镜像自地址
https://github.com/binary-husky/gpt_academic.git
已同步 2025-12-06 14:36:48 +00:00
比较提交
10 次代码提交
version3.4
...
version3.4
| 作者 | SHA1 | 提交日期 | |
|---|---|---|---|
|
|
790a1cf12a | ||
|
|
3ecf2977a8 | ||
|
|
aeddf6b461 | ||
|
|
3c00e7a143 | ||
|
|
ef1bfdd60f | ||
|
|
e48d92e82e | ||
|
|
110510997f | ||
|
|
b52695845e | ||
|
|
f30c9c6d3b | ||
|
|
ff5403eac6 |
50
README.md
50
README.md
@@ -43,7 +43,7 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
|
||||
[Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF
|
||||
[谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL,让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
|
||||
互联网信息聚合+GPT | [函数插件] 一键[让GPT先从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck),再回答问题,让信息永不过时
|
||||
Arxiv论文精密翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),迄今为止最好的论文翻译工具
|
||||
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),迄今为止最好的论文翻译工具⭐
|
||||
公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
|
||||
多线程函数插件支持 | 支持多线调用chatgpt,一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
|
||||
启动暗色gradio[主题](https://github.com/binary-husky/chatgpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
|
||||
@@ -233,66 +233,62 @@ Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史h
|
||||
<img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
2. 生成报告。大部分插件都会在执行结束后,生成工作报告
|
||||
2. ⭐Latex/Arxiv论文翻译功能⭐
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="300" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="300" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227504005-efeaefe0-b687-49d0-bf95-2d7b7e66c348.png" height="300" >
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" > ===>
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
|
||||
</div>
|
||||
|
||||
3. 模块化功能设计,简单的接口却能支持强大的功能
|
||||
3. 生成报告。大部分插件都会在执行结束后,生成工作报告
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="250" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="250" >
|
||||
</div>
|
||||
|
||||
4. 模块化功能设计,简单的接口却能支持强大的功能
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/229288270-093643c1-0018-487a-81e6-1d7809b6e90f.png" height="400" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
|
||||
</div>
|
||||
|
||||
4. 这是一个能够“自我译解”的开源项目
|
||||
5. 这是一个能够“自我译解”的开源项目
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/226936850-c77d7183-0749-4c1c-9875-fd4891842d0c.png" width="500" >
|
||||
</div>
|
||||
|
||||
5. 译解其他开源项目,不在话下
|
||||
6. 译解其他开源项目,不在话下
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" width="500" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" >
|
||||
<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" >
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" width="500" >
|
||||
</div>
|
||||
|
||||
6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`)
|
||||
7. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`)
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/236432361-67739153-73e8-43fe-8111-b61296edabd9.png" width="500" >
|
||||
</div>
|
||||
|
||||
7. 新增MOSS大语言模型支持
|
||||
8. 新增MOSS大语言模型支持
|
||||
<div align="center">
|
||||
<img src="https://user-images.githubusercontent.com/96192199/236639178-92836f37-13af-4fdd-984d-b4450fe30336.png" width="500" >
|
||||
</div>
|
||||
|
||||
8. OpenAI图像生成
|
||||
9. OpenAI图像生成
|
||||
<div align="center">
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/bc7ab234-ad90-48a0-8d62-f703d9e74665" width="500" >
|
||||
</div>
|
||||
|
||||
9. OpenAI音频解析与总结
|
||||
10. OpenAI音频解析与总结
|
||||
<div align="center">
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/709ccf95-3aee-498a-934a-e1c22d3d5d5b" width="500" >
|
||||
</div>
|
||||
|
||||
10. Latex全文校对纠错
|
||||
11. Latex全文校对纠错
|
||||
<div align="center">
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="250" > ===>
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="250">
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===>
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200">
|
||||
</div>
|
||||
|
||||
10. Latex/Arxiv论文翻译功能
|
||||
<div align="center">
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" >
|
||||
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
|
||||
</div>
|
||||
|
||||
|
||||
## 版本:
|
||||
- version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
|
||||
|
||||
@@ -146,7 +146,7 @@ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, histo
|
||||
from .latex_utils import Latex精细分解与转化, 编译Latex
|
||||
except Exception as e:
|
||||
chatbot.append([ f"解析项目: {txt}",
|
||||
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
|
||||
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
return
|
||||
|
||||
@@ -216,7 +216,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
|
||||
from .latex_utils import Latex精细分解与转化, 编译Latex
|
||||
except Exception as e:
|
||||
chatbot.append([ f"解析项目: {txt}",
|
||||
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
|
||||
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
return
|
||||
|
||||
|
||||
@@ -23,13 +23,38 @@ def split_worker(text, mask, pattern, flags=0):
|
||||
mask[res.span()[0]:res.span()[1]] = PRESERVE
|
||||
return text, mask
|
||||
|
||||
def split_worker_reverse_caption(text, mask, pattern, flags=0):
|
||||
def split_worker_careful_brace(text, mask, pattern, flags=0):
|
||||
"""
|
||||
Move caption area out of preserve area
|
||||
Move area into preserve area
|
||||
"""
|
||||
pattern_compile = re.compile(pattern, flags)
|
||||
for res in pattern_compile.finditer(text):
|
||||
mask[res.regs[1][0]:res.regs[1][1]] = TRANSFORM
|
||||
brace_level = -1
|
||||
p = begin = end = res.regs[0][0]
|
||||
for _ in range(1024*16):
|
||||
if text[p] == '}' and brace_level == 0: break
|
||||
elif text[p] == '}': brace_level -= 1
|
||||
elif text[p] == '{': brace_level += 1
|
||||
p += 1
|
||||
end = p+1
|
||||
mask[begin:end] = PRESERVE
|
||||
return text, mask
|
||||
|
||||
def split_worker_reverse_careful_brace(text, mask, pattern, flags=0):
|
||||
"""
|
||||
Move area out of preserve area
|
||||
"""
|
||||
pattern_compile = re.compile(pattern, flags)
|
||||
for res in pattern_compile.finditer(text):
|
||||
brace_level = 0
|
||||
p = begin = end = res.regs[1][0]
|
||||
for _ in range(1024*16):
|
||||
if text[p] == '}' and brace_level == 0: break
|
||||
elif text[p] == '}': brace_level -= 1
|
||||
elif text[p] == '{': brace_level += 1
|
||||
p += 1
|
||||
end = p
|
||||
mask[begin:end] = TRANSFORM
|
||||
return text, mask
|
||||
|
||||
def split_worker_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
|
||||
@@ -97,6 +122,7 @@ def 寻找Latex主文件(file_manifest, mode):
|
||||
else:
|
||||
continue
|
||||
raise RuntimeError('无法找到一个主Tex文件(包含documentclass关键字)')
|
||||
|
||||
def rm_comments(main_file):
|
||||
new_file_remove_comment_lines = []
|
||||
for l in main_file.splitlines():
|
||||
@@ -108,6 +134,7 @@ def rm_comments(main_file):
|
||||
main_file = '\n'.join(new_file_remove_comment_lines)
|
||||
main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串
|
||||
return main_file
|
||||
|
||||
def merge_tex_files_(project_foler, main_file, mode):
|
||||
"""
|
||||
Merge Tex project recrusively
|
||||
@@ -185,14 +212,39 @@ def fix_content(final_tex, node_string):
|
||||
if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'):
|
||||
# walk and replace any _ without \
|
||||
final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex)
|
||||
if node_string.count('{') != node_string.count('}'):
|
||||
if final_tex.count('{') != node_string.count('{'):
|
||||
final_tex = node_string # 出问题了,还原原文
|
||||
if final_tex.count('}') != node_string.count('}'):
|
||||
final_tex = node_string # 出问题了,还原原文
|
||||
|
||||
def compute_brace_level(string):
|
||||
# this function count the number of { and }
|
||||
brace_level = 0
|
||||
for c in string:
|
||||
if c == "{": brace_level += 1
|
||||
elif c == "}": brace_level -= 1
|
||||
return brace_level
|
||||
def join_most(tex_t, tex_o):
|
||||
# this function join translated string and original string when something goes wrong
|
||||
p_t = 0
|
||||
p_o = 0
|
||||
def find_next(string, chars, begin):
|
||||
p = begin
|
||||
while p < len(string):
|
||||
if string[p] in chars: return p, string[p]
|
||||
p += 1
|
||||
return None, None
|
||||
while True:
|
||||
res1, char = find_next(tex_o, ['{','}'], p_o)
|
||||
if res1 is None: break
|
||||
res2, char = find_next(tex_t, [char], p_t)
|
||||
if res2 is None: break
|
||||
p_o = res1 + 1
|
||||
p_t = res2 + 1
|
||||
return tex_t[:p_t] + tex_o[p_o:]
|
||||
|
||||
if compute_brace_level(final_tex) != compute_brace_level(node_string):
|
||||
# 出问题了,还原部分原文,保证括号正确
|
||||
final_tex = join_most(final_tex, node_string)
|
||||
return final_tex
|
||||
|
||||
def split_subprocess(txt, project_folder, return_dict):
|
||||
def split_subprocess(txt, project_folder, return_dict, opts):
|
||||
"""
|
||||
break down latex file to a linked list,
|
||||
each node use a preserve flag to indicate whether it should
|
||||
@@ -239,7 +291,8 @@ def split_subprocess(txt, project_folder, return_dict):
|
||||
text, mask = split_worker(text, mask, r"\\vspace\{(.*?)\}")
|
||||
text, mask = split_worker(text, mask, r"\\hspace\{(.*?)\}")
|
||||
text, mask = split_worker(text, mask, r"\\end\{(.*?)\}")
|
||||
# text, mask = split_worker_reverse_caption(text, mask, r"\\caption\{(.*?)\}", re.DOTALL)
|
||||
text, mask = split_worker_careful_brace(text, mask, r"\\hl\{(.*?)\}", re.DOTALL)
|
||||
text, mask = split_worker_reverse_careful_brace(text, mask, r"\\caption\{(.*?)\}", re.DOTALL)
|
||||
root = convert_to_linklist(text, mask)
|
||||
|
||||
# 修复括号
|
||||
@@ -369,7 +422,7 @@ class LatexPaperSplit():
|
||||
result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:]
|
||||
return result_string
|
||||
|
||||
def split(self, txt, project_folder):
|
||||
def split(self, txt, project_folder, opts):
|
||||
"""
|
||||
break down latex file to a linked list,
|
||||
each node use a preserve flag to indicate whether it should
|
||||
@@ -381,7 +434,7 @@ class LatexPaperSplit():
|
||||
return_dict = manager.dict()
|
||||
p = multiprocessing.Process(
|
||||
target=split_subprocess,
|
||||
args=(txt, project_folder, return_dict))
|
||||
args=(txt, project_folder, return_dict, opts))
|
||||
p.start()
|
||||
p.join()
|
||||
self.nodes = return_dict['nodes']
|
||||
@@ -440,7 +493,7 @@ class LatexPaperFileGroup():
|
||||
|
||||
|
||||
|
||||
def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None):
|
||||
def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None, opts=[]):
|
||||
import time, os, re
|
||||
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
|
||||
from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件
|
||||
@@ -469,8 +522,10 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
|
||||
f.write(merged_content)
|
||||
|
||||
# <-------- 精细切分latex文件 ---------->
|
||||
chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件,这需要一段时间计算,文档越长耗时越长,请耐心等待。'))
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
lps = LatexPaperSplit()
|
||||
res = lps.split(merged_content, project_folder) # 消耗时间的函数
|
||||
res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
|
||||
|
||||
# <-------- 拆分过长的latex片段 ---------->
|
||||
pfg = LatexPaperFileGroup()
|
||||
@@ -567,7 +622,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
|
||||
current_dir = os.getcwd()
|
||||
n_fix = 1
|
||||
max_try = 32
|
||||
chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder},如果程序停顿5分钟以上,则大概率是卡死在Latex里面了。不幸卡死时请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
|
||||
chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder},如果程序停顿5分钟以上,请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
|
||||
chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面
|
||||
yield from update_ui_lastest_msg('编译已经开始...', chatbot, history) # 刷新Gradio前端界面
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ def inspect_dependency(chatbot, history):
|
||||
import manim
|
||||
return True
|
||||
except:
|
||||
chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manimgl```"])
|
||||
chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manim manimgl```"])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
return False
|
||||
|
||||
|
||||
@@ -58,6 +58,8 @@
|
||||
"连接网络回答问题": "ConnectToNetworkToAnswerQuestions",
|
||||
"联网的ChatGPT": "ChatGPTConnectedToNetwork",
|
||||
"解析任意code项目": "ParseAnyCodeProject",
|
||||
"读取知识库作答": "ReadKnowledgeArchiveAnswerQuestions",
|
||||
"知识库问答": "UpdateKnowledgeArchive",
|
||||
"同时问询_指定模型": "InquireSimultaneously_SpecifiedModel",
|
||||
"图片生成": "ImageGeneration",
|
||||
"test_解析ipynb文件": "Test_ParseIpynbFile",
|
||||
|
||||
@@ -483,7 +483,9 @@ def on_report_generated(files, chatbot):
|
||||
if len(report_files) == 0:
|
||||
return None, chatbot
|
||||
# files.extend(report_files)
|
||||
chatbot.append(['报告如何远程获取?', '报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。'])
|
||||
file_links = ''
|
||||
for f in report_files: file_links += f'<br/><a href="file={os.path.abspath(f)}" target="_blank">{f}</a>'
|
||||
chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
|
||||
return report_files, chatbot
|
||||
|
||||
def is_openai_api_key(key):
|
||||
|
||||
在新工单中引用
屏蔽一个用户