比较提交

..

207 次代码提交

作者 SHA1 备注 提交日期
binary-husky
8a10db618e Merge branch 'master-interact' 2023-07-09 01:05:04 +08:00
binary-husky
1fe66f0291 优化azure的体验 2023-07-09 00:20:58 +08:00
binary-husky
ced977c443 修复双dollar公式匹配bug 2023-07-08 22:23:29 +08:00
binary-husky
6c2ffbae52 Update README.md 2023-07-08 19:17:35 +08:00
binary-husky
be2f54fac9 Update README.md 2023-07-08 18:21:20 +08:00
binary-husky
87b5e56378 Update requirements.txt 2023-07-08 18:10:33 +08:00
binary-husky
3a5764ed34 Update requirements.txt 2023-07-08 17:59:27 +08:00
qingxu fu
67d9051890 update error message 2023-07-07 17:41:43 +08:00
binary-husky
be96232127 Merge pull request #933 from binary-husky/master-latex-patch
Latex File Name Bug Patch
2023-07-07 16:57:58 +08:00
binary-husky
3b5bc7a784 Update use_azure.md 2023-07-07 10:55:22 +08:00
binary-husky
5e92f437a1 Update use_azure.md 2023-07-07 10:54:21 +08:00
qingxu fu
eabd9d312f 3.43 2023-07-07 10:47:30 +08:00
qingxu fu
0da6fe78ac 统一azure-gpt-3.5的格式 2023-07-07 10:45:11 +08:00
qingxu fu
be990380a0 Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master 2023-07-07 10:42:41 +08:00
qingxu fu
9c0bc48420 修复Azure OpenAI接口的各种bug 2023-07-07 10:42:38 +08:00
binary-husky
5c0d34793e Latex File Name Bug Patch 2023-07-07 00:09:50 +08:00
binary-husky
37fc550652 Update config.py 2023-07-06 10:47:06 +08:00
binary-husky
2c1d6ac212 修复Organization的bug 2023-07-05 21:14:13 +08:00
binary-husky
8c699c1b26 Update README.md 2023-07-05 21:04:28 +08:00
binary-husky
c620fa9011 Update README.md 2023-07-05 20:55:59 +08:00
binary-husky
f16fd60211 Update README.md 2023-07-05 20:34:22 +08:00
binary-husky
9674e59d26 更新说明 2023-07-05 20:22:57 +08:00
binary-husky
643c5e125a 更新提醒 2023-07-05 20:10:18 +08:00
binary-husky
e5099e1daa 极少数情况下,openai的官方KEY需要伴随组织编码 2023-07-05 20:05:20 +08:00
binary-husky
3e621bbec1 Update Dockerfile 2023-07-05 14:37:54 +08:00
qingxu fu
bb1d5a61c0 update translation matrix 2023-07-05 14:32:33 +08:00
binary-husky
fd3d0be2d8 Update config.py 2023-07-05 14:13:04 +08:00
binary-husky
ae623258f3 更详细的配置提示 2023-07-05 14:10:06 +08:00
binary-husky
cda281f08b 把newbing的cookie加回来 2023-07-05 13:48:50 +08:00
binary-husky
9f8e7a6efa 显示更详细的报错 2023-07-05 13:35:11 +08:00
qingxu fu
57643dd2b6 update error msg 2023-07-05 13:01:06 +08:00
qingxu fu
6bc8a78cfe No more cookie for NewBing! 2023-07-05 12:45:10 +08:00
binary-husky
d2700e97fb 更新openai失效提醒 2023-07-05 11:03:11 +08:00
binary-husky
c4dd81dc9a Update Dockerfile 2023-07-04 12:28:52 +08:00
binary-husky
e9b06d7cde Merge pull request #927 from QuantumRoseinAmethystVase/master
Update 批量总结PDF文档.py
2023-07-04 12:24:17 +08:00
qingxu fu
6e6ea69611 Unsplash恢复了 2023-07-04 12:16:01 +08:00
QuantumRoseinAmethystVase
16c17eb077 Update 批量总结PDF文档.py
Improve the output.
2023-07-03 18:55:16 +08:00
kainstan
59877dd728 Local variable 'result' might be referenced before assignment, add else result 2023-07-01 22:27:11 +08:00
w_xiaolizu
5f7ffef238 增加基础功能判空 2023-07-01 22:04:42 +08:00
qingxu fu
41c10f5688 report image generation error in UI 2023-07-01 02:28:32 +08:00
qingxu fu
d7ac99f603 更正错误提示 2023-07-01 01:46:43 +08:00
qingxu fu
1616daae6a Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master 2023-07-01 00:17:30 +08:00
qingxu fu
a1092d8f92 提供自动清空输入框的选项 2023-07-01 00:17:26 +08:00
binary-husky
34ca9f138f Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-06-30 14:56:28 +08:00
binary-husky
df3f1aa3ca 更正ChatGLM2的默认Token数量 2023-06-30 14:56:22 +08:00
qingxu fu
bf805cf477 Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master 2023-06-30 13:09:51 +08:00
qingxu fu
ecb08e69be remove find picture core functionality 2023-06-30 13:08:54 +08:00
binary-husky
28c1e3f11b Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-06-30 12:06:33 +08:00
binary-husky
403667aec1 upgrade chatglm to chatglm2 2023-06-30 12:06:28 +08:00
qingxu fu
22f377e2fb fix multi user cwd shift 2023-06-30 11:05:47 +08:00
binary-husky
37172906ef 修复文件导出的bug 2023-06-29 14:55:55 +08:00
binary-husky
3b78e0538b 修复插件demo的图像显示的问题 2023-06-29 14:52:58 +08:00
binary-husky
d8f9ac71d0 Merge pull request #907 from Xminry/master
feat:联网搜索功能,cn.bing.com版,国内可用
2023-06-29 12:44:32 +08:00
qingxu fu
aced272d3c 微调插件提示 2023-06-29 12:43:50 +08:00
qingxu fu
aff77a086d Merge branch 'master' of https://github.com/Xminry/gpt_academic into Xminry-master 2023-06-29 12:38:43 +08:00
qingxu fu
49253c4dc6 [arxiv trans] add html comparison to zip file 2023-06-29 12:29:49 +08:00
qingxu fu
1a00093015 修复提示 2023-06-29 12:15:52 +08:00
qingxu fu
64f76e7401 3.42 2023-06-29 11:32:19 +08:00
qingxu fu
eb4c07997e 修复Latex矫错和本地Latex论文翻译的问题 2023-06-29 11:30:42 +08:00
Xminry
99cf7205c3 feat:联网搜索功能,cn.bing.com版,国内可用 2023-06-28 10:30:08 +08:00
binary-husky
d684b4cdb3 Merge pull request #905 from Xminry/master
Update 理解PDF文档内容.py
2023-06-27 23:37:25 +08:00
binary-husky
601a95c948 Merge pull request #881 from OverKit/master
update latex_utils.py
2023-06-27 19:20:17 +08:00
qingxu fu
e18bef2e9c add item breaker 2023-06-27 19:16:05 +08:00
qingxu fu
f654c1af31 merge regex expressions 2023-06-27 18:59:56 +08:00
qingxu fu
e90048a671 Merge branch 'master' of https://github.com/OverKit/gpt_academic into OverKit-master 2023-06-27 16:14:12 +08:00
binary-husky
ea624b1510 Merge pull request #889 from dackdawn/master
添加0613模型的声明
2023-06-27 15:03:15 +08:00
qingxu fu
057e3dda3c Merge branch 'master' of https://github.com/dackdawn/gpt_academic into dackdawn-master 2023-06-27 15:02:22 +08:00
Xminry
4290821a50 Update 理解PDF文档内容.py 2023-06-27 01:57:31 +08:00
binary-husky
280e14d7b7 更新Latex模块的docker-compose 2023-06-26 09:59:14 +08:00
505030475
9f0cf9fb2b arxiv PDF 引用 2023-06-25 23:30:31 +08:00
505030475
b8560b7510 修正误判latex模板文件的bug 2023-06-25 22:46:16 +08:00
505030475
d841d13b04 add arxiv translation test samples 2023-06-25 22:12:44 +08:00
binary-husky
efda9e5193 Merge pull request #897 from Ranhuiryan/master
添加azure-gpt35选项
2023-06-24 17:59:51 +10:00
Ranhuiryan
33d2e75aac add azure-gpt35 to model list 2023-06-21 16:19:49 +08:00
Ranhuiryan
74941170aa update azure use instruction 2023-06-21 16:19:26 +08:00
505030475
cd38949903 当遇到错误时,回滚到原文 2023-06-21 11:53:57 +10:00
505030475
d87f1eb171 更新接入azure的说明 2023-06-21 11:38:59 +10:00
binary-husky
cd1e4e1ba7 Merge pull request #797 from XiaojianTang/master
增加azure openai api的支持
2023-06-21 11:23:41 +10:00
505030475
cf5f348d70 update test samples 2023-06-21 11:20:31 +10:00
binary-husky
0ee25f475e Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-06-20 23:07:51 +08:00
binary-husky
1fede6df7f temp 2023-06-20 23:05:17 +08:00
binary-husky
22a65cd163 Create build-with-latex.yml 2023-06-21 00:55:24 +10:00
binary-husky
538b041ea3 Merge pull request #890 from Mcskiller/master
Update README.md
2023-06-21 00:53:26 +10:00
505030475
d7b056576d add latex docker-compose 2023-06-21 00:52:58 +10:00
505030475
cb0bb6ab4a fix minor bugs 2023-06-21 00:41:33 +10:00
505030475
bf955aaf12 fix bugs 2023-06-20 23:12:30 +10:00
505030475
61eb0da861 fix encoding bug 2023-06-20 22:08:09 +10:00
Lebenito(生糸)
5da633d94d Update README.md
Fix the error URL for the git clone.
2023-06-20 19:10:11 +08:00
dackdawn
f3e4e26e2f 添加0613模型的声明
openai对gpt-3.5-turbo的RPM限制是3,而gpt-3.5-turbo-0613的RPM是60,虽然两个模型的内容是一致的,但是选定特定模型可以获得更高的RPM和TPM
2023-06-19 21:40:26 +08:00
505030475
af7734dd35 avoid file fusion 2023-06-19 16:57:11 +10:00
505030475
d5bab093f9 rename function names 2023-06-19 15:17:33 +10:00
505030475
f94b167dc2 Merge branch 'master' into overkit-master 2023-06-19 14:53:51 +10:00
505030475
951d5ec758 Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-06-19 14:52:25 +10:00
505030475
016d8ee156 Merge remote-tracking branch 'origin/master' into OverKit-master 2023-06-19 14:51:59 +10:00
505030475
dca9ec4bae Merge branch 'master' of https://github.com/OverKit/gpt_academic into OverKit-master 2023-06-19 14:49:50 +10:00
binary-husky
a06e43c96b Update README.md 2023-06-18 16:15:37 +08:00
binary-husky
29c6bfb6cb Update README.md 2023-06-18 16:12:06 +08:00
binary-husky
8d7ee975a0 Update README.md 2023-06-18 16:10:45 +08:00
binary-husky
4bafbb3562 Update Latex输出PDF结果.py 2023-06-18 15:54:23 +08:00
OverKit
7fdf0a8e51 调整区分内容的代码 2023-06-18 15:51:29 +08:00
binary-husky
2bb13b4677 Update README.md 2023-06-18 15:44:42 +08:00
OverKit
9a5a509dd9 修复关于abstract的搜索 2023-06-17 19:27:21 +08:00
binary-husky
cbcb98ef6a Merge pull request #872 from Skyzayre/master
Update README.md
2023-06-16 17:54:39 +08:00
qingxu fu
bb864c6313 增加一些提示文字 2023-06-16 17:33:19 +08:00
qingxu fu
6d849eeb12 修复Langchain插件的bug 2023-06-16 17:33:03 +08:00
Skyzayre
ef752838b0 Update README.md 2023-06-15 02:07:43 +08:00
binary-husky
73d4a1ff4b Update README.md 2023-06-14 10:15:47 +08:00
qingxu fu
8c62f21aa6 3.41增加gpt-3.5-16k的支持 2023-06-14 09:57:09 +08:00
qingxu fu
c40ebfc21f 将gpt-3.5-16k作为加入支持列表 2023-06-14 09:50:15 +08:00
binary-husky
c365ea9f57 Update README.md 2023-06-13 16:13:19 +08:00
binary-husky
12d66777cc Merge pull request #864 from OverKit/master
check letter % after removing spaces or tabs in the left
2023-06-12 15:21:35 +08:00
OverKit
9ac3d0d65d check letter % after removing spaces or tabs in the left 2023-06-12 10:09:52 +08:00
binary-husky
9fd212652e 专业词汇声明 2023-06-12 09:45:59 +08:00
binary-husky
790a1cf12a 添加一些提示 2023-06-11 20:12:25 +08:00
binary-husky
3ecf2977a8 修复caption翻译 2023-06-11 18:23:54 +08:00
binary-husky
aeddf6b461 Update Latex输出PDF结果.py 2023-06-11 10:20:49 +08:00
505030475
ce0d8b9dab 虚空终端插件雏形 2023-06-11 01:36:23 +08:00
binary-husky
3c00e7a143 file link in chatbot 2023-06-10 21:45:38 +08:00
binary-husky
ef1bfdd60f update pip install notice 2023-06-08 21:29:10 +08:00
qingxu fu
e48d92e82e update translation 2023-06-08 18:34:06 +08:00
binary-husky
110510997f Update README.md 2023-06-08 12:48:52 +08:00
binary-husky
b52695845e Update README.md 2023-06-08 12:44:05 +08:00
binary-husky
f30c9c6d3b Update README.md 2023-06-08 12:43:13 +08:00
binary-husky
ff5403eac6 Update README.md 2023-06-08 12:42:24 +08:00
binary-husky
f9226d92be Update version 2023-06-08 12:24:14 +08:00
binary-husky
a0ea5d0e9e Update README.md 2023-06-08 12:22:03 +08:00
binary-husky
ce6f11d200 Update README.md 2023-06-08 12:20:49 +08:00
binary-husky
10b3001dba Update README.md 2023-06-08 12:19:11 +08:00
binary-husky
e2de1d76ea Update README.md 2023-06-08 12:18:31 +08:00
binary-husky
77cc141a82 Update README.md 2023-06-08 12:14:02 +08:00
binary-husky
526b4d8ecd Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-06-07 11:09:20 +08:00
binary-husky
149db621ec langchain check depends 2023-06-07 11:09:12 +08:00
binary-husky
2e1bb7311c Merge pull request #848 from MengDanzz/master
将Dockerfile COPY分成两段,缓存依赖库,重新构建不需要重新安装
2023-06-07 10:44:09 +08:00
binary-husky
dae65fd2c2 在copy ..后在运行一次pip install检查依赖变化 2023-06-07 10:43:45 +08:00
MengDanzz
9aafb2ee47 非pypi包加入COPY 2023-06-07 09:18:57 +08:00
MengDanzz
6bc91bd02e Merge branch 'binary-husky:master' into master 2023-06-07 09:15:44 +08:00
qingxu fu
8ef7344101 fix subprocess bug in Windows 2023-06-06 18:57:52 +08:00
binary-husky
40da1b0afe 将Latex分解程序放到子进程执行 2023-06-06 18:44:00 +08:00
MengDanzz
c65def90f3 将Dockerfile COPY分成两段,缓存依赖库,重新构建不需要重新安装 2023-06-06 14:36:30 +08:00
binary-husky
ddeaf76422 check latex in PATH 2023-06-06 00:23:00 +08:00
qingxu fu
f23b66dec2 update Dockerfile with Latex 2023-06-05 23:49:54 +08:00
qingxu fu
a26b294817 Write Some Docstring 2023-06-05 23:44:59 +08:00
qingxu fu
66018840da declare resp 2023-06-05 23:24:41 +08:00
qingxu fu
cea2144f34 fix test samples 2023-06-05 23:11:21 +08:00
qingxu fu
7f5be93c1d 修正一些正则匹配bug 2023-06-05 22:57:39 +08:00
binary-husky
85b838b302 add Linux support 2023-06-04 23:06:35 +08:00
qingxu fu
27f97ba92a remove previous results 2023-06-04 16:55:36 +08:00
qingxu fu
14269eba98 建立本地arxiv缓存区 2023-06-04 16:08:01 +08:00
qingxu fu
d5c9bc9f0a 提高iffalse搜索优先级 2023-06-04 14:15:59 +08:00
qingxu fu
b0fed3edfc consider iffalse state 2023-06-04 14:06:02 +08:00
qingxu fu
7296d054a2 patch latex segmentation 2023-06-04 13:56:15 +08:00
qingxu fu
d57c7d352d improve quality 2023-06-03 23:54:30 +08:00
qingxu fu
3fd2927ea3 改善 2023-06-03 23:33:45 +08:00
qingxu fu
b745074160 avoid most compile failure 2023-06-03 23:33:32 +08:00
qingxu fu
70ee810133 improve success rate 2023-06-03 19:39:19 +08:00
qingxu fu
68fea9e79b fix test 2023-06-03 18:09:39 +08:00
qingxu fu
f82bf91aa8 test example 2023-06-03 18:06:39 +08:00
qingxu fu
dde9edcc0c fix a fatal mistake 2023-06-03 17:49:22 +08:00
qingxu fu
66c78e459e 修正提示 2023-06-03 17:18:38 +08:00
qingxu fu
de54102303 修改提醒 2023-06-03 16:43:26 +08:00
qingxu fu
7c7d2d8a84 Latex的minipage补丁 2023-06-03 16:16:32 +08:00
qingxu fu
834f989ed4 考虑有人用input不加.tex的情况 2023-06-03 15:42:22 +08:00
qingxu fu
b658ee6e04 修复arxiv翻译的一些问题 2023-06-03 15:36:55 +08:00
qingxu fu
1a60280ea0 添加警告 2023-06-03 14:40:37 +08:00
qingxu fu
991cb7d272 warning 2023-06-03 14:39:40 +08:00
qingxu fu
463991cfb2 fix bug 2023-06-03 14:24:06 +08:00
qingxu fu
06f10b5fdc fix zh cite bug 2023-06-03 14:17:58 +08:00
qingxu fu
d275d012c6 Merge branch 'langchain' into master 2023-06-03 13:53:39 +08:00
qingxu fu
c5d1ea3e21 update langchain version 2023-06-03 13:53:34 +08:00
qingxu fu
0022b92404 update prompt 2023-06-03 13:50:39 +08:00
qingxu fu
ef61221241 latex auto translation milestone 2023-06-03 13:46:40 +08:00
qingxu fu
5a1831db98 成功! 2023-06-03 00:34:23 +08:00
qingxu fu
a643f8b0db debug translation 2023-06-02 23:06:01 +08:00
qingxu fu
601712fd0a latex toolchain 2023-06-02 21:44:11 +08:00
505030475
e769f831c7 latex 2023-06-02 14:07:04 +08:00
binary-husky
dcd952671f Update main.py 2023-06-01 15:56:52 +08:00
binary-husky
06564df038 Merge branch 'langchain' 2023-06-01 09:39:34 +08:00
binary-husky
2f037f30d5 暂时移除插件锁定 2023-06-01 09:39:00 +08:00
505030475
efedab186d Merge branch 'master' into langchain 2023-06-01 00:10:22 +08:00
binary-husky
f49cae5116 Update Langchain知识库.py 2023-06-01 00:09:07 +08:00
binary-husky
2b620ccf2e 更新提示 2023-06-01 00:07:19 +08:00
binary-husky
a1b7a4da56 更新测试案例 2023-06-01 00:03:27 +08:00
binary-husky
61b0e49fed fix some bugs in linux 2023-05-31 23:49:25 +08:00
binary-husky
f60dc371db 12 2023-05-31 10:42:44 +08:00
binary-husky
0a3433b8ac Update README.md 2023-05-31 10:37:08 +08:00
binary-husky
31bce54abb Update README.md 2023-05-31 10:34:21 +08:00
binary-husky
5db1530717 Merge branch 'langchain' of github.com:binary-husky/chatgpt_academic into langchain 2023-05-30 20:08:47 +08:00
binary-husky
c32929fd11 Merge branch 'master' into langchain 2023-05-30 20:08:15 +08:00
505030475
3e4c2b056c knowledge base 2023-05-30 19:55:38 +08:00
505030475
e79e9d7d23 Merge branch 'master' into langchain 2023-05-30 18:31:39 +08:00
binary-husky
d175b93072 Update README.md.Italian.md 2023-05-30 17:27:41 +08:00
binary-husky
ed254687d2 Update README.md.Italian.md 2023-05-30 17:26:12 +08:00
binary-husky
c0392f7074 Update README.md.Korean.md 2023-05-30 17:25:32 +08:00
binary-husky
f437712af7 Update README.md.Portuguese.md 2023-05-30 17:22:46 +08:00
505030475
6d1ea643e9 langchain 2023-05-30 12:54:42 +08:00
binary-husky
9e84cfcd46 Update README.md 2023-05-29 19:48:34 +08:00
binary-husky
897695d29f 修复二级路径的文件屏蔽 2023-05-28 20:25:35 +08:00
binary-husky
1dcc2873d2 修复Gradio配置泄露的问题 2023-05-28 20:23:47 +08:00
binary-husky
42cf738a31 修复一些情况下复制键失效的问题 2023-05-28 18:12:48 +08:00
binary-husky
e4646789af Merge branch 'master' of github.com:binary-husky/chatgpt_academic 2023-05-28 16:07:29 +08:00
binary-husky
e6c3aabd45 docker-compose check 2023-05-28 16:07:24 +08:00
binary-husky
6789d1fab4 Update README.md 2023-05-28 11:21:50 +08:00
binary-husky
7a733f00a2 Update README.md 2023-05-28 00:19:23 +08:00
binary-husky
dd55888f0e Update README.md 2023-05-28 00:16:45 +08:00
binary-husky
0327df22eb Update README.md 2023-05-28 00:14:54 +08:00
binary-husky
e544f5e9d0 Update README.md 2023-05-27 23:45:15 +08:00
XiaojianTang
f3205994ea 增加azure openai api的支持 2023-05-26 23:22:12 +08:00
共有 48 个文件被更改,包括 3096 次插入1070 次删除

44
.github/workflows/build-with-latex.yml vendored 普通文件
查看文件

@@ -0,0 +1,44 @@
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
name: Create and publish a Docker image for Latex support
on:
push:
branches:
- 'master'
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}_with_latex
jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Log in to the Container registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
file: docs/GithubAction+NoLocal+Latex
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

查看文件

@@ -1,24 +1,34 @@
# 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型,请参考 docs/Dockerfile+ChatGLM
# 如何构建: 先修改 `config.py`, 然后 docker build -t gpt-academic .
# 如何运行: docker run --rm -it --net=host gpt-academic
# 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型或者latex运行依赖,请参考 docker-compose.yml
# 如何构建: 先修改 `config.py`, 然后 `docker build -t gpt-academic . `
# 如何运行(Linux下): `docker run --rm -it --net=host gpt-academic `
# 如何运行(其他操作系统,选择任意一个固定端口50923): `docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic `
FROM python:3.11
# 非必要步骤,更换pip源
RUN echo '[global]' > /etc/pip.conf && \
echo 'index-url = https://mirrors.aliyun.com/pypi/simple/' >> /etc/pip.conf && \
echo 'trusted-host = mirrors.aliyun.com' >> /etc/pip.conf
# 进入工作路径
WORKDIR /gpt
# 装载项目文件
COPY . .
# 安装依赖
# 安装大部分依赖,利用Docker缓存加速以后的构建
COPY requirements.txt ./
COPY ./docs/gradio-3.32.2-py3-none-any.whl ./docs/gradio-3.32.2-py3-none-any.whl
RUN pip3 install -r requirements.txt
# 可选步骤,用于预热模块
# 装载项目文件,安装剩余依赖
COPY . .
RUN pip3 install -r requirements.txt
# 非必要步骤,用于预热模块
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
# 启动
CMD ["python3", "-u", "main.py"]

169
README.md
查看文件

@@ -1,24 +1,26 @@
> **Note**
>
> 安装依赖时,请严格选择requirements.txt中**指定的版本**,并使用官方pip源
> 2023.7.5: Gradio依赖调整。请及时**更新代码**
>
> `pip install -r requirements.txt -i https://pypi.org/simple`
> 2023.7.8: pydantic出现兼容问题,已修改 `requirements.txt`。安装依赖时,请严格选择`requirements.txt`中**指定的版本**
>
> `pip install -r requirements.txt`
# <img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)
**如果喜欢这个项目,请给它一个Star;如果你发明了更好用的快捷键或函数插件,欢迎发pull requests**
# <div align=center><img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)</div>
**如果喜欢这个项目,请给它一个Star;如果您发明了好用的快捷键或函数插件,欢迎发pull requests**
If you like this project, please give it a Star. If you've come up with more useful academic shortcuts or functional plugins, feel free to open an issue or pull request. We also have a README in [English|](docs/README_EN.md)[日本語|](docs/README_JP.md)[한국어|](https://github.com/mldljyh/ko_gpt_academic)[Русский|](docs/README_RS.md)[Français](docs/README_FR.md) translated by this project itself.
To translate this project to arbitary language with GPT, read and run [`multi_language.py`](multi_language.py) (experimental).
> **Note**
>
> 1.请注意只有**红颜色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR
> 1.请注意只有 **高亮(如红色)** 标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR
>
> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代,您也可以随时自行点击相关函数插件,调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代,您也可以随时自行点击相关函数插件,调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
>
> 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存,可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
> 3.本项目兼容并鼓励尝试国产大语言模型ChatGLM和Moss等等。支持多个api-key共存,可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
@@ -31,23 +33,24 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
一键中英互译 | 一键中英互译
一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
[自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键
模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/chatgpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
[自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/gpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
[自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
[程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树
读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要
Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文
批量注释生成 | [函数插件] 一键批量生成函数注释
Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/chatgpt_academic/blob/master/docs/README_EN.md)了吗?
Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/gpt_academic/blob/master/docs/README_EN.md)了吗?
chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
[PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程)
[Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF
[谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL,让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck),再回答问题,让信息永不过时
互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck)回答问题,让信息永不过时
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),目前最好的论文翻译工具
公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
多线程函数插件支持 | 支持多线调用chatgpt,一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
启动暗色gradio[主题](https://github.com/binary-husky/chatgpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持,[API2D](https://api2d.com/)接口支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
更多LLM模型接入,支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama),[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
启动暗色[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
更多LLM模型接入,支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)和[盘古α](https://openi.org.cn/pangu/)
更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
</div>
@@ -84,19 +87,18 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
<img src="https://user-images.githubusercontent.com/96192199/232537274-deca0563-7aa6-4b5d-94a2-b7c453c47794.png" width="700" >
</div>
---
# Installation
## 安装-方法1:直接运行 (Windows, Linux or MacOS)
### 安装方法I:直接运行 (Windows, Linux or MacOS)
1. 下载项目
```sh
git clone https://github.com/binary-husky/chatgpt_academic.git
cd chatgpt_academic
git clone https://github.com/binary-husky/gpt_academic.git
cd gpt_academic
```
2. 配置API_KEY
在`config.py`中,配置API KEY等设置,[特殊网络环境设置](https://github.com/binary-husky/gpt_academic/issues/1) 。
在`config.py`中,配置API KEY等设置,[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。
(P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。因此,如果您能理解我们的配置读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中。`config_private.py`不受git管控,可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项,环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
@@ -112,6 +114,7 @@ conda activate gptac_venv # 激活anaconda环境
python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
```
<details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端,请点击展开此处</summary>
<p>
@@ -138,19 +141,13 @@ AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-
python main.py
```
5. 测试函数插件
```
- 测试函数插件模板函数要求gpt回答历史上的今天发生了什么,您可以根据此函数为模板,实现更复杂的功能
点击 "[函数插件模板Demo] 历史上的今天"
```
### 安装方法II使用Docker
## 安装-方法2使用Docker
1. 仅ChatGPT推荐大多数人选择
1. 仅ChatGPT推荐大多数人选择,等价于docker-compose方案1
``` sh
git clone https://github.com/binary-husky/chatgpt_academic.git # 下载项目
cd chatgpt_academic # 进入路径
git clone https://github.com/binary-husky/gpt_academic.git # 下载项目
cd gpt_academic # 进入路径
nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy”, “API_KEY” 以及 “WEB_PORT” (例如50923) 等
docker build -t gpt-academic . # 安装
@@ -159,42 +156,48 @@ docker run --rm -it --net=host gpt-academic
#(最后一步-选择2在macOS/windows环境下,只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
```
P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能修改docker-compose.yml,保留方案4并删除其他方案
2. ChatGPT + ChatGLM + MOSS需要熟悉Docker
``` sh
# 修改docker-compose.yml,删除方案1和方案3,保留方案2。修改docker-compose.yml中方案2的配置,参考其中注释即可
# 修改docker-compose.yml,保留方案2并删除其他方案。修改docker-compose.yml中方案2的配置,参考其中注释即可
docker-compose up
```
3. ChatGPT + LLAMA + 盘古 + RWKV需要熟悉Docker
``` sh
# 修改docker-compose.yml,删除方案1和方案2,保留方案3。修改docker-compose.yml中方案3的配置,参考其中注释即可
# 修改docker-compose.yml,保留方案3并删除其他方案。修改docker-compose.yml中方案3的配置,参考其中注释即可
docker-compose up
```
## 安装-方法3:其他部署姿势
### 安装方法III:其他部署姿势
1. 一键运行脚本。
完全不熟悉python环境的Windows用户可以下载[Release](https://github.com/binary-husky/gpt_academic/releases)中发布的一键运行脚本安装无本地模型的版本。
脚本的贡献来源是[oobabooga](https://github.com/oobabooga/one-click-installers)。
1. 如何使用反代URL/微软云AzureAPI
2. 使用docker-compose运行。
请阅读docker-compose.yml后,按照其中的提示操作即可
3. 如何使用反代URL
按照`config.py`中的说明配置API_URL_REDIRECT即可。
2. 远程云服务器部署(需要云服务器知识与经验)
请访问[部署wiki-1](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
4. 微软云AzureAPI
按照`config.py`中的说明配置即可AZURE_ENDPOINT等四个配置
3. 使用WSL2Windows Subsystem for Linux 子系统)
请访问[部署wiki-2](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
5. 远程云服务器部署(需要云服务器知识与经验)。
请访问[部署wiki-1](https://github.com/binary-husky/gpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
4. 如何在二级网址(如`http://localhost/subpath`)下运行
6. 使用WSL2Windows Subsystem for Linux 子系统)。
请访问[部署wiki-2](https://github.com/binary-husky/gpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
7. 如何在二级网址(如`http://localhost/subpath`)下运行。
请访问[FastAPI运行说明](docs/WithFastapi.md)
5. 使用docker-compose运行
请阅读docker-compose.yml后,按照其中的提示操作即可
---
# Advanced Usage
## 自定义新的便捷按钮 / 自定义函数插件
1. 自定义新的便捷按钮(学术快捷键)
# Advanced Usage
### I自定义新的便捷按钮学术快捷键
任意文本编辑器打开`core_functional.py`,添加条目如下,然后重启程序即可。(如果按钮已经添加成功并可见,那么前缀、后缀都支持热修改,无需重启程序即可生效。)
例如
```
@@ -210,50 +213,45 @@ docker-compose up
<img src="https://user-images.githubusercontent.com/96192199/226899272-477c2134-ed71-4326-810c-29891fe4a508.png" width="500" >
</div>
2. 自定义函数插件
### II自定义函数插件
编写强大的函数插件来执行任何你想得到的和想不到的任务。
本项目的插件编写、调试难度很低,只要您具备一定的python基础知识,就可以仿照我们提供的模板实现自己的插件功能。
详情请参考[函数插件指南](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
详情请参考[函数插件指南](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
---
# Latest Update
## 新功能动态
### I新功能动态
1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件,
另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。
Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存,点击 `删除所有本地对话历史记录` 可以删除所有html存档缓存
Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存。
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
</div>
2. 生成报告。大部分插件都会在执行结束后,生成工作报告
2. ⭐Latex/Arxiv论文翻译功能⭐
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="300" >
<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="300" >
<img src="https://user-images.githubusercontent.com/96192199/227504005-efeaefe0-b687-49d0-bf95-2d7b7e66c348.png" height="300" >
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" > ===>
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
</div>
3. 模块化功能设计,简单的接口却能支持强大的功能
3. 生成报告。大部分插件都会在执行结束后,生成工作报告
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="250" >
<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="250" >
</div>
4. 模块化功能设计,简单的接口却能支持强大的功能
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/229288270-093643c1-0018-487a-81e6-1d7809b6e90f.png" height="400" >
<img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
</div>
4. 这是一个能够“自我译解”的开源项目
5. 译解其他开源项目
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/226936850-c77d7183-0749-4c1c-9875-fd4891842d0c.png" width="500" >
</div>
5. 译解其他开源项目,不在话下
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" width="500" >
</div>
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" width="500" >
<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" >
<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" >
</div>
6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`
@@ -278,13 +276,15 @@ Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史h
10. Latex全文校对纠错
<div align="center">
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" width="500" >
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===>
<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200">
</div>
## 版本:
### II版本:
- version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
- version 3.4(Todo): 完善chatglm本地大模型的多线支持
- version 3.4: +arxiv论文翻译、latex论文批改功能
- version 3.3: +互联网信息综合功能
- version 3.2: 函数插件支持更多参数接口 (保存对话功能, 解读任意语言代码+同时询问任意的LLM组合)
- version 3.1: 支持同时问询多个gpt模型支持api2d,支持多个apikey负载均衡
@@ -302,29 +302,32 @@ gpt_academic开发者QQ群-2610599535
- 已知问题
- 某些浏览器翻译插件干扰此软件前端的运行
- gradio版本过高或过低,都会导致多种异常
- 官方Gradio目前有很多兼容性Bug,请务必使用`requirement.txt`安装Gradio
## 参考与学习
### III参考与学习
```
代码中参考了很多其他优秀项目中的设计,主要包括
代码中参考了很多其他优秀项目中的设计,顺序不分先后
# 项目1清华ChatGLM-6B:
# 清华ChatGLM-6B:
https://github.com/THUDM/ChatGLM-6B
# 项目2清华JittorLLMs:
# 清华JittorLLMs:
https://github.com/Jittor/JittorLLMs
# 项目3Edge-GPT:
https://github.com/acheong08/EdgeGPT
# 项目4ChuanhuChatGPT:
https://github.com/GaiZhenbiao/ChuanhuChatGPT
# 项目5ChatPaper:
# ChatPaper:
https://github.com/kaixindelele/ChatPaper
# 更多:
# Edge-GPT:
https://github.com/acheong08/EdgeGPT
# ChuanhuChatGPT:
https://github.com/GaiZhenbiao/ChuanhuChatGPT
# Oobabooga one-click installer:
https://github.com/oobabooga/one-click-installers
# More
https://github.com/gradio-app/gradio
https://github.com/fghrsh/live2d_demo
```

查看文件

@@ -12,6 +12,8 @@ def check_proxy(proxies):
result = f"代理配置 {proxies_https}, 代理所在地:{country}"
elif 'error' in data:
result = f"代理配置 {proxies_https}, 代理所在地未知,IP查询频率受限"
else:
result = f"代理配置 {proxies_https}, 代理数据解析失败:{data}"
print(result)
return result
except:

查看文件

@@ -34,58 +34,28 @@ def print亮紫(*kw,**kargs):
def print亮靛(*kw,**kargs):
print("\033[1;36m",*kw,"\033[0m",**kargs)
def print亮红(*kw,**kargs):
print("\033[1;31m",*kw,"\033[0m",**kargs)
def print亮绿(*kw,**kargs):
print("\033[1;32m",*kw,"\033[0m",**kargs)
def print亮黄(*kw,**kargs):
print("\033[1;33m",*kw,"\033[0m",**kargs)
def print亮蓝(*kw,**kargs):
print("\033[1;34m",*kw,"\033[0m",**kargs)
def print亮紫(*kw,**kargs):
print("\033[1;35m",*kw,"\033[0m",**kargs)
def print亮靛(*kw,**kargs):
print("\033[1;36m",*kw,"\033[0m",**kargs)
print_red = print红
print_green = print绿
print_yellow = print黄
print_blue = print蓝
print_purple = print紫
print_indigo = print靛
print_bold_red = print亮红
print_bold_green = print亮绿
print_bold_yellow = print亮黄
print_bold_blue = print亮蓝
print_bold_purple = print亮紫
print_bold_indigo = print亮靛
if not stdout.isatty():
# redirection, avoid a fucked up log file
print红 = print
print绿 = print
print黄 = print
print蓝 = print
print紫 = print
print靛 = print
print亮红 = print
print亮绿 = print
print亮黄 = print
print亮蓝 = print
print亮紫 = print
print亮靛 = print
print_red = print
print_green = print
print_yellow = print
print_blue = print
print_purple = print
print_indigo = print
print_bold_red = print
print_bold_green = print
print_bold_yellow = print
print_bold_blue = print
print_bold_purple = print
print_bold_indigo = print
# Do you like the elegance of Chinese characters?
def sprint红(*kw):
return "\033[0;31m"+' '.join(kw)+"\033[0m"
def sprint绿(*kw):
return "\033[0;32m"+' '.join(kw)+"\033[0m"
def sprint(*kw):
return "\033[0;33m"+' '.join(kw)+"\033[0m"
def sprint(*kw):
return "\033[0;34m"+' '.join(kw)+"\033[0m"
def sprint(*kw):
return "\033[0;35m"+' '.join(kw)+"\033[0m"
def sprint(*kw):
return "\033[0;36m"+' '.join(kw)+"\033[0m"
def sprint亮红(*kw):
return "\033[1;31m"+' '.join(kw)+"\033[0m"
def sprint亮绿(*kw):
return "\033[1;32m"+' '.join(kw)+"\033[0m"
def sprint亮黄(*kw):
return "\033[1;33m"+' '.join(kw)+"\033[0m"
def sprint亮蓝(*kw):
return "\033[1;34m"+' '.join(kw)+"\033[0m"
def sprint亮紫(*kw):
return "\033[1;35m"+' '.join(kw)+"\033[0m"
def sprint亮靛(*kw):
return "\033[1;36m"+' '.join(kw)+"\033[0m"

查看文件

@@ -1,16 +1,27 @@
# [step 1]>> 例如: API_KEY = "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" 此key无效
API_KEY = "sk-此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey1,fkxxxx-api2dkey2"
"""
以下所有配置也都支持利用环境变量覆写,环境变量配置格式见docker-compose.yml。
读取优先级:环境变量 > config_private.py > config.py
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
All the following configurations also support using environment variables to override,
and the environment variable configuration format can be seen in docker-compose.yml.
Configuration reading priority: environment variable > config_private.py > config.py
"""
# [step 1]>> API_KEY = "sk-123456789xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx123456789"。极少数情况下,还需要填写组织格式如org-123456789abcdefghijklmno的,请向下翻,找 API_ORG 设置项
API_KEY = "此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey3,azure-apikey4"
# [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
USE_PROXY = False
if USE_PROXY:
# 填写格式是 [协议]:// [地址] :[端口],填写之前不要忘记把USE_PROXY改成True,如果直接在海外服务器部署,此处不修改
# 例如 "socks5h://localhost:11284"
# [协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
# [地址] 懂的都懂,不懂就填localhost或者127.0.0.1肯定错不了localhost意思是代理软件安装在本机上
# [端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
# 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5/http)、地址(localhost)和端口(11284)
"""
填写格式是 [协议]:// [地址] :[端口],填写之前不要忘记把USE_PROXY改成True,如果直接在海外服务器部署,此处不修改
<配置教程&视频教程> https://github.com/binary-husky/gpt_academic/issues/1>
[协议] 常见协议无非socks5h/http; 例如 v2**y 和 ss* 的默认本地协议是socks5h; 而cl**h 的默认本地协议是http
[地址] 懂的都懂,不懂就填localhost或者127.0.0.1肯定错不了localhost意思是代理软件安装在本机上
[端口] 在代理软件的设置里找。虽然不同的代理软件界面不一样,但端口号都应该在最显眼的位置上
"""
# 代理网络的地址,打开你的*学*网软件查看代理的协议(socks5h / http)、地址(localhost)和端口(11284)
proxies = {
# [协议]:// [地址] :[端口]
"http": "socks5h://localhost:11284", # 再例如 "http": "http://127.0.0.1:7890",
@@ -19,65 +30,92 @@ if USE_PROXY:
else:
proxies = None
# [step 3]>> 多线程函数插件中,默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次,Pay-as-you-go users的限制是每分钟3500次
# 一言以蔽之免费用户填3,OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询https://platform.openai.com/docs/guides/rate-limits/overview
# ------------------------------------ 以下配置可以优化体验, 但大部分场合下并不需要修改 ------------------------------------
# 重新URL重新定向,实现更换API_URL的作用常规情况下,不要修改!! 高危设置通过修改此设置,您将把您的API-KEY和对话隐私完全暴露给您设定的中间人
# 格式 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
# 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions":"https://reverse-proxy-url/v1/chat/completions"}
API_URL_REDIRECT = {}
# 多线程函数插件中,默认允许多少路线程同时访问OpenAI。Free trial users的限制是每分钟3次,Pay-as-you-go users的限制是每分钟3500次
# 一言以蔽之免费5刀用户填3,OpenAI绑了信用卡的用户可以填 16 或者更高。提高限制请查询https://platform.openai.com/docs/guides/rate-limits/overview
DEFAULT_WORKER_NUM = 3
# [step 4]>> 以下配置可以优化体验,但大部分场合下并不需要修改
# 对话窗的高度
CHATBOT_HEIGHT = 1115
# 代码高亮
CODE_HIGHLIGHT = True
# 窗口布局
LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
DARK_MODE = True # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
LAYOUT = "LEFT-RIGHT" # "LEFT-RIGHT"(左右布局) # "TOP-DOWN"(上下布局)
DARK_MODE = True # 暗色模式 / 亮色模式
# 发送请求到OpenAI后,等待多久判定为超时
TIMEOUT_SECONDS = 30
# 网页的端口, -1代表随机端口
WEB_PORT = -1
# 如果OpenAI不响应网络卡顿、代理失败、KEY失效,重试的次数限制
MAX_RETRY = 2
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 同时它必须被包含在AVAIL_LLM_MODELS切换列表中 )
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"]
# P.S. 其他可用的模型还包括 ["newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
# P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
# 本地LLM模型如ChatGLM的执行方式 CPU/GPU
LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
# 设置gradio的并行线程数不需要修改
CONCURRENT_COUNT = 100
# 是否在提交时自动清空输入框
AUTO_CLEAR_TXT = False
# 加一个live2d装饰
ADD_WAIFU = False
# 设置用户名和密码不需要修改相关功能不稳定,与gradio版本和网络都相关,如果本地使用不建议加这个
# [("username", "password"), ("username2", "password2"), ...]
AUTHENTICATION = []
# 重新URL重新定向,实现更换API_URL的作用常规情况下,不要修改!!
# 高危设置通过修改此设置,您将把您的API-KEY和对话隐私完全暴露给您设定的中间人
# 格式 {"https://api.openai.com/v1/chat/completions": "在这里填写重定向的api.openai.com的URL"}
# 例如 API_URL_REDIRECT = {"https://api.openai.com/v1/chat/completions": "https://ai.open.com/api/conversation"}
API_URL_REDIRECT = {}
# 如果需要在二级路径下运行(常规情况下,不要修改!!需要配合修改main.py才能生效!
CUSTOM_PATH = "/"
# 如果需要使用newbing,把newbing的长长的cookie放到这里
NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
# 从现在起,如果您调用"newbing-free"模型,则无需填写NEWBING_COOKIES
NEWBING_COOKIES = """
your bing cookies here
"""
# 极少数情况下,openai的官方KEY需要伴随组织编码格式如org-xxxxxxxxxxxxxxxxxxxxxxxx使用
API_ORG = ""
# 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
SLACK_CLAUDE_BOT_ID = ''
SLACK_CLAUDE_USER_TOKEN = ''
# 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
AZURE_ENDPOINT = "https://你亲手写的api名称.openai.azure.com/"
AZURE_API_KEY = "填入azure openai api的密钥" # 建议直接在API_KEY处填写,该选项即将被弃用
AZURE_ENGINE = "填入你亲手写的部署名" # 读 docs\use_azure.md
# 使用Newbing
NEWBING_STYLE = "creative" # ["creative", "balanced", "precise"]
NEWBING_COOKIES = """
put your new bing cookies here
"""

查看文件

@@ -63,6 +63,7 @@ def get_core_functions():
"Prefix": r"我需要你找一张网络图片。使用Unsplash API(https://source.unsplash.com/960x640/?<英语关键词>)获取图片URL," +
r"然后请使用Markdown格式封装,并且不要有反斜线,不要用代码块。现在,请按以下描述给我发送图片" + "\n\n",
"Suffix": r"",
"Visible": False,
},
"解释代码": {
"Prefix": r"请解释以下代码:" + "\n```\n",
@@ -73,6 +74,5 @@ def get_core_functions():
r"Note that, reference styles maybe more than one kind, you should transform each item correctly." +
r"Items need to be transformed:",
"Suffix": r"",
"Visible": False,
}
}

查看文件

@@ -112,11 +112,11 @@ def get_crazy_functions():
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(解析项目本身)
},
"[老旧的Demo] 把本项目源代码切换成全英文": {
# HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(全项目切换英文)
},
# "[老旧的Demo] 把本项目源代码切换成全英文": {
# # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
# "AsButton": False, # 加入下拉菜单中
# "Function": HotReload(全项目切换英文)
# },
"[插件demo] 历史上的今天": {
# HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
"Function": HotReload(高阶功能模板函数)
@@ -126,7 +126,7 @@ def get_crazy_functions():
###################### 第二组插件 ###########################
# [第二组插件]: 经过充分测试
from crazy_functions.批量总结PDF文档 import 批量总结PDF文档
from crazy_functions.批量总结PDF文档pdfminer import 批量总结PDF文档pdfminer
# from crazy_functions.批量总结PDF文档pdfminer import 批量总结PDF文档pdfminer
from crazy_functions.批量翻译PDF文档_多线程 import 批量翻译PDF文档
from crazy_functions.谷歌检索小助手 import 谷歌检索小助手
from crazy_functions.理解PDF文档内容 import 理解PDF文档内容标准文件输入
@@ -152,17 +152,16 @@ def get_crazy_functions():
# HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
"Function": HotReload(批量总结PDF文档)
},
"[测试功能] 批量总结PDF文档pdfminer": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(批量总结PDF文档pdfminer)
},
# "[测试功能] 批量总结PDF文档pdfminer": {
# "Color": "stop",
# "AsButton": False, # 加入下拉菜单中
# "Function": HotReload(批量总结PDF文档pdfminer)
# },
"谷歌学术检索助手输入谷歌学术搜索页url": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(谷歌检索小助手)
},
"理解PDF文档内容 模仿ChatPDF": {
# HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
"Color": "stop",
@@ -181,7 +180,7 @@ def get_crazy_functions():
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(Latex英文纠错)
},
"[测试功能] 中文Latex项目全文润色输入路径或上传压缩包": {
"中文Latex项目全文润色输入路径或上传压缩包": {
# HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
@@ -210,65 +209,96 @@ def get_crazy_functions():
})
###################### 第三组插件 ###########################
# [第三组插件]: 尚未充分测试的函数插件,放在这里
from crazy_functions.下载arxiv论文翻译摘要 import 下载arxiv论文并翻译摘要
function_plugins.update({
"一键下载arxiv论文并翻译摘要先在input输入编号,如1812.10695": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(下载arxiv论文并翻译摘要)
}
})
# [第三组插件]: 尚未充分测试的函数插件
from crazy_functions.联网的ChatGPT import 连接网络回答问题
function_plugins.update({
"连接网络回答问题(先输入问题,再点击按钮,需要访问谷歌)": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(连接网络回答问题)
}
})
try:
from crazy_functions.下载arxiv论文翻译摘要 import 下载arxiv论文并翻译摘要
function_plugins.update({
"一键下载arxiv论文并翻译摘要先在input输入编号,如1812.10695": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(下载arxiv论文并翻译摘要)
}
})
except:
print('Load function plugin failed')
try:
from crazy_functions.联网的ChatGPT import 连接网络回答问题
function_plugins.update({
"连接网络回答问题(输入问题后点击该插件,需要访问谷歌)": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(连接网络回答问题)
}
})
from crazy_functions.联网的ChatGPT_bing版 import 连接bing搜索回答问题
function_plugins.update({
"连接网络回答问题中文Bing版,输入问题后点击该插件": {
"Color": "stop",
"AsButton": False, # 加入下拉菜单中
"Function": HotReload(连接bing搜索回答问题)
}
})
except:
print('Load function plugin failed')
try:
from crazy_functions.解析项目源代码 import 解析任意code项目
function_plugins.update({
"解析项目源代码(手动指定和筛选源代码文件类型)": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "输入时用逗号隔开, *代表通配符, 加了^代表不匹配; 不输入代表全部匹配。例如: \"*.c, ^*.cpp, config.toml, ^*.toml\"", # 高级参数输入区的显示提示
"Function": HotReload(解析任意code项目)
},
})
except:
print('Load function plugin failed')
try:
from crazy_functions.询问多个大语言模型 import 同时问询_指定模型
function_plugins.update({
"询问多个GPT模型手动指定询问哪些模型": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "支持任意数量的llm接口,用&符号分隔。例如chatglm&gpt-3.5-turbo&api2d-gpt-4", # 高级参数输入区的显示提示
"Function": HotReload(同时问询_指定模型)
},
})
except:
print('Load function plugin failed')
try:
from crazy_functions.图片生成 import 图片生成
function_plugins.update({
"图片生成先切换模型到openai或api2d": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "在这里输入分辨率, 如256x256默认", # 高级参数输入区的显示提示
"Function": HotReload(图片生成)
},
})
except:
print('Load function plugin failed')
try:
from crazy_functions.总结音视频 import 总结音视频
function_plugins.update({
"批量总结音视频(输入路径或上传压缩包)": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "调用openai api 使用whisper-1模型, 目前支持的格式:mp4, m4a, wav, mpga, mpeg, mp3。此处可以输入解析提示,例如解析为简体中文默认",
"Function": HotReload(总结音视频)
}
})
except:
print('Load function plugin failed')
from crazy_functions.解析项目源代码 import 解析任意code项目
function_plugins.update({
"解析项目源代码(手动指定和筛选源代码文件类型)": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "输入时用逗号隔开, *代表通配符, 加了^代表不匹配; 不输入代表全部匹配。例如: \"*.c, ^*.cpp, config.toml, ^*.toml\"", # 高级参数输入区的显示提示
"Function": HotReload(解析任意code项目)
},
})
from crazy_functions.询问多个大语言模型 import 同时问询_指定模型
function_plugins.update({
"询问多个GPT模型手动指定询问哪些模型": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "支持任意数量的llm接口,用&符号分隔。例如chatglm&gpt-3.5-turbo&api2d-gpt-4", # 高级参数输入区的显示提示
"Function": HotReload(同时问询_指定模型)
},
})
from crazy_functions.图片生成 import 图片生成
function_plugins.update({
"图片生成先切换模型到openai或api2d": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True, # 调用时,唤起高级参数输入区默认False
"ArgsReminder": "在这里输入分辨率, 如256x256默认", # 高级参数输入区的显示提示
"Function": HotReload(图片生成)
},
})
from crazy_functions.总结音视频 import 总结音视频
function_plugins.update({
"批量总结音视频(输入路径或上传压缩包)": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "调用openai api 使用whisper-1模型, 目前支持的格式:mp4, m4a, wav, mpga, mpeg, mp3。此处可以输入解析提示,例如解析为简体中文默认",
"Function": HotReload(总结音视频)
}
})
try:
from crazy_functions.数学动画生成manim import 动画生成
function_plugins.update({
@@ -295,5 +325,95 @@ def get_crazy_functions():
except:
print('Load function plugin failed')
###################### 第n组插件 ###########################
try:
from crazy_functions.Langchain知识库 import 知识库问答
function_plugins.update({
"[功能尚不稳定] 构建知识库(请先上传文件素材)": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "待注入的知识库名称id, 默认为default",
"Function": HotReload(知识库问答)
}
})
except:
print('Load function plugin failed')
try:
from crazy_functions.Langchain知识库 import 读取知识库作答
function_plugins.update({
"[功能尚不稳定] 知识库问答": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "待提取的知识库名称id, 默认为default, 您需要首先调用构建知识库",
"Function": HotReload(读取知识库作答)
}
})
except:
print('Load function plugin failed')
try:
from crazy_functions.交互功能函数模板 import 交互功能模板函数
function_plugins.update({
"交互功能模板函数": {
"Color": "stop",
"AsButton": False,
"Function": HotReload(交互功能模板函数)
}
})
except:
print('Load function plugin failed')
try:
from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比
function_plugins.update({
"Latex英文纠错+高亮修正位置 [需Latex]": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "如果有必要, 请在此处追加更细致的矫错指令(使用英文)。",
"Function": HotReload(Latex英文纠错加PDF对比)
}
})
from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
function_plugins.update({
"Arixv翻译输入arxivID[需Latex]": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder":
"如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+
"例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
"Function": HotReload(Latex翻译中文并重新编译PDF)
}
})
function_plugins.update({
"本地论文翻译上传Latex压缩包[需Latex]": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder":
"如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+
"例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
"Function": HotReload(Latex翻译中文并重新编译PDF)
}
})
except:
print('Load function plugin failed')
# try:
# from crazy_functions.虚空终端 import 终端
# function_plugins.update({
# "超级终端": {
# "Color": "stop",
# "AsButton": False,
# # "AdvancedArgs": True,
# # "ArgsReminder": "",
# "Function": HotReload(终端)
# }
# })
# except:
# print('Load function plugin failed')
return function_plugins

查看文件

@@ -0,0 +1,107 @@
from toolbox import CatchException, update_ui, ProxyNetworkActivate
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, get_files_from_everything
@CatchException
def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
plugin_kwargs 插件模型的参数,暂时没有用武之地
chatbot 聊天显示框的句柄,用于显示给用户
history 聊天历史,前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append(("这是什么功能?", "[Local Message] 从一批文件(txt, md, tex)中读取数据构建知识库, 然后进行问答。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# resolve deps
try:
from zh_langchain import construct_vector_store
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from .crazy_utils import knowledge_archive_interface
except Exception as e:
chatbot.append(
["依赖不足",
"导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."]
)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.1'])
# < --------------------读取参数--------------- >
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
# < --------------------读取文件--------------- >
file_manifest = []
spl = ["txt", "doc", "docx", "email", "epub", "html", "json", "md", "msg", "pdf", "ppt", "pptx", "rtf"]
for sp in spl:
_, file_manifest_tmp, _ = get_files_from_everything(txt, type=f'.{sp}')
file_manifest += file_manifest_tmp
if len(file_manifest) == 0:
chatbot.append(["没有找到任何可读取文件", "当前支持的格式包括: txt, md, docx, pptx, pdf, json等"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# < -------------------预热文本向量化模组--------------- >
chatbot.append(['<br/>'.join(file_manifest), "正在预热文本向量化模组, 如果是第一次运行, 将消耗较长时间下载中文向量化模型..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
print('Checking Text2vec ...')
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
with ProxyNetworkActivate(): # 临时地激活代理网络
HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
# < -------------------构建知识库--------------- >
chatbot.append(['<br/>'.join(file_manifest), "正在构建知识库..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
print('Establishing knowledge archive ...')
with ProxyNetworkActivate(): # 临时地激活代理网络
kai = knowledge_archive_interface()
kai.feed_archive(file_manifest=file_manifest, id=kai_id)
kai_files = kai.get_loaded_file()
kai_files = '<br/>'.join(kai_files)
# chatbot.append(['知识库构建成功', "正在将知识库存储至cookie中"])
# yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# chatbot._cookies['langchain_plugin_embedding'] = kai.get_current_archive_id()
# chatbot._cookies['lock_plugin'] = 'crazy_functions.Langchain知识库->读取知识库作答'
# chatbot.append(['完成', "“根据知识库作答”函数插件已经接管问答系统, 提问吧! 但注意, 您接下来不能再使用其他插件了,刷新页面即可以退出知识库问答模式。"])
chatbot.append(['构建完成', f"当前知识库内的有效文件:\n\n---\n\n{kai_files}\n\n---\n\n请切换至“知识库问答”插件进行知识库访问, 或者使用此插件继续上传更多文件。"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
@CatchException
def 读取知识库作答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port=-1):
# resolve deps
try:
from zh_langchain import construct_vector_store
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from .crazy_utils import knowledge_archive_interface
except Exception as e:
chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.1'])
# < ------------------- --------------- >
kai = knowledge_archive_interface()
if 'langchain_plugin_embedding' in chatbot._cookies:
resp, prompt = kai.answer_with_archive_by_id(txt, chatbot._cookies['langchain_plugin_embedding'])
else:
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
resp, prompt = kai.answer_with_archive_by_id(txt, kai_id)
chatbot.append((txt, '[Local Message] ' + prompt))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt, inputs_show_user=txt,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
sys_prompt=system_prompt
)
history.extend((prompt, gpt_say))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新

查看文件

@@ -238,3 +238,6 @@ def Latex英文纠错(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_p
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
yield from 多文件润色(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, language='en', mode='proofread')

查看文件

@@ -0,0 +1,300 @@
from toolbox import update_ui, trimmed_format_exc, get_conf, objdump, objload, promote_file_to_downloadzone
from toolbox import CatchException, report_execption, update_ui_lastest_msg, zip_result, gen_time_str
from functools import partial
import glob, os, requests, time
pj = os.path.join
ARXIV_CACHE_DIR = os.path.expanduser(f"~/arxiv_cache/")
# =================================== 工具函数 ===============================================
专业词汇声明 = 'If the term "agent" is used in this section, it should be translated to "智能体". '
def switch_prompt(pfg, mode, more_requirement):
"""
Generate prompts and system prompts based on the mode for proofreading or translating.
Args:
- pfg: Proofreader or Translator instance.
- mode: A string specifying the mode, either 'proofread' or 'translate_zh'.
Returns:
- inputs_array: A list of strings containing prompts for users to respond to.
- sys_prompt_array: A list of strings containing prompts for system prompts.
"""
n_split = len(pfg.sp_file_contents)
if mode == 'proofread_en':
inputs_array = [r"Below is a section from an academic paper, proofread this section." +
r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + more_requirement +
r"Answer me only with the revised text:" +
f"\n\n{frag}" for frag in pfg.sp_file_contents]
sys_prompt_array = ["You are a professional academic paper writer." for _ in range(n_split)]
elif mode == 'translate_zh':
inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese. " + more_requirement +
r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " +
r"Answer me only with the translated text:" +
f"\n\n{frag}" for frag in pfg.sp_file_contents]
sys_prompt_array = ["You are a professional translator." for _ in range(n_split)]
else:
assert False, "未知指令"
return inputs_array, sys_prompt_array
def desend_to_extracted_folder_if_exist(project_folder):
"""
Descend into the extracted folder if it exists, otherwise return the original folder.
Args:
- project_folder: A string specifying the folder path.
Returns:
- A string specifying the path to the extracted folder, or the original folder if there is no extracted folder.
"""
maybe_dir = [f for f in glob.glob(f'{project_folder}/*') if os.path.isdir(f)]
if len(maybe_dir) == 0: return project_folder
if maybe_dir[0].endswith('.extract'): return maybe_dir[0]
return project_folder
def move_project(project_folder, arxiv_id=None):
"""
Create a new work folder and copy the project folder to it.
Args:
- project_folder: A string specifying the folder path of the project.
Returns:
- A string specifying the path to the new work folder.
"""
import shutil, time
time.sleep(2) # avoid time string conflict
if arxiv_id is not None:
new_workfolder = pj(ARXIV_CACHE_DIR, arxiv_id, 'workfolder')
else:
new_workfolder = f'gpt_log/{gen_time_str()}'
try:
shutil.rmtree(new_workfolder)
except:
pass
# align subfolder if there is a folder wrapper
items = glob.glob(pj(project_folder,'*'))
if len(glob.glob(pj(project_folder,'*.tex'))) == 0 and len(items) == 1:
if os.path.isdir(items[0]): project_folder = items[0]
shutil.copytree(src=project_folder, dst=new_workfolder)
return new_workfolder
def arxiv_download(chatbot, history, txt):
def check_cached_translation_pdf(arxiv_id):
translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'translation')
if not os.path.exists(translation_dir):
os.makedirs(translation_dir)
target_file = pj(translation_dir, 'translate_zh.pdf')
if os.path.exists(target_file):
promote_file_to_downloadzone(target_file, rename_file=None, chatbot=chatbot)
return target_file
return False
def is_float(s):
try:
float(s)
return True
except ValueError:
return False
if ('.' in txt) and ('/' not in txt) and is_float(txt): # is arxiv ID
txt = 'https://arxiv.org/abs/' + txt.strip()
if ('.' in txt) and ('/' not in txt) and is_float(txt[:10]): # is arxiv ID
txt = 'https://arxiv.org/abs/' + txt[:10]
if not txt.startswith('https://arxiv.org'):
return txt, None
# <-------------- inspect format ------------->
chatbot.append([f"检测到arxiv文档连接", '尝试下载 ...'])
yield from update_ui(chatbot=chatbot, history=history)
time.sleep(1) # 刷新界面
url_ = txt # https://arxiv.org/abs/1707.06690
if not txt.startswith('https://arxiv.org/abs/'):
msg = f"解析arxiv网址失败, 期望格式例如: https://arxiv.org/abs/1707.06690。实际得到格式: {url_}"
yield from update_ui_lastest_msg(msg, chatbot=chatbot, history=history) # 刷新界面
return msg, None
# <-------------- set format ------------->
arxiv_id = url_.split('/abs/')[-1]
if 'v' in arxiv_id: arxiv_id = arxiv_id[:10]
cached_translation_pdf = check_cached_translation_pdf(arxiv_id)
if cached_translation_pdf: return cached_translation_pdf, arxiv_id
url_tar = url_.replace('/abs/', '/e-print/')
translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'e-print')
extract_dst = pj(ARXIV_CACHE_DIR, arxiv_id, 'extract')
os.makedirs(translation_dir, exist_ok=True)
# <-------------- download arxiv source file ------------->
dst = pj(translation_dir, arxiv_id+'.tar')
if os.path.exists(dst):
yield from update_ui_lastest_msg("调用缓存", chatbot=chatbot, history=history) # 刷新界面
else:
yield from update_ui_lastest_msg("开始下载", chatbot=chatbot, history=history) # 刷新界面
proxies, = get_conf('proxies')
r = requests.get(url_tar, proxies=proxies)
with open(dst, 'wb+') as f:
f.write(r.content)
# <-------------- extract file ------------->
yield from update_ui_lastest_msg("下载完成", chatbot=chatbot, history=history) # 刷新界面
from toolbox import extract_archive
extract_archive(file_path=dst, dest_dir=extract_dst)
return extract_dst, arxiv_id
# ========================================= 插件主程序1 =====================================================
@CatchException
def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
# <-------------- information about this plugin ------------->
chatbot.append([ "函数插件功能?",
"对整个Latex项目进行纠错, 用latex编译为PDF对修正处做高亮。函数插件贡献者: Binary-Husky。注意事项: 目前仅支持GPT3.5/GPT4,其他模型转化效果未知。目前对机器学习类文献转化效果最好,其他类型文献转化效果未知。仅在Windows系统进行了测试,其他操作系统表现未知。"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# <-------------- more requirements ------------->
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
more_req = plugin_kwargs.get("advanced_arg", "")
_switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
# <-------------- check deps ------------->
try:
import glob, os, time, subprocess
subprocess.Popen(['pdflatex', '-version'])
from .latex_utils import Latex精细分解与转化, 编译Latex
except Exception as e:
chatbot.append([ f"解析项目: {txt}",
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# <-------------- clear history and read input ------------->
history = []
if os.path.exists(txt):
project_folder = txt
else:
if txt == "": txt = '空空如也的输入栏'
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.tex', recursive=True)]
if len(file_manifest) == 0:
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到任何.tex文件: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# <-------------- if is a zip/tar file ------------->
project_folder = desend_to_extracted_folder_if_exist(project_folder)
# <-------------- move latex project away from temp folder ------------->
project_folder = move_project(project_folder, arxiv_id=None)
# <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
if not os.path.exists(project_folder + '/merge_proofread_en.tex'):
yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
chatbot, history, system_prompt, mode='proofread_en', switch_prompt=_switch_prompt_)
# <-------------- compile PDF ------------->
success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_proofread_en',
work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
# <-------------- zip PDF ------------->
zip_res = zip_result(project_folder)
if success:
chatbot.append((f"成功啦", '请查收结果(压缩包)...'))
yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
else:
chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果(压缩包), 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
# <-------------- we are done ------------->
return success
# ========================================= 插件主程序2 =====================================================
@CatchException
def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
# <-------------- information about this plugin ------------->
chatbot.append([
"函数插件功能?",
"对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 此插件Windows支持最佳,Linux下必须使用Docker安装,详见项目主README.md。目前仅支持GPT3.5/GPT4,其他模型转化效果未知。目前对机器学习类文献转化效果最好,其他类型文献转化效果未知。"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# <-------------- more requirements ------------->
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
more_req = plugin_kwargs.get("advanced_arg", "")
_switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
# <-------------- check deps ------------->
try:
import glob, os, time, subprocess
subprocess.Popen(['pdflatex', '-version'])
from .latex_utils import Latex精细分解与转化, 编译Latex
except Exception as e:
chatbot.append([ f"解析项目: {txt}",
f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# <-------------- clear history and read input ------------->
history = []
txt, arxiv_id = yield from arxiv_download(chatbot, history, txt)
if txt.endswith('.pdf'):
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"发现已经存在翻译好的PDF文档")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
if os.path.exists(txt):
project_folder = txt
else:
if txt == "": txt = '空空如也的输入栏'
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.tex', recursive=True)]
if len(file_manifest) == 0:
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到任何.tex文件: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# <-------------- if is a zip/tar file ------------->
project_folder = desend_to_extracted_folder_if_exist(project_folder)
# <-------------- move latex project away from temp folder ------------->
project_folder = move_project(project_folder, arxiv_id)
# <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
if not os.path.exists(project_folder + '/merge_translate_zh.tex'):
yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
chatbot, history, system_prompt, mode='translate_zh', switch_prompt=_switch_prompt_)
# <-------------- compile PDF ------------->
success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_translate_zh', mode='translate_zh',
work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
# <-------------- zip PDF ------------->
zip_res = zip_result(project_folder)
if success:
chatbot.append((f"成功啦", '请查收结果(压缩包)...'))
yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
else:
chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果(压缩包), 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
# <-------------- we are done ------------->
return success

查看文件

@@ -4,16 +4,24 @@
运行方法 python crazy_functions/crazy_functions_test.py
"""
# ==============================================================================================================================
def validate_path():
import os, sys
dir_name = os.path.dirname(__file__)
root_dir_assume = os.path.abspath(os.path.dirname(__file__) + '/..')
os.chdir(root_dir_assume)
sys.path.append(root_dir_assume)
validate_path() # validate path so you can run from base directory
# ==============================================================================================================================
from colorful import *
from toolbox import get_conf, ChatBotWithCookies
import contextlib
import os
import sys
from functools import wraps
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, API_KEY = \
get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'API_KEY')
@@ -30,7 +38,43 @@ history = []
system_prompt = "Serve me as a writing and programming assistant."
web_port = 1024
# ==============================================================================================================================
def silence_stdout(func):
@wraps(func)
def wrapper(*args, **kwargs):
_original_stdout = sys.stdout
sys.stdout = open(os.devnull, 'w')
for q in func(*args, **kwargs):
sys.stdout = _original_stdout
yield q
sys.stdout = open(os.devnull, 'w')
sys.stdout.close()
sys.stdout = _original_stdout
return wrapper
class CLI_Printer():
def __init__(self) -> None:
self.pre_buf = ""
def print(self, buf):
bufp = ""
for index, chat in enumerate(buf):
a, b = chat
bufp += sprint亮靛('[Me]:' + a) + '\n'
bufp += '[GPT]:' + b
if index < len(buf)-1:
bufp += '\n'
if self.pre_buf!="" and bufp.startswith(self.pre_buf):
print(bufp[len(self.pre_buf):], end='')
else:
print('\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'+bufp, end='')
self.pre_buf = bufp
return
cli_printer = CLI_Printer()
# ==============================================================================================================================
def test_解析一个Python项目():
from crazy_functions.解析项目源代码 import 解析一个Python项目
txt = "crazy_functions/test_project/python/dqn"
@@ -116,6 +160,57 @@ def test_Markdown多语言():
for cookies, cb, hist, msg in Markdown翻译指定语言(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
print(cb)
def test_Langchain知识库():
from crazy_functions.Langchain知识库 import 知识库问答
txt = "./"
chatbot = ChatBotWithCookies(llm_kwargs)
for cookies, cb, hist, msg in silence_stdout(知识库问答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
cli_printer.print(cb) # print(cb)
chatbot = ChatBotWithCookies(cookies)
from crazy_functions.Langchain知识库 import 读取知识库作答
txt = "What is the installation method?"
for cookies, cb, hist, msg in silence_stdout(读取知识库作答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
cli_printer.print(cb) # print(cb)
def test_Langchain知识库读取():
from crazy_functions.Langchain知识库 import 读取知识库作答
txt = "远程云服务器部署?"
for cookies, cb, hist, msg in silence_stdout(读取知识库作答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
cli_printer.print(cb) # print(cb)
def test_Latex():
from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比, Latex翻译中文并重新编译PDF
# txt = r"https://arxiv.org/abs/1706.03762"
# txt = r"https://arxiv.org/abs/1902.03185"
# txt = r"https://arxiv.org/abs/2305.18290"
# txt = r"https://arxiv.org/abs/2305.17608"
# txt = r"https://arxiv.org/abs/2211.16068" # ACE
# txt = r"C:\Users\x\arxiv_cache\2211.16068\workfolder" # ACE
# txt = r"https://arxiv.org/abs/2002.09253"
# txt = r"https://arxiv.org/abs/2306.07831"
# txt = r"https://arxiv.org/abs/2212.10156"
# txt = r"https://arxiv.org/abs/2211.11559"
# txt = r"https://arxiv.org/abs/2303.08774"
# txt = r"https://arxiv.org/abs/2303.12712"
# txt = r"C:\Users\fuqingxu\arxiv_cache\2303.12712\workfolder"
txt = r"2306.17157" # 这个paper有个input命令文件名大小写错误
for cookies, cb, hist, msg in (Latex翻译中文并重新编译PDF)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
cli_printer.print(cb) # print(cb)
# txt = "2302.02948.tar"
# print(txt)
# main_tex, work_folder = Latex预处理(txt)
# print('main tex:', main_tex)
# res = 编译Latex(main_tex, work_folder)
# # for cookies, cb, hist, msg in silence_stdout(编译Latex)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
# cli_printer.print(cb) # print(cb)
# test_解析一个Python项目()
@@ -129,7 +224,9 @@ def test_Markdown多语言():
# test_联网回答问题()
# test_解析ipynb文件()
# test_数学动画生成manim()
test_Markdown多语言()
input("程序完成,回车退出。")
print("退出。")
# test_Langchain知识库()
# test_Langchain知识库读取()
if __name__ == "__main__":
test_Latex()
input("程序完成,回车退出。")
print("退出。")

查看文件

@@ -1,4 +1,5 @@
from toolbox import update_ui, get_conf, trimmed_format_exc
import threading
def input_clipping(inputs, history, max_token_limit):
import numpy as np
@@ -129,6 +130,11 @@ def request_gpt_model_in_new_thread_with_ui_alive(
yield from update_ui(chatbot=chatbot, history=[]) # 如果最后成功了,则删除报错信息
return final_result
def can_multi_process(llm):
if llm.startswith('gpt-'): return True
if llm.startswith('api2d-'): return True
if llm.startswith('azure-'): return True
return False
def request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array, inputs_show_user_array, llm_kwargs,
@@ -174,7 +180,7 @@ def request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
except: max_workers = 8
if max_workers <= 0: max_workers = 3
# 屏蔽掉 chatglm的多线程,可能会导致严重卡顿
if not (llm_kwargs['llm_model'].startswith('gpt-') or llm_kwargs['llm_model'].startswith('api2d-')):
if not can_multi_process(llm_kwargs['llm_model']):
max_workers = 1
executor = ThreadPoolExecutor(max_workers=max_workers)
@@ -606,3 +612,142 @@ def get_files_from_everything(txt, type): # type='.md'
success = False
return success, file_manifest, project_folder
def Singleton(cls):
_instance = {}
def _singleton(*args, **kargs):
if cls not in _instance:
_instance[cls] = cls(*args, **kargs)
return _instance[cls]
return _singleton
@Singleton
class knowledge_archive_interface():
def __init__(self) -> None:
self.threadLock = threading.Lock()
self.current_id = ""
self.kai_path = None
self.qa_handle = None
self.text2vec_large_chinese = None
def get_chinese_text2vec(self):
if self.text2vec_large_chinese is None:
# < -------------------预热文本向量化模组--------------- >
from toolbox import ProxyNetworkActivate
print('Checking Text2vec ...')
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
with ProxyNetworkActivate(): # 临时地激活代理网络
self.text2vec_large_chinese = HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
return self.text2vec_large_chinese
def feed_archive(self, file_manifest, id="default"):
self.threadLock.acquire()
# import uuid
self.current_id = id
from zh_langchain import construct_vector_store
self.qa_handle, self.kai_path = construct_vector_store(
vs_id=self.current_id,
files=file_manifest,
sentence_size=100,
history=[],
one_conent="",
one_content_segmentation="",
text2vec = self.get_chinese_text2vec(),
)
self.threadLock.release()
def get_current_archive_id(self):
return self.current_id
def get_loaded_file(self):
return self.qa_handle.get_loaded_file()
def answer_with_archive_by_id(self, txt, id):
self.threadLock.acquire()
if not self.current_id == id:
self.current_id = id
from zh_langchain import construct_vector_store
self.qa_handle, self.kai_path = construct_vector_store(
vs_id=self.current_id,
files=[],
sentence_size=100,
history=[],
one_conent="",
one_content_segmentation="",
text2vec = self.get_chinese_text2vec(),
)
VECTOR_SEARCH_SCORE_THRESHOLD = 0
VECTOR_SEARCH_TOP_K = 4
CHUNK_SIZE = 512
resp, prompt = self.qa_handle.get_knowledge_based_conent_test(
query = txt,
vs_path = self.kai_path,
score_threshold=VECTOR_SEARCH_SCORE_THRESHOLD,
vector_search_top_k=VECTOR_SEARCH_TOP_K,
chunk_conent=True,
chunk_size=CHUNK_SIZE,
text2vec = self.get_chinese_text2vec(),
)
self.threadLock.release()
return resp, prompt
def try_install_deps(deps):
for dep in deps:
import subprocess, sys
subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', dep])
class construct_html():
def __init__(self) -> None:
self.css = """
.row {
display: flex;
flex-wrap: wrap;
}
.column {
flex: 1;
padding: 10px;
}
.table-header {
font-weight: bold;
border-bottom: 1px solid black;
}
.table-row {
border-bottom: 1px solid lightgray;
}
.table-cell {
padding: 5px;
}
"""
self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
def add_row(self, a, b):
tmp = """
<div class="row table-row">
<div class="column table-cell">REPLACE_A</div>
<div class="column table-cell">REPLACE_B</div>
</div>
"""
from toolbox import markdown_convertion
tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
self.html_string += tmp
def save_file(self, file_name):
with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
f.write(self.html_string.encode('utf-8', 'ignore').decode())

查看文件

@@ -0,0 +1,797 @@
from toolbox import update_ui, update_ui_lastest_msg # 刷新Gradio前端界面
from toolbox import zip_folder, objdump, objload, promote_file_to_downloadzone
import os, shutil
import re
import numpy as np
pj = os.path.join
"""
========================================================================
Part One
Latex segmentation with a binary mask (PRESERVE=0, TRANSFORM=1)
========================================================================
"""
PRESERVE = 0
TRANSFORM = 1
def set_forbidden_text(text, mask, pattern, flags=0):
"""
Add a preserve text area in this paper
e.g. with pattern = r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}"
you can mask out (mask = PRESERVE so that text become untouchable for GPT)
everything between "\begin{equation}" and "\end{equation}"
"""
if isinstance(pattern, list): pattern = '|'.join(pattern)
pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text):
mask[res.span()[0]:res.span()[1]] = PRESERVE
return text, mask
def reverse_forbidden_text(text, mask, pattern, flags=0, forbid_wrapper=True):
"""
Move area out of preserve area (make text editable for GPT)
count the number of the braces so as to catch compelete text area.
e.g.
\begin{abstract} blablablablablabla. \end{abstract}
"""
if isinstance(pattern, list): pattern = '|'.join(pattern)
pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text):
if not forbid_wrapper:
mask[res.span()[0]:res.span()[1]] = TRANSFORM
else:
mask[res.regs[0][0]: res.regs[1][0]] = PRESERVE # '\\begin{abstract}'
mask[res.regs[1][0]: res.regs[1][1]] = TRANSFORM # abstract
mask[res.regs[1][1]: res.regs[0][1]] = PRESERVE # abstract
return text, mask
def set_forbidden_text_careful_brace(text, mask, pattern, flags=0):
"""
Add a preserve text area in this paper (text become untouchable for GPT).
count the number of the braces so as to catch compelete text area.
e.g.
\caption{blablablablabla\texbf{blablabla}blablabla.}
"""
pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text):
brace_level = -1
p = begin = end = res.regs[0][0]
for _ in range(1024*16):
if text[p] == '}' and brace_level == 0: break
elif text[p] == '}': brace_level -= 1
elif text[p] == '{': brace_level += 1
p += 1
end = p+1
mask[begin:end] = PRESERVE
return text, mask
def reverse_forbidden_text_careful_brace(text, mask, pattern, flags=0, forbid_wrapper=True):
"""
Move area out of preserve area (make text editable for GPT)
count the number of the braces so as to catch compelete text area.
e.g.
\caption{blablablablabla\texbf{blablabla}blablabla.}
"""
pattern_compile = re.compile(pattern, flags)
for res in pattern_compile.finditer(text):
brace_level = 0
p = begin = end = res.regs[1][0]
for _ in range(1024*16):
if text[p] == '}' and brace_level == 0: break
elif text[p] == '}': brace_level -= 1
elif text[p] == '{': brace_level += 1
p += 1
end = p
mask[begin:end] = TRANSFORM
if forbid_wrapper:
mask[res.regs[0][0]:begin] = PRESERVE
mask[end:res.regs[0][1]] = PRESERVE
return text, mask
def set_forbidden_text_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
"""
Find all \begin{} ... \end{} text block that with less than limit_n_lines lines.
Add it to preserve area
"""
pattern_compile = re.compile(pattern, flags)
def search_with_line_limit(text, mask):
for res in pattern_compile.finditer(text):
cmd = res.group(1) # begin{what}
this = res.group(2) # content between begin and end
this_mask = mask[res.regs[2][0]:res.regs[2][1]]
white_list = ['document', 'abstract', 'lemma', 'definition', 'sproof',
'em', 'emph', 'textit', 'textbf', 'itemize', 'enumerate']
if (cmd in white_list) or this.count('\n') >= limit_n_lines: # use a magical number 42
this, this_mask = search_with_line_limit(this, this_mask)
mask[res.regs[2][0]:res.regs[2][1]] = this_mask
else:
mask[res.regs[0][0]:res.regs[0][1]] = PRESERVE
return text, mask
return search_with_line_limit(text, mask)
class LinkedListNode():
"""
Linked List Node
"""
def __init__(self, string, preserve=True) -> None:
self.string = string
self.preserve = preserve
self.next = None
# self.begin_line = 0
# self.begin_char = 0
def convert_to_linklist(text, mask):
root = LinkedListNode("", preserve=True)
current_node = root
for c, m, i in zip(text, mask, range(len(text))):
if (m==PRESERVE and current_node.preserve) \
or (m==TRANSFORM and not current_node.preserve):
# add
current_node.string += c
else:
current_node.next = LinkedListNode(c, preserve=(m==PRESERVE))
current_node = current_node.next
return root
"""
========================================================================
Latex Merge File
========================================================================
"""
def 寻找Latex主文件(file_manifest, mode):
"""
在多Tex文档中,寻找主文件,必须包含documentclass,返回找到的第一个。
P.S. 但愿没人把latex模板放在里面传进来 (6.25 加入判定latex模板的代码)
"""
canidates = []
for texf in file_manifest:
if os.path.basename(texf).startswith('merge'):
continue
with open(texf, 'r', encoding='utf8') as f:
file_content = f.read()
if r'\documentclass' in file_content:
canidates.append(texf)
else:
continue
if len(canidates) == 0:
raise RuntimeError('无法找到一个主Tex文件包含documentclass关键字')
elif len(canidates) == 1:
return canidates[0]
else: # if len(canidates) >= 2 通过一些Latex模板中常见但通常不会出现在正文的单词,对不同latex源文件扣分,取评分最高者返回
canidates_score = []
# 给出一些判定模板文档的词作为扣分项
unexpected_words = ['\LaTeX', 'manuscript', 'Guidelines', 'font', 'citations', 'rejected', 'blind review', 'reviewers']
expected_words = ['\input', '\ref', '\cite']
for texf in canidates:
canidates_score.append(0)
with open(texf, 'r', encoding='utf8') as f:
file_content = f.read()
for uw in unexpected_words:
if uw in file_content:
canidates_score[-1] -= 1
for uw in expected_words:
if uw in file_content:
canidates_score[-1] += 1
select = np.argmax(canidates_score) # 取评分最高者返回
return canidates[select]
def rm_comments(main_file):
new_file_remove_comment_lines = []
for l in main_file.splitlines():
# 删除整行的空注释
if l.lstrip().startswith("%"):
pass
else:
new_file_remove_comment_lines.append(l)
main_file = '\n'.join(new_file_remove_comment_lines)
# main_file = re.sub(r"\\include{(.*?)}", r"\\input{\1}", main_file) # 将 \include 命令转换为 \input 命令
main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串
return main_file
def find_tex_file_ignore_case(fp):
dir_name = os.path.dirname(fp)
base_name = os.path.basename(fp)
if not base_name.endswith('.tex'): base_name+='.tex'
if os.path.exists(pj(dir_name, base_name)): return pj(dir_name, base_name)
# go case in-sensitive
import glob
for f in glob.glob(dir_name+'/*.tex'):
base_name_s = os.path.basename(fp)
if base_name_s.lower() == base_name.lower(): return f
return None
def merge_tex_files_(project_foler, main_file, mode):
"""
Merge Tex project recrusively
"""
main_file = rm_comments(main_file)
for s in reversed([q for q in re.finditer(r"\\input\{(.*?)\}", main_file, re.M)]):
f = s.group(1)
fp = os.path.join(project_foler, f)
fp = find_tex_file_ignore_case(fp)
if fp:
with open(fp, 'r', encoding='utf-8', errors='replace') as fx: c = fx.read()
else:
raise RuntimeError(f'找不到{fp},Tex源文件缺失')
c = merge_tex_files_(project_foler, c, mode)
main_file = main_file[:s.span()[0]] + c + main_file[s.span()[1]:]
return main_file
def merge_tex_files(project_foler, main_file, mode):
"""
Merge Tex project recrusively
P.S. 顺便把CTEX塞进去以支持中文
P.S. 顺便把Latex的注释去除
"""
main_file = merge_tex_files_(project_foler, main_file, mode)
main_file = rm_comments(main_file)
if mode == 'translate_zh':
# find paper documentclass
pattern = re.compile(r'\\documentclass.*\n')
match = pattern.search(main_file)
assert match is not None, "Cannot find documentclass statement!"
position = match.end()
add_ctex = '\\usepackage{ctex}\n'
add_url = '\\usepackage{url}\n' if '{url}' not in main_file else ''
main_file = main_file[:position] + add_ctex + add_url + main_file[position:]
# fontset=windows
import platform
main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows,UTF8]{\2}",main_file)
main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows,UTF8]{\1}",main_file)
# find paper abstract
pattern_opt1 = re.compile(r'\\begin\{abstract\}.*\n')
pattern_opt2 = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
match_opt1 = pattern_opt1.search(main_file)
match_opt2 = pattern_opt2.search(main_file)
assert (match_opt1 is not None) or (match_opt2 is not None), "Cannot find paper abstract section!"
return main_file
"""
========================================================================
Post process
========================================================================
"""
def mod_inbraket(match):
"""
为啥chatgpt会把cite里面的逗号换成中文逗号呀
"""
# get the matched string
cmd = match.group(1)
str_to_modify = match.group(2)
# modify the matched string
str_to_modify = str_to_modify.replace('', ':') # 前面是中文冒号,后面是英文冒号
str_to_modify = str_to_modify.replace('', ',') # 前面是中文逗号,后面是英文逗号
# str_to_modify = 'BOOM'
return "\\" + cmd + "{" + str_to_modify + "}"
def fix_content(final_tex, node_string):
"""
Fix common GPT errors to increase success rate
"""
final_tex = re.sub(r"(?<!\\)%", "\\%", final_tex)
final_tex = re.sub(r"\\([a-z]{2,10})\ \{", r"\\\1{", string=final_tex)
final_tex = re.sub(r"\\\ ([a-z]{2,10})\{", r"\\\1{", string=final_tex)
final_tex = re.sub(r"\\([a-z]{2,10})\{([^\}]*?)\}", mod_inbraket, string=final_tex)
if "Traceback" in final_tex and "[Local Message]" in final_tex:
final_tex = node_string # 出问题了,还原原文
if node_string.count('\\begin') != final_tex.count('\\begin'):
final_tex = node_string # 出问题了,还原原文
if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'):
# walk and replace any _ without \
final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex)
def compute_brace_level(string):
# this function count the number of { and }
brace_level = 0
for c in string:
if c == "{": brace_level += 1
elif c == "}": brace_level -= 1
return brace_level
def join_most(tex_t, tex_o):
# this function join translated string and original string when something goes wrong
p_t = 0
p_o = 0
def find_next(string, chars, begin):
p = begin
while p < len(string):
if string[p] in chars: return p, string[p]
p += 1
return None, None
while True:
res1, char = find_next(tex_o, ['{','}'], p_o)
if res1 is None: break
res2, char = find_next(tex_t, [char], p_t)
if res2 is None: break
p_o = res1 + 1
p_t = res2 + 1
return tex_t[:p_t] + tex_o[p_o:]
if compute_brace_level(final_tex) != compute_brace_level(node_string):
# 出问题了,还原部分原文,保证括号正确
final_tex = join_most(final_tex, node_string)
return final_tex
def split_subprocess(txt, project_folder, return_dict, opts):
"""
break down latex file to a linked list,
each node use a preserve flag to indicate whether it should
be proccessed by GPT.
"""
text = txt
mask = np.zeros(len(txt), dtype=np.uint8) + TRANSFORM
# 吸收title与作者以上的部分
text, mask = set_forbidden_text(text, mask, r"(.*?)\\maketitle", re.DOTALL)
# 吸收iffalse注释
text, mask = set_forbidden_text(text, mask, r"\\iffalse(.*?)\\fi", re.DOTALL)
# 吸收在42行以内的begin-end组合
text, mask = set_forbidden_text_begin_end(text, mask, r"\\begin\{([a-z\*]*)\}(.*?)\\end\{\1\}", re.DOTALL, limit_n_lines=42)
# 吸收匿名公式
text, mask = set_forbidden_text(text, mask, [ r"\$\$([^$]+)\$\$", r"\\\[.*?\\\]" ], re.DOTALL)
# 吸收其他杂项
text, mask = set_forbidden_text(text, mask, [ r"\\section\{(.*?)\}", r"\\section\*\{(.*?)\}", r"\\subsection\{(.*?)\}", r"\\subsubsection\{(.*?)\}" ])
text, mask = set_forbidden_text(text, mask, [ r"\\bibliography\{(.*?)\}", r"\\bibliographystyle\{(.*?)\}" ])
text, mask = set_forbidden_text(text, mask, r"\\begin\{thebibliography\}.*?\\end\{thebibliography\}", re.DOTALL)
text, mask = set_forbidden_text(text, mask, r"\\begin\{lstlisting\}(.*?)\\end\{lstlisting\}", re.DOTALL)
text, mask = set_forbidden_text(text, mask, r"\\begin\{wraptable\}(.*?)\\end\{wraptable\}", re.DOTALL)
text, mask = set_forbidden_text(text, mask, r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}", re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{wrapfigure\}(.*?)\\end\{wrapfigure\}", r"\\begin\{wrapfigure\*\}(.*?)\\end\{wrapfigure\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{figure\}(.*?)\\end\{figure\}", r"\\begin\{figure\*\}(.*?)\\end\{figure\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{multline\}(.*?)\\end\{multline\}", r"\\begin\{multline\*\}(.*?)\\end\{multline\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{table\}(.*?)\\end\{table\}", r"\\begin\{table\*\}(.*?)\\end\{table\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{minipage\}(.*?)\\end\{minipage\}", r"\\begin\{minipage\*\}(.*?)\\end\{minipage\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{align\*\}(.*?)\\end\{align\*\}", r"\\begin\{align\}(.*?)\\end\{align\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\begin\{equation\}(.*?)\\end\{equation\}", r"\\begin\{equation\*\}(.*?)\\end\{equation\*\}"], re.DOTALL)
text, mask = set_forbidden_text(text, mask, [r"\\includepdf\[(.*?)\]\{(.*?)\}", r"\\clearpage", r"\\newpage", r"\\appendix", r"\\tableofcontents", r"\\include\{(.*?)\}"])
text, mask = set_forbidden_text(text, mask, [r"\\vspace\{(.*?)\}", r"\\hspace\{(.*?)\}", r"\\label\{(.*?)\}", r"\\begin\{(.*?)\}", r"\\end\{(.*?)\}", r"\\item "])
text, mask = set_forbidden_text_careful_brace(text, mask, r"\\hl\{(.*?)\}", re.DOTALL)
# reverse 操作必须放在最后
text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\caption\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\abstract\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
text, mask = reverse_forbidden_text(text, mask, r"\\begin\{abstract\}(.*?)\\end\{abstract\}", re.DOTALL, forbid_wrapper=True)
root = convert_to_linklist(text, mask)
# 修复括号
node = root
while True:
string = node.string
if node.preserve:
node = node.next
if node is None: break
continue
def break_check(string):
str_stack = [""] # (lv, index)
for i, c in enumerate(string):
if c == '{':
str_stack.append('{')
elif c == '}':
if len(str_stack) == 1:
print('stack fix')
return i
str_stack.pop(-1)
else:
str_stack[-1] += c
return -1
bp = break_check(string)
if bp == -1:
pass
elif bp == 0:
node.string = string[:1]
q = LinkedListNode(string[1:], False)
q.next = node.next
node.next = q
else:
node.string = string[:bp]
q = LinkedListNode(string[bp:], False)
q.next = node.next
node.next = q
node = node.next
if node is None: break
# 屏蔽空行和太短的句子
node = root
while True:
if len(node.string.strip('\n').strip(''))==0: node.preserve = True
if len(node.string.strip('\n').strip(''))<42: node.preserve = True
node = node.next
if node is None: break
node = root
while True:
if node.next and node.preserve and node.next.preserve:
node.string += node.next.string
node.next = node.next.next
node = node.next
if node is None: break
# 将前后断行符脱离
node = root
prev_node = None
while True:
if not node.preserve:
lstriped_ = node.string.lstrip().lstrip('\n')
if (prev_node is not None) and (prev_node.preserve) and (len(lstriped_)!=len(node.string)):
prev_node.string += node.string[:-len(lstriped_)]
node.string = lstriped_
rstriped_ = node.string.rstrip().rstrip('\n')
if (node.next is not None) and (node.next.preserve) and (len(rstriped_)!=len(node.string)):
node.next.string = node.string[len(rstriped_):] + node.next.string
node.string = rstriped_
# =====
prev_node = node
node = node.next
if node is None: break
# 输出html调试文件,用红色标注处保留区PRESERVE,用黑色标注转换区TRANSFORM
with open(pj(project_folder, 'debug_log.html'), 'w', encoding='utf8') as f:
segment_parts_for_gpt = []
nodes = []
node = root
while True:
nodes.append(node)
show_html = node.string.replace('\n','<br/>')
if not node.preserve:
segment_parts_for_gpt.append(node.string)
f.write(f'<p style="color:black;">#{show_html}#</p>')
else:
f.write(f'<p style="color:red;">{show_html}</p>')
node = node.next
if node is None: break
for n in nodes: n.next = None # break
return_dict['nodes'] = nodes
return_dict['segment_parts_for_gpt'] = segment_parts_for_gpt
return return_dict
class LatexPaperSplit():
"""
break down latex file to a linked list,
each node use a preserve flag to indicate whether it should
be proccessed by GPT.
"""
def __init__(self) -> None:
self.nodes = None
self.msg = "*{\\scriptsize\\textbf{警告该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成," + \
"版权归原文作者所有。翻译内容可靠性无保障,请仔细鉴别并以原文为准。" + \
"项目Github地址 \\url{https://github.com/binary-husky/gpt_academic/}。"
# 请您不要删除或修改这行警告,除非您是论文的原作者如果您是论文原作者,欢迎加REAME中的QQ联系开发者
self.msg_declare = "为了防止大语言模型的意外谬误产生扩散影响,禁止移除或修改此警告。}}\\\\"
def merge_result(self, arr, mode, msg):
"""
Merge the result after the GPT process completed
"""
result_string = ""
p = 0
for node in self.nodes:
if node.preserve:
result_string += node.string
else:
result_string += fix_content(arr[p], node.string)
p += 1
if mode == 'translate_zh':
pattern = re.compile(r'\\begin\{abstract\}.*\n')
match = pattern.search(result_string)
if not match:
# match \abstract{xxxx}
pattern_compile = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
match = pattern_compile.search(result_string)
position = match.regs[1][0]
else:
# match \begin{abstract}xxxx\end{abstract}
position = match.end()
result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:]
return result_string
def split(self, txt, project_folder, opts):
"""
break down latex file to a linked list,
each node use a preserve flag to indicate whether it should
be proccessed by GPT.
P.S. use multiprocessing to avoid timeout error
"""
import multiprocessing
manager = multiprocessing.Manager()
return_dict = manager.dict()
p = multiprocessing.Process(
target=split_subprocess,
args=(txt, project_folder, return_dict, opts))
p.start()
p.join()
p.close()
self.nodes = return_dict['nodes']
self.sp = return_dict['segment_parts_for_gpt']
return self.sp
class LatexPaperFileGroup():
"""
use tokenizer to break down text according to max_token_limit
"""
def __init__(self):
self.file_paths = []
self.file_contents = []
self.sp_file_contents = []
self.sp_file_index = []
self.sp_file_tag = []
# count_token
from request_llm.bridge_all import model_info
enc = model_info["gpt-3.5-turbo"]['tokenizer']
def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
self.get_token_num = get_token_num
def run_file_split(self, max_token_limit=1900):
"""
use tokenizer to break down text according to max_token_limit
"""
for index, file_content in enumerate(self.file_contents):
if self.get_token_num(file_content) < max_token_limit:
self.sp_file_contents.append(file_content)
self.sp_file_index.append(index)
self.sp_file_tag.append(self.file_paths[index])
else:
from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
segments = breakdown_txt_to_satisfy_token_limit_for_pdf(file_content, self.get_token_num, max_token_limit)
for j, segment in enumerate(segments):
self.sp_file_contents.append(segment)
self.sp_file_index.append(index)
self.sp_file_tag.append(self.file_paths[index] + f".part-{j}.tex")
print('Segmentation: done')
def merge_result(self):
self.file_result = ["" for _ in range(len(self.file_paths))]
for r, k in zip(self.sp_file_result, self.sp_file_index):
self.file_result[k] += r
def write_result(self):
manifest = []
for path, res in zip(self.file_paths, self.file_result):
with open(path + '.polish.tex', 'w', encoding='utf8') as f:
manifest.append(path + '.polish.tex')
f.write(res)
return manifest
def write_html(sp_file_contents, sp_file_result, chatbot, project_folder):
# write html
try:
import shutil
from .crazy_utils import construct_html
from toolbox import gen_time_str
ch = construct_html()
orig = ""
trans = ""
final = []
for c,r in zip(sp_file_contents, sp_file_result):
final.append(c)
final.append(r)
for i, k in enumerate(final):
if i%2==0:
orig = k
if i%2==1:
trans = k
ch.add_row(a=orig, b=trans)
create_report_file_name = f"{gen_time_str()}.trans.html"
ch.save_file(create_report_file_name)
shutil.copyfile(pj('./gpt_log/', create_report_file_name), pj(project_folder, create_report_file_name))
promote_file_to_downloadzone(file=f'./gpt_log/{create_report_file_name}', chatbot=chatbot)
except:
from toolbox import trimmed_format_exc
print('writing html result failed:', trimmed_format_exc())
def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None, opts=[]):
import time, os, re
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件
# <-------- 寻找主tex文件 ---------->
maintex = 寻找Latex主文件(file_manifest, mode)
chatbot.append((f"定位主Latex文件", f'[Local Message] 分析结果该项目的Latex主文件是{maintex}, 如果分析错误, 请立即终止程序, 删除或修改歧义文件, 然后重试。主程序即将开始, 请稍候。'))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
time.sleep(3)
# <-------- 读取Latex文件, 将多文件tex工程融合为一个巨型tex ---------->
main_tex_basename = os.path.basename(maintex)
assert main_tex_basename.endswith('.tex')
main_tex_basename_bare = main_tex_basename[:-4]
may_exist_bbl = pj(project_folder, f'{main_tex_basename_bare}.bbl')
if os.path.exists(may_exist_bbl):
shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge.bbl'))
shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge_{mode}.bbl'))
shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge_diff.bbl'))
with open(maintex, 'r', encoding='utf-8', errors='replace') as f:
content = f.read()
merged_content = merge_tex_files(project_folder, content, mode)
with open(project_folder + '/merge.tex', 'w', encoding='utf-8', errors='replace') as f:
f.write(merged_content)
# <-------- 精细切分latex文件 ---------->
chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件,这需要一段时间计算,文档越长耗时越长,请耐心等待。'))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
lps = LatexPaperSplit()
res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
# <-------- 拆分过长的latex片段 ---------->
pfg = LatexPaperFileGroup()
for index, r in enumerate(res):
pfg.file_paths.append('segment-' + str(index))
pfg.file_contents.append(r)
pfg.run_file_split(max_token_limit=1024)
n_split = len(pfg.sp_file_contents)
# <-------- 根据需要切换prompt ---------->
inputs_array, sys_prompt_array = switch_prompt(pfg, mode)
inputs_show_user_array = [f"{mode} {f}" for f in pfg.sp_file_tag]
if os.path.exists(pj(project_folder,'temp.pkl')):
# <-------- 【仅调试】如果存在调试缓存文件,则跳过GPT请求环节 ---------->
pfg = objload(file=pj(project_folder,'temp.pkl'))
else:
# <-------- gpt 多线程请求 ---------->
gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array=inputs_array,
inputs_show_user_array=inputs_show_user_array,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history_array=[[""] for _ in range(n_split)],
sys_prompt_array=sys_prompt_array,
# max_workers=5, # 并行任务数量限制, 最多同时执行5个, 其他的排队等待
scroller_max_len = 40
)
# <-------- 文本碎片重组为完整的tex片段 ---------->
pfg.sp_file_result = []
for i_say, gpt_say, orig_content in zip(gpt_response_collection[0::2], gpt_response_collection[1::2], pfg.sp_file_contents):
pfg.sp_file_result.append(gpt_say)
pfg.merge_result()
# <-------- 临时存储用于调试 ---------->
pfg.get_token_num = None
objdump(pfg, file=pj(project_folder,'temp.pkl'))
write_html(pfg.sp_file_contents, pfg.sp_file_result, chatbot=chatbot, project_folder=project_folder)
# <-------- 写出文件 ---------->
msg = f"当前大语言模型: {llm_kwargs['llm_model']},当前语言模型温度设定: {llm_kwargs['temperature']}"
final_tex = lps.merge_result(pfg.file_result, mode, msg)
with open(project_folder + f'/merge_{mode}.tex', 'w', encoding='utf-8', errors='replace') as f:
if mode != 'translate_zh' or "binary" in final_tex: f.write(final_tex)
# <-------- 整理结果, 退出 ---------->
chatbot.append((f"完成了吗?", 'GPT结果已输出, 正在编译PDF'))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# <-------- 返回 ---------->
return project_folder + f'/merge_{mode}.tex'
def remove_buggy_lines(file_path, log_path, tex_name, tex_name_pure, n_fix, work_folder_modified):
try:
with open(log_path, 'r', encoding='utf-8', errors='replace') as f:
log = f.read()
with open(file_path, 'r', encoding='utf-8', errors='replace') as f:
file_lines = f.readlines()
import re
buggy_lines = re.findall(tex_name+':([0-9]{1,5}):', log)
buggy_lines = [int(l) for l in buggy_lines]
buggy_lines = sorted(buggy_lines)
print("removing lines that has errors", buggy_lines)
file_lines.pop(buggy_lines[0]-1)
with open(pj(work_folder_modified, f"{tex_name_pure}_fix_{n_fix}.tex"), 'w', encoding='utf-8', errors='replace') as f:
f.writelines(file_lines)
return True, f"{tex_name_pure}_fix_{n_fix}", buggy_lines
except:
print("Fatal error occurred, but we cannot identify error, please download zip, read latex log, and compile manually.")
return False, -1, [-1]
def compile_latex_with_timeout(command, cwd, timeout=60):
import subprocess
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=cwd)
try:
stdout, stderr = process.communicate(timeout=timeout)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
print("Process timed out!")
return False
return True
def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_folder_original, work_folder_modified, work_folder, mode='default'):
import os, time
current_dir = os.getcwd()
n_fix = 1
max_try = 32
chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder},如果程序停顿5分钟以上,请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面
yield from update_ui_lastest_msg('编译已经开始...', chatbot, history) # 刷新Gradio前端界面
while True:
import os
# https://stackoverflow.com/questions/738755/dont-make-me-manually-abort-a-latex-compile-when-theres-an-error
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译原始PDF ...', chatbot, history) # 刷新Gradio前端界面
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex', work_folder_original)
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译转化后的PDF ...', chatbot, history) # 刷新Gradio前端界面
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex', work_folder_modified)
if ok and os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf')):
# 只有第二步成功,才能继续下面的步骤
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译BibTex ...', chatbot, history) # 刷新Gradio前端界面
if not os.path.exists(pj(work_folder_original, f'{main_file_original}.bbl')):
ok = compile_latex_with_timeout(f'bibtex {main_file_original}.aux', work_folder_original)
if not os.path.exists(pj(work_folder_modified, f'{main_file_modified}.bbl')):
ok = compile_latex_with_timeout(f'bibtex {main_file_modified}.aux', work_folder_modified)
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译文献交叉引用 ...', chatbot, history) # 刷新Gradio前端界面
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex', work_folder_original)
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex', work_folder_modified)
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex', work_folder_original)
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex', work_folder_modified)
if mode!='translate_zh':
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 使用latexdiff生成论文转化前后对比 ...', chatbot, history) # 刷新Gradio前端界面
print( f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 正在编译对比PDF ...', chatbot, history) # 刷新Gradio前端界面
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex', work_folder)
ok = compile_latex_with_timeout(f'bibtex merge_diff.aux', work_folder)
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex', work_folder)
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex', work_folder)
# <---------- 检查结果 ----------->
results_ = ""
original_pdf_success = os.path.exists(pj(work_folder_original, f'{main_file_original}.pdf'))
modified_pdf_success = os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf'))
diff_pdf_success = os.path.exists(pj(work_folder, f'merge_diff.pdf'))
results_ += f"原始PDF编译是否成功: {original_pdf_success};"
results_ += f"转化PDF编译是否成功: {modified_pdf_success};"
results_ += f"对比PDF编译是否成功: {diff_pdf_success};"
yield from update_ui_lastest_msg(f'{n_fix}编译结束:<br/>{results_}...', chatbot, history) # 刷新Gradio前端界面
if diff_pdf_success:
result_pdf = pj(work_folder_modified, f'merge_diff.pdf') # get pdf path
promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot) # promote file to web UI
if modified_pdf_success:
yield from update_ui_lastest_msg(f'转化PDF编译已经成功, 即将退出 ...', chatbot, history) # 刷新Gradio前端界面
result_pdf = pj(work_folder_modified, f'{main_file_modified}.pdf') # get pdf path
if os.path.exists(pj(work_folder, '..', 'translation')):
shutil.copyfile(result_pdf, pj(work_folder, '..', 'translation', 'translate_zh.pdf'))
promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot) # promote file to web UI
return True # 成功啦
else:
if n_fix>=max_try: break
n_fix += 1
can_retry, main_file_modified, buggy_lines = remove_buggy_lines(
file_path=pj(work_folder_modified, f'{main_file_modified}.tex'),
log_path=pj(work_folder_modified, f'{main_file_modified}.log'),
tex_name=f'{main_file_modified}.tex',
tex_name_pure=f'{main_file_modified}',
n_fix=n_fix,
work_folder_modified=work_folder_modified,
)
yield from update_ui_lastest_msg(f'由于最为关键的转化PDF编译失败, 将根据报错信息修正tex源文件并重试, 当前报错的latex代码处于第{buggy_lines}行 ...', chatbot, history) # 刷新Gradio前端界面
if not can_retry: break
return False # 失败啦

查看文件

@@ -0,0 +1,63 @@
from toolbox import CatchException, update_ui
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
@CatchException
def 交互功能模板函数(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
plugin_kwargs 插件模型的参数, 如温度和top_p等, 一般原样传递下去就行
chatbot 聊天显示框的句柄,用于显示给用户
history 聊天历史,前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append(("这是什么功能?", "交互功能函数模板。在执行完成之后, 可以将自身的状态存储到cookie中, 等待用户的再次调用。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
state = chatbot._cookies.get('plugin_state_0001', None) # 初始化插件状态
if state is None:
chatbot._cookies['lock_plugin'] = 'crazy_functions.交互功能函数模板->交互功能模板函数' # 赋予插件锁定 锁定插件回调路径,当下一次用户提交时,会直接转到该函数
chatbot._cookies['plugin_state_0001'] = 'wait_user_keyword' # 赋予插件状态
chatbot.append(("第一次调用:", "请输入关键词, 我将为您查找相关壁纸, 建议使用英文单词, 插件锁定中,请直接提交即可。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
if state == 'wait_user_keyword':
chatbot._cookies['lock_plugin'] = None # 解除插件锁定,避免遗忘导致死锁
chatbot._cookies['plugin_state_0001'] = None # 解除插件状态,避免遗忘导致死锁
# 解除插件锁定
chatbot.append((f"获取关键词:{txt}", ""))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
page_return = get_image_page_by_keyword(txt)
inputs=inputs_show_user=f"Extract all image urls in this html page, pick the first 5 images and show them with markdown format: \n\n {page_return}"
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=inputs, inputs_show_user=inputs_show_user,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
sys_prompt="When you want to show an image, use markdown format. e.g. ![image_description](image_url). If there are no image url provided, answer 'no image url provided'"
)
chatbot[-1] = [chatbot[-1][0], gpt_say]
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# ---------------------------------------------------------------------------------
def get_image_page_by_keyword(keyword):
import requests
from bs4 import BeautifulSoup
response = requests.get(f'https://wallhaven.cc/search?q={keyword}', timeout=2)
res = "image urls: \n"
for image_element in BeautifulSoup(response.content, 'html.parser').findAll("img"):
try:
res += image_element["data-src"]
res += "\n"
except:
pass
return res

查看文件

@@ -27,8 +27,10 @@ def gen_image(llm_kwargs, prompt, resolution="256x256"):
}
response = requests.post(url, headers=headers, json=data, proxies=proxies)
print(response.content)
image_url = json.loads(response.content.decode('utf8'))['data'][0]['url']
try:
image_url = json.loads(response.content.decode('utf8'))['data'][0]['url']
except:
raise RuntimeError(response.content.decode())
# 文件保存到本地
r = requests.get(image_url, proxies=proxies)
file_path = 'gpt_log/image_gen/'

查看文件

@@ -1,4 +1,4 @@
from toolbox import CatchException, update_ui
from toolbox import CatchException, update_ui, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
import re
@@ -29,9 +29,8 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
for h in history:
f.write("\n>>>" + h)
f.write('</code>')
res = '对话历史写入:' + os.path.abspath(f'./gpt_log/{file_name}')
print(res)
return res
promote_file_to_downloadzone(f'./gpt_log/{file_name}', rename_file=file_name, chatbot=chatbot)
return '对话历史写入:' + os.path.abspath(f'./gpt_log/{file_name}')
def gen_file_preview(file_name):
try:

查看文件

@@ -14,17 +14,19 @@ def 解析docx(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot
doc = Document(fp)
file_content = "\n".join([para.text for para in doc.paragraphs])
else:
import win32com.client
word = win32com.client.Dispatch("Word.Application")
word.visible = False
# 打开文件
print('fp', os.getcwd())
doc = word.Documents.Open(os.getcwd() + '/' + fp)
# file_content = doc.Content.Text
doc = word.ActiveDocument
file_content = doc.Range().Text
doc.Close()
word.Quit()
try:
import win32com.client
word = win32com.client.Dispatch("Word.Application")
word.visible = False
# 打开文件
doc = word.Documents.Open(os.getcwd() + '/' + fp)
# file_content = doc.Content.Text
doc = word.ActiveDocument
file_content = doc.Range().Text
doc.Close()
word.Quit()
except:
raise RuntimeError('请先将.doc文档转换为.docx文档。')
print(file_content)
# private_upload里面的文件名在解压zip后容易出现乱码rar和7z格式正常,故可以只分析文章内容,不输入文件名

查看文件

@@ -71,7 +71,7 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
prefix = "接下来请你逐文件分析下面的论文文件,概括其内容" if index==0 else ""
i_say = prefix + f'请对下面的文章片段用中文做一个概述,文件名是{os.path.relpath(fp, project_folder)},文章内容是 ```{file_content}```'
i_say_show_user = prefix + f'[{index}/{len(file_manifest)}] 请对下面的文章片段做一个概述: {os.path.abspath(fp)}'
i_say_show_user = prefix + f'[{index + 1}/{len(file_manifest)}] 请对下面的文章片段做一个概述: {os.path.abspath(fp)}'
chatbot.append((i_say_show_user, "[Local Message] waiting gpt response."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,5 +1,5 @@
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import update_ui
from toolbox import update_ui, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
from .crazy_utils import read_and_clean_pdf_text
@@ -147,23 +147,14 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
print('writing html result failed:', trimmed_format_exc())
# 准备文件的下载
import shutil
for pdf_path in generated_conclusion_files:
# 重命名文件
rename_file = f'./gpt_log/翻译-{os.path.basename(pdf_path)}'
if os.path.exists(rename_file):
os.remove(rename_file)
shutil.copyfile(pdf_path, rename_file)
if os.path.exists(pdf_path):
os.remove(pdf_path)
rename_file = f'翻译-{os.path.basename(pdf_path)}'
promote_file_to_downloadzone(pdf_path, rename_file=rename_file, chatbot=chatbot)
for html_path in generated_html_files:
# 重命名文件
rename_file = f'./gpt_log/翻译-{os.path.basename(html_path)}'
if os.path.exists(rename_file):
os.remove(rename_file)
shutil.copyfile(html_path, rename_file)
if os.path.exists(html_path):
os.remove(html_path)
rename_file = f'翻译-{os.path.basename(html_path)}'
promote_file_to_downloadzone(html_path, rename_file=rename_file, chatbot=chatbot)
chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -8,7 +8,7 @@ def inspect_dependency(chatbot, history):
import manim
return True
except:
chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manimgl```"])
chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manim manimgl```"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return False

查看文件

@@ -13,6 +13,8 @@ def 解析PDF(file_name, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
# 递归地切割PDF文件,每一块尽量是完整的一个section,比如introduction,experiment等,必要时再进行切割
# 的长度必须小于 2500 个 Token
file_content, page_one = read_and_clean_pdf_text(file_name) # 尝试按照章节切割PDF
file_content = file_content.encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
page_one = str(page_one).encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
TOKEN_LIMIT_PER_FRAGMENT = 2500

查看文件

@@ -0,0 +1,102 @@
from toolbox import CatchException, update_ui
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, input_clipping
import requests
from bs4 import BeautifulSoup
from request_llm.bridge_all import model_info
def bing_search(query, proxies=None):
query = query
url = f"https://cn.bing.com/search?q={query}"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36'}
response = requests.get(url, headers=headers, proxies=proxies)
soup = BeautifulSoup(response.content, 'html.parser')
results = []
for g in soup.find_all('li', class_='b_algo'):
anchors = g.find_all('a')
if anchors:
link = anchors[0]['href']
if not link.startswith('http'):
continue
title = g.find('h2').text
item = {'title': title, 'link': link}
results.append(item)
for r in results:
print(r['link'])
return results
def scrape_text(url, proxies) -> str:
"""Scrape text from a webpage
Args:
url (str): The URL to scrape text from
Returns:
str: The scraped text
"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36',
'Content-Type': 'text/plain',
}
try:
response = requests.get(url, headers=headers, proxies=proxies, timeout=8)
if response.encoding == "ISO-8859-1": response.encoding = response.apparent_encoding
except:
return "无法连接到该网页"
soup = BeautifulSoup(response.text, "html.parser")
for script in soup(["script", "style"]):
script.extract()
text = soup.get_text()
lines = (line.strip() for line in text.splitlines())
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
text = "\n".join(chunk for chunk in chunks if chunk)
return text
@CatchException
def 连接bing搜索回答问题(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数,如温度和top_p等,一般原样传递下去就行
plugin_kwargs 插件模型的参数,暂时没有用武之地
chatbot 聊天显示框的句柄,用于显示给用户
history 聊天历史,前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append((f"请结合互联网信息回答以下问题:{txt}",
"[Local Message] 请注意,您正在调用一个[函数插件]的模板,该模板可以实现ChatGPT联网信息综合。该函数面向希望实现更多有趣功能的开发者,它可以作为创建新功能函数的模板。您若希望分享新的功能模组,请不吝PR"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
# ------------- < 第1步爬取搜索引擎的结果 > -------------
from toolbox import get_conf
proxies, = get_conf('proxies')
urls = bing_search(txt, proxies)
history = []
# ------------- < 第2步依次访问网页 > -------------
max_search_result = 8 # 最多收纳多少个网页的结果
for index, url in enumerate(urls[:max_search_result]):
res = scrape_text(url['link'], proxies)
history.extend([f"{index}份搜索结果:", res])
chatbot.append([f"{index}份搜索结果:", res[:500]+"......"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
# ------------- < 第3步ChatGPT综合 > -------------
i_say = f"从以上搜索结果中抽取信息,然后回答问题:{txt}"
i_say, history = input_clipping( # 裁剪输入,从最长的条目开始裁剪,防止爆token
inputs=i_say,
history=history,
max_token_limit=model_info[llm_kwargs['llm_model']]['max_token']*3//4
)
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say, inputs_show_user=i_say,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=history,
sys_prompt="请从给定的若干条搜索结果中抽取信息,对最相关的两个搜索结果进行总结,然后回答问题。"
)
chatbot[-1] = (i_say, gpt_say)
history.append(i_say);history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新

查看文件

@@ -0,0 +1,131 @@
from toolbox import CatchException, update_ui, gen_time_str
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import input_clipping
prompt = """
I have to achieve some functionalities by calling one of the functions below.
Your job is to find the correct funtion to use to satisfy my requirement,
and then write python code to call this function with correct parameters.
These are functions you are allowed to choose from:
1.
功能描述: 总结音视频内容
调用函数: ConcludeAudioContent(txt, llm_kwargs)
参数说明:
txt: 音频文件的路径
llm_kwargs: 模型参数, 永远给定None
2.
功能描述: 将每次对话记录写入Markdown格式的文件中
调用函数: WriteMarkdown()
3.
功能描述: 将指定目录下的PDF文件从英文翻译成中文
调用函数: BatchTranslatePDFDocuments_MultiThreaded(txt, llm_kwargs)
参数说明:
txt: PDF文件所在的路径
llm_kwargs: 模型参数, 永远给定None
4.
功能描述: 根据文本使用GPT模型生成相应的图像
调用函数: ImageGeneration(txt, llm_kwargs)
参数说明:
txt: 图像生成所用到的提示文本
llm_kwargs: 模型参数, 永远给定None
5.
功能描述: 对输入的word文档进行摘要生成
调用函数: SummarizingWordDocuments(input_path, output_path)
参数说明:
input_path: 待处理的word文档路径
output_path: 摘要生成后的文档路径
You should always anwser with following format:
----------------
Code:
```
class AutoAcademic(object):
def __init__(self):
self.selected_function = "FILL_CORRECT_FUNCTION_HERE" # e.g., "GenerateImage"
self.txt = "FILL_MAIN_PARAMETER_HERE" # e.g., "荷叶上的蜻蜓"
self.llm_kwargs = None
```
Explanation:
只有GenerateImage和生成图像相关, 因此选择GenerateImage函数。
----------------
Now, this is my requirement:
"""
def get_fn_lib():
return {
"BatchTranslatePDFDocuments_MultiThreaded": ("crazy_functions.批量翻译PDF文档_多线程", "批量翻译PDF文档"),
"SummarizingWordDocuments": ("crazy_functions.总结word文档", "总结word文档"),
"ImageGeneration": ("crazy_functions.图片生成", "图片生成"),
"TranslateMarkdownFromEnglishToChinese": ("crazy_functions.批量Markdown翻译", "Markdown中译英"),
"SummaryAudioVideo": ("crazy_functions.总结音视频", "总结音视频"),
}
def inspect_dependency(chatbot, history):
return True
def eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
import subprocess, sys, os, shutil, importlib
with open('gpt_log/void_terminal_runtime.py', 'w', encoding='utf8') as f:
f.write(code)
try:
AutoAcademic = getattr(importlib.import_module('gpt_log.void_terminal_runtime', 'AutoAcademic'), 'AutoAcademic')
# importlib.reload(AutoAcademic)
auto_dict = AutoAcademic()
selected_function = auto_dict.selected_function
txt = auto_dict.txt
fp, fn = get_fn_lib()[selected_function]
fn_plugin = getattr(importlib.import_module(fp, fn), fn)
yield from fn_plugin(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)
except:
from toolbox import trimmed_format_exc
chatbot.append(["执行错误", f"\n```\n{trimmed_format_exc()}\n```\n"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
def get_code_block(reply):
import re
pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
matches = re.findall(pattern, reply) # find all code blocks in text
if len(matches) != 1:
raise RuntimeError("GPT is not generating proper code.")
return matches[0].strip('python') # code block
@CatchException
def 终端(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本, 例如需要翻译的一段话, 再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
plugin_kwargs 插件模型的参数, 暂时没有用武之地
chatbot 聊天显示框的句柄, 用于显示给用户
history 聊天历史, 前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
# 清空历史, 以免输入溢出
history = []
# 基本信息:功能、贡献者
chatbot.append(["函数插件功能?", "根据自然语言执行插件命令, 作者: binary-husky, 插件初始化中 ..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# # 尝试导入依赖, 如果缺少依赖, 则给出安装建议
# dep_ok = yield from inspect_dependency(chatbot=chatbot, history=history) # 刷新界面
# if not dep_ok: return
# 输入
i_say = prompt + txt
# 开始
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say, inputs_show_user=txt,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
sys_prompt=""
)
# 将代码转为动画
code = get_code_block(gpt_say)
yield from eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)

查看文件

@@ -0,0 +1,28 @@
# encoding: utf-8
# @Time : 2023/4/19
# @Author : Spike
# @Descr :
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
@CatchException
def 猜你想问(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
if txt:
show_say = txt
prompt = txt+'\n回答完问题后,再列出用户可能提出的三个问题。'
else:
prompt = history[-1]+"\n分析上述回答,再列出用户可能提出的三个问题。"
show_say = '分析上述回答,再列出用户可能提出的三个问题。'
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt,
inputs_show_user=show_say,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history=history,
sys_prompt=system_prompt
)
chatbot[-1] = (show_say, gpt_say)
history.extend([show_say, gpt_say])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -6,7 +6,7 @@ def 高阶功能模板函数(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
"""
txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数,如温度和top_p等,一般原样传递下去就行
plugin_kwargs 插件模型的参数,暂时没有用武之地
plugin_kwargs 插件模型的参数,如温度和top_p等,一般原样传递下去就行
chatbot 聊天显示框的句柄,用于显示给用户
history 聊天历史,前情提要
system_prompt 给gpt的静默提醒

查看文件

@@ -99,6 +99,34 @@ services:
command: >
bash -c " echo '[gpt-academic] 正在从github拉取最新代码...' &&
git pull &&
pip install -r requirements.txt &&
echo '[jittorllms] 正在从github拉取最新代码...' &&
git --git-dir=request_llm/jittorllms/.git --work-tree=request_llm/jittorllms pull --force &&
python3 -u main.py"
## ===================================================
## 【方案四】 chatgpt + Latex
## ===================================================
version: '3'
services:
gpt_academic_with_latex:
image: ghcr.io/binary-husky/gpt_academic_with_latex:master
environment:
# 请查阅 `config.py` 以查看所有的配置信息
API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
USE_PROXY: ' True '
proxies: ' { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } '
LLM_MODEL: ' gpt-3.5-turbo '
AVAIL_LLM_MODELS: ' ["gpt-3.5-turbo", "gpt-4"] '
LOCAL_MODEL_DEVICE: ' cuda '
DEFAULT_WORKER_NUM: ' 10 '
WEB_PORT: ' 12303 '
# 与宿主的网络融合
network_mode: "host"
# 不使用代理网络拉取最新代码
command: >
bash -c "python3 -u main.py"

查看文件

@@ -0,0 +1,27 @@
# 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型,请参考 docs/Dockerfile+ChatGLM
# - 1 修改 `config.py`
# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
# - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
FROM fuqingxu/python311_texlive_ctex:latest
# 指定路径
WORKDIR /gpt
ARG useProxyNetwork=''
RUN $useProxyNetwork pip3 install gradio openai numpy arxiv rich -i https://pypi.douban.com/simple/
RUN $useProxyNetwork pip3 install colorama Markdown pygments pymupdf -i https://pypi.douban.com/simple/
# 装载项目文件
COPY . .
# 安装依赖
RUN $useProxyNetwork pip3 install -r requirements.txt -i https://pypi.douban.com/simple/
# 可选步骤,用于预热模块
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
# 启动
CMD ["python3", "-u", "main.py"]

查看文件

@@ -0,0 +1,25 @@
# 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型,请参考 docs/Dockerfile+ChatGLM
# - 1 修改 `config.py`
# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
# - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
FROM fuqingxu/python311_texlive_ctex:latest
# 指定路径
WORKDIR /gpt
RUN pip3 install gradio openai numpy arxiv rich
RUN pip3 install colorama Markdown pygments pymupdf
# 装载项目文件
COPY . .
# 安装依赖
RUN pip3 install -r requirements.txt
# 可选步骤,用于预热模块
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
# 启动
CMD ["python3", "-u", "main.py"]

查看文件

@@ -2,11 +2,11 @@
>
> Durante l'installazione delle dipendenze, selezionare rigorosamente le **versioni specificate** nel file requirements.txt.
>
> ` pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/`
> ` pip install -r requirements.txt`
# <img src="docs/logo.png" width="40" > GPT Ottimizzazione Accademica (GPT Academic)
# <img src="logo.png" width="40" > GPT Ottimizzazione Accademica (GPT Academic)
**Se ti piace questo progetto, ti preghiamo di dargli una stella. Se hai sviluppato scorciatoie accademiche o plugin funzionali più utili, non esitare ad aprire una issue o pull request. Abbiamo anche una README in [Inglese|](docs/README_EN.md)[Giapponese|](docs/README_JP.md)[Coreano|](https://github.com/mldljyh/ko_gpt_academic)[Russo|](docs/README_RS.md)[Francese](docs/README_FR.md) tradotta da questo stesso progetto.
**Se ti piace questo progetto, ti preghiamo di dargli una stella. Se hai sviluppato scorciatoie accademiche o plugin funzionali più utili, non esitare ad aprire una issue o pull request. Abbiamo anche una README in [Inglese|](README_EN.md)[Giapponese|](README_JP.md)[Coreano|](https://github.com/mldljyh/ko_gpt_academic)[Russo|](README_RS.md)[Francese](README_FR.md) tradotta da questo stesso progetto.
Per tradurre questo progetto in qualsiasi lingua con GPT, leggere e eseguire [`multi_language.py`](multi_language.py) (sperimentale).
> **Nota**
@@ -17,7 +17,9 @@ Per tradurre questo progetto in qualsiasi lingua con GPT, leggere e eseguire [`m
>
> 3. Questo progetto è compatibile e incoraggia l'utilizzo di grandi modelli di linguaggio di produzione nazionale come chatglm, RWKV, Pangu ecc. Supporta la coesistenza di più api-key e può essere compilato nel file di configurazione come `API_KEY="openai-key1,openai-key2,api2d-key3"`. Per sostituire temporaneamente `API_KEY`, inserire `API_KEY` temporaneo nell'area di input e premere Invio per renderlo effettivo.
<div align="center">Funzione | Descrizione
<div align="center">
Funzione | Descrizione
--- | ---
Correzione immediata | Supporta correzione immediata e ricerca degli errori di grammatica del documento con un solo clic
Traduzione cinese-inglese immediata | Traduzione cinese-inglese immediata con un solo clic
@@ -41,6 +43,8 @@ Avvia il tema di gradio [scuro](https://github.com/binary-husky/chatgpt_academic
Supporto per maggiori modelli LLM, supporto API2D | Sentirsi serviti simultaneamente da GPT3.5, GPT4, [Tsinghua ChatGLM](https://github.com/THUDM/ChatGLM-6B), [Fudan MOSS](https://github.com/OpenLMLab/MOSS) deve essere una grande sensazione, giusto?
Ulteriori modelli LLM supportat,i supporto per l'implementazione di Huggingface | Aggiunta di un'interfaccia Newbing (Nuovo Bing), introdotta la compatibilità con Tsinghua [Jittorllms](https://github.com/Jittor/JittorLLMs), [LLaMA](https://github.com/facebookresearch/llama), [RWKV](https://github.com/BlinkDL/ChatRWKV) e [PanGu-α](https://openi.org.cn/pangu/)
Ulteriori dimostrazioni di nuove funzionalità (generazione di immagini, ecc.)... | Vedere la fine di questo documento...
</div>
- Nuova interfaccia (modificare l'opzione LAYOUT in `config.py` per passare dal layout a sinistra e a destra al layout superiore e inferiore)
<div align="center">
@@ -202,11 +206,13 @@ ad esempio
2. Plugin di funzione personalizzati
Scrivi plugin di funzione personalizzati e esegui tutte le attività che desideri o non hai mai pensato di fare.
La difficoltà di scrittura e debug dei plugin del nostro progetto è molto bassa. Se si dispone di una certa conoscenza di base di Python, è possibile realizzare la propria funzione del plugin seguendo il nostro modello. Per maggiori dettagli, consultare la [guida al plugin per funzioni] (https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97).
La difficoltà di scrittura e debug dei plugin del nostro progetto è molto bassa. Se si dispone di una certa conoscenza di base di Python, è possibile realizzare la propria funzione del plugin seguendo il nostro modello. Per maggiori dettagli, consultare la [guida al plugin per funzioni](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97).
---
# Ultimo aggiornamento
## Nuove funzionalità dinamiche1. Funzionalità di salvataggio della conversazione. Nell'area dei plugin della funzione, fare clic su "Salva la conversazione corrente" per salvare la conversazione corrente come file html leggibile e ripristinabile, inoltre, nell'area dei plugin della funzione (menu a discesa), fare clic su "Carica la cronologia della conversazione archiviata" per ripristinare la conversazione precedente. Suggerimento: fare clic su "Carica la cronologia della conversazione archiviata" senza specificare il file consente di visualizzare la cache degli archivi html di cronologia, fare clic su "Elimina tutti i record di cronologia delle conversazioni locali" per eliminare tutte le cache degli archivi html.
## Nuove funzionalità dinamiche
1. Funzionalità di salvataggio della conversazione. Nell'area dei plugin della funzione, fare clic su "Salva la conversazione corrente" per salvare la conversazione corrente come file html leggibile e ripristinabile, inoltre, nell'area dei plugin della funzione (menu a discesa), fare clic su "Carica la cronologia della conversazione archiviata" per ripristinare la conversazione precedente. Suggerimento: fare clic su "Carica la cronologia della conversazione archiviata" senza specificare il file consente di visualizzare la cache degli archivi html di cronologia, fare clic su "Elimina tutti i record di cronologia delle conversazioni locali" per eliminare tutte le cache degli archivi html.
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
</div>

查看文件

@@ -17,7 +17,9 @@ GPT를 이용하여 프로젝트를 임의의 언어로 번역하려면 [`multi_
>
> 3. 이 프로젝트는 국내 언어 모델 chatglm과 RWKV, 판고 등의 시도와 호환 가능합니다. 여러 개의 api-key를 지원하며 설정 파일에 "API_KEY="openai-key1,openai-key2,api2d-key3""와 같이 작성할 수 있습니다. `API_KEY`를 임시로 변경해야하는 경우 입력 영역에 임시 `API_KEY`를 입력 한 후 엔터 키를 누르면 즉시 적용됩니다.
<div align="center">기능 | 설명
<div align="center">
기능 | 설명
--- | ---
원 키워드 | 원 키워드 및 논문 문법 오류를 찾는 기능 지원
한-영 키워드 | 한-영 키워드 지원

查看文件

@@ -2,7 +2,7 @@
>
> Ao instalar as dependências, por favor, selecione rigorosamente as versões **especificadas** no arquivo requirements.txt.
>
> `pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/`
> `pip install -r requirements.txt`
>
# <img src="logo.png" width="40" > Otimização acadêmica GPT (GPT Academic)
@@ -18,7 +18,9 @@ Para traduzir este projeto para qualquer idioma com o GPT, leia e execute [`mult
>
> 3. Este projeto é compatível com e incentiva o uso de modelos de linguagem nacionais, como chatglm e RWKV, Pangolin, etc. Suporta a coexistência de várias chaves de API e pode ser preenchido no arquivo de configuração como `API_KEY="openai-key1,openai-key2,api2d-key3"`. Quando precisar alterar temporariamente o `API_KEY`, basta digitar o `API_KEY` temporário na área de entrada e pressionar Enter para que ele entre em vigor.
<div align="center">Funcionalidade | Descrição
<div align="center">
Funcionalidade | Descrição
--- | ---
Um clique de polimento | Suporte a um clique polimento, um clique encontrar erros de gramática no artigo
Tradução chinês-inglês de um clique | Tradução chinês-inglês de um clique
@@ -216,7 +218,9 @@ Para mais detalhes, consulte o [Guia do plug-in de função.](https://github.com
---
# Última atualização
## Novas funções dinâmicas.1. Função de salvamento de diálogo. Ao chamar o plug-in de função "Salvar diálogo atual", é possível salvar o diálogo atual em um arquivo html legível e reversível. Além disso, ao chamar o plug-in de função "Carregar arquivo de histórico de diálogo" no menu suspenso da área de plug-in, é possível restaurar uma conversa anterior. Dica: clicar em "Carregar arquivo de histórico de diálogo" sem especificar um arquivo permite visualizar o cache do arquivo html de histórico. Clicar em "Excluir todo o registro de histórico de diálogo local" permite excluir todo o cache de arquivo html.
## Novas funções dinâmicas.
1. Função de salvamento de diálogo. Ao chamar o plug-in de função "Salvar diálogo atual", é possível salvar o diálogo atual em um arquivo html legível e reversível. Além disso, ao chamar o plug-in de função "Carregar arquivo de histórico de diálogo" no menu suspenso da área de plug-in, é possível restaurar uma conversa anterior. Dica: clicar em "Carregar arquivo de histórico de diálogo" sem especificar um arquivo permite visualizar o cache do arquivo html de histórico. Clicar em "Excluir todo o registro de histórico de diálogo local" permite excluir todo o cache de arquivo html.
<div align="center">
<img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
</div>

二进制文件未显示。

查看文件

@@ -58,6 +58,8 @@
"连接网络回答问题": "ConnectToNetworkToAnswerQuestions",
"联网的ChatGPT": "ChatGPTConnectedToNetwork",
"解析任意code项目": "ParseAnyCodeProject",
"读取知识库作答": "ReadKnowledgeArchiveAnswerQuestions",
"知识库问答": "UpdateKnowledgeArchive",
"同时问询_指定模型": "InquireSimultaneously_SpecifiedModel",
"图片生成": "ImageGeneration",
"test_解析ipynb文件": "Test_ParseIpynbFile",
@@ -1665,5 +1667,294 @@
"段音频的主要内容": "The main content of the segment audio is",
"z$ 分别是空间直角坐标系中的三个坐标": "z$, respectively, are the three coordinates in the spatial rectangular coordinate system",
"这个是怎么识别的呢我也不清楚": "I'm not sure how this is recognized",
"从现在起": "From now on"
"从现在起": "From now on",
"连接bing搜索回答问题": "ConnectBingSearchAnswerQuestion",
"联网的ChatGPT_bing版": "OnlineChatGPT_BingEdition",
"Markdown翻译指定语言": "TranslateMarkdownToSpecifiedLanguage",
"Langchain知识库": "LangchainKnowledgeBase",
"Latex英文纠错加PDF对比": "CorrectEnglishInLatexWithPDFComparison",
"Latex输出PDF结果": "OutputPDFFromLatex",
"Latex翻译中文并重新编译PDF": "TranslateChineseToEnglishInLatexAndRecompilePDF",
"sprint亮靛": "SprintIndigo",
"寻找Latex主文件": "FindLatexMainFile",
"专业词汇声明": "ProfessionalTerminologyDeclaration",
"Latex精细分解与转化": "DecomposeAndConvertLatex",
"编译Latex": "CompileLatex",
"如果您是论文原作者": "If you are the original author of the paper",
"正在编译对比PDF": "Compiling the comparison PDF",
"将 \\include 命令转换为 \\input 命令": "Converting the \\include command to the \\input command",
"取评分最高者返回": "Returning the highest-rated one",
"不要修改!! 高危设置!通过修改此设置": "Do not modify!! High-risk setting! By modifying this setting",
"Tex源文件缺失": "Tex source file is missing!",
"6.25 加入判定latex模板的代码": "Added code to determine the latex template on June 25",
"正在精细切分latex文件": "Finely splitting the latex file",
"获取response失败": "Failed to get response",
"手动指定语言": "Manually specify the language",
"输入arxivID": "Enter arxivID",
"对输入的word文档进行摘要生成": "Generate a summary of the input word document",
"将指定目录下的PDF文件从英文翻译成中文": "Translate PDF files from English to Chinese in the specified directory",
"如果分析错误": "If the analysis is incorrect",
"尝试第": "Try the",
"用户填3": "User fills in 3",
"请在此处追加更细致的矫错指令": "Please append more detailed correction instructions here",
"为了防止大语言模型的意外谬误产生扩散影响": "To prevent the accidental spread of errors in large language models",
"前面是中文冒号": "The colon before is in Chinese",
"内含已经翻译的Tex文档": "Contains a Tex document that has been translated",
"成功啦": "Success!",
"刷新页面即可以退出UpdateKnowledgeArchive模式": "Refresh the page to exit UpdateKnowledgeArchive mode",
"或者不在环境变量PATH中": "Or not in the environment variable PATH",
"--读取文件": "--Read the file",
"才能继续下面的步骤": "To continue with the next steps",
"代理数据解析失败": "Proxy data parsing failed",
"详见项目主README.md": "See the main README.md of the project for details",
"临时存储用于调试": "Temporarily stored for debugging",
"屏蔽空行和太短的句子": "Filter out empty lines and sentences that are too short",
"gpt 多线程请求": "GPT multi-threaded request",
"编译已经开始": "Compilation has started",
"无法找到一个主Tex文件": "Cannot find a main Tex file",
"修复括号": "Fix parentheses",
"请您不要删除或修改这行警告": "Please do not delete or modify this warning",
"请登录OpenAI查看详情 https": "Please log in to OpenAI to view details at https",
"调用函数": "Call a function",
"请查看终端的输出或耐心等待": "Please check the output in the terminal or wait patiently",
"LatexEnglishCorrection+高亮修正位置": "Latex English correction + highlight correction position",
"行": "line",
"Newbing 请求失败": "Newbing request failed",
"转化PDF编译是否成功": "Check if the conversion to PDF and compilation were successful",
"建议更换代理协议": "Recommend changing the proxy protocol",
"========================================= 插件主程序1 =====================================================": "========================================= Plugin Main Program 1 =====================================================",
"终端": "terminal",
"请先上传文件素材": "Please upload file materials first",
"前面是中文逗号": "There is a Chinese comma in front",
"请尝试把以下指令复制到高级参数区": "Please try copying the following instructions to the advanced parameters section",
"翻译-": "Translation -",
"请耐心等待": "Please be patient",
"将前后断行符脱离": "Remove line breaks before and after",
"json等": "JSON, etc.",
"生成中文PDF": "Generate Chinese PDF",
"用红色标注处保留区": "Use red color to highlight the reserved area",
"对比PDF编译是否成功": "Compare if the PDF compilation was successful",
"回答完问题后": "After answering the question",
"其他操作系统表现未知": "Unknown performance on other operating systems",
"-构建知识库": "Build knowledge base",
"还原原文": "Restore original text",
"或者重启之后再度尝试": "Or try again after restarting",
"免费": "Free",
"仅在Windows系统进行了测试": "Tested only on Windows system",
"欢迎加REAME中的QQ联系开发者": "Feel free to contact the developer via QQ in REAME",
"当前知识库内的有效文件": "Valid files in the current knowledge base",
"您可以到Github Issue区": "You can go to the Github Issue area",
"刷新Gradio前端界面": "Refresh the Gradio frontend interface",
"吸收title与作者以上的部分": "Include the title and the above part of the author",
"给出一些判定模板文档的词作为扣分项": "Provide some words in the template document as deduction items",
"--读取参数": "-- Read parameters",
"然后进行问答": "And then perform question-answering",
"根据自然语言执行插件命令": "Execute plugin commands based on natural language",
"*{\\scriptsize\\textbf{警告": "*{\\scriptsize\\textbf{Warning",
"但请查收结果": "But please check the results",
"翻译内容可靠性无保障": "No guarantee of translation accuracy",
"寻找主文件": "Find the main file",
"消耗时间的函数": "Time-consuming function",
"当前语言模型温度设定": "Current language model temperature setting",
"这需要一段时间计算": "This requires some time to calculate",
"为啥chatgpt会把cite里面的逗号换成中文逗号呀": "Why does ChatGPT change commas inside 'cite' to Chinese commas?",
"发现已经存在翻译好的PDF文档": "Found an already translated PDF document",
"待提取的知识库名称id": "Knowledge base name ID to be extracted",
"文本碎片重组为完整的tex片段": "Reassemble text fragments into complete tex fragments",
"注意事项": "Notes",
"参数说明": "Parameter description",
"或代理节点": "Or proxy node",
"构建知识库": "Building knowledge base",
"报错信息如下. 如果是与网络相关的问题": "Error message as follows. If it is related to network issues",
"功能描述": "Function description",
"禁止移除或修改此警告": "Removal or modification of this warning is prohibited",
"Arixv翻译": "Arixv translation",
"读取优先级": "Read priority",
"包含documentclass关键字": "Contains the documentclass keyword",
"根据文本使用GPT模型生成相应的图像": "Generate corresponding images using GPT model based on the text",
"图像生成所用到的提示文本": "Prompt text used for image generation",
"Your account is not active. OpenAI以账户失效为由": "Your account is not active. OpenAI states that it is due to account expiration",
"快捷的调试函数": "Convenient debugging function",
"在多Tex文档中": "In multiple Tex documents",
"因此选择GenerateImage函数": "Therefore, choose the GenerateImage function",
"当前工作路径为": "The current working directory is",
"实际得到格式": "Obtained format in reality",
"这段代码定义了一个名为TempProxy的空上下文管理器": "This code defines an empty context manager named TempProxy",
"吸收其他杂项": "Absorb other miscellaneous items",
"请输入要翻译成哪种语言": "Please enter which language to translate into",
"的单词": "of the word",
"正在尝试自动安装": "Attempting automatic installation",
"如果有必要": "If necessary",
"开始下载": "Start downloading",
"项目Github地址 \\url{https": "Project GitHub address \\url{https",
"将根据报错信息修正tex源文件并重试": "The Tex source file will be corrected and retried based on the error message",
"发送至azure openai api": "Send to Azure OpenAI API",
"吸收匿名公式": "Absorb anonymous formulas",
"用该压缩包+ConversationHistoryArchive进行反馈": "Provide feedback using the compressed package + ConversationHistoryArchive",
"需要特殊依赖": "Requires special dependencies",
"还原部分原文": "Restore part of the original text",
"构建完成": "Build completed",
"解析arxiv网址失败": "Failed to parse arXiv URL",
"输入问题后点击该插件": "Click the plugin after entering the question",
"请求子进程": "Requesting subprocess",
"请务必用 pip install -r requirements.txt 指令安装依赖": "Please make sure to install the dependencies using the 'pip install -r requirements.txt' command",
"如果程序停顿5分钟以上": "If the program pauses for more than 5 minutes",
"转化PDF编译已经成功": "Conversion to PDF compilation was successful",
"虽然PDF生成失败了": "Although PDF generation failed",
"分析上述回答": "Analyze the above answer",
"吸收在42行以内的begin-end组合": "Absorb the begin-end combination within 42 lines",
"推荐http": "Recommend http",
"Latex没有安装": "Latex is not installed",
"用latex编译为PDF对修正处做高亮": "Compile to PDF using LaTeX and highlight the corrections",
"reverse 操作必须放在最后": "'reverse' operation must be placed at the end",
"AZURE OPENAI API拒绝了请求": "AZURE OPENAI API rejected the request",
"该项目的Latex主文件是": "The main LaTeX file of this project is",
"You are associated with a deactivated account. OpenAI以账户失效为由": "You are associated with a deactivated account. OpenAI considers it as an account expiration",
"它*必须*被包含在AVAIL_LLM_MODELS列表中": "It *must* be included in the AVAIL_LLM_MODELS list",
"未知指令": "Unknown command",
"尝试执行Latex指令失败": "Failed to execute the LaTeX command",
"摘要生成后的文档路径": "Path of the document after summary generation",
"GPT结果已输出": "GPT result has been outputted",
"使用Newbing": "Using Newbing",
"其他模型转化效果未知": "Unknown conversion effect of other models",
"P.S. 但愿没人把latex模板放在里面传进来": "P.S. Hopefully, no one passes a LaTeX template in it",
"定位主Latex文件": "Locate the main LaTeX file",
"后面是英文冒号": "English colon follows",
"文档越长耗时越长": "The longer the document, the longer it takes.",
"压缩包": "Compressed file",
"但通常不会出现在正文": "But usually does not appear in the body.",
"正在预热文本向量化模组": "Preheating text vectorization module",
"5刀": "5 dollars",
"提问吧! 但注意": "Ask questions! But be careful",
"发送至AZURE OPENAI API": "Send to AZURE OPENAI API",
"请仔细鉴别并以原文为准": "Please carefully verify and refer to the original text",
"如果需要使用AZURE 详情请见额外文档 docs\\use_azure.md": "If you need to use AZURE, please refer to the additional document docs\\use_azure.md for details",
"使用正则表达式查找半行注释": "Use regular expressions to find inline comments",
"只有第二步成功": "Only the second step is successful",
"P.S. 顺便把CTEX塞进去以支持中文": "P.S. By the way, include CTEX to support Chinese",
"安装方法https": "Installation method: https",
"则跳过GPT请求环节": "Then skip the GPT request process",
"请切换至“UpdateKnowledgeArchive”插件进行知识库访问": "Please switch to the 'UpdateKnowledgeArchive' plugin for knowledge base access",
"=================================== 工具函数 ===============================================": "=================================== Utility functions ===============================================",
"填入azure openai api的密钥": "Fill in the Azure OpenAI API key",
"上传Latex压缩包": "Upload LaTeX compressed file",
"远程云服务器部署": "Deploy to remote cloud server",
"用黑色标注转换区": "Use black color to annotate the conversion area",
"音频文件的路径": "Path to the audio file",
"必须包含documentclass": "Must include documentclass",
"再列出用户可能提出的三个问题": "List three more questions that the user might ask",
"根据需要切换prompt": "Switch the prompt as needed",
"将文件复制一份到下载区": "Make a copy of the file in the download area",
"次编译": "Second compilation",
"Latex文件融合完成": "LaTeX file merging completed",
"返回": "Return",
"后面是英文逗号": "Comma after this",
"对不同latex源文件扣分": "Deduct points for different LaTeX source files",
"失败啦": "Failed",
"编译BibTex": "Compile BibTeX",
"Linux下必须使用Docker安装": "Must install using Docker on Linux",
"报错信息": "Error message",
"删除或修改歧义文件": "Delete or modify ambiguous files",
"-预热文本向量化模组": "- Preheating text vectorization module",
"将每次对话记录写入Markdown格式的文件中": "Write each conversation record into a file in Markdown format",
"其他类型文献转化效果未知": "Unknown conversion effect for other types of literature",
"获取线程锁": "Acquire thread lock",
"使用英文": "Use English",
"如果存在调试缓存文件": "If there is a debug cache file",
"您需要首先调用构建知识库": "You need to call the knowledge base building first",
"原始PDF编译是否成功": "Whether the original PDF compilation is successful",
"生成 azure openai api请求": "Generate Azure OpenAI API requests",
"正在编译PDF": "Compiling PDF",
"仅调试": "Debug only",
"========================================= 插件主程序2 =====================================================": "========================================= Plugin Main Program 2 =====================================================",
"多线程翻译开始": "Multithreaded translation begins",
"出问题了": "There is a problem",
"版权归原文作者所有": "Copyright belongs to the original author",
"当前大语言模型": "Current large language model",
"目前对机器学习类文献转化效果最好": "Currently, the best conversion effect for machine learning literature",
"这个paper有个input命令文件名大小写错误": "This paper has an input command with a filename case error!",
"期望格式例如": "Expected format, for example",
"解决部分词汇翻译不准确的问题": "Resolve the issue of inaccurate translation for some terms",
"待注入的知识库名称id": "Name/ID of the knowledge base to be injected",
"精细切分latex文件": "Fine-grained segmentation of LaTeX files",
"永远给定None": "Always given None",
"work_folder = Latex预处理": "work_folder = LaTeX preprocessing",
"请直接去该路径下取回翻译结果": "Please directly go to the path to retrieve the translation results",
"寻找主tex文件": "Finding the main .tex file",
"模型参数": "Model parameters",
"返回找到的第一个": "Return the first one found",
"编译转化后的PDF": "Compile the converted PDF",
"\\SEAFILE_LOCALŅ03047\\我的资料库\\music\\Akie秋绘-未来轮廓.mp3": "\\SEAFILE_LOCALŅ03047\\My Library\\music\\Akie秋绘-未来轮廓.mp3",
"拆分过长的latex片段": "Splitting overly long LaTeX fragments",
"没有找到任何可读取文件": "No readable files found",
"暗色模式 / 亮色模式": "Dark mode / Light mode",
"检测到arxiv文档连接": "Detected arXiv document link",
"此插件Windows支持最佳": "This plugin has best support for Windows",
"from crazy_functions.虚空终端 import 终端": "from crazy_functions.null_terminal import Terminal",
"本地论文翻译": "Local paper translation",
"输出html调试文件": "Output HTML debugging file",
"以下所有配置也都支持利用环境变量覆写": "All the following configurations can also be overridden using environment variables",
"PDF文件所在的路径": "Path of the PDF file",
"也是可读的": "It is also readable",
"将消耗较长时间下载中文向量化模型": "Downloading Chinese vectorization model will take a long time",
"环境变量配置格式见docker-compose.yml": "See docker-compose.yml for the format of environment variable configuration",
"编译文献交叉引用": "Compile bibliographic cross-references",
"默认为default": "Default is 'default'",
"或者使用此插件继续上传更多文件": "Or use this plugin to continue uploading more files",
"该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成": "This PDF is generated by the GPT-Academic open-source project using a large language model + LaTeX translation plugin",
"使用latexdiff生成论文转化前后对比": "Use latexdiff to generate before and after comparison of paper transformation",
"正在编译PDF文档": "Compiling PDF document",
"读取config.py文件中关于AZURE OPENAI API的信息": "Read the information about AZURE OPENAI API from the config.py file",
"配置教程&视频教程": "Configuration tutorial & video tutorial",
"临时地启动代理网络": "Temporarily start proxy network",
"临时地激活代理网络": "Temporarily activate proxy network",
"功能尚不稳定": "Functionality is unstable",
"默认为Chinese": "Default is Chinese",
"请查收结果": "Please check the results",
"将 chatglm 直接对齐到 chatglm2": "Align chatglm directly to chatglm2",
"中读取数据构建知识库": "Build a knowledge base by reading data in",
"用于给一小段代码上代理": "Used to proxy a small piece of code",
"分析结果": "Analysis results",
"依赖不足": "Insufficient dependencies",
"Markdown翻译": "Markdown translation",
"除非您是论文的原作者": "Unless you are the original author of the paper",
"test_LangchainKnowledgeBase读取": "test_LangchainKnowledgeBase read",
"将多文件tex工程融合为一个巨型tex": "Merge multiple tex projects into one giant tex",
"吸收iffalse注释": "Absorb iffalser comments",
"您接下来不能再使用其他插件了": "You can no longer use other plugins next",
"正在构建知识库": "Building knowledge base",
"需Latex": "Requires Latex",
"即找不到": "That is not found",
"保证括号正确": "Ensure parentheses are correct",
"= 2 通过一些Latex模板中常见": "= 2 through some common Latex templates",
"请立即终止程序": "Please terminate the program immediately",
"解压失败! 需要安装pip install rarfile来解压rar文件": "Decompression failed! Install 'pip install rarfile' to decompress rar files",
"请在此处给出自定义翻译命令": "Please provide custom translation command here",
"解压失败! 需要安装pip install py7zr来解压7z文件": "Decompression failed! Install 'pip install py7zr' to decompress 7z files",
"执行错误": "Execution error",
"目前仅支持GPT3.5/GPT4": "Currently only supports GPT3.5/GPT4",
"P.S. 顺便把Latex的注释去除": "P.S. Also remove comments from Latex",
"写出文件": "Write out the file",
"当前报错的latex代码处于第": "The current error in the LaTeX code is on line",
"主程序即将开始": "Main program is about to start",
"详情信息见requirements.txt": "See details in requirements.txt",
"释放线程锁": "Release thread lock",
"由于最为关键的转化PDF编译失败": "Due to the critical failure of PDF conversion and compilation",
"即将退出": "Exiting soon",
"尝试下载": "Attempting to download",
"删除整行的空注释": "Remove empty comments from the entire line",
"也找不到": "Not found either",
"从一批文件": "From a batch of files",
"编译结束": "Compilation finished",
"调用缓存": "Calling cache",
"只有GenerateImage和生成图像相关": "Only GenerateImage and image generation related",
"待处理的word文档路径": "Path of the word document to be processed",
"是否在提交时自动清空输入框": "Whether to automatically clear the input box upon submission",
"检查结果": "Check the result",
"生成时间戳": "Generate a timestamp",
"编译原始PDF": "Compile the original PDF",
"填入ENGINE": "Fill in ENGINE",
"填入api版本": "Fill in the API version",
"中文Bing版": "Chinese Bing version",
"当前支持的格式包括": "Currently supported formats include"
}

119
docs/use_azure.md 普通文件
查看文件

@@ -0,0 +1,119 @@
# 通过微软Azure云服务申请 Openai API
由于Openai和微软的关系,现在是可以通过微软的Azure云计算服务直接访问openai的api,免去了注册和网络的问题。
快速入门的官方文档的链接是:[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
# 申请API
按文档中的“先决条件”的介绍,出了编程的环境以外,还需要以下三个条件:
1.  Azure账号并创建订阅
2.  为订阅添加Azure OpenAI 服务
3.  部署模型
## Azure账号并创建订阅
### Azure账号
创建Azure的账号时最好是有微软的账号,这样似乎更容易获得免费额度第一个月的200美元,实测了一下,如果用一个刚注册的微软账号登录Azure的话,并没有这一个月的免费额度
创建Azure账号的网址是[立即创建 Azure 免费帐户 | Microsoft Azure](https://azure.microsoft.com/zh-cn/free/)
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_944786_iH6AECuZ_tY0EaBd_1685327219?w=1327\&h=695\&type=image/png)
打开网页后,点击 “免费开始使用” 会跳转到登录或注册页面,如果有微软的账户,直接登录即可,如果没有微软账户,那就需要到微软的网页再另行注册一个。
注意,Azure的页面和政策时不时会变化,已实际最新显示的为准就好。
### 创建订阅
注册好Azure后便可进入主页
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_444847_tk-9S-pxOYuaLs_K_1685327675?w=1865\&h=969\&type=image/png)
首先需要在订阅里进行添加操作,点开后即可进入订阅的页面:
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_612820_z_1AlaEgnJR-rUl0_1685327892?w=1865\&h=969\&type=image/png)
第一次进来应该是空的,点添加即可创建新的订阅可以是“免费”或者“即付即用”的订阅,其中订阅ID是后面申请Azure OpenAI需要使用的。
## 为订阅添加Azure OpenAI服务
之后回到首页,点Azure OpenAI即可进入OpenAI服务的页面如果不显示的话,则在首页上方的搜索栏里搜索“openai”即可
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_269759_nExkGcPC0EuAR5cp_1685328130?w=1865\&h=969\&type=image/png)
不过现在这个服务还不能用。在使用前,还需要在这个网址申请一下:
[Request Access to Azure OpenAI Service (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu)
这里有二十来个问题,按照要求和自己的实际情况填写即可。
其中需要注意的是
1.  千万记得填对"订阅ID"
2.  需要填一个公司邮箱(可以不是注册用的邮箱)和公司网址
之后,在回到上面那个页面,点创建,就会进入创建页面了:
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_72708_9d9JYhylPVz3dFWL_1685328372?w=824\&h=590\&type=image/png)
需要填入“资源组”和“名称”,按照自己的需要填入即可。
完成后,在主页的“资源”里就可以看到刚才创建的“资源”了,点击进入后,就可以进行最后的部署了。
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_871541_CGCnbgtV9Uk1Jccy_1685329861?w=1217\&h=628\&type=image/png)
## 部署模型
进入资源页面后,在部署模型前,可以先点击“开发”,把密钥和终结点记下来。
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_852567_dxCZOrkMlWDSLH0d_1685330736?w=856\&h=568\&type=image/png)
之后,就可以去部署模型了,点击“部署”即可,会跳转到 Azure OpenAI Stuido 进行下面的操作:
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_169225_uWs1gMhpNbnwW4h2_1685329901?w=1865\&h=969\&type=image/png)
进入 Azure OpenAi Studio 后,点击新建部署,会弹出如下对话框:
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_391255_iXUSZAzoud5qlxjJ_1685330224?w=656\&h=641\&type=image/png)
在这里选 gpt-35-turbo 或需要的模型并按需要填入“部署名”即可完成模型的部署。
![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_724099_vBaHcUilsm1EtPgK_1685330396?w=1869\&h=482\&type=image/png)
这个部署名需要记下来。
到现在为止,申请操作就完成了,需要记下来的有下面几个东西:
 密钥对应AZURE_API_KEY,1或2都可以
● 终结点 对应AZURE_ENDPOINT
 部署名对应AZURE_ENGINE,不是模型名
# 修改 config.py
```
LLM_MODEL = "azure-gpt-3.5" # 指定启动时的默认模型,当然事后从下拉菜单选也ok
AZURE_ENDPOINT = "填入终结点" # 见上述图片
AZURE_API_KEY = "填入azure openai api的密钥"
AZURE_API_VERSION = "2023-05-15" # 默认使用 2023-05-15 版本,无需修改
AZURE_ENGINE = "填入部署名" # 见上述图片
```
# 关于费用
Azure OpenAI API 还是需要一些费用的免费订阅只有1个月有效期
具体可以可以看这个网址 [Azure OpenAI 服务 - 定价| Microsoft Azure](https://azure.microsoft.com/zh-cn/pricing/details/cognitive-services/openai-service/?cdn=disable)
并非网上说的什么“一年白嫖”,但注册方法以及网络问题都比直接使用openai的api要简单一些。

58
main.py
查看文件

@@ -2,12 +2,12 @@ import os; os.environ['no_proxy'] = '*' # 避免代理网络产生意外污染
def main():
import gradio as gr
if gr.__version__ not in ['3.28.3','3.32.2']: assert False, "用 pip install -r requirements.txt 安装依赖"
if gr.__version__ not in ['3.28.3','3.32.2']: assert False, "需要特殊依赖,请务必用 pip install -r requirements.txt 指令安装依赖,详情信息见requirements.txt"
from request_llm.bridge_all import predict
from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, DummyWith
from toolbox import format_io, find_free_port, on_file_uploaded, on_report_generated, get_conf, ArgsGeneralWrapper, load_chat_cookies, DummyWith
# 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, API_KEY, AVAIL_LLM_MODELS = \
get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'API_KEY', 'AVAIL_LLM_MODELS')
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, AVAIL_LLM_MODELS, AUTO_CLEAR_TXT = \
get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'AVAIL_LLM_MODELS', 'AUTO_CLEAR_TXT')
# 如果WEB_PORT是-1, 则随机选取WEB端口
PORT = find_free_port() if WEB_PORT <= 0 else WEB_PORT
@@ -45,23 +45,23 @@ def main():
proxy_info = check_proxy(proxies)
gr_L1 = lambda: gr.Row().style()
gr_L2 = lambda scale: gr.Column(scale=scale)
gr_L2 = lambda scale, elem_id: gr.Column(scale=scale, elem_id=elem_id)
if LAYOUT == "TOP-DOWN":
gr_L1 = lambda: DummyWith()
gr_L2 = lambda scale: gr.Row()
gr_L2 = lambda scale, elem_id: gr.Row()
CHATBOT_HEIGHT /= 2
cancel_handles = []
with gr.Blocks(title="ChatGPT 学术优化", theme=set_theme, analytics_enabled=False, css=advanced_css) as demo:
gr.HTML(title_html)
cookies = gr.State({'api_key': API_KEY, 'llm_model': LLM_MODEL})
cookies = gr.State(load_chat_cookies())
with gr_L1():
with gr_L2(scale=2):
chatbot = gr.Chatbot(label=f"当前模型:{LLM_MODEL}")
chatbot.style(height=CHATBOT_HEIGHT)
with gr_L2(scale=2, elem_id="gpt-chat"):
chatbot = gr.Chatbot(label=f"当前模型:{LLM_MODEL}", elem_id="gpt-chatbot")
if LAYOUT == "TOP-DOWN": chatbot.style(height=CHATBOT_HEIGHT)
history = gr.State([])
with gr_L2(scale=1):
with gr.Accordion("输入区", open=True) as area_input_primary:
with gr_L2(scale=1, elem_id="gpt-panel"):
with gr.Accordion("输入区", open=True, elem_id="input-panel") as area_input_primary:
with gr.Row():
txt = gr.Textbox(show_label=False, placeholder="Input question here.").style(container=False)
with gr.Row():
@@ -71,14 +71,14 @@ def main():
stopBtn = gr.Button("停止", variant="secondary"); stopBtn.style(size="sm")
clearBtn = gr.Button("清除", variant="secondary", visible=False); clearBtn.style(size="sm")
with gr.Row():
status = gr.Markdown(f"Tip: 按Enter提交, 按Shift+Enter换行。当前模型: {LLM_MODEL} \n {proxy_info}")
with gr.Accordion("基础功能区", open=True) as area_basic_fn:
status = gr.Markdown(f"Tip: 按Enter提交, 按Shift+Enter换行。当前模型: {LLM_MODEL} \n {proxy_info}", elem_id="state-panel")
with gr.Accordion("基础功能区", open=True, elem_id="basic-panel") as area_basic_fn:
with gr.Row():
for k in functional:
if ("Visible" in functional[k]) and (not functional[k]["Visible"]): continue
variant = functional[k]["Color"] if "Color" in functional[k] else "secondary"
functional[k]["Button"] = gr.Button(k, variant=variant)
with gr.Accordion("函数插件区", open=True) as area_crazy_fn:
with gr.Accordion("函数插件区", open=True, elem_id="plugin-panel") as area_crazy_fn:
with gr.Row():
gr.Markdown("注意:以下“红颜色”标识的函数插件需从输入区读取路径作为参数.")
with gr.Row():
@@ -100,16 +100,16 @@ def main():
with gr.Row():
with gr.Accordion("点击展开“文件上传区”。上传本地文件可供红色函数插件调用。", open=False) as area_file_up:
file_upload = gr.Files(label="任何文件, 但推荐上传压缩文件(zip, tar)", file_count="multiple")
with gr.Accordion("更换模型 & SysPrompt & 交互界面布局", open=(LAYOUT == "TOP-DOWN")):
with gr.Accordion("更换模型 & SysPrompt & 交互界面布局", open=(LAYOUT == "TOP-DOWN"), elem_id="interact-panel"):
system_prompt = gr.Textbox(show_label=True, placeholder=f"System Prompt", label="System prompt", value=initial_prompt)
top_p = gr.Slider(minimum=-0, maximum=1.0, value=1.0, step=0.01,interactive=True, label="Top-p (nucleus sampling)",)
temperature = gr.Slider(minimum=-0, maximum=2.0, value=1.0, step=0.01, interactive=True, label="Temperature",)
max_length_sl = gr.Slider(minimum=256, maximum=4096, value=512, step=1, interactive=True, label="Local LLM MaxLength",)
max_length_sl = gr.Slider(minimum=256, maximum=8192, value=4096, step=1, interactive=True, label="Local LLM MaxLength",)
checkboxes = gr.CheckboxGroup(["基础功能区", "函数插件区", "底部输入区", "输入清除键", "插件参数区"], value=["基础功能区", "函数插件区"], label="显示/隐藏功能区")
md_dropdown = gr.Dropdown(AVAIL_LLM_MODELS, value=LLM_MODEL, label="更换LLM模型/请求源").style(container=False)
gr.Markdown(description)
with gr.Accordion("备选输入区", open=True, visible=False) as area_input_secondary:
with gr.Accordion("备选输入区", open=True, visible=False, elem_id="input-panel2") as area_input_secondary:
with gr.Row():
txt2 = gr.Textbox(show_label=False, placeholder="Input question here.", label="输入区2").style(container=False)
with gr.Row():
@@ -144,6 +144,11 @@ def main():
resetBtn2.click(lambda: ([], [], "已重置"), None, [chatbot, history, status])
clearBtn.click(lambda: ("",""), None, [txt, txt2])
clearBtn2.click(lambda: ("",""), None, [txt, txt2])
if AUTO_CLEAR_TXT:
submitBtn.click(lambda: ("",""), None, [txt, txt2])
submitBtn2.click(lambda: ("",""), None, [txt, txt2])
txt.submit(lambda: ("",""), None, [txt, txt2])
txt2.submit(lambda: ("",""), None, [txt, txt2])
# 基础功能区的回调函数注册
for k in functional:
if ("Visible" in functional[k]) and (not functional[k]["Visible"]): continue
@@ -155,7 +160,7 @@ def main():
for k in crazy_fns:
if not crazy_fns[k].get("AsButton", True): continue
click_handle = crazy_fns[k]["Button"].click(ArgsGeneralWrapper(crazy_fns[k]["Function"]), [*input_combo, gr.State(PORT)], output_combo)
click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
cancel_handles.append(click_handle)
# 函数插件-下拉菜单与随变按钮的互动
def on_dropdown_changed(k):
@@ -171,15 +176,16 @@ def main():
return {chatbot: gr.update(label="当前模型:"+k)}
md_dropdown.select(on_md_dropdown_changed, [md_dropdown], [chatbot] )
# 随变按钮的回调函数注册
def route(k, *args, **kwargs):
def route(request: gr.Request, k, *args, **kwargs):
if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(*args, **kwargs)
yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(request, *args, **kwargs)
click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
cancel_handles.append(click_handle)
# 终止按钮的回调函数注册
stopBtn.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
stopBtn2.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{ChatBotHeight();}')
# gradio的inbrowser触发不太稳定,回滚代码到原始的浏览器打开函数
def auto_opentab_delay():
@@ -197,7 +203,10 @@ def main():
threading.Thread(target=warm_up_modules, name="warm-up", daemon=True).start()
auto_opentab_delay()
demo.queue(concurrency_count=CONCURRENT_COUNT).launch(server_name="0.0.0.0", server_port=PORT, auth=AUTHENTICATION, favicon_path="docs/logo.png")
demo.queue(concurrency_count=CONCURRENT_COUNT).launch(
server_name="0.0.0.0", server_port=PORT,
favicon_path="docs/logo.png", auth=AUTHENTICATION,
blocked_paths=["config.py","config_private.py","docker-compose.yml","Dockerfile"])
# 如果需要在二级路径下运行
# CUSTOM_PATH, = get_conf('CUSTOM_PATH')
@@ -205,7 +214,8 @@ def main():
# from toolbox import run_gradio_in_subpath
# run_gradio_in_subpath(demo, auth=AUTHENTICATION, port=PORT, custom_path=CUSTOM_PATH)
# else:
# demo.launch(server_name="0.0.0.0", server_port=PORT, auth=AUTHENTICATION, favicon_path="docs/logo.png")
# demo.launch(server_name="0.0.0.0", server_port=PORT, auth=AUTHENTICATION, favicon_path="docs/logo.png",
# blocked_paths=["config.py","config_private.py","docker-compose.yml","Dockerfile"])
if __name__ == "__main__":
main()

查看文件

@@ -33,7 +33,7 @@ import pickle
import time
CACHE_FOLDER = "gpt_log"
blacklist = ['multi-language', 'gpt_log', '.git', 'private_upload', 'multi_language.py']
blacklist = ['multi-language', 'gpt_log', '.git', 'private_upload', 'multi_language.py', 'build', '.github', '.vscode', '__pycache__', 'venv']
# LANG = "TraditionalChinese"
# TransPrompt = f"Replace each json value `#` with translated results in Traditional Chinese, e.g., \"原始文本\":\"翻譯後文字\". Keep Json format. Do not answer #."
@@ -301,6 +301,7 @@ def step_1_core_key_translate():
elif isinstance(node, ast.ImportFrom):
for n in node.names:
if contains_chinese(n.name): syntax.append(n.name)
# if node.module is None: print(node.module)
for k in node.module.split('.'):
if contains_chinese(k): syntax.append(k)
return syntax
@@ -310,6 +311,7 @@ def step_1_core_key_translate():
for root, dirs, files in os.walk(directory_path):
if any([b in root for b in blacklist]):
continue
print(files)
for file in files:
if file.endswith('.py'):
file_path = os.path.join(root, file)
@@ -505,6 +507,6 @@ def step_2_core_key_translate():
with open(file_path_new, 'w', encoding='utf-8') as f:
f.write(content)
os.remove(file_path)
step_1_core_key_translate()
step_2_core_key_translate()
print('Finished, checkout generated results at ./multi-language/')

查看文件

@@ -19,9 +19,6 @@ from .bridge_chatgpt import predict as chatgpt_ui
from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
from .bridge_chatglm import predict as chatglm_ui
from .bridge_newbing import predict_no_ui_long_connection as newbing_noui
from .bridge_newbing import predict as newbing_ui
# from .bridge_tgui import predict_no_ui_long_connection as tgui_noui
# from .bridge_tgui import predict as tgui_ui
@@ -48,10 +45,11 @@ class LazyloadTiktoken(object):
return encoder.decode(*args, **kwargs)
# Endpoint 重定向
API_URL_REDIRECT, = get_conf("API_URL_REDIRECT")
API_URL_REDIRECT, AZURE_ENDPOINT, AZURE_ENGINE = get_conf("API_URL_REDIRECT", "AZURE_ENDPOINT", "AZURE_ENGINE")
openai_endpoint = "https://api.openai.com/v1/chat/completions"
api2d_endpoint = "https://openai.api2d.net/v1/chat/completions"
newbing_endpoint = "wss://sydney.bing.com/sydney/ChatHub"
azure_endpoint = AZURE_ENDPOINT + f'openai/deployments/{AZURE_ENGINE}/chat/completions?api-version=2023-05-15'
# 兼容旧版的配置
try:
API_URL, = get_conf("API_URL")
@@ -84,6 +82,33 @@ model_info = {
"token_cnt": get_token_num_gpt35,
},
"gpt-3.5-turbo-16k": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": openai_endpoint,
"max_token": 1024*16,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
"gpt-3.5-turbo-0613": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": openai_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
"gpt-3.5-turbo-16k-0613": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": openai_endpoint,
"max_token": 1024 * 16,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
"gpt-4": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
@@ -93,6 +118,16 @@ model_info = {
"token_cnt": get_token_num_gpt4,
},
# azure openai
"azure-gpt-3.5":{
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": azure_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
# api_2d
"api2d-gpt-3.5-turbo": {
"fn_with_ui": chatgpt_ui,
@@ -112,7 +147,7 @@ model_info = {
"token_cnt": get_token_num_gpt4,
},
# chatglm
# chatglm 直接对齐到 chatglm2
"chatglm": {
"fn_with_ui": chatglm_ui,
"fn_without_ui": chatglm_noui,
@@ -121,12 +156,11 @@ model_info = {
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
# newbing
"newbing": {
"fn_with_ui": newbing_ui,
"fn_without_ui": newbing_noui,
"endpoint": newbing_endpoint,
"max_token": 4096,
"chatglm2": {
"fn_with_ui": chatglm_ui,
"fn_without_ui": chatglm_noui,
"endpoint": None,
"max_token": 1024,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
@@ -218,6 +252,23 @@ if "newbing-free" in AVAIL_LLM_MODELS:
})
except:
print(trimmed_format_exc())
if "newbing" in AVAIL_LLM_MODELS: # same with newbing-free
try:
from .bridge_newbingfree import predict_no_ui_long_connection as newbingfree_noui
from .bridge_newbingfree import predict as newbingfree_ui
# claude
model_info.update({
"newbing": {
"fn_with_ui": newbingfree_ui,
"fn_without_ui": newbingfree_noui,
"endpoint": newbing_endpoint,
"max_token": 4096,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
}
})
except:
print(trimmed_format_exc())
def LLM_CATCH_EXCEPTION(f):
"""

查看文件

@@ -40,12 +40,12 @@ class GetGLMHandle(Process):
while True:
try:
if self.chatglm_model is None:
self.chatglm_tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
self.chatglm_tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
device, = get_conf('LOCAL_MODEL_DEVICE')
if device=='cpu':
self.chatglm_model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float()
self.chatglm_model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).float()
else:
self.chatglm_model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
self.chatglm_model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).half().cuda()
self.chatglm_model = self.chatglm_model.eval()
break
else:

查看文件

@@ -22,8 +22,8 @@ import importlib
# config_private.py放自己的秘密如API和代理网址
# 读取时首先看是否存在私密的config_private配置文件不受git管控,如果有,则覆盖原config文件
from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history, trimmed_format_exc
proxies, API_KEY, TIMEOUT_SECONDS, MAX_RETRY = \
get_conf('proxies', 'API_KEY', 'TIMEOUT_SECONDS', 'MAX_RETRY')
proxies, TIMEOUT_SECONDS, MAX_RETRY, API_ORG = \
get_conf('proxies', 'TIMEOUT_SECONDS', 'MAX_RETRY', 'API_ORG')
timeout_bot_msg = '[Local Message] Request timeout. Network error. Please check proxy settings in config.py.' + \
'网络错误,检查代理服务器是否可用,以及代理设置的格式是否正确,格式须是[协议]://[地址]:[端口],缺一不可。'
@@ -101,6 +101,8 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("用户取消了程序。")
else: raise RuntimeError("意外Json结构"+delta)
if json_data['finish_reason'] == 'content_filter':
raise RuntimeError("由于提问含不合规内容被Azure过滤。")
if json_data['finish_reason'] == 'length':
raise ConnectionAbortedError("正常结束,但显示Token不足,导致输出不完整,请削减单次输入的文本量。")
return result
@@ -205,6 +207,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
chunk = get_full_error(chunk, stream_response)
chunk_decoded = chunk.decode()
error_msg = chunk_decoded
openai_website = ' 请登录OpenAI查看详情 https://platform.openai.com/signup'
if "reduce the length" in error_msg:
if len(history) >= 2: history[-1] = ""; history[-2] = "" # 清除当前溢出的输入history[-2] 是本次输入, history[-1] 是本次输出
history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
@@ -214,9 +217,13 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
elif "does not exist" in error_msg:
chatbot[-1] = (chatbot[-1][0], f"[Local Message] Model {llm_kwargs['llm_model']} does not exist. 模型不存在, 或者您没有获得体验资格.")
elif "Incorrect API key" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Incorrect API key. OpenAI以提供了不正确的API_KEY为由, 拒绝服务.")
chatbot[-1] = (chatbot[-1][0], "[Local Message] Incorrect API key. OpenAI以提供了不正确的API_KEY为由, 拒绝服务. " + openai_website)
elif "exceeded your current quota" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] You exceeded your current quota. OpenAI以账户额度不足为由, 拒绝服务.")
chatbot[-1] = (chatbot[-1][0], "[Local Message] You exceeded your current quota. OpenAI以账户额度不足为由, 拒绝服务." + openai_website)
elif "account is not active" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Your account is not active. OpenAI以账户失效为由, 拒绝服务." + openai_website)
elif "associated with a deactivated account" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] You are associated with a deactivated account. OpenAI以账户失效为由, 拒绝服务." + openai_website)
elif "bad forward key" in error_msg:
chatbot[-1] = (chatbot[-1][0], "[Local Message] Bad forward key. API2D账户额度不足.")
elif "Not enough point" in error_msg:
@@ -241,6 +248,8 @@ def generate_payload(inputs, llm_kwargs, history, system_prompt, stream):
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
if API_ORG.startswith('org-'): headers.update({"OpenAI-Organization": API_ORG})
if llm_kwargs['llm_model'].startswith('azure-'): headers.update({"api-key": api_key})
conversation_cnt = len(history) // 2

查看文件

@@ -1,254 +0,0 @@
"""
========================================================================
第一部分来自EdgeGPT.py
https://github.com/acheong08/EdgeGPT
========================================================================
"""
from .edge_gpt import NewbingChatbot
load_message = "等待NewBing响应。"
"""
========================================================================
第二部分子进程Worker调用主体
========================================================================
"""
import time
import json
import re
import logging
import asyncio
import importlib
import threading
from toolbox import update_ui, get_conf, trimmed_format_exc
from multiprocessing import Process, Pipe
def preprocess_newbing_out(s):
pattern = r'\^(\d+)\^' # 匹配^数字^
sub = lambda m: '('+m.group(1)+')' # 将匹配到的数字作为替换值
result = re.sub(pattern, sub, s) # 替换操作
if '[1]' in result:
result += '\n\n```reference\n' + "\n".join([r for r in result.split('\n') if r.startswith('[')]) + '\n```\n'
return result
def preprocess_newbing_out_simple(result):
if '[1]' in result:
result += '\n\n```reference\n' + "\n".join([r for r in result.split('\n') if r.startswith('[')]) + '\n```\n'
return result
class NewBingHandle(Process):
def __init__(self):
super().__init__(daemon=True)
self.parent, self.child = Pipe()
self.newbing_model = None
self.info = ""
self.success = True
self.local_history = []
self.check_dependency()
self.start()
self.threadLock = threading.Lock()
def check_dependency(self):
try:
self.success = False
import certifi, httpx, rich
self.info = "依赖检测通过,等待NewBing响应。注意目前不能多人同时调用NewBing接口有线程锁,否则将导致每个人的NewBing问询历史互相渗透。调用NewBing时,会自动使用已配置的代理。"
self.success = True
except:
self.info = "缺少的依赖,如果要使用Newbing,除了基础的pip依赖以外,您还需要运行`pip install -r request_llm/requirements_newbing.txt`安装Newbing的依赖。"
self.success = False
def ready(self):
return self.newbing_model is not None
async def async_run(self):
# 读取配置
NEWBING_STYLE, = get_conf('NEWBING_STYLE')
from request_llm.bridge_all import model_info
endpoint = model_info['newbing']['endpoint']
while True:
# 等待
kwargs = self.child.recv()
question=kwargs['query']
history=kwargs['history']
system_prompt=kwargs['system_prompt']
# 是否重置
if len(self.local_history) > 0 and len(history)==0:
await self.newbing_model.reset()
self.local_history = []
# 开始问问题
prompt = ""
if system_prompt not in self.local_history:
self.local_history.append(system_prompt)
prompt += system_prompt + '\n'
# 追加历史
for ab in history:
a, b = ab
if a not in self.local_history:
self.local_history.append(a)
prompt += a + '\n'
# if b not in self.local_history:
# self.local_history.append(b)
# prompt += b + '\n'
# 问题
prompt += question
self.local_history.append(question)
print('question:', prompt)
# 提交
async for final, response in self.newbing_model.ask_stream(
prompt=question,
conversation_style=NEWBING_STYLE, # ["creative", "balanced", "precise"]
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
):
if not final:
print(response)
self.child.send(str(response))
else:
print('-------- receive final ---------')
self.child.send('[Finish]')
# self.local_history.append(response)
def run(self):
"""
这个函数运行在子进程
"""
# 第一次运行,加载参数
self.success = False
self.local_history = []
if (self.newbing_model is None) or (not self.success):
# 代理设置
proxies, = get_conf('proxies')
if proxies is None:
self.proxies_https = None
else:
self.proxies_https = proxies['https']
# cookie
NEWBING_COOKIES, = get_conf('NEWBING_COOKIES')
try:
cookies = json.loads(NEWBING_COOKIES)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] 不能加载Newbing组件。NEWBING_COOKIES未填写或有格式错误。')
self.child.send('[Fail]')
self.child.send('[Finish]')
raise RuntimeError(f"不能加载Newbing组件。NEWBING_COOKIES未填写或有格式错误。")
try:
self.newbing_model = NewbingChatbot(proxy=self.proxies_https, cookies=cookies)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] 不能加载Newbing组件。{tb_str}')
self.child.send('[Fail]')
self.child.send('[Finish]')
raise RuntimeError(f"不能加载Newbing组件。")
self.success = True
try:
# 进入任务等待状态
asyncio.run(self.async_run())
except Exception:
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] Newbing失败 {tb_str}.')
self.child.send('[Fail]')
self.child.send('[Finish]')
def stream_chat(self, **kwargs):
"""
这个函数运行在主进程
"""
self.threadLock.acquire()
self.parent.send(kwargs) # 发送请求到子进程
while True:
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]':
break # 结束
elif res == '[Fail]':
self.success = False
break
else:
yield res # newbing回复的片段
self.threadLock.release()
"""
========================================================================
第三部分:主进程统一调用函数接口
========================================================================
"""
global newbing_handle
newbing_handle = None
def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=None, console_slience=False):
"""
多线程方法
函数的说明请见 request_llm/bridge_all.py
"""
global newbing_handle
if (newbing_handle is None) or (not newbing_handle.success):
newbing_handle = NewBingHandle()
observe_window[0] = load_message + "\n\n" + newbing_handle.info
if not newbing_handle.success:
error = newbing_handle.info
newbing_handle = None
raise RuntimeError(error)
# 没有 sys_prompt 接口,因此把prompt加入 history
history_feedin = []
for i in range(len(history)//2):
history_feedin.append([history[2*i], history[2*i+1]] )
watch_dog_patience = 5 # 看门狗 (watchdog) 的耐心, 设置5秒即可
response = ""
observe_window[0] = "[Local Message]: 等待NewBing响应中 ..."
for response in newbing_handle.stream_chat(query=inputs, history=history_feedin, system_prompt=sys_prompt, max_length=llm_kwargs['max_length'], top_p=llm_kwargs['top_p'], temperature=llm_kwargs['temperature']):
observe_window[0] = preprocess_newbing_out_simple(response)
if len(observe_window) >= 2:
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("程序终止。")
return preprocess_newbing_out_simple(response)
def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
"""
单线程方法
函数的说明请见 request_llm/bridge_all.py
"""
chatbot.append((inputs, "[Local Message]: 等待NewBing响应中 ..."))
global newbing_handle
if (newbing_handle is None) or (not newbing_handle.success):
newbing_handle = NewBingHandle()
chatbot[-1] = (inputs, load_message + "\n\n" + newbing_handle.info)
yield from update_ui(chatbot=chatbot, history=[])
if not newbing_handle.success:
newbing_handle = None
return
if additional_fn is not None:
import core_functional
importlib.reload(core_functional) # 热更新prompt
core_functional = core_functional.get_core_functions()
if "PreProcess" in core_functional[additional_fn]: inputs = core_functional[additional_fn]["PreProcess"](inputs) # 获取预处理函数(如果有的话)
inputs = core_functional[additional_fn]["Prefix"] + inputs + core_functional[additional_fn]["Suffix"]
history_feedin = []
for i in range(len(history)//2):
history_feedin.append([history[2*i], history[2*i+1]] )
chatbot[-1] = (inputs, "[Local Message]: 等待NewBing响应中 ...")
response = "[Local Message]: 等待NewBing响应中 ..."
yield from update_ui(chatbot=chatbot, history=history, msg="NewBing响应缓慢,尚未完成全部响应,请耐心完成后再提交新问题。")
for response in newbing_handle.stream_chat(query=inputs, history=history_feedin, system_prompt=system_prompt, max_length=llm_kwargs['max_length'], top_p=llm_kwargs['top_p'], temperature=llm_kwargs['temperature']):
chatbot[-1] = (inputs, preprocess_newbing_out(response))
yield from update_ui(chatbot=chatbot, history=history, msg="NewBing响应缓慢,尚未完成全部响应,请耐心完成后再提交新问题。")
if response == "[Local Message]: 等待NewBing响应中 ...": response = "[Local Message]: NewBing响应异常,请刷新界面重试 ..."
history.extend([inputs, response])
logging.info(f'[raw_input] {inputs}')
logging.info(f'[response] {response}')
yield from update_ui(chatbot=chatbot, history=history, msg="完成全部响应,请提交新问题。")

查看文件

@@ -89,9 +89,6 @@ class NewBingHandle(Process):
if a not in self.local_history:
self.local_history.append(a)
prompt += a + '\n'
# if b not in self.local_history:
# self.local_history.append(b)
# prompt += b + '\n'
# 问题
prompt += question
@@ -101,7 +98,7 @@ class NewBingHandle(Process):
async for final, response in self.newbing_model.ask_stream(
prompt=question,
conversation_style=NEWBING_STYLE, # ["creative", "balanced", "precise"]
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
wss_link=endpoint, # "wss://sydney.bing.com/sydney/ChatHub"
):
if not final:
print(response)
@@ -121,14 +118,26 @@ class NewBingHandle(Process):
self.local_history = []
if (self.newbing_model is None) or (not self.success):
# 代理设置
proxies, = get_conf('proxies')
proxies, NEWBING_COOKIES = get_conf('proxies', 'NEWBING_COOKIES')
if proxies is None:
self.proxies_https = None
else:
self.proxies_https = proxies['https']
if (NEWBING_COOKIES is not None) and len(NEWBING_COOKIES) > 100:
try:
cookies = json.loads(NEWBING_COOKIES)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] NEWBING_COOKIES未填写或有格式错误。')
self.child.send('[Fail]'); self.child.send('[Finish]')
raise RuntimeError(f"NEWBING_COOKIES未填写或有格式错误。")
else:
cookies = None
try:
self.newbing_model = NewbingChatbot(proxy=self.proxies_https)
self.newbing_model = NewbingChatbot(proxy=self.proxies_https, cookies=cookies)
except:
self.success = False
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
@@ -143,7 +152,7 @@ class NewBingHandle(Process):
asyncio.run(self.async_run())
except Exception:
tb_str = '\n```\n' + trimmed_format_exc() + '\n```\n'
self.child.send(f'[Local Message] Newbing失败 {tb_str}.')
self.child.send(f'[Local Message] Newbing 请求失败,报错信息如下. 如果是与网络相关的问题,建议更换代理协议推荐http或代理节点 {tb_str}.')
self.child.send('[Fail]')
self.child.send('[Finish]')
@@ -151,18 +160,14 @@ class NewBingHandle(Process):
"""
这个函数运行在主进程
"""
self.threadLock.acquire()
self.parent.send(kwargs) # 发送请求子进程
self.threadLock.acquire() # 获取线程锁
self.parent.send(kwargs) # 请求子进程
while True:
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]':
break # 结束
elif res == '[Fail]':
self.success = False
break
else:
yield res # newbing回复的片段
self.threadLock.release()
res = self.parent.recv() # 等待newbing回复的片段
if res == '[Finish]': break # 结束
elif res == '[Fail]': self.success = False; break # 失败
else: yield res # newbing回复的片段
self.threadLock.release() # 释放线程锁
"""

查看文件

@@ -1,4 +1,4 @@
from .bridge_newbing import preprocess_newbing_out, preprocess_newbing_out_simple
from .bridge_newbingfree import preprocess_newbing_out, preprocess_newbing_out_simple
from multiprocessing import Process, Pipe
from toolbox import update_ui, get_conf, trimmed_format_exc
import threading

查看文件

@@ -1,409 +0,0 @@
"""
========================================================================
第一部分来自EdgeGPT.py
https://github.com/acheong08/EdgeGPT
========================================================================
"""
import argparse
import asyncio
import json
import os
import random
import re
import ssl
import sys
import uuid
from enum import Enum
from typing import Generator
from typing import Literal
from typing import Optional
from typing import Union
import websockets.client as websockets
DELIMITER = "\x1e"
# Generate random IP between range 13.104.0.0/14
FORWARDED_IP = (
f"13.{random.randint(104, 107)}.{random.randint(0, 255)}.{random.randint(0, 255)}"
)
HEADERS = {
"accept": "application/json",
"accept-language": "en-US,en;q=0.9",
"content-type": "application/json",
"sec-ch-ua": '"Not_A Brand";v="99", "Microsoft Edge";v="110", "Chromium";v="110"',
"sec-ch-ua-arch": '"x86"',
"sec-ch-ua-bitness": '"64"',
"sec-ch-ua-full-version": '"109.0.1518.78"',
"sec-ch-ua-full-version-list": '"Chromium";v="110.0.5481.192", "Not A(Brand";v="24.0.0.0", "Microsoft Edge";v="110.0.1587.69"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-model": "",
"sec-ch-ua-platform": '"Windows"',
"sec-ch-ua-platform-version": '"15.0.0"',
"sec-fetch-dest": "empty",
"sec-fetch-mode": "cors",
"sec-fetch-site": "same-origin",
"x-ms-client-request-id": str(uuid.uuid4()),
"x-ms-useragent": "azsdk-js-api-client-factory/1.0.0-beta.1 core-rest-pipeline/1.10.0 OS/Win32",
"Referer": "https://www.bing.com/search?q=Bing+AI&showconv=1&FORM=hpcodx",
"Referrer-Policy": "origin-when-cross-origin",
"x-forwarded-for": FORWARDED_IP,
}
HEADERS_INIT_CONVER = {
"authority": "edgeservices.bing.com",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "en-US,en;q=0.9",
"cache-control": "max-age=0",
"sec-ch-ua": '"Chromium";v="110", "Not A(Brand";v="24", "Microsoft Edge";v="110"',
"sec-ch-ua-arch": '"x86"',
"sec-ch-ua-bitness": '"64"',
"sec-ch-ua-full-version": '"110.0.1587.69"',
"sec-ch-ua-full-version-list": '"Chromium";v="110.0.5481.192", "Not A(Brand";v="24.0.0.0", "Microsoft Edge";v="110.0.1587.69"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-model": '""',
"sec-ch-ua-platform": '"Windows"',
"sec-ch-ua-platform-version": '"15.0.0"',
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.69",
"x-edge-shopping-flag": "1",
"x-forwarded-for": FORWARDED_IP,
}
def get_ssl_context():
import certifi
ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())
return ssl_context
class NotAllowedToAccess(Exception):
pass
class ConversationStyle(Enum):
creative = "h3imaginative,clgalileo,gencontentv3"
balanced = "galileo"
precise = "h3precise,clgalileo"
CONVERSATION_STYLE_TYPE = Optional[
Union[ConversationStyle, Literal["creative", "balanced", "precise"]]
]
def _append_identifier(msg: dict) -> str:
"""
Appends special character to end of message to identify end of message
"""
# Convert dict to json string
return json.dumps(msg) + DELIMITER
def _get_ran_hex(length: int = 32) -> str:
"""
Returns random hex string
"""
return "".join(random.choice("0123456789abcdef") for _ in range(length))
class _ChatHubRequest:
"""
Request object for ChatHub
"""
def __init__(
self,
conversation_signature: str,
client_id: str,
conversation_id: str,
invocation_id: int = 0,
) -> None:
self.struct: dict = {}
self.client_id: str = client_id
self.conversation_id: str = conversation_id
self.conversation_signature: str = conversation_signature
self.invocation_id: int = invocation_id
def update(
self,
prompt,
conversation_style,
options,
) -> None:
"""
Updates request object
"""
if options is None:
options = [
"deepleo",
"enable_debug_commands",
"disable_emoji_spoken_text",
"enablemm",
]
if conversation_style:
if not isinstance(conversation_style, ConversationStyle):
conversation_style = getattr(ConversationStyle, conversation_style)
options = [
"nlu_direct_response_filter",
"deepleo",
"disable_emoji_spoken_text",
"responsible_ai_policy_235",
"enablemm",
conversation_style.value,
"dtappid",
"cricinfo",
"cricinfov2",
"dv3sugg",
]
self.struct = {
"arguments": [
{
"source": "cib",
"optionsSets": options,
"sliceIds": [
"222dtappid",
"225cricinfo",
"224locals0",
],
"traceId": _get_ran_hex(32),
"isStartOfSession": self.invocation_id == 0,
"message": {
"author": "user",
"inputMethod": "Keyboard",
"text": prompt,
"messageType": "Chat",
},
"conversationSignature": self.conversation_signature,
"participant": {
"id": self.client_id,
},
"conversationId": self.conversation_id,
},
],
"invocationId": str(self.invocation_id),
"target": "chat",
"type": 4,
}
self.invocation_id += 1
class _Conversation:
"""
Conversation API
"""
def __init__(
self,
cookies,
proxy,
) -> None:
self.struct: dict = {
"conversationId": None,
"clientId": None,
"conversationSignature": None,
"result": {"value": "Success", "message": None},
}
import httpx
self.proxy = proxy
proxy = (
proxy
or os.environ.get("all_proxy")
or os.environ.get("ALL_PROXY")
or os.environ.get("https_proxy")
or os.environ.get("HTTPS_PROXY")
or None
)
if proxy is not None and proxy.startswith("socks5h://"):
proxy = "socks5://" + proxy[len("socks5h://") :]
self.session = httpx.Client(
proxies=proxy,
timeout=30,
headers=HEADERS_INIT_CONVER,
)
for cookie in cookies:
self.session.cookies.set(cookie["name"], cookie["value"])
# Send GET request
response = self.session.get(
url=os.environ.get("BING_PROXY_URL")
or "https://edgeservices.bing.com/edgesvc/turing/conversation/create",
)
if response.status_code != 200:
response = self.session.get(
"https://edge.churchless.tech/edgesvc/turing/conversation/create",
)
if response.status_code != 200:
print(f"Status code: {response.status_code}")
print(response.text)
print(response.url)
raise Exception("Authentication failed")
try:
self.struct = response.json()
except (json.decoder.JSONDecodeError, NotAllowedToAccess) as exc:
raise Exception(
"Authentication failed. You have not been accepted into the beta.",
) from exc
if self.struct["result"]["value"] == "UnauthorizedRequest":
raise NotAllowedToAccess(self.struct["result"]["message"])
class _ChatHub:
"""
Chat API
"""
def __init__(self, conversation) -> None:
self.wss = None
self.request: _ChatHubRequest
self.loop: bool
self.task: asyncio.Task
print(conversation.struct)
self.request = _ChatHubRequest(
conversation_signature=conversation.struct["conversationSignature"],
client_id=conversation.struct["clientId"],
conversation_id=conversation.struct["conversationId"],
)
async def ask_stream(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
raw: bool = False,
options: dict = None,
) -> Generator[str, None, None]:
"""
Ask a question to the bot
"""
if self.wss and not self.wss.closed:
await self.wss.close()
# Check if websocket is closed
self.wss = await websockets.connect(
wss_link,
extra_headers=HEADERS,
max_size=None,
ssl=get_ssl_context()
)
await self._initial_handshake()
# Construct a ChatHub request
self.request.update(
prompt=prompt,
conversation_style=conversation_style,
options=options,
)
# Send request
await self.wss.send(_append_identifier(self.request.struct))
final = False
while not final:
objects = str(await self.wss.recv()).split(DELIMITER)
for obj in objects:
if obj is None or not obj:
continue
response = json.loads(obj)
if response.get("type") != 2 and raw:
yield False, response
elif response.get("type") == 1 and response["arguments"][0].get(
"messages",
):
resp_txt = response["arguments"][0]["messages"][0]["adaptiveCards"][
0
]["body"][0].get("text")
yield False, resp_txt
elif response.get("type") == 2:
final = True
yield True, response
async def _initial_handshake(self) -> None:
await self.wss.send(_append_identifier({"protocol": "json", "version": 1}))
await self.wss.recv()
async def close(self) -> None:
"""
Close the connection
"""
if self.wss and not self.wss.closed:
await self.wss.close()
class NewbingChatbot:
"""
Combines everything to make it seamless
"""
def __init__(
self,
cookies,
proxy
) -> None:
if cookies is None:
cookies = {}
self.cookies = cookies
self.proxy = proxy
self.chat_hub: _ChatHub = _ChatHub(
_Conversation(self.cookies, self.proxy),
)
async def ask(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
options: dict = None,
) -> dict:
"""
Ask a question to the bot
"""
async for final, response in self.chat_hub.ask_stream(
prompt=prompt,
conversation_style=conversation_style,
wss_link=wss_link,
options=options,
):
if final:
return response
await self.chat_hub.wss.close()
return None
async def ask_stream(
self,
prompt: str,
wss_link: str,
conversation_style: CONVERSATION_STYLE_TYPE = None,
raw: bool = False,
options: dict = None,
) -> Generator[str, None, None]:
"""
Ask a question to the bot
"""
async for response in self.chat_hub.ask_stream(
prompt=prompt,
conversation_style=conversation_style,
wss_link=wss_link,
raw=raw,
options=options,
):
yield response
async def close(self) -> None:
"""
Close the connection
"""
await self.chat_hub.close()
async def reset(self) -> None:
"""
Reset the conversation
"""
await self.close()
self.chat_hub = _ChatHub(_Conversation(self.cookies, self.proxy))

查看文件

@@ -1,4 +1,5 @@
./docs/gradio-3.32.2-py3-none-any.whl
pydantic==1.10.11
tiktoken>=0.3.3
requests[socks]
transformers

查看文件

@@ -1,6 +1,6 @@
import gradio as gr
from toolbox import get_conf
CODE_HIGHLIGHT, ADD_WAIFU = get_conf('CODE_HIGHLIGHT', 'ADD_WAIFU')
CODE_HIGHLIGHT, ADD_WAIFU, LAYOUT = get_conf('CODE_HIGHLIGHT', 'ADD_WAIFU', 'LAYOUT')
# gradio可用颜色列表
# gr.themes.utils.colors.slate (石板色)
# gr.themes.utils.colors.gray (灰色)
@@ -82,20 +82,76 @@ def adjust_theme():
button_cancel_text_color_dark="white",
)
# Layout = "LEFT-RIGHT"
js = """
<script>
function ChatBotHeight() {
function update_height(){
var { panel_height_target, chatbot_height, chatbot } = get_elements();
if (panel_height_target!=chatbot_height)
{
var pixelString = panel_height_target.toString() + 'px';
chatbot.style.maxHeight = pixelString; chatbot.style.height = pixelString;
}
}
function update_height_slow(){
var { panel_height_target, chatbot_height, chatbot } = get_elements();
if (panel_height_target!=chatbot_height)
{
new_panel_height = (panel_height_target - chatbot_height)*0.5 + chatbot_height;
if (Math.abs(new_panel_height - panel_height_target) < 10){
new_panel_height = panel_height_target;
}
// console.log(chatbot_height, panel_height_target, new_panel_height);
var pixelString = new_panel_height.toString() + 'px';
chatbot.style.maxHeight = pixelString; chatbot.style.height = pixelString;
}
}
update_height();
setInterval(function() {
update_height_slow()
}, 50); // 每100毫秒执行一次
}
function get_elements() {
var chatbot = document.querySelector('#gpt-chatbot > div.wrap.svelte-18telvq');
if (!chatbot) {
chatbot = document.querySelector('#gpt-chatbot');
}
const panel1 = document.querySelector('#input-panel');
const panel2 = document.querySelector('#basic-panel');
const panel3 = document.querySelector('#plugin-panel');
const panel4 = document.querySelector('#interact-panel');
const panel5 = document.querySelector('#input-panel2');
const panel_active = document.querySelector('#state-panel');
var panel_height_target = (20-panel_active.offsetHeight) + panel1.offsetHeight + panel2.offsetHeight + panel3.offsetHeight + panel4.offsetHeight + panel5.offsetHeight + 21;
var panel_height_target = parseInt(panel_height_target);
var chatbot_height = chatbot.style.height;
var chatbot_height = parseInt(chatbot_height);
return { panel_height_target, chatbot_height, chatbot };
}
</script>
"""
if LAYOUT=="TOP-DOWN":
js = ""
# 添加一个萌萌的看板娘
if ADD_WAIFU:
js = """
js += """
<script src="file=docs/waifu_plugin/jquery.min.js"></script>
<script src="file=docs/waifu_plugin/jquery-ui.min.js"></script>
<script src="file=docs/waifu_plugin/autoload.js"></script>
"""
gradio_original_template_fn = gr.routes.templates.TemplateResponse
def gradio_new_template_fn(*args, **kwargs):
res = gradio_original_template_fn(*args, **kwargs)
res.body = res.body.replace(b'</html>', f'{js}</html>'.encode("utf8"))
res.init_headers()
return res
gr.routes.templates.TemplateResponse = gradio_new_template_fn # override gradio template
gradio_original_template_fn = gr.routes.templates.TemplateResponse
def gradio_new_template_fn(*args, **kwargs):
res = gradio_original_template_fn(*args, **kwargs)
res.body = res.body.replace(b'</html>', f'{js}</html>'.encode("utf8"))
res.init_headers()
return res
gr.routes.templates.TemplateResponse = gradio_new_template_fn # override gradio template
except:
set_theme = None
print('gradio版本较旧, 不能自定义字体和颜色')

查看文件

@@ -1,11 +1,13 @@
import markdown
import importlib
import traceback
import time
import inspect
import re
import os
import gradio
from latex2mathml.converter import convert as tex2mathml
from functools import wraps, lru_cache
pj = os.path.join
"""
========================================================================
@@ -39,7 +41,7 @@ def ArgsGeneralWrapper(f):
"""
装饰器函数,用于重组输入参数,改变输入参数的顺序与结构。
"""
def decorated(cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
def decorated(request: gradio.Request, cookies, max_length, llm_model, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg, *args):
txt_passon = txt
if txt == "" and txt2 != "": txt_passon = txt2
# 引入一个有cookie的chatbot
@@ -53,13 +55,21 @@ def ArgsGeneralWrapper(f):
'top_p':top_p,
'max_length': max_length,
'temperature':temperature,
'client_ip': request.client.host,
}
plugin_kwargs = {
"advanced_arg": plugin_advanced_arg,
}
chatbot_with_cookie = ChatBotWithCookies(cookies)
chatbot_with_cookie.write_list(chatbot)
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
if cookies.get('lock_plugin', None) is None:
# 正常状态
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
else:
# 处理个别特殊插件的锁定状态
module, fn_name = cookies['lock_plugin'].split('->')
f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
return decorated
@@ -67,8 +77,32 @@ def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
"""
刷新用户界面
"""
assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时可用clear将其清空然后用for+append循环重新赋值。"
yield chatbot.get_cookies(), chatbot, history, msg
assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
cookies = chatbot.get_cookies()
# 解决插件锁定时的界面显示问题
if cookies.get('lock_plugin', None):
label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
chatbot_gr = gradio.update(value=chatbot, label=label)
if cookies.get('label', "") != label: cookies['label'] = label # 记住当前的label
elif cookies.get('label', None):
chatbot_gr = gradio.update(value=chatbot, label=cookies.get('llm_model', ""))
cookies['label'] = None # 清空label
else:
chatbot_gr = chatbot
yield cookies, chatbot_gr, history, msg
def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
"""
刷新用户界面
"""
if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg])
chatbot[-1] = list(chatbot[-1])
chatbot[-1][-1] = lastmsg
yield from update_ui(chatbot=chatbot, history=history)
time.sleep(delay)
def trimmed_format_exc():
import os, traceback
@@ -83,7 +117,7 @@ def CatchException(f):
"""
@wraps(f)
def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT):
def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT=-1):
try:
yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT)
except Exception as e:
@@ -210,16 +244,21 @@ def text_divide_paragraph(text):
"""
将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
"""
pre = '<div class="markdown-body">'
suf = '</div>'
if text.startswith(pre) and text.endswith(suf):
return text
if '```' in text:
# careful input
return text
return pre + text + suf
else:
# wtf input
lines = text.split("\n")
for i, line in enumerate(lines):
lines[i] = lines[i].replace(" ", "&nbsp;")
text = "</br>".join(lines)
return text
return pre + text + suf
@lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
def markdown_convertion(txt):
@@ -331,8 +370,11 @@ def format_io(self, y):
if y is None or y == []:
return []
i_ask, gpt_reply = y[-1]
i_ask = text_divide_paragraph(i_ask) # 输入部分太自由,预处理一波
gpt_reply = close_up_code_segment_during_stream(gpt_reply) # 当代码输出半截的时候,试着补上后个```
# 输入部分太自由,预处理一波
if i_ask is not None: i_ask = text_divide_paragraph(i_ask)
# 当代码输出半截的时候,试着补上后个```
if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply)
# process
y[-1] = (
None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
None if gpt_reply is None else markdown_convertion(gpt_reply)
@@ -380,7 +422,7 @@ def extract_archive(file_path, dest_dir):
print("Successfully extracted rar archive to {}".format(dest_dir))
except:
print("Rar format requires additional dependencies to install")
return '\n\n需要安装pip install rarfile来解压rar文件'
return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件'
# 第三方库,需要预先pip install py7zr
elif file_extension == '.7z':
@@ -391,7 +433,7 @@ def extract_archive(file_path, dest_dir):
print("Successfully extracted 7z archive to {}".format(dest_dir))
except:
print("7z format requires additional dependencies to install")
return '\n\n需要安装pip install py7zr来解压7z文件'
return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件'
else:
return ''
return ''
@@ -420,6 +462,17 @@ def find_recent_files(directory):
return recent_files
def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
# 将文件复制一份到下载区
import shutil
if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
new_path = os.path.join(f'./gpt_log/', rename_file)
if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
if not os.path.exists(new_path): shutil.copyfile(file, new_path)
if chatbot:
if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
else: current = []
chatbot._cookies.update({'file_to_promote': [new_path] + current})
def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
"""
@@ -459,25 +512,39 @@ def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
return chatbot, txt, txt2
def on_report_generated(files, chatbot):
def on_report_generated(cookies, files, chatbot):
from toolbox import find_recent_files
report_files = find_recent_files('gpt_log')
if 'file_to_promote' in cookies:
report_files = cookies['file_to_promote']
cookies.pop('file_to_promote')
else:
report_files = find_recent_files('gpt_log')
if len(report_files) == 0:
return None, chatbot
return cookies, None, chatbot
# files.extend(report_files)
chatbot.append(['报告如何远程获取?', '报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。'])
return report_files, chatbot
file_links = ''
for f in report_files: file_links += f'<br/><a href="file={os.path.abspath(f)}" target="_blank">{f}</a>'
chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
return cookies, report_files, chatbot
def load_chat_cookies():
API_KEY, LLM_MODEL, AZURE_API_KEY = get_conf('API_KEY', 'LLM_MODEL', 'AZURE_API_KEY')
if is_any_api_key(AZURE_API_KEY):
if is_any_api_key(API_KEY): API_KEY = API_KEY + ',' + AZURE_API_KEY
else: API_KEY = AZURE_API_KEY
return {'api_key': API_KEY, 'llm_model': LLM_MODEL}
def is_openai_api_key(key):
API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
return bool(API_MATCH_ORIGINAL)
def is_azure_api_key(key):
API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{32}$", key)
return bool(API_MATCH_ORIGINAL) or bool(API_MATCH_AZURE)
return bool(API_MATCH_AZURE)
def is_api2d_key(key):
if key.startswith('fk') and len(key) == 41:
return True
else:
return False
API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
return bool(API_MATCH_API2D)
def is_any_api_key(key):
if ',' in key:
@@ -486,10 +553,10 @@ def is_any_api_key(key):
if is_any_api_key(k): return True
return False
else:
return is_openai_api_key(key) or is_api2d_key(key)
return is_openai_api_key(key) or is_api2d_key(key) or is_azure_api_key(key)
def what_keys(keys):
avail_key_list = {'OpenAI Key':0, "API2D Key":0}
avail_key_list = {'OpenAI Key':0, "Azure Key":0, "API2D Key":0}
key_list = keys.split(',')
for k in key_list:
@@ -500,7 +567,11 @@ def what_keys(keys):
if is_api2d_key(k):
avail_key_list['API2D Key'] += 1
return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个,API2D Key {avail_key_list['API2D Key']}"
for k in key_list:
if is_azure_api_key(k):
avail_key_list['Azure Key'] += 1
return f"检测到: OpenAI Key {avail_key_list['OpenAI Key']} 个, Azure Key {avail_key_list['Azure Key']} 个, API2D Key {avail_key_list['API2D Key']}"
def select_api_key(keys, llm_model):
import random
@@ -515,8 +586,12 @@ def select_api_key(keys, llm_model):
for k in key_list:
if is_api2d_key(k): avail_key_list.append(k)
if llm_model.startswith('azure-'):
for k in key_list:
if is_azure_api_key(k): avail_key_list.append(k)
if len(avail_key_list) == 0:
raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源")
raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源右下角更换模型菜单中可切换openai,azure和api2d请求源")
api_key = random.choice(avail_key_list) # 随机负载均衡
return api_key
@@ -728,6 +803,8 @@ def clip_history(inputs, history, tokenizer, max_token_limit):
其他小工具:
- zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中gpt写的
- gen_time_str: 生成时间戳
- ProxyNetworkActivate: 临时地启动代理网络(如果有)
- objdump/objload: 快捷的调试函数
========================================================================
"""
@@ -762,11 +839,16 @@ def zip_folder(source_folder, dest_folder, zip_name):
print(f"Zip file created at {zip_file}")
def zip_result(folder):
import time
t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
zip_folder(folder, './gpt_log/', f'{t}-result.zip')
return pj('./gpt_log/', f'{t}-result.zip')
def gen_time_str():
import time
return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
class ProxyNetworkActivate():
"""
这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
@@ -775,8 +857,9 @@ class ProxyNetworkActivate():
from toolbox import get_conf
proxies, = get_conf('proxies')
if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
os.environ['HTTP_PROXY'] = proxies['http']
os.environ['HTTPS_PROXY'] = proxies['https']
if proxies is not None:
if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http']
if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https']
return self
def __exit__(self, exc_type, exc_value, traceback):
@@ -784,3 +867,17 @@ class ProxyNetworkActivate():
if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
return
def objdump(obj, file='objdump.tmp'):
import pickle
with open(file, 'wb+') as f:
pickle.dump(obj, f)
return
def objload(file='objdump.tmp'):
import pickle, os
if not os.path.exists(file):
return
with open(file, 'rb') as f:
return pickle.load(f)

查看文件

@@ -1,5 +1,5 @@
{
"version": 3.37,
"version": 3.44,
"show_feature": true,
"new_feature": "修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持 <-> 提供复旦MOSS模型适配启用需额外依赖 <-> 提供docker-compose方案兼容LLAMA盘古RWKV等模型的后端 <-> 新增Live2D装饰 <-> 完善对话历史的保存/载入/删除 <-> 保存对话功能"
"new_feature": "改善UI <-> 修复Azure接口的BUG <-> 完善多语言模块 <-> 完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件"
}