比较提交

..

60 次代码提交

作者 SHA1 备注 提交日期
binary-husky
0844b6e9cf GROBID服务代理访问支持 2023-09-27 15:40:55 +08:00
binary-husky
9cb05e5724 修改布局 2023-09-27 15:20:28 +08:00
binary-husky
80b209fa0c Merge branch 'frontier' 2023-09-27 15:19:07 +08:00
binary-husky
8d4cb05738 Matlab项目解析插件的Shortcut 2023-09-26 10:16:38 +08:00
binary-husky
31f4069563 改善润色和校读Prompt 2023-09-25 17:46:28 +08:00
binary-husky
8ba6fc062e Merge branch 'frontier' of github.com:binary-husky/chatgpt_academic into frontier 2023-09-23 23:59:30 +08:00
binary-husky
c0c2d14e3d better scrollbar 2023-09-23 23:58:32 +08:00
binary-husky
f0a5c49a9c Merge branch 'frontier' of github.com:binary-husky/chatgpt_academic into frontier 2023-09-23 23:47:42 +08:00
binary-husky
9333570ab7 减小重置等基础按钮的最小大小 2023-09-23 23:47:25 +08:00
binary-husky
d6eaaad962 禁止gradio显示误导性的share=True 2023-09-23 23:23:23 +08:00
binary-husky
e24f077b68 显式增加azure-gpt-4选项 2023-09-23 23:06:58 +08:00
binary-husky
dc5bb9741a 版本更新 2023-09-23 22:45:07 +08:00
binary-husky
b383b45191 version 3.54 beta 2023-09-23 22:44:18 +08:00
binary-husky
2d8f37baba 细分代理场景 2023-09-23 22:43:15 +08:00
binary-husky
409927ef8e 统一 transformers 版本 2023-09-23 22:26:28 +08:00
binary-husky
5b231e0170 添加整体复制按钮 2023-09-23 22:11:29 +08:00
binary-husky
87f629bb37 添加gpt-4-32k 2023-09-23 20:24:13 +08:00
binary-husky
3672c97a06 动态代码解释器 2023-09-23 01:51:05 +08:00
binary-husky
b6ee3e9807 Merge pull request #1121 from binary-husky/frontier
arxiv翻译插件添加禁用缓存选项
2023-09-21 09:33:19 +08:00
binary-husky
d56bc280e9 添加禁用缓存选项 2023-09-20 22:04:15 +08:00
qingxu fu
d5fd00c15d 微调Dockerfile 2023-09-20 10:02:10 +08:00
binary-husky
5e647ff149 Merge branch 'master' into frontier 2023-09-19 17:21:02 +08:00
binary-husky
868faf00cc 修正docker compose 2023-09-19 17:10:57 +08:00
binary-husky
a0286c39b9 更新README 2023-09-19 17:08:20 +08:00
binary-husky
9cced321f1 修改README 2023-09-19 16:55:39 +08:00
binary-husky
3073935e24 修改readme 推送version 3.53 2023-09-19 16:49:33 +08:00
binary-husky
ef6631b280 TOKEN_LIMIT_PER_FRAGMENT修改为1024 2023-09-19 16:31:36 +08:00
binary-husky
0801e4d881 Merge pull request #1111 from kaixindelele/only_chinese_pdf
提升PDF翻译插件的效果
2023-09-19 15:56:04 +08:00
qingxu fu
ae08cfbcae 修复小Bug 2023-09-19 15:55:27 +08:00
qingxu fu
1c0d5361ea 调整状态栏的最小高度 2023-09-19 15:52:42 +08:00
qingxu fu
278464bfb7 合并重复的函数 2023-09-18 23:03:23 +08:00
qingxu fu
2a6996f5d0 修复Azure的ENDPOINT格式兼容性 2023-09-18 21:19:02 +08:00
qingxu fu
84b11016c6 在nougat处理结束后,同时输出mmd文件 2023-09-18 15:21:30 +08:00
qingxu fu
7e74d3d699 调整按钮位置 2023-09-18 15:19:21 +08:00
qingxu fu
2cad8e2694 支持动态切换主题 2023-09-17 00:15:28 +08:00
qingxu fu
e765ec1223 dynamic theme 2023-09-17 00:02:49 +08:00
kaixindelele
471a369bb8 论文翻译只输出中文 2023-09-16 22:09:44 +08:00
binary-husky
760ff1840c 修复一个循环的Bug 2023-09-15 17:08:23 +08:00
binary-husky
9905122fc2 修复Tex文件匹配BUG 2023-09-15 12:55:41 +08:00
binary-husky
abea0d07ac 修复logging的Bug 2023-09-15 11:00:30 +08:00
binary-husky
16ff5ddcdc 版本3.52 2023-09-14 23:07:12 +08:00
binary-husky
1c4cb340ca 修复滞留文档的提示Bug 2023-09-14 22:45:45 +08:00
binary-husky
5ba8ea27d1 用logging取代print 2023-09-14 22:33:07 +08:00
binary-husky
567c6530d8 增加NOUGAT消息提示和错误操作提示 2023-09-14 21:38:47 +08:00
binary-husky
a3f36668a8 修复latex识别主文件错误的问题 2023-09-14 17:51:41 +08:00
binary-husky
a1cc2f733c 修复nougat线程锁释放Bug 2023-09-14 15:26:03 +08:00
binary-husky
0937f37388 Predict按钮参数修正 2023-09-14 11:02:40 +08:00
binary-husky
74f35e3401 针对虚空终端个别情况下不输出文件的问题进行提示 2023-09-14 01:51:55 +08:00
binary-husky
ab7999c71a 修正本项目源码范围 2023-09-14 01:00:38 +08:00
binary-husky
544771db9a 隐藏历史对话绝对路径 2023-09-14 00:53:15 +08:00
binary-husky
ec9d030457 把上传文件路径和日志路径修改为统一可配置的变量 2023-09-14 00:51:25 +08:00
binary-husky
14de282302 给nougat加线程锁 合并冗余代码 2023-09-13 23:21:00 +08:00
binary-husky
fb5467b85b 更新插件系统提示 2023-09-12 19:13:36 +08:00
binary-husky
c4c6465927 解决issues #1097 2023-09-12 18:57:50 +08:00
qingxu fu
99a1cd6f9f 添加pypinyin依赖 2023-09-12 12:20:05 +08:00
qingxu fu
7e73a255f4 修改知识库插件的提示信息 2023-09-12 11:47:34 +08:00
qingxu fu
4b5f13bff2 修复知识库的依赖问题 2023-09-12 11:35:31 +08:00
qingxu fu
d495b73456 支持更多UI皮肤外观,加入暗色亮色切换键 2023-09-11 22:55:32 +08:00
qingxu fu
e699b6b13f Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master 2023-09-11 14:49:37 +08:00
qingxu fu
eb150987f0 兼容一个one-api没有done数据包的第三方Bug情形 2023-09-11 14:49:30 +08:00
共有 58 个文件被更改,包括 1356 次插入821 次删除

查看文件

@@ -101,9 +101,11 @@ cd gpt_academic
2. 配置API_KEY
在`config.py`中,配置API KEY等设置,[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。
在`config.py`中,配置API KEY等设置,[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。[Wiki页面](https://github.com/binary-husky/gpt_academic/wiki/%E9%A1%B9%E7%9B%AE%E9%85%8D%E7%BD%AE%E8%AF%B4%E6%98%8E)。
(P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。因此,如果您能理解我们的配置读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中(仅复制您修改过的配置条目即可)。`config_private.py`不受git管控,可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项,环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
「 程序会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。如您能理解该读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中(仅复制您修改过的配置条目即可)。
「 支持通过`环境变量`配置项目,环境变量的书写格式参考`docker-compose.yml`文件或者我们的[Wiki页面](https://github.com/binary-husky/gpt_academic/wiki/%E9%A1%B9%E7%9B%AE%E9%85%8D%E7%BD%AE%E8%AF%B4%E6%98%8E)。配置读取优先级: `环境变量` > `config_private.py` > `config.py`。 」
3. 安装依赖
@@ -111,7 +113,7 @@ cd gpt_academic
# 选择I: 如熟悉pythonpython版本3.9以上,越新越好,备注使用官方pip源或者阿里pip源,临时换源方法python -m pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
python -m pip install -r requirements.txt
# 选择II: 如不熟悉python使用anaconda步骤也是类似的 (https://www.bilibili.com/video/BV1rc411W7Dr)
# 选择II: 使用Anaconda步骤也是类似的 (https://www.bilibili.com/video/BV1rc411W7Dr)
conda create -n gptac_venv python=3.11 # 创建anaconda环境
conda activate gptac_venv # 激活anaconda环境
python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
@@ -149,26 +151,25 @@ python main.py
### 安装方法II使用Docker
0. 部署项目的全部能力这个是包含cuda和latex的大型镜像。如果您网速慢、硬盘小或没有显卡,则不推荐使用这个,建议使用方案1需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时)
[![fullcapacity](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-all-capacity.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)
1. 仅ChatGPT推荐大多数人选择,等价于docker-compose方案1
``` sh
# 修改docker-compose.yml,保留方案0并删除其他方案。修改docker-compose.yml中方案0的配置,参考其中注释即可
docker-compose up
```
1. 仅ChatGPT+文心一言+spark等在线模型推荐大多数人选择
[![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml)
[![basiclatex](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml)
[![basicaudio](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)
``` sh
git clone --depth=1 https://github.com/binary-husky/gpt_academic.git # 下载项目
cd gpt_academic # 进入路径
nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy”, “API_KEY” 以及 “WEB_PORT” (例如50923) 等
docker build -t gpt-academic . # 安装
#(最后一步-Linux操作系统用`--net=host`更方便快捷
docker run --rm -it --net=host gpt-academic
#(最后一步-MacOS/Windows操作系统只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
# 修改docker-compose.yml,保留方案1并删除其他方案。修改docker-compose.yml中方案1的配置,参考其中注释即可
docker-compose up
```
P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能修改docker-compose.yml,保留方案4并删除其他方案
P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用方案4或者方案0获取Latex功能。
2. ChatGPT + ChatGLM2 + MOSS + LLAMA2 + 通义千问(需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时)
[![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)
@@ -309,6 +310,7 @@ Tip不指定文件直接点击 `载入对话历史存档` 可以查看历史h
### II版本:
- version 3.60todo: 优化虚空终端,引入code interpreter和更多插件
- version 3.53: 支持动态选择不同界面主题,提高稳定性&解决多用户冲突问题
- version 3.50: 使用自然语言调用本项目的所有函数插件虚空终端,支持插件分类,改进UI,设计新主题
- version 3.49: 支持百度千帆平台和文心一言
- version 3.48: 支持阿里达摩院通义千问,上海AI-Lab书生,讯飞星火

查看文件

@@ -43,9 +43,10 @@ API_URL_REDIRECT = {}
DEFAULT_WORKER_NUM = 3
# 色彩主题可选 ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast"]
# 色彩主题, 可选 ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast"]
# 更多主题, 请查阅Gradio主题商店: https://huggingface.co/spaces/gradio/theme-gallery 可选 ["Gstaff/Xkcd", "NoCrypt/Miku", ...]
THEME = "Default"
AVAIL_THEMES = ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast", "Gstaff/Xkcd", "NoCrypt/Miku"]
# 对话窗的高度 仅在LAYOUT="TOP-DOWN"时生效)
CHATBOT_HEIGHT = 1115
@@ -73,13 +74,13 @@ MAX_RETRY = 2
# 插件分类默认选项
DEFAULT_FN_GROUPS = ['对话', '编程', '学术']
DEFAULT_FN_GROUPS = ['对话', '编程', '学术', '智能体']
# 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo",
"gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
"gpt-4", "gpt-4-32k", "azure-gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
# P.S. 其他可用的模型还包括 ["qianfan", "llama2", "qwen", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613",
# "spark", "sparkv2", "chatglm_onnx", "claude-1-100k", "claude-2", "internlm", "jittorllms_pangualpha", "jittorllms_llama"]
@@ -178,6 +179,12 @@ GROBID_URLS = [
# 是否允许通过自然语言描述修改本页的配置,该功能具有一定的危险性,默认关闭
ALLOW_RESET_CONFIG = False
# 临时的上传文件夹位置,请勿修改
PATH_PRIVATE_UPLOAD = "private_upload"
# 日志文件夹的位置,请勿修改
PATH_LOGGING = "gpt_log"
# 除了连接OpenAI之外,还有哪些场合允许使用代理,请勿修改
WHEN_TO_USE_PROXY = ["Download_LLM", "Download_Gradio_Theme", "Connect_Grobid"]
"""

查看文件

@@ -11,7 +11,8 @@ def get_core_functions():
# 前缀,会被加在你的输入之前。例如,用来描述你的要求,例如翻译、解释代码、润色等等
"Prefix": r"Below is a paragraph from an academic paper. Polish the writing to meet the academic style, " +
r"improve the spelling, grammar, clarity, concision and overall readability. When necessary, rewrite the whole sentence. " +
r"Furthermore, list all modification and explain the reasons to do so in markdown table." + "\n\n",
r"Firstly, you should provide the polished paragraph. "
r"Secondly, you should list all your modification and explain the reasons to do so in markdown table." + "\n\n",
# 后缀,会被加在你的输入之后。例如,配合前缀可以把你的输入内容用引号圈起来
"Suffix": r"",
# 按钮颜色 (默认 secondary)
@@ -27,17 +28,18 @@ def get_core_functions():
"Suffix": r"",
},
"查找语法错误": {
"Prefix": r"Can you help me ensure that the grammar and the spelling is correct? " +
r"Do not try to polish the text, if no mistake is found, tell me that this paragraph is good." +
r"If you find grammar or spelling mistakes, please list mistakes you find in a two-column markdown table, " +
r"put the original text the first column, " +
r"put the corrected text in the second column and highlight the key words you fixed.""\n"
"Prefix": r"Help me ensure that the grammar and the spelling is correct. "
r"Do not try to polish the text, if no mistake is found, tell me that this paragraph is good. "
r"If you find grammar or spelling mistakes, please list mistakes you find in a two-column markdown table, "
r"put the original text the first column, "
r"put the corrected text in the second column and highlight the key words you fixed. "
r"Finally, please provide the proofreaded text.""\n\n"
r"Example:""\n"
r"Paragraph: How is you? Do you knows what is it?""\n"
r"| Original sentence | Corrected sentence |""\n"
r"| :--- | :--- |""\n"
r"| How **is** you? | How **are** you? |""\n"
r"| Do you **knows** what **is** **it**? | Do you **know** what **it** **is** ? |""\n"
r"| Do you **knows** what **is** **it**? | Do you **know** what **it** **is** ? |""\n\n"
r"Below is a paragraph from an academic paper. "
r"You need to report all grammar and spelling mistakes as the example before."
+ "\n\n",

查看文件

@@ -6,6 +6,7 @@ def get_crazy_functions():
from crazy_functions.生成函数注释 import 批量生成函数注释
from crazy_functions.解析项目源代码 import 解析项目本身
from crazy_functions.解析项目源代码 import 解析一个Python项目
from crazy_functions.解析项目源代码 import 解析一个Matlab项目
from crazy_functions.解析项目源代码 import 解析一个C项目的头文件
from crazy_functions.解析项目源代码 import 解析一个C项目
from crazy_functions.解析项目源代码 import 解析一个Golang项目
@@ -13,7 +14,6 @@ def get_crazy_functions():
from crazy_functions.解析项目源代码 import 解析一个Java项目
from crazy_functions.解析项目源代码 import 解析一个前端项目
from crazy_functions.高级功能函数模板 import 高阶功能模板函数
from crazy_functions.代码重写为全英文_多线程 import 全项目切换英文
from crazy_functions.Latex全文润色 import Latex英文润色
from crazy_functions.询问多个大语言模型 import 同时问询
from crazy_functions.解析项目源代码 import 解析一个Lua项目
@@ -39,7 +39,7 @@ def get_crazy_functions():
function_plugins = {
"虚空终端": {
"Group": "对话|编程|学术",
"Group": "对话|编程|学术|智能体",
"Color": "stop",
"AsButton": True,
"Function": HotReload(虚空终端)
@@ -78,6 +78,13 @@ def get_crazy_functions():
"Info": "批量总结word文档 | 输入参数为路径",
"Function": HotReload(总结word文档)
},
"解析整个Matlab项目": {
"Group": "编程",
"Color": "stop",
"AsButton": False,
"Info": "解析一个Matlab项目的所有源文件(.m) | 输入参数为路径",
"Function": HotReload(解析一个Matlab项目)
},
"解析整个C++项目头文件": {
"Group": "编程",
"Color": "stop",
@@ -400,12 +407,12 @@ def get_crazy_functions():
try:
from crazy_functions.Langchain知识库 import 知识库问答
function_plugins.update({
"构建知识库(先上传文件素材)": {
"构建知识库(先上传文件素材,再运行此插件": {
"Group": "对话",
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "待注入的知识库名称id, 默认为default",
"ArgsReminder": "此处待注入的知识库名称id, 默认为default。文件进入知识库后可长期保存。可以通过再次调用本插件的方式,向知识库追加更多文档。",
"Function": HotReload(知识库问答)
}
})
@@ -415,12 +422,12 @@ def get_crazy_functions():
try:
from crazy_functions.Langchain知识库 import 读取知识库作答
function_plugins.update({
"知识库问答": {
"知识库问答(构建知识库后,再运行此插件)": {
"Group": "对话",
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
"ArgsReminder": "待提取的知识库名称id, 默认为default, 您需要首先调用构建知识库",
"ArgsReminder": "待提取的知识库名称id, 默认为default, 您需要构建知识库后再运行此插件。",
"Function": HotReload(读取知识库作答)
}
})
@@ -514,6 +521,18 @@ def get_crazy_functions():
except:
print('Load function plugin failed')
try:
from crazy_functions.函数动态生成 import 函数动态生成
function_plugins.update({
"动态代码解释器CodeInterpreter": {
"Group": "智能体",
"Color": "stop",
"AsButton": False,
"Function": HotReload(函数动态生成)
}
})
except:
print('Load function plugin failed')
# try:
# from crazy_functions.CodeInterpreter import 虚空终端CodeInterpreter

查看文件

@@ -1,6 +1,7 @@
from collections.abc import Callable, Iterable, Mapping
from typing import Any
from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, promote_file_to_downloadzone, clear_file_downloadzone
from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc
from toolbox import promote_file_to_downloadzone, get_log_folder
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import input_clipping, try_install_deps
from multiprocessing import Process, Pipe
@@ -92,7 +93,7 @@ def gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history):
def make_module(code):
module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
with open(f'gpt_log/{module_file}.py', 'w', encoding='utf8') as f:
with open(f'{get_log_folder()}/{module_file}.py', 'w', encoding='utf8') as f:
f.write(code)
def get_class_name(class_string):
@@ -102,7 +103,7 @@ def make_module(code):
return class_name
class_name = get_class_name(code)
return f"gpt_log.{module_file}->{class_name}"
return f"{get_log_folder().replace('/', '.')}.{module_file}->{class_name}"
def init_module_instance(module):
import importlib
@@ -171,7 +172,7 @@ def 虚空终端CodeInterpreter(txt, llm_kwargs, plugin_kwargs, chatbot, history
file_type = file_path.split('.')[-1]
# 粗心检查
if 'private_upload' in txt:
if is_the_upload_folder(txt):
chatbot.append([
"...",
f"请在输入框内填写需求,然后再次点击该插件(文件路径 {file_path} 已经被记忆)"

查看文件

@@ -1,4 +1,4 @@
from toolbox import CatchException, update_ui, ProxyNetworkActivate
from toolbox import CatchException, update_ui, ProxyNetworkActivate, update_ui_lastest_msg
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, get_files_from_everything
@@ -15,7 +15,12 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
web_port 当前软件运行的端口号
"""
history = [] # 清空历史,以免输入溢出
chatbot.append(("这是什么功能?", "[Local Message] 从一批文件(txt, md, tex)中读取数据构建知识库, 然后进行问答。"))
# < --------------------读取参数--------------- >
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
chatbot.append((f"向`{kai_id}`知识库中添加文件。", "[Local Message] 从一批文件(txt, md, tex)中读取数据构建知识库, 然后进行问答。"))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# resolve deps
@@ -24,17 +29,12 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from .crazy_utils import knowledge_archive_interface
except Exception as e:
chatbot.append(
["依赖不足",
"导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."]
)
chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.1', 'pypinyin'])
# < --------------------读取参数--------------- >
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
try_install_deps(['zh_langchain==0.2.1', 'pypinyin'], reload_m=['pypinyin', 'zh_langchain'])
yield from update_ui_lastest_msg("安装完成,您可以再次重试。", chatbot, history)
return
# < --------------------读取文件--------------- >
file_manifest = []
@@ -53,14 +53,14 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
print('Checking Text2vec ...')
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
with ProxyNetworkActivate(): # 临时地激活代理网络
with ProxyNetworkActivate('Download_LLM'): # 临时地激活代理网络
HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
# < -------------------构建知识库--------------- >
chatbot.append(['<br/>'.join(file_manifest), "正在构建知识库..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
print('Establishing knowledge archive ...')
with ProxyNetworkActivate(): # 临时地激活代理网络
with ProxyNetworkActivate('Download_LLM'): # 临时地激活代理网络
kai = knowledge_archive_interface()
kai.feed_archive(file_manifest=file_manifest, id=kai_id)
kai_files = kai.get_loaded_file()
@@ -84,19 +84,18 @@ def 读取知识库作答(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
from .crazy_utils import try_install_deps
try_install_deps(['zh_langchain==0.2.1'])
try_install_deps(['zh_langchain==0.2.1', 'pypinyin'], reload_m=['pypinyin', 'zh_langchain'])
yield from update_ui_lastest_msg("安装完成,您可以再次重试。", chatbot, history)
return
# < ------------------- --------------- >
kai = knowledge_archive_interface()
if 'langchain_plugin_embedding' in chatbot._cookies:
resp, prompt = kai.answer_with_archive_by_id(txt, chatbot._cookies['langchain_plugin_embedding'])
else:
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
resp, prompt = kai.answer_with_archive_by_id(txt, kai_id)
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
kai_id = plugin_kwargs.get("advanced_arg", 'default')
resp, prompt = kai.answer_with_archive_by_id(txt, kai_id)
chatbot.append((txt, '[Local Message] ' + prompt))
chatbot.append((txt, f'[知识库 {kai_id}] ' + prompt))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt, inputs_show_user=txt,

查看文件

@@ -1,5 +1,5 @@
from toolbox import update_ui, trimmed_format_exc
from toolbox import CatchException, report_execption, write_results_to_file, zip_folder
from toolbox import update_ui, trimmed_format_exc, promote_file_to_downloadzone, get_log_folder
from toolbox import CatchException, report_execption, write_history_to_file, zip_folder
class PaperFileGroup():
@@ -51,7 +51,7 @@ class PaperFileGroup():
import os, time
folder = os.path.dirname(self.file_paths[0])
t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
zip_folder(folder, './gpt_log/', f'{t}-polished.zip')
zip_folder(folder, get_log_folder(), f'{t}-polished.zip')
def 多文件润色(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, language='en', mode='polish'):
@@ -126,7 +126,9 @@ def 多文件润色(file_manifest, project_folder, llm_kwargs, plugin_kwargs, ch
# <-------- 整理结果,退出 ---------->
create_report_file_name = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + f"-chatgpt.polish.md"
res = write_results_to_file(gpt_response_collection, file_name=create_report_file_name)
res = write_history_to_file(gpt_response_collection, file_basename=create_report_file_name)
promote_file_to_downloadzone(res, chatbot=chatbot)
history = gpt_response_collection
chatbot.append((f"{fp}完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
@@ -137,7 +139,7 @@ def Latex英文润色(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_p
# 基本信息:功能、贡献者
chatbot.append([
"函数插件功能?",
"对整个Latex项目进行润色。函数插件贡献者: Binary-Husky"])
"对整个Latex项目进行润色。函数插件贡献者: Binary-Husky注意,此插件不调用Latex,如果有Latex环境,请使用“Latex英文纠错+高亮”插件)"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# 尝试导入依赖,如果缺少依赖,则给出安装建议

查看文件

@@ -1,5 +1,5 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import update_ui, promote_file_to_downloadzone
from toolbox import CatchException, report_execption, write_history_to_file
fast_debug = False
class PaperFileGroup():
@@ -95,7 +95,8 @@ def 多文件翻译(file_manifest, project_folder, llm_kwargs, plugin_kwargs, ch
# <-------- 整理结果,退出 ---------->
create_report_file_name = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + f"-chatgpt.polish.md"
res = write_results_to_file(gpt_response_collection, file_name=create_report_file_name)
res = write_history_to_file(gpt_response_collection, create_report_file_name)
promote_file_to_downloadzone(res, chatbot=chatbot)
history = gpt_response_collection
chatbot.append((f"{fp}完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,4 +1,4 @@
from toolbox import update_ui, trimmed_format_exc, get_conf, objdump, objload, promote_file_to_downloadzone
from toolbox import update_ui, trimmed_format_exc, get_conf, get_log_folder, promote_file_to_downloadzone
from toolbox import CatchException, report_execption, update_ui_lastest_msg, zip_result, gen_time_str
from functools import partial
import glob, os, requests, time
@@ -65,7 +65,7 @@ def move_project(project_folder, arxiv_id=None):
if arxiv_id is not None:
new_workfolder = pj(ARXIV_CACHE_DIR, arxiv_id, 'workfolder')
else:
new_workfolder = f'gpt_log/{gen_time_str()}'
new_workfolder = f'{get_log_folder()}/{gen_time_str()}'
try:
shutil.rmtree(new_workfolder)
except:
@@ -79,7 +79,7 @@ def move_project(project_folder, arxiv_id=None):
shutil.copytree(src=project_folder, dst=new_workfolder)
return new_workfolder
def arxiv_download(chatbot, history, txt):
def arxiv_download(chatbot, history, txt, allow_cache=True):
def check_cached_translation_pdf(arxiv_id):
translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'translation')
if not os.path.exists(translation_dir):
@@ -116,7 +116,7 @@ def arxiv_download(chatbot, history, txt):
arxiv_id = url_.split('/abs/')[-1]
if 'v' in arxiv_id: arxiv_id = arxiv_id[:10]
cached_translation_pdf = check_cached_translation_pdf(arxiv_id)
if cached_translation_pdf: return cached_translation_pdf, arxiv_id
if cached_translation_pdf and allow_cache: return cached_translation_pdf, arxiv_id
url_tar = url_.replace('/abs/', '/e-print/')
translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'e-print')
@@ -228,6 +228,9 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
# <-------------- more requirements ------------->
if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
more_req = plugin_kwargs.get("advanced_arg", "")
no_cache = more_req.startswith("--no-cache")
if no_cache: more_req.lstrip("--no-cache")
allow_cache = not no_cache
_switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
# <-------------- check deps ------------->
@@ -244,7 +247,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
# <-------------- clear history and read input ------------->
history = []
txt, arxiv_id = yield from arxiv_download(chatbot, history, txt)
txt, arxiv_id = yield from arxiv_download(chatbot, history, txt, allow_cache)
if txt.endswith('.pdf'):
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"发现已经存在翻译好的PDF文档")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,5 +1,7 @@
from toolbox import update_ui, get_conf, trimmed_format_exc
from toolbox import update_ui, get_conf, trimmed_format_exc, get_log_folder
import threading
import os
import logging
def input_clipping(inputs, history, max_token_limit):
import numpy as np
@@ -649,7 +651,7 @@ class knowledge_archive_interface():
from toolbox import ProxyNetworkActivate
print('Checking Text2vec ...')
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
with ProxyNetworkActivate(): # 临时地激活代理网络
with ProxyNetworkActivate('Download_LLM'): # 临时地激活代理网络
self.text2vec_large_chinese = HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
return self.text2vec_large_chinese
@@ -705,49 +707,96 @@ class knowledge_archive_interface():
)
self.threadLock.release()
return resp, prompt
@Singleton
class nougat_interface():
def __init__(self):
self.threadLock = threading.Lock()
def try_install_deps(deps):
def nougat_with_timeout(self, command, cwd, timeout=3600):
import subprocess
logging.info(f'正在执行命令 {command}')
process = subprocess.Popen(command, shell=True, cwd=cwd)
try:
stdout, stderr = process.communicate(timeout=timeout)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
print("Process timed out!")
return False
return True
def NOUGAT_parse_pdf(self, fp, chatbot, history):
from toolbox import update_ui_lastest_msg
yield from update_ui_lastest_msg("正在解析论文, 请稍候。进度:正在排队, 等待线程锁...",
chatbot=chatbot, history=history, delay=0)
self.threadLock.acquire()
import glob, threading, os
from toolbox import get_log_folder, gen_time_str
dst = os.path.join(get_log_folder(plugin_name='nougat'), gen_time_str())
os.makedirs(dst)
yield from update_ui_lastest_msg("正在解析论文, 请稍候。进度正在加载NOUGAT... 提示首次运行需要花费较长时间下载NOUGAT参数",
chatbot=chatbot, history=history, delay=0)
self.nougat_with_timeout(f'nougat --out "{os.path.abspath(dst)}" "{os.path.abspath(fp)}"', os.getcwd(), timeout=3600)
res = glob.glob(os.path.join(dst,'*.mmd'))
if len(res) == 0:
self.threadLock.release()
raise RuntimeError("Nougat解析论文失败。")
self.threadLock.release()
return res[0]
def try_install_deps(deps, reload_m=[]):
import subprocess, sys, importlib
for dep in deps:
import subprocess, sys
subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', dep])
import site
importlib.reload(site)
for m in reload_m:
importlib.reload(__import__(m))
class construct_html():
def __init__(self) -> None:
self.css = """
HTML_CSS = """
.row {
display: flex;
flex-wrap: wrap;
}
.column {
flex: 1;
padding: 10px;
}
.table-header {
font-weight: bold;
border-bottom: 1px solid black;
}
.table-row {
border-bottom: 1px solid lightgray;
}
.table-cell {
padding: 5px;
}
"""
self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
"""
def add_row(self, a, b):
tmp = """
TABLE_CSS = """
<div class="row table-row">
<div class="column table-cell">REPLACE_A</div>
<div class="column table-cell">REPLACE_B</div>
</div>
"""
"""
class construct_html():
def __init__(self) -> None:
self.css = HTML_CSS
self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
def add_row(self, a, b):
tmp = TABLE_CSS
from toolbox import markdown_convertion
tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
@@ -755,6 +804,13 @@ class construct_html():
def save_file(self, file_name):
with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
f.write(self.html_string.encode('utf-8', 'ignore').decode())
return os.path.join(get_log_folder(), file_name)
def get_plugin_arg(plugin_kwargs, key, default):
# 如果参数是空的
if (key in plugin_kwargs) and (plugin_kwargs[key] == ""): plugin_kwargs.pop(key)
# 正常情况
return plugin_kwargs.get(key, default)

查看文件

@@ -0,0 +1,70 @@
import time
import importlib
from toolbox import trimmed_format_exc, gen_time_str, get_log_folder
from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, is_the_upload_folder
from toolbox import promote_file_to_downloadzone, get_log_folder, update_ui_lastest_msg
import multiprocessing
def get_class_name(class_string):
import re
# Use regex to extract the class name
class_name = re.search(r'class (\w+)\(', class_string).group(1)
return class_name
def try_make_module(code, chatbot):
module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
fn_path = f'{get_log_folder(plugin_name="gen_plugin_verify")}/{module_file}.py'
with open(fn_path, 'w', encoding='utf8') as f: f.write(code)
promote_file_to_downloadzone(fn_path, chatbot=chatbot)
class_name = get_class_name(code)
manager = multiprocessing.Manager()
return_dict = manager.dict()
p = multiprocessing.Process(target=is_function_successfully_generated, args=(fn_path, class_name, return_dict))
# only has 10 seconds to run
p.start(); p.join(timeout=10)
if p.is_alive(): p.terminate(); p.join()
p.close()
return return_dict["success"], return_dict['traceback']
# check is_function_successfully_generated
def is_function_successfully_generated(fn_path, class_name, return_dict):
return_dict['success'] = False
return_dict['traceback'] = ""
try:
# Create a spec for the module
module_spec = importlib.util.spec_from_file_location('example_module', fn_path)
# Load the module
example_module = importlib.util.module_from_spec(module_spec)
module_spec.loader.exec_module(example_module)
# Now you can use the module
some_class = getattr(example_module, class_name)
# Now you can create an instance of the class
instance = some_class()
return_dict['success'] = True
return
except:
return_dict['traceback'] = trimmed_format_exc()
return
def subprocess_worker(code, file_path, return_dict):
return_dict['result'] = None
return_dict['success'] = False
return_dict['traceback'] = ""
try:
module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
fn_path = f'{get_log_folder(plugin_name="gen_plugin_run")}/{module_file}.py'
with open(fn_path, 'w', encoding='utf8') as f: f.write(code)
class_name = get_class_name(code)
# Create a spec for the module
module_spec = importlib.util.spec_from_file_location('example_module', fn_path)
# Load the module
example_module = importlib.util.module_from_spec(module_spec)
module_spec.loader.exec_module(example_module)
# Now you can use the module
some_class = getattr(example_module, class_name)
# Now you can create an instance of the class
instance = some_class()
return_dict['result'] = instance.run(file_path)
return_dict['success'] = True
except:
return_dict['traceback'] = trimmed_format_exc()

查看文件

@@ -1,4 +1,4 @@
from toolbox import update_ui, update_ui_lastest_msg # 刷新Gradio前端界面
from toolbox import update_ui, update_ui_lastest_msg, get_log_folder
from toolbox import zip_folder, objdump, objload, promote_file_to_downloadzone
from .latex_toolbox import PRESERVE, TRANSFORM
from .latex_toolbox import set_forbidden_text, set_forbidden_text_begin_end, set_forbidden_text_careful_brace
@@ -363,7 +363,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
if mode!='translate_zh':
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 使用latexdiff生成论文转化前后对比 ...', chatbot, history) # 刷新Gradio前端界面
print( f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex', os.getcwd())
yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 正在编译对比PDF ...', chatbot, history) # 刷新Gradio前端界面
ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex', work_folder)
@@ -439,9 +439,9 @@ def write_html(sp_file_contents, sp_file_result, chatbot, project_folder):
trans = k
ch.add_row(a=orig, b=trans)
create_report_file_name = f"{gen_time_str()}.trans.html"
ch.save_file(create_report_file_name)
shutil.copyfile(pj('./gpt_log/', create_report_file_name), pj(project_folder, create_report_file_name))
promote_file_to_downloadzone(file=f'./gpt_log/{create_report_file_name}', chatbot=chatbot)
res = ch.save_file(create_report_file_name)
shutil.copyfile(res, pj(project_folder, create_report_file_name))
promote_file_to_downloadzone(file=res, chatbot=chatbot)
except:
from toolbox import trimmed_format_exc
print('writing html result failed:', trimmed_format_exc())

查看文件

@@ -256,6 +256,7 @@ def find_main_tex_file(file_manifest, mode):
canidates_score.append(0)
with open(texf, 'r', encoding='utf8', errors='ignore') as f:
file_content = f.read()
file_content = rm_comments(file_content)
for uw in unexpected_words:
if uw in file_content:
canidates_score[-1] -= 1
@@ -290,7 +291,11 @@ def find_tex_file_ignore_case(fp):
import glob
for f in glob.glob(dir_name+'/*.tex'):
base_name_s = os.path.basename(fp)
if base_name_s.lower() == base_name.lower(): return f
base_name_f = os.path.basename(f)
if base_name_s.lower() == base_name_f.lower(): return f
# 试着加上.tex后缀试试
if not base_name_s.endswith('.tex'): base_name_s+='.tex'
if base_name_s.lower() == base_name_f.lower(): return f
return None
def merge_tex_files_(project_foler, main_file, mode):
@@ -301,9 +306,9 @@ def merge_tex_files_(project_foler, main_file, mode):
for s in reversed([q for q in re.finditer(r"\\input\{(.*?)\}", main_file, re.M)]):
f = s.group(1)
fp = os.path.join(project_foler, f)
fp = find_tex_file_ignore_case(fp)
if fp:
with open(fp, 'r', encoding='utf-8', errors='replace') as fx: c = fx.read()
fp_ = find_tex_file_ignore_case(fp)
if fp_:
with open(fp_, 'r', encoding='utf-8', errors='replace') as fx: c = fx.read()
else:
raise RuntimeError(f'找不到{fp},Tex源文件缺失')
c = merge_tex_files_(project_foler, c, mode)

查看文件

@@ -1,16 +1,26 @@
from functools import lru_cache
from toolbox import gen_time_str
from toolbox import promote_file_to_downloadzone
from toolbox import write_history_to_file, promote_file_to_downloadzone
from toolbox import get_conf
from toolbox import ProxyNetworkActivate
from colorful import *
import requests
import random
from functools import lru_cache
import copy
import os
import math
class GROBID_OFFLINE_EXCEPTION(Exception): pass
def get_avail_grobid_url():
from toolbox import get_conf
GROBID_URLS, = get_conf('GROBID_URLS')
if len(GROBID_URLS) == 0: return None
try:
_grobid_url = random.choice(GROBID_URLS) # 随机负载均衡
if _grobid_url.endswith('/'): _grobid_url = _grobid_url.rstrip('/')
res = requests.get(_grobid_url+'/api/isalive')
with ProxyNetworkActivate('Connect_Grobid'):
res = requests.get(_grobid_url+'/api/isalive')
if res.text=='true': return _grobid_url
else: return None
except:
@@ -21,10 +31,141 @@ def parse_pdf(pdf_path, grobid_url):
import scipdf # pip install scipdf_parser
if grobid_url.endswith('/'): grobid_url = grobid_url.rstrip('/')
try:
article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
with ProxyNetworkActivate('Connect_Grobid'):
article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
except GROBID_OFFLINE_EXCEPTION:
raise GROBID_OFFLINE_EXCEPTION("GROBID服务不可用,请修改config中的GROBID_URL,可修改成本地GROBID服务。")
except:
raise RuntimeError("解析PDF失败,请检查PDF是否损坏。")
return article_dict
def produce_report_markdown(gpt_response_collection, meta, paper_meta_info, chatbot, fp, generated_conclusion_files):
# -=-=-=-=-=-=-=-= 写出第1个文件翻译前后混合 -=-=-=-=-=-=-=-=
res_path = write_history_to_file(meta + ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=f"{gen_time_str()}translated_and_original.md", file_fullname=None)
promote_file_to_downloadzone(res_path, rename_file=os.path.basename(res_path)+'.md', chatbot=chatbot)
generated_conclusion_files.append(res_path)
# -=-=-=-=-=-=-=-= 写出第2个文件仅翻译后的文本 -=-=-=-=-=-=-=-=
translated_res_array = []
# 记录当前的大章节标题:
last_section_name = ""
for index, value in enumerate(gpt_response_collection):
# 先挑选偶数序列号:
if index % 2 != 0:
# 先提取当前英文标题:
cur_section_name = gpt_response_collection[index-1].split('\n')[0].split(" Part")[0]
# 如果index是1的话,则直接使用first section name
if cur_section_name != last_section_name:
cur_value = cur_section_name + '\n'
last_section_name = copy.deepcopy(cur_section_name)
else:
cur_value = ""
# 再做一个小修改重新修改当前part的标题,默认用英文的
cur_value += value
translated_res_array.append(cur_value)
res_path = write_history_to_file(meta + ["# Meta Translation" , paper_meta_info] + translated_res_array,
file_basename = f"{gen_time_str()}-translated_only.md",
file_fullname = None,
auto_caption = False)
promote_file_to_downloadzone(res_path, rename_file=os.path.basename(res_path)+'.md', chatbot=chatbot)
generated_conclusion_files.append(res_path)
return res_path
def translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG):
from crazy_functions.crazy_utils import construct_html
from crazy_functions.crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from crazy_functions.crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
prompt = "以下是一篇学术论文的基本信息:\n"
# title
title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
# authors
authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
# abstract
abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
# command
prompt += f"请将题目和摘要翻译为{DST_LANG}"
meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
# 单线,获取文章meta信息
paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt,
inputs_show_user=prompt,
llm_kwargs=llm_kwargs,
chatbot=chatbot, history=[],
sys_prompt="You are an academic paper reader。",
)
# 多线,翻译
inputs_array = []
inputs_show_user_array = []
# get_token_num
from request_llm.bridge_all import model_info
enc = model_info[llm_kwargs['llm_model']]['tokenizer']
def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
def break_down(txt):
raw_token_num = get_token_num(txt)
if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
return [txt]
else:
# raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
# find a smooth token limit to achieve even seperation
count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
token_limit_smooth = raw_token_num // count + count
return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
for section in article_dict.get('sections'):
if len(section['text']) == 0: continue
section_frags = break_down(section['text'])
for i, fragment in enumerate(section_frags):
heading = section['heading']
if len(section_frags) > 1: heading += f' Part-{i+1}'
inputs_array.append(
f"你需要翻译{heading}章节,内容如下: \n\n{fragment}"
)
inputs_show_user_array.append(
f"# {heading}\n\n{fragment}"
)
gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array=inputs_array,
inputs_show_user_array=inputs_show_user_array,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history_array=[meta for _ in inputs_array],
sys_prompt_array=[
"请你作为一个学术翻译,负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
)
# -=-=-=-=-=-=-=-= 写出Markdown文件 -=-=-=-=-=-=-=-=
produce_report_markdown(gpt_response_collection, meta, paper_meta_info, chatbot, fp, generated_conclusion_files)
# -=-=-=-=-=-=-=-= 写出HTML文件 -=-=-=-=-=-=-=-=
ch = construct_html()
orig = ""
trans = ""
gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
for i,k in enumerate(gpt_response_collection_html):
if i%2==0:
gpt_response_collection_html[i] = inputs_show_user_array[i//2]
else:
# 先提取当前英文标题:
cur_section_name = gpt_response_collection[i-1].split('\n')[0].split(" Part")[0]
cur_value = cur_section_name + "\n" + gpt_response_collection_html[i]
gpt_response_collection_html[i] = cur_value
final = ["", "", "一、论文概况", "", "Abstract", paper_meta_info, "二、论文翻译", ""]
final.extend(gpt_response_collection_html)
for i, k in enumerate(final):
if i%2==0:
orig = k
if i%2==1:
trans = k
ch.add_row(a=orig, b=trans)
create_report_file_name = f"{os.path.basename(fp)}.trans.html"
html_file = ch.save_file(create_report_file_name)
generated_conclusion_files.append(html_file)
promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)

查看文件

@@ -1,5 +1,6 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file, get_conf
from toolbox import update_ui, get_log_folder
from toolbox import write_history_to_file, promote_file_to_downloadzone
from toolbox import CatchException, report_execption, get_conf
import re, requests, unicodedata, os
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
def download_arxiv_(url_pdf):
@@ -28,7 +29,7 @@ def download_arxiv_(url_pdf):
if k in other_info['comment']:
title = k + ' ' + title
download_dir = './gpt_log/arxiv/'
download_dir = get_log_folder(plugin_name='arxiv')
os.makedirs(download_dir, exist_ok=True)
title_str = title.replace('?', '')\
@@ -40,9 +41,6 @@ def download_arxiv_(url_pdf):
requests_pdf_url = url_pdf
file_path = download_dir+title_str
# if os.path.exists(file_path):
# print('返回缓存文件')
# return './gpt_log/arxiv/'+title_str
print('下载中')
proxies, = get_conf('proxies')
@@ -61,7 +59,7 @@ def download_arxiv_(url_pdf):
.replace('\n', '')\
.replace(' ', ' ')\
.replace(' ', ' ')
return './gpt_log/arxiv/'+title_str, other_info
return file_path, other_info
def get_name(_url_):
@@ -184,11 +182,10 @@ def 下载arxiv论文并翻译摘要(txt, llm_kwargs, plugin_kwargs, chatbot, hi
chatbot[-1] = (i_say_show_user, gpt_say)
history.append(i_say_show_user); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
# 写入文件
import shutil
# 重置文件的创建时间
shutil.copyfile(pdf_path, f'./gpt_log/{os.path.basename(pdf_path)}'); os.remove(pdf_path)
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
promote_file_to_downloadzone(pdf_path, chatbot=chatbot)
chatbot.append(("完成了吗?", res + "\n\nPDF文件也已经下载"))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面

查看文件

@@ -1,138 +0,0 @@
import threading
from request_llm.bridge_all import predict_no_ui_long_connection
from toolbox import update_ui
from toolbox import CatchException, write_results_to_file, report_execption
from .crazy_utils import breakdown_txt_to_satisfy_token_limit
def extract_code_block_carefully(txt):
splitted = txt.split('```')
n_code_block_seg = len(splitted) - 1
if n_code_block_seg <= 1: return txt
# 剩下的情况都开头除去 ``` 结尾除去一次 ```
txt_out = '```'.join(splitted[1:-1])
return txt_out
def break_txt_into_half_at_some_linebreak(txt):
lines = txt.split('\n')
n_lines = len(lines)
pre = lines[:(n_lines//2)]
post = lines[(n_lines//2):]
return "\n".join(pre), "\n".join(post)
@CatchException
def 全项目切换英文(txt, llm_kwargs, plugin_kwargs, chatbot, history, sys_prompt, web_port):
# 第1步清空历史,以免输入溢出
history = []
# 第2步尝试导入依赖,如果缺少依赖,则给出安装建议
try:
import tiktoken
except:
report_execption(chatbot, history,
a = f"解析项目: {txt}",
b = f"导入软件依赖失败。使用该模块需要额外依赖,安装方法```pip install --upgrade tiktoken```。")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# 第3步集合文件
import time, glob, os, shutil, re
os.makedirs('gpt_log/generated_english_version', exist_ok=True)
os.makedirs('gpt_log/generated_english_version/crazy_functions', exist_ok=True)
file_manifest = [f for f in glob.glob('./*.py') if ('test_project' not in f) and ('gpt_log' not in f)] + \
[f for f in glob.glob('./crazy_functions/*.py') if ('test_project' not in f) and ('gpt_log' not in f)]
# file_manifest = ['./toolbox.py']
i_say_show_user_buffer = []
# 第4步随便显示点什么防止卡顿的感觉
for index, fp in enumerate(file_manifest):
# if 'test_project' in fp: continue
with open(fp, 'r', encoding='utf-8', errors='replace') as f:
file_content = f.read()
i_say_show_user =f'[{index}/{len(file_manifest)}] 接下来请将以下代码中包含的所有中文转化为英文,只输出转化后的英文代码,请用代码块输出代码: {os.path.abspath(fp)}'
i_say_show_user_buffer.append(i_say_show_user)
chatbot.append((i_say_show_user, "[Local Message] 等待多线程操作,中间过程不予显示."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# 第5步Token限制下的截断与处理
MAX_TOKEN = 3000
from request_llm.bridge_all import model_info
enc = model_info["gpt-3.5-turbo"]['tokenizer']
def get_token_fn(txt): return len(enc.encode(txt, disallowed_special=()))
# 第6步任务函数
mutable_return = [None for _ in file_manifest]
observe_window = [[""] for _ in file_manifest]
def thread_worker(fp,index):
if index > 10:
time.sleep(60)
print('Openai 限制免费用户每分钟20次请求,降低请求频率中。')
with open(fp, 'r', encoding='utf-8', errors='replace') as f:
file_content = f.read()
i_say_template = lambda fp, file_content: f'接下来请将以下代码中包含的所有中文转化为英文,只输出代码,文件名是{fp},文件代码是 ```{file_content}```'
try:
gpt_say = ""
# 分解代码文件
file_content_breakdown = breakdown_txt_to_satisfy_token_limit(file_content, get_token_fn, MAX_TOKEN)
for file_content_partial in file_content_breakdown:
i_say = i_say_template(fp, file_content_partial)
# # ** gpt request **
gpt_say_partial = predict_no_ui_long_connection(inputs=i_say, llm_kwargs=llm_kwargs, history=[], sys_prompt=sys_prompt, observe_window=observe_window[index])
gpt_say_partial = extract_code_block_carefully(gpt_say_partial)
gpt_say += gpt_say_partial
mutable_return[index] = gpt_say
except ConnectionAbortedError as token_exceed_err:
print('至少一个线程任务Token溢出而失败', e)
except Exception as e:
print('至少一个线程任务意外失败', e)
# 第7步所有线程同时开始执行任务函数
handles = [threading.Thread(target=thread_worker, args=(fp,index)) for index, fp in enumerate(file_manifest)]
for h in handles:
h.daemon = True
h.start()
chatbot.append(('开始了吗?', f'多线程操作已经开始'))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# 第8步循环轮询各个线程是否执行完毕
cnt = 0
while True:
cnt += 1
time.sleep(0.2)
th_alive = [h.is_alive() for h in handles]
if not any(th_alive): break
# 更好的UI视觉效果
observe_win = []
for thread_index, alive in enumerate(th_alive):
observe_win.append("[ ..."+observe_window[thread_index][0][-60:].replace('\n','').replace('```','...').replace(' ','.').replace('<br/>','.....').replace('$','.')+"... ]")
stat = [f'执行中: {obs}\n\n' if alive else '已完成\n\n' for alive, obs in zip(th_alive, observe_win)]
stat_str = ''.join(stat)
chatbot[-1] = (chatbot[-1][0], f'多线程操作已经开始,完成情况: \n\n{stat_str}' + ''.join(['.']*(cnt%10+1)))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# 第9步把结果写入文件
for index, h in enumerate(handles):
h.join() # 这里其实不需要join了,肯定已经都结束了
fp = file_manifest[index]
gpt_say = mutable_return[index]
i_say_show_user = i_say_show_user_buffer[index]
where_to_relocate = f'gpt_log/generated_english_version/{fp}'
if gpt_say is not None:
with open(where_to_relocate, 'w+', encoding='utf-8') as f:
f.write(gpt_say)
else: # 失败
shutil.copyfile(file_manifest[index], where_to_relocate)
chatbot.append((i_say_show_user, f'[Local Message] 已完成{os.path.abspath(fp)}的转化,\n\n存入{os.path.abspath(where_to_relocate)}'))
history.append(i_say_show_user); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
time.sleep(1)
# 第10步备份一个文件
res = write_results_to_file(history)
chatbot.append(("生成一份任务执行报告", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -0,0 +1,252 @@
# 本源代码中, ⭐ = 关键步骤
"""
测试:
- 裁剪图像,保留下半部分
- 交换图像的蓝色通道和红色通道
- 将图像转为灰度图像
- 将csv文件转excel表格
Testing:
- Crop the image, keeping the bottom half.
- Swap the blue channel and red channel of the image.
- Convert the image to grayscale.
- Convert the CSV file to an Excel spreadsheet.
"""
from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, is_the_upload_folder
from toolbox import promote_file_to_downloadzone, get_log_folder, update_ui_lastest_msg
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, get_plugin_arg
from .crazy_utils import input_clipping, try_install_deps
from crazy_functions.gen_fns.gen_fns_shared import is_function_successfully_generated
from crazy_functions.gen_fns.gen_fns_shared import get_class_name
from crazy_functions.gen_fns.gen_fns_shared import subprocess_worker
from crazy_functions.gen_fns.gen_fns_shared import try_make_module
import os
import time
import glob
import multiprocessing
templete = """
```python
import ... # Put dependencies here, e.g. import numpy as np.
class TerminalFunction(object): # Do not change the name of the class, The name of the class must be `TerminalFunction`
def run(self, path): # The name of the function must be `run`, it takes only a positional argument.
# rewrite the function you have just written here
...
return generated_file_path
```
"""
def inspect_dependency(chatbot, history):
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return True
def get_code_block(reply):
import re
pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
matches = re.findall(pattern, reply) # find all code blocks in text
if len(matches) == 1:
return matches[0].strip('python') # code block
for match in matches:
if 'class TerminalFunction' in match:
return match.strip('python') # code block
raise RuntimeError("GPT is not generating proper code.")
def gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history):
# 输入
prompt_compose = [
f'Your job:\n'
f'1. write a single Python function, which takes a path of a `{file_type}` file as the only argument and returns a `string` containing the result of analysis or the path of generated files. \n',
f"2. You should write this function to perform following task: " + txt + "\n",
f"3. Wrap the output python function with markdown codeblock."
]
i_say = "".join(prompt_compose)
demo = []
# 第一步
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say, inputs_show_user=i_say,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=demo,
sys_prompt= r"You are a world-class programmer."
)
history.extend([i_say, gpt_say])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
# 第二步
prompt_compose = [
"If previous stage is successful, rewrite the function you have just written to satisfy following templete: \n",
templete
]
i_say = "".join(prompt_compose); inputs_show_user = "If previous stage is successful, rewrite the function you have just written to satisfy executable templete. "
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=i_say, inputs_show_user=inputs_show_user,
llm_kwargs=llm_kwargs, chatbot=chatbot, history=history,
sys_prompt= r"You are a programmer. You need to replace `...` with valid packages, do not give `...` in your answer!"
)
code_to_return = gpt_say
history.extend([i_say, gpt_say])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
# # 第三步
# i_say = "Please list to packages to install to run the code above. Then show me how to use `try_install_deps` function to install them."
# i_say += 'For instance. `try_install_deps(["opencv-python", "scipy", "numpy"])`'
# installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
# inputs=i_say, inputs_show_user=inputs_show_user,
# llm_kwargs=llm_kwargs, chatbot=chatbot, history=history,
# sys_prompt= r"You are a programmer."
# )
# # # 第三步
# i_say = "Show me how to use `pip` to install packages to run the code above. "
# i_say += 'For instance. `pip install -r opencv-python scipy numpy`'
# installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
# inputs=i_say, inputs_show_user=i_say,
# llm_kwargs=llm_kwargs, chatbot=chatbot, history=history,
# sys_prompt= r"You are a programmer."
# )
installation_advance = ""
return code_to_return, installation_advance, txt, file_type, llm_kwargs, chatbot, history
def for_immediate_show_off_when_possible(file_type, fp, chatbot):
if file_type in ['png', 'jpg']:
image_path = os.path.abspath(fp)
chatbot.append(['这是一张图片, 展示如下:',
f'本地文件地址: <br/>`{image_path}`<br/>'+
f'本地文件预览: <br/><div align="center"><img src="file={image_path}"></div>'
])
return chatbot
def have_any_recent_upload_files(chatbot):
_5min = 5 * 60
if not chatbot: return False # chatbot is None
most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
if not most_recent_uploaded: return False # most_recent_uploaded is None
if time.time() - most_recent_uploaded["time"] < _5min: return True # most_recent_uploaded is new
else: return False # most_recent_uploaded is too old
def get_recent_file_prompt_support(chatbot):
most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
path = most_recent_uploaded['path']
return path
@CatchException
def 函数动态生成(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
"""
txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
llm_kwargs gpt模型参数,如温度和top_p等,一般原样传递下去就行
plugin_kwargs 插件模型的参数,暂时没有用武之地
chatbot 聊天显示框的句柄,用于显示给用户
history 聊天历史,前情提要
system_prompt 给gpt的静默提醒
web_port 当前软件运行的端口号
"""
# 清空历史
history = []
# 基本信息:功能、贡献者
chatbot.append(["正在启动: 插件动态生成插件", "插件动态生成, 执行开始, 作者Binary-Husky."])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# ⭐ 文件上传区是否有东西
# 1. 如果有文件: 作为函数参数
# 2. 如果没有文件需要用GPT提取参数 (太懒了,以后再写,虚空终端已经实现了类似的代码)
file_list = []
if get_plugin_arg(plugin_kwargs, key="file_path_arg", default=False):
file_path = get_plugin_arg(plugin_kwargs, key="file_path_arg", default=None)
file_list.append(file_path)
yield from update_ui_lastest_msg(f"当前文件: {file_path}", chatbot, history, 1)
elif have_any_recent_upload_files(chatbot):
file_dir = get_recent_file_prompt_support(chatbot)
file_list = glob.glob(os.path.join(file_dir, '**/*'), recursive=True)
yield from update_ui_lastest_msg(f"当前文件处理列表: {file_list}", chatbot, history, 1)
else:
chatbot.append(["文件检索", "没有发现任何近期上传的文件。"])
yield from update_ui_lastest_msg("没有发现任何近期上传的文件。", chatbot, history, 1)
return # 2. 如果没有文件
if len(file_list) == 0:
chatbot.append(["文件检索", "没有发现任何近期上传的文件。"])
yield from update_ui_lastest_msg("没有发现任何近期上传的文件。", chatbot, history, 1)
return # 2. 如果没有文件
# 读取文件
file_type = file_list[0].split('.')[-1]
# 粗心检查
if is_the_upload_folder(txt):
yield from update_ui_lastest_msg(f"请在输入框内填写需求, 然后再次点击该插件! 至于您的文件,不用担心, 文件路径 {txt} 已经被记忆. ", chatbot, history, 1)
return
# 开始干正事
MAX_TRY = 3
for j in range(MAX_TRY): # 最多重试5次
traceback = ""
try:
# ⭐ 开始啦
code, installation_advance, txt, file_type, llm_kwargs, chatbot, history = \
yield from gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history)
chatbot.append(["代码生成阶段结束", ""])
yield from update_ui_lastest_msg(f"正在验证上述代码的有效性 ...", chatbot, history, 1)
# ⭐ 分离代码块
code = get_code_block(code)
# ⭐ 检查模块
ok, traceback = try_make_module(code, chatbot)
# 搞定代码生成
if ok: break
except Exception as e:
if not traceback: traceback = trimmed_format_exc()
# 处理异常
if not traceback: traceback = trimmed_format_exc()
yield from update_ui_lastest_msg(f"{j+1}/{MAX_TRY} 次代码生成尝试, 失败了~ 别担心, 我们5秒后再试一次... \n\n此次我们的错误追踪是\n```\n{traceback}\n```\n", chatbot, history, 5)
# 代码生成结束, 开始执行
TIME_LIMIT = 15
yield from update_ui_lastest_msg(f"开始创建新进程并执行代码! 时间限制 {TIME_LIMIT} 秒. 请等待任务完成... ", chatbot, history, 1)
manager = multiprocessing.Manager()
return_dict = manager.dict()
# ⭐ 到最后一步了,开始逐个文件进行处理
for file_path in file_list:
if os.path.exists(file_path):
chatbot.append([f"正在处理文件: {file_path}", f"请稍等..."])
chatbot = for_immediate_show_off_when_possible(file_type, file_path, chatbot)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
else:
continue
# ⭐⭐⭐ subprocess_worker ⭐⭐⭐
p = multiprocessing.Process(target=subprocess_worker, args=(code, file_path, return_dict))
# ⭐ 开始执行,时间限制TIME_LIMIT
p.start(); p.join(timeout=TIME_LIMIT)
if p.is_alive(): p.terminate(); p.join()
p.close()
res = return_dict['result']
success = return_dict['success']
traceback = return_dict['traceback']
if not success:
if not traceback: traceback = trimmed_format_exc()
chatbot.append(["执行失败了", f"错误追踪\n```\n{trimmed_format_exc()}\n```\n"])
# chatbot.append(["如果是缺乏依赖,请参考以下建议", installation_advance])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
# 顺利完成,收尾
res = str(res)
if os.path.exists(res):
chatbot.append(["执行成功了,结果是一个有效文件", "结果:" + res])
new_file_path = promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot = for_immediate_show_off_when_possible(file_type, new_file_path, chatbot)
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
else:
chatbot.append(["执行成功了,结果是一个字符串", "结果:" + res])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新

查看文件

@@ -1,4 +1,4 @@
from toolbox import CatchException, update_ui, get_conf, select_api_key
from toolbox import CatchException, update_ui, get_conf, select_api_key, get_log_folder
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
import datetime
@@ -33,7 +33,7 @@ def gen_image(llm_kwargs, prompt, resolution="256x256"):
raise RuntimeError(response.content.decode())
# 文件保存到本地
r = requests.get(image_url, proxies=proxies)
file_path = 'gpt_log/image_gen/'
file_path = f'{get_log_folder()}/image_gen/'
os.makedirs(file_path, exist_ok=True)
file_name = 'Image' + time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.png'
with open(file_path+file_name, 'wb+') as f: f.write(r.content)

查看文件

@@ -1,4 +1,4 @@
from toolbox import CatchException, update_ui, promote_file_to_downloadzone
from toolbox import CatchException, update_ui, promote_file_to_downloadzone, get_log_folder
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
import re
@@ -10,8 +10,8 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
import time
if file_name is None:
file_name = 'chatGPT对话历史' + time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.html'
os.makedirs('./gpt_log/', exist_ok=True)
with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
fp = os.path.join(get_log_folder(), file_name)
with open(fp, 'w', encoding='utf8') as f:
from themes.theme import advanced_css
f.write(f'<!DOCTYPE html><head><meta charset="utf-8"><title>对话历史</title><style>{advanced_css}</style></head>')
for i, contents in enumerate(chatbot):
@@ -29,8 +29,8 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
for h in history:
f.write("\n>>>" + h)
f.write('</code>')
promote_file_to_downloadzone(f'./gpt_log/{file_name}', rename_file=file_name, chatbot=chatbot)
return '对话历史写入:' + os.path.abspath(f'./gpt_log/{file_name}')
promote_file_to_downloadzone(fp, rename_file=file_name, chatbot=chatbot)
return '对话历史写入:' + fp
def gen_file_preview(file_name):
try:
@@ -106,7 +106,7 @@ def 载入对话历史存档(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
if not success:
if txt == "": txt = '空空如也的输入栏'
import glob
local_history = "<br/>".join(["`"+hide_cwd(f)+f" ({gen_file_preview(f)})"+"`" for f in glob.glob(f'gpt_log/**/chatGPT对话历史*.html', recursive=True)])
local_history = "<br/>".join(["`"+hide_cwd(f)+f" ({gen_file_preview(f)})"+"`" for f in glob.glob(f'{get_log_folder()}/**/chatGPT对话历史*.html', recursive=True)])
chatbot.append([f"正在查找对话历史文件html格式: {txt}", f"找不到任何html文件: {txt}。但本地存储了以下历史文件,您可以将任意一个文件路径粘贴到输入区,然后重试:<br/>{local_history}"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
@@ -132,8 +132,8 @@ def 删除所有本地对话历史记录(txt, llm_kwargs, plugin_kwargs, chatbot
"""
import glob, os
local_history = "<br/>".join(["`"+hide_cwd(f)+"`" for f in glob.glob(f'gpt_log/**/chatGPT对话历史*.html', recursive=True)])
for f in glob.glob(f'gpt_log/**/chatGPT对话历史*.html', recursive=True):
local_history = "<br/>".join(["`"+hide_cwd(f)+"`" for f in glob.glob(f'{get_log_folder()}/**/chatGPT对话历史*.html', recursive=True)])
for f in glob.glob(f'{get_log_folder()}/**/chatGPT对话历史*.html', recursive=True):
os.remove(f)
chatbot.append([f"删除所有历史对话文件", f"已删除<br/>{local_history}"])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,5 +1,6 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
fast_debug = False
@@ -71,11 +72,13 @@ def 解析docx(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot
history.extend([i_say,gpt_say])
this_paper_history.extend([i_say,gpt_say])
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("所有文件都总结完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,5 +1,6 @@
from toolbox import CatchException, report_execption, select_api_key, update_ui, write_results_to_file, get_conf
from toolbox import CatchException, report_execption, select_api_key, update_ui, get_conf
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from toolbox import write_history_to_file, promote_file_to_downloadzone, get_log_folder
def split_audio_file(filename, split_duration=1000):
"""
@@ -15,7 +16,7 @@ def split_audio_file(filename, split_duration=1000):
"""
from moviepy.editor import AudioFileClip
import os
os.makedirs('gpt_log/mp3/cut/', exist_ok=True) # 创建存储切割音频的文件夹
os.makedirs(f"{get_log_folder(plugin_name='audio')}/mp3/cut/", exist_ok=True) # 创建存储切割音频的文件夹
# 读取音频文件
audio = AudioFileClip(filename)
@@ -31,8 +32,8 @@ def split_audio_file(filename, split_duration=1000):
start_time = split_points[i]
end_time = split_points[i + 1]
split_audio = audio.subclip(start_time, end_time)
split_audio.write_audiofile(f"gpt_log/mp3/cut/{filename[0]}_{i}.mp3")
filelist.append(f"gpt_log/mp3/cut/{filename[0]}_{i}.mp3")
split_audio.write_audiofile(f"{get_log_folder(plugin_name='audio')}/mp3/cut/{filename[0]}_{i}.mp3")
filelist.append(f"{get_log_folder(plugin_name='audio')}/mp3/cut/{filename[0]}_{i}.mp3")
audio.close()
return filelist
@@ -52,7 +53,7 @@ def AnalyAudio(parse_prompt, file_manifest, llm_kwargs, chatbot, history):
'Authorization': f"Bearer {api_key}"
}
os.makedirs('gpt_log/mp3/', exist_ok=True)
os.makedirs(f"{get_log_folder(plugin_name='audio')}/mp3/", exist_ok=True)
for index, fp in enumerate(file_manifest):
audio_history = []
# 提取文件扩展名
@@ -60,8 +61,8 @@ def AnalyAudio(parse_prompt, file_manifest, llm_kwargs, chatbot, history):
# 提取视频中的音频
if ext not in [".mp3", ".wav", ".m4a", ".mpga"]:
audio_clip = AudioFileClip(fp)
audio_clip.write_audiofile(f'gpt_log/mp3/output{index}.mp3')
fp = f'gpt_log/mp3/output{index}.mp3'
audio_clip.write_audiofile(f"{get_log_folder(plugin_name='audio')}/mp3/output{index}.mp3")
fp = f"{get_log_folder(plugin_name='audio')}/mp3/output{index}.mp3"
# 调用whisper模型音频转文字
voice = split_audio_file(fp)
for j, i in enumerate(voice):
@@ -113,18 +114,19 @@ def AnalyAudio(parse_prompt, file_manifest, llm_kwargs, chatbot, history):
history=audio_history,
sys_prompt="总结文章。"
)
history.extend([i_say, gpt_say])
audio_history.extend([i_say, gpt_say])
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append((f"{index + 1}段音频完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# 删除中间文件夹
import shutil
shutil.rmtree('gpt_log/mp3')
res = write_results_to_file(history)
shutil.rmtree(f"{get_log_folder(plugin_name='audio')}/mp3")
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("所有音频都总结完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history)

查看文件

@@ -1,7 +1,7 @@
import glob, time, os, re
import glob, time, os, re, logging
from toolbox import update_ui, trimmed_format_exc, gen_time_str, disable_auto_promotion
from toolbox import CatchException, report_execption, write_history_to_file
from toolbox import promote_file_to_downloadzone, get_log_folder
from toolbox import CatchException, report_execption, get_log_folder
from toolbox import write_history_to_file, promote_file_to_downloadzone
fast_debug = False
class PaperFileGroup():
@@ -34,7 +34,7 @@ class PaperFileGroup():
self.sp_file_contents.append(segment)
self.sp_file_index.append(index)
self.sp_file_tag.append(self.file_paths[index] + f".part-{j}.md")
print('Segmentation: done')
logging.info('Segmentation: done')
def merge_result(self):
self.file_result = ["" for _ in range(len(self.file_paths))]
@@ -101,7 +101,7 @@ def 多文件翻译(file_manifest, project_folder, llm_kwargs, plugin_kwargs, ch
pfg.merge_result()
pfg.write_result(language)
except:
print(trimmed_format_exc())
logging.error(trimmed_format_exc())
# <-------- 整理结果,退出 ---------->
create_report_file_name = gen_time_str() + f"-chatgpt.md"
@@ -121,7 +121,7 @@ def get_files_from_everything(txt, preference=''):
proxies, = get_conf('proxies')
# 网络的远程文件
if preference == 'Github':
print('正在从github下载资源 ...')
logging.info('正在从github下载资源 ...')
if not txt.endswith('.md'):
# Make a request to the GitHub API to retrieve the repository information
url = txt.replace("https://github.com/", "https://api.github.com/repos/") + '/readme'

查看文件

@@ -1,5 +1,6 @@
from toolbox import update_ui, promote_file_to_downloadzone, gen_time_str
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import read_and_clean_pdf_text
from .crazy_utils import input_clipping
@@ -99,8 +100,8 @@ do not have too much repetitive information, numerical values using the original
_, final_results = input_clipping("", final_results, max_token_limit=3200)
yield from update_ui(chatbot=chatbot, history=final_results) # 注意这里的历史记录被替代了
res = write_results_to_file(file_write_buffer, file_name=gen_time_str())
promote_file_to_downloadzone(res.split('\t')[-1], chatbot=chatbot)
res = write_history_to_file(file_write_buffer)
promote_file_to_downloadzone(res, chatbot=chatbot)
yield from update_ui(chatbot=chatbot, history=final_results) # 刷新界面

查看文件

@@ -1,6 +1,7 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from toolbox import write_history_to_file, promote_file_to_downloadzone
fast_debug = False
@@ -115,7 +116,8 @@ def 解析Paper(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbo
chatbot[-1] = (i_say, gpt_say)
history.append(i_say); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面

查看文件

@@ -1,11 +1,12 @@
from toolbox import CatchException, report_execption, gen_time_str
from toolbox import CatchException, report_execption, get_log_folder, gen_time_str
from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
from toolbox import write_history_to_file, get_log_folder
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
from .crazy_utils import read_and_clean_pdf_text
from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url, translate_pdf
from colorful import *
import copy
import os
import math
import logging
@@ -86,186 +87,29 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
# 开始正式执行任务
yield from 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
def nougat_with_timeout(command, cwd, timeout=3600):
import subprocess
process = subprocess.Popen(command, shell=True, cwd=cwd)
try:
stdout, stderr = process.communicate(timeout=timeout)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
print("Process timed out!")
return False
return True
def NOUGAT_parse_pdf(fp):
import glob
from toolbox import get_log_folder, gen_time_str
dst = os.path.join(get_log_folder(plugin_name='nougat'), gen_time_str())
os.makedirs(dst)
nougat_with_timeout(f'nougat --out "{os.path.abspath(dst)}" "{os.path.abspath(fp)}"', os.getcwd())
res = glob.glob(os.path.join(dst,'*.mmd'))
if len(res) == 0:
raise RuntimeError("Nougat解析论文失败。")
return res[0]
def 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
import copy
import tiktoken
TOKEN_LIMIT_PER_FRAGMENT = 1280
TOKEN_LIMIT_PER_FRAGMENT = 1024
generated_conclusion_files = []
generated_html_files = []
DST_LANG = "中文"
from crazy_functions.crazy_utils import nougat_interface, construct_html
nougat_handle = nougat_interface()
for index, fp in enumerate(file_manifest):
chatbot.append(["当前进度:", f"正在解析论文,请稍候。第一次运行时,需要花费较长时间下载NOUGAT参数"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
fpp = NOUGAT_parse_pdf(fp)
fpp = yield from nougat_handle.NOUGAT_parse_pdf(fp, chatbot, history)
promote_file_to_downloadzone(fpp, rename_file=os.path.basename(fpp)+'.nougat.mmd', chatbot=chatbot)
with open(fpp, 'r', encoding='utf8') as f:
article_content = f.readlines()
article_dict = markdown_to_dict(article_content)
logging.info(article_dict)
prompt = "以下是一篇学术论文的基本信息:\n"
# title
title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
# authors
authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
# abstract
abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
# command
prompt += f"请将题目和摘要翻译为{DST_LANG}"
meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
# 单线,获取文章meta信息
paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt,
inputs_show_user=prompt,
llm_kwargs=llm_kwargs,
chatbot=chatbot, history=[],
sys_prompt="You are an academic paper reader。",
)
# 多线,翻译
inputs_array = []
inputs_show_user_array = []
# get_token_num
from request_llm.bridge_all import model_info
enc = model_info[llm_kwargs['llm_model']]['tokenizer']
def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
def break_down(txt):
raw_token_num = get_token_num(txt)
if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
return [txt]
else:
# raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
# find a smooth token limit to achieve even seperation
count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
token_limit_smooth = raw_token_num // count + count
return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
for section in article_dict.get('sections'):
if len(section['text']) == 0: continue
section_frags = break_down(section['text'])
for i, fragment in enumerate(section_frags):
heading = section['heading']
if len(section_frags) > 1: heading += f' Part-{i+1}'
inputs_array.append(
f"你需要翻译{heading}章节,内容如下: \n\n{fragment}"
)
inputs_show_user_array.append(
f"# {heading}\n\n{fragment}"
)
gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array=inputs_array,
inputs_show_user_array=inputs_show_user_array,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history_array=[meta for _ in inputs_array],
sys_prompt_array=[
"请你作为一个学术翻译,负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
)
res_path = write_history_to_file(meta + ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
generated_conclusion_files.append(res_path)
ch = construct_html()
orig = ""
trans = ""
gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
for i,k in enumerate(gpt_response_collection_html):
if i%2==0:
gpt_response_collection_html[i] = inputs_show_user_array[i//2]
else:
gpt_response_collection_html[i] = gpt_response_collection_html[i]
final = ["", "", "一、论文概况", "", "Abstract", paper_meta_info, "二、论文翻译", ""]
final.extend(gpt_response_collection_html)
for i, k in enumerate(final):
if i%2==0:
orig = k
if i%2==1:
trans = k
ch.add_row(a=orig, b=trans)
create_report_file_name = f"{os.path.basename(fp)}.trans.html"
html_file = ch.save_file(create_report_file_name)
generated_html_files.append(html_file)
promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
yield from translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG)
chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
class construct_html():
def __init__(self) -> None:
self.css = """
.row {
display: flex;
flex-wrap: wrap;
}
.column {
flex: 1;
padding: 10px;
}
.table-header {
font-weight: bold;
border-bottom: 1px solid black;
}
.table-row {
border-bottom: 1px solid lightgray;
}
.table-cell {
padding: 5px;
}
"""
self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
def add_row(self, a, b):
tmp = """
<div class="row table-row">
<div class="column table-cell">REPLACE_A</div>
<div class="column table-cell">REPLACE_B</div>
</div>
"""
from toolbox import markdown_convertion
tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
self.html_string += tmp
def save_file(self, file_name):
with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
f.write(self.html_string.encode('utf-8', 'ignore').decode())
return os.path.join(get_log_folder(), file_name)

查看文件

@@ -1,12 +1,12 @@
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption, get_log_folder, gen_time_str
from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
from toolbox import write_history_to_file, get_log_folder
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
from .crazy_utils import read_and_clean_pdf_text
from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url, translate_pdf
from colorful import *
import glob
import copy
import os
import math
@@ -58,114 +58,35 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, grobid_url):
import copy
TOKEN_LIMIT_PER_FRAGMENT = 1280
import copy, json
TOKEN_LIMIT_PER_FRAGMENT = 1024
generated_conclusion_files = []
generated_html_files = []
DST_LANG = "中文"
from crazy_functions.crazy_utils import construct_html
for index, fp in enumerate(file_manifest):
chatbot.append(["当前进度:", f"正在连接GROBID服务,请稍候: {grobid_url}\n如果等待时间过长,请修改config中的GROBID_URL,可修改成本地GROBID服务。"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
article_dict = parse_pdf(fp, grobid_url)
grobid_json_res = os.path.join(get_log_folder(), gen_time_str() + "grobid.json")
with open(grobid_json_res, 'w+', encoding='utf8') as f:
f.write(json.dumps(article_dict, indent=4, ensure_ascii=False))
promote_file_to_downloadzone(grobid_json_res, chatbot=chatbot)
if article_dict is None: raise RuntimeError("解析PDF失败,请检查PDF是否损坏。")
prompt = "以下是一篇学术论文的基本信息:\n"
# title
title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
# authors
authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
# abstract
abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
# command
prompt += f"请将题目和摘要翻译为{DST_LANG}"
meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
# 单线,获取文章meta信息
paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
inputs=prompt,
inputs_show_user=prompt,
llm_kwargs=llm_kwargs,
chatbot=chatbot, history=[],
sys_prompt="You are an academic paper reader。",
)
# 多线,翻译
inputs_array = []
inputs_show_user_array = []
# get_token_num
from request_llm.bridge_all import model_info
enc = model_info[llm_kwargs['llm_model']]['tokenizer']
def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
def break_down(txt):
raw_token_num = get_token_num(txt)
if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
return [txt]
else:
# raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
# find a smooth token limit to achieve even seperation
count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
token_limit_smooth = raw_token_num // count + count
return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
for section in article_dict.get('sections'):
if len(section['text']) == 0: continue
section_frags = break_down(section['text'])
for i, fragment in enumerate(section_frags):
heading = section['heading']
if len(section_frags) > 1: heading += f' Part-{i+1}'
inputs_array.append(
f"你需要翻译{heading}章节,内容如下: \n\n{fragment}"
)
inputs_show_user_array.append(
f"# {heading}\n\n{fragment}"
)
gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
inputs_array=inputs_array,
inputs_show_user_array=inputs_show_user_array,
llm_kwargs=llm_kwargs,
chatbot=chatbot,
history_array=[meta for _ in inputs_array],
sys_prompt_array=[
"请你作为一个学术翻译,负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
)
res_path = write_history_to_file(meta + ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
generated_conclusion_files.append(res_path)
ch = construct_html()
orig = ""
trans = ""
gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
for i,k in enumerate(gpt_response_collection_html):
if i%2==0:
gpt_response_collection_html[i] = inputs_show_user_array[i//2]
else:
gpt_response_collection_html[i] = gpt_response_collection_html[i]
final = ["", "", "一、论文概况", "", "Abstract", paper_meta_info, "二、论文翻译", ""]
final.extend(gpt_response_collection_html)
for i, k in enumerate(final):
if i%2==0:
orig = k
if i%2==1:
trans = k
ch.add_row(a=orig, b=trans)
create_report_file_name = f"{os.path.basename(fp)}.trans.html"
html_file = ch.save_file(create_report_file_name)
generated_html_files.append(html_file)
promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
yield from translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG)
chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
"""
此函数已经弃用
"""
import copy
TOKEN_LIMIT_PER_FRAGMENT = 1280
TOKEN_LIMIT_PER_FRAGMENT = 1024
generated_conclusion_files = []
generated_html_files = []
from crazy_functions.crazy_utils import construct_html
for index, fp in enumerate(file_manifest):
# 读取PDF文件
file_content, page_one = read_and_clean_pdf_text(fp)
@@ -216,10 +137,11 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
final = ["一、论文概况\n\n---\n\n", paper_meta_info.replace('# ', '### ') + '\n\n---\n\n', "二、论文翻译", ""]
final.extend(gpt_response_collection_md)
create_report_file_name = f"{os.path.basename(fp)}.trans.md"
res = write_results_to_file(final, file_name=create_report_file_name)
res = write_history_to_file(final, create_report_file_name)
promote_file_to_downloadzone(res, chatbot=chatbot)
# 更新UI
generated_conclusion_files.append(f'./gpt_log/{create_report_file_name}')
generated_conclusion_files.append(f'{get_log_folder()}/{create_report_file_name}')
chatbot.append((f"{fp}完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
@@ -261,49 +183,3 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
class construct_html():
def __init__(self) -> None:
self.css = """
.row {
display: flex;
flex-wrap: wrap;
}
.column {
flex: 1;
padding: 10px;
}
.table-header {
font-weight: bold;
border-bottom: 1px solid black;
}
.table-row {
border-bottom: 1px solid lightgray;
}
.table-cell {
padding: 5px;
}
"""
self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
def add_row(self, a, b):
tmp = """
<div class="row table-row">
<div class="column table-cell">REPLACE_A</div>
<div class="column table-cell">REPLACE_B</div>
</div>
"""
from toolbox import markdown_convertion
tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
self.html_string += tmp
def save_file(self, file_name):
with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
f.write(self.html_string.encode('utf-8', 'ignore').decode())
return os.path.join(get_log_folder(), file_name)

查看文件

@@ -1,5 +1,6 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
fast_debug = False
@@ -27,7 +28,8 @@ def 生成函数注释(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
if not fast_debug: time.sleep(2)
if not fast_debug:
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面

查看文件

@@ -46,7 +46,7 @@ explain_msg = """
from pydantic import BaseModel, Field
from typing import List
from toolbox import CatchException, update_ui, gen_time_str
from toolbox import CatchException, update_ui, is_the_upload_folder
from toolbox import update_ui_lastest_msg, disable_auto_promotion
from request_llm.bridge_all import predict_no_ui_long_connection
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
@@ -112,7 +112,7 @@ def 虚空终端(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt
# 用简单的关键词检测用户意图
is_certain, _ = analyze_intention_with_simple_rules(txt)
if txt.startswith('private_upload/') and len(txt) == 34:
if is_the_upload_folder(txt):
state.set_state(chatbot=chatbot, key='has_provided_explaination', value=False)
appendix_msg = "\n\n**很好,您已经上传了文件**,现在请您描述您的需求。"

查看文件

@@ -1,5 +1,6 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from toolbox import write_history_to_file, promote_file_to_downloadzone
fast_debug = True
@@ -110,7 +111,8 @@ def ipynb解释(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbo
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
# <-------- 写入文件,退出 ---------->
res = write_results_to_file(history)
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -23,7 +23,7 @@ def 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
file_content = f.read()
prefix = "接下来请你逐文件分析下面的工程" if index==0 else ""
i_say = prefix + f'请对下面的程序文件做一个概述文件名是{os.path.relpath(fp, project_folder)},文件代码是 ```{file_content}```'
i_say_show_user = prefix + f'[{index}/{len(file_manifest)}] 请对下面的程序文件做一个概述: {os.path.abspath(fp)}'
i_say_show_user = prefix + f'[{index}/{len(file_manifest)}] 请对下面的程序文件做一个概述: {fp}'
# 装载请求内容
inputs_array.append(i_say)
inputs_show_user_array.append(i_say_show_user)
@@ -109,9 +109,8 @@ def 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
def 解析项目本身(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
history = [] # 清空历史,以免输入溢出
import glob
file_manifest = [f for f in glob.glob('./*.py') if ('test_project' not in f) and ('gpt_log' not in f)] + \
[f for f in glob.glob('./crazy_functions/*.py') if ('test_project' not in f) and ('gpt_log' not in f)]+ \
[f for f in glob.glob('./request_llm/*.py') if ('test_project' not in f) and ('gpt_log' not in f)]
file_manifest = [f for f in glob.glob('./*.py')] + \
[f for f in glob.glob('./*/*.py')]
project_folder = './'
if len(file_manifest) == 0:
report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到任何python文件: {txt}")
@@ -137,6 +136,23 @@ def 解析一个Python项目(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
return
yield from 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
@CatchException
def 解析一个Matlab项目(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
history = [] # 清空历史,以免输入溢出
import glob, os
if os.path.exists(txt):
project_folder = txt
else:
if txt == "": txt = '空空如也的输入栏'
report_execption(chatbot, history, a = f"解析Matlab项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.m', recursive=True)]
if len(file_manifest) == 0:
report_execption(chatbot, history, a = f"解析Matlab项目: {txt}", b = f"找不到任何`.m`源文件: {txt}")
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
return
yield from 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
@CatchException
def 解析一个C项目的头文件(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):

查看文件

@@ -1,7 +1,7 @@
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file
from toolbox import CatchException, report_execption
from toolbox import write_history_to_file, promote_file_to_downloadzone
from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
fast_debug = False
def 解析Paper(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
@@ -17,32 +17,29 @@ def 解析Paper(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbo
chatbot.append((i_say_show_user, "[Local Message] waiting gpt response."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
if not fast_debug:
msg = '正常'
# ** gpt request **
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(i_say, i_say_show_user, llm_kwargs, chatbot, history=[], sys_prompt=system_prompt) # 带超时倒计时
chatbot[-1] = (i_say_show_user, gpt_say)
history.append(i_say_show_user); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
if not fast_debug: time.sleep(2)
msg = '正常'
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(i_say, i_say_show_user, llm_kwargs, chatbot, history=[], sys_prompt=system_prompt) # 带超时倒计时
chatbot[-1] = (i_say_show_user, gpt_say)
history.append(i_say_show_user); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
time.sleep(2)
all_file = ', '.join([os.path.relpath(fp, project_folder) for index, fp in enumerate(file_manifest)])
i_say = f'根据以上你自己的分析,对全文进行概括,用学术性语言写一段中文摘要,然后再写一段英文摘要(包括{all_file})。'
chatbot.append((i_say, "[Local Message] waiting gpt response."))
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
if not fast_debug:
msg = '正常'
# ** gpt request **
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(i_say, i_say, llm_kwargs, chatbot, history=history, sys_prompt=system_prompt) # 带超时倒计时
msg = '正常'
# ** gpt request **
gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(i_say, i_say, llm_kwargs, chatbot, history=history, sys_prompt=system_prompt) # 带超时倒计时
chatbot[-1] = (i_say, gpt_say)
history.append(i_say); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
res = write_results_to_file(history)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
chatbot[-1] = (i_say, gpt_say)
history.append(i_say); history.append(gpt_say)
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
res = write_history_to_file(history)
promote_file_to_downloadzone(res, chatbot=chatbot)
chatbot.append(("完成了吗?", res))
yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面

查看文件

@@ -2,8 +2,8 @@
# @Time : 2023/4/19
# @Author : Spike
# @Descr :
from toolbox import update_ui
from toolbox import CatchException, report_execption, write_results_to_file, get_log_folder
from toolbox import update_ui, get_conf
from toolbox import CatchException
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
@@ -30,14 +30,13 @@ def 猜你想问(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt
@CatchException
def 清除缓存(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
chatbot.append(['清除本地缓存数据', '执行中. 删除 gpt_log & private_upload'])
chatbot.append(['清除本地缓存数据', '执行中. 删除数据'])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
import shutil, os
gpt_log_dir = os.path.join(os.path.dirname(__file__), '..', 'gpt_log')
private_upload_dir = os.path.join(os.path.dirname(__file__), '..', 'private_upload')
shutil.rmtree(gpt_log_dir, ignore_errors=True)
shutil.rmtree(private_upload_dir, ignore_errors=True)
PATH_PRIVATE_UPLOAD, PATH_LOGGING = get_conf('PATH_PRIVATE_UPLOAD', 'PATH_LOGGING')
shutil.rmtree(PATH_LOGGING, ignore_errors=True)
shutil.rmtree(PATH_PRIVATE_UPLOAD, ignore_errors=True)
chatbot.append(['清除本地缓存数据', '执行完成'])
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

查看文件

@@ -1,5 +1,54 @@
#【请修改完参数后,删除此行】请在以下方案中选择一种,然后删除其他的方案,最后docker-compose up运行 | Please choose from one of these options below, delete other options as well as This Line
## ===================================================
## 【方案零】 部署项目的全部能力这个是包含cuda和latex的大型镜像。如果您网速慢、硬盘小或没有显卡,则不推荐使用这个
## ===================================================
version: '3'
services:
gpt_academic_full_capability:
image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
environment:
# 请查阅 `config.py`或者 github wiki 以查看所有的配置信息
API_KEY: ' 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 '
# USE_PROXY: ' True '
# proxies: ' { "http": "http://localhost:10881", "https": "http://localhost:10881", } '
LLM_MODEL: ' gpt-3.5-turbo '
AVAIL_LLM_MODELS: ' ["gpt-3.5-turbo", "gpt-4", "qianfan", "sparkv2", "spark", "chatglm"] '
BAIDU_CLOUD_API_KEY : ' bTUtwEAveBrQipEowUvDwYWq '
BAIDU_CLOUD_SECRET_KEY : ' jqXtLvXiVw6UNdjliATTS61rllG8Iuni '
XFYUN_APPID: ' 53a8d816 '
XFYUN_API_SECRET: ' MjMxNDQ4NDE4MzM0OSNlNjQ2NTlhMTkx '
XFYUN_API_KEY: ' 95ccdec285364869d17b33e75ee96447 '
ENABLE_AUDIO: ' False '
DEFAULT_WORKER_NUM: ' 20 '
WEB_PORT: ' 12345 '
ADD_WAIFU: ' False '
ALIYUN_APPKEY: ' RxPlZrM88DnAFkZK '
THEME: ' Chuanhu-Small-and-Beautiful '
ALIYUN_ACCESSKEY: ' LTAI5t6BrFUzxRXVGUWnekh1 '
ALIYUN_SECRET: ' eHmI20SVWIwQZxCiTD2bGQVspP9i68 '
# LOCAL_MODEL_DEVICE: ' cuda '
# 加载英伟达显卡运行时
# runtime: nvidia
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
# 与宿主的网络融合
network_mode: "host"
# 不使用代理网络拉取最新代码
command: >
bash -c "python3 -u main.py"
## ===================================================
## 【方案一】 如果不需要运行本地模型(仅 chatgpt, azure, 星火, 千帆, claude 等在线大模型服务)
## ===================================================

查看文件

@@ -12,22 +12,21 @@ RUN python3 -m pip install torch --extra-index-url https://download.pytorch.org/
RUN python3 -m pip install openai numpy arxiv rich
RUN python3 -m pip install colorama Markdown pygments pymupdf
RUN python3 -m pip install python-docx moviepy pdfminer
RUN python3 -m pip install zh_langchain==0.2.1
RUN python3 -m pip install nougat-ocr
RUN python3 -m pip install zh_langchain==0.2.1 pypinyin
RUN python3 -m pip install rarfile py7zr
RUN python3 -m pip install aliyun-python-sdk-core==2.13.3 pyOpenSSL scipy git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
# 下载分支
WORKDIR /gpt
RUN git clone --depth=1 https://github.com/binary-husky/gpt_academic.git
WORKDIR /gpt/gpt_academic
RUN git clone https://github.com/OpenLMLab/MOSS.git request_llm/moss
RUN git clone --depth=1 https://github.com/OpenLMLab/MOSS.git request_llm/moss
RUN python3 -m pip install -r requirements.txt
RUN python3 -m pip install -r request_llm/requirements_moss.txt
RUN python3 -m pip install -r request_llm/requirements_qwen.txt
RUN python3 -m pip install -r request_llm/requirements_chatglm.txt
RUN python3 -m pip install -r request_llm/requirements_newbing.txt
RUN python3 -m pip install nougat-ocr
# 预热Tiktoken模块

查看文件

@@ -299,7 +299,6 @@
"地址🚀": "Address 🚀",
"感谢热情的": "Thanks to the enthusiastic",
"开发者们❤️": "Developers ❤️",
"所有问询记录将自动保存在本地目录./gpt_log/chat_secrets.log": "All inquiry records will be automatically saved in the local directory ./gpt_log/chat_secrets.log",
"请注意自我隐私保护哦!": "Please pay attention to self-privacy protection!",
"当前模型": "Current model",
"输入区": "Input area",
@@ -892,7 +891,6 @@
"保存当前对话": "Save current conversation",
"您可以调用“LoadConversationHistoryArchive”还原当下的对话": "You can call 'LoadConversationHistoryArchive' to restore the current conversation",
"警告!被保存的对话历史可以被使用该系统的任何人查阅": "Warning! The saved conversation history can be viewed by anyone using this system",
"gpt_log/**/chatGPT对话历史*.html": "gpt_log/**/chatGPT conversation history *.html",
"正在查找对话历史文件": "Looking for conversation history file",
"html格式": "HTML format",
"找不到任何html文件": "No HTML files found",
@@ -908,7 +906,6 @@
"pip install pywin32 用于doc格式": "pip install pywin32 for doc format",
"仅支持Win平台": "Only supports Win platform",
"打开文件": "Open file",
"private_upload里面的文件名在解压zip后容易出现乱码": "The file name in private_upload is prone to garbled characters after unzipping",
"rar和7z格式正常": "RAR and 7z formats are normal",
"故可以只分析文章内容": "So you can only analyze the content of the article",
"不输入文件名": "Do not enter the file name",
@@ -1364,7 +1361,6 @@
"注意文章中的每一句话都要翻译": "Please translate every sentence in the article",
"一、论文概况": "I. Overview of the paper",
"二、论文翻译": "II. Translation of the paper",
"/gpt_log/总结论文-": "/gpt_log/Summary of the paper-",
"给出输出文件清单": "Provide a list of output files",
"第 0 步": "Step 0",
"切割PDF": "Split PDF",
@@ -1564,7 +1560,6 @@
"广义速度": "Generalized velocity",
"粒子的固有": "Intrinsic of particle",
"一个包含所有切割音频片段文件路径的列表": "A list containing the file paths of all segmented audio clips",
"/gpt_log/翻译-": "Translation log-",
"计算文件总时长和切割点": "Calculate total duration and cutting points of the file",
"总结音频": "Summarize audio",
"作者": "Author",
@@ -2339,7 +2334,6 @@
"将文件拖动到文件上传区": "Drag and drop the file to the file upload area",
"如果意图模糊": "If the intent is ambiguous",
"星火认知大模型": "Spark Cognitive Big Model",
"执行中. 删除 gpt_log & private_upload": "Executing. Delete gpt_log & private_upload",
"默认 Color = secondary": "Default Color = secondary",
"此处也不需要修改": "No modification is needed here",
"⭐ ⭐ ⭐ 分析用户意图": "⭐ ⭐ ⭐ Analyze user intent",
@@ -2449,48 +2443,75 @@
"├── CODE_HIGHLIGHT 代码高亮": "├── CODE_HIGHLIGHT Code highlighting",
"记得用插件": "Remember to use the plugin",
"谨慎操作": "Handle with caution",
"请检查PDF是否损坏": "#",
"执行成功了": "#",
"请在输入框内填写需求": "#",
"结果": "#",
"开始干正事": "#",
"次代码生成尝试": "#",
"代码生成结束": "#",
"Nougat解析论文失败": "#",
"受到google限制": "#",
"收尾": "#",
"结果是一个有效文件": "#",
"然后再次点击该插件": "#",
"用插件实现」": "#",
"文件路径": "#",
"仅供测试": "#",
"将csv文件转excel表格": "#",
"开始执行": "#",
"测试": "#",
"睡一会防止触发google反爬虫": "#",
"某段话的整个句子": "#",
"使用tex格式公式 测试2 给出柯西不等式": "#",
"找不到本地项目或无法处理": "#",
"交换图像的蓝色通道和红色通道": "#",
"第三步": "#",
"返回给定的url解析出的arxiv_id": "#",
"裁剪图像": "#",
"已经被记忆": "#",
"无法从bing获取信息": "#",
"可能触发了google反爬虫机制": "#",
"检索文章的历史版本的题目": "#",
"请配置讯飞星火大模型的XFYUN_APPID": "#",
"执行失败了": "#",
"需要花费较长时间下载NOUGAT参数": "#",
"请检查": "#",
"写入": "#",
"下个句子中已经说完的部分": "#",
"精准翻译PDF文档": "#",
"解析python源代码项目": "#",
"首先在arxiv上搜索": "#",
"错误追踪": "#",
"结果是一个字符串": "#",
"由 test_on_sentence_end": "#",
"获取文章摘要": "#",
"受到bing限制": "#"
"private_upload里面的文件名在解压zip后容易出现乱码": "The file name inside private_upload is prone to garbled characters after unzipping",
"直接返回报错": "Direct return error",
"临时的上传文件夹位置": "Temporary upload folder location",
"使用latex格式 测试3 写出麦克斯韦方程组": "Write Maxwell's equations using latex format for test 3",
"这是一张图片": "This is an image",
"没有发现任何近期上传的文件": "No recent uploaded files found",
"如url未成功匹配返回None": "Return None if the URL does not match successfully",
"如果有Latex环境": "If there is a Latex environment",
"第一次运行时": "When running for the first time",
"创建工作路径": "Create a working directory",
"": "To",
"执行中. 删除数据": "Executing. Deleting data",
"CodeInterpreter开源版": "CodeInterpreter open source version",
"建议选择更稳定的接口": "It is recommended to choose a more stable interface",
"现在您点击任意函数插件时": "Now when you click on any function plugin",
"请使用“LatexEnglishCorrection+高亮”插件": "Please use the 'LatexEnglishCorrection+Highlight' plugin",
"安装完成": "Installation completed",
"记得用插件!」": "Remember to use the plugin!",
"结论": "Conclusion",
"无法下载资源": "Unable to download resources",
"首先排除一个one-api没有done数据包的第三方Bug情形": "First exclude a third-party bug where one-api does not have a done data package",
"知识库中添加文件": "Add files to the knowledge base",
"处理重名的章节": "Handling duplicate chapter names",
"先上传文件素材": "Upload file materials first",
"无法从google获取信息": "Unable to retrieve information from Google!",
"展示如下": "Display as follows",
"「把Arxiv论文翻译成中文PDF": "Translate Arxiv papers into Chinese PDF",
"论文我刚刚放到上传区了」": "I just put the paper in the upload area",
"正在下载Gradio主题": "Downloading Gradio themes",
"再运行此插件": "Run this plugin again",
"记录近期文件": "Record recent files",
"粗心检查": "Careful check",
"更多主题": "More themes",
"//huggingface.co/spaces/gradio/theme-gallery 可选": "//huggingface.co/spaces/gradio/theme-gallery optional",
"由 test_on_result_chg": "By test_on_result_chg",
"所有问询记录将自动保存在本地目录./": "All inquiry records will be automatically saved in the local directory ./",
"正在解析论文": "Analyzing the paper",
"逐个文件转移到目标路径": "Move each file to the target path",
"最多重试5次": "Retry up to 5 times",
"日志文件夹的位置": "Location of the log folder",
"我们暂时无法解析此PDF文档": "We are temporarily unable to parse this PDF document",
"文件检索": "File retrieval",
"/**/chatGPT对话历史*.html": "/**/chatGPT conversation history*.html",
"非OpenAI官方接口返回了错误": "Non-OpenAI official interface returned an error",
"如果在Arxiv上匹配失败": "If the match fails on Arxiv",
"文件进入知识库后可长期保存": "Files can be saved for a long time after entering the knowledge base",
"您可以再次重试": "You can try again",
"整理文件集合": "Organize file collection",
"检测到有缺陷的非OpenAI官方接口": "Detected defective non-OpenAI official interface",
"此插件不调用Latex": "This plugin does not call Latex",
"移除过时的旧文件从而节省空间&保护隐私": "Remove outdated old files to save space & protect privacy",
"代码我刚刚打包拖到上传区了」": "I just packed the code and dragged it to the upload area",
"将图像转为灰度图像": "Convert the image to grayscale",
"待排除": "To be excluded",
"请勿修改": "Please do not modify",
"crazy_functions/代码重写为全英文_多线程.py": "crazy_functions/code rewritten to all English_multi-threading.py",
"开发中": "Under development",
"请查阅Gradio主题商店": "Please refer to the Gradio theme store",
"输出消息": "Output message",
"其他情况": "Other situations",
"获取文献失败": "Failed to retrieve literature",
"可以通过再次调用本插件的方式": "You can use this plugin again by calling it",
"保留下半部分": "Keep the lower half",
"排除问题": "Exclude the problem",
"知识库": "Knowledge base",
"ParsePDF失败": "ParsePDF failed",
"向知识库追加更多文档": "Append more documents to the knowledge base",
"此处待注入的知识库名称id": "The knowledge base name ID to be injected here",
"您需要构建知识库后再运行此插件": "You need to build the knowledge base before running this plugin",
"判定是否为公式 | 测试1 写出洛伦兹定律": "Determine whether it is a formula | Test 1 write out the Lorentz law",
"构建知识库后": "After building the knowledge base"
}

查看文件

@@ -301,7 +301,6 @@
"缺少的依赖": "不足している依存関係",
"紫色": "紫色",
"唤起高级参数输入区": "高度なパラメータ入力エリアを呼び出す",
"所有问询记录将自动保存在本地目录./gpt_log/chat_secrets.log": "すべての問い合わせ記録は自動的にローカルディレクトリ./gpt_log/chat_secrets.logに保存されます",
"则换行符更有可能表示段落分隔": "したがって、改行記号は段落の区切りを表す可能性がより高いです",
";4、引用数量": ";4、引用数量",
"中转网址预览": "中継ウェブサイトのプレビュー",
@@ -448,7 +447,6 @@
"表示函数是否成功执行": "関数が正常に実行されたかどうかを示す",
"一般原样传递下去就行": "通常はそのまま渡すだけでよい",
"琥珀色": "琥珀色",
"gpt_log/**/chatGPT对话历史*.html": "gpt_log/**/chatGPT対話履歴*.html",
"jittorllms 没有 sys_prompt 接口": "jittorllmsにはsys_promptインターフェースがありません",
"清除": "クリア",
"小于正文的": "本文より小さい",
@@ -1234,7 +1232,6 @@
"找不到任何前端相关文件": "No frontend-related files can be found",
"Not enough point. API2D账户点数不足": "Not enough points. API2D account points are insufficient",
"当前版本": "Current version",
"/gpt_log/总结论文-": "/gpt_log/Summary paper-",
"1. 临时解决方案": "1. Temporary solution",
"第8步": "Step 8",
"历史": "History",

查看文件

@@ -314,7 +314,6 @@
"请用markdown格式输出": "請用 Markdown 格式輸出",
"模仿ChatPDF": "模仿 ChatPDF",
"等待多久判定为超时": "等待多久判定為超時",
"/gpt_log/总结论文-": "/gpt_log/總結論文-",
"请结合互联网信息回答以下问题": "請結合互聯網信息回答以下問題",
"IP查询频率受限": "IP查詢頻率受限",
"高级参数输入区的显示提示": "高級參數輸入區的顯示提示",
@@ -511,7 +510,6 @@
"將生成的報告自動投射到文件上傳區": "將生成的報告自動上傳到文件區",
"函數插件作者": "函數插件作者",
"將要匹配的模式": "將要匹配的模式",
"所有问询记录将自动保存在本地目录./gpt_log/chat_secrets.log": "所有詢問記錄將自動保存在本地目錄./gpt_log/chat_secrets.log",
"正在分析一个项目的源代码": "正在分析一個專案的源代碼",
"使每个段落之间有两个换行符分隔": "使每個段落之間有兩個換行符分隔",
"并在被装饰的函数上执行": "並在被裝飾的函數上執行",
@@ -1059,7 +1057,6 @@
"重试中": "重試中",
"月": "月份",
"localhost意思是代理软件安装在本机上": "localhost意思是代理軟體安裝在本機上",
"gpt_log/**/chatGPT对话历史*.html": "gpt_log/**/chatGPT對話歷史*.html",
"的长度必须小于 2500 个 Token": "長度必須小於 2500 個 Token",
"抽取可用的api-key": "提取可用的api-key",
"增强报告的可读性": "增強報告的可讀性",

查看文件

@@ -107,6 +107,12 @@ AZURE_API_KEY = "填入azure openai api的密钥"
AZURE_API_VERSION = "2023-05-15" # 默认使用 2023-05-15 版本,无需修改
AZURE_ENGINE = "填入部署名" # 见上述图片
# 例如
API_KEY = '6424e9d19e674092815cea1cb35e67a5'
AZURE_ENDPOINT = 'https://rhtjjjjjj.openai.azure.com/'
AZURE_ENGINE = 'qqwe'
LLM_MODEL = "azure-gpt-3.5" # 可选 ↓↓↓
```

61
main.py
查看文件

@@ -8,12 +8,13 @@ def main():
# 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION = get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION')
CHATBOT_HEIGHT, LAYOUT, AVAIL_LLM_MODELS, AUTO_CLEAR_TXT = get_conf('CHATBOT_HEIGHT', 'LAYOUT', 'AVAIL_LLM_MODELS', 'AUTO_CLEAR_TXT')
ENABLE_AUDIO, AUTO_CLEAR_TXT = get_conf('ENABLE_AUDIO', 'AUTO_CLEAR_TXT')
ENABLE_AUDIO, AUTO_CLEAR_TXT, PATH_LOGGING, AVAIL_THEMES, THEME = get_conf('ENABLE_AUDIO', 'AUTO_CLEAR_TXT', 'PATH_LOGGING', 'AVAIL_THEMES', 'THEME')
# 如果WEB_PORT是-1, 则随机选取WEB端口
PORT = find_free_port() if WEB_PORT <= 0 else WEB_PORT
from check_proxy import get_current_version
from themes.theme import adjust_theme, advanced_css, theme_declaration
from themes.theme import adjust_theme, advanced_css, theme_declaration, load_dynamic_theme
initial_prompt = "Serve me as a writing and programming assistant."
title_html = f"<h1 align=\"center\">GPT 学术优化 {get_current_version()}</h1>{theme_declaration}"
description = "代码开源和更新[地址🚀](https://github.com/binary-husky/gpt_academic),"
@@ -21,12 +22,12 @@ def main():
# 问询记录, python 版本建议3.9+(越新越好)
import logging, uuid
os.makedirs("gpt_log", exist_ok=True)
try:logging.basicConfig(filename="gpt_log/chat_secrets.log", level=logging.INFO, encoding="utf-8", format="%(asctime)s %(levelname)-8s %(message)s", datefmt="%Y-%m-%d %H:%M:%S")
except:logging.basicConfig(filename="gpt_log/chat_secrets.log", level=logging.INFO, format="%(asctime)s %(levelname)-8s %(message)s", datefmt="%Y-%m-%d %H:%M:%S")
os.makedirs(PATH_LOGGING, exist_ok=True)
try:logging.basicConfig(filename=f"{PATH_LOGGING}/chat_secrets.log", level=logging.INFO, encoding="utf-8", format="%(asctime)s %(levelname)-8s %(message)s", datefmt="%Y-%m-%d %H:%M:%S")
except:logging.basicConfig(filename=f"{PATH_LOGGING}/chat_secrets.log", level=logging.INFO, format="%(asctime)s %(levelname)-8s %(message)s", datefmt="%Y-%m-%d %H:%M:%S")
# Disable logging output from the 'httpx' logger
logging.getLogger("httpx").setLevel(logging.WARNING)
print("所有问询记录将自动保存在本地目录./gpt_log/chat_secrets.log, 请注意自我隐私保护哦!")
print(f"所有问询记录将自动保存在本地目录./{PATH_LOGGING}/chat_secrets.log, 请注意自我隐私保护哦!")
# 一些普通功能模块
from core_functional import get_core_functions
@@ -59,6 +60,7 @@ def main():
cancel_handles = []
with gr.Blocks(title="GPT 学术优化", theme=set_theme, analytics_enabled=False, css=advanced_css) as demo:
gr.HTML(title_html)
secret_css, secret_font = gr.Textbox(visible=False), gr.Textbox(visible=False)
cookies = gr.State(load_chat_cookies())
with gr_L1():
with gr_L2(scale=2, elem_id="gpt-chat"):
@@ -123,6 +125,16 @@ def main():
max_length_sl = gr.Slider(minimum=256, maximum=8192, value=4096, step=1, interactive=True, label="Local LLM MaxLength",)
checkboxes = gr.CheckboxGroup(["基础功能区", "函数插件区", "底部输入区", "输入清除键", "插件参数区"], value=["基础功能区", "函数插件区"], label="显示/隐藏功能区")
md_dropdown = gr.Dropdown(AVAIL_LLM_MODELS, value=LLM_MODEL, label="更换LLM模型/请求源").style(container=False)
theme_dropdown = gr.Dropdown(AVAIL_THEMES, value=THEME, label="更换UI主题").style(container=False)
dark_mode_btn = gr.Button("切换界面明暗 ☀", variant="secondary").style(size="sm")
dark_mode_btn.click(None, None, None, _js="""() => {
if (document.querySelectorAll('.dark').length) {
document.querySelectorAll('.dark').forEach(el => el.classList.remove('dark'));
} else {
document.querySelector('body').classList.add('dark');
}
}""",
)
gr.Markdown(description)
with gr.Accordion("备选输入区", open=True, visible=False, elem_id="input-panel2") as area_input_secondary:
with gr.Row():
@@ -150,7 +162,7 @@ def main():
# 整理反复出现的控件句柄组合
input_combo = [cookies, max_length_sl, md_dropdown, txt, txt2, top_p, temperature, chatbot, history, system_prompt, plugin_advanced_arg]
output_combo = [cookies, chatbot, history, status]
predict_args = dict(fn=ArgsGeneralWrapper(predict), inputs=input_combo, outputs=output_combo)
predict_args = dict(fn=ArgsGeneralWrapper(predict), inputs=[*input_combo, gr.State(True)], outputs=output_combo)
# 提交按钮、重置按钮
cancel_handles.append(txt.submit(**predict_args))
cancel_handles.append(txt2.submit(**predict_args))
@@ -175,7 +187,7 @@ def main():
# 函数插件-固定按钮区
for k in plugins:
if not plugins[k].get("AsButton", True): continue
click_handle = plugins[k]["Button"].click(ArgsGeneralWrapper(plugins[k]["Function"]), [*input_combo, gr.State(PORT)], output_combo)
click_handle = plugins[k]["Button"].click(ArgsGeneralWrapper(plugins[k]["Function"]), [*input_combo], output_combo)
click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
cancel_handles.append(click_handle)
# 函数插件-下拉菜单与随变按钮的互动
@@ -188,14 +200,42 @@ def main():
ret.update({plugin_advanced_arg: gr.update(visible=False, label=f"插件[{k}]不需要高级参数。")})
return ret
dropdown.select(on_dropdown_changed, [dropdown], [switchy_bt, plugin_advanced_arg] )
def on_md_dropdown_changed(k):
return {chatbot: gr.update(label="当前模型:"+k)}
md_dropdown.select(on_md_dropdown_changed, [md_dropdown], [chatbot] )
def on_theme_dropdown_changed(theme, secret_css):
adjust_theme, css_part1, _, adjust_dynamic_theme = load_dynamic_theme(theme)
if adjust_dynamic_theme:
css_part2 = adjust_dynamic_theme._get_theme_css()
else:
css_part2 = adjust_theme()._get_theme_css()
return css_part2 + css_part1
theme_handle = theme_dropdown.select(on_theme_dropdown_changed, [theme_dropdown, secret_css], [secret_css])
theme_handle.then(
None,
[secret_css],
None,
_js="""(css) => {
var existingStyles = document.querySelectorAll("style[data-loaded-css]");
for (var i = 0; i < existingStyles.length; i++) {
var style = existingStyles[i];
style.parentNode.removeChild(style);
}
var styleElement = document.createElement('style');
styleElement.setAttribute('data-loaded-css', css);
styleElement.innerHTML = css;
document.head.appendChild(styleElement);
}
"""
)
# 随变按钮的回调函数注册
def route(request: gr.Request, k, *args, **kwargs):
if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
yield from ArgsGeneralWrapper(plugins[k]["Function"])(request, *args, **kwargs)
click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
click_handle = switchy_bt.click(route,[switchy_bt, *input_combo], output_combo)
click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
cancel_handles.append(click_handle)
# 终止按钮的回调函数注册
@@ -226,7 +266,7 @@ def main():
cookies.update({'uuid': uuid.uuid4()})
return cookies
demo.load(init_cookie, inputs=[cookies, chatbot], outputs=[cookies])
demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{ChatBotHeight();}')
demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{GptAcademicJavaScriptInit();}')
# gradio的inbrowser触发不太稳定,回滚代码到原始的浏览器打开函数
def auto_opentab_delay():
@@ -245,6 +285,7 @@ def main():
auto_opentab_delay()
demo.queue(concurrency_count=CONCURRENT_COUNT).launch(
quiet=True,
server_name="0.0.0.0",
server_port=PORT,
favicon_path="docs/logo.png",

查看文件

@@ -33,9 +33,11 @@ import functools
import re
import pickle
import time
from toolbox import get_conf
CACHE_FOLDER = "gpt_log"
blacklist = ['multi-language', 'gpt_log', '.git', 'private_upload', 'multi_language.py', 'build', '.github', '.vscode', '__pycache__', 'venv']
CACHE_FOLDER, = get_conf('PATH_LOGGING')
blacklist = ['multi-language', CACHE_FOLDER, '.git', 'private_upload', 'multi_language.py', 'build', '.github', '.vscode', '__pycache__', 'venv']
# LANG = "TraditionalChinese"
# TransPrompt = f"Replace each json value `#` with translated results in Traditional Chinese, e.g., \"原始文本\":\"翻譯後文字\". Keep Json format. Do not answer #."

查看文件

@@ -52,6 +52,7 @@ API_URL_REDIRECT, AZURE_ENDPOINT, AZURE_ENGINE = get_conf("API_URL_REDIRECT", "A
openai_endpoint = "https://api.openai.com/v1/chat/completions"
api2d_endpoint = "https://openai.api2d.net/v1/chat/completions"
newbing_endpoint = "wss://sydney.bing.com/sydney/ChatHub"
if not AZURE_ENDPOINT.endswith('/'): AZURE_ENDPOINT += '/'
azure_endpoint = AZURE_ENDPOINT + f'openai/deployments/{AZURE_ENGINE}/chat/completions?api-version=2023-05-15'
# 兼容旧版的配置
try:
@@ -125,6 +126,15 @@ model_info = {
"token_cnt": get_token_num_gpt4,
},
"gpt-4-32k": {
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": openai_endpoint,
"max_token": 32768,
"tokenizer": tokenizer_gpt4,
"token_cnt": get_token_num_gpt4,
},
# azure openai
"azure-gpt-3.5":{
"fn_with_ui": chatgpt_ui,
@@ -135,6 +145,15 @@ model_info = {
"token_cnt": get_token_num_gpt35,
},
"azure-gpt-4":{
"fn_with_ui": chatgpt_ui,
"fn_without_ui": chatgpt_noui,
"endpoint": azure_endpoint,
"max_token": 8192,
"tokenizer": tokenizer_gpt35,
"token_cnt": get_token_num_gpt35,
},
# api_2d
"api2d-gpt-3.5-turbo": {
"fn_with_ui": chatgpt_ui,

查看文件

@@ -3,7 +3,7 @@ from transformers import AutoModel, AutoTokenizer
import time
import threading
import importlib
from toolbox import update_ui, get_conf
from toolbox import update_ui, get_conf, ProxyNetworkActivate
from multiprocessing import Process, Pipe
load_message = "ChatGLM尚未加载,加载需要一段时间。注意,取决于`config.py`的配置,ChatGLM消耗大量的内存CPU或显存GPU,也许会导致低配计算机卡死 ……"
@@ -48,16 +48,17 @@ class GetGLMHandle(Process):
while True:
try:
if self.chatglm_model is None:
self.chatglm_tokenizer = AutoTokenizer.from_pretrained(_model_name_, trust_remote_code=True)
if device=='cpu':
self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).float()
with ProxyNetworkActivate('Download_LLM'):
if self.chatglm_model is None:
self.chatglm_tokenizer = AutoTokenizer.from_pretrained(_model_name_, trust_remote_code=True)
if device=='cpu':
self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).float()
else:
self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).half().cuda()
self.chatglm_model = self.chatglm_model.eval()
break
else:
self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).half().cuda()
self.chatglm_model = self.chatglm_model.eval()
break
else:
break
break
except:
retry += 1
if retry > 3:

查看文件

@@ -21,7 +21,7 @@ import importlib
# config_private.py放自己的秘密如API和代理网址
# 读取时首先看是否存在私密的config_private配置文件不受git管控,如果有,则覆盖原config文件
from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history, trimmed_format_exc
from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history, trimmed_format_exc, is_the_upload_folder
proxies, TIMEOUT_SECONDS, MAX_RETRY, API_ORG = \
get_conf('proxies', 'TIMEOUT_SECONDS', 'MAX_RETRY', 'API_ORG')
@@ -72,6 +72,7 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
stream_response = response.iter_lines()
result = ''
json_data = None
while True:
try: chunk = next(stream_response).decode()
except StopIteration:
@@ -90,20 +91,21 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
delta = json_data["delta"]
if len(delta) == 0: break
if "role" in delta: continue
if "content" in delta:
if "content" in delta:
result += delta["content"]
if not console_slience: print(delta["content"], end='')
if observe_window is not None:
# 观测窗,把已经获取的数据显示出去
if len(observe_window) >= 1: observe_window[0] += delta["content"]
if len(observe_window) >= 1:
observe_window[0] += delta["content"]
# 看门狗,如果超过期限没有喂狗,则终止
if len(observe_window) >= 2:
if len(observe_window) >= 2:
if (time.time()-observe_window[1]) > watch_dog_patience:
raise RuntimeError("用户取消了程序。")
else: raise RuntimeError("意外Json结构"+delta)
if json_data['finish_reason'] == 'content_filter':
if json_data and json_data['finish_reason'] == 'content_filter':
raise RuntimeError("由于提问含不合规内容被Azure过滤。")
if json_data['finish_reason'] == 'length':
if json_data and json_data['finish_reason'] == 'length':
raise ConnectionAbortedError("正常结束,但显示Token不足,导致输出不完整,请削减单次输入的文本量。")
return result
@@ -128,6 +130,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
yield from update_ui(chatbot=chatbot, history=history, msg="缺少api_key") # 刷新界面
return
user_input = inputs
if additional_fn is not None:
from core_functional import handle_core_functionality
inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
@@ -138,8 +141,8 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
# check mis-behavior
if raw_input.startswith('private_upload/') and len(raw_input) == 34:
chatbot[-1] = (inputs, f"[Local Message] 检测到操作错误!当您上传文档之后,需点击“函数插件区”按钮进行处理,而不是点击“提交”按钮。")
if is_the_upload_folder(user_input):
chatbot[-1] = (inputs, f"[Local Message] 检测到操作错误!当您上传文档之后,需点击“**函数插件区**”按钮进行处理,请勿点击“提交”按钮或者“基础功能区”按钮")
yield from update_ui(chatbot=chatbot, history=history, msg="正常") # 刷新界面
time.sleep(2)
@@ -179,8 +182,13 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
# 非OpenAI官方接口的出现这样的报错,OpenAI和API2D不会走这里
chunk_decoded = chunk.decode()
error_msg = chunk_decoded
# 首先排除一个one-api没有done数据包的第三方Bug情形
if len(gpt_replying_buffer.strip()) > 0 and len(error_msg) == 0:
yield from update_ui(chatbot=chatbot, history=history, msg="检测到有缺陷的非OpenAI官方接口,建议选择更稳定的接口。")
break
# 其他情况,直接返回报错
chatbot, history = handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg)
yield from update_ui(chatbot=chatbot, history=history, msg="非Openai官方接口返回了错误:" + chunk.decode()) # 刷新界面
yield from update_ui(chatbot=chatbot, history=history, msg="非OpenAI官方接口返回了错误:" + chunk.decode()) # 刷新界面
return
chunk_decoded = chunk.decode()
@@ -199,7 +207,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
chunkjson = json.loads(chunk_decoded[6:])
status_text = f"finish_reason: {chunkjson['choices'][0].get('finish_reason', 'null')}"
# 如果这里抛出异常,一般是文本过长,详情见get_full_error的输出
gpt_replying_buffer = gpt_replying_buffer + json.loads(chunk_decoded[6:])['choices'][0]["delta"]["content"]
gpt_replying_buffer = gpt_replying_buffer + chunkjson['choices'][0]["delta"]["content"]
history[-1] = gpt_replying_buffer
chatbot[-1] = (history[-2], history[-1])
yield from update_ui(chatbot=chatbot, history=history, msg=status_text) # 刷新界面

查看文件

@@ -30,7 +30,7 @@ class GetONNXGLMHandle(LocalLLMHandle):
with open(os.path.expanduser('~/.cache/huggingface/token'), 'w') as f:
f.write(huggingface_token)
model_id = 'meta-llama/Llama-2-7b-chat-hf'
with ProxyNetworkActivate():
with ProxyNetworkActivate('Download_LLM'):
self._tokenizer = AutoTokenizer.from_pretrained(model_id, use_auth_token=huggingface_token)
# use fp16
model = AutoModelForCausalLM.from_pretrained(model_id, use_auth_token=huggingface_token).eval()

查看文件

@@ -109,6 +109,7 @@ class SparkRequestInstance():
code = data['header']['code']
if code != 0:
print(f'请求错误: {code}, {data}')
self.result_buf += str(data)
ws.close()
self.time_to_exit_event.set()
else:

查看文件

@@ -1,5 +1,4 @@
protobuf
transformers>=4.27.1
cpm_kernels
torch>=1.10
mdtex2html

查看文件

@@ -1,5 +1,4 @@
protobuf
transformers>=4.27.1
cpm_kernels
torch>=1.10
mdtex2html

查看文件

@@ -2,6 +2,5 @@ jittor >= 1.3.7.9
jtorch >= 0.1.3
torch
torchvision
transformers==4.26.1
pandas
jieba

查看文件

@@ -1,5 +1,4 @@
torch
transformers==4.25.1
sentencepiece
datasets
accelerate

查看文件

@@ -2,7 +2,7 @@
pydantic==1.10.11
tiktoken>=0.3.3
requests[socks]
transformers
transformers>=4.27.1
python-markdown-math
beautifulsoup4
prompt_toolkit

查看文件

@@ -6,11 +6,14 @@
import os, sys
def validate_path(): dir_name = os.path.dirname(__file__); root_dir_assume = os.path.abspath(dir_name + '/..'); os.chdir(root_dir_assume); sys.path.append(root_dir_assume)
validate_path() # 返回项目根路径
from tests.test_utils import plugin_test
if __name__ == "__main__":
from tests.test_utils import plugin_test
plugin_test(plugin='crazy_functions.函数动态生成->函数动态生成', main_input='交换图像的蓝色通道和红色通道', advanced_arg={"file_path_arg": "./build/ants.jpg"})
# plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='修改api-key为sk-jhoejriotherjep')
plugin_test(plugin='crazy_functions.批量翻译PDF文档_NOUGAT->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')
# plugin_test(plugin='crazy_functions.批量翻译PDF文档_NOUGAT->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')
# plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='调用插件,对C:/Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns中的python文件进行解析')

查看文件

@@ -74,7 +74,7 @@ def plugin_test(main_input, plugin, advanced_arg=None):
plugin_kwargs['plugin_kwargs'] = advanced_arg
my_working_plugin = silence_stdout(plugin)(**plugin_kwargs)
with Live(Markdown(""), auto_refresh=False) as live:
with Live(Markdown(""), auto_refresh=False, vertical_overflow="visible") as live:
for cookies, chat, hist, msg in my_working_plugin:
md_str = vt.chat_to_markdown_str(chat)
md = Markdown(md_str)

查看文件

@@ -19,3 +19,67 @@
.wrap.svelte-xwlu1w {
min-height: var(--size-32);
}
/* status bar height */
.min.svelte-1yrv54 {
min-height: var(--size-12);
}
/* copy btn */
.message-btn-row {
width: 19px;
height: 19px;
position: absolute;
left: calc(100% + 3px);
top: 0;
display: flex;
justify-content: space-between;
}
/* .message-btn-row-leading, .message-btn-row-trailing {
display: inline-flex;
gap: 4px;
} */
.message-btn-row button {
font-size: 18px;
align-self: center;
align-items: center;
flex-wrap: nowrap;
white-space: nowrap;
display: inline-flex;
flex-direction: row;
gap: 4px;
padding-block: 2px !important;
}
/* Scrollbar Width */
::-webkit-scrollbar {
width: 12px;
}
/* Scrollbar Track */
::-webkit-scrollbar-track {
background: #f1f1f1;
border-radius: 12px;
}
/* Scrollbar Handle */
::-webkit-scrollbar-thumb {
background: #888;
border-radius: 12px;
}
/* Scrollbar Handle on hover */
::-webkit-scrollbar-thumb:hover {
background: #555;
}
/* input btns: clear, reset, stop */
#input-panel button {
min-width: min(80px, 100%);
}
/* input btns: clear, reset, stop */
#input-panel2 button {
min-width: min(80px, 100%);
}

查看文件

@@ -1,4 +1,85 @@
function ChatBotHeight() {
function gradioApp() {
// https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
const elems = document.getElementsByTagName('gradio-app');
const elem = elems.length == 0 ? document : elems[0];
if (elem !== document) {
elem.getElementById = function(id) {
return document.getElementById(id);
};
}
return elem.shadowRoot ? elem.shadowRoot : elem;
}
const copiedIcon = '<span><svg stroke="currentColor" fill="none" stroke-width="2" viewBox="0 0 24 24" stroke-linecap="round" stroke-linejoin="round" height=".8em" width=".8em" xmlns="http://www.w3.org/2000/svg"><polyline points="20 6 9 17 4 12"></polyline></svg></span>';
const copyIcon = '<span><svg stroke="currentColor" fill="none" stroke-width="2" viewBox="0 0 24 24" stroke-linecap="round" stroke-linejoin="round" height=".8em" width=".8em" xmlns="http://www.w3.org/2000/svg"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span>';
function addCopyButton(botElement) {
// https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
// Copy bot button
const messageBtnColumnElement = botElement.querySelector('.message-btn-row');
if (messageBtnColumnElement) {
// Do something if .message-btn-column exists, for example, remove it
// messageBtnColumnElement.remove();
return;
}
var copyButton = document.createElement('button');
copyButton.classList.add('copy-bot-btn');
copyButton.setAttribute('aria-label', 'Copy');
copyButton.innerHTML = copyIcon;
copyButton.addEventListener('click', async () => {
const textToCopy = botElement.innerText;
try {
if ("clipboard" in navigator) {
await navigator.clipboard.writeText(textToCopy);
copyButton.innerHTML = copiedIcon;
setTimeout(() => {
copyButton.innerHTML = copyIcon;
}, 1500);
} else {
const textArea = document.createElement("textarea");
textArea.value = textToCopy;
document.body.appendChild(textArea);
textArea.select();
try {
document.execCommand('copy');
copyButton.innerHTML = copiedIcon;
setTimeout(() => {
copyButton.innerHTML = copyIcon;
}, 1500);
} catch (error) {
console.error("Copy failed: ", error);
}
document.body.removeChild(textArea);
}
} catch (error) {
console.error("Copy failed: ", error);
}
});
var messageBtnColumn = document.createElement('div');
messageBtnColumn.classList.add('message-btn-row');
messageBtnColumn.appendChild(copyButton);
botElement.appendChild(messageBtnColumn);
}
function chatbotContentChanged(attempt = 1, force = false) {
// https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
for (var i = 0; i < attempt; i++) {
setTimeout(() => {
gradioApp().querySelectorAll('#gpt-chatbot .message-wrap .message.bot').forEach(addCopyButton);
}, i === 0 ? 0 : 200);
}
}
function GptAcademicJavaScriptInit() {
chatbotIndicator = gradioApp().querySelector('#gpt-chatbot > div.wrap');
var chatbotObserver = new MutationObserver(() => {
chatbotContentChanged(1);
});
chatbotObserver.observe(chatbotIndicator, { attributes: true, childList: true, subtree: true });
function update_height(){
var { panel_height_target, chatbot_height, chatbot } = get_elements(true);
if (panel_height_target!=chatbot_height)

55
themes/gradios.py 普通文件
查看文件

@@ -0,0 +1,55 @@
import gradio as gr
import logging
from toolbox import get_conf, ProxyNetworkActivate
CODE_HIGHLIGHT, ADD_WAIFU, LAYOUT = get_conf('CODE_HIGHLIGHT', 'ADD_WAIFU', 'LAYOUT')
def dynamic_set_theme(THEME):
set_theme = gr.themes.ThemeClass()
with ProxyNetworkActivate('Download_Gradio_Theme'):
logging.info('正在下载Gradio主题,请稍等。')
if THEME.startswith('Huggingface-'): THEME = THEME.lstrip('Huggingface-')
if THEME.startswith('huggingface-'): THEME = THEME.lstrip('huggingface-')
set_theme = set_theme.from_hub(THEME.lower())
return set_theme
def adjust_theme():
try:
set_theme = gr.themes.ThemeClass()
with ProxyNetworkActivate('Download_Gradio_Theme'):
logging.info('正在下载Gradio主题,请稍等。')
THEME, = get_conf('THEME')
if THEME.startswith('Huggingface-'): THEME = THEME.lstrip('Huggingface-')
if THEME.startswith('huggingface-'): THEME = THEME.lstrip('huggingface-')
set_theme = set_theme.from_hub(THEME.lower())
if LAYOUT=="TOP-DOWN":
js = ""
else:
with open('themes/common.js', 'r', encoding='utf8') as f:
js = f"<script>{f.read()}</script>"
# 添加一个萌萌的看板娘
if ADD_WAIFU:
js += """
<script src="file=docs/waifu_plugin/jquery.min.js"></script>
<script src="file=docs/waifu_plugin/jquery-ui.min.js"></script>
<script src="file=docs/waifu_plugin/autoload.js"></script>
"""
gradio_original_template_fn = gr.routes.templates.TemplateResponse
def gradio_new_template_fn(*args, **kwargs):
res = gradio_original_template_fn(*args, **kwargs)
res.body = res.body.replace(b'</html>', f'{js}</html>'.encode("utf8"))
res.init_headers()
return res
gr.routes.templates.TemplateResponse = gradio_new_template_fn # override gradio template
except Exception as e:
set_theme = None
from toolbox import trimmed_format_exc
logging.error('gradio版本较旧, 不能自定义字体和颜色:', trimmed_format_exc())
return set_theme
# with open("themes/default.css", "r", encoding="utf-8") as f:
# advanced_css = f.read()
with open("themes/common.css", "r", encoding="utf-8") as f:
advanced_css = f.read()

查看文件

@@ -2,14 +2,22 @@ import gradio as gr
from toolbox import get_conf
THEME, = get_conf('THEME')
if THEME == 'Chuanhu-Small-and-Beautiful':
from .green import adjust_theme, advanced_css
theme_declaration = "<h2 align=\"center\" class=\"small\">[Chuanhu-Small-and-Beautiful主题]</h2>"
elif THEME == 'High-Contrast':
from .contrast import adjust_theme, advanced_css
theme_declaration = ""
else:
from .default import adjust_theme, advanced_css
theme_declaration = ""
def load_dynamic_theme(THEME):
adjust_dynamic_theme = None
if THEME == 'Chuanhu-Small-and-Beautiful':
from .green import adjust_theme, advanced_css
theme_declaration = "<h2 align=\"center\" class=\"small\">[Chuanhu-Small-and-Beautiful主题]</h2>"
elif THEME == 'High-Contrast':
from .contrast import adjust_theme, advanced_css
theme_declaration = ""
elif '/' in THEME:
from .gradios import adjust_theme, advanced_css
from .gradios import dynamic_set_theme
adjust_dynamic_theme = dynamic_set_theme(THEME)
theme_declaration = ""
else:
from .default import adjust_theme, advanced_css
theme_declaration = ""
return adjust_theme, advanced_css, theme_declaration, adjust_dynamic_theme
adjust_theme, advanced_css, theme_declaration, _ = load_dynamic_theme(THEME)

查看文件

@@ -5,6 +5,8 @@ import inspect
import re
import os
import gradio
import shutil
import glob
from latex2mathml.converter import convert as tex2mathml
from functools import wraps, lru_cache
pj = os.path.join
@@ -77,14 +79,24 @@ def ArgsGeneralWrapper(f):
}
chatbot_with_cookie = ChatBotWithCookies(cookies)
chatbot_with_cookie.write_list(chatbot)
if cookies.get('lock_plugin', None) is None:
# 正常状态
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
if len(args) == 0: # 插件通道
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request)
else: # 对话通道,或者基础功能通道
yield from f(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, *args)
else:
# 处理个别特殊插件的锁定状态
# 处理少数情况下的特殊插件的锁定状态
module, fn_name = cookies['lock_plugin'].split('->')
f_hot_reload = getattr(importlib.import_module(module, fn_name), fn_name)
yield from f_hot_reload(txt_passon, llm_kwargs, plugin_kwargs, chatbot_with_cookie, history, system_prompt, request)
# 判断一下用户是否错误地通过对话通道进入,如果是,则进行提醒
final_cookies = chatbot_with_cookie.get_cookies()
# len(args) != 0 代表“提交”键对话通道,或者基础功能通道
if len(args) != 0 and 'files_to_promote' in final_cookies and len(final_cookies['files_to_promote']) > 0:
chatbot_with_cookie.append(["检测到**滞留的缓存文档**,请及时处理。", "请及时点击“**保存当前对话**”获取所有滞留文档。"])
yield from update_ui(chatbot_with_cookie, final_cookies['history'], msg="检测到被滞留的缓存文档")
return decorated
@@ -94,7 +106,8 @@ def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
"""
assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时, 可用clear将其清空, 然后用for+append循环重新赋值。"
cookies = chatbot.get_cookies()
# 备份一份History作为记录
cookies.update({'history': history})
# 解决插件锁定时的界面显示问题
if cookies.get('lock_plugin', None):
label = cookies.get('llm_model', "") + " | " + "正在锁定插件" + cookies.get('lock_plugin', None)
@@ -171,7 +184,7 @@ def HotReload(f):
========================================================================
第二部分
其他小工具:
- write_results_to_file: 将结果写入markdown文件中
- write_history_to_file: 将结果写入markdown文件中
- regular_txt_to_markdown: 将普通文本转换为Markdown格式的文本。
- report_execption: 向chatbot中添加简单的意外错误信息
- text_divide_paragraph: 将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
@@ -203,37 +216,7 @@ def get_reduce_token_percent(text):
return 0.5, '不详'
def write_results_to_file(history, file_name=None):
"""
将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
"""
import os
import time
if file_name is None:
# file_name = time.strftime("chatGPT分析报告%Y-%m-%d-%H-%M-%S", time.localtime()) + '.md'
file_name = 'GPT-Report-' + gen_time_str() + '.md'
os.makedirs('./gpt_log/', exist_ok=True)
with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
f.write('# GPT-Academic Report\n')
for i, content in enumerate(history):
try:
if type(content) != str: content = str(content)
except:
continue
if i % 2 == 0:
f.write('## ')
try:
f.write(content)
except:
# remove everything that cannot be handled by utf8
f.write(content.encode('utf-8', 'ignore').decode())
f.write('\n\n')
res = '以上材料已经被写入:\t' + os.path.abspath(f'./gpt_log/{file_name}')
print(res)
return res
def write_history_to_file(history, file_basename=None, file_fullname=None):
def write_history_to_file(history, file_basename=None, file_fullname=None, auto_caption=True):
"""
将对话记录history以Markdown格式写入文件中。如果没有指定文件名,则使用当前时间生成文件名。
"""
@@ -241,9 +224,9 @@ def write_history_to_file(history, file_basename=None, file_fullname=None):
import time
if file_fullname is None:
if file_basename is not None:
file_fullname = os.path.join(get_log_folder(), file_basename)
file_fullname = pj(get_log_folder(), file_basename)
else:
file_fullname = os.path.join(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md')
file_fullname = pj(get_log_folder(), f'GPT-Academic-{gen_time_str()}.md')
os.makedirs(os.path.dirname(file_fullname), exist_ok=True)
with open(file_fullname, 'w', encoding='utf8') as f:
f.write('# GPT-Academic Report\n')
@@ -252,7 +235,7 @@ def write_history_to_file(history, file_basename=None, file_fullname=None):
if type(content) != str: content = str(content)
except:
continue
if i % 2 == 0:
if i % 2 == 0 and auto_caption:
f.write('## ')
try:
f.write(content)
@@ -519,7 +502,7 @@ def find_recent_files(directory):
if not os.path.exists(directory):
os.makedirs(directory, exist_ok=True)
for filename in os.listdir(directory):
file_path = os.path.join(directory, filename)
file_path = pj(directory, filename)
if file_path.endswith('.log'):
continue
created_time = os.path.getmtime(file_path)
@@ -534,7 +517,7 @@ def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
# 将文件复制一份到下载区
import shutil
if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
new_path = os.path.join(get_log_folder(), rename_file)
new_path = pj(get_log_folder(), rename_file)
# 如果已经存在,先删除
if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
# 把文件复制过去
@@ -544,49 +527,76 @@ def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
if 'files_to_promote' in chatbot._cookies: current = chatbot._cookies['files_to_promote']
else: current = []
chatbot._cookies.update({'files_to_promote': [new_path] + current})
return new_path
def disable_auto_promotion(chatbot):
chatbot._cookies.update({'files_to_promote': []})
return
def on_file_uploaded(files, chatbot, txt, txt2, checkboxes, cookies):
def is_the_upload_folder(string):
PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
pattern = r'^PATH_PRIVATE_UPLOAD/[A-Za-z0-9_-]+/\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}$'
pattern = pattern.replace('PATH_PRIVATE_UPLOAD', PATH_PRIVATE_UPLOAD)
if re.match(pattern, string): return True
else: return False
def del_outdated_uploads(outdate_time_seconds):
PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
current_time = time.time()
one_hour_ago = current_time - outdate_time_seconds
# Get a list of all subdirectories in the PATH_PRIVATE_UPLOAD folder
# Remove subdirectories that are older than one hour
for subdirectory in glob.glob(f'{PATH_PRIVATE_UPLOAD}/*/*'):
subdirectory_time = os.path.getmtime(subdirectory)
if subdirectory_time < one_hour_ago:
try: shutil.rmtree(subdirectory)
except: pass
return
def on_file_uploaded(request: gradio.Request, files, chatbot, txt, txt2, checkboxes, cookies):
"""
当文件被上传时的回调函数
"""
if len(files) == 0:
return chatbot, txt
import shutil
import os
import time
import glob
from toolbox import extract_archive
try:
shutil.rmtree('./private_upload/')
except:
pass
# 移除过时的旧文件从而节省空间&保护隐私
outdate_time_seconds = 60
del_outdated_uploads(outdate_time_seconds)
# 创建工作路径
user_name = "default" if not request.username else request.username
time_tag = gen_time_str()
os.makedirs(f'private_upload/{time_tag}', exist_ok=True)
err_msg = ''
PATH_PRIVATE_UPLOAD, = get_conf('PATH_PRIVATE_UPLOAD')
target_path_base = pj(PATH_PRIVATE_UPLOAD, user_name, time_tag)
os.makedirs(target_path_base, exist_ok=True)
# 逐个文件转移到目标路径
upload_msg = ''
for file in files:
file_origin_name = os.path.basename(file.orig_name)
shutil.copy(file.name, f'private_upload/{time_tag}/{file_origin_name}')
err_msg += extract_archive(f'private_upload/{time_tag}/{file_origin_name}',
dest_dir=f'private_upload/{time_tag}/{file_origin_name}.extract')
moved_files = [fp for fp in glob.glob('private_upload/**/*', recursive=True)]
if "底部输入区" in checkboxes:
txt = ""
txt2 = f'private_upload/{time_tag}'
this_file_path = pj(target_path_base, file_origin_name)
shutil.move(file.name, this_file_path)
upload_msg += extract_archive(file_path=this_file_path, dest_dir=this_file_path+'.extract')
# 整理文件集合
moved_files = [fp for fp in glob.glob(f'{target_path_base}/**/*', recursive=True)]
if "底部输入区" in checkboxes:
txt, txt2 = "", target_path_base
else:
txt = f'private_upload/{time_tag}'
txt2 = ""
txt, txt2 = target_path_base, ""
# 输出消息
moved_files_str = '\t\n\n'.join(moved_files)
chatbot.append(['我上传了文件,请查收',
chatbot.append(['我上传了文件,请查收',
f'[Local Message] 收到以下文件: \n\n{moved_files_str}' +
f'\n\n调用路径参数已自动修正到: \n\n{txt}' +
f'\n\n现在您点击任意函数插件时,以上文件将被作为输入参数'+err_msg])
f'\n\n现在您点击任意函数插件时,以上文件将被作为输入参数'+upload_msg])
# 记录近期文件
cookies.update({
'most_recent_uploaded': {
'path': f'private_upload/{time_tag}',
'path': target_path_base,
'time': time.time(),
'time_str': time_tag
}})
@@ -595,11 +605,12 @@ def on_file_uploaded(files, chatbot, txt, txt2, checkboxes, cookies):
def on_report_generated(cookies, files, chatbot):
from toolbox import find_recent_files
PATH_LOGGING, = get_conf('PATH_LOGGING')
if 'files_to_promote' in cookies:
report_files = cookies['files_to_promote']
cookies.pop('files_to_promote')
else:
report_files = find_recent_files('gpt_log')
report_files = find_recent_files(PATH_LOGGING)
if len(report_files) == 0:
return cookies, None, chatbot
# files.extend(report_files)
@@ -909,34 +920,35 @@ def zip_folder(source_folder, dest_folder, zip_name):
return
# Create the name for the zip file
zip_file = os.path.join(dest_folder, zip_name)
zip_file = pj(dest_folder, zip_name)
# Create a ZipFile object
with zipfile.ZipFile(zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
# Walk through the source folder and add files to the zip file
for foldername, subfolders, filenames in os.walk(source_folder):
for filename in filenames:
filepath = os.path.join(foldername, filename)
filepath = pj(foldername, filename)
zipf.write(filepath, arcname=os.path.relpath(filepath, source_folder))
# Move the zip file to the destination folder (if it wasn't already there)
if os.path.dirname(zip_file) != dest_folder:
os.rename(zip_file, os.path.join(dest_folder, os.path.basename(zip_file)))
zip_file = os.path.join(dest_folder, os.path.basename(zip_file))
os.rename(zip_file, pj(dest_folder, os.path.basename(zip_file)))
zip_file = pj(dest_folder, os.path.basename(zip_file))
print(f"Zip file created at {zip_file}")
def zip_result(folder):
t = gen_time_str()
zip_folder(folder, './gpt_log/', f'{t}-result.zip')
return pj('./gpt_log/', f'{t}-result.zip')
zip_folder(folder, get_log_folder(), f'{t}-result.zip')
return pj(get_log_folder(), f'{t}-result.zip')
def gen_time_str():
import time
return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
def get_log_folder(user='default', plugin_name='shared'):
_dir = os.path.join(os.path.dirname(__file__), 'gpt_log', user, plugin_name)
PATH_LOGGING, = get_conf('PATH_LOGGING')
_dir = pj(PATH_LOGGING, user, plugin_name)
if not os.path.exists(_dir): os.makedirs(_dir)
return _dir
@@ -944,7 +956,19 @@ class ProxyNetworkActivate():
"""
这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
"""
def __init__(self, task=None) -> None:
self.task = task
if not task:
# 不给定task, 那么我们默认代理生效
self.valid = True
else:
# 给定了task, 我们检查一下
from toolbox import get_conf
WHEN_TO_USE_PROXY, = get_conf('WHEN_TO_USE_PROXY')
self.valid = (task in WHEN_TO_USE_PROXY)
def __enter__(self):
if not self.valid: return self
from toolbox import get_conf
proxies, = get_conf('proxies')
if 'no_proxy' in os.environ: os.environ.pop('no_proxy')

查看文件

@@ -1,5 +1,5 @@
{
"version": 3.50,
"version": 3.54,
"show_feature": true,
"new_feature": "支持插件分类! <-> 支持用户使用自然语言调度各个插件(虚空终端) <-> 改进UI,设计新主题 <-> 支持借助GROBID实现PDF高精度翻译 <-> 接入百度千帆平台和文心一言 <-> 接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率"
"new_feature": "新增动态代码解释器CodeInterpreter <-> 增加文本回答复制按钮 <-> 细分代理场合 <-> 支持动态选择不同界面主题 <-> 提高稳定性&解决多用户冲突问题 <-> 支持插件分类和更多UI皮肤外观 <-> 支持用户使用自然语言调度各个插件(虚空终端) <-> 改进UI,设计新主题 <-> 支持借助GROBID实现PDF高精度翻译 <-> 接入百度千帆平台和文心一言 <-> 接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率"
}