Update crazy_functional.py

修复'copiedIcon'重复定义BUG
允许模块预热时使用Proxy
2025-12-06 22:46:48 +00:00 · 2023-09-27 18:35:06 +08:00 · 2023-09-27 16:35:58 +08:00 · 2023-09-27 15:53:45 +08:00 · 2023-09-27 15:40:55 +08:00 · 2023-09-27 15:20:28 +08:00
--- a/README.md
+++ b/README.md
@@ -101,9 +101,11 @@ cd gpt_academic
 2. 配置API_KEY
-在`config.py`中，配置API KEY等设置，[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。
+在`config.py`中，配置API KEY等设置，[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。[Wiki页面](https://github.com/binary-husky/gpt_academic/wiki/%E9%A1%B9%E7%9B%AE%E9%85%8D%E7%BD%AE%E8%AF%B4%E6%98%8E)。
-(P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件，并用其中的配置覆盖`config.py`的同名配置。因此，如果您能理解我们的配置读取逻辑，我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件，并把`config.py`中的配置转移（复制）到`config_private.py`中（仅复制您修改过的配置条目即可）。`config_private.py`不受git管控，可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项，环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
+「 程序会优先检查是否存在名为`config_private.py`的私密配置文件，并用其中的配置覆盖`config.py`的同名配置。如您能理解该读取逻辑，我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件，并把`config.py`中的配置转移（复制）到`config_private.py`中（仅复制您修改过的配置条目即可）。 」
 「 支持通过`环境变量`配置项目，环境变量的书写格式参考`docker-compose.yml`文件或者我们的[Wiki页面](https://github.com/binary-husky/gpt_academic/wiki/%E9%A1%B9%E7%9B%AE%E9%85%8D%E7%BD%AE%E8%AF%B4%E6%98%8E)。配置读取优先级: `环境变量` > `config_private.py` > `config.py`。 」
 3. 安装依赖
@@ -111,7 +113,7 @@ cd gpt_academic
 # （选择I: 如熟悉python）（python版本3.9以上，越新越好），备注：使用官方pip源或者阿里pip源,临时换源方法：python -m pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
 python -m pip install -r requirements.txt
-# （选择II: 如不熟悉python）使用anaconda，步骤也是类似的 (https://www.bilibili.com/video/BV1rc411W7Dr)：
+# （选择II: 使用Anaconda）步骤也是类似的 (https://www.bilibili.com/video/BV1rc411W7Dr)：
 conda create -n gptac_venv python=3.11    # 创建anaconda环境
 conda activate gptac_venv                 # 激活anaconda环境
 python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
@@ -149,26 +151,25 @@ python main.py
 ### 安装方法II：使用Docker
 0. 部署项目的全部能力（这个是包含cuda和latex的大型镜像。如果您网速慢、硬盘小或没有显卡，则不推荐使用这个，建议使用方案1）（需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时）
 [![fullcapacity](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-all-capacity.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)
-1. 仅ChatGPT（推荐大多数人选择，等价于docker-compose方案1）
+``` sh
 # 修改docker-compose.yml，保留方案0并删除其他方案。修改docker-compose.yml中方案0的配置，参考其中注释即可
 docker-compose up
 ```
 1. 仅ChatGPT+文心一言+spark等在线模型（推荐大多数人选择）
 [![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml)
 [![basiclatex](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml)
 [![basicaudio](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)
 ``` sh
-git clone --depth=1 https://github.com/binary-husky/gpt_academic.git  # 下载项目
+# 修改docker-compose.yml，保留方案1并删除其他方案。修改docker-compose.yml中方案1的配置，参考其中注释即可
-cd gpt_academic                                 # 进入路径
+docker-compose up
 nano config.py                                      # 用任意文本编辑器编辑config.py, 配置 “Proxy”， “API_KEY” 以及 “WEB_PORT” (例如50923) 等
 docker build -t gpt-academic .                      # 安装
 #（最后一步-Linux操作系统）用`--net=host`更方便快捷
 docker run --rm -it --net=host gpt-academic
 #（最后一步-MacOS/Windows操作系统）只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
 docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
 ```
-P.S. 如果需要依赖Latex的插件功能，请见Wiki。另外，您也可以直接使用docker-compose获取Latex功能（修改docker-compose.yml，保留方案4并删除其他方案）。
+
 P.S. 如果需要依赖Latex的插件功能，请见Wiki。另外，您也可以直接使用方案4或者方案0获取Latex功能。
 2. ChatGPT + ChatGLM2 + MOSS + LLAMA2 + 通义千问（需要熟悉[Nvidia Docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-ubuntu-and-debian)运行时）
 [![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)
@@ -309,6 +310,7 @@ Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史h
 ### II：版本:
 - version 3.60（todo）: 优化虚空终端，引入code interpreter和更多插件
 - version 3.53: 支持动态选择不同界面主题，提高稳定性&解决多用户冲突问题
 - version 3.50: 使用自然语言调用本项目的所有函数插件（虚空终端），支持插件分类，改进UI，设计新主题
 - version 3.49: 支持百度千帆平台和文心一言
 - version 3.48: 支持阿里达摩院通义千问，上海AI-Lab书生，讯飞星火
--- a/check_proxy.py
+++ b/check_proxy.py
@@ -155,11 +155,13 @@ def auto_update(raise_error=False):
 def warm_up_modules():
    print('正在执行一些模块的预热...')
    from toolbox import ProxyNetworkActivate
    from request_llm.bridge_all import model_info
-    enc = model_info["gpt-3.5-turbo"]['tokenizer']
+    with ProxyNetworkActivate("Warmup_Modules"):
-    enc.encode("模块预热", disallowed_special=())
+        enc = model_info["gpt-3.5-turbo"]['tokenizer']
-    enc = model_info["gpt-4"]['tokenizer']
+        enc.encode("模块预热", disallowed_special=())
-    enc.encode("模块预热", disallowed_special=())
+        enc = model_info["gpt-4"]['tokenizer']
        enc.encode("模块预热", disallowed_special=())
 if __name__ == '__main__':
    import os
--- a/config.py
+++ b/config.py
@@ -46,7 +46,7 @@ DEFAULT_WORKER_NUM = 3
 # 色彩主题, 可选 ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast"]
 # 更多主题, 请查阅Gradio主题商店: https://huggingface.co/spaces/gradio/theme-gallery 可选 ["Gstaff/Xkcd", "NoCrypt/Miku", ...]
 THEME = "Default"
-
+AVAIL_THEMES = ["Default", "Chuanhu-Small-and-Beautiful", "High-Contrast", "Gstaff/Xkcd", "NoCrypt/Miku"]
 # 对话窗的高度 （仅在LAYOUT="TOP-DOWN"时生效）
 CHATBOT_HEIGHT = 1115
@@ -74,13 +74,13 @@ MAX_RETRY = 2
 # 插件分类默认选项
-DEFAULT_FN_GROUPS = ['对话', '编程', '学术']
+DEFAULT_FN_GROUPS = ['对话', '编程', '学术', '智能体']
 # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 它*必须*被包含在AVAIL_LLM_MODELS列表中 )
 LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
 AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt-3.5", "api2d-gpt-3.5-turbo", 
-                    "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
+                    "gpt-4", "gpt-4-32k", "azure-gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "stack-claude"]
 # P.S. 其他可用的模型还包括 ["qianfan", "llama2", "qwen", "gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", 
 # "spark", "sparkv2", "chatglm_onnx", "claude-1-100k", "claude-2", "internlm", "jittorllms_pangualpha", "jittorllms_llama"]
@@ -183,6 +183,9 @@ ALLOW_RESET_CONFIG = False
 PATH_PRIVATE_UPLOAD = "private_upload"
 # 日志文件夹的位置，请勿修改
 PATH_LOGGING = "gpt_log"
 # 除了连接OpenAI之外，还有哪些场合允许使用代理，请勿修改
 WHEN_TO_USE_PROXY = ["Download_LLM", "Download_Gradio_Theme", "Connect_Grobid", "Warmup_Modules"]
 """
 在线大模型配置关联关系示意图
--- a/core_functional.py
+++ b/core_functional.py
@@ -11,7 +11,8 @@ def get_core_functions():
            # 前缀，会被加在你的输入之前。例如，用来描述你的要求，例如翻译、解释代码、润色等等
            "Prefix":   r"Below is a paragraph from an academic paper. Polish the writing to meet the academic style, " +
                        r"improve the spelling, grammar, clarity, concision and overall readability. When necessary, rewrite the whole sentence. " +
-                        r"Furthermore, list all modification and explain the reasons to do so in markdown table." + "\n\n",
+                        r"Firstly, you should provide the polished paragraph. "
                        r"Secondly, you should list all your modification and explain the reasons to do so in markdown table." + "\n\n",
            # 后缀，会被加在你的输入之后。例如，配合前缀可以把你的输入内容用引号圈起来
            "Suffix":   r"",
            # 按钮颜色 (默认 secondary)
@@ -27,17 +28,18 @@ def get_core_functions():
            "Suffix":   r"",
        },
        "查找语法错误": {
-            "Prefix":   r"Can you help me ensure that the grammar and the spelling is correct? " +
+            "Prefix":   r"Help me ensure that the grammar and the spelling is correct. "
-                        r"Do not try to polish the text, if no mistake is found, tell me that this paragraph is good." +
+                        r"Do not try to polish the text, if no mistake is found, tell me that this paragraph is good. "
-                        r"If you find grammar or spelling mistakes, please list mistakes you find in a two-column markdown table, " +
+                        r"If you find grammar or spelling mistakes, please list mistakes you find in a two-column markdown table, "
-                        r"put the original text the first column, " +
+                        r"put the original text the first column, "
-                        r"put the corrected text in the second column and highlight the key words you fixed.""\n"
+                        r"put the corrected text in the second column and highlight the key words you fixed. "
                        r"Finally, please provide the proofreaded text.""\n\n"
                        r"Example:""\n"
                        r"Paragraph: How is you? Do you knows what is it?""\n"
                        r"| Original sentence | Corrected sentence |""\n"
                        r"| :--- | :--- |""\n"
                        r"| How **is** you? | How **are** you? |""\n"
-                        r"| Do you **knows** what **is** **it**? | Do you **know** what **it** **is** ? |""\n"
+                        r"| Do you **knows** what **is** **it**? | Do you **know** what **it** **is** ? |""\n\n"
                        r"Below is a paragraph from an academic paper. "
                        r"You need to report all grammar and spelling mistakes as the example before."
                        + "\n\n",
--- a/crazy_functional.py
+++ b/crazy_functional.py
@@ -6,6 +6,7 @@ def get_crazy_functions():
    from crazy_functions.生成函数注释 import 批量生成函数注释
    from crazy_functions.解析项目源代码 import 解析项目本身
    from crazy_functions.解析项目源代码 import 解析一个Python项目
    from crazy_functions.解析项目源代码 import 解析一个Matlab项目
    from crazy_functions.解析项目源代码 import 解析一个C项目的头文件
    from crazy_functions.解析项目源代码 import 解析一个C项目
    from crazy_functions.解析项目源代码 import 解析一个Golang项目
@@ -38,7 +39,7 @@ def get_crazy_functions():
    function_plugins = {
        "虚空终端": {
-            "Group": "对话|编程|学术",
+            "Group": "对话|编程|学术|智能体",
            "Color": "stop",
            "AsButton": True,
            "Function": HotReload(虚空终端)
@@ -77,6 +78,13 @@ def get_crazy_functions():
            "Info": "批量总结word文档 | 输入参数为路径",
            "Function": HotReload(总结word文档)
        },
        "解析整个Matlab项目": {
            "Group": "编程",
            "Color": "stop",
            "AsButton": False,
            "Info": "解析一个Matlab项目的所有源文件(.m) | 输入参数为路径",
            "Function": HotReload(解析一个Matlab项目)
        },
        "解析整个C++项目头文件": {
            "Group": "编程",
            "Color": "stop",
@@ -243,20 +251,23 @@ def get_crazy_functions():
            "Info": "对中文Latex项目全文进行润色处理 | 输入参数为路径或上传压缩包",
            "Function": HotReload(Latex中文润色)
        },
-        "Latex项目全文中译英（输入路径或上传压缩包）": {
+
-            "Group": "学术",
+        # 被新插件取代
-            "Color": "stop",
+        # "Latex项目全文中译英（输入路径或上传压缩包）": {
-            "AsButton": False,  # 加入下拉菜单中
+        #     "Group": "学术",
-            "Info": "对Latex项目全文进行中译英处理 | 输入参数为路径或上传压缩包",
+        #     "Color": "stop",
-            "Function": HotReload(Latex中译英)
+        #     "AsButton": False,  # 加入下拉菜单中
-        },
+        #     "Info": "对Latex项目全文进行中译英处理 | 输入参数为路径或上传压缩包",
-        "Latex项目全文英译中（输入路径或上传压缩包）": {
+        #     "Function": HotReload(Latex中译英)
-            "Group": "学术",
+        # },
-            "Color": "stop",
+        # "Latex项目全文英译中（输入路径或上传压缩包）": {
-            "AsButton": False,  # 加入下拉菜单中
+        #     "Group": "学术",
-            "Info": "对Latex项目全文进行英译中处理 | 输入参数为路径或上传压缩包",
+        #     "Color": "stop",
-            "Function": HotReload(Latex英译中)
+        #     "AsButton": False,  # 加入下拉菜单中
-        },
+        #     "Info": "对Latex项目全文进行英译中处理 | 输入参数为路径或上传压缩包",
        #     "Function": HotReload(Latex英译中)
        # },
        "批量Markdown中译英（输入路径或上传压缩包）": {
            "Group": "编程",
            "Color": "stop",
@@ -513,6 +524,18 @@ def get_crazy_functions():
    except:
        print('Load function plugin failed')
    try:
        from crazy_functions.函数动态生成 import 函数动态生成
        function_plugins.update({
            "动态代码解释器（CodeInterpreter）": {
                "Group": "智能体",
                "Color": "stop",
                "AsButton": False,
                "Function": HotReload(函数动态生成)
            }
        })
    except:
        print('Load function plugin failed')
    # try:
    #     from crazy_functions.CodeInterpreter import 虚空终端CodeInterpreter
--- a/crazy_functions/Langchain知识库.py
+++ b/crazy_functions/Langchain知识库.py
@@ -53,14 +53,14 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    print('Checking Text2vec ...')
    from langchain.embeddings.huggingface import HuggingFaceEmbeddings
-    with ProxyNetworkActivate():    # 临时地激活代理网络
+    with ProxyNetworkActivate('Download_LLM'):    # 临时地激活代理网络
        HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
    # < -------------------构建知识库--------------- >
    chatbot.append(['<br/>'.join(file_manifest), "正在构建知识库..."])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    print('Establishing knowledge archive ...')
-    with ProxyNetworkActivate():    # 临时地激活代理网络
+    with ProxyNetworkActivate('Download_LLM'):    # 临时地激活代理网络
        kai = knowledge_archive_interface()
        kai.feed_archive(file_manifest=file_manifest, id=kai_id)
    kai_files = kai.get_loaded_file()
--- a/crazy_functions/Latex输出PDF结果.py
+++ b/crazy_functions/Latex输出PDF结果.py
@@ -79,7 +79,7 @@ def move_project(project_folder, arxiv_id=None):
    shutil.copytree(src=project_folder, dst=new_workfolder)
    return new_workfolder
-def arxiv_download(chatbot, history, txt):
+def arxiv_download(chatbot, history, txt, allow_cache=True):
    def check_cached_translation_pdf(arxiv_id):
        translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'translation')
        if not os.path.exists(translation_dir):
@@ -116,7 +116,7 @@ def arxiv_download(chatbot, history, txt):
    arxiv_id = url_.split('/abs/')[-1]
    if 'v' in arxiv_id: arxiv_id = arxiv_id[:10]
    cached_translation_pdf = check_cached_translation_pdf(arxiv_id)
-    if cached_translation_pdf: return cached_translation_pdf, arxiv_id
+    if cached_translation_pdf and allow_cache: return cached_translation_pdf, arxiv_id
    url_tar = url_.replace('/abs/', '/e-print/')
    translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'e-print')
@@ -228,6 +228,9 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
    # <-------------- more requirements ------------->
    if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
    more_req = plugin_kwargs.get("advanced_arg", "")
    no_cache = more_req.startswith("--no-cache")
    if no_cache: more_req.lstrip("--no-cache")
    allow_cache = not no_cache
    _switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
    # <-------------- check deps ------------->
@@ -244,7 +247,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
    # <-------------- clear history and read input ------------->
    history = []
-    txt, arxiv_id = yield from arxiv_download(chatbot, history, txt)
+    txt, arxiv_id = yield from arxiv_download(chatbot, history, txt, allow_cache)
    if txt.endswith('.pdf'):
        report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"发现已经存在翻译好的PDF文档")
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
--- a/crazy_functions/crazy_utils.py
+++ b/crazy_functions/crazy_utils.py
@@ -651,7 +651,7 @@ class knowledge_archive_interface():
            from toolbox import ProxyNetworkActivate
            print('Checking Text2vec ...')
            from langchain.embeddings.huggingface import HuggingFaceEmbeddings
-            with ProxyNetworkActivate():    # 临时地激活代理网络
+            with ProxyNetworkActivate('Download_LLM'):    # 临时地激活代理网络
                self.text2vec_large_chinese = HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
        return self.text2vec_large_chinese
@@ -807,3 +807,10 @@ class construct_html():
        with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
            f.write(self.html_string.encode('utf-8', 'ignore').decode())
        return os.path.join(get_log_folder(), file_name)
 def get_plugin_arg(plugin_kwargs, key, default):
    # 如果参数是空的
    if (key in plugin_kwargs) and (plugin_kwargs[key] == ""): plugin_kwargs.pop(key)
    # 正常情况
    return plugin_kwargs.get(key, default)
--- a/crazy_functions/gen_fns/gen_fns_shared.py
+++ b/crazy_functions/gen_fns/gen_fns_shared.py
@@ -0,0 +1,70 @@
 import time
 import importlib
 from toolbox import trimmed_format_exc, gen_time_str, get_log_folder
 from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, is_the_upload_folder
 from toolbox import promote_file_to_downloadzone, get_log_folder, update_ui_lastest_msg
 import multiprocessing
 def get_class_name(class_string):
    import re
    # Use regex to extract the class name
    class_name = re.search(r'class (\w+)\(', class_string).group(1)
    return class_name
 def try_make_module(code, chatbot):
    module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
    fn_path = f'{get_log_folder(plugin_name="gen_plugin_verify")}/{module_file}.py'
    with open(fn_path, 'w', encoding='utf8') as f: f.write(code)
    promote_file_to_downloadzone(fn_path, chatbot=chatbot)
    class_name = get_class_name(code)
    manager = multiprocessing.Manager()
    return_dict = manager.dict()
    p = multiprocessing.Process(target=is_function_successfully_generated, args=(fn_path, class_name, return_dict))
    # only has 10 seconds to run
    p.start(); p.join(timeout=10)
    if p.is_alive(): p.terminate(); p.join()
    p.close()
    return return_dict["success"], return_dict['traceback']
 # check is_function_successfully_generated
 def is_function_successfully_generated(fn_path, class_name, return_dict):
    return_dict['success'] = False
    return_dict['traceback'] = ""
    try:
        # Create a spec for the module
        module_spec = importlib.util.spec_from_file_location('example_module', fn_path)
        # Load the module
        example_module = importlib.util.module_from_spec(module_spec)
        module_spec.loader.exec_module(example_module)
        # Now you can use the module
        some_class = getattr(example_module, class_name)
        # Now you can create an instance of the class
        instance = some_class()
        return_dict['success'] = True
        return 
    except:
        return_dict['traceback'] = trimmed_format_exc()
        return
 def subprocess_worker(code, file_path, return_dict):
    return_dict['result'] = None
    return_dict['success'] = False
    return_dict['traceback'] = ""
    try:
        module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
        fn_path = f'{get_log_folder(plugin_name="gen_plugin_run")}/{module_file}.py'
        with open(fn_path, 'w', encoding='utf8') as f: f.write(code)
        class_name = get_class_name(code)
        # Create a spec for the module
        module_spec = importlib.util.spec_from_file_location('example_module', fn_path)
        # Load the module
        example_module = importlib.util.module_from_spec(module_spec)
        module_spec.loader.exec_module(example_module)
        # Now you can use the module
        some_class = getattr(example_module, class_name)
        # Now you can create an instance of the class
        instance = some_class()
        return_dict['result'] = instance.run(file_path)
        return_dict['success'] = True
    except:
        return_dict['traceback'] = trimmed_format_exc()
--- a/crazy_functions/pdf_fns/parse_pdf.py
+++ b/crazy_functions/pdf_fns/parse_pdf.py
@@ -1,16 +1,26 @@
 from functools import lru_cache
 from toolbox import gen_time_str
 from toolbox import promote_file_to_downloadzone
 from toolbox import write_history_to_file, promote_file_to_downloadzone
 from toolbox import get_conf
 from toolbox import ProxyNetworkActivate
 from colorful import *
 import requests
 import random
-from functools import lru_cache
+import copy
 import os
 import math
 class GROBID_OFFLINE_EXCEPTION(Exception): pass
 def get_avail_grobid_url():
    from toolbox import get_conf
    GROBID_URLS, = get_conf('GROBID_URLS')
    if len(GROBID_URLS) == 0: return None
    try:
        _grobid_url = random.choice(GROBID_URLS) # 随机负载均衡
        if _grobid_url.endswith('/'): _grobid_url = _grobid_url.rstrip('/')
-        res = requests.get(_grobid_url+'/api/isalive')
+        with ProxyNetworkActivate('Connect_Grobid'):
            res = requests.get(_grobid_url+'/api/isalive')
        if res.text=='true': return _grobid_url
        else: return None
    except:
@@ -21,10 +31,141 @@ def parse_pdf(pdf_path, grobid_url):
    import scipdf   # pip install scipdf_parser
    if grobid_url.endswith('/'): grobid_url = grobid_url.rstrip('/')
    try:
-        article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
+        with ProxyNetworkActivate('Connect_Grobid'):
            article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
    except GROBID_OFFLINE_EXCEPTION:
        raise GROBID_OFFLINE_EXCEPTION("GROBID服务不可用，请修改config中的GROBID_URL，可修改成本地GROBID服务。")
    except:
        raise RuntimeError("解析PDF失败，请检查PDF是否损坏。")
    return article_dict
 def produce_report_markdown(gpt_response_collection, meta, paper_meta_info, chatbot, fp, generated_conclusion_files):
    # -=-=-=-=-=-=-=-= 写出第1个文件：翻译前后混合 -=-=-=-=-=-=-=-=
    res_path = write_history_to_file(meta +  ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=f"{gen_time_str()}translated_and_original.md", file_fullname=None)
    promote_file_to_downloadzone(res_path, rename_file=os.path.basename(res_path)+'.md', chatbot=chatbot)
    generated_conclusion_files.append(res_path)
    # -=-=-=-=-=-=-=-= 写出第2个文件：仅翻译后的文本 -=-=-=-=-=-=-=-=
    translated_res_array = []
    # 记录当前的大章节标题：
    last_section_name = ""
    for index, value in enumerate(gpt_response_collection):
        # 先挑选偶数序列号：
        if index % 2 != 0:
            # 先提取当前英文标题：
            cur_section_name = gpt_response_collection[index-1].split('\n')[0].split(" Part")[0]
            # 如果index是1的话，则直接使用first section name：
            if cur_section_name != last_section_name:
                cur_value = cur_section_name + '\n'
                last_section_name = copy.deepcopy(cur_section_name)
            else:
                cur_value = ""
            # 再做一个小修改：重新修改当前part的标题，默认用英文的
            cur_value += value
            translated_res_array.append(cur_value)
    res_path = write_history_to_file(meta +  ["# Meta Translation" , paper_meta_info] + translated_res_array, 
                                     file_basename = f"{gen_time_str()}-translated_only.md", 
                                     file_fullname = None,
                                     auto_caption = False)
    promote_file_to_downloadzone(res_path, rename_file=os.path.basename(res_path)+'.md', chatbot=chatbot)
    generated_conclusion_files.append(res_path)
    return res_path
 def translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG):
    from crazy_functions.crazy_utils import construct_html
    from crazy_functions.crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
    from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
    from crazy_functions.crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
    prompt = "以下是一篇学术论文的基本信息:\n"
    # title
    title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
    # authors
    authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
    # abstract
    abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
    # command
    prompt += f"请将题目和摘要翻译为{DST_LANG}。"
    meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
    # 单线，获取文章meta信息
    paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
        inputs=prompt,
        inputs_show_user=prompt,
        llm_kwargs=llm_kwargs,
        chatbot=chatbot, history=[],
        sys_prompt="You are an academic paper reader。",
    )
    # 多线，翻译
    inputs_array = []
    inputs_show_user_array = []
    # get_token_num
    from request_llm.bridge_all import model_info
    enc = model_info[llm_kwargs['llm_model']]['tokenizer']
    def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
    def break_down(txt):
        raw_token_num = get_token_num(txt)
        if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
            return [txt]
        else:
            # raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
            # find a smooth token limit to achieve even seperation
            count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
            token_limit_smooth = raw_token_num // count + count
            return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
    for section in article_dict.get('sections'):
        if len(section['text']) == 0: continue
        section_frags = break_down(section['text'])
        for i, fragment in enumerate(section_frags):
            heading = section['heading']
            if len(section_frags) > 1: heading += f' Part-{i+1}'
            inputs_array.append(
                f"你需要翻译{heading}章节，内容如下: \n\n{fragment}"
            )
            inputs_show_user_array.append(
                f"# {heading}\n\n{fragment}"
            )
    gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
        inputs_array=inputs_array,
        inputs_show_user_array=inputs_show_user_array,
        llm_kwargs=llm_kwargs,
        chatbot=chatbot,
        history_array=[meta for _ in inputs_array],
        sys_prompt_array=[
            "请你作为一个学术翻译，负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
    )
    # -=-=-=-=-=-=-=-= 写出Markdown文件 -=-=-=-=-=-=-=-=
    produce_report_markdown(gpt_response_collection, meta, paper_meta_info, chatbot, fp, generated_conclusion_files)
    # -=-=-=-=-=-=-=-= 写出HTML文件 -=-=-=-=-=-=-=-=
    ch = construct_html() 
    orig = ""
    trans = ""
    gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
    for i,k in enumerate(gpt_response_collection_html): 
        if i%2==0:
            gpt_response_collection_html[i] = inputs_show_user_array[i//2]
        else:
            # 先提取当前英文标题：
            cur_section_name = gpt_response_collection[i-1].split('\n')[0].split(" Part")[0]
            cur_value = cur_section_name + "\n" + gpt_response_collection_html[i]
            gpt_response_collection_html[i] = cur_value
    final = ["", "", "一、论文概况",  "", "Abstract", paper_meta_info,  "二、论文翻译",  ""]
    final.extend(gpt_response_collection_html)
    for i, k in enumerate(final): 
        if i%2==0:
            orig = k
        if i%2==1:
            trans = k
            ch.add_row(a=orig, b=trans)
    create_report_file_name = f"{os.path.basename(fp)}.trans.html"
    html_file = ch.save_file(create_report_file_name)
    generated_conclusion_files.append(html_file)
    promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
--- a/crazy_functions/函数动态生成.py
+++ b/crazy_functions/函数动态生成.py
@@ -0,0 +1,252 @@
 # 本源代码中, ⭐ = 关键步骤
 """
 测试：
    - 裁剪图像，保留下半部分
    - 交换图像的蓝色通道和红色通道
    - 将图像转为灰度图像
    - 将csv文件转excel表格
 Testing: 
    - Crop the image, keeping the bottom half. 
    - Swap the blue channel and red channel of the image. 
    - Convert the image to grayscale. 
    - Convert the CSV file to an Excel spreadsheet.
 """
 from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, is_the_upload_folder
 from toolbox import promote_file_to_downloadzone, get_log_folder, update_ui_lastest_msg
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, get_plugin_arg
 from .crazy_utils import input_clipping, try_install_deps
 from crazy_functions.gen_fns.gen_fns_shared import is_function_successfully_generated
 from crazy_functions.gen_fns.gen_fns_shared import get_class_name
 from crazy_functions.gen_fns.gen_fns_shared import subprocess_worker
 from crazy_functions.gen_fns.gen_fns_shared import try_make_module
 import os
 import time
 import glob
 import multiprocessing
 templete = """
 ```python
 import ...  # Put dependencies here, e.g. import numpy as np. 
 class TerminalFunction(object): # Do not change the name of the class, The name of the class must be `TerminalFunction`
    def run(self, path):    # The name of the function must be `run`, it takes only a positional argument.
        # rewrite the function you have just written here 
        ...
        return generated_file_path
 ```
 """
 def inspect_dependency(chatbot, history):
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    return True
 def get_code_block(reply):
    import re
    pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
    matches = re.findall(pattern, reply) # find all code blocks in text
    if len(matches) == 1: 
        return matches[0].strip('python') #  code block
    for match in matches:
        if 'class TerminalFunction' in match:
            return match.strip('python') #  code block
    raise RuntimeError("GPT is not generating proper code.")
 def gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history):
    # 输入
    prompt_compose = [
        f'Your job:\n'
        f'1. write a single Python function, which takes a path of a `{file_type}` file as the only argument and returns a `string` containing the result of analysis or the path of generated files. \n',
        f"2. You should write this function to perform following task: " + txt + "\n",
        f"3. Wrap the output python function with markdown codeblock."
    ]
    i_say = "".join(prompt_compose)
    demo = []
    # 第一步
    gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
        inputs=i_say, inputs_show_user=i_say, 
        llm_kwargs=llm_kwargs, chatbot=chatbot, history=demo, 
        sys_prompt= r"You are a world-class programmer."
    )
    history.extend([i_say, gpt_say])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
    # 第二步
    prompt_compose = [
        "If previous stage is successful, rewrite the function you have just written to satisfy following templete: \n",
        templete
    ]
    i_say = "".join(prompt_compose); inputs_show_user = "If previous stage is successful, rewrite the function you have just written to satisfy executable templete. "
    gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
        inputs=i_say, inputs_show_user=inputs_show_user, 
        llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
        sys_prompt= r"You are a programmer. You need to replace `...` with valid packages, do not give `...` in your answer!"
    )
    code_to_return = gpt_say
    history.extend([i_say, gpt_say])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
    # # 第三步
    # i_say = "Please list to packages to install to run the code above. Then show me how to use `try_install_deps` function to install them."
    # i_say += 'For instance. `try_install_deps(["opencv-python", "scipy", "numpy"])`'
    # installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
    #     inputs=i_say, inputs_show_user=inputs_show_user, 
    #     llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
    #     sys_prompt= r"You are a programmer."
    # )
    # # # 第三步  
    # i_say = "Show me how to use `pip` to install packages to run the code above. "
    # i_say += 'For instance. `pip install -r opencv-python scipy numpy`'
    # installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
    #     inputs=i_say, inputs_show_user=i_say, 
    #     llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
    #     sys_prompt= r"You are a programmer."
    # )
    installation_advance = ""
    return code_to_return, installation_advance, txt, file_type, llm_kwargs, chatbot, history
 def for_immediate_show_off_when_possible(file_type, fp, chatbot):
    if file_type in ['png', 'jpg']:
        image_path = os.path.abspath(fp)
        chatbot.append(['这是一张图片, 展示如下:',  
            f'本地文件地址: <br/>`{image_path}`<br/>'+
            f'本地文件预览: <br/><div align="center"><img src="file={image_path}"></div>'
        ])
    return chatbot
 def have_any_recent_upload_files(chatbot):
    _5min = 5 * 60
    if not chatbot: return False    # chatbot is None
    most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
    if not most_recent_uploaded: return False   # most_recent_uploaded is None
    if time.time() - most_recent_uploaded["time"] < _5min: return True # most_recent_uploaded is new
    else: return False  # most_recent_uploaded is too old
 def get_recent_file_prompt_support(chatbot):
    most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
    path = most_recent_uploaded['path']
    return path
@CatchException
 def 函数动态生成(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
    """
    txt             输入栏用户输入的文本，例如需要翻译的一段话，再例如一个包含了待处理文件的路径
    llm_kwargs      gpt模型参数，如温度和top_p等，一般原样传递下去就行
    plugin_kwargs   插件模型的参数，暂时没有用武之地
    chatbot         聊天显示框的句柄，用于显示给用户
    history         聊天历史，前情提要
    system_prompt   给gpt的静默提醒
    web_port        当前软件运行的端口号
    """
    # 清空历史
    history = []
    # 基本信息：功能、贡献者
    chatbot.append(["正在启动: 插件动态生成插件", "插件动态生成, 执行开始, 作者Binary-Husky."])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    # ⭐ 文件上传区是否有东西
    # 1. 如果有文件: 作为函数参数
    # 2. 如果没有文件：需要用GPT提取参数 （太懒了，以后再写，虚空终端已经实现了类似的代码）
    file_list = []
    if get_plugin_arg(plugin_kwargs, key="file_path_arg", default=False):
        file_path = get_plugin_arg(plugin_kwargs, key="file_path_arg", default=None)
        file_list.append(file_path)
        yield from update_ui_lastest_msg(f"当前文件: {file_path}", chatbot, history, 1)
    elif have_any_recent_upload_files(chatbot):
        file_dir = get_recent_file_prompt_support(chatbot)
        file_list = glob.glob(os.path.join(file_dir, '**/*'), recursive=True)
        yield from update_ui_lastest_msg(f"当前文件处理列表: {file_list}", chatbot, history, 1)
    else:
        chatbot.append(["文件检索", "没有发现任何近期上传的文件。"])
        yield from update_ui_lastest_msg("没有发现任何近期上传的文件。", chatbot, history, 1)
        return  # 2. 如果没有文件
    if len(file_list) == 0:
        chatbot.append(["文件检索", "没有发现任何近期上传的文件。"])
        yield from update_ui_lastest_msg("没有发现任何近期上传的文件。", chatbot, history, 1)
        return  # 2. 如果没有文件
    # 读取文件
    file_type = file_list[0].split('.')[-1]
    # 粗心检查
    if is_the_upload_folder(txt):
        yield from update_ui_lastest_msg(f"请在输入框内填写需求, 然后再次点击该插件! 至于您的文件，不用担心, 文件路径 {txt} 已经被记忆. ", chatbot, history, 1)
        return
    # 开始干正事
    MAX_TRY = 3
    for j in range(MAX_TRY):  # 最多重试5次
        traceback = ""
        try:
            # ⭐ 开始啦 ！
            code, installation_advance, txt, file_type, llm_kwargs, chatbot, history = \
                yield from gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history)
            chatbot.append(["代码生成阶段结束", ""])
            yield from update_ui_lastest_msg(f"正在验证上述代码的有效性 ...", chatbot, history, 1)
            # ⭐ 分离代码块
            code = get_code_block(code)
            # ⭐ 检查模块
            ok, traceback = try_make_module(code, chatbot)
            # 搞定代码生成
            if ok: break
        except Exception as e:
            if not traceback: traceback = trimmed_format_exc()
        # 处理异常
        if not traceback: traceback = trimmed_format_exc()
        yield from update_ui_lastest_msg(f"第 {j+1}/{MAX_TRY} 次代码生成尝试, 失败了~ 别担心, 我们5秒后再试一次... \n\n此次我们的错误追踪是\n```\n{traceback}\n```\n", chatbot, history, 5)
    # 代码生成结束, 开始执行
    TIME_LIMIT = 15
    yield from update_ui_lastest_msg(f"开始创建新进程并执行代码! 时间限制 {TIME_LIMIT} 秒. 请等待任务完成... ", chatbot, history, 1)
    manager = multiprocessing.Manager()
    return_dict = manager.dict()
    # ⭐ 到最后一步了，开始逐个文件进行处理
    for file_path in file_list:
        if os.path.exists(file_path):
            chatbot.append([f"正在处理文件: {file_path}", f"请稍等..."])
            chatbot = for_immediate_show_off_when_possible(file_type, file_path, chatbot)
            yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
        else:
            continue
        # ⭐⭐⭐ subprocess_worker ⭐⭐⭐
        p = multiprocessing.Process(target=subprocess_worker, args=(code, file_path, return_dict))
        # ⭐ 开始执行，时间限制TIME_LIMIT
        p.start(); p.join(timeout=TIME_LIMIT)
        if p.is_alive(): p.terminate(); p.join()
        p.close()
        res = return_dict['result']
        success = return_dict['success']
        traceback = return_dict['traceback']
        if not success:
            if not traceback: traceback = trimmed_format_exc()
            chatbot.append(["执行失败了", f"错误追踪\n```\n{trimmed_format_exc()}\n```\n"])
            # chatbot.append(["如果是缺乏依赖，请参考以下建议", installation_advance])
            yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
            return
        # 顺利完成，收尾
        res = str(res)
        if os.path.exists(res):
            chatbot.append(["执行成功了，结果是一个有效文件", "结果：" + res])
            new_file_path = promote_file_to_downloadzone(res, chatbot=chatbot)
            chatbot = for_immediate_show_off_when_possible(file_type, new_file_path, chatbot)
            yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
        else:
            chatbot.append(["执行成功了，结果是一个字符串", "结果：" + res])
            yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新   
--- a/crazy_functions/批量翻译PDF文档_NOUGAT.py
+++ b/crazy_functions/批量翻译PDF文档_NOUGAT.py
@@ -1,11 +1,12 @@
-from toolbox import CatchException, report_execption, gen_time_str
+from toolbox import CatchException, report_execption, get_log_folder, gen_time_str
 from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
-from toolbox import write_history_to_file, get_log_folder
+from toolbox import write_history_to_file, promote_file_to_downloadzone
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
 from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
 from .crazy_utils import read_and_clean_pdf_text
-from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
+from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url, translate_pdf
 from colorful import *
 import copy
 import os
 import math
 import logging
@@ -92,7 +93,7 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
 def 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
    import copy
    import tiktoken
-    TOKEN_LIMIT_PER_FRAGMENT = 1280
+    TOKEN_LIMIT_PER_FRAGMENT = 1024
    generated_conclusion_files = []
    generated_html_files = []
    DST_LANG = "中文"
@@ -101,101 +102,12 @@ def 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwa
    for index, fp in enumerate(file_manifest):
        chatbot.append(["当前进度：", f"正在解析论文，请稍候。（第一次运行时，需要花费较长时间下载NOUGAT参数）"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        fpp = yield from nougat_handle.NOUGAT_parse_pdf(fp, chatbot, history)
-
+        promote_file_to_downloadzone(fpp, rename_file=os.path.basename(fpp)+'.nougat.mmd', chatbot=chatbot)
        with open(fpp, 'r', encoding='utf8') as f:
            article_content = f.readlines()
        article_dict = markdown_to_dict(article_content)
        logging.info(article_dict)
-
+        yield from translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG)
        prompt = "以下是一篇学术论文的基本信息:\n"
        # title
        title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
        # authors
        authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
        # abstract
        abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
        # command
        prompt += f"请将题目和摘要翻译为{DST_LANG}。"
        meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
        # 单线，获取文章meta信息
        paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
            inputs=prompt,
            inputs_show_user=prompt,
            llm_kwargs=llm_kwargs,
            chatbot=chatbot, history=[],
            sys_prompt="You are an academic paper reader。",
        )
        # 多线，翻译
        inputs_array = []
        inputs_show_user_array = []
        # get_token_num
        from request_llm.bridge_all import model_info
        enc = model_info[llm_kwargs['llm_model']]['tokenizer']
        def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
        from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
        def break_down(txt):
            raw_token_num = get_token_num(txt)
            if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
                return [txt]
            else:
                # raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
                # find a smooth token limit to achieve even seperation
                count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
                token_limit_smooth = raw_token_num // count + count
                return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
        for section in article_dict.get('sections'):
            if len(section['text']) == 0: continue
            section_frags = break_down(section['text'])
            for i, fragment in enumerate(section_frags):
                heading = section['heading']
                if len(section_frags) > 1: heading += f' Part-{i+1}'
                inputs_array.append(
                    f"你需要翻译{heading}章节，内容如下: \n\n{fragment}"
                )
                inputs_show_user_array.append(
                    f"# {heading}\n\n{fragment}"
                )
        gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
            inputs_array=inputs_array,
            inputs_show_user_array=inputs_show_user_array,
            llm_kwargs=llm_kwargs,
            chatbot=chatbot,
            history_array=[meta for _ in inputs_array],
            sys_prompt_array=[
                "请你作为一个学术翻译，负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
        )
        res_path = write_history_to_file(meta +  ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
        promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
        generated_conclusion_files.append(res_path)
        ch = construct_html() 
        orig = ""
        trans = ""
        gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
        for i,k in enumerate(gpt_response_collection_html): 
            if i%2==0:
                gpt_response_collection_html[i] = inputs_show_user_array[i//2]
            else:
                gpt_response_collection_html[i] = gpt_response_collection_html[i]
        final = ["", "", "一、论文概况",  "", "Abstract", paper_meta_info,  "二、论文翻译",  ""]
        final.extend(gpt_response_collection_html)
        for i, k in enumerate(final): 
            if i%2==0:
                orig = k
            if i%2==1:
                trans = k
                ch.add_row(a=orig, b=trans)
        create_report_file_name = f"{os.path.basename(fp)}.trans.html"
        html_file = ch.save_file(create_report_file_name)
        generated_html_files.append(html_file)
        promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
    chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
--- a/crazy_functions/批量翻译PDF文档_多线程.py
+++ b/crazy_functions/批量翻译PDF文档_多线程.py
@@ -1,12 +1,12 @@
-from toolbox import CatchException, report_execption, get_log_folder
+from toolbox import CatchException, report_execption, get_log_folder, gen_time_str
 from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
 from toolbox import write_history_to_file, promote_file_to_downloadzone
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
 from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
 from .crazy_utils import read_and_clean_pdf_text
-from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
+from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url, translate_pdf
 from colorful import *
-import glob
+import copy
 import os
 import math
@@ -58,8 +58,8 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
 def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, grobid_url):
-    import copy
+    import copy, json
-    TOKEN_LIMIT_PER_FRAGMENT = 1280
+    TOKEN_LIMIT_PER_FRAGMENT = 1024
    generated_conclusion_files = []
    generated_html_files = []
    DST_LANG = "中文"
@@ -67,104 +67,23 @@ def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwa
    for index, fp in enumerate(file_manifest):
        chatbot.append(["当前进度：", f"正在连接GROBID服务，请稍候: {grobid_url}\n如果等待时间过长，请修改config中的GROBID_URL，可修改成本地GROBID服务。"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        article_dict = parse_pdf(fp, grobid_url)
        grobid_json_res = os.path.join(get_log_folder(), gen_time_str() + "grobid.json")
        with open(grobid_json_res, 'w+', encoding='utf8') as f:
            f.write(json.dumps(article_dict, indent=4, ensure_ascii=False))
        promote_file_to_downloadzone(grobid_json_res, chatbot=chatbot)
        if article_dict is None: raise RuntimeError("解析PDF失败，请检查PDF是否损坏。")
-        prompt = "以下是一篇学术论文的基本信息:\n"
+        yield from translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG)
        # title
        title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
        # authors
        authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
        # abstract
        abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
        # command
        prompt += f"请将题目和摘要翻译为{DST_LANG}。"
        meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
        # 单线，获取文章meta信息
        paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
            inputs=prompt,
            inputs_show_user=prompt,
            llm_kwargs=llm_kwargs,
            chatbot=chatbot, history=[],
            sys_prompt="You are an academic paper reader。",
        )
        # 多线，翻译
        inputs_array = []
        inputs_show_user_array = []
        # get_token_num
        from request_llm.bridge_all import model_info
        enc = model_info[llm_kwargs['llm_model']]['tokenizer']
        def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
        from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
        def break_down(txt):
            raw_token_num = get_token_num(txt)
            if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
                return [txt]
            else:
                # raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
                # find a smooth token limit to achieve even seperation
                count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
                token_limit_smooth = raw_token_num // count + count
                return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
        for section in article_dict.get('sections'):
            if len(section['text']) == 0: continue
            section_frags = break_down(section['text'])
            for i, fragment in enumerate(section_frags):
                heading = section['heading']
                if len(section_frags) > 1: heading += f' Part-{i+1}'
                inputs_array.append(
                    f"你需要翻译{heading}章节，内容如下: \n\n{fragment}"
                )
                inputs_show_user_array.append(
                    f"# {heading}\n\n{fragment}"
                )
        gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
            inputs_array=inputs_array,
            inputs_show_user_array=inputs_show_user_array,
            llm_kwargs=llm_kwargs,
            chatbot=chatbot,
            history_array=[meta for _ in inputs_array],
            sys_prompt_array=[
                "请你作为一个学术翻译，负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
        )
        res_path = write_history_to_file(meta +  ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
        promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
        generated_conclusion_files.append(res_path)
        ch = construct_html() 
        orig = ""
        trans = ""
        gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
        for i,k in enumerate(gpt_response_collection_html): 
            if i%2==0:
                gpt_response_collection_html[i] = inputs_show_user_array[i//2]
            else:
                gpt_response_collection_html[i] = gpt_response_collection_html[i]
        final = ["", "", "一、论文概况",  "", "Abstract", paper_meta_info,  "二、论文翻译",  ""]
        final.extend(gpt_response_collection_html)
        for i, k in enumerate(final): 
            if i%2==0:
                orig = k
            if i%2==1:
                trans = k
                ch.add_row(a=orig, b=trans)
        create_report_file_name = f"{os.path.basename(fp)}.trans.html"
        html_file = ch.save_file(create_report_file_name)
        generated_html_files.append(html_file)
        promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
    chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
 def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
    """
    此函数已经弃用
    """
    import copy
-    TOKEN_LIMIT_PER_FRAGMENT = 1280
+    TOKEN_LIMIT_PER_FRAGMENT = 1024
    generated_conclusion_files = []
    generated_html_files = []
    from crazy_functions.crazy_utils import construct_html
--- a/crazy_functions/解析项目源代码.py
+++ b/crazy_functions/解析项目源代码.py
@@ -136,6 +136,23 @@ def 解析一个Python项目(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
        return
    yield from 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
@CatchException
 def 解析一个Matlab项目(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
    history = []    # 清空历史，以免输入溢出
    import glob, os
    if os.path.exists(txt):
        project_folder = txt
    else:
        if txt == "": txt = '空空如也的输入栏'
        report_execption(chatbot, history, a = f"解析Matlab项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return
    file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.m', recursive=True)]
    if len(file_manifest) == 0:
        report_execption(chatbot, history, a = f"解析Matlab项目: {txt}", b = f"找不到任何`.m`源文件: {txt}")
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return
    yield from 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
@CatchException
 def 解析一个C项目的头文件(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,5 +1,54 @@
 #【请修改完参数后，删除此行】请在以下方案中选择一种，然后删除其他的方案，最后docker-compose up运行 | Please choose from one of these options below, delete other options as well as This Line
 ## ===================================================
 ## 【方案零】 部署项目的全部能力（这个是包含cuda和latex的大型镜像。如果您网速慢、硬盘小或没有显卡，则不推荐使用这个）
 ## ===================================================
 version: '3'
 services:
  gpt_academic_full_capability:
    image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
    environment:
    # 请查阅 `config.py`或者 github wiki 以查看所有的配置信息
      API_KEY:                  '  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                        '
    # USE_PROXY:                '  True                                                                       '
    # proxies:                  '  { "http": "http://localhost:10881", "https": "http://localhost:10881", }   '
      LLM_MODEL:                '  gpt-3.5-turbo                                                              '
      AVAIL_LLM_MODELS:         '  ["gpt-3.5-turbo", "gpt-4", "qianfan", "sparkv2", "spark", "chatglm"]       '
      BAIDU_CLOUD_API_KEY :     '  bTUtwEAveBrQipEowUvDwYWq                                                   '
      BAIDU_CLOUD_SECRET_KEY :  '  jqXtLvXiVw6UNdjliATTS61rllG8Iuni                                           '
      XFYUN_APPID:              '  53a8d816                                                                   '
      XFYUN_API_SECRET:         '  MjMxNDQ4NDE4MzM0OSNlNjQ2NTlhMTkx                                           '
      XFYUN_API_KEY:            '  95ccdec285364869d17b33e75ee96447                                           '
      ENABLE_AUDIO:             '  False                                                                      '
      DEFAULT_WORKER_NUM:       '  20                                                                         '
      WEB_PORT:                 '  12345                                                                      '
      ADD_WAIFU:                '  False                                                                      '
      ALIYUN_APPKEY:            '  RxPlZrM88DnAFkZK                                                           '
      THEME:                    '  Chuanhu-Small-and-Beautiful                                                '
      ALIYUN_ACCESSKEY:         '  LTAI5t6BrFUzxRXVGUWnekh1                                                   '
      ALIYUN_SECRET:            '  eHmI20SVWIwQZxCiTD2bGQVspP9i68                                             '
    # LOCAL_MODEL_DEVICE:       '  cuda                                                                       '
    # 加载英伟达显卡运行时
    # runtime: nvidia
    # deploy:
    #     resources:
    #       reservations:
    #         devices:
    #           - driver: nvidia
    #             count: 1
    #             capabilities: [gpu]
    # 与宿主的网络融合
    network_mode: "host"
    # 不使用代理网络拉取最新代码
    command: >
      bash -c "python3 -u main.py"
 ## ===================================================
 ## 【方案一】 如果不需要运行本地模型（仅 chatgpt, azure, 星火, 千帆, claude 等在线大模型服务）
 ## ===================================================
--- a/docs/GithubAction+AllCapacity
+++ b/docs/GithubAction+AllCapacity
@@ -13,21 +13,20 @@ RUN python3 -m pip install openai numpy arxiv rich
 RUN python3 -m pip install colorama Markdown pygments pymupdf
 RUN python3 -m pip install python-docx moviepy pdfminer
 RUN python3 -m pip install zh_langchain==0.2.1 pypinyin
 RUN python3 -m pip install nougat-ocr
 RUN python3 -m pip install rarfile py7zr
 RUN python3 -m pip install aliyun-python-sdk-core==2.13.3 pyOpenSSL scipy git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
 # 下载分支
 WORKDIR /gpt
 RUN git clone --depth=1 https://github.com/binary-husky/gpt_academic.git
 WORKDIR /gpt/gpt_academic
-RUN git clone https://github.com/OpenLMLab/MOSS.git request_llm/moss
+RUN git clone --depth=1 https://github.com/OpenLMLab/MOSS.git request_llm/moss
 RUN python3 -m pip install -r requirements.txt
 RUN python3 -m pip install -r request_llm/requirements_moss.txt
 RUN python3 -m pip install -r request_llm/requirements_qwen.txt
 RUN python3 -m pip install -r request_llm/requirements_chatglm.txt
 RUN python3 -m pip install -r request_llm/requirements_newbing.txt
-
+RUN python3 -m pip install nougat-ocr
 # 预热Tiktoken模块
--- a/docs/use_azure.md
+++ b/docs/use_azure.md
@@ -107,6 +107,12 @@ AZURE_API_KEY = "填入azure openai api的密钥"
 AZURE_API_VERSION = "2023-05-15"  # 默认使用 2023-05-15 版本，无需修改
 AZURE_ENGINE = "填入部署名" # 见上述图片
 # 例如
 API_KEY = '6424e9d19e674092815cea1cb35e67a5'
 AZURE_ENDPOINT = 'https://rhtjjjjjj.openai.azure.com/'
 AZURE_ENGINE = 'qqwe'
 LLM_MODEL = "azure-gpt-3.5" # 可选 ↓↓↓
 ```
--- a/main.py
+++ b/main.py
@@ -8,12 +8,13 @@ def main():
    # 建议您复制一个config_private.py放自己的秘密, 如API和代理网址, 避免不小心传github被别人看到
    proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION = get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION')
    CHATBOT_HEIGHT, LAYOUT, AVAIL_LLM_MODELS, AUTO_CLEAR_TXT = get_conf('CHATBOT_HEIGHT', 'LAYOUT', 'AVAIL_LLM_MODELS', 'AUTO_CLEAR_TXT')
-    ENABLE_AUDIO, AUTO_CLEAR_TXT, PATH_LOGGING = get_conf('ENABLE_AUDIO', 'AUTO_CLEAR_TXT', 'PATH_LOGGING')
+    ENABLE_AUDIO, AUTO_CLEAR_TXT, PATH_LOGGING, AVAIL_THEMES, THEME = get_conf('ENABLE_AUDIO', 'AUTO_CLEAR_TXT', 'PATH_LOGGING', 'AVAIL_THEMES', 'THEME')
    # 如果WEB_PORT是-1, 则随机选取WEB端口
    PORT = find_free_port() if WEB_PORT <= 0 else WEB_PORT
    from check_proxy import get_current_version
-    from themes.theme import adjust_theme, advanced_css, theme_declaration
+    from themes.theme import adjust_theme, advanced_css, theme_declaration, load_dynamic_theme
    initial_prompt = "Serve me as a writing and programming assistant."
    title_html = f"<h1 align=\"center\">GPT 学术优化 {get_current_version()}</h1>{theme_declaration}"
    description =  "代码开源和更新[地址🚀](https://github.com/binary-husky/gpt_academic)，"
@@ -59,6 +60,7 @@ def main():
    cancel_handles = []
    with gr.Blocks(title="GPT 学术优化", theme=set_theme, analytics_enabled=False, css=advanced_css) as demo:
        gr.HTML(title_html)
        secret_css, secret_font = gr.Textbox(visible=False), gr.Textbox(visible=False)
        cookies = gr.State(load_chat_cookies())
        with gr_L1():
            with gr_L2(scale=2, elem_id="gpt-chat"):
@@ -123,7 +125,8 @@ def main():
                    max_length_sl = gr.Slider(minimum=256, maximum=8192, value=4096, step=1, interactive=True, label="Local LLM MaxLength",)
                    checkboxes = gr.CheckboxGroup(["基础功能区", "函数插件区", "底部输入区", "输入清除键", "插件参数区"], value=["基础功能区", "函数插件区"], label="显示/隐藏功能区")
                    md_dropdown = gr.Dropdown(AVAIL_LLM_MODELS, value=LLM_MODEL, label="更换LLM模型/请求源").style(container=False)
-                    dark_mode_btn = gr.Button("Toggle Dark Mode ☀", variant="secondary").style(size="sm")
+                    theme_dropdown = gr.Dropdown(AVAIL_THEMES, value=THEME, label="更换UI主题").style(container=False)
                    dark_mode_btn = gr.Button("切换界面明暗 ☀", variant="secondary").style(size="sm")
                    dark_mode_btn.click(None, None, None, _js="""() => {
                            if (document.querySelectorAll('.dark').length) {
                                document.querySelectorAll('.dark').forEach(el => el.classList.remove('dark'));
@@ -197,9 +200,37 @@ def main():
                ret.update({plugin_advanced_arg: gr.update(visible=False, label=f"插件[{k}]不需要高级参数。")})
            return ret
        dropdown.select(on_dropdown_changed, [dropdown], [switchy_bt, plugin_advanced_arg] )
        def on_md_dropdown_changed(k):
            return {chatbot: gr.update(label="当前模型："+k)}
        md_dropdown.select(on_md_dropdown_changed, [md_dropdown], [chatbot] )
        def on_theme_dropdown_changed(theme, secret_css):
            adjust_theme, css_part1, _, adjust_dynamic_theme = load_dynamic_theme(theme)
            if adjust_dynamic_theme:
                css_part2 = adjust_dynamic_theme._get_theme_css()
            else:
                css_part2 = adjust_theme()._get_theme_css()
            return css_part2 + css_part1
        theme_handle = theme_dropdown.select(on_theme_dropdown_changed, [theme_dropdown, secret_css], [secret_css])
        theme_handle.then(
            None,
            [secret_css],
            None,
            _js="""(css) => {
                var existingStyles = document.querySelectorAll("style[data-loaded-css]");
                for (var i = 0; i < existingStyles.length; i++) {
                    var style = existingStyles[i];
                    style.parentNode.removeChild(style);
                }
                var styleElement = document.createElement('style');
                styleElement.setAttribute('data-loaded-css', css);
                styleElement.innerHTML = css;
                document.head.appendChild(styleElement);
            }
            """
        )
        # 随变按钮的回调函数注册
        def route(request: gr.Request, k, *args, **kwargs):
            if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
@@ -235,7 +266,7 @@ def main():
            cookies.update({'uuid': uuid.uuid4()})
            return cookies
        demo.load(init_cookie, inputs=[cookies, chatbot], outputs=[cookies])
-        demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{ChatBotHeight();}')
+        demo.load(lambda: 0, inputs=None, outputs=None, _js='()=>{GptAcademicJavaScriptInit();}')
    # gradio的inbrowser触发不太稳定，回滚代码到原始的浏览器打开函数
    def auto_opentab_delay():
@@ -254,6 +285,7 @@ def main():
    auto_opentab_delay()
    demo.queue(concurrency_count=CONCURRENT_COUNT).launch(
        quiet=True,
        server_name="0.0.0.0", 
        server_port=PORT,
        favicon_path="docs/logo.png", 
--- a/request_llm/bridge_all.py
+++ b/request_llm/bridge_all.py
@@ -52,6 +52,7 @@ API_URL_REDIRECT, AZURE_ENDPOINT, AZURE_ENGINE = get_conf("API_URL_REDIRECT", "A
 openai_endpoint = "https://api.openai.com/v1/chat/completions"
 api2d_endpoint = "https://openai.api2d.net/v1/chat/completions"
 newbing_endpoint = "wss://sydney.bing.com/sydney/ChatHub"
 if not AZURE_ENDPOINT.endswith('/'): AZURE_ENDPOINT += '/'
 azure_endpoint = AZURE_ENDPOINT + f'openai/deployments/{AZURE_ENGINE}/chat/completions?api-version=2023-05-15'
 # 兼容旧版的配置
 try:
@@ -125,6 +126,15 @@ model_info = {
        "token_cnt": get_token_num_gpt4,
    },
    "gpt-4-32k": {
        "fn_with_ui": chatgpt_ui,
        "fn_without_ui": chatgpt_noui,
        "endpoint": openai_endpoint,
        "max_token": 32768,
        "tokenizer": tokenizer_gpt4,
        "token_cnt": get_token_num_gpt4,
    },
    # azure openai
    "azure-gpt-3.5":{
        "fn_with_ui": chatgpt_ui,
@@ -135,6 +145,15 @@ model_info = {
        "token_cnt": get_token_num_gpt35,
    },
    "azure-gpt-4":{
        "fn_with_ui": chatgpt_ui,
        "fn_without_ui": chatgpt_noui,
        "endpoint": azure_endpoint,
        "max_token": 8192,
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    # api_2d
    "api2d-gpt-3.5-turbo": {
        "fn_with_ui": chatgpt_ui,
--- a/request_llm/bridge_chatglm.py
+++ b/request_llm/bridge_chatglm.py
@@ -3,7 +3,7 @@ from transformers import AutoModel, AutoTokenizer
 import time
 import threading
 import importlib
-from toolbox import update_ui, get_conf
+from toolbox import update_ui, get_conf, ProxyNetworkActivate
 from multiprocessing import Process, Pipe
 load_message = "ChatGLM尚未加载，加载需要一段时间。注意，取决于`config.py`的配置，ChatGLM消耗大量的内存（CPU）或显存（GPU），也许会导致低配计算机卡死 ……"
@@ -48,16 +48,17 @@ class GetGLMHandle(Process):
        while True:
            try:
-                if self.chatglm_model is None:
+                with ProxyNetworkActivate('Download_LLM'):
-                    self.chatglm_tokenizer = AutoTokenizer.from_pretrained(_model_name_, trust_remote_code=True)
+                    if self.chatglm_model is None:
-                    if device=='cpu':
+                        self.chatglm_tokenizer = AutoTokenizer.from_pretrained(_model_name_, trust_remote_code=True)
-                        self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).float()
+                        if device=='cpu':
                            self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).float()
                        else:
                            self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).half().cuda()
                        self.chatglm_model = self.chatglm_model.eval()
                        break
                    else:
-                        self.chatglm_model = AutoModel.from_pretrained(_model_name_, trust_remote_code=True).half().cuda()
+                        break
                    self.chatglm_model = self.chatglm_model.eval()
                    break
                else:
                    break
            except:
                retry += 1
                if retry > 3: 
--- a/request_llm/bridge_llama2.py
+++ b/request_llm/bridge_llama2.py
@@ -30,7 +30,7 @@ class GetONNXGLMHandle(LocalLLMHandle):
        with open(os.path.expanduser('~/.cache/huggingface/token'), 'w') as f:
            f.write(huggingface_token)
        model_id = 'meta-llama/Llama-2-7b-chat-hf'
-        with ProxyNetworkActivate():
+        with ProxyNetworkActivate('Download_LLM'):
            self._tokenizer = AutoTokenizer.from_pretrained(model_id, use_auth_token=huggingface_token)
            # use fp16
            model = AutoModelForCausalLM.from_pretrained(model_id, use_auth_token=huggingface_token).eval()
--- a/request_llm/requirements_chatglm.txt
+++ b/request_llm/requirements_chatglm.txt
@@ -1,5 +1,4 @@
 protobuf
 transformers>=4.27.1
 cpm_kernels
 torch>=1.10
 mdtex2html
--- a/request_llm/requirements_chatglm_onnx.txt
+++ b/request_llm/requirements_chatglm_onnx.txt
@@ -1,5 +1,4 @@
 protobuf
 transformers>=4.27.1
 cpm_kernels
 torch>=1.10
 mdtex2html
--- a/request_llm/requirements_jittorllms.txt
+++ b/request_llm/requirements_jittorllms.txt
@@ -2,6 +2,5 @@ jittor >= 1.3.7.9
 jtorch >= 0.1.3
 torch
 torchvision
 transformers==4.26.1
 pandas
 jieba
--- a/request_llm/requirements_moss.txt
+++ b/request_llm/requirements_moss.txt
@@ -1,5 +1,4 @@
 torch
 transformers==4.25.1
 sentencepiece
 datasets
 accelerate
--- a/requirements.txt
+++ b/requirements.txt
@@ -2,7 +2,7 @@
 pydantic==1.10.11
 tiktoken>=0.3.3
 requests[socks]
-transformers
+transformers>=4.27.1
 python-markdown-math
 beautifulsoup4
 prompt_toolkit
--- a/tests/test_plugins.py
+++ b/tests/test_plugins.py
@@ -6,11 +6,14 @@
 import os, sys
 def validate_path(): dir_name = os.path.dirname(__file__); root_dir_assume = os.path.abspath(dir_name +  '/..'); os.chdir(root_dir_assume); sys.path.append(root_dir_assume)
 validate_path() # 返回项目根路径
 from tests.test_utils import plugin_test
 if __name__ == "__main__":
    from tests.test_utils import plugin_test
    plugin_test(plugin='crazy_functions.函数动态生成->函数动态生成', main_input='交换图像的蓝色通道和红色通道', advanced_arg={"file_path_arg": "./build/ants.jpg"})
    # plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='修改api-key为sk-jhoejriotherjep')
-    plugin_test(plugin='crazy_functions.批量翻译PDF文档_NOUGAT->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')
+
    # plugin_test(plugin='crazy_functions.批量翻译PDF文档_NOUGAT->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')
    # plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='调用插件，对C:/Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns中的python文件进行解析')
--- a/tests/test_utils.py
+++ b/tests/test_utils.py
@@ -74,7 +74,7 @@ def plugin_test(main_input, plugin, advanced_arg=None):
        plugin_kwargs['plugin_kwargs'] = advanced_arg
    my_working_plugin = silence_stdout(plugin)(**plugin_kwargs)
-    with Live(Markdown(""), auto_refresh=False) as live:
+    with Live(Markdown(""), auto_refresh=False, vertical_overflow="visible") as live:
        for cookies, chat, hist, msg in my_working_plugin:
            md_str = vt.chat_to_markdown_str(chat)
            md = Markdown(md_str)
--- a/themes/common.css
+++ b/themes/common.css
@@ -19,3 +19,67 @@
 .wrap.svelte-xwlu1w {
    min-height: var(--size-32);
 }
 /* status bar height */
 .min.svelte-1yrv54 {
    min-height: var(--size-12);
 }
 /* copy btn */
 .message-btn-row {
    width: 19px;
    height: 19px;
    position: absolute;
    left: calc(100% + 3px);
    top: 0;
    display: flex;
    justify-content: space-between;
 }
 /* .message-btn-row-leading, .message-btn-row-trailing {
    display: inline-flex;
    gap: 4px;
 } */
 .message-btn-row button {
    font-size: 18px;
    align-self: center;
    align-items: center;
    flex-wrap: nowrap;
    white-space: nowrap;
    display: inline-flex;
    flex-direction: row;
    gap: 4px;
    padding-block: 2px !important;
 }
 /* Scrollbar Width */
 ::-webkit-scrollbar {
    width: 12px;
 }
 /* Scrollbar Track */
 ::-webkit-scrollbar-track {
    background: #f1f1f1;
    border-radius: 12px;
 }
 /* Scrollbar Handle */
 ::-webkit-scrollbar-thumb {
    background: #888;
    border-radius: 12px;
 }
 /* Scrollbar Handle on hover */
 ::-webkit-scrollbar-thumb:hover {
    background: #555;
 }
 /* input btns: clear, reset, stop */
 #input-panel button {
    min-width: min(80px, 100%);
 }
 /* input btns: clear, reset, stop */
 #input-panel2 button {
    min-width: min(80px, 100%);
 }
--- a/themes/common.js
+++ b/themes/common.js
@@ -1,4 +1,86 @@
-function ChatBotHeight() {
+function gradioApp() {
    // https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
    const elems = document.getElementsByTagName('gradio-app');
    const elem = elems.length == 0 ? document : elems[0];
    if (elem !== document) {
        elem.getElementById = function(id) {
            return document.getElementById(id);
        };
    }
    return elem.shadowRoot ? elem.shadowRoot : elem;
 }
 function addCopyButton(botElement) {
    // https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
    // Copy bot button
    const copiedIcon = '<span><svg stroke="currentColor" fill="none" stroke-width="2" viewBox="0 0 24 24" stroke-linecap="round" stroke-linejoin="round" height=".8em" width=".8em" xmlns="http://www.w3.org/2000/svg"><polyline points="20 6 9 17 4 12"></polyline></svg></span>';
    const copyIcon = '<span><svg stroke="currentColor" fill="none" stroke-width="2" viewBox="0 0 24 24" stroke-linecap="round" stroke-linejoin="round" height=".8em" width=".8em" xmlns="http://www.w3.org/2000/svg"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span>';
    const messageBtnColumnElement = botElement.querySelector('.message-btn-row');
    if (messageBtnColumnElement) {
        // Do something if .message-btn-column exists, for example, remove it
        // messageBtnColumnElement.remove();
        return;
    }
    var copyButton = document.createElement('button');
    copyButton.classList.add('copy-bot-btn');
    copyButton.setAttribute('aria-label', 'Copy');
    copyButton.innerHTML = copyIcon;
    copyButton.addEventListener('click', async () => {
        const textToCopy = botElement.innerText;
        try {
            if ("clipboard" in navigator) {
                await navigator.clipboard.writeText(textToCopy);
                copyButton.innerHTML = copiedIcon;
                setTimeout(() => {
                    copyButton.innerHTML = copyIcon;
                }, 1500);
            } else {
                const textArea = document.createElement("textarea");
                textArea.value = textToCopy;
                document.body.appendChild(textArea);
                textArea.select();
                try {
                    document.execCommand('copy');
                    copyButton.innerHTML = copiedIcon;
                    setTimeout(() => {
                        copyButton.innerHTML = copyIcon;
                    }, 1500);
                } catch (error) {
                    console.error("Copy failed: ", error);
                }
                document.body.removeChild(textArea);
            }
        } catch (error) {
            console.error("Copy failed: ", error);
        }
    });
    var messageBtnColumn = document.createElement('div');
    messageBtnColumn.classList.add('message-btn-row');
    messageBtnColumn.appendChild(copyButton);
    botElement.appendChild(messageBtnColumn);
 }
 function chatbotContentChanged(attempt = 1, force = false) {
    // https://github.com/GaiZhenbiao/ChuanhuChatGPT/tree/main/web_assets/javascript
    for (var i = 0; i < attempt; i++) {
        setTimeout(() => {
            gradioApp().querySelectorAll('#gpt-chatbot .message-wrap .message.bot').forEach(addCopyButton);
        }, i === 0 ? 0 : 200);
    }
 }
 function GptAcademicJavaScriptInit() {
    chatbotIndicator = gradioApp().querySelector('#gpt-chatbot > div.wrap');
    var chatbotObserver = new MutationObserver(() => {
        chatbotContentChanged(1);
    });
    chatbotObserver.observe(chatbotIndicator, { attributes: true, childList: true, subtree: true });
    function update_height(){
        var { panel_height_target, chatbot_height, chatbot } = get_elements(true);
        if (panel_height_target!=chatbot_height)
--- a/themes/gradios.py
+++ b/themes/gradios.py
@@ -3,11 +3,20 @@ import logging
 from toolbox import get_conf, ProxyNetworkActivate
 CODE_HIGHLIGHT, ADD_WAIFU, LAYOUT = get_conf('CODE_HIGHLIGHT', 'ADD_WAIFU', 'LAYOUT')
 def dynamic_set_theme(THEME):
    set_theme = gr.themes.ThemeClass()
    with ProxyNetworkActivate('Download_Gradio_Theme'):
        logging.info('正在下载Gradio主题，请稍等。')
        if THEME.startswith('Huggingface-'): THEME = THEME.lstrip('Huggingface-')
        if THEME.startswith('huggingface-'): THEME = THEME.lstrip('huggingface-')
        set_theme = set_theme.from_hub(THEME.lower())
    return set_theme
 def adjust_theme():
    try:
        set_theme = gr.themes.ThemeClass()
-        with ProxyNetworkActivate():
+        with ProxyNetworkActivate('Download_Gradio_Theme'):
            logging.info('正在下载Gradio主题，请稍等。')
            THEME, = get_conf('THEME')
            if THEME.startswith('Huggingface-'): THEME = THEME.lstrip('Huggingface-')
--- a/themes/theme.py
+++ b/themes/theme.py
@@ -2,17 +2,22 @@ import gradio as gr
 from toolbox import get_conf
 THEME, = get_conf('THEME')
-if THEME == 'Chuanhu-Small-and-Beautiful':
+def load_dynamic_theme(THEME):
-    from .green import adjust_theme, advanced_css
+    adjust_dynamic_theme = None
-    theme_declaration = "<h2 align=\"center\"  class=\"small\">[Chuanhu-Small-and-Beautiful主题]</h2>"
+    if THEME == 'Chuanhu-Small-and-Beautiful':
-elif THEME == 'High-Contrast':
+        from .green import adjust_theme, advanced_css
-    from .contrast import adjust_theme, advanced_css
+        theme_declaration = "<h2 align=\"center\"  class=\"small\">[Chuanhu-Small-and-Beautiful主题]</h2>"
-    theme_declaration = ""
+    elif THEME == 'High-Contrast':
-elif '/' in THEME:
+        from .contrast import adjust_theme, advanced_css
-    from .gradios import adjust_theme, advanced_css
+        theme_declaration = ""
-    theme_declaration = ""
+    elif '/' in THEME:
-else:
+        from .gradios import adjust_theme, advanced_css
-    from .default import adjust_theme, advanced_css
+        from .gradios import dynamic_set_theme
-    theme_declaration = ""
+        adjust_dynamic_theme = dynamic_set_theme(THEME)
-
+        theme_declaration = ""
    else:
        from .default import adjust_theme, advanced_css
        theme_declaration = ""
    return adjust_theme, advanced_css, theme_declaration, adjust_dynamic_theme
 adjust_theme, advanced_css, theme_declaration, _ = load_dynamic_theme(THEME)
--- a/toolbox.py
+++ b/toolbox.py
@@ -216,7 +216,7 @@ def get_reduce_token_percent(text):
        return 0.5, '不详'
-def write_history_to_file(history, file_basename=None, file_fullname=None):
+def write_history_to_file(history, file_basename=None, file_fullname=None, auto_caption=True):
    """
    将对话记录history以Markdown格式写入文件中。如果没有指定文件名，则使用当前时间生成文件名。
    """
@@ -235,7 +235,7 @@ def write_history_to_file(history, file_basename=None, file_fullname=None):
                if type(content) != str: content = str(content)
            except:
                continue
-            if i % 2 == 0:
+            if i % 2 == 0 and auto_caption:
                f.write('## ')
            try:
                f.write(content)
@@ -527,6 +527,7 @@ def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
        if 'files_to_promote' in chatbot._cookies: current = chatbot._cookies['files_to_promote']
        else: current = []
        chatbot._cookies.update({'files_to_promote': [new_path] + current})
    return new_path
 def disable_auto_promotion(chatbot):
    chatbot._cookies.update({'files_to_promote': []})
@@ -955,7 +956,19 @@ class ProxyNetworkActivate():
    """
    这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
    """
    def __init__(self, task=None) -> None:
        self.task = task
        if not task:
            # 不给定task, 那么我们默认代理生效
            self.valid = True
        else:
            # 给定了task, 我们检查一下
            from toolbox import get_conf
            WHEN_TO_USE_PROXY, = get_conf('WHEN_TO_USE_PROXY')
            self.valid = (task in WHEN_TO_USE_PROXY)
    def __enter__(self):
        if not self.valid: return self
        from toolbox import get_conf
        proxies, = get_conf('proxies')
        if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
--- a/4
+++ b/4
@@ -1,5 +1,5 @@
 {
-  "version": 3.52,
+  "version": 3.54,
  "show_feature": true,
-  "new_feature": "提高稳定性&解决多用户冲突问题 <-> 支持插件分类和更多UI皮肤外观 <-> 支持用户使用自然语言调度各个插件(虚空终端) ！ <-> 改进UI，设计新主题 <-> 支持借助GROBID实现PDF高精度翻译 <-> 接入百度千帆平台和文心一言 <-> 接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率"
+  "new_feature": "新增动态代码解释器（CodeInterpreter） <-> 增加文本回答复制按钮 <-> 细分代理场合 <-> 支持动态选择不同界面主题 <-> 提高稳定性&解决多用户冲突问题 <-> 支持插件分类和更多UI皮肤外观 <-> 支持用户使用自然语言调度各个插件(虚空终端) ！ <-> 改进UI，设计新主题 <-> 支持借助GROBID实现PDF高精度翻译 <-> 接入百度千帆平台和文心一言 <-> 接入阿里通义千问、讯飞星火、上海AI-Lab书生 <-> 优化一键升级 <-> 提高arxiv翻译速度和成功率"
 }
作者	SHA1	备注	提交日期
binary-husky	87ccd1a89a	Update crazy_functional.py	2023-09-27 18:35:06 +08:00
binary-husky	87b9734986	修复'copiedIcon'重复定义BUG	2023-09-27 16:35:58 +08:00
binary-husky	d2d5665c37	允许模块预热时使用Proxy	2023-09-27 15:53:45 +08:00
binary-husky	0844b6e9cf	GROBID服务代理访问支持	2023-09-27 15:40:55 +08:00
binary-husky	9cb05e5724	修改布局	2023-09-27 15:20:28 +08:00
binary-husky	80b209fa0c	Merge branch 'frontier'	2023-09-27 15:19:07 +08:00
binary-husky	8d4cb05738	Matlab项目解析插件的Shortcut	2023-09-26 10:16:38 +08:00
binary-husky	31f4069563	改善润色和校读Prompt	2023-09-25 17:46:28 +08:00
binary-husky	8ba6fc062e	Merge branch 'frontier' of github.com:binary-husky/chatgpt_academic into frontier	2023-09-23 23:59:30 +08:00
binary-husky	c0c2d14e3d	better scrollbar	2023-09-23 23:58:32 +08:00
binary-husky	f0a5c49a9c	Merge branch 'frontier' of github.com:binary-husky/chatgpt_academic into frontier	2023-09-23 23:47:42 +08:00
binary-husky	9333570ab7	减小重置等基础按钮的最小大小	2023-09-23 23:47:25 +08:00
binary-husky	d6eaaad962	禁止gradio显示误导性的share=True	2023-09-23 23:23:23 +08:00
binary-husky	e24f077b68	显式增加azure-gpt-4选项	2023-09-23 23:06:58 +08:00
binary-husky	dc5bb9741a	版本更新	2023-09-23 22:45:07 +08:00
binary-husky	b383b45191	version 3.54 beta	2023-09-23 22:44:18 +08:00
binary-husky	2d8f37baba	细分代理场景	2023-09-23 22:43:15 +08:00
binary-husky	409927ef8e	统一 transformers 版本	2023-09-23 22:26:28 +08:00
binary-husky	5b231e0170	添加整体复制按钮	2023-09-23 22:11:29 +08:00
binary-husky	87f629bb37	添加gpt-4-32k	2023-09-23 20:24:13 +08:00
binary-husky	3672c97a06	动态代码解释器	2023-09-23 01:51:05 +08:00
binary-husky	b6ee3e9807	Merge pull request #1121 from binary-husky/frontier arxiv翻译插件添加禁用缓存选项	2023-09-21 09:33:19 +08:00
binary-husky	d56bc280e9	添加禁用缓存选项	2023-09-20 22:04:15 +08:00
qingxu fu	d5fd00c15d	微调Dockerfile	2023-09-20 10:02:10 +08:00
binary-husky	5e647ff149	Merge branch 'master' into frontier	2023-09-19 17:21:02 +08:00
binary-husky	868faf00cc	修正docker compose	2023-09-19 17:10:57 +08:00
binary-husky	a0286c39b9	更新README	2023-09-19 17:08:20 +08:00
binary-husky	9cced321f1	修改README	2023-09-19 16:55:39 +08:00
binary-husky	3073935e24	修改readme 推送version 3.53	2023-09-19 16:49:33 +08:00
binary-husky	ef6631b280	TOKEN_LIMIT_PER_FRAGMENT修改为1024	2023-09-19 16:31:36 +08:00
binary-husky	0801e4d881	Merge pull request #1111 from kaixindelele/only_chinese_pdf 提升PDF翻译插件的效果	2023-09-19 15:56:04 +08:00
qingxu fu	ae08cfbcae	修复小Bug	2023-09-19 15:55:27 +08:00
qingxu fu	1c0d5361ea	调整状态栏的最小高度	2023-09-19 15:52:42 +08:00
qingxu fu	278464bfb7	合并重复的函数	2023-09-18 23:03:23 +08:00
qingxu fu	2a6996f5d0	修复Azure的ENDPOINT格式兼容性	2023-09-18 21:19:02 +08:00
qingxu fu	84b11016c6	在nougat处理结束后，同时输出mmd文件	2023-09-18 15:21:30 +08:00
qingxu fu	7e74d3d699	调整按钮位置	2023-09-18 15:19:21 +08:00
qingxu fu	2cad8e2694	支持动态切换主题	2023-09-17 00:15:28 +08:00
qingxu fu	e765ec1223	dynamic theme	2023-09-17 00:02:49 +08:00
kaixindelele	471a369bb8	论文翻译只输出中文	2023-09-16 22:09:44 +08:00