融合PDF左右比例调整到95%

修正报错消息
update translation matrix
2025-12-06 06:26:47 +00:00 · 2023-09-10 17:22:35 +08:00 · 2023-09-10 16:52:35 +08:00 · 2023-09-09 21:57:24 +08:00 · 2023-09-09 20:32:44 +08:00 · 2023-09-09 20:15:46 +08:00
--- a/.github/workflows/build-with-all-capacity.yml
+++ b/.github/workflows/build-with-all-capacity.yml
@@ -0,0 +1,44 @@
+# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
+name: build-with-all-capacity
+
+on:
+  push:
+    branches:
+      - 'master'
+
+env:
+  REGISTRY: ghcr.io
+  IMAGE_NAME: ${{ github.repository }}_with_all_capacity
+
+jobs:
+  build-and-push-image:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v3
+
+      - name: Log in to the Container registry
+        uses: docker/login-action@v2
+        with:
+          registry: ${{ env.REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@v4
+        with:
+          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+
+      - name: Build and push Docker image
+        uses: docker/build-push-action@v4
+        with:
+          context: .
+          push: true
+          file: docs/GithubAction+AllCapacity
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@@ -0,0 +1,25 @@
+# This workflow warns and then closes issues and PRs that have had no activity for a specified amount of time.
+#
+# You can adjust the behavior by modifying this file.
+# For more information, see:
+# https://github.com/actions/stale
+
+name: 'Close stale issues and PRs'
+on:
+  schedule:
+    - cron: '*/5 * * * *'
+
+jobs:
+  stale:
+    runs-on: ubuntu-latest
+    permissions:
+      issues: write
+      pull-requests: read
+    
+    steps:
+      - uses: actions/stale@v8
+        with:
+          stale-issue-message: 'This issue is stale because it has been open 100 days with no activity. Remove stale label or comment or this will be closed in 1 days.'
+          days-before-stale: 100
+          days-before-close: 1
+          debug-only: true
--- a/README.md
+++ b/README.md
@@ -10,13 +10,13 @@
 **如果喜欢这个项目，请给它一个Star；如果您发明了好用的快捷键或函数插件，欢迎发pull requests！**

 If you like this project, please give it a Star. If you've come up with more useful academic shortcuts or functional plugins, feel free to open an issue or pull request. We also have a README in [English|](docs/README_EN.md)[日本語|](docs/README_JP.md)[한국어|](https://github.com/mldljyh/ko_gpt_academic)[Русский|](docs/README_RS.md)[Français](docs/README_FR.md) translated by this project itself.
-To translate this project to arbitary language with GPT, read and run [`multi_language.py`](multi_language.py) (experimental).
+To translate this project to arbitrary language with GPT, read and run [`multi_language.py`](multi_language.py) (experimental).

 > **Note**
 >
-> 1.请注意只有 **高亮(如红色)** 标识的函数插件（按钮）才支持读取文件，部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR。
+> 1.请注意只有 **高亮** 标识的函数插件（按钮）才支持读取文件，部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR。
 >
-> 2.本项目中每个文件的功能都在[自译解报告`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/GPT‐Academic项目自译解报告)详细说明。随着版本的迭代，您也可以随时自行点击相关函数插件，调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
+> 2.本项目中每个文件的功能都在[自译解报告`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/GPT‐Academic项目自译解报告)详细说明。随着版本的迭代，您也可以随时自行点击相关函数插件，调用GPT重新生成项目的自我解析报告。常见问题[`wiki`](https://github.com/binary-husky/gpt_academic/wiki)。[安装方法](#installation) | [配置说明](https://github.com/binary-husky/gpt_academic/wiki/%E9%A1%B9%E7%9B%AE%E9%85%8D%E7%BD%AE%E8%AF%B4%E6%98%8E)。
 > 
 > 3.本项目兼容并鼓励尝试国产大语言模型ChatGLM和Moss等等。支持多个api-key共存，可在配置文件中填写如`API_KEY="openai-key1,openai-key2,azure-key3,api2d-key4"`。需要临时更换`API_KEY`时，在输入区输入临时的`API_KEY`然后回车键提交后即可生效。

@@ -53,7 +53,8 @@ Latex论文一键校对 | [函数插件] 仿Grammarly对Latex文章进行语法
 [多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM2](https://github.com/THUDM/ChatGLM2-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧？
 ⭐ChatGLM2微调模型 | 支持加载ChatGLM2微调模型，提供ChatGLM2微调辅助插件
 更多LLM模型接入，支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应)，引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)和[盘古α](https://openi.org.cn/pangu/)
-⭐[虚空终端](https://github.com/binary-husky/void-terminal)pip包 | 脱离GUI，在Python中直接调用本项目的函数插件（开发中）
+⭐[void-terminal](https://github.com/binary-husky/void-terminal) pip包 | 脱离GUI，在Python中直接调用本项目的所有函数插件（开发中）
+⭐虚空终端插件 | [函数插件] 用自然语言，直接调度本项目其他插件
 更多新功能展示 (图像生成等) …… | 见本文档结尾处 ……
 </div>

@@ -148,11 +149,14 @@ python main.py

 ### 安装方法II：使用Docker

+[![fullcapacity](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-all-capacity.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)
+
 1. 仅ChatGPT（推荐大多数人选择，等价于docker-compose方案1）
 [![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml)
 [![basiclatex](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml)
 [![basicaudio](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-audio-assistant.yml)

+
 ``` sh
 git clone --depth=1 https://github.com/binary-husky/gpt_academic.git  # 下载项目
 cd gpt_academic                                 # 进入路径
@@ -249,10 +253,13 @@ Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史h
 <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
 </div>

-3. 生成报告。大部分插件都会在执行结束后，生成工作报告
+3. 虚空终端（从自然语言输入中，理解用户意图+自动调用其他插件）
+
+- 步骤一：输入 “ 请调用插件翻译PDF论文，地址为https://openreview.net/pdf?id=rJl0r3R9KX ”
+- 步骤二：点击“虚空终端”
+
 <div align="center">
-<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="250" >
-<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="250" >
+<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/66f1b044-e9ff-4eed-9126-5d4f3668f1ed" width="500" >
 </div>

 4. 模块化功能设计，简单的接口却能支持强大的功能
@@ -299,6 +306,7 @@ Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史h
 </div>


+
 ### II：版本:
 - version 3.60（todo）: 优化虚空终端，引入code interpreter和更多插件
 - version 3.50: 使用自然语言调用本项目的所有函数插件（虚空终端），支持插件分类，改进UI，设计新主题
--- a/check_proxy.py
+++ b/check_proxy.py
@@ -5,7 +5,7 @@ def check_proxy(proxies):
    try:
        response = requests.get("https://ipapi.co/json/", proxies=proxies, timeout=4)
        data = response.json()
-        print(f'查询代理的地理位置，返回的结果是{data}')
+        # print(f'查询代理的地理位置，返回的结果是{data}')
        if 'country_name' in data:
            country = data['country_name']
            result = f"代理配置 {proxies_https}, 代理所在地：{country}"
--- a/crazy_functional.py
+++ b/crazy_functional.py
@@ -501,6 +501,32 @@ def get_crazy_functions():
    except:
        print('Load function plugin failed')

+    try:
+        from crazy_functions.批量翻译PDF文档_NOUGAT import 批量翻译PDF文档
+        function_plugins.update({
+            "精准翻译PDF文档（NOUGAT）": {
+                "Group": "学术",
+                "Color": "stop",
+                "AsButton": False,
+                "Function": HotReload(批量翻译PDF文档)
+            }
+        })
+    except:
+        print('Load function plugin failed')
+
+
+    # try:
+    #     from crazy_functions.CodeInterpreter import 虚空终端CodeInterpreter
+    #     function_plugins.update({
+    #         "CodeInterpreter（开发中，仅供测试）": {
+    #             "Group": "编程|对话",
+    #             "Color": "stop",
+    #             "AsButton": False,
+    #             "Function": HotReload(虚空终端CodeInterpreter)
+    #         }
+    #     })
+    # except:
+    #     print('Load function plugin failed')

    # try:
    #     from crazy_functions.chatglm微调工具 import 微调数据集生成
--- a/crazy_functions/CodeInterpreter.py
+++ b/crazy_functions/CodeInterpreter.py
@@ -0,0 +1,231 @@
+from collections.abc import Callable, Iterable, Mapping
+from typing import Any
+from toolbox import CatchException, update_ui, gen_time_str, trimmed_format_exc, promote_file_to_downloadzone, clear_file_downloadzone
+from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
+from .crazy_utils import input_clipping, try_install_deps
+from multiprocessing import Process, Pipe
+import os
+import time
+
+templete = """
+```python
+import ...  # Put dependencies here, e.g. import numpy as np
+
+class TerminalFunction(object): # Do not change the name of the class, The name of the class must be `TerminalFunction`
+
+    def run(self, path):    # The name of the function must be `run`, it takes only a positional argument.
+        # rewrite the function you have just written here 
+        ...
+        return generated_file_path
+```
+"""
+
+def inspect_dependency(chatbot, history):
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+    return True
+
+def get_code_block(reply):
+    import re
+    pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
+    matches = re.findall(pattern, reply) # find all code blocks in text
+    if len(matches) == 1: 
+        return matches[0].strip('python') #  code block
+    for match in matches:
+        if 'class TerminalFunction' in match:
+            return match.strip('python') #  code block
+    raise RuntimeError("GPT is not generating proper code.")
+
+def gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history):
+    # 输入
+    prompt_compose = [
+        f'Your job:\n'
+        f'1. write a single Python function, which takes a path of a `{file_type}` file as the only argument and returns a `string` containing the result of analysis or the path of generated files. \n',
+        f"2. You should write this function to perform following task: " + txt + "\n",
+        f"3. Wrap the output python function with markdown codeblock."
+    ]
+    i_say = "".join(prompt_compose)
+    demo = []
+
+    # 第一步
+    gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
+        inputs=i_say, inputs_show_user=i_say, 
+        llm_kwargs=llm_kwargs, chatbot=chatbot, history=demo, 
+        sys_prompt= r"You are a programmer."
+    )
+    history.extend([i_say, gpt_say])
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
+
+    # 第二步
+    prompt_compose = [
+        "If previous stage is successful, rewrite the function you have just written to satisfy following templete: \n",
+        templete
+    ]
+    i_say = "".join(prompt_compose); inputs_show_user = "If previous stage is successful, rewrite the function you have just written to satisfy executable templete. "
+    gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
+        inputs=i_say, inputs_show_user=inputs_show_user, 
+        llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
+        sys_prompt= r"You are a programmer."
+    )
+    code_to_return = gpt_say
+    history.extend([i_say, gpt_say])
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
+    
+    # # 第三步
+    # i_say = "Please list to packages to install to run the code above. Then show me how to use `try_install_deps` function to install them."
+    # i_say += 'For instance. `try_install_deps(["opencv-python", "scipy", "numpy"])`'
+    # installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
+    #     inputs=i_say, inputs_show_user=inputs_show_user, 
+    #     llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
+    #     sys_prompt= r"You are a programmer."
+    # )
+    # # # 第三步  
+    # i_say = "Show me how to use `pip` to install packages to run the code above. "
+    # i_say += 'For instance. `pip install -r opencv-python scipy numpy`'
+    # installation_advance = yield from request_gpt_model_in_new_thread_with_ui_alive(
+    #     inputs=i_say, inputs_show_user=i_say, 
+    #     llm_kwargs=llm_kwargs, chatbot=chatbot, history=history, 
+    #     sys_prompt= r"You are a programmer."
+    # )
+    installation_advance = ""
+    
+    return code_to_return, installation_advance, txt, file_type, llm_kwargs, chatbot, history
+
+def make_module(code):
+    module_file = 'gpt_fn_' + gen_time_str().replace('-','_')
+    with open(f'gpt_log/{module_file}.py', 'w', encoding='utf8') as f:
+        f.write(code)
+
+    def get_class_name(class_string):
+        import re
+        # Use regex to extract the class name
+        class_name = re.search(r'class (\w+)\(', class_string).group(1)
+        return class_name
+
+    class_name = get_class_name(code)
+    return f"gpt_log.{module_file}->{class_name}"
+
+def init_module_instance(module):
+    import importlib
+    module_, class_ = module.split('->')
+    init_f = getattr(importlib.import_module(module_), class_)
+    return init_f()
+
+def for_immediate_show_off_when_possible(file_type, fp, chatbot):
+    if file_type in ['png', 'jpg']:
+        image_path = os.path.abspath(fp)
+        chatbot.append(['这是一张图片, 展示如下:',  
+            f'本地文件地址: <br/>`{image_path}`<br/>'+
+            f'本地文件预览: <br/><div align="center"><img src="file={image_path}"></div>'
+        ])
+    return chatbot
+
+def subprocess_worker(instance, file_path, return_dict):
+    return_dict['result'] = instance.run(file_path)
+
+def have_any_recent_upload_files(chatbot):
+    _5min = 5 * 60
+    if not chatbot: return False    # chatbot is None
+    most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
+    if not most_recent_uploaded: return False   # most_recent_uploaded is None
+    if time.time() - most_recent_uploaded["time"] < _5min: return True # most_recent_uploaded is new
+    else: return False  # most_recent_uploaded is too old
+
+def get_recent_file_prompt_support(chatbot):
+    most_recent_uploaded = chatbot._cookies.get("most_recent_uploaded", None)
+    path = most_recent_uploaded['path']
+    return path
+
+@CatchException
+def 虚空终端CodeInterpreter(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
+    """
+    txt             输入栏用户输入的文本，例如需要翻译的一段话，再例如一个包含了待处理文件的路径
+    llm_kwargs      gpt模型参数，如温度和top_p等，一般原样传递下去就行
+    plugin_kwargs   插件模型的参数，暂时没有用武之地
+    chatbot         聊天显示框的句柄，用于显示给用户
+    history         聊天历史，前情提要
+    system_prompt   给gpt的静默提醒
+    web_port        当前软件运行的端口号
+    """
+    raise NotImplementedError
+
+    # 清空历史，以免输入溢出
+    history = []; clear_file_downloadzone(chatbot)
+
+    # 基本信息：功能、贡献者
+    chatbot.append([
+        "函数插件功能？",
+        "CodeInterpreter开源版, 此插件处于开发阶段, 建议暂时不要使用, 插件初始化中 ..."
+    ])
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+
+    if have_any_recent_upload_files(chatbot):
+        file_path = get_recent_file_prompt_support(chatbot)
+    else:
+        chatbot.append(["文件检索", "没有发现任何近期上传的文件。"])
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+
+    # 读取文件
+    if ("recently_uploaded_files" in plugin_kwargs) and (plugin_kwargs["recently_uploaded_files"] == ""): plugin_kwargs.pop("recently_uploaded_files")
+    recently_uploaded_files = plugin_kwargs.get("recently_uploaded_files", None)
+    file_path = recently_uploaded_files[-1]
+    file_type = file_path.split('.')[-1]
+
+    # 粗心检查
+    if 'private_upload' in txt:
+        chatbot.append([
+            "...",
+            f"请在输入框内填写需求，然后再次点击该插件（文件路径 {file_path} 已经被记忆）"
+        ])
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+        return
+    
+    # 开始干正事
+    for j in range(5):  # 最多重试5次
+        try:
+            code, installation_advance, txt, file_type, llm_kwargs, chatbot, history = \
+                yield from gpt_interact_multi_step(txt, file_type, llm_kwargs, chatbot, history)
+            code = get_code_block(code)
+            res = make_module(code)
+            instance = init_module_instance(res)
+            break
+        except Exception as e:
+            chatbot.append([f"第{j}次代码生成尝试，失败了", f"错误追踪\n```\n{trimmed_format_exc()}\n```\n"])
+            yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+
+    # 代码生成结束, 开始执行
+    try:
+        import multiprocessing
+        manager = multiprocessing.Manager()
+        return_dict = manager.dict()
+
+        p = multiprocessing.Process(target=subprocess_worker, args=(instance, file_path, return_dict))
+        # only has 10 seconds to run
+        p.start(); p.join(timeout=10)
+        if p.is_alive(): p.terminate(); p.join()
+        p.close()
+        res = return_dict['result']
+        # res = instance.run(file_path)
+    except Exception as e:
+        chatbot.append(["执行失败了", f"错误追踪\n```\n{trimmed_format_exc()}\n```\n"])
+        # chatbot.append(["如果是缺乏依赖，请参考以下建议", installation_advance])
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+        return
+
+    # 顺利完成，收尾
+    res = str(res)
+    if os.path.exists(res):
+        chatbot.append(["执行成功了，结果是一个有效文件", "结果：" + res])
+        new_file_path = promote_file_to_downloadzone(res, chatbot=chatbot)
+        chatbot = for_immediate_show_off_when_possible(file_type, new_file_path, chatbot)
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
+    else:
+        chatbot.append(["执行成功了，结果是一个字符串", "结果：" + res])
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新   
+
+"""
+测试：
+    裁剪图像，保留下半部分
+    交换图像的蓝色通道和红色通道
+    将图像转为灰度图像
+    将csv文件转excel表格
+"""
--- a/crazy_functions/Latex输出PDF结果.py
+++ b/crazy_functions/Latex输出PDF结果.py
@@ -109,7 +109,7 @@ def arxiv_download(chatbot, history, txt):

    url_ = txt   # https://arxiv.org/abs/1707.06690
    if not txt.startswith('https://arxiv.org/abs/'): 
-        msg = f"解析arxiv网址失败, 期望格式例如: https://arxiv.org/abs/1707.06690。实际得到格式: {url_}"
+        msg = f"解析arxiv网址失败, 期望格式例如: https://arxiv.org/abs/1707.06690。实际得到格式: {url_}。"
        yield from update_ui_lastest_msg(msg, chatbot=chatbot, history=history) # 刷新界面
        return msg, None
    # <-------------- set format ------------->
@@ -255,7 +255,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
        project_folder = txt
    else:
        if txt == "": txt = '空空如也的输入栏'
-        report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
+        report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无法处理: {txt}")
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return
    
--- a/crazy_functions/crazy_utils.py
+++ b/crazy_functions/crazy_utils.py
@@ -469,14 +469,16 @@ def read_and_clean_pdf_text(fp):
                    '- ', '') for t in text_areas['blocks'] if 'lines' in t]
                
        ############################## <第 2 步，获取正文主字体> ##################################
-        fsize_statiscs = {}
-        for span in meta_span:
-            if span[1] not in fsize_statiscs: fsize_statiscs[span[1]] = 0
-            fsize_statiscs[span[1]] += span[2]
-        main_fsize = max(fsize_statiscs, key=fsize_statiscs.get)
-        if REMOVE_FOOT_NOTE:
-            give_up_fize_threshold = main_fsize * REMOVE_FOOT_FFSIZE_PERCENT
-
+        try:
+            fsize_statiscs = {}
+            for span in meta_span:
+                if span[1] not in fsize_statiscs: fsize_statiscs[span[1]] = 0
+                fsize_statiscs[span[1]] += span[2]
+            main_fsize = max(fsize_statiscs, key=fsize_statiscs.get)
+            if REMOVE_FOOT_NOTE:
+                give_up_fize_threshold = main_fsize * REMOVE_FOOT_FFSIZE_PERCENT
+        except:
+            raise RuntimeError(f'抱歉, 我们暂时无法解析此PDF文档: {fp}。')
        ############################## <第 3 步，切分和重新整合> ##################################
        mega_sec = []
        sec = []
@@ -591,11 +593,16 @@ def get_files_from_everything(txt, type): # type='.md'
        # 网络的远程文件
        import requests
        from toolbox import get_conf
+        from toolbox import get_log_folder, gen_time_str
        proxies, = get_conf('proxies')
-        r = requests.get(txt, proxies=proxies)
-        with open('./gpt_log/temp'+type, 'wb+') as f: f.write(r.content)
-        project_folder = './gpt_log/'
-        file_manifest = ['./gpt_log/temp'+type]
+        try:
+            r = requests.get(txt, proxies=proxies)
+        except:
+            raise ConnectionRefusedError(f"无法下载资源{txt}，请检查。")
+        path = os.path.join(get_log_folder(plugin_name='web_download'), gen_time_str()+type)
+        with open(path, 'wb+') as f: f.write(r.content)
+        project_folder = get_log_folder(plugin_name='web_download')
+        file_manifest = [path]
    elif txt.endswith(type):
        # 直接给定文件
        file_manifest = [txt]
--- a/crazy_functions/latex_fns/latex_toolbox.py
+++ b/crazy_functions/latex_fns/latex_toolbox.py
@@ -423,7 +423,7 @@ def compile_latex_with_timeout(command, cwd, timeout=60):

 def merge_pdfs(pdf1_path, pdf2_path, output_path):
    import PyPDF2
-    Percent = 0.8
+    Percent = 0.95
    # Open the first PDF file
    with open(pdf1_path, 'rb') as pdf1_file:
        pdf1_reader = PyPDF2.PdfFileReader(pdf1_file)
--- a/crazy_functions/live_audio/aliyunASR.py
+++ b/crazy_functions/live_audio/aliyunASR.py
@@ -1,4 +1,4 @@
-import time, threading, json
+import time, logging, json


 class AliyunASR():
@@ -12,14 +12,14 @@ class AliyunASR():
        message = json.loads(message)
        self.parsed_sentence = message['payload']['result']
        self.event_on_entence_end.set()
-        print(self.parsed_sentence)
+        # print(self.parsed_sentence)

    def test_on_start(self, message, *args):
        # print("test_on_start:{}".format(message))
        pass

    def test_on_error(self, message, *args):
-        print("on_error args=>{}".format(args))
+        logging.error("on_error args=>{}".format(args))
        pass

    def test_on_close(self, *args):
@@ -36,7 +36,6 @@ class AliyunASR():
        # print("on_completed:args=>{} message=>{}".format(args, message))
        pass

-
    def audio_convertion_thread(self, uuid):
        # 在一个异步线程中采集音频
        import nls  # pip install git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
--- a/crazy_functions/pdf_fns/parse_pdf.py
+++ b/crazy_functions/pdf_fns/parse_pdf.py
@@ -20,6 +20,11 @@ def get_avail_grobid_url():
 def parse_pdf(pdf_path, grobid_url):
    import scipdf   # pip install scipdf_parser
    if grobid_url.endswith('/'): grobid_url = grobid_url.rstrip('/')
-    article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
+    try:
+        article_dict = scipdf.parse_pdf_to_dict(pdf_path, grobid_url=grobid_url)
+    except GROBID_OFFLINE_EXCEPTION:
+        raise GROBID_OFFLINE_EXCEPTION("GROBID服务不可用，请修改config中的GROBID_URL，可修改成本地GROBID服务。")
+    except:
+        raise RuntimeError("解析PDF失败，请检查PDF是否损坏。")
    return article_dict

--- a/crazy_functions/批量翻译PDF文档_NOUGAT.py
+++ b/crazy_functions/批量翻译PDF文档_NOUGAT.py
@@ -0,0 +1,271 @@
+from toolbox import CatchException, report_execption, gen_time_str
+from toolbox import update_ui, promote_file_to_downloadzone, update_ui_lastest_msg, disable_auto_promotion
+from toolbox import write_history_to_file, get_log_folder
+from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
+from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
+from .crazy_utils import read_and_clean_pdf_text
+from .pdf_fns.parse_pdf import parse_pdf, get_avail_grobid_url
+from colorful import *
+import os
+import math
+import logging
+
+def markdown_to_dict(article_content):
+    import markdown
+    from bs4 import BeautifulSoup
+    cur_t = ""
+    cur_c = ""
+    results = {}
+    for line in article_content:
+        if line.startswith('#'):
+            if cur_t!="":
+                if cur_t not in results:
+                    results.update({cur_t:cur_c.lstrip('\n')})
+                else:
+                    # 处理重名的章节
+                    results.update({cur_t + " " + gen_time_str():cur_c.lstrip('\n')})
+            cur_t = line.rstrip('\n')
+            cur_c = ""
+        else:
+            cur_c += line
+    results_final = {}
+    for k in list(results.keys()):
+        if k.startswith('# '):
+            results_final['title'] = k.split('# ')[-1]
+            results_final['authors'] = results.pop(k).lstrip('\n')
+        if k.startswith('###### Abstract'):
+            results_final['abstract'] = results.pop(k).lstrip('\n')
+
+    results_final_sections = []
+    for k,v in results.items():
+        results_final_sections.append({
+            'heading':k.lstrip("# "),
+            'text':v if len(v) > 0 else f"The beginning of {k.lstrip('# ')} section."
+        })
+    results_final['sections'] = results_final_sections
+    return results_final
+
+
+@CatchException
+def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
+
+    disable_auto_promotion(chatbot)
+    # 基本信息：功能、贡献者
+    chatbot.append([
+        "函数插件功能？",
+        "批量翻译PDF文档。函数插件贡献者: Binary-Husky"])
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+
+    # 尝试导入依赖，如果缺少依赖，则给出安装建议
+    try:
+        import nougat
+        import tiktoken
+    except:
+        report_execption(chatbot, history,
+                         a=f"解析项目: {txt}",
+                         b=f"导入软件依赖失败。使用该模块需要额外依赖，安装方法```pip install --upgrade nougat-ocr tiktoken```。")
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+        return
+
+    # 清空历史，以免输入溢出
+    history = []
+
+    from .crazy_utils import get_files_from_everything
+    success, file_manifest, project_folder = get_files_from_everything(txt, type='.pdf')
+    # 检测输入参数，如没有给定输入参数，直接退出
+    if not success:
+        if txt == "": txt = '空空如也的输入栏'
+
+    # 如果没找到任何文件
+    if len(file_manifest) == 0:
+        report_execption(chatbot, history,
+                         a=f"解析项目: {txt}", b=f"找不到任何.tex或.pdf文件: {txt}")
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+        return
+
+    # 开始正式执行任务
+    yield from 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt)
+
+    
+def nougat_with_timeout(command, cwd, timeout=3600):
+    import subprocess
+    process = subprocess.Popen(command, shell=True, cwd=cwd)
+    try:
+        stdout, stderr = process.communicate(timeout=timeout)
+    except subprocess.TimeoutExpired:
+        process.kill()
+        stdout, stderr = process.communicate()
+        print("Process timed out!")
+        return False
+    return True
+
+
+def NOUGAT_parse_pdf(fp):
+    import glob
+    from toolbox import get_log_folder, gen_time_str
+    dst = os.path.join(get_log_folder(plugin_name='nougat'), gen_time_str())
+    os.makedirs(dst)
+    nougat_with_timeout(f'nougat --out "{os.path.abspath(dst)}" "{os.path.abspath(fp)}"', os.getcwd())
+    res = glob.glob(os.path.join(dst,'*.mmd'))
+    if len(res) == 0:
+        raise RuntimeError("Nougat解析论文失败。")
+    return res[0]
+
+
+def 解析PDF_基于NOUGAT(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
+    import copy
+    import tiktoken
+    TOKEN_LIMIT_PER_FRAGMENT = 1280
+    generated_conclusion_files = []
+    generated_html_files = []
+    DST_LANG = "中文"
+    for index, fp in enumerate(file_manifest):
+        chatbot.append(["当前进度：", f"正在解析论文，请稍候。（第一次运行时，需要花费较长时间下载NOUGAT参数）"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+        fpp = NOUGAT_parse_pdf(fp)
+
+        with open(fpp, 'r', encoding='utf8') as f:
+            article_content = f.readlines()
+        article_dict = markdown_to_dict(article_content)
+        logging.info(article_dict)
+
+        prompt = "以下是一篇学术论文的基本信息:\n"
+        # title
+        title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
+        # authors
+        authors = article_dict.get('authors', '无法获取 authors'); prompt += f'authors:{authors}\n\n'
+        # abstract
+        abstract = article_dict.get('abstract', '无法获取 abstract'); prompt += f'abstract:{abstract}\n\n'
+        # command
+        prompt += f"请将题目和摘要翻译为{DST_LANG}。"
+        meta = [f'# Title:\n\n', title, f'# Abstract:\n\n', abstract ]
+
+        # 单线，获取文章meta信息
+        paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
+            inputs=prompt,
+            inputs_show_user=prompt,
+            llm_kwargs=llm_kwargs,
+            chatbot=chatbot, history=[],
+            sys_prompt="You are an academic paper reader。",
+        )
+
+        # 多线，翻译
+        inputs_array = []
+        inputs_show_user_array = []
+
+        # get_token_num
+        from request_llm.bridge_all import model_info
+        enc = model_info[llm_kwargs['llm_model']]['tokenizer']
+        def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
+        from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
+
+        def break_down(txt):
+            raw_token_num = get_token_num(txt)
+            if raw_token_num <= TOKEN_LIMIT_PER_FRAGMENT:
+                return [txt]
+            else:
+                # raw_token_num > TOKEN_LIMIT_PER_FRAGMENT
+                # find a smooth token limit to achieve even seperation
+                count = int(math.ceil(raw_token_num / TOKEN_LIMIT_PER_FRAGMENT))
+                token_limit_smooth = raw_token_num // count + count
+                return breakdown_txt_to_satisfy_token_limit_for_pdf(txt, get_token_fn=get_token_num, limit=token_limit_smooth)
+
+        for section in article_dict.get('sections'):
+            if len(section['text']) == 0: continue
+            section_frags = break_down(section['text'])
+            for i, fragment in enumerate(section_frags):
+                heading = section['heading']
+                if len(section_frags) > 1: heading += f' Part-{i+1}'
+                inputs_array.append(
+                    f"你需要翻译{heading}章节，内容如下: \n\n{fragment}"
+                )
+                inputs_show_user_array.append(
+                    f"# {heading}\n\n{fragment}"
+                )
+
+        gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
+            inputs_array=inputs_array,
+            inputs_show_user_array=inputs_show_user_array,
+            llm_kwargs=llm_kwargs,
+            chatbot=chatbot,
+            history_array=[meta for _ in inputs_array],
+            sys_prompt_array=[
+                "请你作为一个学术翻译，负责把学术论文准确翻译成中文。注意文章中的每一句话都要翻译。" for _ in inputs_array],
+        )
+        res_path = write_history_to_file(meta +  ["# Meta Translation" , paper_meta_info] + gpt_response_collection, file_basename=None, file_fullname=None)
+        promote_file_to_downloadzone(res_path, rename_file=os.path.basename(fp)+'.md', chatbot=chatbot)
+        generated_conclusion_files.append(res_path)
+
+        ch = construct_html() 
+        orig = ""
+        trans = ""
+        gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
+        for i,k in enumerate(gpt_response_collection_html): 
+            if i%2==0:
+                gpt_response_collection_html[i] = inputs_show_user_array[i//2]
+            else:
+                gpt_response_collection_html[i] = gpt_response_collection_html[i]
+
+        final = ["", "", "一、论文概况",  "", "Abstract", paper_meta_info,  "二、论文翻译",  ""]
+        final.extend(gpt_response_collection_html)
+        for i, k in enumerate(final): 
+            if i%2==0:
+                orig = k
+            if i%2==1:
+                trans = k
+                ch.add_row(a=orig, b=trans)
+        create_report_file_name = f"{os.path.basename(fp)}.trans.html"
+        html_file = ch.save_file(create_report_file_name)
+        generated_html_files.append(html_file)
+        promote_file_to_downloadzone(html_file, rename_file=os.path.basename(html_file), chatbot=chatbot)
+
+    chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
+    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
+
+
+
+class construct_html():
+    def __init__(self) -> None:
+        self.css = """
+.row {
+  display: flex;
+  flex-wrap: wrap;
+}
+
+.column {
+  flex: 1;
+  padding: 10px;
+}
+
+.table-header {
+  font-weight: bold;
+  border-bottom: 1px solid black;
+}
+
+.table-row {
+  border-bottom: 1px solid lightgray;
+}
+
+.table-cell {
+  padding: 5px;
+}
+        """
+        self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
+
+
+    def add_row(self, a, b):
+        tmp = """
+<div class="row table-row">
+    <div class="column table-cell">REPLACE_A</div>
+    <div class="column table-cell">REPLACE_B</div>
+</div>
+        """
+        from toolbox import markdown_convertion
+        tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
+        tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
+        self.html_string += tmp
+
+
+    def save_file(self, file_name):
+        with open(os.path.join(get_log_folder(), file_name), 'w', encoding='utf8') as f:
+            f.write(self.html_string.encode('utf-8', 'ignore').decode())
+        return os.path.join(get_log_folder(), file_name)
--- a/crazy_functions/批量翻译PDF文档_多线程.py
+++ b/crazy_functions/批量翻译PDF文档_多线程.py
@@ -24,10 +24,11 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
    try:
        import fitz
        import tiktoken
+        import scipdf
    except:
        report_execption(chatbot, history,
                         a=f"解析项目: {txt}",
-                         b=f"导入软件依赖失败。使用该模块需要额外依赖，安装方法```pip install --upgrade pymupdf tiktoken```。")
+                         b=f"导入软件依赖失败。使用该模块需要额外依赖，安装方法```pip install --upgrade pymupdf tiktoken scipdf_parser```。")
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return

@@ -58,7 +59,6 @@ def 批量翻译PDF文档(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst

 def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, grobid_url):
    import copy
-    import tiktoken
    TOKEN_LIMIT_PER_FRAGMENT = 1280
    generated_conclusion_files = []
    generated_html_files = []
@@ -66,7 +66,7 @@ def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwa
    for index, fp in enumerate(file_manifest):
        chatbot.append(["当前进度：", f"正在连接GROBID服务，请稍候: {grobid_url}\n如果等待时间过长，请修改config中的GROBID_URL，可修改成本地GROBID服务。"]); yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        article_dict = parse_pdf(fp, grobid_url)
-        print(article_dict)
+        if article_dict is None: raise RuntimeError("解析PDF失败，请检查PDF是否损坏。")
        prompt = "以下是一篇学术论文的基本信息:\n"
        # title
        title = article_dict.get('title', '无法获取 title'); prompt += f'title:{title}\n\n'
--- a/crazy_functions/联网的ChatGPT.py
+++ b/crazy_functions/联网的ChatGPT.py
@@ -75,7 +75,11 @@ def 连接网络回答问题(txt, llm_kwargs, plugin_kwargs, chatbot, history, s
    proxies, = get_conf('proxies')
    urls = google(txt, proxies)
    history = []
-
+    if len(urls) == 0:
+        chatbot.append((f"结论：{txt}",
+                        "[Local Message] 受到google限制，无法从google获取信息！"))
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间，我们先及时地做一次界面更新
+        return
    # ------------- < 第2步：依次访问网页 > -------------
    max_search_result = 5   # 最多收纳多少个网页的结果
    for index, url in enumerate(urls[:max_search_result]):
--- a/crazy_functions/联网的ChatGPT_bing版.py
+++ b/crazy_functions/联网的ChatGPT_bing版.py
@@ -75,7 +75,11 @@ def 连接bing搜索回答问题(txt, llm_kwargs, plugin_kwargs, chatbot, histor
    proxies, = get_conf('proxies')
    urls = bing_search(txt, proxies)
    history = []
-
+    if len(urls) == 0:
+        chatbot.append((f"结论：{txt}",
+                        "[Local Message] 受到bing限制，无法从bing获取信息！"))
+        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间，我们先及时地做一次界面更新
+        return
    # ------------- < 第2步：依次访问网页 > -------------
    max_search_result = 8   # 最多收纳多少个网页的结果
    for index, url in enumerate(urls[:max_search_result]):
--- a/crazy_functions/虚空终端.py
+++ b/crazy_functions/虚空终端.py
@@ -24,12 +24,13 @@ explain_msg = """
 ## 虚空终端插件说明:

 1. 请用**自然语言**描述您需要做什么。例如：
-    - 「请调用插件，为我翻译PDF论文，论文我刚刚放到上传区了。」
-    - 「请调用插件翻译PDF论文，地址为https://www.nature.com/articles/s41586-019-1724-z.pdf」
-    - 「生成一张图片，图中鲜花怒放，绿草如茵，用插件实现。」
+    - 「请调用插件，为我翻译PDF论文，论文我刚刚放到上传区了」
+    - 「请调用插件翻译PDF论文，地址为https://openreview.net/pdf?id=rJl0r3R9KX」
+    - 「把Arxiv论文翻译成中文PDF，arxiv论文的ID是1812.10695，记得用插件！」
+    - 「生成一张图片，图中鲜花怒放，绿草如茵，用插件实现」
    - 「用插件翻译README，Github网址是https://github.com/facebookresearch/co-tracker」
-    - 「给爷翻译Arxiv论文，arxiv论文的ID是1812.10695，记得用插件，不要自己瞎搞！」
-    - 「我不喜欢当前的界面颜色，修改配置，把主题THEME更换为THEME="High-Contrast"。」
+    - 「我不喜欢当前的界面颜色，修改配置，把主题THEME更换为THEME="High-Contrast"」
+    - 「请调用插件，解析python源代码项目，代码我刚刚打包拖到上传区了」
    - 「请问Transformer网络的结构是怎样的？」

 2. 您可以打开插件下拉菜单以了解本项目的各种能力。    
--- a/crazy_functions/解析项目源代码.py
+++ b/crazy_functions/解析项目源代码.py
@@ -1,12 +1,13 @@
-from toolbox import update_ui
-from toolbox import CatchException, report_execption, write_results_to_file
+from toolbox import update_ui, promote_file_to_downloadzone, disable_auto_promotion
+from toolbox import CatchException, report_execption, write_history_to_file
 from .crazy_utils import input_clipping

 def 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt):
    import os, copy
    from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
    from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
-    msg = '正常'
+    disable_auto_promotion(chatbot=chatbot)
+
    summary_batch_isolation = True
    inputs_array = []
    inputs_show_user_array = []
@@ -43,7 +44,8 @@ def 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
    # 全部文件解析完成，结果写入文件，准备对工程源代码进行汇总分析
    report_part_1 = copy.deepcopy(gpt_response_collection)
    history_to_return = report_part_1
-    res = write_results_to_file(report_part_1)
+    res = write_history_to_file(report_part_1)
+    promote_file_to_downloadzone(res, chatbot=chatbot)
    chatbot.append(("完成？", "逐个文件分析已完成。" + res + "\n\n正在开始汇总。"))
    yield from update_ui(chatbot=chatbot, history=history_to_return) # 刷新界面

@@ -97,7 +99,8 @@ def 解析源代码新(file_manifest, project_folder, llm_kwargs, plugin_kwargs,

    ############################## <END> ##################################
    history_to_return.extend(report_part_2)
-    res = write_results_to_file(history_to_return)
+    res = write_history_to_file(history_to_return)
+    promote_file_to_downloadzone(res, chatbot=chatbot)
    chatbot.append(("完成了吗？", res))
    yield from update_ui(chatbot=chatbot, history=history_to_return) # 刷新界面

--- a/crazy_functions/语音助手.py
+++ b/crazy_functions/语音助手.py
@@ -80,9 +80,9 @@ class InterviewAssistant(AliyunASR):
    def __init__(self):
        self.capture_interval = 0.5 # second
        self.stop = False
-        self.parsed_text = ""
-        self.parsed_sentence = ""
-        self.buffered_sentence = ""
+        self.parsed_text = ""   # 下个句子中已经说完的部分, 由 test_on_result_chg() 写入
+        self.parsed_sentence = ""   # 某段话的整个句子,由 test_on_sentence_end() 写入
+        self.buffered_sentence = ""    #
        self.event_on_result_chg = threading.Event()
        self.event_on_entence_end = threading.Event()
        self.event_on_commit_question = threading.Event()
@@ -132,7 +132,7 @@ class InterviewAssistant(AliyunASR):
            self.plugin_wd.feed()

            if self.event_on_result_chg.is_set(): 
-                # update audio decode result
+                # called when some words have finished
                self.event_on_result_chg.clear()
                chatbot[-1] = list(chatbot[-1])
                chatbot[-1][0] = self.buffered_sentence + self.parsed_text
@@ -144,7 +144,11 @@ class InterviewAssistant(AliyunASR):
                # called when a sentence has ended
                self.event_on_entence_end.clear()
                self.parsed_text = self.parsed_sentence
-                self.buffered_sentence += self.parsed_sentence
+                self.buffered_sentence += self.parsed_text
+                chatbot[-1] = list(chatbot[-1])
+                chatbot[-1][0] = self.buffered_sentence
+                history = chatbot2history(chatbot)
+                yield from update_ui(chatbot=chatbot, history=history)  # 刷新界面

            if self.event_on_commit_question.is_set():
                # called when a question should be commited
--- a/crazy_functions/谷歌检索小助手.py
+++ b/crazy_functions/谷歌检索小助手.py
@@ -1,26 +1,75 @@
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
-from toolbox import CatchException, report_execption, write_results_to_file
-from toolbox import update_ui
+from toolbox import CatchException, report_execption, promote_file_to_downloadzone
+from toolbox import update_ui, update_ui_lastest_msg, disable_auto_promotion, write_history_to_file
+import logging
+import requests
+import time
+import random
+
+ENABLE_ALL_VERSION_SEARCH = True

 def get_meta_information(url, chatbot, history):
-    import requests
    import arxiv
    import difflib
+    import re
    from bs4 import BeautifulSoup
    from toolbox import get_conf
+    from urllib.parse import urlparse
+    session = requests.session()
+
    proxies, = get_conf('proxies')
    headers = {
-        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36',
+        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36',
+        'Accept-Encoding': 'gzip, deflate, br', 
+        'Accept-Language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',
+        'Cache-Control':'max-age=0',
+        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 
+        'Connection': 'keep-alive'
    }
-    # 发送 GET 请求
-    response = requests.get(url, proxies=proxies, headers=headers)
+    session.proxies.update(proxies)
+    session.headers.update(headers)

+    response = session.get(url)
    # 解析网页内容
    soup = BeautifulSoup(response.text, "html.parser")

    def string_similar(s1, s2):
        return difflib.SequenceMatcher(None, s1, s2).quick_ratio()

+    if ENABLE_ALL_VERSION_SEARCH:
+        def search_all_version(url):
+            time.sleep(random.randint(1,5)) # 睡一会防止触发google反爬虫
+            response = session.get(url)
+            soup = BeautifulSoup(response.text, "html.parser")
+
+            for result in soup.select(".gs_ri"):
+                try:
+                    url = result.select_one(".gs_rt").a['href']
+                except:
+                    continue
+                arxiv_id = extract_arxiv_id(url)
+                if not arxiv_id:
+                    continue
+                search = arxiv.Search(
+                    id_list=[arxiv_id],
+                    max_results=1,
+                    sort_by=arxiv.SortCriterion.Relevance,
+                )
+                try: paper = next(search.results())
+                except: paper = None
+                return paper
+
+            return None
+
+        def extract_arxiv_id(url):
+            # 返回给定的url解析出的arxiv_id，如url未成功匹配返回None
+            pattern = r'arxiv.org/abs/([^/]+)'
+            match = re.search(pattern, url)
+            if match:
+                return match.group(1)
+            else:
+                return None
+
    profile = []
    # 获取所有文章的标题和作者
    for result in soup.select(".gs_ri"):
@@ -31,32 +80,45 @@ def get_meta_information(url, chatbot, history):
        except:
            citation = 'cited by 0'
        abstract = result.select_one(".gs_rs").text.strip()  # 摘要在 .gs_rs 中的文本，需要清除首尾空格
+
+        # 首先在arxiv上搜索，获取文章摘要
        search = arxiv.Search(
            query = title,
            max_results = 1,
            sort_by = arxiv.SortCriterion.Relevance,
        )
-        try:
-            paper = next(search.results())
-            if string_similar(title, paper.title) > 0.90: # same paper
-                abstract = paper.summary.replace('\n', ' ')
-                is_paper_in_arxiv = True
-            else:   # different paper
-                abstract = abstract
-                is_paper_in_arxiv = False
-            paper = next(search.results())
-        except:
+        try: paper = next(search.results())
+        except: paper = None
+        
+        is_match = paper is not None and string_similar(title, paper.title) > 0.90
+
+        # 如果在Arxiv上匹配失败，检索文章的历史版本的题目
+        if not is_match and ENABLE_ALL_VERSION_SEARCH:
+            other_versions_page_url = [tag['href'] for tag in result.select_one('.gs_flb').select('.gs_nph') if 'cluster' in tag['href']]
+            if len(other_versions_page_url) > 0:
+                other_versions_page_url = other_versions_page_url[0]
+                paper = search_all_version('http://' + urlparse(url).netloc + other_versions_page_url)
+                is_match = paper is not None and string_similar(title, paper.title) > 0.90
+
+        if is_match:
+            # same paper
+            abstract = paper.summary.replace('\n', ' ')
+            is_paper_in_arxiv = True
+        else:
+            # different paper
            abstract = abstract
            is_paper_in_arxiv = False
-        print(title)
-        print(author)
-        print(citation)
+
+        logging.info('[title]:' + title)
+        logging.info('[author]:' + author)
+        logging.info('[citation]:' + citation)
+
        profile.append({
-            'title':title,
-            'author':author,
-            'citation':citation,
-            'abstract':abstract,
-            'is_paper_in_arxiv':is_paper_in_arxiv,
+            'title': title,
+            'author': author,
+            'citation': citation,
+            'abstract': abstract,
+            'is_paper_in_arxiv': is_paper_in_arxiv,
        })

        chatbot[-1] = [chatbot[-1][0], title + f'\n\n是否在arxiv中（不在arxiv中无法获取完整摘要）:{is_paper_in_arxiv}\n\n' + abstract]
@@ -65,6 +127,7 @@ def get_meta_information(url, chatbot, history):

@CatchException
 def 谷歌检索小助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
+    disable_auto_promotion(chatbot=chatbot)
    # 基本信息：功能、贡献者
    chatbot.append([
        "函数插件功能？",
@@ -86,6 +149,9 @@ def 谷歌检索小助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
    # 清空历史，以免输入溢出
    history = []
    meta_paper_info_list = yield from get_meta_information(txt, chatbot, history)
+    if len(meta_paper_info_list) == 0:
+        yield from update_ui_lastest_msg(lastmsg='获取文献失败，可能触发了google反爬虫机制。',chatbot=chatbot, history=history, delay=0)
+        return
    batchsize = 5
    for batch in range(math.ceil(len(meta_paper_info_list)/batchsize)):
        if len(meta_paper_info_list[:batchsize]) > 0:
@@ -107,6 +173,7 @@ def 谷歌检索小助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
        "已经全部完成，您可以试试让AI写一个Related Works，例如您可以继续输入Write a \"Related Works\" section about \"你搜索的研究领域\" for me."])
    msg = '正常'
    yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
-    res = write_results_to_file(history)
-    chatbot.append(("完成了吗？", res)); 
+    path = write_history_to_file(history)
+    promote_file_to_downloadzone(path, chatbot=chatbot)
+    chatbot.append(("完成了吗？", path)); 
    yield from update_ui(chatbot=chatbot, history=history, msg=msg) # 刷新界面
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,7 +1,7 @@
 #【请修改完参数后，删除此行】请在以下方案中选择一种，然后删除其他的方案，最后docker-compose up运行 | Please choose from one of these options below, delete other options as well as This Line

 ## ===================================================
-## 【方案一】 如果不需要运行本地模型（仅chatgpt,newbing类远程服务）
+## 【方案一】 如果不需要运行本地模型（仅 chatgpt, azure, 星火, 千帆, claude 等在线大模型服务）
 ## ===================================================
 version: '3'
 services:
@@ -13,7 +13,7 @@ services:
      USE_PROXY:                '    True                                                                                           '
      proxies:                  '    { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", }                 '
      LLM_MODEL:                '    gpt-3.5-turbo                                                                                  '
-      AVAIL_LLM_MODELS:         '    ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "newbing"]                    '
+      AVAIL_LLM_MODELS:         '    ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "sparkv2", "qianfan"]         '
      WEB_PORT:                 '    22303                                                                                          '
      ADD_WAIFU:                '    True                                                                                           '
      # THEME:                    '    Chuanhu-Small-and-Beautiful                                                                    '
--- a/docs/Dockerfile+ChatGLM
+++ b/docs/Dockerfile+ChatGLM
@@ -1,62 +1,2 @@
-# How to build | 如何构建: docker build -t gpt-academic --network=host  -f Dockerfile+ChatGLM .
-# How to run | (1) 我想直接一键运行（选择0号GPU）: docker run --rm -it --net=host --gpus \"device=0\" gpt-academic
-# How to run | (2) 我想运行之前进容器做一些调整（选择1号GPU）: docker run --rm -it --net=host --gpus \"device=1\" gpt-academic bash
- 
-# 从NVIDIA源，从而支持显卡运损（检查宿主的nvidia-smi中的cuda版本必须>=11.3）
-FROM nvidia/cuda:11.3.1-runtime-ubuntu20.04
-ARG useProxyNetwork=''
-RUN apt-get update
-RUN apt-get install -y curl proxychains curl 
-RUN apt-get install -y git python python3 python-dev python3-dev --fix-missing
+# 此Dockerfile不再维护，请前往docs/GithubAction+ChatGLM+Moss

-# 配置代理网络（构建Docker镜像时使用）
-# # comment out below if you do not need proxy network | 如果不需要翻墙 - 从此行向下删除
-RUN $useProxyNetwork curl cip.cc
-RUN sed -i '$ d' /etc/proxychains.conf
-RUN sed -i '$ d' /etc/proxychains.conf
-# 在这里填写主机的代理协议（用于从github拉取代码）
-RUN echo "socks5 127.0.0.1 10880" >> /etc/proxychains.conf
-ARG useProxyNetwork=proxychains
-# # comment out above if you do not need proxy network | 如果不需要翻墙 - 从此行向上删除
-
-
-# use python3 as the system default python
-RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.8
-# 下载pytorch
-RUN $useProxyNetwork python3 -m pip install torch --extra-index-url https://download.pytorch.org/whl/cu113
-# 下载分支
-WORKDIR /gpt
-RUN $useProxyNetwork git clone https://github.com/binary-husky/gpt_academic.git
-WORKDIR /gpt/gpt_academic
-RUN $useProxyNetwork python3 -m pip install -r requirements.txt
-RUN $useProxyNetwork python3 -m pip install -r request_llm/requirements_chatglm.txt
-RUN $useProxyNetwork python3 -m pip install -r request_llm/requirements_newbing.txt
-
-# 预热CHATGLM参数（非必要 可选步骤）
-RUN echo ' \n\
-from transformers import AutoModel, AutoTokenizer \n\
-chatglm_tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) \n\
-chatglm_model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float() ' >> warm_up_chatglm.py
-RUN python3 -u warm_up_chatglm.py
-
-# 禁用缓存，确保更新代码
-ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache
-RUN $useProxyNetwork git pull
-
-# 预热Tiktoken模块
-RUN python3  -c 'from check_proxy import warm_up_modules; warm_up_modules()'
-
-# 为chatgpt-academic配置代理和API-KEY （非必要 可选步骤）
-# 可同时填写多个API-KEY，支持openai的key和api2d的key共存，用英文逗号分割，例如API_KEY = "sk-openaikey1,fkxxxx-api2dkey2,........"
-# LLM_MODEL 是选择初始的模型
-# LOCAL_MODEL_DEVICE 是选择chatglm等本地模型运行的设备，可选 cpu 和 cuda
-# [说明: 以下内容与`config.py`一一对应，请查阅config.py来完成一下配置的填写]
-RUN echo ' \n\
-API_KEY = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,fkxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \n\
-USE_PROXY = True \n\
-LLM_MODEL = "chatglm" \n\
-LOCAL_MODEL_DEVICE = "cuda" \n\
-proxies = { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } ' >> config_private.py
-
-# 启动
-CMD ["python3", "-u", "main.py"]
--- a/docs/Dockerfile+JittorLLM
+++ b/docs/Dockerfile+JittorLLM
@@ -1,59 +1 @@
-# How to build | 如何构建: docker build -t gpt-academic-jittor --network=host  -f Dockerfile+ChatGLM .
-# How to run | (1) 我想直接一键运行（选择0号GPU）: docker run --rm -it --net=host --gpus \"device=0\" gpt-academic-jittor bash
-# How to run | (2) 我想运行之前进容器做一些调整（选择1号GPU）: docker run --rm -it --net=host --gpus \"device=1\" gpt-academic-jittor bash
- 
-# 从NVIDIA源，从而支持显卡运损（检查宿主的nvidia-smi中的cuda版本必须>=11.3）
-FROM nvidia/cuda:11.3.1-runtime-ubuntu20.04
-ARG useProxyNetwork=''
-RUN apt-get update
-RUN apt-get install -y curl proxychains curl g++
-RUN apt-get install -y git python python3 python-dev python3-dev --fix-missing
-
-# 配置代理网络（构建Docker镜像时使用）
-# # comment out below if you do not need proxy network | 如果不需要翻墙 - 从此行向下删除
-RUN $useProxyNetwork curl cip.cc
-RUN sed -i '$ d' /etc/proxychains.conf
-RUN sed -i '$ d' /etc/proxychains.conf
-# 在这里填写主机的代理协议（用于从github拉取代码）
-RUN echo "socks5 127.0.0.1 10880" >> /etc/proxychains.conf
-ARG useProxyNetwork=proxychains
-# # comment out above if you do not need proxy network | 如果不需要翻墙 - 从此行向上删除
-
-
-# use python3 as the system default python
-RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.8
-# 下载pytorch
-RUN $useProxyNetwork python3 -m pip install torch --extra-index-url https://download.pytorch.org/whl/cu113
-# 下载分支
-WORKDIR /gpt
-RUN $useProxyNetwork git clone https://github.com/binary-husky/gpt_academic.git
-WORKDIR /gpt/gpt_academic
-RUN $useProxyNetwork python3 -m pip install -r requirements.txt
-RUN $useProxyNetwork python3 -m pip install -r request_llm/requirements_chatglm.txt
-RUN $useProxyNetwork python3 -m pip install -r request_llm/requirements_newbing.txt
-RUN $useProxyNetwork python3 -m pip install -r request_llm/requirements_jittorllms.txt -i https://pypi.jittor.org/simple -I
-
-# 下载JittorLLMs
-RUN $useProxyNetwork git clone https://github.com/binary-husky/JittorLLMs.git --depth 1 request_llm/jittorllms
-
-# 禁用缓存，确保更新代码
-ADD "https://www.random.org/cgi-bin/randbyte?nbytes=10&format=h" skipcache
-RUN $useProxyNetwork git pull
-
-# 预热Tiktoken模块
-RUN python3  -c 'from check_proxy import warm_up_modules; warm_up_modules()'
-
-# 为chatgpt-academic配置代理和API-KEY （非必要 可选步骤）
-# 可同时填写多个API-KEY，支持openai的key和api2d的key共存，用英文逗号分割，例如API_KEY = "sk-openaikey1,fkxxxx-api2dkey2,........"
-# LLM_MODEL 是选择初始的模型
-# LOCAL_MODEL_DEVICE 是选择chatglm等本地模型运行的设备，可选 cpu 和 cuda
-# [说明: 以下内容与`config.py`一一对应，请查阅config.py来完成一下配置的填写]
-RUN echo ' \n\
-API_KEY = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,fkxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \n\
-USE_PROXY = True \n\
-LLM_MODEL = "chatglm" \n\
-LOCAL_MODEL_DEVICE = "cuda" \n\
-proxies = { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } ' >> config_private.py
-
-# 启动
-CMD ["python3", "-u", "main.py"]
+# 此Dockerfile不再维护，请前往docs/GithubAction+JittorLLMs
--- a/docs/Dockerfile+NoLocal+Latex
+++ b/docs/Dockerfile+NoLocal+Latex
@@ -1,27 +1 @@
-# 此Dockerfile适用于“无本地模型”的环境构建，如果需要使用chatglm等本地模型，请参考 docs/Dockerfile+ChatGLM
-# - 1 修改 `config.py`
-# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
-# - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
-
-FROM fuqingxu/python311_texlive_ctex:latest
-
-# 指定路径
-WORKDIR /gpt
-
-ARG useProxyNetwork=''
-
-RUN $useProxyNetwork pip3 install gradio openai numpy arxiv rich -i https://pypi.douban.com/simple/
-RUN $useProxyNetwork pip3 install colorama Markdown pygments pymupdf -i https://pypi.douban.com/simple/
-
-# 装载项目文件
-COPY . .
-
-
-# 安装依赖
-RUN $useProxyNetwork pip3 install -r requirements.txt -i https://pypi.douban.com/simple/
-
-# 可选步骤，用于预热模块
-RUN python3  -c 'from check_proxy import warm_up_modules; warm_up_modules()'
-
-# 启动
-CMD ["python3", "-u", "main.py"]
+# 此Dockerfile不再维护，请前往docs/GithubAction+NoLocal+Latex
--- a/docs/GithubAction+AllCapacity
+++ b/docs/GithubAction+AllCapacity
@@ -0,0 +1,37 @@
+# docker build -t gpt-academic-all-capacity -f docs/GithubAction+AllCapacity  --network=host --build-arg http_proxy=http://localhost:10881 --build-arg https_proxy=http://localhost:10881 .
+
+# 从NVIDIA源，从而支持显卡（检查宿主的nvidia-smi中的cuda版本必须>=11.3）
+FROM fuqingxu/11.3.1-runtime-ubuntu20.04-with-texlive:latest
+
+# use python3 as the system default python
+WORKDIR /gpt
+RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.8
+# 下载pytorch
+RUN python3 -m pip install torch --extra-index-url https://download.pytorch.org/whl/cu113
+# 准备pip依赖
+RUN python3 -m pip install openai numpy arxiv rich
+RUN python3 -m pip install colorama Markdown pygments pymupdf
+RUN python3 -m pip install python-docx moviepy pdfminer
+RUN python3 -m pip install zh_langchain==0.2.1
+RUN python3 -m pip install nougat-ocr
+RUN python3 -m pip install rarfile py7zr
+RUN python3 -m pip install aliyun-python-sdk-core==2.13.3 pyOpenSSL scipy git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
+# 下载分支
+WORKDIR /gpt
+RUN git clone --depth=1 https://github.com/binary-husky/gpt_academic.git
+WORKDIR /gpt/gpt_academic
+RUN git clone https://github.com/OpenLMLab/MOSS.git request_llm/moss
+
+RUN python3 -m pip install -r requirements.txt
+RUN python3 -m pip install -r request_llm/requirements_moss.txt
+RUN python3 -m pip install -r request_llm/requirements_qwen.txt
+RUN python3 -m pip install -r request_llm/requirements_chatglm.txt
+RUN python3 -m pip install -r request_llm/requirements_newbing.txt
+
+
+
+# 预热Tiktoken模块
+RUN python3  -c 'from check_proxy import warm_up_modules; warm_up_modules()'
+
+# 启动
+CMD ["python3", "-u", "main.py"]
--- a/docs/GithubAction+ChatGLM+Moss
+++ b/docs/GithubAction+ChatGLM+Moss
@@ -1,7 +1,6 @@

 # 从NVIDIA源，从而支持显卡运损（检查宿主的nvidia-smi中的cuda版本必须>=11.3）
 FROM nvidia/cuda:11.3.1-runtime-ubuntu20.04
-ARG useProxyNetwork=''
 RUN apt-get update
 RUN apt-get install -y curl proxychains curl gcc
 RUN apt-get install -y git python python3 python-dev python3-dev --fix-missing
--- a/docs/GithubAction+NoLocal+Latex
+++ b/docs/GithubAction+NoLocal+Latex
@@ -1,6 +1,6 @@
 # 此Dockerfile适用于“无本地模型”的环境构建，如果需要使用chatglm等本地模型，请参考 docs/Dockerfile+ChatGLM
 # - 1 修改 `config.py`
-# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
+# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex .
 # - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex

 FROM fuqingxu/python311_texlive_ctex:latest
@@ -10,6 +10,10 @@ WORKDIR /gpt

 RUN pip3 install gradio openai numpy arxiv rich
 RUN pip3 install colorama Markdown pygments pymupdf
+RUN pip3 install python-docx moviepy pdfminer 
+RUN pip3 install zh_langchain==0.2.1
+RUN pip3 install nougat-ocr
+RUN pip3 install aliyun-python-sdk-core==2.13.3 pyOpenSSL scipy git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git

 # 装载项目文件
 COPY . .
--- a/docs/translate_english.json
+++ b/docs/translate_english.json
@@ -2161,5 +2161,336 @@
    "在运行过程中动态地修改配置": "Dynamically modify configurations during runtime",
    "请先把模型切换至gpt-*或者api2d-*": "Please switch the model to gpt-* or api2d-* first",
    "获取简单聊天的句柄": "Get handle of simple chat",
-    "获取插件的默认参数": "Get default parameters of plugin"
+    "获取插件的默认参数": "Get default parameters of plugin",
+    "GROBID服务不可用": "GROBID service is unavailable",
+    "请问": "May I ask",
+    "如果等待时间过长": "If the waiting time is too long",
+    "编程": "programming",
+    "5. 现在": "5. Now",
+    "您不必读这个else分支": "You don't have to read this else branch",
+    "用插件实现": "Implement with plugins",
+    "插件分类默认选项": "Default options for plugin classification",
+    "填写多个可以均衡负载": "Filling in multiple can balance the load",
+    "色彩主题": "Color theme",
+    "可能附带额外依赖 -=-=-=-=-=-=-": "May come with additional dependencies -=-=-=-=-=-=-",
+    "讯飞星火认知大模型": "Xunfei Xinghuo cognitive model",
+    "ParsingLuaProject的所有源文件 | 输入参数为路径": "All source files of ParsingLuaProject | Input parameter is path",
+    "复制以下空间https": "Copy the following space https",
+    "如果意图明确": "If the intention is clear",
+    "如系统是Linux": "If the system is Linux",
+    "├── 语音功能": "├── Voice function",
+    "见Github wiki": "See Github wiki",
+    "⭐ ⭐ ⭐ 立即应用配置": "⭐ ⭐ ⭐ Apply configuration immediately",
+    "现在您只需要再次重复一次您的指令即可": "Now you just need to repeat your command again",
+    "没辙了": "No way",
+    "解析Jupyter Notebook文件 | 输入参数为路径": "Parse Jupyter Notebook file | Input parameter is path",
+    "⭐ ⭐ ⭐ 确认插件参数": "⭐ ⭐ ⭐ Confirm plugin parameters",
+    "找不到合适插件执行该任务": "Cannot find a suitable plugin to perform this task",
+    "接驳VoidTerminal": "Connect to VoidTerminal",
+    "**很好": "**Very good",
+    "对话|编程": "Conversation|Programming",
+    "对话|编程|学术": "Conversation|Programming|Academic",
+    "4. 建议使用 GPT3.5 或更强的模型": "4. It is recommended to use GPT3.5 or a stronger model",
+    "「请调用插件翻译PDF论文": "Please call the plugin to translate the PDF paper",
+    "3. 如果您使用「调用插件xxx」、「修改配置xxx」、「请问」等关键词": "3. If you use keywords such as 'call plugin xxx', 'modify configuration xxx', 'please', etc.",
+    "以下是一篇学术论文的基本信息": "The following is the basic information of an academic paper",
+    "GROBID服务器地址": "GROBID server address",
+    "修改配置": "Modify configuration",
+    "理解PDF文档的内容并进行回答 | 输入参数为路径": "Understand the content of the PDF document and answer | Input parameter is path",
+    "对于需要高级参数的插件": "For plugins that require advanced parameters",
+    "🏃‍♂️🏃‍♂️🏃‍♂️ 主进程执行": "Main process execution 🏃‍♂️🏃‍♂️🏃‍♂️",
+    "没有填写 HUGGINGFACE_ACCESS_TOKEN": "HUGGINGFACE_ACCESS_TOKEN not filled in",
+    "调度插件": "Scheduling plugin",
+    "语言模型": "Language model",
+    "├── ADD_WAIFU 加一个live2d装饰": "├── ADD_WAIFU Add a live2d decoration",
+    "初始化": "Initialization",
+    "选择了不存在的插件": "Selected a non-existent plugin",
+    "修改本项目的配置": "Modify the configuration of this project",
+    "如果输入的文件路径是正确的": "If the input file path is correct",
+    "2. 您可以打开插件下拉菜单以了解本项目的各种能力": "2. You can open the plugin dropdown menu to learn about various capabilities of this project",
+    "VoidTerminal插件说明": "VoidTerminal plugin description",
+    "无法理解您的需求": "Unable to understand your requirements",
+    "默认 AdvancedArgs = False": "Default AdvancedArgs = False",
+    "「请问Transformer网络的结构是怎样的": "What is the structure of the Transformer network?",
+    "比如1812.10695": "For example, 1812.10695",
+    "翻译README或MD": "Translate README or MD",
+    "读取新配置中": "Reading new configuration",
+    "假如偏离了您的要求": "If it deviates from your requirements",
+    "├── THEME 色彩主题": "├── THEME color theme",
+    "如果还找不到": "If still not found",
+    "问": "Ask",
+    "请检查系统字体": "Please check system fonts",
+    "如果错误": "If there is an error",
+    "作为替代": "As an alternative",
+    "ParseJavaProject的所有源文件 | 输入参数为路径": "All source files of ParseJavaProject | Input parameter is path",
+    "比对相同参数时生成的url与自己代码生成的url是否一致": "Check if the generated URL matches the one generated by your code when comparing the same parameters",
+    "清除本地缓存数据": "Clear local cache data",
+    "使用谷歌学术检索助手搜索指定URL的结果 | 输入参数为谷歌学术搜索页的URL": "Use Google Scholar search assistant to search for results of a specific URL | Input parameter is the URL of Google Scholar search page",
+    "运行方法": "Running method",
+    "您已经上传了文件**": "You have uploaded the file **",
+    "「给爷翻译Arxiv论文": "Translate Arxiv papers for me",
+    "请修改config中的GROBID_URL": "Please modify GROBID_URL in the config",
+    "处理特殊情况": "Handling special cases",
+    "不要自己瞎搞！」": "Don't mess around by yourself!",
+    "LoadConversationHistoryArchive | 输入参数为路径": "LoadConversationHistoryArchive | Input parameter is a path",
+    "| 输入参数是一个问题": "| Input parameter is a question",
+    "├── CHATBOT_HEIGHT 对话窗的高度": "├── CHATBOT_HEIGHT Height of the chat window",
+    "对C": "To C",
+    "默认关闭": "Default closed",
+    "当前进度": "Current progress",
+    "HUGGINGFACE的TOKEN": "HUGGINGFACE's TOKEN",
+    "查找可用插件中": "Searching for available plugins",
+    "下载LLAMA时起作用 https": "Works when downloading LLAMA https",
+    "使用 AK": "Using AK",
+    "正在执行任务": "Executing task",
+    "保存当前的对话 | 不需要输入参数": "Save current conversation | No input parameters required",
+    "对话": "Conversation",
+    "图中鲜花怒放": "Flowers blooming in the picture",
+    "批量将Markdown文件中文翻译为英文 | 输入参数为路径或上传压缩包": "Batch translate Chinese to English in Markdown files | Input parameter is a path or upload a compressed package",
+    "ParsingCSharpProject的所有源文件 | 输入参数为路径": "ParsingCSharpProject's all source files | Input parameter is a path",
+    "为我翻译PDF论文": "Translate PDF papers for me",
+    "聊天对话": "Chat conversation",
+    "拼接鉴权参数": "Concatenate authentication parameters",
+    "请检查config中的GROBID_URL": "Please check the GROBID_URL in the config",
+    "拼接字符串": "Concatenate strings",
+    "您的意图可以被识别的更准确": "Your intent can be recognized more accurately",
+    "该模型有七个 bin 文件": "The model has seven bin files",
+    "但思路相同": "But the idea is the same",
+    "你需要翻译": "You need to translate",
+    "或者描述文件所在的路径": "Or the path of the description file",
+    "请您上传文件": "Please upload the file",
+    "不常用": "Not commonly used",
+    "尚未充分测试的实验性插件 & 需要额外依赖的插件 -=--=-": "Experimental plugins that have not been fully tested & plugins that require additional dependencies -=--=-",
+    "⭐ ⭐ ⭐ 选择插件": "⭐ ⭐ ⭐ Select plugin",
+    "当前配置不允许被修改！如需激活本功能": "The current configuration does not allow modification! To activate this feature",
+    "正在连接GROBID服务": "Connecting to GROBID service",
+    "用户图形界面布局依赖关系示意图": "Diagram of user interface layout dependencies",
+    "是否允许通过自然语言描述修改本页的配置": "Allow modifying the configuration of this page through natural language description",
+    "self.chatbot被序列化": "self.chatbot is serialized",
+    "本地Latex论文精细翻译 | 输入参数是路径": "Locally translate Latex papers with fine-grained translation | Input parameter is the path",
+    "抱歉": "Sorry",
+    "以下这部分是最早加入的最稳定的模型 -=-=-=-=-=-=-": "The following section is the earliest and most stable model added",
+    "「用插件翻译README": "Translate README with plugins",
+    "如果不正确": "If incorrect",
+    "⭐ ⭐ ⭐ 读取可配置项目条目": "⭐ ⭐ ⭐ Read configurable project entries",
+    "开始语言对话 | 没有输入参数": "Start language conversation | No input parameters",
+    "谨慎操作 | 不需要输入参数": "Handle with caution | No input parameters required",
+    "对英文Latex项目全文进行纠错处理 | 输入参数为路径或上传压缩包": "Correct the entire English Latex project | Input parameter is the path or upload compressed package",
+    "如果需要处理文件": "If file processing is required",
+    "提供图像的内容": "Provide the content of the image",
+    "查看历史上的今天事件 | 不需要输入参数": "View historical events of today | No input parameters required",
+    "这个稍微啰嗦一点": "This is a bit verbose",
+    "多线程解析并翻译此项目的源码 | 不需要输入参数": "Parse and translate the source code of this project in multi-threading | No input parameters required",
+    "此处打印出建立连接时候的url": "Print the URL when establishing the connection here",
+    "精准翻译PDF论文为中文 | 输入参数为路径": "Translate PDF papers accurately into Chinese | Input parameter is the path",
+    "检测到操作错误！当您上传文档之后": "Operation error detected! After you upload the document",
+    "在线大模型配置关联关系示意图": "Online large model configuration relationship diagram",
+    "你的填写的空间名如grobid": "Your filled space name such as grobid",
+    "获取方法": "Get method",
+    "| 输入参数为路径": "| Input parameter is the path",
+    "⭐ ⭐ ⭐ 执行插件": "⭐ ⭐ ⭐ Execute plugin",
+    "├── ALLOW_RESET_CONFIG 是否允许通过自然语言描述修改本页的配置": "├── ALLOW_RESET_CONFIG Whether to allow modifying the configuration of this page through natural language description",
+    "重新页面即可生效": "Refresh the page to take effect",
+    "设为public": "Set as public",
+    "并在此处指定模型路径": "And specify the model path here",
+    "分析用户意图中": "Analyzing user intent",
+    "刷新下拉列表": "Refresh the drop-down list",
+    "失败 当前语言模型": "Failed current language model",
+    "1. 请用**自然语言**描述您需要做什么": "1. Please describe what you need to do in **natural language**",
+    "对Latex项目全文进行中译英处理 | 输入参数为路径或上传压缩包": "Translate the full text of Latex projects from Chinese to English | Input parameter is the path or upload a compressed package",
+    "没有配置BAIDU_CLOUD_API_KEY": "No configuration for BAIDU_CLOUD_API_KEY",
+    "设置默认值": "Set default value",
+    "如果太多了会导致gpt无法理解": "If there are too many, it will cause GPT to be unable to understand",
+    "绿草如茵": "Green grass",
+    "├── LAYOUT 窗口布局": "├── LAYOUT window layout",
+    "用户意图理解": "User intent understanding",
+    "生成RFC1123格式的时间戳": "Generate RFC1123 formatted timestamp",
+    "欢迎您前往Github反馈问题": "Welcome to go to Github to provide feedback",
+    "排除已经是按钮的插件": "Exclude plugins that are already buttons",
+    "亦在下拉菜单中显示": "Also displayed in the dropdown menu",
+    "导致无法反序列化": "Causing deserialization failure",
+    "意图=": "Intent =",
+    "章节": "Chapter",
+    "调用插件": "Invoke plugin",
+    "ParseRustProject的所有源文件 | 输入参数为路径": "All source files of ParseRustProject | Input parameter is path",
+    "需要点击“函数插件区”按钮进行处理": "Need to click the 'Function Plugin Area' button for processing",
+    "默认 AsButton = True": "Default AsButton = True",
+    "收到websocket错误的处理": "Handling websocket errors",
+    "用插件": "Use Plugin",
+    "没有选择任何插件组": "No plugin group selected",
+    "答": "Answer",
+    "可修改成本地GROBID服务": "Can modify to local GROBID service",
+    "用户意图": "User intent",
+    "对英文Latex项目全文进行润色处理 | 输入参数为路径或上传压缩包": "Polish the full text of English Latex projects | Input parameters are paths or uploaded compressed packages",
+    "「我不喜欢当前的界面颜色": "I don't like the current interface color",
+    "「请调用插件": "Please call the plugin",
+    "VoidTerminal状态": "VoidTerminal status",
+    "新配置": "New configuration",
+    "支持Github链接": "Support Github links",
+    "没有配置BAIDU_CLOUD_SECRET_KEY": "No BAIDU_CLOUD_SECRET_KEY configured",
+    "获取当前VoidTerminal状态": "Get the current VoidTerminal status",
+    "刷新按钮": "Refresh button",
+    "为了防止pickle.dumps": "To prevent pickle.dumps",
+    "放弃治疗": "Give up treatment",
+    "可指定不同的生成长度、top_p等相关超参": "Can specify different generation lengths, top_p and other related hyperparameters",
+    "请将题目和摘要翻译为": "Translate the title and abstract",
+    "通过appid和用户的提问来生成请参数": "Generate request parameters through appid and user's question",
+    "ImageGeneration | 输入参数字符串": "ImageGeneration | Input parameter string",
+    "将文件拖动到文件上传区": "Drag and drop the file to the file upload area",
+    "如果意图模糊": "If the intent is ambiguous",
+    "星火认知大模型": "Spark Cognitive Big Model",
+    "执行中. 删除 gpt_log & private_upload": "Executing. Delete gpt_log & private_upload",
+    "默认 Color = secondary": "Default Color = secondary",
+    "此处也不需要修改": "No modification is needed here",
+    "⭐ ⭐ ⭐ 分析用户意图": "⭐ ⭐ ⭐ Analyze user intent",
+    "再试一次": "Try again",
+    "请写bash命令实现以下功能": "Please write a bash command to implement the following function",
+    "批量SummarizingWordDocuments | 输入参数为路径": "Batch SummarizingWordDocuments | Input parameter is the path",
+    "/Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns中的python文件进行解析": "Parse the python file in /Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns",
+    "当我要求你写bash命令时": "When I ask you to write a bash command",
+    "├── AUTO_CLEAR_TXT 是否在提交时自动清空输入框": "├── AUTO_CLEAR_TXT Whether to automatically clear the input box when submitting",
+    "按停止键终止": "Press the stop key to terminate",
+    "文心一言": "Original text",
+    "不能理解您的意图": "Cannot understand your intention",
+    "用简单的关键词检测用户意图": "Detect user intention with simple keywords",
+    "中文": "Chinese",
+    "解析一个C++项目的所有源文件": "Parse all source files of a C++ project",
+    "请求的Prompt为": "Requested prompt is",
+    "参考本demo的时候可取消上方打印的注释": "You can remove the comments above when referring to this demo",
+    "开始接收回复": "Start receiving replies",
+    "接入讯飞星火大模型 https": "Access to Xunfei Xinghuo large model https",
+    "用该压缩包进行反馈": "Use this compressed package for feedback",
+    "翻译Markdown或README": "Translate Markdown or README",
+    "SK 生成鉴权签名": "SK generates authentication signature",
+    "插件参数": "Plugin parameters",
+    "需要访问中文Bing": "Need to access Chinese Bing",
+    "ParseFrontendProject的所有源文件": "Parse all source files of ParseFrontendProject",
+    "现在将执行效果稍差的旧版代码": "Now execute the older version code with slightly worse performance",
+    "您需要明确说明并在指令中提到它": "You need to specify and mention it in the command",
+    "请在config.py中设置ALLOW_RESET_CONFIG=True后重启软件": "Please set ALLOW_RESET_CONFIG=True in config.py and restart the software",
+    "按照自然语言描述生成一个动画 | 输入参数是一段话": "Generate an animation based on natural language description | Input parameter is a sentence",
+    "你的hf用户名如qingxu98": "Your hf username is qingxu98",
+    "Arixv论文精细翻译 | 输入参数arxiv论文的ID": "Fine translation of Arixv paper | Input parameter is the ID of arxiv paper",
+    "无法获取 abstract": "Unable to retrieve abstract",
+    "尽可能地仅用一行命令解决我的要求": "Try to solve my request using only one command",
+    "提取插件参数": "Extract plugin parameters",
+    "配置修改完成": "Configuration modification completed",
+    "正在修改配置中": "Modifying configuration",
+    "ParsePythonProject的所有源文件": "All source files of ParsePythonProject",
+    "请求错误": "Request error",
+    "精准翻译PDF论文": "Accurate translation of PDF paper",
+    "无法获取 authors": "Unable to retrieve authors",
+    "该插件诞生时间不长": "This plugin has not been around for long",
+    "返回项目根路径": "Return project root path",
+    "BatchSummarizePDFDocuments的内容 | 输入参数为路径": "Content of BatchSummarizePDFDocuments | Input parameter is a path",
+    "百度千帆": "Baidu Qianfan",
+    "解析一个C++项目的所有头文件": "Parse all header files of a C++ project",
+    "现在请您描述您的需求": "Now please describe your requirements",
+    "该功能具有一定的危险性": "This feature has a certain level of danger",
+    "收到websocket关闭的处理": "Processing when receiving websocket closure",
+    "读取Tex论文并写摘要 | 输入参数为路径": "Read Tex paper and write abstract | Input parameter is the path",
+    "地址为https": "The address is https",
+    "限制最多前10个配置项": "Limit up to 10 configuration items",
+    "6. 如果不需要上传文件": "6. If file upload is not needed",
+    "默认 Group = 对话": "Default Group = Conversation",
+    "五秒后即将重启！若出现报错请无视即可": "Restarting in five seconds! Please ignore if there is an error",
+    "收到websocket连接建立的处理": "Processing when receiving websocket connection establishment",
+    "批量生成函数的注释 | 输入参数为路径": "Batch generate function comments | Input parameter is the path",
+    "聊天": "Chat",
+    "但您可以尝试再试一次": "But you can try again",
+    "千帆大模型平台": "Qianfan Big Model Platform",
+    "直接运行 python tests/test_plugins.py": "Run python tests/test_plugins.py directly",
+    "或是None": "Or None",
+    "进行hmac-sha256进行加密": "Perform encryption using hmac-sha256",
+    "批量总结音频或视频 | 输入参数为路径": "Batch summarize audio or video | Input parameter is path",
+    "插件在线服务配置依赖关系示意图": "Plugin online service configuration dependency diagram",
+    "开始初始化模型": "Start initializing model",
+    "弱模型可能无法理解您的想法": "Weak model may not understand your ideas",
+    "解除大小写限制": "Remove case sensitivity restriction",
+    "跳过提示环节": "Skip prompt section",
+    "接入一些逆向工程https": "Access some reverse engineering https",
+    "执行完成": "Execution completed",
+    "如果需要配置": "If configuration is needed",
+    "此处不修改；如果使用本地或无地域限制的大模型时": "Do not modify here; if using local or region-unrestricted large models",
+    "你是一个Linux大师级用户": "You are a Linux master-level user",
+    "arxiv论文的ID是1812.10695": "The ID of the arxiv paper is 1812.10695",
+    "而不是点击“提交”按钮": "Instead of clicking the 'Submit' button",
+    "解析一个Go项目的所有源文件 | 输入参数为路径": "Parse all source files of a Go project | Input parameter is path",
+    "对中文Latex项目全文进行润色处理 | 输入参数为路径或上传压缩包": "Polish the entire text of a Chinese Latex project | Input parameter is path or upload compressed package",
+    "「生成一张图片": "Generate an image",
+    "将Markdown或README翻译为中文 | 输入参数为路径或URL": "Translate Markdown or README to Chinese | Input parameters are path or URL",
+    "训练时间": "Training time",
+    "将请求的鉴权参数组合为字典": "Combine the requested authentication parameters into a dictionary",
+    "对Latex项目全文进行英译中处理 | 输入参数为路径或上传压缩包": "Translate the entire text of Latex project from English to Chinese | Input parameters are path or uploaded compressed package",
+    "内容如下": "The content is as follows",
+    "用于高质量地读取PDF文档": "Used for high-quality reading of PDF documents",
+    "上下文太长导致 token 溢出": "The context is too long, causing token overflow",
+    "├── DARK_MODE 暗色模式 / 亮色模式": "├── DARK_MODE Dark mode / Light mode",
+    "语言模型回复为": "The language model replies as",
+    "from crazy_functions.chatglm微调工具 import 微调数据集生成": "from crazy_functions.chatglm fine-tuning tool import fine-tuning dataset generation",
+    "为您选择了插件": "Selected plugin for you",
+    "无法获取 title": "Unable to get title",
+    "收到websocket消息的处理": "Processing of received websocket messages",
+    "2023年": "2023",
+    "清除所有缓存文件": "Clear all cache files",
+    "├── PDF文档精准解析": "├── Accurate parsing of PDF documents",
+    "论文我刚刚放到上传区了": "I just put the paper in the upload area",
+    "生成url": "Generate URL",
+    "以下部分是新加入的模型": "The following section is the newly added model",
+    "学术": "Academic",
+    "├── DEFAULT_FN_GROUPS 插件分类默认选项": "├── DEFAULT_FN_GROUPS Plugin classification default options",
+    "不推荐使用": "Not recommended for use",
+    "正在同时咨询": "Consulting simultaneously",
+    "将Markdown翻译为中文 | 输入参数为路径或URL": "Translate Markdown to Chinese | Input parameters are path or URL",
+    "Github网址是https": "The Github URL is https",
+    "试着加上.tex后缀试试": "Try adding the .tex suffix",
+    "对项目中的各个插件进行测试": "Test each plugin in the project",
+    "插件说明": "Plugin description",
+    "├── CODE_HIGHLIGHT 代码高亮": "├── CODE_HIGHLIGHT Code highlighting",
+    "记得用插件": "Remember to use the plugin",
+    "谨慎操作": "Handle with caution",
+    "请检查PDF是否损坏": "#",
+    "执行成功了": "#",
+    "请在输入框内填写需求": "#",
+    "结果": "#",
+    "开始干正事": "#",
+    "次代码生成尝试": "#",
+    "代码生成结束": "#",
+    "Nougat解析论文失败": "#",
+    "受到google限制": "#",
+    "收尾": "#",
+    "结果是一个有效文件": "#",
+    "然后再次点击该插件": "#",
+    "用插件实现」": "#",
+    "文件路径": "#",
+    "仅供测试": "#",
+    "将csv文件转excel表格": "#",
+    "开始执行": "#",
+    "测试": "#",
+    "睡一会防止触发google反爬虫": "#",
+    "某段话的整个句子": "#",
+    "使用tex格式公式 测试2 给出柯西不等式": "#",
+    "找不到本地项目或无法处理": "#",
+    "交换图像的蓝色通道和红色通道": "#",
+    "第三步": "#",
+    "返回给定的url解析出的arxiv_id": "#",
+    "裁剪图像": "#",
+    "已经被记忆": "#",
+    "无法从bing获取信息！": "#",
+    "可能触发了google反爬虫机制": "#",
+    "检索文章的历史版本的题目": "#",
+    "请配置讯飞星火大模型的XFYUN_APPID": "#",
+    "执行失败了": "#",
+    "需要花费较长时间下载NOUGAT参数": "#",
+    "请检查": "#",
+    "写入": "#",
+    "下个句子中已经说完的部分": "#",
+    "精准翻译PDF文档": "#",
+    "解析python源代码项目": "#",
+    "首先在arxiv上搜索": "#",
+    "错误追踪": "#",
+    "结果是一个字符串": "#",
+    "由 test_on_sentence_end": "#",
+    "获取文章摘要": "#",
+    "受到bing限制": "#"
 }
--- a/docs/translate_std.json
+++ b/docs/translate_std.json
@@ -83,5 +83,12 @@
    "图片生成": "ImageGeneration",
    "动画生成": "AnimationGeneration",
    "语音助手": "VoiceAssistant",
-    "启动微调": "StartFineTuning"
+    "启动微调": "StartFineTuning",
+    "清除缓存": "ClearCache",
+    "辅助功能": "Accessibility",
+    "虚空终端": "VoidTerminal",
+    "解析PDF_基于GROBID": "ParsePDF_BasedOnGROBID",
+    "虚空终端主路由": "VoidTerminalMainRoute",
+    "批量翻译PDF文档_NOUGAT": "BatchTranslatePDFDocuments_NOUGAT",
+    "解析PDF_基于NOUGAT": "ParsePDF_NOUGAT"
 }
--- a/multi_language.py
+++ b/multi_language.py
@@ -478,6 +478,8 @@ def step_2_core_key_translate():
    up = trans_json(need_translate, language=LANG, special=False)
    map_to_json(up, language=LANG)
    cached_translation = read_map_from_json(language=LANG)
+    LANG_STD = 'std'
+    cached_translation.update(read_map_from_json(language=LANG_STD))
    cached_translation = dict(sorted(cached_translation.items(), key=lambda x: -len(x[0])))

    # ===============================================
--- a/request_llm/bridge_spark.py
+++ b/request_llm/bridge_spark.py
@@ -2,11 +2,17 @@
 import time
 import threading
 import importlib
-from toolbox import update_ui, get_conf
+from toolbox import update_ui, get_conf, update_ui_lastest_msg
 from multiprocessing import Process, Pipe

 model_name = '星火认知大模型'

+def validate_key():
+    XFYUN_APPID,  = get_conf('XFYUN_APPID', )
+    if XFYUN_APPID == '00000000' or XFYUN_APPID == '': 
+        return False
+    return True
+
 def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=[], console_slience=False):
    """
        ⭐多线程方法
@@ -15,6 +21,9 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
    watch_dog_patience = 5
    response = ""

+    if validate_key() is False:
+        raise RuntimeError('请配置讯飞星火大模型的XFYUN_APPID, XFYUN_API_KEY, XFYUN_API_SECRET')
+
    from .com_sparkapi import SparkRequestInstance
    sri = SparkRequestInstance()
    for response in sri.generate(inputs, llm_kwargs, history, sys_prompt):
@@ -32,6 +41,10 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
    chatbot.append((inputs, ""))
    yield from update_ui(chatbot=chatbot, history=history)

+    if validate_key() is False:
+        yield from update_ui_lastest_msg(lastmsg="[Local Message]: 请配置讯飞星火大模型的XFYUN_APPID, XFYUN_API_KEY, XFYUN_API_SECRET", chatbot=chatbot, history=history, delay=0)
+        return
+
    if additional_fn is not None:
        from core_functional import handle_core_functionality
        inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
--- a/request_llm/com_sparkapi.py
+++ b/request_llm/com_sparkapi.py
@@ -58,7 +58,7 @@ class Ws_Param(object):
 class SparkRequestInstance():
    def __init__(self):
        XFYUN_APPID, XFYUN_API_SECRET, XFYUN_API_KEY = get_conf('XFYUN_APPID', 'XFYUN_API_SECRET', 'XFYUN_API_KEY')
-
+        if XFYUN_APPID == '00000000' or XFYUN_APPID == '': raise RuntimeError('请配置讯飞星火大模型的XFYUN_APPID, XFYUN_API_KEY, XFYUN_API_SECRET')
        self.appid = XFYUN_APPID
        self.api_secret = XFYUN_API_SECRET
        self.api_key = XFYUN_API_KEY
--- a/requirements.txt
+++ b/requirements.txt
@@ -20,4 +20,4 @@ arxiv
 rich
 pypdf2==2.12.1
 websocket-client
-scipdf_parser==0.3
+scipdf_parser>=0.3
--- a/tests/test_plugins.py
+++ b/tests/test_plugins.py
@@ -10,8 +10,9 @@ from tests.test_utils import plugin_test

 if __name__ == "__main__":
    # plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='修改api-key为sk-jhoejriotherjep')
+    plugin_test(plugin='crazy_functions.批量翻译PDF文档_NOUGAT->批量翻译PDF文档', main_input='crazy_functions/test_project/pdf_and_word/aaai.pdf')

-    plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='调用插件，对C:/Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns中的python文件进行解析')
+    # plugin_test(plugin='crazy_functions.虚空终端->虚空终端', main_input='调用插件，对C:/Users/fuqingxu/Desktop/旧文件/gpt/chatgpt_academic/crazy_functions/latex_fns中的python文件进行解析')

    # plugin_test(plugin='crazy_functions.命令行助手->命令行助手', main_input='查看当前的docker容器列表')

--- a/toolbox.py
+++ b/toolbox.py
@@ -281,8 +281,7 @@ def report_execption(chatbot, history, a, b):
    向chatbot中添加错误信息
    """
    chatbot.append((a, b))
-    history.append(a)
-    history.append(b)
+    history.extend([a, b])


 def text_divide_paragraph(text):
@@ -305,6 +304,7 @@ def text_divide_paragraph(text):
        text = "</br>".join(lines)
        return pre + text + suf

+
@lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
 def markdown_convertion(txt):
    """
@@ -359,19 +359,41 @@ def markdown_convertion(txt):
        content = content.replace('</script>\n</script>', '</script>')
        return content

-    def no_code(txt):
-        if '```' not in txt: 
-            return True
-        else:
-            if '```reference' in txt: return True    # newbing
-            else: return False
+    def is_equation(txt):
+        """
+        判定是否为公式 | 测试1 写出洛伦兹定律，使用tex格式公式 测试2 给出柯西不等式，使用latex格式 测试3 写出麦克斯韦方程组
+        """
+        if '```' in txt and '```reference' not in txt: return False
+        if '$' not in txt and '\\[' not in txt: return False
+        mathpatterns = {
+            r'(?<!\\|\$)(\$)([^\$]+)(\$)': {'allow_multi_lines': False},                            #  $...$
+            r'(?<!\\)(\$\$)([^\$]+)(\$\$)': {'allow_multi_lines': True},                            # $$...$$
+            r'(?<!\\)(\\\[)(.+?)(\\\])': {'allow_multi_lines': False},                              # \[...\]
+            # r'(?<!\\)(\\\()(.+?)(\\\))': {'allow_multi_lines': False},                            # \(...\)
+            # r'(?<!\\)(\\begin{([a-z]+?\*?)})(.+?)(\\end{\2})': {'allow_multi_lines': True},       # \begin...\end
+            # r'(?<!\\)(\$`)([^`]+)(`\$)': {'allow_multi_lines': False},                            # $`...`$
+        }
+        matches = []
+        for pattern, property in mathpatterns.items():
+            flags = re.ASCII|re.DOTALL if property['allow_multi_lines'] else re.ASCII
+            matches.extend(re.findall(pattern, txt, flags))
+        if len(matches) == 0: return False
+        contain_any_eq = False
+        illegal_pattern = re.compile(r'[^\x00-\x7F]|echo')
+        for match in matches:
+            if len(match) != 3: return False
+            eq_canidate = match[1]
+            if illegal_pattern.search(eq_canidate): 
+                return False
+            else: 
+                contain_any_eq = True
+        return contain_any_eq

-    if ('$' in txt) and no_code(txt):  # 有$标识的公式符号，且没有代码段```的标识
+    if is_equation(txt):  # 有$标识的公式符号，且没有代码段```的标识
        # convert everything to html format
        split = markdown.markdown(text='---')
-        convert_stage_1 = markdown.markdown(text=txt, extensions=['mdx_math', 'fenced_code', 'tables', 'sane_lists'], extension_configs=markdown_extension_configs)
+        convert_stage_1 = markdown.markdown(text=txt, extensions=['sane_lists', 'tables', 'mdx_math', 'fenced_code'], extension_configs=markdown_extension_configs)
        convert_stage_1 = markdown_bug_hunt(convert_stage_1)
-        # re.DOTALL: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline. Corresponds to the inline flag (?s).
        # 1. convert to easy-to-copy tex (do not render math)
        convert_stage_2_1, n = re.subn(find_equation_pattern, replace_math_no_render, convert_stage_1, flags=re.DOTALL)
        # 2. convert to rendered equation
@@ -379,7 +401,7 @@ def markdown_convertion(txt):
        # cat them together
        return pre + convert_stage_2_1 + f'{split}' + convert_stage_2_2 + suf
    else:
-        return pre + markdown.markdown(txt, extensions=['fenced_code', 'codehilite', 'tables', 'sane_lists']) + suf
+        return pre + markdown.markdown(txt, extensions=['sane_lists', 'tables', 'fenced_code', 'codehilite']) + suf


 def close_up_code_segment_during_stream(gpt_reply):
@@ -561,7 +583,7 @@ def on_file_uploaded(files, chatbot, txt, txt2, checkboxes, cookies):
    chatbot.append(['我上传了文件，请查收',
                    f'[Local Message] 收到以下文件: \n\n{moved_files_str}' +
                    f'\n\n调用路径参数已自动修正到: \n\n{txt}' +
-                    f'\n\n现在您点击任意“红颜色”标识的函数插件时，以上文件将被作为输入参数'+err_msg])
+                    f'\n\n现在您点击任意函数插件时，以上文件将被作为输入参数'+err_msg])
    cookies.update({
        'most_recent_uploaded': {
            'path': f'private_upload/{time_tag}',
作者	SHA1	备注	提交日期
binary-husky	34784333dc	融合PDF左右比例调整到95%	2023-09-10 17:22:35 +08:00
binary-husky	28d777a96b	修正报错消息	2023-09-10 16:52:35 +08:00
qingxu fu	c45fa88684	update translation matrix	2023-09-09 21:57:24 +08:00
binary-husky	ad9807dd14	更新虚空终端的提示	2023-09-09 20:32:44 +08:00
binary-husky	2a51715075	修复Dockerfile	2023-09-09 20:15:46 +08:00
binary-husky	7c307d8964	修复源代码解析模块与虚空终端的兼容性	2023-09-09 19:33:05 +08:00
binary-husky	baaacc5a7b	Update README.md	2023-09-09 19:11:21 +08:00
binary-husky	6faf5947c9	Merge branch 'master' of github.com:binary-husky/chatgpt_academic	2023-09-09 18:30:59 +08:00
binary-husky	571335cbc4	fix docker file	2023-09-09 18:30:43 +08:00
binary-husky	7d5abb6d69	Merge pull request #1077 from jsz14897502/master 更改谷歌学术搜索助手获取摘要的逻辑	2023-09-09 18:24:30 +08:00
binary-husky	a0f592308a	Merge branch 'master' into jsz14897502-master	2023-09-09 18:22:29 +08:00
binary-husky	e512d99879	添加一定的延迟，防止触发反爬虫机制	2023-09-09 18:22:22 +08:00
binary-husky	e70b636513	修复数学公式判定的Bug	2023-09-09 17:50:38 +08:00
binary-husky	408b8403fe	Merge branch 'master' of github.com:binary-husky/chatgpt_academic	2023-09-08 12:10:22 +08:00
binary-husky	74f8cb3511	update dockerfile	2023-09-08 12:10:16 +08:00
qingxu fu	2202cf3701	remove proxy message	2023-09-08 11:11:53 +08:00
qingxu fu	cce69beee9	update error message	2023-09-08 11:08:02 +08:00
qingxu fu	347124c967	update scipdf_parser dep	2023-09-08 10:43:20 +08:00
qingxu fu	77a6105a9a	修改demo案例	2023-09-08 09:52:29 +08:00
qingxu fu	13c9606af7	修正下载PDF失败时产生的错误提示	2023-09-08 09:47:29 +08:00
binary-husky	bac6810e75	修改操作提示	2023-09-08 09:38:16 +08:00
binary-husky	c176187d24	修复因为函数返回值导致的不准确错误提示	2023-09-07 23:46:54 +08:00
binary-husky	31d5ee6ccc	Update README.md	2023-09-07 23:05:54 +08:00
binary-husky	5e0dc9b9ad	修复PDF下载路径时间戳的问题	2023-09-07 18:51:09 +08:00
binary-husky	4c6f3aa427	CodeInterpreter	2023-09-07 17:45:44 +08:00
binary-husky	d7331befc1	add note	2023-09-07 17:42:47 +08:00
binary-husky	63219baa21	修正语音对话时句子末尾显示异常的问题	2023-09-07 17:04:40 +08:00
binary-husky	97cb9a4adc	full capacity docker file	2023-09-07 15:09:38 +08:00
binary-husky	24f41b0a75	new docker file	2023-09-07 00:45:03 +08:00
binary-husky	bfec29e9bc	new docker file	2023-09-07 00:43:31 +08:00
binary-husky	dd9e624761	add new dockerfile	2023-09-07 00:40:11 +08:00
binary-husky	7855325ff9	update dockerfiles	2023-09-06 23:33:15 +08:00
binary-husky	2c039ff5c9	add session	2023-09-06 22:19:32 +08:00
binary-husky	9a5ee86434	Merge pull request #1084 from eltociear/patch-2 Update README.md	2023-09-06 21:56:39 +08:00
binary-husky	d6698db257	nougat翻译PDF论文	2023-09-06 15:32:11 +08:00
Ikko Eltociear Ashimine	b2d03bf2a3	Update README.md arbitary -> arbitrary	2023-09-06 15:30:12 +09:00
binary-husky	2f83b60fb3	添加搜索失败时的提示	2023-09-06 12:36:59 +08:00
binary-husky	d183e34461	添加一个全版本搜索的开关	2023-09-06 11:42:29 +08:00
binary-husky	fb78569335	Merge branch 'master' of https://github.com/jsz14897502/gpt_academic into jsz14897502-master	2023-09-06 10:27:52 +08:00
qingxu fu	12c8cd75ee	Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master	2023-09-06 10:24:14 +08:00
qingxu fu	0e21e3e2e7	修复没填写讯飞APPID无报错提示的问题	2023-09-06 10:24:11 +08:00
binary-husky	fda1e87278	Update stale.yml	2023-09-06 10:19:21 +08:00
binary-husky	1092031d77	Create stale.yml	2023-09-06 10:15:52 +08:00
binary-husky	f0482d3bae	Update docker-compose.yml	2023-09-04 12:39:25 +08:00
binary-husky	b6ac3d0d6c	Update README.md	2023-09-04 12:34:55 +08:00
binary-husky	3344ffcb8b	Update README.md	2023-09-04 11:41:52 +08:00
binary-husky	82936f71b6	Update README.md	2023-09-04 11:37:47 +08:00
binary-husky	51e809c09e	Update README.md	2023-09-04 11:34:46 +08:00
qingxu fu	713df396dc	Merge branch 'master' of https://github.com/binary-husky/chatgpt_academic into master	2023-09-03 16:46:30 +08:00
qingxu fu	23a42d93df	update translation matrix	2023-09-03 16:46:27 +08:00
binary-husky	0ef06683dc	Update README.md	2023-09-03 16:35:03 +08:00
jsz14	03164bcb6f	fix:没有获取到所有版本时的处理	2023-09-02 19:58:24 +08:00
jsz14	d052d425af	更改谷歌学术搜索助手获取摘要的逻辑	2023-08-30 19:14:01 +08:00