[arxiv trans] add html comparison to zip file

修复提示
3.42
2025-12-06 06:26:47 +00:00 · 2023-06-29 12:29:49 +08:00 · 2023-06-29 12:15:52 +08:00 · 2023-06-29 11:32:19 +08:00 · 2023-06-29 11:30:42 +08:00 · 2023-06-27 23:37:25 +08:00
--- a/.github/workflows/build-with-latex.yml
+++ b/.github/workflows/build-with-latex.yml
@@ -0,0 +1,44 @@
 # https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
 name: Create and publish a Docker image for Latex support
 on:
  push:
    branches:
      - 'master'
 env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}_with_latex
 jobs:
  build-and-push-image:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3
      - name: Log in to the Container registry
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: Extract metadata (tags, labels) for Docker
        id: meta
        uses: docker/metadata-action@v4
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
      - name: Build and push Docker image
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: docs/GithubAction+NoLocal+Latex
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
 >
 > 1.请注意只有**红颜色**标识的函数插件（按钮）才支持读取文件，部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR！
 >
-> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代，您也可以随时自行点击相关函数插件，调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
+> 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代，您也可以随时自行点击相关函数插件，调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
 > 
 > 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存，可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时，在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
@@ -31,23 +31,23 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
 一键中英互译 | 一键中英互译
 一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
 [自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键
-模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/chatgpt_academic/tree/master/crazy_functions)，插件支持[热更新](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
+模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/gpt_academic/tree/master/crazy_functions)，插件支持[热更新](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
-[自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
+[自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
 [程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树
 读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要
 Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文
 批量注释生成 | [函数插件] 一键批量生成函数注释
-Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/chatgpt_academic/blob/master/docs/README_EN.md)了吗？
+Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/gpt_academic/blob/master/docs/README_EN.md)了吗？
 chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
 [PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文（多线程）
 [Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插件] 输入arxiv文章url即可一键翻译摘要+下载PDF
 [谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL，让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
 互联网信息聚合+GPT | [函数插件] 一键[让GPT先从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck)，再回答问题，让信息永不过时
-Arxiv论文精密翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/)，迄今为止最好的论文翻译工具
+⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/)，迄今为止最好的论文翻译工具⭐
 公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png)，支持公式、代码高亮
 多线程函数插件支持 | 支持多线调用chatgpt，一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
-启动暗色gradio[主题](https://github.com/binary-husky/chatgpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
+启动暗色gradio[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
-[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持，[API2D](https://api2d.com/)接口支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧？
+[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧？
 更多LLM模型接入，支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应)，引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)，[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
 更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
@@ -91,13 +91,13 @@ Arxiv论文精密翻译 | [函数插件] 一键[以超高质量翻译arxiv论文
 1. 下载项目
 ```sh
-git clone https://github.com/binary-husky/chatgpt_academic.git
+git clone https://github.com/binary-husky/gpt_academic.git
-cd chatgpt_academic
+cd gpt_academic
 ```
 2. 配置API_KEY
-在`config.py`中，配置API KEY等设置，[特殊网络环境设置](https://github.com/binary-husky/gpt_academic/issues/1) 。
+在`config.py`中，配置API KEY等设置，[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。
 (P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件，并用其中的配置覆盖`config.py`的同名配置。因此，如果您能理解我们的配置读取逻辑，我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件，并把`config.py`中的配置转移（复制）到`config_private.py`中。`config_private.py`不受git管控，可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项，环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
@@ -113,6 +113,7 @@ conda activate gptac_venv                 # 激活anaconda环境
 python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
 ```
 <details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端，请点击展开此处</summary>
 <p>
@@ -139,19 +140,13 @@ AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-
 python main.py
 ```
 5. 测试函数插件
 ```
 - 测试函数插件模板函数（要求gpt回答历史上的今天发生了什么），您可以根据此函数为模板，实现更复杂的功能
    点击 "[函数插件模板Demo] 历史上的今天"
 ```
 ## 安装-方法2：使用Docker
-1. 仅ChatGPT（推荐大多数人选择）
+1. 仅ChatGPT（推荐大多数人选择，等价于docker-compose方案1）
 ``` sh
-git clone https://github.com/binary-husky/chatgpt_academic.git  # 下载项目
+git clone https://github.com/binary-husky/gpt_academic.git  # 下载项目
-cd chatgpt_academic                                 # 进入路径
+cd gpt_academic                                 # 进入路径
 nano config.py                                      # 用任意文本编辑器编辑config.py, 配置 “Proxy”， “API_KEY” 以及 “WEB_PORT” (例如50923) 等
 docker build -t gpt-academic .                      # 安装
@@ -160,40 +155,43 @@ docker run --rm -it --net=host gpt-academic
 #（最后一步-选择2）在macOS/windows环境下，只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
 docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
 ```
 P.S. 如果需要依赖Latex的插件功能，请见Wiki。另外，您也可以直接使用docker-compose获取Latex功能（修改docker-compose.yml，保留方案4并删除其他方案）。
 2. ChatGPT + ChatGLM + MOSS（需要熟悉Docker）
 ``` sh
-# 修改docker-compose.yml，删除方案1和方案3，保留方案2。修改docker-compose.yml中方案2的配置，参考其中注释即可
+# 修改docker-compose.yml，保留方案2并删除其他方案。修改docker-compose.yml中方案2的配置，参考其中注释即可
 docker-compose up
 ```
 3. ChatGPT + LLAMA + 盘古 + RWKV（需要熟悉Docker）
 ``` sh
-# 修改docker-compose.yml，删除方案1和方案2，保留方案3。修改docker-compose.yml中方案3的配置，参考其中注释即可
+# 修改docker-compose.yml，保留方案3并删除其他方案。修改docker-compose.yml中方案3的配置，参考其中注释即可
 docker-compose up
 ```
 ## 安装-方法3：其他部署姿势
 1. 一键运行脚本。
-完全不熟悉python环境的Windows用户可以下载[Release](https://github.com/binary-husky/gpt_academic/releases)中发布的一键运行脚本安装无本地模型的版本，
+完全不熟悉python环境的Windows用户可以下载[Release](https://github.com/binary-husky/gpt_academic/releases)中发布的一键运行脚本安装无本地模型的版本。
 不建议电脑上已有python的用户采用此方法（在此基础上安装插件的依赖很麻烦）。
 脚本的贡献来源是[oobabooga](https://github.com/oobabooga/one-click-installers)。
 2. 使用docker-compose运行。
 请阅读docker-compose.yml后，按照其中的提示操作即可
-3. 如何使用反代URL/微软云AzureAPI。
+3. 如何使用反代URL
 按照`config.py`中的说明配置API_URL_REDIRECT即可。
-4. 远程云服务器部署（需要云服务器知识与经验）。
+4. 微软云AzureAPI
-请访问[部署wiki-1](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
+按照`config.py`中的说明配置即可（AZURE_ENDPOINT等四个配置）
-5. 使用WSL2（Windows Subsystem for Linux 子系统）。
+5. 远程云服务器部署（需要云服务器知识与经验）。
-请访问[部署wiki-2](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
+请访问[部署wiki-1](https://github.com/binary-husky/gpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
-6. 如何在二级网址（如`http://localhost/subpath`）下运行。
+6. 使用WSL2（Windows Subsystem for Linux 子系统）。
 请访问[部署wiki-2](https://github.com/binary-husky/gpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
 7. 如何在二级网址（如`http://localhost/subpath`）下运行。
 请访问[FastAPI运行说明](docs/WithFastapi.md)
 ---
@@ -220,7 +218,7 @@ docker-compose up
 编写强大的函数插件来执行任何你想得到的和想不到的任务。
 本项目的插件编写、调试难度很低，只要您具备一定的python基础知识，就可以仿照我们提供的模板实现自己的插件功能。
-详情请参考[函数插件指南](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
+详情请参考[函数插件指南](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
 ---
 # Latest Update
@@ -228,38 +226,33 @@ docker-compose up
 1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件，
 另外在函数插件区（下拉菜单）调用 `载入对话历史存档` ，即可还原之前的会话。
-Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存，点击 `删除所有本地对话历史记录` 可以删除所有html存档缓存。
+Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存。
 <div align="center">
 <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
 </div>
-
+2. ⭐Latex/Arxiv论文翻译功能⭐
 2. 生成报告。大部分插件都会在执行结束后，生成工作报告
 <div align="center">
-<img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="300" >
+<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" > ===>
-<img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="300" >
+<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
 <img src="https://user-images.githubusercontent.com/96192199/227504005-efeaefe0-b687-49d0-bf95-2d7b7e66c348.png" height="300" >
 </div>
-3. 模块化功能设计，简单的接口却能支持强大的功能
+3. 生成报告。大部分插件都会在执行结束后，生成工作报告
 <div align="center">
 <img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="250" >
 <img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="250" >
 </div>
 4. 模块化功能设计，简单的接口却能支持强大的功能
 <div align="center">
 <img src="https://user-images.githubusercontent.com/96192199/229288270-093643c1-0018-487a-81e6-1d7809b6e90f.png" height="400" >
 <img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
 </div>
-4. 这是一个能够“自我译解”的开源项目
+5. 译解其他开源项目
 <div align="center">
-<img src="https://user-images.githubusercontent.com/96192199/226936850-c77d7183-0749-4c1c-9875-fd4891842d0c.png" width="500" >
+<img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" >
-</div>
+<img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" >
 5. 译解其他开源项目，不在话下
 <div align="center">
 <img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" width="500" >
 </div>
 <div align="center">
 <img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" width="500" >
 </div>
 6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能（默认关闭，需要修改`config.py`）
@@ -284,15 +277,11 @@ Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史h
 10. Latex全文校对纠错
 <div align="center">
-<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="250" > ===>
+<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===>
-<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="250">
+<img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200">
 </div>
-10. Latex/Arxiv论文翻译功能
+
 <div align="center">
 <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" >
 <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
 </div>
 ## 版本:
 - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件（高优先级）
@@ -314,30 +303,32 @@ gpt_academic开发者QQ群-2：610599535
 - 已知问题
    - 某些浏览器翻译插件干扰此软件前端的运行
-    - 官方Gradio目前有很多兼容性Bug，请务必使用requirement.txt安装Gradio
+    - 官方Gradio目前有很多兼容性Bug，请务必使用`requirement.txt`安装Gradio
 ## 参考与学习
 ```
-代码中参考了很多其他优秀项目中的设计，主要包括：
+代码中参考了很多其他优秀项目中的设计，顺序不分先后：
-# 项目1：清华ChatGLM-6B:
+# 清华ChatGLM-6B:
 https://github.com/THUDM/ChatGLM-6B
-# 项目2：清华JittorLLMs:
+# 清华JittorLLMs:
 https://github.com/Jittor/JittorLLMs
-# 项目3：Edge-GPT:
+# ChatPaper:
 https://github.com/acheong08/EdgeGPT
 # 项目4：ChuanhuChatGPT:
 https://github.com/GaiZhenbiao/ChuanhuChatGPT
 # 项目5：ChatPaper:
 https://github.com/kaixindelele/ChatPaper
-# 更多：
+# Edge-GPT:
 https://github.com/acheong08/EdgeGPT
 # ChuanhuChatGPT:
 https://github.com/GaiZhenbiao/ChuanhuChatGPT
 # Oobabooga one-click installer:
 https://github.com/oobabooga/one-click-installers
 # More：
 https://github.com/gradio-app/gradio
 https://github.com/fghrsh/live2d_demo
 https://github.com/oobabooga/one-click-installers
 ```
--- a/config.py
+++ b/config.py
@@ -1,6 +1,7 @@
 # [step 1]>> 例如： API_KEY = "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" （此key无效）
 API_KEY = "sk-此处填API密钥"    # 可同时填写多个API-KEY，用英文逗号分割，例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey1,fkxxxx-api2dkey2"
 # [step 2]>> 改为True应用代理，如果直接在海外服务器部署，此处不修改
 USE_PROXY = False
 if USE_PROXY:
@@ -46,8 +47,8 @@ MAX_RETRY = 2
 # 模型选择是 (注意: LLM_MODEL是默认选中的模型, 同时它必须被包含在AVAIL_LLM_MODELS切换列表中 )
 LLM_MODEL = "gpt-3.5-turbo" # 可选 ↓↓↓
-AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"]
+AVAIL_LLM_MODELS = ["gpt-3.5-turbo-16k", "gpt-3.5-turbo", "azure-gpt35", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-4", "chatglm", "moss", "newbing", "newbing-free", "stack-claude"]
-# P.S. 其他可用的模型还包括 ["newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
+# P.S. 其他可用的模型还包括 ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "newbing-free", "jittorllms_rwkv", "jittorllms_pangualpha", "jittorllms_llama"]
 # 本地LLM模型如ChatGLM的执行方式 CPU/GPU
 LOCAL_MODEL_DEVICE = "cpu" # 可选 "cuda"
@@ -81,3 +82,10 @@ your bing cookies here
 # 如果需要使用Slack Claude，使用教程详情见 request_llm/README.md
 SLACK_CLAUDE_BOT_ID = ''   
 SLACK_CLAUDE_USER_TOKEN = ''
 # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
 AZURE_ENDPOINT = "https://你的api名称.openai.azure.com/"
 AZURE_API_KEY = "填入azure openai api的密钥"
 AZURE_API_VERSION = "填入api版本"
 AZURE_ENGINE = "填入ENGINE"
--- a/crazy_functional.py
+++ b/crazy_functional.py
@@ -112,11 +112,11 @@ def get_crazy_functions():
            "AsButton": False,  # 加入下拉菜单中
            "Function": HotReload(解析项目本身)
        },
-        "[老旧的Demo] 把本项目源代码切换成全英文": {
+        # "[老旧的Demo] 把本项目源代码切换成全英文": {
-            # HotReload 的意思是热更新，修改函数插件代码后，不需要重启程序，代码直接生效
+        #     # HotReload 的意思是热更新，修改函数插件代码后，不需要重启程序，代码直接生效
-            "AsButton": False,  # 加入下拉菜单中
+        #     "AsButton": False,  # 加入下拉菜单中
-            "Function": HotReload(全项目切换英文)
+        #     "Function": HotReload(全项目切换英文)
-        },
+        # },
        "[插件demo] 历史上的今天": {
            # HotReload 的意思是热更新，修改函数插件代码后，不需要重启程序，代码直接生效
            "Function": HotReload(高阶功能模板函数)
@@ -348,25 +348,52 @@ def get_crazy_functions():
    try:
        from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比
        function_plugins.update({
-            "[功能尚不稳定] Latex英文纠错+LatexDiff高亮修正位置": {
+            "Latex英文纠错+高亮修正位置 [需Latex]": {
                "Color": "stop",
                "AsButton": False,
-                # "AdvancedArgs": True,
+                "AdvancedArgs": True,
-                # "ArgsReminder": "",
+                "ArgsReminder": "如果有必要, 请在此处追加更细致的矫错指令（使用英文）。",
                "Function": HotReload(Latex英文纠错加PDF对比)
            }
        })
        from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
        function_plugins.update({
-            "[功能尚不稳定] Latex翻译/Arixv翻译+重构PDF": {
+            "Arixv翻译（输入arxivID）[需Latex]": {
                "Color": "stop",
                "AsButton": False,
-                # "AdvancedArgs": True,
+                "AdvancedArgs": True,
-                # "ArgsReminder": "",
+                "ArgsReminder": 
                    "如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+ 
                    "例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
                "Function": HotReload(Latex翻译中文并重新编译PDF)
            }
        })
        function_plugins.update({
            "本地论文翻译（上传Latex压缩包）[需Latex]": {
                "Color": "stop",
                "AsButton": False,
                "AdvancedArgs": True,
                "ArgsReminder": 
                    "如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+ 
                    "例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
                "Function": HotReload(Latex翻译中文并重新编译PDF)
            }
        })
    except:
        print('Load function plugin failed')
-    ###################### 第n组插件 ###########################
+
    # try:
    #     from crazy_functions.虚空终端 import 终端
    #     function_plugins.update({
    #         "超级终端": {
    #             "Color": "stop",
    #             "AsButton": False,
    #             # "AdvancedArgs": True,
    #             # "ArgsReminder": "",
    #             "Function": HotReload(终端)
    #         }
    #     })
    # except:
    #     print('Load function plugin failed')
    return function_plugins
--- a/crazy_functions/Langchain知识库.py
+++ b/crazy_functions/Langchain知识库.py
@@ -30,7 +30,7 @@ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
        )
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        from .crazy_utils import try_install_deps
-        try_install_deps(['zh_langchain==0.2.0'])
+        try_install_deps(['zh_langchain==0.2.1'])
    # < --------------------读取参数--------------- >
    if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
@@ -84,7 +84,7 @@ def 读取知识库作答(txt, llm_kwargs, plugin_kwargs, chatbot, history, syst
        chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装，请查看终端的输出或耐心等待..."])
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        from .crazy_utils import try_install_deps
-        try_install_deps(['zh_langchain==0.2.0'])
+        try_install_deps(['zh_langchain==0.2.1'])
    # < -------------------  --------------- >
    kai = knowledge_archive_interface()
--- a/crazy_functions/Latex输出PDF结果.py
+++ b/crazy_functions/Latex输出PDF结果.py
@@ -1,12 +1,13 @@
 from toolbox import update_ui, trimmed_format_exc, get_conf, objdump, objload, promote_file_to_downloadzone
 from toolbox import CatchException, report_execption, update_ui_lastest_msg, zip_result, gen_time_str
 from functools import partial
 import glob, os, requests, time
 pj = os.path.join
 ARXIV_CACHE_DIR = os.path.expanduser(f"~/arxiv_cache/")
 # =================================== 工具函数 ===============================================
-沙雕GPT啊别犯这些低级翻译错误  = 'You must to translate "agent" to "智能体". '
+专业词汇声明  = 'If the term "agent" is used in this section, it should be translated to "智能体". '
-def switch_prompt(pfg, mode):
+def switch_prompt(pfg, mode, more_requirement):
    """
    Generate prompts and system prompts based on the mode for proofreading or translating.
    Args:
@@ -18,14 +19,14 @@ def switch_prompt(pfg, mode):
    - sys_prompt_array: A list of strings containing prompts for system prompts.
    """
    n_split = len(pfg.sp_file_contents)
-    if mode == 'proofread':
+    if mode == 'proofread_en':
        inputs_array = [r"Below is a section from an academic paper, proofread this section." + 
-                        r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + 
+                        r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + more_requirement +
                        r"Answer me only with the revised text:" + 
                        f"\n\n{frag}" for frag in pfg.sp_file_contents]
        sys_prompt_array = ["You are a professional academic paper writer." for _ in range(n_split)]
    elif mode == 'translate_zh':
-        inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese." + 沙雕GPT啊别犯这些低级翻译错误 + 
+        inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese. " + more_requirement + 
                        r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + 
                        r"Answer me only with the translated text:" + 
                        f"\n\n{frag}" for frag in pfg.sp_file_contents]
@@ -69,6 +70,12 @@ def move_project(project_folder, arxiv_id=None):
        shutil.rmtree(new_workfolder)
    except:
        pass
    # align subfolder if there is a folder wrapper
    items = glob.glob(pj(project_folder,'*'))
    if len(glob.glob(pj(project_folder,'*.tex'))) == 0 and len(items) == 1:
        if os.path.isdir(items[0]): project_folder = items[0]
    shutil.copytree(src=project_folder, dst=new_workfolder)
    return new_workfolder
@@ -79,7 +86,7 @@ def arxiv_download(chatbot, history, txt):
            os.makedirs(translation_dir)
        target_file = pj(translation_dir, 'translate_zh.pdf')
        if os.path.exists(target_file):
-            promote_file_to_downloadzone(target_file)
+            promote_file_to_downloadzone(target_file, rename_file=None, chatbot=chatbot)
            return target_file
        return False
    def is_float(s):
@@ -88,8 +95,10 @@ def arxiv_download(chatbot, history, txt):
            return True
        except ValueError:
            return False
-    if ('.' in txt) and ('/' not in txt) and is_float(txt):
+    if ('.' in txt) and ('/' not in txt) and is_float(txt): # is arxiv ID
-        txt = 'https://arxiv.org/abs/' + txt
+        txt = 'https://arxiv.org/abs/' + txt.strip()
    if ('.' in txt) and ('/' not in txt) and is_float(txt[:10]): # is arxiv ID
        txt = 'https://arxiv.org/abs/' + txt[:10]
    if not txt.startswith('https://arxiv.org'): 
        return txt, None
@@ -105,6 +114,7 @@ def arxiv_download(chatbot, history, txt):
        return msg, None
    # <-------------- set format ------------->
    arxiv_id = url_.split('/abs/')[-1]
    if 'v' in arxiv_id: arxiv_id = arxiv_id[:10]
    cached_translation_pdf = check_cached_translation_pdf(arxiv_id)
    if cached_translation_pdf: return cached_translation_pdf, arxiv_id
@@ -137,7 +147,11 @@ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, histo
    chatbot.append([ "函数插件功能？",
        "对整个Latex项目进行纠错, 用latex编译为PDF对修正处做高亮。函数插件贡献者: Binary-Husky。注意事项: 目前仅支持GPT3.5/GPT4，其他模型转化效果未知。目前对机器学习类文献转化效果最好，其他类型文献转化效果未知。仅在Windows系统进行了测试，其他操作系统表现未知。"])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
-
+    
    # <-------------- more requirements ------------->
    if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
    more_req = plugin_kwargs.get("advanced_arg", "")
    _switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
    # <-------------- check deps ------------->
    try:
@@ -146,7 +160,7 @@ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, histo
        from .latex_utils import Latex精细分解与转化, 编译Latex
    except Exception as e:
        chatbot.append([ f"解析项目: {txt}",
-            f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
+            f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return
@@ -176,23 +190,26 @@ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, histo
    # <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
-    if not os.path.exists(project_folder + '/merge_proofread.tex'):
+    if not os.path.exists(project_folder + '/merge_proofread_en.tex'):
-        yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread_latex', switch_prompt=switch_prompt)
+        yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, 
                                chatbot, history, system_prompt, mode='proofread_en', switch_prompt=_switch_prompt_)
    # <-------------- compile PDF ------------->
-    success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_proofread', 
+    success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_proofread_en', 
                             work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
    # <-------------- zip PDF ------------->
-    zip_result(project_folder)
+    zip_res = zip_result(project_folder)
    if success:
        chatbot.append((f"成功啦", '请查收结果（压缩包）...'))
        yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
        promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
    else:
        chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果（压缩包）, 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
        yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
        promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
    # <-------------- we are done ------------->
    return success
@@ -205,9 +222,13 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
    # <-------------- information about this plugin ------------->
    chatbot.append([
        "函数插件功能？",
-        "对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 目前仅支持GPT3.5/GPT4，其他模型转化效果未知。目前对机器学习类文献转化效果最好，其他类型文献转化效果未知。"])
+        "对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 此插件Windows支持最佳，Linux下必须使用Docker安装，详见项目主README.md。目前仅支持GPT3.5/GPT4，其他模型转化效果未知。目前对机器学习类文献转化效果最好，其他类型文献转化效果未知。"])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    # <-------------- more requirements ------------->
    if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
    more_req = plugin_kwargs.get("advanced_arg", "")
    _switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
    # <-------------- check deps ------------->
    try:
@@ -216,7 +237,7 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
        from .latex_utils import Latex精细分解与转化, 编译Latex
    except Exception as e:
        chatbot.append([ f"解析项目: {txt}",
-            f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
+            f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return
@@ -255,21 +276,24 @@ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot,
    # <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
    if not os.path.exists(project_folder + '/merge_translate_zh.tex'):
-        yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='translate_zh', switch_prompt=switch_prompt)
+        yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, 
                                chatbot, history, system_prompt, mode='translate_zh', switch_prompt=_switch_prompt_)
    # <-------------- compile PDF ------------->
-    success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_translate_zh', 
+    success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_translate_zh', mode='translate_zh', 
                             work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
    # <-------------- zip PDF ------------->
-    zip_result(project_folder)
+    zip_res = zip_result(project_folder)
    if success:
        chatbot.append((f"成功啦", '请查收结果（压缩包）...'))
        yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
        promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
    else:
        chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果（压缩包）, 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
        yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
        promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
    # <-------------- we are done ------------->
--- a/crazy_functions/crazy_functions_test.py
+++ b/crazy_functions/crazy_functions_test.py
@@ -188,7 +188,15 @@ def test_Latex():
    # txt = r"https://arxiv.org/abs/2305.17608"
    # txt = r"https://arxiv.org/abs/2211.16068"                     #  ACE
    # txt = r"C:\Users\x\arxiv_cache\2211.16068\workfolder"  #  ACE
-    txt = r"https://arxiv.org/abs/2002.09253"
+    # txt = r"https://arxiv.org/abs/2002.09253"
    # txt = r"https://arxiv.org/abs/2306.07831"
    # txt = r"https://arxiv.org/abs/2212.10156"
    # txt = r"https://arxiv.org/abs/2211.11559"
    # txt = r"https://arxiv.org/abs/2303.08774"
    txt = r"https://arxiv.org/abs/2303.12712"
    # txt = r"C:\Users\fuqingxu\arxiv_cache\2303.12712\workfolder"
    for cookies, cb, hist, msg in (Latex翻译中文并重新编译PDF)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
        cli_printer.print(cb)   #  print(cb)
@@ -217,6 +225,7 @@ def test_Latex():
 # test_数学动画生成manim()
 # test_Langchain知识库()
 # test_Langchain知识库读取()
-test_Latex()
+if __name__ == "__main__":
-input("程序完成，回车退出。")
+    test_Latex()
-print("退出。")
+    input("程序完成，回车退出。")
    print("退出。")
--- a/crazy_functions/crazy_utils.py
+++ b/crazy_functions/crazy_utils.py
@@ -698,3 +698,51 @@ def try_install_deps(deps):
    for dep in deps:
        import subprocess, sys
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', dep])
 class construct_html():
    def __init__(self) -> None:
        self.css = """
 .row {
  display: flex;
  flex-wrap: wrap;
 }
 .column {
  flex: 1;
  padding: 10px;
 }
 .table-header {
  font-weight: bold;
  border-bottom: 1px solid black;
 }
 .table-row {
  border-bottom: 1px solid lightgray;
 }
 .table-cell {
  padding: 5px;
 }
        """
        self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
    def add_row(self, a, b):
        tmp = """
 <div class="row table-row">
    <div class="column table-cell">REPLACE_A</div>
    <div class="column table-cell">REPLACE_B</div>
 </div>
        """
        from toolbox import markdown_convertion
        tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
        tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
        self.html_string += tmp
    def save_file(self, file_name):
        with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
            f.write(self.html_string.encode('utf-8', 'ignore').decode())
--- a/crazy_functions/latex_utils.py
+++ b/crazy_functions/latex_utils.py
@@ -8,31 +8,69 @@ pj = os.path.join
 """
 ========================================================================
 Part One
-Latex segmentation to a linklist
+Latex segmentation with a binary mask (PRESERVE=0, TRANSFORM=1)
 ========================================================================
 """
 PRESERVE = 0
 TRANSFORM = 1
-def split_worker(text, mask, pattern, flags=0):
+def set_forbidden_text(text, mask, pattern, flags=0):
    """
    Add a preserve text area in this paper
    e.g. with pattern = r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}"
    you can mask out (mask = PRESERVE so that text become untouchable for GPT) 
    everything between "\begin{equation}" and "\end{equation}"
    """
    if isinstance(pattern, list): pattern = '|'.join(pattern)
    pattern_compile = re.compile(pattern, flags)
    for res in pattern_compile.finditer(text):
        mask[res.span()[0]:res.span()[1]] = PRESERVE
    return text, mask
-def split_worker_reverse_caption(text, mask, pattern, flags=0):
+def set_forbidden_text_careful_brace(text, mask, pattern, flags=0):
    """
-    Move caption area out of preserve area 
+    Add a preserve text area in this paper (text become untouchable for GPT).
    count the number of the braces so as to catch compelete text area. 
    e.g.
    \caption{blablablablabla\texbf{blablabla}blablabla.} 
    """
    pattern_compile = re.compile(pattern, flags)
    for res in pattern_compile.finditer(text):
-        mask[res.regs[1][0]:res.regs[1][1]] = TRANSFORM
+        brace_level = -1
        p = begin = end = res.regs[0][0]
        for _ in range(1024*16):
            if text[p] == '}' and brace_level == 0: break
            elif text[p] == '}':  brace_level -= 1
            elif text[p] == '{':  brace_level += 1
            p += 1
        end = p+1
        mask[begin:end] = PRESERVE
    return text, mask
-def split_worker_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
+def reverse_forbidden_text_careful_brace(text, mask, pattern, flags=0, forbid_wrapper=True):
    """
    Move area out of preserve area (make text editable for GPT)
    count the number of the braces so as to catch compelete text area. 
    e.g.
    \caption{blablablablabla\texbf{blablabla}blablabla.} 
    """
    pattern_compile = re.compile(pattern, flags)
    for res in pattern_compile.finditer(text):
        brace_level = 0
        p = begin = end = res.regs[1][0]
        for _ in range(1024*16):
            if text[p] == '}' and brace_level == 0: break
            elif text[p] == '}':  brace_level -= 1
            elif text[p] == '{':  brace_level += 1
            p += 1
        end = p
        mask[begin:end] = TRANSFORM
        if forbid_wrapper:
            mask[res.regs[0][0]:begin] = PRESERVE
            mask[end:res.regs[0][1]] = PRESERVE
    return text, mask
 def set_forbidden_text_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
    """
    Find all \begin{} ... \end{} text block that with less than limit_n_lines lines.
    Add it to preserve area
@@ -85,29 +123,54 @@ Latex Merge File
 def 寻找Latex主文件(file_manifest, mode):
    """
    在多Tex文档中，寻找主文件，必须包含documentclass，返回找到的第一个。
-    P.S. 但愿没人把latex模板放在里面传进来
+    P.S. 但愿没人把latex模板放在里面传进来 (6.25 加入判定latex模板的代码)
    """
    canidates = []
    for texf in file_manifest:
        if os.path.basename(texf).startswith('merge'):
            continue
        with open(texf, 'r', encoding='utf8') as f:
            file_content = f.read()
        if r'\documentclass' in file_content:
-            return texf
+            canidates.append(texf)
        else:
            continue
-    raise RuntimeError('无法找到一个主Tex文件（包含documentclass关键字）')
+
    if len(canidates) == 0:
        raise RuntimeError('无法找到一个主Tex文件（包含documentclass关键字）')
    elif len(canidates) == 1:
        return canidates[0]
    else: # if len(canidates) >= 2 通过一些Latex模板中常见（但通常不会出现在正文）的单词，对不同latex源文件扣分，取评分最高者返回
        canidates_score = []
        # 给出一些判定模板文档的词作为扣分项
        unexpected_words = ['\LaTeX', 'manuscript', 'Guidelines', 'font', 'citations', 'rejected', 'blind review', 'reviewers']
        expected_words = ['\input', '\ref', '\cite']
        for texf in canidates:
            canidates_score.append(0)
            with open(texf, 'r', encoding='utf8') as f:
                file_content = f.read()
            for uw in unexpected_words:
                if uw in file_content:
                    canidates_score[-1] -= 1
            for uw in expected_words:
                if uw in file_content:
                    canidates_score[-1] += 1
        select = np.argmax(canidates_score) # 取评分最高者返回
        return canidates[select]
 def rm_comments(main_file):
    new_file_remove_comment_lines = []
    for l in main_file.splitlines():
        # 删除整行的空注释
-        if l.startswith("%") or (l.startswith(" ") and l.lstrip().startswith("%")):
+        if l.lstrip().startswith("%"):
            pass
        else:
            new_file_remove_comment_lines.append(l)
    main_file = '\n'.join(new_file_remove_comment_lines)
    # main_file = re.sub(r"\\include{(.*?)}", r"\\input{\1}", main_file)  # 将 \include 命令转换为 \input 命令
    main_file = re.sub(r'(?<!\\)%.*', '', main_file)  # 使用正则表达式查找半行注释, 并替换为空字符串
    return main_file
 def merge_tex_files_(project_foler, main_file, mode):
    """
    Merge Tex project recrusively
@@ -138,17 +201,24 @@ def merge_tex_files(project_foler, main_file, mode):
    main_file = rm_comments(main_file)
    if mode == 'translate_zh':
        # find paper documentclass
        pattern = re.compile(r'\\documentclass.*\n')
        match = pattern.search(main_file)
        assert match is not None, "Cannot find documentclass statement!"
        position = match.end()
        add_ctex = '\\usepackage{ctex}\n'
        add_url = '\\usepackage{url}\n' if '{url}' not in main_file else ''
        main_file = main_file[:position] + add_ctex + add_url + main_file[position:]
-        # 2 fontset=windows
+        # fontset=windows
        import platform
-        if platform.system() != 'Windows':
+        main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows,UTF8]{\2}",main_file)
-            main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows]{\2}",main_file)
+        main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows,UTF8]{\1}",main_file)
-            main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows]{\1}",main_file)
+        # find paper abstract
        pattern_opt1 = re.compile(r'\\begin\{abstract\}.*\n')
        pattern_opt2 = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
        match_opt1 = pattern_opt1.search(main_file)
        match_opt2 = pattern_opt2.search(main_file)
        assert (match_opt1 is not None) or (match_opt2 is not None), "Cannot find paper abstract section!"
    return main_file
@@ -180,19 +250,46 @@ def fix_content(final_tex, node_string):
    final_tex = re.sub(r"\\\ ([a-z]{2,10})\{", r"\\\1{", string=final_tex)
    final_tex = re.sub(r"\\([a-z]{2,10})\{([^\}]*?)\}", mod_inbraket, string=final_tex)
    if "Traceback" in final_tex and "[Local Message]" in final_tex:
        final_tex = node_string # 出问题了，还原原文
    if node_string.count('\\begin') != final_tex.count('\\begin'):
        final_tex = node_string # 出问题了，还原原文
    if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'):
        # walk and replace any _ without \
        final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex)
-    if node_string.count('{') != node_string.count('}'):
+
-        if final_tex.count('{') != node_string.count('{'):
+    def compute_brace_level(string):
-            final_tex = node_string # 出问题了，还原原文
+        # this function count the number of { and }
-        if final_tex.count('}') != node_string.count('}'):
+        brace_level = 0
-            final_tex = node_string # 出问题了，还原原文
+        for c in string:
            if c == "{": brace_level += 1
            elif c == "}": brace_level -= 1
        return brace_level
    def join_most(tex_t, tex_o):
        # this function join translated string and original string when something goes wrong
        p_t = 0
        p_o = 0
        def find_next(string, chars, begin):
            p = begin
            while p < len(string):
                if string[p] in chars: return p, string[p]
                p += 1
            return None, None
        while True:
            res1, char = find_next(tex_o, ['{','}'], p_o)
            if res1 is None: break
            res2, char = find_next(tex_t, [char], p_t)
            if res2 is None: break
            p_o = res1 + 1
            p_t = res2 + 1
        return tex_t[:p_t] + tex_o[p_o:]
    if compute_brace_level(final_tex) != compute_brace_level(node_string):
        # 出问题了，还原部分原文，保证括号正确
        final_tex = join_most(final_tex, node_string)
    return final_tex
-def split_subprocess(txt, project_folder, return_dict):
+def split_subprocess(txt, project_folder, return_dict, opts):
    """
    break down latex file to a linked list,
    each node use a preserve flag to indicate whether it should
@@ -202,44 +299,33 @@ def split_subprocess(txt, project_folder, return_dict):
    mask = np.zeros(len(txt), dtype=np.uint8) + TRANSFORM
    # 吸收title与作者以上的部分
-    text, mask = split_worker(text, mask, r"(.*?)\\maketitle", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, r"(.*?)\\maketitle", re.DOTALL)
-    # 删除iffalse注释
+    # 吸收iffalse注释
-    text, mask = split_worker(text, mask, r"\\iffalse(.*?)\\fi", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, r"\\iffalse(.*?)\\fi", re.DOTALL)
-    # 吸收在25行以内的begin-end组合
+    # 吸收在42行以内的begin-end组合
-    text, mask = split_worker_begin_end(text, mask, r"\\begin\{([a-z\*]*)\}(.*?)\\end\{\1\}", re.DOTALL, limit_n_lines=25)
+    text, mask = set_forbidden_text_begin_end(text, mask, r"\\begin\{([a-z\*]*)\}(.*?)\\end\{\1\}", re.DOTALL, limit_n_lines=42)
    # 吸收匿名公式
-    text, mask = split_worker(text, mask, r"\$\$(.*?)\$\$", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [ r"\$\$(.*?)\$\$",  r"\\\[.*?\\\]" ], re.DOTALL)
    # 吸收其他杂项
-    text, mask = split_worker(text, mask, r"\\section\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, [ r"\\section\{(.*?)\}", r"\\section\*\{(.*?)\}", r"\\subsection\{(.*?)\}", r"\\subsubsection\{(.*?)\}" ])
-    text, mask = split_worker(text, mask, r"\\section\*\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, [ r"\\bibliography\{(.*?)\}", r"\\bibliographystyle\{(.*?)\}" ])
-    text, mask = split_worker(text, mask, r"\\subsection\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, r"\\begin\{thebibliography\}.*?\\end\{thebibliography\}", re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\subsubsection\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, r"\\begin\{lstlisting\}(.*?)\\end\{lstlisting\}", re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\bibliography\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, r"\\begin\{wraptable\}(.*?)\\end\{wraptable\}", re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\bibliographystyle\{(.*?)\}")
+    text, mask = set_forbidden_text(text, mask, r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}", re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{lstlisting\}(.*?)\\end\{lstlisting\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{wrapfigure\}(.*?)\\end\{wrapfigure\}", r"\\begin\{wrapfigure\*\}(.*?)\\end\{wrapfigure\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{wraptable\}(.*?)\\end\{wraptable\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{figure\}(.*?)\\end\{figure\}", r"\\begin\{figure\*\}(.*?)\\end\{figure\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{multline\}(.*?)\\end\{multline\}", r"\\begin\{multline\*\}(.*?)\\end\{multline\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{wrapfigure\}(.*?)\\end\{wrapfigure\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{table\}(.*?)\\end\{table\}", r"\\begin\{table\*\}(.*?)\\end\{table\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{wrapfigure\*\}(.*?)\\end\{wrapfigure\*\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{minipage\}(.*?)\\end\{minipage\}", r"\\begin\{minipage\*\}(.*?)\\end\{minipage\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{figure\}(.*?)\\end\{figure\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{align\*\}(.*?)\\end\{align\*\}", r"\\begin\{align\}(.*?)\\end\{align\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{figure\*\}(.*?)\\end\{figure\*\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\begin\{equation\}(.*?)\\end\{equation\}", r"\\begin\{equation\*\}(.*?)\\end\{equation\*\}"], re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{multline\}(.*?)\\end\{multline\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\includepdf\[(.*?)\]\{(.*?)\}", r"\\clearpage", r"\\newpage", r"\\appendix", r"\\tableofcontents", r"\\include\{(.*?)\}"])
-    text, mask = split_worker(text, mask, r"\\begin\{multline\*\}(.*?)\\end\{multline\*\}", re.DOTALL)
+    text, mask = set_forbidden_text(text, mask, [r"\\vspace\{(.*?)\}", r"\\hspace\{(.*?)\}", r"\\label\{(.*?)\}", r"\\begin\{(.*?)\}", r"\\end\{(.*?)\}", r"\\item "])
-    text, mask = split_worker(text, mask, r"\\begin\{table\}(.*?)\\end\{table\}", re.DOTALL)
+    text, mask = set_forbidden_text_careful_brace(text, mask, r"\\hl\{(.*?)\}", re.DOTALL)
-    text, mask = split_worker(text, mask, r"\\begin\{table\*\}(.*?)\\end\{table\*\}", re.DOTALL)
+    # reverse 操作必须放在最后
-    text, mask = split_worker(text, mask, r"\\begin\{minipage\}(.*?)\\end\{minipage\}", re.DOTALL)
+    text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\caption\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
-    text, mask = split_worker(text, mask, r"\\begin\{minipage\*\}(.*?)\\end\{minipage\*\}", re.DOTALL)
+    text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\abstract\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
    text, mask = split_worker(text, mask, r"\\begin\{align\*\}(.*?)\\end\{align\*\}", re.DOTALL)
    text, mask = split_worker(text, mask, r"\\begin\{align\}(.*?)\\end\{align\}", re.DOTALL)
    text, mask = split_worker(text, mask, r"\\begin\{equation\}(.*?)\\end\{equation\}", re.DOTALL)
    text, mask = split_worker(text, mask, r"\\begin\{equation\*\}(.*?)\\end\{equation\*\}", re.DOTALL)
    text, mask = split_worker(text, mask, r"\\item ")
    text, mask = split_worker(text, mask, r"\\label\{(.*?)\}")
    text, mask = split_worker(text, mask, r"\\begin\{(.*?)\}")
    text, mask = split_worker(text, mask, r"\\vspace\{(.*?)\}")
    text, mask = split_worker(text, mask, r"\\hspace\{(.*?)\}")
    text, mask = split_worker(text, mask, r"\\end\{(.*?)\}")
    # text, mask = split_worker_reverse_caption(text, mask, r"\\caption\{(.*?)\}", re.DOTALL)
    root = convert_to_linklist(text, mask)
    # 修复括号
@@ -313,7 +399,7 @@ def split_subprocess(txt, project_folder, return_dict):
        prev_node = node
        node = node.next
        if node is None: break
-
+    # 输出html调试文件，用红色标注处保留区（PRESERVE），用黑色标注转换区（TRANSFORM）
    with open(pj(project_folder, 'debug_log.html'), 'w', encoding='utf8') as f:
        segment_parts_for_gpt = []
        nodes = []
@@ -344,8 +430,8 @@ class LatexPaperSplit():
    """
    def __init__(self) -> None:
        self.nodes = None
-        self.msg = "{\\scriptsize\\textbf{警告：该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成，" + \
+        self.msg = "*{\\scriptsize\\textbf{警告：该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成，" + \
-            "版权归原文作者所有。翻译内容可靠性无任何保障，请仔细鉴别并以原文为准。" + \
+            "版权归原文作者所有。翻译内容可靠性无保障，请仔细鉴别并以原文为准。" + \
            "项目Github地址 \\url{https://github.com/binary-husky/gpt_academic/}。"
        # 请您不要删除或修改这行警告，除非您是论文的原作者（如果您是论文原作者，欢迎加REAME中的QQ联系开发者）
        self.msg_declare = "为了防止大语言模型的意外谬误产生扩散影响，禁止移除或修改此警告。}}\\\\" 
@@ -365,11 +451,18 @@ class LatexPaperSplit():
        if mode == 'translate_zh':
            pattern = re.compile(r'\\begin\{abstract\}.*\n')
            match = pattern.search(result_string)
-            position = match.end()
+            if not match:
                # match \abstract{xxxx}
                pattern_compile = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
                match = pattern_compile.search(result_string)
                position = match.regs[1][0]
            else:
                # match \begin{abstract}xxxx\end{abstract}
                position = match.end()
            result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:]
        return result_string
-    def split(self, txt, project_folder): 
+    def split(self, txt, project_folder, opts): 
        """
        break down latex file to a linked list,
        each node use a preserve flag to indicate whether it should
@@ -381,9 +474,10 @@ class LatexPaperSplit():
        return_dict = manager.dict()
        p = multiprocessing.Process(
            target=split_subprocess, 
-            args=(txt, project_folder, return_dict))
+            args=(txt, project_folder, return_dict, opts))
        p.start()
        p.join()
        p.close()
        self.nodes = return_dict['nodes']
        self.sp = return_dict['segment_parts_for_gpt']
        return self.sp
@@ -438,9 +532,35 @@ class LatexPaperFileGroup():
                f.write(res)
        return manifest
 def write_html(sp_file_contents, sp_file_result, chatbot, project_folder):
    # write html
    try:
        import shutil
        from .crazy_utils import construct_html
        from toolbox import gen_time_str
        ch = construct_html() 
        orig = ""
        trans = ""
        final = []
        for c,r in zip(sp_file_contents, sp_file_result): 
            final.append(c)
            final.append(r)
        for i, k in enumerate(final): 
            if i%2==0:
                orig = k
            if i%2==1:
                trans = k
                ch.add_row(a=orig, b=trans)
        create_report_file_name = f"{gen_time_str()}.trans.html"
        ch.save_file(create_report_file_name)
        shutil.copyfile(pj('./gpt_log/', create_report_file_name), pj(project_folder, create_report_file_name))
        promote_file_to_downloadzone(file=f'./gpt_log/{create_report_file_name}', chatbot=chatbot)
    except:
        from toolbox import trimmed_format_exc
        print('writing html result failed:', trimmed_format_exc())
-def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None):
+def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None, opts=[]):
    import time, os, re
    from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
    from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件
@@ -469,8 +589,10 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
        f.write(merged_content)
    #  <-------- 精细切分latex文件 ----------> 
    chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件，这需要一段时间计算，文档越长耗时越长，请耐心等待。'))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    lps = LatexPaperSplit()
-    res = lps.split(merged_content, project_folder) # 消耗时间的函数
+    res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
    #  <-------- 拆分过长的latex片段 ----------> 
    pfg = LatexPaperFileGroup()
@@ -513,6 +635,7 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
        pfg.get_token_num = None
        objdump(pfg, file=pj(project_folder,'temp.pkl'))
    write_html(pfg.sp_file_contents, pfg.sp_file_result, chatbot=chatbot, project_folder=project_folder)
    #  <-------- 写出文件 ----------> 
    msg = f"当前大语言模型: {llm_kwargs['llm_model']}，当前语言模型温度设定: {llm_kwargs['temperature']}。"
@@ -562,17 +685,18 @@ def compile_latex_with_timeout(command, timeout=60):
        return False
    return True
-def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_folder_original, work_folder_modified, work_folder):
+def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_folder_original, work_folder_modified, work_folder, mode='default'):
    import os, time
    current_dir = os.getcwd()
    n_fix = 1
    max_try = 32
-    chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder}，如果程序停顿5分钟以上，则大概率是卡死在Latex里面了。不幸卡死时请直接去该路径下取回翻译结果，或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
+    chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder}，如果程序停顿5分钟以上，请直接去该路径下取回翻译结果，或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
    chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面
    yield from update_ui_lastest_msg('编译已经开始...', chatbot, history)   # 刷新Gradio前端界面
    while True:
        import os
        # https://stackoverflow.com/questions/738755/dont-make-me-manually-abort-a-latex-compile-when-theres-an-error
        yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译原始PDF ...', chatbot, history)   # 刷新Gradio前端界面
        os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex'); os.chdir(current_dir)
@@ -594,15 +718,16 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
            os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex'); os.chdir(current_dir)
            os.chdir(work_folder_modified); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex'); os.chdir(current_dir)
-            yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 使用latexdiff生成论文转化前后对比 ...', chatbot, history) # 刷新Gradio前端界面
+            if mode!='translate_zh':
-            print(    f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex  {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
+                yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 使用latexdiff生成论文转化前后对比 ...', chatbot, history) # 刷新Gradio前端界面
-            ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex  {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
+                print(    f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex  {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
                ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex  {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
-            yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 正在编译对比PDF ...', chatbot, history)   # 刷新Gradio前端界面
+                yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 正在编译对比PDF ...', chatbot, history)   # 刷新Gradio前端界面
-            os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
+                os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
-            os.chdir(work_folder); ok = compile_latex_with_timeout(f'bibtex    merge_diff.aux'); os.chdir(current_dir)
+                os.chdir(work_folder); ok = compile_latex_with_timeout(f'bibtex    merge_diff.aux'); os.chdir(current_dir)
-            os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
+                os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
-            os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
+                os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex  -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
        # <--------------------->
        os.chdir(current_dir)
@@ -617,13 +742,15 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
        results_ += f"对比PDF编译是否成功: {diff_pdf_success};" 
        yield from update_ui_lastest_msg(f'第{n_fix}编译结束:<br/>{results_}...', chatbot, history) # 刷新Gradio前端界面
        if diff_pdf_success:
            result_pdf = pj(work_folder_modified, f'merge_diff.pdf')    # get pdf path
            promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot)  # promote file to web UI
        if modified_pdf_success:
            yield from update_ui_lastest_msg(f'转化PDF编译已经成功, 即将退出 ...', chatbot, history)    # 刷新Gradio前端界面
-            os.chdir(current_dir)
+            result_pdf = pj(work_folder_modified, f'{main_file_modified}.pdf') # get pdf path
            result_pdf = pj(work_folder_modified, f'{main_file_modified}.pdf')
            if os.path.exists(pj(work_folder, '..', 'translation')):
                shutil.copyfile(result_pdf, pj(work_folder, '..', 'translation', 'translate_zh.pdf'))
-            promote_file_to_downloadzone(result_pdf)
+            promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot)  # promote file to web UI
            return True # 成功啦
        else:
            if n_fix>=max_try: break
--- a/crazy_functions/对话历史存档.py
+++ b/crazy_functions/对话历史存档.py
@@ -1,4 +1,4 @@
-from toolbox import CatchException, update_ui
+from toolbox import CatchException, update_ui, promote_file_to_downloadzone
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
 import re
@@ -29,9 +29,8 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
        for h in history:
            f.write("\n>>>" + h)
        f.write('</code>')
-    res = '对话历史写入：' + os.path.abspath(f'./gpt_log/{file_name}')
+    promote_file_to_downloadzone(f'./gpt_log/{file_name}', rename_file=file_name, chatbot=chatbot)
-    print(res)
+    return '对话历史写入：' + os.path.abspath(f'./gpt_log/{file_name}')
    return res
 def gen_file_preview(file_name):
    try:
--- a/crazy_functions/数学动画生成manim.py
+++ b/crazy_functions/数学动画生成manim.py
@@ -8,7 +8,7 @@ def inspect_dependency(chatbot, history):
        import manim
        return True
    except:
-        chatbot.append(["导入依赖失败", "使用该模块需要额外依赖，安装方法:```pip install manimgl```"])
+        chatbot.append(["导入依赖失败", "使用该模块需要额外依赖，安装方法:```pip install manim manimgl```"])
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
        return False
--- a/crazy_functions/理解PDF文档内容.py
+++ b/crazy_functions/理解PDF文档内容.py
@@ -13,7 +13,9 @@ def 解析PDF(file_name, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
    # 递归地切割PDF文件，每一块（尽量是完整的一个section，比如introduction，experiment等，必要时再进行切割）
    # 的长度必须小于 2500 个 Token
    file_content, page_one = read_and_clean_pdf_text(file_name) # （尝试）按照章节切割PDF
-
+    file_content = file_content.encode('utf-8', 'ignore').decode()   # avoid reading non-utf8 chars
    page_one = str(page_one).encode('utf-8', 'ignore').decode()  # avoid reading non-utf8 chars
    TOKEN_LIMIT_PER_FRAGMENT = 2500
    from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
--- a/crazy_functions/虚空终端.py
+++ b/crazy_functions/虚空终端.py
@@ -0,0 +1,131 @@
 from toolbox import CatchException, update_ui, gen_time_str
 from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
 from .crazy_utils import input_clipping
 prompt = """
 I have to achieve some functionalities by calling one of the functions below.
 Your job is to find the correct funtion to use to satisfy my requirement,
 and then write python code to call this function with correct parameters.
 These are functions you are allowed to choose from:
 1. 
    功能描述: 总结音视频内容
    调用函数: ConcludeAudioContent(txt, llm_kwargs)
    参数说明: 
            txt: 音频文件的路径
            llm_kwargs: 模型参数, 永远给定None
 2. 
    功能描述: 将每次对话记录写入Markdown格式的文件中
    调用函数: WriteMarkdown()
 3.
    功能描述: 将指定目录下的PDF文件从英文翻译成中文
    调用函数: BatchTranslatePDFDocuments_MultiThreaded(txt, llm_kwargs)
    参数说明: 
            txt: PDF文件所在的路径
            llm_kwargs: 模型参数, 永远给定None
 4.
    功能描述: 根据文本使用GPT模型生成相应的图像
    调用函数: ImageGeneration(txt, llm_kwargs)
    参数说明: 
            txt: 图像生成所用到的提示文本
            llm_kwargs: 模型参数, 永远给定None
 5.
    功能描述: 对输入的word文档进行摘要生成 
    调用函数: SummarizingWordDocuments(input_path, output_path)
    参数说明: 
            input_path: 待处理的word文档路径
            output_path: 摘要生成后的文档路径
 You should always anwser with following format:
 ----------------
 Code:
 ```
 class AutoAcademic(object):
    def __init__(self):
        self.selected_function = "FILL_CORRECT_FUNCTION_HERE"      # e.g., "GenerateImage"
        self.txt = "FILL_MAIN_PARAMETER_HERE"      # e.g., "荷叶上的蜻蜓"
        self.llm_kwargs = None
 ```
 Explanation:
 只有GenerateImage和生成图像相关, 因此选择GenerateImage函数。
 ----------------
 Now, this is my requirement: 
 """
 def get_fn_lib():
    return {
        "BatchTranslatePDFDocuments_MultiThreaded": ("crazy_functions.批量翻译PDF文档_多线程",  "批量翻译PDF文档"),
        "SummarizingWordDocuments": ("crazy_functions.总结word文档",  "总结word文档"),
        "ImageGeneration": ("crazy_functions.图片生成",  "图片生成"),
        "TranslateMarkdownFromEnglishToChinese": ("crazy_functions.批量Markdown翻译",  "Markdown中译英"),
        "SummaryAudioVideo": ("crazy_functions.总结音视频",  "总结音视频"),
    }
 def inspect_dependency(chatbot, history):
    return True
 def eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
    import subprocess, sys, os, shutil, importlib
    with open('gpt_log/void_terminal_runtime.py', 'w', encoding='utf8') as f:
        f.write(code)
    try:
        AutoAcademic = getattr(importlib.import_module('gpt_log.void_terminal_runtime', 'AutoAcademic'), 'AutoAcademic')
        # importlib.reload(AutoAcademic)
        auto_dict = AutoAcademic()
        selected_function = auto_dict.selected_function
        txt = auto_dict.txt
        fp, fn = get_fn_lib()[selected_function]
        fn_plugin = getattr(importlib.import_module(fp, fn), fn)
        yield from fn_plugin(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)
    except:
        from toolbox import trimmed_format_exc
        chatbot.append(["执行错误", f"\n```\n{trimmed_format_exc()}\n```\n"])
        yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
 def get_code_block(reply):
    import re
    pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
    matches = re.findall(pattern, reply) # find all code blocks in text
    if len(matches) != 1: 
        raise RuntimeError("GPT is not generating proper code.")
    return matches[0].strip('python') #  code block
@CatchException
 def 终端(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
    """
    txt             输入栏用户输入的文本, 例如需要翻译的一段话, 再例如一个包含了待处理文件的路径
    llm_kwargs      gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
    plugin_kwargs   插件模型的参数, 暂时没有用武之地
    chatbot         聊天显示框的句柄, 用于显示给用户
    history         聊天历史, 前情提要
    system_prompt   给gpt的静默提醒
    web_port        当前软件运行的端口号
    """
    # 清空历史, 以免输入溢出
    history = []    
    # 基本信息：功能、贡献者
    chatbot.append(["函数插件功能？", "根据自然语言执行插件命令, 作者: binary-husky, 插件初始化中 ..."])
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    # # 尝试导入依赖, 如果缺少依赖, 则给出安装建议
    # dep_ok = yield from inspect_dependency(chatbot=chatbot, history=history) # 刷新界面
    # if not dep_ok: return
    # 输入
    i_say = prompt + txt
    # 开始
    gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
        inputs=i_say, inputs_show_user=txt, 
        llm_kwargs=llm_kwargs, chatbot=chatbot, history=[], 
        sys_prompt=""
    )
    # 将代码转为动画
    code = get_code_block(gpt_say)
    yield from eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -103,3 +103,30 @@ services:
                      echo '[jittorllms] 正在从github拉取最新代码...' &&
                      git --git-dir=request_llm/jittorllms/.git --work-tree=request_llm/jittorllms pull --force &&
                      python3 -u main.py"
 ## ===================================================
 ## 【方案四】 chatgpt + Latex
 ## ===================================================
 version: '3'
 services:
  gpt_academic_with_latex:
    image: ghcr.io/binary-husky/gpt_academic_with_latex:master
    environment:
      # 请查阅 `config.py` 以查看所有的配置信息
      API_KEY:                  '    sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx                              '
      USE_PROXY:                '    True                                                                             '
      proxies:                  '    { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", }   '
      LLM_MODEL:                '    gpt-3.5-turbo                                                                    '
      AVAIL_LLM_MODELS:         '    ["gpt-3.5-turbo", "gpt-4"]                                                       '
      LOCAL_MODEL_DEVICE:       '    cuda                                                                             '
      DEFAULT_WORKER_NUM:       '    10                                                                               '
      WEB_PORT:                 '    12303                                                                            '
    # 与宿主的网络融合
    network_mode: "host"
    # 不使用代理网络拉取最新代码
    command: >
      bash -c "python3 -u main.py"
--- a/docs/GithubAction+NoLocal+Latex
+++ b/docs/GithubAction+NoLocal+Latex
@@ -0,0 +1,25 @@
 # 此Dockerfile适用于“无本地模型”的环境构建，如果需要使用chatglm等本地模型，请参考 docs/Dockerfile+ChatGLM
 # - 1 修改 `config.py`
 # - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
 # - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
 FROM fuqingxu/python311_texlive_ctex:latest
 # 指定路径
 WORKDIR /gpt
 RUN pip3 install gradio openai numpy arxiv rich
 RUN pip3 install colorama Markdown pygments pymupdf
 # 装载项目文件
 COPY . .
 # 安装依赖
 RUN pip3 install -r requirements.txt
 # 可选步骤，用于预热模块
 RUN python3  -c 'from check_proxy import warm_up_modules; warm_up_modules()'
 # 启动
 CMD ["python3", "-u", "main.py"]
--- a/docs/translate_english.json
+++ b/docs/translate_english.json
@@ -58,6 +58,8 @@
    "连接网络回答问题": "ConnectToNetworkToAnswerQuestions",
    "联网的ChatGPT": "ChatGPTConnectedToNetwork",
    "解析任意code项目": "ParseAnyCodeProject",
    "读取知识库作答": "ReadKnowledgeArchiveAnswerQuestions",
    "知识库问答": "UpdateKnowledgeArchive",
    "同时问询_指定模型": "InquireSimultaneously_SpecifiedModel",
    "图片生成": "ImageGeneration",
    "test_解析ipynb文件": "Test_ParseIpynbFile",
--- a/docs/use_azure.md
+++ b/docs/use_azure.md
@@ -0,0 +1,152 @@
 # 通过微软Azure云服务申请 Openai API
 由于Openai和微软的关系，现在是可以通过微软的Azure云计算服务直接访问openai的api，免去了注册和网络的问题。
 快速入门的官方文档的链接是：[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
 # 申请API
 按文档中的“先决条件”的介绍，出了编程的环境以外，还需要以下三个条件：
 1.  Azure账号并创建订阅
 2.  为订阅添加Azure OpenAI 服务
 3.  部署模型
 ## Azure账号并创建订阅
 ### Azure账号
 创建Azure的账号时最好是有微软的账号，这样似乎更容易获得免费额度（第一个月的200美元，实测了一下，如果用一个刚注册的微软账号登录Azure的话，并没有这一个月的免费额度）。
 创建Azure账号的网址是：[立即创建 Azure 免费帐户 | Microsoft Azure](https://azure.microsoft.com/zh-cn/free/)
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_944786_iH6AECuZ_tY0EaBd_1685327219?w=1327\&h=695\&type=image/png)
 打开网页后，点击 “免费开始使用” 会跳转到登录或注册页面，如果有微软的账户，直接登录即可，如果没有微软账户，那就需要到微软的网页再另行注册一个。
 注意，Azure的页面和政策时不时会变化，已实际最新显示的为准就好。
 ### 创建订阅
 注册好Azure后便可进入主页：
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_444847_tk-9S-pxOYuaLs_K_1685327675?w=1865\&h=969\&type=image/png)
 首先需要在订阅里进行添加操作，点开后即可进入订阅的页面：
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_612820_z_1AlaEgnJR-rUl0_1685327892?w=1865\&h=969\&type=image/png)
 第一次进来应该是空的，点添加即可创建新的订阅（可以是“免费”或者“即付即用”的订阅），其中订阅ID是后面申请Azure OpenAI需要使用的。
 ## 为订阅添加Azure OpenAI服务
 之后回到首页，点Azure OpenAI即可进入OpenAI服务的页面（如果不显示的话，则在首页上方的搜索栏里搜索“openai”即可）。
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_269759_nExkGcPC0EuAR5cp_1685328130?w=1865\&h=969\&type=image/png)
 不过现在这个服务还不能用。在使用前，还需要在这个网址申请一下：
 [Request Access to Azure OpenAI Service (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu)
 这里有二十来个问题，按照要求和自己的实际情况填写即可。
 其中需要注意的是
 1.  千万记得填对"订阅ID"
 2.  需要填一个公司邮箱（可以不是注册用的邮箱）和公司网址
 之后，在回到上面那个页面，点创建，就会进入创建页面了：
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_72708_9d9JYhylPVz3dFWL_1685328372?w=824\&h=590\&type=image/png)
 需要填入“资源组”和“名称”，按照自己的需要填入即可。
 完成后，在主页的“资源”里就可以看到刚才创建的“资源”了，点击进入后，就可以进行最后的部署了。
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_871541_CGCnbgtV9Uk1Jccy_1685329861?w=1217\&h=628\&type=image/png)
 ## 部署模型
 进入资源页面后，在部署模型前，可以先点击“开发”，把密钥和终结点记下来。
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_852567_dxCZOrkMlWDSLH0d_1685330736?w=856\&h=568\&type=image/png)
 之后，就可以去部署模型了，点击“部署”即可，会跳转到 Azure OpenAI Stuido 进行下面的操作：
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_169225_uWs1gMhpNbnwW4h2_1685329901?w=1865\&h=969\&type=image/png)
 进入 Azure OpenAi Studio 后，点击新建部署，会弹出如下对话框：
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_391255_iXUSZAzoud5qlxjJ_1685330224?w=656\&h=641\&type=image/png)
 在这里选 gpt-35-turbo 或需要的模型并按需要填入“部署名”即可完成模型的部署。
 ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_724099_vBaHcUilsm1EtPgK_1685330396?w=1869\&h=482\&type=image/png)
 这个部署名需要记下来。
 到现在为止，申请操作就完成了，需要记下来的有下面几个东西：
 ● 密钥（1或2都可以）
 ● 终结点
 ● 部署名（不是模型名）
 # 修改 config.py
 ```
 AZURE_ENDPOINT = "填入终结点"
 AZURE_API_KEY = "填入azure openai api的密钥"
 AZURE_API_VERSION = "2023-05-15"  # 默认使用 2023-05-15 版本，无需修改
 AZURE_ENGINE = "填入部署名"
 ```
 # API的使用
 接下来就是具体怎么使用API了，还是可以参考官方文档：[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
 和openai自己的api调用有点类似，都需要安装openai库，不同的是调用方式
 ```
 import openai
 openai.api_type = "azure" #固定格式，无需修改
 openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") #这里填入“终结点”
 openai.api_version = "2023-05-15" #固定格式，无需修改
 openai.api_key = os.getenv("AZURE_OPENAI_KEY") #这里填入“密钥1”或“密钥2”
 response = openai.ChatCompletion.create(
    engine="gpt-35-turbo", #这里填入的不是模型名，是部署名
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
        {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
        {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}
    ]
 )
 print(response)
 print(response['choices'][0]['message']['content'])
 ```
 需要注意的是：
 1.  engine那里填入的是部署名，不是模型名
 2.  通过openai库获得的这个 response 和通过 request 库访问 url 获得的 response 不同，不需要 decode，已经是解析好的 json 了，直接根据键值读取即可。
 更细节的使用方法，详见官方API文档。
 # 关于费用
 Azure OpenAI API 还是需要一些费用的（免费订阅只有1个月有效期），费用如下：
 ![image.png](https://note.youdao.com/yws/res/18095/WEBRESOURCEeba0ab6d3127b79e143ef2d5627c0e44)
 具体可以可以看这个网址 ：[Azure OpenAI 服务 - 定价| Microsoft Azure](https://azure.microsoft.com/zh-cn/pricing/details/cognitive-services/openai-service/?cdn=disable)
 并非网上说的什么“一年白嫖”，但注册方法以及网络问题都比直接使用openai的api要简单一些。
--- a/main.py
+++ b/main.py
@@ -155,7 +155,7 @@ def main():
        for k in crazy_fns:
            if not crazy_fns[k].get("AsButton", True): continue
            click_handle = crazy_fns[k]["Button"].click(ArgsGeneralWrapper(crazy_fns[k]["Function"]), [*input_combo, gr.State(PORT)], output_combo)
-            click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
+            click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
            cancel_handles.append(click_handle)
        # 函数插件-下拉菜单与随变按钮的互动
        def on_dropdown_changed(k):
@@ -175,7 +175,7 @@ def main():
            if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
            yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(*args, **kwargs)
        click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
-        click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
+        click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
        cancel_handles.append(click_handle)
        # 终止按钮的回调函数注册
        stopBtn.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
--- a/request_llm/bridge_all.py
+++ b/request_llm/bridge_all.py
@@ -16,6 +16,9 @@ from toolbox import get_conf, trimmed_format_exc
 from .bridge_chatgpt import predict_no_ui_long_connection as chatgpt_noui
 from .bridge_chatgpt import predict as chatgpt_ui
 from .bridge_azure_test import predict_no_ui_long_connection as azure_noui
 from .bridge_azure_test import predict as azure_ui
 from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
 from .bridge_chatglm import predict as chatglm_ui
@@ -83,6 +86,33 @@ model_info = {
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    "gpt-3.5-turbo-16k": {
        "fn_with_ui": chatgpt_ui,
        "fn_without_ui": chatgpt_noui,
        "endpoint": openai_endpoint,
        "max_token": 1024*16,
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    "gpt-3.5-turbo-0613": {
        "fn_with_ui": chatgpt_ui,
        "fn_without_ui": chatgpt_noui,
        "endpoint": openai_endpoint,
        "max_token": 4096,
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    "gpt-3.5-turbo-16k-0613": {
        "fn_with_ui": chatgpt_ui,
        "fn_without_ui": chatgpt_noui,
        "endpoint": openai_endpoint,
        "max_token": 1024 * 16,
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    "gpt-4": {
        "fn_with_ui": chatgpt_ui,
@@ -93,6 +123,16 @@ model_info = {
        "token_cnt": get_token_num_gpt4,
    },
    # azure openai
    "azure-gpt35":{
        "fn_with_ui": azure_ui,
        "fn_without_ui": azure_noui,
        "endpoint": get_conf("AZURE_ENDPOINT"),
        "max_token": 4096,
        "tokenizer": tokenizer_gpt35,
        "token_cnt": get_token_num_gpt35,
    },
    # api_2d
    "api2d-gpt-3.5-turbo": {
        "fn_with_ui": chatgpt_ui,
--- a/request_llm/bridge_azure_test.py
+++ b/request_llm/bridge_azure_test.py
@@ -0,0 +1,241 @@
 """
    该文件中主要包含三个函数
    不具备多线程能力的函数：
    1. predict: 正常对话时使用，具备完备的交互功能，不可多线程
    具备多线程调用能力的函数
    2. predict_no_ui：高级实验性功能模块调用，不会实时显示在界面上，参数简单，可以多线程并行，方便实现复杂的功能逻辑
    3. predict_no_ui_long_connection：在实验过程中发现调用predict_no_ui处理长文档时，和openai的连接容易断掉，这个函数用stream的方式解决这个问题，同样支持多线程
 """
 import logging
 import traceback
 import importlib
 import openai
 import time
 # 读取config.py文件中关于AZURE OPENAI API的信息
 from toolbox import get_conf, update_ui, clip_history, trimmed_format_exc
 TIMEOUT_SECONDS, MAX_RETRY, AZURE_ENGINE, AZURE_ENDPOINT, AZURE_API_VERSION, AZURE_API_KEY = \
    get_conf('TIMEOUT_SECONDS', 'MAX_RETRY',"AZURE_ENGINE","AZURE_ENDPOINT", "AZURE_API_VERSION", "AZURE_API_KEY")
 def get_full_error(chunk, stream_response):
    """
        获取完整的从Openai返回的报错
    """
    while True:
        try:
            chunk += next(stream_response)
        except:
            break
    return chunk
 def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
    """
    发送至azure openai api，流式获取输出。
    用于基础的对话功能。
    inputs 是本次问询的输入
    top_p, temperature是chatGPT的内部调优参数
    history 是之前的对话列表（注意无论是inputs还是history，内容太长了都会触发token数量溢出的错误）
    chatbot 为WebUI中显示的对话列表，修改它，然后yeild出去，可以直接修改对话界面内容
    additional_fn代表点击的哪个按钮，按钮见functional.py
    """
    print(llm_kwargs["llm_model"])    
    if additional_fn is not None:
        import core_functional
        importlib.reload(core_functional)    # 热更新prompt
        core_functional = core_functional.get_core_functions()
        if "PreProcess" in core_functional[additional_fn]: inputs = core_functional[additional_fn]["PreProcess"](inputs)  # 获取预处理函数（如果有的话）
        inputs = core_functional[additional_fn]["Prefix"] + inputs + core_functional[additional_fn]["Suffix"]
    raw_input = inputs
    logging.info(f'[raw_input] {raw_input}')
    chatbot.append((inputs, ""))
    yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
    payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream)    
    history.append(inputs); history.append("")
    retry = 0
    while True:
        try:            
            openai.api_type = "azure"            
            openai.api_version = AZURE_API_VERSION
            openai.api_base = AZURE_ENDPOINT
            openai.api_key = AZURE_API_KEY
            response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
        except:
            retry += 1
            chatbot[-1] = ((chatbot[-1][0], "获取response失败，重试中。。。"))
            retry_msg = f"，正在重试 ({retry}/{MAX_RETRY}) ……" if MAX_RETRY > 0 else ""
            yield from update_ui(chatbot=chatbot, history=history, msg="请求超时"+retry_msg) # 刷新界面
            if retry > MAX_RETRY: raise TimeoutError
    gpt_replying_buffer = ""    
    is_head_of_the_stream = True
    if stream:
        stream_response = response
        while True:
            try:
                chunk = next(stream_response)
            except StopIteration:                
                from toolbox import regular_txt_to_markdown; tb_str = '```\n' + trimmed_format_exc() + '```'
                chatbot[-1] = (chatbot[-1][0], f"[Local Message] 远程返回错误: \n\n{tb_str} \n\n{regular_txt_to_markdown(chunk)}")
                yield from update_ui(chatbot=chatbot, history=history, msg="远程返回错误:" + chunk) # 刷新界面
                return            
            if is_head_of_the_stream and (r'"object":"error"' not in chunk):
                # 数据流的第一帧不携带content
                is_head_of_the_stream = False; continue
            if chunk:
                #print(chunk)
                try:                     
                    if "delta" in chunk["choices"][0]:
                        if chunk["choices"][0]["finish_reason"] == "stop":
                            logging.info(f'[response] {gpt_replying_buffer}')
                            break
                    status_text = f"finish_reason: {chunk['choices'][0]['finish_reason']}"    
                    gpt_replying_buffer = gpt_replying_buffer + chunk["choices"][0]["delta"]["content"]                               
                    history[-1] = gpt_replying_buffer
                    chatbot[-1] = (history[-2], history[-1])
                    yield from update_ui(chatbot=chatbot, history=history, msg=status_text) # 刷新界面
                except Exception as e:
                    traceback.print_exc()
                    yield from update_ui(chatbot=chatbot, history=history, msg="Json解析不合常规") # 刷新界面
                    chunk = get_full_error(chunk, stream_response)
                    error_msg = chunk                    
                    yield from update_ui(chatbot=chatbot, history=history, msg="Json异常" + error_msg) # 刷新界面
                    return
 def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=None, console_slience=False):
    """
    发送至AZURE OPENAI API，等待回复，一次性完成，不显示中间过程。但内部用stream的方法避免中途网线被掐。
    inputs：
        是本次问询的输入
    sys_prompt:
        系统静默prompt
    llm_kwargs：
        chatGPT的内部调优参数
    history：
        是之前的对话列表
    observe_window = None：
        用于负责跨越线程传递已经输出的部分，大部分时候仅仅为了fancy的视觉效果，留空即可。observe_window[0]：观测窗。observe_window[1]：看门狗
    """
    watch_dog_patience = 5 # 看门狗的耐心, 设置5秒即可
    payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=True)
    retry = 0
    while True:
        try:
            openai.api_type = "azure"            
            openai.api_version = AZURE_API_VERSION
            openai.api_base = AZURE_ENDPOINT
            openai.api_key = AZURE_API_KEY
            response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
        except:  
            retry += 1
            traceback.print_exc()
            if retry > MAX_RETRY: raise TimeoutError
            if MAX_RETRY!=0: print(f'请求超时，正在重试 ({retry}/{MAX_RETRY}) ……')     
    stream_response =  response
    result = ''
    while True:
        try: chunk = next(stream_response)
        except StopIteration: 
            break
        except:
            chunk = next(stream_response) # 失败了，重试一次？再失败就没办法了。
        if len(chunk)==0: continue
        if not chunk.startswith('data:'): 
            error_msg = get_full_error(chunk, stream_response)
            if "reduce the length" in error_msg:
                raise ConnectionAbortedError("AZURE OPENAI API拒绝了请求:" + error_msg)
            else:
                raise RuntimeError("AZURE OPENAI API拒绝了请求：" + error_msg)
        if ('data: [DONE]' in chunk): break 
        delta = chunk["delta"]
        if len(delta) == 0: break
        if "role" in delta: continue
        if "content" in delta: 
            result += delta["content"]
            if not console_slience: print(delta["content"], end='')
            if observe_window is not None: 
                # 观测窗，把已经获取的数据显示出去
                if len(observe_window) >= 1: observe_window[0] += delta["content"]
                # 看门狗，如果超过期限没有喂狗，则终止
                if len(observe_window) >= 2:  
                    if (time.time()-observe_window[1]) > watch_dog_patience:
                        raise RuntimeError("用户取消了程序。")
        else: raise RuntimeError("意外Json结构："+delta)
    if chunk['finish_reason'] == 'length':
        raise ConnectionAbortedError("正常结束，但显示Token不足，导致输出不完整，请削减单次输入的文本量。")
    return result
 def generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream):
    """
    整合所有信息，选择LLM模型，生成 azure openai api请求，为发送请求做准备
    """    
    conversation_cnt = len(history) // 2
    messages = [{"role": "system", "content": system_prompt}]
    if conversation_cnt:
        for index in range(0, 2*conversation_cnt, 2):
            what_i_have_asked = {}
            what_i_have_asked["role"] = "user"
            what_i_have_asked["content"] = history[index]
            what_gpt_answer = {}
            what_gpt_answer["role"] = "assistant"
            what_gpt_answer["content"] = history[index+1]
            if what_i_have_asked["content"] != "":
                if what_gpt_answer["content"] == "": continue                
                messages.append(what_i_have_asked)
                messages.append(what_gpt_answer)
            else:
                messages[-1]['content'] = what_gpt_answer['content']
    what_i_ask_now = {}
    what_i_ask_now["role"] = "user"
    what_i_ask_now["content"] = inputs
    messages.append(what_i_ask_now)
    payload = {
        "model": llm_kwargs['llm_model'],
        "messages": messages, 
        "temperature": llm_kwargs['temperature'],  # 1.0,
        "top_p": llm_kwargs['top_p'],  # 1.0,
        "n": 1,
        "stream": stream,
        "presence_penalty": 0,
        "frequency_penalty": 0,
        "engine": AZURE_ENGINE
    }
    try:
        print(f" {llm_kwargs['llm_model']} : {conversation_cnt} : {inputs[:100]} ..........")
    except:
        print('输入中可能存在乱码。')
    return payload
--- a/toolbox.py
+++ b/toolbox.py
@@ -6,6 +6,7 @@ import re
 import os
 from latex2mathml.converter import convert as tex2mathml
 from functools import wraps, lru_cache
 pj = os.path.join
 """
 ========================================================================
@@ -221,16 +222,21 @@ def text_divide_paragraph(text):
    """
    将文本按照段落分隔符分割开，生成带有段落标签的HTML代码。
    """
    pre = '<div class="markdown-body">'
    suf = '</div>'
    if text.startswith(pre) and text.endswith(suf):
        return text
    if '```' in text:
        # careful input
-        return text
+        return pre + text + suf
    else:
        # wtf input
        lines = text.split("\n")
        for i, line in enumerate(lines):
            lines[i] = lines[i].replace(" ", "&nbsp;")
        text = "</br>".join(lines)
-        return text
+        return pre + text + suf
@lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
 def markdown_convertion(txt):
@@ -342,8 +348,11 @@ def format_io(self, y):
    if y is None or y == []:
        return []
    i_ask, gpt_reply = y[-1]
-    i_ask = text_divide_paragraph(i_ask)  # 输入部分太自由，预处理一波
+    # 输入部分太自由，预处理一波
-    gpt_reply = close_up_code_segment_during_stream(gpt_reply)  # 当代码输出半截的时候，试着补上后个```
+    if i_ask is not None: i_ask = text_divide_paragraph(i_ask)
    # 当代码输出半截的时候，试着补上后个```
    if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply)
    # process
    y[-1] = (
        None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
        None if gpt_reply is None else markdown_convertion(gpt_reply)
@@ -391,7 +400,7 @@ def extract_archive(file_path, dest_dir):
                print("Successfully extracted rar archive to {}".format(dest_dir))
        except:
            print("Rar format requires additional dependencies to install")
-            return '\n\n需要安装pip install rarfile来解压rar文件'
+            return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件'
    # 第三方库，需要预先pip install py7zr
    elif file_extension == '.7z':
@@ -402,7 +411,7 @@ def extract_archive(file_path, dest_dir):
                print("Successfully extracted 7z archive to {}".format(dest_dir))
        except:
            print("7z format requires additional dependencies to install")
-            return '\n\n需要安装pip install py7zr来解压7z文件'
+            return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件'
    else:
        return ''
    return ''
@@ -431,13 +440,17 @@ def find_recent_files(directory):
    return recent_files
-def promote_file_to_downloadzone(file, rename_file=None):
+def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
    # 将文件复制一份到下载区
    import shutil
    if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
    new_path = os.path.join(f'./gpt_log/', rename_file)
-    if os.path.exists(new_path): os.remove(new_path)
+    if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
-    shutil.copyfile(file, new_path)
+    if not os.path.exists(new_path): shutil.copyfile(file, new_path)
    if chatbot:
        if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
        else: current = []
        chatbot._cookies.update({'file_to_promote': [new_path] + current})
 def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
    """
@@ -477,14 +490,20 @@ def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
    return chatbot, txt, txt2
-def on_report_generated(files, chatbot):
+def on_report_generated(cookies, files, chatbot):
    from toolbox import find_recent_files
-    report_files = find_recent_files('gpt_log')
+    if 'file_to_promote' in cookies:
        report_files = cookies['file_to_promote']
        cookies.pop('file_to_promote')
    else:
        report_files = find_recent_files('gpt_log')
    if len(report_files) == 0:
        return None, chatbot
    # files.extend(report_files)
-    chatbot.append(['报告如何远程获取？', '报告已经添加到右侧“文件上传区”（可能处于折叠状态），请查收。'])
+    file_links = ''
-    return report_files, chatbot
+    for f in report_files: file_links += f'<br/><a href="file={os.path.abspath(f)}" target="_blank">{f}</a>'
    chatbot.append(['报告如何远程获取？', f'报告已经添加到右侧“文件上传区”（可能处于折叠状态），请查收。{file_links}'])
    return cookies, report_files, chatbot
 def is_openai_api_key(key):
    API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
@@ -786,7 +805,8 @@ def zip_result(folder):
    import time
    t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
    zip_folder(folder, './gpt_log/', f'{t}-result.zip')
-    
+    return pj('./gpt_log/', f'{t}-result.zip')
 def gen_time_str():
    import time
    return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
--- a/4
+++ b/4
@@ -1,5 +1,5 @@
 {
-  "version": 3.4,
+  "version": 3.42,
  "show_feature": true,
-  "new_feature": "新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持"
+  "new_feature": "完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持"
 }
作者	SHA1	备注	提交日期
qingxu fu	49253c4dc6	[arxiv trans] add html comparison to zip file	2023-06-29 12:29:49 +08:00
qingxu fu	1a00093015	修复提示	2023-06-29 12:15:52 +08:00
qingxu fu	64f76e7401	3.42	2023-06-29 11:32:19 +08:00
qingxu fu	eb4c07997e	修复Latex矫错和本地Latex论文翻译的问题	2023-06-29 11:30:42 +08:00
binary-husky	d684b4cdb3	Merge pull request #905 from Xminry/master Update 理解PDF文档内容.py	2023-06-27 23:37:25 +08:00
binary-husky	601a95c948	Merge pull request #881 from OverKit/master update latex_utils.py	2023-06-27 19:20:17 +08:00
qingxu fu	e18bef2e9c	add `item` breaker	2023-06-27 19:16:05 +08:00
qingxu fu	f654c1af31	merge regex expressions	2023-06-27 18:59:56 +08:00
qingxu fu	e90048a671	Merge branch 'master' of https://github.com/OverKit/gpt_academic into OverKit-master	2023-06-27 16:14:12 +08:00
binary-husky	ea624b1510	Merge pull request #889 from dackdawn/master 添加0613模型的声明	2023-06-27 15:03:15 +08:00
qingxu fu	057e3dda3c	Merge branch 'master' of https://github.com/dackdawn/gpt_academic into dackdawn-master	2023-06-27 15:02:22 +08:00
Xminry	4290821a50	Update 理解PDF文档内容.py	2023-06-27 01:57:31 +08:00
binary-husky	280e14d7b7	更新Latex模块的docker-compose	2023-06-26 09:59:14 +08:00
505030475	9f0cf9fb2b	arxiv PDF 引用	2023-06-25 23:30:31 +08:00
505030475	b8560b7510	修正误判latex模板文件的bug	2023-06-25 22:46:16 +08:00
505030475	d841d13b04	add arxiv translation test samples	2023-06-25 22:12:44 +08:00
binary-husky	efda9e5193	Merge pull request #897 from Ranhuiryan/master 添加azure-gpt35选项	2023-06-24 17:59:51 +10:00
Ranhuiryan	33d2e75aac	add azure-gpt35 to model list	2023-06-21 16:19:49 +08:00
Ranhuiryan	74941170aa	update azure use instruction	2023-06-21 16:19:26 +08:00
505030475	cd38949903	当遇到错误时，回滚到原文	2023-06-21 11:53:57 +10:00
505030475	d87f1eb171	更新接入azure的说明	2023-06-21 11:38:59 +10:00
binary-husky	cd1e4e1ba7	Merge pull request #797 from XiaojianTang/master 增加azure openai api的支持	2023-06-21 11:23:41 +10:00
505030475	cf5f348d70	update test samples	2023-06-21 11:20:31 +10:00
binary-husky	0ee25f475e	Merge branch 'master' of github.com:binary-husky/chatgpt_academic	2023-06-20 23:07:51 +08:00
binary-husky	1fede6df7f	temp	2023-06-20 23:05:17 +08:00
binary-husky	22a65cd163	Create build-with-latex.yml	2023-06-21 00:55:24 +10:00
binary-husky	538b041ea3	Merge pull request #890 from Mcskiller/master Update README.md	2023-06-21 00:53:26 +10:00
505030475	d7b056576d	add latex docker-compose	2023-06-21 00:52:58 +10:00
505030475	cb0bb6ab4a	fix minor bugs	2023-06-21 00:41:33 +10:00
505030475	bf955aaf12	fix bugs	2023-06-20 23:12:30 +10:00
505030475	61eb0da861	fix encoding bug	2023-06-20 22:08:09 +10:00
Lebenito（生糸）	5da633d94d	Update README.md Fix the error URL for the git clone.	2023-06-20 19:10:11 +08:00
dackdawn	f3e4e26e2f	添加0613模型的声明 openai对gpt-3.5-turbo的RPM限制是3，而gpt-3.5-turbo-0613的RPM是60，虽然两个模型的内容是一致的，但是选定特定模型可以获得更高的RPM和TPM	2023-06-19 21:40:26 +08:00
505030475	af7734dd35	avoid file fusion	2023-06-19 16:57:11 +10:00
505030475	d5bab093f9	rename function names	2023-06-19 15:17:33 +10:00
505030475	f94b167dc2	Merge branch 'master' into overkit-master	2023-06-19 14:53:51 +10:00
505030475	951d5ec758	Merge branch 'master' of github.com:binary-husky/chatgpt_academic	2023-06-19 14:52:25 +10:00
505030475	016d8ee156	Merge remote-tracking branch 'origin/master' into OverKit-master	2023-06-19 14:51:59 +10:00
505030475	dca9ec4bae	Merge branch 'master' of https://github.com/OverKit/gpt_academic into OverKit-master	2023-06-19 14:49:50 +10:00
binary-husky	a06e43c96b	Update README.md	2023-06-18 16:15:37 +08:00
binary-husky	29c6bfb6cb	Update README.md	2023-06-18 16:12:06 +08:00
binary-husky	8d7ee975a0	Update README.md	2023-06-18 16:10:45 +08:00
binary-husky	4bafbb3562	Update Latex输出PDF结果.py	2023-06-18 15:54:23 +08:00
OverKit	7fdf0a8e51	调整区分内容的代码	2023-06-18 15:51:29 +08:00
binary-husky	2bb13b4677	Update README.md	2023-06-18 15:44:42 +08:00
OverKit	9a5a509dd9	修复关于abstract的搜索	2023-06-17 19:27:21 +08:00
binary-husky	cbcb98ef6a	Merge pull request #872 from Skyzayre/master Update README.md	2023-06-16 17:54:39 +08:00
qingxu fu	bb864c6313	增加一些提示文字	2023-06-16 17:33:19 +08:00
qingxu fu	6d849eeb12	修复Langchain插件的bug	2023-06-16 17:33:03 +08:00
Skyzayre	ef752838b0	Update README.md	2023-06-15 02:07:43 +08:00
binary-husky	73d4a1ff4b	Update README.md	2023-06-14 10:15:47 +08:00
qingxu fu	8c62f21aa6	3.41增加gpt-3.5-16k的支持	2023-06-14 09:57:09 +08:00
qingxu fu	c40ebfc21f	将gpt-3.5-16k作为加入支持列表	2023-06-14 09:50:15 +08:00
binary-husky	c365ea9f57	Update README.md	2023-06-13 16:13:19 +08:00
binary-husky	12d66777cc	Merge pull request #864 from OverKit/master check letter % after removing spaces or tabs in the left	2023-06-12 15:21:35 +08:00
OverKit	9ac3d0d65d	check letter % after removing spaces or tabs in the left	2023-06-12 10:09:52 +08:00
binary-husky	9fd212652e	专业词汇声明	2023-06-12 09:45:59 +08:00
binary-husky	790a1cf12a	添加一些提示	2023-06-11 20:12:25 +08:00
binary-husky	3ecf2977a8	修复caption翻译	2023-06-11 18:23:54 +08:00
binary-husky	aeddf6b461	Update Latex输出PDF结果.py	2023-06-11 10:20:49 +08:00
505030475	ce0d8b9dab	虚空终端插件雏形	2023-06-11 01:36:23 +08:00
binary-husky	3c00e7a143	file link in chatbot	2023-06-10 21:45:38 +08:00
binary-husky	ef1bfdd60f	update pip install notice	2023-06-08 21:29:10 +08:00
qingxu fu	e48d92e82e	update translation	2023-06-08 18:34:06 +08:00
binary-husky	110510997f	Update README.md	2023-06-08 12:48:52 +08:00
binary-husky	b52695845e	Update README.md	2023-06-08 12:44:05 +08:00
binary-husky	f30c9c6d3b	Update README.md	2023-06-08 12:43:13 +08:00
binary-husky	ff5403eac6	Update README.md	2023-06-08 12:42:24 +08:00
XiaojianTang	f3205994ea	增加azure openai api的支持	2023-05-26 23:22:12 +08:00