merge frontier branch (#1620)

* Zhipu sdk update 适配最新的智谱SDK，支持GLM4v (#1502) * 适配 google gemini 优化为从用户input中提取文件 * 适配最新的智谱SDK、支持glm-4v * requirements.txt fix * pending history check --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * Update "生成多种Mermaid图表" plugin: Separate out the file reading function (#1520) * Update crazy_functional.py with new functionality deal with PDF * Update crazy_functional.py and Mermaid.py for plugin_kwargs * Update crazy_functional.py with new chart type: mind map * Update SELECT_PROMPT and i_say_show_user messages * Update ArgsReminder message in get_crazy_functions() function * Update with read md file and update PROMPTS * Return the PROMPTS as the test found that the initial version worked best * Update Mermaid chart generation function * version 3.71 * 解决issues #1510 * Remove unnecessary text from sys_prompt in 解析历史输入 function * Remove sys_prompt message in 解析历史输入 function * Update bridge_all.py: supports gpt-4-turbo-preview (#1517) * Update bridge_all.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update bridge_all.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Update config.py: supports gpt-4-turbo-preview (#1516) * Update config.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update config.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Refactor 解析历史输入 function to handle file input * Update Mermaid chart generation functionality * rename files and functions --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * 接入mathpix ocr功能 (#1468) * Update Latex输出PDF结果.py 借助mathpix实现了PDF翻译中文并重新编译PDF * Update config.py add mathpix appid & appkey * Add 'PDF翻译中文并重新编译PDF' feature to plugins. --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * fix zhipuai * check picture * remove glm-4 due to bug * 修改config * 检查MATHPIX_APPID * Remove unnecessary code and update function_plugins dictionary * capture non-standard token overflow * bug fix #1524 * change mermaid style * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 (#1530) * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 * 微调未果先stage一下 * update --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * ver 3.72 * change live2d * save the status of ``clear btn` in cookie * 前端选择保持 * js ui bug fix * reset btn bug fix * update live2d tips * fix missing get_token_num method * fix live2d toggle switch * fix persistent custom btn with cookie * fix zhipuai feedback with core functionality * Refactor button update and clean up functions * tailing space removal * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration * Prompt fix、脑图提示词优化 (#1537) * 适配 google gemini 优化为从用户input中提取文件 * 脑图提示词优化 * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * 优化“PDF翻译中文并重新编译PDF”插件 (#1602) * Add gemini_endpoint to API_URL_REDIRECT (#1560) * Add gemini_endpoint to API_URL_REDIRECT * Update gemini-pro and gemini-pro-vision model_info endpoints * Update to support new claude models (#1606) * Add anthropic library and update claude models * 更新bridge_claude.py文件，添加了对图片输入的支持。修复了一些bug。 * 添加Claude_3_Models变量以限制图片数量 * Refactor code to improve readability and maintainability * minor claude bug fix * more flexible one-api support * reformat config * fix one-api new access bug * dummy * compat non-standard api * version 3.73 --------- Co-authored-by: XIao <46100050+Kilig947@users.noreply.github.com> Co-authored-by: Menghuan1918 <menghuan2003@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: Hao Ma <893017927@qq.com> Co-authored-by: zeyuan huang <599012428@qq.com>
2025-12-09 16:06:48 +00:00 · 2024-03-11 17:26:09 +08:00
--- a/crazy_functions/批量翻译PDF文档_多线程.py
+++ b/crazy_functions/批量翻译PDF文档_多线程.py
@@ -68,7 +68,7 @@ def 解析PDF_基于GROBID(file_manifest, project_folder, llm_kwargs, plugin_kwa
        with open(grobid_json_res, 'w+', encoding='utf8') as f:
            f.write(json.dumps(article_dict, indent=4, ensure_ascii=False))
        promote_file_to_downloadzone(grobid_json_res, chatbot=chatbot)
-        
+
        if article_dict is None: raise RuntimeError("解析PDF失败，请检查PDF是否损坏。")
        yield from translate_pdf(article_dict, llm_kwargs, chatbot, fp, generated_conclusion_files, TOKEN_LIMIT_PER_FRAGMENT, DST_LANG)
    chatbot.append(("给出输出文件清单", str(generated_conclusion_files + generated_html_files)))
@@ -97,7 +97,7 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,

        # 为了更好的效果，我们剥离Introduction之后的部分（如果有）
        paper_meta = page_one_fragments[0].split('introduction')[0].split('Introduction')[0].split('INTRODUCTION')[0]
-        
+
        # 单线，获取文章meta信息
        paper_meta_info = yield from request_gpt_model_in_new_thread_with_ui_alive(
            inputs=f"以下是一篇学术论文的基础信息，请从中提取出“标题”、“收录会议或期刊”、“作者”、“摘要”、“编号”、“作者邮箱”这六个部分。请用markdown格式输出，最后用中文翻译摘要部分。请提取：{paper_meta}",
@@ -121,7 +121,7 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,
        )
        gpt_response_collection_md = copy.deepcopy(gpt_response_collection)
        # 整理报告的格式
-        for i,k in enumerate(gpt_response_collection_md): 
+        for i,k in enumerate(gpt_response_collection_md):
            if i%2==0:
                gpt_response_collection_md[i] = f"\n\n---\n\n ## 原文[{i//2}/{len(gpt_response_collection_md)//2}]： \n\n {paper_fragments[i//2].replace('#', '')}  \n\n---\n\n ## 翻译[{i//2}/{len(gpt_response_collection_md)//2}]：\n "
            else:
@@ -139,18 +139,18 @@ def 解析PDF(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot,

        # write html
        try:
-            ch = construct_html() 
+            ch = construct_html()
            orig = ""
            trans = ""
            gpt_response_collection_html = copy.deepcopy(gpt_response_collection)
-            for i,k in enumerate(gpt_response_collection_html): 
+            for i,k in enumerate(gpt_response_collection_html):
                if i%2==0:
                    gpt_response_collection_html[i] = paper_fragments[i//2].replace('#', '')
                else:
                    gpt_response_collection_html[i] = gpt_response_collection_html[i]
            final = ["论文概况", paper_meta_info.replace('# ', '### '),  "二、论文翻译",  ""]
            final.extend(gpt_response_collection_html)
-            for i, k in enumerate(final): 
+            for i, k in enumerate(final):
                if i%2==0:
                    orig = k
                if i%2==1: