merge frontier branch (#1620)

* Zhipu sdk update 适配最新的智谱SDK，支持GLM4v (#1502) * 适配 google gemini 优化为从用户input中提取文件 * 适配最新的智谱SDK、支持glm-4v * requirements.txt fix * pending history check --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * Update "生成多种Mermaid图表" plugin: Separate out the file reading function (#1520) * Update crazy_functional.py with new functionality deal with PDF * Update crazy_functional.py and Mermaid.py for plugin_kwargs * Update crazy_functional.py with new chart type: mind map * Update SELECT_PROMPT and i_say_show_user messages * Update ArgsReminder message in get_crazy_functions() function * Update with read md file and update PROMPTS * Return the PROMPTS as the test found that the initial version worked best * Update Mermaid chart generation function * version 3.71 * 解决issues #1510 * Remove unnecessary text from sys_prompt in 解析历史输入 function * Remove sys_prompt message in 解析历史输入 function * Update bridge_all.py: supports gpt-4-turbo-preview (#1517) * Update bridge_all.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update bridge_all.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Update config.py: supports gpt-4-turbo-preview (#1516) * Update config.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update config.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Refactor 解析历史输入 function to handle file input * Update Mermaid chart generation functionality * rename files and functions --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * 接入mathpix ocr功能 (#1468) * Update Latex输出PDF结果.py 借助mathpix实现了PDF翻译中文并重新编译PDF * Update config.py add mathpix appid & appkey * Add 'PDF翻译中文并重新编译PDF' feature to plugins. --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * fix zhipuai * check picture * remove glm-4 due to bug * 修改config * 检查MATHPIX_APPID * Remove unnecessary code and update function_plugins dictionary * capture non-standard token overflow * bug fix #1524 * change mermaid style * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 (#1530) * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 * 微调未果先stage一下 * update --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * ver 3.72 * change live2d * save the status of ``clear btn` in cookie * 前端选择保持 * js ui bug fix * reset btn bug fix * update live2d tips * fix missing get_token_num method * fix live2d toggle switch * fix persistent custom btn with cookie * fix zhipuai feedback with core functionality * Refactor button update and clean up functions * tailing space removal * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration * Prompt fix、脑图提示词优化 (#1537) * 适配 google gemini 优化为从用户input中提取文件 * 脑图提示词优化 * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * 优化“PDF翻译中文并重新编译PDF”插件 (#1602) * Add gemini_endpoint to API_URL_REDIRECT (#1560) * Add gemini_endpoint to API_URL_REDIRECT * Update gemini-pro and gemini-pro-vision model_info endpoints * Update to support new claude models (#1606) * Add anthropic library and update claude models * 更新bridge_claude.py文件，添加了对图片输入的支持。修复了一些bug。 * 添加Claude_3_Models变量以限制图片数量 * Refactor code to improve readability and maintainability * minor claude bug fix * more flexible one-api support * reformat config * fix one-api new access bug * dummy * compat non-standard api * version 3.73 --------- Co-authored-by: XIao <46100050+Kilig947@users.noreply.github.com> Co-authored-by: Menghuan1918 <menghuan2003@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: Hao Ma <893017927@qq.com> Co-authored-by: zeyuan huang <599012428@qq.com>
2025-12-06 06:26:47 +00:00 · 2024-03-11 17:26:09 +08:00
--- a/request_llms/bridge_moss.py
+++ b/request_llms/bridge_moss.py
@@ -18,7 +18,7 @@ class GetGLMHandle(Process):
        if self.check_dependency():
            self.start()
            self.threadLock = threading.Lock()
-        
+
    def check_dependency(self): # 主进程执行
        try:
            import datasets, os
@@ -54,9 +54,9 @@ class GetGLMHandle(Process):
        from models.tokenization_moss import MossTokenizer

        parser = argparse.ArgumentParser()
-        parser.add_argument("--model_name", default="fnlp/moss-moon-003-sft-int4", 
-                            choices=["fnlp/moss-moon-003-sft", 
-                                    "fnlp/moss-moon-003-sft-int8", 
+        parser.add_argument("--model_name", default="fnlp/moss-moon-003-sft-int4",
+                            choices=["fnlp/moss-moon-003-sft",
+                                    "fnlp/moss-moon-003-sft-int8",
                                    "fnlp/moss-moon-003-sft-int4"], type=str)
        parser.add_argument("--gpu", default="0", type=str)
        args = parser.parse_args()
@@ -76,7 +76,7 @@ class GetGLMHandle(Process):

        config = MossConfig.from_pretrained(model_path)
        self.tokenizer = MossTokenizer.from_pretrained(model_path)
-        if num_gpus > 1:  
+        if num_gpus > 1:
            print("Waiting for all devices to be ready, it may take a few minutes...")
            with init_empty_weights():
                raw_model = MossForCausalLM._from_config(config, torch_dtype=torch.float16)
@@ -135,15 +135,15 @@ class GetGLMHandle(Process):
                inputs = self.tokenizer(self.prompt, return_tensors="pt")
                with torch.no_grad():
                    outputs = self.model.generate(
-                        inputs.input_ids.cuda(), 
-                        attention_mask=inputs.attention_mask.cuda(), 
-                        max_length=2048, 
-                        do_sample=True, 
-                        top_k=40, 
-                        top_p=0.8, 
+                        inputs.input_ids.cuda(),
+                        attention_mask=inputs.attention_mask.cuda(),
+                        max_length=2048,
+                        do_sample=True,
+                        top_k=40,
+                        top_p=0.8,
                        temperature=0.7,
                        repetition_penalty=1.02,
-                        num_return_sequences=1, 
+                        num_return_sequences=1,
                        eos_token_id=106068,
                        pad_token_id=self.tokenizer.pad_token_id)
                    response = self.tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
@@ -167,7 +167,7 @@ class GetGLMHandle(Process):
            else:
                break
        self.threadLock.release()
-    
+
 global moss_handle
 moss_handle = None
 #################################################################################
@@ -180,7 +180,7 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
    if moss_handle is None:
        moss_handle = GetGLMHandle()
        if len(observe_window) >= 1: observe_window[0] = load_message + "\n\n" + moss_handle.info
-        if not moss_handle.success: 
+        if not moss_handle.success:
            error = moss_handle.info
            moss_handle = None
            raise RuntimeError(error)
@@ -194,7 +194,7 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
    response = ""
    for response in moss_handle.stream_chat(query=inputs, history=history_feedin, sys_prompt=sys_prompt, max_length=llm_kwargs['max_length'], top_p=llm_kwargs['top_p'], temperature=llm_kwargs['temperature']):
        if len(observe_window) >= 1:  observe_window[0] = response
-        if len(observe_window) >= 2:  
+        if len(observe_window) >= 2:
            if (time.time()-observe_window[1]) > watch_dog_patience:
                raise RuntimeError("程序终止。")
    return response
@@ -213,7 +213,7 @@ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_promp
        moss_handle = GetGLMHandle()
        chatbot[-1] = (inputs, load_message + "\n\n" + moss_handle.info)
        yield from update_ui(chatbot=chatbot, history=[])
-        if not moss_handle.success: 
+        if not moss_handle.success:
            moss_handle = None
            return
    else: