镜像自地址
https://github.com/binary-husky/gpt_academic.git
已同步 2025-12-06 14:36:48 +00:00
比较提交
6 次代码提交
frontier_2
...
purge_prin
| 作者 | SHA1 | 提交日期 | |
|---|---|---|---|
|
|
2f343179a2 | ||
|
|
bbf9e9f868 | ||
|
|
aa1f967dd7 | ||
|
|
0d082327c8 | ||
|
|
80acd9c875 | ||
|
|
17cd4f8210 |
@@ -1,14 +1,14 @@
|
||||
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
|
||||
name: build-with-latex-arm
|
||||
name: build-with-all-capacity-beta
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- "master"
|
||||
- 'master'
|
||||
|
||||
env:
|
||||
REGISTRY: ghcr.io
|
||||
IMAGE_NAME: ${{ github.repository }}_with_latex_arm
|
||||
IMAGE_NAME: ${{ github.repository }}_with_all_capacity_beta
|
||||
|
||||
jobs:
|
||||
build-and-push-image:
|
||||
@@ -18,17 +18,11 @@ jobs:
|
||||
packages: write
|
||||
|
||||
steps:
|
||||
- name: Set up QEMU
|
||||
uses: docker/setup-qemu-action@v3
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Log in to the Container registry
|
||||
uses: docker/login-action@v3
|
||||
uses: docker/login-action@v2
|
||||
with:
|
||||
registry: ${{ env.REGISTRY }}
|
||||
username: ${{ github.actor }}
|
||||
@@ -41,11 +35,10 @@ jobs:
|
||||
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
||||
|
||||
- name: Build and push Docker image
|
||||
uses: docker/build-push-action@v6
|
||||
uses: docker/build-push-action@v4
|
||||
with:
|
||||
context: .
|
||||
push: true
|
||||
platforms: linux/arm64
|
||||
file: docs/GithubAction+NoLocal+Latex
|
||||
file: docs/GithubAction+AllCapacityBeta
|
||||
tags: ${{ steps.meta.outputs.tags }}
|
||||
labels: ${{ steps.meta.outputs.labels }}
|
||||
labels: ${{ steps.meta.outputs.labels }}
|
||||
44
.github/workflows/build-with-jittorllms.yml
vendored
普通文件
44
.github/workflows/build-with-jittorllms.yml
vendored
普通文件
@@ -0,0 +1,44 @@
|
||||
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
|
||||
name: build-with-jittorllms
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- 'master'
|
||||
|
||||
env:
|
||||
REGISTRY: ghcr.io
|
||||
IMAGE_NAME: ${{ github.repository }}_jittorllms
|
||||
|
||||
jobs:
|
||||
build-and-push-image:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: read
|
||||
packages: write
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v3
|
||||
|
||||
- name: Log in to the Container registry
|
||||
uses: docker/login-action@v2
|
||||
with:
|
||||
registry: ${{ env.REGISTRY }}
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Extract metadata (tags, labels) for Docker
|
||||
id: meta
|
||||
uses: docker/metadata-action@v4
|
||||
with:
|
||||
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
||||
|
||||
- name: Build and push Docker image
|
||||
uses: docker/build-push-action@v4
|
||||
with:
|
||||
context: .
|
||||
push: true
|
||||
file: docs/GithubAction+JittorLLMs
|
||||
tags: ${{ steps.meta.outputs.tags }}
|
||||
labels: ${{ steps.meta.outputs.labels }}
|
||||
@@ -1,6 +1,5 @@
|
||||
> [!IMPORTANT]
|
||||
> 2024.10.10: 突发停电,紧急恢复了提供[whl包](https://drive.google.com/file/d/19U_hsLoMrjOlQSzYS3pzWX9fTzyusArP/view?usp=sharing)的文件服务器
|
||||
> 2024.10.8: 版本3.90加入对llama-index的初步支持,版本3.80加入插件二级菜单功能(详见wiki)
|
||||
> 2024.6.1: 版本3.80加入插件二级菜单功能(详见wiki)
|
||||
> 2024.5.1: 加入Doc2x翻译PDF论文的功能,[查看详情](https://github.com/binary-husky/gpt_academic/wiki/Doc2x)
|
||||
> 2024.3.11: 全力支持Qwen、GLM、DeepseekCoder等中文大语言模型! SoVits语音克隆模块,[查看详情](https://www.bilibili.com/video/BV1Rp421S7tF/)
|
||||
> 2024.1.17: 安装依赖时,请选择`requirements.txt`中**指定的版本**。 安装命令:`pip install -r requirements.txt`。本项目完全开源免费,您可通过订阅[在线服务](https://github.com/binary-husky/gpt_academic/wiki/online)的方式鼓励本项目的发展。
|
||||
|
||||
@@ -57,9 +57,9 @@ EMBEDDING_MODEL = "text-embedding-3-small"
|
||||
# "yi-34b-chat-0205","yi-34b-chat-200k","yi-large","yi-medium","yi-spark","yi-large-turbo","yi-large-preview",
|
||||
# ]
|
||||
# --- --- --- ---
|
||||
# 此外,您还可以在接入one-api/vllm/ollama/Openroute时,
|
||||
# 使用"one-api-*","vllm-*","ollama-*","openrouter-*"前缀直接使用非标准方式接入的模型,例如
|
||||
# AVAIL_LLM_MODELS = ["one-api-claude-3-sonnet-20240229(max_token=100000)", "ollama-phi3(max_token=4096)","openrouter-openai/gpt-4o-mini","openrouter-openai/chatgpt-4o-latest"]
|
||||
# 此外,您还可以在接入one-api/vllm/ollama时,
|
||||
# 使用"one-api-*","vllm-*","ollama-*"前缀直接使用非标准方式接入的模型,例如
|
||||
# AVAIL_LLM_MODELS = ["one-api-claude-3-sonnet-20240229(max_token=100000)", "ollama-phi3(max_token=4096)"]
|
||||
# --- --- --- ---
|
||||
|
||||
|
||||
|
||||
@@ -17,7 +17,7 @@ def get_core_functions():
|
||||
text_show_english=
|
||||
r"Below is a paragraph from an academic paper. Polish the writing to meet the academic style, "
|
||||
r"improve the spelling, grammar, clarity, concision and overall readability. When necessary, rewrite the whole sentence. "
|
||||
r"Firstly, you should provide the polished paragraph (in English). "
|
||||
r"Firstly, you should provide the polished paragraph. "
|
||||
r"Secondly, you should list all your modification and explain the reasons to do so in markdown table.",
|
||||
text_show_chinese=
|
||||
r"作为一名中文学术论文写作改进助理,你的任务是改进所提供文本的拼写、语法、清晰、简洁和整体可读性,"
|
||||
|
||||
@@ -6,6 +6,7 @@ from loguru import logger
|
||||
def get_crazy_functions():
|
||||
from crazy_functions.读文章写摘要 import 读文章写摘要
|
||||
from crazy_functions.生成函数注释 import 批量生成函数注释
|
||||
from crazy_functions.Rag_Interface import Rag问答
|
||||
from crazy_functions.SourceCode_Analyse import 解析项目本身
|
||||
from crazy_functions.SourceCode_Analyse import 解析一个Python项目
|
||||
from crazy_functions.SourceCode_Analyse import 解析一个Matlab项目
|
||||
@@ -51,6 +52,13 @@ def get_crazy_functions():
|
||||
from crazy_functions.SourceCode_Comment import 注释Python项目
|
||||
|
||||
function_plugins = {
|
||||
"Rag智能召回": {
|
||||
"Group": "对话",
|
||||
"Color": "stop",
|
||||
"AsButton": False,
|
||||
"Info": "将问答数据记录到向量库中,作为长期参考。",
|
||||
"Function": HotReload(Rag问答),
|
||||
},
|
||||
"虚空终端": {
|
||||
"Group": "对话|编程|学术|智能体",
|
||||
"Color": "stop",
|
||||
@@ -699,31 +707,6 @@ def get_crazy_functions():
|
||||
logger.error(trimmed_format_exc())
|
||||
logger.error("Load function plugin failed")
|
||||
|
||||
try:
|
||||
from crazy_functions.Rag_Interface import Rag问答
|
||||
|
||||
function_plugins.update(
|
||||
{
|
||||
"Rag智能召回": {
|
||||
"Group": "对话",
|
||||
"Color": "stop",
|
||||
"AsButton": False,
|
||||
"Info": "将问答数据记录到向量库中,作为长期参考。",
|
||||
"Function": HotReload(Rag问答),
|
||||
},
|
||||
}
|
||||
)
|
||||
except:
|
||||
logger.error(trimmed_format_exc())
|
||||
logger.error("Load function plugin failed")
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# try:
|
||||
# from crazy_functions.高级功能函数模板 import 测试图表渲染
|
||||
# function_plugins.update({
|
||||
|
||||
@@ -2,7 +2,20 @@ from toolbox import CatchException, update_ui, get_conf, get_log_folder, update_
|
||||
from crazy_functions.crazy_utils import input_clipping
|
||||
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
|
||||
|
||||
VECTOR_STORE_TYPE = "Milvus"
|
||||
|
||||
if VECTOR_STORE_TYPE == "Milvus":
|
||||
try:
|
||||
from crazy_functions.rag_fns.milvus_worker import MilvusRagWorker as LlamaIndexRagWorker
|
||||
except:
|
||||
VECTOR_STORE_TYPE = "Simple"
|
||||
|
||||
if VECTOR_STORE_TYPE == "Simple":
|
||||
from crazy_functions.rag_fns.llama_index_worker import LlamaIndexRagWorker
|
||||
|
||||
|
||||
RAG_WORKER_REGISTER = {}
|
||||
|
||||
MAX_HISTORY_ROUND = 5
|
||||
MAX_CONTEXT_TOKEN_LIMIT = 4096
|
||||
REMEMBER_PREVIEW = 1000
|
||||
@@ -10,16 +23,6 @@ REMEMBER_PREVIEW = 1000
|
||||
@CatchException
|
||||
def Rag问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, user_request):
|
||||
|
||||
# import vector store lib
|
||||
VECTOR_STORE_TYPE = "Milvus"
|
||||
if VECTOR_STORE_TYPE == "Milvus":
|
||||
try:
|
||||
from crazy_functions.rag_fns.milvus_worker import MilvusRagWorker as LlamaIndexRagWorker
|
||||
except:
|
||||
VECTOR_STORE_TYPE = "Simple"
|
||||
if VECTOR_STORE_TYPE == "Simple":
|
||||
from crazy_functions.rag_fns.llama_index_worker import LlamaIndexRagWorker
|
||||
|
||||
# 1. we retrieve rag worker from global context
|
||||
user_name = chatbot.get_user()
|
||||
checkpoint_dir = get_log_folder(user_name, plugin_name='experimental_rag')
|
||||
|
||||
@@ -1,13 +1,7 @@
|
||||
import pickle, os, random
|
||||
from toolbox import CatchException, update_ui, get_conf, get_log_folder, update_ui_lastest_msg
|
||||
from crazy_functions.crazy_utils import input_clipping
|
||||
from crazy_functions.crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
|
||||
from request_llms.bridge_all import predict_no_ui_long_connection
|
||||
from crazy_functions.json_fns.select_tool import structure_output, select_tool
|
||||
from pydantic import BaseModel, Field
|
||||
from loguru import logger
|
||||
from typing import List
|
||||
|
||||
import pickle, os
|
||||
|
||||
SOCIAL_NETWOK_WORKER_REGISTER = {}
|
||||
|
||||
@@ -15,7 +9,7 @@ class SocialNetwork():
|
||||
def __init__(self):
|
||||
self.people = []
|
||||
|
||||
class SaveAndLoad():
|
||||
class SocialNetworkWorker():
|
||||
def __init__(self, user_name, llm_kwargs, auto_load_checkpoint=True, checkpoint_dir=None) -> None:
|
||||
self.user_name = user_name
|
||||
self.checkpoint_dir = checkpoint_dir
|
||||
@@ -47,105 +41,8 @@ class SaveAndLoad():
|
||||
return SocialNetwork()
|
||||
|
||||
|
||||
class Friend(BaseModel):
|
||||
friend_name: str = Field(description="name of a friend")
|
||||
friend_description: str = Field(description="description of a friend (everything about this friend)")
|
||||
friend_relationship: str = Field(description="The relationship with a friend (e.g. friend, family, colleague)")
|
||||
|
||||
class FriendList(BaseModel):
|
||||
friends_list: List[Friend] = Field(description="The list of friends")
|
||||
|
||||
|
||||
class SocialNetworkWorker(SaveAndLoad):
|
||||
def ai_socail_advice(self, prompt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, run_gpt_fn, intention_type):
|
||||
pass
|
||||
|
||||
def ai_remove_friend(self, prompt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, run_gpt_fn, intention_type):
|
||||
pass
|
||||
|
||||
def ai_list_friends(self, prompt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, run_gpt_fn, intention_type):
|
||||
pass
|
||||
|
||||
def ai_add_multi_friends(self, prompt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, run_gpt_fn, intention_type):
|
||||
friend, err_msg = structure_output(
|
||||
txt=prompt,
|
||||
prompt="根据提示, 解析多个联系人的身份信息\n\n",
|
||||
err_msg=f"不能理解该联系人",
|
||||
run_gpt_fn=run_gpt_fn,
|
||||
pydantic_cls=FriendList
|
||||
)
|
||||
if friend.friends_list:
|
||||
for f in friend.friends_list:
|
||||
self.add_friend(f)
|
||||
msg = f"成功添加{len(friend.friends_list)}个联系人: {str(friend.friends_list)}"
|
||||
yield from update_ui_lastest_msg(lastmsg=msg, chatbot=chatbot, history=history, delay=0)
|
||||
|
||||
|
||||
def run(self, txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, user_request):
|
||||
prompt = txt
|
||||
run_gpt_fn = lambda inputs, sys_prompt: predict_no_ui_long_connection(inputs=inputs, llm_kwargs=llm_kwargs, history=[], sys_prompt=sys_prompt, observe_window=[])
|
||||
self.tools_to_select = {
|
||||
"SocialAdvice":{
|
||||
"explain_to_llm": "如果用户希望获取社交指导,调用SocialAdvice生成一些社交建议",
|
||||
"callback": self.ai_socail_advice,
|
||||
},
|
||||
"AddFriends":{
|
||||
"explain_to_llm": "如果用户给出了联系人,调用AddMultiFriends把联系人添加到数据库",
|
||||
"callback": self.ai_add_multi_friends,
|
||||
},
|
||||
"RemoveFriend":{
|
||||
"explain_to_llm": "如果用户希望移除某个联系人,调用RemoveFriend",
|
||||
"callback": self.ai_remove_friend,
|
||||
},
|
||||
"ListFriends":{
|
||||
"explain_to_llm": "如果用户列举联系人,调用ListFriends",
|
||||
"callback": self.ai_list_friends,
|
||||
}
|
||||
}
|
||||
|
||||
try:
|
||||
Explaination = '\n'.join([f'{k}: {v["explain_to_llm"]}' for k, v in self.tools_to_select.items()])
|
||||
class UserSociaIntention(BaseModel):
|
||||
intention_type: str = Field(
|
||||
description=
|
||||
f"The type of user intention. You must choose from {self.tools_to_select.keys()}.\n\n"
|
||||
f"Explaination:\n{Explaination}",
|
||||
default="SocialAdvice"
|
||||
)
|
||||
pydantic_cls_instance, err_msg = select_tool(
|
||||
prompt=txt,
|
||||
run_gpt_fn=run_gpt_fn,
|
||||
pydantic_cls=UserSociaIntention
|
||||
)
|
||||
except Exception as e:
|
||||
yield from update_ui_lastest_msg(
|
||||
lastmsg=f"无法理解用户意图 {err_msg}",
|
||||
chatbot=chatbot,
|
||||
history=history,
|
||||
delay=0
|
||||
)
|
||||
return
|
||||
|
||||
intention_type = pydantic_cls_instance.intention_type
|
||||
intention_callback = self.tools_to_select[pydantic_cls_instance.intention_type]['callback']
|
||||
yield from intention_callback(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, run_gpt_fn, intention_type)
|
||||
|
||||
|
||||
def add_friend(self, friend):
|
||||
# check whether the friend is already in the social network
|
||||
for f in self.social_network.people:
|
||||
if f.friend_name == friend.friend_name:
|
||||
f.friend_description = friend.friend_description
|
||||
f.friend_relationship = friend.friend_relationship
|
||||
logger.info(f"Repeated friend, update info: {friend}")
|
||||
return
|
||||
logger.info(f"Add a new friend: {friend}")
|
||||
self.social_network.people.append(friend)
|
||||
return
|
||||
|
||||
|
||||
@CatchException
|
||||
def I人助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, user_request):
|
||||
def I人助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, user_request, num_day=5):
|
||||
|
||||
# 1. we retrieve worker from global context
|
||||
user_name = chatbot.get_user()
|
||||
@@ -161,7 +58,8 @@ def I人助手(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt,
|
||||
)
|
||||
|
||||
# 2. save
|
||||
yield from social_network_worker.run(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, user_request)
|
||||
social_network_worker.social_network.people.append("张三")
|
||||
social_network_worker.save_to_checkpoint(checkpoint_dir)
|
||||
chatbot.append(["good", "work"])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
|
||||
|
||||
@@ -1,26 +0,0 @@
|
||||
from crazy_functions.json_fns.pydantic_io import GptJsonIO, JsonStringError
|
||||
|
||||
def structure_output(txt, prompt, err_msg, run_gpt_fn, pydantic_cls):
|
||||
gpt_json_io = GptJsonIO(pydantic_cls)
|
||||
analyze_res = run_gpt_fn(
|
||||
txt,
|
||||
sys_prompt=prompt + gpt_json_io.format_instructions
|
||||
)
|
||||
try:
|
||||
friend = gpt_json_io.generate_output_auto_repair(analyze_res, run_gpt_fn)
|
||||
except JsonStringError as e:
|
||||
return None, err_msg
|
||||
|
||||
err_msg = ""
|
||||
return friend, err_msg
|
||||
|
||||
|
||||
def select_tool(prompt, run_gpt_fn, pydantic_cls):
|
||||
pydantic_cls_instance, err_msg = structure_output(
|
||||
txt=prompt,
|
||||
prompt="根据提示, 分析应该调用哪个工具函数\n\n",
|
||||
err_msg=f"不能理解该联系人",
|
||||
run_gpt_fn=run_gpt_fn,
|
||||
pydantic_cls=pydantic_cls
|
||||
)
|
||||
return pydantic_cls_instance, err_msg
|
||||
@@ -644,213 +644,6 @@ def run_in_subprocess(func):
|
||||
|
||||
|
||||
def _merge_pdfs(pdf1_path, pdf2_path, output_path):
|
||||
try:
|
||||
logger.info("Merging PDFs using _merge_pdfs_ng")
|
||||
_merge_pdfs_ng(pdf1_path, pdf2_path, output_path)
|
||||
except:
|
||||
logger.info("Merging PDFs using _merge_pdfs_legacy")
|
||||
_merge_pdfs_legacy(pdf1_path, pdf2_path, output_path)
|
||||
|
||||
|
||||
def _merge_pdfs_ng(pdf1_path, pdf2_path, output_path):
|
||||
import PyPDF2 # PyPDF2这个库有严重的内存泄露问题,把它放到子进程中运行,从而方便内存的释放
|
||||
from PyPDF2.generic import NameObject, TextStringObject, ArrayObject, FloatObject, NumberObject
|
||||
|
||||
Percent = 1
|
||||
# raise RuntimeError('PyPDF2 has a serious memory leak problem, please use other tools to merge PDF files.')
|
||||
# Open the first PDF file
|
||||
with open(pdf1_path, "rb") as pdf1_file:
|
||||
pdf1_reader = PyPDF2.PdfFileReader(pdf1_file)
|
||||
# Open the second PDF file
|
||||
with open(pdf2_path, "rb") as pdf2_file:
|
||||
pdf2_reader = PyPDF2.PdfFileReader(pdf2_file)
|
||||
# Create a new PDF file to store the merged pages
|
||||
output_writer = PyPDF2.PdfFileWriter()
|
||||
# Determine the number of pages in each PDF file
|
||||
num_pages = max(pdf1_reader.numPages, pdf2_reader.numPages)
|
||||
# Merge the pages from the two PDF files
|
||||
for page_num in range(num_pages):
|
||||
# Add the page from the first PDF file
|
||||
if page_num < pdf1_reader.numPages:
|
||||
page1 = pdf1_reader.getPage(page_num)
|
||||
else:
|
||||
page1 = PyPDF2.PageObject.createBlankPage(pdf1_reader)
|
||||
# Add the page from the second PDF file
|
||||
if page_num < pdf2_reader.numPages:
|
||||
page2 = pdf2_reader.getPage(page_num)
|
||||
else:
|
||||
page2 = PyPDF2.PageObject.createBlankPage(pdf1_reader)
|
||||
# Create a new empty page with double width
|
||||
new_page = PyPDF2.PageObject.createBlankPage(
|
||||
width=int(
|
||||
int(page1.mediaBox.getWidth())
|
||||
+ int(page2.mediaBox.getWidth()) * Percent
|
||||
),
|
||||
height=max(page1.mediaBox.getHeight(), page2.mediaBox.getHeight()),
|
||||
)
|
||||
new_page.mergeTranslatedPage(page1, 0, 0)
|
||||
new_page.mergeTranslatedPage(
|
||||
page2,
|
||||
int(
|
||||
int(page1.mediaBox.getWidth())
|
||||
- int(page2.mediaBox.getWidth()) * (1 - Percent)
|
||||
),
|
||||
0,
|
||||
)
|
||||
if "/Annots" in new_page:
|
||||
annotations = new_page["/Annots"]
|
||||
for i, annot in enumerate(annotations):
|
||||
annot_obj = annot.get_object()
|
||||
|
||||
# 检查注释类型是否是链接(/Link)
|
||||
if annot_obj.get("/Subtype") == "/Link":
|
||||
# 检查是否为内部链接跳转(/GoTo)或外部URI链接(/URI)
|
||||
action = annot_obj.get("/A")
|
||||
if action:
|
||||
|
||||
if "/S" in action and action["/S"] == "/GoTo":
|
||||
# 内部链接:跳转到文档中的某个页面
|
||||
dest = action.get("/D") # 目标页或目标位置
|
||||
# if dest and annot.idnum in page2_annot_id:
|
||||
if dest in pdf2_reader.named_destinations:
|
||||
# 获取原始文件中跳转信息,包括跳转页面
|
||||
destination = pdf2_reader.named_destinations[
|
||||
dest
|
||||
]
|
||||
page_number = (
|
||||
pdf2_reader.get_destination_page_number(
|
||||
destination
|
||||
)
|
||||
)
|
||||
# 更新跳转信息,跳转到对应的页面和,指定坐标 (100, 150),缩放比例为 100%
|
||||
# “/D”:[10,'/XYZ',100,100,0]
|
||||
if destination.dest_array[1] == "/XYZ":
|
||||
annot_obj["/A"].update(
|
||||
{
|
||||
NameObject("/D"): ArrayObject(
|
||||
[
|
||||
NumberObject(page_number),
|
||||
destination.dest_array[1],
|
||||
FloatObject(
|
||||
destination.dest_array[
|
||||
2
|
||||
]
|
||||
+ int(
|
||||
page1.mediaBox.getWidth()
|
||||
)
|
||||
),
|
||||
destination.dest_array[3],
|
||||
destination.dest_array[4],
|
||||
]
|
||||
) # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
else:
|
||||
annot_obj["/A"].update(
|
||||
{
|
||||
NameObject("/D"): ArrayObject(
|
||||
[
|
||||
NumberObject(page_number),
|
||||
destination.dest_array[1],
|
||||
]
|
||||
) # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
|
||||
rect = annot_obj.get("/Rect")
|
||||
# 更新点击坐标
|
||||
rect = ArrayObject(
|
||||
[
|
||||
FloatObject(
|
||||
rect[0]
|
||||
+ int(page1.mediaBox.getWidth())
|
||||
),
|
||||
rect[1],
|
||||
FloatObject(
|
||||
rect[2]
|
||||
+ int(page1.mediaBox.getWidth())
|
||||
),
|
||||
rect[3],
|
||||
]
|
||||
)
|
||||
annot_obj.update(
|
||||
{
|
||||
NameObject(
|
||||
"/Rect"
|
||||
): rect # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
# if dest and annot.idnum in page1_annot_id:
|
||||
if dest in pdf1_reader.named_destinations:
|
||||
|
||||
# 获取原始文件中跳转信息,包括跳转页面
|
||||
destination = pdf1_reader.named_destinations[
|
||||
dest
|
||||
]
|
||||
page_number = (
|
||||
pdf1_reader.get_destination_page_number(
|
||||
destination
|
||||
)
|
||||
)
|
||||
# 更新跳转信息,跳转到对应的页面和,指定坐标 (100, 150),缩放比例为 100%
|
||||
# “/D”:[10,'/XYZ',100,100,0]
|
||||
if destination.dest_array[1] == "/XYZ":
|
||||
annot_obj["/A"].update(
|
||||
{
|
||||
NameObject("/D"): ArrayObject(
|
||||
[
|
||||
NumberObject(page_number),
|
||||
destination.dest_array[1],
|
||||
FloatObject(
|
||||
destination.dest_array[
|
||||
2
|
||||
]
|
||||
),
|
||||
destination.dest_array[3],
|
||||
destination.dest_array[4],
|
||||
]
|
||||
) # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
else:
|
||||
annot_obj["/A"].update(
|
||||
{
|
||||
NameObject("/D"): ArrayObject(
|
||||
[
|
||||
NumberObject(page_number),
|
||||
destination.dest_array[1],
|
||||
]
|
||||
) # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
|
||||
rect = annot_obj.get("/Rect")
|
||||
rect = ArrayObject(
|
||||
[
|
||||
FloatObject(rect[0]),
|
||||
rect[1],
|
||||
FloatObject(rect[2]),
|
||||
rect[3],
|
||||
]
|
||||
)
|
||||
annot_obj.update(
|
||||
{
|
||||
NameObject(
|
||||
"/Rect"
|
||||
): rect # 确保键和值是 PdfObject
|
||||
}
|
||||
)
|
||||
|
||||
elif "/S" in action and action["/S"] == "/URI":
|
||||
# 外部链接:跳转到某个URI
|
||||
uri = action.get("/URI")
|
||||
output_writer.addPage(new_page)
|
||||
# Save the merged PDF file
|
||||
with open(output_path, "wb") as output_file:
|
||||
output_writer.write(output_file)
|
||||
|
||||
|
||||
def _merge_pdfs_legacy(pdf1_path, pdf2_path, output_path):
|
||||
import PyPDF2 # PyPDF2这个库有严重的内存泄露问题,把它放到子进程中运行,从而方便内存的释放
|
||||
|
||||
Percent = 0.95
|
||||
|
||||
1
docs/Dockerfile+JittorLLM
普通文件
1
docs/Dockerfile+JittorLLM
普通文件
@@ -0,0 +1 @@
|
||||
# 此Dockerfile不再维护,请前往docs/GithubAction+JittorLLMs
|
||||
@@ -0,0 +1,57 @@
|
||||
# docker build -t gpt-academic-all-capacity -f docs/GithubAction+AllCapacity --network=host --build-arg http_proxy=http://localhost:10881 --build-arg https_proxy=http://localhost:10881 .
|
||||
# docker build -t gpt-academic-all-capacity -f docs/GithubAction+AllCapacityBeta --network=host .
|
||||
# docker run -it --net=host gpt-academic-all-capacity bash
|
||||
|
||||
# 从NVIDIA源,从而支持显卡(检查宿主的nvidia-smi中的cuda版本必须>=11.3)
|
||||
FROM fuqingxu/11.3.1-runtime-ubuntu20.04-with-texlive:latest
|
||||
|
||||
# edge-tts需要的依赖,某些pip包所需的依赖
|
||||
RUN apt update && apt install ffmpeg build-essential -y
|
||||
|
||||
# use python3 as the system default python
|
||||
WORKDIR /gpt
|
||||
RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.8
|
||||
|
||||
# # 非必要步骤,更换pip源 (以下三行,可以删除)
|
||||
# RUN echo '[global]' > /etc/pip.conf && \
|
||||
# echo 'index-url = https://mirrors.aliyun.com/pypi/simple/' >> /etc/pip.conf && \
|
||||
# echo 'trusted-host = mirrors.aliyun.com' >> /etc/pip.conf
|
||||
|
||||
# 下载pytorch
|
||||
RUN python3 -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
|
||||
# 准备pip依赖
|
||||
RUN python3 -m pip install openai numpy arxiv rich
|
||||
RUN python3 -m pip install colorama Markdown pygments pymupdf
|
||||
RUN python3 -m pip install python-docx moviepy pdfminer
|
||||
RUN python3 -m pip install zh_langchain==0.2.1 pypinyin
|
||||
RUN python3 -m pip install rarfile py7zr
|
||||
RUN python3 -m pip install aliyun-python-sdk-core==2.13.3 pyOpenSSL webrtcvad scipy git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
|
||||
# 下载分支
|
||||
WORKDIR /gpt
|
||||
RUN git clone --depth=1 https://github.com/binary-husky/gpt_academic.git
|
||||
WORKDIR /gpt/gpt_academic
|
||||
RUN git clone --depth=1 https://github.com/OpenLMLab/MOSS.git request_llms/moss
|
||||
|
||||
RUN python3 -m pip install -r requirements.txt
|
||||
RUN python3 -m pip install -r request_llms/requirements_moss.txt
|
||||
RUN python3 -m pip install -r request_llms/requirements_qwen.txt
|
||||
RUN python3 -m pip install -r request_llms/requirements_chatglm.txt
|
||||
RUN python3 -m pip install -r request_llms/requirements_newbing.txt
|
||||
RUN python3 -m pip install nougat-ocr
|
||||
|
||||
|
||||
# 预热Tiktoken模块
|
||||
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
|
||||
|
||||
# 安装知识库插件的额外依赖
|
||||
RUN apt-get update && apt-get install libgl1 -y
|
||||
RUN pip3 install transformers protobuf langchain sentence-transformers faiss-cpu nltk beautifulsoup4 bitsandbytes tabulate icetk --upgrade
|
||||
RUN pip3 install unstructured[all-docs] --upgrade
|
||||
RUN python3 -c 'from check_proxy import warm_up_vectordb; warm_up_vectordb()'
|
||||
RUN rm -rf /usr/local/lib/python3.8/dist-packages/tests
|
||||
|
||||
|
||||
# COPY .cache /root/.cache
|
||||
# COPY config_private.py config_private.py
|
||||
# 启动
|
||||
CMD ["python3", "-u", "main.py"]
|
||||
@@ -3,19 +3,33 @@
|
||||
# - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/GithubAction+NoLocal+Latex .
|
||||
# - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
|
||||
|
||||
FROM menghuan1918/ubuntu_uv_ctex:latest
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
SHELL ["/bin/bash", "-c"]
|
||||
FROM fuqingxu/python311_texlive_ctex:latest
|
||||
ENV PATH "$PATH:/usr/local/texlive/2022/bin/x86_64-linux"
|
||||
ENV PATH "$PATH:/usr/local/texlive/2023/bin/x86_64-linux"
|
||||
ENV PATH "$PATH:/usr/local/texlive/2024/bin/x86_64-linux"
|
||||
ENV PATH "$PATH:/usr/local/texlive/2025/bin/x86_64-linux"
|
||||
ENV PATH "$PATH:/usr/local/texlive/2026/bin/x86_64-linux"
|
||||
|
||||
# 指定路径
|
||||
WORKDIR /gpt
|
||||
|
||||
RUN pip3 install openai numpy arxiv rich
|
||||
RUN pip3 install colorama Markdown pygments pymupdf
|
||||
RUN pip3 install python-docx pdfminer
|
||||
RUN pip3 install nougat-ocr
|
||||
|
||||
# 装载项目文件
|
||||
COPY . .
|
||||
RUN /root/.cargo/bin/uv venv --seed \
|
||||
&& source .venv/bin/activate \
|
||||
&& /root/.cargo/bin/uv pip install openai numpy arxiv rich colorama Markdown pygments pymupdf python-docx pdfminer \
|
||||
&& /root/.cargo/bin/uv pip install -r requirements.txt \
|
||||
&& /root/.cargo/bin/uv clean
|
||||
|
||||
|
||||
# 安装依赖
|
||||
RUN pip3 install -r requirements.txt
|
||||
|
||||
# edge-tts需要的依赖
|
||||
RUN apt update && apt install ffmpeg -y
|
||||
|
||||
# 可选步骤,用于预热模块
|
||||
RUN .venv/bin/python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
|
||||
RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
|
||||
|
||||
# 启动
|
||||
CMD [".venv/bin/python3", "-u", "main.py"]
|
||||
CMD ["python3", "-u", "main.py"]
|
||||
|
||||
@@ -4,7 +4,7 @@ We currently support fastapi in order to solve sub-path deploy issue.
|
||||
|
||||
1. change CUSTOM_PATH setting in `config.py`
|
||||
|
||||
```sh
|
||||
``` sh
|
||||
nano config.py
|
||||
```
|
||||
|
||||
@@ -35,8 +35,9 @@ if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
|
||||
3. Go!
|
||||
|
||||
```sh
|
||||
``` sh
|
||||
python main.py
|
||||
```
|
||||
|
||||
文件差异内容过多而无法显示
加载差异
@@ -108,22 +108,5 @@
|
||||
"解析PDF_简单拆解": "ParsePDF_simpleDecomposition",
|
||||
"解析PDF_DOC2X_单文件": "ParsePDF_DOC2X_singleFile",
|
||||
"注释Python项目": "CommentPythonProject",
|
||||
"注释源代码": "CommentSourceCode",
|
||||
"log亮黄": "log_yellow",
|
||||
"log亮绿": "log_green",
|
||||
"log亮红": "log_red",
|
||||
"log亮紫": "log_purple",
|
||||
"log亮蓝": "log_blue",
|
||||
"Rag问答": "RagQA",
|
||||
"sprint红": "sprint_red",
|
||||
"sprint绿": "sprint_green",
|
||||
"sprint黄": "sprint_yellow",
|
||||
"sprint蓝": "sprint_blue",
|
||||
"sprint紫": "sprint_purple",
|
||||
"sprint靛": "sprint_indigo",
|
||||
"sprint亮红": "sprint_bright_red",
|
||||
"sprint亮绿": "sprint_bright_green",
|
||||
"sprint亮黄": "sprint_bright_yellow",
|
||||
"sprint亮蓝": "sprint_bright_blue",
|
||||
"sprint亮紫": "sprint_bright_purple"
|
||||
"注释源代码": "CommentSourceCode"
|
||||
}
|
||||
@@ -256,8 +256,6 @@ model_info = {
|
||||
"max_token": 128000,
|
||||
"tokenizer": tokenizer_gpt4,
|
||||
"token_cnt": get_token_num_gpt4,
|
||||
"openai_disable_system_prompt": True,
|
||||
"openai_disable_stream": True,
|
||||
},
|
||||
"o1-mini": {
|
||||
"fn_with_ui": chatgpt_ui,
|
||||
@@ -266,8 +264,6 @@ model_info = {
|
||||
"max_token": 128000,
|
||||
"tokenizer": tokenizer_gpt4,
|
||||
"token_cnt": get_token_num_gpt4,
|
||||
"openai_disable_system_prompt": True,
|
||||
"openai_disable_stream": True,
|
||||
},
|
||||
|
||||
"gpt-4-turbo": {
|
||||
@@ -1120,24 +1116,6 @@ if len(AZURE_CFG_ARRAY) > 0:
|
||||
if azure_model_name not in AVAIL_LLM_MODELS:
|
||||
AVAIL_LLM_MODELS += [azure_model_name]
|
||||
|
||||
# -=-=-=-=-=-=- Openrouter模型对齐支持 -=-=-=-=-=-=-
|
||||
# 为了更灵活地接入Openrouter路由,设计了此接口
|
||||
for model in [m for m in AVAIL_LLM_MODELS if m.startswith("openrouter-")]:
|
||||
from request_llms.bridge_openrouter import predict_no_ui_long_connection as openrouter_noui
|
||||
from request_llms.bridge_openrouter import predict as openrouter_ui
|
||||
model_info.update({
|
||||
model: {
|
||||
"fn_with_ui": openrouter_ui,
|
||||
"fn_without_ui": openrouter_noui,
|
||||
# 以下参数参考gpt-4o-mini的配置, 请根据实际情况修改
|
||||
"endpoint": openai_endpoint,
|
||||
"has_multimodal_capacity": True,
|
||||
"max_token": 128000,
|
||||
"tokenizer": tokenizer_gpt4,
|
||||
"token_cnt": get_token_num_gpt4,
|
||||
},
|
||||
})
|
||||
|
||||
|
||||
# -=-=-=-=-=-=--=-=-=-=-=-=--=-=-=-=-=-=--=-=-=-=-=-=-=-=
|
||||
# -=-=-=-=-=-=-=-=-=- ☝️ 以上是模型路由 -=-=-=-=-=-=-=-=-=
|
||||
@@ -1283,6 +1261,5 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot,
|
||||
if additional_fn: # 根据基础功能区 ModelOverride 参数调整模型类型
|
||||
llm_kwargs, additional_fn, method = execute_model_override(llm_kwargs, additional_fn, method)
|
||||
|
||||
# 更新一下llm_kwargs的参数,否则会出现参数不匹配的问题
|
||||
yield from method(inputs, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, stream, additional_fn)
|
||||
|
||||
|
||||
@@ -134,33 +134,22 @@ def predict_no_ui_long_connection(inputs:str, llm_kwargs:dict, history:list=[],
|
||||
observe_window = None:
|
||||
用于负责跨越线程传递已经输出的部分,大部分时候仅仅为了fancy的视觉效果,留空即可。observe_window[0]:观测窗。observe_window[1]:看门狗
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
|
||||
watch_dog_patience = 5 # 看门狗的耐心, 设置5秒即可
|
||||
|
||||
if model_info[llm_kwargs['llm_model']].get('openai_disable_stream', False): stream = False
|
||||
else: stream = True
|
||||
|
||||
headers, payload = generate_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=stream)
|
||||
headers, payload = generate_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=True)
|
||||
retry = 0
|
||||
while True:
|
||||
try:
|
||||
# make a POST request to the API endpoint, stream=False
|
||||
from .bridge_all import model_info
|
||||
endpoint = verify_endpoint(model_info[llm_kwargs['llm_model']]['endpoint'])
|
||||
response = requests.post(endpoint, headers=headers, proxies=proxies,
|
||||
json=payload, stream=stream, timeout=TIMEOUT_SECONDS); break
|
||||
json=payload, stream=True, timeout=TIMEOUT_SECONDS); break
|
||||
except requests.exceptions.ReadTimeout as e:
|
||||
retry += 1
|
||||
traceback.print_exc()
|
||||
if retry > MAX_RETRY: raise TimeoutError
|
||||
if MAX_RETRY!=0: logger.error(f'请求超时,正在重试 ({retry}/{MAX_RETRY}) ……')
|
||||
|
||||
if not stream:
|
||||
# 该分支仅适用于不支持stream的o1模型,其他情形一律不适用
|
||||
chunkjson = json.loads(response.content.decode())
|
||||
gpt_replying_buffer = chunkjson['choices'][0]["message"]["content"]
|
||||
return gpt_replying_buffer
|
||||
|
||||
stream_response = response.iter_lines()
|
||||
result = ''
|
||||
json_data = None
|
||||
@@ -192,7 +181,7 @@ def predict_no_ui_long_connection(inputs:str, llm_kwargs:dict, history:list=[],
|
||||
if (not has_content) and (not has_role): continue # raise RuntimeError("发现不标准的第三方接口:"+delta)
|
||||
if has_content: # has_role = True/False
|
||||
result += delta["content"]
|
||||
if not console_slience: print(delta["content"], end='')
|
||||
if not console_slience: logger.info(delta["content"], end='')
|
||||
if observe_window is not None:
|
||||
# 观测窗,把已经获取的数据显示出去
|
||||
if len(observe_window) >= 1:
|
||||
@@ -220,7 +209,7 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWith
|
||||
chatbot 为WebUI中显示的对话列表,修改它,然后yeild出去,可以直接修改对话界面内容
|
||||
additional_fn代表点击的哪个按钮,按钮见functional.py
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
from .bridge_all import model_info
|
||||
if is_any_api_key(inputs):
|
||||
chatbot._cookies['api_key'] = inputs
|
||||
chatbot.append(("输入已识别为openai的api_key", what_keys(inputs)))
|
||||
@@ -249,10 +238,6 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWith
|
||||
chatbot.append((_inputs, ""))
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
|
||||
|
||||
# 禁用stream的特殊模型处理
|
||||
if model_info[llm_kwargs['llm_model']].get('openai_disable_stream', False): stream = False
|
||||
else: stream = True
|
||||
|
||||
# check mis-behavior
|
||||
if is_the_upload_folder(user_input):
|
||||
chatbot[-1] = (inputs, f"[Local Message] 检测到操作错误!当您上传文档之后,需点击“**函数插件区**”按钮进行处理,请勿点击“提交”按钮或者“基础功能区”按钮。")
|
||||
@@ -286,7 +271,7 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWith
|
||||
try:
|
||||
# make a POST request to the API endpoint, stream=True
|
||||
response = requests.post(endpoint, headers=headers, proxies=proxies,
|
||||
json=payload, stream=stream, timeout=TIMEOUT_SECONDS);break
|
||||
json=payload, stream=True, timeout=TIMEOUT_SECONDS);break
|
||||
except:
|
||||
retry += 1
|
||||
chatbot[-1] = ((chatbot[-1][0], timeout_bot_msg))
|
||||
@@ -294,15 +279,10 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWith
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="请求超时"+retry_msg) # 刷新界面
|
||||
if retry > MAX_RETRY: raise TimeoutError
|
||||
|
||||
gpt_replying_buffer = ""
|
||||
|
||||
if not stream:
|
||||
# 该分支仅适用于不支持stream的o1模型,其他情形一律不适用
|
||||
yield from handle_o1_model_special(response, inputs, llm_kwargs, chatbot, history)
|
||||
return
|
||||
|
||||
is_head_of_the_stream = True
|
||||
if stream:
|
||||
gpt_replying_buffer = ""
|
||||
is_head_of_the_stream = True
|
||||
stream_response = response.iter_lines()
|
||||
while True:
|
||||
try:
|
||||
@@ -363,24 +343,12 @@ def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWith
|
||||
chunk_decoded = chunk.decode()
|
||||
error_msg = chunk_decoded
|
||||
chatbot, history = handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg)
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析异常" + error_msg) # 刷新界面
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json异常" + error_msg) # 刷新界面
|
||||
logger.error(error_msg)
|
||||
return
|
||||
return # return from stream-branch
|
||||
|
||||
def handle_o1_model_special(response, inputs, llm_kwargs, chatbot, history):
|
||||
try:
|
||||
chunkjson = json.loads(response.content.decode())
|
||||
gpt_replying_buffer = chunkjson['choices'][0]["message"]["content"]
|
||||
log_chat(llm_model=llm_kwargs["llm_model"], input_str=inputs, output_str=gpt_replying_buffer)
|
||||
history[-1] = gpt_replying_buffer
|
||||
chatbot[-1] = (history[-2], history[-1])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
except Exception as e:
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析异常" + response.text) # 刷新界面
|
||||
|
||||
def handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg):
|
||||
from request_llms.bridge_all import model_info
|
||||
from .bridge_all import model_info
|
||||
openai_website = ' 请登录OpenAI查看详情 https://platform.openai.com/signup'
|
||||
if "reduce the length" in error_msg:
|
||||
if len(history) >= 2: history[-1] = ""; history[-2] = "" # 清除当前溢出的输入:history[-2] 是本次输入, history[-1] 是本次输出
|
||||
@@ -413,8 +381,6 @@ def generate_payload(inputs:str, llm_kwargs:dict, history:list, system_prompt:st
|
||||
"""
|
||||
整合所有信息,选择LLM模型,生成http请求,为发送请求做准备
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
|
||||
if not is_any_api_key(llm_kwargs['api_key']):
|
||||
raise AssertionError("你提供了错误的API_KEY。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。")
|
||||
|
||||
@@ -443,16 +409,10 @@ def generate_payload(inputs:str, llm_kwargs:dict, history:list, system_prompt:st
|
||||
else:
|
||||
enable_multimodal_capacity = False
|
||||
|
||||
conversation_cnt = len(history) // 2
|
||||
openai_disable_system_prompt = model_info[llm_kwargs['llm_model']].get('openai_disable_system_prompt', False)
|
||||
|
||||
if openai_disable_system_prompt:
|
||||
messages = [{"role": "user", "content": system_prompt}]
|
||||
else:
|
||||
messages = [{"role": "system", "content": system_prompt}]
|
||||
|
||||
if not enable_multimodal_capacity:
|
||||
# 不使用多模态能力
|
||||
conversation_cnt = len(history) // 2
|
||||
messages = [{"role": "system", "content": system_prompt}]
|
||||
if conversation_cnt:
|
||||
for index in range(0, 2*conversation_cnt, 2):
|
||||
what_i_have_asked = {}
|
||||
@@ -474,6 +434,8 @@ def generate_payload(inputs:str, llm_kwargs:dict, history:list, system_prompt:st
|
||||
messages.append(what_i_ask_now)
|
||||
else:
|
||||
# 多模态能力
|
||||
conversation_cnt = len(history) // 2
|
||||
messages = [{"role": "system", "content": system_prompt}]
|
||||
if conversation_cnt:
|
||||
for index in range(0, 2*conversation_cnt, 2):
|
||||
what_i_have_asked = {}
|
||||
|
||||
@@ -111,7 +111,7 @@ def predict_no_ui_long_connection(inputs:str, llm_kwargs:dict, history:list=[],
|
||||
if chunkjson['event_type'] == 'stream-start': continue
|
||||
if chunkjson['event_type'] == 'text-generation':
|
||||
result += chunkjson["text"]
|
||||
if not console_slience: print(chunkjson["text"], end='')
|
||||
if not console_slience: logger.info(chunkjson["text"], end='')
|
||||
if observe_window is not None:
|
||||
# 观测窗,把已经获取的数据显示出去
|
||||
if len(observe_window) >= 1:
|
||||
|
||||
@@ -99,7 +99,7 @@ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="",
|
||||
logger.info(f'[response] {result}')
|
||||
break
|
||||
result += chunkjson['message']["content"]
|
||||
if not console_slience: print(chunkjson['message']["content"], end='')
|
||||
if not console_slience: logger.info(chunkjson['message']["content"], end='')
|
||||
if observe_window is not None:
|
||||
# 观测窗,把已经获取的数据显示出去
|
||||
if len(observe_window) >= 1:
|
||||
|
||||
@@ -1,541 +0,0 @@
|
||||
"""
|
||||
该文件中主要包含三个函数
|
||||
|
||||
不具备多线程能力的函数:
|
||||
1. predict: 正常对话时使用,具备完备的交互功能,不可多线程
|
||||
|
||||
具备多线程调用能力的函数
|
||||
2. predict_no_ui_long_connection:支持多线程
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import time
|
||||
import traceback
|
||||
import requests
|
||||
import random
|
||||
from loguru import logger
|
||||
|
||||
# config_private.py放自己的秘密如API和代理网址
|
||||
# 读取时首先看是否存在私密的config_private配置文件(不受git管控),如果有,则覆盖原config文件
|
||||
from toolbox import get_conf, update_ui, is_any_api_key, select_api_key, what_keys, clip_history
|
||||
from toolbox import trimmed_format_exc, is_the_upload_folder, read_one_api_model_name, log_chat
|
||||
from toolbox import ChatBotWithCookies, have_any_recent_upload_image_files, encode_image
|
||||
proxies, TIMEOUT_SECONDS, MAX_RETRY, API_ORG, AZURE_CFG_ARRAY = \
|
||||
get_conf('proxies', 'TIMEOUT_SECONDS', 'MAX_RETRY', 'API_ORG', 'AZURE_CFG_ARRAY')
|
||||
|
||||
timeout_bot_msg = '[Local Message] Request timeout. Network error. Please check proxy settings in config.py.' + \
|
||||
'网络错误,检查代理服务器是否可用,以及代理设置的格式是否正确,格式须是[协议]://[地址]:[端口],缺一不可。'
|
||||
|
||||
def get_full_error(chunk, stream_response):
|
||||
"""
|
||||
获取完整的从Openai返回的报错
|
||||
"""
|
||||
while True:
|
||||
try:
|
||||
chunk += next(stream_response)
|
||||
except:
|
||||
break
|
||||
return chunk
|
||||
|
||||
def make_multimodal_input(inputs, image_paths):
|
||||
image_base64_array = []
|
||||
for image_path in image_paths:
|
||||
path = os.path.abspath(image_path)
|
||||
base64 = encode_image(path)
|
||||
inputs = inputs + f'<br/><br/><div align="center"><img src="file={path}" base64="{base64}"></div>'
|
||||
image_base64_array.append(base64)
|
||||
return inputs, image_base64_array
|
||||
|
||||
def reverse_base64_from_input(inputs):
|
||||
# 定义一个正则表达式来匹配 Base64 字符串(假设格式为 base64="<Base64编码>")
|
||||
# pattern = re.compile(r'base64="([^"]+)"></div>')
|
||||
pattern = re.compile(r'<br/><br/><div align="center"><img[^<>]+base64="([^"]+)"></div>')
|
||||
# 使用 findall 方法查找所有匹配的 Base64 字符串
|
||||
base64_strings = pattern.findall(inputs)
|
||||
# 返回反转后的 Base64 字符串列表
|
||||
return base64_strings
|
||||
|
||||
def contain_base64(inputs):
|
||||
base64_strings = reverse_base64_from_input(inputs)
|
||||
return len(base64_strings) > 0
|
||||
|
||||
def append_image_if_contain_base64(inputs):
|
||||
if not contain_base64(inputs):
|
||||
return inputs
|
||||
else:
|
||||
image_base64_array = reverse_base64_from_input(inputs)
|
||||
pattern = re.compile(r'<br/><br/><div align="center"><img[^><]+></div>')
|
||||
inputs = re.sub(pattern, '', inputs)
|
||||
res = []
|
||||
res.append({
|
||||
"type": "text",
|
||||
"text": inputs
|
||||
})
|
||||
for image_base64 in image_base64_array:
|
||||
res.append({
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": f"data:image/jpeg;base64,{image_base64}"
|
||||
}
|
||||
})
|
||||
return res
|
||||
|
||||
def remove_image_if_contain_base64(inputs):
|
||||
if not contain_base64(inputs):
|
||||
return inputs
|
||||
else:
|
||||
pattern = re.compile(r'<br/><br/><div align="center"><img[^><]+></div>')
|
||||
inputs = re.sub(pattern, '', inputs)
|
||||
return inputs
|
||||
|
||||
def decode_chunk(chunk):
|
||||
# 提前读取一些信息 (用于判断异常)
|
||||
chunk_decoded = chunk.decode()
|
||||
chunkjson = None
|
||||
has_choices = False
|
||||
choice_valid = False
|
||||
has_content = False
|
||||
has_role = False
|
||||
try:
|
||||
chunkjson = json.loads(chunk_decoded[6:])
|
||||
has_choices = 'choices' in chunkjson
|
||||
if has_choices: choice_valid = (len(chunkjson['choices']) > 0)
|
||||
if has_choices and choice_valid: has_content = ("content" in chunkjson['choices'][0]["delta"])
|
||||
if has_content: has_content = (chunkjson['choices'][0]["delta"]["content"] is not None)
|
||||
if has_choices and choice_valid: has_role = "role" in chunkjson['choices'][0]["delta"]
|
||||
except:
|
||||
pass
|
||||
return chunk_decoded, chunkjson, has_choices, choice_valid, has_content, has_role
|
||||
|
||||
from functools import lru_cache
|
||||
@lru_cache(maxsize=32)
|
||||
def verify_endpoint(endpoint):
|
||||
"""
|
||||
检查endpoint是否可用
|
||||
"""
|
||||
if "你亲手写的api名称" in endpoint:
|
||||
raise ValueError("Endpoint不正确, 请检查AZURE_ENDPOINT的配置! 当前的Endpoint为:" + endpoint)
|
||||
return endpoint
|
||||
|
||||
def predict_no_ui_long_connection(inputs:str, llm_kwargs:dict, history:list=[], sys_prompt:str="", observe_window:list=None, console_slience:bool=False):
|
||||
"""
|
||||
发送至chatGPT,等待回复,一次性完成,不显示中间过程。但内部用stream的方法避免中途网线被掐。
|
||||
inputs:
|
||||
是本次问询的输入
|
||||
sys_prompt:
|
||||
系统静默prompt
|
||||
llm_kwargs:
|
||||
chatGPT的内部调优参数
|
||||
history:
|
||||
是之前的对话列表
|
||||
observe_window = None:
|
||||
用于负责跨越线程传递已经输出的部分,大部分时候仅仅为了fancy的视觉效果,留空即可。observe_window[0]:观测窗。observe_window[1]:看门狗
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
|
||||
watch_dog_patience = 5 # 看门狗的耐心, 设置5秒即可
|
||||
|
||||
if model_info[llm_kwargs['llm_model']].get('openai_disable_stream', False): stream = False
|
||||
else: stream = True
|
||||
|
||||
headers, payload = generate_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=stream)
|
||||
retry = 0
|
||||
while True:
|
||||
try:
|
||||
# make a POST request to the API endpoint, stream=False
|
||||
endpoint = verify_endpoint(model_info[llm_kwargs['llm_model']]['endpoint'])
|
||||
response = requests.post(endpoint, headers=headers, proxies=proxies,
|
||||
json=payload, stream=stream, timeout=TIMEOUT_SECONDS); break
|
||||
except requests.exceptions.ReadTimeout as e:
|
||||
retry += 1
|
||||
traceback.print_exc()
|
||||
if retry > MAX_RETRY: raise TimeoutError
|
||||
if MAX_RETRY!=0: logger.error(f'请求超时,正在重试 ({retry}/{MAX_RETRY}) ……')
|
||||
|
||||
if not stream:
|
||||
# 该分支仅适用于不支持stream的o1模型,其他情形一律不适用
|
||||
chunkjson = json.loads(response.content.decode())
|
||||
gpt_replying_buffer = chunkjson['choices'][0]["message"]["content"]
|
||||
return gpt_replying_buffer
|
||||
|
||||
stream_response = response.iter_lines()
|
||||
result = ''
|
||||
json_data = None
|
||||
while True:
|
||||
try: chunk = next(stream_response)
|
||||
except StopIteration:
|
||||
break
|
||||
except requests.exceptions.ConnectionError:
|
||||
chunk = next(stream_response) # 失败了,重试一次?再失败就没办法了。
|
||||
chunk_decoded, chunkjson, has_choices, choice_valid, has_content, has_role = decode_chunk(chunk)
|
||||
if len(chunk_decoded)==0: continue
|
||||
if not chunk_decoded.startswith('data:'):
|
||||
error_msg = get_full_error(chunk, stream_response).decode()
|
||||
if "reduce the length" in error_msg:
|
||||
raise ConnectionAbortedError("OpenAI拒绝了请求:" + error_msg)
|
||||
elif """type":"upstream_error","param":"307""" in error_msg:
|
||||
raise ConnectionAbortedError("正常结束,但显示Token不足,导致输出不完整,请削减单次输入的文本量。")
|
||||
else:
|
||||
raise RuntimeError("OpenAI拒绝了请求:" + error_msg)
|
||||
if ('data: [DONE]' in chunk_decoded): break # api2d 正常完成
|
||||
# 提前读取一些信息 (用于判断异常)
|
||||
if (has_choices and not choice_valid) or ('OPENROUTER PROCESSING' in chunk_decoded):
|
||||
# 一些垃圾第三方接口的出现这样的错误,openrouter的特殊处理
|
||||
continue
|
||||
json_data = chunkjson['choices'][0]
|
||||
delta = json_data["delta"]
|
||||
if len(delta) == 0: break
|
||||
if (not has_content) and has_role: continue
|
||||
if (not has_content) and (not has_role): continue # raise RuntimeError("发现不标准的第三方接口:"+delta)
|
||||
if has_content: # has_role = True/False
|
||||
result += delta["content"]
|
||||
if not console_slience: print(delta["content"], end='')
|
||||
if observe_window is not None:
|
||||
# 观测窗,把已经获取的数据显示出去
|
||||
if len(observe_window) >= 1:
|
||||
observe_window[0] += delta["content"]
|
||||
# 看门狗,如果超过期限没有喂狗,则终止
|
||||
if len(observe_window) >= 2:
|
||||
if (time.time()-observe_window[1]) > watch_dog_patience:
|
||||
raise RuntimeError("用户取消了程序。")
|
||||
else: raise RuntimeError("意外Json结构:"+delta)
|
||||
if json_data and json_data['finish_reason'] == 'content_filter':
|
||||
raise RuntimeError("由于提问含不合规内容被Azure过滤。")
|
||||
if json_data and json_data['finish_reason'] == 'length':
|
||||
raise ConnectionAbortedError("正常结束,但显示Token不足,导致输出不完整,请削减单次输入的文本量。")
|
||||
return result
|
||||
|
||||
|
||||
def predict(inputs:str, llm_kwargs:dict, plugin_kwargs:dict, chatbot:ChatBotWithCookies,
|
||||
history:list=[], system_prompt:str='', stream:bool=True, additional_fn:str=None):
|
||||
"""
|
||||
发送至chatGPT,流式获取输出。
|
||||
用于基础的对话功能。
|
||||
inputs 是本次问询的输入
|
||||
top_p, temperature是chatGPT的内部调优参数
|
||||
history 是之前的对话列表(注意无论是inputs还是history,内容太长了都会触发token数量溢出的错误)
|
||||
chatbot 为WebUI中显示的对话列表,修改它,然后yeild出去,可以直接修改对话界面内容
|
||||
additional_fn代表点击的哪个按钮,按钮见functional.py
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
if is_any_api_key(inputs):
|
||||
chatbot._cookies['api_key'] = inputs
|
||||
chatbot.append(("输入已识别为openai的api_key", what_keys(inputs)))
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="api_key已导入") # 刷新界面
|
||||
return
|
||||
elif not is_any_api_key(chatbot._cookies['api_key']):
|
||||
chatbot.append((inputs, "缺少api_key。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。"))
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="缺少api_key") # 刷新界面
|
||||
return
|
||||
|
||||
user_input = inputs
|
||||
if additional_fn is not None:
|
||||
from core_functional import handle_core_functionality
|
||||
inputs, history = handle_core_functionality(additional_fn, inputs, history, chatbot)
|
||||
|
||||
# 多模态模型
|
||||
has_multimodal_capacity = model_info[llm_kwargs['llm_model']].get('has_multimodal_capacity', False)
|
||||
if has_multimodal_capacity:
|
||||
has_recent_image_upload, image_paths = have_any_recent_upload_image_files(chatbot, pop=True)
|
||||
else:
|
||||
has_recent_image_upload, image_paths = False, []
|
||||
if has_recent_image_upload:
|
||||
_inputs, image_base64_array = make_multimodal_input(inputs, image_paths)
|
||||
else:
|
||||
_inputs, image_base64_array = inputs, []
|
||||
chatbot.append((_inputs, ""))
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
|
||||
|
||||
# 禁用stream的特殊模型处理
|
||||
if model_info[llm_kwargs['llm_model']].get('openai_disable_stream', False): stream = False
|
||||
else: stream = True
|
||||
|
||||
# check mis-behavior
|
||||
if is_the_upload_folder(user_input):
|
||||
chatbot[-1] = (inputs, f"[Local Message] 检测到操作错误!当您上传文档之后,需点击“**函数插件区**”按钮进行处理,请勿点击“提交”按钮或者“基础功能区”按钮。")
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="正常") # 刷新界面
|
||||
time.sleep(2)
|
||||
|
||||
try:
|
||||
headers, payload = generate_payload(inputs, llm_kwargs, history, system_prompt, image_base64_array, has_multimodal_capacity, stream)
|
||||
except RuntimeError as e:
|
||||
chatbot[-1] = (inputs, f"您提供的api-key不满足要求,不包含任何可用于{llm_kwargs['llm_model']}的api-key。您可能选择了错误的模型或请求源。")
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="api-key不满足要求") # 刷新界面
|
||||
return
|
||||
|
||||
# 检查endpoint是否合法
|
||||
try:
|
||||
endpoint = verify_endpoint(model_info[llm_kwargs['llm_model']]['endpoint'])
|
||||
except:
|
||||
tb_str = '```\n' + trimmed_format_exc() + '```'
|
||||
chatbot[-1] = (inputs, tb_str)
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Endpoint不满足要求") # 刷新界面
|
||||
return
|
||||
|
||||
# 加入历史
|
||||
if has_recent_image_upload:
|
||||
history.extend([_inputs, ""])
|
||||
else:
|
||||
history.extend([inputs, ""])
|
||||
|
||||
retry = 0
|
||||
while True:
|
||||
try:
|
||||
# make a POST request to the API endpoint, stream=True
|
||||
response = requests.post(endpoint, headers=headers, proxies=proxies,
|
||||
json=payload, stream=stream, timeout=TIMEOUT_SECONDS);break
|
||||
except:
|
||||
retry += 1
|
||||
chatbot[-1] = ((chatbot[-1][0], timeout_bot_msg))
|
||||
retry_msg = f",正在重试 ({retry}/{MAX_RETRY}) ……" if MAX_RETRY > 0 else ""
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="请求超时"+retry_msg) # 刷新界面
|
||||
if retry > MAX_RETRY: raise TimeoutError
|
||||
|
||||
|
||||
if not stream:
|
||||
# 该分支仅适用于不支持stream的o1模型,其他情形一律不适用
|
||||
yield from handle_o1_model_special(response, inputs, llm_kwargs, chatbot, history)
|
||||
return
|
||||
|
||||
if stream:
|
||||
gpt_replying_buffer = ""
|
||||
is_head_of_the_stream = True
|
||||
stream_response = response.iter_lines()
|
||||
while True:
|
||||
try:
|
||||
chunk = next(stream_response)
|
||||
except StopIteration:
|
||||
# 非OpenAI官方接口的出现这样的报错,OpenAI和API2D不会走这里
|
||||
chunk_decoded = chunk.decode()
|
||||
error_msg = chunk_decoded
|
||||
# 首先排除一个one-api没有done数据包的第三方Bug情形
|
||||
if len(gpt_replying_buffer.strip()) > 0 and len(error_msg) == 0:
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="检测到有缺陷的非OpenAI官方接口,建议选择更稳定的接口。")
|
||||
break
|
||||
# 其他情况,直接返回报错
|
||||
chatbot, history = handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg)
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="非OpenAI官方接口返回了错误:" + chunk.decode()) # 刷新界面
|
||||
return
|
||||
|
||||
# 提前读取一些信息 (用于判断异常)
|
||||
chunk_decoded, chunkjson, has_choices, choice_valid, has_content, has_role = decode_chunk(chunk)
|
||||
|
||||
if is_head_of_the_stream and (r'"object":"error"' not in chunk_decoded) and (r"content" not in chunk_decoded):
|
||||
# 数据流的第一帧不携带content
|
||||
is_head_of_the_stream = False; continue
|
||||
|
||||
if chunk:
|
||||
try:
|
||||
if (has_choices and not choice_valid) or ('OPENROUTER PROCESSING' in chunk_decoded):
|
||||
# 一些垃圾第三方接口的出现这样的错误, 或者OPENROUTER的特殊处理,因为OPENROUTER的数据流未连接到模型时会出现OPENROUTER PROCESSING
|
||||
continue
|
||||
if ('data: [DONE]' not in chunk_decoded) and len(chunk_decoded) > 0 and (chunkjson is None):
|
||||
# 传递进来一些奇怪的东西
|
||||
raise ValueError(f'无法读取以下数据,请检查配置。\n\n{chunk_decoded}')
|
||||
# 前者是API2D的结束条件,后者是OPENAI的结束条件
|
||||
if ('data: [DONE]' in chunk_decoded) or (len(chunkjson['choices'][0]["delta"]) == 0):
|
||||
# 判定为数据流的结束,gpt_replying_buffer也写完了
|
||||
log_chat(llm_model=llm_kwargs["llm_model"], input_str=inputs, output_str=gpt_replying_buffer)
|
||||
break
|
||||
# 处理数据流的主体
|
||||
status_text = f"finish_reason: {chunkjson['choices'][0].get('finish_reason', 'null')}"
|
||||
# 如果这里抛出异常,一般是文本过长,详情见get_full_error的输出
|
||||
if has_content:
|
||||
# 正常情况
|
||||
gpt_replying_buffer = gpt_replying_buffer + chunkjson['choices'][0]["delta"]["content"]
|
||||
elif has_role:
|
||||
# 一些第三方接口的出现这样的错误,兼容一下吧
|
||||
continue
|
||||
else:
|
||||
# 至此已经超出了正常接口应该进入的范围,一些垃圾第三方接口会出现这样的错误
|
||||
if chunkjson['choices'][0]["delta"]["content"] is None: continue # 一些垃圾第三方接口出现这样的错误,兼容一下吧
|
||||
gpt_replying_buffer = gpt_replying_buffer + chunkjson['choices'][0]["delta"]["content"]
|
||||
|
||||
history[-1] = gpt_replying_buffer
|
||||
chatbot[-1] = (history[-2], history[-1])
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg=status_text) # 刷新界面
|
||||
except Exception as e:
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析不合常规") # 刷新界面
|
||||
chunk = get_full_error(chunk, stream_response)
|
||||
chunk_decoded = chunk.decode()
|
||||
error_msg = chunk_decoded
|
||||
chatbot, history = handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg)
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析异常" + error_msg) # 刷新界面
|
||||
logger.error(error_msg)
|
||||
return
|
||||
return # return from stream-branch
|
||||
|
||||
def handle_o1_model_special(response, inputs, llm_kwargs, chatbot, history):
|
||||
try:
|
||||
chunkjson = json.loads(response.content.decode())
|
||||
gpt_replying_buffer = chunkjson['choices'][0]["message"]["content"]
|
||||
log_chat(llm_model=llm_kwargs["llm_model"], input_str=inputs, output_str=gpt_replying_buffer)
|
||||
history[-1] = gpt_replying_buffer
|
||||
chatbot[-1] = (history[-2], history[-1])
|
||||
yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
|
||||
except Exception as e:
|
||||
yield from update_ui(chatbot=chatbot, history=history, msg="Json解析异常" + response.text) # 刷新界面
|
||||
|
||||
def handle_error(inputs, llm_kwargs, chatbot, history, chunk_decoded, error_msg):
|
||||
from request_llms.bridge_all import model_info
|
||||
openai_website = ' 请登录OpenAI查看详情 https://platform.openai.com/signup'
|
||||
if "reduce the length" in error_msg:
|
||||
if len(history) >= 2: history[-1] = ""; history[-2] = "" # 清除当前溢出的输入:history[-2] 是本次输入, history[-1] 是本次输出
|
||||
history = clip_history(inputs=inputs, history=history, tokenizer=model_info[llm_kwargs['llm_model']]['tokenizer'],
|
||||
max_token_limit=(model_info[llm_kwargs['llm_model']]['max_token'])) # history至少释放二分之一
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] Reduce the length. 本次输入过长, 或历史数据过长. 历史缓存数据已部分释放, 您可以请再次尝试. (若再次失败则更可能是因为输入过长.)")
|
||||
elif "does not exist" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], f"[Local Message] Model {llm_kwargs['llm_model']} does not exist. 模型不存在, 或者您没有获得体验资格.")
|
||||
elif "Incorrect API key" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] Incorrect API key. OpenAI以提供了不正确的API_KEY为由, 拒绝服务. " + openai_website)
|
||||
elif "exceeded your current quota" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] You exceeded your current quota. OpenAI以账户额度不足为由, 拒绝服务." + openai_website)
|
||||
elif "account is not active" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] Your account is not active. OpenAI以账户失效为由, 拒绝服务." + openai_website)
|
||||
elif "associated with a deactivated account" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] You are associated with a deactivated account. OpenAI以账户失效为由, 拒绝服务." + openai_website)
|
||||
elif "API key has been deactivated" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] API key has been deactivated. OpenAI以账户失效为由, 拒绝服务." + openai_website)
|
||||
elif "bad forward key" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] Bad forward key. API2D账户额度不足.")
|
||||
elif "Not enough point" in error_msg:
|
||||
chatbot[-1] = (chatbot[-1][0], "[Local Message] Not enough point. API2D账户点数不足.")
|
||||
else:
|
||||
from toolbox import regular_txt_to_markdown
|
||||
tb_str = '```\n' + trimmed_format_exc() + '```'
|
||||
chatbot[-1] = (chatbot[-1][0], f"[Local Message] 异常 \n\n{tb_str} \n\n{regular_txt_to_markdown(chunk_decoded)}")
|
||||
return chatbot, history
|
||||
|
||||
def generate_payload(inputs:str, llm_kwargs:dict, history:list, system_prompt:str, image_base64_array:list=[], has_multimodal_capacity:bool=False, stream:bool=True):
|
||||
"""
|
||||
整合所有信息,选择LLM模型,生成http请求,为发送请求做准备
|
||||
"""
|
||||
from request_llms.bridge_all import model_info
|
||||
|
||||
if not is_any_api_key(llm_kwargs['api_key']):
|
||||
raise AssertionError("你提供了错误的API_KEY。\n\n1. 临时解决方案:直接在输入区键入api_key,然后回车提交。\n\n2. 长效解决方案:在config.py中配置。")
|
||||
|
||||
if llm_kwargs['llm_model'].startswith('vllm-'):
|
||||
api_key = 'no-api-key'
|
||||
else:
|
||||
api_key = select_api_key(llm_kwargs['api_key'], llm_kwargs['llm_model'])
|
||||
|
||||
headers = {
|
||||
"Content-Type": "application/json",
|
||||
"Authorization": f"Bearer {api_key}"
|
||||
}
|
||||
if API_ORG.startswith('org-'): headers.update({"OpenAI-Organization": API_ORG})
|
||||
if llm_kwargs['llm_model'].startswith('azure-'):
|
||||
headers.update({"api-key": api_key})
|
||||
if llm_kwargs['llm_model'] in AZURE_CFG_ARRAY.keys():
|
||||
azure_api_key_unshared = AZURE_CFG_ARRAY[llm_kwargs['llm_model']]["AZURE_API_KEY"]
|
||||
headers.update({"api-key": azure_api_key_unshared})
|
||||
|
||||
if has_multimodal_capacity:
|
||||
# 当以下条件满足时,启用多模态能力:
|
||||
# 1. 模型本身是多模态模型(has_multimodal_capacity)
|
||||
# 2. 输入包含图像(len(image_base64_array) > 0)
|
||||
# 3. 历史输入包含图像( any([contain_base64(h) for h in history]) )
|
||||
enable_multimodal_capacity = (len(image_base64_array) > 0) or any([contain_base64(h) for h in history])
|
||||
else:
|
||||
enable_multimodal_capacity = False
|
||||
|
||||
conversation_cnt = len(history) // 2
|
||||
openai_disable_system_prompt = model_info[llm_kwargs['llm_model']].get('openai_disable_system_prompt', False)
|
||||
|
||||
if openai_disable_system_prompt:
|
||||
messages = [{"role": "user", "content": system_prompt}]
|
||||
else:
|
||||
messages = [{"role": "system", "content": system_prompt}]
|
||||
|
||||
if not enable_multimodal_capacity:
|
||||
# 不使用多模态能力
|
||||
if conversation_cnt:
|
||||
for index in range(0, 2*conversation_cnt, 2):
|
||||
what_i_have_asked = {}
|
||||
what_i_have_asked["role"] = "user"
|
||||
what_i_have_asked["content"] = remove_image_if_contain_base64(history[index])
|
||||
what_gpt_answer = {}
|
||||
what_gpt_answer["role"] = "assistant"
|
||||
what_gpt_answer["content"] = remove_image_if_contain_base64(history[index+1])
|
||||
if what_i_have_asked["content"] != "":
|
||||
if what_gpt_answer["content"] == "": continue
|
||||
if what_gpt_answer["content"] == timeout_bot_msg: continue
|
||||
messages.append(what_i_have_asked)
|
||||
messages.append(what_gpt_answer)
|
||||
else:
|
||||
messages[-1]['content'] = what_gpt_answer['content']
|
||||
what_i_ask_now = {}
|
||||
what_i_ask_now["role"] = "user"
|
||||
what_i_ask_now["content"] = inputs
|
||||
messages.append(what_i_ask_now)
|
||||
else:
|
||||
# 多模态能力
|
||||
if conversation_cnt:
|
||||
for index in range(0, 2*conversation_cnt, 2):
|
||||
what_i_have_asked = {}
|
||||
what_i_have_asked["role"] = "user"
|
||||
what_i_have_asked["content"] = append_image_if_contain_base64(history[index])
|
||||
what_gpt_answer = {}
|
||||
what_gpt_answer["role"] = "assistant"
|
||||
what_gpt_answer["content"] = append_image_if_contain_base64(history[index+1])
|
||||
if what_i_have_asked["content"] != "":
|
||||
if what_gpt_answer["content"] == "": continue
|
||||
if what_gpt_answer["content"] == timeout_bot_msg: continue
|
||||
messages.append(what_i_have_asked)
|
||||
messages.append(what_gpt_answer)
|
||||
else:
|
||||
messages[-1]['content'] = what_gpt_answer['content']
|
||||
what_i_ask_now = {}
|
||||
what_i_ask_now["role"] = "user"
|
||||
what_i_ask_now["content"] = []
|
||||
what_i_ask_now["content"].append({
|
||||
"type": "text",
|
||||
"text": inputs
|
||||
})
|
||||
for image_base64 in image_base64_array:
|
||||
what_i_ask_now["content"].append({
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": f"data:image/jpeg;base64,{image_base64}"
|
||||
}
|
||||
})
|
||||
messages.append(what_i_ask_now)
|
||||
|
||||
|
||||
model = llm_kwargs['llm_model']
|
||||
if llm_kwargs['llm_model'].startswith('api2d-'):
|
||||
model = llm_kwargs['llm_model'][len('api2d-'):]
|
||||
if llm_kwargs['llm_model'].startswith('one-api-'):
|
||||
model = llm_kwargs['llm_model'][len('one-api-'):]
|
||||
model, _ = read_one_api_model_name(model)
|
||||
if llm_kwargs['llm_model'].startswith('vllm-'):
|
||||
model = llm_kwargs['llm_model'][len('vllm-'):]
|
||||
model, _ = read_one_api_model_name(model)
|
||||
if llm_kwargs['llm_model'].startswith('openrouter-'):
|
||||
model = llm_kwargs['llm_model'][len('openrouter-'):]
|
||||
model= read_one_api_model_name(model)
|
||||
if model == "gpt-3.5-random": # 随机选择, 绕过openai访问频率限制
|
||||
model = random.choice([
|
||||
"gpt-3.5-turbo",
|
||||
"gpt-3.5-turbo-16k",
|
||||
"gpt-3.5-turbo-1106",
|
||||
"gpt-3.5-turbo-0613",
|
||||
"gpt-3.5-turbo-16k-0613",
|
||||
"gpt-3.5-turbo-0301",
|
||||
])
|
||||
|
||||
payload = {
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"temperature": llm_kwargs['temperature'], # 1.0,
|
||||
"top_p": llm_kwargs['top_p'], # 1.0,
|
||||
"n": 1,
|
||||
"stream": stream,
|
||||
}
|
||||
|
||||
return headers,payload
|
||||
|
||||
|
||||
@@ -224,7 +224,7 @@ def get_predict_function(
|
||||
try:
|
||||
if finish_reason == "stop":
|
||||
if not console_slience:
|
||||
print(f"[response] {result}")
|
||||
logger.info(f"[response] {result}")
|
||||
break
|
||||
result += response_text
|
||||
if observe_window is not None:
|
||||
|
||||
@@ -2,15 +2,14 @@ https://public.agent-matrix.com/publish/gradio-3.32.10-py3-none-any.whl
|
||||
fastapi==0.110
|
||||
gradio-client==0.8
|
||||
pypdf2==2.12.1
|
||||
httpx<=0.25.2
|
||||
zhipuai==2.0.1
|
||||
tiktoken>=0.3.3
|
||||
requests[socks]
|
||||
pydantic==2.9.2
|
||||
pydantic==2.5.2
|
||||
llama-index==0.10
|
||||
protobuf==3.20
|
||||
transformers>=4.27.1,<4.42
|
||||
scipdf_parser>=0.52
|
||||
spacy==3.7.4
|
||||
anthropic>=0.18.1
|
||||
python-markdown-math
|
||||
pymdown-extensions
|
||||
@@ -33,14 +32,3 @@ loguru
|
||||
arxiv
|
||||
numpy
|
||||
rich
|
||||
|
||||
|
||||
llama-index-core==0.10.68
|
||||
llama-index-legacy==0.9.48
|
||||
llama-index-readers-file==0.1.33
|
||||
llama-index-readers-llama-parse==0.1.6
|
||||
llama-index-embeddings-azure-openai==0.1.10
|
||||
llama-index-embeddings-openai==0.1.10
|
||||
llama-parse==0.4.9
|
||||
mdit-py-plugins>=0.3.3
|
||||
linkify-it-py==2.0.3
|
||||
@@ -94,7 +94,7 @@ def read_single_conf_with_lru_cache(arg):
|
||||
if r is None:
|
||||
log亮红('[PROXY] 网络代理状态:未配置。无代理状态下很可能无法访问OpenAI家族的模型。建议:检查USE_PROXY选项是否修改。')
|
||||
else:
|
||||
log亮绿('[PROXY] 网络代理状态:已配置。配置信息如下:', str(r))
|
||||
log亮绿('[PROXY] 网络代理状态:已配置。配置信息如下:', r)
|
||||
assert isinstance(r, dict), 'proxies格式错误,请注意proxies选项的格式,不要遗漏括号。'
|
||||
return r
|
||||
|
||||
|
||||
@@ -90,6 +90,23 @@ def make_history_cache():
|
||||
|
||||
|
||||
|
||||
# """
|
||||
# with gr.Row():
|
||||
# txt = gr.Textbox(show_label=False, placeholder="Input question here.", elem_id='user_input_main').style(container=False)
|
||||
# txtx = gr.Textbox(show_label=False, placeholder="Input question here.", elem_id='user_input_main').style(container=False)
|
||||
# with gr.Row():
|
||||
# btn_value = "Test"
|
||||
# elem_id = "TestCase"
|
||||
# variant = "primary"
|
||||
# input_list = [txt, txtx]
|
||||
# output_list = [txt, txtx]
|
||||
# input_name_list = ["txt(input)", "txtx(input)"]
|
||||
# output_name_list = ["txt", "txtx"]
|
||||
# js_callback = """(txt, txtx)=>{console.log(txt); console.log(txtx);}"""
|
||||
# def function(txt, txtx):
|
||||
# return "booo", "goooo"
|
||||
# create_button_with_javascript_callback(btn_value, elem_id, variant, js_callback, input_list, output_list, function, input_name_list, output_name_list)
|
||||
# """
|
||||
def create_button_with_javascript_callback(btn_value, elem_id, variant, js_callback, input_list, output_list, function, input_name_list, output_name_list):
|
||||
import gradio as gr
|
||||
middle_ware_component = gr.Textbox(visible=False, elem_id=elem_id+'_buffer')
|
||||
|
||||
@@ -34,9 +34,6 @@ def is_api2d_key(key):
|
||||
API_MATCH_API2D = re.match(r"fk[a-zA-Z0-9]{6}-[a-zA-Z0-9]{32}$", key)
|
||||
return bool(API_MATCH_API2D)
|
||||
|
||||
def is_openroute_api_key(key):
|
||||
API_MATCH_OPENROUTE = re.match(r"sk-or-v1-[a-zA-Z0-9]{64}$", key)
|
||||
return bool(API_MATCH_OPENROUTE)
|
||||
|
||||
def is_cohere_api_key(key):
|
||||
API_MATCH_AZURE = re.match(r"[a-zA-Z0-9]{40}$", key)
|
||||
@@ -92,10 +89,6 @@ def select_api_key(keys, llm_model):
|
||||
if llm_model.startswith('cohere-'):
|
||||
for k in key_list:
|
||||
if is_cohere_api_key(k): avail_key_list.append(k)
|
||||
|
||||
if llm_model.startswith('openrouter-'):
|
||||
for k in key_list:
|
||||
if is_openroute_api_key(k): avail_key_list.append(k)
|
||||
|
||||
if len(avail_key_list) == 0:
|
||||
raise RuntimeError(f"您提供的api-key不满足要求,不包含任何可用于{llm_model}的api-key。您可能选择了错误的模型或请求源(左上角更换模型菜单中可切换openai,azure,claude,cohere等请求源)。")
|
||||
|
||||
@@ -11,7 +11,7 @@ def not_chat_log_filter(record):
|
||||
|
||||
def formatter_with_clip(record):
|
||||
# Note this function returns the string to be formatted, not the actual message to be logged
|
||||
# record["extra"]["serialized"] = "555555"
|
||||
record["extra"]["serialized"] = "555555"
|
||||
max_len = 12
|
||||
record['function_x'] = record['function'].center(max_len)
|
||||
if len(record['function_x']) > max_len:
|
||||
|
||||
@@ -1,12 +0,0 @@
|
||||
"""
|
||||
对项目中的各个插件进行测试。运行方法:直接运行 python tests/test_plugins.py
|
||||
"""
|
||||
|
||||
import init_test
|
||||
import os, sys
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
from test_utils import plugin_test
|
||||
|
||||
plugin_test(plugin='crazy_functions.数学动画生成manim->动画生成', main_input="A point moving along function culve y=sin(x), starting from x=0 and stop at x=4*\pi.")
|
||||
@@ -8,17 +8,4 @@ import os, sys
|
||||
|
||||
if __name__ == "__main__":
|
||||
from test_utils import plugin_test
|
||||
plugin_test(
|
||||
plugin='crazy_functions.Social_Helper->I人助手',
|
||||
main_input="""
|
||||
添加联系人:
|
||||
艾德·史塔克:我的养父,他是临冬城的公爵。
|
||||
凯特琳·史塔克:我的养母,她对我态度冷淡,因为我是私生子。
|
||||
罗柏·史塔克:我的哥哥,他是北境的继承人。
|
||||
艾莉亚·史塔克:我的妹妹,她和我关系亲密,性格独立坚强。
|
||||
珊莎·史塔克:我的妹妹,她梦想成为一位淑女。
|
||||
布兰·史塔克:我的弟弟,他有预知未来的能力。
|
||||
瑞肯·史塔克:我的弟弟,他是个天真无邪的小孩。
|
||||
山姆威尔·塔利:我的朋友,他在守夜人军团中与我并肩作战。
|
||||
伊格瑞特:我的恋人,她是野人中的一员。
|
||||
""")
|
||||
plugin_test(plugin='crazy_functions.Social_Helper->I人助手', main_input="|")
|
||||
|
||||
在新工单中引用
屏蔽一个用户