|
1 |
| -# API for Open LLMs |
2 |
| - |
3 |
| -<p align="center"> |
4 |
| - <a href="https://github.com/xusenlinzy/api-for-open-llm"><img src="https://img.shields.io/github/license/xusenlinzy/api-for-open-llm"></a> |
5 |
| - <a href=""><img src="https://img.shields.io/badge/python-3.8+-aff.svg"></a> |
6 |
| - <a href=""><img src="https://img.shields.io/badge/pytorch-%3E=1.14-red?logo=pytorch"></a> |
7 |
| - <a href="https://github.com/xusenlinzy/api-for-open-llm"><img src="https://img.shields.io/github/last-commit/xusenlinzy/api-for-open-llm"></a> |
8 |
| - <a href="https://github.com/xusenlinzy/api-for-open-llm"><img src="https://img.shields.io/github/issues/xusenlinzy/api-for-open-llm?color=9cc"></a> |
9 |
| - <a href="https://github.com/xusenlinzy/api-for-open-llm"><img src="https://img.shields.io/github/stars/xusenlinzy/api-for-open-llm?color=ccf"></a> |
10 |
| - <a href="https://github.com/xusenlinzy/api-for-open-llm"><img src="https://img.shields.io/badge/langurage-py-brightgreen?style=flat&color=blue"></a> |
11 |
| -</p> |
12 |
| - |
13 |
| - |
14 |
| - |
15 |
| -<div align="center"> 图片来自于论文: [A Survey of Large Language Models](https://arxiv.org/pdf/2303.18223.pdf) </div> |
16 |
| - |
17 |
| - |
18 |
| -## 🐧 QQ交流群:870207830 |
19 |
| - |
20 |
| - |
21 |
| -## 📢 新闻 |
22 |
| - |
23 |
| -+ 【2024.02.26】 QWEN2 模型需要修改环境变量 `MODEL_NAME=qwen2` `PROMPT_NAME=qwen2` |
24 |
| - |
25 |
| -+ 【2024.01.19】 添加 [InternLM2](https://github.com/InternLM/InternLM) 模型支持,[启动方式](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#internlm2) |
26 |
| - |
27 |
| - |
28 |
| -+ 【2023.12.21】 添加 [TGI](https://github.com/huggingface/text-generation-inference) 生成接口转发和 [TEI](https://github.com/huggingface/text-embeddings-inference) embedding 接口转发 |
29 |
| - |
30 |
| - |
31 |
| -+ 【2023.12.06】 添加 [SUS-Chat-34B](https://huggingface.co/SUSTech/SUS-Chat-34B) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#suschat) |
32 |
| - |
33 |
| - |
34 |
| -+ 【2023.11.24】 支持 [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) 推理,[使用文档](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/LLAMA_CPP.md) |
35 |
| - |
36 |
| - |
37 |
| -+ 【2023.11.03】 支持 `chatglm3` 和 `qwen` 模型的 `function call` 调用功能,同时支持流式和非流式模式, [工具使用示例](https://github.com/xusenlinzy/api-for-open-llm/tree/master/examples/chatglm3/tool_using.py), 网页 `demo` 已经集成到 [streamlit-demo](./streamlit-demo) |
38 |
| - |
39 |
| - |
40 |
| -+ 【2023.10.29】 添加 [ChatGLM3](https://github.com/THUDM/ChatGLM3) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#chatglm3),[工具使用示例](https://github.com/xusenlinzy/api-for-open-llm/tree/master/examples/chatglm3) |
41 |
| - |
42 |
| - |
43 |
| -+ 【2023.09.27】 添加 [Qwen-14B-Chat-Int4](https://huggingface.co/Qwen/Qwen-14B-Chat-Int4) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#qwen-14b-chat) |
44 |
| - |
45 |
| - |
46 |
| -+ 【2023.09.07】 添加 [baichuan2](https://github.com/baichuan-inc/Baichuan2) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#baichuan2) |
47 |
| - |
48 |
| - |
49 |
| -+ 【2023.08.28】 添加 `transformers.TextIteratorStreamer` 流式输出支持,只需将环境变量修改为 `USE_STREAMER_V2=true` |
50 |
| - |
51 |
| - |
52 |
| -+ 【2023.08.26】 添加 [code-llama](https://github.com/facebookresearch/codellama) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#code-llama),[使用示例链接](https://github.com/xusenlinzy/api-for-open-llm/tree/master/examples/code-llama) |
53 |
| - |
54 |
| - |
55 |
| -+ 【2023.08.23】 添加 [sqlcoder](https://huggingface.co/defog/sqlcoder) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#sqlcoder),[使用示例链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/examples/sqlcoder/inference.py) |
56 |
| - |
57 |
| - |
58 |
| -+ 【2023.08.22】 添加 [xverse-13b-chat](https://github.com/xverse-ai/XVERSE-13B) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#xverse-13b-chat) |
59 |
| - |
60 |
| - |
61 |
| -+ 【2023.08.10】 添加 [vLLM](https://github.com/vllm-project/vllm) 推理加速支持,[使用文档](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/VLLM_SCRIPT.md) |
62 |
| - |
63 |
| - |
64 |
| -+ 【2023.08.03】 添加 [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B) 模型支持,[启动方式链接](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md#qwen-7b-chat) |
65 |
| - |
66 |
| - |
67 |
| -更多新闻和历史请转至 [此处](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/NEWS.md) |
68 |
| - |
69 |
| ---- |
70 |
| - |
71 |
| -**此项目主要内容** |
72 |
| - |
73 |
| -此项目为开源大模型的推理实现统一的后端接口,与 `OpenAI` 的响应保持一致,具有以下特性: |
74 |
| - |
75 |
| -+ ✨ 以 `OpenAI ChatGPT API` 的方式调用各类开源大模型 |
76 |
| - |
77 |
| - |
78 |
| -+ 🖨️ 支持流式响应,实现打印机效果 |
79 |
| - |
80 |
| - |
81 |
| -+ 📖 实现文本嵌入模型,为文档知识问答提供支持 |
82 |
| - |
83 |
| - |
84 |
| -+ 🦜️ 支持大规模语言模型开发工具 [`langchain` ](https://github.com/hwchase17/langchain) 的各类功能 |
85 |
| - |
86 |
| - |
87 |
| -+ 🙌 只需要简单的修改环境变量即可将开源模型作为 `chatgpt` 的替代模型,为各类应用提供后端支持 |
88 |
| - |
89 |
| - |
90 |
| -+ 🚀 支持加载经过自行训练过的 `lora` 模型 |
91 |
| - |
92 |
| - |
93 |
| -+ ⚡ 支持 [vLLM](https://github.com/vllm-project/vllm) 推理加速和处理并发请求 |
94 |
| - |
95 |
| - |
96 |
| -## 内容导引 |
97 |
| - |
98 |
| -| 章节 | 描述 | |
99 |
| -|:-----------------------------------------------------------------------------------------------:|:-----------------------------:| |
100 |
| -| [💁🏻♂支持模型](https://github.com/xusenlinzy/api-for-open-llm#-支持模型) | 此项目支持的开源模型以及简要信息 | |
101 |
| -| [🚄启动方式](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/SCRIPT.md) | 启动模型的环境配置和启动命令 | |
102 |
| -| [⚡vLLM启动方式](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/VLLM_SCRIPT.md) | 使用 `vLLM` 启动模型的环境配置和启动命令 | |
103 |
| -| [🦙llama-cpp启动方式](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/LLAMA_CPP.md) | 使用 `llama-cpp` 启动模型的环境配置和启动命令 | |
104 |
| -| [💻调用方式](https://github.com/xusenlinzy/api-for-open-llm#-使用方式) | 启动模型之后的调用方式 | |
105 |
| -| [❓常见问题](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/FAQ.md) | 一些常见问题的回复 | |
106 |
| -| [📚相关资源](https://github.com/xusenlinzy/api-for-open-llm/blob/master/docs/RESOURCES.md) | 关于开源模型训练和推理的相关资源 | |
107 |
| - |
108 |
| - |
109 |
| -## 🐼 支持模型 |
110 |
| - |
111 |
| -**语言模型** |
112 |
| - |
113 |
| -| 模型 | 基座模型 | 参数量 | 语言 | 模型权重链接 | |
114 |
| -|:---------------------------------------------------------------------:|:------------:|:--------:|:------:|:-----------------------------------------------------------------------------------------------------------:| |
115 |
| -| [baichuan2](https://github.com/baichuan-inc/Baichuan2) | Baichuan | 7/13 | en, zh | [baichuan-inc/Baichuan2-13B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat) | |
116 |
| -| [codellama](https://github.com/facebookresearch/codellama) | LLaMA2 | 7/13/34B | multi | [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) | |
117 |
| -| [xverse-13b-chat](https://github.com/xverse-ai/XVERSE-13B) | Xverse | 13B | multi | [xverse/XVERSE-13B-Chat](https://huggingface.co/xverse/XVERSE-13B-Chat) | |
118 |
| -| [qwen-7b-chat](https://github.com/QwenLM/Qwen-7B) | Qwen | 7B | en, zh | [Qwen/Qwen-7B-Chat](https://huggingface.co/baichuan-inc/Qwen/Qwen-7B-Chat) | |
119 |
| -| [baichuan-13b-chat](https://github.com/baichuan-inc/Baichuan-13B) | Baichuan | 13B | en, zh | [baichuan-inc/Baichuan-13B-Chat](https://huggingface.co/baichuan-inc/Baichuan-13B-Chat) | |
120 |
| -| [InternLM](https://github.com/InternLM/InternLM) | InternLM | 7B | en, zh | [internlm/internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) | |
121 |
| -| [InternLM2](https://github.com/InternLM/InternLM) | InternLM2 | 20B | en, zh | [internlm/internlm2-chat-20b](https://huggingface.co/internlm/internlm2-chat-20b) | |
122 |
| -| [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | GLM | 6/130B | en, zh | [THUDM/chatglm2-6b](https://huggingface.co/THUDM/chatglm2-6b) | |
123 |
| -| [baichaun-7b](https://github.com/baichuan-inc/baichuan-7B) | Baichuan | 7B | en, zh | [baichuan-inc/baichuan-7B](https://huggingface.co/baichuan-inc/baichuan-7B) | |
124 |
| -| [Guanaco](https://github.com/artidoro/qlora/tree/main) | LLaMA | 7/33/65B | en | [timdettmers/guanaco-33b-merged](https://huggingface.co/timdettmers/guanaco-33b-merged) | |
125 |
| -| [YuLan-Chat](https://github.com/RUC-GSAI/YuLan-Chat) | LLaMA | 13/65B | en, zh | [RUCAIBox/YuLan-Chat-13b-delta](https://huggingface.co/RUCAIBox/YuLan-Chat-13b-delta) | |
126 |
| -| [TigerBot](https://github.com/TigerResearch/TigerBot) | BLOOMZ | 7/180B | en, zh | [TigerResearch/tigerbot-7b-sft](https://huggingface.co/TigerResearch/tigerbot-7b-sft) | |
127 |
| -| [OpenBuddy](https://github.com/OpenBuddy/OpenBuddy) | LLaMA、Falcon | 7B | multi | [OpenBuddy](https://huggingface.co/OpenBuddy) | |
128 |
| -| [MOSS](https://github.com/OpenLMLab/MOSS) | CodeGen | 16B | en, zh | [fnlp/moss-moon-003-sft-int4](https://huggingface.co/fnlp/moss-moon-003-sft-int4) | |
129 |
| -| [Phoenix](https://github.com/FreedomIntelligence/LLMZoo) | BLOOMZ | 7B | multi | [FreedomIntelligence/phoenix-inst-chat-7b](https://huggingface.co/FreedomIntelligence/phoenix-inst-chat-7b) | |
130 |
| -| [BAIZE](https://github.com/project-baize/baize-chatbot) | LLaMA | 7/13/30B | en | [project-baize/baize-lora-7B](https://huggingface.co/project-baize/baize-lora-7B) | |
131 |
| -| [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca) | LLaMA | 7/13B | en, zh | [ziqingyang/chinese-alpaca-plus-lora-7b](https://huggingface.co/ziqingyang/chinese-alpaca-plus-lora-7b) | |
132 |
| -| [BELLE](https://github.com/LianjiaTech/BELLE) | BLOOMZ | 7B | zh | [BelleGroup/BELLE-7B-2M](https://huggingface.co/BelleGroup/BELLE-7B-2M) | |
133 |
| -| [ChatGLM](https://github.com/THUDM/ChatGLM-6B) | GLM | 6B | en, zh | [THUDM/chatglm-6b](https://huggingface.co/THUDM/chatglm-6b) | |
134 |
| - |
135 |
| - |
136 |
| -**嵌入模型** |
137 |
| - |
138 |
| -| 模型 | 维度 | 权重链接 | |
139 |
| -|:----------------------:|:----:|:-----------------------------------------------------------------------------------:| |
140 |
| -| bge-large-zh | 1024 | [bge-large-zh](https://huggingface.co/BAAI/bge-large-zh) | |
141 |
| -| m3e-large | 1024 | [moka-ai/m3e-large](https://huggingface.co/moka-ai/m3e-large) | |
142 |
| -| text2vec-large-chinese | 1024 | [text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | |
143 |
| - |
144 |
| - |
145 |
| -## 🤖 使用方式 |
146 |
| - |
147 |
| -### 环境变量 |
148 |
| - |
149 |
| -+ `OPENAI_API_KEY`: 此处随意填一个字符串即可 |
150 |
| - |
151 |
| -+ `OPENAI_API_BASE`: 后端启动的接口地址,如:http://192.168.0.xx:80/v1 |
152 |
| - |
153 |
| - |
154 |
| -### [聊天界面](./applications) |
155 |
| - |
156 |
| -```shell |
157 |
| -cd streamlit-demo |
158 |
| -pip install -r requirements.txt |
159 |
| -streamlit run streamlit_app.py |
160 |
| -``` |
161 |
| - |
162 |
| - |
163 |
| - |
164 |
| -### [openai v1.1.0](https://github.com/openai/openai-python) |
165 |
| - |
166 |
| -<details> |
167 |
| -<summary>👉 Chat Completions</summary> |
168 |
| - |
169 |
| -```python |
170 |
| -from openai import OpenAI |
171 |
| - |
172 |
| -client = OpenAI( |
173 |
| - api_key="EMPTY", |
174 |
| - base_url="http://192.168.20.59:7891/v1/", |
175 |
| -) |
176 |
| - |
177 |
| -# Chat completion API |
178 |
| -chat_completion = client.chat.completions.create( |
179 |
| - messages=[ |
180 |
| - { |
181 |
| - "role": "user", |
182 |
| - "content": "你好", |
183 |
| - } |
184 |
| - ], |
185 |
| - model="gpt-3.5-turbo", |
186 |
| -) |
187 |
| -print(chat_completion) |
188 |
| -# 你好👋!我是人工智能助手 ChatGLM3-6B,很高兴见到你,欢迎问我任何问题。 |
189 |
| - |
190 |
| - |
191 |
| -# stream = client.chat.completions.create( |
192 |
| -# messages=[ |
193 |
| -# { |
194 |
| -# "role": "user", |
195 |
| -# "content": "感冒了怎么办", |
196 |
| -# } |
197 |
| -# ], |
198 |
| -# model="gpt-3.5-turbo", |
199 |
| -# stream=True, |
200 |
| -# ) |
201 |
| -# for part in stream: |
202 |
| -# print(part.choices[0].delta.content or "", end="", flush=True) |
203 |
| -``` |
204 |
| - |
205 |
| -</details> |
206 |
| - |
207 |
| -<details> |
208 |
| -<summary>👉 Completions</summary> |
209 |
| - |
210 |
| -```python |
211 |
| -from openai import OpenAI |
212 |
| - |
213 |
| -client = OpenAI( |
214 |
| - api_key="EMPTY", |
215 |
| - base_url="http://192.168.20.59:7891/v1/", |
216 |
| -) |
217 |
| - |
218 |
| - |
219 |
| -# Chat completion API |
220 |
| -completion = client.completions.create( |
221 |
| - model="gpt-3.5-turbo", |
222 |
| - prompt="你好", |
223 |
| -) |
224 |
| -print(completion) |
225 |
| -# 你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。 |
226 |
| -``` |
227 |
| - |
228 |
| -</details> |
229 |
| - |
230 |
| -<details> |
231 |
| -<summary>👉 Embeddings</summary> |
232 |
| - |
233 |
| -```python |
234 |
| -from openai import OpenAI |
235 |
| - |
236 |
| -client = OpenAI( |
237 |
| - api_key="EMPTY", |
238 |
| - base_url="http://192.168.20.59:7891/v1/", |
239 |
| -) |
240 |
| - |
241 |
| - |
242 |
| -# compute the embedding of the text |
243 |
| -embedding = client.embeddings.create( |
244 |
| - input="你好", |
245 |
| - model="text-embedding-ada-002" |
246 |
| -) |
247 |
| -print(embedding) |
248 |
| - |
249 |
| -``` |
250 |
| - |
251 |
| -</details> |
252 |
| - |
253 |
| - |
254 |
| -### 可接入的项目 |
255 |
| - |
256 |
| -**通过修改 `OPENAI_API_BASE` 环境变量,大部分的 `chatgpt` 应用和前后端项目都可以无缝衔接!** |
257 |
| - |
258 |
| -+ [ChatGPT-Next-Web: One-Click to deploy well-designed ChatGPT web UI on Vercel](https://github.com/Yidadaa/ChatGPT-Next-Web) |
259 |
| - |
260 |
| -```shell |
261 |
| -docker run -d -p 3000:3000 \ |
262 |
| - -e OPENAI_API_KEY="sk-xxxx" \ |
263 |
| - -e BASE_URL="http://192.168.0.xx:80" \ |
264 |
| - yidadaa/chatgpt-next-web |
265 |
| -``` |
266 |
| - |
267 |
| - |
268 |
| - |
269 |
| -+ [dify: An easy-to-use LLMOps platform designed to empower more people to create sustainable, AI-native applications](https://github.com/langgenius/dify) |
270 |
| - |
271 |
| -```shell |
272 |
| -# 在docker-compose.yml中的api和worker服务中添加以下环境变量 |
273 |
| -OPENAI_API_BASE: http://192.168.0.xx:80/v1 |
274 |
| -DISABLE_PROVIDER_CONFIG_VALIDATION: 'true' |
275 |
| -``` |
276 |
| - |
277 |
| - |
278 |
| - |
279 |
| - |
280 |
| -## 📜 License |
281 |
| - |
282 |
| -此项目为 `Apache 2.0` 许可证授权,有关详细信息,请参阅 [LICENSE](LICENSE) 文件。 |
283 |
| - |
284 |
| - |
285 |
| -## 🚧 References |
286 |
| - |
287 |
| -+ [ChatGLM: An Open Bilingual Dialogue Language Model](https://github.com/THUDM/ChatGLM-6B) |
288 |
| - |
289 |
| -+ [BLOOM: A 176B-Parameter Open-Access Multilingual Language Model](https://arxiv.org/abs/2211.05100) |
290 |
| - |
291 |
| -+ [LLaMA: Open and Efficient Foundation Language Models](https://arxiv.org/abs/2302.13971v1) |
292 |
| - |
293 |
| -+ [Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca) |
294 |
| - |
295 |
| -+ [Phoenix: Democratizing ChatGPT across Languages](https://github.com/FreedomIntelligence/LLMZoo) |
296 |
| - |
297 |
| -+ [MOSS: An open-sourced plugin-augmented conversational language model](https://github.com/OpenLMLab/MOSS) |
298 |
| - |
299 |
| -+ [FastChat: An open platform for training, serving, and evaluating large language model based chatbots](https://github.com/lm-sys/FastChat) |
300 |
| - |
301 |
| -+ [LangChain: Building applications with LLMs through composability](https://github.com/hwchase17/langchain) |
302 |
| - |
303 |
| -+ [ChuanhuChatgpt](https://github.com/GaiZhenbiao/ChuanhuChatGPT) |
304 |
| - |
305 |
| - |
306 |
| -## Star History |
307 |
| - |
308 |
| -[](https://star-history.com/#xusenlinzy/api-for-open-llm&Date) |
0 commit comments