-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: allow ai-proxy to forward standard AI capabilities that are natively supported #1704
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1704 +/- ##
==========================================
+ Coverage 35.91% 43.61% +7.70%
==========================================
Files 69 76 +7
Lines 11576 12358 +782
==========================================
+ Hits 4157 5390 +1233
+ Misses 7104 6630 -474
- Partials 315 338 +23 |
e8083a3
to
a3350ee
Compare
setup脚本
测试脚本
|
WIP |
5d8c338
to
7025c0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于这个改动还有个建议是,当识别出unsupported api name时,不要直接返回报错,而是打一个日志说明ai proxy会忽略当前请求的协议处理,然后跳过插件逻辑,直接透传。这样当用户开了 ai proxy,但除了使用 completions api 也同时用 openai 的 realtime api 的时候也不会报错。
@@ -42,7 +42,7 @@ description: AI 代理插件配置参考 | |||
| `customSettings` | array of customSetting | 非必填 | - | 为AI请求指定覆盖或者填充参数 | | |||
| `failover` | object | 非必填 | - | 配置 apiToken 的 failover 策略,当 apiToken 不可用时,将其移出 apiToken 列表,待健康检测通过后重新添加回 apiToken 列表 | | |||
| `retryOnFailure` | object | 非必填 | - | 当请求失败时立即进行重试 | | |||
|
|||
| `capabilities` | array of string | 非必填 | - | 部分provider的部分ai能力原生兼容openai/v1格式,不需要重写,可以直接转发,通过此配置项指定来开启转发, 当前支持: openai/v1/chatcompletions, openai/v1/embeddings, openai/v1/imagegeneration, openai/v1/audiospeech, openai/v1/audiotranscription | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个配置项还需要吗,我看每个模型都已经内置了映射了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,我也在思考一些问题。
主要是有些厂商我们没能非常详细的去测试过它所有支持的能力,先前的版本中,主要是适配了chat的api,没能去关注其他api,假如用户需要使用非chat的api的时候,能通过修改配置快速支持。如cohere,它支持embedding,但是code没有默认内置。这时候用户需要使用,就可以修改配置达到目的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯 不过这个配置项的类型已经配置例子得改一下,我看跟解析配置的逻辑对不上,解析配置的逻辑解析的是一个 map 而不是 array of string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
解析的是一个map[string]string,这个例子有问题,我稍后修正
@pepesi 对于不同的api协议,modelMapping 的逻辑都是一样的吗? 问这个是因为,目前把 modelMapping 的能力独立了一个插件:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_mapper |
让我再观察下 |
…natively supported
1094293
to
78ced6e
Compare
5954df1
to
d644443
Compare
这个插件的能力要更灵活些,能支持特定 suffix的配置,能自己指定model key。当前ai-proxy插件中的modelMapping,是直接针对需要处理body的请求,替换json中的“model”字段。 针对问题本身,我看了下 目前主要支持的api协议,都有model字段。 但是用户自己配置的 capability部分就不能确保存在model字段。 |
我想了下,暂时不这么做,主要是我感觉可能会存在安全问题,因为默认我们只限定了它能访问的path,注入了token,我不确认这个token在真实服务上的权限是多大。白名单的机制似乎要好点 |
我理解没有安全问题,path也并没有为安全提供额外的保障。目前用户的需求其实是需要能支持所有类型的 API 的。或者可以考虑加个 passthrough 的配置项,如果配置了就允许透传 body。 |
要不把它作为配置项吧,交给用户抉择 |
嗯 加个独立的配置项吧 |
506dcaf
to
eb23ddf
Compare
支持转发原生支持的ai能力。
Ⅰ. Describe what this PR did
相关issue
#1690
#1708
腾讯hunyuan
#1544
Ⅱ. Does this pull request fix one issue?
Ⅲ. Why don't you add test cases (unit test/integration test)?
Ⅳ. Describe how to verify it
Ⅴ. Special notes for reviews