Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: allow ai-proxy to forward standard AI capabilities that are natively supported #1704

Merged
merged 13 commits into from
Feb 12, 2025

Conversation

pepesi
Copy link
Contributor

@pepesi pepesi commented Jan 22, 2025

支持转发原生支持的ai能力。

Ⅰ. Describe what this PR did

相关issue
#1690
#1708
腾讯hunyuan
#1544

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@codecov-commenter
Copy link

codecov-commenter commented Jan 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 43.61%. Comparing base (ef31e09) to head (ae9ae84).
Report is 289 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1704      +/-   ##
==========================================
+ Coverage   35.91%   43.61%   +7.70%     
==========================================
  Files          69       76       +7     
  Lines       11576    12358     +782     
==========================================
+ Hits         4157     5390    +1233     
+ Misses       7104     6630     -474     
- Partials      315      338      +23     

see 70 files with indirect coverage changes

@pepesi pepesi marked this pull request as draft January 22, 2025 05:48
@pepesi pepesi force-pushed the feature-passthrough branch from e8083a3 to a3350ee Compare January 22, 2025 14:56
@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

setup脚本

# -*- coding: utf-8 -*-

from urllib.parse import urlparse
from kubernetes import client, config
from kubernetes.client.exceptions import ApiException


def apply_resources(name, url, token, wasm_url, namespace="higress-system"):
    """
    使用 Kubernetes Python 客户端以 apply 模式创建或更新 Ingress、McpBridge 和 WasmPlugin。

    参数:
        name (str): 资源的名字(用于 Ingress、McpBridge 和 WasmPlugin)。
        url (str): 完整的 URL(如 https://open.bigmodel.cn)。
        token (str): 用于 WasmPlugin 的 API Token。
        wasm_url (str): WasmPlugin 的 URL(如 http://192.168.31.51:8000/ai-proxy.wasm)。
        namespace (str): 资源的命名空间,默认为 "higress-system"。
    """
    # 加载 Kubernetes 配置(本地 kubeconfig 或集群内配置)
    try:
        config.load_kube_config()  # 如果在 Pod 内运行,使用 config.load_incluster_config()
    except Exception as e:
        print(f"加载 Kubernetes 配置失败: {e}")
        return

    # 解析 URL
    parsed_url = urlparse(url)
    domain = parsed_url.hostname
    port = parsed_url.port or (443 if parsed_url.scheme == "https" else 80)
    protocol = parsed_url.scheme

    if not domain:
        print(f"无效的 URL: {url}")
        return

    # 定义 API 客户端
    networking_api = client.NetworkingV1Api()
    custom_objects_api = client.CustomObjectsApi()

    # 定义 McpBridge 的 API 组和版本
    mcpbridge_group = "networking.higress.io"
    mcpbridge_version = "v1"
    mcpbridge_plural = "mcpbridges"

    # 定义 WasmPlugin 的 API 组和版本
    wasmplugin_group = "extensions.higress.io"
    wasmplugin_version = "v1alpha1"
    wasmplugin_plural = "wasmplugins"

    # 定义 registry 条目
    new_registry = {
        "domain": domain,
        "name": name,
        "port": port,
        "protocol": protocol,
        "type": "dns",
    }

    # Step 1: Apply McpBridge
    try:
        # 检查 McpBridge 是否存在
        existing_mcpbridge = custom_objects_api.get_namespaced_custom_object(
            group=mcpbridge_group,
            version=mcpbridge_version,
            namespace=namespace,
            plural=mcpbridge_plural,
            name="default",  # McpBridge 的名字固定为 "default"
        )
        print("McpBridge 'default' 已存在,正在检查 registries...")

        # 检查 registries 中是否包含指定的域名
        registries = existing_mcpbridge.get("spec", {}).get("registries", [])
        if any(registry["domain"] == domain for registry in registries):
            print(f"域名 {domain} 已存在于 McpBridge 'default' 的 registries 中,无需更新。")
        else:
            # 如果域名不存在,则添加到 registries 中
            registries.append(new_registry)
            existing_mcpbridge["spec"]["registries"] = registries

            # 更新 McpBridge
            custom_objects_api.replace_namespaced_custom_object(
                group=mcpbridge_group,
                version=mcpbridge_version,
                namespace=namespace,
                plural=mcpbridge_plural,
                name="default",
                body=existing_mcpbridge,
            )
            print(f"已将域名 {domain} 添加到 McpBridge 'default' 的 registries 中。")

    except ApiException as e:
        if e.status == 404:
            # 如果 McpBridge 不存在,则创建
            print("McpBridge 'default' 不存在,正在创建...")
            mcpbridge_body = {
                "apiVersion": f"{mcpbridge_group}/{mcpbridge_version}",
                "kind": "McpBridge",
                "metadata": {
                    "name": "default",
                    "namespace": namespace,
                },
                "spec": {
                    "registries": [new_registry],
                },
            }
            custom_objects_api.create_namespaced_custom_object(
                group=mcpbridge_group,
                version=mcpbridge_version,
                namespace=namespace,
                plural=mcpbridge_plural,
                body=mcpbridge_body,
            )
            print(f"McpBridge 'default' 已成功创建,并添加域名 {domain}。")
        else:
            # 其他错误
            print(f"处理 McpBridge 时出错: {e.reason}")
            print(f"详细信息: {e.body}")

    # Step 2: Apply Ingress
    ingress = client.V1Ingress(
        api_version="networking.k8s.io/v1",
        kind="Ingress",
        metadata=client.V1ObjectMeta(
            name=name,
            namespace=namespace,
            annotations={
                "higress.io/backend-protocol": "HTTPS",
                "higress.io/destination": name + ".dns",
                "higress.io/proxy-ssl-name": domain,
                "higress.io/proxy-ssl-server-name": "on",
            },
            labels={
                "higress.io/resource-definer": "higress",
            },
        ),
        spec=client.V1IngressSpec(
            ingress_class_name="higress",
            rules=[
                client.V1IngressRule(
                    host=f"{name}.test.com",  # 动态生成 host
                    http=client.V1HTTPIngressRuleValue(
                        paths=[
                            client.V1HTTPIngressPath(
                                path="/",
                                path_type="Prefix",
                                backend=client.V1IngressBackend(
                                    resource=client.V1TypedLocalObjectReference(
                                        api_group="networking.higress.io",
                                        kind="McpBridge",
                                        name=name,
                                    )
                                ),
                            )
                        ]
                    ),
                )
            ],
        ),
    )

    try:
        # 检查 Ingress 是否已存在
        _ = networking_api.read_namespaced_ingress(name=name, namespace=namespace)
        print(f"Ingress {name} 已存在,正在更新...")
        # 如果存在,则更新(replace)
        _ = networking_api.replace_namespaced_ingress(
            name=name, namespace=namespace, body=ingress
        )
        print(f"Ingress {name} 已成功更新")
    except ApiException as e:
        if e.status == 404:
            # 如果不存在,则创建
            print(f"Ingress {name} 不存在,正在创建...")
            _ = networking_api.create_namespaced_ingress(
                namespace=namespace, body=ingress
            )
            print(f"Ingress {name} 已成功创建")
        else:
            # 其他错误
            print(f"处理 Ingress 时出错: {e.reason}")
            print(f"详细信息: {e.body}")

    # Step 3: Apply WasmPlugin
    try:
        # 检查 WasmPlugin 是否存在
        existing_wasmplugin = custom_objects_api.get_namespaced_custom_object(
            group=wasmplugin_group,
            version=wasmplugin_version,
            namespace=namespace,
            plural=wasmplugin_plural,
            name="ai-proxy",
        )
        print("WasmPlugin 'ai-proxy' 已存在,正在检查 matchRules...")

        # 检查 matchRules 中是否包含指定的 type 和 token
        match_rules = existing_wasmplugin.get("spec", {}).get("matchRules", [])
        if any(rule["config"]["provider"]["type"] == name for rule in match_rules):
            print(f"类型 {name} 已存在于 WasmPlugin 'ai-proxy' 的 matchRules 中,无需更新。")
        else:
            # 如果类型不存在,则添加到 matchRules 中
            match_rules.append(
                {
                    "config": {
                        "provider": {
                            "apiTokens": [token],
                            "type": name,
                        },
                    },
                    "ingress": [name],
                }
            )
            existing_wasmplugin["spec"]["matchRules"] = match_rules
            existing_wasmplugin["spec"]["url"] = wasm_url  # 更新 URL

            # 更新 WasmPlugin
            custom_objects_api.replace_namespaced_custom_object(
                group=wasmplugin_group,
                version=wasmplugin_version,
                namespace=namespace,
                plural=wasmplugin_plural,
                name="ai-proxy",
                body=existing_wasmplugin,
            )
            print(
                f"已将类型 {name} 和 token 添加到 WasmPlugin 'ai-proxy' 的 matchRules 中,并更新 URL。"
            )

    except ApiException as e:
        if e.status == 404:
            # 如果 WasmPlugin 不存在,则创建
            print("WasmPlugin 'ai-proxy' 不存在,正在创建...")
            wasmplugin_body = {
                "apiVersion": f"{wasmplugin_group}/{wasmplugin_version}",
                "kind": "WasmPlugin",
                "metadata": {
                    "name": "ai-proxy",
                    "namespace": namespace,
                    "labels": {
                        "higress.io/resource-definer": "higress",
                        "higress.io/wasm-plugin-built-in": "true",
                        "higress.io/wasm-plugin-category": "custom",
                        "higress.io/wasm-plugin-name": "ai-proxy",
                        "higress.io/wasm-plugin-version": "1.0.0",
                    },
                },
                "spec": {
                    "defaultConfigDisable": True,
                    "matchRules": [
                        {
                            "config": {
                                "provider": {
                                    "apiTokens": [token],
                                    "type": name,
                                },
                            },
                            "ingress": [name],
                        }
                    ],
                    "phase": "UNSPECIFIED_PHASE",
                    "priority": 100,
                    "url": wasm_url,
                },
            }
            custom_objects_api.create_namespaced_custom_object(
                group=wasmplugin_group,
                version=wasmplugin_version,
                namespace=namespace,
                plural=wasmplugin_plural,
                body=wasmplugin_body,
            )
            print(f"WasmPlugin 'ai-proxy' 已成功创建,并添加类型 {name} 和 token,并设置 URL。")
        else:
            # 其他错误
            print(f"处理 WasmPlugin 时出错: {e.reason}")
            print(f"详细信息: {e.body}")


zhipuai = (
    "zhipuai",
    "https://open.bigmodel.cn",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
hunyuan = (
    "hunyuan",
    "https://api.hunyuan.cloud.tencent.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
baichuan = (
    "baichuan",
    "https://api.baichuan-ai.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
deepseek = (
    "deepseek",
    "https://api.deepseek.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
doubao = (
    "doubao",
    "https://ark.cn-beijing.volces.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
cohere = (
    "cohere",
    "https://api.cohere.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
baidu = (
    "baidu",
    "https://qianfan.baidubce.com",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
ai360 = (
    "ai360",
    "https://api.360.cn/v1",
    "替换token",
    "http://172.22.7.227:8000/ai-proxy.wasm?v=1"
)
apply_resources(*zhipuai)
apply_resources(*hunyuan)
apply_resources(*baichuan)
apply_resources(*deepseek)
apply_resources(*doubao)
apply_resources(*cohere)
apply_resources(*baidu)
apply_resources(*ai360)

测试脚本

import requests
import unittest
import logging

# 配置 logging
logging.basicConfig(
    level=logging.INFO,  # 设置日志级别
    format="%(asctime)s - %(levelname)s - %(message)s",
)
logger = logging.getLogger(__name__)


def send_request_with_host_header(url, host_header, data=None):
    headers = {
        "Host": host_header,
    }

    try:
        logger.info(f"Sending request to {url} with Host header: {host_header}")
        response = requests.post(url, headers=headers, json=data)
        if data and data.get("stream"):
            chunks = []
            for chunk in response.iter_content(chunk_size=None):
                if chunk:
                    logger.debug(f"Streamed chunk: {chunk.decode('utf-8')}")
                    chunks.append(chunks)
            return True, chunks
        else:
            print(response.text)
            ret = response.json()
            logger.debug(f"Response JSON: {response.json()}")
        return True, ret
    except requests.exceptions.RequestException as e:
        logger.error(f"Request failed: {e}")
        return False, None


def testimggen(host, model):
    url = "http://127.0.0.1:18080/v1/images/generations"
    data = {
        "model": model,
        "prompt": "画一只猫",
    }

    logger.info(f"Testing image gen with host: {host}, model: {model}")
    ok, ret = send_request_with_host_header(url, host, data)
    if not ok:
        return False
    # TODO: 验证内容格式
    return ok


def testrerank(host, model):
    url = "http://127.0.0.1:18080/v1/rerank"
    data = {
        "model": model,
        "query": "What is the capital of the United States?",
        "top_n": 3,
        "documents": [
            "Carson City is the capital city of the American state of Nevada.",
            "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",  # noqa
            "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",  # noqa
            "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",  # noqa
            "Capital punishment has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."  # noqa
        ]
    }
    logger.info(f"Testing rerank with host: {host}, model: {model}")
    ok, ret = send_request_with_host_header(url, host, data)
    if not ok:
        return False
    # TODO: 验证内容格式
    return ok


def testchat(host, model, stream=True):
    url = "http://127.0.0.1:18080/v1/chat/completions"
    data = {
        "model": model,
        "messages": [{"role": "user", "content": "说你好"}],
        "temperature": 0.7,
        "stream": stream,
    }

    logger.info(f"Testing chat with host: {host}, model: {model}, stream: {stream}")
    ok, ret = send_request_with_host_header(url, host, data)
    if not ok:
        return False
    # TODO: 验证内容格式
    return ok


def testemb(host, model):
    url = "http://127.0.0.1:18080/v1/embeddings"
    data = {"model": model, "input": ["hi", "there"]}

    logger.info(f"Testing embedding with host: {host}, model: {model}")
    ok, ret = send_request_with_host_header(url, host, data)
    if not ok:
        return False
    return len(ret.get("data")) > 0


chatcases = (
    ["hunyuan", "hunyuan-turbo"],
    ["zhipuai", "glm-4-long"],
    ["baichuan", "Baichuan4"],
    ["deepseek", "deepseek-chat"],
    ["doubao", "ep-20240912163747-dtz49"],
    ["cohere", "command-r-plus-08-2024"],
    ["baidu", "ernie-4.0-8k"],
    ["ai360", "360zhinao2-o1"],
    ["minimax", "abab6.5s-chat"],
    ["mistral", "mistral-large-latest"],
    ["moonshot", "moonshot-v1-8k"],
    ["qwen", "qwen-plus"],
)

embcases = [
    ["hunyuan", "hunyuan-embedding"],
    ["zhipuai", "embedding-3"],
    ["baichuan", "Baichuan-Text-Embedding"],
    ["doubao", "ep-20250123094256-lgnwk"],
    # ["cohere", "embed-english-v3.0"],
    ["baidu", "embedding-v1"],
    ["ai360", "embedding_s1_v1.2"],
    ["mistral", "mistral-embed"],
    ["qwen", "text-embedding-v3"],
]

imggencases = [
    ["zhipuai", "cog"],
]

rerankcases = [
    ["cohere", "rerank-v3.5"],
]

chat_test_cases = [
    {
        "name": f"test_chat_{i[0]}",
        "params": [f"{i[0]}.test.com", i[1]] + i[2:],
        "function": testchat,
    } for i in chatcases
]

embedding_test_cases = [
    {
        "name": f"test_embedding_{i[0]}",
        "params": (f"{i[0]}.test.com", i[1]),
        "function": testemb,
    } for i in embcases]


imggen_test_cases = [
    {
        "name": f"test_imggen_{i[0]}",
        "params": (f"{i[0]}.test.com", i[1]),
        "function": testimggen,
    } for i in imggencases]

rerank_test_cases = [
    {
        "name": f"test_rerank_{i[0]}",
        "params": (f"{i[0]}.test.com", i[1]),
        "function": testrerank,
    } for i in rerankcases]


# 动态生成测试类
class TestProcessData(unittest.TestCase):
    pass


# 动态创建测试方法
def create_test_method(function, params):
    def test_method(self):
        logger.info(f"Running test with params: {params}")
        result = function(*params)
        self.assertEqual(result, True, f"Failed for params: {params}")
        logger.info(f"Test passed for params: {params}")

    return test_method


# 为每个 chat 测试用例动态添加测试方法
for case in chat_test_cases:
    test_name = case["name"]
    test_method = create_test_method(case["function"], case["params"])
    setattr(TestProcessData, test_name, test_method)

# 为每个 embedding 测试用例动态添加测试方法
for case in embedding_test_cases:
    test_name = case["name"]
    test_method = create_test_method(case["function"], case["params"])
    setattr(TestProcessData, test_name, test_method)

for case in imggen_test_cases:
    test_name = case["name"]
    test_method = create_test_method(case["function"], case["params"])
    setattr(TestProcessData, test_name, test_method)

for case in rerank_test_cases:
    test_name = case["name"]
    test_method = create_test_method(case["function"], case["params"])
    setattr(TestProcessData, test_name, test_method)


if __name__ == "__main__":
    unittest.main()

@pepesi pepesi marked this pull request as ready for review January 23, 2025 02:21
@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

WIP

@pepesi pepesi force-pushed the feature-passthrough branch from 5d8c338 to 7025c0e Compare January 23, 2025 05:37
@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

注册了 zhipuai 的 cogview-4 模型,同时mapping 为 cog

image

测试覆盖

参考脚本内容
chatcompetions
embeddings
imagegeneration
rerank

测试日志

图像生成以及 modelMapping结果。
image

rerank 结果
image

Copy link
Collaborator

@johnlanni johnlanni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于这个改动还有个建议是,当识别出unsupported api name时,不要直接返回报错,而是打一个日志说明ai proxy会忽略当前请求的协议处理,然后跳过插件逻辑,直接透传。这样当用户开了 ai proxy,但除了使用 completions api 也同时用 openai 的 realtime api 的时候也不会报错。

@@ -42,7 +42,7 @@ description: AI 代理插件配置参考
| `customSettings` | array of customSetting | 非必填 | - | 为AI请求指定覆盖或者填充参数 |
| `failover` | object | 非必填 | - | 配置 apiToken 的 failover 策略,当 apiToken 不可用时,将其移出 apiToken 列表,待健康检测通过后重新添加回 apiToken 列表 |
| `retryOnFailure` | object | 非必填 | - | 当请求失败时立即进行重试 |

| `capabilities` | array of string | 非必填 | - | 部分provider的部分ai能力原生兼容openai/v1格式,不需要重写,可以直接转发,通过此配置项指定来开启转发, 当前支持: openai/v1/chatcompletions, openai/v1/embeddings, openai/v1/imagegeneration, openai/v1/audiospeech, openai/v1/audiotranscription |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个配置项还需要吗,我看每个模型都已经内置了映射了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,我也在思考一些问题。
主要是有些厂商我们没能非常详细的去测试过它所有支持的能力,先前的版本中,主要是适配了chat的api,没能去关注其他api,假如用户需要使用非chat的api的时候,能通过修改配置快速支持。如cohere,它支持embedding,但是code没有默认内置。这时候用户需要使用,就可以修改配置达到目的。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯 不过这个配置项的类型已经配置例子得改一下,我看跟解析配置的逻辑对不上,解析配置的逻辑解析的是一个 map 而不是 array of string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

解析的是一个map[string]string,这个例子有问题,我稍后修正

plugins/wasm-go/extensions/ai-proxy/main.go Show resolved Hide resolved
@johnlanni
Copy link
Collaborator

johnlanni commented Jan 23, 2025

@pepesi 对于不同的api协议,modelMapping 的逻辑都是一样的吗?

问这个是因为,目前把 modelMapping 的能力独立了一个插件:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_mapper

@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

@pepesi 对于不同的api协议,modelMapping 的逻辑都是一样的吗?

问这个是因为,目前把 modelMapping 的能力独立了一个插件:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_mapper

让我再观察下

@pepesi pepesi force-pushed the feature-passthrough branch from 1094293 to 78ced6e Compare January 23, 2025 08:05
@pepesi pepesi force-pushed the feature-passthrough branch from 5954df1 to d644443 Compare January 23, 2025 08:09
@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

@pepesi 对于不同的api协议,modelMapping 的逻辑都是一样的吗?

问这个是因为,目前把 modelMapping 的能力独立了一个插件:https://github.com/alibaba/higress/tree/main/plugins/wasm-cpp/extensions/model_mapper

这个插件的能力要更灵活些,能支持特定 suffix的配置,能自己指定model key。当前ai-proxy插件中的modelMapping,是直接针对需要处理body的请求,替换json中的“model”字段。
二者目的一致,但是场景存在些差异,我理解他们不冲突,可以保留的当前ai-proxy插件的modelMapping,这样的好处是可以减少一个插件使用,针对单个渠道的所有模型能统一配置替换。

针对问题本身,我看了下 目前主要支持的api协议,都有model字段。
"openai/v1/chatcompletions"
"openai/v1/embeddings"
"openai/v1/imagegeneration"
"openai/v1/audiospeech"
"cohere/v1/rerank"

但是用户自己配置的 capability部分就不能确保存在model字段。

@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

对于这个改动还有个建议是,当识别出unsupported api name时,不要直接返回报错,而是打一个日志说明ai proxy会忽略当前请求的协议处理,然后跳过插件逻辑,直接透传。这样当用户开了 ai proxy,但除了使用 completions api 也同时用 openai 的 realtime api 的时候也不会报错。

我想了下,暂时不这么做,主要是我感觉可能会存在安全问题,因为默认我们只限定了它能访问的path,注入了token,我不确认这个token在真实服务上的权限是多大。白名单的机制似乎要好点

@johnlanni
Copy link
Collaborator

对于这个改动还有个建议是,当识别出unsupported api name时,不要直接返回报错,而是打一个日志说明ai proxy会忽略当前请求的协议处理,然后跳过插件逻辑,直接透传。这样当用户开了 ai proxy,但除了使用 completions api 也同时用 openai 的 realtime api 的时候也不会报错。

我想了下,暂时不这么做,主要是我感觉可能会存在安全问题,因为默认我们只限定了它能访问的path,注入了token,我不确认这个token在真实服务上的权限是多大。白名单的机制似乎要好点

我理解没有安全问题,path也并没有为安全提供额外的保障。目前用户的需求其实是需要能支持所有类型的 API 的。或者可以考虑加个 passthrough 的配置项,如果配置了就允许透传 body。

@pepesi
Copy link
Contributor Author

pepesi commented Jan 23, 2025

对于这个改动还有个建议是,当识别出unsupported api name时,不要直接返回报错,而是打一个日志说明ai proxy会忽略当前请求的协议处理,然后跳过插件逻辑,直接透传。这样当用户开了 ai proxy,但除了使用 completions api 也同时用 openai 的 realtime api 的时候也不会报错。

我想了下,暂时不这么做,主要是我感觉可能会存在安全问题,因为默认我们只限定了它能访问的path,注入了token,我不确认这个token在真实服务上的权限是多大。白名单的机制似乎要好点

我理解没有安全问题,path也并没有为安全提供额外的保障。目前用户的需求其实是需要能支持所有类型的 API 的。或者可以考虑加个 passthrough 的配置项,如果配置了就允许透传 body。

要不把它作为配置项吧,交给用户抉择

@johnlanni
Copy link
Collaborator

嗯 加个独立的配置项吧

@pepesi pepesi force-pushed the feature-passthrough branch from 506dcaf to eb23ddf Compare January 24, 2025 02:43
@johnlanni johnlanni merged commit a84a382 into alibaba:main Feb 12, 2025
13 checks passed
@johnlanni johnlanni changed the title feature: allow ai-proxy to forward standard AI capabilities that are … feature: allow ai-proxy to forward standard AI capabilities that are natively supported Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants