Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: add "standard routing header" that gets prioritized #73

Open
mathetake opened this issue Jan 9, 2025 · 1 comment
Open
Labels
discussion To be discussed in community

Comments

@mathetake
Copy link
Member

per #71 (comment)

Currently, users have to specify all routing conditions in LLMRoute.Rules. For example, if an AI Gateway user wants to allow clients to select backend by themselves, they need to define something like

apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: LLMRoute
metadata:
  name: some-route
  namespace: default
spec:
  inputSchema:
    schema: OpenAI
  rules:
    - matches:
      - headers:
        - type: Exact
          name: some-random-header-that-can-be-sent-directly-by-clients
          value: foo
      backendRefs:
        - name: somebackend
 ....

and tells the clients that use -H some-random-header-that-can-be-sent-directly-by-clients: somebackend to select the backend. This is not that inconvenient because at the end of the day there won't be hundreds of backends but at most tens.

On the other hand, we can provide the "standard routing header", say x-ai-gateway-backend, that effectively ignores the LLMRoute.Rules and routes the requests to the backend specified in the value of the standard header. To do so, we need to know which backends can be routed via that headers at the HTTPRoute construction phase. One way is to allow routing to any backend in the same namespace as LLMRoute, or only route to backends that appear in LLMRoute.Rules.

Personally, i feel this "implicit routing" might makes things more confusing, but agree this will provide better experience, in fact this is how the PoC works which is because it didn't have the Rules and routing inside extproc.

i am opening this issue as I feel at least this is worth the discussion.

cc @envoyproxy/assignable

@mathetake mathetake self-assigned this Jan 9, 2025
@mathetake mathetake added the discussion To be discussed in community label Jan 9, 2025
@Krishanx92
Copy link
Contributor

Are we discussing keeping only one of the options? I would prefer to keep both options and prioritize the standard header if it is set. Additionally, I suggest leaving that decision to the user level by defining a flag in the LLMRoute. For example, if the gateway user enables the flag, clients can use the standard x-ai-gateway-backend header to specify which backend they want to route to.

To determine which backends can be routed via headers, the gateway should route the request to "somebackend" only if it appears in one of the LLMRoute.Rules or is in the same namespace, which seems fine to me. However, if needed in the future, we can introduce an explicit allowed backend list for the spec as well.

@mathetake mathetake removed their assignment Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion To be discussed in community
Projects
None yet
Development

No branches or pull requests

2 participants