api: initial skeleton of LLMRoute and LLMBackend by mathetake · Pull Request #20 · envoyproxy/ai-gateway

mathetake · 2024-12-03T22:04:07Z

This adds the skeleton API of LLMRoute and LLMBackend.
These two resources would be the foundation for the future
iterations, such as authn/z, token-based rate limiting,
schema transformation and more advanced thingy like #10

Note: we might / will break APIs if necessity comes up until
the initial release.

part of #13

cc @yuzisun

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake · 2024-12-03T22:09:46Z

cc @arkodg

mathetake · 2024-12-03T22:13:47Z

cc @sanjeewa-malalgoda (i would appreciate it if you can ping other folks from your org)

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

sanjeewa-malalgoda · 2024-12-04T04:36:21Z

@mathetake We reviewed this and it looks good for us.

mathetake

Reviewed myself🏃

mathetake · 2024-12-04T04:51:56Z

api/v1alpha1/api.go

+	// APISchema specifies the API schema of the input that the target Gateway(s) will receive.
+	// Based on this schema, the ai-gateway will perform the necessary transformation to the
+	// output schema specified in the selected LLMBackend during the routing process.
+	APISchema LLMAPISchema `json:"inputSchema"`


Suggested change

APISchema LLMAPISchema `json:"inputSchema"`

APISchema LLMAPISchema `json:"apiSchema"`

Are we going to allow different APISchema for different LLM route if we define at the route level?

If my understanding is correct, this indicates which vendor and model the route belongs to. Therefore, a single LLMRoute can correspond to only one vendor and model

Are we going to allow different APISchema for different LLM route if we define at the route level?

I think the answer would be no at the moment since i am not sure why a user wants to access API like that - clients need to use different set of API clients and switching it from their end depending on a path? idk.

api/v1alpha1/api.go

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake · 2024-12-04T16:09:52Z

on second thought i kept the field name inputSchema and outputSchema for LLMRoute and LLMBackend respectively, instead of the same apiSchema as I left comment by myself yesterday. I think it's almost ready to go

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

codefromthecrypt

drive-by notes

api/v1alpha1/api.go

codefromthecrypt · 2024-12-05T04:38:10Z

manifests/charts/ai-gateway-helm/crds/aigateway.envoyproxy.io_llmroutes.yaml

+                  Based on this schema, the ai-gateway will perform the necessary transformation to the
+                  output schema specified in the selected LLMBackend during the routing process.
+
+                  Currently, the only supported schema is OpenAI as the input schema.


would it be less problematic to remove this text constraint and introduce bedrock later once supported?

the constraint will be enforced in the cel validation rule that happens at k8s API server - will do the follow up soon

good comment anyways adrian as usual

Co-authored-by: Adrian Cole <64215+codefromthecrypt@users.noreply.github.com> Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

This commit is a follow up on #20. Basically, this makes LLMRoute a pure "addition" to the existing standardized HTTPRoute. This makes it possible to configure something like ``` kind: LLMRoute metadata: name: llm-route spec: inputSchema: OpenAI httpRouteRef: name: my-llm-route --- kind: HTTPRoute metadata: name: my-llm-route spec: matches: - headers: key: x-envoy-ai-gateway-llm-model value: llama3-70b backendRefs: - kserve: weight: 20 - aws-bedrock: weight: 80 ``` where LLMRoute is purely referencing HTTPRoute and users can configure whatever routing condition in a standardized way via HTTPRoute while leveraging the LLM specific information, in this case x-envoy-ai-gateway-llm-model header. In the implementation, though it's not merged yet, we have to do the routing calculation in the extproc by actually analyzing the referenced HTTPRoute, and emulate the behavior in order to do the transformation. The reason is that the routing decision is made at the very end of filter chain in general, and by the time we invoke extproc, we don't have that info. Furthermore, `x-envoy-ai-gateway-llm-model` is not available before extproc. As a bonus of this, we no longer need TargetRef at LLMRoute level since that's within the HTTPRoute resources. This will really simplify the PoC implementation. --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

api: initial skeleton of LLMRoute and LLMBackend

4ada96d

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake requested review from aabchoo, missBerg and wengyao04 as code owners December 3, 2024 22:04

more

6cee7bc

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

aabchoo approved these changes Dec 3, 2024

View reviewed changes

mathetake added 3 commits December 3, 2024 14:39

Adds schema

897fb3b

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wording

c06c6c8

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

more wording

d3d0de4

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

sanjeewa-malalgoda approved these changes Dec 4, 2024

View reviewed changes

Krishanx92 approved these changes Dec 4, 2024

View reviewed changes

mathetake commented Dec 4, 2024

View reviewed changes

yuzisun reviewed Dec 4, 2024

View reviewed changes

api/v1alpha1/api.go Outdated Show resolved Hide resolved

yuzisun reviewed Dec 4, 2024

View reviewed changes

api/v1alpha1/api.go Outdated Show resolved Hide resolved

mathetake added 2 commits December 4, 2024 07:51

review: the comment

1286fdb

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

clarify not arbitrary input is supported

865ec35

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

arkodg mentioned this pull request Dec 5, 2024

Support custom backendRefs via extensions envoyproxy/gateway#4762

Closed

zirain approved these changes Dec 5, 2024

View reviewed changes

merge

f29cbd6

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

codefromthecrypt reviewed Dec 5, 2024

View reviewed changes

mathetake and others added 2 commits December 5, 2024 07:43

Update api/v1alpha1/api.go

721da55

Co-authored-by: Adrian Cole <64215+codefromthecrypt@users.noreply.github.com> Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

schema

54ec90c

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake merged commit e95a824 into main Dec 5, 2024

mathetake deleted the apidefinitions-llv branch December 5, 2024 17:34

mathetake mentioned this pull request Dec 9, 2024

api: make LLMRoute reference HTTPRoute #39

Merged

	APISchema LLMAPISchema `json:"inputSchema"`
	APISchema LLMAPISchema `json:"apiSchema"`

Conversation

mathetake commented Dec 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathetake commented Dec 3, 2024

Uh oh!

mathetake commented Dec 3, 2024

Uh oh!

sanjeewa-malalgoda commented Dec 4, 2024

Uh oh!

mathetake left a comment

Choose a reason for hiding this comment

Uh oh!

mathetake Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuzisun Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

Krishanx92 Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

mathetake Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mathetake commented Dec 4, 2024

Uh oh!

codefromthecrypt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codefromthecrypt Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

mathetake Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

mathetake Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

mathetake commented Dec 3, 2024 •

edited

Loading