-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate extproc to dynamic modules when available in Envoy #90
Comments
cc @envoyproxy/ai-gateway-assignable @missBerg @wengyao04 @aabchoo |
since Envoy won't be ready by the end of the month (or the next Envoy v1.33 to be released in a few days), unfortunately we have to use the extproc in the initial version |
Per #90, we will migrate to dynamic modules in the second or third version of AI Gateway, the public configuration package must not be coupled with extproc (see the description in the issue). This renames `extprocconfig` to more generic (and accurate) name as `filterconfig`. Signed-off-by: Takeshi Yoneda <[email protected]>
or if it's ok to use the main branch Envoy in our initial release (and possible delay of 1-3 weeks), then we could use the dynamic modules with the initial release. wdyt? |
I think we'd better delay the first release for better compatibility. |
makes sense |
ok so the plan would be
? |
So are you planing to implement dynamic module Rust by converting existing extproc Go codes in 0.2? |
Not sure if I'm using Rust or Go but yes that's my current plan |
I'd suggest using Go, as I believe many engineers involved in this project know Go. |
What would it take to make it Go friendly @mathetake ? ❤ |
yeah |
nothing specific needed from envoy side. just that we need to write code carefully on our side as i noted in the issue desc 👍 |
@mathetake Is where any doc I can read how dynamic plugins are deployed? just want to understand the benefits we are getting, performance on proxy is usually not a concern for LLMs as the bottleneck is always the inference time. |
The doc will be worked on in the next few months- but the summary is the module (shared library) will be located inside Envoy container and loaded inside the Envoy process - so from Envoy pov it's simply the native executable. For us, the only thing we need is to publish "Envoy+AI Dynamic Module" container instead of ExtProc container, which can be built as FROM envoyproxy/envoy:v1.33.0
COPY path/to/built/ai-dynamic-module.so /whatever/pre/configured/path/to/ai-dynamic-module.so
ENTRYPOINT ["envoy"] think about the NGINX modules - this is working exactly the same as their module system |
then we won't need anything related to extproc related configuration such as Deployment, Service, ConfigMap etc, The equivalent of ConfigMap will be just dealt at Envoy xds configuration layers |
Thanks! the secret credentials mounted on envoy proxy would be accessible by the dynamic module ? |
yes, anything is possible |
If we build the Is that possible to adopt an approach similar to how istio handles wasm plugin. For example, if the .so in some could be stored in an OCI image and the ai-extpro-controller would pull the new oci image and inject into the envoy dynamically ? |
Unfortunately as long as Go is used it’s not technically possible to hot reload since the Go runtime forbids multiple shared libraries loaded in a process (once one shared library is loaded then it’s even impossible to unload lol). So the answers depends on which language we use (Go or Rust). |
Then could we keep both if we decide to write it in GO |
See golang/go#11100 and golang/go#65050 |
The more I've been learning about Rust now, Rust seems like a very solid language 👀 |
yeah solid in particular when being used as a shared library compared to Go. The Go runtime has a global side effect in the loaded process, e.g. installing their own signal handler etc, which results in the aforementioned limitations like golang/go#11100 and golang/go#65050. That's the reason why Envoy will not officially have a Go SDK (vs Rust) because it comes with the limitation (of the code that can be written) and uncertainty. So either works for me and less work for sure when using Go initially, but not sure if that's the best option in the long run |
Per the discussion yesterday, I think we are leaning a bit towards Rust considering the limitations and bottleneck. |
As discussed in the community meeting (though I was not able to attend), the upcoming Envoy v1.34 (to be released in April) will have support for Dynamic Modules with all the necessary features to rewrite the extproc in dynamic modules. A dynamic module is a shared object that is written in any language implementing the ABI specified in a single pure C header. It's actively being developed since the last summer (see the PRs so far) but almost ready to be used by the real end users like this project.
I am the author of the new feature in Envoy upstream, and am sure that we should adopt it due to multiple reasons:
In addition, the dynamic module has already been worked on by multiple (large) companies like Netflix, Linkedin, etc, so I believe the possible only concern on "maturity" is not what blocks us from adopting it vs the benefits stated above.
Since the "official" language supported by Envoy initially will be Rust (and I think C++ later), I would recommend to rewrite it in Rust but Go can also be supported with some care, which is not critical, in the sense the code must be written in a way that won't break Go runtime. As long as I write the initial code, simply porting the Go existing code to dynamic module would be fine as well.
On that note, the public configuration API package, currently named "extprocconfig", will be the "migration interface" that will be accepted by both extproc and dynamic module versions, so I will rename it to something like "aifilterconfig" or something like that. edited: rename done in #92
The text was updated successfully, but these errors were encountered: