Support for providers outside of this codebase #658

aidando73 · 2024-12-19T09:32:03Z

🚀 Describe the new functionality needed

Something to consider for after our apis are stable and the ecosystem matures a bit - but keen to know your thoughts.

We probably don't have the capacity to support all providers; If we try to, we risk becoming the bottleneck of the ecosystem.

There will be a lot of providers that might not be a good fit to be supported within this codebase. E.g.,

OpenAI
Anthropic
Experimental implementations
Niche providers - e.g., Add Sambanova Inference #518

I personally have thought about building:

OpenAI inference provider - because it allows me to compare and experiment between models
Memory with BM25 search - to improve retrievals

But having it a part of this code-base means that the burden is on you guys to maintain it. Experimental implementations will mostly not work out - but a few could be very valuable. So it doesn't make much sense to be in this code base.

💡 Why is this needed? What if we don't build it?

If we allow the community to build these providers, then we'd be able to have the community drive it and have the best providers bubble up.

Users would be able to pick the best providers, if one isn't well supported then others can fork it. The good ones will accumulate stars and gradually over time we end up with a set of well maintained and quality providers.

Is this something that you guys have thought about? wdyt?

cheesecake100201 · 2024-12-20T10:23:07Z

Even I have been thinking about this for a while now, because I think adding something like gpt models to a pretty robust agentic framework like llama-stack, will be quite a game changer honestly. This would obviously require changes in the llama models repo and llama stack repo both.

terrytangyuan · 2024-12-21T16:07:43Z

I personally have thought about building:
OpenAI inference provider - because it allows me to compare and experiment between models

See my PR #195 and @ashwinb's rationale for closing it. I'd love to pick it up again if the maintainers are open to it.

raghotham · 2024-12-23T21:07:34Z

One of the key reasons to not add a blanket OpenAI-compat inference provider is what Ashwin articulated - there would be no way for us to test/showcase the "compliant providers" of llama models.

On the other hand, an OpenAI-specific inference provider seems useful to have for benchmark evals on OpenAI's models. Would love to hear thoughts on just creating providers for closed models like OpenAI, Claude, Gemini?

Regarding supporting providers outside of this codebase, the question is around discoverability in addition to verification. How would we keep track of all the providers that are available for developers to use? Maybe we create "community-maintained" vs "Meta-maintained" providers?

Meta-maintained providers will be verified and kept up to date - including bug fixes, CVE patches, and support of new llama stack capabilities. In addition, Meta will work with provider authors to support capabilities of new Llama models.
Community-maintained providers have no guarantees. Can quickly become useless as Llama and Llama Stack progress.

Thoughts?

aidando73 · 2024-12-23T23:11:25Z

Community Providers

Maybe we create "community-maintained" vs "Meta-maintained" providers?

^ Yeah I think this would be a good distinction. The community-maintained ones have no guarantees and can be a lot more experimental, less stable and we should communicate that's the case.

But definitely not something we should support until we see some demand for it our apis are stable. We'll know there's demand for it if:

we see: "Could we support X in Y provider? I've submitted a PR, but you guys haven't reviewed it/or vetoed it - I think it's super necessary or useful for Z use-case" but it doesn't make sense for us to support it.
The community submits more PRs to keep providers up to date than we have capacity for review and we see community members getting frustrated because of it.

Closed Models

wrt to closed models OpenAI, Claude, Gemini - I see early signs of demand for it:

"Focus on Llama Models" - probably the only downside I see so far
[1]

@cheesecake100201 has expressed interest above and I myself have gone through the pain of re-writing something in OpenAI so I can compare models.

I'm speculating that a big barrier for people in industry to adopt Llama-stack will be - "What if we want to use GPT-4 or Claude?" in the future? Devs would need to justify that risk to other stakeholders in their design docs before introducing llama-stack. Imo, being able to switch between providers would be a big selling point. E.g., "If we go with Llama-stack, we can easily switch between OpenAI, Anthropic and Fireworks"; it de-risks vendor lock-in.

@raghotham But what is Meta's stance on this, since Llama-stack is a Meta sponsored project - are there any concerns with adding support for models outside of Meta? (I'm imagining you'll need to justify this decision to other stakeholders).

Imo, I think it makes sense from a long-term + Customer Obsession approach, if Llama-stack is the go-to framework then there's more opportunity than if we were to only support Llama models. Two analogies are:

VSCode is free and you can install other coding assistants, but the default is GitHub Co-pilot
Chrome is free and you can set whatever default search engine you want, but the default is Google

^ But it only works if they're the go-to IDE/browser, which is done by consistently making product decisions that do right by the user - including if they use it to access extensions/tools by other companies.

wdyt?

cheesecake100201 · 2024-12-30T09:00:23Z

I agree with @aidando73 on this. Since Llama-stack is a meta sponsored project, Its understandable that they might not want models outside of the llama family but the advantages of adding closed models like GPT, Claude to the llama stack framework would greatly outweigh the disadvantages if there are any. That would also make llama stack the go to framework for developers across the globe as well.
cc: @raghotham @ashwinb

github-actions · 2025-03-14T00:11:42Z

This issue has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.

terrytangyuan · 2025-03-14T00:31:58Z

Let's keep this open.

github-actions bot added the stale label Mar 14, 2025

terrytangyuan removed the stale label Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for providers outside of this codebase #658

Support for providers outside of this codebase #658

aidando73 commented Dec 19, 2024 •

edited

Loading

cheesecake100201 commented Dec 20, 2024

terrytangyuan commented Dec 21, 2024

raghotham commented Dec 23, 2024 •

edited

Loading

aidando73 commented Dec 23, 2024 •

edited

Loading

cheesecake100201 commented Dec 30, 2024

github-actions bot commented Mar 14, 2025

terrytangyuan commented Mar 14, 2025

Support for providers outside of this codebase #658

Support for providers outside of this codebase #658

Comments

aidando73 commented Dec 19, 2024 • edited Loading

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

cheesecake100201 commented Dec 20, 2024

terrytangyuan commented Dec 21, 2024

raghotham commented Dec 23, 2024 • edited Loading

aidando73 commented Dec 23, 2024 • edited Loading

Community Providers

Closed Models

cheesecake100201 commented Dec 30, 2024

github-actions bot commented Mar 14, 2025

terrytangyuan commented Mar 14, 2025

aidando73 commented Dec 19, 2024 •

edited

Loading

raghotham commented Dec 23, 2024 •

edited

Loading

aidando73 commented Dec 23, 2024 •

edited

Loading