feat(config): add support for remote embedding services via config.toml by siimons · Pull Request #4284 · TabbyML/tabby

siimons · 2025-06-12T13:07:46Z

Description

This pull request introduces support for remote embedding models in config.toml, enabling users to delegate embedding generation to external servers. This is particularly valuable for environments without GPU or lacking the ability to run llama-server locally.

What’s Changed

Added support for [embedding] section with type = "remote" and endpoint fields in config.toml.
Updated embedding::create() and model::load_embedding() to support ModelConfig::Http (remote models).
Prevents Tabby from launching the local llama-server process when using a remote embedding service.
Keeps compatibility with local embeddings (no breaking changes).

Example Usage

[embedding]
type = "remote"
endpoint = "http://localhost:5000"
model = "BAAI/bge-small-en"

Run an embedding service like:

uvicorn main:app --host 0.0.0.0 --port 5000

Then launch Tabby:

./target/release/tabby serve

Motivation

Currently, Tabby always attempts to launch its internal llama-server binary, which fails on machines without compatible GPU or CUDA libraries. This PR introduces flexibility and portability, enabling Tabby to run in lightweight environments with minimal dependencies.

How to Test

Start Tabby with a valid config.toml that includes a remote embedding config.
Verify that:
- Tabby starts without attempting to run llama-server
- Embedding API requests are successfully forwarded to the remote server

Known Limitations

This does not disable internal embeddings when [embedding] is omitted (default behavior).
The remote server must follow the expected API (e.g., /v1/embeddings in OpenAI-compatible format).

Request for Review

Would love feedback on:

Integration approach
Potential edge cases to test
Any docs you'd like me to include

Let me know if you'd like me to add a sample embedding server (Python FastAPI) or documentation PR as a follow-up!

Micro66 · 2025-06-13T06:33:09Z

May I ask why you don't use this configuration

[model.embedding.http]
kind = "ollama/embedding"
model_name = "nomic-embed-text"
api_endpoint = "http://localhost:11434"

wsxiaoys · 2025-06-13T06:39:13Z

Right - you can always configure a remote embedding through http endpoint, e.g

https://tabby.tabbyml.com/docs/references/models-http-api/llama.cpp/

siimons · 2025-06-13T07:05:48Z

May I ask why you don't use this configuration

[model.embedding.http]
kind = "ollama/embedding"
model_name = "nomic-embed-text"
api_endpoint = "http://localhost:11434"

I understand now that Tabby already supports HTTP-based embedding backends via the [model.embedding.http] configuration. However, my intention with this PR is to improve the usability and discoverability of this feature:

Why I chose to propose [embedding] type = "remote":

Consistency with existing config sections like [model] that already use type = "local" or type = "remote".
Cleaner and simpler UX for users who are not familiar with internal model kinds like "ollama/embedding" or llama.cpp/embedding.
This makes remote embedding configuration feel native and explicit, just like other parts of config.toml, without requiring users to construct HttpModelConfig manually.

If you think this is redundant or beyond the scope of the Tabby configuration concept, no problem, I will understand the decision to reject or revise the PR. I just wanted to explain the motivation and maybe initiate a dialogue.

feat(config): add support for remote embedding services via config.toml

ca3c72f

wsxiaoys requested a review from zwpaper June 12, 2025 16:11

[autofix.ci] apply automated fixes

88227f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(config): add support for remote embedding services via config.toml#4284

feat(config): add support for remote embedding services via config.toml#4284
siimons wants to merge 2 commits intoTabbyML:mainfrom
siimons:feat/remote-embedding-support

siimons commented Jun 12, 2025 •

edited

Loading

Uh oh!

Micro66 commented Jun 13, 2025

Uh oh!

wsxiaoys commented Jun 13, 2025

Uh oh!

siimons commented Jun 13, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

siimons commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What’s Changed

Example Usage

Motivation

How to Test

Known Limitations

Request for Review

Uh oh!

Micro66 commented Jun 13, 2025

Uh oh!

wsxiaoys commented Jun 13, 2025

Uh oh!

siimons commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

siimons commented Jun 12, 2025 •

edited

Loading

siimons commented Jun 13, 2025 •

edited

Loading