Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
78c4473
VoyageAI embeddings support
ggozad Dec 26, 2025
50b1f64
VoyageAI cassettes
ggozad Dec 26, 2025
a166cb7
Update docs for VoyageAI with example
ggozad Dec 26, 2025
aeb5ccc
Fix example tests
ggozad Dec 26, 2025
88ab61b
Refactor VoyageAI embeddings to use provider pattern
ggozad Jan 9, 2026
bcdd6f2
Add VoyageAI provider tests for full coverage
ggozad Jan 9, 2026
324e3bd
Fix coverage by restoring original infer_embedding_model structure
ggozad Jan 15, 2026
f079fa9
Add truncate setting to EmbeddingSettings base class
ggozad Jan 15, 2026
8526d4b
Simplify VoyageAIProvider to only accept api_key and voyageai_client
ggozad Jan 15, 2026
87c7b72
Add VoyageAI to API docs
ggozad Jan 15, 2026
093eaa5
Tests & vcr for truncate option in embeddings
ggozad Jan 15, 2026
b283b0b
Add voyageai_input_type setting
ggozad Jan 15, 2026
edde0de
Merge remote-tracking branch 'upstream/main' into voyageai-embeddings
ggozad Jan 16, 2026
95e2f45
Add voyage-4-large, voyage-4, voyage-4-lite to supported models
ggozad Jan 16, 2026
31d5e4a
Document truncate option, remove voyageai_truncation override
ggozad Jan 16, 2026
7820d20
Change voyageai_input_type to use none string instead of None for cla…
ggozad Jan 16, 2026
c35fa2c
Add overloads to VoyageAIProvider for mutually exclusive api_key/voya…
ggozad Jan 16, 2026
53e08af
Fix pyright ignores by using typed settings variables
ggozad Jan 16, 2026
0171011
Update docs, clean up overload signature of VoyageAIs __init__()
ggozad Jan 19, 2026
ca963f6
Merge upstream/main
ggozad Jan 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/api/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@

::: pydantic_ai.embeddings.google

::: pydantic_ai.embeddings.voyageai

::: pydantic_ai.embeddings.sentence_transformers

::: pydantic_ai.embeddings.test
Expand Down
2 changes: 2 additions & 0 deletions docs/api/providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

::: pydantic_ai.providers.cohere

::: pydantic_ai.providers.voyageai.VoyageAIProvider

::: pydantic_ai.providers.cerebras.CerebrasProvider

::: pydantic_ai.providers.mistral.MistralProvider
Expand Down
60 changes: 59 additions & 1 deletion docs/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,61 @@ embedder = Embedder(
)
```

### VoyageAI

[`VoyageAIEmbeddingModel`][pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingModel] provides access to VoyageAI's embedding models, which are optimized for retrieval with specialized models for code, finance, and legal domains.

#### Install

To use VoyageAI embedding models, you need to install `pydantic-ai-slim` with the `voyageai` optional group:

```bash
pip/uv-add "pydantic-ai-slim[voyageai]"
```

#### Configuration

To use `VoyageAIEmbeddingModel`, go to [dash.voyageai.com](https://dash.voyageai.com/) to generate an API key. Once you have the API key, you can set it as an environment variable:

```bash
export VOYAGE_API_KEY='your-api-key'
```

You can then use the model:

```python {title="voyageai_embeddings.py"}
from pydantic_ai import Embedder

embedder = Embedder('voyageai:voyage-3.5')


async def main():
result = await embedder.embed_query('Hello world')
print(len(result.embeddings[0]))
#> 1024
```

_(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)_

See the [VoyageAI Embeddings documentation](https://docs.voyageai.com/docs/embeddings) for available models.

#### VoyageAI-Specific Settings

VoyageAI models support additional settings via [`VoyageAIEmbeddingSettings`][pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingSettings]:

```python {title="voyageai_settings.py"}
from pydantic_ai import Embedder
from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingSettings

embedder = Embedder(
'voyageai:voyage-3.5',
settings=VoyageAIEmbeddingSettings(
dimensions=512, # Reduce output dimensions
voyageai_input_type='document', # Override input type for all requests
),
)
```

### Sentence Transformers (Local)

[`SentenceTransformerEmbeddingModel`][pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel] runs embeddings locally using the [sentence-transformers](https://www.sbert.net/) library. This is ideal for:
Expand Down Expand Up @@ -418,7 +473,10 @@ embedder = Embedder(model)

## Settings

[`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] provides common configuration options that work across providers.
[`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] provides common configuration options that work across providers:

- `dimensions`: Reduce the output embedding dimensions (supported by OpenAI, Google, Cohere, VoyageAI)
- `truncate`: When `True`, truncate input text that exceeds the model's context length instead of raising an error (supported by Cohere, VoyageAI)

Settings can be specified at the embedder level (applied to all calls) or per-call:

Expand Down
14 changes: 14 additions & 0 deletions pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,16 @@
'cohere:embed-english-light-v3.0',
'cohere:embed-multilingual-v3.0',
'cohere:embed-multilingual-light-v3.0',
'voyageai:voyage-4-large',
'voyageai:voyage-4',
'voyageai:voyage-4-lite',
'voyageai:voyage-3-large',
'voyageai:voyage-3.5',
'voyageai:voyage-3.5-lite',
'voyageai:voyage-code-3',
'voyageai:voyage-finance-2',
'voyageai:voyage-law-2',
'voyageai:voyage-code-2',
],
)
"""Known model names that can be used with the `model` parameter of [`Embedder`][pydantic_ai.embeddings.Embedder].
Expand Down Expand Up @@ -104,6 +114,10 @@ def infer_embedding_model(
from .sentence_transformers import SentenceTransformerEmbeddingModel

return SentenceTransformerEmbeddingModel(model_name)
elif model_kind == 'voyageai':
from .voyageai import VoyageAIEmbeddingModel

return VoyageAIEmbeddingModel(model_name, provider=provider)
else:
raise UserError(f'Unknown embeddings model: {model}') # pragma: no cover

Expand Down
12 changes: 11 additions & 1 deletion pydantic_ai_slim/pydantic_ai/embeddings/cohere.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ class CohereEmbeddingSettings(EmbeddingSettings, total=False):
- `'NONE'` (default): Raise an error if input exceeds max tokens.
- `'END'`: Truncate the end of the input text.
- `'START'`: Truncate the start of the input text.

Note: This setting overrides the standard `truncate` boolean setting when specified.
"""


Expand Down Expand Up @@ -159,14 +161,22 @@ async def embed(
if extra_body := settings.get('extra_body'): # pragma: no cover
request_options['additional_body_parameters'] = cast(dict[str, Any], extra_body)

# Determine truncation strategy: cohere_truncate takes precedence over truncate
if 'cohere_truncate' in settings:
truncate = settings['cohere_truncate']
elif settings.get('truncate'):
truncate = 'END'
else:
truncate = 'NONE'

try:
response = await self._client.embed(
model=self.model_name,
texts=inputs,
output_dimension=settings.get('dimensions'),
input_type=cohere_input_type,
max_tokens=settings.get('cohere_max_tokens'),
truncate=settings.get('cohere_truncate', 'NONE'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's specify on the cohere_truncate docstring that it overrides the truncate boolean

truncate=truncate,
request_options=request_options,
)
except ApiError as e:
Expand Down
21 changes: 21 additions & 0 deletions pydantic_ai_slim/pydantic_ai/embeddings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,27 @@ class EmbeddingSettings(TypedDict, total=False):
* Cohere
* Google
* Sentence Transformers
* VoyageAI
"""

truncate: bool
"""Whether to truncate inputs that exceed the model's context length.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We should specify that the default is False
  • I think it's worth explaining that you can use the max_input_tokens and count_tokens methods to implement your own (smarter) "truncation"


Defaults to `False`. If `True`, inputs that are too long will be truncated.
If `False`, an error will be raised for inputs that exceed the context length.

For more control over truncation, you can use
[`max_input_tokens()`][pydantic_ai.embeddings.Embedder.max_input_tokens] and
[`count_tokens()`][pydantic_ai.embeddings.Embedder.count_tokens] to implement
your own truncation logic.

Provider-specific truncation settings (e.g., `cohere_truncate`) take precedence
if specified.

Supported by:

* Cohere
* VoyageAI
"""

extra_headers: dict[str, str]
Expand Down
188 changes: 188 additions & 0 deletions pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
from __future__ import annotations

from collections.abc import Sequence
from dataclasses import dataclass, field
from typing import Literal, cast

from pydantic_ai.exceptions import ModelAPIError
from pydantic_ai.providers import Provider, infer_provider
from pydantic_ai.usage import RequestUsage

from .base import EmbeddingModel, EmbedInputType
from .result import EmbeddingResult
from .settings import EmbeddingSettings

try:
from voyageai.client_async import AsyncClient
from voyageai.error import VoyageError
except ImportError as _import_error:
raise ImportError(
'Please install `voyageai` to use the VoyageAI embeddings model, '
'you can use the `voyageai` optional group — `pip install "pydantic-ai-slim[voyageai]"`'
) from _import_error

LatestVoyageAIEmbeddingModelNames = Literal[
'voyage-4-large',
'voyage-4',
'voyage-4-lite',
'voyage-3-large',
'voyage-3.5',
'voyage-3.5-lite',
'voyage-code-3',
'voyage-finance-2',
'voyage-law-2',
'voyage-code-2',
]
"""Latest VoyageAI embedding models.

See [VoyageAI Embeddings](https://docs.voyageai.com/docs/embeddings)
for available models and their capabilities.
"""

VoyageAIEmbeddingModelName = str | LatestVoyageAIEmbeddingModelNames
"""Possible VoyageAI embedding model names."""

VoyageAIEmbedInputType = Literal['query', 'document', 'none']
"""VoyageAI embedding input types.

- `'query'`: For search queries; prepends retrieval-optimized prefix.
- `'document'`: For documents; prepends document retrieval prefix.
- `'none'`: Direct embedding without any prefix.
"""


class VoyageAIEmbeddingSettings(EmbeddingSettings, total=False):
"""Settings used for a VoyageAI embedding model request.

All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported,
plus VoyageAI-specific settings prefixed with `voyageai_`.
"""

# ALL FIELDS MUST BE `voyageai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

voyageai_input_type: VoyageAIEmbedInputType
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm if it only supports query and document anyway I don't think a setting is warranted. If "direct embedding without prefix" is something users would want to do, I think we should make it 'none' instead of None so that the difference is more clear between this field being omitted (and we should use the default input type implied by the embed_query/document method) and explicitly being set to none.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The None option according to their docs does "raw" embedding. My guess is that that's useful say if you want to do clustering or classification. For retrieval one would use document or query.
Will change to none as you suggested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm thinking about this more, what do you think about changing our EmbedInputType type to accept None as well, plus having Embedder.embed's input_type argument default to None? Then we would not need this custom setting at all anymore. I don't think any of the embeddings APIs require an input type, and if they do we can pick a reasonable default. OpenAI ignores the argument entirely anyway.

Then we would not need a new setting here at all, so even though it's kind of a separate task from this PR, I think it's worth trying it here so we don't introduce the new setting and then immediately deprecate it.

"""The VoyageAI-specific input type for the embedding.

Overrides the standard `input_type` argument. Options include:
`'query'`, `'document'`, or `'none'` for direct embedding without prefix.
"""


_MAX_INPUT_TOKENS: dict[VoyageAIEmbeddingModelName, int] = {
'voyage-4-large': 32000,
'voyage-4': 32000,
'voyage-4-lite': 32000,
'voyage-3-large': 32000,
'voyage-3.5': 32000,
'voyage-3.5-lite': 32000,
'voyage-code-3': 32000,
'voyage-finance-2': 32000,
'voyage-law-2': 16000,
'voyage-code-2': 16000,
}


@dataclass(init=False)
class VoyageAIEmbeddingModel(EmbeddingModel):
"""VoyageAI embedding model implementation.

VoyageAI provides state-of-the-art embedding models optimized for
retrieval, with specialized models for code, finance, and legal domains.

Example:
```python
from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel

model = VoyageAIEmbeddingModel('voyage-3.5')
```
"""

_model_name: VoyageAIEmbeddingModelName = field(repr=False)
_provider: Provider[AsyncClient] = field(repr=False)

def __init__(
self,
model_name: VoyageAIEmbeddingModelName,
*,
provider: Literal['voyageai'] | Provider[AsyncClient] = 'voyageai',
settings: EmbeddingSettings | None = None,
):
"""Initialize a VoyageAI embedding model.

Args:
model_name: The name of the VoyageAI model to use.
See [VoyageAI models](https://docs.voyageai.com/docs/embeddings)
for available options.
provider: The provider to use for authentication and API access. Can be:

- `'voyageai'` (default): Uses the standard VoyageAI API
- A [`VoyageAIProvider`][pydantic_ai.providers.voyageai.VoyageAIProvider] instance
for custom configuration
settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings]
to use as defaults for this model.
"""
self._model_name = model_name

if isinstance(provider, str):
provider = infer_provider(provider)
self._provider = provider

super().__init__(settings=settings)

@property
def base_url(self) -> str:
"""The base URL for the provider API."""
return self._provider.base_url

@property
def model_name(self) -> VoyageAIEmbeddingModelName:
"""The embedding model name."""
return self._model_name

@property
def system(self) -> str:
"""The embedding model provider."""
return self._provider.name

async def embed(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame they don't (seem to) support counting tokens :(

self,
inputs: str | Sequence[str],
*,
input_type: EmbedInputType,
settings: EmbeddingSettings | None = None,
) -> EmbeddingResult:
inputs, settings = self.prepare_embed(inputs, settings)
settings = cast(VoyageAIEmbeddingSettings, settings)

voyageai_input_type: VoyageAIEmbedInputType = settings.get(
'voyageai_input_type', 'document' if input_type == 'document' else 'query'
)
# Convert 'none' string to None for the API
api_input_type = None if voyageai_input_type == 'none' else voyageai_input_type

try:
response = await self._provider.client.embed(
texts=list(inputs),
model=self.model_name,
input_type=api_input_type,
truncation=settings.get('truncate', False),
output_dimension=settings.get('dimensions'),
)
except VoyageError as e:
raise ModelAPIError(model_name=self.model_name, message=str(e)) from e

return EmbeddingResult(
embeddings=response.embeddings,
inputs=inputs,
input_type=input_type,
usage=_map_usage(response.total_tokens),
model_name=self.model_name,
provider_name=self.system,
)

async def max_input_tokens(self) -> int | None:
return _MAX_INPUT_TOKENS.get(self.model_name)


def _map_usage(total_tokens: int) -> RequestUsage:
return RequestUsage(input_tokens=total_tokens)
4 changes: 4 additions & 0 deletions pydantic_ai_slim/pydantic_ai/providers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,10 @@ def infer_provider_class(provider: str) -> type[Provider[Any]]: # noqa: C901
from .sentence_transformers import SentenceTransformersProvider

return SentenceTransformersProvider
elif provider == 'voyageai':
from .voyageai import VoyageAIProvider

return VoyageAIProvider
else: # pragma: no cover
raise ValueError(f'Unknown provider: {provider}')

Expand Down
Loading