-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat: add BedrockEmbeddingModel for Nova, Cohere and Titan endpoints
#4008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
BedrockEmbeddingModel for Nova, Cohere and Titan endpoints
| export AWS_REGION='us-east-1' | ||
| ``` | ||
| - **AWS credentials file** (`~/.aws/credentials`) | ||
| - **IAM roles** (when running on AWS infrastructure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we do in https://ai.pydantic.dev/embeddings/#vertex-ai, let's link to the existing Bedrock doc that is the canonical source on how to configure the provider, that also has more details like that we support the AWS_BEARER_TOKEN_BEDROCK env var. So I'd prefer for that to be the main place we explain all the options, and for this to have an example + link there for more details; if you want to specifically mention ~/.aws/credentials for example, let's add it there, not here.
|
|
||
| #### Basic Usage | ||
|
|
||
| ```python {title="bedrock_embeddings.py" test="skip"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't skip testing examples unless we have no other option.
|
|
||
| #### Supported Models | ||
|
|
||
| Bedrock supports three families of embedding models: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This'll get outdated, let's link to Amazon's doc on this and have just 1 example here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: I see now that we specifically implement support these 3 families. So then it does make sense listing them, but I'd rather use subheadings so they show up in the ToC sidebar, and instead of one "Bedrock-Specific Settings", I think each model family section should have list its own settings, as they're model family specific right?
| embedder = Embedder('bedrock:eu.cohere.embed-english-v3') | ||
| ``` | ||
|
|
||
| The model automatically normalizes these prefixes when looking up `max_input_tokens()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs to be specified
|
|
||
| #### Using a Custom Provider | ||
|
|
||
| For advanced configuration, you can create a [`BedrockProvider`][pydantic_ai.providers.bedrock.BedrockProvider] directly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can have just one example here of building the provider yourself, and link to the existing bedrock docs for more details.
| str(response_body), | ||
| ) | ||
|
|
||
| return EmbeddingResult( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets simplify this method by returning only the embeddings + tokens + in some cases the response ID, and then build the EmbeddingResult where we call parse_response. Right now when we combine single-document runs, we deconstruct the individual EmbeddingResults and turn them into one combined one anyway, so let's skip the intermediate EmbeddingResult. That also means we no longer have to pass inputs, model_name etc into this method anymore.
| def mock_vcr_botocore_content_length(mocker: MockerFixture): | ||
| # VCR doesn't properly handle botocore's content-length verification when replaying responses. | ||
| # This causes IncompleteReadError when the recorded response body length doesn't match the Content-Length header. | ||
| # This happens because VCR decodes compressed responses but doesn't update the Content-Length header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead fix this by updating the Content-Length in tests/json_body_serializer.py's serialize method?
| embeddings=IsList(IsList(IsFloat(), length=1024), length=1), | ||
| inputs=['Hello, world!'], | ||
| input_type='query', | ||
| usage=RequestUsage(input_tokens=IsInt()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For usage, can we see the actual value rather than IsInt? I want to make sure it's not 0
|
|
||
|
|
||
| @pytest.mark.skipif(not bedrock_imports_successful(), reason='Bedrock not installed') | ||
| class TestBedrockHandlers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to unit test this, assuming all the paths are covered by the integration test above?
| """Test error handling when ClientError is raised with HTTP status code.""" | ||
| from botocore.exceptions import ClientError | ||
|
|
||
| from pydantic_ai.exceptions import ModelHTTPError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All imports at the top please
|
@DouweM Are we keeping the clients opinionated to work as text embedding models only? For instance, Nova has a tonne of multi-modal specific parameters and if we're restricting to text only, we shouldn't surface them. |
Pre-Review Checklist
make formatandmake typecheck.Pre-Merge Checklist