-
Notifications
You must be signed in to change notification settings - Fork 30
INTPYTHON-729 Allow creating search indexes with field mappings and analyzers #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances MongoDB Atlas search index functionality by adding field mapping capabilities and index status monitoring. The changes allow developers to specify custom field mappings for search indexes and ensure proper synchronization during index operations.
Key changes:
- Added
field_mappingsparameter toSearchIndexto allow custom Atlas Search field configurations - Introduced index status monitoring functions to wait for index creation/deletion completion
- Added
DynamicSearchIndexclass for dynamic field mapping scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| django_mongodb_backend/schema.py | Added index status monitoring functions and integrated them into index operations |
| django_mongodb_backend/indexes.py | Enhanced SearchIndex with field_mappings support and added DynamicSearchIndex class |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
fields_mappings to get added to SearchIndexModel configurationsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagined the index type API as subclasses like AutocompleteSearchIndex but I guess that's not flexible enough if an index has multiple fields with different types.
django_mongodb_backend/indexes.py
Outdated
| if field_name in self.field_mappings: | ||
| fields[field_path] = self.field_mappings[field_name].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is field_mappings really supposed to contain the entire mapping? (e.g. "type" too). I'd think it would be more likely to be interpreted as "extra options to add to the field".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, type in the Atlas Search Field Mapping refers to the Atlas Search Field Type. We infer type from our fields, but, for instance, strings can be interpreted as four different types:
- string (we infer)
- token
- stringFacet
- autocomplete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Your original PR combined fields and field_mappings but I made these arguments mutually exclusive (possibly a separate class (e.g. "MappedSearchIndex") would be a better separate of concerns rather than having mutually exclusive arguments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh, potentially. I actually chose to combine fields and fields_mappings because I envisioned folks being fine with the defaults set on a field name unless they wanted to make one small mutation. It's purely a QOL so folks don't have to commit to writing the entire field mapping, but I'm fine conceding to their separation unless we get requests from developers.
2135908 to
daae732
Compare
b978a65 to
48c1495
Compare
48c1495 to
db4b32f
Compare
|
I think this is functionally complete for VectorSearchIndex doesn't take |
db4b32f to
7625617
Compare
This is a fairly straightforward addition.
Yeah, quantization, hnswOptions can definitely be split that into a separate ticket |
|
|
||
| def __init__(self, *, fields=(), name=None): | ||
| def __init__( | ||
| self, *, fields=(), field_mappings=None, name=None, analyzer=None, search_analyzer=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's called "field mappings" in the MongoDB docs, I've been struggling to intuitively remember where the "s" goes (field_mappings, fields_mapping, fields_mappings, etc.). I think fields_mappings may be more intuitive since we have an existing fields parameter and also the JSON structure has ["mappings"] and ["fields"]. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, after thinking it over, I say we still keep it field_mappings. I understand the dangling "s" concern. Since at the server level, fields and field_mappings are the actual names of the keys, keeping that parallelism still aligns better to me.
cc:
@aclark4life , @WaVEV for additional opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the documentation uses the term "Field Mappings" but I haven't seen a key called "field_mappings". Did I miss it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 IMO, field_mappings is more accurate. Noun + noun (compound noun) to refer to a new noun the first one is in singular. Like copilot settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@timgraham my mistake. I vaguely remember seeing field_mappings, in my earlier attempts at this, but I'm guessing it's that same documentation that I saw and attributed. Nonetheless, I still align with the field_mappings because it's also just phonetically better. (to @WaVEV's point as well)
939c3dc to
4ffca7f
Compare
…nalyzers Co-authored-by: Tim Graham <[email protected]>
4ffca7f to
204e722
Compare
Summary
Defining Field Mappings for Atlas Search and Vector Search indexes can get complicated. Our initial
SearchIndexandVectorSearchIndexsolutions provide reasonable defaults for categorized fields -- however for the typical MongoDB poweruser, there may be more nuanced indexes they may want to use. This PR introduces an avenue to provide more custom field mappings on a field.Key changes
field_mappings,analyzer, andsearch_analyzerparameter to SearchIndex to allow custom Atlas Search field configurations.optionsreturned byget_constraintsto also includeanalyzerandsearchAnalyzer.Screenshots
Image of a customized field_mapping added in a migration

It's representation on MongoDB Compass
