-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle plain SearchParameters in HNSW searches #4167
Conversation
Hi @kaivalnp , I am still browsing through it, and reading the corresponding PR on Lucene side apache/lucene#14178 Meanwhile, could you update the format as suggested by https://github.com/facebookresearch/faiss/actions/runs/13143467156/job/36742197199?pr=4167 ? (I believe that is the only issue so far; the Linux x86_64 (conda) error is transient, and AMD folks are looking into the rocm runner) |
If the caller knows they are creating an IndexHNSW through this factory string, I'm confused why they can't pass a SearchParametersHNSW? (Apologies, I didn't catch the context of how Faiss gets invoked in the Lucene PR very well.) BTW, pre-filtering is of interest to us too. I'll keep reading your updates in apache/lucene#14178. |
Done!
The API currently exposed in the PR asks for two strings from the end-user: It internally uses Only the end-user knows what string (and subsequently, index type) is going to be used -- so we'd have to parse the Further, the user has already specified some search parameters in I'm also not sure about which All this may require non-trivial additional code in Lucene and / or Faiss. Please let me know if you had another approach in mind to tackle this issue (of applying only an FYI we're trying to create a standalone Lucene codec that internally uses Faiss for vector indexing and search, without any additional "glue" code -- and it would be awesome if Faiss supports required functionality out-of-the-box! (as long as these changes are not too invasive) |
@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Add ability to search HNSW indexes using a plain [`SearchParameters`](https://github.com/facebookresearch/faiss/blob/6c046992a71e672df504a0101cddf6f2f2e90601/faiss/Index.h#L64-L69) object (i.e. only an [`IDSelector`](https://github.com/facebookresearch/faiss/blob/6c046992a71e672df504a0101cddf6f2f2e90601/faiss/Index.h#L66)) Issue: Currently if a plain `SearchParameters` is used to query an HNSW index, [an error is thrown](https://github.com/facebookresearch/faiss/blob/6c046992a71e672df504a0101cddf6f2f2e90601/faiss/IndexHNSW.cpp#L251) -- when the user's intent was only to filter some documents, and rely on index settings for remaining parameters (like `efSearch`, `check_relative_distance`, `search_bounded_queue`) Motivation: Faiss provides an amazing [index factory](https://github.com/facebookresearch/faiss/wiki/The-index-factory) and [parameter setter](https://github.com/facebookresearch/faiss/wiki/Index-IO,-cloning-and-hyper-parameter-tuning) to abstract away internals of the index type and settings used, like: ```cpp Index* index = index_factory(256, "HNSW32"); ParameterSpace().set_index_parameters(index, "efConstruction=200,efSearch=150"); ``` Now if a user wants to perform a filtered search on this _opaque_ index using: ```cpp SearchParameters parameters; parameters.sel = new IDSelectorRange(10, 20); index->search(nq, xq, k, d, id, ¶meters); ``` they are met with an error: ``` faiss/IndexHNSW.cpp:251: Error: '!(params)' failed: params type invalid ``` An easy way to reproduce this issue is to replace `Flat` -> `HNSW` [here](https://github.com/facebookresearch/faiss/blob/6c046992a71e672df504a0101cddf6f2f2e90601/c_api/example_c.c#L60) and run `example_c` like: ``` make -C build example_c ./build/c_api/example_c ``` This PR allows passing a plain `SearchParameters` to HNSW indexes, and use index settings as a fallback Reviewed By: asadoughi Differential Revision: D69266072 Pulled By: mnorris11
Hi @mnorris11, I see you created a new PR (#4174) to validate that these changes pass the internal Meta CI after updating a failing test -- but that PR is closed.. |
@kaivalnp We will merge this one to Faiss. I am trying to merge the internal CI fix separately currently. |
@mnorris11 Got it, thanks for the update! |
@mnorris11 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mnorris11 merged this pull request in 1d8f393. |
Summary: Add ability to search HNSW indexes using a plain
SearchParameters
object (i.e. only anIDSelector
)Issue: Currently if a plain
SearchParameters
is used to query an HNSW index, an error is thrown -- when the user's intent was only to filter some documents, and rely on index settings for remaining parameters (likeefSearch
,check_relative_distance
,search_bounded_queue
)Motivation: Faiss provides an amazing index factory and parameter setter to abstract away internals of the index type and settings used, like:
Now if a user wants to perform a filtered search on this opaque index using:
they are met with an error:
An easy way to reproduce this issue is to replace
Flat
->HNSW
here and runexample_c
like:This PR allows passing a plain
SearchParameters
to HNSW indexes, and use index settings as a fallback