Explore more granular vector quantization?

### Description

Today, Lucene supports [8, 4, 2 and 1](https://github.com/apache/lucene/blob/e1879e450b75b3a58fde2b0dad77ae6b499504dd/lucene/core/src/java/org/apache/lucene/codecs/lucene104/Lucene104ScalarQuantizedVectorsFormat.java#L119-L148) bit quantization.

Each quantization level typically has an upper bound of recall with exact KNN that it can produce (which is "exact KNN with quantized scores" v/s "exact KNN with original scores", see https://github.com/mikemccand/luceneutil/issues/528) -- this is the information loss due to quantization itself (before approximate-ness from search algorithms like HNSW comes into picture).

Any algorithm operating on quantized scores _alone_ cannot go beyond this recall (e.g. tweaking parameters like `maxConn`, `beamWidth`, `fanout`, etc. for HNSW) without using the original scores from un-quantized vectors for re-ranking -- which may not be feasible for some use cases (e.g. keeping the index in-memory for performance, where using un-quantized vectors increases memory footprint by \~4x in case of byte-quantized vectors).

In such cases, I wonder if Lucene could support more granular quantization options (say _the equivalent of!_ 6-bit quantization) -- for more granular recall v/s memory requirements?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore more granular vector quantization? #15734

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Explore more granular vector quantization? #15734

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions