Partial loading implementation for FAISS HNSW #2405

0ctopus13prime · 2025-01-17T21:54:10Z

Description

OpenSearch KNN plugin supports three engines: NMSLIB, FAISS, and Lucene.
The first two native engines, NMSLIB and FAISS, require all vector-related data structures (such as HNSW graphs) to be loaded into memory for search operation.
For large workloads, this memory cost can quickly become substantial if quantization techniques are not applied.
Therefore, 'Partial Loading' must be enabled as an option in native engines to control the available memory for KNN search. The objective of partial loading is twofold:

To allow users to control the maximum memory available for KNN searching.
To enable native engines to partially load only the necessary data within the constraint.
If we look closely a HNSW graph mainly consist of below things:

Full precision 32 bit vectors.
Graph representations.
Metadata like dimensions, number of vectors, space type, headers etc.
From the above items, main memory is used by these full precision vectors 4 bytes * the number of vectors * the number of dimension.
The way FAISS stores these vectors is in a Flat Index and during serialization and deserialization these vectors are written and read to/from the file and put in the main memory which increases the memory consumption.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#2401

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

0ctopus13prime · 2025-01-18T01:27:41Z

Please note that will make sure all System.out for debugging to be removed after finalized before merging.

0ctopus13prime · 2025-01-18T07:35:26Z

src/main/java/org/opensearch/knn/partialloading/KdyPerfCheck.java

+
+package org.opensearch.knn.partialloading;
+
+public class KdyPerfCheck {


This is temp class for tracking performance.
Will be removed before merging to main.

0ctopus13prime · 2025-01-22T06:06:49Z

src/main/java/org/opensearch/knn/index/codec/KNN990Codec/NativeEngines990KnnVectorsWriter.java

@@ -106,7 +106,7 @@ public void flush(int maxDoc, final Sorter.DocMap sortMap) throws IOException {
            final QuantizationState quantizationState = train(field.getFieldInfo(), knnVectorValuesSupplier, totalLiveDocs);
            // Check only after quantization state writer finish writing its state, since it is required
            // even if there are no graph files in segment, which will be later used by exact search
-            if (shouldSkipBuildingVectorDataStructure(totalLiveDocs)) {
+            if (false /*TMP*/ && shouldSkipBuildingVectorDataStructure(totalLiveDocs)) {


This is temp code. Will revert it back before merging.

0ctopus13prime · 2025-01-22T08:25:54Z

Partial Loading Code Review Breaks Down

1. Goal

This document provides a comprehensive overview of a big PR on partial loading to minimize the time required for reviewers to complete a review.

2. Scope

Design Document : RFC

1. Supported Vector Types

Only float32 vectors are supported initially.
Binary and byte vector indices are not yet supported.

2.. Supported Metrics

Dot product.
Euclidean distance.

3.. Filtered Query

Partial loading supports filtered queries.

4.. Nested Vectors

Supported for scenarios where parent documents contain multiple vectors.
Integer parent IDs are provided in KNNWeight.

5.. Sparse Vector Documents

Supports cases where not all Lucene documents contain vectors.
Handles indexing documents without vectors.

3. Break Downs

The PR can be divided into two main parts, with the search part further split into five subparts:

Index Loading
1. Graceful resource cleanup.
Searching
1. Basic framework.
2. Normal case: No filtering, no parent IDs, and all documents have indexed vectors.
3. Filtering:
  1. Filtered queries.
  2. Handling deletions.
4. Handling parent IDs.
5. Sparse vector documents.

4. [Part 1] Index partial loading

NativeMemoryLoadStrategy
1. Fetches mapping configuration from settings to check if the current KNN field supports partial loading. If partial loading is disabled, it falls back to the default mode, loading everything into memory.
  1. Currently, retrieving this configuration from settings is not implemented and can be replaced with a placeholder for now.
Partial Loading in FAISS
1. Source : partialloading.faiss package.
2. FaissIndex.partialLoad(InputStream input) is the entry point for partially loading a FAISS index by reading bytes from the provided InputStream. The main idea is to mark starting offsets and load bytes on demand.
  1. FaissIndex.partialLoad is a Java port of a corresponding function in FAISS.
    1. Please refer to FAISS C++ source code.
  2. Supported index types:
    1. IxMp - FaissIdMapIndex
      1. Containing a mapping that maps an internal vector id to Lucene document id.
    2. IHNf - FaissHNSWFlatIndex
      1. Contains FaissHNSW
    3. IxF2 - FaissIndexFlat
      1. For Euclidian distance.
    4. IxFI - FaissIndexFlat
      1. For inner product distance.
Resource Cleanup
1. PartialLoadingContext may hold a non-null IndexInput reference, which is passed to a search thread for vector searches (e.g., HNSW graph search).
2. Graceful resource cleanup is managed in NativeMemoryAllocation.IndexAllocation.close, which invokes PartialLoadingContext.close to release the IndexInput.

5. [Part 2] Search

2.1. Partial Loading Basic Framework

The flow will reach at KNNWeight.doANNSearch.
Retrieves the configured partial loading mode from settings. [Not yet implemented]
If partial loading is disabled, it falls back to the default search using the C++ FAISS.
If partial loading is enabled:
1. Obtains PartialLoadingContext from IndexAllocation.
2. Retrieves the search strategy based on the partial loading mode.
  1. Currently, the only available strategy is MemoryEfficientPartialLoadingSearchStrategy, which accesses and loads bytes on demand without caching.
3. Copies IndexInput.
4. Extracts the efSearch value from the query.
5. Calls queryIndex of the selected search strategy.
6. Invokes FaissIndex.search to perform the search.
Sources
1. KNNWeight
2. MemoryEfficientPartialLoadingSearchStrategy
3. PartialLodingContext
4. FaissIndex
5. FaissIdMapIndex → FaissHNSWFlatIndex → FaissIndexFlat

2.2. Normal Case — Happy Path

This is the straightforward case with no filtering IDs, parent IDs, and all documents having indexed vectors.

FaissIdMapIndex:
1. Operates without a grouper or selector.
2. Delegates the search directly to the nested index, FaissHNSWFlatIndex.
FaissHNSWFlatIndex:
1. Creates a max-heap based on the distance metric.
2. Passes the heap to FaissHNSW to initiate HNSW search.
FaissHNSW:
1. Executes the HNSW search.

2.3. Having a Filtering

With Filtering:

If filtering is applied, filterIdsBitSet will have a non-null value in doANNSearch.
- Live bits (representing "live" documents) are included in the bitset only when a filter is specified in the query.

No Integer List Conversion:

Unlike C++ FAISS, running a vector search in partial loading in the JVM does not require converting the bitset into an integer list.
The search can directly use the bitset as provided.

2.4. Having Parent Ids

Parent IDs Handling:

Parent IDs are passed down to FaissIndex.

Conversion to BitSet:

The passed parent IDs are converted into a bitset.
- Refer to the comments in BitSetParentIdGrouper for details.

Grouper Creation:

A grouper is created to map child document IDs to their corresponding parent document IDs.

Parent-Level BFS in HNSW:

During BFS in HNSW, the max heap based on distance considers only the parent IDs.
- For implementation, see GroupedDistanceMaxHeap.
- Example: Child IDs (1, 2, 3) with parent ID '4'. The max heap evaluates distances at the parent level only.
  - But, we keep tracking the max child per each parent id though.

2.5. Sparse Vector Documents

Handling Sparse Vectors:

If some documents lack indexed vectors, vectorIdToDocIdMapping in FaissIdMapIndex will hold a non-null value.
- Example: If only documents 1, 5, and 10 have vectors, the mapping will be:
  - 0 → 1
  - 1 → 5
  - 2 → 10

shatejas

Still working on this PR

Should the cache key change here for partial loading? This depends on how we are planning to launch it but might be a good idea to change the cache key to avoid any unpredictable behavior

shatejas · 2025-02-05T19:24:59Z

src/main/java/org/opensearch/knn/index/memory/NativeMemoryLoadStrategy.java

@@ -87,6 +91,15 @@ public NativeMemoryAllocation.IndexAllocation load(NativeMemoryEntryContext.Inde
            final Directory directory = indexEntryContext.getDirectory();
            final int indexSizeKb = Math.toIntExact(directory.fileLength(vectorFileName) / 1024);

+            // TMP
+            final PartialLoadingMode partialLoadingMode = PartialLoadingMode.DISABLED;


I am little confused between mapping and setting. What was decided on whether to use mapping or setting?

Ideally if the performance and recall is equal we should eventually have an option of deprecating something that is not memory-effecient. Will having a mapping make it a one way door?

Mapping is preferred as we want it to be configured at field level.

You mean MEMORY_EFFICIENT mode's performance and recall are equal to the baseline where loading everything into memory right? Its performance never can be equal to the baseline as it involves load costs. Even when MMapDirectory was configured, it was shown that FAISS baseline had the best performance.
The whole point of MEMORY_EFFICIENT is to give an option to users to operate big vector index within a memory constraints environment.

shatejas · 2025-02-05T19:26:06Z

src/main/java/org/opensearch/knn/index/memory/NativeMemoryLoadStrategy.java

+            );
+        }
+
+        private void validatePartialLoadingSupported(NativeMemoryEntryContext.IndexEntryContext indexEntryContext, KNNEngine knnEngine)


This validation seems too late, are there any validations like this while creating the mapping?

Yes, I agree! My strategy is to have two separate PRs: 1. Core logic for partial loading 2. Extending mapping
And this PR has the core logic, and will make sure the early validation to be made during mapping creation as you suggested.
Will add TODO.

shatejas · 2025-02-05T19:33:04Z

src/main/java/org/opensearch/knn/partialloading/PartialLoadingContext.java

+@RequiredArgsConstructor
+@Getter
+public class PartialLoadingContext implements Closeable {
+    private final FaissIndex faissIndex;


Do we need faissIndex here? should we decouple?

Partial loading context has required FAISS components for searching.
And FaissIndex a top level index that will recursively delegate search and get the search results.

Partial loading context will be defined in IndexAllocation after 'partial load' in native load strategy.
Then in KNNWeight, it retrieves the context to get the entry point for searching. I think we need to have it here.

shatejas · 2025-02-05T19:33:24Z

src/main/java/org/opensearch/knn/partialloading/PartialLoadingContext.java

+public class PartialLoadingContext implements Closeable {
+    private final FaissIndex faissIndex;
+    private final String vectorFileName;
+    private final PartialLoadingMode partialLoadingMode;


Why is this mode needed here? just curious

When loading in native memory load strategy, it will retrieve the mode from mapping then put it in here.
Then in KNNWeight, after it acquired the context, it will creates partial load search strategy based on the mode without having to look up mapping.

shatejas · 2025-02-05T19:35:57Z

src/main/java/org/opensearch/knn/partialloading/PartialLoadingContext.java

+        if (indexInput != null) {
+            return indexInput.clone();
+        }
+        indexInput = directory.openInput(vectorFileName, IOContext.RANDOM);


Let client worry about the IOContext/ReadAdvice here?

Can we initialize open in the constructor, I am not really comfortable with the functionality of get and open being combined

Sure, I think we should let partial load search strategy to handle the low level IOContext/ReadAdvice.

Sure, will update in the next revision to make it initialize in the constructor.

shatejas · 2025-02-05T19:48:26Z

src/main/java/org/opensearch/knn/index/query/KNNWeight.java

-            Map<Integer, Float> result = doExactSearch(context, docs, cardinality, k);
+        if (isExactSearchRequire(context, matchDocsCardinality, docIdsToScoreMap.size())) {
+            final BitSetIterator docs = filterWeight != null ? new BitSetIterator(filterBitSet, matchDocsCardinality) : null;
+            Map<Integer, Float> result = doExactSearch(context, docs, matchDocsCardinality, k);
            return new PerLeafResult(filterWeight == null ? null : filterBitSet, result);
        }
        return new PerLeafResult(filterWeight == null ? null : filterBitSet, docIdsToScoreMap);
    }

    private BitSet getFilteredDocsBitSet(final LeafReaderContext ctx) throws IOException {


For partial loading you can always pass in liveDocs (irrespective of filter weight) and take care of deleted docs while doing the search instead of post filtering, currently its not done because the liveDocs need to be converted to array to pass to JNI layer which might add latencies

Yes, we can pass liveDocs to partial loading and let the components directly filter documents based on it.
But I intentionally tried to make it to return the equivalent output to the same input to minimize the possible side effects that might be coming from it.
I just had a hunch that subtle difference introduced at the beginning is likely gradually bring more subtle differences will make us hard to debug at some point, so I tried to make sure same output will be returned to the same input.
Current baseline only passes liveDocs (technically, an int array) when there's a filter provided.

shatejas · 2025-02-10T20:39:58Z

src/main/java/org/opensearch/knn/index/memory/NativeMemoryLoadStrategy.java

@@ -96,6 +109,45 @@ public NativeMemoryAllocation.IndexAllocation load(NativeMemoryEntryContext.Inde
            }
        }

+        private NativeMemoryAllocation.IndexAllocation createPartialLoadedIndexAllocation(


Should we create a separate class which extends NativeMemoryLoadStrategy for this? and possibly PartialIndexAllocation pojo to hold the context?

This will simplify the code and isolate partialLoading related code ideally under one fork rather than an if fork in each class. Let me know how it turns out?

Decision whether to partial load or not is being made within NativeMemoryLoadStrategy::load by fetching mode from mapping. I think it would be good to have PartialIndexAllocation to separate the loading logic, but it seems hard to have a subclass of NativeMemoryLoadStrategy.
Will factor out partial loading related logics to PartialIndexAllocation as you suggested in the next revision.

src/main/java/org/opensearch/knn/index/query/KNNQueryResult.java

shatejas · 2025-02-10T20:58:59Z

src/main/java/org/opensearch/knn/index/query/KNNWeight.java

+            throw new UnsupportedOperationException("Partial loading does not support radius query with k=0");
+        }
+
+        final PartialLoadingSearchStrategy searchStrategy = partialLoadingContext.getPartialLoadingMode().createSearchStrategy();


Rather than having an interface specifically for PartialLoadingSearchStrategy have you considered NativeIndexSearchStrategy?. Its implementation would be MemoryEfficientSearchStrategy that will use partial loading and DefaultSearchStrategy which just loads the entire index and calls all the jni layer. You will need a common set of parameters that you need to pass to these strategies

Like the idea, only thing bugging me is that it would be cumbersome to define common set of parameters as those required parameters are varied between partial loading and the baseline.
It is natural as we're trying to abstract out two different mechanisms -- partial loading vs calling C++ FAISS.

Partial loading is fine with bitset, but baseline needs long array

Generally, partial loading needs more parameters that baseline does not need at all.

I'm open to it, but I have an impression that we're forcefully fitting two different mechanisms under one interface.
Please share your thoughts.

…ding order.

Signed-off-by: Dooyong Kim <[email protected]>

0ctopus13prime · 2025-02-13T02:52:19Z

Will clean up commit messages before squash merge

0ctopus13prime · 2025-02-14T05:19:16Z

By passing cache manager

Partial loading design was inspired by Lucene, where IndexInput is used to fetch bytes during vector search.
The key difference merely lies in underlying file format, but both approaches share the same algorithm with IndexInput.

Since we rely on NRT reader, which periodically triggers Codec to create vector readers for newly published segments, there's no need for us to do lazy loading when query comes in.

If we could fully delegate this responsibility to NRT reader, maintenance will become much easier.
The idea is to move the partial loading logic into Codec, returning PartialLoadingVectorReader, which will handle vector search within KNNWeight. This eliminates the need to keep partial loading logic within the load strategy and index allocation object.

I'll assess the required efforts, and if feasible, will refactor in the next revision. Otherwise, will take an iterative approach in subsequent PRs.

cc @shatejas

Vikasht34

I am not really conformable with the change we are doing to enable Partial Loading , from the change it fells like we are re-writing whole faiss search path again in Java. I know we must have discussed during design review but did not anticipate the magnitude of change.

After looking at Faiss Library , Faiss has in built capability to partailly load the file the way IndexInput is doing for us using MMAP

Faiss already supports Mmap based loading

/*
 * Copyright (c) Meta Platforms, Inc. and affiliates.
 *
 * This source code is licensed under the MIT license found in the
 * LICENSE file in the root directory of this source tree.
 */

// I/O code for indexes

#ifndef FAISS_INDEX_IO_H
#define FAISS_INDEX_IO_H

#include <cstdio>
#include <string>
#include <typeinfo>
#include <vector>

/** I/O functions can read/write to a filename, a file handle or to an
 * object that abstracts the medium.
 *
 * The read functions return objects that should be deallocated with
 * delete. All references within these objectes are owned by the
 * object.
 */

namespace faiss {

struct Index;
struct IndexBinary;
struct VectorTransform;
struct ProductQuantizer;
struct IOReader;
struct IOWriter;
struct InvertedLists;

/// skip the storage for graph-based indexes
const int IO_FLAG_SKIP_STORAGE = 1;

void write_index(const Index* idx, const char* fname, int io_flags = 0);
void write_index(const Index* idx, FILE* f, int io_flags = 0);
void write_index(const Index* idx, IOWriter* writer, int io_flags = 0);

void write_index_binary(const IndexBinary* idx, const char* fname);
void write_index_binary(const IndexBinary* idx, FILE* f);
void write_index_binary(const IndexBinary* idx, IOWriter* writer);

// The read_index flags are implemented only for a subset of index types.
const int IO_FLAG_READ_ONLY = 2;
// strip directory component from ondisk filename, and assume it's in
// the same directory as the index file
const int IO_FLAG_ONDISK_SAME_DIR = 4;
// don't load IVF data to RAM, only list sizes
const int IO_FLAG_SKIP_IVF_DATA = 8;
// don't initialize precomputed table after loading
const int IO_FLAG_SKIP_PRECOMPUTE_TABLE = 16;
// don't compute the sdc table for PQ-based indices
// this will prevent distances from being computed
// between elements in the index. For indices like HNSWPQ,
// this will prevent graph building because sdc
// computations are required to construct the graph
const int IO_FLAG_PQ_SKIP_SDC_TABLE = 32;
// try to memmap data (useful to load an ArrayInvertedLists as an
// OnDiskInvertedLists)
const int IO_FLAG_MMAP = IO_FLAG_SKIP_IVF_DATA | 0x646f0000;

Index* read_index(const char* fname, int io_flags = 0);
Index* read_index(FILE* f, int io_flags = 0);
Index* read_index(IOReader* reader, int io_flags = 0);

IndexBinary* read_index_binary(const char* fname, int io_flags = 0);
IndexBinary* read_index_binary(FILE* f, int io_flags = 0);
IndexBinary* read_index_binary(IOReader* reader, int io_flags = 0);

void write_VectorTransform(const VectorTransform* vt, const char* fname);
void write_VectorTransform(const VectorTransform* vt, IOWriter* f);

VectorTransform* read_VectorTransform(const char* fname);
VectorTransform* read_VectorTransform(IOReader* f);

ProductQuantizer* read_ProductQuantizer(const char* fname);
ProductQuantizer* read_ProductQuantizer(IOReader* reader);

void write_ProductQuantizer(const ProductQuantizer* pq, const char* fname);
void write_ProductQuantizer(const ProductQuantizer* pq, IOWriter* f);

void write_InvertedLists(const InvertedLists* ils, IOWriter* f);
InvertedLists* read_InvertedLists(IOReader* reader, int io_flags = 0);

} // namespace faiss

#endif

In case if we need to make additinal change to make faiss work , it would be very minimal . But with this change , we are going to have really good challenging with maintaince.

I would like to have one more broader discussion on this just to re-visit things once again.

src/main/java/org/opensearch/knn/partialloading/faiss/hnsw/FaissHNSW.java

0ctopus13prime · 2025-02-14T07:09:10Z

I am not really conformable with the change we are doing to enable Partial Loading , from the change it fells like we are re-writing whole faiss search path again in Java. I know we must have discussed during design review but did not anticipate the magnitude of change....

Thank you @Vikasht34 for taking it a look.

But MMap is not enough. We want to run the search on S3 as well, and it is not feasible for that case.
Also, your presented code is to give an option for user to be able to use MMap to load 'everything' into memory instead of calling fread. Therefore, with the apis you shared does not solve the problem we want to solve -- Running vector search on FAISS in memory constraint environment anyway.

And agree with the maintenance costs. But on the flip side, with this, user could run FAISS in memory constraints environment which solves the current FAISS limitation that requires memory heavy, expensive hardware.
It is not an exaggeration saying that user could run 500GB FAISS index in 32GB instance which is not even possible at the moment.
Please look at the benefit user can take from this.

Vikasht34 · 2025-02-14T07:24:01Z

But MMap is not enough. We want to run the search on S3 as well, and it is not feasible for that case.

Let's have one more discussion among us.

I am not really conformable with the change we are doing to enable Partial Loading , from the change it fells like we are re-writing whole faiss search path again in Java. I know we must have discussed during design review but did not anticipate the magnitude of change....

Thank you @Vikasht34 for taking it a look.

But MMap is not enough. We want to run the search on S3 as well, and it is not feasible for that case.

Also, your presented code is to give an option for user to be able to use MMap to load 'everything' into memory instead of calling fread. Therefore, with the apis you shared does not solve the problem we want to solve -- Running vector search on FAISS in memory constraint environment anyway.

And agree with the maintenance costs. But on the flip side, with this, user could run FAISS in memory constraints environment which solves the current FAISS limitation that requires memory heavy, expensive hardware. It is not an exaggeration saying that user could run 500GB FAISS index in 32GB instance which is not even possible at the moment. Please look at the benefit user can take from this.

Let's Discuss this in detail , I think we need to revisit again some part of it.

0ctopus13prime requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, ryanbogan, luyuncheng and shatejas as code owners January 17, 2025 21:54

0ctopus13prime changed the title ~~Partial loading implementation for FAISS HNSW>~~ Partial loading implementation for FAISS HNSW Jan 18, 2025

0ctopus13prime commented Jan 18, 2025

View reviewed changes

0ctopus13prime commented Jan 22, 2025

View reviewed changes

shatejas reviewed Feb 10, 2025

View reviewed changes

Dooyong Kim added 8 commits February 12, 2025 14:30

First partial loading implementation for float[], FAISS HNSW.

1ee43c7

Adding KNNVectorDistanceFunction

3539b64

Cover parentIds, filter, idMap

6eb7d2e

Fix bug in GroupedDistanceMaxHeap to use childId instead of group id.

2051bfc

Fix sorting logic in PlainDistanceMaxHeap to return distance in ascen…

2c47d2c

…ding order.

Fix bug in GroupedDistanceMaxHeap sorting logic.

ceae9ca

Implemeneted FAISS HNSW partial loading.

adb2b37

Signed-off-by: Dooyong Kim <[email protected]>

Fixed a bug setting MAX_VALUE after pop in the max-heap.

51388da

Signed-off-by: Dooyong Kim <[email protected]>

0ctopus13prime self-assigned this Feb 13, 2025

Rebased to main and make it up to date.

1539547

0ctopus13prime force-pushed the actual-partial-loading branch from c42a6bf to 1539547 Compare February 13, 2025 02:51

0ctopus13prime requested a review from Vikasht34 as a code owner February 13, 2025 02:51

Vikasht34 reviewed Feb 14, 2025

View reviewed changes

src/main/java/org/opensearch/knn/partialloading/faiss/hnsw/FaissHNSW.java Show resolved Hide resolved

src/main/java/org/opensearch/knn/partialloading/faiss/hnsw/FaissHNSW.java Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial loading implementation for FAISS HNSW #2405

Partial loading implementation for FAISS HNSW #2405

0ctopus13prime commented Jan 17, 2025 •

edited

Loading

0ctopus13prime commented Jan 18, 2025

0ctopus13prime Jan 18, 2025

0ctopus13prime Jan 22, 2025

0ctopus13prime commented Jan 22, 2025 •

edited

Loading

shatejas left a comment •

edited

Loading

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 5, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 10, 2025

0ctopus13prime Feb 12, 2025

shatejas Feb 10, 2025

0ctopus13prime Feb 12, 2025

0ctopus13prime commented Feb 13, 2025

0ctopus13prime commented Feb 14, 2025

Vikasht34 left a comment •

edited

Loading

0ctopus13prime commented Feb 14, 2025

Vikasht34 commented Feb 14, 2025


		package org.opensearch.knn.partialloading;

		public class KdyPerfCheck {

Partial loading implementation for FAISS HNSW #2405

Are you sure you want to change the base?

Partial loading implementation for FAISS HNSW #2405

Conversation

0ctopus13prime commented Jan 17, 2025 • edited Loading

Description

Related Issues

Check List

0ctopus13prime commented Jan 18, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0ctopus13prime commented Jan 22, 2025 • edited Loading

Partial Loading Code Review Breaks Down

1. Goal

2. Scope

3. Break Downs

4. [Part 1] Index partial loading

5. [Part 2] Search

2.1. Partial Loading Basic Framework

2.2. Normal Case — Happy Path

2.3. Having a Filtering

2.4. Having Parent Ids

2.5. Sparse Vector Documents

shatejas left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0ctopus13prime commented Feb 13, 2025

0ctopus13prime commented Feb 14, 2025

By passing cache manager

Vikasht34 left a comment • edited Loading

Choose a reason for hiding this comment

0ctopus13prime commented Feb 14, 2025

Vikasht34 commented Feb 14, 2025

0ctopus13prime commented Jan 17, 2025 •

edited

Loading

0ctopus13prime commented Jan 22, 2025 •

edited

Loading

shatejas left a comment •

edited

Loading

Vikasht34 left a comment •

edited

Loading