Skip to content

Docs: more about ANN #19943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions ydb/docs/en/core/concepts/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,11 +158,12 @@ A **primary index** or **primary key index** is the main data structure used to

A **secondary index** is an additional data structure used to locate rows in a table, typically when it can't be done efficiently using the [primary index](#primary-index). Unlike the primary index, secondary indexes are managed independently from the main table data. Thus, a table might have multiple secondary indexes for different use cases. {{ ydb-short-name }}'s capabilities in terms of secondary indexes are covered in a separate article [{#T}](secondary_indexes.md). Secondary indexes can be either unique or non-unique.

A special type of **secondary index** is singled out separately - [vector index] (#vector-index).
A special type of **secondary index** is singled out separately - [vector index](#vector-index).

#### Vector Index {#vector-index}

**Vector index** is an additional data structure used to speed up the [vector search](vector_search.md) when there is a large amount of data, and the [exact vector search without an index](../yql/reference/udf/list/knn.md) does not perform satisfactorily. The capabilities of {{ ydb-short-name }} regarding vector indexes are described in a separate article [{#T}](../dev/vector-indexes.md).
**Vector index** is an additional data structure used to speed up the [vector search](vector_search.md) when there is a large amount of data, and the [exact vector search without an index](../yql/reference/udf/list/knn.md) does not perform satisfactorily.
The capabilities of {{ ydb-short-name }} regarding **ANN search** (approximate nearest neighbor search) with vector indexes are described in a separate article [{#T}](../dev/vector-indexes.md).

**Vector index** is distinct from a [secondary index](#secondary-index) as it solves other tasks.

Expand Down
5 changes: 4 additions & 1 deletion ydb/docs/en/core/concepts/vector_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@

**Vector search**, also known as [nearest neighbor search](https://en.wikipedia.org/wiki/Nearest_neighbor_search) (NN), is an optimization problem where the goal is to find the nearest vector (or a set of vectors) in a given dataset relative to a specified query vector. The proximity between vectors is determined using distance or similarity metrics.

One common approach, especially for large datasets, is **approximate nearest neighbor (ANN) search**, which allows faster vector retrieval at the cost of potential accuracy trade-offs.


Vector search is actively used in the following areas:

* recommendation systems
* semantic search
* search for similar images
* image similarity search
* anomaly detection
* classification systems

Expand Down
3 changes: 2 additions & 1 deletion ydb/docs/ru/core/concepts/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@

#### Векторный индекс {#vector-index}

**Векторный индекс** или **vector index** — это дополнительная структура данных, используемая для ускорения решения задачи [векторного поиска](vector_search.md), когда данных достаточно много и [точный векторный поиск без индекса](../yql/reference/udf/list/knn.md) не работает удовлетворительно. Возможности {{ ydb-short-name }} в отношении векторных индексов описаны в отдельной статье [{#T}](../dev/vector-indexes.md).
**Векторный индекс** или **vector index** — это дополнительная структура данных, используемая для ускорения решения задачи [векторного поиска](vector_search.md), когда данных достаточно много и [точный векторный поиск без индекса](../yql/reference/udf/list/knn.md) не работает удовлетворительно.
Возможности {{ ydb-short-name }} по приближённому поиску ближайших соседей (ANN search) с помощью векторных индексов описаны в отдельной статье [{#T}](../dev/vector-indexes.md).

**Векторный индекс** выделяется отдельно от [вторичного индекса](#secondary-index), так как решает иные задачи.

Expand Down
2 changes: 2 additions & 0 deletions ydb/docs/ru/core/concepts/vector_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

**Векторный поиск**, также известный как [поиск ближайшего соседа](https://en.wikipedia.org/wiki/Nearest_neighbor_search) (NN), представляет собой задачу оптимизации, в которой необходимо найти ближайший вектор (или множество векторов) в данном наборе данных относительно заданного вектора запроса. Близость между векторами определяется с помощью метрик расстояния или сходства.

Одним из распространённых подходов, особенно при работе с большими наборами данных, является приближённый поиск ближайших соседей (ANN search), который позволяет получать более быстрые результаты за счёт возможной потери точности.

Векторный поиск активно используется в следующих областях:

* рекомендательные системы;
Expand Down