Skip to content

Commit 9ab8373

Browse files
committed
Improving vector search diversity through native MMR
Signed-off-by: Bo Zhang <[email protected]>
1 parent 6bdd89b commit 9ab8373

File tree

1 file changed

+240
-0
lines changed

1 file changed

+240
-0
lines changed
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
---
2+
layout: post
3+
title: "Improving vector search diversity through native MMR"
4+
layout: post
5+
authors:
6+
- bzhangam
7+
date: 2025-10-24
8+
has_science_table: true
9+
categories:
10+
- technical-posts
11+
meta_keywords: MMR, Maximal Marginal Relevance, search diversity, search ranking, OpenSearch 3.3, vector search
12+
meta_description: Learn how to use Maximal Marginal Relevance (MMR) in OpenSearch to make your search results more diverse.
13+
---
14+
15+
## Improving vector search diversity through native MMR
16+
17+
When it comes to search and recommendation systems, returning highly relevant results is only half the battle. Equally important is diversity — ensuring users see a range of results rather than multiple near-duplicates. OpenSearch 3.3 now supports native Maximal Marginal Relevance (MMR) for k-NN/neural queries makes this easy.
18+
19+
## What is MMR?
20+
21+
Maximal Marginal Relevance (MMR) is a re-ranking algorithm that balances relevance and diversity:
22+
23+
- **Relevance:** How well a result matches the query.
24+
25+
- **Diversity:** How different the results are from each other.
26+
27+
MMR iteratively selects results that are relevant to the query and not too similar to previously selected results. The trade-off is controlled by a diversity parameter (0 = prioritize relevance, 1 = prioritize diversity).
28+
29+
In vector search, this is particularly useful because embeddings often cluster similar results together. Without MMR, the top-k results might all look nearly identical.
30+
31+
## Native MMR in OpenSearch
32+
33+
Previously, MMR could only be implemented externally, requiring custom pipelines and extra coding. Now, OpenSearch supports native MMR directly in k-NN and neural queries using knn_vector. This simplifies your setup and reduces latency.
34+
35+
## How to Use MMR
36+
37+
### Pre-Requisites
38+
Before using Maximal Marginal Relevance (MMR) for reranking, make sure the required [system-generated search processor factories](https://docs.opensearch.org/latest/search-plugins/search-pipelines/system-generated-search-processors/) are enabled in your cluster:
39+
40+
```json
41+
PUT _cluster/settings
42+
{
43+
"persistent": {
44+
"cluster.search.enabled_system_generated_factories": [
45+
"mmr_over_sample_factory",
46+
"mmr_rerank_factory"
47+
]
48+
}
49+
}
50+
```
51+
These factories enable OpenSearch to automatically perform the oversampling and reranking steps needed for MMR.
52+
53+
### Example: Improving Diversity in Neural Search
54+
55+
Suppose we have a neural search index with a semantic field for product descriptions using a dense embedding model. You can set up your index following this [guide](https://docs.opensearch.org/latest/field-types/supported-field-types/semantic/).
56+
57+
#### Index Sample Data
58+
59+
We index a few example product descriptions:
60+
61+
```json
62+
PUT /_bulk
63+
64+
{ "update": { "_index": "my-nlp-index", "_id": "1" } }
65+
{ "doc": {"product_description": "Red apple from USA."}, "doc_as_upsert": true }
66+
67+
{ "update": { "_index": "my-nlp-index", "_id": "2" } }
68+
{ "doc": {"product_description": "Red apple from usa."}, "doc_as_upsert": true }
69+
70+
{ "update": { "_index": "my-nlp-index", "_id": "3" } }
71+
{ "doc": {"product_description": "Crispy apple."}, "doc_as_upsert": true }
72+
73+
{ "update": { "_index": "my-nlp-index", "_id": "4" } }
74+
{ "doc": {"product_description": "Red apple."}, "doc_as_upsert": true }
75+
76+
{ "update": { "_index": "my-nlp-index", "_id": "5" } }
77+
{ "doc": {"product_description": "Orange juice from usa."}, "doc_as_upsert": true }
78+
```
79+
80+
#### Query Without MMR
81+
82+
A standard neural search query for "Red apple" might look like this:
83+
```json
84+
GET /my-npl-index/_search
85+
{
86+
"size": 3,
87+
"_source": { "exclude": ["product_description_semantic_info"] },
88+
"query": {
89+
"neural": {
90+
"product_description": { "query_text": "Red apple" }
91+
}
92+
}
93+
}
94+
```
95+
Results:
96+
97+
```json
98+
"hits": [
99+
{ "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} },
100+
{ "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} },
101+
{ "_id": "2", "_score": 0.743, "_source": {"product_description": "Red apple from usa."} }
102+
]
103+
```
104+
Notice how all top results are very similar — there’s little diversity in what the user sees.
105+
106+
#### Query With MMR
107+
108+
By adding MMR, we can diversify the top results while maintaining relevance:
109+
```json
110+
GET /my-npl-index/_search
111+
{
112+
"size": 3,
113+
"_source": { "exclude": ["product_description_semantic_info"] },
114+
"query": {
115+
"neural": {
116+
"product_description": { "query_text": "Red apple" }
117+
}
118+
},
119+
"ext": {
120+
"mmr": {
121+
"candidates": 10,
122+
"diversity": 0.4
123+
}
124+
}
125+
}
126+
127+
```
128+
129+
Results:
130+
```json
131+
"hits": [
132+
{ "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} },
133+
{ "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} },
134+
{ "_id": "3", "_score": 0.611, "_source": {"product_description": "Crispy apple."} }
135+
]
136+
```
137+
138+
By using MMR, we introduce more diverse results (like “Crispy apple”) without sacrificing relevance for the top hits.
139+
140+
## Benchmarking MMR Reranking in OpenSearch
141+
To evaluate the performance impact of Maximal Marginal Relevance (MMR) reranking, we ran benchmark tests on OpenSearch 3.3 across both [vector search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/vectorsearch/params/corpus/10million/faiss-cohere-768-dp.json) and [neural-search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/neural_search/params/semanticfield/neural_search_semantic_field_dense_model.json) workloads. These tests help quantify the latency trade-offs introduced by MMR while highlighting the benefits of more diverse search results.
142+
143+
### Cluster configuration
144+
145+
The following OpenSearch cluster configuration was used:
146+
147+
* Version: OpenSearch 3.3
148+
* Data nodes: 3 × r6g.2xlarge
149+
* Master nodes: 3 × c6g.xlarge
150+
* Benchmark instance: c6g.large
151+
152+
### Vector Search Performance
153+
We used the cohere-1m dataset, which contains one million precomputed embeddings, to evaluate k-nearest neighbor (KNN) queries. The table below summarizes query latency (in milliseconds) for different values of k and MMR candidate sizes:
154+
155+
| k | Query size | MMR candidates | KNN p50 (no MMR) | KNN p90 (no MMR) | KNN p50 (with MMR) | KNN p90 (with MMR) | p50 Δ (%) | p90 Δ (%) | p50 Δ (ms) | p90 Δ (ms) |
156+
| --- | ---------- | -------------- | ---------------- | ---------------- | ------------------ | ------------------ | --------- | --------- | ---------- | ---------- |
157+
| 1 | 1 | 1 | 6.70 | 7.19 | 8.22 | 8.79 | 22.7 | 22.2 | 1.52 | 1.60 |
158+
| 10 | 10 | 10 | 8.09 | 8.64 | 9.14 | 9.62 | 13.0 | 11.3 | 1.05 | 0.98 |
159+
| 30 | 10 | 30 | 7.85 | 8.40 | 10.83 | 11.48 | 37.9 | 36.7 | 2.98 | 3.08 |
160+
| 50 | 10 | 50 | 7.17 | 7.63 | 11.76 | 12.55 | 64.1 | 64.5 | 4.59 | 4.92 |
161+
| 50 | 20 | 50 | 8.04 | 8.57 | 14.08 | 14.94 | 75.0 | 74.4 | 6.04 | 6.37 |
162+
| 50 | 50 | 50 | 8.34 | 8.91 | 17.25 | 17.94 | 106.8 | 101.3 | 8.91 | 9.03 |
163+
| 100 | 10 | 100 | 7.92 | 8.46 | 15.81 | 16.73 | 99.7 | 97.7 | 7.89 | 8.27 |
164+
165+
166+
### Neural Search Performance
167+
168+
For neural search, we used the Quora dataset, containing over 500,000 documents. The table below shows query latency with and without MMR reranking:
169+
170+
| k | Query size | MMR candidates | Neural p50 (no MMR) | Neural p90 (no MMR) | Neural p50 (with MMR) | Neural p90 (with MMR) | p50 Δ (%) | p90 Δ (%) | p50 Δ (ms) | p90 Δ (ms) |
171+
| --- | ---------- | -------------- | ------------------- | ------------------- | --------------------- | --------------------- | --------- | --------- | ---------- | ---------- |
172+
| 1 | 1 | 1 | 113.59 | 122.22 | 113.08 | 122.38 | -0.46 | 0.13 | -0.52 | 0.16 |
173+
| 10 | 10 | 10 | 112.03 | 122.90 | 113.88 | 122.63 | 1.66 | -0.22 | 1.86 | -0.27 |
174+
| 30 | 10 | 30 | 112.09 | 118.82 | 119.57 | 127.65 | 6.67 | 7.42 | 7.48 | 8.82 |
175+
| 50 | 10 | 50 | 113.48 | 126.35 | 122.56 | 133.34 | 8.00 | 5.53 | 9.08 | 6.99 |
176+
| 50 | 20 | 50 | 113.80 | 125.20 | 122.94 | 134.80 | 8.04 | 7.67 | 9.14 | 9.60 |
177+
| 50 | 50 | 50 | 113.36 | 125.54 | 128.33 | 136.52 | 13.21 | 8.74 | 14.97 | 10.97 |
178+
| 100 | 10 | 100 | 119.04 | 128.71 | 130.52 | 139.95 | 9.65 | 8.73 | 11.48 | 11.24 |
179+
180+
### Key Observations
181+
182+
1. MMR adds latency, and the increase grows with the number of MMR candidates.
183+
2. KNN/Neural queries without MMR scale well with k. The dominant cost comes from graph traversal (ef_search), not selecting the top k candidates.
184+
185+
Choosing the number of MMR candidates requires balancing diversity versus query latency. More candidates improve result diversity but increase latency, so select values appropriate for your workload.
186+
187+
## Using MMR with Cross-cluster Search
188+
189+
Currently, for [cross-cluster search](https://docs.opensearch.org/latest/search-plugins/cross-cluster-search/), OpenSearch cannot automatically resolve vector field information from the index mapping in the remote clusters. This means users must explicitly provide the vector field details when using MMR.
190+
191+
Here’s an example query:
192+
193+
```json
194+
POST /my-index/_search
195+
{
196+
"query": {
197+
"neural": {
198+
"my_vector_field": {
199+
"query_text": "query text",
200+
"model_id": "<your model id>"
201+
}
202+
}
203+
},
204+
"ext": {
205+
"mmr": {
206+
"diversity": 0.5,
207+
"candidates": 10,
208+
"vector_field_path": "my_vector_field",
209+
"vector_field_data_type": "float",
210+
"vector_field_space_type": "l2"
211+
}
212+
}
213+
}
214+
215+
```
216+
217+
Explanation of MMR Parameters for Remote Clusters
218+
219+
**vector_field_path:** Path to the vector field to use for MMR re-ranking.
220+
221+
**vector_field_data_type:** Data type of the vector (e.g., float).
222+
223+
**vector_field_space_type:** Distance metric used for similarity calculations (e.g., l2).
224+
225+
candidates and diversity: Same as in local MMR queries, controlling the number of candidates and the diversity weight.
226+
227+
Providing this information ensures that MMR can correctly compute diversity and re-rank results even when querying across remote clusters.
228+
229+
## Summary
230+
231+
OpenSearch’s Maximal Marginal Relevance (MMR) feature makes it easy to deliver search results that are both relevant and diverse. By intelligently re-ranking results, MMR helps surface a wider variety of options, reduces redundancy, and creates a richer, more engaging search experience for your users.
232+
233+
If you’re looking to improve your vector search diversity, MMR in OpenSearch is a powerful tool to try today.
234+
235+
## What's Next
236+
237+
In the future, we can make MMR even easier and more flexible:
238+
239+
- **Better support for remote clusters:** removing the need to manually specify vector field info.
240+
- **Expanded query type support:** Currently we only can support knn query or neural query with knn_vector. Potentially we can support more query types. e.g. bool and hybrid queries, so MMR can enhance a wider variety of search scenarios.

0 commit comments

Comments
 (0)