|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Improving vector search diversity through native MMR" |
| 4 | +layout: post |
| 5 | +authors: |
| 6 | + - bzhangam |
| 7 | +date: 2025-10-24 |
| 8 | +has_science_table: true |
| 9 | +categories: |
| 10 | + - technical-posts |
| 11 | +meta_keywords: MMR, Maximal Marginal Relevance, search diversity, search ranking, OpenSearch 3.3, vector search |
| 12 | +meta_description: Learn how to use Maximal Marginal Relevance (MMR) in OpenSearch to make your search results more diverse. |
| 13 | +--- |
| 14 | + |
| 15 | +## Improving vector search diversity through native MMR |
| 16 | + |
| 17 | +When it comes to search and recommendation systems, returning highly relevant results is only half the battle. Equally important is diversity — ensuring users see a range of results rather than multiple near-duplicates. OpenSearch 3.3 now supports native Maximal Marginal Relevance (MMR) for k-NN/neural queries makes this easy. |
| 18 | + |
| 19 | +## What is MMR? |
| 20 | + |
| 21 | +Maximal Marginal Relevance (MMR) is a re-ranking algorithm that balances relevance and diversity: |
| 22 | + |
| 23 | + - **Relevance:** How well a result matches the query. |
| 24 | + |
| 25 | + - **Diversity:** How different the results are from each other. |
| 26 | + |
| 27 | +MMR iteratively selects results that are relevant to the query and not too similar to previously selected results. The trade-off is controlled by a diversity parameter (0 = prioritize relevance, 1 = prioritize diversity). |
| 28 | + |
| 29 | +In vector search, this is particularly useful because embeddings often cluster similar results together. Without MMR, the top-k results might all look nearly identical. |
| 30 | + |
| 31 | +## Native MMR in OpenSearch |
| 32 | + |
| 33 | +Previously, MMR could only be implemented externally, requiring custom pipelines and extra coding. Now, OpenSearch supports native MMR directly in k-NN and neural queries using knn_vector. This simplifies your setup and reduces latency. |
| 34 | + |
| 35 | +## How to Use MMR |
| 36 | + |
| 37 | +### Pre-Requisites |
| 38 | +Before using Maximal Marginal Relevance (MMR) for reranking, make sure the required [system-generated search processor factories](https://docs.opensearch.org/latest/search-plugins/search-pipelines/system-generated-search-processors/) are enabled in your cluster: |
| 39 | + |
| 40 | +```json |
| 41 | +PUT _cluster/settings |
| 42 | +{ |
| 43 | + "persistent": { |
| 44 | + "cluster.search.enabled_system_generated_factories": [ |
| 45 | + "mmr_over_sample_factory", |
| 46 | + "mmr_rerank_factory" |
| 47 | + ] |
| 48 | + } |
| 49 | +} |
| 50 | +``` |
| 51 | +These factories enable OpenSearch to automatically perform the oversampling and reranking steps needed for MMR. |
| 52 | + |
| 53 | +### Example: Improving Diversity in Neural Search |
| 54 | + |
| 55 | +Suppose we have a neural search index with a semantic field for product descriptions using a dense embedding model. You can set up your index following this [guide](https://docs.opensearch.org/latest/field-types/supported-field-types/semantic/). |
| 56 | + |
| 57 | +#### Index Sample Data |
| 58 | + |
| 59 | +We index a few example product descriptions: |
| 60 | + |
| 61 | +```json |
| 62 | +PUT /_bulk |
| 63 | + |
| 64 | +{ "update": { "_index": "my-nlp-index", "_id": "1" } } |
| 65 | +{ "doc": {"product_description": "Red apple from USA."}, "doc_as_upsert": true } |
| 66 | + |
| 67 | +{ "update": { "_index": "my-nlp-index", "_id": "2" } } |
| 68 | +{ "doc": {"product_description": "Red apple from usa."}, "doc_as_upsert": true } |
| 69 | + |
| 70 | +{ "update": { "_index": "my-nlp-index", "_id": "3" } } |
| 71 | +{ "doc": {"product_description": "Crispy apple."}, "doc_as_upsert": true } |
| 72 | + |
| 73 | +{ "update": { "_index": "my-nlp-index", "_id": "4" } } |
| 74 | +{ "doc": {"product_description": "Red apple."}, "doc_as_upsert": true } |
| 75 | + |
| 76 | +{ "update": { "_index": "my-nlp-index", "_id": "5" } } |
| 77 | +{ "doc": {"product_description": "Orange juice from usa."}, "doc_as_upsert": true } |
| 78 | +``` |
| 79 | + |
| 80 | +#### Query Without MMR |
| 81 | + |
| 82 | +A standard neural search query for "Red apple" might look like this: |
| 83 | +```json |
| 84 | +GET /my-npl-index/_search |
| 85 | +{ |
| 86 | + "size": 3, |
| 87 | + "_source": { "exclude": ["product_description_semantic_info"] }, |
| 88 | + "query": { |
| 89 | + "neural": { |
| 90 | + "product_description": { "query_text": "Red apple" } |
| 91 | + } |
| 92 | + } |
| 93 | +} |
| 94 | +``` |
| 95 | +Results: |
| 96 | + |
| 97 | +```json |
| 98 | +"hits": [ |
| 99 | + { "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} }, |
| 100 | + { "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} }, |
| 101 | + { "_id": "2", "_score": 0.743, "_source": {"product_description": "Red apple from usa."} } |
| 102 | +] |
| 103 | +``` |
| 104 | +Notice how all top results are very similar — there’s little diversity in what the user sees. |
| 105 | + |
| 106 | +#### Query With MMR |
| 107 | + |
| 108 | +By adding MMR, we can diversify the top results while maintaining relevance: |
| 109 | +```json |
| 110 | +GET /my-npl-index/_search |
| 111 | +{ |
| 112 | + "size": 3, |
| 113 | + "_source": { "exclude": ["product_description_semantic_info"] }, |
| 114 | + "query": { |
| 115 | + "neural": { |
| 116 | + "product_description": { "query_text": "Red apple" } |
| 117 | + } |
| 118 | + }, |
| 119 | + "ext": { |
| 120 | + "mmr": { |
| 121 | + "candidates": 10, |
| 122 | + "diversity": 0.4 |
| 123 | + } |
| 124 | + } |
| 125 | +} |
| 126 | + |
| 127 | +``` |
| 128 | + |
| 129 | +Results: |
| 130 | +```json |
| 131 | +"hits": [ |
| 132 | + { "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} }, |
| 133 | + { "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} }, |
| 134 | + { "_id": "3", "_score": 0.611, "_source": {"product_description": "Crispy apple."} } |
| 135 | +] |
| 136 | +``` |
| 137 | + |
| 138 | +By using MMR, we introduce more diverse results (like “Crispy apple”) without sacrificing relevance for the top hits. |
| 139 | + |
| 140 | +## Benchmarking MMR Reranking in OpenSearch |
| 141 | +To evaluate the performance impact of Maximal Marginal Relevance (MMR) reranking, we ran benchmark tests on OpenSearch 3.3 across both [vector search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/vectorsearch/params/corpus/10million/faiss-cohere-768-dp.json) and [neural-search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/neural_search/params/semanticfield/neural_search_semantic_field_dense_model.json) workloads. These tests help quantify the latency trade-offs introduced by MMR while highlighting the benefits of more diverse search results. |
| 142 | + |
| 143 | +### Cluster configuration |
| 144 | + |
| 145 | +The following OpenSearch cluster configuration was used: |
| 146 | + |
| 147 | +* Version: OpenSearch 3.3 |
| 148 | +* Data nodes: 3 × r6g.2xlarge |
| 149 | +* Master nodes: 3 × c6g.xlarge |
| 150 | +* Benchmark instance: c6g.large |
| 151 | + |
| 152 | +### Vector Search Performance |
| 153 | +We used the cohere-1m dataset, which contains one million precomputed embeddings, to evaluate k-nearest neighbor (KNN) queries. The table below summarizes query latency (in milliseconds) for different values of k and MMR candidate sizes: |
| 154 | + |
| 155 | +| k | Query size | MMR candidates | KNN p50 (no MMR) | KNN p90 (no MMR) | KNN p50 (with MMR) | KNN p90 (with MMR) | p50 Δ (%) | p90 Δ (%) | p50 Δ (ms) | p90 Δ (ms) | |
| 156 | +| --- | ---------- | -------------- | ---------------- | ---------------- | ------------------ | ------------------ | --------- | --------- | ---------- | ---------- | |
| 157 | +| 1 | 1 | 1 | 6.70 | 7.19 | 8.22 | 8.79 | 22.7 | 22.2 | 1.52 | 1.60 | |
| 158 | +| 10 | 10 | 10 | 8.09 | 8.64 | 9.14 | 9.62 | 13.0 | 11.3 | 1.05 | 0.98 | |
| 159 | +| 30 | 10 | 30 | 7.85 | 8.40 | 10.83 | 11.48 | 37.9 | 36.7 | 2.98 | 3.08 | |
| 160 | +| 50 | 10 | 50 | 7.17 | 7.63 | 11.76 | 12.55 | 64.1 | 64.5 | 4.59 | 4.92 | |
| 161 | +| 50 | 20 | 50 | 8.04 | 8.57 | 14.08 | 14.94 | 75.0 | 74.4 | 6.04 | 6.37 | |
| 162 | +| 50 | 50 | 50 | 8.34 | 8.91 | 17.25 | 17.94 | 106.8 | 101.3 | 8.91 | 9.03 | |
| 163 | +| 100 | 10 | 100 | 7.92 | 8.46 | 15.81 | 16.73 | 99.7 | 97.7 | 7.89 | 8.27 | |
| 164 | + |
| 165 | + |
| 166 | +### Neural Search Performance |
| 167 | + |
| 168 | +For neural search, we used the Quora dataset, containing over 500,000 documents. The table below shows query latency with and without MMR reranking: |
| 169 | + |
| 170 | +| k | Query size | MMR candidates | Neural p50 (no MMR) | Neural p90 (no MMR) | Neural p50 (with MMR) | Neural p90 (with MMR) | p50 Δ (%) | p90 Δ (%) | p50 Δ (ms) | p90 Δ (ms) | |
| 171 | +| --- | ---------- | -------------- | ------------------- | ------------------- | --------------------- | --------------------- | --------- | --------- | ---------- | ---------- | |
| 172 | +| 1 | 1 | 1 | 113.59 | 122.22 | 113.08 | 122.38 | -0.46 | 0.13 | -0.52 | 0.16 | |
| 173 | +| 10 | 10 | 10 | 112.03 | 122.90 | 113.88 | 122.63 | 1.66 | -0.22 | 1.86 | -0.27 | |
| 174 | +| 30 | 10 | 30 | 112.09 | 118.82 | 119.57 | 127.65 | 6.67 | 7.42 | 7.48 | 8.82 | |
| 175 | +| 50 | 10 | 50 | 113.48 | 126.35 | 122.56 | 133.34 | 8.00 | 5.53 | 9.08 | 6.99 | |
| 176 | +| 50 | 20 | 50 | 113.80 | 125.20 | 122.94 | 134.80 | 8.04 | 7.67 | 9.14 | 9.60 | |
| 177 | +| 50 | 50 | 50 | 113.36 | 125.54 | 128.33 | 136.52 | 13.21 | 8.74 | 14.97 | 10.97 | |
| 178 | +| 100 | 10 | 100 | 119.04 | 128.71 | 130.52 | 139.95 | 9.65 | 8.73 | 11.48 | 11.24 | |
| 179 | + |
| 180 | +### Key Observations |
| 181 | + |
| 182 | +1. MMR adds latency, and the increase grows with the number of MMR candidates. |
| 183 | +2. KNN/Neural queries without MMR scale well with k. The dominant cost comes from graph traversal (ef_search), not selecting the top k candidates. |
| 184 | + |
| 185 | +Choosing the number of MMR candidates requires balancing diversity versus query latency. More candidates improve result diversity but increase latency, so select values appropriate for your workload. |
| 186 | + |
| 187 | +## Using MMR with Cross-cluster Search |
| 188 | + |
| 189 | +Currently, for [cross-cluster search](https://docs.opensearch.org/latest/search-plugins/cross-cluster-search/), OpenSearch cannot automatically resolve vector field information from the index mapping in the remote clusters. This means users must explicitly provide the vector field details when using MMR. |
| 190 | + |
| 191 | +Here’s an example query: |
| 192 | + |
| 193 | +```json |
| 194 | +POST /my-index/_search |
| 195 | +{ |
| 196 | + "query": { |
| 197 | + "neural": { |
| 198 | + "my_vector_field": { |
| 199 | + "query_text": "query text", |
| 200 | + "model_id": "<your model id>" |
| 201 | + } |
| 202 | + } |
| 203 | + }, |
| 204 | + "ext": { |
| 205 | + "mmr": { |
| 206 | + "diversity": 0.5, |
| 207 | + "candidates": 10, |
| 208 | + "vector_field_path": "my_vector_field", |
| 209 | + "vector_field_data_type": "float", |
| 210 | + "vector_field_space_type": "l2" |
| 211 | + } |
| 212 | + } |
| 213 | +} |
| 214 | + |
| 215 | +``` |
| 216 | + |
| 217 | +Explanation of MMR Parameters for Remote Clusters |
| 218 | + |
| 219 | +**vector_field_path:** Path to the vector field to use for MMR re-ranking. |
| 220 | + |
| 221 | +**vector_field_data_type:** Data type of the vector (e.g., float). |
| 222 | + |
| 223 | +**vector_field_space_type:** Distance metric used for similarity calculations (e.g., l2). |
| 224 | + |
| 225 | +candidates and diversity: Same as in local MMR queries, controlling the number of candidates and the diversity weight. |
| 226 | + |
| 227 | +Providing this information ensures that MMR can correctly compute diversity and re-rank results even when querying across remote clusters. |
| 228 | + |
| 229 | +## Summary |
| 230 | + |
| 231 | +OpenSearch’s Maximal Marginal Relevance (MMR) feature makes it easy to deliver search results that are both relevant and diverse. By intelligently re-ranking results, MMR helps surface a wider variety of options, reduces redundancy, and creates a richer, more engaging search experience for your users. |
| 232 | + |
| 233 | +If you’re looking to improve your vector search diversity, MMR in OpenSearch is a powerful tool to try today. |
| 234 | + |
| 235 | +## What's Next |
| 236 | + |
| 237 | +In the future, we can make MMR even easier and more flexible: |
| 238 | + |
| 239 | +- **Better support for remote clusters:** removing the need to manually specify vector field info. |
| 240 | +- **Expanded query type support:** Currently we only can support knn query or neural query with knn_vector. Potentially we can support more query types. e.g. bool and hybrid queries, so MMR can enhance a wider variety of search scenarios. |
0 commit comments