Skip to content

Commit 0a622d2

Browse files
Alexandr Guzhvafacebook-github-bot
Alexandr Guzhva
authored andcommitted
Update docs for benchmarks in benchs/ directory (facebookresearch#2565)
Summary: Pull Request resolved: facebookresearch#2565 Reviewed By: mdouze Differential Revision: D40856253 fbshipit-source-id: 78f549bb37cdb3e6f562d877f5e33fa1c20834dc
1 parent 19f7696 commit 0a622d2

File tree

1 file changed

+31
-8
lines changed

1 file changed

+31
-8
lines changed

benchs/README.md

+31-8
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ http://corpus-texmex.irisa.fr/ to subdirectory bigann/
7575

7676
### Getting Deep1B
7777

78-
The ground-truth and queries are available here
78+
The ground-truth and queries are available here
7979

8080
https://yadi.sk/d/11eDCm7Dsn9GA
8181

@@ -145,7 +145,7 @@ The 8-byte results can be reproduced with the factory key `IMI2x12,PQ8`
145145

146146
### Experiments of the appendix
147147

148-
The experiments in the appendix are only in the ArXiv version of the paper (table 3).
148+
The experiments in the appendix are only in the ArXiv version of the paper (table 3).
149149

150150
```
151151
python bench_polysemous_1bn.py SIFT1000M OPQ8_64,IMI2x13,PQ8 nprobe={1,2,4,8,16,32,64,128},ht={20,24,26,28,30}
@@ -179,11 +179,11 @@ The original results were obtained with `nprobe=1024,ht=66,max_codes=262144`.
179179

180180
## GPU experiments
181181

182-
The benchmarks below run 1 or 4 Titan X GPUs and reproduce the results of the "GPU paper". They are also a good starting point on how to use GPU Faiss.
182+
The benchmarks below run 1 or 4 Titan X GPUs and reproduce the results of the "GPU paper". They are also a good starting point on how to use GPU Faiss.
183183

184184
### Search on SIFT1M
185185

186-
See above on how to get SIFT1M into subdirectory sift1M/. The script [`bench_gpu_sift1m.py`](bench_gpu_sift1m.py) reproduces the "exact k-NN time" plot in the ArXiv paper, and the SIFT1M numbers.
186+
See above on how to get SIFT1M into subdirectory sift1M/. The script [`bench_gpu_sift1m.py`](bench_gpu_sift1m.py) reproduces the "exact k-NN time" plot in the ArXiv paper, and the SIFT1M numbers.
187187

188188
The output is:
189189
```
@@ -245,14 +245,14 @@ nprobe= 512 0.527 s recalls= 0.9907 0.9987 0.9987
245245

246246
To get the "infinite MNIST dataset", follow the instructions on [Léon Bottou's website](http://leon.bottou.org/projects/infimnist). The script assumes the file `mnist8m-patterns-idx3-ubyte` is in subdirectory `mnist8m`
247247

248-
The script [`kmeans_mnist.py`](kmeans_mnist.py) produces the following output:
248+
The script [`kmeans_mnist.py`](kmeans_mnist.py) produces the following output:
249249

250250
```
251251
python kmeans_mnist.py 1 256
252252
...
253253
Clustering 8100000 points in 784D to 256 clusters, redo 1 times, 20 iterations
254254
Preprocessing in 7.94526 s
255-
Iteration 19 (131.697 s, search 114.78 s): objective=1.44881e+13 imbalance=1.05963 nsplit=0
255+
Iteration 19 (131.697 s, search 114.78 s): objective=1.44881e+13 imbalance=1.05963 nsplit=0
256256
final objective: 1.449e+13
257257
total runtime: 140.615 s
258258
```
@@ -263,7 +263,7 @@ The script [`bench_gpu_1bn.py`](bench_gpu_1bn.py) runs multi-gpu searches on the
263263

264264
Even on multiple GPUs, building the 1B datasets can last several hours. It is often a good idea to validate that everything is working fine on smaller datasets like SIFT1M, SIFT2M, etc.
265265

266-
The search results on SIFT1B in the "GPU paper" can be obtained with
266+
The search results on SIFT1B in the "GPU paper" can be obtained with
267267

268268
<!-- see P57124181 -->
269269

@@ -285,7 +285,7 @@ We use the `-tempmem` option to reduce the temporary memory allocation to 1.5G,
285285

286286
### search on Deep1B
287287

288-
The same script generates the GPU search results on Deep1B.
288+
The same script generates the GPU search results on Deep1B.
289289

290290
```
291291
python bench_gpu_1bn.py Deep1B OPQ20_80,IVF262144,PQ20 -nnn 10 -R 2 -ngpu 4 -altadd -noptables -tempmem $[1024*1024*1024]
@@ -336,3 +336,26 @@ search...
336336
999997440/1000000000 (36717.207 s, 0.6015) probe=128: 36717.309 s rank-10 intersection results: 0.6015
337337
999997440/1000000000 (70616.392 s, 0.6047) probe=256: 70616.581 s rank-10 intersection results: 0.6047
338338
```
339+
340+
# Additional benchmarks
341+
342+
This directory also contains certain additional benchmarks (and serve as an additional source of examples of how to use the FAISS code).
343+
Certain tests / benchmarks might be outdated.
344+
345+
* bench_6bit_codec.cpp - tests vector codecs for SQ6 quantization on a synthetic dataset
346+
* bench_cppcontrib_sa_decode.cpp - benchmarks specialized kernels for vector codecs for PQ, IVFPQ and Resudial+PQ on a synthetic dataset
347+
* bench_for_interrupt.py - evaluates the impact of the interrupt callback handler (which can be triggered from Python code)
348+
* bench_hamming_computer.cpp - specialized implementations for Hamming distance computations
349+
* bench_heap_replace.cpp - benchmarks different implementations of certain calls for a Heap data structure
350+
* bench_hnsw.py - benchmarks HNSW in combination with other ones for SIFT1M dataset
351+
* bench_index_flat.py - benchmarks IndexFlatL2 on a synthetic dataset
352+
* bench_index_pq.py - benchmarks PQ on SIFT1M dataset
353+
* bench_ivf_fastscan_single_query.py - benchmarks a single query for different nprobe levels for IVF{nlist},PQ{M}x4fs on BIGANN dataset
354+
* bench_ivf_fastscan.py - compares IVF{nlist},PQ{M}x4fs against other indices on SIFT1M dataset
355+
* bench_ivf_selector.cpp - checks the possible overhead when using faiss::IDSelectorAll interface
356+
* bench_pairwise_distances.py - benchmarks pairwise distance computation between two synthetic datasets
357+
* bench_partition.py - benchmarks partitioning functions
358+
* bench_pq_tables.py - benchmarks ProductQuantizer.compute_inner_prod_tables() and ProductQuantizer.compute_distance_tables() calls
359+
* bench_quantizer.py - benchmarks various quantizers for SIFT1M, Deep1B, BigANN datasets
360+
* bench_scalar_quantizer.py - benchmarks IVF+SQ on a Sift1M dataset
361+
* bench_vector_ops.py - benchmarks dot product and distances computations on a synthetic dataset

0 commit comments

Comments
 (0)