Skip to content

Commit

Permalink
Adds ankerl::unordered_dense{segmented_map, segmented_set}
Browse files Browse the repository at this point in the history
This new underlying container has a much smoother memory allocation curve
than the default underlying `std::vector`.

* Much smoother memory usage, memory usage increases continuously.
* No high peak memory usage.
* Faster insertion because elements never need to be moved to new allocated blocks
* Slightly slower indexing compared to `std::vector` because an additional
  indirection is needed.

Abseil is fastest for this simple inserting test, taking a bit over 0.8 seconds.
It's peak memory usage is about 430 MB. Note how the memory usage goes down after
the last peak; when it goes down to ~290MB it has finished rehashing and could free
the previously used memory block.

`ankerl::unordered_dense::segmented_map` doesn't have these peaks, and instead has
a smooth increase of memory usage. Note there are still sudden drops & increases in
memory because the indexing data structure needs still needs to increase by a fixed
factor. But due to holding the data in a separate container we are able to first free
the old data structure, and then allocate a new, bigger indexing structure; thus we
do not have peaks.

bump to 4.0.0
  • Loading branch information
martinus committed Apr 8, 2023
1 parent 10782bf commit ec970e9
Show file tree
Hide file tree
Showing 182 changed files with 1,965 additions and 617 deletions.
30 changes: 14 additions & 16 deletions .clang-tidy
Original file line number Diff line number Diff line change
@@ -1,20 +1,18 @@
---
Checks: '*,
-altera*,
-fuchsia*,
-llvmlibc*,
-bugprone-easily-swappable-parameters,
-cert-err58-cpp,
-cppcoreguidelines-avoid-magic-numbers,
-cppcoreguidelines-pro-bounds-constant-array-index,
-cppcoreguidelines-pro-bounds-pointer-arithmetic,
-llvm-header-guard,
-readability-function-cognitive-complexity,
-readability-identifier-length,
-readability-magic-numbers,
'
Checks: '*
-altera*
-bugprone-easily-swappable-parameters
-cert-err58-cpp
-cppcoreguidelines-avoid-magic-numbers
-cppcoreguidelines-pro-bounds-constant-array-index
-cppcoreguidelines-pro-bounds-pointer-arithmetic
-fuchsia*
-llvm-header-guard
-llvmlibc*
-readability-function-cognitive-complexity
-readability-identifier-length
-readability-magic-numbers
'
WarningsAsErrors: '*'
HeaderFilterRegex: ''
CheckOptions:
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.12)
project("unordered_dense"
VERSION 3.1.1
VERSION 4.0.0
DESCRIPTION "A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion"
HOMEPAGE_URL "https://github.com/martinus/unordered_dense")

Expand Down
28 changes: 23 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion.

The classes `ankerl::unordered_dense::map` and `ankerl::unordered_dense::set` are (almost) drop-in replacements of `std::unordered_map` and `std::unordered_set`. While they don't have as strong iterator / reference stability guaranties, they are typically *much* faster.
The classes `ankerl::unordered_dense::map` and `ankerl::unordered_dense::set` are (almost) drop-in replacements of `std::unordered_map` and `std::unordered_set`. While they don't have as strong iterator / reference stability guaranties, they are typically *much* faster.

Additionally, there are `ankerl::unordered_dense::segmented_map` and `ankerl::unordered_dense::segmented_set` with lower peak memory usage.

- [1. Overview](#1-overview)
- [2. Installation](#2-installation)
Expand Down Expand Up @@ -257,13 +259,29 @@ The map/set supports two different bucket types. The default should be good for
* up to 2^63 = 9223372036854775808 elements.
* 12 bytes overhead per bucket.

## 4. Design
## 4. `segmented_map` and `segmented_set`

`ankerl::unordered_dense` provides a custom container implementation that has lower memory requirements than the default `std::vector`. Memory is not contiguous, but it can allocate segments without having to reallocate and move all the elements. In summary, this leads to

* Much smoother memory usage, memory usage increases continuously.
* No high peak memory usage.
* Faster insertion because elements never need to be moved to new allocated blocks
* Slightly slower indexing compared to `std::vector` because an additional indirection is needed.

Here is a comparison against `absl::flat_hash_map` and the `ankerl::unordered_dense::map` when inserting 10 million entries
![allocated memory](doc/allocated_memory.png)

Abseil is fastest for this simple inserting test, taking a bit over 0.8 seconds. It's peak memory usage is about 430 MB. Note how the memory usage goes down after the last peak; when it goes down to ~290MB it has finished rehashing and could free the previously used memory block.

`ankerl::unordered_dense::segmented_map` doesn't have these peaks, and instead has a smooth increase of memory usage. Note there are still sudden drops & increases in memory because the indexing data structure needs still needs to increase by a fixed factor. But due to holding the data in a separate container we are able to first free the old data structure, and then allocate a new, bigger indexing structure; thus we do not have peaks.

## 5. Design

The map/set has two data structures:
* `std::vector<value_type>` which holds all data. map/set iterators are just `std::vector<value_type>::iterator`!
* An indexing structure (bucket array), which is a flat array with 8-byte buckets.

### 4.1. Inserts
### 5.1. Inserts

Whenever an element is added it is `emplace_back` to the vector. The key is hashed, and an entry (bucket) is added at the
corresponding location in the bucket array. The bucket has this structure:
Expand All @@ -283,12 +301,12 @@ Each bucket stores 3 things:
This structure is especially designed for the collision resolution strategy robin-hood hashing with backward shift
deletion.
### 4.2. Lookups
### 5.2. Lookups
The key is hashed and the bucket array is searched if it has an entry at that location with that fingerprint. When found,
the key in the data vector is compared, and when equal the value is returned.
### 4.3. Removals
### 5.3. Removals
Since all data is stored in a vector, removals are a bit more complicated:
Expand Down
1 change: 1 addition & 0 deletions data/fuzz/api/00b05d7d90848f32ca139aa1a2a7bfc05ecdc6aa
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�������555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555�55555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555���55555555555555555555555555555555555555555555555555555555555555555555555=5555555555555555555555555555555555555555555555555555555555555555555555555555�555555555555555555555555555555555555555555555555555555555555555555555555555555555�55555555555555555555555555555555555555555555555555555555555555555
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/05cb85ee3c7ee6f363d64e4e3e2dd99a9c755a0a
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
@U�U]U]��U
1 change: 1 addition & 0 deletions data/fuzz/api/08c684aa36bebe1d5c53e9761e6029a490377e93
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hh��hh
1 change: 1 addition & 0 deletions data/fuzz/api/0d06b0a171fca96ca6a3e68323726b39e008838a
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#��H
1 change: 1 addition & 0 deletions data/fuzz/api/1078aadb2db22cf008832242292a02fca407acb2
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
C
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/18c8e7b5836a3fd84896e6423e48f20cf6b5f7fe
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/29371b45b46460a6bf8240283a31cc49a30c6c83
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���۱��۱�K۱�V۱���
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/2a687445f01f304375eef92e7b616a56e6e2bb2d
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��{�-[-[-�-[-[-�-�-[-�-[-[-�-�-[-�-[-[-�-[-��/���/��:{���/��/��[/���/����z~����/���/���/����/��[�/���/���/���/��/��[/���![
1 change: 1 addition & 0 deletions data/fuzz/api/31c440f9ab80976665056e180c9c49263c32aeb7
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/5z05z/ �00 0� Rz05
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/3ee207f869c84fca99b82c26283935f5db0fbaf7
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�0������
1 change: 1 addition & 0 deletions data/fuzz/api/4067c8d1bee31f0fff1ad43cc9bf358bf965830d
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/ /00 0 0 05
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/451bb5b8e85859f406439d5777ac733416f4bf43
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��$���J�J�J ��
1 change: 1 addition & 0 deletions data/fuzz/api/46d6a22920746f08ce476c3c0891a312422d8430
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{/�����/��R������[����򲲲�[�����򲲲����[
1 change: 1 addition & 0 deletions data/fuzz/api/46ed7c052afc7f78fad09e04bdd05e230bae9bf1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0000000000000000000000[000000000000,0000000000000000000000000000000000000000000000[000000000000000000000000000000000000000000000000000000000500000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000�00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020
1 change: 1 addition & 0 deletions data/fuzz/api/4a0977214a607e3f61185afb43493a0a3dda0ab0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
������������������������������
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/5124a4a6f29061b3d61f3e8a86db67a4128ba351
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�y���<<����[
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/5375cdb60501a7b4d522805d484d32ea8a0bc605
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�@!�
1 change: 1 addition & 0 deletions data/fuzz/api/53d3a3d0e9c50b3050f17f9cb74e345f8c6400e1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�$��������������������
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/5af6c73c6a666c0d483ed0c74f0469acbfb57de3
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���0
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/6035e522da2f771dbdaf1027b132429f0b41f0de
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{{��
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/6a3325de19b7a40f7f7abef68978d55137cc8991
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���i
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/6d01dff409415c1f1299af3502f125eef9b71df5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���۱x#0�!��
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/76d14ea91168626a28ed808ab86d116969ee5579
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��He�H�"���""�H"
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/818a038b030208a3fc21ef5f56006a6ca8612b29
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���y���Z�����?����������,�������?�������,?���J�������Z���������y���Z�����?������,�������?����/������,?���J������Z��!��ڞ�%%�����J���Z���
1 change: 1 addition & 0 deletions data/fuzz/api/88fa28c86e37004eff702379366a79790a264c26
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
GGGGGGGGGGGGGGG
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/9b4eb939e2fba6ce08c6bd6d9e2fb27e72ef9f4b
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��|
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/b7b4e06ffc64fbb2c8974fd39541a1ef1b7493b4
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��!��۲#��!���!8K�!���@.-�!���!�#۔h�!���@s2�!���!�k�!���@s��!���
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/bf4d13c7ee2607d3cefa3671e90ae41914a3817e
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
-����۱��۱��۱��۱��۱��۱��۱��۱��۱��۱�����۱��۱���۱��۱��۱��۱��۱����۱��۱��۱�
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/c85488a14efb3cd93838a2136d398912793fc8b5
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��-%z�,���Z����%������?���-[%/
1 change: 1 addition & 0 deletions data/fuzz/api/c9c9699668662ac0d4c21cafd4b2e86f4b3db945
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/cdbebb288783b7afbfea18f41d74282a5a760baa
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
:GGGGGG$GGGGGGGG�GGGGG��GGGGR�
1 change: 1 addition & 0 deletions data/fuzz/api/ce547c1571410a433155945e3c8efd6608a8cd55
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�f!fffffff!ff�!
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/d09396127da6714e99f1f18d881939e19749c627
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Y�C
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/d62f1806de56aed27a3782f7c390ba2937b34a34
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��/ /00 00 � 0� z 00 0 0 000 00 � 0� 0[
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/e573c1e57431b2d466c8c41408eb764892321bdd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{{[/���/���/��%�/��{���/��/���/����/���/��:{���/������/��/���/��[/���/���{���/��/���/����/���/��:{���/������/��/���/��[/���/�����/��[/���/��[/���{���/���/���/��:{���/������/��/���/��[/���/�����/��[/���/��[/���{���/���/��[/���/�/���/���/��[/��/���/���/��/��[/���![
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/ea602310ae83e245a51971758a86e97f203d6095
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
���������������������������������������������������������������������5
1 change: 1 addition & 0 deletions data/fuzz/api/eb487de4e5b68bc61fa54da5140481e7a07f11c8
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
�����������UUUUUU�X}�aM��a[
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/ee292cb81ed36de2ac8c3c4e07f3e61f881ea9e7
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/@��
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/f8ca2fde2281ab28f40091fb8083e9eef068e01c
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
��H��,!�[
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 1 addition & 0 deletions data/fuzz/api/fdeeb02ad36e89b018b71dea53e5faaee2081ca0
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
K#���
44 changes: 44 additions & 0 deletions doc/allocated_memory.gnuplot
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/usr/bin/gnuplot

#set terminal pngcairo
#set terminal pngcairo size 730,510 enhanced font 'Verdana,10'
set terminal pngcairo size 800,600 enhanced font 'Verdana,10'

# define axis
# remove border on top and right and set color to gray
set style line 11 lc rgb '#808080' lt 1
set border 3 back ls 11
set tics nomirror
# define grid
set style line 12 lc rgb '#808080' lt 0 lw 1
set grid back ls 12

# line styles
set style line 1 lt 1 lc rgb '#1B9E77' # dark teal
set style line 2 lt 1 lc rgb '#D95F02' # dark orange
set style line 3 lt 1 lc rgb '#7570B3' # dark lilac
set style line 4 lt 1 lc rgb '#E7298A' # dark magenta
set style line 5 lt 1 lc rgb '#66A61E' # dark lime green
set style line 6 lt 1 lc rgb '#E6AB02' # dark banana
set style line 7 lt 1 lc rgb '#A6761D' # dark tan
set style line 8 lt 1 lc rgb '#666666' # dark gray


set style line 101 lc rgb '#808080' lt 1 lw 1
set border 3 front ls 101
set tics nomirror out scale 0.75

set key left top

set output 'allocated_memory.png'

set xlabel "Runtime [s]"
set ylabel "Allocated memory [MB]"

set title "Inserting 10 Million uint64\\\_t -> uint64\\\_t pairs"

# allocated_memory_segmented_vector.txt allocated_memory_std_unordered_map.txt allocated_memory_std_vector.txt
plot \
'allocated_memory_segmented_vector.txt' using ($1):($2/1e6) w steps ls 1 lw 2 title "ankerl::unordered\\\_dense::segmented\\\_map" , \
'allocated_memory_std_vector.txt' using ($1):($2/1e6) w steps ls 2 lw 2 title "ankerl::unordered\\\_dense::map" , \
'allocated_memory_absl_flat_hash_map.txt' using ($1):($2/1e6) w steps ls 3 lw 2 title "absl::flat\\\_hash\\\_map"
Binary file added doc/allocated_memory.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit ec970e9

Please sign in to comment.