Skip to content

elliota43/SwissTables

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Swiss Table

An educational Rust implementation of the Swiss Table hash map algorithm, built to understand how modern hash tables work under the hood.

I started working on this after reading the Go team's blog post on bringing Swiss Tables to Go -- I wanted to implement the algorithm myself to better understand its implementation. Ironically, I discovered halfway through that Rust's own HashMap has been backed by Swiss Tables (via the hashbrown crate) since 2019.

That said, digging through the hashbrown or Abseil source code isn't exactly a gentle introduction. This implementation strips the algorithm down to its core ideas with no unsafe code, no SIMD, and no crazy generics -- just the data structure in a digestible form.

The Core Idea

Traditional open-addressed hash tables probe one slot at a time on collision, which gets slow as the table fills up. Swiss Tables group slots into chunks of 8 and attach a control word to each group -- 8 bytes of metadata, one per slot, each storing a 7-bit hash fingerprint (h2).

On lookup, instead of comparing keys one by one, you compare your h2 against all 8 control bytes in parallel (using a SIMD instruction). This tells you exactly which slots are worth checking -- typically only one or two -- so you skip almost all full key comparisons.

hash(key) -> 64-bit value
          ├── upper 57 bits (h1) -> selects which group to search.
          └── lower 7 bits (h2) -> fingerprint stored in control word.

The Code

SwissTable<K, V> struct:

  • insert(key, value) -- adds a new entry or updates an existing one, returning the old value if the key was already present.
  • get(key)/get_mut(key) -- returns a shared / mutable reference to the value for a given key
  • remove(key) -- deletes an entry and returns its value (uses tombstones internally to avoid breaking probe chains)
  • contains_key(key) -- checks whether a key exists without returning a value
  • Automatically grows/resizes the table when the load factor exceeds 87.5%

The code is structured to make each piece of the algorithm easy to follow: h1/h2 hash splitting, per-group control words with bitmask matching, group-level linear probing, the 0x80 high-bit trick for distinguishing slot states in a single bitwise check, tombstone-based deletion to preserve probe chains, and early termination on empty slots during lookup.

I've tried to keep the code as simple as possible, though I'm sure there are some optimizations that could be made.

What's Missing

This is deliberately minimal -- enough to understand the algorithm, not enough to ship in production. A real implementation like hashbrown includes:

  • SIMD -- match_h2 is a plain loop; production code uses _mm_cmpeq_epi8 (x86) or NEON (ARM) equivalents to compare all control bytes in a single CPU instruction.
  • 16-wide groups -- I used groups of 8 for simplicity, but with 128-bit SIMD registers, you can double that and probe 16 slots in parallel.
  • Iterator support -- this implementation does not support / have a way to iterate over the table's entries.
  • Custom hashers -- This implementation hardcodes RandomState: hashbrown is generic over BuildHasher so you can plug in faster hashers like ahash
  • Allocation optimizations -- no support for pre-allocation capacity. no small-table optimizations.
  • Incremental growth -- everything is rehashed at once on resize; Go, for example, splits maps into many small tables (max 1024 entries) so that growth only copies one table at a time, bounding worst-case insertion latency.

Usage

use swiss_table::SwissTable;

let mut map = SwissTable::new();

map.insert("language", "rust");
map.insert("algorithm", "swiss table");

assert_eq!(map.get(&"language"), Some(&"rust"));
assert_eq!(map.len(), 2);

map.remove(&"language");
assert_eq!(map.get(&"language"), None);

Running tests

cargo test

Tests cover insertion, lookup, updates, deletion, probe chain integrity after removal, and large-scale insert/remove cycles.

Resources

About

An educational Rust implementation of the Swiss Table hash map algorithm, built to understand how modern hash tables work under the hood.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages