Skip to content

sangeet01/gradient

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Gradient Hashing

Breaking the 28-Year-Old 'Impossible Triangle' of Distributed Load Balancing.

License

TL;DR: Traditional hashing (Maglev, Ring) forces you to choose between speed and locality. Gradient Hashing uses physics-inspired potential fields to achieve $O(1)$ speed, minimal churn, and 90% better spatial locality—simultaneously.

Resilience Demo

The Breakthrough: Solving the Impossible Triangle

For decades, distributed systems have been bounded by a three-way trade-off:

  1. Speed ($O(1)$ Lookup): Instant routing (Google Maglev).
  2. Consistency (Minimal Churn): Low data movement during Scaling (Ring Hashing).
  3. Spatial Locality: Proximity-aware routing (Geo-Hashing).

Gradient Hashing is the first algorithm to achieve all three. It replaces random permutations with a potential field equation modeled after mycelial nutrient transport.


Technical Core: The Gradient Equation

Instead of static math, we use a dynamic flow equation:

$$\Phi(x, y, s_i) = \text{Gravity} \cdot \text{Pressure} \cdot \text{Trust}$$

  • Gravity ($1/d^2$): Pulls traffic to the nearest node (Locality).
  • Pressure ($1/Capacity$): Pushes traffic away from overloaded nodes (Load Balance).
  • Trust ($[0, 1]$): Multiplicative filter that instantly isolates Byzantine nodes.
  • Sticky Hysteresis ($\alpha$): Creates "potential wells" that pin keys to existing servers, preventing rehash storms during cluster scaling (Consistency).

This results in a Liquid System: Traffic flows to the optimal server but naturally "spills over" to physical neighbors during spikes.


Hybrid Dual-Mode Architecture

A single lookup table, two optimized behaviors:

1. Spatial Mode (Geo-Aware)

  • Use Case: CDNs, IoT, Multi-Region DBs.
  • Performance: 90.3% distance reduction vs Maglev.
  • Locality: Related keys stay within 2 neighboring servers (vs 10 for Ring Hash).

2. Hash Mode (High Throughput)

  • Use Case: Database sharding, HTTP Caches.
  • Performance: 1.1 Million Req/s in Python (2.7x faster than Maglev).
  • Consistency: 4.4% churn (Matches theoretical ideal of $1/N$).

Benchmarks (1,000 Server Cluster)

Metric Ring Hashing Google Maglev Gradient Hashing
Lookup Speed $O(\log N)$ $O(1)$ $O(1)$
Throughput 0.37M req/s 0.43M req/s 1.10M req/s
Avg. Distance 0.425 0.425 0.041
Byzantine Resilience 94.8% (Fail) 94.9% (Fail) 100.0% (Immune)
Locality Factor Low Low Optimal (Voronoi)

Usage Guide

Explore the protocol using the bundled simulation suite:

Run the Peak Performance Test

python gradient.py

Jupyter Notebook

jupyter notebook gradient.ipynb

Origin Story: The Intelligence of Fungi

We drew inspiration from the Tokyo Subway Experiment. A slime mold recreated the entire Tokyo rail network efficiently just by seeking food. We asked: "Why use random math for the internet when biology has solved routing for millions of years?"

By modeling server clusters as mycelial networks, we unlocked a geometry-first approach to data.

License

Apache License 2.0 - see LICENSE file for details.

PS: Sangeet's the name, a daft undergrad splashing through chemistry and code like a toddler—my titrations are a mess, and I've used my mouth to pipette.

About

Hashing dynamically using principles from biology

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors