A fast, concurrent CLI tool to find specific IDs across all columns in your PostgreSQL database.
Ever needed to find where a UUID appears in your database? This tool scans every column of every table, checking if your target IDs exist anywhere.
- 🔍 Auto-detects ID types - UUID, BigInt, or Text
- ⚡ Highly concurrent - Configurable parallelism with semaphore-based rate limiting
- 🎯 Smart filtering - Restrict by schema, column name, or column pattern
- 📊 Progress tracking - Live progress bar for large databases
- 🛡️ Error resilient - Continues scanning even if individual columns fail
- 🔌 Supabase ready - Works with connection poolers out of the box
# Clone the repository
git clone https://github.com/yourusername/needle_haystack_rust.git
cd needle_haystack_rust- Create a
.envfile with your database connection:
DATABASE_URL=postgres://user:password@host:port/databaseSupabase users: Use the connection pooler URL (port
6543), not direct connection (5432)
- Create a
needles.txtfile with IDs to search (one per line):
# Comments are supported
id-example-1
id-example-2
id-example-3
# Basic usage (searches all schemas)
cargo run --release
# Specify a different IDs file
cargo run --release -- --ids my_ids.txt
# Restrict to public schema only
cargo run --release -- --schema public
# Only check columns named "id"
cargo run --release -- --column-exact id
# Only check columns containing "_id" in their name
cargo run --release -- --column-contains _id
# Stop after first match (faster)
cargo run --release -- --first-only
# Adjust concurrency for Supabase (conservative)
cargo run --release -- --concurrency 8 --pool-size 5
# Adjust concurrency for on-premise (aggressive)
cargo run --release -- --concurrency 32 --pool-size 20| Flag | Default | Description |
|---|---|---|
--ids |
needles.txt |
Path to file containing IDs |
--concurrency |
16 |
Max concurrent column checks |
--pool-size |
10 |
Max database connections |
--schema |
all | Restrict to specific schema |
--column-exact |
- | Only check columns with this exact name |
--column-contains |
- | Only check columns containing this substring |
--first-only |
false |
Stop after finding the first match |
--timeout-secs |
30 |
Query timeout per column |
--pool-timeout-secs |
60 |
Connection pool acquire timeout |
Connecting with pool_size=10, concurrency=16, timeouts=60s/30s
Loaded 3 IDs of type 'uuid'
Candidate columns to check: 847
⠹ [00:00:12] [################>-----------------------] 342/847 (40%) checking public.users.id
✓ FOUND: public.profiles.user_id
✓ FOUND: public.orders.customer_id
⠹ [00:00:28] [########################################] 847/847 (100%) done
=== Summary ===
Columns checked: 847/847
Matches found: 2
→ public.profiles.user_id
→ public.orders.customer_id
- Parses IDs and auto-detects their type (UUID, integer, or text)
- Queries
pg_catalogto find all columns matching the ID type - For each candidate column, runs:
SELECT EXISTS(SELECT 1 FROM table WHERE column = ANY($1)) - Uses connection pooling and semaphores to control concurrency
- Reports all locations where the IDs were found
- Supabase: Start with
--concurrency 8 --pool-size 5and increase gradually - On-premise: Can go up to
--concurrency 64 --pool-size 30depending on yourmax_connections - Large tables: Increase
--timeout-secsif you have tables with millions of rows - Quick check: Use
--first-onlyif you just need to confirm the ID exists somewhere