|
1 | | -# VUKE - PROJECT KNOWLEDGE BASE |
| 1 | +# VUKE |
2 | 2 |
|
3 | | -**Generated:** 2025-12-31 |
4 | | -**Commit:** c1f2d45 |
5 | | -**Branch:** main |
| 3 | +Bitcoin key vulnerability research tool. Reproduces historical weak key generation (brainwallets, weak PRNGs, derivation bugs) and analyzes keys for vulnerable origins. Rust, optional WebGPU acceleration. |
6 | 4 |
|
7 | | -## OVERVIEW |
| 5 | +## Quick Commands |
8 | 6 |
|
9 | | -Bitcoin key vulnerability research tool - reproduces historical weak key generation (brainwallets, weak PRNGs, derivation bugs) and analyzes keys for vulnerable origins. Rust + optional WebGPU acceleration. |
| 7 | +```bash |
| 8 | +# Build |
| 9 | +cargo build --release |
| 10 | +cargo build --release --features gpu # WebGPU acceleration |
| 11 | +cargo build --release --features storage # Parquet output |
| 12 | +cargo build --release --features storage-query # + DuckDB queries |
| 13 | + |
| 14 | +# Test |
| 15 | +cargo test # All tests |
| 16 | +cargo test test_name -- --exact # Single test by name |
| 17 | +cargo test transform::mt64::tests # All tests in a module |
| 18 | +cargo test --no-fail-fast # Don't stop on first failure |
| 19 | + |
| 20 | +# Bench |
| 21 | +cargo bench # Criterion benchmarks (codspeed-criterion-compat) |
| 22 | + |
| 23 | +# Release |
| 24 | +just release 0.8.0 # Bump version, changelog, tag, commit |
| 25 | +``` |
| 26 | + |
| 27 | +No rustfmt.toml or clippy.toml — default Rust formatting and lints apply. |
10 | 28 |
|
11 | | -## STRUCTURE |
| 29 | +## Codebase Map |
12 | 30 |
|
13 | 31 | ``` |
14 | | -vuke/ |
15 | | -├── src/ |
16 | | -│ ├── analyze/ # Reverse-engineer key origins (brute-force, heuristics) |
17 | | -│ ├── transform/ # Forward key generation (SHA256, PRNGs, derivation) |
18 | | -│ ├── gpu/ # WebGPU acceleration (WGSL shaders, pipelines) |
19 | | -│ ├── source/ # Input providers (range, wordlist, timestamps, stdin) |
20 | | -│ ├── output/ # Result formatting (console, files) |
21 | | -│ ├── storage/ # Parquet/Arrow TB-scale persistence |
22 | | -│ ├── main.rs # CLI: generate, scan, single, bench, analyze commands |
23 | | -│ ├── lib.rs # Library exports |
24 | | -│ ├── derive.rs # Key → addresses/WIF derivation |
25 | | -│ ├── matcher.rs # Target address matching |
26 | | -│ └── {prng}.rs # Shared PRNG logic (lcg, xorshift, mt64, sha256_chain) |
27 | | -├── benches/ # Criterion benchmarks (codspeed-criterion-compat) |
28 | | -└── src/gpu/shaders/ # WGSL compute shaders |
| 32 | +src/ |
| 33 | +├── main.rs # CLI entry (clap derive): generate, scan, single, bench, analyze |
| 34 | +├── lib.rs # Public API exports |
| 35 | +├── derive.rs # Key → addresses/WIF derivation (secp256k1 + bitcoin crate) |
| 36 | +├── matcher.rs # Target address matching |
| 37 | +├── network.rs # Bitcoin network handling |
| 38 | +├── benchmark.rs # Performance utilities |
| 39 | +├── provider.rs # Puzzle data provider (feature: boha) |
| 40 | +│ |
| 41 | +├── transform/ # Forward: Input → Key generation (see src/transform/AGENTS.md) |
| 42 | +├── analyze/ # Reverse: Key → origin detection (see src/analyze/AGENTS.md) |
| 43 | +├── source/ # Input providers: range, wordlist, timestamps, stdin, files (see src/source/AGENTS.md) |
| 44 | +├── output/ # Result formatting: console CSV, multi-output, storage bridge |
| 45 | +├── gpu/ # WebGPU acceleration (see src/gpu/AGENTS.md) |
| 46 | +├── storage/ # Parquet/Arrow persistence (see src/storage/AGENTS.md) |
| 47 | +│ |
| 48 | +├── {prng}.rs # Shared PRNG logic used by BOTH transform/ and analyze/ |
| 49 | +│ # (mt64.rs, xorshift.rs, lcg.rs, sha256_chain.rs, electrum.rs, multibit.rs) |
| 50 | +└── bitimage.rs # Bitimage puzzle key integration |
29 | 51 | ``` |
30 | 52 |
|
31 | | -## WHERE TO LOOK |
| 53 | +### Data Flow |
| 54 | + |
| 55 | +- **Generate/Scan**: Source → Transform → KeyDeriver → Matcher → Output |
| 56 | +- **Analyze**: Key → Analyzer(s) → AnalysisResult |
| 57 | + |
| 58 | +### Core Traits (all `Send + Sync`) |
| 59 | + |
| 60 | +| Trait | Location | Signature | |
| 61 | +|-------|----------|-----------| |
| 62 | +| `Transform` | `src/transform/mod.rs` | `apply_batch(&[Input], &mut Vec<(String, Key)>)` | |
| 63 | +| `Analyzer` | `src/analyze/mod.rs` | `analyze(&Key, &AnalysisConfig, Option<&ProgressBar>) → AnalysisResult` | |
| 64 | +| `Source` | `src/source/mod.rs` | `process(transforms, deriver, matcher, output) → Result<ProcessStats>` | |
| 65 | +| `Output` | `src/output/mod.rs` | `key(source, transform, derived)`, `hit(source, transform, derived, match_info)` | |
| 66 | +| `StorageBackend` | `src/storage/mod.rs` | `write_batch(&[ResultRecord])`, `flush() → Vec<PathBuf>` | |
| 67 | + |
| 68 | +### Key Types |
| 69 | + |
| 70 | +- `Key` = `[u8; 32]` private key |
| 71 | +- `Input` = multi-representation struct (`string_val`, `u64_val`, `bytes_be`, `bytes_le`) |
| 72 | +- `DerivedKey` = full derivation (WIF, P2PKH compressed/uncompressed, P2WPKH) |
32 | 73 |
|
33 | | -| Task | Location | Notes | |
34 | | -|------|----------|-------| |
35 | | -| Add new vulnerability | `src/transform/{name}.rs` + `src/analyze/{name}.rs` | Implement both Transform and Analyzer | |
36 | | -| Add PRNG variant | `src/{prng}.rs` shared logic + transform + analyze | Keep config/logic in shared module | |
| 74 | +### Where to Add Things |
| 75 | + |
| 76 | +| Task | Where | Then | |
| 77 | +|------|-------|------| |
| 78 | +| New vulnerability | `src/transform/{name}.rs` + `src/analyze/{name}.rs` | Add to both `TransformType` and `AnalyzerType` enums | |
| 79 | +| New PRNG variant | `src/{prng}.rs` (shared) + transform + analyze | Keep config/logic in shared module | |
| 80 | +| New input source | `src/source/{name}.rs` | Implement `Source`, update `SourceType` and `main.rs` | |
| 81 | +| New output format | `src/output/{name}.rs` | Implement `Output` trait | |
37 | 82 | | GPU acceleration | `src/gpu/shaders/{algo}.wgsl` + `src/gpu/{algo}.rs` | Feature-gated behind `gpu` | |
38 | | -| New input source | `src/source/{name}.rs` | Implement Source trait | |
39 | | -| New output format | `src/output/{name}.rs` | Implement Output trait | |
| 83 | +| Storage backend | `src/storage/{name}.rs` | Implement `StorageBackend` trait | |
40 | 84 | | CLI changes | `src/main.rs` | clap derive macros | |
41 | | -| Key derivation | `src/derive.rs` | secp256k1 + bitcoin crate | |
42 | | -| Storage backend | `src/storage/{name}.rs` | Implement StorageBackend trait | |
43 | 85 |
|
44 | | -## CODE MAP |
| 86 | +## Code Conventions |
45 | 87 |
|
46 | | -**Core Traits** (all require `Send + Sync`): |
| 88 | +### Imports |
47 | 89 |
|
48 | | -| Trait | Location | Purpose | |
49 | | -|-------|----------|---------| |
50 | | -| `Transform` | `src/transform/mod.rs` | Input → Key generation | |
51 | | -| `Analyzer` | `src/analyze/mod.rs` | Key origin detection | |
52 | | -| `Source` | `src/source/mod.rs` | Input batch processing | |
53 | | -| `Output` | `src/output/mod.rs` | Result formatting | |
54 | | -| `StorageBackend` | `src/storage/mod.rs` | Persistent result storage | |
| 90 | +Ordered: external crates → std → blank line → `super::` → blank line → `crate::`. |
55 | 91 |
|
56 | | -**Data Flow**: |
57 | | -- **Generate/Scan**: Source → Transform → KeyDeriver → Matcher → Output |
58 | | -- **Analyze**: Key → Analyzer(s) → AnalysisResult |
| 92 | +```rust |
| 93 | +use anyhow::Result; |
| 94 | +use rayon::prelude::*; |
| 95 | +use std::path::PathBuf; |
| 96 | + |
| 97 | +use super::Source; |
| 98 | + |
| 99 | +use crate::derive::KeyDeriver; |
| 100 | +use crate::transform::{Input, Transform}; |
| 101 | +``` |
| 102 | + |
| 103 | +### Error Handling |
| 104 | + |
| 105 | +- **CLI/top-level**: `anyhow::Result<T>`, `anyhow::bail!()` for errors |
| 106 | +- **Domain modules**: Custom error enums (`ElectrumError`, `GpuError`, `ParseError`, `StorageError`) with `Display` + `Error` impls |
| 107 | +- **`.unwrap()`**: Present in ~500 call sites (especially GPU code) — pragmatic, not ideal. Prefer `?` in new code |
| 108 | +- **No `unsafe`**: Intentional. Maintain memory safety |
| 109 | +- **No `panic!()`**: Prefer `Result` types |
| 110 | + |
| 111 | +### Naming |
| 112 | + |
| 113 | +- Types/structs: `PascalCase` (`Mt64Analyzer`, `ConsoleOutput`) |
| 114 | +- Functions/methods: `snake_case` (`apply_batch`, `from_string`) |
| 115 | +- Constants: `SCREAMING_SNAKE_CASE` (`BATCH_SIZE`, `CURVE_ORDER`) |
| 116 | +- Files/modules: `snake_case` (`sha256_chain.rs`, `key_parser.rs`) |
| 117 | +- Struct suffixes by role: `{Name}Transform`, `{Name}Analyzer`, `{Name}Source` |
59 | 118 |
|
60 | | -**Key Types**: |
61 | | -- `Key`: `[u8; 32]` private key |
62 | | -- `Input`: Multi-representation (string, u64, bytes_be/le) |
63 | | -- `DerivedKey`: Full derivation (WIF, P2PKH, P2WPKH) |
64 | | - |
65 | | -## CONVENTIONS |
66 | | - |
67 | | -- **PRNG shared logic**: Common implementations in `src/{prng}.rs`, used by both transform and analyze |
68 | | -- **GPU optional**: Feature-gated, graceful CPU fallback via `supports_gpu()` + `apply_batch_gpu()` |
69 | | -- **Variant configs**: `{Prng}Variant` enums + `{Prng}Config` structs for parameterization |
70 | | -- **Cascade filtering**: Multi-target verification for 64-bit seed spaces (mt64, xorshift) |
71 | | -- **Masked analysis**: `(full_key & mask) | (1 << (bits-1))` for puzzle solving |
72 | | -- **Batch processing**: Always `&[Input]` → `&mut Vec<(String, Key)>` |
73 | | -- **Progress bars**: Use `indicatif::ProgressBar` for long operations |
74 | | -- **Early termination**: Use `AtomicBool` for found flag across threads |
75 | | - |
76 | | -## ANTI-PATTERNS (THIS PROJECT) |
77 | | - |
78 | | -- **Excessive `.unwrap()`**: 125+ instances, especially in GPU code - should use `?` operator |
79 | | -- **No unsafe blocks**: Intentional, maintain memory safety |
80 | | -- **No panic!()**: Prefer Result types |
81 | | -- **No type suppression**: Never `as any`, `@ts-ignore` equivalent |
82 | | - |
83 | | -## TRANSFORMS |
84 | | - |
85 | | -| Name | Seed Size | Description | |
86 | | -|------|-----------|-------------| |
87 | | -| `sha256` | - | Classic brainwallet | |
88 | | -| `double_sha256` | - | Bitcoin-style hash | |
89 | | -| `md5` | - | Legacy weak hash | |
90 | | -| `milksad` | 32-bit | MT19937 (CVE-2023-39910) | |
91 | | -| `mt64` | 64-bit | MT19937-64 | |
92 | | -| `lcg:{variant}:{endian}` | 31-32 bit | glibc/minstd/msvc/borland | |
93 | | -| `xorshift:{variant}` | 64-bit | 64/128/128plus/xoroshiro | |
94 | | -| `sha256_chain:{variant}` | 32-bit | iterated/indexed/counter | |
95 | | -| `multibit` | - | MultiBit HD seed-as-entropy bug | |
96 | | -| `electrum` | - | Pre-BIP39 derivation | |
97 | | -| `armory` | - | Pre-BIP32 HD | |
98 | | - |
99 | | -## ANALYZERS |
100 | | - |
101 | | -| Name | Method | GPU | Notes | |
102 | | -|------|--------|-----|-------| |
103 | | -| `milksad` | 2^32 brute-force | Yes | Supports mask/cascade | |
104 | | -| `mt64` | 2^64 w/ cascade | No | Requires cascade filter | |
105 | | -| `lcg` | 2^31-32 brute-force | No | Multi-variant | |
106 | | -| `xorshift` | 2^64 w/ cascade | No | Multi-variant | |
107 | | -| `sha256_chain` | 2^32 + depth | Yes | Iterated/indexed | |
108 | | -| `multibit-hd` | Mnemonic test | No | Dictionary attack support | |
109 | | -| `direct` | Pattern detect | No | ASCII, small seeds | |
110 | | -| `heuristic` | Statistical | No | Entropy, hamming | |
111 | | - |
112 | | -## COMMANDS |
| 119 | +### Patterns |
| 120 | + |
| 121 | +- **Batch processing**: All transforms/sources process `&[Input]` batches via Rayon `par_chunks()` |
| 122 | +- **Factory enums**: `TransformType::create()`, `AnalyzerType::create()` — parse from CLI strings via `FromStr` |
| 123 | +- **Builder pattern**: Used for configuration (`ElectrumTransform::new().with_change()`, `ParquetBackend::new().with_compression()`) |
| 124 | +- **Progress bars**: `indicatif::ProgressBar` for long operations |
| 125 | +- **Early termination**: `AtomicBool` shared across Rayon threads |
| 126 | +- **Cascade filtering**: Required for 64-bit seed spaces (mt64, xorshift) — multi-target verification to avoid false positives |
| 127 | +- **Masked analysis**: `(full_key & ((1<<N)-1)) | (1<<(N-1))` for puzzle solving |
| 128 | +- **GPU feature gating**: `#[cfg(feature = "gpu")]` on trait methods with CPU fallback as default impl |
| 129 | + |
| 130 | +### Tests |
| 131 | + |
| 132 | +Inline `#[cfg(test)] mod tests` at end of each file. Standard `assert!`/`assert_eq!`. `tempfile` crate for file I/O tests. |
113 | 133 |
|
114 | 134 | ```bash |
115 | | -# Dev |
116 | | -cargo test # Run tests |
117 | | -cargo build --release # Build optimized |
118 | | -cargo build --release --features gpu # With GPU |
119 | | -cargo build --release --features storage # With Parquet |
120 | | - |
121 | | -# Benchmarks |
122 | | -cargo bench # Run benchmarks |
123 | | - |
124 | | -# Release (via justfile) |
125 | | -just release 0.8.0 # Bump version, changelog, tag |
126 | | - |
127 | | -# CI |
128 | | -# - crates.yml: Publish to crates.io on tags |
129 | | -# - aur.yml: Publish to AUR on tags |
130 | | -# - codspeed.yml: Benchmark on push/PR |
| 135 | +cargo test transform::mt64::tests::test_transform_generates_key -- --exact |
131 | 136 | ``` |
132 | 137 |
|
133 | | -## NOTES |
| 138 | +## Feature Flags |
| 139 | + |
| 140 | +| Flag | What it gates | |
| 141 | +|------|---------------| |
| 142 | +| `gpu` | WebGPU acceleration (wgpu, pollster, bytemuck) | |
| 143 | +| `storage` | Parquet/Arrow output | |
| 144 | +| `storage-query` | DuckDB SQL queries on Parquet results (implies `storage`) | |
| 145 | +| `storage-cloud` | S3/R2/MinIO upload (implies `storage`) | |
| 146 | +| `storage-iceberg` | Iceberg catalog (implies `storage-cloud`) | |
| 147 | +| `boha` | Bitcoin puzzle data provider | |
| 148 | + |
| 149 | +No features enabled by default. |
| 150 | + |
| 151 | +## CI |
| 152 | + |
| 153 | +- **crates.yml**: Publish to crates.io on `v*` tags |
| 154 | +- **aur.yml**: Publish to AUR on `v*` tags |
| 155 | +- **codspeed.yml**: Criterion benchmarks on push to `main` and PRs |
| 156 | + |
| 157 | +No CI-enforced clippy or rustfmt checks. |
| 158 | + |
| 159 | +## Commit Style |
| 160 | + |
| 161 | +Semantic commits: `type(scope): description` |
| 162 | + |
| 163 | +``` |
| 164 | +fix(output): escape compact CSV fields in console output |
| 165 | +test(output): harden CSV edge-case coverage |
| 166 | +fix(source): validate descending ranges and guard timestamp counters |
| 167 | +``` |
| 168 | + |
| 169 | +## Execution Workflow |
| 170 | + |
| 171 | +1. **Explore** — read relevant code before changing it. Understand the trait, the factory enum, existing implementations |
| 172 | +2. **Plan** — identify which files need changes. New vulnerability = transform + analyze + shared PRNG + both factory enums + CLI |
| 173 | +3. **Edit** — make focused changes. Don't refactor surrounding code |
| 174 | +4. **Verify** — run `cargo test` after changes. Run `cargo build --release` if touching feature-gated code. Run specific module tests for targeted verification |
| 175 | + |
| 176 | +## Safety |
| 177 | + |
| 178 | +- Don't commit unless explicitly asked |
| 179 | +- Don't push unless explicitly asked |
| 180 | +- No secrets in outputs — this tool handles private keys, treat test vectors carefully |
| 181 | +- Avoid destructive git operations (force push, reset --hard) unless explicitly requested |
| 182 | +- Release profile uses `panic = "abort"` — unrecoverable panics kill the process, so prefer `Result` types |
| 183 | + |
| 184 | +## Complexity Hotspots |
| 185 | + |
| 186 | +| File | Lines | Why | |
| 187 | +|------|-------|-----| |
| 188 | +| `src/analyze/sha256_chain.rs` | ~843 | Multiple chain variants, GPU support, cascade filtering | |
| 189 | +| `src/gpu/sha256_chain.rs` | ~662 | Hybrid CPU-GPU pipeline, cascade | |
| 190 | +| `src/analyze/milksad.rs` | ~581 | Full 2^32 brute-force with GPU path | |
| 191 | +| `src/analyze/xorshift.rs` | ~511 | Multiple PRNG variants with cascade | |
| 192 | +| `src/gpu/hash.rs` | ~538 | Multi-algorithm GPU hashing | |
134 | 193 |
|
135 | | -- **GPU feature**: Compile with `--features gpu` for WebGPU acceleration |
136 | | -- **Storage feature**: Compile with `--features storage` for Parquet output |
137 | | -- **Release profile**: Aggressive optimization (LTO, single codegen unit, stripped) |
138 | | -- **Large files**: `src/analyze/sha256_chain.rs` (843L), `src/gpu/sha256_chain.rs` (662L) - complexity hotspots with refactoring potential |
139 | | -- **Rust 2021 edition**, requires Rust 1.70+ |
140 | | -- **TODO**: GPU for generate/scan needs Source trait redesign (main.rs:322) |
141 | | -- **Refactoring opportunity**: Extract common brute-force framework, masking utilities, cascade formatting across analyzers |
| 194 | +Refactoring potential: common brute-force framework, shared masking utilities, generic cascade formatting across analyzers. |
0 commit comments