Skip to content

Commit a516f2f

Browse files
committed
chore: add AGENTS.md files
1 parent 1cf1803 commit a516f2f

9 files changed

Lines changed: 229 additions & 120 deletions

File tree

AGENTS.md

Lines changed: 173 additions & 120 deletions
Original file line numberDiff line numberDiff line change
@@ -1,141 +1,194 @@
1-
# VUKE - PROJECT KNOWLEDGE BASE
1+
# VUKE
22

3-
**Generated:** 2025-12-31
4-
**Commit:** c1f2d45
5-
**Branch:** main
3+
Bitcoin key vulnerability research tool. Reproduces historical weak key generation (brainwallets, weak PRNGs, derivation bugs) and analyzes keys for vulnerable origins. Rust, optional WebGPU acceleration.
64

7-
## OVERVIEW
5+
## Quick Commands
86

9-
Bitcoin key vulnerability research tool - reproduces historical weak key generation (brainwallets, weak PRNGs, derivation bugs) and analyzes keys for vulnerable origins. Rust + optional WebGPU acceleration.
7+
```bash
8+
# Build
9+
cargo build --release
10+
cargo build --release --features gpu # WebGPU acceleration
11+
cargo build --release --features storage # Parquet output
12+
cargo build --release --features storage-query # + DuckDB queries
13+
14+
# Test
15+
cargo test # All tests
16+
cargo test test_name -- --exact # Single test by name
17+
cargo test transform::mt64::tests # All tests in a module
18+
cargo test --no-fail-fast # Don't stop on first failure
19+
20+
# Bench
21+
cargo bench # Criterion benchmarks (codspeed-criterion-compat)
22+
23+
# Release
24+
just release 0.8.0 # Bump version, changelog, tag, commit
25+
```
26+
27+
No rustfmt.toml or clippy.toml — default Rust formatting and lints apply.
1028

11-
## STRUCTURE
29+
## Codebase Map
1230

1331
```
14-
vuke/
15-
├── src/
16-
│ ├── analyze/ # Reverse-engineer key origins (brute-force, heuristics)
17-
│ ├── transform/ # Forward key generation (SHA256, PRNGs, derivation)
18-
│ ├── gpu/ # WebGPU acceleration (WGSL shaders, pipelines)
19-
│ ├── source/ # Input providers (range, wordlist, timestamps, stdin)
20-
│ ├── output/ # Result formatting (console, files)
21-
│ ├── storage/ # Parquet/Arrow TB-scale persistence
22-
│ ├── main.rs # CLI: generate, scan, single, bench, analyze commands
23-
│ ├── lib.rs # Library exports
24-
│ ├── derive.rs # Key → addresses/WIF derivation
25-
│ ├── matcher.rs # Target address matching
26-
│ └── {prng}.rs # Shared PRNG logic (lcg, xorshift, mt64, sha256_chain)
27-
├── benches/ # Criterion benchmarks (codspeed-criterion-compat)
28-
└── src/gpu/shaders/ # WGSL compute shaders
32+
src/
33+
├── main.rs # CLI entry (clap derive): generate, scan, single, bench, analyze
34+
├── lib.rs # Public API exports
35+
├── derive.rs # Key → addresses/WIF derivation (secp256k1 + bitcoin crate)
36+
├── matcher.rs # Target address matching
37+
├── network.rs # Bitcoin network handling
38+
├── benchmark.rs # Performance utilities
39+
├── provider.rs # Puzzle data provider (feature: boha)
40+
41+
├── transform/ # Forward: Input → Key generation (see src/transform/AGENTS.md)
42+
├── analyze/ # Reverse: Key → origin detection (see src/analyze/AGENTS.md)
43+
├── source/ # Input providers: range, wordlist, timestamps, stdin, files (see src/source/AGENTS.md)
44+
├── output/ # Result formatting: console CSV, multi-output, storage bridge
45+
├── gpu/ # WebGPU acceleration (see src/gpu/AGENTS.md)
46+
├── storage/ # Parquet/Arrow persistence (see src/storage/AGENTS.md)
47+
48+
├── {prng}.rs # Shared PRNG logic used by BOTH transform/ and analyze/
49+
│ # (mt64.rs, xorshift.rs, lcg.rs, sha256_chain.rs, electrum.rs, multibit.rs)
50+
└── bitimage.rs # Bitimage puzzle key integration
2951
```
3052

31-
## WHERE TO LOOK
53+
### Data Flow
54+
55+
- **Generate/Scan**: Source → Transform → KeyDeriver → Matcher → Output
56+
- **Analyze**: Key → Analyzer(s) → AnalysisResult
57+
58+
### Core Traits (all `Send + Sync`)
59+
60+
| Trait | Location | Signature |
61+
|-------|----------|-----------|
62+
| `Transform` | `src/transform/mod.rs` | `apply_batch(&[Input], &mut Vec<(String, Key)>)` |
63+
| `Analyzer` | `src/analyze/mod.rs` | `analyze(&Key, &AnalysisConfig, Option<&ProgressBar>) → AnalysisResult` |
64+
| `Source` | `src/source/mod.rs` | `process(transforms, deriver, matcher, output) → Result<ProcessStats>` |
65+
| `Output` | `src/output/mod.rs` | `key(source, transform, derived)`, `hit(source, transform, derived, match_info)` |
66+
| `StorageBackend` | `src/storage/mod.rs` | `write_batch(&[ResultRecord])`, `flush() → Vec<PathBuf>` |
67+
68+
### Key Types
69+
70+
- `Key` = `[u8; 32]` private key
71+
- `Input` = multi-representation struct (`string_val`, `u64_val`, `bytes_be`, `bytes_le`)
72+
- `DerivedKey` = full derivation (WIF, P2PKH compressed/uncompressed, P2WPKH)
3273

33-
| Task | Location | Notes |
34-
|------|----------|-------|
35-
| Add new vulnerability | `src/transform/{name}.rs` + `src/analyze/{name}.rs` | Implement both Transform and Analyzer |
36-
| Add PRNG variant | `src/{prng}.rs` shared logic + transform + analyze | Keep config/logic in shared module |
74+
### Where to Add Things
75+
76+
| Task | Where | Then |
77+
|------|-------|------|
78+
| New vulnerability | `src/transform/{name}.rs` + `src/analyze/{name}.rs` | Add to both `TransformType` and `AnalyzerType` enums |
79+
| New PRNG variant | `src/{prng}.rs` (shared) + transform + analyze | Keep config/logic in shared module |
80+
| New input source | `src/source/{name}.rs` | Implement `Source`, update `SourceType` and `main.rs` |
81+
| New output format | `src/output/{name}.rs` | Implement `Output` trait |
3782
| GPU acceleration | `src/gpu/shaders/{algo}.wgsl` + `src/gpu/{algo}.rs` | Feature-gated behind `gpu` |
38-
| New input source | `src/source/{name}.rs` | Implement Source trait |
39-
| New output format | `src/output/{name}.rs` | Implement Output trait |
83+
| Storage backend | `src/storage/{name}.rs` | Implement `StorageBackend` trait |
4084
| CLI changes | `src/main.rs` | clap derive macros |
41-
| Key derivation | `src/derive.rs` | secp256k1 + bitcoin crate |
42-
| Storage backend | `src/storage/{name}.rs` | Implement StorageBackend trait |
4385

44-
## CODE MAP
86+
## Code Conventions
4587

46-
**Core Traits** (all require `Send + Sync`):
88+
### Imports
4789

48-
| Trait | Location | Purpose |
49-
|-------|----------|---------|
50-
| `Transform` | `src/transform/mod.rs` | Input → Key generation |
51-
| `Analyzer` | `src/analyze/mod.rs` | Key origin detection |
52-
| `Source` | `src/source/mod.rs` | Input batch processing |
53-
| `Output` | `src/output/mod.rs` | Result formatting |
54-
| `StorageBackend` | `src/storage/mod.rs` | Persistent result storage |
90+
Ordered: external crates → std → blank line → `super::` → blank line → `crate::`.
5591

56-
**Data Flow**:
57-
- **Generate/Scan**: Source → Transform → KeyDeriver → Matcher → Output
58-
- **Analyze**: Key → Analyzer(s) → AnalysisResult
92+
```rust
93+
use anyhow::Result;
94+
use rayon::prelude::*;
95+
use std::path::PathBuf;
96+
97+
use super::Source;
98+
99+
use crate::derive::KeyDeriver;
100+
use crate::transform::{Input, Transform};
101+
```
102+
103+
### Error Handling
104+
105+
- **CLI/top-level**: `anyhow::Result<T>`, `anyhow::bail!()` for errors
106+
- **Domain modules**: Custom error enums (`ElectrumError`, `GpuError`, `ParseError`, `StorageError`) with `Display` + `Error` impls
107+
- **`.unwrap()`**: Present in ~500 call sites (especially GPU code) — pragmatic, not ideal. Prefer `?` in new code
108+
- **No `unsafe`**: Intentional. Maintain memory safety
109+
- **No `panic!()`**: Prefer `Result` types
110+
111+
### Naming
112+
113+
- Types/structs: `PascalCase` (`Mt64Analyzer`, `ConsoleOutput`)
114+
- Functions/methods: `snake_case` (`apply_batch`, `from_string`)
115+
- Constants: `SCREAMING_SNAKE_CASE` (`BATCH_SIZE`, `CURVE_ORDER`)
116+
- Files/modules: `snake_case` (`sha256_chain.rs`, `key_parser.rs`)
117+
- Struct suffixes by role: `{Name}Transform`, `{Name}Analyzer`, `{Name}Source`
59118

60-
**Key Types**:
61-
- `Key`: `[u8; 32]` private key
62-
- `Input`: Multi-representation (string, u64, bytes_be/le)
63-
- `DerivedKey`: Full derivation (WIF, P2PKH, P2WPKH)
64-
65-
## CONVENTIONS
66-
67-
- **PRNG shared logic**: Common implementations in `src/{prng}.rs`, used by both transform and analyze
68-
- **GPU optional**: Feature-gated, graceful CPU fallback via `supports_gpu()` + `apply_batch_gpu()`
69-
- **Variant configs**: `{Prng}Variant` enums + `{Prng}Config` structs for parameterization
70-
- **Cascade filtering**: Multi-target verification for 64-bit seed spaces (mt64, xorshift)
71-
- **Masked analysis**: `(full_key & mask) | (1 << (bits-1))` for puzzle solving
72-
- **Batch processing**: Always `&[Input]``&mut Vec<(String, Key)>`
73-
- **Progress bars**: Use `indicatif::ProgressBar` for long operations
74-
- **Early termination**: Use `AtomicBool` for found flag across threads
75-
76-
## ANTI-PATTERNS (THIS PROJECT)
77-
78-
- **Excessive `.unwrap()`**: 125+ instances, especially in GPU code - should use `?` operator
79-
- **No unsafe blocks**: Intentional, maintain memory safety
80-
- **No panic!()**: Prefer Result types
81-
- **No type suppression**: Never `as any`, `@ts-ignore` equivalent
82-
83-
## TRANSFORMS
84-
85-
| Name | Seed Size | Description |
86-
|------|-----------|-------------|
87-
| `sha256` | - | Classic brainwallet |
88-
| `double_sha256` | - | Bitcoin-style hash |
89-
| `md5` | - | Legacy weak hash |
90-
| `milksad` | 32-bit | MT19937 (CVE-2023-39910) |
91-
| `mt64` | 64-bit | MT19937-64 |
92-
| `lcg:{variant}:{endian}` | 31-32 bit | glibc/minstd/msvc/borland |
93-
| `xorshift:{variant}` | 64-bit | 64/128/128plus/xoroshiro |
94-
| `sha256_chain:{variant}` | 32-bit | iterated/indexed/counter |
95-
| `multibit` | - | MultiBit HD seed-as-entropy bug |
96-
| `electrum` | - | Pre-BIP39 derivation |
97-
| `armory` | - | Pre-BIP32 HD |
98-
99-
## ANALYZERS
100-
101-
| Name | Method | GPU | Notes |
102-
|------|--------|-----|-------|
103-
| `milksad` | 2^32 brute-force | Yes | Supports mask/cascade |
104-
| `mt64` | 2^64 w/ cascade | No | Requires cascade filter |
105-
| `lcg` | 2^31-32 brute-force | No | Multi-variant |
106-
| `xorshift` | 2^64 w/ cascade | No | Multi-variant |
107-
| `sha256_chain` | 2^32 + depth | Yes | Iterated/indexed |
108-
| `multibit-hd` | Mnemonic test | No | Dictionary attack support |
109-
| `direct` | Pattern detect | No | ASCII, small seeds |
110-
| `heuristic` | Statistical | No | Entropy, hamming |
111-
112-
## COMMANDS
119+
### Patterns
120+
121+
- **Batch processing**: All transforms/sources process `&[Input]` batches via Rayon `par_chunks()`
122+
- **Factory enums**: `TransformType::create()`, `AnalyzerType::create()` — parse from CLI strings via `FromStr`
123+
- **Builder pattern**: Used for configuration (`ElectrumTransform::new().with_change()`, `ParquetBackend::new().with_compression()`)
124+
- **Progress bars**: `indicatif::ProgressBar` for long operations
125+
- **Early termination**: `AtomicBool` shared across Rayon threads
126+
- **Cascade filtering**: Required for 64-bit seed spaces (mt64, xorshift) — multi-target verification to avoid false positives
127+
- **Masked analysis**: `(full_key & ((1<<N)-1)) | (1<<(N-1))` for puzzle solving
128+
- **GPU feature gating**: `#[cfg(feature = "gpu")]` on trait methods with CPU fallback as default impl
129+
130+
### Tests
131+
132+
Inline `#[cfg(test)] mod tests` at end of each file. Standard `assert!`/`assert_eq!`. `tempfile` crate for file I/O tests.
113133

114134
```bash
115-
# Dev
116-
cargo test # Run tests
117-
cargo build --release # Build optimized
118-
cargo build --release --features gpu # With GPU
119-
cargo build --release --features storage # With Parquet
120-
121-
# Benchmarks
122-
cargo bench # Run benchmarks
123-
124-
# Release (via justfile)
125-
just release 0.8.0 # Bump version, changelog, tag
126-
127-
# CI
128-
# - crates.yml: Publish to crates.io on tags
129-
# - aur.yml: Publish to AUR on tags
130-
# - codspeed.yml: Benchmark on push/PR
135+
cargo test transform::mt64::tests::test_transform_generates_key -- --exact
131136
```
132137

133-
## NOTES
138+
## Feature Flags
139+
140+
| Flag | What it gates |
141+
|------|---------------|
142+
| `gpu` | WebGPU acceleration (wgpu, pollster, bytemuck) |
143+
| `storage` | Parquet/Arrow output |
144+
| `storage-query` | DuckDB SQL queries on Parquet results (implies `storage`) |
145+
| `storage-cloud` | S3/R2/MinIO upload (implies `storage`) |
146+
| `storage-iceberg` | Iceberg catalog (implies `storage-cloud`) |
147+
| `boha` | Bitcoin puzzle data provider |
148+
149+
No features enabled by default.
150+
151+
## CI
152+
153+
- **crates.yml**: Publish to crates.io on `v*` tags
154+
- **aur.yml**: Publish to AUR on `v*` tags
155+
- **codspeed.yml**: Criterion benchmarks on push to `main` and PRs
156+
157+
No CI-enforced clippy or rustfmt checks.
158+
159+
## Commit Style
160+
161+
Semantic commits: `type(scope): description`
162+
163+
```
164+
fix(output): escape compact CSV fields in console output
165+
test(output): harden CSV edge-case coverage
166+
fix(source): validate descending ranges and guard timestamp counters
167+
```
168+
169+
## Execution Workflow
170+
171+
1. **Explore** — read relevant code before changing it. Understand the trait, the factory enum, existing implementations
172+
2. **Plan** — identify which files need changes. New vulnerability = transform + analyze + shared PRNG + both factory enums + CLI
173+
3. **Edit** — make focused changes. Don't refactor surrounding code
174+
4. **Verify** — run `cargo test` after changes. Run `cargo build --release` if touching feature-gated code. Run specific module tests for targeted verification
175+
176+
## Safety
177+
178+
- Don't commit unless explicitly asked
179+
- Don't push unless explicitly asked
180+
- No secrets in outputs — this tool handles private keys, treat test vectors carefully
181+
- Avoid destructive git operations (force push, reset --hard) unless explicitly requested
182+
- Release profile uses `panic = "abort"` — unrecoverable panics kill the process, so prefer `Result` types
183+
184+
## Complexity Hotspots
185+
186+
| File | Lines | Why |
187+
|------|-------|-----|
188+
| `src/analyze/sha256_chain.rs` | ~843 | Multiple chain variants, GPU support, cascade filtering |
189+
| `src/gpu/sha256_chain.rs` | ~662 | Hybrid CPU-GPU pipeline, cascade |
190+
| `src/analyze/milksad.rs` | ~581 | Full 2^32 brute-force with GPU path |
191+
| `src/analyze/xorshift.rs` | ~511 | Multiple PRNG variants with cascade |
192+
| `src/gpu/hash.rs` | ~538 | Multi-algorithm GPU hashing |
134193

135-
- **GPU feature**: Compile with `--features gpu` for WebGPU acceleration
136-
- **Storage feature**: Compile with `--features storage` for Parquet output
137-
- **Release profile**: Aggressive optimization (LTO, single codegen unit, stripped)
138-
- **Large files**: `src/analyze/sha256_chain.rs` (843L), `src/gpu/sha256_chain.rs` (662L) - complexity hotspots with refactoring potential
139-
- **Rust 2021 edition**, requires Rust 1.70+
140-
- **TODO**: GPU for generate/scan needs Source trait redesign (main.rs:322)
141-
- **Refactoring opportunity**: Extract common brute-force framework, masking utilities, cascade formatting across analyzers
194+
Refactoring potential: common brute-force framework, shared masking utilities, generic cascade formatting across analyzers.

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/analyze/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/gpu/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/output/AGENTS.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# OUTPUT MODULE
2+
3+
Result formatting for generated and matched keys.
4+
5+
## STRUCTURE
6+
7+
```
8+
output/
9+
├── mod.rs # Output trait, escape_csv_field()
10+
├── console.rs # Stdout and file output (compact CSV + verbose YAML)
11+
├── multi.rs # Multi-output dispatcher (fan-out to multiple outputs)
12+
├── storage.rs # Parquet backend bridge (feature: storage)
13+
└── query_format.rs # DuckDB result formatting (feature: storage-query)
14+
```
15+
16+
## OUTPUT TRAIT
17+
18+
```rust
19+
pub trait Output: Send + Sync {
20+
fn key(&self, source: &str, transform: &str, derived: &DerivedKey) -> Result<()>;
21+
fn hit(&self, source: &str, transform: &str, derived: &DerivedKey, match_info: &MatchInfo) -> Result<()>;
22+
fn flush(&self) -> Result<()>;
23+
}
24+
```
25+
26+
## CONVENTIONS
27+
28+
- **CSV escaping**: Use `escape_csv_field()` from `mod.rs` for any field written to CSV
29+
- **Compact format**: `source,transform,privkey_hex,address` — default for file output
30+
- **Verbose format**: YAML-like multi-line — for console stdout
31+
- **Thread safety**: All outputs must be `Send + Sync` (use interior mutability where needed)
32+
- **Flush**: Always call `flush()` at end of processing to ensure all data is written
33+
34+
## WHERE TO LOOK
35+
36+
| Task | Location |
37+
|------|----------|
38+
| Add new output format | Create `{name}.rs`, implement `Output` trait |
39+
| Fix CSV formatting | `mod.rs` for escaping, `console.rs` for field assembly |
40+
| Multi-output behavior | `multi.rs` — dispatches to Vec<Box<dyn Output>> |
41+
| Storage output | `storage.rs` — bridges Output trait to StorageBackend |
42+
| Query result display | `query_format.rs` — table, CSV, JSON formatting |
43+
44+
## ADDING A NEW OUTPUT
45+
46+
1. Create `src/output/{name}.rs`
47+
2. Implement `Output` trait with `key()`, `hit()`, `flush()`
48+
3. Add `mod {name};` and `pub use` in `mod.rs`
49+
4. Wire up in `src/main.rs` where outputs are constructed

src/output/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/source/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/storage/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

src/transform/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

0 commit comments

Comments
 (0)