feat(derive): add P2SH-P2WPKH address support#89
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📜 Recent review details🧰 Additional context used📓 Path-based instructions (4)**/*.rs📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/{transform,output,storage}/**/*.rs📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/**/*.rs📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (7)📓 Common learnings📚 Learning: 2026-03-22T19:15:02.128ZApplied to files:
📚 Learning: 2026-03-02T14:18:34.668ZApplied to files:
📚 Learning: 2026-03-02T14:18:34.668ZApplied to files:
📚 Learning: 2026-03-02T14:18:34.668ZApplied to files:
📚 Learning: 2026-03-22T18:38:29.431ZApplied to files:
📚 Learning: 2026-03-13T13:20:58.369ZApplied to files:
🧬 Code graph analysis (1)src/storage/iceberg/schema.rs (2)
🔇 Additional comments (2)
📝 WalkthroughWalkthroughThis PR adds P2SH-P2WPKH address derivation to DerivedKey, includes it in DerivedKey::addresses(), extends matching to detect P2SH-P2WPKH, and propagates the new field through console output, storage records, Arrow/Iceberg schemas, and provider verification. Changes
Sequence Diagram(s)sequenceDiagram
participant KeyDeriver
participant DerivedKey
participant Matcher
participant Provider
participant Storage
participant Console
KeyDeriver->>DerivedKey: derive(pubkey) -> fill addresses (p2pkh, p2wpkh, p2tr, p2sh_p2wpkh)
DerivedKey->>Matcher: addresses()
Matcher->>Provider: check(puzzle/address against addresses)
alt match found
Provider->>Console: emit hit with address_type (may be p2sh_p2wpkh)
Provider->>Storage: write record including address_p2sh_p2wpkh
else no match
Provider->>Console: no-hit
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
No issues found across 9 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Requires human review: Contains storage schema changes (Parquet/Iceberg) and modifications to core address derivation/matching logic, which require human review.
Architecture diagram
sequenceDiagram
participant CLI as CLI / Main
participant KD as KeyDeriver
participant DK as DerivedKey
participant M as Matcher
participant P as Provider
participant Out as Output (Console/Storage)
Note over CLI,Out: Address Derivation & Matching Flow
CLI->>KD: derive(private_key)
KD->>KD: Generate P2PKH (Comp/Uncomp)
KD->>KD: NEW: Generate P2SH-P2WPKH (Address::p2shwpkh)
KD->>KD: Generate P2WPKH & P2TR
KD-->>CLI: Return DerivedKey object
CLI->>M: check(DerivedKey)
M->>DK: CHANGED: addresses() returns 5 strings
loop For each address in DerivedKey
M->>M: Compare against target list
alt NEW: Match on P2SH-P2WPKH
M-->>CLI: Return MatchInfo (AddressType::P2shP2wpkh)
else Match on other types
M-->>CLI: Return MatchInfo
end
end
opt Match Found
CLI->>P: verify_key_boha(key, query)
P->>DK: Compare address strings
Note right of P: NEW: Includes P2SH-P2WPKH check
P-->>CLI: Return VerifyReport
CLI->>Out: emit(DerivedKey, MatchInfo)
par Console Output
Out->>Out: CHANGED: Print P2SH-P2WPKH address
and Storage Output (Parquet/Iceberg)
Out->>Out: NEW: Map P2SH-P2WPKH to AddressRecord
end
end
Note over CLI: CLI 'single' command displays 5 addresses (CHANGED)
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/storage/schema.rs (3)
473-474:⚠️ Potential issue | 🔴 CriticalTest will fail:
num_columns()assertion is stale.Same issue as
records_to_batch_empty— should be 21.Proposed fix
assert_eq!(batch.num_rows(), 1); - assert_eq!(batch.num_columns(), 20); + assert_eq!(batch.num_columns(), 21);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/schema.rs` around lines 473 - 474, The test asserts batch.num_columns() == 20 but the schema now has 21 columns; update the assertion in the failing test (the same test that checks batch.num_rows() and batch.num_columns(), referenced by the test name around these assertions, e.g., records_to_batch or records_to_batch_empty) to expect 21 instead of 20 so the assertion matches the current schema column count.
391-396:⚠️ Potential issue | 🔴 CriticalTest will fail:
num_columns()assertion is stale.Schema now has 21 columns, not 20.
Proposed fix
fn records_to_batch_empty() { let batch = records_to_batch(&[]).unwrap(); assert_eq!(batch.num_rows(), 0); - assert_eq!(batch.num_columns(), 20); + assert_eq!(batch.num_columns(), 21); assert_eq!(batch.schema(), Arc::new(result_schema())); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/schema.rs` around lines 391 - 396, The unit test records_to_batch_empty in src/storage/schema.rs asserts that batch.num_columns() == 20 but the schema was changed to 21 columns; update the expected column count to 21 (and ensure the expected schema from result_schema() still matches) by changing the assertion in records_to_batch_empty to assert_eq!(batch.num_columns(), 21) and run the tests to confirm consistency with records_to_batch and result_schema().
518-530:⚠️ Potential issue | 🔴 CriticalTest will fail: column indices shifted after new field insertion.
Column 16 is now
address_p2sh_p2wpkh, notaddress_p2wpkh. Column 18 is nowaddress_p2tr, notwif_compressed.Proposed fix
let addr_p2wpkh_col = batch - .column(16) + .column(17) .as_any() .downcast_ref::<StringArray>() .unwrap(); assert_eq!(addr_p2wpkh_col.value(0), "bc1qtest"); let wif_compressed_col = batch - .column(18) + .column(19) .as_any() .downcast_ref::<StringArray>() .unwrap(); assert_eq!(wif_compressed_col.value(0), "L1234");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/storage/schema.rs` around lines 518 - 530, The test uses hard-coded column indices that shifted after inserting a new field; update the indices so addr_p2wpkh_col reads from column 17 (not 16) and wif_compressed_col reads from column 19 (not 18). Locate the downcast/ref calls for addr_p2wpkh_col and wif_compressed_col in src/storage/schema.rs and change their .column(...) index values accordingly, keeping the same downcast to StringArray and the existing assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/output/console.rs`:
- Line 171: Add a regression assertion that the newly added p2sh_p2wpkh field is
actually rendered in the verbose output: in the test that constructs the fixture
and produces the verbose/printed output (the test that calls the console
renderer/to_string()/render for the fixture), assert that the output contains
the p2sh_p2wpkh value (e.g. "3Wrapped..." or the fixture variable) so a broken
print line will fail; locate the fixture usage and add a
contains/assert!(output.contains(...)) for p2sh_p2wpkh.
In `@src/storage/iceberg/schema.rs`:
- Around line 136-141: Add a unit test that explicitly asserts the presence of
the new field name "address_p2sh_p2wpkh" in the Iceberg schema instead of
relying only on field count/subset checks; locate the test that currently
validates the schema (the one doing a field count and subset name checks) and
add an assertion that the schema's field names include "address_p2sh_p2wpkh"
(the field created by NestedField::optional(next_id(),
fields::ADDRESS_P2SH_P2WPKH, Type::Primitive(PrimitiveType::String))). Ensure
the test fails if that exact name is missing to catch regressions.
---
Outside diff comments:
In `@src/storage/schema.rs`:
- Around line 473-474: The test asserts batch.num_columns() == 20 but the schema
now has 21 columns; update the assertion in the failing test (the same test that
checks batch.num_rows() and batch.num_columns(), referenced by the test name
around these assertions, e.g., records_to_batch or records_to_batch_empty) to
expect 21 instead of 20 so the assertion matches the current schema column
count.
- Around line 391-396: The unit test records_to_batch_empty in
src/storage/schema.rs asserts that batch.num_columns() == 20 but the schema was
changed to 21 columns; update the expected column count to 21 (and ensure the
expected schema from result_schema() still matches) by changing the assertion in
records_to_batch_empty to assert_eq!(batch.num_columns(), 21) and run the tests
to confirm consistency with records_to_batch and result_schema().
- Around line 518-530: The test uses hard-coded column indices that shifted
after inserting a new field; update the indices so addr_p2wpkh_col reads from
column 17 (not 16) and wif_compressed_col reads from column 19 (not 18). Locate
the downcast/ref calls for addr_p2wpkh_col and wif_compressed_col in
src/storage/schema.rs and change their .column(...) index values accordingly,
keeping the same downcast to StringArray and the existing assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: f52c1550-35e2-4e74-869b-207bddad25ef
📒 Files selected for processing (9)
src/derive.rssrc/main.rssrc/matcher.rssrc/output/console.rssrc/output/multi.rssrc/output/storage.rssrc/provider.rssrc/storage/iceberg/schema.rssrc/storage/schema.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (9)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Order imports as: external crates → std → blank line →super::→ blank line →crate::
Prefer?operator over.unwrap()in new code
UsePascalCasefor types and structs
Usesnake_casefor function and method names
UseSCREAMING_SNAKE_CASEfor constants
Usesnake_casefor file and module names
Files:
src/output/multi.rssrc/main.rssrc/output/console.rssrc/provider.rssrc/storage/iceberg/schema.rssrc/matcher.rssrc/derive.rssrc/storage/schema.rssrc/output/storage.rs
src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Implement custom error enums with
DisplayandErrortrait implementations for domain modules
Files:
src/output/multi.rssrc/output/console.rssrc/storage/iceberg/schema.rssrc/storage/schema.rssrc/output/storage.rs
src/{transform,output,storage}/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
Use builder pattern for configuration (e.g.,
ElectrumTransform::new().with_change(),ParquetBackend::new().with_compression())
Files:
src/output/multi.rssrc/output/console.rssrc/storage/iceberg/schema.rssrc/storage/schema.rssrc/output/storage.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Useindicatif::ProgressBarfor long-running operations
Place inline tests in#[cfg(test)] mod testsat the end of each file, using standardassert!andassert_eq!macros
Usetempfilecrate for file I/O tests
Files:
src/output/multi.rssrc/main.rssrc/output/console.rssrc/provider.rssrc/storage/iceberg/schema.rssrc/matcher.rssrc/derive.rssrc/storage/schema.rssrc/output/storage.rs
src/output/**/*.rs
📄 CodeRabbit inference engine (src/output/AGENTS.md)
src/output/**/*.rs: Useescape_csv_field()frommod.rsfor any field written to CSV in the output module
All outputs must beSend + Sync(use interior mutability where needed) for thread safety
Output trait must implement three methods:key(),hit(), andflush()with specified signatures
Files:
src/output/multi.rssrc/output/console.rssrc/output/storage.rs
src/output/*.rs
📄 CodeRabbit inference engine (src/output/AGENTS.md)
Create new output formats by implementing the Output trait in
src/output/{name}.rsand registering withmod {name};andpub useinmod.rs
Files:
src/output/multi.rssrc/output/console.rssrc/output/storage.rs
src/output/multi.rs
📄 CodeRabbit inference engine (src/output/AGENTS.md)
Multi-output behavior is implemented by dispatching to
Vec<Box<dyn Output>>in the multi-output dispatcher
Files:
src/output/multi.rs
src/main.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/main.rs: Useanyhow::Result<T>andanyhow::bail!()for CLI and top-level error handling
CLI command and subcommand changes go insrc/main.rsusing clap derive macros
Files:
src/main.rs
src/output/console.rs
📄 CodeRabbit inference engine (src/output/AGENTS.md)
src/output/console.rs: Compact CSV format for file output should be:source,transform,privkey_hex,address
Use verbose YAML-like multi-line format for console stdout output
Files:
src/output/console.rs
🧠 Learnings (11)
📓 Common learnings
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 86
File: src/provider.rs:442-443
Timestamp: 2026-03-22T18:38:26.890Z
Learning: In oritwoen/vuke src/provider.rs, the `derived.p2tr == addr` branch in `verify_key_boha` has no test coverage intentionally — no boha puzzles currently have P2TR/Taproot addresses. The branch is forward-compatible for when taproot puzzles are added to boha. Do not flag missing P2TR test coverage in this function.
📚 Learning: 2026-03-22T18:38:26.890Z
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 86
File: src/provider.rs:442-443
Timestamp: 2026-03-22T18:38:26.890Z
Learning: In oritwoen/vuke src/provider.rs, the `derived.p2tr == addr` branch in `verify_key_boha` has no test coverage intentionally — no boha puzzles currently have P2TR/Taproot addresses. The branch is forward-compatible for when taproot puzzles are added to boha. Do not flag missing P2TR test coverage in this function.
Applied to files:
src/output/multi.rssrc/main.rssrc/output/console.rssrc/provider.rssrc/matcher.rssrc/derive.rssrc/output/storage.rs
📚 Learning: 2026-03-13T13:20:58.369Z
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 76
File: src/source/wordlist.rs:134-143
Timestamp: 2026-03-13T13:20:58.369Z
Learning: Do not flag .ok() calls on output.hit() / output.key() inside Rayon par_chunks().for_each() closures in Rust files (e.g., src/source/wordlist.rs) as bugs. Rayon for_each requires FnMut() -> (), which cannot propagate Result with ?. This is a known limitation discussed in issue `#77`. Treat these .ok() usages in this pattern as intentional and not regressions; focus review on actual logic or error handling outside this specific closure pattern.
Applied to files:
src/output/multi.rssrc/main.rssrc/output/console.rssrc/provider.rssrc/storage/iceberg/schema.rssrc/matcher.rssrc/derive.rssrc/storage/schema.rssrc/output/storage.rs
📚 Learning: 2026-03-12T18:25:01.029Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Compact CSV format for file output should be: `source,transform,privkey_hex,address`
Applied to files:
src/main.rssrc/output/console.rs
📚 Learning: 2026-03-12T18:24:52.371Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T18:24:52.371Z
Learning: Applies to src/main.rs : CLI command and subcommand changes go in `src/main.rs` using clap derive macros
Applied to files:
src/main.rs
📚 Learning: 2026-03-12T18:25:01.029Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Use verbose YAML-like multi-line format for console stdout output
Applied to files:
src/output/console.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update Arrow schema definition in `schema.rs`, including `result_schema()` function and field constants
Applied to files:
src/storage/iceberg/schema.rssrc/storage/schema.rssrc/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Map variable-length fields to fixed columns in schema for SQL-friendly querying
Applied to files:
src/storage/iceberg/schema.rssrc/storage/schema.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update `records_to_batch()` conversion when adding new columns to the schema
Applied to files:
src/storage/iceberg/schema.rssrc/storage/schema.rssrc/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Define record structs in `mod.rs` when modifying the record structure
Applied to files:
src/storage/schema.rssrc/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Use `&'a str` and `&'a [T]` references in record structs for zero-copy efficiency
Applied to files:
src/output/storage.rs
🔇 Additional comments (11)
src/main.rs (1)
769-769:singlenow exposes wrapped-segwit address output.Line 769 correctly prints
derived.p2sh_p2wpkh, sosinglestays aligned with the expanded derivation set.src/output/console.rs (1)
74-75: Wrapped-segwit field is consistently printed in both verbose paths.Lines 74 and 119 wire
p2sh_p2wpkhintokey()andhit(), so operators see the same address set in both flows.As per coding guidelines,
src/output/console.rsshould keep verbose YAML-like multi-line output, and this change follows that format.Also applies to: 119-120
src/output/multi.rs (1)
94-94: Test fixture update is correct for the expandedDerivedKey.Line 94 keeps
make_test_derived()structurally in sync with the newp2sh_p2wpkhfield.src/provider.rs (1)
440-441: Provider verification now covers wrapped-segwit matches.Lines 440-441 add the missing
derived.p2sh_p2wpkh == addrpath and report the correct address type string.src/output/storage.rs (1)
184-187: Parquet record mapping includes the new address type end-to-end.Lines 184-187 persist
p2sh_p2wpkhintoaddresses, and Line 286 keeps test data aligned with that schema expansion.Also applies to: 286-286
src/matcher.rs (1)
25-25: Matcher support for wrapped-segwit is complete and covered.Variant,
as_str()mapping, runtime check, and test were all added consistently, so matching/reporting forp2sh_p2wpkhis now end-to-end in this module.Also applies to: 35-35, 93-99, 176-190
src/derive.rs (1)
39-40: Derivation and public address list are correctly extended to P2SH-P2WPKH.The new field is generated from the compressed pubkey, exposed in
DerivedKey::addresses(), and validated by tests, so this module’s address set expansion is coherent.Also applies to: 49-54, 121-125, 174-174, 212-213, 223-228
src/storage/schema.rs (4)
42-44: LGTM!Constant follows naming convention and is placed in correct position between the address fields.
110-114: LGTM!Field insertion position is correct — between
ADDRESS_P2PKH_UNCOMPRESSEDandADDRESS_P2WPKH. Doc table matches code.
235-240: LGTM!Array construction and vec ordering matches schema. The
"p2sh_p2wpkh"lookup key needs to match whateverAddressRecord.address_typeis set to in the derive module — worth a quick sanity check that they align.Also applies to: 287-288
304-306: LGTM!Test assertions properly updated for the new column count and field positions.
Also applies to: 333-333, 362-362, 380-380
Closes #82
P2SH-P2WPKH (the
3-prefixed wrapped segwit addresses) was the dominant address format from ~2017 through 2021. If a vulnerable key had funds sent to its P2SH-P2WPKH address, the scanner couldn't find them because that address was never computed.Adds
p2sh_p2wpkhfield toDerivedKeyusingAddress::p2shwpkh()from the bitcoin crate. Updated everywhere that matters: matcher, provider verification, console output (verbose + hit), storage schema (Parquet + Iceberg), and thesinglecommand display.DerivedKey::addresses()returns 5 elements now instead of 4.