Skip to content

feat(derive): add P2TR (Taproot) address support#79

Merged
oritwoen merged 3 commits into
mainfrom
feat/p2tr-taproot-support
Mar 13, 2026
Merged

feat(derive): add P2TR (Taproot) address support#79
oritwoen merged 3 commits into
mainfrom
feat/p2tr-taproot-support

Conversation

@oritwoen
Copy link
Copy Markdown
Owner

Taproot has been active on mainnet since November 2021, but vuke only derives and matches P2PKH and P2WPKH addresses. Scanning against bc1p... targets silently produces zero matches because the address type doesn't exist in the pipeline.

This adds P2TR (key-path spend, no script tree) across the full flow:

  • KeyDeriver::derive() now produces a p2tr field via Address::p2tr() from the bitcoin crate
  • Matcher::check() tests the new address type, AddressType::P2tr variant added
  • Console output includes P2TR in both verbose and hit formats
  • Parquet and Iceberg storage schemas gain an address_p2tr column (19 -> 20 columns)
  • All test helpers and schema assertions updated

Closes #67

Derive, match, and output bc1p Taproot addresses alongside the
existing P2PKH and P2WPKH formats. Covers the full pipeline:
KeyDeriver, Matcher, all output handlers, and the Parquet/Iceberg
storage schemas (19 -> 20 columns).

Closes #67
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

P2TR (Taproot) support added: DerivedKey now exposes p2tr, derivation computes P2TR, Matcher recognizes P2tr, outputs include and persist the p2tr column, and storage schemas/tests updated to include ADDRESS_P2TR.

Changes

Cohort / File(s) Summary
Core Derivation & Matching
src/derive.rs, src/matcher.rs
Added p2tr: String to DerivedKey, compute p2tr during derive, addresses() returns 4 items. Added AddressType::P2tr and Matcher checks for derived.p2tr (bc1p) addresses.
Console & Multi Output
src/output/console.rs, src/output/multi.rs, src/output/storage.rs
Verbose and multi output now print/include derived.p2tr; test fixtures updated to populate p2tr. Storage output writes an AddressRecord with type p2tr.
Storage Schema & Backends
src/storage/schema.rs, src/storage/iceberg/schema.rs, src/storage/parquet_backend.rs
Added ADDRESS_P2TR field and a nullable Utf8 column; schema expanded from 19→20 fields. Tests and batch construction updated to include the new P2TR column.

Sequence Diagram

sequenceDiagram
    participant KeyDeriver
    participant DerivedKey
    participant Matcher
    participant Storage
    participant Console

    KeyDeriver->>DerivedKey: derive() -> compute p2pkh/p2wpkh/p2tr
    DerivedKey->>Matcher: provide addresses (incl. p2tr)
    Matcher->>Matcher: compare targets vs p2tr (bc1p...)
    alt Match found
        Matcher->>Storage: write record (address_type: "p2tr", address: derived.p2tr)
        Matcher->>Console: emit hit (include p2tr)
    else No match
        Matcher->>Console: no-hit flow
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

bc1p arrives, a taproot gleam,
Derived, matched, and stored in the stream,
Four addresses now lined in a row,
Tests updated so the checks will know,
Small change — modern wallets can show.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: adding P2TR (Taproot) address support across the codebase.
Description check ✅ Passed Description clearly explains the problem (P2TR missing from pipeline since Nov 2021), the solution scope (KeyDeriver, Matcher, console, storage), and references the linked issue.
Linked Issues check ✅ Passed All coding requirements from issue #67 are implemented: p2tr field added to DerivedKey, P2TR derivation via bitcoin crate, P2tr variant added to AddressType, Matcher::check() extended, addresses() returns 4 items, console outputs updated, storage schemas expanded (19→20 columns).
Out of Scope Changes check ✅ Passed Changes are tightly scoped to P2TR support: derive.rs, matcher.rs, console/multi/storage output modules, and storage schemas. No unrelated refactoring or feature additions detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/p2tr-taproot-support
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch feat/p2tr-taproot-support
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented Mar 13, 2026

Sequence Diagram

This PR extends the Bitcoin pipeline to include Taproot addresses as a first class type. A derived key now includes a P2TR address, matcher checks it against targets, and outputs and storage persist the new address field.

sequenceDiagram
    participant Scanner
    participant KeyDeriver
    participant Matcher
    participant ConsoleOutput
    participant Storage

    Scanner->>KeyDeriver: Derive key data and all address types
    KeyDeriver-->>Scanner: Return key with p2pkh p2wpkh and p2tr
    Scanner->>Matcher: Check derived addresses against targets
    Matcher-->>Scanner: Match result can be p2tr
    Scanner->>ConsoleOutput: Print key and hit details including p2tr
    Scanner->>Storage: Write record with address p2tr column
Loading

Generated by CodeAnt AI

@oritwoen oritwoen self-assigned this Mar 13, 2026
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 8 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

Requires human review: This PR modifies the storage schemas for Parquet and Iceberg (adding columns), which is classified as a high-impact change requiring human review.

Architecture diagram
sequenceDiagram
    participant R as Runner / Main
    participant KD as KeyDeriver
    participant M as Matcher
    participant C as ConsoleOutput
    participant S as StorageOutput
    participant P as Parquet/Iceberg Backend

    Note over R,P: Address Derivation and Matching Flow

    R->>KD: derive(private_key)
    KD->>KD: Generate P2PKH & P2WPKH
    KD->>KD: NEW: Derive P2TR (Taproot) via X-only Pubkey
    KD-->>R: Return DerivedKey (includes P2TR)

    R->>M: check(DerivedKey)
    M->>M: Compare targets against P2PKH/P2WPKH
    M->>M: NEW: Compare targets against AddressType::P2tr
    alt Match Found
        M-->>R: Some(MatchInfo)
    else No Match
        M-->>R: None
    end

    opt Verbose Mode or Match Found
        R->>C: print_details(DerivedKey)
        Note right of C: CHANGED: Console now includes<br/>P2TR address in output
    end

    opt Storage Enabled
        R->>S: write(DerivedKey)
        S->>P: append_row(data)
        Note right of P: CHANGED: Schema expanded to 20 columns<br/>to include address_p2tr
    end
Loading

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Mar 13, 2026

Merging this PR will degrade performance by 24.33%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 2 regressed benchmarks
✅ 5 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
derive_batch_100 36.4 ms 48.1 ms -24.33%
derive_addresses 419.8 µs 548.2 µs -23.43%

Comparing feat/p2tr-taproot-support (768c8df) with main (5e227e8)

Open in CodSpeed

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/output/console.rs (1)

65-86: ⚠️ Potential issue | 🟠 Major

Compact CSV still drops Taproot addresses.

This adds derived.p2tr to verbose output, but the non-verbose CSV path still writes only derived.p2pkh_compressed. So to_file() continues to omit bc1p... values entirely, which means Taproot support is still incomplete for the default file-output path. If the CSV has to stay 4 columns, emit one row per derived address instead of one row per key. As per coding guidelines, "Compact CSV format for file output should be: source,transform,privkey_hex,address."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/console.rs` around lines 65 - 86, The non-verbose branch still
writes only a single CSV row using derived.p2pkh_compressed, which omits Taproot
(derived.p2tr); change the else path in console.rs (the block that calls
format_compact_csv_row) to emit one compact CSV row per derived address instead
of one row per key: build an ordered list of addresses from the Derived struct
(e.g. derived.p2pkh_compressed, derived.p2pkh_uncompressed, derived.p2wpkh,
derived.p2tr) and loop over them, calling format_compact_csv_row(source,
transform, &derived.private_key_hex, &address) and writeln! each result so the
compact file output follows "source,transform,privkey_hex,address" and includes
Taproot rows.
🧹 Nitpick comments (2)
src/storage/parquet_backend.rs (1)

582-585: This still doesn't prove the new column is populated.

This only checks that the batch grew to 20 columns. addresses1 never includes a p2tr entry and nothing reads column 17 back, so a typo between src/output/storage.rs and src/storage/schema.rs would leave address_p2tr null everywhere and CI would still pass. Add one round-trip assertion for a non-null Taproot value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/parquet_backend.rs` around lines 582 - 585, The test currently
only checks column count but not that the new Taproot column is populated;
modify the test after obtaining `batch = &batches[0]` to read the `address_p2tr`
column (either by name "address_p2tr" or index 17) and assert that at least one
row contains a non-null, expected Taproot value (for example compare the
recorded value to the known Taproot from `addresses1` or check
`!array.is_null(row_idx)`/matches expected bytes); use the Arrow/Parquet API
calls you already use elsewhere in the test to fetch the column array and
validate its contents so CI will fail if the column is left null everywhere.
src/derive.rs (1)

216-221: Pin Taproot derivation to an exact fixture.

starts_with("bc1p") is too loose here. A wrong tweak or wrong x-only key still gives you a valid-looking Taproot string, so this test stays green while derivation is wrong. Use a fixed key and assert the full expected p2tr address.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/derive.rs` around lines 216 - 221, The Taproot assertion is too loose:
replace the starts_with("bc1p") check so the test pins Taproot output to a fixed
fixture; compute or hardcode the exact expected p2tr address and assert equality
against addrs[3] (the result of derived.addresses()) instead of a prefix check,
ensuring the test fails if the tweak/x-only key is wrong; keep the existing
checks for addrs[0..2] but change the P2TR check to assert_eq!(addrs[3],
"<expected_p2tr_address>") using the known fixture value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/storage/iceberg/schema.rs`:
- Around line 117-122: The new NestedField::optional(fields::ADDRESS_P2TR, ...)
block is currently inserted before the existing wif fields and therefore shifts
their generated IDs (affecting wif_compressed and wif_uncompressed); move the
ADDRESS_P2TR field block so it is appended after the wif_compressed and
wif_uncompressed field declarations (i.e., place the
NestedField::optional(next_id(), fields::ADDRESS_P2TR,
Type::Primitive(PrimitiveType::String)).into() after the existing wif_* entries)
to ensure existing field IDs remain unchanged and the new field gets a fresh ID.

---

Outside diff comments:
In `@src/output/console.rs`:
- Around line 65-86: The non-verbose branch still writes only a single CSV row
using derived.p2pkh_compressed, which omits Taproot (derived.p2tr); change the
else path in console.rs (the block that calls format_compact_csv_row) to emit
one compact CSV row per derived address instead of one row per key: build an
ordered list of addresses from the Derived struct (e.g.
derived.p2pkh_compressed, derived.p2pkh_uncompressed, derived.p2wpkh,
derived.p2tr) and loop over them, calling format_compact_csv_row(source,
transform, &derived.private_key_hex, &address) and writeln! each result so the
compact file output follows "source,transform,privkey_hex,address" and includes
Taproot rows.

---

Nitpick comments:
In `@src/derive.rs`:
- Around line 216-221: The Taproot assertion is too loose: replace the
starts_with("bc1p") check so the test pins Taproot output to a fixed fixture;
compute or hardcode the exact expected p2tr address and assert equality against
addrs[3] (the result of derived.addresses()) instead of a prefix check, ensuring
the test fails if the tweak/x-only key is wrong; keep the existing checks for
addrs[0..2] but change the P2TR check to assert_eq!(addrs[3],
"<expected_p2tr_address>") using the known fixture value.

In `@src/storage/parquet_backend.rs`:
- Around line 582-585: The test currently only checks column count but not that
the new Taproot column is populated; modify the test after obtaining `batch =
&batches[0]` to read the `address_p2tr` column (either by name "address_p2tr" or
index 17) and assert that at least one row contains a non-null, expected Taproot
value (for example compare the recorded value to the known Taproot from
`addresses1` or check `!array.is_null(row_idx)`/matches expected bytes); use the
Arrow/Parquet API calls you already use elsewhere in the test to fetch the
column array and validate its contents so CI will fail if the column is left
null everywhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c9c30707-e47e-4647-acf5-6b8c443ce033

📥 Commits

Reviewing files that changed from the base of the PR and between 5e227e8 and 768c8df.

📒 Files selected for processing (8)
  • src/derive.rs
  • src/matcher.rs
  • src/output/console.rs
  • src/output/multi.rs
  • src/output/storage.rs
  • src/storage/iceberg/schema.rs
  • src/storage/parquet_backend.rs
  • src/storage/schema.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: benchmarks
🧰 Additional context used
📓 Path-based instructions (8)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Order imports as: external crates → std → blank line → super:: → blank line → crate::
Prefer ? operator over .unwrap() in new code
Use PascalCase for types and structs
Use snake_case for function and method names
Use SCREAMING_SNAKE_CASE for constants
Use snake_case for file and module names

Files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/derive.rs
  • src/output/storage.rs
  • src/output/multi.rs
  • src/storage/parquet_backend.rs
  • src/output/console.rs
  • src/matcher.rs
src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement custom error enums with Display and Error trait implementations for domain modules

Files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/output/storage.rs
  • src/output/multi.rs
  • src/storage/parquet_backend.rs
  • src/output/console.rs
src/{transform,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use builder pattern for configuration (e.g., ElectrumTransform::new().with_change(), ParquetBackend::new().with_compression())

Files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/output/storage.rs
  • src/output/multi.rs
  • src/storage/parquet_backend.rs
  • src/output/console.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Use indicatif::ProgressBar for long-running operations
Place inline tests in #[cfg(test)] mod tests at the end of each file, using standard assert! and assert_eq! macros
Use tempfile crate for file I/O tests

Files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/derive.rs
  • src/output/storage.rs
  • src/output/multi.rs
  • src/storage/parquet_backend.rs
  • src/output/console.rs
  • src/matcher.rs
src/output/**/*.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

src/output/**/*.rs: Use escape_csv_field() from mod.rs for any field written to CSV in the output module
All outputs must be Send + Sync (use interior mutability where needed) for thread safety
Output trait must implement three methods: key(), hit(), and flush() with specified signatures

Files:

  • src/output/storage.rs
  • src/output/multi.rs
  • src/output/console.rs
src/output/*.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

Create new output formats by implementing the Output trait in src/output/{name}.rs and registering with mod {name}; and pub use in mod.rs

Files:

  • src/output/storage.rs
  • src/output/multi.rs
  • src/output/console.rs
src/output/multi.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

Multi-output behavior is implemented by dispatching to Vec<Box<dyn Output>> in the multi-output dispatcher

Files:

  • src/output/multi.rs
src/output/console.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

src/output/console.rs: Compact CSV format for file output should be: source,transform,privkey_hex,address
Use verbose YAML-like multi-line format for console stdout output

Files:

  • src/output/console.rs
🧠 Learnings (13)
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update Arrow schema definition in `schema.rs`, including `result_schema()` function and field constants

Applied to files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/output/storage.rs
  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Map variable-length fields to fixed columns in schema for SQL-friendly querying

Applied to files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-13T13:20:58.369Z
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 76
File: src/source/wordlist.rs:134-143
Timestamp: 2026-03-13T13:20:58.369Z
Learning: Do not flag .ok() calls on output.hit() / output.key() inside Rayon par_chunks().for_each() closures in Rust files (e.g., src/source/wordlist.rs) as bugs. Rayon for_each requires FnMut() -> (), which cannot propagate Result with ?. This is a known limitation discussed in issue `#77`. Treat these .ok() usages in this pattern as intentional and not regressions; focus review on actual logic or error handling outside this specific closure pattern.

Applied to files:

  • src/storage/iceberg/schema.rs
  • src/storage/schema.rs
  • src/derive.rs
  • src/output/storage.rs
  • src/output/multi.rs
  • src/storage/parquet_backend.rs
  • src/output/console.rs
  • src/matcher.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update `records_to_batch()` conversion when adding new columns to the schema

Applied to files:

  • src/storage/schema.rs
  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Define record structs in `mod.rs` when modifying the record structure

Applied to files:

  • src/storage/schema.rs
  • src/output/storage.rs
📚 Learning: 2026-03-12T18:25:01.029Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Compact CSV format for file output should be: `source,transform,privkey_hex,address`

Applied to files:

  • src/storage/schema.rs
  • src/output/storage.rs
  • src/output/console.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Use `&'a str` and `&'a [T]` references in record structs for zero-copy efficiency

Applied to files:

  • src/storage/schema.rs
  • src/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/*.rs : Implement `StorageBackend` trait in a new `{name}.rs` file when adding a new storage backend

Applied to files:

  • src/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Add error variants to the `StorageError` enum in `mod.rs`

Applied to files:

  • src/output/storage.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Implement chunk rotation logic in `should_rotate()` and `rotate_chunk()` methods in `parquet_backend.rs`

Applied to files:

  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Configure ParquetBackend with appropriate compression, chunk_records, and chunk_bytes thresholds

Applied to files:

  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Use Hive-style partitioning format `transform={name}/date={YYYY-MM-DD}/chunk_{NNNN}.parquet` for ParquetBackend file organization

Applied to files:

  • src/storage/parquet_backend.rs
📚 Learning: 2026-03-12T18:25:01.029Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Use verbose YAML-like multi-line format for console stdout output

Applied to files:

  • src/output/console.rs
🧬 Code graph analysis (1)
src/matcher.rs (1)
src/output/console.rs (2)
  • key (62-91)
  • new (21-26)

Comment thread src/storage/iceberg/schema.rs
Iceberg identifies columns by field ID, not position. Inserting
P2TR before the WIF fields would shift their IDs and break reads
on existing tables. Append it at the end so WIF keeps IDs 18/19
and P2TR gets the fresh ID 20.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/storage/iceberg/schema.rs (1)

160-160: Line 160 check is too weak to guard schema-ID compatibility.

len() == 20 still passes if someone later reorders fields and shifts IDs. Add an explicit regression test for wif_compressed=18, wif_uncompressed=19, and address_p2tr=20 (and include address_p2tr in the field-name test), so the compatibility intent is enforced by tests.

Suggested test hardening
@@
     fn schema_has_correct_field_names() {
         let schema = build_iceberg_schema().unwrap();
         let struct_type = schema.as_struct();

         assert!(struct_type.field_by_name("source").is_some());
         assert!(struct_type.field_by_name("transform").is_some());
         assert!(struct_type.field_by_name("timestamp").is_some());
         assert!(struct_type.field_by_name("private_key_raw").is_some());
         assert!(struct_type.field_by_name("address_p2wpkh").is_some());
+        assert!(struct_type.field_by_name("address_p2tr").is_some());
     }
@@
     fn field_ids_are_sequential() {
         let schema = build_iceberg_schema().unwrap();
         let fields = schema.as_struct().fields();
         let ids: Vec<i32> = fields.iter().map(|f| f.id).collect();

         for (i, id) in ids.iter().enumerate() {
             assert_eq!(*id, (i + 1) as i32);
         }
     }
+
+    #[test]
+    fn p2tr_append_preserves_wif_field_ids() {
+        let schema = build_iceberg_schema().unwrap();
+        let struct_type = schema.as_struct();
+        assert_eq!(struct_type.field_by_name("wif_compressed").unwrap().id, 18);
+        assert_eq!(struct_type.field_by_name("wif_uncompressed").unwrap().id, 19);
+        assert_eq!(struct_type.field_by_name("address_p2tr").unwrap().id, 20);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/iceberg/schema.rs` at line 160, Replace the weak length-only
assertion in the schema test with explicit checks that guard field ordering/IDs:
assert that schema.as_struct().fields() contains fields with names
"wif_compressed", "wif_uncompressed", and "address_p2tr" at the exact positions
corresponding to IDs 18, 19, and 20 respectively (i.e., indexes 17, 18, 19), and
include "address_p2tr" in the existing field-name assertions; update or add a
regression test in schema.rs to enforce these exact position/name mappings so
reordering cannot silently break schema-ID compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/storage/iceberg/schema.rs`:
- Line 160: Replace the weak length-only assertion in the schema test with
explicit checks that guard field ordering/IDs: assert that
schema.as_struct().fields() contains fields with names "wif_compressed",
"wif_uncompressed", and "address_p2tr" at the exact positions corresponding to
IDs 18, 19, and 20 respectively (i.e., indexes 17, 18, 19), and include
"address_p2tr" in the existing field-name assertions; update or add a regression
test in schema.rs to enforce these exact position/name mappings so reordering
cannot silently break schema-ID compatibility.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 42769662-1b5b-4009-a8d4-eea964920016

📥 Commits

Reviewing files that changed from the base of the PR and between 768c8df and 1e8910a.

📒 Files selected for processing (1)
  • src/storage/iceberg/schema.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (4)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Order imports as: external crates → std → blank line → super:: → blank line → crate::
Prefer ? operator over .unwrap() in new code
Use PascalCase for types and structs
Use snake_case for function and method names
Use SCREAMING_SNAKE_CASE for constants
Use snake_case for file and module names

Files:

  • src/storage/iceberg/schema.rs
src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement custom error enums with Display and Error trait implementations for domain modules

Files:

  • src/storage/iceberg/schema.rs
src/{transform,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use builder pattern for configuration (e.g., ElectrumTransform::new().with_change(), ParquetBackend::new().with_compression())

Files:

  • src/storage/iceberg/schema.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Use indicatif::ProgressBar for long-running operations
Place inline tests in #[cfg(test)] mod tests at the end of each file, using standard assert! and assert_eq! macros
Use tempfile crate for file I/O tests

Files:

  • src/storage/iceberg/schema.rs
🧠 Learnings (4)
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update Arrow schema definition in `schema.rs`, including `result_schema()` function and field constants

Applied to files:

  • src/storage/iceberg/schema.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Define record structs in `mod.rs` when modifying the record structure

Applied to files:

  • src/storage/iceberg/schema.rs
📚 Learning: 2026-03-02T14:18:34.668Z
Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Map variable-length fields to fixed columns in schema for SQL-friendly querying

Applied to files:

  • src/storage/iceberg/schema.rs
📚 Learning: 2026-03-13T13:20:58.369Z
Learnt from: oritwoen
Repo: oritwoen/vuke PR: 76
File: src/source/wordlist.rs:134-143
Timestamp: 2026-03-13T13:20:58.369Z
Learning: Do not flag .ok() calls on output.hit() / output.key() inside Rayon par_chunks().for_each() closures in Rust files (e.g., src/source/wordlist.rs) as bugs. Rayon for_each requires FnMut() -> (), which cannot propagate Result with ?. This is a known limitation discussed in issue `#77`. Treat these .ok() usages in this pattern as intentional and not regressions; focus review on actual logic or error handling outside this specific closure pattern.

Applied to files:

  • src/storage/iceberg/schema.rs
🔇 Additional comments (1)
src/storage/iceberg/schema.rs (1)

129-135: Appending address_p2tr after WIF is correct.

This keeps the existing WIF field IDs stable and adds P2TR as a new trailing field, which is the safe evolution path.

@oritwoen oritwoen merged commit bc9b540 into main Mar 13, 2026
2 checks passed
@oritwoen oritwoen deleted the feat/p2tr-taproot-support branch March 13, 2026 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Matcher and KeyDeriver don't support P2TR (Taproot) addresses

1 participant