feat(derive): add P2TR (Taproot) address support by oritwoen · Pull Request #79 · oritwoen/vuke

oritwoen · 2026-03-13T14:28:33Z

Taproot has been active on mainnet since November 2021, but vuke only derives and matches P2PKH and P2WPKH addresses. Scanning against bc1p... targets silently produces zero matches because the address type doesn't exist in the pipeline.

This adds P2TR (key-path spend, no script tree) across the full flow:

KeyDeriver::derive() now produces a p2tr field via Address::p2tr() from the bitcoin crate
Matcher::check() tests the new address type, AddressType::P2tr variant added
Console output includes P2TR in both verbose and hit formats
Parquet and Iceberg storage schemas gain an address_p2tr column (19 -> 20 columns)
All test helpers and schema assertions updated

Closes #67

Derive, match, and output bc1p Taproot addresses alongside the existing P2PKH and P2WPKH formats. Covers the full pipeline: KeyDeriver, Matcher, all output handlers, and the Parquet/Iceberg storage schemas (19 -> 20 columns). Closes #67

coderabbitai · 2026-03-13T14:28:54Z

📝 Walkthrough

Walkthrough

P2TR (Taproot) support added: DerivedKey now exposes p2tr, derivation computes P2TR, Matcher recognizes P2tr, outputs include and persist the p2tr column, and storage schemas/tests updated to include ADDRESS_P2TR.

Changes

Cohort / File(s)	Summary
Core Derivation & Matching `src/derive.rs`, `src/matcher.rs`	Added `p2tr: String` to `DerivedKey`, compute p2tr during derive, `addresses()` returns 4 items. Added `AddressType::P2tr` and Matcher checks for `derived.p2tr` (bc1p) addresses.
Console & Multi Output `src/output/console.rs`, `src/output/multi.rs`, `src/output/storage.rs`	Verbose and multi output now print/include `derived.p2tr`; test fixtures updated to populate `p2tr`. Storage output writes an AddressRecord with type `p2tr`.
Storage Schema & Backends `src/storage/schema.rs`, `src/storage/iceberg/schema.rs`, `src/storage/parquet_backend.rs`	Added `ADDRESS_P2TR` field and a nullable Utf8 column; schema expanded from 19→20 fields. Tests and batch construction updated to include the new P2TR column.

Sequence Diagram

sequenceDiagram
    participant KeyDeriver
    participant DerivedKey
    participant Matcher
    participant Storage
    participant Console

    KeyDeriver->>DerivedKey: derive() -> compute p2pkh/p2wpkh/p2tr
    DerivedKey->>Matcher: provide addresses (incl. p2tr)
    Matcher->>Matcher: compare targets vs p2tr (bc1p...)
    alt Match found
        Matcher->>Storage: write record (address_type: "p2tr", address: derived.p2tr)
        Matcher->>Console: emit hit (include p2tr)
    else No match
        Matcher->>Console: no-hit flow
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

bc1p arrives, a taproot gleam,
Derived, matched, and stored in the stream,
Four addresses now lined in a row,
Tests updated so the checks will know,
Small change — modern wallets can show.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title accurately summarizes the main change: adding P2TR (Taproot) address support across the codebase.
Description check	✅ Passed	Description clearly explains the problem (P2TR missing from pipeline since Nov 2021), the solution scope (KeyDeriver, Matcher, console, storage), and references the linked issue.
Linked Issues check	✅ Passed	All coding requirements from issue `#67` are implemented: p2tr field added to DerivedKey, P2TR derivation via bitcoin crate, P2tr variant added to AddressType, Matcher::check() extended, addresses() returns 4 items, console outputs updated, storage schemas expanded (19→20 columns).
Out of Scope Changes check	✅ Passed	Changes are tightly scoped to P2TR support: derive.rs, matcher.rs, console/multi/storage output modules, and storage schemas. No unrelated refactoring or feature additions detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/p2tr-taproot-support

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/p2tr-taproot-support

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codeant-ai · 2026-03-13T14:29:34Z

Sequence Diagram

This PR extends the Bitcoin pipeline to include Taproot addresses as a first class type. A derived key now includes a P2TR address, matcher checks it against targets, and outputs and storage persist the new address field.

sequenceDiagram
    participant Scanner
    participant KeyDeriver
    participant Matcher
    participant ConsoleOutput
    participant Storage

    Scanner->>KeyDeriver: Derive key data and all address types
    KeyDeriver-->>Scanner: Return key with p2pkh p2wpkh and p2tr
    Scanner->>Matcher: Check derived addresses against targets
    Matcher-->>Scanner: Match result can be p2tr
    Scanner->>ConsoleOutput: Print key and hit details including p2tr
    Scanner->>Storage: Write record with address p2tr column

Generated by CodeAnt AI

cubic-dev-ai

No issues found across 8 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Requires human review: This PR modifies the storage schemas for Parquet and Iceberg (adding columns), which is classified as a high-impact change requiring human review.}

Architecture diagram

sequenceDiagram
    participant R as Runner / Main
    participant KD as KeyDeriver
    participant M as Matcher
    participant C as ConsoleOutput
    participant S as StorageOutput
    participant P as Parquet/Iceberg Backend

    Note over R,P: Address Derivation and Matching Flow

    R->>KD: derive(private_key)
    KD->>KD: Generate P2PKH & P2WPKH
    KD->>KD: NEW: Derive P2TR (Taproot) via X-only Pubkey
    KD-->>R: Return DerivedKey (includes P2TR)

    R->>M: check(DerivedKey)
    M->>M: Compare targets against P2PKH/P2WPKH
    M->>M: NEW: Compare targets against AddressType::P2tr
    alt Match Found
        M-->>R: Some(MatchInfo)
    else No Match
        M-->>R: None
    end

    opt Verbose Mode or Match Found
        R->>C: print_details(DerivedKey)
        Note right of C: CHANGED: Console now includes<br/>P2TR address in output
    end

    opt Storage Enabled
        R->>S: write(DerivedKey)
        S->>P: append_row(data)
        Note right of P: CHANGED: Schema expanded to 20 columns<br/>to include address_p2tr
    end

codspeed-hq · 2026-03-13T14:32:05Z

Merging this PR will degrade performance by 24.33%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 2 regressed benchmarks
✅ 5 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
❌	`derive_batch_100`	36.4 ms	48.1 ms	-24.33%
❌	`derive_addresses`	419.8 µs	548.2 µs	-23.43%

_{Comparing feat/p2tr-taproot-support (768c8df) with main (5e227e8)}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/output/console.rs (1)
65-86: ⚠️ Potential issue | 🟠 Major

Compact CSV still drops Taproot addresses.

This adds derived.p2tr to verbose output, but the non-verbose CSV path still writes only derived.p2pkh_compressed. So to_file() continues to omit bc1p... values entirely, which means Taproot support is still incomplete for the default file-output path. If the CSV has to stay 4 columns, emit one row per derived address instead of one row per key. As per coding guidelines, "Compact CSV format for file output should be: source,transform,privkey_hex,address."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/output/console.rs` around lines 65 - 86, The non-verbose branch still
writes only a single CSV row using derived.p2pkh_compressed, which omits Taproot
(derived.p2tr); change the else path in console.rs (the block that calls
format_compact_csv_row) to emit one compact CSV row per derived address instead
of one row per key: build an ordered list of addresses from the Derived struct
(e.g. derived.p2pkh_compressed, derived.p2pkh_uncompressed, derived.p2wpkh,
derived.p2tr) and loop over them, calling format_compact_csv_row(source,
transform, &derived.private_key_hex, &address) and writeln! each result so the
compact file output follows "source,transform,privkey_hex,address" and includes
Taproot rows.

🧹 Nitpick comments (2)

src/storage/parquet_backend.rs (1)
582-585: This still doesn't prove the new column is populated.

This only checks that the batch grew to 20 columns. addresses1 never includes a p2tr entry and nothing reads column 17 back, so a typo between src/output/storage.rs and src/storage/schema.rs would leave address_p2tr null everywhere and CI would still pass. Add one round-trip assertion for a non-null Taproot value.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/parquet_backend.rs` around lines 582 - 585, The test currently
only checks column count but not that the new Taproot column is populated;
modify the test after obtaining `batch = &batches[0]` to read the `address_p2tr`
column (either by name "address_p2tr" or index 17) and assert that at least one
row contains a non-null, expected Taproot value (for example compare the
recorded value to the known Taproot from `addresses1` or check
`!array.is_null(row_idx)`/matches expected bytes); use the Arrow/Parquet API
calls you already use elsewhere in the test to fetch the column array and
validate its contents so CI will fail if the column is left null everywhere.
src/derive.rs (1)
216-221: Pin Taproot derivation to an exact fixture.

starts_with("bc1p") is too loose here. A wrong tweak or wrong x-only key still gives you a valid-looking Taproot string, so this test stays green while derivation is wrong. Use a fixed key and assert the full expected p2tr address.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/derive.rs` around lines 216 - 221, The Taproot assertion is too loose:
replace the starts_with("bc1p") check so the test pins Taproot output to a fixed
fixture; compute or hardcode the exact expected p2tr address and assert equality
against addrs[3] (the result of derived.addresses()) instead of a prefix check,
ensuring the test fails if the tweak/x-only key is wrong; keep the existing
checks for addrs[0..2] but change the P2TR check to assert_eq!(addrs[3],
"<expected_p2tr_address>") using the known fixture value.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/storage/iceberg/schema.rs`:
- Around line 117-122: The new NestedField::optional(fields::ADDRESS_P2TR, ...)
block is currently inserted before the existing wif fields and therefore shifts
their generated IDs (affecting wif_compressed and wif_uncompressed); move the
ADDRESS_P2TR field block so it is appended after the wif_compressed and
wif_uncompressed field declarations (i.e., place the
NestedField::optional(next_id(), fields::ADDRESS_P2TR,
Type::Primitive(PrimitiveType::String)).into() after the existing wif_* entries)
to ensure existing field IDs remain unchanged and the new field gets a fresh ID.

---

Outside diff comments:
In `@src/output/console.rs`:
- Around line 65-86: The non-verbose branch still writes only a single CSV row
using derived.p2pkh_compressed, which omits Taproot (derived.p2tr); change the
else path in console.rs (the block that calls format_compact_csv_row) to emit
one compact CSV row per derived address instead of one row per key: build an
ordered list of addresses from the Derived struct (e.g.
derived.p2pkh_compressed, derived.p2pkh_uncompressed, derived.p2wpkh,
derived.p2tr) and loop over them, calling format_compact_csv_row(source,
transform, &derived.private_key_hex, &address) and writeln! each result so the
compact file output follows "source,transform,privkey_hex,address" and includes
Taproot rows.

---

Nitpick comments:
In `@src/derive.rs`:
- Around line 216-221: The Taproot assertion is too loose: replace the
starts_with("bc1p") check so the test pins Taproot output to a fixed fixture;
compute or hardcode the exact expected p2tr address and assert equality against
addrs[3] (the result of derived.addresses()) instead of a prefix check, ensuring
the test fails if the tweak/x-only key is wrong; keep the existing checks for
addrs[0..2] but change the P2TR check to assert_eq!(addrs[3],
"<expected_p2tr_address>") using the known fixture value.

In `@src/storage/parquet_backend.rs`:
- Around line 582-585: The test currently only checks column count but not that
the new Taproot column is populated; modify the test after obtaining `batch =
&batches[0]` to read the `address_p2tr` column (either by name "address_p2tr" or
index 17) and assert that at least one row contains a non-null, expected Taproot
value (for example compare the recorded value to the known Taproot from
`addresses1` or check `!array.is_null(row_idx)`/matches expected bytes); use the
Arrow/Parquet API calls you already use elsewhere in the test to fetch the
column array and validate its contents so CI will fail if the column is left
null everywhere.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c9c30707-e47e-4647-acf5-6b8c443ce033

📥 Commits

Reviewing files that changed from the base of the PR and between 5e227e8 and 768c8df.

📒 Files selected for processing (8)

src/derive.rs
src/matcher.rs
src/output/console.rs
src/output/multi.rs
src/output/storage.rs
src/storage/iceberg/schema.rs
src/storage/parquet_backend.rs
src/storage/schema.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: cubic · AI code reviewer
GitHub Check: benchmarks

🧰 Additional context used

📓 Path-based instructions (8)

**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Order imports as: external crates → std → blank line → super:: → blank line → crate::
Prefer ? operator over .unwrap() in new code
Use PascalCase for types and structs
Use snake_case for function and method names
Use SCREAMING_SNAKE_CASE for constants
Use snake_case for file and module names

Files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/derive.rs
src/output/storage.rs
src/output/multi.rs
src/storage/parquet_backend.rs
src/output/console.rs
src/matcher.rs

src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement custom error enums with Display and Error trait implementations for domain modules

Files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/output/storage.rs
src/output/multi.rs
src/storage/parquet_backend.rs
src/output/console.rs

src/{transform,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use builder pattern for configuration (e.g., ElectrumTransform::new().with_change(), ParquetBackend::new().with_compression())

Files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/output/storage.rs
src/output/multi.rs
src/storage/parquet_backend.rs
src/output/console.rs

src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Use indicatif::ProgressBar for long-running operations
Place inline tests in #[cfg(test)] mod tests at the end of each file, using standard assert! and assert_eq! macros
Use tempfile crate for file I/O tests

Files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/derive.rs
src/output/storage.rs
src/output/multi.rs
src/storage/parquet_backend.rs
src/output/console.rs
src/matcher.rs

src/output/**/*.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

src/output/**/*.rs: Use escape_csv_field() from mod.rs for any field written to CSV in the output module
All outputs must be Send + Sync (use interior mutability where needed) for thread safety
Output trait must implement three methods: key(), hit(), and flush() with specified signatures

Files:

src/output/storage.rs
src/output/multi.rs
src/output/console.rs

src/output/*.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

Create new output formats by implementing the Output trait in src/output/{name}.rs and registering with mod {name}; and pub use in mod.rs

Files:

src/output/storage.rs
src/output/multi.rs
src/output/console.rs

src/output/multi.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

Multi-output behavior is implemented by dispatching to Vec<Box<dyn Output>> in the multi-output dispatcher

Files:

src/output/multi.rs

src/output/console.rs

📄 CodeRabbit inference engine (src/output/AGENTS.md)

src/output/console.rs: Compact CSV format for file output should be: source,transform,privkey_hex,address
Use verbose YAML-like multi-line format for console stdout output

Files:

src/output/console.rs

🧠 Learnings (13)

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update Arrow schema definition in `schema.rs`, including `result_schema()` function and field constants

Applied to files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/output/storage.rs
src/storage/parquet_backend.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Map variable-length fields to fixed columns in schema for SQL-friendly querying

Applied to files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/storage/parquet_backend.rs

📚 Learning: 2026-03-13T13:20:58.369Z

Learnt from: oritwoen
Repo: oritwoen/vuke PR: 76
File: src/source/wordlist.rs:134-143
Timestamp: 2026-03-13T13:20:58.369Z
Learning: Do not flag .ok() calls on output.hit() / output.key() inside Rayon par_chunks().for_each() closures in Rust files (e.g., src/source/wordlist.rs) as bugs. Rayon for_each requires FnMut() -> (), which cannot propagate Result with ?. This is a known limitation discussed in issue `#77`. Treat these .ok() usages in this pattern as intentional and not regressions; focus review on actual logic or error handling outside this specific closure pattern.

Applied to files:

src/storage/iceberg/schema.rs
src/storage/schema.rs
src/derive.rs
src/output/storage.rs
src/output/multi.rs
src/storage/parquet_backend.rs
src/output/console.rs
src/matcher.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update `records_to_batch()` conversion when adding new columns to the schema

Applied to files:

src/storage/schema.rs
src/storage/parquet_backend.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Define record structs in `mod.rs` when modifying the record structure

Applied to files:

src/storage/schema.rs
src/output/storage.rs

📚 Learning: 2026-03-12T18:25:01.029Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Compact CSV format for file output should be: `source,transform,privkey_hex,address`

Applied to files:

src/storage/schema.rs
src/output/storage.rs
src/output/console.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Use `&'a str` and `&'a [T]` references in record structs for zero-copy efficiency

Applied to files:

src/storage/schema.rs
src/output/storage.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/*.rs : Implement `StorageBackend` trait in a new `{name}.rs` file when adding a new storage backend

Applied to files:

src/output/storage.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Add error variants to the `StorageError` enum in `mod.rs`

Applied to files:

src/output/storage.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Implement chunk rotation logic in `should_rotate()` and `rotate_chunk()` methods in `parquet_backend.rs`

Applied to files:

src/storage/parquet_backend.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Configure ParquetBackend with appropriate compression, chunk_records, and chunk_bytes thresholds

Applied to files:

src/storage/parquet_backend.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/parquet_backend.rs : Use Hive-style partitioning format `transform={name}/date={YYYY-MM-DD}/chunk_{NNNN}.parquet` for ParquetBackend file organization

Applied to files:

src/storage/parquet_backend.rs

📚 Learning: 2026-03-12T18:25:01.029Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/output/AGENTS.md:0-0
Timestamp: 2026-03-12T18:25:01.029Z
Learning: Applies to src/output/console.rs : Use verbose YAML-like multi-line format for console stdout output

Applied to files:

src/output/console.rs

🧬 Code graph analysis (1)

src/matcher.rs (1)

src/output/console.rs (2)

key (62-91)

new (21-26)

Iceberg identifies columns by field ID, not position. Inserting P2TR before the WIF fields would shift their IDs and break reads on existing tables. Append it at the end so WIF keeps IDs 18/19 and P2TR gets the fresh ID 20.

coderabbitai

🧹 Nitpick comments (1)

src/storage/iceberg/schema.rs (1)

160-160: Line 160 check is too weak to guard schema-ID compatibility.

len() == 20 still passes if someone later reorders fields and shifts IDs. Add an explicit regression test for wif_compressed=18, wif_uncompressed=19, and address_p2tr=20 (and include address_p2tr in the field-name test), so the compatibility intent is enforced by tests.

Suggested test hardening

@@
     fn schema_has_correct_field_names() {
         let schema = build_iceberg_schema().unwrap();
         let struct_type = schema.as_struct();

         assert!(struct_type.field_by_name("source").is_some());
         assert!(struct_type.field_by_name("transform").is_some());
         assert!(struct_type.field_by_name("timestamp").is_some());
         assert!(struct_type.field_by_name("private_key_raw").is_some());
         assert!(struct_type.field_by_name("address_p2wpkh").is_some());
+        assert!(struct_type.field_by_name("address_p2tr").is_some());
     }
@@
     fn field_ids_are_sequential() {
         let schema = build_iceberg_schema().unwrap();
         let fields = schema.as_struct().fields();
         let ids: Vec<i32> = fields.iter().map(|f| f.id).collect();

         for (i, id) in ids.iter().enumerate() {
             assert_eq!(*id, (i + 1) as i32);
         }
     }
+
+    #[test]
+    fn p2tr_append_preserves_wif_field_ids() {
+        let schema = build_iceberg_schema().unwrap();
+        let struct_type = schema.as_struct();
+        assert_eq!(struct_type.field_by_name("wif_compressed").unwrap().id, 18);
+        assert_eq!(struct_type.field_by_name("wif_uncompressed").unwrap().id, 19);
+        assert_eq!(struct_type.field_by_name("address_p2tr").unwrap().id, 20);
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/storage/iceberg/schema.rs` at line 160, Replace the weak length-only
assertion in the schema test with explicit checks that guard field ordering/IDs:
assert that schema.as_struct().fields() contains fields with names
"wif_compressed", "wif_uncompressed", and "address_p2tr" at the exact positions
corresponding to IDs 18, 19, and 20 respectively (i.e., indexes 17, 18, 19), and
include "address_p2tr" in the existing field-name assertions; update or add a
regression test in schema.rs to enforce these exact position/name mappings so
reordering cannot silently break schema-ID compatibility.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/storage/iceberg/schema.rs`:
- Line 160: Replace the weak length-only assertion in the schema test with
explicit checks that guard field ordering/IDs: assert that
schema.as_struct().fields() contains fields with names "wif_compressed",
"wif_uncompressed", and "address_p2tr" at the exact positions corresponding to
IDs 18, 19, and 20 respectively (i.e., indexes 17, 18, 19), and include
"address_p2tr" in the existing field-name assertions; update or add a regression
test in schema.rs to enforce these exact position/name mappings so reordering
cannot silently break schema-ID compatibility.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 42769662-1b5b-4009-a8d4-eea964920016

📥 Commits

Reviewing files that changed from the base of the PR and between 768c8df and 1e8910a.

📒 Files selected for processing (1)

src/storage/iceberg/schema.rs

📜 Review details

🧰 Additional context used

📓 Path-based instructions (4)

**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Order imports as: external crates → std → blank line → super:: → blank line → crate::
Prefer ? operator over .unwrap() in new code
Use PascalCase for types and structs
Use snake_case for function and method names
Use SCREAMING_SNAKE_CASE for constants
Use snake_case for file and module names

Files:

src/storage/iceberg/schema.rs

src/{derive,matcher,network,benchmark,provider,transform,analyze,source,output,gpu,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Implement custom error enums with Display and Error trait implementations for domain modules

Files:

src/storage/iceberg/schema.rs

src/{transform,output,storage}/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

Use builder pattern for configuration (e.g., ElectrumTransform::new().with_change(), ParquetBackend::new().with_compression())

Files:

src/storage/iceberg/schema.rs

src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Use indicatif::ProgressBar for long-running operations
Place inline tests in #[cfg(test)] mod tests at the end of each file, using standard assert! and assert_eq! macros
Use tempfile crate for file I/O tests

Files:

src/storage/iceberg/schema.rs

🧠 Learnings (4)

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Update Arrow schema definition in `schema.rs`, including `result_schema()` function and field constants

Applied to files:

src/storage/iceberg/schema.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/mod.rs : Define record structs in `mod.rs` when modifying the record structure

Applied to files:

src/storage/iceberg/schema.rs

📚 Learning: 2026-03-02T14:18:34.668Z

Learnt from: CR
Repo: oritwoen/vuke PR: 0
File: src/storage/AGENTS.md:0-0
Timestamp: 2026-03-02T14:18:34.668Z
Learning: Applies to src/storage/storage/schema.rs : Map variable-length fields to fixed columns in schema for SQL-friendly querying

Applied to files:

src/storage/iceberg/schema.rs

📚 Learning: 2026-03-13T13:20:58.369Z

Learnt from: oritwoen
Repo: oritwoen/vuke PR: 76
File: src/source/wordlist.rs:134-143
Timestamp: 2026-03-13T13:20:58.369Z
Learning: Do not flag .ok() calls on output.hit() / output.key() inside Rayon par_chunks().for_each() closures in Rust files (e.g., src/source/wordlist.rs) as bugs. Rayon for_each requires FnMut() -> (), which cannot propagate Result with ?. This is a known limitation discussed in issue `#77`. Treat these .ok() usages in this pattern as intentional and not regressions; focus review on actual logic or error handling outside this specific closure pattern.

Applied to files:

src/storage/iceberg/schema.rs

🔇 Additional comments (1)

src/storage/iceberg/schema.rs (1)

129-135: Appending address_p2tr after WIF is correct.

This keeps the existing WIF field IDs stable and adds P2TR as a new trailing field, which is the safe evolution path.

feat(derive): add P2TR (Taproot) address support

768c8df

Derive, match, and output bc1p Taproot addresses alongside the existing P2PKH and P2WPKH formats. Covers the full pipeline: KeyDeriver, Matcher, all output handlers, and the Parquet/Iceberg storage schemas (19 -> 20 columns). Closes #67

oritwoen self-assigned this Mar 13, 2026

cubic-dev-ai Bot reviewed Mar 13, 2026

View reviewed changes

coderabbitai Bot requested changes Mar 13, 2026

View reviewed changes

Comment thread src/storage/iceberg/schema.rs

oritwoen added 2 commits March 13, 2026 15:46

Merge branch 'main' into feat/p2tr-taproot-support

90b3944

fix(storage): append P2TR field after WIF in Iceberg schema

1e8910a

Iceberg identifies columns by field ID, not position. Inserting P2TR before the WIF fields would shift their IDs and break reads on existing tables. Append it at the end so WIF keeps IDs 18/19 and P2TR gets the fresh ID 20.

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

coderabbitai Bot approved these changes Mar 13, 2026

View reviewed changes

oritwoen merged commit bc9b540 into main Mar 13, 2026
2 checks passed

oritwoen deleted the feat/p2tr-taproot-support branch March 13, 2026 14:55

coderabbitai Bot mentioned this pull request Mar 13, 2026

Derive and match P2SH-P2WPKH addresses #82

Closed

oritwoen mentioned this pull request Mar 15, 2026

fix(cli): show P2TR address in single subcommand output #85

Merged

This was referenced Mar 22, 2026

fix(provider): add P2TR address matching to key verification #86

Merged

feat(derive): add P2SH-P2WPKH address support #89

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(derive): add P2TR (Taproot) address support#79

feat(derive): add P2TR (Taproot) address support#79
oritwoen merged 3 commits into
mainfrom
feat/p2tr-taproot-support

oritwoen commented Mar 13, 2026

Uh oh!

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

codeant-ai Bot commented Mar 13, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

codspeed-hq Bot commented Mar 13, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oritwoen commented Mar 13, 2026

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

codeant-ai Bot commented Mar 13, 2026

Sequence Diagram

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codspeed-hq Bot commented Mar 13, 2026

Merging this PR will degrade performance by 24.33%

Performance Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading