Add puzzle 49 to benchmark, track time instead of ops/s by oritwoen · Pull Request #104 · oritwoen/kangaroo

oritwoen · 2026-04-02T16:30:44Z

Performance History tracked ops/s - misleading metric. Higher throughput doesn't mean faster solve, wall-clock time does. Switched to seconds with improvement % based on time reduction. Added 49-bit puzzle case to the suite.

Rebuilt history table with real measurements on AMD RX 6800S across v0.3.0–v0.8.0, added missing v0.5.0. v0.2.0 wouldn't compile - dropped it.

Higher ops/s doesn't mean faster solving — what matters is wall-clock time. Switch Performance History from rate (M/s) to time (seconds) with improvement percentages based on time reduction. Add 49-bit puzzle case to the benchmark suite using puzzle #49's known public key. Rebuild Performance History with real measurements from AMD RX 6800S across all compilable versions (v0.3.0–v0.8.0), adding previously missing v0.5.0 entry.

coderabbitai · 2026-04-02T16:30:58Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 80fb5b02-80e6-421c-b23b-d82bc47f298a

📥 Commits

Reviewing files that changed from the base of the PR and between d0f2167 and cec41d2.

📒 Files selected for processing (1)

src/benchmark.rs

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: cubic · AI code reviewer

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-02-20T07:57:41.039Z

Learnt from: CR
Repo: oritwoen/kangaroo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-20T07:57:41.039Z
Learning: Applies to src/gpu_crypto/context.rs : Detect and warn about software renderer fallbacks (llvmpipe/SwiftShader)

Applied to files:

src/benchmark.rs

🔇 Additional comments (7)

src/benchmark.rs (7)

33-38: 49-bit case looks structurally correct.

Start 0x1000000000000 (2^48) with range_bits: 49 gives search space [2^48, 2^49) which is right for puzzle #49. The pubkey format is valid compressed SEC1.

Worth double-checking the pubkey matches the official puzzle #49 public key if you haven't already.

175-181: Clean metric switch from throughput to time.

Using time_secs directly instead of deriving M/s makes the history tracking more intuitive and aligns with the PR objective.

245-248: Good fix for the legacy preservation issue.

This handles the all-legacy scenario correctly. One edge case: if you somehow had mixed formats (old M/s rows + new seconds rows), the unparseable rows would silently drop. Probably won't happen since you're rebuilding history from scratch, but worth knowing.

303-308: Parsing logic works correctly for new format.

strip_suffix("s") on "15.00s" gives "15.00" which parses fine. Legacy "M/s" fails parse as expected, triggering the preservation guard.

338-342: Improvement calculation is correct for lower-is-better.

For baseline=25s, time=15s: (25-15)/25*100 = 40% → "+40%" meaning "40% faster". Math checks out.

605-636: Good test for legacy preservation.

Covers the exact scenario from the previous review comment - legacy M/s rows preserved when parsing fails.

511-514: Test data properly migrated to time format.

Tests use seconds format and improvement values that align with the new lower-is-better calculation.

Also applies to: 537-540

📝 Walkthrough

Walkthrough

Switched the 48-bit benchmark from reporting throughput (M/s) to execution time (seconds), inverted the improvement formula for lower-is-better semantics, added a 49-bit benchmark case, updated parsing/formatting and tests, and preserved legacy history when parsing fails.

Changes

Cohort / File(s)	Summary
Benchmark docs `BENCHMARKS.md`	Replaced "48-bit Rate (M/s)" column with "48-bit Time (s)" and updated table rows to report seconds and revised improvement percentages (v0.3.0 baseline).
Benchmark implementation & tests `src/benchmark.rs`, `tests/...`	Added 49‑bit `Case`; changed history model from `rate: f64` to `time: f64`; switched updates to use `r.time_secs`; parse/format now strip `"s"`; inverted improvement calculation; added defensive handling to preserve legacy/ unparsable history; updated tests and sample markdown expectations.

Sequence Diagram(s)

(Skipped — change is focused on metric representation/parsing and small control flow; not a multi-component sequential feature.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

feat(benchmark): auto-save results to BENCHMARKS.md #34: Modifies the same benchmarking markdown logic (generate_markdown/update_performance_history/parse_history_rows) and format change from 48-bit rate to time; likely closely related.

Poem

⏱️ Numbers flipped, the table sings,
48s now measured in swings,
49 joined the marching line,
old rows kept if they won't parse fine—
small change, clearer metric signs.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the two main changes: adding puzzle 49 to the benchmark suite and switching the performance metric from throughput (ops/s) to wall-clock time.
Description check	✅ Passed	The description clearly explains why the change matters (throughput is misleading, wall-clock time is what matters) and covers both main objectives: metric change and 49-bit puzzle addition.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/benchmark-49bit-time-history

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/benchmark-49bit-time-history

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/benchmark.rs`:
- Around line 245-247: The code currently strips the trailing "s" and parses
legacy throughput strings (e.g. "25.10 M/s") which can leave malformed suffixes
(like "25.10 M/") leading to parse failures and emptying rows; update the
parsing logic used by upsert_history_row (and the helper that strips/splits
units) to robustly handle legacy formats by detecting parse errors and, on
failure, preserving the original rows instead of returning an empty vec: return
the original rows unchanged (or append the new row without deleting history),
and emit a debug/warn message so callers (e.g. format_history_section) won’t
overwrite the whole section when parsing fails. Ensure the change touches the
functions that perform the strip_suffix("s") and numeric parse so they
short-circuit on parse error and avoid destructive replacement.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: da6ba7c4-1963-4a51-b250-fc57bc5fcb6c

📥 Commits

Reviewing files that changed from the base of the PR and between 6f6e4a6 and d0f2167.

📒 Files selected for processing (2)

BENCHMARKS.md
src/benchmark.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: cubic · AI code reviewer

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-02-20T07:57:41.039Z

Learnt from: CR
Repo: oritwoen/kangaroo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-20T07:57:41.039Z
Learning: Applies to src/gpu_crypto/context.rs : Detect and warn about software renderer fallbacks (llvmpipe/SwiftShader)

Applied to files:

src/benchmark.rs

🔇 Additional comments (5)

src/benchmark.rs (4)

33-38: 49-bit benchmark case is wired correctly

This case is consistent with the existing benchmark flow and should execute without special handling.

175-182: 48-bit history source switched cleanly to wall-clock time

Line 175 through Line 182 now pulls time_secs for the 48-bit row and only updates history when present, which avoids writing partial history data.

331-349: Improvement math matches the new metric direction

Line 338 correctly computes improvement as time reduction vs baseline, so faster runs produce positive percentages.

507-547: Updated test fixtures/assertions are aligned with time-based history

The history fixtures and assertions now validate seconds-format rows and the new expected values.

BENCHMARKS.md (1)

33-41: Performance History table is consistent with the new time-based metric

Header, units, and improvement direction are coherent in this section.

cubic-dev-ai

No issues found across 2 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Auto-approved: Low-impact changes restricted to the benchmarking tool and its documentation. Switching metrics and adding a test case does not affect core business logic or security.}

Architecture diagram

sequenceDiagram
    participant User as CLI (kangaroo)
    participant Runner as Benchmark Runner
    participant Suites as Benchmark Cases
    participant Parser as History Logic
    participant FS as File System (BENCHMARKS.md)

    User->>Runner: --benchmark
    
    loop For each Case in CASES
        Runner->>Suites: Execute puzzle solver
        Note over Suites: Includes NEW: 49-bit puzzle
        Suites-->>Runner: Return CaseResult (time_secs)
    end

    Runner->>FS: Read existing documentation
    FS-->>Runner: Markdown content

    Runner->>Parser: update_performance_history(content, version, time_secs)
    
    Parser->>Parser: parse_history_rows()
    Note right of Parser: CHANGED: Parses "s" suffix<br/>instead of "M/s"

    alt Existing version found
        Parser->>Parser: Update existing row time
    else New version
        Parser->>Parser: Append new HistoryRow
    end

    Parser->>Parser: format_history_section()
    Note right of Parser: CHANGED: Improvement % calculated<br/>via time reduction (lower is better)

    Parser-->>Runner: Updated Markdown content
    Runner->>FS: Write updated BENCHMARKS.md
    Runner-->>User: Display results summary

If existing Performance History has M/s format rows that can't be parsed as seconds, keep the section unchanged instead of silently dropping it.

cubic-dev-ai

0 issues found across 1 file (changes from recent commits).

_{Auto-approved: This is a safe update to benchmarking utilities and documentation. It switches performance metrics to time-based measurements and adds a new test case without affecting core logic.}

coderabbitai Bot added the performance Performance improvements label Apr 2, 2026

coderabbitai Bot requested changes Apr 2, 2026

View reviewed changes

Comment thread src/benchmark.rs

cubic-dev-ai Bot approved these changes Apr 2, 2026

View reviewed changes

fix(bench): preserve legacy M/s history rows on format migration

cec41d2

If existing Performance History has M/s format rows that can't be parsed as seconds, keep the section unchanged instead of silently dropping it.

coderabbitai Bot approved these changes Apr 2, 2026

View reviewed changes

cubic-dev-ai Bot approved these changes Apr 2, 2026

View reviewed changes

oritwoen self-assigned this Apr 2, 2026

oritwoen merged commit e550cbd into main Apr 2, 2026
4 checks passed

oritwoen deleted the feat/benchmark-49bit-time-history branch April 2, 2026 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add puzzle 49 to benchmark, track time instead of ops/s#104

Add puzzle 49 to benchmark, track time instead of ops/s#104
oritwoen merged 2 commits intomainfrom
feat/benchmark-49bit-time-history

oritwoen commented Apr 2, 2026

Uh oh!

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oritwoen commented Apr 2, 2026

Uh oh!

coderabbitai Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 2, 2026 •

edited

Loading