Skip to content

Add helper-level rev5 vs rev6 validator, correctness-first monte_carlo_super_bs_eirp_dist_rev7, and rev15 integration#13

Open
nicklasorte wants to merge 1 commit intomainfrom
codex/create-helper-comparison-harness-for-validation
Open

Add helper-level rev5 vs rev6 validator, correctness-first monte_carlo_super_bs_eirp_dist_rev7, and rev15 integration#13
nicklasorte wants to merge 1 commit intomainfrom
codex/create-helper-comparison-harness-for-validation

Conversation

@nicklasorte
Copy link
Copy Markdown
Owner

Motivation

  • rev14 introduced a faster helper (rev6) that produced unacceptable output drift versus rev11 by replacing per-row interp1('spline') with custom PP evaluation, so a correctness-first path and diagnostics are required.
  • Provide a deterministic harness to compare rev5 and rev6 helper behavior on identical inputs to locate semantic divergence points before any further optimization.
  • Restore a safe production path by wiring a rev7 helper that preserves rev5 semantics and integrating it into the subchunk flow as rev15 for end-to-end validation.

Description

  • Added validate_monte_carlo_super_bs_eirp_dist_rev5_rev6.m, a helper-level comparator that reconstructs the exact helper-call inputs, runs both rev5 and rev6, reports shapes, max/mean absolute and relative diffs, worst-case indices, and clustering near endpoints/breakpoints/clamps, and fails closed on tight thresholds.
  • Implemented monte_carlo_super_bs_eirp_dist_rev7.m as a correctness-first helper that preserves rev5 per-row interp1(...,'spline') semantics, adds shape/length assertions and optional DEBUG_CHECKS, clamps queries, and avoids the manual PP evaluation strategy used in rev6.
  • Implemented subchunk_agg_check_maxazi_rev15.m by cloning the rev11 aggregation/RNG/chunking flow and replacing only the EIRP helper invocation with monte_carlo_super_bs_eirp_dist_rev7, preserving RNG and chunking behavior.
  • Added validate_subchunk_agg_check_maxazi_rev11_rev15_statistical.m to run end-to-end rev11 vs rev15 comparisons on identical inputs, measure runtimes, compute summary and upper-tail metrics (mean/std/min/max/median/p90/p95/p99), and enforce fail-closed thresholds.

Testing

  • Basic repository checks and file inspections were performed and confirm the four new files are present and syntactically viewable in the tree. (creation and content printed for review). — succeeded.
  • Static consistency: the new validators use must_exist(...) guards and return structured results objects for automated gating; those functions are in place for runtime validation. — code-paths added (no runtime execution).
  • MATLAB/Octave runtime validation could not be executed in this environment because octave/MATLAB is not available, so helper-level and end-to-end numeric tests were not run here; the validation scripts are written to be run in a MATLAB/Octave environment and will fail-closed if drift thresholds are exceeded. — runtime check not executed (environment limitation).

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant