Skip to content

Add real-input regression validator for subchunk max-azimuth rev9/rev10#7

Open
nicklasorte wants to merge 1 commit intomainfrom
codex/create-input-validation-for-rev9-and-rev10
Open

Add real-input regression validator for subchunk max-azimuth rev9/rev10#7
nicklasorte wants to merge 1 commit intomainfrom
codex/create-input-validation-for-rev9-and-rev10

Conversation

@nicklasorte
Copy link
Copy Markdown
Owner

Motivation

  • Provide a trustworthy regression harness that runs subchunk_agg_check_maxazi_rev9 and subchunk_agg_check_maxazi_rev10 on the exact same production inputs to detect behavioral drift.
  • Validate structural invariants and numeric agreement while measuring runtime to catch performance regressions.
  • Fail closed on any meaningful mismatch so CI or a developer run cannot silently accept regressions.

Description

  • Added validate_subchunk_agg_check_maxazi_rev9_rev10_real.m which exposes the exact production signature and accepts the real inputs used by the existing codebase.
  • The validator calls both versions with identical arguments (via local function handles), captures outputs, and performs structural checks: size match, column-vector shape, NaN pattern match, Inf pattern match, and class compatibility.
  • It performs numeric comparisons including diff_vals, abs_diff, rel_diff, max_abs_diff, max_rel_diff, mean_abs_diff, an exact_match via isequaln, worst-index reporting, and a finite-element mismatch count versus tight tolerances tol_abs = 1e-10 and tol_rel = 1e-10.
  • It measures runtimes using timeit, computes speedup = runtime_rev9 / runtime_rev10, prints a concise --- VALIDATION SUMMARY ---, returns a results struct with clear fields, and raises error('Validation failed: rev10 does not match rev9 within tolerance.') when checks fail.

Testing

  • Verified the presence and signature of the existing implementation by inspecting subchunk_agg_check_maxazi_rev9.m, which succeeded.
  • Created and committed validate_subchunk_agg_check_maxazi_rev9_rev10_real.m to the repository, and confirmed the new file content with nl/sed.
  • Attempted to execute an Octave runtime check but the environment lacked Octave/MATLAB (octave: command not found), so runtime timeit execution was not performed here.
  • To fully exercise the validator run it in a MATLAB or Octave environment with the real production inputs; the validator will print the summary and return the results struct or raise an error on failure.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72492f3afc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

on_full_Pr_dBm, clutter_loss, custom_antenna_pattern, sub_point_idx);

out_rev9 = rev9_fun();
out_rev10 = rev10_fun();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard missing rev10 implementation before executing validator

The validator calls rev10_fun() unconditionally, but this repository does not contain a subchunk_agg_check_maxazi_rev10 definition (repo-wide search only finds this reference), so execution fails immediately with an undefined-function error. That prevents the new regression harness from running at all in the current codebase and blocks the intended rev9/rev10 comparison workflow.

Useful? React with 👍 / 👎.

finite_mismatch_count = 0;
end

numeric_match = (max_abs_diff <= tol_abs) || (max_rel_diff <= tol_rel);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Evaluate numeric pass/fail with per-element tolerance logic

The numeric_match check uses global maxima with ||, which can incorrectly fail even when every finite element satisfies the intended (abs<=tol_abs) OR (rel<=tol_rel) rule. For example, one element can pass only relative tolerance while another passes only absolute tolerance; finite_mismatch_count is zero, but both global maxima exceed their thresholds so this expression marks the run as a failure. This can produce false regression failures and make CI results unreliable.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant