Skip to content

Add rev20 optimization with vectorized Pr helper and validation suite#17

Open
nicklasorte wants to merge 1 commit intomainfrom
claude/optimize-monte-carlo-function-Ibhsg
Open

Add rev20 optimization with vectorized Pr helper and validation suite#17
nicklasorte wants to merge 1 commit intomainfrom
claude/optimize-monte-carlo-function-Ibhsg

Conversation

@nicklasorte
Copy link
Copy Markdown
Owner

Summary

This PR introduces subchunk_agg_check_maxazi_rev20, a performance-optimized branch that replaces the Pr helper function with a vectorized implementation (monte_carlo_Pr_dBm_rev3_app), along with comprehensive profiling and statistical validation tools to ensure correctness against the golden rev11 baseline.

Key Changes

  • New optimized function: subchunk_agg_check_maxazi_rev20.m

    • Semantic equivalent to rev19 but with improved Pr helper (rev2 → rev3)
    • Configurable azimuth chunking (default 128) for memory/throughput tuning
    • Precomputes off-axis gain matrix once instead of per-iteration
    • Pre-generates all MC random numbers upfront for better cache locality
  • Vectorized Pr helper: monte_carlo_Pr_dBm_rev3_app.m

    • Replaces nearestpoint_app bracket search with discretize for speed
    • Fully vectorized per-TX interpolation (proven pattern from clutter_rev5)
    • RNG-free, maintains exact rev2 output contract and units
    • Handles edge cases (NaN, Inf, zero-span interpolation) with rev2 semantics
  • Profiling tool: profile_subchunk_agg_check_maxazi_rev20_real.m

    • Direct timing comparison of rev20 vs rev11 golden baseline on identical inputs
    • Detailed function-level profiling with top-20 contributors by total time
    • Tracks material Pr helper performance drop and identifies new bottlenecks
    • Returns structured results with full profile tables for analysis
  • Statistical validation: validate_subchunk_agg_check_maxazi_rev11_rev20_statistical.m

    • Fail-closed validation: rev20 must stay within thresholds of rev11
    • Separate core (mean, std, min, max, median) and tail (p95, p99) metrics
    • Configurable absolute and relative drift tolerances
    • Multiple timing runs to assess runtime stability

Implementation Details

  • Azimuth chunking in rev20 processes simulation azimuths in configurable blocks to balance memory usage and vectorization efficiency
  • Discretize-based bracket search in rev3 Pr helper eliminates two nearestpoint_app calls per TX per MC iteration
  • Pre-generation of random numbers decouples RNG from inner loops, enabling better parallelization in future revisions
  • Validation thresholds use max(absolute_dB, relative_fraction) to handle both small and large signal magnitudes

https://claude.ai/code/session_018i3rp6VALpD9dvwAoYQmMZ

Vectorize monte_carlo_Pr_dBm_rev2_app into rev3 by replacing two
nearestpoint_app calls with discretize and eliminating the per-TX
scalar loop via sub2ind indexing (same proven pattern as clutter_rev5).
Wire rev3 into subchunk_agg_check_maxazi_rev20 pipeline with
fail-closed statistical validator and profiler against rev11 baseline.

https://claude.ai/code/session_018i3rp6VALpD9dvwAoYQmMZ
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4a69d247d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

base.subchunk_agg_check_maxazi_rev11=summarize_rows(baseline_tbl,match_rows(baseline_tbl,'subchunk_agg_check_maxazi_rev11'),baseline_wall_s);
base.monte_carlo_clutter_rev3_app=summarize_rows(baseline_tbl,match_rows(baseline_tbl,'monte_carlo_clutter_rev3_app'),baseline_wall_s);
base.monte_carlo_Pr_dBm_rev2_app=summarize_rows(baseline_tbl,match_rows(baseline_tbl,'monte_carlo_Pr_dBm_rev2_app'),baseline_wall_s);
base.monte_carlo_super_bs_eirp_dist_valid=summarize_rows(baseline_tbl,match_rows(baseline_tbl,eirp_pattern),baseline_wall_s);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use rev11 EIRP helper when summarizing baseline profile

This baseline summary uses eirp_pattern selected from file existence, so when monte_carlo_super_bs_eirp_dist_rev8.m is present it matches rev8 in the rev11 profiler table even though subchunk_agg_check_maxazi_rev11 calls monte_carlo_super_bs_eirp_dist_rev5 (see subchunk_agg_check_maxazi_rev11.m line 80). In that common setup, base.monte_carlo_super_bs_eirp_dist_valid is recorded as not visible/zero, which silently skews baseline-vs-rev20 profiling outputs and can mislead performance conclusions.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants