Skip to content

Add memory-aware two-level MC chunk planner (rev2) and grouped-save parfor wrapper (rev9)#5

Open
nicklasorte wants to merge 1 commit intomainfrom
codex/refactor-matlab-chunk-sizing-and-persistence-strategy
Open

Add memory-aware two-level MC chunk planner (rev2) and grouped-save parfor wrapper (rev9)#5
nicklasorte wants to merge 1 commit intomainfrom
codex/refactor-matlab-chunk-sizing-and-persistence-strategy

Conversation

@nicklasorte
Copy link
Copy Markdown
Owner

Motivation

  • Reduce filesystem load and keep per-worker memory safe for very large Monte Carlo runs by separating compute chunk sizing from save-file grouping.
  • Avoid the previous one-file-per-compute-chunk pattern that caused slowdown when the saved chunk count grew beyond ~200.
  • Preserve restart/recovery behavior and keep the design safe for parfor by ensuring unique output files per grouped save chunk.

Description

  • Added dynamic_mc_chunks_rev2.m implementing a two-level planner that computes a memory-safe compute_chunk_size from worker_memory_mb, target_memory_utilization, num_bs, and num_sim_azi using a conservative safety multiplier (4x), then groups compute subchunks into num_save_chunks capped by max_saved_chunks (preferred cap 128).
  • Added parfor_randchunk_aggcheck_rev9_claude.m which processes one save chunk at a time, iterates its memory-safe compute subchunks via subchunk_agg_check_rev8, accumulates results in memory, and writes a single grouped .mat file per save chunk using a temp-file-then-movefile pattern and retry logic to reduce partial-write/corruption risk.
  • Added validate_grouped_chunk_strategy_rev1.m which provides automated integrity checks and an optional numerical A/B path that reconstructs baseline (per-compute) vs grouped-save outputs and verifies size equality, NaN pattern equality, max absolute difference, randomized-order invariance, and restart-skip behavior.
  • The design preserves existing .m files unchanged, returns a structured chunk_plan (mappings, ranges, randomized save order, and derived sizing info), and enforces parfor-safety by ensuring no concurrent writes to the same file.

Testing

  • Performed automated static checks by creating and inspecting the new files dynamic_mc_chunks_rev2.m, parfor_randchunk_aggcheck_rev9_claude.m, and validate_grouped_chunk_strategy_rev1.m to verify the expected interfaces and required fields are present.
  • Attempted to run a runtime check via octave --version but the Octave/MATLAB runtime was not available in this environment, so no numerical tests were executed here.
  • The new validate_grouped_chunk_strategy_rev1 implements the automated numerical checks (baseline vs grouped assembly, randomized-order invariance, and restart-skip detection) and can be run in a MATLAB/Octave environment against representative inputs to confirm numerical equivalence and restart behavior.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant