Prune redundant flags, variables, and debug code in training code

<html>
<body>
<html><head></head><body><h3>Prune redundant flags, variables, and debug code in <code inline="">train_policy.py</code></h3>
<p><strong>Summary</strong><br>
Several CLI arguments, parameters, and helper constructs are currently duplicated or unused. Cleaning them up will make the script easier to read and maintain, and will reduce the cognitive load for anyone experimenting with new hyper-parameters.</p>
<hr>
<h4>Redundancies &amp; Proposed Actions</h4>

Area | Details | Suggested fix
-- | -- | --
Model-selection flags | --use_lstm is still parsed and passed into PolicyNet, but the real switch is now --architecture. Inside PolicyNet the argument is only used to print a warning. | Remove --use_lstm entirely (both CLI arg and PolicyNet kwarg).
Scheduler toggles | --disable_scheduler is parsed but never checked (the old guard is commented out). | Delete flag, its help text, and any dead checks.
CyclicLR parameters | --step_size_down is parsed but never forwarded to CyclicLR. | Drop the argument or wire it into the scheduler call.
NaN/Inf detection | Script disables Autograd’s built-in anomaly detection and then adds a custom hook that serves the same purpose. | Rely on one mechanism (prefer the custom hook for its field granularity).
Hyper-parameter shadowing | Local copies (alignment_f, anti_spill, etc.) immediately mirror args.* values. | Use args.alignment_f (etc.) directly.
Loss bookkeeping | last_boundary_loss, last_mse_loss, best_mse_loss are all maintained, but only best_mse_loss influences the function’s return value. | Keep best_mse_loss (and maybe last_mse_loss for debugging); drop the others unless they gate logic in the future.
Scheduler creation logic | A scheduler is always instantiated; for "exp" mode this duplicates the manual decay already applied. | Unify LR decay handling: either remove the standalone ExponentialLR or convert the manual decay into scheduler calls.
Parameter defaults vs. hard-coded values | Flags like sigma_scale, initial_action_noise, resolution have defaults that exactly match the literal values passed to HelioEnv. | Consider removing the flags or wiring the CLI values through.
Debug scatter plots | Five back-to-back scatter3d_vectors calls often visualise the same data with different weightings. | Decide which plots are genuinely useful and drop the remainder (or gate them behind a dedicated debug flag).
scheduler_gamma relevance | Only affects "exp_range" mode, but default mode is "triangular2". | Clarify in help text or lazily parse only when needed.


<hr>
<h4>Acceptance Criteria</h4>
<ul>
<li>
<p>All removed flags/variables are gone from the CLI parser, documentation, and downstream code paths.</p>
</li>
<li>
<p>Training behaviour remains unchanged when equivalent functionality is preserved.</p>
</li>
<li>
<p>CI (or a quick training smoke-test) completes without new warnings or errors.</p>
</li>
</ul></body></html>
</body>
</html>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prune redundant flags, variables, and debug code in training code #12

Prune redundant flags, variables, and debug code in `train_policy.py`

Redundancies & Proposed Actions

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Area	Details	Suggested fix
Model-selection flags	--use_lstm is still parsed and passed into PolicyNet, but the real switch is now --architecture. Inside PolicyNet the argument is only used to print a warning.	Remove --use_lstm entirely (both CLI arg and PolicyNet kwarg).
Scheduler toggles	--disable_scheduler is parsed but never checked (the old guard is commented out).	Delete flag, its help text, and any dead checks.
CyclicLR parameters	--step_size_down is parsed but never forwarded to CyclicLR.	Drop the argument or wire it into the scheduler call.
NaN/Inf detection	Script disables Autograd’s built-in anomaly detection and then adds a custom hook that serves the same purpose.	Rely on one mechanism (prefer the custom hook for its field granularity).
Hyper-parameter shadowing	Local copies (alignment_f, anti_spill, etc.) immediately mirror args.* values.	Use args.alignment_f (etc.) directly.
Loss bookkeeping	last_boundary_loss, last_mse_loss, best_mse_loss are all maintained, but only best_mse_loss influences the function’s return value.	Keep best_mse_loss (and maybe last_mse_loss for debugging); drop the others unless they gate logic in the future.
Scheduler creation logic	A scheduler is always instantiated; for "exp" mode this duplicates the manual decay already applied.	Unify LR decay handling: either remove the standalone ExponentialLR or convert the manual decay into scheduler calls.
Parameter defaults vs. hard-coded values	Flags like sigma_scale, initial_action_noise, resolution have defaults that exactly match the literal values passed to HelioEnv.	Consider removing the flags or wiring the CLI values through.
Debug scatter plots	Five back-to-back scatter3d_vectors calls often visualise the same data with different weightings.	Decide which plots are genuinely useful and drop the remainder (or gate them behind a dedicated debug flag).
scheduler_gamma relevance	Only affects "exp_range" mode, but default mode is "triangular2".	Clarify in help text or lazily parse only when needed.

Prune redundant flags, variables, and debug code in training code #12

Description

Prune redundant flags, variables, and debug code in train_policy.py

Redundancies & Proposed Actions

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Prune redundant flags, variables, and debug code in `train_policy.py`