Skip to content

Improve baseline handling when regenerating baselines#4952

Open
jgfouca wants to merge 4 commits intomasterfrom
jgfouca/imprv_baseline_handling
Open

Improve baseline handling when regenerating baselines#4952
jgfouca wants to merge 4 commits intomasterfrom
jgfouca/imprv_baseline_handling

Conversation

@jgfouca
Copy link
Copy Markdown
Contributor

@jgfouca jgfouca commented Mar 25, 2026

When regenerating baselines, we should never remove the existing
baselines until we know the current run has passed. The idea here
is that we never want to be in a situation where we've removed
existing baselines, but the case being run to regenerate them has
FAILed, leaving us with no baselines now.

The approach is, instead of cleaning baselines at the beginning of
TestScheduler, just do the cleaning of existing baselines at the point
just before new baselines are being copied over.

Checklist

  • My code follows the style guidelines of this project (black formatting)
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • I have added tests that exercise my feature/fix and existing tests continue to pass
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding additions and changes to the documentation

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes baseline regeneration behavior so existing baselines are not removed up-front (in TestScheduler) and are instead cleaned immediately before copying newly generated baseline outputs, preventing loss of baselines when a regeneration run fails.

Changes:

  • Stop deleting existing baselines early in TestScheduler when --allow-baseline-overwrite is used.
  • Add pre-generation cleanup in hist_utils._generate_baseline_impl to remove stale baseline artifacts while preserving namelist-related files.
  • Adjust namelist baseline generation to remove all prior user_nl* files in the baseline directory before copying new ones.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
CIME/test_scheduler.py Removes early baseline directory clearing during scheduler initialization.
CIME/hist_utils.py Adds “stale baseline file” cleanup just before generating/copying baseline outputs.
CIME/case/case_cmpgen_namelists.py Cleans old user_nl* files from baseline dir before copying new ones.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +655 to +666
# Remove stale baseline files from a previous run so they don't linger when the
# new run no longer produces them. CaseDocs and user_nl* are managed by the
# namelist-generation phase (which already ran), so we preserve those.
if os.path.isdir(basegen_dir):
for item in os.listdir(basegen_dir):
if item == "CaseDocs" or item.startswith("user_nl"):
continue
item_path = os.path.join(basegen_dir, item)
if os.path.isfile(item_path):
os.remove(item_path)
elif os.path.isdir(item_path):
shutil.rmtree(item_path)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new pre-generation cleanup deletes everything in the baseline dir except CaseDocs and user_nl*. This will also delete the baseline TestStatus file (TEST_STATUS_FILENAME), which is written earlier by generate_teststatus when baseline_store_teststatus is enabled, so that file will be lost after baseline generation. Consider excluding TEST_STATUS_FILENAME (and any other baseline metadata you intend to preserve) from this cleanup list so storing TestStatus continues to work.

Copilot uses AI. Check for mistakes.
Comment on lines +655 to +666
# Remove stale baseline files from a previous run so they don't linger when the
# new run no longer produces them. CaseDocs and user_nl* are managed by the
# namelist-generation phase (which already ran), so we preserve those.
if os.path.isdir(basegen_dir):
for item in os.listdir(basegen_dir):
if item == "CaseDocs" or item.startswith("user_nl"):
continue
item_path = os.path.join(basegen_dir, item)
if os.path.isfile(item_path):
os.remove(item_path)
elif os.path.isdir(item_path):
shutil.rmtree(item_path)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cleanup loop will also remove the existing bless_log file (BLESS_LOG_NAME) before the code appends the new entry later in this function, which defeats the use of append mode and discards the existing bless history. If retaining bless history is desired, skip BLESS_LOG_NAME during the stale-file removal step.

Copilot uses AI. Check for mistakes.
Comment on lines +655 to +666
# Remove stale baseline files from a previous run so they don't linger when the
# new run no longer produces them. CaseDocs and user_nl* are managed by the
# namelist-generation phase (which already ran), so we preserve those.
if os.path.isdir(basegen_dir):
for item in os.listdir(basegen_dir):
if item == "CaseDocs" or item.startswith("user_nl"):
continue
item_path = os.path.join(basegen_dir, item)
if os.path.isfile(item_path):
os.remove(item_path)
elif os.path.isdir(item_path):
shutil.rmtree(item_path)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is existing unit coverage for baseline generation (e.g., SystemTestsCommon._generate_baseline), but it doesn’t exercise the new “remove stale baseline files” behavior. Adding a test that pre-populates the baseline dir (including TestStatus / bless_log and an extra stale file) would help ensure the cleanup removes only what it should and preserves expected metadata.

Copilot uses AI. Check for mistakes.
clear_folder(os.path.join(test_baseline, "CaseDocs"))
else:
clear_folder(test_baseline)
pass
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if allow_baseline_overwrite and run_count == 0: block is now a no-op (pass). Consider removing the conditional entirely or replacing it with a short comment explaining that baseline cleanup is intentionally deferred to the baseline-generation step, to avoid future confusion about whether something is missing here.

Suggested change
pass
# Intentionally do not skip tests with existing baselines when
# baseline overwrite is enabled; baseline cleanup is handled
# during the baseline-generation step.
continue

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the CIME knowledge to comment on the correctness of the impl (copilot prob has that covered anyways). But the feature is very welcome. Thanks Jim!

Copy link
Copy Markdown
Collaborator

@jasonb5 jasonb5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs some tests to ensure expected behavior.

@jgfouca
Copy link
Copy Markdown
Contributor Author

jgfouca commented Mar 26, 2026

@jasonb5 , done.

jgfouca added 4 commits March 27, 2026 14:31
When regenerating baselines, we should never remove the existing
baselines until we know the current run has passed. The idea here
is that we never want to be in a situation where we've removed
existing baselines, but the case being run to regenerate them has
FAILed, leaving us with no baselines now.

The approach is, instead of cleaning baselines at the beginning of
TestScheduler, just do the cleaning of existing baselines at the point
just before new baselines are being copied over.
@jgfouca jgfouca force-pushed the jgfouca/imprv_baseline_handling branch from 130faaa to 5507f30 Compare March 27, 2026 20:31
@jgfouca jgfouca requested a review from jasonb5 March 27, 2026 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants