Skip to content

Correct a bug that would overwrite the output_coord_ranges dictionary if feature order in datastore.yaml is incorrect.#87

Closed
zweihuehner wants to merge 6 commits intomllam:mainfrom
zweihuehner:#81_coord_ranges
Closed

Correct a bug that would overwrite the output_coord_ranges dictionary if feature order in datastore.yaml is incorrect.#87
zweihuehner wants to merge 6 commits intomllam:mainfrom
zweihuehner:#81_coord_ranges

Conversation

@zweihuehner
Copy link
Copy Markdown
Contributor

@zweihuehner zweihuehner commented Nov 26, 2025

Changes

Rename the output_coord_ranges to output_coord_ranges_tmp to avoid overwriting the original coord ranges. Add inline comment to explain the issue.

Issue Link

Resolves #81

Type of change

🐛 Bug fix

Copy link
Copy Markdown
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! Just one minor suggestion if you think it fits.

If you could update the changelog too then that would be perfect. Thanks for your contribution!

Comment thread mllam_data_prep/create_dataset.py Outdated
Comment on lines +239 to +240
# Use a temporary dict to avoid modifying the original ranges.
# Static features have no time dimension, so they would return an empty dict.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about the suggestion below?

Suggested change
# Use a temporary dict to avoid modifying the original ranges.
# Static features have no time dimension, so they would return an empty dict.
# Use a temporary dict to apply selection on coordinate ranges to avoid
# modifying the original ranges given in the config. This is needed because
# for example static features would not have a time dimension, but that
# shouldn't that doing selection on time shouldn't be applied to the other
# variables.

Copy link
Copy Markdown
Contributor Author

@zweihuehner zweihuehner Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to make it a bit more detailed, however, I am a bit confused about the last sentence, something might have gotten mixed up. Do you mean something like:

This is needed because static features, for example, do not have a time dimension. Hence, the time-based selection returns an empty dictionary, which should not overwrite the selection for the other variables.

I have also updated the changelog under v0.6.1, I hope that is correct.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because static features, for example, do not have a time dimension. Hence, the time-based selection returns an empty dictionary, which should not overwrite the selection for the other variables.

That's perfect! I also think the changelog text isn't quite right. The issue here is that if a output variable didn't share a dimension and this output variable was processed before others, then the coordination selection for that dimension wouldn't be applied. Is that how you understand it too?

I have also updated the changelog under v0.6.1, I hope that is correct.

Almost, rather than changing what is in the v0.6.1 release the right way would be to create a new section at the top with unreleased changes (like https://github.com/mllam/mllam-data-prep/blob/2f02bac4120541ae2438e6bbfeb66b6466cc957a/CHANGELOG.md), since your fix will be part of the next release which hasn't been created yet :)

zweihuehner and others added 4 commits November 26, 2025 12:03
@zweihuehner
Copy link
Copy Markdown
Contributor Author

@leifdenby I corrected a pre-commit issue, but I am rather unsure why the test seems to fail for Test pip install python 3.9 on ubuntu-latest with zarr >=2,<3

It seems that it is not able to connect to some external .zarr datasets for testing. Is this anything I have influence on ?

@leifdenby
Copy link
Copy Markdown
Member

Superseded by #90

@leifdenby leifdenby closed this Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problem with output.coord_ranges slicing for coordinates not present in all of state/static/forcing

3 participants