Skip to content

Improve support for adding parent experiment data #212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bouweandela opened this issue Mar 28, 2025 · 0 comments · May be fixed by #214
Open

Improve support for adding parent experiment data #212

bouweandela opened this issue Mar 28, 2025 · 0 comments · May be fixed by #214

Comments

@bouweandela
Copy link
Contributor

Several metrics make use of a parent dataset. So far we have assumed that this data is described by the same facets except for the experiment_id and uses the same time dimension. This limits the number of models these metrics work with and it would be nice to improve this.

For CMIP6, the parent dataset is described by the facets

branch_time_in_child
branch_time_in_parent
parent_activity_id
parent_experiment_id
parent_source_id
parent_time_units
parent_variant_label

Finding matching data

In order to find the matching data, the parent_experiment_id, parent_source_id, and parent_variant_label should be used along with the variable_id, table_id, and grid_label from the child dataset.

To make this work, I will work on implementing

@frozen
class SelectParentExperiment:
"""
Include a dataset's parent experiment in the selection
"""
def apply(self, group: pd.DataFrame, data_catalog: pd.DataFrame) -> pd.DataFrame:
"""
Include a dataset's parent experiment in the selection
Not yet implemented
"""
raise NotImplementedError("This is not implemented yet") # pragma: no cover

Finding a matching timerange

For temporal subsetting, we also need a function to compute a desired time range in the parent dataset based on a timerange in the child dataset. To translate time between parent dataset and child dataset, we need branch_time, time_units, and calendar for both datasets. This will require adding time_units and calendar to the facets we collect for every dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants