-
Notifications
You must be signed in to change notification settings - Fork 9
PMP enso #273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
PMP enso #273
Changes from all commits
Commits
Show all changes
69 commits
Select commit
Hold shift + click to select a range
414a83b
initial commit for enso codes
lee1043 b5d8751
rename internal function and generalize variable name
lee1043 3973278
apply changes from #271
lee1043 92ede94
update
lee1043 7c5730f
update
lee1043 06e50ff
pre-commit fix
lee1043 7719d88
pre-commit fix
lee1043 f08c655
in progress
lee1043 8ecd3f7
update
lee1043 afa9c49
Update packages/climate-ref-pmp/src/climate_ref_pmp/diagnostics/enso.py
lee1043 76c7b97
update
lee1043 b2aeb1c
update
lee1043 81e60c4
in progress
lee1043 dbcdade
in progress
lee1043 d6b7491
update
lee1043 75501a6
add change log
lee1043 4b689d4
Update environment.yml
lee1043 0098813
update
lee1043 beae40e
update
lee1043 78d8d3b
Merge remote-tracking branch 'origin/main' into 223_pmp-enso-2
lewisjared d11b581
feat: Rework so that the command is executed
lewisjared c4dd856
clean up
lee1043 a49b336
update
lee1043 5b6885b
ruff fix
lee1043 7511d09
remove enso param file as enso driver does not need it for the curren…
lee1043 cd116e9
update
lee1043 5e21ae7
generate landmask for reference per variable basis because it is poss…
lee1043 f19e52d
typo fix
lee1043 b78f6b3
update
lee1043 c986f81
update
lee1043 aec1b48
update
lee1043 477a4ae
add logger lib to the pmp env
lee1043 f606fc6
update
lee1043 136edb1
update -- bug fix
lee1043 665ba8d
update -- typo fix
lee1043 6c6a72c
update
lee1043 b488f32
adjust numpy version limit
lee1043 b9220c7
chore: Update lockfile
lewisjared 9166fbc
bug fix
lee1043 c212593
update
lee1043 14fc030
typo fix
lee1043 61d937e
cmec converter added
lee1043 09740b2
update cmec converter
lee1043 b9adbfa
bug fix
lee1043 1d6c427
update
lee1043 89ebb03
update
lee1043 242daac
clean up
lee1043 3783587
add ERA-5
lee1043 fa77ff3
clean up
lee1043 31209b6
Merge remote-tracking branch 'origin/main' into 223_pmp-enso-2
lewisjared 907d97e
chore: cleanup dict_datasets
lewisjared 8c7e2df
chore: Add files to obs4REF registry
lewisjared 5a8e54e
Merge branch 'main' into 223_pmp-enso-2
lewisjared 5d1f92d
chore: Skip coverage of driver files
lewisjared 42f1e89
Merge remote-tracking branch 'origin/main' into 223_pmp-enso-2
lewisjared 1a7f33e
chore: Add areacella and sftlf
lewisjared 2c066c1
chore: Adding REF_TEST_DATA_DIR for out-of-source sample data
lewisjared c9637c9
typo fix
lee1043 757ed71
bug fix -- re-enable landsea mask estimation for obs and models if ne…
lee1043 a06787a
typo fix
lee1043 24d99aa
Merge remote-tracking branch 'origin/main' into 223_pmp-enso-2
lewisjared 7e7f9ca
testing
lee1043 1362cf3
clean up
lee1043 e4b8c31
clean up
lee1043 34c1746
Merge remote-tracking branch 'origin/main' into 223_pmp-enso-2
lewisjared 59bef7d
chore: Add additional dimensions
lewisjared 4389095
chore: Add regression outputs
lewisjared 3755dd4
chore: fix number of obs4ref file
lewisjared 67656d3
chore: Fix coverage
lewisjared File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Implemented PMP ENSO metrics |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 2 additions & 0 deletions
2
packages/climate-ref-pmp/src/climate_ref_pmp/diagnostics/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,11 @@ | ||
"""PMP diagnostics.""" | ||
|
||
from climate_ref_pmp.diagnostics.annual_cycle import AnnualCycle | ||
from climate_ref_pmp.diagnostics.enso import ENSO | ||
from climate_ref_pmp.diagnostics.variability_modes import ExtratropicalModesOfVariability | ||
|
||
__all__ = [ | ||
"ENSO", | ||
"AnnualCycle", | ||
"ExtratropicalModesOfVariability", | ||
] |
245 changes: 245 additions & 0 deletions
245
packages/climate-ref-pmp/src/climate_ref_pmp/diagnostics/enso.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
import json | ||
import os | ||
from collections.abc import Collection, Iterable | ||
from typing import Any | ||
|
||
from loguru import logger | ||
|
||
from climate_ref_core.constraints import AddSupplementaryDataset | ||
from climate_ref_core.datasets import DatasetCollection, FacetFilter, SourceDatasetType | ||
from climate_ref_core.diagnostics import ( | ||
CommandLineDiagnostic, | ||
DataRequirement, | ||
ExecutionDefinition, | ||
ExecutionResult, | ||
) | ||
from climate_ref_pmp.pmp_driver import _get_resource, process_json_result | ||
|
||
|
||
class ENSO(CommandLineDiagnostic): | ||
""" | ||
Calculate the ENSO performance metrics for a dataset | ||
""" | ||
|
||
facets = ("source_id", "member_id", "grid_label", "experiment_id", "metric", "reference_datasets") | ||
|
||
def __init__(self, metrics_collection: str, experiments: Collection[str] = ("historical",)) -> None: | ||
self.name = metrics_collection | ||
self.slug = metrics_collection.lower() | ||
self.metrics_collection = metrics_collection | ||
self.parameter_file = "pmp_param_enso.py" | ||
self.obs_sources: tuple[str, ...] | ||
self.model_variables: tuple[str, ...] | ||
|
||
if metrics_collection == "ENSO_perf": # pragma: no cover | ||
self.model_variables = ("pr", "ts", "tauu") | ||
self.obs_sources = ("GPCP-Monthly-3-2", "TropFlux-1-0", "HadISST-1-1") | ||
elif metrics_collection == "ENSO_tel": | ||
self.model_variables = ("pr", "ts") | ||
self.obs_sources = ("GPCP-Monthly-3-2", "TropFlux-1-0", "HadISST-1-1") | ||
elif metrics_collection == "ENSO_proc": | ||
self.model_variables = ("ts", "tauu", "hfls", "hfss", "rlds", "rlus", "rsds", "rsus") | ||
self.obs_sources = ( | ||
"GPCP-Monthly-3-2", | ||
"TropFlux-1-0", | ||
"HadISST-1-1", | ||
"CERES-EBAF-4-2", | ||
) | ||
else: | ||
raise ValueError( | ||
f"Unknown metrics collection: {metrics_collection}. " | ||
"Valid options are: ENSO_perf, ENSO_tel, ENSO_proc" | ||
) | ||
|
||
self.data_requirements = self._get_data_requirements(experiments) | ||
|
||
def _get_data_requirements( | ||
self, | ||
experiments: Collection[str] = ("historical",), | ||
) -> tuple[DataRequirement, DataRequirement]: | ||
filters = [ | ||
FacetFilter( | ||
facets={ | ||
"frequency": "mon", | ||
"experiment_id": tuple(experiments), | ||
"variable_id": self.model_variables, | ||
} | ||
) | ||
] | ||
|
||
return ( | ||
DataRequirement( | ||
source_type=SourceDatasetType.obs4MIPs, | ||
filters=( | ||
FacetFilter(facets={"source_id": self.obs_sources, "variable_id": self.model_variables}), | ||
), | ||
group_by=("activity_id",), | ||
), | ||
DataRequirement( | ||
source_type=SourceDatasetType.CMIP6, | ||
filters=tuple(filters), | ||
group_by=("source_id", "experiment_id", "member_id", "grid_label"), | ||
constraints=( | ||
AddSupplementaryDataset.from_defaults("areacella", SourceDatasetType.CMIP6), | ||
AddSupplementaryDataset.from_defaults("sftlf", SourceDatasetType.CMIP6), | ||
), | ||
), | ||
) | ||
|
||
def build_cmd(self, definition: ExecutionDefinition) -> Iterable[str]: | ||
""" | ||
Run the diagnostic on the given configuration. | ||
|
||
Parameters | ||
---------- | ||
definition : ExecutionDefinition | ||
The configuration to run the diagnostic on. | ||
|
||
Returns | ||
------- | ||
: | ||
The result of running the diagnostic. | ||
""" | ||
mc_name = self.metrics_collection | ||
|
||
# ------------------------------------------------ | ||
# Get the input datasets information for the model | ||
# ------------------------------------------------ | ||
input_datasets = definition.datasets[SourceDatasetType.CMIP6] | ||
input_selectors = input_datasets.selector_dict() | ||
source_id = input_selectors["source_id"] | ||
member_id = input_selectors["member_id"] | ||
experiment_id = input_selectors["experiment_id"] | ||
variable_ids = set(input_datasets["variable_id"].unique()) - {"areacella", "sftlf"} | ||
mod_run = f"{source_id}_{member_id}" | ||
|
||
# We only need one entry for the model run | ||
dict_mod: dict[str, dict[str, Any]] = {mod_run: {}} | ||
|
||
def extract_variable(dc: DatasetCollection, variable: str) -> list[str]: | ||
return dc.datasets[input_datasets["variable_id"] == variable]["path"].to_list() # type: ignore | ||
|
||
# TO DO: Get the path to the files per variable | ||
for variable in variable_ids: | ||
list_files = extract_variable(input_datasets, variable) | ||
list_areacella = extract_variable(input_datasets, "areacella") | ||
list_sftlf = extract_variable(input_datasets, "sftlf") | ||
|
||
if len(list_files) > 0: | ||
dict_mod[mod_run][variable] = { | ||
"path + filename": list_files, | ||
"varname": variable, | ||
"path + filename_area": list_areacella, | ||
"areaname": "areacella", | ||
"path + filename_landmask": list_sftlf, | ||
"landmaskname": "sftlf", | ||
} | ||
|
||
# ------------------------------------------------------- | ||
# Get the input datasets information for the observations | ||
# ------------------------------------------------------- | ||
reference_dataset = definition.datasets[SourceDatasetType.obs4MIPs] | ||
reference_dataset_names = reference_dataset["source_id"].unique() | ||
|
||
dict_obs: dict[str, dict[str, Any]] = {} | ||
|
||
# TO DO: Get the path to the files per variable and per source | ||
for obs_name in reference_dataset_names: | ||
dict_obs[obs_name] = {} | ||
for variable in variable_ids: | ||
# Get the list of files for the current variable and observation source | ||
list_files = reference_dataset.datasets[ | ||
(reference_dataset["variable_id"] == variable) | ||
& (reference_dataset["source_id"] == obs_name) | ||
]["path"].to_list() | ||
# If the list is not empty, add it to the dictionary | ||
if len(list_files) > 0: | ||
dict_obs[obs_name][variable] = { | ||
"path + filename": list_files, | ||
"varname": variable, | ||
} | ||
|
||
# Create input directory | ||
dict_datasets = { | ||
"model": dict_mod, | ||
"observations": dict_obs, | ||
"metricsCollection": mc_name, | ||
"experiment_id": experiment_id, | ||
} | ||
|
||
# Create JSON file for dictDatasets | ||
json_file = os.path.join( | ||
definition.output_directory, f"input_{mc_name}_{source_id}_{experiment_id}_{member_id}.json" | ||
) | ||
with open(json_file, "w") as f: | ||
json.dump(dict_datasets, f, indent=4) | ||
logger.debug(f"JSON file created: {json_file}") | ||
|
||
driver_file = _get_resource("climate_ref_pmp.drivers", "enso_driver.py", use_resources=True) | ||
return [ | ||
"python", | ||
driver_file, | ||
"--metrics_collection", | ||
mc_name, | ||
"--experiment_id", | ||
experiment_id, | ||
"--input_json_path", | ||
json_file, | ||
"--output_directory", | ||
str(definition.output_directory), | ||
] | ||
|
||
def build_execution_result(self, definition: ExecutionDefinition) -> ExecutionResult: | ||
""" | ||
Build a diagnostic result from the output of the PMP driver | ||
|
||
Parameters | ||
---------- | ||
definition | ||
Definition of the diagnostic execution | ||
|
||
Returns | ||
------- | ||
Result of the diagnostic execution | ||
""" | ||
input_datasets = definition.datasets[SourceDatasetType.CMIP6] | ||
source_id = input_datasets["source_id"].unique()[0] | ||
experiment_id = input_datasets["experiment_id"].unique()[0] | ||
member_id = input_datasets["member_id"].unique()[0] | ||
mc_name = self.metrics_collection | ||
pattern = f"{mc_name}_{source_id}_{experiment_id}_{member_id}" | ||
|
||
# Find the results files | ||
results_files = list(definition.output_directory.glob(f"{pattern}_cmec.json")) | ||
logger.debug(f"Results files: {results_files}") | ||
|
||
if len(results_files) != 1: # pragma: no cover | ||
logger.warning(f"A single cmec output file not found: {results_files}") | ||
return ExecutionResult.build_from_failure(definition) | ||
|
||
# Find the other outputs | ||
png_files = [definition.as_relative_path(f) for f in definition.output_directory.glob("*.png")] | ||
data_files = [definition.as_relative_path(f) for f in definition.output_directory.glob("*.nc")] | ||
|
||
cmec_output, cmec_metric = process_json_result(results_files[0], png_files, data_files) | ||
|
||
input_selectors = definition.datasets[SourceDatasetType.CMIP6].selector_dict() | ||
cmec_metric_bundle = cmec_metric.remove_dimensions( | ||
[ | ||
"model", | ||
"realization", | ||
], | ||
).prepend_dimensions( | ||
{ | ||
"source_id": input_selectors["source_id"], | ||
"member_id": input_selectors["member_id"], | ||
"grid_label": input_selectors["grid_label"], | ||
"experiment_id": input_selectors["experiment_id"], | ||
} | ||
) | ||
|
||
return ExecutionResult.build_from_output_bundle( | ||
definition, | ||
cmec_output_bundle=cmec_output, | ||
cmec_metric_bundle=cmec_metric_bundle, | ||
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.