Skip to content

Test addmeta#523

Draft
aidanheerdegen wants to merge 9 commits intodev-preindustrial+concentrationsfrom
test-addmeta
Draft

Test addmeta#523
aidanheerdegen wants to merge 9 commits intodev-preindustrial+concentrationsfrom
test-addmeta

Conversation

@aidanheerdegen
Copy link
Copy Markdown
Member

@aidanheerdegen aidanheerdegen commented Mar 2, 2026

1. Summary:

Added postscript changes to run addmeta tool for global metadata to make data conform to ACCESS Output Data Spec.

2. Issues Addressed:

3. Dependencies (e.g. on payu, or model)

This change requires changes to (note pull request(s) where relevant):

  • workflow manager (payu):
  • model deployment (ACCESS-ESM1.6):
  • model component or library dependency:
  • input workflow:
  • post-processing workflow:

4. Ad-hoc Testing

No changes were made to model configuration, so no testing was done to ensure bit repro.

5. CI Testing

  • !test repro has been run

No.

6. Reproducibility

Is this reproducible with the previous commit? (If not, why not?)

  • Yes
  • No - !test repro commit has been run.

7. Performance

Has the model performance (say, throughput of model-years/wall-day) changed?

  • Yes
  • No
  • N/A (if selected, please add a brief explanation why performance testing is not necessary for this PR)

If yes, provide the numbers from your testing. Is the performance better or worse?

8. Manifests

Have you changed the executable, the input files and/or the restart files?

  • Yes
  • No

If yes, have you updated the manifests?

  • Yes
  • No

To update the manifests, run payu setup (in a cloned copy of your feature branch) with reproducibility tests turned off:

manifest:
  reproduce:
    exe: false
    input: false
    restart: false
runlog:
  enable: false

Then commit the newly created manifest files (under manifests/) only to the branch for this PR.

9. Documentation

Is the documentation updated?

  • Yes
  • N/A

Not updated yet.

10. Merge Strategy

  • Merge commit
  • Rebase and merge
  • Squash

@aidanheerdegen aidanheerdegen marked this pull request as draft March 2, 2026 23:07
@aidanheerdegen
Copy link
Copy Markdown
Member Author

The metadata spec should be removed before merging, as it is only for testing.

@access-hive-bot
Copy link
Copy Markdown

This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/csiro-access-nri-standup-minutes/3789/80

@@ -0,0 +1,4 @@
global:
frequency: "{{ __argdata__.freq }}"
variable_id:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see variable_id extraction from the filename is only implemented for ocean files at the moment?

Maybe put an loud placeholder here so it's less likely to be forgotten?

Suggested change
variable_id:
variable_id: "NOT IMPLEMENTED YET"

Copy link
Copy Markdown
Member Author

@aidanheerdegen aidanheerdegen Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'd prefer to use a string that was easily machine matchable, e.g. __NOT_IMPLEMENTED__

Waddyareckon?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that current behaviour leaves it out, as they are multivariable files. Do we want this in published output if it isn't relevant?

Copy link
Copy Markdown
Collaborator

@joshuatorrance joshuatorrance Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__NOT_IMPLEMENTED__ is fine by me! We don't want placeholders like this ending up in published output though of course.

Ideally if there are multiple variables per file then variable_id should be a list of those variables - var1,var2,var3. How many different sets of variable lists are there? (i.e. how fiddly would it be to implement a temporary variable_id list while we wait for one-var-per-file?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah quite fiddly, and would need to be kept updated when there are changes to the files. If we did want to do this it would be easier to specify what isn't a variable, and then automatically generate the list of variables and subtract the the not-variables from it.

If one variable per file is planned then I don't think we should go to a lot of trouble to implement this for multivariable files.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__NOT_IMPLEMENTED__ is fine by me! We don't want placeholders like this ending up in published output though of course.

Now re-reading I'm not sure what to do then. If we add the __NOT_IMPLEMENTED__ placeholder it will be in the file outputs.

Is the intention to make sure we don't forget about it? In which case a commented string might be better.

Comment thread addmeta/ice/dataspec.yaml
@@ -0,0 +1,4 @@
global:
frequency: "{{ __argdata__.freq }}"
variable_id:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see variable_id extraction from the filename is only implemented for ocean files at the moment?

Maybe put an loud placeholder here so it's less likely to be forgotten?

Suggested change
variable_id:
variable_id: "NOT IMPLEMENTED YET"

Comment thread addmeta/ice/addmeta_monthly.args Outdated
Comment thread addmeta/ice/addmeta_daily.args Outdated
Comment thread addmeta/ocean/addmeta_fixed.args Outdated
Comment thread addmeta/dataspec.yaml
Comment thread scripts/NetCDF-conversion/UM_conversion_job.sh Outdated
Comment thread addmeta/ocean/addmeta.args
--datavar=realm=atmos
--datavar=freq=1mon
# Files to process
${PAYU_CURRENT_OUTPUT_DIR}/atmosphere/netCDF/aiihca.p*_mon.nc
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this gets implemented prior to the UM filename changes, we'll need to add addmeta_6hourly.args, addmeta_3hourly.args and addmeta_1hourly.args files for the subdaily output from the production runs, i.e:

aiihca.pj-025101_6hr.nc
aiihca.pi-025104_3hr.nc
aiihca.pc-025110.nc (this is hourly)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. No I'd rather wait for the changes then we wouldn't need separate files as we could parse out a compliant frequency from the filename.

But

aiihca.pc-025110.nc (this is hourly)

should be aiihca.pc-025110_1hr.nc

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshuatorrance here is some examples of sub-daily output if you wanted to check the units were consistent with the schema.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I found a few examples of these files and they are all using days since 0001-01-01 00:00 as their time units which is consistent with the rest of the data I've seen so far.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be aiihca.pc-025110_1hr.nc

Any progress on this @blimlim?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The um2nc changes for this are currently going through review here: ACCESS-NRI/um2nc-standalone#229

aidanheerdegen and others added 9 commits April 7, 2026 10:11
 - Daily and monthly atmosphere command files with hardcoded realm and frequency
 - Daily and monthly ice command files with hardcoded realm and frequency
 - Fixed and time-vaying ocean command files with inferred frequency and realm logic

Added in validation for checking, included schema data so checks can run in a PBS job
Co-authored-by: joshuatorrance <[email protected]>
Co-authored-by: joshuatorrance <[email protected]>
* Added config files for time & time_bnds/bounds.

* Updated time yamls with renaming.

* Removed portion of comment that not so relevant anymore

* Moved time/time_bnds into dataspec.yaml

* time.yaml can be removed
@aidanheerdegen
Copy link
Copy Markdown
Member Author

I have rebased on to the latest version of dev-preindustrial+concentrations and force pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants