Skip to content

Ensure time is the first dimension in CMORised variables #221

@rbeucher

Description

@rbeucher

Ensure time is the first dimension in CMORised variables

Context

In CMIP tables, the dimensions field does not always list time first.
For example:

longitude latitude time

or

longitude latitude plev time

However, when looking at actual CMIP6 datasets, the time dimension is almost always the first dimension in the NetCDF files:

tas(time, lat, lon)
ua(time, plev, lat, lon)
thetao(time, lev, lat, lon)

Why this happens

This convention comes mainly from NetCDF historical practices.

Time is typically defined as the unlimited (record) dimension:

time = UNLIMITED ;

In NetCDF classic conventions the record dimension is traditionally placed first, which allows efficient appending of new timesteps and matches how climate models typically write data (one timestep at a time).

As a result, most CMIP6 data follow the storage pattern:

(time, vertical, lat, lon)
(time, lat, lon)

even if the CMIP tables list time last.

Proposal

For CMORised outputs produced by MOPPy, ensure that time is the first dimension whenever it exists, while preserving the order of the remaining dimensions.

Example implementation

This can be handled easily with xarray before writing the variable:

dims = list(ds[var].dims)

if "time" in dims and dims[0] != "time":
    new_order = ["time"] + [d for d in dims if d != "time"]
    ds[var] = ds[var].transpose(*new_order)

This keeps the relative order of the other axes but guarantees that time is the leading dimension.

Motivation

This would align MOPPy output with historical CMIP6 datasets, follow NetCDF best practices, and avoid surprises for downstream tools that expect time to be the leading dimension.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions