Skip to content

Add maximum absolute error metric#21

Merged
treigerm merged 5 commits into
mainfrom
max_error
Apr 14, 2025
Merged

Add maximum absolute error metric#21
treigerm merged 5 commits into
mainfrom
max_error

Conversation

@treigerm
Copy link
Copy Markdown
Member

Adds a maximum absolute error metric, as was raised during #20 .

Copy link
Copy Markdown
Collaborator

@juntyr juntyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also get a max relative error to match? Else it looks good to go from my end :)

y : xr.DataArray
Shape (realization, time, vertical, latitude, longitude)
"""
rel_error = np.abs(x - y) / np.abs(x)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens for x=0? If x=y, we should ensure that the rel_error=0, otherwise it should be inf

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If x = 0 and y = 0, the relative error will evaluate to np.nan and hence be ignored in the max reduction. So it still returns the correct result, unless all entries of x and y are 0. Typing this reasoning out loud makes me realise I should just add an explicit check :D

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use an explicit np.where here, e.g.

np.where(
    x != 0,
    np.abs(x - y) / np.abs(x),
    np.where(
        y == 0,
        0,
        np.inf,
    ),
)

@treigerm
Copy link
Copy Markdown
Member Author

@juntyr I have also added a max relative error bound now. I also noticed that the default behaviour of calling np.mean(x) when x: xr.DataArray is to skip the NaN values (similar for computing the maximum). This is a bit counter-intuitive because if you call np.mean(x) when x: np.NDArray then it does NOT skip the NaN values. So I changed the code so that we explicitly call the xarray aggregation functions with skipna=True to make this behaviour clear. So the code change to the other metrics is not changing the behaviour of the function right now.

In another PR I think we should add an argument to the metric functions about whether they should skip the NaNs or not. It is actually quite useful to skip the NaNs in practice: for the data sources which have NaNs in them and for which the compressors do not strictly follow the NaN patterns (JPEG and SZ3) it gives us an idea what sort of error to expect if we would apply some simple NaN mask pre/post-processing. If we just return NaN for the metrics that information is lost.

Also we do have information about whether the NaN masks match from our error bound tests so we can use the pass/fail information to filter the validity of computed metrics.

@treigerm
Copy link
Copy Markdown
Member Author

But long-term I think it make sense to compute both versions of each metric: one that skips NaNs and one which doesn't.

@juntyr
Copy link
Copy Markdown
Collaborator

juntyr commented Apr 14, 2025

But long-term I think it make sense to compute both versions of each metric: one that skips NaNs and one which doesn't.

I feel like the failure of the error bound metric (or if we want to be more explicit we can always add a "do NaNs match" test just for this) will be enough, seeing metric=NaN never looks good (even if it's the compressor's fault and not the benchmark's)

Copy link
Copy Markdown
Collaborator

@juntyr juntyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@treigerm treigerm merged commit 6512622 into main Apr 14, 2025
3 checks passed
@juntyr juntyr deleted the max_error branch April 14, 2025 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants