Py vernier aggregations by mo-marqh · Pull Request #223 · MetOffice/Vernier

mo-marqh · 2026-03-04T11:49:25Z

Description

a new class is added to the post-processing vernier python library, to enable aggregation across multiple runs, that are asserted to be the same, but where the members can be also accessed, removed, etc.

Linked issues

Closes # (issue)

related to #221

related to #229

Type of change

New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How has this been tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

New tests have been added
Tests have been modified to accommodate this change

Checklist:

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes, for both debug and optimised builds

andrewcoughtrie · 2026-03-04T14:55:46Z

Something @oakleybrunt mentioned on the previous PR (and @mo-mglover has just mentioned as well) the spelling of calliper should have 2 ls in this repo, we have tried to be consistent with this and so should continue to do so. So if this can be corrected in this PR that would be great.

mo-marqh · 2026-03-05T14:13:32Z

Something @oakleybrunt mentioned on the previous PR (and @mo-mglover has just mentioned as well) the spelling of calliper should have 2 ls in this repo, we have tried to be consistent with this and so should continue to do so. So if this can be corrected in this PR that would be great.

agreed, see
f9ed767
@andrewcoughtrie

mo-marqh · 2026-03-09T13:14:12Z

i have added tests for the new aggregation class

this is likely suitable for review now @EdHone

EdHone · 2026-03-09T16:07:45Z

The technical implementation seems sound to me all round.
In terms of the design, I'm not entirely sure what the new class is offering in terms of additional user functionality that doesn't already exist with the aggregate function (that admittedly isn't a part of the API but could be).

For the case where a user wants to group a set of runs that have been loaded in order to analyse them as one, it's simple to do with a script like this:

from vernier import VernierReader, aggregate

# Load data
run1_timers = VernierReader(Path(run1_file)).load()
run2_timers = VernierReader(Path(run2_file)).load()
run3_timers = VernierReader(Path(run3_file)).load()
run4_timers = VernierReader(Path(run4_file)).load()

# Analyse individual data
analyse_data(run1_timers)
analyse_data(run2_timers)
analyse_data(run3_timers)
analyse_data(run4_timers)

# Analyse aggregated data
aggregated_timers = aggregate[run1_timers, run2_timers, run3_timers, run4_timers]
analyse_data(aggregated_timers)

with the benefit that because the product of the aggregation is just a VernierData object, the same analysis functions can be used.

If i'm misunderstanding the scope of your change and the example I've given above isn't the same as what you're trying to achieve please let me know

EdHone

Comment above ^

mo-marqh · 2026-03-10T16:33:48Z

In terms of the design, I'm not entirely sure what the new class is offering in terms of additional user functionality that doesn't already exist with the aggregate function (that admittedly isn't a part of the API but could be).

I think that the design issue here is important.

I think that having an object which aggregates whilst keeping the identities of the members separate is a valuable and different entity from an aggregation that loses the distinction between input units.

On reflection, I think that the aggregate function as defined would be better to be moved to a method on the VernierData class, to continue to enable collecting data from many single files (& perhaps renamed)

I think that the VernierDataAggregation class provides a richer interface, better supporting an aggregation by assertion, that is that two or more independent runs can be considered together, whilst a VernierData method that support managing data in different files from the same run can be adapted from the current function.

I'll have a look at putting such a change set into a further commit to this PR, for design and implementation consideration.

mo-marqh · 2026-03-11T11:56:20Z

8408b39
implements the suggestion I made in #223 (comment)

this provides a consistent design, with aggregate now being a method on the VernierData class, and the different behaviour of an asserted aggregation being a new VernierDataAggregation class

mo-marqh · 2026-03-11T12:07:45Z

this does leave a further potential confusion, with overuse of the term aggregate, so I have changed the name of the new class to VernierDataCollation
8b80cbd
This serves to disambiguate, and enhances the differential that a collation is used to bring together different pieces of written information so that the similarities and differences can be seen.

An aggregation, in constrast, is something formed by adding together several amounts or things

mo-marqh · 2026-03-11T12:08:42Z

i think this is ready for re-review now, it would be helpful if
#231 could be reviewed and merged first, then this rebased to that, as #231 is a spelling consistency change which confuses the diff within the PR
@EdHone

update: #231 closed, rebased to main

EdHone · 2026-03-13T10:26:28Z

Really happy with these changes, I'll get into a full review soon. I think the potential to enforce consistency of interface between the VernierData and VernierDataCollation is an opportunity we shouldn't pass on, as this would make all user analysis scripts consistent

EdHone

A few spelling issues to fix

documentation/Sphinx/user_guide/build-guide/build-guide.rst

CMakeLists.txt

EdHone

I managed to use GitHub's magic to fix the spelling errors - now approved

github-actions bot added the cla-required The CLA has not yet been signed by the author of this PR - added by GA label Mar 5, 2026

mo-marqh force-pushed the py-vernier-aggregations branch from 3c9bb2a to f9ed767 Compare March 5, 2026 14:14

github-actions bot added cla-signed The CLA has been signed as part of this PR - added by GA and removed cla-required The CLA has not yet been signed by the author of this PR - added by GA labels Mar 5, 2026

mo-marqh requested a review from EdHone March 9, 2026 13:14

mo-marqh mentioned this pull request Mar 9, 2026

[Custom Issue]: post Processing Code Maintenance #229

Open

mo-marqh assigned EdHone Mar 9, 2026

EdHone reviewed Mar 10, 2026

View reviewed changes

mo-marqh force-pushed the py-vernier-aggregations branch from 280d574 to 02acc93 Compare March 11, 2026 14:13

github-actions bot added cla-modified The CLA has been modified as part of this PR - added by GA and removed cla-signed The CLA has been signed as part of this PR - added by GA labels Mar 11, 2026

mo-marqh added 9 commits March 11, 2026 14:25

Data Aggregator for consistent job runs

a35e6b3

consistency and ease of use for aggregation

7c9a0c1

consistent get method

c0dad34

imports

c4559c2

rename caliper to calliper

b210d96

add testing for aggregation class

852b702

move aggregate to be a method on VernierData

8408b39

rename VernierDataAggregation to VernierDataCollation (clarity)

8b80cbd

fix refactor

cad1247

mo-marqh force-pushed the py-vernier-aggregations branch from 02acc93 to cad1247 Compare March 11, 2026 14:26

github-actions bot removed the cla-modified The CLA has been modified as part of this PR - added by GA label Mar 11, 2026

EdHone requested changes Mar 13, 2026

View reviewed changes

documentation/Sphinx/user_guide/build-guide/build-guide.rst Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

EdHone added 3 commits March 13, 2026 10:35

Update documentation/Sphinx/user_guide/build-guide/build-guide.rst

2515df4

Update CMakeLists.txt

a86db54

Update CMakeLists.txt

721aab5

EdHone approved these changes Mar 13, 2026

View reviewed changes

EdHone merged commit 85aed81 into MetOffice:main Mar 13, 2026
12 checks passed

Conversation

mo-marqh commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Linked issues

Type of change

How has this been tested?

Checklist:

Uh oh!

andrewcoughtrie commented Mar 4, 2026

Uh oh!

mo-marqh commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mo-marqh commented Mar 9, 2026

Uh oh!

EdHone commented Mar 9, 2026

Uh oh!

EdHone left a comment

Choose a reason for hiding this comment

Uh oh!

mo-marqh commented Mar 10, 2026

Uh oh!

mo-marqh commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mo-marqh commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mo-marqh commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EdHone commented Mar 13, 2026

Uh oh!

EdHone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

EdHone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mo-marqh commented Mar 4, 2026 •

edited

Loading

mo-marqh commented Mar 5, 2026 •

edited

Loading

mo-marqh commented Mar 11, 2026 •

edited

Loading

mo-marqh commented Mar 11, 2026 •

edited

Loading

mo-marqh commented Mar 11, 2026 •

edited

Loading