Skip to content

Post-processing: Store rank and thread information in a single array#235

Merged
Ed Hone (EdHone) merged 8 commits intoMetOffice:mainfrom
EdHone:py-vernier-threads
Mar 24, 2026
Merged

Post-processing: Store rank and thread information in a single array#235
Ed Hone (EdHone) merged 8 commits intoMetOffice:mainfrom
EdHone:py-vernier-threads

Conversation

@EdHone
Copy link
Collaborator

@EdHone Ed Hone (EdHone) commented Mar 23, 2026

Description

This PR unifies the representation for multi-threaded callipers. This is achieved with the same single-data-list approach as before, but the rank and thread information is also stored in the calliper. The get method has been extended to accept the thread and rank IDs as arguments and should now become the primary (and only) way to get subsets of data from the VernierData and VernierDataCollation objects, i.e.:

timers = VernierReader(`vernier-output').load()

#Get data for a calliper
alg_timers = timers.get('algorithm')

# Get data for single rank/thread
alg_timers_thread_0 = timers.get('algorithm', thread=0)
alg_timers_rank_1 = timers.get('algorithm', rank=1)

A future issue should be opened to make the data attribute of the VernierData class 'private' (i.e. rename to _data).

Linked issues

Closes #232
Closes #228

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How has this been tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • New tests have been added
  • Tests have been modified to accommodate this change

Checklist:

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes, for both debug and optimised builds

@github-actions github-actions bot added the cla-signed The CLA has been signed as part of this PR - added by GA label Mar 23, 2026
@EdHone Ed Hone (EdHone) marked this pull request as ready for review March 23, 2026 09:39
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ed, a couple notes regarding documentation but the core functionality looks robust to me. A good change :)

sline = line.split()
if len(sline) > 0: # Line contains data
if sline[0].isdigit(): # Calliper lines start with a digit
if "Task" in sline:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably change this to something that definitely can't be in a calliper line. Someone could have a calliper called "Task" and that would then break this line.

Suggested change
if "Task" in sline:
if "MPI rank ID" in sline:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesnt quite work as you suggested as the line is already split at this point - I've changed it to if sline[0] == "Task" which should capture the condition more robustly and rule out the possibility of the case you suggested above

Comment on lines +75 to +81
while True:
try:
start = self.rank.index(rank, start)
results.append(start)
start += 1
except ValueError:
return results

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use some comments, it's not the easiest to understand.

Suggested change
while True:
try:
start = self.rank.index(rank, start)
results.append(start)
start += 1
except ValueError:
return results
while True:
try:
# Get the index for the first occurrence of `rank` after `start`
start = self.rank.index(rank, start)
# Add the index to the list
results.append(start)
# Move our start position to the index after the rank just found
start += 1
except ValueError:
return results

Comment on lines +96 to +102
while True:
try:
start = self.thread.index(thread, start)
results.append(start)
start += 1
except ValueError:
return results

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, a couple comments would help, e.g.:

Suggested change
while True:
try:
start = self.thread.index(thread, start)
results.append(start)
start += 1
except ValueError:
return results
while True:
try:
# Get the index for the first occurrence of `thread` after `start`
start = self.thread.index(thread, start)
# Add the index to the list
results.append(start)
# Move our start position to the index after the rank just found
start += 1
except ValueError:
return results

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the variable names around and added a comment to (hopefully) make things a bit easier to follow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required after the changes to VernierReader?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@EdHone Ed Hone (EdHone) self-assigned this Mar 24, 2026
@EdHone
Copy link
Collaborator Author

Hopefully I've addressed your comments in the latest commit Oakley Brunt (@oakleybrunt) - back to you

@EdHone Ed Hone (EdHone) merged commit 07109fc into MetOffice:main Mar 24, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The CLA has been signed as part of this PR - added by GA

Projects

None yet

2 participants