Add figure generation code #158

stephprince · 2025-09-25T17:59:00Z

Here is an initial script to generate figures for the benchmarks paper. Currently this script leans towards plotting everything and we can condense down as we go.

Related plots can be found on Google Drive or can be generated from this script (using an updated, local GitHub clone of nwb-benchmarks-results).

A couple of things I noticed

for asking about downloading vs. streaming (as described in Hypotheses: The purpose of NWB Benchmarks #89), we still need to capture the download time for these different test cases. The only download test case I saw was the 'time_download_lindi.LindiDownloadBenchmark.time_download
I think the icephys slice size should maybe be increased. As it is right now it seems to be too small to capture any variation in time vs. slice size.

@rly @oruebel @CodyCBakerPhD let me know if you have any initial feedback on figures or different things you would like to see before we meet. Some of the other to-do items are indicated in the script.

for more information, see https://pre-commit.ci

CodyCBakerPhD · 2025-09-25T18:59:21Z

Very cool stuff

I'd love to see some scatter plot versions of the timing (slicing and file read) to truly experience the variability over the machines (bonus points for plotly with hover effect of displaying the machine ID for easy cross-ref)

CodyCBakerPhD · 2025-09-25T19:00:53Z

src/nwb_benchmarks/scripts/generate_figures.py

+repackage_as_parquet(
+    directory=results_directory,
+    output_directory=db_directory,
+    minimum_results_version="3.0.0",
+    minimum_machines_version="1.4.0",
+)


I think this is good for development. When paper is finalized we can remove the endpoint from the Flask server (or I guess do that whenever) and treat the 'results database' repo as a version controlled 'data' to go along with publication (also with Zenodo DOI)

CodyCBakerPhD · 2025-09-25T19:09:14Z

I would also combine Zarr with HDF5 options (perhaps with an asterisk or something? to indicate different type of file but still kind of apples-to-apples since user can choose between backends to use just like they can chose streaming methods for HDF5 backend) to more easily see side by side comparison

CodyCBakerPhD · 2025-09-25T19:15:46Z

Finally:

while it is nice to have the colorful slicing_HDF5PyNWB.png to see the landscape, we might want to re-organize the panels here to more appropriately correspond to testing conditions

1a) have the 'preloaded' ones be in their own panel - easier then to compare them across methods, and kind of unfair to compare to the others since they had different untimed steps in their setups

1b) separate caching from non caching to more easily see which methods are best within that given mode (maybe it's not always convenient to have cache enabled depending on file system)

I would also experiment with outputting to .eps or .svg to ensure the script is capable of that since it not only helps us during early exploration (as we might want to zoom in infinitely to see some of the smaller scale items, but will almost certainly be required by the journal as well

for more information, see https://pre-commit.ci

stephprince · 2025-09-30T18:13:53Z

I think the updated code addresses most of the changes mentioned here or discussed in our last meeting (see generated plots here):

add scatter plots
add different plots for different slice range sizes
add n for each run to the plots
grouping: combine Zarr / HDF5 / Lindi into one figure, separate out preloaded data into different panels, reorder cached vs. uncached
add additional network tracking metrics (avg time per request, percent network time (something is off with some of these values where the network time is larger than the total time))
output figures as pdfs (editable by vector graphics software)

Once we have some of the additional results (including download times, different versions, scaling chunk vs. object tests), I can add additional plots for those

stephprince and others added 12 commits September 18, 2025 10:43

add visualization dependencies

e006938

refactor dataclasses, parquet processing

9a9cd78

[pre-commit.ci] auto fixes from pre-commit.com hooks

994b1b4

for more information, see https://pre-commit.ci

add figure script - wip

deeb6fb

update plotting functions

f9f0575

update minimum version setting for parquet

c98bcec

save timestamp info in parquet

fec918f

Merge branch 'main' into add-figure-script

475e7f4

remove accidentally tracked figures

eb74de6

add slice vs time plots

b9742f0

update figure script

7f50a75

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b4a4f6

for more information, see https://pre-commit.ci

CodyCBakerPhD reviewed Sep 25, 2025

View reviewed changes

stephprince and others added 10 commits September 25, 2025 16:09

save figures as pdfs with editable text

6d5dc1f

add scatter plots, combine plots, pull out preloaded

e077e9f

update figure script

c272e31

[pre-commit.ci] auto fixes from pre-commit.com hooks

fac69e8

for more information, see https://pre-commit.ci

update figures from feedback

394b397

refactor db processing and visualization code

44e74b7

refactor db processing and visualization code

1370885

update figure generation script to use classes

b480d2f

add print logging for plotting

1f54780

ignore pdf figure files

b2ea98e

stephprince changed the title ~~Add script to generate figures~~ Add figure generation code Sep 30, 2025

pre-commit-ci bot and others added 3 commits September 30, 2025 17:55

[pre-commit.ci] auto fixes from pre-commit.com hooks

60535e7

for more information, see https://pre-commit.ci

update cache/no cache order

b00e126

[pre-commit.ci] auto fixes from pre-commit.com hooks

2e3ab14

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add figure generation code #158

Add figure generation code #158

Uh oh!

stephprince commented Sep 25, 2025 •

edited

Loading

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

CodyCBakerPhD Sep 25, 2025

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

stephprince commented Sep 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add figure generation code #158

Are you sure you want to change the base?

Add figure generation code #158

Uh oh!

Conversation

stephprince commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

CodyCBakerPhD Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

CodyCBakerPhD commented Sep 25, 2025

Uh oh!

stephprince commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stephprince commented Sep 25, 2025 •

edited

Loading

stephprince commented Sep 30, 2025 •

edited

Loading