-
Notifications
You must be signed in to change notification settings - Fork 1
Add figure generation code #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Very cool stuff I'd love to see some scatter plot versions of the timing (slicing and file read) to truly experience the variability over the machines (bonus points for plotly with hover effect of displaying the machine ID for easy cross-ref) |
repackage_as_parquet( | ||
directory=results_directory, | ||
output_directory=db_directory, | ||
minimum_results_version="3.0.0", | ||
minimum_machines_version="1.4.0", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good for development. When paper is finalized we can remove the endpoint from the Flask server (or I guess do that whenever) and treat the 'results database' repo as a version controlled 'data' to go along with publication (also with Zenodo DOI)
I would also combine Zarr with HDF5 options (perhaps with an asterisk or something? to indicate different type of file but still kind of apples-to-apples since user can choose between backends to use just like they can chose streaming methods for HDF5 backend) to more easily see side by side comparison |
Finally:
1a) have the 'preloaded' ones be in their own panel - easier then to compare them across methods, and kind of unfair to compare to the others since they had different untimed steps in their setups 1b) separate caching from non caching to more easily see which methods are best within that given mode (maybe it's not always convenient to have cache enabled depending on file system)
|
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
I think the updated code addresses most of the changes mentioned here or discussed in our last meeting (see generated plots here):
Once we have some of the additional results (including download times, different versions, scaling chunk vs. object tests), I can add additional plots for those |
Here is an initial script to generate figures for the benchmarks paper. Currently this script leans towards plotting everything and we can condense down as we go.
Related plots can be found on Google Drive or can be generated from this script (using an updated, local GitHub clone of nwb-benchmarks-results).
A couple of things I noticed
'time_download_lindi.LindiDownloadBenchmark.time_download
@rly @oruebel @CodyCBakerPhD let me know if you have any initial feedback on figures or different things you would like to see before we meet. Some of the other to-do items are indicated in the script.