You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+22-11Lines changed: 22 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,15 +6,30 @@ This repository houses performance benchmarks for [Parcels](https://github.com/O
6
6
7
7
## Development instructions
8
8
9
+
This project uses a combination of [Pixi](https://pixi.sh/dev/installation/), [ASV](https://asv.readthedocs.io/), and [intake-xarray](https://github.com/intake/intake-xarray) to coordinate the setting up and running of benchmarks.
10
+
11
+
- Scripts are used to download the datasets required into the correct location
12
+
- intake-xarray is used to define data catalogues which can be easily accessed from within benchmark scripts
13
+
- ASV is used to run the benchmarks (see the [Writing the benchmarks](#writing-the-benchmarks) section).
14
+
- Pixi is used to orchestrate all the above into a convenient, user friendly workflow
15
+
16
+
You can run `pixi task list` to see the list of available tasks in the workspace.
17
+
18
+
In brief, you can set up the data and run the benchmarks by doing:
-`PARCELS_BENCHMARKS_DATA_FOLDER=./data pixi run benchmarks`
12
23
13
-
You can run the linting with `pixi run lint`
24
+
> [!NOTE]
25
+
> The syntax `PARCELS_BENCHMARKS_DATA_FOLDER=./data pixi run ...` set's the environment variable for the task, but you can set environment variables [in other ways](https://askubuntu.com/a/58828) as well.
14
26
15
27
> [!IMPORTANT]
16
-
> The default path for the benchmark data is set by [pooch.os_cache](https://www.fatiando.org/pooch/latest/api/generated/pooch.os_cache.html), which typically is a subdirectory of your home directory. Currently, you will need at least 50GB of disk space available to store the benchmark data.
17
-
> To change the location of the benchmark data cache, you can set the environment variable `PARCELS_DATADIR` to a preferred location to store the benchmark data.
28
+
> Currently, you will need at least 50GB of disk space available to store the unzipped benchmark data. Since the zips are deleted after downloaded and extracted, this ends up being about 80GB of disk space needed.
29
+
> You need to be explicit to determine where the benchmark data will be saved by
30
+
> setting the `PARCELS_BENCHMARKS_DATA_FOLDER` environment variable. This
31
+
> environment variable is used in the downloading of the data and definition of
32
+
> the benchmarks.
18
33
19
34
To view the benchmark data
20
35
@@ -34,7 +49,7 @@ Members of the Parcels community can contribute benchmark data using the followi
@@ -61,13 +76,9 @@ Adding benchmarks for parcels typically involves adding a dataset and defining t
61
76
### Adding new data
62
77
63
78
Data is hosted remotely on a SurfDrive managed by the Parcels developers. You will need to open an issue on this repository to start the process of getting your data hosted in the shared SurfDrive.
64
-
Once your data is hosted in the shared SurfDrive, you can easily add your dataset to the benchmark dataset manifest using
65
-
66
-
```
67
-
pixi run benchmark-setup pixi add-dataset --name "Name for your dataset" --file "Path to ZIP archive in the SurfDrive"
68
-
```
79
+
Once your data is hosted in the shared SurfDrive, you can easily add your dataset to the benchmark dataset catalogue by modifying `catalogs/parcels-benchmarks/catalog.yml`.
69
80
70
-
During this process, the dataset will be downloaded and a complete entry will be added to the [parcels_benchmarks/benchmarks.json](./parcels_benchmarks/benchmarks.json) manifest file. Once updated, this file can be committed to this repository and contributed via a pull request.
81
+
In the benchmark you can now use this catalogue entry.
0 commit comments