Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
fd2c217
Update README
VeckoTheGecko Mar 25, 2026
b75d0b3
Update gitignore
VeckoTheGecko Mar 25, 2026
295cb7b
Add Parcels as submodule
VeckoTheGecko Mar 25, 2026
98855a5
Add sandbox environment
VeckoTheGecko Mar 25, 2026
bb967c6
Rename benchmarks.json to datasets.json
VeckoTheGecko Mar 26, 2026
4d59a44
typo
VeckoTheGecko Mar 26, 2026
1f8777c
update function name
VeckoTheGecko Mar 26, 2026
96b3646
No need to have this default here
VeckoTheGecko Mar 26, 2026
c56cf69
Add pydantic
VeckoTheGecko Mar 26, 2026
2cec8b4
Migrate to pydantic
VeckoTheGecko Mar 26, 2026
64fd72c
Assert no duplicate dataset names
VeckoTheGecko Mar 26, 2026
b281743
refactor
VeckoTheGecko Mar 26, 2026
96e24a7
Rename function
VeckoTheGecko Mar 26, 2026
2d83e26
Add download-catalogue option
VeckoTheGecko Mar 26, 2026
4570368
Update catalogue
VeckoTheGecko Mar 26, 2026
9fa1963
Move files and add tasks
VeckoTheGecko Mar 26, 2026
bddcbdc
Add catalogue
VeckoTheGecko Mar 26, 2026
99cf82d
Move file
VeckoTheGecko Mar 26, 2026
070bcf0
Update folder location
VeckoTheGecko Mar 26, 2026
96b97d8
Add PARCELS_BENCHMARKS_DATA_FOLDER env var
VeckoTheGecko Mar 26, 2026
98bb9e7
Use curl instead
VeckoTheGecko Mar 26, 2026
42d7f53
Rename files (catalogue to catalog and yaml->yml)
VeckoTheGecko Mar 26, 2026
fc37186
Add task descriptions
VeckoTheGecko Mar 27, 2026
e582cff
Update readme
VeckoTheGecko Mar 27, 2026
2a064be
Remove parcels_benchmarks internal package
VeckoTheGecko Mar 27, 2026
c3299ba
Update script to unpack zips correctly
VeckoTheGecko Mar 27, 2026
0f3a93a
Update toml and lock
VeckoTheGecko Mar 27, 2026
9a25838
Add comment
VeckoTheGecko Mar 27, 2026
0c2cd03
Update catalogs regardless of folder existing
VeckoTheGecko Mar 27, 2026
0c2b6cf
Fix catalogues
VeckoTheGecko Mar 27, 2026
c2eb313
Migrate fesom ingestion to intake
VeckoTheGecko Mar 27, 2026
b4da946
Update MOI
VeckoTheGecko Mar 27, 2026
6fb1b7f
Update ASV conf
VeckoTheGecko Mar 27, 2026
515b767
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 27, 2026
4365e7b
Default PARCELS_BENCHMARKS_DATA_FOLDER to ./data
VeckoTheGecko Apr 2, 2026
5c98107
Clean out dependencies
VeckoTheGecko Apr 2, 2026
1a06ac3
Fix ASV/py-rattler deps
VeckoTheGecko Apr 2, 2026
d46f4b1
run pre-commit
VeckoTheGecko Apr 2, 2026
b1d3bc5
Rename asv.conf.json to .jsonc (comment supported format)
VeckoTheGecko Apr 2, 2026
0675afe
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 2, 2026
f992e3a
Review feedback
VeckoTheGecko Apr 2, 2026
10f4c96
update readme
VeckoTheGecko Apr 2, 2026
cbbc180
Disable isort
VeckoTheGecko Apr 2, 2026
c116a3c
Update catalogs/parcels-benchmarks/catalog.yml
VeckoTheGecko Apr 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ credentials.json
*.egg-info
__pycache__
build/
parcels/
.asv/
html/
.DS_Store

data
.env
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "Parcels"]
path = Parcels
url = git@github.com:Parcels-code/Parcels
1 change: 1 addition & 0 deletions Parcels
Submodule Parcels added at c6f11d
33 changes: 22 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,30 @@ This repository houses performance benchmarks for [Parcels](https://github.com/O

## Development instructions

This project uses a combination of [Pixi](https://pixi.sh/dev/installation/), [ASV](https://asv.readthedocs.io/), and [intake-xarray](https://github.com/intake/intake-xarray) to coordinate the setting up and running of benchmarks.

- Scripts are used to download the datasets required into the correct location
- intake-xarray is used to define data catalogues which can be easily accessed from within benchmark scripts
- ASV is used to run the benchmarks (see the [Writing the benchmarks](#writing-the-benchmarks) section).
- Pixi is used to orchestrate all the above into a convenient, user friendly workflow

You can run `pixi task list` to see the list of available tasks in the workspace.

In brief, you can set up the data and run the benchmarks by doing:

- [install Pixi](https://pixi.sh/dev/installation/) `curl -fsSL https://pixi.sh/install.sh | bash`
- `pixi install`
- `pixi run asv run`
- `PARCELS_BENCHMARKS_DATA_FOLDER=./data pixi run benchmarks`

You can run the linting with `pixi run lint`
> [!NOTE]
> The syntax `PARCELS_BENCHMARKS_DATA_FOLDER=./data pixi run ...` set's the environment variable for the task, but you can set environment variables [in other ways](https://askubuntu.com/a/58828) as well.

> [!IMPORTANT]
> The default path for the benchmark data is set by [pooch.os_cache](https://www.fatiando.org/pooch/latest/api/generated/pooch.os_cache.html), which typically is a subdirectory of your home directory. Currently, you will need at least 50GB of disk space available to store the benchmark data.
> To change the location of the benchmark data cache, you can set the environment variable `PARCELS_DATADIR` to a preferred location to store the benchmark data.
> Currently, you will need at least 50GB of disk space available to store the unzipped benchmark data. Since the zips are deleted after downloaded and extracted, this ends up being about 80GB of disk space needed.
> You need to be explicit to determine where the benchmark data will be saved by
> setting the `PARCELS_BENCHMARKS_DATA_FOLDER` environment variable. This
> environment variable is used in the downloading of the data and definition of
> the benchmarks.

To view the benchmark data

Expand All @@ -34,7 +49,7 @@ Members of the Parcels community can contribute benchmark data using the followi
2. Clone your fork onto your system

```
git clone git@github.com:<your-github-handle>/parcels-benchmarks.git ~/parcels-benchmarks
git clone --recurse-submodules git@github.com:<your-github-handle>/parcels-benchmarks.git
```

3. Run the benchmarks
Expand All @@ -61,13 +76,9 @@ Adding benchmarks for parcels typically involves adding a dataset and defining t
### Adding new data

Data is hosted remotely on a SurfDrive managed by the Parcels developers. You will need to open an issue on this repository to start the process of getting your data hosted in the shared SurfDrive.
Once your data is hosted in the shared SurfDrive, you can easily add your dataset to the benchmark dataset manifest using

```
pixi run benchmark-setup pixi add-dataset --name "Name for your dataset" --file "Path to ZIP archive in the SurfDrive"
```
Once your data is hosted in the shared SurfDrive, you can easily add your dataset to the benchmark dataset catalogue by modifying `catalogs/parcels-benchmarks/catalog.yml`.

During this process, the dataset will be downloaded and a complete entry will be added to the [parcels_benchmarks/benchmarks.json](./parcels_benchmarks/benchmarks.json) manifest file. Once updated, this file can be committed to this repository and contributed via a pull request.
In the benchmark you can now use this catalogue entry.

### Writing the benchmarks

Expand Down
25 changes: 0 additions & 25 deletions asv.conf.json

This file was deleted.

31 changes: 31 additions & 0 deletions asv.conf.jsonc
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{
"version": 1,
"project": "parcels",
"project_url": "https://github.com/Parcels-Code/parcels",
"repo": "./Parcels",
"dvcs": "git",
"branches": ["main"],
"environment_type": "rattler",
"conda_channels": [
"conda-forge",
"defaults",
"https://repo.prefix.dev/parcels",
],
"default_benchmark_timeout": 1800,
"env_dir": ".asv/env",
"results_dir": "results",
"html_dir": "html",
"build_command": ["python -m build --wheel -o {build_cache_dir} {build_dir}"],
// "install_command": [
// "in-dir={conf_dir} python -m pip install .",
// "in-dir={build_dir} python -m pip install ."
// ],
// "uninstall_command": [
// "return-code=any python -m pip uninstall -y parcels parcels_benchmarks"
// ]
"matrix": {
"req": {
"intake-xarray": [],
},
},
}
27 changes: 27 additions & 0 deletions benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import logging
import os
from pathlib import Path

logger = logging.getLogger(__name__)

PIXI_PROJECT_ROOT = os.environ.get("PIXI_PROJECT_ROOT")
if PIXI_PROJECT_ROOT is not None:
PIXI_PROJECT_ROOT = Path(PIXI_PROJECT_ROOT)

PIXI_PROJECT_ROOT: Path | None

try:
PARCELS_BENCHMARKS_DATA_FOLDER = Path(os.environ["PARCELS_BENCHMARKS_DATA_FOLDER"])
except KeyError:
# Default to `./data`
PARCELS_BENCHMARKS_DATA_FOLDER = Path("./data")
logger.info("PARCELS_BENCHMARKS_DATA_FOLDER was not set. Defaulting to `./data`")

if not PARCELS_BENCHMARKS_DATA_FOLDER.is_absolute():
if PIXI_PROJECT_ROOT is None:
raise RuntimeError(
"PARCELS_BENCHMARKS_DATA_FOLDER is a relative path, but PIXI_PROJECT_ROOT env variable is not set. We don't know where to store the data."
)
PARCELS_BENCHMARKS_DATA_FOLDER = PIXI_PROJECT_ROOT / str(
PARCELS_BENCHMARKS_DATA_FOLDER
)
12 changes: 12 additions & 0 deletions benchmarks/catalogs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import intake

from . import PARCELS_BENCHMARKS_DATA_FOLDER


class Catalogs:
CAT_EXAMPLES = intake.open_catalog(
f"{PARCELS_BENCHMARKS_DATA_FOLDER}/surf-data/parcels-examples/catalog.yml"
)
CAT_BENCHMARKS = intake.open_catalog(
f"{PARCELS_BENCHMARKS_DATA_FOLDER}/surf-data/parcels-benchmarks/catalog.yml"
)
28 changes: 14 additions & 14 deletions benchmarks/fesom2.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import numpy as np
import uxarray as ux
import xarray as xr
from parcels import (
FieldSet,
Particle,
Expand All @@ -8,39 +9,38 @@
)
from parcels.kernels import AdvectionRK2_3D

from parcels_benchmarks.benchmark_setup import PARCELS_DATADIR, download_example_dataset
from . import PARCELS_BENCHMARKS_DATA_FOLDER

runtime = np.timedelta64(1, "D")
dt = np.timedelta64(2400, "s")


def _load_ds(datapath):
def _load_ds():
"""Helper function to load uxarray dataset from datapath"""

grid_file = f"{datapath}/mesh/fesom.mesh.diag.nc"
data_files = f"{datapath}/*.nc"
return ux.open_mfdataset(grid_file, data_files, combine="by_coords")
grid_file = xr.open_mfdataset(
f"{PARCELS_BENCHMARKS_DATA_FOLDER}/surf-data/parcels-benchmarks/data/Parcelsv4_Benchmarking_data/Parcels_Benchmarks_FESOM-baroclinic-gyre/data/mesh/fesom.mesh.diag.nc"
)
data_files = xr.open_mfdataset(
f"{PARCELS_BENCHMARKS_DATA_FOLDER}/surf-data/parcels-benchmarks/data/Parcelsv4_Benchmarking_data/Parcels_Benchmarks_FESOM-baroclinic-gyre/data/*.nc"
)

grid = ux.open_grid(grid_file)
return ux.UxDataset(data_files, uxgrid=grid)


class FESOM2:
params = ([10000], [AdvectionRK2_3D])
param_names = ["npart", "integrator"]

def setup(self, npart, integrator):
# Ensure the dataset is downloaded in the desired data_home
# and obtain the path to the dataset
self.datapath = download_example_dataset(
"FESOM-baroclinic-gyre", data_home=PARCELS_DATADIR
)

def time_load_data(self, npart, integrator):
ds = _load_ds(self.datapath)
ds = _load_ds()
for i in range(min(ds.coords["time"].size, 2)):
_u = ds["u"].isel(time=i).compute()
_v = ds["v"].isel(time=i).compute()

def pset_execute(self, npart, integrator):
ds = _load_ds(self.datapath)
ds = _load_ds()
ds = convert.fesom_to_ugrid(ds)
fieldset = FieldSet.from_ugrid_conventions(ds)

Expand Down
12 changes: 7 additions & 5 deletions benchmarks/moi_curvilinear.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,16 @@
import xgcm
from parcels.interpolators import XLinear

from parcels_benchmarks.benchmark_setup import PARCELS_DATADIR, download_example_dataset

runtime = np.timedelta64(2, "D")
dt = np.timedelta64(15, "m")


PARCELS_DATADIR = ... # TODO: Replace with intake


def download_dataset(*args, **kwargs): ... # TODO: Replace with intake


def _load_ds(datapath, chunk):
"""Helper function to load xarray dataset from datapath with or without chunking"""

Expand Down Expand Up @@ -72,9 +76,7 @@ class MOICurvilinear:
]

def setup(self, interpolator, chunk, npart):
self.datapath = download_example_dataset(
"MOi-curvilinear", data_home=PARCELS_DATADIR
)
self.datapath = download_dataset("MOi-curvilinear", data_home=PARCELS_DATADIR)

def time_load_data_3d(self, interpolator, chunk, npart):
"""Benchmark that times loading the 'U' and 'V' data arrays only for 3-D"""
Expand Down
63 changes: 63 additions & 0 deletions catalogs/parcels-benchmarks/catalog.yml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between the catalogues in parcels-benchmarks and the parcels-examples? They seem to be the same now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to be updated in a future PR (mainly focussing on the actual downloading of the datasets - will fix the catalogs and ingestion at the same time)

Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# zip_url: https://surfdrive.surf.nl/index.php/s/7xlfdOFaUGDEmpD/download?path=%2F&files=
# ^ Do not remove this line! Used by the download script to find the data source
plugins:
source:
- module: intake_xarray
sources: #!TODO Update
croco:
description: CROCO_idealized
driver: netcdf
#cache:
# - argkey: urlpath
# regex: ''
# type: file
args:
urlpath: "{{ CATALOG_DIR }}/data/CROCOidealized_data/CROCO_idealized.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
GlobCurrent_example_data:
description: GlobCurrent_example_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/GlobCurrent_example_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
MITgcm_example_data:
description: MITgcm_example_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/MITgcm_example_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
MovingEddies_data:
description: MovingEddies_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/MovingEddies_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"

# NemoCurvilinear_data:
# NemoNorthSeaORCA025-N006_data:
# OFAM_example_data
# Peninsula_data
SWASH_data:
description: SWASH_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/SWASH_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
WOA_data:
description: WOA_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/WOA_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
64 changes: 64 additions & 0 deletions catalogs/parcels-examples/catalog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# zip_url: https://surfdrive.surf.nl/index.php/s/cmdSy8wBtCLDaGJ/download?path=%2F&files=
# ^ Do not remove this line! Used by the download script to find the data source
plugins:
source:
- module: intake_xarray
sources:
croco:
description: CROCO_idealized
driver: netcdf
#cache:
# - argkey: urlpath
# regex: ''
# type: file
args:
urlpath: "{{ CATALOG_DIR }}/data/CROCOidealized_data/CROCO_idealized.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
GlobCurrent_example_data:
description: GlobCurrent_example_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/GlobCurrent_example_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
MITgcm_example_data:
description: MITgcm_example_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/MITgcm_example_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
MovingEddies_data:
description: MovingEddies_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/MovingEddies_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"

# NemoCurvilinear_data:
# NemoNorthSeaORCA025-N006_data:
# OFAM_example_data
# Peninsula_data
# SWASH_data
SWASH_data:
description: SWASH_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/SWASH_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
WOA_data:
description: WOA_data
driver: netcdf
args:
urlpath: "{{ CATALOG_DIR }}/data/WOA_data/*.nc"
chunks: {}
xarray_kwargs:
engine: "netcdf4"
1 change: 0 additions & 1 deletion parcels_benchmarks/__init__.py

This file was deleted.

Loading