Retrieve, subset, and process CORDEX-CORE and ERA5 climate data directly from THREDDS/OPeNDAP.
cavapy is a Python package built for climate-impact workflows where you need reliable data access without handling massive raw NetCDF archives manually.
It is part of the CAVA (Climate and Agriculture Risk Visualization and Assessment) ecosystem and focuses on:
- Fast access to CORDEX-CORE simulations
- Access to ERA5 observations
- Optional bias correction and calendar harmonization
- Clean integration with downstream hydrology, agronomy, and risk-analysis pipelines
Project context: CAVA overview
- CORDEX-CORE regional climate simulations (25 km)
- ERA5 reanalysis (used directly and for optional correction workflows)
Data is hosted on the University of Cantabria THREDDS infrastructure within the CAVA initiative (FAO, University of Cantabria, University of Cape Town, Predictia).
CORDEX-CORE: original model outputsCORDEX-CORE-BC: pre-bias-corrected outputs using ISIMIP methodology
tasmax: daily maximum temperature (degC)tasmin: daily minimum temperature (degC)pr: daily precipitation (mm)hurs: daily relative humidity (%)sfcWind: daily wind speed at 2 m (m/s)rsds: daily solar radiation (W/m2)
- Domains:
NAM-22,EUR-22,AFR-22,EAS-22,SEA-22,WAS-22,AUS-22,SAM-22,CAM-22 - RCPs:
rcp26,rcp85 - GCMs:
MOHC,MPI,NCC - RCMs:
REMO,Reg
conda create -n cavapy "python>=3.11"
conda activate cavapy
pip install cavapyimport cavapy
togo = cavapy.get_climate_data(
country="Togo",
variables=["tasmax", "pr"],
cordex_domain="AFR-22",
rcp="rcp26",
gcm="MPI",
rcm="REMO",
years_up_to=2030,
dataset="CORDEX-CORE-BC",
)import cavapy
togo = cavapy.get_climate_data(
country="Togo",
variables=["tasmax", "pr"],
cordex_domain="AFR-22",
rcp="rcp26",
gcm="MPI",
rcm="REMO",
years_up_to=2030,
bias_correction=True,
dataset="CORDEX-CORE",
)import cavapy
era5 = cavapy.get_climate_data(
country="Togo",
variables=["tasmax", "pr"],
obs=True,
years_obs=range(1980, 2019),
)import cavapy
data = cavapy.get_climate_data(
country="Afghanistan",
variables=["tasmax", "pr"],
cordex_domain="WAS-22",
rcp="rcp85",
gcm="NCC",
rcm="REMO",
years_up_to=2030,
historical=True,
dataset="CORDEX-CORE-BC",
)Pass lists (or None) to rcp, gcm, and rcm.
import cavapy
multi = cavapy.get_climate_data(
country="Togo",
cordex_domain="AFR-22",
rcp=["rcp26", "rcp85"],
gcm=["MPI", "MOHC"],
rcm=["Reg", "REMO"],
years_up_to=2030,
historical=True,
)Return shape for multi-combination requests:
multi[rcp][f"{gcm}-{rcm}"][variable] # -> xarray.DataArrayget_climate_data() orchestrates:
- Server-side access and subsetting via OPeNDAP
- Parallel data retrieval
- Unit conversions
- Calendar conversion to Gregorian calendar
- Optional empirical quantile mapping bias correction
- Single model/scenario combo: parallel across variables
- Multiple combos: parallel across combo-variable tasks, capped globally
- Sequential mode is used when
num_processes <= 1or only one variable is requested - Default global cap for multi-combo execution: up to
6processes - Inside each process, threaded downloads are used for fetch operations
cavapy includes built-in plotting helpers:
plot_spatial_map()plot_time_series()
import cavapy
data = cavapy.get_climate_data(country="Togo", obs=True, years_obs=range(1990, 2011))
fig = cavapy.plot_spatial_map(
data["tasmax"],
time_period=(2000, 2010),
title="Mean Max Temperature 2000-2010",
cmap="Reds",
)fig = cavapy.plot_time_series(
data["pr"],
title="Precipitation Time Series - Togo (1990-2011)",
trend_line=True,
ylabel="Annual Precipitation (mm)",
aggregation="sum",
figsize=(12, 6),
)If your primary goal is advanced visualization/reporting, see CAVAanalytics.
- Check GitHub issues for data server outages or announcement posts.
- Set
CAVAPY_NO_ANNOUNCEMENTS=1to disable startup announcements in scripts/production runs.
- License: MIT
- Package metadata and build details: pyproject.toml

