[Talk Suggestion]: Speed up access to your NetCDF data using VirtualiZarr

**Describe the topic for the talk**
Zarr is an "Analysis Ready Cloud Optimized" (ARCO) data format, and allows for very fast data access for analysis. A lot of datasets, however, exist in NetCDF data formats which can significantly slow down analysis. Rewriting the NetCDF data to Zarr is a monstrous task that would also duplicate your data, and need to be done every time you update your data. Thankfully there's a win-win-win solution! Enter VirtualiZarr, a project that maps out your NetCDF data creating virtual Zarr stores. This gives you Zarr-like access to NetCDF data with no data duplication, and in a way that easily allows you to update the datasets.

[VirtualiZarr Docs](https://virtualizarr.readthedocs.io/en/stable/index.html)

**Describe the benefit**

I think that this talk should be done at a higher level focussing on the benefits from the scientists POV (i.e., tailor it to the audience). The exact implementation details would be of interest to RSEs and Data Engineers, but is also well documented in the project docs.


**Would you be capable/willing to give the talk?**
Yes - I'm wanting to do this for our Lorenz data anyway (also this would be good to explore interactions with Icechunk etc). Also open to others taking over/being involved if interested.

**Additional comments**

_None_




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Talk Suggestion]: Speed up access to your NetCDF data using VirtualiZarr #57

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Talk Suggestion]: Speed up access to your NetCDF data using VirtualiZarr #57

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions