Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions data_access/README_load_swarm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
Author: Maximilian Gregorius

A jupyter notebook explaining briefly how to acceess and download data from ESA's Swarm mission as part of the module Earth System Data Processing.

This readme file serves more as a lab book of my findings during the work with the Swarm data than a more conventional, purely descriptive readme file
and follows mostly the layout given in the homework assignment sheet. But first a brief summary of the Swarm mission and of the data retrival system VirES.


# The Swarm Mission and VirES
Swarm is ESA's first constellation mission to survey Earth's magnetic and electric field and their temporal variations. It launched in 2013 and consits of three satellites, called Alpha(A), Bravo(B) and Charlie(C) and is planned to continue the survey until 2025. The satellites carry a wide range of instruments with their main focus being their magnetometers. The primary objectives of this mission are a better understand of core-mantle interactions, investigationg the electric currents flowing in the iono- and magnetosphere and more. A basic overview of the mission can be found at https://earth.esa.int/eogateway/missions/swarm.

In this notebook we will use the data retrival system VirES to get access to a vast amount of data inluding many of ESA’s Earth Explorer missions, Aeolus and most importantly Swarm. This data is available for many different collections of measurements and auxiliary measurements and associated models, including the magnetic field, electric field, ion temperature, satellite positions and many more. A more experienced user might use this data to calculate and visualize Earth's magnetic field or to do simple space weather forecasting, but in this notebook we will focus mainly on getting access to the Swarm data and how to download some geomagnetic field data. The goal is to get the user familiarized with the VirEs environment and working with Swarm data so they can modify it in the future for their own needs. Some basic understanding of geomagnetism is advised but not necessary and anyone willing to learn can check out the ressources below to deepen their understanding.

Understanding Earth's geomagnetic sources: https://link.springer.com/journal/11214/volumes-and-issues/206-1.


# Evaluation of VirES as a Data Portal
As described above VirES(Virtual environments for Earth observation Scientists) is a data retrival system, that also includes a server system and a graphical web interface which allows easy visualization and manipulation of Swarm products.
It is designed to lower the barrier of entry for Scientists who want to acceess Swarm data. The website(https://notebooks.vires.services/docs/vre-overview#)offers a lot of basic explanations and example code to make it possible to get started in
a matter of minutes.
To identify the magnetic data used in the notebook the Swarm Product Data Handbook https://swarmhandbook.earth.esa.int/catalogue/index, which is directly provided by the ESA, is used. Using the appropriate filters for Swarm and magnetic data it is possible to identify a handful of products that would hold potentially useful data. Searching the VirES support website "Available parameters for Swarm"(https://viresclient.readthedocs.io/en/latest/available_parameters.html) provides a long list of available collections and measurements making it possible to identify "MAG" as the desired collection for use in the homework notebook. While the ressources are plentiful and everything is meticulously inter-linked, it can still be quite difficult to find the data one wants to use, as the load of options can be overwhelmimg. It is very helpful to have a basic idea of what you are looking for and to have some experience in working with geomagnetic data to more quickly understand all the abbreviations used for the measurements. In some cases the naming conventions of the measurements differ sligthly between ESA and what is provided by VirES, which may cause headache, but the differences are usually so small, that it is still easily possible to find the right measurements.


# Accessing Swarm Data
To access the data using VirES an access token is required. On the first request for data the user will be prompted automatically to create an account at https://vires.services/ and to set up a token. On subsequent uses of the notebook this token will
be called upon automatically enabling very fast log ins. Alternativly it is also possible to directly include the user information and token information in a few lines of code which might be useful in some cases. Information on how VirES handles tokens can be found at https://notebooks.vires.services/notebooks/02a__intro-swarm-viresclient and a more general overview of tokens and account creation at VirES can be found at https://viresclient.readthedocs.io/en/latest/access_token.html.


# Downloading Swarm Data
For the download of the data the python package viresclient is used. The user sends a request for the data where they specifiy the collection, the products and the timeframe they need.
Under collection the user choses which ESA mission and which instruments are desired, in this case this would mean the Swarm mission and the geomagnetic data. Each collection has a number of measurements associated with it, and the appropriate collection must be set in order to access the measurements.
Under products the user spcifies these measurements, like magnetic field intensity or vectorial magnetic field strength, that they want to download. Here it is also possible to request model data, auxiliary data and the sampling step. Auxiliiary data is not necessarily data collected by the requested mission, but data that can be useful supportive data. For example one could request the auxiliary "Disturbance storm time Index"(Dst) to flag their geomagnetic data as unreliable at times of great solar activity. There are several different models available for a wide variety of objectives. The model evaluations are calculated at the same sample points as for the requested data products. For most models these evaluation calculations happen server side at the time of the data request and only for the most common requested and computatonal expensive models the server stores and uses a cache of some of the model values. How VirES handels models is described at https://viresclient.readthedocs.io/en/latest/geomagnetic_models.html. In this notebook a simple model of Earth's geomagnetic field, called "International Geomagnetic Reference Field""(IGRF), will be requested to use as a point of comparison for our satellite data. The sampling step can be set up using the ISO 8601 standards with a minimal interval depending on the collection used (usually one second).
The timeframe can be chosen depending on the collection used. The Swarm mission launched in 2013, so earlier times are obviously not available, but the data for some groundbased magnetic observatiories might be available for more than a hundred years.
In this notebook the month of May 2024 will be used as an example.
The request returns a data object called Returned.Data, which is basically a wrapper around a temporary CDF file that can either be written to the disk directly or first be tranformed into a pandas dataframe or a xarray type object.


# Visualize Geomagnetic Data
The notbook provides a simple scatter plot to compare the downloaded satellite data to the model provided by IGRF. This approach only works for smaller datasizes and shall only demonstrate how the data may look when mapped on the globe and demonstrate a few common problems the user might encounter when working with this kind of data. In this example of geomagnetic data these are mainly empty data columns and outliners outside of a reasonable physicsal level, that need to be dealt with before further processing of this data.


# Scaling Data Access
It should be relatively simple to upscale the data requested by the user by changing the parameters of the request command. The user can add or remove aditional collections, measurements, auxiliary measurements and models or change the sampling step and time window and regardless of the requested data size this would result in single CDF file. Requesting data for up to 100 years and lots of measurements could potentially lead to an abnormially large CDF file. The largest file downloaded during the testing of the notebook is around three GB, but how VirEs would react to much larger files is untested. It might be a good idea to split the request into smaller "subrequests" to not overburden the downloader and to avoid having to deal with a single NetCDF or xarray file of several hundred GB. The most obvoius idea would be to split the requests by date, but depending on the project it might be more useful to split by measurement or collection or measurement. It really depends on the user case.
Loading