Eurec4a test by chitvansingh03 · Pull Request #216 · atmdrops/pydropsonde

chitvansingh03 · 2025-08-06T16:12:47Z

I have modified some of the files so as process EUREC4A's HALO level 0 data using Pydropsonde. The first change mentioned needs to be generalized so both EUREC4A and ORCESTRA can be processed simultaneously.

Changed in processor.py - modified the method get_circle_times_from_segementation() so it the 'flight_id' corresponds to 'flight_id's in EUREC4A's flight segmentation file (Type - 'Platform-MMDD'). They are not the same as flight id which pydropsonde derives from Level 0 data (type - 'YYYYMMDD'). So this change may end up not working for orcestra. This needs to be generalized, so both orchestra and eurec4a can be processed using same code.
Change in rawreader.py - modified opening and reading of A-files. Earlier it stirctly only read characters of type UTF-8, just changed that if a non-utf-8 , character comes, ignore the error and proceed. (this problem was in P3 files, and on making this change, it magically worked - until another error came).
Added 2 more files relevant for EUREC4A- segmentation file and config file I made (path needs to be modified)
Thank you
Chitvan Singh

…concat_circle functions respectively. And 4 new files added which are to be removed

…te_and_populate_flight_object, to check if their output is empty or not. turns out they are empty!"

…fied for EUREC4A needs. pipeline.py - create_and_populate_circle_object , the print and saving statement updated.

… processor.py - generalised finding flight_id for P3 and HALO

…ad A- files in all functions doing so 2) In processor.py - modified get_circle_times_from_segmentation() for eurec4a data, it won't work on orchestra.

tmieslinger · 2025-09-04T14:49:02Z

Thank you for testing pydropsonde on the EUREC4A dropsonde data!
This PR covers three things. We've already discussed some offline and I'll try to summarise my suggestions below:

reading some of the A-files from P3 results in errors. This was actually a very good discovery as some of the P3 files are indeed broken with bit-flips and cannot be decoded properly with ASCII. Instead of catching but ignoring the error message, I implemented a change in the reading which is detailed in the respective PR change reading code for a-file to handle P3 cases #217
The EUREC4A or JOANNE Level 0 data (raw files and folder names) do not directly fit into the input scheme of pydropsonde. Also, some files are duplicated (A-files) or partly duplicated (files with short naming scheme). I would generally suggest that we improve the sparse pydropsonde documentation to clarify what the input shall look like. In the end, the cleaning of duplicate files is independent from pydropsonde and should be done beforehand. More complicated is the folder naming: pydropsonde assumes that the folder which includes all files from a measurement flight has a name that uniquely identifies this flight (or measurement sequence). Respectively, the folder name is written to the flight_idvariable used within pdropsonde and also to the output files from Level 2 onwards. Your suggestion to extract the flight_id later on in the processing from a flight segmentation file is a neat solution, but it would only work for the Level 4 dataset, leading to an inconsistency in flight_id between Level 2, 3, and 4. Also, pydropsonde is designed to do additional QC (Level2) and concatenate single profile measurements that are meant to be evaluated together (Level 3). It includes further good things like adding derived variables. All of that is independent from a flight segmentation and the possibility to combine certain profiles into mesoscale products (e.g. omega, Level 4). In fact, for most datasets that pydropsonde could be applied, there is likely no flight segmentation information available. Therefore, I'd suggest to update to documentation to clarify that the folder names must be such that they uniquely identify a flight and also suggest the users to preferably make use of the official campaign-wide flight IDs.
the EUREC4A config files would be super cool to have! I'd suggest however to add it not here, but to the orcestra-campaign/dropsondes repository. The current repo is meant to only cover the pydropsonde code including minimalistic example data and a respective config for it to run tests on the code. Config files and further information to specific campaigns and their datasets are better placed in separate repos. For the EUREC4A case, I think it's a good option to add it to the ORCESTRA dropsondes as the above linked repo shall include everything needed to reproduce the processing that we do for the final ORCESTRA dropsonde datasets and respective data paper. We include a comparison to JOANNE anyway, such that it would be perfect to further compare the JOANNE profiles to reprocessed EUREC4A profiles with the most recent pydropsonde version :)

Overall, this was a super helpful test and I would appreciate if you add the eurec4a config to the linked repo and open a PR there. Thanks again for all the work!

chitvansingh03 added 10 commits July 27, 2025 17:52

print(gridded) & print(eerror) added in create_and_populate_circle & …

975475b

…concat_circle functions respectively. And 4 new files added which are to be removed

print statements added in get_circle_times_from_segmentation and crea…

f0f5ff7

…te_and_populate_flight_object, to check if their output is empty or not. turns out they are empty!"

In processeor.py - get_circle..._from_segmentation, the function modi…

574ae22

…fied for EUREC4A needs. pipeline.py - create_and_populate_circle_object , the print and saving statement updated.

create_populate_circle_object - saving test file removed

938aafb

rawreader.py - in check_launch_detect_in_a_file relaxed file reading,…

91dd19b

… processor.py - generalised finding flight_id for P3 and HALO

reading of A_files relaxed from just UTF-8

139b963

unnecesary changes from origin commented out

0840fb6

Final Changes - 1) In rawreader.py - relaxed enconding required to re…

1b3233f

…ad A- files in all functions doing so 2) In processor.py - modified get_circle_times_from_segmentation() for eurec4a data, it won't work on orchestra.

finishing touches. Only processor.py and rawreader.py are modified

94b3f1a

unnecesary files removed and config file for eurec4a added

2f68924

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eurec4a test#216

Eurec4a test#216
chitvansingh03 wants to merge 10 commits intoatmdrops:mainfrom
chitvansingh03:eurec4a_test

chitvansingh03 commented Aug 6, 2025 •

edited

Loading

Uh oh!

tmieslinger commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chitvansingh03 commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tmieslinger commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chitvansingh03 commented Aug 6, 2025 •

edited

Loading