Skip to content

Update for DAAC inputs#46

Open
jfahlen wants to merge 22 commits intoemit-sds:devfrom
jfahlen:update_for_DAAC_inputs
Open

Update for DAAC inputs#46
jfahlen wants to merge 22 commits intoemit-sds:devfrom
jfahlen:update_for_DAAC_inputs

Conversation

@jfahlen
Copy link
Copy Markdown
Contributor

@jfahlen jfahlen commented Sep 3, 2025

Use spec_io.py from SpectralUtils (see https://github.com/jfahlen/SpectralUtil/tree/Updates_for_DAAC_parallel_mf and associated PR) for generic input file IO so that the code will run on either DAAC or local cluster inputs.

The code does not produce identical outputs for the same case when run on DAAC data vs local cluster inputs because of slight variations in the input data, specifically in the SZA, H2O, and elevation along with the wavelength and FWHM inputs being single vs. double precision. In testing on the same EMIT FID where those 5 things are hardcoded to be identical, the code does produce identical outputs.

NOTE: I don't know how to incorporate spec_io, so I just added it to my python path. This is probably not what we want, but not sure how to proceed. Should SpectralUtils be turned into a python module and added to the EMIT-SDS repos?

IMPORTANT: it's not clear to me how to handle the different input files groups. DAAC-derived files are different than local cluster files. For example, there is no l1b_bandmask DAAC file. That same field is stored in the L2A_MASK file from the DAAC. There are other similar differences that result in the same files being passed in multiple times. This also means that the help for the argparse inputs are not properly labelled. I'm not sure what to do about that.

Local envi file input order to ghg_process:
rdn
obs
loc
glt
l1b_bandmask
l2a_mask
OUTFILENAME
--state_subs l2a_statesubs

DAAC input order to ghg_process:
RAD
OBS
RAD
OBS
L2A_MASK
L2A_MASK
OUTFILENAME
--state_subs L2A_MASK

Example local cluster call:
srun -p debug -N 1 -c 64 --mem=300G --pty python ghg_process.py /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l1b/emit20240409t074435_o10005_s000_l1b_rdn_b0106_v01.img /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l1b/emit20240409t074435_o10005_s000_l1b_obs_b0106_v01.img /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l1b/emit20240409t074435_o10005_s000_l1b_loc_b0106_v01.img /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l1b/emit20240409t074435_o10005_s000_l1b_glt_b0106_v01.img /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l1b/emit20240409t074435_o10005_s000_l1b_bandmask_b0106_v01.img /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l2a/emit20240409t074435_o10005_s000_l2a_mask_b0106_v01.img /store/jfahlen/test/test_baseline/test --state_subs /store/emit/ops/data/acquisitions/20240409/emit20240409t074435/l2a/emit20240409t074435_o10005_s000_l2a_statesubs_b0106_v01.img --overwrite

Example DAAC call:
srun -p debug -N 1 -c 64 --mem=300G --pty python ghg_process.py /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L1B_RAD_001_20240409T074435_2410005_007.nc /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L1B_OBS_001_20240409T074435_2410005_007.nc /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L1B_RAD_001_20240409T074435_2410005_007.nc /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L1B_OBS_001_20240409T074435_2410005_007.nc /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L2A_MASK_001_20240409T074435_2410005_007.nc /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L2A_MASK_001_20240409T074435_2410005_007.nc /store/jfahlen/test/test_daac/test --overwrite --state_subs /store/jfahlen/test/daac_data_for_test/granules/EMIT_L2B_CH4ENH_001_20240409T074435_2410005_007/EMIT_L2A_MASK_001_20240409T074435_2410005_007.nc

@jfahlen
Copy link
Copy Markdown
Contributor Author

jfahlen commented Sep 11, 2025

Now updated to include running AV3 from DAAC inputs. This was trickier than I anticipated because the AV3 L1B MASK files (which are akin to the EMIT L2A MASK files) are not delivered to the DAAC. We therefore need to compute them before running the ghg_process code.

The L1B MASK code is called make_emit_masks.py. I copied it into this repo so that it could be modified to accept DAAC inputs using spec_io. It required helper code from ISOFIT. Not knowing how to handle this, I simply copied the relevant files from the ISOFIT library and included them here so that it was self contained. Aside from changing import statements, the ISOFIT files are unchanged.

I also needed to include an AV3 noise file. I took /store/airborne/aviris-3/2025/20250715/AV320250715t180536/l1b_rdn/AV320250715t180536_001_L1B_RDN_76a54582_NOISE.txt and added it to the repo.

To facilitate running the three different modes of ghg_process, I added run_ghg_process.py. This can be used for reference to see which types of DAAC files should be provided to ghg_process. For example, compare the ghg_process command in run_ghg_process_EMIT_cluster to run_ghg_process_EMIT_DAAC and run_ghg_process_AV3_DAAC. The run_ghg_process_AV3_DAAC function also demonstrates how to call make_emit_masks.py. The run_ghg_process.py will not run for most people as it assumes that you downloaded the DAAC data using my earthaccess_helpers_EMIT and earthaccess_helpers_AV3 from emit-sds/SpectralUtil#21 which has not yet been accepted.

I tested this code on AV320250715t180536_001 and compared the MF, UNC, and SNS. These are all within about 0.2-0.3 of each other. Here's the calls:

srun -p debug -N 1 -c 64 --mem=300G --pty python make_emit_masks.py /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_RDN_4842d6a3_RDN.nc /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_RDN_4842d6a3_RDN.nc /store/airborne/software/asds_data/main/l1b_rdn/kurucz_0.1nm.dat /store/jfahlen/test/AV320250715t180536/local/AV320250715t180536_001/L1B_MASK

srun -p debug -N 1 -c 1 --mem=30G --pty python ghg_process.py /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_RDN_4842d6a3_RDN.nc /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_ORT_57c4e387_OBS.nc /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_RDN_4842d6a3_RDN.nc /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_ORT_57c4e387_OBS.nc /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/AV320250715t180536_001_L1B_RDN_4842d6a3_BANDMASK.nc /store/jfahlen/test/AV320250715t180536/local/AV320250715t180536_001/L1B_MASK /store/jfahlen/test/AV320250715t180536/local/AV320250715t180536_001/ --overwrite --noise_file /home/jfahlen/src/emit-ghg/instrument_noise_parameters/AV320250715t180536_001_L1B_RDN_76a54582_NOISE.txt

These were created with python run_ghg_process.py AV3_DAAC AV320250715t180536 /store/jfahlen/test/AV320250715t180536/granules/AV320250715t180536_001_L2B_GHG_1/ /store/jfahlen/test/AV320250715t180536/local/AV320250715t180536_001/ --execute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant