Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topsStack computeCoherence unnecessarily accesses burst_0*.slc.vrt in DataAccessorPy.py, producing HPC problems #247

Open
falkamelung opened this issue Mar 4, 2021 · 3 comments

Comments

@falkamelung
Copy link
Contributor

We got into trouble using topsStack on HPC (TACC's Stampede2) because of too heavy IO loads on the shared disk by running more than ~200 ISCE processes simultaneously. The admins advised us for each step to copy all required input files to a local /tmp disk. This works fine for all processing steps except for run_10_filter_coherence where we get the error

GDAL open (R): /tmp/merged/SLC/20160629/20160629.slc.full.vrt
ERROR 4: /tmp/merged/SLC/20160629/../../../coreg_secondarys/20160629/IW1/burst_01.slc.vrt: No such file or directory
Error. Cannot open the file /tmp/merged/SLC/20160629/20160629.slc.full.vrt in read mode.
Error in file /home/conda/feedstock_root/build_artifacts/isce2_1605839897087/work/build/components/iscesys/ImageApi/InterleavedAccessor/src/GDALAccessor.cpp at line 77 Exiting

This error occurs in runBurstIfg.py in the slc2.createImage() call here. More precisely, it fails in DataAccessorPy.py here. I have no solution but it is similar to #245 (comment)

To be clear, the processing works fine on /scratch. This happens only when using two disks. Here our modified config_igram_filt_coh_*:

cat config_igram_filt_coh_20160605_20160629
##########################
###################################
[Function-1]
FilterAndCoherence :
input : /tmp/merged/interferograms/20160605_20160629/fine.int
filt : /scratch/05861/tg851601/GalapagosSenDT128/merged/interferograms/20160605_20160629/filt_fine.int
coh : /scratch/05861/tg851601/GalapagosSenDT128/merged/interferograms/20160605_20160629/filt_fine.cor
strength : 0.2
slc1 : /tmp/merged/SLC/20160605/20160605.slc.full
slc2 : /tmp/merged/SLC/20160629/20160629.slc.full
complex_coh : /scratch/05861/tg851601/GalapagosSenDT128/merged/interferograms/20160605_20160629/fine.cor
range_looks : 15
azimuth_looks : 5

It reads the fine.int from the node-local /tmp but writes all outfiles to the shared /scratch. In the same way it reads 20160605.slc.full from /tmp but writes fine.cor to /scratch. The error occurs when trying to access burst_01.slc.vrt. Copying of this file to /tmp is not an possible because it also wants burst*.slc and azimuth_*off* requiring too much space on /tmp.

My question:

Does it really needs burst_01.slc.vrt and burst_01.slc.vrt to calculate the complex coherence? If not, how to tell isce not to open these files?

If this can be resolved we can run ~5000 processes (or more) simultaneously, so this is important.

Our python scripts for job submission on HPC and copying to local disk are on a public GitHub (MinSAR) but still messy. We plan to share asap.

@piyushrpt
Copy link
Member

Access to these are not unnecessary

merged vrt points to collection of individual vrts. These individual vrts are the only interpretation of data type and layout of bursts in the raw file. This lets you mosaic data in any format - flat files, tiffs etc.

In your workflow, if you are already paying the extra cost of generating a physical .slc.full - you can overwrite the slc.vrt file with image.renderVRT() call after your gdal_translate of gdal.Translate() call. ISCE is seeing a .vrt which still points to pieces from individual bursts. Once you have full file, update this to point to the merged physical file.

@falkamelung
Copy link
Contributor Author

Thank you. We don't have the *.slc.full , so what you are suggesting is not possible.

MintPy does not use the complex coherence. We therefore would like to add an option to make this calculation optional, probably stackSentinel.py --no_complex_coherence.

Does anybody know what the complex coherence is used for and whether it used in routine workflows? It was introduced recently: #97. Alternatively, instead of introducing an option to skip it, we could change the default to what it was prior to this PR and introduce an option to calculate it stackSentinel.py --complex_coherence. Any comments/thoughts/objections?

@yunjunz
Copy link
Member

yunjunz commented Mar 17, 2021

I like the idea of stackSentinel.py --cpx_coh / --complex_coherence, it takes less disk usage and less computing time for the default setting, which might be more of the common scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants