Indexing of ASTER LP DAAC Datasets to datacube#213
Indexing of ASTER LP DAAC Datasets to datacube#213ashoka1234 wants to merge 15 commits intodevelopfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #213 +/- ##
===========================================
- Coverage 66.8% 66.72% -0.08%
===========================================
Files 42 42
Lines 3190 3189 -1
===========================================
- Hits 2131 2128 -3
- Misses 1059 1061 +2
Continue to review full report at Codecov.
|
|
@ashoka1234, @omad I have a few questions/comments:
|
|
Thanks @Kirill888 for review comments. Answers to questions respectively:
|
|
|
|
Scanning about 100 It seems ' |
|
@ashoka1234 that's another datacube limitation. I think the most sane way is to have exactly the same set of bands across both product and dataset definitions.
I see 2 possible workaround here:
Advantage of option (1) is that it is simple to verify correctness of yaml as you don't need to keep track which netcdf variable ended up with which band index in the VRT. Option (2) is basically a work around for datacube not having an option to indicate that a particular band is missing for this particular dataset. |
|
@Kirill888 |
|
As for consistency of band mappings the code currently rely on consistent ordering of |
There was a problem hiding this comment.
Thanks Ashoka, looks pretty good! I still haven't actually tested it in practice, have some comments and a PR first.
the code currently rely on consistent ordering of subdatasets returned by the corresponding GDAL query. All the dataset files I checked seemed to have the same consistent order! But may be we should not rely on this GDAL feature.
Is there any check done to ensure the ordering is consistent? We really don't want bands accidentally mapped wrongly, and need to error out instead.
|
Yes for consistency I sort the bands defined in the global constant |
Replace this review with a new review. All the requested changes are done.
Refactoring, cleanup and vrt generation click indexing, dataset doc, vrt generation, product def working - initial version indexing with vrt file works with absolute path names for bands
…DAL order of subdatasets for band name to band number mapping.
|
OK sure will do! I thought it was unnecessary for scripts tests to run on Travis! |
|
NP. I think that if we've gone to the effort of writing tests, we might as well do a bit more and have them run all the time in case anything breaks. :) |

Reason for this PR
ASTER data consists of visible and near infrared (VNIR) frequencies
with three bands at 15-meter resolution, short-wave infrared (SWIR)
frequencies with six bands at 30-meter resolution, and thermal infrared (TIR)
wavelength with five bands at 90-meter resolution. ASTER SWIR data acquired since April 2008 are not usable, and show saturation of values and severe striping. Data CRS definitions are not available directly
from the datasets but can be computed from the available information from the datasets. This PR attempt to compute CRS definitions and index ASTER datasets.
Proposed Solution
CRSdefinition is computed per original dataset whileGeoTransformis computed per sensor, i.e. datacube product.indexed datasetpoints to a correspondingVRTfile which defines the bands corresponding to the product referring to the original.hdffile.modisindexing scripts with additional commands for VRT file generation. Thus there are separate commands for deriving and adding a product definition, generation VRT file, Generating and indexing a dataset all corresponding to a particularhdffile and for a particular product. A seperatebash scriptis provided to index multiple files.hdffiles usesabsulutepath names but indexed datasets uses relative path names to the VRT files. This is due to aGDAL/rasteriolibrary restriction.PRODUCTS.includeorexcludeare defined by means of common prefixes in a global constant EXTRA_METADATA_PREFIXES.Tests
aster_lpdaac(Use psql -h agdcdev-db.nci.org.au -d aster_lpdaac) for this work and number of datasets are indexed corresponding to three products,aster_l1t_vnir,aster_l1t_swir, andaster_l1t_tir.scripts_testsmodule) and basic form ofVRT schemawas derived and tests provided for VRT file content validation (in memory).