Skip to content

Releases: MAAP-Project/gedi-subsetter

0.14.0

16 Apr 21:25
d26b452

Choose a tag to compare

Added

  • Add support for running in the MAAP DPS via CWL (#158)

Removed

  • Breaking change: Remove Scalene support with no planned replacement. This is technically a breaking change since the algorithm no longer defines a --scalene-args input, but end-users were not using it, so this almost certainly does not cause breakage for end-users.

0.13.0

30 Sep 17:23
399237b

Choose a tag to compare

Added

  • Add support for indexing and slicing 2D datasets
    (#48)

Removed

  • Remove use of returns library
    (#3)

Full Changelog: 0.12.0...0.13.0

0.12.0

03 Jul 22:31
e88bc7c

Choose a tag to compare

Changed

  • Add "requester pays" flag for reading granule data files from S3 (#138)
  • Add tolerated_failure_percentage input to control percentage of individual granule failures to tolerate before failing a job. Default tolerance is 0 (i.e., fail fast), thus any single failure will immediately fail the job. (#62)

Full Changelog: 0.11.0...0.12.0

0.11.0

25 Apr 19:07
4c74dfb

Choose a tag to compare

Changed

  • In addition to a logical name or an official DOI name, the value for the doi may now be a collection concept ID to uniquely specify a collection, because searching by DOI does not guarantee uniqueness (due to a bug in CMR Search).

    Further, specifying a logical name (e.g., "L4A") now uses the appropriate collection concept ID rather than the official DOI name so that this change requires no change in user code (other than changing the algorithm version) in order to gain this uniqueness guarantee. (#124)

Full Changelog: 0.10.0...0.11.0

0.10.0

01 Apr 20:25
96c2512

Choose a tag to compare

Fixed

  • Columns in the output file are now guaranteed to be in the same order as given by the columns input value. (#100)

Changed

  • Valid values for the output option have changed such that a file extension is required, whereas previously this was optional since .gpkg was the only supported output format. (#97)

Added

  • It is now possible to specify alternative file formats for the output value: in addition to the file extension .gpkg for GeoPackage format, it is now possible to specify the extensions .parquet for (Geo)Parquet format, or .fgb for FlatGeobuf format. (#97)
  • Add gedi CLI. Run gedi --help for details. (#95)
  • Document how to derive a date-only column or a datetime column. See the section "Computing Dates and Times" in MAAP_USAGE.md. (#110, #111)

0.9.0

09 Oct 15:54
08aed56

Choose a tag to compare

Added

  • Add "L4C" as a valid value for the doi input, for convenience (#90)

0.8.0

13 Aug 17:05
2aa78e0

Choose a tag to compare

Fixed

  • Remove hard-coded MAAP API host value in the scripts bin/algo/describe,bin/algo/delete, and bin/algo/register letting maap-py make use of MAAP_API_HOST environment variable (#85)

Changed

  • Obtain AWS S3 credentials via a role using the EC2 instance metadata rather than via the maap-py library
    (#14)
  • Log messages with timestamps in ISO 8601 UTC combined date and time representations with milliseconds
    (#72)
  • Read granule files directly from AWS S3 instead of downloading them (#54)
  • Optimize AWS S3 read performance to provide ~10% speed improvement (on average) over downloading files by tuning the default_cache_type, default_block_size, and default_fill_cache keyword arguments to the fsspec.url_to_fs function
    (#77)
  • Set default granule limit to 100000. Although this is not unlimited, it effectively behaves as such because all of the supported GEDI collections have fewer granules than this limit. (#69)
  • Set default job queue to maap-dps-worker-32vcpu-64gb to improve performance by running on 32 CPUs (#78)
  • Succeed even when the result is an empty subset (#79)
  • Upgrade to Python 3.12

Added

  • Add fsspec_kwargs input to allow user to specify keyword arguments to the fsspec.url_to_fs method; see [MAAP_USAGE.md] for details. (#77)
  • Add processes input to allow user to specify the number of processes to use, defaulting to the number of available CPUs
    (#77)

0.7.0

24 Apr 22:54

Choose a tag to compare

Added

  • #57 Users may choose to profile their jobs by specifying command-line options for the scalene profiling tool. See docs/MAAP_USAGE.md for more information.
  • #44 Granule download failures are now retried up to 10 times to reduce the likelihood that subsetting will fail due to a download failure.
  • #56 The bin/subset script now captures output to stderr and writes it to the log file named gedi-subset.log. When a job succeeds, the log file will appear in the job's output directory. Otherwise, it will appear in the jobs triage directory.
  • #65 All supported GEDI collections are now cloud-hosted, and granules are now downloaded from the cloud rather than from DAAC servers.

0.6.2

05 Dec 21:10
513589e

Choose a tag to compare

Fixed

  • Updated to use v3.1.3 of maap-py in environment-maappy.yml. Previous versions of maap-py were referencing the deprecated MAAP Query Service API endpoint.

Full Changelog: 0.6.1...0.6.2

0.6.1

28 Sep 17:01
c9279d4

Choose a tag to compare

Fixed

  • #49 Remove all API urls that contain ops as they have now been retired (eg. api.ops.maap-project.org).