Skip to content

Releases: Ecogenomics/GTDBTk

2.6.1

12 Dec 06:12
4461f70

Choose a tag to compare

Bug Fixes:

  • (#680) Solve check_install error when StageLogger path is not set

2.6.0

10 Dec 06:28
5cdb5f8

Choose a tag to compare

Major Changes:

  • GTDB-Tk has now a fixed version for skani (v0.3.1) and pplacer (v1.1.alpha19) to i) ensure reproducibility of results and ii) use the sketch format compatible with skani v0.3.1.
  • The limit of number of genomes compared in dense genera has been removed.This ensures that all representative genomes in a genus are compared, preventing incorrect species assignments when the closest genome by ANI is outside the previous 100-genome limit. This is especially important in dense genera like Collinsella and significantly improves classification accuracy, even if runtime increases slightly.

Bug Fixes:

  • (#670, #674, #668 ) Fixed an issue where GTDB-Tk would crash when using pplacer v1.1.alpha20. This issue is now resolved by fixing pplacer to v1.1.alpha19.
  • (#671) The limit of number of genomes compared in dense genera has been removed.
  • (#672) skani is now fixed to v0.3.1 to and uses sketch + search commands instead of dist.
  • (#665) GTDB-Tk now uses skani v0.3.1 and have a option to save the sketch db for reference genomes for future use( --skani_sketch_dir ).
  • (#669) BaseModel from pydantic is now replaces by DataClass to avoid warnings with pydantic v2.x.

2.5.2

12 Sep 03:00
2e9d4c1

Choose a tag to compare

Bug fixes:

  • (#662, #663) Resolves TypeError: bool() undefined when iterable == total == None

2.5.1

09 Sep 04:55
9e97cda

Choose a tag to compare

Bug Fixes:

  • (#658) Change the spinner to a progress bar

2.5.0

08 Sep 05:30
3e65722

Choose a tag to compare

Bug Fixes:

  • (#644 , #641) Fixed compatibility with recent versions of NumPy (≥1.24), which removed the tostring() method from numpy.ndarray.

Minor Changes:

  • (#650) Update CLI with an up-to-date taxon.

Major Changes:

  • GTDB-Tk now uses Skani exclusively for genome clustering, replacing the previous Mash/Skani hybrid approach. This change simplifies the CLI and removes the dependency on Mash, streamlining installation and execution.

2.4.1

18 Apr 04:27
655baba

Choose a tag to compare

Bug Fixes:

  • (#630) Fixed SyntaxWarning in Python 3.12 by using raw strings for regex in HMMResultsIO.py

Minor Changes:

  • (#631) gtdb_to_ncbi_majority_vote.py script has been included as part of the release

The GTDB-Tk version has been bumped to synchronise its release with GTDB R226.

2.4.0

24 Apr 01:21
59609e2

Choose a tag to compare

Bug Fixes:

  • (#576) When all genomes fail the prodigal step in the classify_wf, The
    bac120 summary file is still produced with the all failed genomes listed as 'Unclassified'
  • (#573) When running the 3 classify steps independently, a genome can be filtered out in the align
    step but still be classified in the identify step. To avoid duplication of row, the genome is classified with a warning.
  • (#540 ) Empty files are skipped during the sketch step of Mash,
    they are then catched in the prodigal step and are returned as 'Unclassified'
  • (#549) : --force has been modified to deal with #540. Prodigal
    wasn't returning the empty files as failed genomes, it was only skipping them. These genomes are now returned in the summary file and flagged as Unclassified.

Major Changes:

  • FastANI has been replaced by skani as the primary tool for computing Average Nucleotide Identity (ANI).Users may notice slight variations in the results compared to those obtained using FastANI.

  • In the generated summary.tsv files, several columns have been renamed for clarity and consistency. The following columns have been affected:

    • "fastani_reference" column has been renamed to "closest_genome_reference".
    • "fastani_reference_radius" column has been renamed to "closest_genome_reference_radius".
    • "fastani_taxonomy" column has been renamed to "closest_genome_taxonomy".
    • "fastani_ani" column has been renamed to "closest_genome_ani".
    • "fastani_af" column has been renamed to "closest_genome_af".

These changes have been implemented to improve the readability and understanding of the data within the summary.tsv files. Users should update their scripts or processes accordingly to reflect these renamed column headers.

2.3.2

05 Jul 22:38
7765d60

Choose a tag to compare

Bug Fixes:

  • (#528) (#529) setup.py has been modified to restrict pydantic version to >=1.9.2 and < 2.0a1

Minor Changes:

  • (#526) change captures the Mash stderr in a separate buffer ( Thanks @wasade for your contribution)

2.3.1

05 Jul 22:27
50ac357

Choose a tag to compare

-- Disregard this release

2.3.0

09 May 00:11
c3597ba

Choose a tag to compare

Bug Fixes:

  • (#508) (#509) If ALL genomes for a specific domain are either filtered out or classified with ANI they are now reported in the summary file.

Minor changes:

  • (#491) (#498) Allow GTDB-Tk to show --help and -v without GTDBTK_DATA_PATH being set.
    • WARNING: This is a breaking change if you are importing GTDB-Tk as a library and importing values from gtdbtk.config.config, instead you need to import as from gtdbtk.config.common import CONFIG then access values via CONFIG.<var>
  • (#508) Mash distance is changed from 0.1 to 0.15 . This is will increase the number of FastANI comparisons but will cover cases wheere genomes have a larger Mash distance but a small ANI.
  • (#497) Add a convert_to_species function is GTDB-Tk to replace GCA/GCF ids with their GTDB species name
  • Add --db_version flag to check_install to check the version of previous GTDB-Tk packages.