Releases: Ecogenomics/GTDBTk
Releases · Ecogenomics/GTDBTk
2.6.1
2.6.0
Major Changes:
- GTDB-Tk has now a fixed version for skani (v0.3.1) and pplacer (v1.1.alpha19) to i) ensure reproducibility of results and ii) use the sketch format compatible with skani v0.3.1.
- The limit of number of genomes compared in dense genera has been removed.This ensures that all representative genomes in a genus are compared, preventing incorrect species assignments when the closest genome by ANI is outside the previous 100-genome limit. This is especially important in dense genera like Collinsella and significantly improves classification accuracy, even if runtime increases slightly.
Bug Fixes:
- (#670, #674, #668 ) Fixed an issue where GTDB-Tk would crash when using pplacer v1.1.alpha20. This issue is now resolved by fixing pplacer to v1.1.alpha19.
- (#671) The limit of number of genomes compared in dense genera has been removed.
- (#672) skani is now fixed to v0.3.1 to and uses
sketch+searchcommands instead ofdist. - (#665) GTDB-Tk now uses skani v0.3.1 and have a option to save the sketch db for reference genomes for future use(
--skani_sketch_dir). - (#669) BaseModel from pydantic is now replaces by DataClass to avoid warnings with pydantic v2.x.
2.5.2
2.5.1
2.5.0
Bug Fixes:
- (#644 , #641) Fixed compatibility with recent versions of NumPy (≥1.24), which removed the tostring() method from numpy.ndarray.
Minor Changes:
- (#650) Update CLI with an up-to-date taxon.
Major Changes:
- GTDB-Tk now uses Skani exclusively for genome clustering, replacing the previous Mash/Skani hybrid approach. This change simplifies the CLI and removes the dependency on Mash, streamlining installation and execution.
2.4.1
2.4.0
Bug Fixes:
- (#576) When all genomes fail the prodigal step in the
classify_wf, The
bac120 summary file is still produced with the all failed genomes listed as 'Unclassified' - (#573) When running the 3 classify steps independently, a genome can be filtered out in the
align
step but still be classified in theidentifystep. To avoid duplication of row, the genome is classified with a warning. - (#540 ) Empty files are skipped during the sketch step of
Mash,
they are then catched in theprodigalstep and are returned as 'Unclassified' - (#549) :
--forcehas been modified to deal with #540.Prodigal
wasn't returning the empty files as failed genomes, it was only skipping them. These genomes are now returned in the summary file and flagged as Unclassified.
Major Changes:
-
FastANIhas been replaced byskanias the primary tool for computing Average Nucleotide Identity (ANI).Users may notice slight variations in the results compared to those obtained usingFastANI. -
In the generated
summary.tsvfiles, several columns have been renamed for clarity and consistency. The following columns have been affected:- "
fastani_reference" column has been renamed to "closest_genome_reference". - "
fastani_reference_radius" column has been renamed to "closest_genome_reference_radius". - "
fastani_taxonomy" column has been renamed to "closest_genome_taxonomy". - "
fastani_ani" column has been renamed to "closest_genome_ani". - "
fastani_af" column has been renamed to "closest_genome_af".
- "
These changes have been implemented to improve the readability and understanding of the data within the summary.tsv files. Users should update their scripts or processes accordingly to reflect these renamed column headers.
2.3.2
2.3.1
2.3.0
Bug Fixes:
- (#508) (#509) If ALL genomes for a specific domain are either filtered out or classified with ANI they are now reported in the summary file.
Minor changes:
- (#491) (#498) Allow GTDB-Tk to show
--helpand-vwithoutGTDBTK_DATA_PATHbeing set.- WARNING: This is a breaking change if you are importing GTDB-Tk as a library and importing values from
gtdbtk.config.config, instead you need to import asfrom gtdbtk.config.common import CONFIGthen access values viaCONFIG.<var>
- WARNING: This is a breaking change if you are importing GTDB-Tk as a library and importing values from
- (#508) Mash distance is changed from 0.1 to 0.15 . This is will increase the number of FastANI comparisons but will cover cases wheere genomes have a larger Mash distance but a small ANI.
- (#497) Add a
convert_to_speciesfunction is GTDB-Tk to replace GCA/GCF ids with their GTDB species name - Add
--db_versionflag tocheck_installto check the version of previous GTDB-Tk packages.