Releases: wwood/singlem
v0.20.3
v0.20.0 / 0.20.1 / 0.20.2
v0.20.0 / 0.20.1 / 0.20.2
Major new function - Long-read input support (Nanopore >= R10.4.1 or PacBio HiFi recommended), thanks to @thepatientwait.
- Lyrebird database updated to v0.3.1, improving exclusion of off-target (non-phage) sequences
microbial_fractionsubcommand renamed toprokaryotic_fraction(old name retained as synonym)- More flexible options for specifying genome input in
pipemode appriasemode: Add--stream-inputs- GlobDB R226 metapackage released. To use it, decompress it and then specify
pipe --metapackage /path/to/GlobDB_r226.metapackage_v1.smpkgrather than setting the environment variable.
Thanks to @AroneyS, @rzhao-2, @EisenRa, @thepatientwait, @dspeth, @Anna-MarieSeelen, @l-gallucci, @ilnamkang and others for contributions and testing.
v0.19.0
Major new function - profiling of Caudoviricetes (aka "Caudovirales") phage communities (Lyrebird), thanks to @rzhao-2.
Other changes:
- Update default metapackage to GTDB R226
- admin: Use pixi instead of conda
- Use of diamond v2.1.10 specifically, to avoid segfault issues with diamond v2.1.11
- Clarify non-standard metapackage usage (#220)
- doc: Improve summarise --cluster (#210)
Thanks @rzhao-2 @AroneyS @ilnamkang Phil Hugenholtz @pchaumeil @zackhenny @thepatientwait
v0.18.3
v0.18.1
v0.18.0
Combined changelog for v0.17.0 and 0.18.0
- Use of GTDB R220 reference metapackage by default
pipe/condense: Improve algorithm by delaying some filtering steps, leading to more accurate taxonomic profilespipe: update to smafa v0.8.0 for substantial speed improvementmicrobial_fraction: Remove%from column data and add average genome size estimationsupplement: Change command line options in backwards incompatible way, clarifying their meaningsummarise: Add--output-taxonomic-profile-with-extrasoutput to add relative abundance etc. to taxonomic profilessummarise: Add--output-species-by-site-relative-abundance-prefixto create taxon-level specific relative abundances from taxonomic profilessummarise: Add--output-taxonomic-level-coverageto show how much coverage and number of taxa assigned to each levelpipe: Faster processing when many genome fasta files are inputseqs: Prioritise high-info HMM positions.- dist: Fix singularity container
- assorted bug and documentation fixes
Thanks @AroneyS @EisenRa @jakobnissen @rzhao-2 @rrohwer @shaze @ellyyuyang @VadimDu @adityabandla @luispedro, and anonymous reviewers, among others.
The microbial_fraction mode now has its own citation - https://www.biorxiv.org/content/10.1101/2024.05.16.594470v1
v0.16.0
This version tweaks the method which assign taxonomy to OTUs (increasing the species-level threshold) and the method which summarises the OTUs to create a final taxonomic profile (very low abundance lineages are given lower taxonomic resolution, rather than ignored completely. This improves the rate over "overclassification" i.e. when novel species are classified wrongly to the species level, and improves the read_fraction (now called microbial_fraction) estimates in complex / shallowly sequenced metagenomes.
We suggest recomputing community profiles using renew or pipe modes.
- pipe/renew: Change default species-level assignment from 3bp or closer, to 2bp or closer.
- pipe/renew/condense: Assign sub-min-taxon-coverage higher.
- read_fraction mode renamed to microbial_fraction
Thanks to Yu Yang, Caitlin Singleton, @MadsAlbertsen @EisenRa @BigDataBiology
v0.15.1
Mostly minor bugfixes
- pipe: extract: Apply --evalue to hmmsearch thresholding.
- Fix for appraise --plot
- pipe: Dedup hmmsearch results during diamond package assignment.
- pipe/renew/condense: Prevent no_assign_taxonomy and taxonomic profile output.
Thanks @kalonji08 @AroneyS @harmonydouwes
v0.15.0
- Genomes that encode proteins with translation table 4 are now supported. This
works by assuming all genomes have translation table 4, since regular sequence
similarity search excludes inappropriately translated sequences from genomes
which use table 11 (the standard bacterial table). Thanks to Dr. Andy Leu for
useful test cases. NOTE: Therenewmode is not sufficient for detecting
these lineages,pipemust be run again from scratch. - new_package_creation (beta): A snakemake pipeline included in the
extras
directory used to create new SingleM metapackages from scratch. In
development. Thanks for @harmonydouwes @tvtv195 @JemmaSun for testing. - Version S3.2.1 of the default metapackage released, which includes updated
genome sizes for GTDB genomes (for use withread_fraction), now corrected
based on CheckM v2 estimates of completeness and contamination. Thanks to
@EisenRa for collaboration. seqs: Output the best window position to STDOUT.- Other assorted bug fixes and documentation updates.
v0.14.0
This release is a huge step forward for the SingleM software, comprising >750 git commits and several years work (particularly from @AroneyS and @EisenRa and @rzhao-2) since v0.13.2.
There are so many changes that generating a CHANGELOG would take too long.
This release is equivalent to 1.0.0beta8, and is intended as a pre-release for version 1.0.0, but using a standard version number allows for a more streamlined release process.
