Hi,
I was wondering whether you have considered providing precomputed skani sketches for GTDB representative genomes as part of the GTDB-Tk reference data available for download.
I believe this could offer several advantages:
- It would remove the need to compute skani sketches during the first execution of GTDB-Tk.
- Users would only need to specify the GTDB-Tk data path (GTDBTK_DATA_PATH), without also having to provide a separate path to the skani sketches.
- It may make the inclusion of representative genome FASTA files in the GTDB-Tk reference package unnecessary (or significantly reduced), potentially decreasing the package size and speeding up downloads.
Thanks again for all your work,
Florian