-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Hi again,
I'm running dRep for dereplication of a large genome dataset. I previously faced an issue where it seemed like dRep (or possibly fastANI) silently stalled during execution. As suggested, I re-ran the same command with the -d flag to enable debug output.
..:: dRep dereplicate Step 2. Cluster ::..
07-14 22:45 INFO Running primary clustering
07-14 22:45 INFO Running pair-wise MASH clustering
07-14 22:45 INFO Will split genomes into 9 groups for primary clustering
07-15 05:28 DEBUG Clustering MASH database
07-15 06:29 DEBUG Debug mode on - saving Mdb ASAP
07-15 07:42 DEBUG Debug mode on - saving CdbF ASAP
07-15 07:42 DEBUG Saving primary_linkage pickle to /data/chandrasekaran/drep_completed_genomes/drep_output/data/Clustering_files/
07-15 07:42 INFO 7430 primary clusters made
07-15 07:42 INFO Running secondary clustering
07-15 07:42 INFO Running 32018245 fastANI comparisons- should take ~ 20812.9 min
07-15 07:42 DEBUG running cluster 2588
07-15 07:42 DEBUG /data/chandrasekaran/miniconda3/envs/drep/bin/fastANI --ql /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/tmp/genomeList --rl /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/tmp/genomeList -o /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/fastANI_out_ejqycipfir --matrix -t 6 --minFraction 0 ejqycipfir
07-15 07:42 DEBUG running cluster 3286
07-15 07:42 DEBUG /data/chandrasekaran/miniconda3/envs/drep/bin/fastANI --ql /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/tmp/genomeList --rl /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/tmp/genomeList -o /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/fastANI_out_btqmmtzduj --matrix -t 6 --minFraction 0 btqmmtzduj
With the -d flag I got some additional files:
(base) [chandrasekaran@actinium cmd_logs]$ ls
2025-07-15_07.42.22.045546.CMD 2025-07-15_07.42.22.045546.STDOUT 2025-07-15_07.42.35.013850.STDERR
2025-07-15_07.42.22.045546.STDERR 2025-07-15_07.42.35.013850.CMD 2025-07-15_07.42.35.013850.STDOUT
Out of which the stall of cluster 3286 corresponds to the file 2025-07-15_07.42.35.013850.STDERR :
Kmer size = 16
Fragment length = 3000
Threads = 6
ANI output file = /data/chandrasekaran/drep_completed_genomes/drep_output/data/fastANI_files/fastANI_out_btqmmtzduj
Sanity Check = 0
INFO [thread 0], skch::main, Count of threads executing parallel_for : 6
INFO [thread 0], skch::Sketch::build, window size for minimizer sampling = 24
INFO [thread 0], skch::Sketch::build, minimizers picked from reference = 268142983
INFO [thread 0], skch::Sketch::index, unique minimizers = 5582865
INFO [thread 0], skch::Sketch::computeFreqHist, Frequency histogram of minimizers = (1, 1752540) ... (27134, 1)
INFO [thread 0], skch::Sketch::computeFreqHist, consider all minimizers during lookup.
INFO [thread 0], skch::main, Time spent sketching the reference : 429.987 sec
INFO [thread 0], skch::main, Start Map 1
.
.
.
INFO [thread 0], skch::main, Start Map 40
INFO [thread 0], skch::main, Time spent mapping fragments in query #40 : 1787.84 sec
INFO [thread 0], skch::main, Time spent post mapping : 0.570964 sec
INFO [thread 0], skch::main, Start Map 41
Here it seems to be have stuck at Start Map 41. Now, I don't understand how to proceed from here, kindly help me debug the issue.
Thanks in advance!