Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting Target Sequences error: ZeroDivisionError: division by zero #13

Open
neolycus23 opened this issue Jun 28, 2024 · 5 comments
Open

Comments

@neolycus23
Copy link

Hello,

First of all, I wanted to congratulate you for such an amazing pipeline, I am really enjoying it and looking forward to implementing it for my future papers.

My problem right now: I am trying to extract UCEs of my assemblies but I keep having the same error below (see attached the full log). It looks like Scipio only manages to finish some of my samples (22 out of 28 samples). It also happens to be that the files that are failing are the largest assembly files (> 1Gb each). Any ideas what could be causing this issue? A couple of days ago Scipio managed to finish the same dataset for 24 out of 28 samples, but it crashed after that. Thank you for your help!

concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/concurrent/futures/process.py", line 263, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/extract.py", line 1339, in scipio_coding
    final_models = scipio_yaml_to_dict(yaml_final_file, min_score, min_identity,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/bioformats.py", line 2321, in scipio_yaml_to_dict
    model = parse_model(yaml[prot][yaml_model],
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/bioformats.py", line 2304, in parse_model
    mismatch_rate      = len(set(mod["mismatches"])) / prot_len_matched
                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
[nohup.txt](https://github.com/user-attachments/files/16034555/nohup.txt)"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/dmz/home/vferreira/.conda/envs/captus/bin/captus_assembly", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/captus_assembly.py", line 1424, in main
    CaptusAssembly()
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/captus_assembly.py", line 90, in __init__
    getattr(self, args.command)()
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/captus_assembly.py", line 1074, in extract
    extract(full_command, args)
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/extract.py", line 505, in extract
    tqdm_parallel_nested_run(scipio_coding, scipio_params, d_msg, f_msg,
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/site-packages/captus/misc.py", line 158, in tqdm_parallel_nested_run
    result = future.result()
             ^^^^^^^^^^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/dmz/home/vferreira/.conda/envs/captus/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
ZeroDivisionError: division by zero

@neolycus23
Copy link
Author

A quick update: I run into the same issue with a single sample, a large genome assembly of 3.78 GB...

@edgardomortiz
Copy link
Owner

Hi @neolycus23 ,

Sorry for the late reply, I was in the chaos of moving countries. If I understand correctly UCEs are not necessarily translatable to protein right? then I would suggest providing your references as miscellaenous DNA (-d option)

Otherwise, please upload your exact command, the logs produced by Captus, and if possible your reference targets so I can start diagnosing the problem

Thanks!

Edgardo

@neolycus23
Copy link
Author

Hi Edgar,

Now it is my turn to apologize for the delay in responding you. UCEs are not necessarily translatable to proteins, but ours are.

I have used Captus and the command "captus_assembly extract -a 02_assemblies -n probes.fasta" (please, see the log file) for several samples, and it worked for almost of all of them. Captus crashed during the extraction stage for some of my larger genome assembly files (>1 GB). I just kept trying and rerunning the same command, and it worked for some of the failed samples, but some are persistently failling. I am attaching the log with the error that I mentioned above, and the NUC_scipio final log of one of the samples that failed to complete.

I am looking forward to hearing from you! Thanks for your help!

Vinicius
captus_log.txt
NUC_scipio_final.log

@neolycus23
Copy link
Author

Hi Edgar,

I am following up on this issue. Did you have sometime to take a look on these files? Thanks a lot!

Vinicius

@edgardomortiz
Copy link
Owner

Hi Vinicius,

Sorry for the delay, I have been in the processing of switching jobs. Unfortunately, your logs don't tell me much. Do you think you could share the assembly that fails to extract as well as the reference file. My aim is to reproduce the error.

Thanks!

Edgardo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants