Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR:index list out of range #3

Open
mas160 opened this issue Aug 23, 2020 · 1 comment
Open

ERROR:index list out of range #3

mas160 opened this issue Aug 23, 2020 · 1 comment

Comments

@mas160
Copy link

mas160 commented Aug 23, 2020

Hi,

I am trying to use your pipeline to trim the barcodes on my 10x generated raw reads but I am getting the following error:
PROCESS NOTE Finished reading in barcode whitelist
PROCESS FILES L4R1.fastq,L4R2.fastq
PROCESS ERROR:[TwoReadIlluminaRun] Error reading next read
Traceback (most recent call last):
File "./proc10xG/process_10xReads.py", line 463, in main
fragment = iterator.next_raw()
File "./proc10xG/process_10xReads.py", line 252, in next_raw
rbc = (id1.split()[1]).split(':')[3]
IndexError: list index out of range
PROCESS ERROR An unknown fatal error was encountered.
how can this be solve?

Hoping to hear from you soon.

thanks,

@mas160 mas160 changed the title ERROR:index list out of rnage ERROR:index list out of range Aug 23, 2020
@mdoliv
Copy link

mdoliv commented May 14, 2021

I think I've found the issue. On line 252 of process_10xReads.py, the method tries to do the following operation to the sequence header, assuming the header is in the following format:

@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36

It attempts to grab the part with 071112_SLXA-EAS1_s_7:5:1:817:345, split it by the ":" (colon) character and get the string at the third index. This is usually fine, but look at the headers for my FASTQ files retrieved from NCBI SRA using fasterq-dump:

@SRR9108936.1 1 length=150

It doesn't conform to the format the script specifies. For what I understood, what the script wants to get is the second string in this header, which in this case is just the number "1". A possible temporary fix is to go to line 252 and change this:

rbc = (id1.split()[1]).split(':')[3]

to this:

rbc = id1.split()[1]

Note that this is just a temp fix, something more elegant must be done to account for different header formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants