Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value Error for chromosome number #277

Open
swethas112 opened this issue Jan 15, 2025 · 8 comments
Open

Value Error for chromosome number #277

swethas112 opened this issue Jan 15, 2025 · 8 comments
Labels
input data Issue is caused by input data

Comments

@swethas112
Copy link

swethas112 commented Jan 15, 2025

Hi,

I am trying to use IsoQuant for my Nanopore data. But, I keep getting the following ValueError. I wonder if there is some issue with the annotation file.

Here is the command I had used:
isoquant.py -d nanopore --fastq WT.fastq.gz KO.fastq.gz --reference hg38.fa --genedb gencode.v45.basic.annotation.gtf --complete_genedb --count_exons --output isoquant_new


Traceback (most recent call last):
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
    return [fn(*args) for args in chunk]
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/process.py", line 198, in <listcomp>
    return [fn(*args) for args in chunk]
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/dataset_processor.py", line 141, in collect_reads_in_parallel
    for gene_info, assignment_storage in alignment_collector.process():
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/alignment_processor.py", line 260, in process
    for res in self.forward_alignments(alignment_storage):
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/alignment_processor.py", line 279, in forward_alignments
    yield self.process_alignments_in_region(new_region, alignments)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/alignment_processor.py", line 285, in process_alignments_in_region
    assignment_storage = self.process_intergenic(alignment_storage, current_region)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/alignment_processor.py", line 294, in process_intergenic
    corrector = IlluminaExonCorrector(self.chr_id, region[0], region[1], self.illumina_bam)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/illumina_exon_corrector.py", line 38, in __init__
    self.short_introns, self.counts = self.get_introns(short_read_file, chromosome, start, end)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/illumina_exon_corrector.py", line 56, in get_introns
    intr = samfile.find_introns(samfile.fetch(chromosome, start = start, stop = end))
  File "pysam/libcalignmentfile.pyx", line 1092, in pysam.libcalignmentfile.AlignmentFile.fetch
  File "pysam/libchtslib.pyx", line 683, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid contig `chr1`
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ssubramanian/.conda/envs/isoquant/bin/isoquant.py", line 819, in <module>
    main(sys.argv[1:])
  File "/home/ssubramanian/.conda/envs/isoquant/bin/isoquant.py", line 813, in main
    run_pipeline(args)
  File "/home/ssubramanian/.conda/envs/isoquant/bin/isoquant.py", line 763, in run_pipeline
    dataset_processor.process_all_samples(args.input_data)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/dataset_processor.py", line 456, in process_all_samples
    self.process_sample(sample)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/dataset_processor.py", line 480, in process_sample
    self.collect_reads(sample)
  File "/home/ssubramanian/.conda/envs/isoquant/share/isoquant-3.6.3-0/src/dataset_processor.py", line 558, in collect_reads
    for read_groups, alignment_stats, processed_reads in results:
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
    yield fs.pop().result()
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/home/ssubramanian/.conda/envs/isoquant/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
ValueError: invalid contig `chr1` ```

Thanks in advance!

@andrewprzh
Copy link
Collaborator

Dear @swethas112

Seems like a problem with the reference genome and annotation, are you sure chromosome have the same name in hg38.fa and gencode.v45.basic.annotation.gtf ?

Best
Andrey

@andrewprzh andrewprzh added the input data Issue is caused by input data label Jan 15, 2025
@swethas112
Copy link
Author

Hi @andrewprzh ,

Yes, I checked that already.

image

image

@andrewprzh
Copy link
Collaborator

Looks really odd since the exception is caused by pysam lib, not IsoQuant itself, could you send the entire log file?

@swethas112
Copy link
Author

Sure.

isoquant.log

@andrewprzh
Copy link
Collaborator

Something is particularly odd is with chr1, all other chromosomes are processed normally.
Could you send me the BAM files header (samtools view -H)?
Also, you can try to remove .fai index file if it was created before IsoQuant run and rerun IsoQuant.

Best
Andrey

@swethas112
Copy link
Author

Here is the header from one of the BAM files

header.txt

I'll try after removing the index file

@andrewprzh
Copy link
Collaborator

BAM file looks fine, really puzzling...

@swethas112
Copy link
Author

I tried it again after removing the .fai file, but I still get the same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
input data Issue is caused by input data
Projects
None yet
Development

No branches or pull requests

2 participants