Skip to content

HISAT2 missed reads which can be aligned to genome #459

Description

@santataRU

Hi HISAT2 users and developers,

I am using HISAT2 version 2.2.1 on macOS to align RNA-seq data from human cells to the hg38 genome. I noticed that HISAT2 occasionally fails to align reads that clearly map to the genome.

For example, consider the following paired-end read (BAM records shown below):

QNAME FLAG RNAME POS MAPQ CIGAR RNEXT PNEXT TLEN SEQ QUAL TAGs
AV241602:4_7_2025_B_Xiao:2437668272:2:11903:4775:1668 73 chr1 45550823 60 54M89N78M1S = 45550823 0 GCCAGCGCCACCTGCCTCATTGTGCCCAGGAGTTCTCCAAACCCGCGCTGCGGAGTCCTTCCTCGGGAGGCGGCGAAGGCGGTCCACCCTGCGCGTGATCCTTTATGCCCGGCCCCTGCCCCTCCCTCCGGGG 5MMMKMMLOLPPPMOPOPPPPPPBPOPPKPOPPPPOPPPPPPPPNOPPPPOOPPPOPOPLPPPPPPPPPOPPOOOPPPPOONPPPPOPPPPOOPLONOPPOPPPOPPPKPPOPNPOONOOOOOOOOOONONON AS:i:-1 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:132 YT:Z:UP XS:A:- NH:i:1
AV241602:4_7_2025_B_Xiao:2437668272:2:11903:4775:1668 133 chr1 45550823 0 * = 45550823 0 CCCCCGAGGGAGGGGCTGGGGCCGGGCATAAAGGATCACGCGCAGGGTGGACCCCCTTCGCCTCCTCCCGAGGAAGGACTCCGCAGCGCGGGTTTTGAGAACTTCTGGGCACAAAGAGGCAGGTGGCGCGGGCATTTCGGAAGGTTTTTG EEEE(88EJGIEJII;(M/MB063MCMH;MMM9DM+(-/FM39MMJIJ-><.M&5M8MAMGJ%14M>F8D@<M(LMHM7@)J5M(.<FDKFILML%MHF9?MC9LMEIL5D)=J%FBD8,/MCM0M-(./LM??H'?8MFLHLD,0?4HM YT:Z:UP

As shown, one read of the pair is successfully mapped with high confidence (MAPQ 60), while the mate read is reported as unmapped (with CIGAR *). However, I verified that both reads can be aligned to the hg38 genome using BLAT in the IGV browser.

I’m curious why HISAT2 is failing to map the second read. Could this be due to a complexity limitation, or is there another explanation?

Thanks in advance for any insights!

Best regards,
Xiao

PS. HISAT2 command as shown in the BAM file header:

CL:"/usr/local/bin/../Cellar/hisat2/2.2.1_1/bin/hisat2-align-s --wrapper basic-0 -p 8 --no-temp-splicesite --summary-file HT1080_shNT-2_VSVNL43_HISAT2_summary.txt --rna-strandness RF -x /Users/bieniaszlab/Documents/Xiao/CLIP/index/hg38/HISAT2_index/hg38 --read-lengths 150,135,147,127,129,136,140,137,130,138,128,141,126,125,122,143,139,121,124,133,118,117,131,115,145,116,132,123,112,120,111,114,134,144,142,119,110,146,105,109,113,108,106,98,107,103,101,104,99,102,96,97,95,90,100,94,92,89,85,93,91,88,87,86,82,81,80,83,78,76,84,77,79,74,69,75,73,67,72,71,70,68,66,62,61,50,65,60,58,64,63,57,56,55,54,52,51 -1 /tmp/11953.inpipe1 -2 /tmp/11953.inpipe2"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions