Hi HISAT2 users and developers,
I am using HISAT2 version 2.2.1 on macOS to align RNA-seq data from human cells to the hg38 genome. I noticed that HISAT2 occasionally fails to align reads that clearly map to the genome.
For example, consider the following paired-end read (BAM records shown below):
| QNAME |
FLAG |
RNAME |
POS |
MAPQ |
CIGAR |
RNEXT |
PNEXT |
TLEN |
SEQ |
QUAL |
TAGs |
| AV241602:4_7_2025_B_Xiao:2437668272:2:11903:4775:1668 |
73 |
chr1 |
45550823 |
60 |
54M89N78M1S |
= |
45550823 |
0 |
GCCAGCGCCACCTGCCTCATTGTGCCCAGGAGTTCTCCAAACCCGCGCTGCGGAGTCCTTCCTCGGGAGGCGGCGAAGGCGGTCCACCCTGCGCGTGATCCTTTATGCCCGGCCCCTGCCCCTCCCTCCGGGG |
5MMMKMMLOLPPPMOPOPPPPPPBPOPPKPOPPPPOPPPPPPPPNOPPPPOOPPPOPOPLPPPPPPPPPOPPOOOPPPPOONPPPPOPPPPOOPLONOPPOPPPOPPPKPPOPNPOONOOOOOOOOOONONON |
AS:i:-1 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:132 YT:Z:UP XS:A:- NH:i:1 |
| AV241602:4_7_2025_B_Xiao:2437668272:2:11903:4775:1668 |
133 |
chr1 |
45550823 |
0 |
* |
= |
45550823 |
0 |
CCCCCGAGGGAGGGGCTGGGGCCGGGCATAAAGGATCACGCGCAGGGTGGACCCCCTTCGCCTCCTCCCGAGGAAGGACTCCGCAGCGCGGGTTTTGAGAACTTCTGGGCACAAAGAGGCAGGTGGCGCGGGCATTTCGGAAGGTTTTTG |
EEEE(88EJGIEJII;(M/MB063MCMH;MMM9DM+(-/FM39MMJIJ-><.M&5M8MAMGJ%14M>F8D@<M(LMHM7@)J5M(.<FDKFILML%MHF9?MC9LMEIL5D)=J%FBD8,/MCM0M-(./LM??H'?8MFLHLD,0?4HM |
YT:Z:UP |
As shown, one read of the pair is successfully mapped with high confidence (MAPQ 60), while the mate read is reported as unmapped (with CIGAR *). However, I verified that both reads can be aligned to the hg38 genome using BLAT in the IGV browser.
I’m curious why HISAT2 is failing to map the second read. Could this be due to a complexity limitation, or is there another explanation?
Thanks in advance for any insights!
Best regards,
Xiao
PS. HISAT2 command as shown in the BAM file header:
CL:"/usr/local/bin/../Cellar/hisat2/2.2.1_1/bin/hisat2-align-s --wrapper basic-0 -p 8 --no-temp-splicesite --summary-file HT1080_shNT-2_VSVNL43_HISAT2_summary.txt --rna-strandness RF -x /Users/bieniaszlab/Documents/Xiao/CLIP/index/hg38/HISAT2_index/hg38 --read-lengths 150,135,147,127,129,136,140,137,130,138,128,141,126,125,122,143,139,121,124,133,118,117,131,115,145,116,132,123,112,120,111,114,134,144,142,119,110,146,105,109,113,108,106,98,107,103,101,104,99,102,96,97,95,90,100,94,92,89,85,93,91,88,87,86,82,81,80,83,78,76,84,77,79,74,69,75,73,67,72,71,70,68,66,62,61,50,65,60,58,64,63,57,56,55,54,52,51 -1 /tmp/11953.inpipe1 -2 /tmp/11953.inpipe2"
Hi HISAT2 users and developers,
I am using HISAT2 version 2.2.1 on macOS to align RNA-seq data from human cells to the hg38 genome. I noticed that HISAT2 occasionally fails to align reads that clearly map to the genome.
For example, consider the following paired-end read (BAM records shown below):
As shown, one read of the pair is successfully mapped with high confidence (MAPQ 60), while the mate read is reported as unmapped (with CIGAR *). However, I verified that both reads can be aligned to the hg38 genome using BLAT in the IGV browser.
I’m curious why HISAT2 is failing to map the second read. Could this be due to a complexity limitation, or is there another explanation?
Thanks in advance for any insights!
Best regards,
Xiao
PS. HISAT2 command as shown in the BAM file header:
CL:"/usr/local/bin/../Cellar/hisat2/2.2.1_1/bin/hisat2-align-s --wrapper basic-0 -p 8 --no-temp-splicesite --summary-file HT1080_shNT-2_VSVNL43_HISAT2_summary.txt --rna-strandness RF -x /Users/bieniaszlab/Documents/Xiao/CLIP/index/hg38/HISAT2_index/hg38 --read-lengths 150,135,147,127,129,136,140,137,130,138,128,141,126,125,122,143,139,121,124,133,118,117,131,115,145,116,132,123,112,120,111,114,134,144,142,119,110,146,105,109,113,108,106,98,107,103,101,104,99,102,96,97,95,90,100,94,92,89,85,93,91,88,87,86,82,81,80,83,78,76,84,77,79,74,69,75,73,67,72,71,70,68,66,62,61,50,65,60,58,64,63,57,56,55,54,52,51 -1 /tmp/11953.inpipe1 -2 /tmp/11953.inpipe2"