Skip to content

Breakpoint detection does not work in long read data #22

@ArthurDondi

Description

@ArthurDondi

Dear DaPars2 developers,

Due to the nature of long-reads 3'UTR profiles (see image below), the breakpoint detection of DaPars2 is not accurate : it assumes uniform distribution before and after breakpoint, while long-reads have this slope of decreasing coverage.

For the example of COL6A2 below, DaPars2 will find:

Gene fit_value Predicted_Proximal_APA Loci Red Green
ENST00000361866.8|COL6A1|chr21|+ 1298.1 46003542 chr21:46003391-46005048 1.00 1.00

While the correct answer should be something like:

Gene fit_value Predicted_Proximal_APA Loci Red Green
ENST00000361866.8|COL6A1|chr21|+ XXXX 46004100 chr21:46003391-46005048 0.90 0.70

So it finds an incorrect breakpoint, leading to incorrect DPUI values.

While DaPars2 does not claim to work for long-reads, I thought it would be nice to have a version working for it.

Screenshot 2023-06-30 at 12 09 54

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions