Skip to content

Variant sequences in double #21

@meglecz

Description

@meglecz

In the sqlite database occasionally there are sequences in upper case and in lower case. Some sequences are identical (apart from the lc/uc). Sequences in lc do not have read counts.

I guess that this comes from using taxassign for sequences that are not yet in the sqlite db, all sequences are added to the db (in lower case letters), even if they are identical to a variant already in the database (upper case letters). In this way, the same sequence can have different varIDs.
I would prefer to eliminate his redundancy, and use the same ID systematically for identical sequences.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions