Skip to content

Overall goals for bio2bel_dbsnp #1

@cthoyt

Description

@cthoyt

Need to write a reproducible script that:

  1. Downloads dbSNP data
  2. Parses it (it's JSON lines format so this is trivial)
  3. Extracts dbSNP identifier to REFSEQ or gene mappings so for SNPs like rs429358, we can automatically generate BEL graphs including:
    • For mutations inside genes, get equivalences between reference genes starting with NG_ to Entrez Gene identifiers and HGNC when human like g(NG_007084.2) eq g(HGNC:APOE)
    • Reference gene g(NG_007084.2) hasVariant g(DBSNP:rs429358)
    • Impact on gene g(DBSNP:rs429358) eq g(NG_007084.2, var("g.7903T>C"))
    • Reference transcript(s) r(NM_001302688.2) hasVariant r(DBSNP:rs429358)
    • Impact on transcript(s) r(DBSNP:rs429358) eq r(NM_001302688.2, var("c.466T>C"))
    • Reference protein, when available p(NP_001289617) hasVariant p(DBSNP:rs429358)
    • Impact on protein, when available p(DBSNP:rs429358) eq p(NP_001289617.1, var("p.Cys156Arg")
    • Mappings between various RefSeq identifiers on the genomic level to genes in Entrez or HGNC

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions