Skip to content

Commit

Permalink
merged changes, update changes
Browse files Browse the repository at this point in the history
  • Loading branch information
lpantano committed Jul 1, 2019
2 parents 0fc120a + 87be0e0 commit b4d26d0
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 8 deletions.
9 changes: 7 additions & 2 deletions Changes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# 1.2 July, 1th, 2019

* `## TOOLS:` is added as required in the header of the file.

# 1.1 October, 24th, 2018

* UID is following exactly MINTplates licenses since this commit: 7f7717d5f23ea638f1a14fccc6386e1dbb8a7e1a in mirtop applied
* iso_5p and iso_3p changed their meaning. Nows the sign means whether the isomiR starts or ends upstream or downstream the refence sequence.
* SNP was change to SNV to support any kind of variant
* iso_5p and iso_3p changed their meaning. Nows the sign means whether the isomiR starts or ends upstream or downstream the refence sequence. This mainly affects iso_5p where the sign will be the opposite than in version 1.0. `-` -> `+` and `+` -> `-`.
* `snp` word was change to `snv` to support any kind of variant
* `iso_add` is renamed to `iso_add3p` and the category `iso_add5p` is added to the list
15 changes: 9 additions & 6 deletions definition.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@ Please add description for each columns/attribute (R:required, O:optional)
* (R) database: `##source-ontology` using FAIRSharing.org:
* miRBase: (FAIRsharing) doi:10.25504/fairsharing.hmgte8
* mirGeneDB: http://mirgenedb.org
* mirCarta: https://mircarta.cs.uni-saarland.de/
* Custom database: please, provide a link to an archive release if this is the case
* (R) tools used starting with the label `## TOOLS:` and followed by tools used to call isomiRs separated by comma (`,`).
* (O) commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD:`. Can be multiple lines starting with this tag.
* (O) commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD: `. Can be multiple lines starting with this tag.
* (O) genome/database version used (maybe try to get from BAM file if GFF3 generated from it): `## REFERENCE:`
* (R) sample names used in attribute:Expression: `## COLDATA:` separated by spaces
* (O) Filter tags meaning: See Filter attribute below. Different filter tags should be separated by `,` character. Example: `## FILTER:` and example would be `## FILTER: PASS(is ok), REJECT(false positive), REJECT lowcount(rejected due to low count in data)`.
* (R) sample names used in attribute:Expression: `## COLDATA:` separated by comma: `,`.
* (O) Filter tags meaning: See Filter attribute below. Different filter tags should be separated by `,` character. Example: `## FILTER: ` and example would be `## FILTER: PASS(is ok), REJECT(false positive), REJECT lowcount(rejected due to low count in data)`.

## Columns

Expand All @@ -44,9 +46,10 @@ Please add description for each columns/attribute (R:required, O:optional)
* (R) Name: mature name
* (R) Parent: hairpin precursor name
* (R) Variant: (categorical types - adapted from isomiR-SEA)
* `iso_5p:+/-N`. `+` indicates extra nucleotides not in the reference miRNA. `-` indicates removed nucleotides not in the sequence. `N` the number of nucleotides of difference. For instance, if the sequence starts 2 nts after the reference miRNA, the label will be: `iso_5p:-2`, but if it starts before, the label will be `iso_5p:+2`.
* `iso_5p:+/-N`. `+` indicates the start is shifted to the right. `-` indicates the start is shifted to the left. `N` the number of nucleotides of difference. For instance, if the sequence starts 2 nts after the reference miRNA, the label will be: `iso_5p:+2`, but if it starts before, the label will be `iso_5p:-2`.
* `iso_3p:+/-N`. Same explanation applied.
* `iso_add:+N`. Same explanation applied.
* `iso_add3p:N`. Number of non-template nucleotides added at 3p.
* `iso_add5p:N`. Number of non-template nucleotides added at 5p.
* `iso_snv_seed`: when affected nucleotides are between [2-7].
* `iso_snv_central_offset`: when affected nucleotides is at position [8].
* `iso_snv_central`: when affected nucleotides are between [9-12].
Expand All @@ -55,7 +58,7 @@ Please add description for each columns/attribute (R:required, O:optional)
* (O) Changes (optional): similar to previous one but indicating the nucleotides being changed.
* additions are in capital case
* deletions are in lower case
* example: `Changes iso_5p:0,iso_3p:TT,iso_add:GTC` where `Variant iso_add:+3,iso_3p:+2`.
* example: `Changes iso_5p:0,iso_3p:TT,iso_add3p:GTC` where `Variant iso_add3p:3,iso_3p:+2`.
* (R) Cigar: CIGAR string as indicated [here](https://samtools.github.io/hts-specs/SAMv1.pdf). It is the standard CIGAR for aligners. With the restriction that `M` means exact match always. That's a difference with some aligners where `M` includes mismatches. In this case, if there is a mismatch, then it should be output like: `11MA7M` to indicates there is a mismatch at position 12, where `A` is the reference nucleotide.
* (R) Hits: number of hits in the database.
* (O) Alias (Optional): get names from miRBase/miRgeneDB or other database separated by `,`
Expand Down

0 comments on commit b4d26d0

Please sign in to comment.