merged changes, update changes

miRTop · Jul 1, 2019 · b4d26d0 · b4d26d0
2 parents 0fc120a + 87be0e0
commit b4d26d0
Show file tree

Hide file tree

Showing 2 changed files with 16 additions and 8 deletions.
diff --git a/Changes.md b/Changes.md
@@ -1,5 +1,10 @@
+# 1.2 July, 1th, 2019
+
+* `## TOOLS:` is added as required in the header of the file.
+
 # 1.1 October, 24th, 2018
 
 * UID is following exactly MINTplates licenses since this commit: 7f7717d5f23ea638f1a14fccc6386e1dbb8a7e1a in mirtop applied
-* iso_5p and iso_3p changed their meaning. Nows the sign means whether the isomiR starts or ends upstream or downstream the refence sequence.
-* SNP was change to SNV to support any kind of variant
+* iso_5p and iso_3p changed their meaning. Nows the sign means whether the isomiR starts or ends upstream or downstream the refence sequence. This mainly affects iso_5p where the sign will be the opposite than in version 1.0. `-` -> `+` and `+` -> `-`.
+* `snp` word was change to `snv` to support any kind of variant
+* `iso_add` is renamed to `iso_add3p` and the category `iso_add5p` is added to the list
diff --git a/definition.md b/definition.md
@@ -23,11 +23,13 @@ Please add description for each columns/attribute (R:required, O:optional)
   * (R) database: `##source-ontology` using FAIRSharing.org:
     * miRBase: (FAIRsharing) doi:10.25504/fairsharing.hmgte8
     * mirGeneDB: http://mirgenedb.org
+    * mirCarta: https://mircarta.cs.uni-saarland.de/
+    * Custom database: please, provide a link to an archive release if this is the case
   * (R) tools used starting with the label `## TOOLS:` and followed by tools used to call isomiRs separated by comma (`,`).
-  * (O) commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD:`. Can be multiple lines starting with this tag.
+  * (O) commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD: `. Can be multiple lines starting with this tag.
   *  (O) genome/database version used (maybe try to get from BAM file if GFF3 generated from it): `## REFERENCE:`
-  * (R) sample names used in attribute:Expression: `## COLDATA:` separated by spaces
-  * (O) Filter tags meaning: See Filter attribute below. Different filter tags should be separated by `,` character. Example: `## FILTER:` and example would be `## FILTER: PASS(is ok), REJECT(false positive), REJECT lowcount(rejected due to low count in data)`.
+  * (R) sample names used in attribute:Expression: `## COLDATA:` separated by comma: `,`.
+  * (O) Filter tags meaning: See Filter attribute below. Different filter tags should be separated by `,` character. Example: `## FILTER: ` and example would be `## FILTER: PASS(is ok), REJECT(false positive), REJECT lowcount(rejected due to low count in data)`.
 
 ## Columns
 
@@ -44,9 +46,10 @@ Please add description for each columns/attribute (R:required, O:optional)
   * (R) Name: mature name
   * (R) Parent: hairpin precursor name
   * (R) Variant: (categorical types - adapted from isomiR-SEA)
-    * `iso_5p:+/-N`. `+` indicates extra nucleotides not in the reference miRNA. `-` indicates removed nucleotides not in the sequence. `N` the number of nucleotides of difference. For instance, if the sequence starts 2 nts after the reference miRNA, the label will be: `iso_5p:-2`, but if it starts before, the label will be `iso_5p:+2`.
+    * `iso_5p:+/-N`. `+` indicates the start is shifted to the right. `-` indicates the start is shifted to the left. `N` the number of nucleotides of difference. For instance, if the sequence starts 2 nts after the reference miRNA, the label will be: `iso_5p:+2`, but if it starts before, the label will be `iso_5p:-2`.
     * `iso_3p:+/-N`. Same explanation applied.
-    * `iso_add:+N`. Same explanation applied.
+    * `iso_add3p:N`. Number of non-template nucleotides added at 3p.
+    * `iso_add5p:N`. Number of non-template nucleotides added at 5p.
     * `iso_snv_seed`: when affected nucleotides are between [2-7].
     * `iso_snv_central_offset`: when affected nucleotides is at position [8].
     * `iso_snv_central`: when affected nucleotides are between [9-12].
@@ -55,7 +58,7 @@ Please add description for each columns/attribute (R:required, O:optional)
   * (O) Changes (optional): similar to previous one but indicating the nucleotides being changed.
     * additions are in capital case
     * deletions are in lower case
-    * example: `Changes iso_5p:0,iso_3p:TT,iso_add:GTC` where `Variant iso_add:+3,iso_3p:+2`.
+    * example: `Changes iso_5p:0,iso_3p:TT,iso_add3p:GTC` where `Variant iso_add3p:3,iso_3p:+2`.
   * (R) Cigar: CIGAR string as indicated [here](https://samtools.github.io/hts-specs/SAMv1.pdf). It is the standard CIGAR for aligners. With the restriction that `M` means exact match always. That's a difference with some aligners where `M` includes mismatches. In this case, if there is a mismatch, then it should be output like: `11MA7M` to indicates there is a mismatch at position 12, where `A` is the reference nucleotide.
   * (R) Hits: number of hits in the database.
   * (O) Alias (Optional): get names from miRBase/miRgeneDB or other database separated by `,`