forked from JurMich/RNANR
-
Notifications
You must be signed in to change notification settings - Fork 0
bonsai-team/RNANR
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
/_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/
RNANR - RNA Non-Redundant - program to compute
and sample non-redundantly RNA secondary structures
__ __ __ __ __ __ __ __ __ __ __ __ __
/_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/
______ __ _ ____ __ _ ______
| __ \ | \ | | / __ \ | \ | | | __ \
| |__| | | \ | | / |__| \ | \ | | | |__| |
| _ / | |\ \| | | __ | | |\ \| | | _ /
| | \ \ | | \ | | | | | | | \ | | | \ \
|_| \__\ |_| \__| |_| |_| |_| \__| |_| \__\
__ __ __ __ __ __ __ __ __ __ __ __ __
/_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/
\ /
\/\/HAT IS IT ?
----------------
RNANR is a C program that reads an RNA sequence and computes all
possible locally optimal secondary structures for this sequence (or
optionnally just counting them). The folding model is based on the
Nussinov model. It can be refined through a series of parameters. See
the section "OPTIONAL PARAMETERS - TOPOLOGY OF THE SECONDARY STRUCTURE".
One of features of RNANR is also the ability to sample locally optimal
secondary structures without redundancy. The core of the functionality
is based on the computation of Boltzmann probability of each substructure
using Turner energy model and generating the complete structure based on
them. To know more about it, please refer to the section "HOW DO I
PERFORM SAMPLING?"
|_|
| |OW TO INSTALL IT ?
---------------------
_
|_|
| \EQUIREMENTS:
---------------
Before any installation of program is made, make sure you have:
- ViennaRNA. The version tested is 2.2.8., however the latest version
should be compatible too.
- Optional: MPFR and GMP by extension (required by MPFR). The version
tested is 3.1.5. If you don't have or don't wan't to install it you
install the version of program without MPFR (see below).
|
|NSTALLATION:
-------------
The program offers two versions to install:
- the default version includes possibility to raise computation precision.
To install this version, just go to 'RNANR/src' of your downloaded
directory and type 'make'.
- you can also install the version not needing MPFR (obligatory for those
not having it installed). Go to 'RNANR/src' of your downloaded directory
and type 'make CFLAGS="-DIGNOREMPFR"'.This will install version without
MPFR dependancy. Option '-w' isn't available in this version.
Go to the "src" directory, and type 'make'. This will create an executable
file "locopt".
|_|
| |OW TO RUN IT ?
-----------------
You can try the program with the file "test_file.fa" provided in the
samplel directory.
> RNANR -i ../sample/test_file.fa
The input file is an RNA sequence in FASTA format. DNA alphabet
(A,C,G,T) is also accepted.
By default, results are displayed on the standard output. With the
"test_file.fa" and default paramater values, you should get 1106
locally optimal secondary structures, that are represented in
bracket-dot format. The number at the end of each line is the total
number of base pairs in the structure.
|_|
| |OW DO I PERFORM SAMPLING ?
-----------------------------
There are three ways to perform sampling using optional parameters:
-s : Performs a non-redundant sampling with number of samples equal to
the number entered following this parameteres. This value has to
be given. Note that if higher number of samples than existing
structures is asked for, the program will return an error message
specifying so.
-z : Performs a non-redundant sampling, but unlike the previous option,
the number of samples corresponds to attained coverage (the sum of
Boltzmann probabilities of returned samples). Has to have value
between 0 and 1.
-s <int> -r : Similar to option -s, but with redundancy, and consequently
without upper limit on number of structures returned.
|_|
| |OW TO GET SUPPLEMENTARY INFORMATION ABOUT ALL PARAMETERS ?
-------------------------------------------------------------
The parameters are described more in detail below this paragraph.
However, you can always print the manual included in the program itself
by simply typing:
> RNANR -h
The help will be also printed if incorrect parameters or their values
were entered.
_
| |
|_|PTIONAL PARAMETER - COUNTING
-------------------------------
-c : Count the number of locally optimal secondary structures, instead
of building them.
_
| |
|_|PTIONAL PARAMETERS - TOPOLOGY OF THE SECONDARY STRUCTURE
-----------------------------------------------------------
The algorithm of locopt includes a series of combinatorial parameters,
that allow to drastically reduce search space and to generate more
realistic structures.
-a : Minimum helix length
The minimum number of consecutive base pairs in a helix.
Default is 3.
-k : Minimum hairpin loop length.
The minimum number of single stranded bases between any two bases
that are paired together. Default is 3.
-l : Maximum hairpin loop length
The maximum number of single stranded bases between any two bases
that are paired togethe. Default is 12.
-m : Maximum loop length
The maximum number of single stranded bases between any two
paired bases that do not form a terminal loop. Default is 7.
-n : Maximum span
The maximum number of positions for a base pair to span. Default
is the length of the input sequence.
-p : Minimum percentage of paired positions. Default is 0. Not
compatible with option -c.
-q : Maximum number of branches within multiloop. Default option is the
length of entered sequence, which basically means there is no limit.
For example, the secondary structure below fullfils the following parameters.
((((.(((..((...))))).....(((....)))))))
Minimum helix length: 2
Maximum stemloop length: 4
Minimum stemloop length: 3
Maximum loop length: 5
Maximum span: 37
Minimum percentage of paired positions: 64
Maximum degree of multiloop: 2
_
| |
|_|PTIONAL PARAMETERS - SET OF BASE PAIRS
-----------------------------------------
By default, the set of all possible base pairs is composed of all A-U,
C-G and G-U base pairs present in the RNA sequence. It is also
possible to specify your own set of base pairs, and/or to combine both
solutions.
In this case, you should provide the program with a text file that
contains all the base pairs in dot-bracket format. This file can
contain as many lines as necessary.
-d : This option allows the user to specify his/her own set of base
pairs. The set of locally optimal secondary structures will be
built solely on this set of base pairs.
-e : This option enables to specify additional base pairs, in
complement to A-U, C-G and G-U base pairs present in the RNA
sequence.
_
| |
|_|PTIONAL PARAMETERS - PRECISION
---------------------------------
The program offers two precision versions, the standard double and 100
bit precision using MPFR. The default version used is standard.
- w : Toggles on MPFR precision. This allows deeper sampling. The
computation becomes considerably slower, so for samples of smaller
sequences and smaller sample sizes in general this option is not
recommended. Unavailable if the second version of program was
installed.
_
| |
|_|PTIONAL PARAMETERS - TEMPERATURE SCALING
-------------------------------------------
The program has two ways of influencing the temperature at which energy
computation and sampling is done:
- t : Changes the global temperature of the system to the value entered
in Celsius. Default value is 37°C.
- b : Scaling of temperature that does not affect the free energy of
computed structures and only intervenes during the computation
of Boltzmann probabilities. Has positive values. Higher values can
be used to randomize sampling. Default is 1.0
_
| |
|_|PTIONAL PARAMETERS - OUTPUT FILE
-----------------------------------
By default, the result is displayed on the standart output.
-o : Name of the output file that will contain all locally optimal
secondary structures. Not compatible with option -c or sampling
options
_
| |
|_|PTIONAL PARAMETERS - COMPUTING DIFFERENT VALUES
------------------------------------------------
Besides the count of all locally optimal secondary structures, there
are two options that can be called and return a global information about
the structures of entered sequence:
- f : Computes partition function of all locally optimal secondary
structures and exits.
- u : Computes the number of all flat structures and exits.
|/
|\NOWN BUGS : The sampling starts to send invalid samples once their
------------- Boltzmann probability starts getting low. This manifests
as program returning empty and half-empty structures.
To prevent this use '-w' option.
_
|_|
| |UTHOR : Helene Touzet ([email protected])
---------- Juraj Michalik ([email protected])
Yann Ponty ([email protected])
Please send comments and bug reports to one of the addresses above.
_
| |
|_IBRARY |_REDITS:
-------------------
ViennaRNA - https://www.tbi.univie.ac.at/RNA/
MPFR - http://www.mpfr.org/
__ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __
/_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/ /_/
About
A software for determining secondary structures within RNA and their non-redundant sampling.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C 98.5%
- Makefile 1.5%