Skip to content

richarddurbin/hexamer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hexamer and hextable
--------------------

hextable makes files of statistics that hexamer uses to scan for
likely coding regions. The principle is to use 6mers, but to avoid
deriving any information from base composition.  I therefore normalise
the frequencies of each 6mer by dividing by the total frequency of all
6mers with the same base composition.

The input of hextable is a fasta file of coding sequences in frame.
The -o file output is an ascii list of 4096 floating point numbers
giving log likelihood ratio scores in bits.  The output on stdout is a
summary of the information content of the table, indicating how
discriminative it is likely to be.  The output of hexamer is maximal
scoring segments of its input with score greater than or equal to T, 
in GFF format (http://www.sanger.ac.uk/Users/rd/gff.html).

Type "make" to build the programs, and "make clean" to remove them.

Example usage:

	hextable -o worm.hex worm.coding
	hexamer -T 20 worm.hex AH6.dna

NB these programs assume all a,c,g,t.  n's found in sequences are
converted to c.

Richard Durbin ([email protected]) 9/95-4/98

PS 30/3/99 The original version of hexamer had some initialisation
bugs, which have been fixed today.

About

find likely coding segments in DNA using composition-normalised hexamer tables

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published