Skip to content

Developer Guide

Bui Quang Minh edited this page Dec 6, 2015 · 20 revisions

This guide gives developers an overview of IQ-TREE software design, data structures and discusses possibility of incorporating new models into IQ-TREE.

To achieve both high performance and flexibility, IQ-TREE software has been entirely written in object oriented C++. Thus, it faciliates extending with new sequence data types or new models. IQ-TREE code consists of C++ classes, most of which inherits from three basal classes: Alignment, ModelSubst and PhyloTree to handle sequence alignments, models of substitution and phylogenetic trees, respectively. In the following we introduce these basal classes.

TIP: IQ-TREE extensively uses Standard Template Library (STL) in C++. Thus, be first familiar with STL and fundamental STL data structures like string, vector, set and map.

Alignment class

The Alignment class stores the data as a vector of Pattern. Each Pattern is in turn a string representing the characters across the sequences at an alignment site, with a frequency of occurrences in the Alignment (from header file pattern.h):

/**
	Site-patterns in a multiple sequence alignment
*/
class Pattern : public string {
public:
	...

	/**
		frequency appearance of the pattern
	*/
	int frequency;
};

The rationale for storing the data this way (instead of storing a set of sequences) is that most computations are carried out along the site-patterns of the Alignment. Thus, it makes all operations more convenient and faster.

As noted above, the Alignment class is defined as (from header file alignment.h):

/**
    Multiple Sequence Alignment. Stored by a vector of site-patterns
*/
class Alignment : public vector<Pattern> {
public:
    /**
            constructor
            @param filename file name
            @param sequence_type type of the sequence, either "BIN", "DNA", "AA", or NULL
            @param intype (OUT) input format of the file
     */
    Alignment(char *filename, char *sequence_type, InputType &intype);

    ...
};

NOTICE: Please follow the commenting style of the code when declaring new components (classes, functions or variables) like the example above. That way, the source code documentation can be generated with tools like Doxygen. See Doxygen commenting manual for more details.

ModelSubst class (model of substitution)

ModelSubst is the base class for all substitution models implemented in IQ-TREE. It implements the basic Juke-Cantor-type model (equal substitution rates and equal state frequencies) that works for all data type. ModelSubst class declares a number of virtual methods, that need to be overriden when implementing a new model, for example (from header file model/modelsubst.h):

/**
    Substitution model abstract class
*/
class ModelSubst: public Optimization
{
public:
	/**
		constructor
		@param nstates number of states, e.g. 4 for DNA, 20 for proteins.
	*/
    ModelSubst(int nstates);

	/**
		@return the number of dimensions
	*/
	virtual int getNDim() { return 0; }

    ...
};

As an example, the method getNDim() should return the number of free parameters of the model, which is 0 for the default JC-type model.

PhyloTree class (phylogenetic tree)

PhyloTree is the base class for phylogenetic trees.

Clone this wiki locally