bito, or "Bayesian Inference of Trees via Optimization", is a Python-interface C++ library for phylogenetic variational inference so that you can express interesting parts of your phylogenetic model in Python/TensorFlow/PyTorch/etc and let bito handle the tree structure and likelihood computations for you.
"Bito" is also the name of a tree native to Africa that produces medicinal oil.
We pronounce "bito" with a long /e/ sound ("bito" rhymes with "burrito").
This library is in an experimental state. This library was formerly known as "libsbn".
- If you are on linux, install gcc >= 7.5, which is standard in Debian Buster and Ubuntu 18.04
 - If you are on OS X, use a recent version of Xcode and install command line tools
 
We suggest using anaconda and the associated conda environment file, which will nicely install relevant dependencies:
conda env create -f environment.yml
conda activate bito
(Very optional) The notebooks require R, IRKernel, rpy2 >=3.1.0, and some R packages such as ggplot and cowplot.
For your first build, do
git submodule update --init --recursivemake
This will install the bito Python module.
You can build and run tests using make test and make fasttest (the latter excludes some slow tests).
Note that make accepts -j flags for multi-core builds: e.g. -j20 will build with 20 jobs.
- (Optional) If you modify the lexer and parser, call 
make bison. This assumes that you have installed Bison >= 3.4 (conda install -c conda-forge bison). - (Optional) If you modify the test preparation scripts, call 
make prep. This assumes that you have installed ete3 (conda install -c etetoolkit ete3). 
The following two papers will explain what this repository is about:
- Zhang & Matsen IV, NeurIPS 2018. Generalizing Tree Probability Estimation via Bayesian Networks; 👉🏽 blog post.
 - Zhang & Matsen IV, ICLR 2019. Variational Bayesian Phylogenetic Inference; 👉🏽 blog post.
 
Our documentation consists of:
- Online documentation
 - Derivations in 
doc/tex, which explain what's going on in the code. 
We welcome your contributions! Please see our detailed contribution guidelines.
- Erick Matsen (@matsen): implementation, design, janitorial duties
 - Dave H. Rich (@DaveRich): core developer
 - Ognian Milanov (@ognian-): core developer
 - Mathieu Fourment (@4ment): implementation of substitution models and likelihoods/gradients, design
 - Seong-Hwan Jun (@junseonghwan): generalized pruning design and implementation, implementation of SBN gradients, design
 - Hassan Nasif (@hrnasif): hot start for generalized pruning; gradient descent for generalized pruning
 - Anna Kooperberg (@annakooperberg): refactoring the subsplit DAG
 - Sho Kiami (@shokiami): refactoring the subsplit DAG
 - Tanvi Ganapathy (@tanviganapathy): refactoring the subsplit DAG
 - Lucy Yang (@lucyyang01): subsplit DAG visualization
 - Cheng Zhang (@zcrabbit): concept, design, algorithms
 - Christiaan Swanepoel (@christiaanjs): design
 - Xiang Ji (@xji3): gradient expertise and node height code
 - Marc Suchard (@msuchard): gradient expertise and node height code
 - Michael Karcher (@mdkarcher): SBN expertise
 - Eric J. Isaac (@EricJIsaac): C++ wisdom
 
If you are citing this library, please cite the NeurIPS and ICLR papers listed above. We require BEAGLE, so please also cite these papers:
- Jaime Huerta-Cepas: several tree traversal functions are copied from ete3
 - Thomas Junier: parts of the parser are copied from newick_utils
 - The parser driver is derived from the Bison C++ example
 
In addition to the packages mentioned above we also employ:
- cxx-prettyprint STL container pretty printing
 - Eigen
 - fast-cpp-csv-parser
 - Progress-CPP progress bar