EnsemblLite

Warning EnsemblLite is not ready for use! We will remove this notice when we are ready to post to PyPi at which point it will be ready for trialling. In the meantime, you can check the project progress towards being usable via the EnsemblLite roadmap.

A screencast of an early prototype

🎬 Very early proof-of-concept demo and plan for a new style terminal user interface

demo-tui.mp4

NOTE: the command line name has changed since this early version. See text below for the new name.

Developer installs

Fork the repo and clone your fork to your local machine. In the terminal, create either a python virtual environment or a new conda environment and activate it. In that virtual environment

$ pip install flit

Then do the flit version of a "developer install". (It is basically creating a symlink to the repos source directory.)

$ flit install -s --python `which python`

Installation

Suggest creating a conda environment or a python virtual environment, using python3.11. Then install directly into that environment from the GitHub repo as

$ python -m pip install "ensembl_lite @ git+https://github.com/cogent3/EnsemblLite.git@develop"

Then run for the first time using

$ elt tui

The first start takes a while as, behind the scenes, cogent3 is transpiling various functions into C and compiling them. Eventually, you get a very neat terminal interface you can click around in. To exit, make sure the "root" is selected on the left panel then ^+r.

Usage

The setup is (for now) controlled using a config file, defined in ini format. To get a starting template use the exportrc subcommand.

Usage: elt exportrc [OPTIONS]

  exports sample config and species table to the nominated path

  setting an environment variable ENSEMBLDBRC with this path will force its
  contents to override the default ensembl_lite settings

Options:
  -o, --outpath PATH  path to directory to export all rc contents
  --help              Show this message and exit.

Click to see a sample config file I've been using for development

Using this config, it takes approximately 16' to download (over a ~200MB/s WiFi connection) and ~45' to install on my M2 Macbook Pro (note the install is incomplete). (Note this step uses up to 10 CPU cores.)

[remote path]
host=ftp.ensembl.org
path=pub
[local path]
staging_path=~/Desktop/Outbox/ensembl_download
install_path=~/Desktop/Outbox/ensembl_install
[release]
release=110
[Mouse Lemur]
db=core
[Macaque]
db=core
[Gibbon]
db=core
[Orangutan]
db=core
[Bonobo]
db=core
[Human]
db=core
[Chimp]
db=core
[Gorilla]
db=core
[compara]
align_names=10_primates.epo

Download

Downloads the species indicated in the config file:

genomes sequences as fasta format
annotations as gff3
gene homologies for individual genomes in tsv format

Alignments indicated in the config file will be downloaded in .maf format.

Downloads are written to a local directory, specified in the config file. Downloads are done in parallel (using threads).

Install

"Installation" presently involves transforming downloaded files into local sqlite3 databases which are saved to the location specified in the config file.

From the maf alignment files, the "ancestral" sequences are discarded and for every aligned sequence only the gap data is stored (i.e. gap position and length) along with the genomic coordinates. These alignments will be reconstructable by combining this information with the whole genome sequence. (This approach reduces storage requirements ~5-fold).

Installation is done in parallel on multiple CPUs (since the data need to be decompressed on the fly).

Name		Name	Last commit message	Last commit date
Latest commit History 774 Commits
.github/workflows		.github/workflows
changelog.d		changelog.d
src/ensembl_lite		src/ensembl_lite
tests		tests
.gitignore		.gitignore
.hgignore		.hgignore
.sourcery.yaml		.sourcery.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
README.md		README.md
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EnsemblLite

A screencast of an early prototype

Developer installs

Installation

Usage

Download

Install

About

Releases

Packages

Languages

weilinwu97/EnsemblLite

Folders and files

Latest commit

History

Repository files navigation

EnsemblLite

A screencast of an early prototype

Developer installs

Installation

Usage

Download

Install

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages