diff --git a/README.md b/README.md index 608b4dd..bcae8b4 100644 --- a/README.md +++ b/README.md @@ -7,151 +7,16 @@ [![GitHub Issues](https://img.shields.io/github/issues/gear-genomics/tracy.svg)](https://github.com/gear-genomics/tracy/issues) -## Installing Tracy +## Tracy: basecalling, alignment, assembly and deconvolution of Sanger Chromatogram trace files +Tracy is available as a [Bioconda package](https://anaconda.org/bioconda/tracy), as a pre-compiled statically linked binary from [Tracy's github release page](https://github.com/gear-genomics/tracy/releases), as a singularity container [SIF file](https://github.com/gear-genomics/tracy/releases) or as a minimal [Docker container](https://hub.docker.com/r/geargenomics/tracy/). -The easiest way to get Tracy is to download the statically linked binary or the singularity container (SIF file) from the [Tracy release page](https://github.com/gear-genomics/tracy/releases). Alternatively, you can download Tracy from [Bioconda](https://anaconda.org/bioconda/tracy) or pull the [Tracy docker container](https://hub.docker.com/r/geargenomics/tracy/). +[Source Code](https://github.com/gear-genomics/tracy/) +[Web Application](https://www.gear-genomics.com) -## Building from Source +[Documentation](https://www.gear-genomics.com/docs/tracy/) -`git clone --recursive https://github.com/gear-genomics/tracy.git` +## Citation -`cd tracy/` - -`make all` - -`make install` - -Tracy requires some system libraries such as bzip2, zlib and boost. For Ubuntu Linux you install these using: - -`apt-get install -y build-essential g++ cmake zlib1g-dev libbz2-dev liblzma-dev libboost-all-dev` - -The Mac OSX versions of these packages are: - -`brew install cmake zlib readline xz bzip2 gsl libtool pkg-config boost` - -For Mac OSX you also often need to set the library path to HTSlib. - -`cd tracy/` - -`export DYLD_LIBRARY_PATH=`pwd`/src/htslib/` - - -## Running Tracy - -`tracy -h` - - -## Basecalling a Trace File - -To get the primary sequence (highest peak) of a trace file in FASTA or FASTQ format. - -`tracy basecall -f fasta -o out.fasta input.ab1` - -`tracy basecall -f fastq -o out.fastq input.ab1` - -To get full trace information, including primary and secondary basecalls for heterozygous variants. - -`tracy basecall -f tsv -o out.tsv input.ab1` - - -## Alignment to a Fasta Slice - -Alignment of a trace file to a FASTA reference slice. - -`tracy align -o outprefix -r ref_slice.fa input.ab1` - - -## Alignment to a Wildtype Chromatogram - -Alignment of a trace file to a wildtype chromatogram is also possible. - -`tracy align -o outprefix -r wildtype.ab1 input.ab1` - - -## Alignment to an indexed reference genome - -Alignment to a large reference genome requires a pre-built index on a bgzip compressed genome. - -`tracy index -o hg38.fa.fm9 hg38.fa.gz` - -`samtools faidx hg38.fa.gz` - -Once the index has been built you can align to the indexed genome. - -`tracy align -r hg38.fa.gz input.ab1` - -The index needs to be built only once. Pre-built genome indices for commonly used reference genomes are available for [download here](https://gear.embl.de/data/tracy/). - - -## Separating heterozygous mutations - -Double-peaks in the chromatogram trace can cause alignment issues. Tracy supports deconvolution of heterozygous variants into two separate alleles. - -`tracy decompose -r hg38.fa.gz -o outprefix input.ab1` - -The two alleles are then separately aligned. - -`cat outprefix.align1 outprefix.align2` - -You can also use a wildtype chromatogram for decomposition. - -`tracy decompose -r wildtype.ab1 -o outprefix mutated.ab1` - -Or a simple FASTA file. - -`tracy decompose -r sequence.fa -o outprefix mutated.ab1` - - -## Single-nucleotide variant (SNV) and insertion & deletion (InDel) variant calling and annotation - -Tracy can call and annotate variants with respect to a reference genome. - -`tracy decompose -v -a homo_sapiens -r hg38.fa.gz -o outprefix input.ab1` - -This command produces a variant call file in binary BCF format. It can be converted to VCF using [bcftools](https://github.com/samtools/bcftools). - -`bcftools view outprefix.bcf` - - -## Using forward & reverse ab1 files to improve variant calling - -If you do have forward and reverse trace files for the same expected genomic variant you can merge variant files and check consistency of calls and genotypes. Forward trace decomposition: - -`tracy decompose -o forward -a homo_sapiens -r hg38.fa.gz forward.ab1` - -Reverse trace decomposition: - -`tracy decompose -o reverse -a homo_sapiens -r hg38.fa.gz reverse.ab1` - -Left-alignment of InDels: - -`bcftools norm -O b -o forward.norm.bcf -f hg38.fa.gz forward.bcf` - -`bcftools norm -O b -o reverse.norm.bcf -f hg38.fa.gz reverse.bcf` - -Merging of normalized variant files: - -`bcftools merge --force-samples forward.norm.bcf reverse.norm.bcf` - - -## Trace assembly - -If you tiled a genomic region with multiple chromatogram files you can assemble all of these with tracy. - -`tracy assemble -r reference.fa file1.ab1 file2.ab1 fileN.ab1` - -Tracy also supports de novo assembly if chromatogram trace files overlap sufficiently with each other. - -`tracy assemble file1.ab1 file2.ab1 fileN.ab1` - - -## Graphical user interface - -All features of tracy are available as web applications at [gear.embl.de](https://gear.embl.de/). - - -## Questions - -In case of questions feel free to send us an [email](https://www-db.embl.de/EMBLPersonGroup-PersonPicture/MailForm/?recipient=ggenomics). +If you use tracy please cite our URL in publications: [https://www.gear-genomics.com](https://www.gear-genomics.com) diff --git a/docs/cli/README.md b/docs/cli/README.md index c49af25..97cecad 100644 --- a/docs/cli/README.md +++ b/docs/cli/README.md @@ -1,6 +1,6 @@ # Usage -Tracy uses subcommands for [basecalling](#basecalling-a-chromatogram-trace-file), [alignment](#trace-alignment), [deconvolution](#deconvolution-of-heterozygous-mutations), [variant calling](#variant-calling). These subcommands are explained below. +Tracy uses subcommands for [basecalling](#basecalling-a-chromatogram-trace-file), [alignment](#trace-alignment), [deconvolution](#deconvolution-of-heterozygous-mutations), [variant calling](#variant-calling) and [trace assembly](#trace-assembly). These subcommands are explained below. ## Basecalling a Chromatogram Trace File @@ -111,3 +111,18 @@ Merging of normalized variant files: ```bash bcftools merge --force-samples forward.norm.bcf reverse.norm.bcf ``` + +## Trace assembly + +For a short genomic region that you tiled with multiple, overlapping Sanger Chromatogram trace files you can use tracy to assemble these. + +```bash +tracy assemble -r reference.fa file1.ab1 file2.ab1 fileN.ab1 +``` + +Instead of a reference-guided assembly using the '-r' option, tracy also supports de novo assembly of chromatogram trace files if these sufficiently overlap each other. + +```bash +tracy assemble file1.ab1 file2.ab1 fileN.ab1 +``` + diff --git a/docs/faq/README.md b/docs/faq/README.md index 36375a2..bbc320d 100644 --- a/docs/faq/README.md +++ b/docs/faq/README.md @@ -13,3 +13,7 @@ Pre-built genome indices for commonly used reference genomes are available for d ## How can I use forward and reverse trace files to confirm variants? Please see [this section](/cli/#using-forward-and-reverse-ab1-files-to-improve-variant-calling) in the documentation of tracy. + +## I have a feature request, how do I contact the developers? + +For questions, help or feature requests please contact gear_genomics@embl.de diff --git a/docs/webapps/README.md b/docs/webapps/README.md index 23cf8b4..7129f60 100644 --- a/docs/webapps/README.md +++ b/docs/webapps/README.md @@ -1 +1,27 @@ # Web Applications + +Tracy features a range of companion web applications hosted at [https://www.gear-genomics.com](https://www.gear-genomics.com) for browsing trace alignments, inspecting variant calls or patching reference sequences. The web apps consist of [teal](#teal) (trace browser), [sage](#sage) (trace alignment), [indigo](#indigo) (trace decomposition and variant calling), [pearl](#pearl) (patching a reference sequence using trace information), [sabre](#sabre) (multiple trace alignment viewer) and the [Wily-DNA-Editor](#wily-dna-editor) (sequence editor). + +# Teal + +ToDo + +# Sage + +ToDo + +# Indigo + +[Indigo](https://www.gear-genomics.com/indigo) can be used to identify single-nucleotide variants (SNVs) and short insertions or deletions (InDels) in a Sanger Chromatogram trace. The application also supports deconvolution of heterozygous mutations: SNVs cause simple double peaks but heterozygous InDels cause a shift in the trace signal and Indigo can be used to separate the two overlapping alleles. The input screen of Indigo requires a chromatogram trace file in scf, abi, ab1 or ab format. Optionally, a left and right trimming size for this trace can be specified. We recommend using [teal](https://www.gear-genomics.com/teal) for estimating such trim sizes. Indigo also requires a reference sequence as input to identify variants. This can be either a wildtype chromatogram, a small sequence in FASTA format or a large indexed reference genome. Once these input requirements have been specified the launch analysis button kicks off tracy and the results are visualized in a separate browser tab. At the top, Indigo shows the actual trace signal. Below the trace viewer is an alignment of both deconvoluted alleles with respect to the reference and an alignment of both alleles against each other. Following the alignment, Indigo lists all identified variants including their rs identifier if it is a known polymorphism, a calling quality, estimated genotype and the basecalling position in the original trace. Please note that all variants are connected via hyperlinks to the original trace for easier browsing. At the very bottom, Indigo shows the decomposition plot. In case of heterozygous InDels you should observe two distinct minima in this plot. For instance, the provided example trace file contains a heterozygous 7bp deletion and thus, Indigo shows a minima for 0 and -7bp in the decomposition plot. All plots of Indigo can be saved in png format, zoomed and panned using the [plotly](http://help.plot.ly) buttons at the top of each chart. + +# Pearl + +ToDo + +# Sabre + +ToDo + +# Wily-DNA-Editor + +ToDo