Skip to content

Commit afc287a

Browse files
committed
Finish Release-2.11
Updated documentation Removed dependency on sphinx extensions in order to keep both readthedocs and homebrew happy. Fixed representation of labels and plotting coordinates based on whether the matrix should be interpretted transposed or not. Added shebang to python library files, in order for homebrew to be happy including them in the bin directory.
2 parents f6f170a + 2c39b52 commit afc287a

36 files changed

+266
-294
lines changed

README.md

+8-1
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,13 @@ Both commands do the same thing.
104104

105105
GNU GPL V3. See COPYING file for more details.
106106

107+
##Cite:
108+
109+
The KAT paper is currently in submission. In the meantime, if you use our software
110+
and wish to cite us please use our bioRxiv preprint:
111+
112+
Daniel Mapleson et al. 2016. KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies. bioRxiv doi: 10.1101/064733
113+
107114

108115
##Authors:
109116

@@ -118,7 +125,7 @@ See AUTHORS file for more details.
118125

119126
##Acknowledgements:
120127

121-
* Affiliation: The Genome Analysis Centre (TGAC)
128+
* Affiliation: Earlham Institute (EI)
122129
* Funding: The Biotechnology and Biological Sciences Research Council (BBSRC)
123130

124131
We would also like to thank the authors of Jellyfish: https://github.com/gmarcais/Jellyfish;

configure.ac

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# Autoconf initialistion. Sets package name version and contact details
66
AC_PREREQ([2.68])
7-
AC_INIT([kat],[2.1.0],[https://github.com/TGAC/KAT/issues],[kat],[https://github.com/TGAC/KAT])
7+
AC_INIT([kat],[2.1.1],[https://github.com/TGAC/KAT/issues],[kat],[https://github.com/TGAC/KAT])
88
AC_CONFIG_SRCDIR([src/kat.cc])
99
AC_CONFIG_AUX_DIR([build-aux])
1010
AC_CONFIG_MACRO_DIR([m4])

doc/source/conf.py

+3-5
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,7 @@
2828
# Add any Sphinx extension module names here, as strings. They can be
2929
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
3030
# ones.
31-
extensions = [
32-
'sphinx.ext.pngmath',
33-
'sphinx.ext.mathjax',
31+
extensions = [
3432
]
3533

3634
# Add any paths that contain templates here, relative to this directory.
@@ -54,9 +52,9 @@
5452
# built documents.
5553
#
5654
# The short X.Y version.
57-
version = '2.1.0'
55+
version = '2.1.1'
5856
# The full version, including alpha/beta/rc tags.
59-
release = '2.1.0'
57+
release = '2.1.1'
6058

6159
# The language for content autogenerated by Sphinx. Refer to documentation
6260
# for a list of supported languages.

doc/source/conf.py.in

+1-3
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,7 @@ needs_sphinx = '1.3'
2828
# Add any Sphinx extension module names here, as strings. They can be
2929
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
3030
# ones.
31-
extensions = [
32-
'sphinx.ext.pngmath',
33-
'sphinx.ext.mathjax',
31+
extensions = [
3432
]
3533

3634
# Add any paths that contain templates here, relative to this directory.

doc/source/images/contaminant_MP.png

344 KB
Loading

doc/source/images/contaminant_PE.png

273 KB
Loading

doc/source/images/contaminant_all.png

570 KB
Loading
359 KB
Loading
380 KB
Loading

doc/source/images/gc_bias_a.png

327 KB
Loading

doc/source/images/gc_bias_b.png

348 KB
Loading

doc/source/images/gc_bias_c.png

357 KB
Loading

doc/source/images/gc_bias_d.png

420 KB
Loading
149 KB
Loading

doc/source/images/pe_v_asm_clean.png

137 KB
Loading

doc/source/images/pe_v_asm_wrong.png

147 KB
Loading
420 KB
Loading
287 KB
Loading
500 KB
Loading
301 KB
Loading
342 KB
Loading
257 KB
Loading
432 KB
Loading
294 KB
Loading
268 KB
Loading

doc/source/images/real_r1_v_r2.png

340 KB
Loading
293 KB
Loading
322 KB
Loading

doc/source/index.rst

+5-3
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,10 @@ selected and the size of your datasets.
4343
Citing
4444
======
4545

46-
We are currently planning to submit multiple publications around different aspects of KAT.
47-
In the meantime if you use KAT in your work, please reference our github page `https://github.com/TGAC/KAT <https://github.com/TGAC/KAT>`_
46+
The KAT paper is currently in submission. In the meantime, if you use our software
47+
and wish to cite us please use our bioRxiv preprint:
48+
49+
Daniel Mapleson et al. 2016. KAT: A K-mer Analysis Toolkit to quality control NGS datasets and genome assemblies. bioRxiv doi: 10.1101/064733
4850

4951

5052

@@ -54,7 +56,7 @@ Issues
5456
======
5557

5658
Should you discover any issues with spectre, or wish to request a new feature please raise a `ticket here <https://github.com/TGAC/KAT/issues>`_.
57-
Alternatively, contact Daniel Mapleson at: daniel.mapleson@tgac.ac.uk
59+
Alternatively, contact Daniel Mapleson at: daniel.mapleson@earlham.ac.uk
5860

5961

6062
.. _availability:

doc/source/kmer.rst

+14-13
Original file line numberDiff line numberDiff line change
@@ -3,25 +3,26 @@
33
K-mer spectra
44
=============
55

6-
K-mer spectra is a representation of a dataset showing how many short
7-
fixed length words (y-axis) appear a certain number of times (x-axis). The k-mer
8-
spectra is composed of distributions representing groups of motifs at different
9-
frequencies in the sample, plus biases. Given not too many biases, this becomes
10-
shape of the distributions provides a useful set of properties describing the
11-
biological sample and the sequencing process and the amount of useful data in the
12-
dataset.
6+
A K-mer spectra is a graphical representation of a dataset showing how many short
7+
fixed length words (k-mers) appear a certain number of times. The frequency of
8+
occurance is plotted on the x-axis and the number of k-mers on the y-axis. The
9+
k-mer spectra is composed of distributions representing groups of motifs at different
10+
frequencies in the sample, plus biases. Given not too many biases, the shape of the
11+
distributions provides a useful set of properties describing the biological sample,
12+
the sequencing process and the amount of useful data in the dataset.
1313

14-
A typical nice 31-mer spectrum of S.cerevisae S288C WGS dataset is shown in the
14+
A typical 31-mer spectrum of S.cerevisae S288C WGS dataset is shown in the
1515
following figure:
1616

1717
.. image:: images/kmer_spectra1.png
1818
:scale: 75%
1919

2020

21-
This is composed of an error component containing a huge amount of
22-
rare motifs, and a several other components as distributions with different modes
23-
according to how many times a motif appear on the genome. The decomposition
24-
showing this distributions can be seen here:
21+
This is composed of an error component containing a huge amount of rare motifs at
22+
frequency < 7 arising from errors in the sequencing process, and a several other
23+
components as distributions with different modes according to how many times a motif
24+
appears on the genome (once, twice, three times etc.). The following plot shows the
25+
decomposition of this distribution into it's component distributions:
2526

2627
.. image:: images/kmer_spectra_breakdown.png
27-
:scale: 50%
28+
:scale: 50%

0 commit comments

Comments
 (0)