- A bug in storing indices for
Transformation
is fixed.
- A bug in input checking for
Transformation
is fixed.
- Selection strings now support interpolation.
- Selection strings are made much faster.
- Change compatibility bounds for new MetaGraphs.jl release.
Base.copy
is defined for structural elements and carries out a recursive copy of the element.- The fields of
Transformation
are made concrete, which may improve performance.
- PDB download functions now use HTTPS rather than FTP as the PDB will deprecate the FTP protocol.
retrievepdb
now takes theformat
keyword argument and usesMMCIFFormat
by default.- MMTF files are no longer available to download via
downloadpdb
,downloadentirepdb
,updatelocalpdb
anddownloadallobsoletepdb
as the RCSB PDB no longer provides them.
The package is made considerably more lightweight by moving a number of dependencies to extensions. This should make it easier for other packages to build on top of BioStructures.jl. Some types and functions are also renamed to avoid clashes, and a convenient string selection syntax is introduced.
PDB
,PDBXML
,MMCIF
andMMTF
are renamed toPDBFormat
,PDBXMLFormat
,MMCIFFormat
andMMTFFormat
respectively to avoid clashing with module names.read(fp, PDB)
should be replaced withread(fp, PDBFormat)
, for example.ProteinStructure
is renamed toMolecularStructure
since it is not limited to representing protein structures.x
,y
,z
,x!
,y!
andz!
are no longer exported as they are common variable names. They are still available asBioStructures.x
etc.- Importing BioSequences.jl is now required to use
LongAA
. - Importing BioSequences.jl and BioAlignments.jl is now required to use
pairalign
,superimpose!
,rmsd
/displacements
with thesuperimpose
option orTransformation
on structural elements. - Importing MMTF.jl is now required to use
MMTFDict
orwritemmtf
. - Importing DSSP_jll.jl is now required to use
rundssp!
,rundssp
or therun_dssp
option withread
/retrievepdb
. - Importing STRIDE_jll.jl is now required to use
runstride!
,runstride
or therun_stride
option withread
/retrievepdb
.
- Support for Julia versions before 1.9 is dropped.
- A string selection syntax is introduced, allowing selections such as
collectatoms(struc, sel"name CA and resnumber <= 5")
. - The selectors
sidechainselector
,proteinselector
,acidicresselector
,aliphaticresselector
,aromaticresselector
,basicresselector
,chargedresselector
,neutralresselector
,hydrophobicresselector
,polarresselector
andnonpolarresselector
are added. - PDB parsing in certain situations is now much faster.
- PrecompileTools.jl is used to reduce the time to first execution of PDB file reading.
- On Julia 1.9 and later the
DataFrame
andMetaGraph
constructors are moved to package extensions in order to reduce the number of dependencies. Callingusing DataFrames
andusing Graphs, MetaGraphs
respectively is now required to access these functions. - The file formats
PDB
,PDBXML
,MMCIF
andMMTF
are no longer subtypes ofBioCore.IO.FileFormat
, allowing BioCore.jl to be removed as a dependency.
- DSSP and STRIDE can now be run to assign secondary structure to proteins.
- The required versions of BioSequences.jl and BioAlignments.jl are updated to v3 of each, with support for earlier versions being dropped.
LongAminoAcidSeq
is hence renamed toLongAA
, an alias forLongSequence{AminoAcidAlphabet}
. - Fix bug in
pdbentrylist
.
- Fix bug allowing reflections during structural superimposition.
firstindex
andlastindex
are defined for structural elements, contact maps and distance maps. This allowsbegin
andend
to be used in indexing expressions.- Support for Julia versions before 1.6 is dropped.
- The
chainid!
function is added, allowing the chain ID of a chain or residue to be changed. The newPDBConsistencyError
is thrown when this would give an inconsistent structural state. - "WAT" is added to
waterresnames
and is hence used inwaterselector
andnotwaterselector
. - Switch from using LightGraphs.jl to using Graphs.jl.
- The ordering when sorting residues in a chain is changed from standard/hetero residue then residue number then insertion code to residue number then insertion code then standard/hetero residue. This makes in-chain hetero residues appear in the correct place in written PDB files.
- Support for Julia versions before 1.3 is dropped.
- Fix bug in expanding disordered residues before applying residue selectors.
- Change compatibility bounds for new DataFrames.jl release.
- Fix bug in expanding disordered atoms before applying atom selectors.
- Change compatibility bounds for new DataFrames.jl release.
- Change compatibility bounds for new Format.jl release.
- Some mmCIF files, such as the chemical component dictionary from the PDB, contain multiple data blocks. These can now be read in to a
Dict{String, MMCIFDict}
withreadmultimmcif
and written out withwritemultimmcif
. - Tab completion and an improved REPL display are added for
MMCIFDict
andMMTFDict
.
- A
ProteinStructure
can now be obtained from aMMCIFDict
orMMTFDict
by passing them to theProteinStructure
constructor. This saves having to read the file twice when both the dictionary and the structure object are required. - Add
get
method forMMTFDict
.
- Gzip support is added for reading and writing mmCIF files via the
gzip
keyword argument.
- Add
get
method forMMCIFDict
.
- Fix bug in reading mmCIF data values containing a comment character.
- The required versions of BioSequences.jl and BioAlignments.jl are updated to v2 of each, with support for earlier versions being dropped.
AminoAcidSequence
is hence renamed toLongAminoAcidSeq
. threeletter_to_aa
, a lookup table of amino acids, is re-exported from BioSymbols.
- Change compatibility bounds for new DataFrames.jl release.
- Change keyword argument names
pdb_dir
todir
andfile_format
toformat
fordownloadpdb
,downloadentirepdb
,updatelocalpdb
,downloadallobsoletepdb
andretrievepdb
. - Remove
readpdb
, which has the same functionality asread
. - API reference section, more docstrings, links to related software and interactive Bio3DView.jl examples in documentation.
- Change compatibility bounds for new RecipesBase.jl and CodecZlib.jl releases.
- Change compatibility bounds for new RecipesBase.jl release.
- Improvements to performance throughout the package. Some functions are made up to 5 times faster.
- Fix documentation build.
- A reader and writer are added for the MMTF file format, building on top of MMTF.jl. The interface is the same as for PDB and mmCIF files, with files either being read into the standard hierarchical structure or a
MMTFDict
. Gzipped files are supported. PDB, mmCIF and MMTF files can be interconverted. - The
expand_disordered
flag is added tocollectatoms
,collectresidues
,countatoms
,countresidues
,coordarray
,writepdb
,writemmcif
,writemmtf
andDataFrame
. It determines whether disordered atoms and residues are expanded to include all entries. By default it isfalse
except for the output functions, i.e. the last four above, where it istrue
by default. - The
pdbextension
dictionary is changed to remove leading dots in the values. - Improved file writing of empty elements.
- Examples are split off into a separate section in the documentation.
- A benchmark suite is added to track performance.
- Superimposition of structural elements is supported using the Kabsch algorithm. New functions are
superimpose!
,Transformation
,applytransform!
andapplytransform
. rmsd
anddisplacements
carry out superimposition by default, with the relevant keyword arguments available. Settingsuperimpose
tofalse
prevents this.rmsdatoms
anddispatoms
respectively determine which atoms to calculate the property for.- The trivial
allselector
, which selects all atoms or residues, is added. - The backbone oxygen
"O"
is added tobackboneatomnames
. - Compatible bounds of package dependencies are added to Project.toml.
MetaGraph
from MetaGraphs.jl is extended to create graphs of contacting elements in a molecular structure, giving access to all the graph analysis tools in LightGraphs.jl.DataFrame
from DataFrames.jl is extended to allow creation of data frames from lists of atoms or residues.pairalign
from BioAlignments.jl is extended to produce pairwise alignments from structural elements.AminoAcidSequence
now takes any element type and has thegaps
keyword argument.- Documentation example of interoperability with NearestNeighbors.jl.
- Parametric types used more extensively internally.
collectatoms
,collectresidues
,collectchains
andcollectmodels
no longer runsort
before returning the final list. The user can run an explicitsort
themselves if desired. This change makes the functions faster and allows preservation of the element order.- Speed up residue iteration.
- Documentation improvements.
- Fix
MMCIFDict
to always contain aDict{String, Vector{String}}
rather than aDict{String, Union{String, Vector{String}}}
, which includes making the"data_"
tag aVector{String}
. - More functions documented and documentation bugfixes.
- The mmCIF reader now returns
Array{String,1}
for dictionary values even when there is only a single component. This improves consistency. - Documentation expanded with references to Bio3DView.jl and an extra example.
- Replace REQUIRE with Project.toml.
- Bugfix when reading truncated MODEL line in a PDB file.
- The
ContactMap
andDistanceMap
types are introduced along with their supertypeSpatialMap
.contactmap
is removed. Plot recipes are defined for visualisation ofContactMap
s andDistanceMap
s.showcontactmap
provides a quick way to view aContactMap
in the terminal. - Bug fix on downloading MMTF files.
- Code is now compatible with Julia v0.7 and v1.0. Support for earlier Julia versions is dropped.
downloadpdb
can now be given a function as the first argument, in which case the function is run with the downloaded filepath(s) as an argument and the file(s) are deleted afterwards.- Improved function docstrings.
- A reader and writer is added for the mmCIF format, which has been the standard PDB archive format since 2014. mmCIF files can either be read into a hierarchical structure object or directly in as a dictionary. PDB and mmCIF files can be interconverted.
chainid
now returns aString
instead of aChar
. This allows multi-character chain IDs. This also changeschainids
,chain
andchains
. Chains can be accessed by string (e.g.struc["A"]
), but can still be accessed by character for single chain IDs (e.g.struc['A']
).show
now returns a single line statement for objects across the module, in line with Julia conventions.
Transfer of existing code from Bio.jl. Compatible with Julia v0.6.
Features:
- Hierarchical data structure suitable for macromolecules, particularly proteins.
- Fast reader and writer for the Protein Data Bank (PDB) file format.
- Selection and iteration of structural elements.
- Calculation of spatial properties such as distances and Ramachandran angles.
- Functions to access the PDB.