-
Notifications
You must be signed in to change notification settings - Fork 105
Home
Foldseek runs on modern UNIX operating systems and is tested on Linux and macOS. Additionally, we are providing a preview version for Windows.
Foldseek takes advantage of multi-core systems through OpenMP and uses the SIMD capabilities of the system. Optimal performance requires a system supporting the AVX2 instruction set, however SSE4.1 and very old systems with SSE2 are also supported. It also supports the PPC64LE and ARM64 processor architectures, these require support for the AltiVec or NEON SIMD instruction sets, respectively.
To check if Foldseek supports your system execute the following commands, depending on your operating system:
[ $(uname -m) = "x86_64" ] && echo "64bit: Yes" || echo "64bit: No"
grep -q avx2 /proc/cpuinfo && echo "AVX2: Yes" || echo "AVX2: No"
grep -q sse4_1 /proc/cpuinfo && echo "SSE4.1: Yes" || echo "SSE4.1: No"
# for very old systems which support neither SSE4.1 or AVX2
grep -q sse2 /proc/cpuinfo && echo "SSE2: Yes" || echo "SSE2: No"
[ $(uname -m) = "x86_64" ] && echo "64bit: Yes" || echo "64bit: No"
sysctl machdep.cpu.leaf7_features | grep -q AVX2 && echo "AVX2: Yes" || echo "AVX2: No"
sysctl machdep.cpu.features | grep -q SSE4.1 && echo "SSE4.1: Yes" || echo "SSE4.1: No"
Foldseek can be installed for Linux or macOS
(1) downloading a statically compiled version For Linux computer with supports AVX2 use:
wget https://mmseqs.com/foldseek/foldseek-linux-avx2.tar.gz
tar xvzf foldseek-linux-avx2.tar.gz
export PATH=$(pwd)/foldseek/bin/:$PATH
Linux with SSE4.1
wget https://mmseqs.com/foldseek/foldseek-linux-sse41.tar.gz
tar xvzf foldseek-linux-sse41.tar.gz
export PATH=$(pwd)/foldseek/bin/:$PATH
macOS build (universal binary with SSE4.1/AVX2/M1 NEON)
wget https://mmseqs.com/foldseek/foldseek-osx-universal.tar.gz
tar xvzf foldseek-osx-universal.tar.gz
export PATH=$(pwd)/foldseek/bin/:$PATH
(2) using bioconda
conda install -c conda-forge -c bioconda foldseek
(3) compiling the from source (see below),
Compiling Foldseek from source has the advantage that it will be optimized to the specific system, which should improve its performance. To compile Foldseek git
, g++
(4.9 or higher) and cmake
(2.8.12 or higher) are needed. Afterwards, the foldseek
binary will be located in build/bin/
.
git clone https://github.com/steineggerlab/foldseek.git
cd foldseek
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=. ..
make
make install
export PATH=$(pwd)/bin/:$PATH
See the Customizing compilation through CMake section if you compile Foldseek on a different system than the one where it will eventually reun.
To compile Foldseek with (Apple-)Clang you need to install either XCode or the Command Line Tools.
You also need libomp
. We recommend installing it using Homebrew:
brew install cmake libomp zlib bzip2
CMake currently does not correctly identify paths to libomp
. Use the script in util/build_osx.sh
to compile Foldseek.
The resulting binary will be placed in OUTPUT_DIR/mmseqs.
./util/build_osx.sh PATH_TO_FOLDSEEK_REPO OUTPUT_DIR
Please install the following packages with Homebrew:
brew install cmake gcc@11 zlib bzip2
Use the following cmake
call:
CC="gcc-11" CXX="g++-11" cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=. ..
Most of the MMseqs2 CMake options also apply to Foldseek, refer to MMseqs2's user guide for details.
Install the google-cloud-cpp
package from vcpkg
:
git clone https://github.com/microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
./vcpkg/vcpkg install google-cloud-cpp
Foldseek can now be compiled with GCS support:
cd path-to-foldseek
mkdir build && cd build
cmake -DHAVE_GCS=1 -DCMAKE_TOOLCHAIN_FILE=[path to vcpkg]/scripts/buildsystems/vcpkg.cmake ..
make -j $(nproc --all)
To provide the user an easy to interpret measure how significant hits are, Foldseek estimates the probability of true positives (i.e. query and target are homologous) for each hit from the structural bit score. The probability is the fraction of TP hits (same superfamily) from TP and FP (not same fold) hits found at the score on average. For this, we estimate the bit score distributions of TP and FP hits. Both score distributions were fitted on SCOPe40. For example, Foldseek finds around the same number of FP and TP with a score of 51 in SCOPe40, the resulting probability for a hit with score 51 is thus 50%.