linfa (Italian) / sap (English):
The vital circulating fluid of a plant.
linfa
aims to provide a comprehensive toolkit to build Machine Learning applications with Rust.
Kin in spirit to Python's scikit-learn
, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.
Where does linfa
stand right now? Are we learning yet?
linfa
currently provides sub-packages with the following algorithms:
Name | Purpose | Status | Category | Notes |
---|---|---|---|---|
clustering | Data clustering | Tested / Benchmarked | Unsupervised learning | Clustering of unlabeled data; contains K-Means, Gaussian-Mixture-Model, DBSCAN and OPTICS |
kernel | Kernel methods for data transformation | Tested | Pre-processing | Maps feature vector into higher-dimensional space |
linear | Linear regression | Tested | Partial fit | Contains Ordinary Least Squares (OLS), Generalized Linear Models (GLM) |
elasticnet | Elastic Net | Tested | Supervised learning | Linear regression with elastic net constraints |
logistic | Logistic regression | Tested | Partial fit | Builds two-class logistic regression models |
reduction | Dimensionality reduction | Tested | Pre-processing | Diffusion mapping, Principal Component Analysis (PCA), Random projections |
trees | Decision trees | Tested / Benchmarked | Supervised learning | Linear decision trees |
svm | Support Vector Machines | Tested | Supervised learning | Classification or regression analysis of labeled datasets |
hierarchical | Agglomerative hierarchical clustering | Tested | Unsupervised learning | Cluster and build hierarchy of clusters |
bayes | Naive Bayes | Tested | Supervised learning | Contains Gaussian Naive Bayes |
ica | Independent component analysis | Tested | Unsupervised learning | Contains FastICA implementation |
pls | Partial Least Squares | Tested | Supervised learning | Contains PLS estimators for dimensionality reduction and regression |
tsne | Dimensionality reduction | Tested | Unsupervised learning | Contains exact solution and Barnes-Hut approximation t-SNE |
preprocessing | Normalization & Vectorization | Tested / Benchmarked | Pre-processing | Contains data normalization/whitening and count vectorization/tf-idf |
nn | Nearest Neighbours & Distances | Tested / Benchmarked | Pre-processing | Spatial index structures and distance functions |
ftrl | Follow The Regularized Leader - proximal | Tested / Benchmarked | Partial fit | Contains L1 and L2 regularization. Possible incremental update |
We believe that only a significant community effort can nurture, build, and sustain a machine learning ecosystem in Rust - there is no other way forward.
If this strikes a chord with you, please take a look at the roadmap and get involved!
Some algorithm crates need to use an external library for linear algebra routines. By default, we use a pure-Rust implementation. However, you can also choose an external BLAS/LAPACK backend library instead, by enabling the blas
feature and a feature corresponding to your BLAS backend. Currently you can choose between the following BLAS/LAPACK backends: openblas
, netblas
or intel-mkl
.
Backend | Linux | Windows | macOS |
---|---|---|---|
OpenBLAS | ✔️ | - | - |
Netlib | ✔️ | - | - |
Intel MKL | ✔️ | ✔️ | ✔️ |
Each BLAS backend has two features available. The feature allows you to choose between linking the BLAS library in your system or statically building the library. For example, the features for the intel-mkl
backend are intel-mkl-static
and intel-mkl-system
.
An example set of Cargo flags for enabling the Intel MKL backend on an algorithm crate is --features blas,linfa/intel-mkl-system
. Note that the BLAS backend features are defined on the linfa
crate, and should only be specified for the final executable.
Dual-licensed to be compatible with the Rust project.
Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.