Here are notes on ideas we have...
- SAGA Paper: out of core variant [Olivier, Fabian, Tom]
- parallel SGD-like/SAGA implementation [Olivier, Fabian, Tom]
- pubsub API
- adaptative hyperparameter search [Jim, Fabian, Olivier]
- stress infrastructure and expermient [Gael, Olivier, Tom]
- Joblib/dask integration: automated data broadcasting/scattering [Olivier, Jim]
-
nd transforms [Kira, Juan]
-
benchmarks (useful for nd transforms) [Emma]
-
numba (precompilation?) [Emma, Juan]
- rewrite Cython
- parallelism
- stencil
- GPU support
-
Data types
-
automatically daskify (most?) functions? [Emma]
-
Big/many images with dask [John]
- scikit-image [Tom]
- scikit-learn
-
Matrix balancing on large scale matrices [Nelle]
-
Matrix factorization "at scale" [John]
- NMF
- online dictionnary learning
- See arthurmensch/modl
figure out how to give access to give 32core access machines to Emma[nelle]