Skip to content

loevlie/ml-research-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ml-research-template

A Copier template for ML research projects — PyTorch + Lightning Fabric + Hydra + uv,
with multi-seed significance testing and copier update baked in.

Python License Copier uv PyTorch

Quick start  •  Why this template  •  Multi-seed demo

copier copy demo

Generate a project in ~30 seconds.


uv tool install copier
copier copy --trust gh:loevlie/ml-research-template my-project

That's it. Answer ~10 prompts, get a ready-to-train project with git initialized, deps locked, and pre-commit installed.


Why this template

  • Fabric, not Trainer. The training loop is ~40 explicit lines you can read and edit. Modify it for custom optimizers, multi-network updates, adversarial training, RL, curriculum learning — no callbacks, no hidden machinery.

  • Statistical rigor on day one. scripts/run_seeds.sh launches N seeds; scripts/aggregate_seeds.py computes bootstrap CIs + paired Wilcoxon/t-tests + Cohen's d. Publication-ready out of the box.

  • Runtime shape checking via @jaxtyped(typechecker=beartype) on tensor functions. Catches broadcasting bugs on the first forward pass, not after a day of wasted training.

  • Copier-native, not "Use this template." Generate with prompts instead of renaming 20 files by hand. copier update pulls template improvements into existing projects without clobbering your work — the compounding win across multiple projects.

  • uv + PyTorch CUDA wheels pre-wired. Pick cu118/cu124/cu126/cu128/cpu at template time. 10-100× faster installs than pip.

Why you might NOT want this

  • Skip this if you want a Lightning Trainer black box (callbacks, auto-EMA, Strategy-based DeepSpeed/FSDP) — use lightning-hydra-template instead.

  • Skip this if you're doing non-PyTorch work (JAX, TF) — not for you.

  • Skip this if you don't want to learn Hydra or Fabric — expect a learning curve.


Quick start

# 1. Install Copier (one-time)
uv tool install copier

# 2. Generate a new project
copier copy --trust gh:loevlie/ml-research-template my-project

# 3. Train
cd my-project
uv run python src/<your_package_name>/train.py

# 4. Multi-seed run with significance tests
bash scripts/run_seeds.sh experiment=example seeds="42,123,456,789,1337"
uv run python scripts/aggregate_seeds.py outputs/multi_seed_*

--trust lets the template run git init, uv lock, uv sync, and pre-commit install after generation. Drop the flag if you want to run those yourself.

What you get

my-project/
├── configs/                   # Composable Hydra YAMLs (data/, model/, trainer/, logger/, experiment/)
├── src/my_project/
│   ├── train.py               # Explicit Fabric training loop — ~200 lines, all visible
│   ├── eval.py
│   ├── models/module.py       # @jaxtyped(typechecker=beartype) shape-checked forward
│   ├── data/datamodule.py     # Reproducible splits, seeded workers
│   └── utils/{seed,stats}.py
├── scripts/
│   ├── run_seeds.sh           # Multi-seed launcher
│   └── aggregate_seeds.py     # Bootstrap CIs + paired significance tests
├── tests/                     # Smoke tests (overfit batch, init loss, shapes)
├── demo/app.py                # Gradio → HF Spaces (optional)
├── docs/                      # MkDocs autogen from docstrings (optional)
├── .copier-answers.yml        # Enables `copier update`
└── pyproject.toml             # uv-managed deps, CUDA index routing

Multi-seed + significance testing

# Train 5 seeds of your method
bash scripts/run_seeds.sh experiment=ours seeds="42,123,456,789,1337"

# Train 5 seeds of a baseline with same seeds (for paired comparison)
bash scripts/run_seeds.sh experiment=baseline seeds="42,123,456,789,1337"

# Aggregate + Wilcoxon signed-rank
uv run python scripts/aggregate_seeds.py outputs/multi_seed_ours_* \
  --baseline outputs/multi_seed_baseline_* --metric val/acc

Output:

Metric: val/acc
Mean:   0.9234 ± 0.0045
95% CI: [0.9145, 0.9310]

--- Paired Comparison ---
Ours:     0.9234 ± 0.0045
Baseline: 0.8912 ± 0.0051
Delta:    +0.0322 **
Test:     Wilcoxon signed-rank
Stat:     0.0000, p=0.0079
Effect:   Cohen's d = 6.712
Significant at p<0.01

Updating an existing project

cd path/to/existing-project
copier update --trust

Pulls template improvements (new CI rules, pre-commit updates, config defaults) into your project. Files listed in _skip_if_exists (your model, data, README, experiment configs) are preserved. Other conflicts show up as .rej files or inline markers — three-way merge, not clobber.

Projects using this template

Submit a PR to README.md to add yours.

Prompts reference

The 10 prompts you'll answer
Prompt Default Notes
project_name Human-readable, e.g. "Retinal OCT Classifier"
package_name derived Import name (retinal_oct_classifier). Validated against ^[a-z][a-z0-9_]*$
project_description generic Used in pyproject.toml and README
author_name LICENSE + pyproject authors
author_email pyproject authors
python_version 3.11 Pins .python-version, requires-python, ruff/mypy target
cuda_version cu124 cpu, cu118, cu124, cu126, cu128 — affects [tool.uv.sources]
logger wandb Default tracker (wandb, aim, tensorboard, csv). Switch at runtime via logger=aim
include_example true Ship the MLP reference (demo/, docs/, project_page/, configs/experiment/example.yaml, tests/test_model.py, mkdocs.yml)
include_dennys_rules false Include Dennis Loevlie's research operating manual (DENNYS_RULES.md)

Support

If this saved you setup time, star the repo — it's the main way others discover it.

Issues, PRs, and copier update conflict reports welcome.

License

MIT

About

Opinionated PyTorch template for reproducible ML research. Hydra configs, W&B tracking, multi-seed significance testing, shape-checked code, auto-docs, and HF Spaces demo - ready for NeurIPS/ICML/ICLR.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors