Why this matters (UX)
First-run experience is the hardest thing to get right in any dev tool, and rapidfireai init is the very first thing a new user runs. Right now it jumps straight into several minutes of uv pip install without validating whether the environment can actually support the install. If anything is misconfigured — wrong Python version, missing driver, missing credentials, not enough disk space — the user doesn't find out until a package deep in the dependency tree fails with a traceback. By then they've already waited, downloaded multiple gigabytes, and are left wondering whether their environment is now in a half-installed state.
A preflight step fixes this the way every good CLI tool does it: fail fast, fail clearly, and tell the user exactly what to do next.
- Time: seconds of checks instead of minutes of downloads before an error surfaces.
- Clarity: one specific, human-readable line per missing prerequisite instead of a Python traceback from deep inside a package build or import.
- Agency: the error explains why the prerequisite is required and how to proceed — install the missing piece, or switch to a mode that doesn't need it.
- Confidence: when preflight passes, the user knows the install will work before it starts. When init finishes, a one-line summary confirms what succeeded and what didn't.
This is the difference between "I tried this tool and it was broken" and "I tried this tool and it told me what I forgot." Only one of those users comes back.
Summary
Add a preflight phase at the top of rapidfireai init that validates prerequisites against the requested install mode and exits early with actionable guidance if anything critical is missing. rapidfireai doctor already knows how to detect most of this — it just runs after the fact, when the damage is done.
Proposed preflight checks
The check set is mode-aware — not every prerequisite applies to every install path, and preflight should only enforce what the chosen mode actually needs.
Checks to run:
- Python version matches supported range (3.12.x per
pyproject.toml).
- Virtual environment active (warn, don't fail, if installing into the system interpreter).
- NVIDIA driver present (
nvidia-smi runnable).
- CUDA version resolvable (detected or passed via
--cudaversion) and within a supported major range.
- GPU compute capability resolvable (detected or passed via
--computecapabilityversion).
- Disk space in
RF_HOME / venv site-packages adequate for the requested mode (evals pulls vllm + torch + flashinfer — multi-GB).
- Network reachability to
download.pytorch.org and flashinfer.ai wheel indexes (quick HEAD check).
- HuggingFace auth present (warn if
HF_TOKEN / ~/.cache/huggingface/token missing, since the tutorials need it).
Proposed UX
$ rapidfireai init --evals
🛫 Preflight checks for mode: evals
──────────────────────────────────
✅ Python 3.12.3
✅ Virtual environment active
✅ NVIDIA driver 580.95.05
✅ CUDA 12.9 detected
✅ Compute capability 8.9
❌ HuggingFace auth not configured
Tutorial notebooks need access to gated models on HuggingFace Hub.
Run `hf auth login` before `rapidfireai init`, or set HF_TOKEN
in your environment.
⚠️ Only 6.2 GB free in ./; evals mode downloads ~12 GB of wheels.
Preflight failed. Fix the errors above and re-run `rapidfireai init --evals`.
A --skip-preflight escape hatch keeps power users unblocked. A post-init summary line ("init complete for mode: evals — 0 errors, 1 warning") gives users a clear answer to "did it work?" without having to parse pip output.
Why share code with doctor
utils/doctor.py:get_doctor_info already does driver / CUDA / port detection. Factor the check primitives into a shared module (utils/preflight.py) so:
rapidfireai init runs them as a gate (fail fast, mode-aware).
rapidfireai doctor runs them in report mode (never fails, shows everything, as today).
- Both stay in sync when we add or adjust prerequisites.
Out of scope
- Auto-installing missing prerequisites (CUDA Toolkit, drivers). Detect and point to docs only.
- Replacing
doctor. doctor is still the "everything's installed but something's weird" tool; preflight is the "before we install" tool.
Why this matters (UX)
First-run experience is the hardest thing to get right in any dev tool, and
rapidfireai initis the very first thing a new user runs. Right now it jumps straight into several minutes ofuv pip installwithout validating whether the environment can actually support the install. If anything is misconfigured — wrong Python version, missing driver, missing credentials, not enough disk space — the user doesn't find out until a package deep in the dependency tree fails with a traceback. By then they've already waited, downloaded multiple gigabytes, and are left wondering whether their environment is now in a half-installed state.A preflight step fixes this the way every good CLI tool does it: fail fast, fail clearly, and tell the user exactly what to do next.
This is the difference between "I tried this tool and it was broken" and "I tried this tool and it told me what I forgot." Only one of those users comes back.
Summary
Add a preflight phase at the top of
rapidfireai initthat validates prerequisites against the requested install mode and exits early with actionable guidance if anything critical is missing.rapidfireai doctoralready knows how to detect most of this — it just runs after the fact, when the damage is done.Proposed preflight checks
The check set is mode-aware — not every prerequisite applies to every install path, and preflight should only enforce what the chosen mode actually needs.
Checks to run:
pyproject.toml).nvidia-smirunnable).--cudaversion) and within a supported major range.--computecapabilityversion).RF_HOME/ venv site-packages adequate for the requested mode (evals pulls vllm + torch + flashinfer — multi-GB).download.pytorch.organdflashinfer.aiwheel indexes (quick HEAD check).HF_TOKEN/~/.cache/huggingface/tokenmissing, since the tutorials need it).Proposed UX
A
--skip-preflightescape hatch keeps power users unblocked. A post-init summary line ("init complete for mode: evals — 0 errors, 1 warning") gives users a clear answer to "did it work?" without having to parse pip output.Why share code with
doctorutils/doctor.py:get_doctor_infoalready does driver / CUDA / port detection. Factor the check primitives into a shared module (utils/preflight.py) so:rapidfireai initruns them as a gate (fail fast, mode-aware).rapidfireai doctorruns them in report mode (never fails, shows everything, as today).Out of scope
doctor.doctoris still the "everything's installed but something's weird" tool; preflight is the "before we install" tool.