diff --git a/docs/docs/examples/_category_.json b/docs/docs/examples/_category_.json
index 7170e1e..9d54848 100644
--- a/docs/docs/examples/_category_.json
+++ b/docs/docs/examples/_category_.json
@@ -1,6 +1,7 @@
 {
   "label": "Examples",
   "position": 4,
+  "collapsed": false,
   "link": {
     "type": "generated-index"
   }
diff --git a/docs/docs/examples/firedrake/cavity.md b/docs/docs/examples/firedrake/cavity.md
new file mode 100644
index 0000000..78e8b67
--- /dev/null
+++ b/docs/docs/examples/firedrake/cavity.md
@@ -0,0 +1,91 @@
+---
+sidebar_position: 2
+---
+
+# Cavity
+
+⚠️ **NOTE**: These are **advanced workflow examples** showing direct solver access, stability analysis, and specialized workflows. They do NOT use the standard RL interface.
+
+**Looking for standard RL examples?** See [Getting Started](./getting_started) for `env.reset()` / `env.step()` interface.
+
+---
+
+The open cavity is a classic CFD benchmark problem demonstrating recirculating flows and shear-layer instability.
+
+## Physical Description
+
+**Configuration:**
+- Open square cavity (1×1) with moving top wall
+- Inlet velocity: U = 1.0 
+- All other walls: no-slip (U = 0)
+- Reynolds number Re = 7500
+
+## Quick Start
+
+### 1. Flow Simulation
+
+Run open cavity at Re=7500:
+
+```bash
+python run-transient.py
+```
+
+**What it does:** Simulates turbulent cavity flow from perturbed base state
+**Outputs:**
+
+- `output/stats.dat` - Time series of CFL, KE, TKE
+- Console shows evolution of kinetic energy and turbulent kinetic energy
+
+**Prerequisites:** Requires steady state checkpoint from solve-steady.py
+
+### 2. Find Steady State
+
+Solve for steady flow using Newton iteration with Reynolds ramping:
+
+```bash
+python solve-steady.py
+```
+
+**What it does:** Computes steady base flow for high-Re cavity
+**Uses ramping:** 500 → 1000 → 2000 → 4000 → 7500 for convergence
+**Outputs:**
+- `output/7500_steady.h5` - Steady flow checkpoint for restart
+- `output/7500_steady.pvd` - Paraview visualization
+**Prerequisites:** None
+
+### 3. Stability Analysis
+
+Compute eigenvalues of steady flow:
+
+```bash
+python stability.py --Re 7500 --num-eigs 10
+```
+
+**What it does:** Linear stability analysis using Arnoldi iteration
+**Purpose:** Identify unstable modes leading to shear-layer instability
+**Outputs:** Eigenvalues, eigenvectors, growth rates
+**Prerequisites:** Optional (can compute steady state internally)
+
+### 4. Complete Workflow
+
+Two-stage simulation: steady solve + perturbed transient:
+
+```bash
+python unsteady.py
+```
+
+**What it does:** Demonstrates transition from steady to unstable flow
+**Stage 1:** Solve steady state with Reynolds ramping (500 → ... → 7500)
+**Stage 2:** Add perturbation and run long transient (Tf=500)
+**Outputs:** Time series, Paraview animations, TKE evolution
+**Prerequisites:** None (computes steady state internally)
+
+---
+
+**MPI Parallelization:**
+All scripts support parallel execution:
+```bash
+mpirun -np 4 python <script-name>.py
+```
+
+
diff --git a/docs/docs/examples/firedrake/cylinder.md b/docs/docs/examples/firedrake/cylinder.md
new file mode 100644
index 0000000..4989bdb
--- /dev/null
+++ b/docs/docs/examples/firedrake/cylinder.md
@@ -0,0 +1,117 @@
+---
+sidebar_position: 3
+---
+
+# Cylinder
+
+⚠️ **NOTE**: These are **advanced workflow examples** showing direct solver access, stability analysis, and specialized control. They do NOT use the standard RL interface.
+
+**Looking for standard RL examples?** See [Getting Started](./getting_started) for `env.reset()` / `env.step()` interface.
+
+---
+
+Flow around a circular cylinder is a canonical benchmark in fluid mechanics and flow control.
+
+## Physical Description
+
+**Configuration:**
+- Circular cylinder (radius = 0.5) in 2D
+- Uniform inflow from left (U∞ = 1.0)
+- Reynolds number Re = 100 (default)
+
+**Actuation:**
+- **Jet blowing/suction (Cylinder class):** Two 10° jets at ±90° from stagnation point
+  - Used in: `solve-steady.py`, `unsteady.py`, `step_input.py`, `pressure-probes.py`
+- **Rotary control (RotaryCylinder class):** Tangential velocity on cylinder surface
+  - Used in: `run-transient.py`, `pd-control.py`, `pd-phase-sweep.py`, `lti_system.py`
+
+**Note:** Both actuation types can suppress vortex shedding, but use different physical mechanisms.
+
+
+## Quick Start
+
+### 1. Basic Vortex Shedding
+
+Run uncontrolled flow at Re=100 to observe natural vortex shedding:
+
+```bash
+python run-transient.py
+```
+
+**What it does:** Simulates uncontrolled cylinder flow showing oscillating lift/drag from vortex shedding
+**Outputs:** Checkpoints for restart, console shows CL/CD time series
+**Prerequisites:** None
+
+### 2. Find Steady State
+
+Solve for the unstable steady state at Re=100 using Newton iteration:
+
+```bash
+python solve-steady.py
+```
+
+**What it does:** Computes unstable equilibrium (saddle point) for stability analysis
+**Outputs:** `output/cylinder_Re100_steady.h5`, Paraview files
+**Prerequisites:** None
+
+### 3. Observe Instability Growth
+
+Two-stage simulation: steady solve + perturbed transient:
+
+```bash
+python unsteady.py
+```
+
+**What it does:** Demonstrates transition from steady state to limit cycle (vortex shedding)
+**Outputs:** Time series, Paraview animations
+**Prerequisites:** None (computes steady state internally)
+
+### 4. Stability Analysis
+
+Compute eigenvalues/eigenmodes using Arnoldi iteration:
+
+```bash
+python stability.py
+```
+
+**What it does:** Linear stability analysis - finds growth rates and frequencies
+**Outputs:** Eigenvalues (growth rate, frequency), eigenvectors
+**Prerequisites:** Optional checkpoint from solve-steady.py (or computes internally)
+
+### 5. Apply PD Control
+
+Suppress vortex shedding using feedback control:
+
+```bash
+python pd-control.py
+```
+
+**What it does:** Demonstrates feedback control (off→on) to stabilize unstable flow
+**Outputs:** Time series showing oscillation suppression
+**Prerequisites:** **REQUIRED** - Must run `run-transient.py` first for checkpoint
+
+---
+
+**MPI Parallelization:**
+All scripts support parallel execution:
+```bash
+mpirun -np 4 python <script-name>.py
+```
+
+---
+
+## Complete Script Reference
+
+| Script | Purpose | Key Features | Prerequisites |
+|--------|---------|--------------|---------------|
+| **solve-steady.py** | Compute steady-state flow | Newton solver, Reynolds ramping | None |
+| **unsteady.py** | Steady→unsteady transition | Two-stage: Newton + transient | None |
+| **run-transient.py** | Basic time integration | Simple vortex shedding demo | None |
+| **stability.py** | Linear stability analysis | Eigenvalues, eigenmodes (direct/adjoint) | Optional steady checkpoint |
+| **step_input.py** | System identification | Step response for control design | None |
+| **pressure-probes.py** | Point measurements | Demonstrates sparse sensing | None |
+| **pd-control.py** | Feedback control | PD controller with on/off phases | **Requires** run-transient.py checkpoint |
+| **pd-phase-sweep.py** | Controller tuning | Sweeps phase angles for optimal gain | **Requires** run-transient.py checkpoint |
+| **lti_system.py** | Model linearization | Extracts base flow + control influence | None |
+
+
diff --git a/docs/docs/examples/firedrake/getting_started.md b/docs/docs/examples/firedrake/getting_started.md
new file mode 100644
index 0000000..a608c20
--- /dev/null
+++ b/docs/docs/examples/firedrake/getting_started.md
@@ -0,0 +1,409 @@
+---
+sidebar_position: 1
+---
+
+# Getting Started
+
+**START HERE** for standard RL interface examples using `env.reset()` and `env.step()`.
+
+This directory contains comprehensive configuration examples and testing utilities for HydroGym's Firedrake-based flow environments using the **standard RL interface**.
+
+> **Looking for advanced workflows?** (steady solvers, stability analysis, direct control)
+> See [`examples/firedrake/advanced`](https://github.com/dynamicslab/hydrogym/tree/main/examples/firedrake/advanced) for specialized research and development examples.
+
+## Files
+
+### [`config_reference.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/firedrake/getting_started/config_reference.py)
+**Comprehensive configuration examples** - Copy-pasteable configurations for all use cases.
+
+Run to see all examples:
+```bash
+python config_reference.py
+```
+
+Contains 10 detailed examples:
+1. **Minimal Configuration** - Simplest setup using defaults
+2. **Cylinder with Velocity Probes** - Probe-based observations
+3. **Rotary Cylinder** - Rotation actuation
+4. **Cavity with Multi-Substep** - Multi-substep simulation with callbacks
+5. **Pinball with Multiple Checkpoints** - Curriculum learning
+6. **Step with Noise Forcing** - Random forcing for exploration
+7. **Cylinder with Restart** - Load from checkpoint
+8. **Advanced Multi-Substep** - All aggregation strategies
+9. **All Observation Types** - Comparing observation modes
+10. **Production RL Setup** - Recommended training configuration
+
+### [`test_firedrake_env.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/firedrake/getting_started/test_firedrake_env.py)
+**Interactive test script** - Test environments with command-line arguments.
+
+Usage:
+```bash
+# Single process
+python test_firedrake_env.py --environment cylinder --num-steps 10
+
+# MPI parallel
+mpirun -np 4 python test_firedrake_env.py --environment cylinder --num-steps 50
+```
+
+Contains **inline configuration documentation** showing all available options.
+
+### [`train_sb3_firedrake.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/firedrake/getting_started/train_sb3_firedrake.py)
+**SB3 training script** - Train reinforcement learning agents (PPO/TD3/SAC) with Stable-Baselines3.
+
+Features:
+- Monitor wrapper for episode statistics
+- VecNormalize for observation/reward normalization
+- Checkpoint saving with normalization stats
+- TensorBoard logging
+- Pure Python execution (no MPMD required)
+
+Usage:
+```bash
+# Basic training
+python train_sb3_firedrake.py --env cylinder --algo PPO --total-timesteps 100000
+
+# With custom configuration
+python train_sb3_firedrake.py --env cavity --reynolds 7500 --mesh fine --algo SAC
+
+# Monitor training
+tensorboard --logdir logs/
+```
+
+### [`run_example_docker.sh`](https://github.com/dynamicslab/hydrogym/blob/main/examples/firedrake/getting_started/run_example_docker.sh)
+**Docker runner script** - Run Firedrake examples in Docker with automatic setup.
+
+Usage:
+```bash
+# Test environment
+./run_example_docker.sh
+
+# Train SB3 agent
+./run_example_docker.sh train
+```
+
+## Quick Start
+
+### 1. View All Configuration Options
+```bash
+python config_reference.py
+```
+
+### 2. Test an Environment
+```bash
+python test_firedrake_env.py --environment cylinder --num-steps 10 --verbose
+```
+
+### 3. Copy a Configuration Template
+Open `config_reference.py` and copy the example that matches your use case.
+
+## Configuration Categories
+
+### **Flow Configuration** (`flow_config`)
+| Parameter | Description | Options/Examples |
+|-----------|-------------|------------------|
+| `mesh` | Mesh resolution | `'coarse'`, `'medium'`, `'fine'` |
+| `Re` | Reynolds number | Flow-dependent (e.g., 100 for cylinder) |
+| `observation_type` | Observation method | `'lift_drag'`, `'stress_sensor'`, `'velocity_probes'`, `'pressure_probes'`, `'vorticity_probes'` |
+| `probes` | Probe locations | `[(x1, y1), (x2, y2), ...]` |
+| `restart` | Checkpoint file(s) | `None` (auto), `'file.h5'`, `'Cylinder_2D_Re100_medium_FD'` (env name), or `['file1.h5', 'file2.h5']` (multiple) |
+| `local_dir` | Local checkpoint directory | `'/path/to/checkpoints'` (for offline/testing) |
+| `cache_dir` | Custom cache directory | `'/path/to/cache'` (where HF downloads are stored) |
+| `velocity_order` | FEM element order | `2` (default, P2-P1 Taylor-Hood) |
+
+### **Solver Configuration** (`solver_config`)
+| Parameter | Description | Default/Options |
+|-----------|-------------|-----------------|
+| `dt` | Time step | **REQUIRED** (e.g., `1e-2` for cylinder, `1e-4` for cavity) |
+| `order` | BDF order | `3` (options: 1, 2, 3) |
+| `stabilization` | Stabilization type | `'supg'`, `'gls'`, `'none'` |
+| `rtol` | Krylov tolerance | `1e-6` |
+
+### **Actuation Configuration** (`actuation_config`)
+| Parameter | Description | Default/Options |
+|-----------|-------------|-----------------|
+| `num_substeps` | Solver steps per action | `1` (default) |
+| `reward_aggregation` | Aggregation method | `'mean'`, `'sum'`, `'median'` |
+
+### **Environment Settings**
+| Parameter | Description | Default |
+|-----------|-------------|---------|
+| `max_steps` | Episode length | `1e6` |
+| `callbacks` | Callback list | `[]` |
+
+## Available Environments
+
+| Environment | Inputs | Control Type | Default Obs | Meshes |
+|-------------|--------|--------------|-------------|--------|
+| **Cylinder** | 1 | Blowing/suction (±0.1) | lift_drag | medium, fine |
+| **RotaryCylinder** | 1 | Rotation (±0.5π rad) | lift_drag | medium, fine |
+| **Pinball** | 3 | Rotation (±10.0) | lift_drag | medium, fine |
+| **Cavity** | 1 | Blowing/suction (±0.1) | stress_sensor | medium, fine |
+| **Step** | 1 | Blowing/suction (±0.1) | stress_sensor | coarse, medium, fine |
+
+## Observation Types
+
+### 1. **Force-Based Observations**
+- `'lift_drag'` → Returns `(CL, CD)` for cylinder/rotary, `(CL1, CD1, CL2, CD2, CL3, CD3)` for pinball
+
+### 2. **Sensor-Based Observations**
+- `'stress_sensor'` → Returns wall shear stress (scalar)
+
+### 3. **Probe-Based Observations**
+- `'velocity_probes'` → Returns `[u1, u2, ..., v1, v2, ...]` at probe locations
+- `'pressure_probes'` → Returns `[p1, p2, ...]` at probe locations
+- `'vorticity_probes'` → Returns `[ω1, ω2, ...]` at probe locations
+
+**Note:** For probe-based observations, you must specify `probes` in `flow_config`.
+
+## Usage Examples
+
+### Example 1: Basic Cylinder Environment
+```python
+from hydrogym import FlowEnv
+import hydrogym.firedrake as hgym
+
+env_config = {
+    'flow': hgym.Cylinder,
+    'flow_config': {'mesh': 'medium', 'Re': 100},
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+}
+
+env = FlowEnv(env_config)
+obs, info = env.reset()
+
+for _ in range(100):
+    action = env.action_space.sample()
+    obs, reward, terminated, truncated, info = env.step(action)
+```
+
+### Example 2: Multi-Substep Simulation
+```python
+env_config = {
+    'flow': hgym.Cylinder,
+    'flow_config': {'mesh': 'medium', 'Re': 100},
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+    'actuation_config': {
+        'num_substeps': 5,              # Run 5 solver steps per action
+        'reward_aggregation': 'mean',   # Average rewards over substeps
+    },
+}
+
+env = FlowEnv(env_config)
+# Each env.step() now runs 5 simulation steps internally
+```
+
+### Example 3: Training with Stable-Baselines3
+```python
+# See train_sb3_firedrake.py for full implementation
+from hydrogym import FlowEnv
+import hydrogym.firedrake as hgym
+from stable_baselines3 import PPO
+from stable_baselines3.common.monitor import Monitor
+from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
+
+def make_env():
+    env_config = {
+        'flow': hgym.Cylinder,
+        'flow_config': {'mesh': 'medium', 'Re': 100},
+        'solver': hgym.SemiImplicitBDF,
+        'solver_config': {'dt': 1e-2},
+        'actuation_config': {'num_substeps': 2},
+    }
+    env = FlowEnv(env_config)
+    return Monitor(env)
+
+env = DummyVecEnv([make_env])
+env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10.0)
+
+model = PPO("MlpPolicy", env, verbose=1, tensorboard_log="./logs")
+model.learn(total_timesteps=100000)
+
+model.save("ppo_cylinder")
+env.save("vec_normalize.pkl")
+```
+
+Run with:
+```bash
+python train_sb3_firedrake.py --env cylinder --algo PPO --total-timesteps 100000
+```
+
+### Example 4: Automatic Checkpoint Loading
+```python
+# Checkpoints are automatically inferred from flow config and downloaded from HF Hub
+env_config = {
+    'flow': hgym.Cylinder,
+    'flow_config': {
+        'mesh': 'medium',
+        'Re': 100,
+        # No 'restart' specified - automatically loads 'Cylinder_2D_Re100_medium_FD'
+    },
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+}
+
+env = FlowEnv(env_config)
+# Checkpoint auto-downloaded from HF Hub and loaded!
+print(f"Loaded checkpoint: {env.flow.checkpoint_path}")
+```
+
+### Example 5: Local Checkpoint Directory
+```python
+# Use local checkpoints without HF Hub (for offline/testing)
+env_config = {
+    'flow': hgym.Cylinder,
+    'flow_config': {
+        'mesh': 'medium',
+        'Re': 100,
+        'local_dir': '/workspace/my_checkpoints',  # Local directory
+        # Automatically loads: /workspace/my_checkpoints/Cylinder_2D_Re100_medium_FD/*.ckpt
+    },
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+}
+
+env = FlowEnv(env_config)
+```
+
+### Example 6: Multiple Checkpoints for Curriculum Learning
+```python
+env_config = {
+    'flow': hgym.Pinball,
+    'flow_config': {
+        'mesh': 'fine',
+        'Re': 30,
+        'restart': [
+            'checkpoint_early.h5',
+            'checkpoint_mid.h5',
+            'checkpoint_late.h5',
+        ],
+    },
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+}
+
+env = FlowEnv(env_config)
+# Each reset() randomly selects one of the three initial conditions
+obs, info = env.reset()
+print(f"Started from checkpoint index: {info.get('checkpoint_index')}")
+```
+
+### Example 7: Probe-Based Observations
+```python
+import numpy as np
+
+# Define wake probes
+wake_probes = [(x, 0.0) for x in np.linspace(1.0, 10.0, 20)]
+
+env_config = {
+    'flow': hgym.Cylinder,
+    'flow_config': {
+        'mesh': 'medium',
+        'Re': 100,
+        'observation_type': 'velocity_probes',
+        'probes': wake_probes,
+    },
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-2},
+}
+
+env = FlowEnv(env_config)
+obs, _ = env.reset()
+print(f"Observation shape: {obs.shape}")  # (40,) for 20 probes × 2 velocity components
+```
+
+### Example 8: Using Callbacks
+```python
+from hydrogym.firedrake.io import CheckpointCallback, LogCallback
+
+env_config = {
+    'flow': hgym.Cavity,
+    'flow_config': {'mesh': 'fine', 'Re': 7500},
+    'solver': hgym.SemiImplicitBDF,
+    'solver_config': {'dt': 1e-4},
+    'callbacks': [
+        CheckpointCallback(
+            interval=1000,
+            filename='cavity_checkpoint.h5',
+        ),
+        LogCallback(
+            postprocess=lambda flow: flow.get_observations(),
+            nvals=1,
+            interval=10,
+            filename='cavity_log.txt',
+        ),
+    ],
+}
+
+env = FlowEnv(env_config)
+```
+
+## Checkpoint Management
+
+HydroGym provides flexible checkpoint management with automatic inference and HuggingFace Hub integration.
+
+### **Checkpoint Loading Methods**
+
+| Method | Example | Use Case |
+|--------|---------|----------|
+| **Automatic** | No `restart` specified | Auto-loads from HF Hub based on flow config |
+| **Environment Name** | `restart='Cylinder_2D_Re100_medium_FD'` | Load specific HF Hub environment |
+| **Explicit Path** | `restart='/path/to/checkpoint.h5'` | Use local checkpoint file |
+| **Multiple Checkpoints** | `restart=['ckpt1.h5', 'ckpt2.h5']` | Random selection for curriculum learning |
+
+### **Configuration Parameters**
+
+```python
+flow_config = {
+    # Checkpoint configuration
+    'restart': None,  # or path, environment name, or list
+    'local_dir': '/path/to/local/checkpoints',  # For offline/testing
+    'cache_dir': '/path/to/custom/cache',  # Custom HF cache location
+}
+```
+
+### **Automatic Checkpoint Naming**
+
+Checkpoints follow the pattern: `{FlowClass}_2D_Re{Reynolds}_{mesh}_FD`
+
+Examples:
+- `Cylinder_2D_Re100_medium_FD` - Cylinder at Re=100 on medium mesh
+- `Pinball_2D_Re30_fine_FD` - Pinball at Re=30 on fine mesh
+- `Cavity_2D_Re7500_medium_FD` - Cavity at Re=7500 on medium mesh
+
+### **How It Works**
+
+1. **No restart specified** → Auto-constructs environment name → Downloads from HF Hub → Loads first checkpoint
+2. **Environment name given** → Downloads from HF Hub → Loads first checkpoint
+3. **Explicit path** → Uses path directly
+4. **Local directory** → Searches local directory → Uses symlinks (no duplication)
+
+### **Verification**
+
+After loading, check the checkpoint:
+```python
+env = FlowEnv(env_config)
+if env.flow.checkpoint_path:
+    print(f"Loaded: {env.flow.checkpoint_path}")
+else:
+    print("Starting from zeros")
+```
+
+## Available Callbacks
+
+Import from `hydrogym.firedrake.io`:
+
+| Callback | Purpose | Key Parameters |
+|----------|---------|----------------|
+| `CheckpointCallback` | Save HDF5 checkpoints | `interval`, `filename`, `write_mesh` |
+| `ParaviewCallback` | Export for visualization | `interval`, `filename`, `postprocess` |
+| `LogCallback` | Log to text file | `interval`, `filename`, `postprocess`, `nvals` |
+| `SnapshotCallback` | Save for modal analysis | `interval`, `filename` |
+| `GenericCallback` | Custom function | `callback`, `interval` |
+
+---
+
+**Last Updated**: March 2026
+**HydroGym Version**: 1.0+
+**Maintainer**: HydroGym Team
diff --git a/docs/docs/examples/firedrake/manage-docs-versions.md b/docs/docs/examples/firedrake/manage-docs-versions.md
deleted file mode 100644
index 3e4687f..0000000
--- a/docs/docs/examples/firedrake/manage-docs-versions.md
+++ /dev/null
@@ -1,53 +0,0 @@
----
-sidebar_position: 1
----
-
-# Manage Docs Versions
-
-Docusaurus can manage multiple versions of your docs.
-
-## Create a docs version
-
-Release a version 1.0 of your project:
-
-```bash
-npm run docusaurus docs:version 1.0
-```
-
-The `docs` folder is copied into `versioned_docs/version-1.0` and `versions.json` is created.
-
-Your docs now have 2 versions:
-
-- `1.0` at `http://localhost:3000/docs/` for the version 1.0 docs
-- `current` at `http://localhost:3000/docs/next/` for the **upcoming, unreleased docs**
-
-## Add a Version Dropdown
-
-To navigate seamlessly across versions, add a version dropdown.
-
-Modify the `docusaurus.config.js` file:
-
-```js title="docusaurus.config.js"
-export default {
-  themeConfig: {
-    navbar: {
-      items: [
-        // highlight-start
-        {
-          type: 'docsVersionDropdown',
-        },
-        // highlight-end
-      ],
-    },
-  },
-};
-```
-
-The docs version dropdown appears in your navbar:
-
-## Update an existing version
-
-It is possible to edit versioned docs in their respective folder:
-
-- `versioned_docs/version-1.0/hello.md` updates `http://localhost:3000/docs/hello`
-- `docs/hello.md` updates `http://localhost:3000/docs/next/hello`
diff --git a/docs/docs/examples/firedrake/pinball.md b/docs/docs/examples/firedrake/pinball.md
new file mode 100644
index 0000000..3f11c82
--- /dev/null
+++ b/docs/docs/examples/firedrake/pinball.md
@@ -0,0 +1,82 @@
+---
+sidebar_position: 4
+---
+
+# Pinball
+
+⚠️ **NOTE**: These are **advanced workflow examples** showing direct solver access and specialized workflows. They do NOT use the standard RL interface.
+
+**Looking for standard RL examples?** See [Getting Started](./getting_started) for `env.reset()` / `env.step()` interface.
+
+---
+
+Flow around three cylinders in triangular arrangement - a challenging benchmark for flow control.
+
+## Physical Description
+
+**Configuration:**
+- Three cylinders in equilateral triangle arrangement
+- Cylinder radius: 0.5
+- Uniform inflow from left (U∞ = 1.0)
+- Reynolds number Re = 30-150
+
+**Key Phenomena:**
+- **Re = 30:** Steady symmetric flow
+- **Re = 150:** Complex unsteady wake with three-body interactions
+- Wake can exhibit mode switching between symmetric/asymmetric states
+- Chaotic dynamics possible at higher Re
+
+## Quick Start
+
+### 1. Basic Simulation
+
+Run unsteady pinball flow at Re=30:
+
+```bash
+python run-transient.py
+```
+
+**What it does:** Simulates flow around three cylinders showing wake interactions
+**Outputs:**
+- `coeffs.dat` - Time series of CL for all three cylinders
+- Console shows forces on each cylinder
+**Prerequisites:** None
+
+### 2. Find Steady State
+
+Solve for steady flow at Re=80 using Newton iteration:
+
+```bash
+python solve-steady.py
+```
+
+**What it does:** Computes steady state (or unstable equilibrium) for stability analysis
+**Uses ramping:** 40 → 60 → 80 for better convergence
+**Outputs:**
+- `output/pinball_Re80_steady.h5` - Checkpoint for restart
+- Paraview files for visualization
+- Force coefficients for all three cylinders
+**Prerequisites:** None
+
+### 3. Observe Wake Dynamics
+
+Two-stage simulation: steady solve + perturbed transient:
+
+```bash
+python unsteady.py
+```
+
+**What it does:** Demonstrates transition from steady state to complex wake dynamics
+**Stage 1:** Solve steady state with Reynolds ramping (40 → 60 → 80 → 100)
+**Stage 2:** Add perturbation and run transient (Tf=200)
+**Outputs:** Time series, Paraview animations, force data for all cylinders
+**Prerequisites:** None (computes steady state internally)
+
+---
+
+**MPI Parallelization:**
+All scripts support parallel execution:
+```bash
+mpirun -np 4 python <script-name>.py
+```
+
diff --git a/docs/docs/examples/firedrake/step.md b/docs/docs/examples/firedrake/step.md
new file mode 100644
index 0000000..605de48
--- /dev/null
+++ b/docs/docs/examples/firedrake/step.md
@@ -0,0 +1,96 @@
+---
+sidebar_position: 5
+---
+
+# Backward-Facing Step
+
+⚠️ **NOTE**: These are **advanced workflow examples** showing direct solver access and specialized workflows. They do NOT use the standard RL interface.
+
+**Looking for standard RL examples?** See [Getting Started](./getting_started) for `env.reset()` / `env.step()` interface.
+
+---
+
+Flow over a backward-facing step demonstrates separated flow and reattachment.
+
+## Physical Description
+
+**Configuration:**
+- Channel with sudden expansion (backward-facing step)
+- Step height creates separation zone
+- Uniform inflow from left
+- Reynolds number Re = 600 (default)
+
+**Observations:**
+- Kinetic energy (KE)
+- Turbulent kinetic energy (TKE)
+- Reattachment point location
+
+## Quick Start
+
+### 1. Basic Simulation
+
+Run transient step flow at Re=600:
+
+```bash
+python run-transient.py
+```
+
+**What it does:** Simulates separated flow from perturbed base state
+**Outputs:**
+- `output/stats.dat` - Time series of CFL, KE, TKE
+- Console shows KE, TKE evolution
+- Long time integration (1000 time units)
+**Prerequisites:** Requires steady state checkpoint from solve-steady.py
+
+### 2. Find Steady State
+
+Solve for steady flow using Newton iteration with Reynolds ramping:
+
+```bash
+python solve-steady.py
+```
+
+**What it does:** Computes steady base flow for separated step flow
+**Uses ramping:** 100 → 200 → 300 → 400 → 500 → 600 for convergence
+**Outputs:**
+- `output/600_steady.h5` - Steady checkpoint for restart
+- `output/600_steady.pvd` - Paraview visualization of recirculation zone
+**Prerequisites:** None
+
+### 3. Observe Instability
+
+Two-stage simulation: steady solve + perturbed transient:
+
+```bash
+python unsteady.py
+```
+
+**What it does:** Demonstrates transition from steady to unsteady separated flow
+**Stage 1:** Solve steady state with Reynolds ramping
+**Stage 2:** Add perturbation and run long transient (Tf=1000)
+**Outputs:** Time series, Paraview animations, TKE evolution
+**Prerequisites:** None (computes steady state internally)
+
+### 4. Test Control Response
+
+Apply step input actuation:
+
+```bash
+python step-control.py
+```
+
+**What it does:** Applies step change in actuation to measure flow response
+**Control:** Off until t=50, then constant actuation
+**Purpose:** System identification - measure step response
+**Outputs:** Time series showing response to actuation
+**Prerequisites:** None (uses internal initial condition)
+
+---
+
+**MPI Parallelization:**
+All scripts support parallel execution:
+```bash
+mpirun -np 4 python <script-name>.py
+```
+
+
diff --git a/docs/docs/examples/jax/channel.md b/docs/docs/examples/jax/channel.md
new file mode 100644
index 0000000..91089e8
--- /dev/null
+++ b/docs/docs/examples/jax/channel.md
@@ -0,0 +1,166 @@
+---
+sidebar_position: 3
+---
+
+# Turbulent Channel
+
+HydroGym contains a 3D channel flow written in the differentiable programming language [JAX](https://docs.jax.dev/en/latest/notebooks/thinking_in_jax.html). The channel flow is of size $[2\pi, \pi, 2]$, where $z$ is the wall-normal direction. The channel flow is run at $Re_\tau = 180$, and is pre-configured to be controlled with 24 wall-normal jets evenly spaced throughout the wall. The observation value consists of evenly spaced x-velocity values sampled from $y^+ \approx 9$.
+
+## Initializing the Environment
+
+To begin setting up the environment, we first have to import JAX, the necessary modules from HydroGym. Then we will be able to construct the environment, and work with it.
+
+```python
+import jax
+import jax.numpy as jnp
+import matplotlib.pyplot as plt
+
+# Import channel flow and environment configuration
+from hydrogym.jax.envs.channel import ChannelFlowSpectralEnv
+
+env_config = {}
+env = ChannelFlowSpectralEnv(env_config)
+params = env.default_params
+```
+
+to have a clean working environment, we will also need to reset the environment to its initial conditions. These are provided in the [HuggingFace initial fields folder](https://huggingface.co/datasets/dynamicslab/HydroGym-environments).
+
+```python
+key = jax.random.PRNGKey(0)
+obs, state = env.reset_env(key, params)
+print("Initial state shape U:", state.U.shape)
+print("Initial state shape V:", state.U.shape)
+print("Initial state shape W:", state.U.shape)
+print("Initial mean observation value: ", jnp.mean(obs))
+```
+
+at which point we can begin to run the environment.
+
+## A First Environment Step
+
+To now interact with the environment, we need to define an action we wish to take. For this we can rely on the default parameters provided by the environment to give us the right dimensions.
+
+```python
+action = jnp.zeros((params.action_dim,))
+```
+
+Now we can step the environment forward by providing the action to the environment.
+
+```python
+obs, state, reward, done, info = env.step_env(key, state, action, params)
+
+print("Mean observation value: ", jnp.mean(obs))
+print("Reward:", reward)
+```
+
+to validate the right performance of the environment, we should see approximately the following output:
+
+| Type of Variable | Value |
+| -------- | -------- |
+| Mean Observation Value | 3.3810941472675573 |
+| Reward | -0.3378005217734956 |
+
+## Visualizing U without Control
+
+After running the environment step, the state class contains (U,V,W) fields which can be accessed and visualized:
+
+```python
+U = state.U
+y_idx = 36
+z_idx = 8
+x_idx = 36
+U_slice_xz = U[:, y_idx, :]
+U_slice_xy = U[:, :, z_idx]
+U_slice_yz = U[x_idx, :, :]
+
+fig, axes = plt.subplots(1, 3, figsize=(10, 2))
+slices = [U_slice_xz, U_slice_xy, U_slice_yz]
+
+vmin = min(s.min() for s in slices)
+vmax = max(s.max() for s in slices)
+
+for (
+    ax,
+    slice_,
+) in zip(axes, slices):
+    im = ax.imshow(
+        slice_.T,
+        origin="lower",
+        aspect="auto",
+        vmin=vmin,
+        vmax=vmax,
+    )
+fig.subplots_adjust(right=0.9)
+
+plt.tight_layout(rect=[0, 0, 1.25, 1])
+fig.colorbar(im, ax=axes, label="velocity")
+plt.show()
+```
+
+![Channel flow visualization without control](img/channel_1.png)
+
+At which point we can also explore the performance of the full suction jets. Here we will perform $6$ steps in the environment, and then inspect the mean observation values and rewards after each step.
+
+```python
+key = jax.random.PRNGKey(0)
+obs, state = env.reset_env(key, params)
+
+# action = jax.random.normal(key, (params.action_dim,))
+action = 0.01 * jnp.ones((params.action_dim,))
+num_steps = 6
+
+for i in range(num_steps):
+    obs, state, reward, done, info = env.step_env(key, state, action, params)
+    print("Mean observation value after environment step: ", jnp.mean(obs))
+    print("Reward:", reward)
+```
+
+The performance you should see should look somewhat like the following:
+
+| Steps | Mean Observation Value | Reward |
+| -------- | -------- | -------- |
+| 1 | 3.3811771453005943 | -0.3379856556277938 |
+| 2 | 3.374816954342459 | -0.3296336066794982 |
+| 3 | 3.3952557633014124 | -0.32246759246707535 |
+| 4 | 3.4049538883848918 | -0.317317974468479 |
+| 5 | 3.369506324681792 | -0.31543620069660644 |
+| 6 | 3.3887298110075403 | -0.31826735862330213 |
+
+## Visualizing U after a few Steps in the Environment with Control
+
+To visualize the flow after a few steps in the environment with control, we can use the same code as above, i.e.
+
+```python
+U = state.U
+y_idx = 36
+z_idx = 9
+x_idx = 36
+U_slice_xz = U[:, y_idx, :]
+U_slice_xy = U[:, :, z_idx]
+U_slice_yz = U[x_idx, :, :]
+
+fig, axes = plt.subplots(1, 3, figsize=(10, 2))
+slices = [U_slice_xz, U_slice_xy, U_slice_yz]
+
+vmin = min(s.min() for s in slices)
+vmax = max(s.max() for s in slices)
+
+for (
+    ax,
+    slice_,
+) in zip(axes, slices):
+    im = ax.imshow(
+        slice_.T,
+        origin="lower",
+        aspect="auto",
+        vmin=vmin,
+        vmax=vmax,
+    )
+fig.subplots_adjust(right=0.9)
+
+plt.tight_layout(rect=[0, 0, 1.25, 1])
+fig.colorbar(im, ax=axes, label="velocity")
+plt.show()
+```
+
+![Channel flow visualization with control](img/channel_2.png)
diff --git a/docs/docs/examples/jax/getting_started.md b/docs/docs/examples/jax/getting_started.md
new file mode 100644
index 0000000..bf8b781
--- /dev/null
+++ b/docs/docs/examples/jax/getting_started.md
@@ -0,0 +1,68 @@
+---
+sidebar_position: 1
+---
+
+# Getting Started
+
+Examples demonstrating HydroGym's JAX backend for GPU-accelerated, fully-differentiable flow control.
+
+## What is the JAX backend?
+
+HydroGym's JAX backend provides pseudo-spectral Navier-Stokes solvers written entirely in JAX. This enables:
+
+- **GPU acceleration** — solvers run on GPU via JAX's XLA compilation
+- **Vectorized environments** — run many parallel environments inside a single JIT-compiled training loop (PureJAX-style)
+- **End-to-end differentiability** — gradients can flow through the solver for gradient-based control
+
+The JAX environments follow the [gymnax](https://github.com/RobertTLange/gymnax) interface (`reset_env` / `step_env` with explicit `params`) and include wrappers (`VecEnv`, `LogWrapper`, `ClipAction`, `NormalizeVecObservation`, `NormalizeVecReward`) for RL training.
+
+## Quick Start
+
+```bash
+# Activate the GPU environment
+source /home/easybuild/venvs/hydrogym_gpu/bin/activate
+
+# Test Kolmogorov flow
+cd getting_started/1_kolmogorov
+./run_kolmogorov_docker.sh
+
+# Test channel flow
+cd getting_started/2_channel
+./run_channel_docker.sh
+
+# Train PPO
+cd getting_started/3_ppo
+./run_ppo_docker.sh --env kolmogorov --total-timesteps 20000
+```
+
+## Available Environments
+
+| Environment | Solver | Grid | Action | Observation | Reward |
+|---|---|---|---|---|---|
+| `KolmogorovFlow` | 2D pseudo-spectral | 64×64 | 4 body-force modes | 8×8 velocity probes | -(α·TKE + action penalty) |
+| `ChannelFlowSpectralEnv` | 3D pseudo-spectral | 72×72×72 | 24 wall jets | 8×8×2 near-wall velocities | -WSS (drag) |
+
+## Typical Usage
+
+```python
+import jax
+import jax.numpy as jnp
+from hydrogym.jax.envs.kolmogorov import KolmogorovFlow
+
+env = KolmogorovFlow(env_config={}, flow_config={})
+params = env.default_params
+
+key = jax.random.PRNGKey(0)
+obs, state = env.reset_env(key, params)
+
+action = jnp.zeros((params.action_dim,))
+obs, state, reward, done, info = env.step_env(key, state, action, params)
+```
+
+**Note:** The channel flow environment downloads a fully turbulent initial field from Hugging Face Hub (`dynamicslab/HydroGym-environments`) on the first run and caches it at `~/.cache/hydrogym/`.
+
+## Requirements
+
+- JAX with GPU support (`jax[cuda12]` or equivalent)
+- `flax`, `optax`, `distrax` for PPO training
+- Internet access on first run (channel flow initial field download)
diff --git a/docs/docs/examples/jax/img/channel_1.png b/docs/docs/examples/jax/img/channel_1.png
new file mode 100644
index 0000000..c4e1820
Binary files /dev/null and b/docs/docs/examples/jax/img/channel_1.png differ
diff --git a/docs/docs/examples/jax/img/channel_2.png b/docs/docs/examples/jax/img/channel_2.png
new file mode 100644
index 0000000..34acd88
Binary files /dev/null and b/docs/docs/examples/jax/img/channel_2.png differ
diff --git a/docs/docs/examples/jax/img/kolmogorov.gif b/docs/docs/examples/jax/img/kolmogorov.gif
new file mode 100644
index 0000000..b6b4727
Binary files /dev/null and b/docs/docs/examples/jax/img/kolmogorov.gif differ
diff --git a/docs/docs/examples/jax/img/kolmogorov_1.png b/docs/docs/examples/jax/img/kolmogorov_1.png
new file mode 100644
index 0000000..8c653ee
Binary files /dev/null and b/docs/docs/examples/jax/img/kolmogorov_1.png differ
diff --git a/docs/docs/examples/jax/img/kolmogorov_2.png b/docs/docs/examples/jax/img/kolmogorov_2.png
new file mode 100644
index 0000000..0c0ce61
Binary files /dev/null and b/docs/docs/examples/jax/img/kolmogorov_2.png differ
diff --git a/docs/docs/examples/jax/kolmogorov.md b/docs/docs/examples/jax/kolmogorov.md
new file mode 100644
index 0000000..5bd4c6e
--- /dev/null
+++ b/docs/docs/examples/jax/kolmogorov.md
@@ -0,0 +1,178 @@
+---
+sidebar_position: 2
+---
+
+# Kolmogorov
+
+A recent development of Hydrogym is the inclusion of differentiable solvers like [JAX](https://docs.jax.dev/en/latest/notebooks/thinking_in_jax.html). The idea is to leverage the differentiable environment to compute sensitivity studies that can lead to a better controller with less compute time. This tutorial covers how to set up the Kolmogorov JAX environment, and how to run basic control. Currently, the Kolmogorov flow is the main differentiable flow environment implemented in Hydrogym.
+
+## Setting up the JAX Environment
+
+To set up our environment, we begin by first importing JAX, plotting libraries, and the required components from HydroGym. The most important component in HydroGym being the [Pseudospectal Navier Stokes Solver in 2D](https://github.com/dynamicslab/hydroGym/blob/main/hydroGym/jax/envs/kolmogorov.py).
+
+```python
+import os
+import time
+import jax.numpy as jnp
+import matplotlib.pyplot as plt
+import seaborn as sns
+from hydrogym.jax.solvers.base import RungeKuttaCrankNicolson
+from hydrogym.jax.utils import io as io
+from hydrogym.jax.envs.kolmogorov import FlowConfig, PseudoSpectralNavierStokes2D
+```
+
+In addition we will need to define a postprocessing function to then be able to visualize the results:
+
+```python
+print_fmt = "vel1: {0:0.3f}\t\t vel2: {1:0.3f}\t\t vel3: {2:0.3e}\t\t vel4: {3:0.3e}"
+
+
+def log_postprocess(flow):
+    """
+    The default observation is the velocity at 64 equally spaced points along the domain.
+    This postprocess function is computing the mean of the observations.
+    """
+    obs = flow.get_observations()
+    mean_obs_time = jnp.mean(obs, axis=1)
+    return mean_obs_time
+
+
+output_dir = "kolmogorov_data"
+np_file_name = "kolmogorov_trajectory"
+gif_file_name = "kolmogorov"
+os.makedirs(output_dir, exist_ok=True)
+```
+
+## The Reinforcement Learning Environment
+
+To set up the reinforcement learning environment, we will be utilizing HydroGym's [FlowConfig](https://github.com/dynamicslab/hydroGym/blob/main/hydroGym/jax/envs/kolmogorov.py). If you would like to change the grid resolution, you will need to define it as an argument of the FlowConfig, like
+
+```python
+FlowConfig(domain_x = 256, domain_y = 256)...
+```
+
+The default here is $64 \times 64$. If you would like to change the Reynolds number, specifically to view extreme events, set ``flow.Re``to be between 40 - 80. The default Reynolds number is 200, a fully turbulent state.
+
+```python
+flow = FlowConfig()
+flow.Re = 100
+dt = 0.001
+equation = PseudoSpectralNavierStokes2D(flow)
+solver = RungeKuttaCrankNicolson(flow, dt, 1, equation)
+end_time = 50  # This is in seconds!
+
+callbacks = [
+    io.LogCallback(
+        postprocess=log_postprocess,
+        interval=1,
+        filename=f"{output_dir}/kolmogorov.dat",
+        print_fmt=print_fmt,
+    ),
+]
+```
+
+Make sure to run the above code on the GPU. If you have access to a GPU, this will run much faster and you can run it at a higher grid resolution.
+
+```python
+def check_jax_device():
+    from jax.lib import xla_bridge
+
+    device = xla_bridge.get_backend().platform
+    print(f"JAX is using: {device}")
+
+
+check_jax_device()
+start = time.time()
+final, trajectory = solver.solve(dt, flow, (0, end_time), callbacks, None, save_n=1)
+end = time.time()
+jnp.save(output_dir + "/" + np_file_name, trajectory)
+print("Total time:", end - start)
+```
+
+## Visualizing the Results
+
+At which point we are able to visualize the results and see how the Kolmogorov flow evolves in time.
+
+```python
+cols = 5
+fig, axs = plt.subplots(1, cols, figsize=(15, 5))
+
+simulation = jnp.load(output_dir + "/" + np_file_name + ".npy")
+
+data = jnp.fft.irfftn(simulation, axes=(1, 2))
+
+for i in range(cols):
+    time = int(len(data) * (i / cols))
+    axs[i].imshow(data[time], cmap="icefire", vmin=-8, vmax=8)
+    axs[i].set_title("time {}".format(time))
+
+plt.tight_layout()
+plt.show()
+```
+
+![Kolmogorov Flow evolving](img/kolmogorov_1.png)
+
+### GIF Generation
+
+We can furthermore summarize the evolution of the flow in time with a GIF utilizing the [imageio](https://imageio.readthedocs.io/en/stable/) library.
+
+```python
+import numpy as np
+import imageio.v2 as imageio
+
+frames = []
+for i in range(data.shape[0]):
+    fig, ax = plt.subplots(figsize=(4, 4))
+    ax.imshow(data[i], cmap="icefire")
+    ax.axis("off")
+    fig.canvas.draw()
+    image = np.frombuffer(fig.canvas.tostring_rgb(), dtype="uint8")
+    image = image.reshape(fig.canvas.get_width_height()[::-1] + (3,))
+    frames.append(image)
+
+    plt.close(fig)
+
+imageio.mimsave(output_dir + "/" + gif_file_name + ".gif", frames, fps=10)
+```
+
+![Kolmogorov Flow GIF](img/kolmogorov.gif)
+
+## Applying Control
+
+To add a controller to the system, simply specify the desired controller under `flow.control_function`. For instance, if one would like a control of sinusoidal forcing of wavenumber 4 and amplitude of -0.5, we can do the following:
+
+```python
+x, y = flow.load_mesh(name="")
+
+def control_func(a, k, y):
+    return (a * jnp.sin(k * y), jnp.zeros_like(y))
+
+flow.control_function = control_func(-0.6, 4, y)
+final, trajectory = solver.solve(dt, flow, (0, end_time), callbacks, None, save_n=1)
+jnp.save(output_dir + "/" + "kolmogorov_controlled", trajectory)
+```
+
+the controlled flow can then be visualized as follows:
+
+```python
+cols = 5
+fig, axs = plt.subplots(1, cols, figsize=(15, 5))
+
+simulation = jnp.load(output_dir + "/" + "kolmogorov_controlled" + ".npy")
+
+data = jnp.fft.irfftn(simulation, axes=(1, 2))
+
+for i in range(cols):
+    time = int(len(data) * (i / cols))
+    axs[i].imshow(data[time], cmap="icefire", vmin=-8, vmax=8)
+    axs[i].set_title("time {}".format(time))
+
+plt.tight_layout()
+plt.show()
+```
+
+![Controlled Kolmogorov Flow](img/kolmogorov_2.png)
+
+For comparison, here is the uncontrolled Kolmogorov flow from earlier in the example:
+
+![Uncontrolled Kolmogorov Flow](img/kolmogorov_1.png)
diff --git a/docs/docs/examples/jax/manage-docs-versions.md b/docs/docs/examples/jax/manage-docs-versions.md
deleted file mode 100644
index 3e4687f..0000000
--- a/docs/docs/examples/jax/manage-docs-versions.md
+++ /dev/null
@@ -1,53 +0,0 @@
----
-sidebar_position: 1
----
-
-# Manage Docs Versions
-
-Docusaurus can manage multiple versions of your docs.
-
-## Create a docs version
-
-Release a version 1.0 of your project:
-
-```bash
-npm run docusaurus docs:version 1.0
-```
-
-The `docs` folder is copied into `versioned_docs/version-1.0` and `versions.json` is created.
-
-Your docs now have 2 versions:
-
-- `1.0` at `http://localhost:3000/docs/` for the version 1.0 docs
-- `current` at `http://localhost:3000/docs/next/` for the **upcoming, unreleased docs**
-
-## Add a Version Dropdown
-
-To navigate seamlessly across versions, add a version dropdown.
-
-Modify the `docusaurus.config.js` file:
-
-```js title="docusaurus.config.js"
-export default {
-  themeConfig: {
-    navbar: {
-      items: [
-        // highlight-start
-        {
-          type: 'docsVersionDropdown',
-        },
-        // highlight-end
-      ],
-    },
-  },
-};
-```
-
-The docs version dropdown appears in your navbar:
-
-## Update an existing version
-
-It is possible to edit versioned docs in their respective folder:
-
-- `versioned_docs/version-1.0/hello.md` updates `http://localhost:3000/docs/hello`
-- `docs/hello.md` updates `http://localhost:3000/docs/next/hello`
diff --git a/docs/docs/examples/jax/ppo.md b/docs/docs/examples/jax/ppo.md
new file mode 100644
index 0000000..090f15c
--- /dev/null
+++ b/docs/docs/examples/jax/ppo.md
@@ -0,0 +1,447 @@
+---
+sidebar_position: 4
+---
+
+# PPO Training
+
+Pure-JAX PPO training for the Kolmogorov and turbulent channel environments, based on [purejaxrl](https://github.com/luchris429/purejaxrl/) with HydroGym integrations (`VecEnv`, normalization wrappers, etc.).
+
+## Setting up the JAX Environment
+
+We begin by setting up the JAX environment with all required software dependencies:
+
+```python
+import argparse
+import pickle
+from typing import NamedTuple, Sequence
+
+import distrax
+import flax.linen as nn
+import flax.serialization
+import jax
+import jax.numpy as jnp
+import matplotlib.pyplot as plt
+import numpy as np
+import optax
+```
+
+From HydroGym, we will need first need the functions to wrap the environment in a `VecEnv` and normalize the observations and rewards.
+
+```python
+from hydrogym.jax.env_core import ClipAction, LogWrapper, NormalizeVecObservation, NormalizeVecReward, VecEnv
+```
+
+## Constructing the Reinforcement Learning Environment
+
+To be able to construct the reinforcement learning environment, we then need to construct an utility function which takes in the environment configuration, and validated its configuration for the chosen case.
+
+```python
+def make_env(config):
+    """Instantiate the environment selected by config["ENV_NAME"]."""
+    env_name = config.get("ENV_NAME", "kolmogorov").lower()
+    if env_name == "kolmogorov":
+        from hydrogym.jax.envs.kolmogorov import KolmogorovFlow
+
+        env = KolmogorovFlow(env_config={}, flow_config={})
+    elif env_name == "channel":
+        from hydrogym.jax.envs.channel import ChannelFlowSpectralEnv
+
+        env = ChannelFlowSpectralEnv(env_config={})
+    else:
+        raise ValueError(f"Unknown ENV_NAME: {env_name!r}. Choose 'kolmogorov' or 'channel'.")
+    return env, env.default_params
+```
+
+In addition, we require utility functions around the saving and loading of the model
+
+```python
+def save_model(params, filepath):
+    with open(filepath, "wb") as f:
+        # Using pickle to serialize params
+        pickle.dump(flax.serialization.to_bytes(params), f)
+
+
+def load_model(filepath):
+    with open(filepath, "rb") as f:
+        # Deserialize params using pickle
+        params_bytes = pickle.load(f)
+        params = flax.serialization.from_bytes(None, params_bytes)
+    return params
+```
+
+## Defining Reinforcement Learning Training
+
+For the reinforcement learning training, we will need to first define an Actor-Critic network, before we can move on to define the transition, and then conclude by defining the actual training loop finally.
+
+```python
+class ActorCritic(nn.Module):
+    action_dim: Sequence[int]
+    activation: str = "tanh"
+
+    @nn.compact
+    def __call__(self, x):
+        if self.activation == "relu":
+            activation = nn.relu
+        else:
+            activation = nn.tanh
+        actor_mean = nn.Dense(256, kernel_init=orthogonal(np.sqrt(2)), bias_init=constant(0.0))(x)
+        actor_mean = activation(actor_mean)
+        actor_mean = nn.Dense(256, kernel_init=orthogonal(np.sqrt(2)), bias_init=constant(0.0))(actor_mean)
+        actor_mean = activation(actor_mean)
+        actor_mean = nn.Dense(self.action_dim, kernel_init=orthogonal(0.01), bias_init=constant(0.0))(actor_mean)
+        actor_logtstd = self.param("log_std", nn.initializers.zeros, (self.action_dim,))
+        pi = distrax.MultivariateNormalDiag(actor_mean, jnp.exp(actor_logtstd))  # changed actor_mean to jnp.exp
+
+        critic = nn.Dense(256, kernel_init=orthogonal(np.sqrt(2)), bias_init=constant(0.0))(x)
+        critic = activation(critic)
+        critic = nn.Dense(256, kernel_init=orthogonal(np.sqrt(2)), bias_init=constant(0.0))(critic)
+        critic = activation(critic)
+        critic = nn.Dense(1, kernel_init=orthogonal(1.0), bias_init=constant(0.0))(critic)
+
+        return pi, jnp.squeeze(critic, axis=-1)
+```
+
+The transition class is then defined as follows:
+
+```python
+class Transition(NamedTuple):
+    done: jnp.ndarray
+    action: jnp.ndarray
+    value: jnp.ndarray
+    reward: jnp.ndarray
+    log_prob: jnp.ndarray
+    obs: jnp.ndarray
+    info: jnp.ndarray
+```
+
+With the rollout function following the purejaxrl implementation:
+
+```python
+def rollout(env, params, env_params, num_steps=10, num_envs=4, activation="tanh"):
+    rng = jax.random.PRNGKey(30)
+    rng, _rng = jax.random.split(rng)
+    reset_rng = jax.random.split(_rng, num_envs)
+    observations = []
+    actions = []
+    rewards = []
+    dones = []
+
+    # Wrap before reset so the wrapped env is used throughout
+    env = ClipAction(env)
+
+    obs, env_state = env.reset(reset_rng, env_params)
+
+    network = ActorCritic(env.action_space(env_params).shape[0], activation=activation)
+
+    for _ in range(num_steps):
+        observations.append(obs)
+
+        rng, action_rng = jax.random.split(rng)
+        pi, _ = network.apply(params, obs)
+        action = pi.sample(seed=action_rng)
+        actions.append(action)
+
+        rng, step_rng = jax.random.split(rng)
+        obs, env_state, reward, done, _ = env.step(step_rng, env_state, action, env_params)
+        rewards.append(reward)
+        dones.append(done)
+
+    return {
+        "observations": jnp.array(observations),
+        "actions": jnp.array(actions),
+        "rewards": jnp.array(rewards),
+        "dones": jnp.array(dones),
+    }
+```
+
+Culminating in the following training loop:
+
+```python
+def make_train(config):
+    total_batch = config["NUM_ENVS"] * config["NUM_STEPS"]
+    if total_batch % config["NUM_MINIBATCHES"] != 0:
+        raise ValueError(
+            f"NUM_ENVS * NUM_STEPS ({config['NUM_ENVS']} * {config['NUM_STEPS']} = {total_batch}) "
+            f"must be divisible by NUM_MINIBATCHES ({config['NUM_MINIBATCHES']}). "
+            f"Valid NUM_MINIBATCHES values for your settings: "
+            f"{[d for d in range(1, total_batch + 1) if total_batch % d == 0]}"
+        )
+    config["NUM_UPDATES"] = config["TOTAL_TIMESTEPS"] // config["NUM_STEPS"] // config["NUM_ENVS"]
+    config["MINIBATCH_SIZE"] = total_batch // config["NUM_MINIBATCHES"]
+    env, env_params = make_env(config)
+    env = LogWrapper(env)
+    env = ClipAction(env)
+    env = VecEnv(env)
+
+    if config["NORMALIZE_ENV"]:
+        env = NormalizeVecObservation(env)
+        env = NormalizeVecReward(env, config["GAMMA"])
+
+    def linear_schedule(count):
+        frac = 1.0 - (count // (config["NUM_MINIBATCHES"] * config["UPDATE_EPOCHS"])) / config["NUM_UPDATES"]
+        return config["LR"] * frac
+
+    # @partial(jax.jit, static_argnums=(1,))
+    def train(rng):
+        # INIT NETWORK
+        network = ActorCritic(env.action_space(env_params).shape[0], activation=config["ACTIVATION"])
+        rng, _rng = jax.random.split(rng)
+        init_x = jnp.zeros(env.observation_space(env_params).shape)
+        network_params = network.init(_rng, init_x)
+        if config["ANNEAL_LR"]:
+            tx = optax.chain(
+                optax.clip_by_global_norm(config["MAX_GRAD_NORM"]),
+                optax.adam(learning_rate=linear_schedule, eps=1e-5),
+            )
+        else:
+            tx = optax.chain(
+                optax.clip_by_global_norm(config["MAX_GRAD_NORM"]),
+                optax.adam(config["LR"], eps=1e-5),
+            )
+        train_state = TrainState.create(
+            apply_fn=network.apply,
+            params=network_params,
+            tx=tx,
+        )
+
+        # INIT ENV
+        rng, _rng = jax.random.split(rng)
+        reset_rng = jax.random.split(_rng, config["NUM_ENVS"])
+        obsv, env_state = env.reset(reset_rng, env_params)
+
+        # TRAIN LOOP
+        def _update_step(runner_state, unused):
+            # COLLECT TRAJECTORIES
+            def _env_step(runner_state, unused):
+                train_state, env_state, last_obs, rng = runner_state
+
+                # SELECT ACTION
+                rng, _rng = jax.random.split(rng)
+                pi, value = network.apply(train_state.params, last_obs)
+                action = pi.sample(seed=_rng)  # clip action here
+                log_prob = pi.log_prob(action)
+
+                # STEP ENV
+                rng, _rng = jax.random.split(rng)
+                rng_step = jax.random.split(_rng, config["NUM_ENVS"])
+                obsv, env_state, reward, done, info = env.step(rng_step, env_state, action, env_params)
+                transition = Transition(done, action, value, reward, log_prob, last_obs, info)
+                runner_state = (train_state, env_state, obsv, rng)
+                return runner_state, transition
+
+            runner_state, traj_batch = jax.lax.scan(_env_step, runner_state, None, config["NUM_STEPS"])
+
+            # CALCULATE ADVANTAGE
+            train_state, env_state, last_obs, rng = runner_state
+            _, last_val = network.apply(train_state.params, last_obs)
+
+            def _calculate_gae(traj_batch, last_val):
+                def _get_advantages(gae_and_next_value, transition):
+                    gae, next_value = gae_and_next_value
+                    done, value, reward = (
+                        transition.done,
+                        transition.value,
+                        transition.reward,
+                    )
+                    delta = reward + config["GAMMA"] * next_value * (1 - done) - value
+                    gae = delta + config["GAMMA"] * config["GAE_LAMBDA"] * (1 - done) * gae
+                    return (gae, value), gae
+
+                _, advantages = jax.lax.scan(
+                    _get_advantages,
+                    (jnp.zeros_like(last_val), last_val),
+                    traj_batch,
+                    reverse=True,
+                    unroll=16,
+                )
+                return advantages, advantages + traj_batch.value
+
+            advantages, targets = _calculate_gae(traj_batch, last_val)
+
+            # UPDATE NETWORK
+            def _update_epoch(update_state, unused):
+                def _update_minbatch(train_state, batch_info):
+                    traj_batch, advantages, targets = batch_info
+
+                    def _loss_fn(params, traj_batch, gae, targets):
+                        # RERUN NETWORK
+                        pi, value = network.apply(params, traj_batch.obs)
+                        log_prob = pi.log_prob(traj_batch.action)
+
+                        # CALCULATE VALUE LOSS
+                        value_pred_clipped = traj_batch.value + (value - traj_batch.value).clip(
+                            -config["CLIP_EPS"], config["CLIP_EPS"]
+                        )
+                        value_losses = jnp.square(value - targets)
+                        value_losses_clipped = jnp.square(value_pred_clipped - targets)
+                        value_loss = 0.5 * jnp.maximum(value_losses, value_losses_clipped).mean()
+
+                        # CALCULATE ACTOR LOSS
+                        ratio = jnp.exp(log_prob - traj_batch.log_prob)
+                        gae = (gae - gae.mean()) / (gae.std() + 1e-8)
+                        loss_actor1 = ratio * gae
+                        loss_actor2 = (
+                            jnp.clip(
+                                ratio,
+                                1.0 - config["CLIP_EPS"],
+                                1.0 + config["CLIP_EPS"],
+                            )
+                            * gae
+                        )
+                        loss_actor = -jnp.minimum(loss_actor1, loss_actor2)
+                        loss_actor = loss_actor.mean()
+                        entropy = pi.entropy().mean()
+
+                        total_loss = loss_actor + config["VF_COEF"] * value_loss - config["ENT_COEF"] * entropy
+                        return total_loss, (value_loss, loss_actor, entropy)
+
+                    grad_fn = jax.value_and_grad(_loss_fn, has_aux=True)
+                    total_loss, grads = grad_fn(train_state.params, traj_batch, advantages, targets)
+                    train_state = train_state.apply_gradients(grads=grads)
+                    return train_state, total_loss
+
+                train_state, traj_batch, advantages, targets, rng = update_state
+                rng, _rng = jax.random.split(rng)
+                batch_size = config["MINIBATCH_SIZE"] * config["NUM_MINIBATCHES"]
+                permutation = jax.random.permutation(_rng, batch_size)
+                batch = (traj_batch, advantages, targets)
+                batch = jax.tree_util.tree_map(lambda x: x.reshape((batch_size,) + x.shape[2:]), batch)
+                shuffled_batch = jax.tree_util.tree_map(lambda x: jnp.take(x, permutation, axis=0), batch)
+                minibatches = jax.tree_util.tree_map(
+                    lambda x: jnp.reshape(x, [config["NUM_MINIBATCHES"], -1] + list(x.shape[1:])),
+                    shuffled_batch,
+                )
+                train_state, total_loss = jax.lax.scan(_update_minbatch, train_state, minibatches)
+                update_state = (train_state, traj_batch, advantages, targets, rng)
+                return update_state, total_loss
+
+            update_state = (train_state, traj_batch, advantages, targets, rng)
+            update_state, loss_info = jax.lax.scan(_update_epoch, update_state, None, config["UPDATE_EPOCHS"])
+            train_state = update_state[0]
+            metric = traj_batch.info
+            rng = update_state[-1]
+
+            if config.get("DEBUG"):
+
+                def callback(info):
+                    step = int(info["timestep"].max())
+                    total = config["TOTAL_TIMESTEPS"]
+                    pct = 100.0 * step / total
+
+                    # Extra env-specific metrics
+                    extras = []
+                    if "mean_tke" in info:
+                        extras.append(f"mean_tke={float(info['mean_tke'].mean()):.4f}")
+
+                    # Completed episodes in this rollout batch
+                    done_mask = info["returned_episode"]
+                    if done_mask.any():
+                        mean_return = float(info["returned_episode_returns"][done_mask].mean())
+                        extras.append(f"return={mean_return:.4f}")
+
+                    extra_str = "  " + "  ".join(extras) if extras else ""
+                    print(f"  step {step:>6}/{total}  ({pct:5.1f}%){extra_str}")
+
+                jax.debug.callback(callback, metric)
+
+            runner_state = (train_state, env_state, last_obs, rng)
+            return runner_state, metric
+
+        rng, _rng = jax.random.split(rng)
+        runner_state = (train_state, env_state, obsv, _rng)
+        runner_state, metric = jax.lax.scan(_update_step, runner_state, None, config["NUM_UPDATES"])
+        return {"runner_state": runner_state, "metrics": metric}
+
+    return train
+```
+
+## Performing the Training
+
+At this point, we can now define the configuration of our training hyperparameters, and pull the individual pieces together
+
+```python
+config = {
+    "LR": 1e-4,  # try 3e-4 - 1e-5 (play around with it) 1e-4
+    "NUM_ENVS": 4,
+    "NUM_STEPS": 40,  # 40
+    "TOTAL_TIMESTEPS": 100,  # 4000
+    "UPDATE_EPOCHS": 10,
+    "NUM_MINIBATCHES": 8,
+    "GAMMA": 0.99,
+    "GAE_LAMBDA": 0.985,  # can tune to go up to 0.995. 0.98
+    "CLIP_EPS": 0.2,
+    "ENT_COEF": 0.0,  # can be increased to approx 0.1 or 0.2 or stay the same
+    "VF_COEF": 0.5,
+    "MAX_GRAD_NORM": 0.5,
+    "ACTIVATION": "tanh",  # mish activation function is good to try
+    "ANNEAL_LR": False,  # can try
+    "NORMALIZE_ENV": False,
+    "DEBUG": True,
+}
+```
+
+define our training parameters more custom to HydroGym
+
+```python
+parser = argparse.ArgumentParser(description="PPO training for HydroGym JAX environments")
+parser.add_argument(
+    "--env",
+    default="kolmogorov",
+    choices=["kolmogorov", "channel"],
+    help="Environment to train on (default: kolmogorov)",
+)
+parser.add_argument("--total-timesteps", type=int, default=4000)
+parser.add_argument("--num-envs", type=int, default=4)
+parser.add_argument("--num-steps", type=int, default=10)
+parser.add_argument("--num-minibatches", type=int, default=8, help="Must divide NUM_ENVS * NUM_STEPS (default: 8)")
+parser.add_argument("--lr", type=float, default=1e-4)
+parser.add_argument("--model-save-path", default=None, help="Path to save trained model (.pkl)")
+parser.add_argument("--plot-path", default=None, help="Path to save reward plot (.png)")
+args = parser.parse_args()
+```
+
+set the paths for the model to be saved, and where plots are to be saved
+
+```python
+model_save_path = args.model_save_path or f"trained_model_{args.env}.pkl"
+plot_path = args.plot_path or f"plot_reward_{args.env}.png"
+```
+
+just for our own sanity, inspect the configuration and paths to be sure that they are set correctly before beginning the training.
+
+```python
+print(f"=== PPO Training: {args.env} environment ===")
+print(f"  Total timesteps : {config['TOTAL_TIMESTEPS']}")
+print(f"  Num envs        : {config['NUM_ENVS']}")
+print(f"  Num steps       : {config['NUM_STEPS']}")
+print(f"  Learning rate   : {config['LR']}")
+print(f"  Model save path : {model_save_path}")
+print(f"  Plot save path  : {plot_path}")
+print("")
+```
+
+at which point we can run the full training
+
+```python
+rng = jax.random.PRNGKey(30)
+train_jit = jax.jit(make_train(config))
+out = train_jit(rng)
+```
+
+After the training is completed, we can save the trained model
+
+```python
+trained_params = out["runner_state"][0].params
+save_model(trained_params, config["MODEL_SAVE_PATH"])
+```
+
+and plot the training results:
+
+```python
+plt.plot(out["metrics"]["returned_episode_returns"].mean(-1).reshape(-1))
+plt.xlabel("Updates")
+plt.ylabel("Return")
+plt.show()
+plt.savefig(config["PLOT_TRAINING_PATH"], format="png")
+jnp.save("rewardovertime", out["metrics"]["returned_episode_returns"].mean(-1).reshape(-1))
+```
diff --git a/docs/docs/examples/jaxfluids/manage-docs-versions.md b/docs/docs/examples/jaxfluids/manage-docs-versions.md
deleted file mode 100644
index 3e4687f..0000000
--- a/docs/docs/examples/jaxfluids/manage-docs-versions.md
+++ /dev/null
@@ -1,53 +0,0 @@
----
-sidebar_position: 1
----
-
-# Manage Docs Versions
-
-Docusaurus can manage multiple versions of your docs.
-
-## Create a docs version
-
-Release a version 1.0 of your project:
-
-```bash
-npm run docusaurus docs:version 1.0
-```
-
-The `docs` folder is copied into `versioned_docs/version-1.0` and `versions.json` is created.
-
-Your docs now have 2 versions:
-
-- `1.0` at `http://localhost:3000/docs/` for the version 1.0 docs
-- `current` at `http://localhost:3000/docs/next/` for the **upcoming, unreleased docs**
-
-## Add a Version Dropdown
-
-To navigate seamlessly across versions, add a version dropdown.
-
-Modify the `docusaurus.config.js` file:
-
-```js title="docusaurus.config.js"
-export default {
-  themeConfig: {
-    navbar: {
-      items: [
-        // highlight-start
-        {
-          type: 'docsVersionDropdown',
-        },
-        // highlight-end
-      ],
-    },
-  },
-};
-```
-
-The docs version dropdown appears in your navbar:
-
-## Update an existing version
-
-It is possible to edit versioned docs in their respective folder:
-
-- `versioned_docs/version-1.0/hello.md` updates `http://localhost:3000/docs/hello`
-- `docs/hello.md` updates `http://localhost:3000/docs/next/hello`
diff --git a/docs/docs/examples/maia/getting_started.md b/docs/docs/examples/maia/getting_started.md
new file mode 100644
index 0000000..e5dee71
--- /dev/null
+++ b/docs/docs/examples/maia/getting_started.md
@@ -0,0 +1,321 @@
+---
+sidebar_position: 1
+---
+
+# Getting Started
+
+✅ **START HERE** for standard RL interface examples using MAIA solver with `env.reset()` and `env.step()`.
+
+This directory contains examples and utilities for HydroGym's MAIA-based flow environments using the **standard RL interface** with **MPMD coupling**.
+
+## Files
+
+### [`test_maia_env.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/maia/getting_started/test_maia_env.py)
+**Interactive test script** - Test MAIA environments with command-line arguments via MPMD execution.
+
+Usage:
+```bash
+# Basic usage (1 Python + 1 MAIA process)
+mpirun -np 1 python test_maia_env.py --environment Cylinder_2D_Re200 : -np 1 maia properties.toml
+
+# Parallel MAIA (1 Python + 4 MAIA processes)
+mpirun -np 1 python test_maia_env.py --environment Cylinder_2D_Re200 : -np 4 maia properties.toml
+```
+
+### [`train_sb3_maia.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/maia/getting_started/train_sb3_maia.py)
+**SB3 training script** - Train reinforcement learning agents (PPO/TD3/SAC) with Stable-Baselines3.
+
+Features:
+- Monitor wrapper for episode statistics
+- VecNormalize for observation/reward normalization
+- Checkpoint saving with normalization stats
+- TensorBoard logging
+
+Usage:
+```bash
+# First, prepare workspace
+python prepare_workspace.py --env Cylinder_2D_Re200 --work-dir ./train_run
+
+# Then train with MPMD execution
+cd train_run
+mpirun -np 1 python ../train_sb3_maia.py --env Cylinder_2D_Re200 --algo PPO --total-timesteps 100000 : -np 1 maia properties.toml
+
+# Monitor training
+tensorboard --logdir logs/
+```
+
+### [`prepare_workspace.py`](https://github.com/dynamicslab/hydrogym/blob/main/examples/maia/getting_started/prepare_workspace.py)
+**Workspace setup utility** - Downloads environment data and creates workspace for HPC jobs.
+
+Usage:
+```bash
+python prepare_workspace.py --env Cylinder_2D_Re200 --work-dir ./my_workspace
+```
+
+### [`run_example_docker.sh`](https://github.com/dynamicslab/hydrogym/blob/main/examples/maia/getting_started/run_example_docker.sh)
+**Docker runner script** - Run MAIA examples in Docker with automatic setup.
+
+Usage:
+```bash
+# Test environment
+./run_example_docker.sh
+
+# Train SB3 agent
+./run_example_docker.sh train
+```
+
+## Quick Start
+
+**⚠️ Internet Required:** Environment files are downloaded from Hugging Face Hub. For offline/HPC use, see [Offline Usage](#offline-usage-no-internet-on-compute-nodes) below.
+
+### Basic Test Workflow (Online)
+
+**Step 1:** Prepare the workspace (downloads data from Hugging Face Hub, **requires internet**, no MPI needed):
+
+```bash
+python prepare_workspace.py --env Cylinder_2D_Re200 --work-dir ./test_run_000
+```
+
+This creates:
+- `test_run_000/` - working directory
+- `test_run_000/properties.toml` - MAIA configuration file (symlink)
+- `test_run_000/grid` - mesh file (symlink)
+- `test_run_000/out/` - output directory
+
+**Step 2:** Run the test with MPMD execution:
+
+```bash
+cd test_run_000
+mpirun -np 1 python ../test_maia_env.py --environment Cylinder_2D_Re200 --num-steps 10 : -np 1 maia properties.toml
+```
+
+This runs:
+- 1 Python process (RL environment)
+- 1 MAIA process (CFD solver)
+- Communication via MPI
+
+### Parallel MAIA
+
+To run with more MAIA processes for larger meshes:
+
+```bash
+cd test_run_000
+mpirun -np 1 python ../test_maia_env.py --environment Cylinder_2D_Re200 : -np 4 maia properties.toml
+```
+
+### Explore Options
+
+The test script supports many configuration options:
+
+```bash
+python test_maia_env.py --help
+```
+
+## Usage Examples
+
+### Example 1: Basic RL Loop
+
+```python
+import hydrogym.maia as maia
+import numpy as np
+
+# Create MAIA environment from Hugging Face Hub
+# Probe locations are flattened: [x0, y0, x1, y1, ...]
+probe_locations = []
+for x in np.linspace(1.0, 8.0, 8):
+    probe_locations.extend([x, 0.0])
+
+env = maia.from_hf(
+    'Cylinder_2D_Re200',
+    probe_locations=probe_locations,
+    obs_normalization_strategy='U_inf',
+)
+
+obs, info = env.reset()
+
+# Run standard RL loop
+for step in range(100):
+    action = env.action_space.sample()  # Replace with your policy
+    obs, reward, terminated, truncated, info = env.step(action)
+
+    if terminated or truncated:
+        obs, info = env.reset()
+
+env.close()
+```
+
+**Important:** This script must be run with MPMD:
+```bash
+mpirun -np 1 python your_script.py : -np 4 maia properties.toml
+```
+
+### Example 2: Training with Stable-Baselines3
+
+```python
+# See train_sb3_maia.py for full implementation
+import hydrogym.maia as maia
+from stable_baselines3 import PPO
+from stable_baselines3.common.monitor import Monitor
+from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
+
+# Define probes
+probe_locations = []
+for x in np.linspace(1.0, 8.0, 8):
+    for y in np.linspace(-1.0, 1.0, 5):
+        probe_locations.extend([x, y])
+
+def make_env():
+    env = maia.from_hf(
+        'Cylinder_2D_Re200',
+        use_clean_cache=False,
+        probe_locations=probe_locations,
+        obs_normalization_strategy='U_inf',
+    )
+    return Monitor(env)
+
+env = DummyVecEnv([make_env])
+env = VecNormalize(env, norm_obs=True, norm_reward=True, clip_obs=10.0)
+
+model = PPO("MlpPolicy", env, verbose=1, tensorboard_log="./logs")
+model.learn(total_timesteps=100000)
+
+model.save("ppo_cylinder")
+env.save("vec_normalize.pkl")
+```
+
+**Run with MPMD:**
+```bash
+cd work_dir
+mpirun -np 1 python ../train_sb3_maia.py --env Cylinder_2D_Re200 --algo PPO : -np 1 maia properties.toml
+```
+
+### Example 3: Custom Probe Configuration
+
+```python
+import hydrogym.maia as maia
+import numpy as np
+
+# Define wake probes (flattened format)
+wake_probes = []
+for x in np.linspace(1.0, 8.0, 10):
+    for y in np.linspace(-1.0, 1.0, 5):
+        wake_probes.extend([x, y])  # 50 probes total
+
+env = maia.from_hf(
+    'Cylinder_2D_Re200',
+    probe_locations=wake_probes,
+    obs_normalization_strategy='U_inf',
+)
+
+obs, _ = env.reset()
+print(f"Observation shape: {obs.shape}")  # (100,) for 50 probes × 2 velocity components
+```
+
+### Example 4: Custom Normalization
+
+```python
+import hydrogym.maia as maia
+
+# Define wake probes (flattened format)
+wake_probes = []
+for x in np.linspace(1.0, 8.0, 8):
+    for y in np.linspace(-1.0, 1.0, 5):
+        wake_probes.extend([x, y])  # 40 probes total
+
+# Define custom normalization (location and scale for each probe component)
+# For N probes with nDim=2 velocity, you need nDim*N location and scale values
+custom_loc = [0.0] * 80    # Here 40 probes × 2 components
+custom_scale = [1.0] * 80
+
+env = maia.from_hf(
+    'Cylinder_2D_Re200',
+    probe_locations=[...],  # Your probe locations
+    obs_normalization_strategy='customized',
+    obs_loc=custom_loc,
+    obs_scale=custom_scale,
+)
+```
+
+### Example 5: Offline Usage with Local Files
+
+```python
+import hydrogym.maia as maia
+
+# Use pre-downloaded environment files (no internet required)
+env = maia.from_hf(
+    'Cylinder_2D_Re200',
+    probe_locations=[...],
+    local_fallback_dir='/scratch/my_project/hf_environments/models--dynamicslab--HydroGym-environments/snapshots/main',
+    use_clean_cache=False,  # Don't try to download from HF Hub
+)
+```
+
+## Offline Usage (No Internet on Compute Nodes)
+
+**Important:** Environment files are stored on Hugging Face Hub and require internet access. HPC compute nodes often lack internet connectivity.
+
+### HPC Offline Workflow
+
+**Step 1:** On a machine with internet (login node or workstation), download environment data:
+
+```bash
+# Download to Hugging Face cache
+python -c "
+from hydrogym.data_manager import HFDataManager
+dm = HFDataManager(repo_id='dynamicslab/HydroGym-environments', use_clean_cache=False)
+env_path = dm.get_environment_path('Cylinder_2D_Re200')
+print(f'Environment downloaded to: {env_path}')
+"
+```
+
+**Step 2:** Copy to shared HPC filesystem (if needed):
+
+```bash
+# Copy from HF cache to shared storage accessible from compute nodes
+cp -r ~/.cache/huggingface/hub/models--dynamicslab--HydroGym-environments \
+      /scratch/my_project/hf_environments/
+```
+
+**Step 3:** Create a workspace preparation script that uses local files:
+
+```python
+# prepare_offline_workspace.py
+from hydrogym.maia.workspace import prepare_maia_workspace
+
+work_dir, props_file = prepare_maia_workspace(
+    environment_name='Cylinder_2D_Re200',
+    work_dir='./my_run',
+    local_fallback_dir='/scratch/my_project/hf_environments/models--dynamicslab--HydroGym-environments/snapshots/main',
+    use_clean_cache=False,  # Use existing cache
+    force_download=False,   # Don't try to download
+)
+
+print(f"Workspace: {work_dir}")
+print(f"Properties: {props_file}")
+```
+
+**Step 4:** In your RL script, use the same `local_fallback_dir`:
+
+```python
+import hydrogym.maia as maia
+
+env = maia.from_hf(
+    'Cylinder_2D_Re200',
+    probe_locations=[...],
+    local_fallback_dir='/scratch/my_project/hf_environments/models--dynamicslab--HydroGym-environments/snapshots/main',
+    use_clean_cache=False,
+)
+```
+
+**Step 5:** Run on compute node (offline):
+
+```bash
+cd my_run
+mpirun -np 1 python ../my_rl_script.py : -np 4 maia properties.toml
+```
+
+---
+
+**Last Updated**: March 2026
+**HydroGym Version**: 1.0+
+**Maintainer**: HydroGym Team
diff --git a/docs/docs/examples/maia/manage-docs-versions.md b/docs/docs/examples/maia/manage-docs-versions.md
deleted file mode 100644
index 642c90c..0000000
--- a/docs/docs/examples/maia/manage-docs-versions.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-sidebar_position: 1
----
-
-# Manage Docs Versions
diff --git a/docs/docs/examples/nek5000/from_hf.md b/docs/docs/examples/nek5000/from_hf.md
new file mode 100644
index 0000000..1d6857b
--- /dev/null
+++ b/docs/docs/examples/nek5000/from_hf.md
@@ -0,0 +1,74 @@
+---
+sidebar_position: 5
+---
+
+# from_hf Pattern
+
+Convenient method for loading environments with minimal configuration using `NekEnv.from_hf()`.
+
+## Interface
+
+```python
+from hydrogym.nek import NekEnv
+
+# Minimal configuration - just environment name and nproc
+env = NekEnv.from_hf(
+    'TCFmini_3D_Re180',
+    nproc=10,
+    use_clean_cache=False,
+    local_fallback_dir=None  # Optional: local environment directory
+)
+
+# That's it! Auto-detects config files, handles setup automatically
+obs, info = env.reset()
+obs, reward, terminated, truncated, info = env.step(action)
+```
+
+## Files
+
+- `test_nek_DM.py` - Test from_hf() loading pattern
+- `train_sb3_from_hf.py` - SB3 training with from_hf() pattern
+- `run_from_hf_docker.sh` - Docker/MPI execution
+
+## Usage
+
+### Test Environment
+```bash
+mpirun -np 1 python test_nek_DM.py --steps 100 : -np 10 nek5000
+```
+
+### Train RL Agent
+```bash
+mpirun -np 1 python train_sb3_from_hf.py --env MiniChannel_Re180 --algo PPO --total-timesteps 100000 : -np 10 nek5000
+```
+
+## When to Use
+
+- **Recommended for most users** - simplest approach
+- Need minimal configuration (environment name + nproc)
+- Want automatic config file detection
+- Using standard/pre-packaged environments
+- HuggingFace Hub integration (future)
+
+## Comparison with Other Patterns
+
+| Pattern | Configuration | Use Case |
+|---------|--------------|----------|
+| **from_hf()** (Chapter 4) | Minimal (name + nproc) | Recommended default |
+| **Direct instantiation** (Chapter 1) | env_config dict | Full control needed |
+| **Legacy config** | YAML + OmegaConf | Backwards compatibility |
+
+## Benefits of from_hf()
+
+- **Minimal code**: Just env name and nproc
+- **Auto-detection**: Finds config files automatically
+- **Fallback support**: Local directories or HuggingFace
+- **Clean workspace**: No nested directories created
+- **Caching**: Reuses prepared environments
+
+## Notes
+
+- Auto-detects `environment_config.yaml` or `config.yaml`
+- `use_clean_cache=False` reuses existing prepared workspace
+- `local_fallback_dir` allows using local environment packages, e.g. for HPC system usage
+- For advanced configuration, use direct instantiation (Chapter 1)
diff --git a/docs/docs/examples/nek5000/getting_started.md b/docs/docs/examples/nek5000/getting_started.md
new file mode 100644
index 0000000..d29c802
--- /dev/null
+++ b/docs/docs/examples/nek5000/getting_started.md
@@ -0,0 +1,236 @@
+---
+sidebar_position: 1
+---
+
+# Getting Started
+
+**START HERE** for NEK5000-based RL interface examples using `env.reset()` and `env.step()`.
+
+This directory contains comprehensive examples for using HydroGym's NEK5000-based flow environments with different interface patterns, from single-agent to multi-agent reinforcement learning.
+
+> **Note:** NEK5000 requires MPI for parallel execution. All examples use `mpirun` to coordinate between the Python controller and NEK5000 solver processes.
+
+## Directory Structure
+
+Each subdirectory demonstrates a specific interface pattern with complete examples:
+
+### 1. [`1_nekenv_single/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/1_nekenv_single) - Single Agent (Standard Gym)
+**Interface:** `NekEnv` - Standard Gymnasium single-agent interface
+**Use Case:** Single actuator/sensor scenarios
+**SB3 Compatible:** ✅ Direct (no wrapper needed)
+
+```python
+from hydrogym.nek import NekEnv
+
+env_config = {
+    'environment_name': 'TCFmini_3D_Re180',
+    'nproc': 10,
+}
+env = NekEnv(env_config=env_config)
+
+# Standard Gym interface
+obs, info = env.reset()
+obs, reward, terminated, truncated, info = env.step(action)
+
+# Works directly with Stable-Baselines3
+from stable_baselines3 import PPO
+model = PPO("MlpPolicy", env)
+model.learn(total_timesteps=100000)
+```
+
+**Files:**
+- `test_nek_direct.py` - Basic environment test with zero control
+- `train_sb3_nek_direct.py` - SB3 training with Monitor & VecNormalize
+- `run_nekenv_docker.sh` - Docker/MPI execution script
+
+---
+
+### 2. [`2_parallel_env/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/2_parallel_env) - Multi-Agent Parallel (PettingZoo)
+**Interface:** `parallel_env` - PettingZoo parallel multi-agent
+**Use Case:** Multiple independent agents with simultaneous actions
+**SB3 Compatible:** ⚠️ Requires wrapper (SuperSuit or custom)
+
+```python
+from hydrogym.nek import parallel_env
+
+env = parallel_env(
+    environment_name='TCFmini_3D_Re180',
+    nproc=10,
+    num_agents=3,  # Multiple agents
+)
+
+# Dictionary-based observations and actions
+obs = env.reset()  # {'agent_0': array, 'agent_1': array, 'agent_2': array}
+actions = {agent: env.action_space(agent).sample() for agent in env.agents}
+obs, rewards, terminations, truncations, infos = env.step(actions)
+```
+
+**Files:**
+- `test_nek_parallel.py` - Multi-agent environment test
+- `train_sb3_parallel.py` - SB3 training with SuperSuit wrappers
+- `run_parallel_docker.sh` - Docker/MPI execution script
+
+---
+
+### 3. [`3_pettingzoo/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/3_pettingzoo) - PettingZoo AEC Interface
+**Interface:** PettingZoo AEC (Agent Environment Cycle)
+**Use Case:** Turn-based multi-agent scenarios
+**Configuration file:** YAML configs are used to lock simulation and runner settings for reproducible training. 
+**SB3 Compatible:** ⚠️ Requires wrapper
+
+```python
+from hydrogym.nek import parallel_env
+from pettingzoo.utils import parallel_to_aec
+
+parallel = parallel_env(environment_name='TCFmini_3D_Re180', nproc=10)
+env = parallel_to_aec(parallel)
+
+# Turn-based API
+env.reset()
+for agent in env.agent_iter():
+    observation, reward, termination, truncation, info = env.last()
+    action = env.action_space(agent).sample()
+    env.step(action)
+```
+
+**Files:**
+- `test_nek_pettingzoo.py` - AEC interface test
+- `train_sb3_pettingzoo.py` - Training with turn-based agents
+- `run_pettingzoo_docker.sh` - Docker/MPI execution script
+
+---
+
+### 4. [`4_from_hf/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/4_from_hf) - HuggingFace Data Manager
+**Interface:** Load pre-packaged environments from HuggingFace Hub or local directories
+**Use Case:** Using standardized, version-controlled environment configurations
+**SB3 Compatible:** ✅ Works with any env type
+
+```python
+from hydrogym.nek import NekDataManager, NekEnv
+
+# Initialize data manager
+dm = NekDataManager(local_dir="./packaged_envs")
+
+# Prepare workspace (downloads/extracts if needed)
+config = dm.prepare_workspace(
+    env_name="TCFmini_3D_Re180",
+    nproc=10,
+)
+
+# Create environment
+env = NekEnv(env_config=config)
+```
+
+**Files:**
+- `test_nek_DM.py` - Data manager test
+- `train_sb3_from_hf.py` - Training with HF environments
+- `run_from_hf_docker.sh` - Docker/MPI execution script
+
+---
+
+### 5. [`5_hydrogym_control/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/5_hydrogym_control) - HydroGym Controllers + Integrate
+**Interface:** Using `hgym.integrate()` for time-stepping with controllers
+**Use Case:** Classical control, RL deployment, or hybrid control strategies
+**SB3 Compatible:** ✅ Pass trained model as controller
+
+```python
+from hydrogym import integrate
+from hydrogym.nek import NekEnv
+
+# Train an RL agent
+env = NekEnv(env_config=config)
+model = PPO("MlpPolicy", env)
+model.learn(total_timesteps=100000)
+
+# Use trained model as controller
+integrate(
+    env,
+    t_span=(0, 100),
+    controller=model,  # Can be trained model, PID, or custom controller
+)
+```
+
+**Files:**
+- `test_nek_env_controller.py` - Environment with controller test
+- `train_sb3_with_integrate.py` - Training + deployment with integrate
+- `run_control_docker.sh` - Docker/MPI execution script
+
+---
+
+### 6. [`6_zeroshot_wing_demo/`](https://github.com/dynamicslab/hydrogym/tree/main/examples/nek/getting_started/6_zeroshot_wing_demo) - Zero-Shot Wing Deployment
+**Interface:** `NekEnv` + PettingZoo rollout with deployment controllers
+**Use Case:** Deploy pre-trained/legacy DRL policies on small wing without new training
+**SB3 Compatible:** ✅ For loading trained policies; demo script is rollout-only
+
+```python
+from hydrogym.nek import NekEnv
+from hydrogym.nek.pettingzoo_env import make_pettingzoo_env
+
+base_env = NekEnv.from_hf("NACA4412_3D_Re75000_AOA5", nproc=12)
+env = make_pettingzoo_env(base_env)
+```
+
+**Files:**
+- `test_nek_pettingzoo.py` - zero-shot wing rollout demo
+- `meta_policy_small_wing_template.py` - template for explicit legacy `MetaPolicy.py` usage
+- `run_pettingzoo_docker.sh` - Docker/MPI execution script
+
+---
+
+## Quick Start
+
+### 1. Choose Your Interface
+Pick the directory that matches your use case:
+- **Single agent?** → Start with `1_nekenv_single/`
+- **Multiple agents?** → Try `2_parallel_env/`
+- **Pre-packaged environments?** → Use `4_from_hf/`
+- **Deploy trained models?** → See `5_hydrogym_control/`
+- **Zero-shot wing deployment?** → See `6_zeroshot_wing_demo/`
+
+### 2. Test the Environment
+```bash
+cd 1_nekenv_single/
+mpirun -np 1 python test_nek_direct.py --steps 100 : -np 10 nek5000
+```
+
+### 3. Train an RL Agent
+```bash
+mpirun -np 1 python train_sb3_nek_direct.py \
+    --env TCFmini_3D_Re180 \
+    --algo PPO \
+    --total-timesteps 100000 \
+    : -np 10 nek5000
+```
+
+## Comparison Table
+
+| Directory | Interface | Obs Format | Action Format | SB3 Direct | Best For |
+|-----------|-----------|------------|---------------|------------|----------|
+| **1_nekenv_single** | `NekEnv` | Array | Array | ✅ Yes | Single actuator, simple baseline |
+| **2_parallel_env** | `parallel_env` | Dict | Dict | ⚠️ Wrapper | Independent multi-agent scenarios |
+| **3_pettingzoo** | AEC | Sequential | Sequential | ⚠️ Wrapper | Turn-based agents |
+| **4_from_hf** | Any | Depends | Depends | Depends | Reproducible, versioned environments |
+| **5_hydrogym_control** | Any + `integrate()` | Any | Any | ✅ Yes | Classical + RL hybrid control |
+| **6_zeroshot_wing_demo** | PettingZoo Parallel | Dict | Dict | ✅ Deployment | Small-wing zero-shot DRL rollout |
+
+## Requirements
+
+### NEK5000 Setup
+NEK5000 must be compiled and the `nek5000` executable must be in your PATH or specified in the environment configuration. We highly recommend using the provided Docker container.
+
+```bash
+# Check NEK5000 is available
+which nek5000
+
+# Or set path explicitly in config
+env_config = {
+    'environment_name': 'TCFmini_3D_Re180',
+    'nek_path': '/path/to/nek5000',
+    'nproc': 10,
+}
+```
+---
+
+**Last Updated**: March 2026
+**HydroGym Version**: 1.0+
+**Maintainer**: HydroGym Team
diff --git a/docs/docs/examples/nek5000/hydrogym_control.md b/docs/docs/examples/nek5000/hydrogym_control.md
new file mode 100644
index 0000000..96ec2c0
--- /dev/null
+++ b/docs/docs/examples/nek5000/hydrogym_control.md
@@ -0,0 +1,125 @@
+---
+sidebar_position: 6
+---
+
+# Control with integrate()
+
+Use `integrate()` for time-stepping simulation with classical or RL controllers.
+
+## Interface
+
+```python
+from hydrogym.nek import NekEnv, integrate
+
+# Create environment
+env = NekEnv.from_hf('TCFmini_3D_Re80', nproc=10)
+
+# Classical controller
+def opposition_control(t, obs, env):
+    return -obs[:env.action_space.shape[0]]
+
+integrate(env, controller=opposition_control, num_steps=1000)
+
+# Or use trained SB3 model
+from stable_baselines3 import PPO
+model = PPO.load("model.zip")
+integrate(env, controller=model, num_steps=1000)
+```
+
+## Files
+
+- `test_nek_env_controller.py` - Test various controllers with integrate()
+- `train_sb3_with_integrate.py` - **Complete workflow: train + evaluate + compare**
+- `run_control_docker.sh` - Docker/MPI execution
+
+## Usage
+
+### Test Controllers
+```bash
+mpirun -np 1 python test_nek_env_controller.py --config test_config.yml : -np 10 nek5000
+```
+
+### Train & Evaluate Workflow
+```bash
+mpirun -np 1 python train_sb3_with_integrate.py --env MiniChannel_Re180 --algo PPO --total-timesteps 50000 --eval-steps 200 : -np 10 nek5000
+```
+
+## When to Use
+
+- **Evaluating trained RL policies** on longer rollouts
+- Comparing RL vs classical control strategies
+- Time-stepping simulations with custom control laws
+- Benchmarking different control approaches
+
+## Controllers Supported
+
+### Classical Controllers
+Custom functions with signature: `action = controller(t, obs, env)`
+
+- **Opposition Control**: `action = -alpha * observation`
+- **Blowing/Suction**: `action = constant`
+- **Sinusoidal**: `action = sin(omega * t)`
+- **Zero Control**: `action = 0` (baseline)
+
+```python
+def opposition_control(t, obs, env):
+    return -obs[:env.action_space.shape[0]] * 0.5
+
+def zero_control(t, obs, env):
+    return np.zeros(env.action_space.shape, dtype=np.float32)
+```
+
+### RL Controllers
+Any SB3 model with `.predict()` method:
+
+```python
+from stable_baselines3 import PPO
+model = PPO.load("trained_model.zip", env=env)
+integrate(env, controller=model, num_steps=1000)
+```
+
+## Complete Train + Evaluate Workflow
+
+The `train_sb3_with_integrate.py` script demonstrates the full workflow:
+
+1. **Train RL agent** with SB3 (Monitor, VecNormalize)
+2. **Save model and normalization stats**
+3. **Load trained model** for evaluation
+4. **Evaluate with integrate()** for extended rollouts
+5. **Compare RL vs classical controllers**
+
+```python
+# Phase 1: Training
+env = NekEnv(env_config=config)
+env = Monitor(env)
+env = DummyVecEnv([lambda: env])
+env = VecNormalize(env, ...)
+
+model = PPO("MlpPolicy", env, ...)
+model.learn(total_timesteps=50000)
+model.save("model.zip")
+env.save("vec_normalize.pkl")
+
+# Phase 2: Evaluation
+env_eval = NekEnv(env_config=config)
+env_eval = DummyVecEnv([lambda: env_eval])
+env_eval = VecNormalize.load("vec_normalize.pkl", env_eval)
+model = PPO.load("model.zip")
+
+integrate(env_eval, controller=model, num_steps=1000)
+```
+
+## Key Features
+
+- **Controller agnostic**: Works with RL models and classical functions
+- **Automatic handling**: Detects controller type automatically
+- **Normalization support**: Properly handles VecNormalize for RL policies
+- **Comparison tool**: Evaluate multiple controllers in sequence
+
+## Notes
+
+- `integrate()` handles the time-stepping loop internally
+- Compatible with any environment type (NekEnv, parallel_env, etc.)
+- RL policies need the same normalization wrapper used during training
+- Classical controllers don't need normalization wrappers
+- Use for evaluation and comparison, not for training
diff --git a/docs/docs/examples/nek5000/manage-docs-versions.md b/docs/docs/examples/nek5000/manage-docs-versions.md
deleted file mode 100644
index 3e4687f..0000000
--- a/docs/docs/examples/nek5000/manage-docs-versions.md
+++ /dev/null
@@ -1,53 +0,0 @@
----
-sidebar_position: 1
----
-
-# Manage Docs Versions
-
-Docusaurus can manage multiple versions of your docs.
-
-## Create a docs version
-
-Release a version 1.0 of your project:
-
-```bash
-npm run docusaurus docs:version 1.0
-```
-
-The `docs` folder is copied into `versioned_docs/version-1.0` and `versions.json` is created.
-
-Your docs now have 2 versions:
-
-- `1.0` at `http://localhost:3000/docs/` for the version 1.0 docs
-- `current` at `http://localhost:3000/docs/next/` for the **upcoming, unreleased docs**
-
-## Add a Version Dropdown
-
-To navigate seamlessly across versions, add a version dropdown.
-
-Modify the `docusaurus.config.js` file:
-
-```js title="docusaurus.config.js"
-export default {
-  themeConfig: {
-    navbar: {
-      items: [
-        // highlight-start
-        {
-          type: 'docsVersionDropdown',
-        },
-        // highlight-end
-      ],
-    },
-  },
-};
-```
-
-The docs version dropdown appears in your navbar:
-
-## Update an existing version
-
-It is possible to edit versioned docs in their respective folder:
-
-- `versioned_docs/version-1.0/hello.md` updates `http://localhost:3000/docs/hello`
-- `docs/hello.md` updates `http://localhost:3000/docs/next/hello`
diff --git a/docs/docs/examples/nek5000/parallel_environments.md b/docs/docs/examples/nek5000/parallel_environments.md
new file mode 100644
index 0000000..3d7ed08
--- /dev/null
+++ b/docs/docs/examples/nek5000/parallel_environments.md
@@ -0,0 +1,74 @@
+---
+sidebar_position: 3
+---
+
+# Parallel Multi-Agent Interface
+
+Dict-based multi-agent interface where each actuator is treated as a separate agent.
+
+## Interface
+
+```python
+from hydrogym.nek import NekEnv, NekParallelEnv
+
+# Create base environment
+env_config = {'environment_name': 'TCFmini_3D_Re180', 'nproc': 10}
+base_env = NekEnv(env_config=env_config)
+
+# Wrap with parallel interface
+env = NekParallelEnv(base_env)
+
+# Dict-based API
+observations, infos = env.reset()
+# observations: {'agent_0': array, 'agent_1': array, ...}
+
+actions = {agent: policy(obs) for agent, obs in observations.items()}
+observations, rewards, terminated, truncated, infos = env.step(actions)
+```
+
+## Files
+
+- `test_nek_parallel.py` - Test parallel environment with dict-based interface
+- `train_sb3_parallel.py` - SB3 training with **DIY centralized wrapper** (educational)
+- `run_parallel_docker.sh` - Docker/MPI execution script
+
+## Usage
+
+### Test Environment
+```bash
+mpirun -np 1 python test_nek_parallel.py --steps 100 : -np 10 nek5000
+```
+
+### Train RL Agent (DIY Centralized Approach)
+```bash
+mpirun -np 1 python train_sb3_parallel.py --env MiniChannel_Re180 --algo PPO --total-timesteps 100000 : -np 10 nek5000
+```
+
+## When to Use
+
+- Multiple agents with independent observation/action spaces
+- Dict-based multi-agent interface needed
+- True MARL scenarios (with RLlib or custom frameworks)
+- Per-agent reward inspection
+
+## SB3 Integration
+
+**parallel_env is NOT directly compatible with SB3** (SB3 expects arrays, not dicts).
+
+**Solutions:**
+
+1. **DIY Centralized Wrapper** (Chapter 2 - Educational)
+   - Shown in `train_sb3_parallel.py`
+   - Manually concatenates all agents → single array
+   - Educational: shows how conversion works
+
+2. **SuperSuit** (Chapter 3 - Production)
+   - See chapter 3 for PettingZoo + SuperSuit approach
+   - Production-ready ecosystem solution
+
+## Notes
+
+- Each agent controls one actuator (scalar action)
+- Each agent receives local observations
+- Rewards can be per-agent or shared
+- For simple centralized control, just use `NekEnv` directly (Chapter 1)
diff --git a/docs/docs/examples/nek5000/pettingzoo.md b/docs/docs/examples/nek5000/pettingzoo.md
new file mode 100644
index 0000000..dcb1955
--- /dev/null
+++ b/docs/docs/examples/nek5000/pettingzoo.md
@@ -0,0 +1,121 @@
+---
+sidebar_position: 4
+---
+
+# PettingZoo Interface
+
+PettingZoo-compatible wrapper for ecosystem integration and production-ready SB3 training via SuperSuit.
+
+## Interface
+
+```python
+from hydrogym.nek import NekEnv, make_pettingzoo_env
+
+# Create base environment
+env_config = {'environment_name': 'TCFmini_3D_Re180', 'nproc': 10}
+base_env = NekEnv(env_config=env_config)
+
+# Wrap with PettingZoo interface
+env = make_pettingzoo_env(base_env)
+
+# PettingZoo parallel API
+observations, infos = env.reset()
+actions = {agent: policy(obs) for agent, obs in observations.items()}
+observations, rewards, terminated, truncated, infos = env.step(actions)
+```
+
+## Files
+
+- `test_nek_pettingzoo.py` - Test PettingZoo interface
+- `train_sb3_pettingzoo.py` - **SB3 training with SuperSuit** (production approach)
+- `run_pettingzoo_docker.sh` - Docker/MPI execution
+
+## Usage
+
+### Test Environment
+```bash
+mpirun -np 1 python test_nek_pettingzoo.py --steps 100 : -np 10 nek5000
+```
+
+### Train RL Agent (SuperSuit Production Approach)
+```bash
+mpirun -np 1 python train_sb3_pettingzoo.py --env MiniChannel_Re180 --algo PPO --total-timesteps 100000 : -np 10 nek5000
+```
+
+## Configuration-Driven Tutorial (Recommended for Reproducibility)
+
+Use a fixed YAML config to lock simulation + runner settings across runs.
+
+### 1) Prepare a workspace
+```bash
+python ../prepare_workspace.py \
+  --local-dir ../../../packaged_envs \
+  --env TCFmini_3D_Re180 \
+  --work-dir ./train_run
+```
+
+### 2) Train with a config file
+```bash
+cd train_run
+mpirun -np 1 python ../train_sb3_pettingzoo.py \
+  --env TCFmini_3D_Re180 \
+  --nproc 10 \
+  --config-file ../configs/pettingzoo_tcfmini_re180.yml \
+  --algo TD3 \
+  --total-timesteps 5000000 \
+  : -np 10 nek5000
+```
+
+### 3) Evaluate (PettingZoo rollouts)
+```bash
+cd train_run
+mpirun -np 1 python ../test_nek_pettingzoo.py \
+  --env TCFmini_3D_Re180 \
+  --nproc 10 \
+  --config-file ../configs/pettingzoo_tcfmini_re180.yml \
+  --steps 2500 \
+  : -np 10 nek5000
+```
+
+Notes:
+- The config lives in `examples/nek/configs/pettingzoo_tcfmini_re180.yml`.
+- Run from the workspace (`train_run`) so `compile_path: "."` resolves to case files.
+- Ensure `--nproc` matches `simulation.nproc` in the config.
+
+## When to Use
+
+- **Production SB3 training** on multi-agent environments
+- PettingZoo ecosystem compatibility
+- Using PettingZoo-specific tools and libraries
+- Need standardized multi-agent API
+
+## SB3 Integration with SuperSuit
+
+**SuperSuit** is PettingZoo's official wrapper library for converting multi-agent envs to SB3-compatible format.
+
+### Installation
+```bash
+pip install pettingzoo supersuit
+```
+
+### SuperSuit Wrappers Used
+1. `pad_observations_v0` - Pad observations to uniform size
+2. `pad_action_space_v0` - Pad action spaces to uniform size
+3. `black_death_v3` - Handle agent termination
+4. `pettingzoo_env_to_vec_env_v1` - Convert to vectorized Gym env
+
+### Comparison with Chapter 2
+
+| Aspect | Chapter 2 (DIY) | Chapter 3 (SuperSuit) |
+|--------|-----------------|----------------------|
+| Purpose | Educational | Production |
+| Dependencies | None extra | pettingzoo, supersuit |
+| Code complexity | ~50 lines wrapper | ~5 lines SuperSuit |
+| Maintenance | DIY | Ecosystem-maintained |
+| Use when | Learning | Deploying |
+
+## Notes
+
+- SuperSuit is the **recommended** approach for production
+- Chapter 2 (DIY wrapper) is for understanding how it works
+- PettingZoo API ensures compatibility with many MARL tools
diff --git a/docs/docs/examples/nek5000/single_environment.md b/docs/docs/examples/nek5000/single_environment.md
new file mode 100644
index 0000000..c82bfab
--- /dev/null
+++ b/docs/docs/examples/nek5000/single_environment.md
@@ -0,0 +1,46 @@
+---
+sidebar_position: 2
+---
+
+# Single Agent Interface
+
+Standard Gym interface for single-agent NEK5000 environments with direct instantiation.
+
+## Interface
+
+```python
+env_config = {
+    'environment_name': 'TCFmini_3D_Re180',
+    'nproc': 10,
+    'use_clean_cache': False,
+    'configuration_file': None,  # Auto-detects config.yaml
+}
+env = NekEnv(env_config=env_config)
+
+obs, info = env.reset()
+obs, reward, terminated, truncated, info = env.step(action)
+```
+
+## Files
+
+- `test_nek_direct.py` - Basic environment test with zero control
+- `train_sb3_nek_direct.py` - SB3 training (PPO/TD3/SAC) with Monitor & VecNormalize
+- `run_nekenv_docker.sh` - Docker/MPI execution script
+
+## Usage
+
+### Test Environment
+```bash
+mpirun -np 1 python test_nek_direct.py --steps 100 : -np 10 nek5000
+```
+
+### Train RL Agent
+```bash
+mpirun -np 1 python train_sb3_nek_direct.py --env MiniChannel_Re180 --algo PPO --total-timesteps 100000 : -np 10 nek5000
+```
+
+## When to Use
+
+- Single actuator/sensor scenarios
+- Direct SB3 compatibility (Monitor, VecNormalize wrappers included)
+- Simple baseline comparisons with zero control
diff --git a/docs/docs/examples/nek5000/zeroshot_wing.md b/docs/docs/examples/nek5000/zeroshot_wing.md
new file mode 100644
index 0000000..5f1c8d6
--- /dev/null
+++ b/docs/docs/examples/nek5000/zeroshot_wing.md
@@ -0,0 +1,122 @@
+---
+sidebar_position: 7
+---
+
+# Zero-Shot Wing Deployment (Multi-Policy MARL)
+
+Zero-shot deployment demo for the small NACA4412 wing case: multiple control policies are mapped to actuator subsets and executed together in one PettingZoo rollout.
+
+**This is a deployment/evaluation demo only (no training). The template and controllers are intended for demonstration and should not be used to draw physical conclusions.**
+
+> NOTE: The provided Nek5000 executable is pre-compiled for this chapter, so this demo focuses on the DRL-style rollout/deployment workflow.
+
+## What the script does
+
+`test_nek_pettingzoo.py`:
+- loads a base `NekEnv` via `NekEnv.from_hf(...)` and wraps it with `make_pettingzoo_env(...)`
+- builds one controller per entry in `POLICY_SPECS` (from `meta_policy_small_wing_template.py`)
+- assigns each controller to actuator agents by `x_range` and `side` (`SS` means `y > 0`, `PS` means `y < 0`)
+- refreshes each group's actions every `drl_step` steps (refresh at step `0`; otherwise actions are held)
+- clips actions to `action_bounds`
+- computes an “inverted” + scaled reward summary for display (deployment-only)
+
+Unassigned actuator agents receive zero action.
+
+## Interface (PettingZoo rollout)
+
+```python
+from hydrogym.nek import NekEnv
+from hydrogym.nek.pettingzoo_env import make_pettingzoo_env
+
+base_env = NekEnv.from_hf("NACA4412_3D_Re75000_AOA5", nproc=12)
+env = make_pettingzoo_env(base_env)
+
+obs_dict, info = env.reset()
+actions = {agent: controller(obs_dict[agent]) for agent in env.agents}
+obs_dict, rewards_dict, terminations, truncations, infos = env.step(actions)
+```
+
+## Files
+
+- `test_nek_pettingzoo.py` - zero-shot multi-policy rollout demo (deployment only)
+- `meta_policy_small_wing_template.py` - template defining `ENV_NAME`, `NPROC`, and `POLICY_SPECS`
+- `run_pettingzoo_docker.sh` - runner script (module load + workspace prep + `mpirun`)
+
+## Usage
+
+### Recommended: use the runner script
+From `6_zeroshot_wing_demo/`:
+
+```bash
+./run_pettingzoo_docker.sh
+./run_pettingzoo_docker.sh --policy-root /workspace/legacy_runs
+```
+
+### Direct: run the Python deployment script
+
+Default template:
+```bash
+mpirun -np 1 python test_nek_pettingzoo.py : -np 12 nek5000
+```
+
+Legacy policy template + run root:
+```bash
+mpirun -np 1 python test_nek_pettingzoo.py \
+  --policy-template ./meta_policy_small_wing_template.py \
+  --policy-root /path/to/legacy_runs \
+  --steps 3000 \
+  : -np 12 nek5000
+```
+
+Useful overrides:
+- `--policy-template PATH` (defaults to `./meta_policy_small_wing_template.py`)
+- `--env ENV_NAME` (defaults from template `ENV_NAME`)
+- `--nproc NPROC` (defaults from template `NPROC`)
+- `--steps NUM_STEPS` (defaults from template `NUM_STEPS`)
+- `--policy-root PATH` (where RL model run folders live)
+- `--local-dir PATH` (optional fallback dir for packaged envs)
+- `--log-every N` (reward table frequency)
+
+## Policy Template (`meta_policy_small_wing_template.py`)
+
+The template defines a lightweight legacy-`MetaPolicy.py`-style configuration.
+
+Required top-level variables:
+- `ENV_NAME`
+- `NPROC`
+- `NUM_STEPS`
+- `POLICY_ROOT` (default for `--policy-root`)
+- `POLICY_SPECS` (list of policy group dicts)
+
+Each `POLICY_SPECS` entry supports:
+- `name`
+- `x_range: [x_min, x_max]`
+- `side: "SS"` (y > 0) or `"PS"` (y < 0)
+- `algorithm: "PPO" | "TD3" | "DDPG" | "BL" | "ZERO"`
+- `drl_step` (action refresh interval; actions are held between refreshes)
+- `action_bounds: [min, max]`
+- optional scaling knobs: `u_tau`, `baseline_dudy`
+- RL algorithms only: `agent_run_name`, `policy`, and/or `model_path`
+
+Algorithm semantics:
+- `ZERO` outputs an all-zero action (no model needed)
+- `BL` outputs a constant action equal to `action_max` (no model needed)
+- `PPO`/`TD3`/`DDPG` load a Stable-Baselines3 model from `model_path`/`POLICY_ROOT`
+
+For overlapping actuator regions, the last-assigned policy takes precedence.
+
+## Default RL Model Path Convention
+
+For RL policies (`PPO`, `TD3`, `DDPG`), if `model_path` is not set, the default expected path is:
+
+```text
+<POLICY_ROOT>/<agent_run_name>/logs/<agent_run_name>-<policy>
+```
+
+## Notes
+
+- Deployment-only (evaluation). No training happens in this chapter.
+- `drl_step` controls when the controller is queried; between refreshes, the last action is held for the whole group.
+- `u_tau` is used to normalize observations before calling the controller (the code comments note that solver-side normalization by `u_tau` should be kept consistent with how the policies were trained).
+- This demo uses deterministic controller calls (`controller.predict(..., deterministic=True)`), and displays a reward summary to help compare controller configurations.
+
diff --git a/docs/docs/faq.md b/docs/docs/faq.md
new file mode 100644
index 0000000..f88b76b
--- /dev/null
+++ b/docs/docs/faq.md
@@ -0,0 +1,10 @@
+---
+sidebar_position: 7
+---
+
+# FAQ
+
+> Answers to frequently asked questions to go here.
+
+## I want to use HydroGym in my work, how do I cite it?
+
diff --git a/docs/docs/installation/_category_.json b/docs/docs/installation/_category_.json
index 7e61e15..d0d4962 100644
--- a/docs/docs/installation/_category_.json
+++ b/docs/docs/installation/_category_.json
@@ -1,8 +1,8 @@
 {
   "label": "Installation",
   "position": 3,
+  "collapsed": false,
   "link": {
-    "type": "generated-index",
-    "description": "5 minutes to learn the most important Docusaurus concepts."
+    "type": "generated-index"
   }
 }
diff --git a/docs/docusaurus.config.ts b/docs/docusaurus.config.ts
index bc38026..df13e96 100644
--- a/docs/docusaurus.config.ts
+++ b/docs/docusaurus.config.ts
@@ -1,6 +1,8 @@
 import {themes as prismThemes} from 'prism-react-renderer';
 import type {Config} from '@docusaurus/types';
 import type * as Preset from '@docusaurus/preset-classic';
+import remarkMath from 'remark-math';
+import rehypeKatex from 'rehype-katex';
 
 // This runs in Node.js - Don't use client-side code here (browser APIs, JSX...)
 
@@ -35,12 +37,23 @@ const config: Config = {
     locales: ['en'],
   },
 
+  stylesheets: [
+    {
+      href: 'https://cdn.jsdelivr.net/npm/katex@0.13.24/dist/katex.min.css',
+      type: 'text/css',
+      integrity: 'sha384-odtC+0UGzzFL/6PNoE8rX/SPcQDXBJ+uRepguP4QkPCm2LBxH3FA3y+fKSiJ+AmM',
+      crossorigin: 'anonymous',
+    },
+  ],
+
   presets: [
     [
       'classic',
       {
         docs: {
           sidebarPath: './sidebars.ts',
+          remarkPlugins: [remarkMath],
+          rehypePlugins: [rehypeKatex],
           // Please change this to your repo.
           // Remove this to remove the "edit this page" links.
           editUrl:
diff --git a/docs/package-lock.json b/docs/package-lock.json
index 25972b7..42b78e6 100644
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
@@ -1,12 +1,12 @@
 {
-  "name": "docs",
-  "version": "0.0.0",
+  "name": "HydroGym Documentation",
+  "version": "1.0.0",
   "lockfileVersion": 3,
   "requires": true,
   "packages": {
     "": {
-      "name": "docs",
-      "version": "0.0.0",
+      "name": "HydroGym Documentation",
+      "version": "1.0.0",
       "dependencies": {
         "@docusaurus/core": "3.9.2",
         "@docusaurus/preset-classic": "3.9.2",
@@ -14,7 +14,9 @@
         "clsx": "^2.0.0",
         "prism-react-renderer": "^2.3.0",
         "react": "^19.0.0",
-        "react-dom": "^19.0.0"
+        "react-dom": "^19.0.0",
+        "rehype-katex": "^7",
+        "remark-math": "^6"
       },
       "devDependencies": {
         "@docusaurus/module-type-aliases": "3.9.2",
@@ -5376,6 +5378,12 @@
       "integrity": "sha512-5+fP8P8MFNC+AyZCDxrB2pkZFPGzqQWUzpSeuuVLvm8VMcorNYavBqoFcxK8bQz4Qsbn4oUEEem4wDLfcysGHA==",
       "license": "MIT"
     },
+    "node_modules/@types/katex": {
+      "version": "0.16.8",
+      "resolved": "https://registry.npmjs.org/@types/katex/-/katex-0.16.8.tgz",
+      "integrity": "sha512-trgaNyfU+Xh2Tc+ABIb44a5AYUpicB3uwirOioeOkNPPbmgRNtcWyDeeFRzjPZENO9Vq8gvVqfhaaXWLlevVwg==",
+      "license": "MIT"
+    },
     "node_modules/@types/mdast": {
       "version": "4.0.4",
       "resolved": "https://registry.npmjs.org/@types/mdast/-/mdast-4.0.4.tgz",
@@ -9240,6 +9248,55 @@
         "node": ">= 0.4"
       }
     },
+    "node_modules/hast-util-from-dom": {
+      "version": "5.0.1",
+      "resolved": "https://registry.npmjs.org/hast-util-from-dom/-/hast-util-from-dom-5.0.1.tgz",
+      "integrity": "sha512-N+LqofjR2zuzTjCPzyDUdSshy4Ma6li7p/c3pA78uTwzFgENbgbUrm2ugwsOdcjI1muO+o6Dgzp9p8WHtn/39Q==",
+      "license": "ISC",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "hastscript": "^9.0.0",
+        "web-namespaces": "^2.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
+    "node_modules/hast-util-from-html": {
+      "version": "2.0.3",
+      "resolved": "https://registry.npmjs.org/hast-util-from-html/-/hast-util-from-html-2.0.3.tgz",
+      "integrity": "sha512-CUSRHXyKjzHov8yKsQjGOElXy/3EKpyX56ELnkHH34vDVw1N1XSQ1ZcAvTyAPtGqLTuKP/uxM+aLkSPqF/EtMw==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "devlop": "^1.1.0",
+        "hast-util-from-parse5": "^8.0.0",
+        "parse5": "^7.0.0",
+        "vfile": "^6.0.0",
+        "vfile-message": "^4.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
+    "node_modules/hast-util-from-html-isomorphic": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/hast-util-from-html-isomorphic/-/hast-util-from-html-isomorphic-2.0.0.tgz",
+      "integrity": "sha512-zJfpXq44yff2hmE0XmwEOzdWin5xwH+QIhMLOScpX91e/NSGPsAzNCvLQDIEPyO2TXi+lBmU6hjLIhV8MwP2kw==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "hast-util-from-dom": "^5.0.0",
+        "hast-util-from-html": "^2.0.0",
+        "unist-util-remove-position": "^5.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/hast-util-from-parse5": {
       "version": "8.0.3",
       "resolved": "https://registry.npmjs.org/hast-util-from-parse5/-/hast-util-from-parse5-8.0.3.tgz",
@@ -9260,6 +9317,19 @@
         "url": "https://opencollective.com/unified"
       }
     },
+    "node_modules/hast-util-is-element": {
+      "version": "3.0.0",
+      "resolved": "https://registry.npmjs.org/hast-util-is-element/-/hast-util-is-element-3.0.0.tgz",
+      "integrity": "sha512-Val9mnv2IWpLbNPqc/pUem+a7Ipj2aHacCwgNfTiK0vJKl0LF+4Ba4+v1oPHFpf3bLYmreq0/l3Gud9S5OH42g==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/hast-util-parse-selector": {
       "version": "4.0.0",
       "resolved": "https://registry.npmjs.org/hast-util-parse-selector/-/hast-util-parse-selector-4.0.0.tgz",
@@ -9372,6 +9442,22 @@
         "url": "https://opencollective.com/unified"
       }
     },
+    "node_modules/hast-util-to-text": {
+      "version": "4.0.2",
+      "resolved": "https://registry.npmjs.org/hast-util-to-text/-/hast-util-to-text-4.0.2.tgz",
+      "integrity": "sha512-KK6y/BN8lbaq654j7JgBydev7wuNMcID54lkRav1P0CaE1e47P72AWWPiGKXTJU271ooYzcvTAn/Zt0REnvc7A==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "@types/unist": "^3.0.0",
+        "hast-util-is-element": "^3.0.0",
+        "unist-util-find-after": "^5.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/hast-util-whitespace": {
       "version": "3.0.0",
       "resolved": "https://registry.npmjs.org/hast-util-whitespace/-/hast-util-whitespace-3.0.0.tgz",
@@ -10353,6 +10439,31 @@
         "graceful-fs": "^4.1.6"
       }
     },
+    "node_modules/katex": {
+      "version": "0.16.45",
+      "resolved": "https://registry.npmjs.org/katex/-/katex-0.16.45.tgz",
+      "integrity": "sha512-pQpZbdBu7wCTmQUh7ufPmLr0pFoObnGUoL/yhtwJDgmmQpbkg/0HSVti25Fu4rmd1oCR6NGWe9vqTWuWv3GcNA==",
+      "funding": [
+        "https://opencollective.com/katex",
+        "https://github.com/sponsors/katex"
+      ],
+      "license": "MIT",
+      "dependencies": {
+        "commander": "^8.3.0"
+      },
+      "bin": {
+        "katex": "cli.js"
+      }
+    },
+    "node_modules/katex/node_modules/commander": {
+      "version": "8.3.0",
+      "resolved": "https://registry.npmjs.org/commander/-/commander-8.3.0.tgz",
+      "integrity": "sha512-OkTL9umf+He2DZkUq8f8J9of7yL6RJKI24dVITBmNfZBmri9zYZQrKkuXiKhyfPSu8tUhnVBB1iKXevvnlR4Ww==",
+      "license": "MIT",
+      "engines": {
+        "node": ">= 12"
+      }
+    },
     "node_modules/keyv": {
       "version": "4.5.4",
       "resolved": "https://registry.npmjs.org/keyv/-/keyv-4.5.4.tgz",
@@ -10837,6 +10948,25 @@
         "url": "https://opencollective.com/unified"
       }
     },
+    "node_modules/mdast-util-math": {
+      "version": "3.0.0",
+      "resolved": "https://registry.npmjs.org/mdast-util-math/-/mdast-util-math-3.0.0.tgz",
+      "integrity": "sha512-Tl9GBNeG/AhJnQM221bJR2HPvLOSnLE/T9cJI9tlc6zwQk2nPk/4f0cHkOdEixQPC/j8UtKDdITswvLAy1OZ1w==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "@types/mdast": "^4.0.0",
+        "devlop": "^1.0.0",
+        "longest-streak": "^3.0.0",
+        "mdast-util-from-markdown": "^2.0.0",
+        "mdast-util-to-markdown": "^2.1.0",
+        "unist-util-remove-position": "^5.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/mdast-util-mdx": {
       "version": "3.0.0",
       "resolved": "https://registry.npmjs.org/mdast-util-mdx/-/mdast-util-mdx-3.0.0.tgz",
@@ -11653,6 +11783,81 @@
       ],
       "license": "MIT"
     },
+    "node_modules/micromark-extension-math": {
+      "version": "3.1.0",
+      "resolved": "https://registry.npmjs.org/micromark-extension-math/-/micromark-extension-math-3.1.0.tgz",
+      "integrity": "sha512-lvEqd+fHjATVs+2v/8kg9i5Q0AP2k85H0WUOwpIVvUML8BapsMvh1XAogmQjOCsLpoKRCVQqEkQBB3NhVBcsOg==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/katex": "^0.16.0",
+        "devlop": "^1.0.0",
+        "katex": "^0.16.0",
+        "micromark-factory-space": "^2.0.0",
+        "micromark-util-character": "^2.0.0",
+        "micromark-util-symbol": "^2.0.0",
+        "micromark-util-types": "^2.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
+    "node_modules/micromark-extension-math/node_modules/micromark-factory-space": {
+      "version": "2.0.1",
+      "resolved": "https://registry.npmjs.org/micromark-factory-space/-/micromark-factory-space-2.0.1.tgz",
+      "integrity": "sha512-zRkxjtBxxLd2Sc0d+fbnEunsTj46SWXgXciZmHq0kDYGnck/ZSGj9/wULTV95uoeYiK5hRXP2mJ98Uo4cq/LQg==",
+      "funding": [
+        {
+          "type": "GitHub Sponsors",
+          "url": "https://github.com/sponsors/unifiedjs"
+        },
+        {
+          "type": "OpenCollective",
+          "url": "https://opencollective.com/unified"
+        }
+      ],
+      "license": "MIT",
+      "dependencies": {
+        "micromark-util-character": "^2.0.0",
+        "micromark-util-types": "^2.0.0"
+      }
+    },
+    "node_modules/micromark-extension-math/node_modules/micromark-util-character": {
+      "version": "2.1.1",
+      "resolved": "https://registry.npmjs.org/micromark-util-character/-/micromark-util-character-2.1.1.tgz",
+      "integrity": "sha512-wv8tdUTJ3thSFFFJKtpYKOYiGP2+v96Hvk4Tu8KpCAsTMs6yi+nVmGh1syvSCsaxz45J6Jbw+9DD6g97+NV67Q==",
+      "funding": [
+        {
+          "type": "GitHub Sponsors",
+          "url": "https://github.com/sponsors/unifiedjs"
+        },
+        {
+          "type": "OpenCollective",
+          "url": "https://opencollective.com/unified"
+        }
+      ],
+      "license": "MIT",
+      "dependencies": {
+        "micromark-util-symbol": "^2.0.0",
+        "micromark-util-types": "^2.0.0"
+      }
+    },
+    "node_modules/micromark-extension-math/node_modules/micromark-util-symbol": {
+      "version": "2.0.1",
+      "resolved": "https://registry.npmjs.org/micromark-util-symbol/-/micromark-util-symbol-2.0.1.tgz",
+      "integrity": "sha512-vs5t8Apaud9N28kgCrRUdEed4UJ+wWNvicHLPxCa9ENlYuAY31M0ETy5y1vA33YoNPDFTghEbnh6efaE8h4x0Q==",
+      "funding": [
+        {
+          "type": "GitHub Sponsors",
+          "url": "https://github.com/sponsors/unifiedjs"
+        },
+        {
+          "type": "OpenCollective",
+          "url": "https://opencollective.com/unified"
+        }
+      ],
+      "license": "MIT"
+    },
     "node_modules/micromark-extension-mdx-expression": {
       "version": "3.0.1",
       "resolved": "https://registry.npmjs.org/micromark-extension-mdx-expression/-/micromark-extension-mdx-expression-3.0.1.tgz",
@@ -15694,6 +15899,25 @@
         "regjsparser": "bin/parser"
       }
     },
+    "node_modules/rehype-katex": {
+      "version": "7.0.1",
+      "resolved": "https://registry.npmjs.org/rehype-katex/-/rehype-katex-7.0.1.tgz",
+      "integrity": "sha512-OiM2wrZ/wuhKkigASodFoo8wimG3H12LWQaH8qSPVJn9apWKFSH3YOCtbKpBorTVw/eI7cuT21XBbvwEswbIOA==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/hast": "^3.0.0",
+        "@types/katex": "^0.16.0",
+        "hast-util-from-html-isomorphic": "^2.0.0",
+        "hast-util-to-text": "^4.0.0",
+        "katex": "^0.16.0",
+        "unist-util-visit-parents": "^6.0.0",
+        "vfile": "^6.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/rehype-raw": {
       "version": "7.0.0",
       "resolved": "https://registry.npmjs.org/rehype-raw/-/rehype-raw-7.0.0.tgz",
@@ -15799,6 +16023,22 @@
         "url": "https://opencollective.com/unified"
       }
     },
+    "node_modules/remark-math": {
+      "version": "6.0.0",
+      "resolved": "https://registry.npmjs.org/remark-math/-/remark-math-6.0.0.tgz",
+      "integrity": "sha512-MMqgnP74Igy+S3WwnhQ7kqGlEerTETXMvJhrUzDikVZ2/uogJCb+WHUg97hK9/jcfc0dkD73s3LN8zU49cTEtA==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/mdast": "^4.0.0",
+        "mdast-util-math": "^3.0.0",
+        "micromark-extension-math": "^3.0.0",
+        "unified": "^11.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/remark-mdx": {
       "version": "3.1.1",
       "resolved": "https://registry.npmjs.org/remark-mdx/-/remark-mdx-3.1.1.tgz",
@@ -17407,6 +17647,20 @@
         "url": "https://github.com/sponsors/sindresorhus"
       }
     },
+    "node_modules/unist-util-find-after": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/unist-util-find-after/-/unist-util-find-after-5.0.0.tgz",
+      "integrity": "sha512-amQa0Ep2m6hE2g72AugUItjbuM8X8cGQnFoHk0pGfrFeT9GZhzN5SW8nRsiGKK7Aif4CrACPENkA6P/Lw6fHGQ==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/unist": "^3.0.0",
+        "unist-util-is": "^6.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/unist-util-is": {
       "version": "6.0.1",
       "resolved": "https://registry.npmjs.org/unist-util-is/-/unist-util-is-6.0.1.tgz",
@@ -17446,6 +17700,20 @@
         "url": "https://opencollective.com/unified"
       }
     },
+    "node_modules/unist-util-remove-position": {
+      "version": "5.0.0",
+      "resolved": "https://registry.npmjs.org/unist-util-remove-position/-/unist-util-remove-position-5.0.0.tgz",
+      "integrity": "sha512-Hp5Kh3wLxv0PHj9m2yZhhLt58KzPtEYKQQ4yxfYFEO7EvHwzyDYnduhHnY1mDxoqr7VUwVuHXk9RXKIiYS1N8Q==",
+      "license": "MIT",
+      "dependencies": {
+        "@types/unist": "^3.0.0",
+        "unist-util-visit": "^5.0.0"
+      },
+      "funding": {
+        "type": "opencollective",
+        "url": "https://opencollective.com/unified"
+      }
+    },
     "node_modules/unist-util-stringify-position": {
       "version": "4.0.0",
       "resolved": "https://registry.npmjs.org/unist-util-stringify-position/-/unist-util-stringify-position-4.0.0.tgz",
diff --git a/docs/package.json b/docs/package.json
index cfc6d8b..67045b2 100644
--- a/docs/package.json
+++ b/docs/package.json
@@ -22,7 +22,9 @@
     "clsx": "^2.0.0",
     "prism-react-renderer": "^2.3.0",
     "react": "^19.0.0",
-    "react-dom": "^19.0.0"
+    "react-dom": "^19.0.0",
+    "rehype-katex": "^7",
+    "remark-math": "^6"
   },
   "devDependencies": {
     "@docusaurus/module-type-aliases": "3.9.2",
diff --git a/examples/nek/getting_started/6_zeroshot_wing_demo/README.md b/examples/nek/getting_started/6_zeroshot_wing_demo/README.md
index 77d9df9..5d7f6ee 100644
--- a/examples/nek/getting_started/6_zeroshot_wing_demo/README.md
+++ b/examples/nek/getting_started/6_zeroshot_wing_demo/README.md
@@ -9,6 +9,7 @@ __This is a deployment/evaluation demo only (no training). The template and cont
 ## What the script does
 
 `test_nek_pettingzoo.py`:
+
 - loads a base `NekEnv` via `NekEnv.from_hf(...)` and wraps it with `make_pettingzoo_env(...)`
 - builds one controller per entry in `POLICY_SPECS` (from `meta_policy_small_wing_template.py`)
 - assigns each controller to actuator agents by `x_range` and `side` (`SS` means `y > 0`, `PS` means `y < 0`)
@@ -41,6 +42,7 @@ obs_dict, rewards_dict, terminations, truncations, infos = env.step(actions)
 ## Usage
 
 ### Recommended: use the runner script
+
 From `6_zeroshot_wing_demo/`:
 
 ```bash
@@ -56,6 +58,7 @@ mpirun -np 1 python test_nek_pettingzoo.py : -np 12 nek5000
 ```
 
 Legacy policy template + run root:
+
 ```bash
 mpirun -np 1 python test_nek_pettingzoo.py \
   --policy-template ./meta_policy_small_wing_template.py \
@@ -65,6 +68,7 @@ mpirun -np 1 python test_nek_pettingzoo.py \
 ```
 
 Useful overrides:
+
 - `--policy-template PATH` (defaults to `./meta_policy_small_wing_template.py`)
 - `--env ENV_NAME` (defaults from template `ENV_NAME`)
 - `--nproc NPROC` (defaults from template `NPROC`)
@@ -78,6 +82,7 @@ Useful overrides:
 The template defines a lightweight legacy-`MetaPolicy.py`-style configuration.
 
 Required top-level variables:
+
 - `ENV_NAME`
 - `NPROC`
 - `NUM_STEPS`
@@ -85,6 +90,7 @@ Required top-level variables:
 - `POLICY_SPECS` (list of policy group dicts)
 
 Each `POLICY_SPECS` entry supports:
+
 - `name`
 - `x_range: [x_min, x_max]`
 - `side: "SS"` (y>0) or `"PS"` (y<0)
@@ -95,6 +101,7 @@ Each `POLICY_SPECS` entry supports:
 - RL algorithms only: `agent_run_name`, `policy`, and/or `model_path`
 
 Algorithm semantics:
+
 - `ZERO` outputs an all-zero action (no model needed)
 - `BL` outputs a constant action equal to `action_max` (no model needed)
 - `PPO`/`TD3`/`DDPG` load a Stable-Baselines3 model from `model_path`/`POLICY_ROOT`