mujocolab · kevinzakka · Dec 27, 2025 · Dec 28, 2025
diff --git a/Makefile b/Makefile
@@ -45,6 +45,10 @@ build:
 docs:
 	uv run --extra docs sphinx-build docs docs/_build
 
+.PHONY: docs-watch
+docs-watch:
+	uv run --extra docs sphinx-autobuild docs docs/_build
+
 .PHONY: docker-build
 docker-build:
 	docker build -t mjlab:latest .
diff --git a/README.md b/README.md
@@ -45,6 +45,9 @@ Launch the demo directly in your browser with an interactive Viser viewer.
 
 ## Installation
 
+> **Note:** `mujoco-warp` is not yet on PyPI, so it must be installed from GitHub.
+> Once it becomes available on PyPI, the installation commands below will simplify.
+
 **From source:**
 
 ```bash
@@ -169,7 +172,7 @@ make format
 Compile documentation locally:
 
 ```bash
-uv pip install -r docs/requirements.txt
+uv sync --extra docs
 make docs
 ```
 

diff --git a/docs/conf.py b/docs/conf.py
@@ -122,11 +122,9 @@
 html_last_updated_fmt = ""
 
 html_static_path = ["source/_static"]
-html_css_files = ["custom.css"]
 
 html_theme_options = {
   "path_to_docs": "docs/",
-  "collapse_navigation": True,
   "repository_url": "https://github.com/mujocolab/mjlab",
   "use_repository_button": True,
   "use_issues_button": True,
@@ -146,7 +144,7 @@
     {
       "name": "mjlab",
       "url": "https://github.com/mujocolab/mjlab",
-      "icon": "https://img.shields.io/badge/mjlab-0.1.0-silver.svg",
+      "icon": "https://img.shields.io/badge/mjlab-1.0.0-silver.svg",
       "type": "url",
     },
     {

diff --git a/docs/index.rst b/docs/index.rst
@@ -23,7 +23,7 @@ You can try mjlab *without installing anything* by using `uvx`:
 
    # Run the mjlab demo (no local installation needed)
    uvx --from mjlab \
-       --with "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@9fc294d86955a303619a254cefae809a41adb274" \
+       --with "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@f2f795796fc433adf8e235f01fae3747585ae5db" \
        demo
 
 If this runs, your setup is compatible with mjlab *for evaluation*.
@@ -44,7 +44,7 @@ If you use mjlab in your research, we would appreciate a citation:
         month = sep,
         title = {{mjlab: Isaac Lab API, powered by MuJoCo-Warp, for RL and robotics research.}},
         url = {https://github.com/mujocolab/mjlab},
-        version = {0.1.0},
+        version = {1.0.0},
         year = {2025}
     }
 
@@ -62,32 +62,63 @@ Table of Contents
 =================
 
 .. toctree::
-   :maxdepth: 1
+   :maxdepth: 2
    :caption: Getting Started
+   :titlesonly:
 
-   source/installation
-   source/migration_isaac_lab
+   source/getting_started/installation
+   source/getting_started/motivation
+   source/getting_started/walkthrough/index
 
 .. toctree::
-   :maxdepth: 1
-   :caption: About the Project
+   :maxdepth: 2
+   :caption: Architecture
+   :titlesonly:
 
-   source/motivation
-   source/faq
+   source/architecture/manager_based_env
+   source/architecture/scene
+   source/architecture/control_flow
 
 .. toctree::
    :maxdepth: 2
-   :caption: API Reference
+   :caption: Components
+   :titlesonly:
+
+   source/components/entities
+   source/components/actuators
+   source/components/sensors
+   source/components/terrains
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Environment Guide
+   :titlesonly:
+
+   source/environment_guide/observations
+   source/environment_guide/domain_randomization
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Features
+   :titlesonly:
+
+   source/features/configuration
+   source/features/distributed_training
+   source/features/nan_guard
+
+.. toctree::
+   :maxdepth: 1
+   :caption: API
+   :titlesonly:
 
    source/api/index
 
 .. toctree::
    :maxdepth: 1
-   :caption: Core Concepts
-
-   source/randomization
-   source/nan_guard
-   source/observation
-   source/actuators
-   source/sensors
-   source/distributed_training
+   :caption: References
+   :titlesonly:
+
+   source/references/changelog
+   source/references/contributing
+   source/references/faq
+   source/references/migration_isaac_lab
diff --git a/docs/source/api/index.rst b/docs/source/api/index.rst
@@ -1,16 +1,15 @@
 API Reference
 =============
 
-This section provides detailed API documentation for all public modules in mjlab.
-
 .. toctree::
-   :maxdepth: 1
+   :maxdepth: 2
+   :titlesonly:
 
+   actuator
+   entity
    envs
+   managers
    scene
-   sim
-   entity
-   actuator
    sensor
-   managers
+   sim
    terrains
diff --git a/docs/source/architecture/control_flow.rst b/docs/source/architecture/control_flow.rst
@@ -0,0 +1,5 @@
+.. _control-flow:
+
+Control Flow
+============
+
diff --git a/docs/source/architecture/manager_based_env.rst b/docs/source/architecture/manager_based_env.rst
@@ -0,0 +1,5 @@
+.. _manager-based-env:
+
+Manager-Based Environment
+=========================
+
diff --git a/docs/source/architecture/scene.rst b/docs/source/architecture/scene.rst
@@ -0,0 +1,5 @@
+.. _scene-architecture:
+
+Scene
+=====
+
diff --git a/docs/source/actuators.rst → docs/source/components/actuators.rst b/docs/source/actuators.rst → docs/source/components/actuators.rst
diff --git a/docs/source/components/entities.rst b/docs/source/components/entities.rst
@@ -0,0 +1,4 @@
+.. _entities:
+
+Entities
+========
diff --git a/docs/source/sensors.rst → docs/source/components/sensors.rst b/docs/source/sensors.rst → docs/source/components/sensors.rst
diff --git a/docs/source/components/terrains.rst b/docs/source/components/terrains.rst
@@ -0,0 +1,5 @@
+.. _terrains:
+
+Terrains
+========
+
diff --git a/docs/source/randomization.rst → ...nvironment_guide/domain_randomization.rst b/docs/source/randomization.rst → ...nvironment_guide/domain_randomization.rst
@@ -1,3 +1,5 @@
+.. _domain-randomization:
+
 Domain Randomization
 ====================
 
@@ -172,7 +174,7 @@ Center of Mass (COM) (startup)
             "ranges": {0: (-0.02, 0.02), 1: (-0.02, 0.02)},
             "operation": "add",
         },
-    ) 
+    )
 
 Custom Class-Based Event Terms
 ------------------------------

diff --git a/docs/source/observation.rst → ...source/environment_guide/observations.rst b/docs/source/observation.rst → ...source/environment_guide/observations.rst
diff --git a/docs/source/features/configuration.rst b/docs/source/features/configuration.rst
@@ -0,0 +1,6 @@
+.. _configuration:
+
+Configuration System
+====================
+
+tyro
diff --git a/docs/source/distributed_training.rst → .../source/features/distributed_training.rst b/docs/source/distributed_training.rst → .../source/features/distributed_training.rst
diff --git a/docs/source/nan_guard.rst → docs/source/features/nan_guard.rst b/docs/source/nan_guard.rst → docs/source/features/nan_guard.rst
@@ -85,7 +85,7 @@ Use the interactive viewer to scrub through captured states:
     uv run viz-nan /tmp/mjlab/nan_dumps/nan_dump_20251014_123456.npz
 
 
-.. figure:: _static/content/nan_debug.gif
+.. figure:: ../_static/content/nan_debug.gif
    :alt: NaN Debug Viewer
 
    NaN debug viewer.

diff --git a/docs/source/installation.rst → docs/source/getting_started/installation.rst b/docs/source/installation.rst → docs/source/getting_started/installation.rst
@@ -3,9 +3,8 @@
 Installation Guide
 ==================
 
-``mjlab`` is in active **beta** and tightly coupled to MuJoCo Warp.
-This guide presents different installation paths so you can
-choose the one that best fits your use case.
+``mjlab`` is tightly coupled to MuJoCo Warp. This guide presents different
+installation paths so you can choose the one that best fits your use case.
 
 .. contents::
    :local:
@@ -92,7 +91,7 @@ install. These options are interchangeable: you can switch at any time.
 
       .. code:: bash
 
-         uv add mjlab "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@9fc294d86955a303619a254cefae809a41adb274"
+         uv add mjlab "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@f2f795796fc433adf8e235f01fae3747585ae5db"
 
       .. note::
 
@@ -104,7 +103,7 @@ install. These options are interchangeable: you can switch at any time.
 
       .. code:: bash
 
-         uv add "mjlab @ git+https://github.com/mujocolab/mjlab" "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@9fc294d86955a303619a254cefae809a41adb274"
+         uv add "mjlab @ git+https://github.com/mujocolab/mjlab" "mujoco-warp @ git+https://github.com/google-deepmind/mujoco_warp@f2f795796fc433adf8e235f01fae3747585ae5db"
 
       .. note::
 
@@ -201,7 +200,7 @@ Install mjlab and dependencies via pip
 
       .. code:: bash
 
-         pip install git+https://github.com/google-deepmind/mujoco_warp@9fc294d86955a303619a254cefae809a41adb274
+         pip install git+https://github.com/google-deepmind/mujoco_warp@f2f795796fc433adf8e235f01fae3747585ae5db
          pip install mjlab
 
    .. tab-item:: Source
@@ -210,7 +209,7 @@ Install mjlab and dependencies via pip
 
       .. code:: bash
 
-         pip install git+https://github.com/google-deepmind/mujoco_warp@9fc294d86955a303619a254cefae809a41adb274
+         pip install git+https://github.com/google-deepmind/mujoco_warp@f2f795796fc433adf8e235f01fae3747585ae5db
          git clone https://github.com/mujocolab/mjlab.git
          cd mjlab
          pip install -e .

diff --git a/docs/source/motivation.rst → docs/source/getting_started/motivation.rst b/docs/source/motivation.rst → docs/source/getting_started/motivation.rst
diff --git a/docs/source/getting_started/walkthrough/evaluate_policy.rst b/docs/source/getting_started/walkthrough/evaluate_policy.rst
@@ -0,0 +1,28 @@
+Evaluating the Policy
+=====================
+
+This section shows how to evaluate and visualize your trained policy.
+
+Running Evaluation
+------------------
+
+TODO: Show the command to evaluate a trained policy.
+
+.. code-block:: bash
+
+   uv run play --task CartPole --checkpoint /path/to/model.pt
+
+Visualization
+-------------
+
+TODO: Show how to visualize the policy execution (MuJoCo viewer, Viser, video recording).
+
+Analyzing Performance
+---------------------
+
+TODO: Explain how to analyze the policy's performance and identify issues.
+
+Next Steps
+----------
+
+TODO: Suggest next steps (tuning hyperparameters, trying different tasks, customization).
diff --git a/docs/source/getting_started/walkthrough/index.rst b/docs/source/getting_started/walkthrough/index.rst
@@ -0,0 +1,25 @@
+Walkthrough
+===========
+
+This walkthrough guides you through creating a complete RL task from scratch
+and training a policy using mjlab. We'll use a CartPole task as an example
+to demonstrate the full workflow.
+
+By the end of this tutorial, you'll understand how to:
+
+- Define a new task environment
+- Configure observations, actions, and rewards
+- Set up training parameters
+- Train an RL policy
+- Evaluate and visualize the trained policy
+
+.. toctree::
+   :maxdepth: 1
+   :titlesonly:
+
+   task_setup
+   observations_actions
+   rewards_terminations
+   training_config
+   run_training
+   evaluate_policy
diff --git a/docs/source/getting_started/walkthrough/observations_actions.rst b/docs/source/getting_started/walkthrough/observations_actions.rst
@@ -0,0 +1,23 @@
+Observations and Actions
+========================
+
+This section explains how to configure observations and actions for your task.
+
+Observation Space
+-----------------
+
+TODO: Explain what observations the agent needs (cart position, velocity, pole angle, etc.)
+
+TODO: Show how to configure observation managers.
+
+Action Space
+------------
+
+TODO: Explain the action space (force applied to cart).
+
+TODO: Show how to configure action managers and actuators.
+
+Code Example
+------------
+
+TODO: Provide complete code example for observation and action configuration.
diff --git a/docs/source/getting_started/walkthrough/rewards_terminations.rst b/docs/source/getting_started/walkthrough/rewards_terminations.rst
@@ -0,0 +1,23 @@
+Rewards and Terminations
+========================
+
+This section covers how to define reward functions and termination conditions.
+
+Reward Function
+---------------
+
+TODO: Explain the reward structure for CartPole (staying upright, penalizing extreme positions).
+
+TODO: Show how to implement custom reward functions.
+
+Termination Conditions
+----------------------
+
+TODO: Explain when episodes should terminate (pole falls over, cart out of bounds, time limit).
+
+TODO: Show how to configure termination managers.
+
+Code Example
+------------
+
+TODO: Provide complete code example for rewards and terminations.