GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

📄 Official Project Page for our CVPR 2025 paper 🎥 Watch the demo video below to see GenManip in action!

🧠 Overview

GenManip is a simulation platform designed for large-scale evaluation of generalist robotic manipulation policies under diverse, realistic instruction-following scenarios.

Built on NVIDIA Isaac Sim, GenManip offers:

🧠 LLM-driven task generation via a novel Task-oriented Scene Graph (ToSG)
🔬 200 curated scenarios for both modular and end-to-end policy benchmarking
🧱 A scalable asset pool with 10,000+ rigid and 100+ articulated objects with vision-language labels
🧭 Evaluation of spatial, appearance, commonsense, and long-horizon reasoning capabilities

News!!!

We are thrilled to announce the release of 10 hand-crafted post-training tasks and the synthesis of over 55K generalizable pick-and-place tasks across ~14K objects on the ALOHA platform for the IROS 2025 Challenge: Vision-Language Manipulation in Open Tabletop Environments.

The pre-training dataset from the GenManip platform features randomized backgrounds, randomized objects, and diverse instructions, enabling strong generalization across a wide range of scenarios.

📌 Register for the IROS Challenge: https://eval.ai/web/challenges/challenge-page/2626/overview

📂 Data Access

Pre-training Data (Dual-arm Generalizable Pick-and-Place): Hugging Face Link
Post-training Data (Dual-arm Manipulation, 10 Tasks): Hugging Face Link

In addition, the GenManip Benchmark will soon merge high-quality tasks into InternManip. We also provide data generated by GenManip through InternData-M1 — a large-scale embodied robotics dataset containing ~250,000 simulation demonstrations with rich frame-level annotations, including 2D/3D bounding boxes, trajectories, grasp points, and semantic masks.

Due to its massive size, converting from the GenManip format to the LeRobot format has taken some time, and uploads are still in progress. Rest assured, all data has already been generated and will be available soon. Even more scaling data for long-horizon tasks is on the way! 🚀

✨ Key Features

Feature	Description
🎯 ToSG-based Task Synthesis	Graph-based semantic representation for generating complex tasks
🖼️ Photorealistic Simulation	RTX ray-traced rendering with physical accuracy
📊 Benchmark Suite	200 high-diversity tasks annotated via human-in-the-loop refinement
🧪 Evaluation Tools	Supports SR, SPL, ablations, and generalization diagnostics

🛠️ Getting Started

Code is released!

You can visit our official website for more information, documentation, and updates.

TODO List

Completed

GenManip Website for setup, using VLM Agents, and leaderborad
Code for demogen, render and evaluation

To Release

GenManip Bench (20 tasks)
Full GenManip Bench with evaluation metrics
GenManip Assets (10K+ objects)
More models: Seer, ACT, and beyond
Objaverse scaling pipeline
etc.

📚 Citation

If our work is helpful in your research, please cite:

@inproceedings{gao2025genmanip,
  title={GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation},
  author={Gao, Ning and Chen, Yilun and Yang, Shuai and Chen, Xinyi and Tian, Yang and Li, Hao and Huang, Haifeng and Wang, Hanqing and Wang, Tai and Pang, Jiangmiao},
  booktitle={CVPR},
  year={2025}
}

📬 Contact & Updates

Have questions or ideas? Reach out via the project page or open an issue. We welcome contributions, collaborations, and feedback from the community!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
baselines/modular_framework		baselines/modular_framework
configs		configs
genmanip		genmanip
genmanip_bench		genmanip_bench
object_utils		object_utils
readme_assets		readme_assets
standalone_tools		standalone_tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demogen_V4.py		demogen_V4.py
eval_V3.py		eval_V3.py
pyrightconfig.json		pyrightconfig.json
render_V3.py		render_V3.py
test_demoscene.py		test_demoscene.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

🧠 Overview

News!!!

✨ Key Features

🛠️ Getting Started

TODO List

Completed

To Release

📚 Citation

📬 Contact & Updates

About

Uh oh!

Releases

Packages

Languages

License

InternRobotics/GenManip

Folders and files

Latest commit

History

Repository files navigation

GenManip: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

🧠 Overview

News!!!

✨ Key Features

🛠️ Getting Started

TODO List

Completed

To Release

📚 Citation

📬 Contact & Updates

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages