CoDance

Official implementation of CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation.

CoDance animates one or multiple subjects from a reference image using a driving pose sequence. It is designed for mismatched-pose settings where the pose layout is not rigidly aligned with the reference image.

Links: Project Page · Checkpoint · Model Card

News

Code release scaffold is available.
CoDance checkpoint is hosted on Hugging Face.

Installation

conda create -n codance python=3.10 -y
conda activate codance

# Choose the CUDA wheel matching your machine.
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu121

pip install -r requirements.txt
pip install -e .

Optional acceleration backends are supported when installed, including FlashAttention, SageAttention, and torch SDPA. torch SDPA is used by default.

Checkpoints

Create the checkpoint directory:

mkdir -p checkpoints

Download the CoDance checkpoint:

python scripts/download_weights.py

Or download it manually from Hugging Face and place it at:

checkpoints/codance.ckpt

Download Wan2.1-I2V-14B-720P:

huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P \
  --local-dir Wan2.1-I2V-14B-720P

Download the DWPose ONNX models and place them as:

checkpoints/yolox_l.onnx
checkpoints/dw-ll_ucoco_384.onnx

Expected layout:

CoDance/
├── checkpoints/
│   ├── codance.ckpt
│   ├── yolox_l.onnx
│   └── dw-ll_ucoco_384.onnx
└── Wan2.1-I2V-14B-720P/
    ├── diffusion_pytorch_model-00001-of-00007.safetensors
    ├── ...
    ├── models_t5_umt5-xxl-enc-bf16.pth
    ├── models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
    └── Wan2.1_VAE.pth

Inference

CoDance uses three inputs:

A reference image.
A DWPose frame directory extracted from a driving video.
A reference subject mask generated by SAM-2 or provided by the user.

1. Extract driving poses

python process_data.py \
  --source_video_paths data/videos/driving.mp4 \
  --saved_pose_dir data/saved_pkl \
  --saved_pose data/saved_pose \
  --det_model checkpoints/yolox_l.onnx \
  --pose_model checkpoints/dw-ll_ucoco_384.onnx

This writes pose images to:

data/saved_pose/driving/

2. Generate a reference mask

Provide one or more positive points on the target subject:

python get_mask.py \
  --image data/images/reference.png \
  --points "480,1470;1168,1424" \
  --output data/masks/reference_mask.png

You can also use any external segmentation tool, as long as the saved mask is an RGB image.

3. Run CoDance

python examples/inference_480p_single.py \
  --ref_img data/images/reference.png \
  --mask data/masks/reference_mask.png \
  --pose_dir data/saved_pose/driving \
  --save_path outputs/codance.mp4 \
  --prompt "A character is dancing." \
  --wan_dir Wan2.1-I2V-14B-720P \
  --codance_ckpt checkpoints/codance.ckpt

Default inference settings are 81 frames, 832x480 resolution, CFG scale 5, 50 denoising steps, and sigma shift 5.

If your Wan2.1 local directory uses a different layout, pass explicit model files:

python examples/inference_480p_single.py \
  --ref_img data/images/reference.png \
  --mask data/masks/reference_mask.png \
  --pose_dir data/saved_pose/driving \
  --model_path Wan2.1-I2V-14B-720P/models_t5_umt5-xxl-enc-bf16.pth \
  --model_path Wan2.1-I2V-14B-720P/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
  --model_path Wan2.1-I2V-14B-720P/Wan2.1_VAE.pth \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00001-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00002-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00003-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00004-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00005-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00006-of-00007.safetensors \
  --model_path Wan2.1-I2V-14B-720P/diffusion_pytorch_model-00007-of-00007.safetensors

Training Details

The released checkpoint contains LoRA weights, the Pose Shift Encoder, and the Mask Encoder. See docs/TRAINING.md for the implementation settings used in the paper.

Repository Structure

.
├── diffsynth/              # Diffusion models, pipelines, schedulers, and loaders
├── dwpose/                 # DWPose ONNX inference utilities
├── examples/               # CoDance inference scripts
├── scripts/                # Download and maintenance helpers
├── process_data.py         # Driving-pose extraction CLI
├── get_mask.py             # SAM-2 mask generation CLI
├── MODEL_CARD.md
└── README.md

Ethics

CoDance is released for academic research. Do not use this project for impersonation, non-consensual identity manipulation, harassment, fraud, or deceptive media generation. Users are responsible for ensuring that reference images, masks, and driving videos are used with proper rights and consent.

Acknowledgements

This implementation builds on the DiffSynth-style video generation codebase and benefits from prior work including UniAnimate-DiT, MimicMotion, MusePose, Animate-X, Wan2.1, DWPose, and SAM-2.

Citation

@article{CoDance2025,
  title={CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation},
  author={Tan, Shuai and Gong, Biao and Ma, Ke and Feng, Yutong and Zhang, Qiyuan and Wang, Yan and Shen, Yujun and Zhao, Hengshuang},
  journal={arXiv preprint arXiv:2601.11096},
  year={2025}
}

License

This repository is released under the Apache-2.0 license. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CoDance

News

Installation

Checkpoints

Inference

1. Extract driving poses

2. Generate a reference mask

3. Run CoDance

Training Details

Repository Structure

Ethics

Acknowledgements

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
checkpoints		checkpoints
diffsynth		diffsynth
docs		docs
dwpose		dwpose
examples		examples
scripts		scripts
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
README.md		README.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
get_mask.py		get_mask.py
process_data.py		process_data.py
requirements.txt		requirements.txt
setup.py		setup.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CoDance

News

Installation

Checkpoints

Inference

1. Extract driving poses

2. Generate a reference mask

3. Run CoDance

Training Details

Repository Structure

Ethics

Acknowledgements

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages