Are there plans to open-source the training data augmentation code (4D Geometric Control pipeline)?

According to the paper, starting from Sekai-Real-HQ and SpatialVID-HQ, 81-frame clips are extracted followed by quality filtering. For each retained clip, Qwen2.5-VL-72B, GroundedSAM2, and MegaSAM provide captions, object masks, depth, and camera poses, which are lifted into background/object point clouds, fitted with 3D Gaussian trajectories, and rendered as background/trajectory maps plus a merged mask that constitute the 4D Geometric Control.

Are there any plans to open-source the code related to this training data augmentation pipeline?

Looking forward to your response. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are there plans to open-source the training data augmentation code (4D Geometric Control pipeline)? #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Are there plans to open-source the training data augmentation code (4D Geometric Control pipeline)? #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions