-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi, thank you for releasing such wonderful work and the clean codebase!
I have a question about how you collect and preprocess the data for training the action VAE, especially for the π0 experiments.
-
Episode selection from DROID / pre-training datasets
For π0, the paper mentions using around 3000 episodes from Open-X Embodiment and DROID as the pre-training dataset for the action VAE.
Could you share a bit more detail on how these episodes are selected in practice?- Do you sample episodes uniformly at random from the full DROID (and other) datasets?
- Is there any filtering based on task type, trajectory length, success / failure labels, or embodiment?
-
Sub-trajectory splitting based on action chunks
After you collect the episodes and load the
qpossequences (shape(T, action_dim)), how do you construct the sub-trajectories (action chunks) used as VAE training samples?My current understanding is that you take a sliding window over the action sequence with a fixed action chunk length
s_length(e.g.,s_length = 50for π0), and use these chunks as VAE inputs.Could you confirm whether it works like this, or correct me if I’m misunderstanding:
- Suppose an episode has actions
a_0, a_1, ..., a_{T-1}. - We choose an action chunk length
L = s_length(e.g.,L = 50) and a stridestride(e.g.,stride = 5). - Then we construct sub-trajectories like:
{a_0, ..., a_49},{a_5, ..., a_54},{a_10, ..., a_59},- and so on, until we reach the end of the episode.
- Suppose an episode has actions
Thanks again for your great work!