Skip to content

Unable to Download Full Dataset: Only 27 of 55 Datasets Available #104

@Lynne-Zheng-Linfang

Description

@Lynne-Zheng-Linfang

Hi, thank you for your impressive work and for generously sharing the dataset.

I'm currently trying to download the full dataset, but I've only been able to successfully download 27 out of the 55 datasets. For the remaining datasets, no files were downloaded, and no errors were reported by gsutil.

Here are the details:

Initially, I attempted to use tfds.load({dataset_name}), but all datasets returned a Dataset not found error. According to the instructions in the README:

If you run into this issue when trying to run tfds.load({dataset_name}),
Try downloading the dataset manually by running
gsutil -m cp -r gs://gdm-robotics-open-x-embodiment/{dataset_name} ~/tensorflow_datasets/
Once you download the dataset like this, you can use the dataset with the regular tfds.load({dataset_name}) command!

I followed this suggestion and used:

gsutil -m cp -r gs://gdm-robotics-open-x-embodiment/{dataset_name} ~/tensorflow_datasets/

However, only the following 27 datasets were downloaded successfully:

asu_table_top_converted_externally_to_rlds                   cmu_play_fusion                                        kuka
austin_buds_dataset_converted_externally_to_rlds             cmu_stretch                                            language_table
austin_sirius_dataset_converted_externally_to_rlds           columbia_cairlab_pusht_real                            nyu_door_opening_surprising_effectiveness
berkeley_fanuc_manipulation                                  dlr_sara_pour_converted_externally_to_rlds             stanford_hydra_dataset_converted_externally_to_rlds
berkeley_gnm_cory_hall                                       eth_agent_affordances                                  stanford_robocook_converted_externally_to_rlds
berkeley_gnm_recon                                           furniture_bench_dataset_converted_externally_to_rlds   ucsd_kitchen_dataset_converted_externally_to_rlds
berkeley_gnm_sac_son                                         iamlab_cmu_pickup_insert_converted_externally_to_rlds  utaustin_mutex
berkeley_mvp_converted_externally_to_rlds                    imperialcollege_sawyer_wrist_cam                       utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds
cmu_franka_exploration_dataset_converted_externally_to_rlds  kaist_nonprehensile_converted_externally_to_rlds       utokyo_saytap_converted_externally_to_rlds

I’ve also verified that some of the other dataset folders (e.g., bc_z, berkeley_autolab_ur5, etc.) exist in the bucket but appear to contain only folder markers (i.e., no actual files). Attempts to list or download them return no results or errors like One or more URLs matched no objects.

Could you kindly clarify:

  • Whether these datasets are expected to be publicly available?

  • If so, is there any additional step needed to access or download them?

Any guidance would be greatly appreciated. Thanks again for your great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions