Skip to content

Bug inside value_guided_sampling.py #10636

Open
@rdesc

Description

@rdesc

Describe the bug

There's a bug here:

for key in self.data.keys():
try:
self.means[key] = self.data[key].mean()
except: # noqa: E722
pass
self.stds = {}
for key in self.data.keys():
try:
self.stds[key] = self.data[key].std()
except: # noqa: E722
pass

The means and stds should be computed across each of the individual dimensions in the observations, actions space as its done in the original jannerm/diffuser code. This is also made clear by the final video in the reinforcement_learning_with_diffusers.ipynb colab notebook shared here for reference when comparing to a rollout video provided by jannerm/diffuser (second video).

buggy.mp4
jannerm_rollout.mp4

Proposed fix:

        for key in self.data.keys():
            try:
                if key in ['observations', 'actions']:
                    self.means[key] = self.data[key].mean(axis=0)
                else:
                    self.means[key] = self.data[key].mean()
            except:  # noqa: E722
                pass
        self.stds = {}
        for key in self.data.keys():
            try:
                if key in ['observations', 'actions']:
                    self.stds[key] = self.data[key].std(axis=0)
                else:
                    self.stds[key] = self.data[key].std()
            except:  # noqa: E722
                pass

Reproduction

Run the google colab reinforcement_learning_with_diffusers.ipynb

Logs

System Info

  • 🤗 Diffusers version: 0.31.0
  • Platform: Linux-6.8.0-51-generic-x86_64-with-glibc2.17
  • Running on Google Colab?: No
  • Python version: 3.8.20
  • PyTorch version (GPU?): 2.4.1+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): 0.7.2 (cpu)
  • Jax version: 0.4.13
  • JaxLib version: 0.4.13
  • Huggingface_hub version: 0.26.2
  • Transformers version: not installed
  • Accelerate version: 1.0.1
  • PEFT version: not installed
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.5
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 2080 Ti, 11264 MiB
    NVIDIA TITAN RTX, 24576 MiB
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@yiyixuxu @DN6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions