Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deterministic issue when getting pipeline dtype and device #10696

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dimitribarbot
Copy link
Contributor

What does this PR do?

This PR aims to fix a deterministic issue when getting the pipeline dtype or device across multiple runs of the same Python script.

Internally, the ordering issue comes from the use of set in the _get_signature_keys function in the src/diffusers/pipelines/pipeline_utils.py file. As stated in the Python documentation, set does not guarantee that the elements will have the same order across multiple runs of a Python script. This is due to the hashing algorithm which is initialized with a random salt each time the python interpreter is started. More information can be found here (especially in the second note at the end of the section).

Fixes #10671.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul

@@ -1619,10 +1619,12 @@ def components(self) -> Dict[str, Any]:
k: getattr(self, k) for k in self.config.keys() if not k.startswith("_") and k not in optional_parameters
}

if set(components.keys()) != expected_modules:
actual = sorted(set(components.keys()))
expected = sorted(expected_modules)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if using sorted is necessary here, since expected_modules comes from the _get_signature_keys call which already calls sorted internally.

However, I leave it that way because it allows to be completely decoupled from the _get_signature_keys function. And I don't think it will add a big performance overhead (sorting of n elements, n=~10, elements already sorted).

@@ -585,3 +585,104 @@ def test_video_to_video(self):
with io.StringIO() as stderr, contextlib.redirect_stderr(stderr):
_ = pipe(**inputs)
self.assertTrue(stderr.getvalue() == "", "Progress bar should be disabled")


@require_torch_gpu
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this to test multiple devices across pipeline components, so that testing with distinct pipeline devices fails without the code changes. Otherwise, I guess only CPU can be available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deterministic issue in DiffusionPipeline when getting dtype or device
1 participant