Skip to content

Commit 27637a5

Browse files
sayakpaulXciDnoskillDN6yiyixuxu
authored
Flux pipeline (#9043)
add flux! Signed-off-by: Adrien <[email protected]> Co-authored-by: Adrien <[email protected]> Co-authored-by: Anatoly Belikov <[email protected]> Co-authored-by: Dhruv Nair <[email protected]> Co-authored-by: yiyixuxu <[email protected]>
1 parent 2ea22e1 commit 27637a5

21 files changed

+2270
-30
lines changed

docs/source/en/_toctree.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,8 @@
253253
title: HunyuanDiT2DModel
254254
- local: api/models/aura_flow_transformer2d
255255
title: AuraFlowTransformer2DModel
256+
- local: api/models/flux_transformer
257+
title: FluxTransformer2DModel
256258
- local: api/models/latte_transformer3d
257259
title: LatteTransformer3DModel
258260
- local: api/models/lumina_nextdit2d
@@ -320,6 +322,8 @@
320322
title: DiffEdit
321323
- local: api/pipelines/dit
322324
title: DiT
325+
- local: api/pipelines/flux
326+
title: Flux
323327
- local: api/pipelines/hunyuandit
324328
title: Hunyuan-DiT
325329
- local: api/pipelines/i2vgenxl
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# FluxTransformer2DModel
14+
15+
A Transformer model for image-like data from [Flux](https://blackforestlabs.ai/announcing-black-forest-labs/).
16+
17+
## FluxTransformer2DModel
18+
19+
[[autodoc]] FluxTransformer2DModel

docs/source/en/api/pipelines/flux.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Flux
14+
15+
Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/) by the creators of Flux, Black Forest Labs.
16+
17+
Original model checkpoints for Flux can be found [here](https://huggingface.co/black-forest-labs). Original inference code can be found [here](https://github.com/black-forest-labs/flux).
18+
19+
<Tip>
20+
21+
Flux can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out [this section](https://huggingface.co/blog/sd3#memory-optimizations-for-sd3) for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to [this blog post](https://huggingface.co/blog/quanto-diffusers) to learn more.
22+
23+
</Tip>
24+
25+
Flux comes in two variants:
26+
27+
* Timestep-distilled (`black-forest-labs/FLUX.1-schnell`)
28+
* Guidance-distilled (`black-forest-labs/FLUX.1-dev`)
29+
30+
Both checkpoints have slightly difference usage which we detail below.
31+
32+
### Timestep-distilled
33+
34+
* `max_sequence_length` cannot be more than 256.
35+
* `guidance_scale` needs to be 0.
36+
* As this is a timestep-distilled model, it benefits from fewer sampling steps.
37+
38+
```python
39+
import torch
40+
from diffusers import FluxPipeline
41+
42+
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
43+
pipe.enable_model_cpu_offload()
44+
45+
prompt = "A cat holding a sign that says hello world"
46+
out = pipe(
47+
prompt=prompt,
48+
guidance_scale=0.,
49+
height=768,
50+
width=1360,
51+
num_inference_steps=4,
52+
max_sequence_length=256,
53+
).images[0]
54+
out.save("image.png")
55+
```
56+
57+
### Guidance-distilled
58+
59+
* The guidance-distilled variant takes about 50 sampling steps for good-quality generation.
60+
* It doesn't have any limitations around the `max_sequence_length`.
61+
62+
```python
63+
import torch
64+
from diffusers import FluxPipeline
65+
66+
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
67+
pipe.enable_model_cpu_offload()
68+
69+
prompt = "a tiny astronaut hatching from an egg on the moon"
70+
out = pipe(
71+
prompt=prompt,
72+
guidance_scale=3.5,
73+
height=768,
74+
width=1360,
75+
num_inference_steps=50,
76+
).images[0]
77+
out.save("image.png")
78+
```
79+
80+
## FluxPipeline
81+
82+
[[autodoc]] FluxPipeline
83+
- all
84+
- __call__

0 commit comments

Comments
 (0)