FluxPipeline is not working with GGUF :( #10674

nitinmukesh · 2025-01-28T18:50:40Z

Describe the bug

cpu offload is not working for Flux-GGUF, Works fine for AuraFlow-GGUF pipeline.

Reproduction

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from diffusers import GGUFQuantizationConfig

model_id = "ostris/Flex.1-alpha"
dtype = torch.bfloat16
transformer_path = "https://huggingface.co/hum-ma/Flex.1-alpha-GGUF/blob/main/Flex.1-alpha-Q4_K_M.gguf"
transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
)

pipe = FluxPipeline.from_pretrained(
	model_id,
	transformer=transformer,
	torch_dtype=dtype,
)
# pipe.enable_sequential_cpu_offload()
pipe.enable_model_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
inference_params = {
	"prompt": "An oak tree",
	"negative_prompt": "",
	"height": 512,
	"width": 512,
	"guidance_scale": 1.0,
	"num_inference_steps": 20,
	"generator": torch.Generator(device="cuda").manual_seed(0),
	"max_sequence_length":512,
}
image = pipe(**inference_params).images[0]
image.save("image.png")

Logs

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  4.07it/s]
Loading pipeline components...:  57%|█████████████████▋             | 4/7 [00:01<00:00,  4.45it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00,  4.36it/s]
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\Flex1alpha-gguf.py", line 20, in <module>
    pipe.enable_model_cpu_offload()
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1095, in enable_model_cpu_offload
    self.to("cpu", silence_dtype_warnings=True)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 467, in to
    module.to(device, dtype)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 1191, in to
    return super().to(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1340, in to
    return self._apply(convert)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 927, in _apply
    param_applied = fn(param)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1333, in convert
    raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

System Info

🤗 Diffusers version: 0.33.0.dev0
Platform: Windows-10-10.0.26100-SP0
Running on Google Colab?: No
Python version: 3.10.11
PyTorch version (GPU?): 2.5.1+cu124 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.27.1
Transformers version: 4.48.1
Accelerate version: 1.4.0.dev0
PEFT version: not installed
Bitsandbytes version: 0.45.1
Safetensors version: 0.5.2
xFormers version: not installed
Accelerator: NVIDIA GeForce RTX 4060 Laptop GPU, 8188 MiB
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

No response

The text was updated successfully, but these errors were encountered:

nitinmukesh · 2025-01-28T19:00:35Z

Tested the same pipeline without GGUF and with int8wo

Both these works
pipe.enable_sequential_cpu_offload()
pipe.enable_model_cpu_offload()

nitinmukesh · 2025-01-28T19:09:43Z

This doesn't work at all
Also removed vae optimization, same result. No progress

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from diffusers import GGUFQuantizationConfig

model_id = "ostris/Flex.1-alpha"
dtype = torch.bfloat16
transformer_path = "https://huggingface.co/hum-ma/Flex.1-alpha-GGUF/blob/main/Flex.1-alpha-Q4_K_M.gguf"
transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
)

pipe = FluxPipeline.from_pretrained(
	model_id,
	transformer=transformer,
	torch_dtype=dtype,
)

pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
inference_params = {
	"prompt": "An oak tree",
	"negative_prompt": "",
	"height": 512,
	"width": 512,
	"guidance_scale": 1.0,
	"num_inference_steps": 20,
	"generator": torch.Generator(device="cuda").manual_seed(0),
	"max_sequence_length":512,
}
image = pipe(**inference_params).images[0]
image.save("image.png")

Just hangs

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Loading pipeline components...: 43%|█████████████▎ | 3/7 [00:00<00:00, 10.39it/s]You set add_prefix_space. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 4.04it/s]
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00, 4.34it/s]

DN6 · 2025-01-30T12:37:30Z

@nitinmukesh Flex.1-alpha doesn't have the same architecture as Flux (it has fewer layers), so we cannot automatically infer the config. You will have to pass in a config so that we know how to configure the model. Can you try changing to this

transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
        config=model_id
)

nitinmukesh · 2025-02-01T07:49:21Z

@DN6

Sorry for the delay in trying the solution suggested.

Thank you for sharing the updated code and explaining the reason of issue. I updated the code as suggested and getting following error

No config file here
https://huggingface.co/ostris/Flex.1-alpha/tree/main

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\requests\models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/ostris/Flex.1-alpha/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\configuration_utils.py", line 387, in load_config
    config_file = hf_hub_download(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 860, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 923, in _hf_hub_download_to_cache_dir
    (url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1374, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1294, in get_hf_file_metadata
    r = _request_wrapper(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 278, in _request_wrapper
    response = _request_wrapper(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 302, in _request_wrapper
    hf_raise_for_status(response)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_http.py", line 417, in hf_raise_for_status
    raise _format(EntryNotFoundError, message, response) from e
huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-679dd169-71101245509553b71d3d8609;26341227-8e5c-46fb-a5a5-97a4787bb7bf)

Entry Not Found for url: https://huggingface.co/ostris/Flex.1-alpha/resolve/main/config.json.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\Flex1alpha-gguf.py", line 8, in <module>
    transformer = FluxTransformer2DModel.from_single_file(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\loaders\single_file_model.py", line 311, in from_single_file
    diffusers_model_config = cls.load_config(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\configuration_utils.py", line 414, in load_config
    raise EnvironmentError(
OSError: ostris/Flex.1-alpha does not appear to have a file named config.json.

nitinmukesh added the bug Something isn't working label Jan 28, 2025

nitinmukesh changed the title ~~NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.~~ FluxPipeline is not working with GGUF :( Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FluxPipeline is not working with GGUF :( #10674

FluxPipeline is not working with GGUF :( #10674

nitinmukesh commented Jan 28, 2025 •

edited

Loading

nitinmukesh commented Jan 28, 2025

nitinmukesh commented Jan 28, 2025 •

edited

Loading

DN6 commented Jan 30, 2025

nitinmukesh commented Feb 1, 2025

FluxPipeline is not working with GGUF :( #10674

FluxPipeline is not working with GGUF :( #10674

Comments

nitinmukesh commented Jan 28, 2025 • edited Loading

Describe the bug

Reproduction

Logs

System Info

Who can help?

nitinmukesh commented Jan 28, 2025

nitinmukesh commented Jan 28, 2025 • edited Loading

DN6 commented Jan 30, 2025

nitinmukesh commented Feb 1, 2025

nitinmukesh commented Jan 28, 2025 •

edited

Loading

nitinmukesh commented Jan 28, 2025 •

edited

Loading