Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FluxPipeline is not working with GGUF :( #10674

Open
nitinmukesh opened this issue Jan 28, 2025 · 4 comments
Open

FluxPipeline is not working with GGUF :( #10674

nitinmukesh opened this issue Jan 28, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@nitinmukesh
Copy link

nitinmukesh commented Jan 28, 2025

Describe the bug

cpu offload is not working for Flux-GGUF, Works fine for AuraFlow-GGUF pipeline.

Reproduction

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from diffusers import GGUFQuantizationConfig

model_id = "ostris/Flex.1-alpha"
dtype = torch.bfloat16
transformer_path = "https://huggingface.co/hum-ma/Flex.1-alpha-GGUF/blob/main/Flex.1-alpha-Q4_K_M.gguf"
transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
)

pipe = FluxPipeline.from_pretrained(
	model_id,
	transformer=transformer,
	torch_dtype=dtype,
)
# pipe.enable_sequential_cpu_offload()
pipe.enable_model_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
inference_params = {
	"prompt": "An oak tree",
	"negative_prompt": "",
	"height": 512,
	"width": 512,
	"guidance_scale": 1.0,
	"num_inference_steps": 20,
	"generator": torch.Generator(device="cuda").manual_seed(0),
	"max_sequence_length":512,
}
image = pipe(**inference_params).images[0]
image.save("image.png")

Logs

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00,  4.07it/s]
Loading pipeline components...:  57%|█████████████████▋             | 4/7 [00:01<00:00,  4.45it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00,  4.36it/s]
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\Flex1alpha-gguf.py", line 20, in <module>
    pipe.enable_model_cpu_offload()
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1095, in enable_model_cpu_offload
    self.to("cpu", silence_dtype_warnings=True)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 467, in to
    module.to(device, dtype)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 1191, in to
    return super().to(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1340, in to
    return self._apply(convert)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 900, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 927, in _apply
    param_applied = fn(param)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1333, in convert
    raise NotImplementedError(
NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

System Info

  • 🤗 Diffusers version: 0.33.0.dev0
  • Platform: Windows-10-10.0.26100-SP0
  • Running on Google Colab?: No
  • Python version: 3.10.11
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.27.1
  • Transformers version: 4.48.1
  • Accelerate version: 1.4.0.dev0
  • PEFT version: not installed
  • Bitsandbytes version: 0.45.1
  • Safetensors version: 0.5.2
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 4060 Laptop GPU, 8188 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

@nitinmukesh nitinmukesh added the bug Something isn't working label Jan 28, 2025
@nitinmukesh
Copy link
Author

Tested the same pipeline without GGUF and with int8wo

Both these works
pipe.enable_sequential_cpu_offload()
pipe.enable_model_cpu_offload()

@nitinmukesh
Copy link
Author

nitinmukesh commented Jan 28, 2025

  1. This doesn't work at all
  2. Also removed vae optimization, same result. No progress
import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from diffusers import GGUFQuantizationConfig

model_id = "ostris/Flex.1-alpha"
dtype = torch.bfloat16
transformer_path = "https://huggingface.co/hum-ma/Flex.1-alpha-GGUF/blob/main/Flex.1-alpha-Q4_K_M.gguf"
transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
)

pipe = FluxPipeline.from_pretrained(
	model_id,
	transformer=transformer,
	torch_dtype=dtype,
)

pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
inference_params = {
	"prompt": "An oak tree",
	"negative_prompt": "",
	"height": 512,
	"width": 512,
	"guidance_scale": 1.0,
	"num_inference_steps": 20,
	"generator": torch.Generator(device="cuda").manual_seed(0),
	"max_sequence_length":512,
}
image = pipe(**inference_params).images[0]
image.save("image.png")

Just hangs

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Loading pipeline components...: 43%|█████████████▎ | 3/7 [00:00<00:00, 10.39it/s]You set add_prefix_space. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 4.04it/s]
Loading pipeline components...: 100%|███████████████████████████████| 7/7 [00:01<00:00, 4.34it/s]

@nitinmukesh nitinmukesh changed the title NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device. FluxPipeline is not working with GGUF :( Jan 28, 2025
@DN6
Copy link
Collaborator

DN6 commented Jan 30, 2025

@nitinmukesh Flex.1-alpha doesn't have the same architecture as Flux (it has fewer layers), so we cannot automatically infer the config. You will have to pass in a config so that we know how to configure the model. Can you try changing to this

transformer = FluxTransformer2DModel.from_single_file(
	transformer_path,
	quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
	torch_dtype=dtype,
        config=model_id
)

@nitinmukesh
Copy link
Author

@DN6

Sorry for the delay in trying the solution suggested.

Thank you for sharing the updated code and explaining the reason of issue. I updated the code as suggested and getting following error

No config file here
https://huggingface.co/ostris/Flex.1-alpha/tree/main

(venv) C:\aiOWN\diffuser_webui>python Flex1alpha-gguf.py
Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_http.py", line 406, in hf_raise_for_status
    response.raise_for_status()
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\requests\models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/ostris/Flex.1-alpha/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\configuration_utils.py", line 387, in load_config
    config_file = hf_hub_download(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 860, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 923, in _hf_hub_download_to_cache_dir
    (url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1374, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 1294, in get_hf_file_metadata
    r = _request_wrapper(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 278, in _request_wrapper
    response = _request_wrapper(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\file_download.py", line 302, in _request_wrapper
    hf_raise_for_status(response)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_http.py", line 417, in hf_raise_for_status
    raise _format(EntryNotFoundError, message, response) from e
huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-679dd169-71101245509553b71d3d8609;26341227-8e5c-46fb-a5a5-97a4787bb7bf)

Entry Not Found for url: https://huggingface.co/ostris/Flex.1-alpha/resolve/main/config.json.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\aiOWN\diffuser_webui\Flex1alpha-gguf.py", line 8, in <module>
    transformer = FluxTransformer2DModel.from_single_file(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\loaders\single_file_model.py", line 311, in from_single_file
    diffusers_model_config = cls.load_config(
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\configuration_utils.py", line 414, in load_config
    raise EnvironmentError(
OSError: ostris/Flex.1-alpha does not appear to have a file named config.json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants