Skip to content

StableDiffusionImg2ImgPipeline OSError: Consistency check failed #1549

@emreaniloguz

Description

@emreaniloguz

Describe the bug

I'm trying to run DreamPose Repository. When I finished fine-tuning the UNet, the code saved the fine-tuned network with this code snippet

            if accelerator.is_main_process and global_step % 500 == 0:
                pipeline = StableDiffusionImg2ImgPipeline.from_pretrained(
                    args.pretrained_model_name_or_path,
                    #adapter=accelerator.unwrap_model(adapter),
                    unet=accelerator.unwrap_model(unet),
                    tokenizer=tokenizer,
                    image_encoder=accelerator.unwrap_model(clip_encoder),
                    clip_processor=accelerator.unwrap_model(clip_processor),
                    revision=args.revision,
                )
                pipeline.save_pretrained(os.path.join(args.output_dir, f'checkpoint-{epoch}'))
                model_path = args.output_dir+f'/unet_epoch_{epoch}.pth'
                torch.save(unet.state_dict(), model_path)
                adapter_path = args.output_dir+f'/adapter_{epoch}.pth'
                torch.save(adapter.state_dict(), adapter_path)

It failed due to: OSError: Consistency check failed: file should be of size 1215981833 but has size 492265879 (model.safetensors). (You can find the full output In the Logs section.)

  • I have modified the force_download parameter to be true but nothing changed.
  • I have enough space to save the model
  • I'm using the latest huggingface-hub version.

Reproduction

No response

Logs

Fetching 14 files:   0%|                                 | 0/14 [00:00<?, ?it/s]Force download:  True
Force download:  True
Fetching 14 files:  21%|█████▎                   | 3/14 [00:06<00:23,  2.11s/it]
Traceback (most recent call last):
  File "finetune-unet.py", line 458, in <module>92M/492M [00:05<00:00, 85.9MB/s]
    main(args)
  File "finetune-unet.py", line 438, in main
    pipeline = StableDiffusionImg2ImgPipeline.from_pretrained(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 908, in from_pretrained
    cached_folder = cls.download(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 1349, in download
    cached_folder = snapshot_download(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/_snapshot_download.py", line 235, in snapshot_download
    thread_map(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/tqdm/contrib/concurrent.py", line 69, in thread_map
    return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/tqdm/contrib/concurrent.py", line 51, in _executor_map
    return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "***/anaconda3/envs/***/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
    yield fs.pop().result()
  File "***/anaconda3/envs/***/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "***/anaconda3/envs/***/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "***/anaconda3/envs/***/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/_snapshot_download.py", line 211, in _inner_hf_hub_download
    return hf_hub_download(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 1365, in hf_hub_download
    http_get(
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/huggingface_hub/file_download.py", line 547, in http_get
    raise EnvironmentError(
OSError: Consistency check failed: file should be of size 1215981833 but has size 492265879 (model.safetensors).
We are sorry for the inconvenience. Please retry download and pass `force_download=True, resume_download=False` as argument.
If the issue persists, please let us know by opening an issue on https://github.com/huggingface/huggingface_hub.
Downloading model.safetensors: 100%|█████████| 492M/492M [00:05<00:00, 83.3MB/s]
Steps: 100%|██████████████| 500/500 [06:10<00:00,  1.35it/s, loss=0.95, lr=1e-5]
Traceback (most recent call last):
  File "***/anaconda3/envs/***/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/accelerate/commands/launch.py", line 941, in launch_command
    simple_launcher(args)
  File "***/anaconda3/envs/***/lib/python3.8/site-packages/accelerate/commands/launch.py", line 603, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['***/anaconda3/envs/***/bin/python', 'finetune-unet.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--instance_data_dir=demo/sample_emre/train', '--output_dir=demo/custom-chkpts_default', '--resolution=512', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=1e-5', '--num_train_epochs=500', '--dropout_rate=0.0', '--custom_chkpt=checkpoints/unet_epoch_20.pth', '--revision', 'ebb811dd71cdc38a204ecbdd6ac5d580f529fd8c', '--use_8bit_adam']' returned non-zero exit status 1.

System info

- huggingface_hub version: 0.15.1
- Platform: Linux-5.4.0-150-generic-x86_64-with-glibc2.17
- Python version: 3.8.16
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: ***.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: 
- FastAI: N/A
- Tensorflow: N/A
- Torch: 1.13.1+cu116
- Jinja2: N/A
- Graphviz: N/A
- Pydot: N/A
- Pillow: 10.0.0
- hf_transfer: N/A
- gradio: N/A
- numpy: 1.24.4
- ENDPOINT: https://huggingface.co
- HUGGINGFACE_HUB_CACHE: ***.cache/huggingface/hub
- HUGGINGFACE_ASSETS_CACHE: ***.cache/huggingface/assets
- HF_TOKEN_PATH: ***.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions