Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLIP 2 caption fail with fresh install #3037

Open
RealArtGames opened this issue Jan 4, 2025 · 2 comments
Open

BLIP 2 caption fail with fresh install #3037

RealArtGames opened this issue Jan 4, 2025 · 2 comments

Comments

@RealArtGames
Copy link

RealArtGames commented Jan 4, 2025

21:20:22-948389 INFO     BLIP2 captionning beam...
Traceback (most recent call last):
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\gradio\queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\gradio\route_utils.py", line 321, in call_process_api
    output = await app.get_blocks().process_api(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\gradio\blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\gradio\blocks.py", line 1520, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 2505, in run_sync_in_worker_thread
    return await future
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 1005, in run
    result = context.run(func, *args)
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\gradio\utils.py", line 826, in wrapper
    response = f(*args, **kwargs)
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\kohya_gui\blip2_caption_gui.py", line 151, in caption_images_beam_search
    processor, model, device = load_model()
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\kohya_gui\blip2_caption_gui.py", line 19, in load_model
    processor = Blip2Processor.from_pretrained("Salesforce/blip2-opt-2.7b")
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\processing_utils.py", line 465, in from_pretrained
    args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\processing_utils.py", line 511, in _get_arguments_from_pretrained
    args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 825, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 2048, in from_pretrained
    return cls._from_pretrained(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 2287, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\models\gpt2\tokenization_gpt2_fast.py", line 134, in __init__
    super().__init__(
  File "C:\Users\ZeroTwo\Downloads\Kohya_ss-GUI-LoRA-Portable-main\kohya_ss-masterREINSTALL\kohya_ss-master\venv\lib\site-packages\transformers\tokenization_utils_fast.py", line 111, in __init__
    fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum ModelWrapper at line 250373 column 3

GIT Caption:
c:\Users\ZeroTwo\AppData\Local\Programs\Python\Python3109\python.exe: can't open file 'C:\\Users\\ZeroTwo\\Downloads\\Kohya_ss-GUI-LoRA-Portable-main\\kohya_ss-masterREINSTALL\\kohya_ss-master\\sd-scripts\\finetune\\make_captions_by_git.py': [Errno 2] No such file or directory21:22:44-176572 INFO ...captioning done
BLIP Caption:
C:\Users\ZeroTwo\AppData\Local\Programs\Python\Python3109\python.exe: can't open file 'C:\\Users\\ZeroTwo\\Downloads\\Kohya_ss-GUI-LoRA-Portable-main\\kohya_ss-masterREINSTALL\\kohya_ss-master\\sd-scripts\\finetune\\make_captions.py': [Errno 2] No such file or directory

nvidia-smi:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.61 Driver Version: 551.61 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 ... WDDM | 00000000:01:00.0 On | N/A |
| 0% 38C P0 50W / 285W | 5845MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

I use a fresh python 3.10.9 install and a fresh koyha ss setup

@Erlandsson
Copy link

Erlandsson commented Jan 4, 2025

Tip: If your (large long time) training works with without captions (Which it will), then DON'T uninstall and try again. I did that and have not been able to get it to work again in 2 weeks. I can train (maybe if i am lucky) small (max 30 min trains), but it crash in training randomly so i don't dare setup longer trains.

Also, the setup misses a lot.. a lot of times i have to press install 2-5 times before all is installed. The same when stating train, it will not load all, but repetetive presses on train before it finally start.

Kohya is so very very buggy. Or if it is bmaltais code that is buggy i don't know. Cant even get it to work in pinokio.

Use WD14 instead. for captioning, or one of the other.

PS:
I have also tried downloading zipfile, gitcloning, and the portable i found. none work.

@MoeMonsuta
Copy link

MoeMonsuta commented Jan 11, 2025

I tried for several hours to determine the issue, it just seems to be because BLIP-2 requires older Transformers, (possibly) anyio and Gradio versions to work correctly. WD14 Moat Tagger v2 or HWTagger work best. BLIP-1 works fine, but is practically useless for tagging.

It looks like BLIP-3 (xgen-mm) is here is already, according to the developer. So Blip-2 is probably just outdated.

https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-interleave-r-v1.5

Kohya might implement it in the future. I personally think the easiest way to tag is with HWTagger or with the WD14 extension in Forge though.

Again HWTagger is a very promising contender and the developers are friendly (they even implemented a couple ideas I had). I highly recommend giving them a try! https://github.com/HaW-Tagger/HWtagger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants