Skip to content

Issue with Training on Custom Dataset: KeyError (v_num) #185

Open
@adityaajay33

Description

@adityaajay33

I am attempting to train my model on a custom dataset but I keep getting a Key Error from v_num, when I run the following training argument: python yolo/lazy.py task=train dataset=custom use_wandb=False. I get this error specifically:

** CODE **

Error executing job with overrides: ['dataset=custom', 'task=train', 'task.data.batch_size=8', 'model=v9-s', 'weight=False', 'use_wandb=False']
Traceback (most recent call last):
File "/mnt/c/Users/is231191daus/algorithms/YOLO/yolo/lazy.py", line 45, in
   main()
File "/root/miniconda3/lib/python3.12/site-packages/hydra/main.py", line 94, in decorated_main
   _run_hydra(
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
   _run_app(
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/utils.py", line 457, in _run_app
   run_and_report(
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
   raise ex
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
   return func()
          ^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/utils.py", line 458, in
   lambda: hydra.run(
           ^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/hydra/_internal/hydra.py", line 132, in run
   _ = ret.return_value
       ^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/hydra/core/utils.py", line 260, in return_value
   raise self._return_value
File "/root/miniconda3/lib/python3.12/site-packages/hydra/core/utils.py", line 186, in run_job
   ret.return_value = task_function(task_cfg)
                      ^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/c/Users/is231191daus/algorithms/YOLO/yolo/lazy.py", line 35, in main
   trainer.fit(model)
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py", line 539, in fit
   call._call_and_handle_interrupt(
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/call.py", line 47, in _call_and_handle_interrupt
   return trainer_fn(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py", line 575, in _fit_impl
   self._run(model, ckpt_path=ckpt_path)
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py", line 982, in _run
   results = self._run_stage()
             ^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py", line 1026, in _run_stage
   self.fit_loop.run()
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/loops/fit_loop.py", line 216, in run
   self.advance()
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/loops/fit_loop.py", line 455, in advance
   self.epoch_loop.run(self._data_fetcher)
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 150, in run
   self.advance(data_fetcher)
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 339, in advance
   call._call_callback_hooks(trainer, "on_train_batch_end", batch_output, batch, batch_idx)
File "/root/miniconda3/lib/python3.12/site-packages/lightning/pytorch/trainer/call.py", line 222, in _call_callback_hooks
   fn(trainer, trainer.lightning_module, *args, **kwargs)
File "/root/miniconda3/lib/python3.12/site-packages/lightning_utilities/core/rank_zero.py", line 42, in wrapped_fn
   return fn(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.12/site-packages/yolo/utils/logging_utils.py", line 110, in on_train_batch_end
   metrics.pop("v_num")
KeyError: 'v_num'

CODE

This is an example of my config file by the way:

Image

Any help would be very much appreciated!!

System Info (please complete the following ## information):

  • Ubuntu 22.04
  • Python 3.12
  • PyTorch 2.6.0+cu124
  • Lightning 2.5.0
  • CUDA 11.5
  • YOLOv9 - s

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions