Struggling with YOLO setup – First attempts, but stuck

Hi guys,

I recently started working with computer vision as part of my thesis. Initially, I achieved good results using the Ultralytics YOLO model, but its license poses a challenge for future use cases. After some research, I believe this YOLO repository is the best fit for my needs. I also tried YOLOX, DAMO-YOLO, and YOLO-NAS but couldn’t get them to work.

I now have some questions and problems regarding the training process and on how to correctly load a trained model for predictions. I would be very happy if someone can give me some ideas on how to solve these problems. 

I am training on a custom dataset in the normalized yolo labeling format (c,x,y,w,h) with the following folder structure and configs:
#######################################
dataset/
├── images
│   ├── train
│   └── val
├── labels
│   ├── train
│   └── val
└── classes.txt
#######################################
dataset.yaml:
  path: dataset_dog/
  train: train
  validation: val
  
  class_num: 4
  class_list: ["Bicycle","Car","Cat","Dog"]
#######################################
config.yaml:

    hydra:
      run:
        dir: runs
    
    name: v9-train-dog
    
    defaults:
      - _self_
      - task: train
      - dataset: yolo_dog
      - model: v9-m
      - general
#######################################
general.yaml:

    device: cuda
    cpu_num: 8
    
    image_size: [640, 640]
    
    out_path: runs
    exist_ok: True
    
    lucky_number: 10
    use_wandb: False
    use_tensorboard: True
    
    weight: yolo/v9-m.pt
#######################################

1.  While training on my custom dataset the training metrics look like this:
![Image](https://github.com/user-attachments/assets/fc45929d-fedf-4bd3-804d-9847005c0030)

2. After training the last checkpoint is being saved in the runs folder. I am currently loading the model how it is suggested in the lazy.py and I do not get any predictions on my pictures. I am not sure if I am doing something wrong with the training or the prediction and on how to validate if the model was trained correctly or not.

The configs for predictions look like this:
#######################################
config_inference.yaml

    hydra:
      run:
        dir: runs
    
    name: v9-inf
    
    defaults:
      - _self_
      - task: inference
      - dataset: yolo_dog
      - model: v9-m 
      - general_inference
#######################################
general_inference.yaml

    device: cuda
    cpu_num: 8
    
    image_size: [640, 640]
    
    out_path: runs
    exist_ok: True
    
    lucky_number: 10
    use_wandb: False
    use_tensorboard: True

    weight: yolo/v9-m-custom.ckpt
#######################################

3. 
What is the correct way to load the model for predictions? I want to setup a pipeline in which I can predict on camera (not webcam) pictures in real time inside a loop with images in opencv format and get the results to work with (which class was predictied etc. like in ultralytics).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Struggling with YOLO setup – First attempts, but stuck #190

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Struggling with YOLO setup – First attempts, but stuck #190

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions