Description
Describe the bug
When creating a converter by following the hf_demo example, there is a mismatch in behavior depending on model size. This causes:
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor.
This happens when DEFAULT_MODEL is v9-m or v9-s, and does not happen when using either v9-c or v7. Tracing the problem to the yolo code yielded no results, so i am asking here in case this is an oversight on my part.
To Reproduce
This is the relevant code. No other part of my code is interfacing with the yolo module. I believe copying and pasting this will yield the exact same result.
DEFAULT_MODEL = "v9-m"
IMAGE_SIZE = (640, 640)
def load_model(model_name, device):
model_cfg = OmegaConf.load(f"./modules/yolo_config/model/{model_name}.yaml")
model_cfg.model.auxiliary = {}
model = create_model(model_cfg, True)
model.to(device).eval()
return model, model_cfg
class YoloTracker:
component_name = 'yolo_tracker'
def __init__(self):
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.model, self.model_cfg = load_model(DEFAULT_MODEL, self.device)
self.converter = create_converter(self.model_cfg.name, self.model, self.model_cfg.anchor, IMAGE_SIZE, self.device)
self.class_list = ['Person'] #OmegaConf.load("./modules/yolo_config/dataset/coco.yaml").class_list
self.transform = AugmentationComposer([])
nms_confidence = 0.5
nms_iou = 0.5
max_bbox = 100
nms_config = NMSConfig(nms_confidence, nms_iou, max_bbox)
self.post_process = PostProcess(self.converter, nms_config)
Expected behavior
I would expect the issue to arise either on all model sizes or in none of them.
System Info (please complete the following ## information):
- OS: Arch 6.12.7-arch1-1
- Python Version: 3.10.16
- PyTorch Version: 2.5.1+cu124
- CUDA/cuDNN/MPS Version: 12.7
- YOLO Model Version: not applicable