Skip to content

Recover support for object detection #257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dpascualhe opened this issue Jan 17, 2025 · 27 comments
Open

Recover support for object detection #257

dpascualhe opened this issue Jan 17, 2025 · 27 comments
Assignees

Comments

@dpascualhe
Copy link
Collaborator

dpascualhe commented Jan 17, 2025

DetectionMetrics was originally built as a tool for evaluating object detection models (v1.0.0). Recover said functionality but adapted to our new architecture and toolset.

@Shyam-duba
Copy link

hey I can work on this Iam interested

@dpascualhe
Copy link
Collaborator Author

Hi @Shyam-duba, thanks for your interest and sorry for the late reply!

This issue involves a pretty big chunk of work. My advice for you is to first follow the installation instructions and try out our notebook tutorial for image segmentation (examples/tutorial_image_segmentation.ipynb). Let us know if you find any issue during the process!

Once you are a bit more comfortable with the functionality that the tool provides, you can explore the current architecture and make a proposal for how object detection evaluation could be integrated. Let's keep the discussion going!

@SakhinetiPraveena
Copy link
Contributor

Hi @dpascualhe ,
I started exploring this problem statement. I am a now comfortable with the functionality the tool provides. To understand more about the problem statement. What are the datasets you want to support object detection for? You want to retrieve all the datasets that were supported in v1?
I see that this involves a pretty big chunk of work, could you guide me to a starting point. Thanks in advance!

@dpascualhe
Copy link
Collaborator Author

Hi @SakhinetiPraveena ! As of now we don't need to recover all the compatibility that v1 offered. We could start with some popular dataset like COCO and some pretrained model like Mask R-CNN (available in torchvision). A very broad roadmap would be:

  • Downloading and testing a popular dataset and model.
  • Defining the ImageDetectionModel and ImageDetectionDataset classes (similarly to the ones already defined for segmentation, and avoiding code duplication!).
  • Implementing basic metrics for object detection.

@Aadik1ng
Copy link

Hi @dpascualhe,

Would it be possible for me to take on this task? I have a relevant background in machine learning and computer vision, including experience with PyTorch, TensorFlow, and object detection models. I'd love to contribute by restoring the object detection evaluation functionality from DetectionMetrics v1.0.0 while aligning it with the current architecture and toolset.

If you agree, I can share a detailed roadmap outlining the steps to achieve this, including:

Using a popular dataset like COCO and a pretrained model such as Mask R-CNN (from torchvision).

Defining ImageDetectionModel and ImageDetectionDataset classes while ensuring minimal code duplication with the existing segmentation implementation.

Implementing fundamental object detection metrics.

Once the task is complete, I will raise a PR for review. Looking forward to your thoughts!

@dpascualhe
Copy link
Collaborator Author

Hi @Aadik1ng 👋
Thanks for your proposal! For now, let's wait to see if @SakhinetiPraveena has made any progress so far 😉

In the meantime, a good starting point could be developing some new notebook tutorials. For instance, adding a new tutorial focusing on the our computational cost estimation tool would be a great contribution! (#245 )

@SakhinetiPraveena
Copy link
Contributor

SakhinetiPraveena commented Mar 30, 2025

Hi @dpascualhe ,
Thanks a lot for assigning this issue to me. I wanted to give you a quick update on my progress. Like you suggested, I was able to download COCO dataset and test with Mask R-CNN. Right now I am working on defining ImageDetectionModel and ImageDetectionDataset classes. Trying to get a better understanding of the code that already exists for image segmentation so that I can avoid any code duplication.

I'm also working on my GSOC'25 proposal, I look forward to the opportunity to collaborate and work under your guidance. Excited for what we can achieve as a team!

@dpascualhe
Copy link
Collaborator Author

Great @SakhinetiPraveena ! 👍

@rudrakatkar
Copy link
Contributor

hi @dpascualhe and @SakhinetiPraveena,👋

I have been following this discussion and I am really interested in contributing to this effort. It’s great to see the progress being made!

I noticed that object detection metrics often include additional evaluation aspects like small/medium/large object analysis (similar to COCO’s evaluation metrics). Would it be useful to incorporate these into the planned implementation? I’d be happy to help research and implement such improvements.

Additionally, is there any aspect of the ImageDetectionModel or ImageDetectionDataset implementation where I could assist to speed up development while ensuring minimal code duplication?

Looking forward to your thoughts!

@SakhinetiPraveena
Copy link
Contributor

SakhinetiPraveena commented Apr 2, 2025

Hi @rudrakatkar , really appreciate your enthusiasm, would love to collaborate and work with you. We can probably collectively decide on the approach (to maintain code consistency and avoid issues) and share the work among us.

So I have a couple of approaches in mind to define ImageDetectionModel. One way is to create a base DetectionModel class similar to SegmentationModel and create an ImageDetectionModel class similar to ImageSegmentationModel, where we implement object detection logic.

But that seemed like a redundant way of doing things, Instead of creating a separate DetectionModel base class, we can modify SegmentationModel (let's say VisionModel) to serve as a base class for both segmentation and detection. We can introduce a task_type attribute (with values "segmentation" or "detection") to guide further implementations.

Please let me know your thoughts on this, @dpascualhe

@rudrakatkar
Copy link
Contributor

Hey @SakhinetiPraveena,

Really appreciate your response! Joining this would be my pleasure.

I think you’re right, your idea of having both models merge into a shared VisionModel is a good one, no duplication, no mess. It appears that task_type is a good attribute differentiating between segmentation and detection.

And one thing to consider is, do we have enough flexibility in this approach for some task specific differences like loss functions and post processing. Or maybe we can define some abstract methods in VisionModel that ImageSegmentationModel and ImageDetectionModel will implement?

Would it be appropriate to create a subissue for refactoring SegmentationModel into VisionModel? allows us to have this change independently tracked and the main issue can be focused on object detection evaluation.

Let me know what you think @dpascualhe

@dpascualhe
Copy link
Collaborator Author

Hi, great to see that you are brainstorming together @SakhinetiPraveena @rudrakatkar !

I think building a parent DetectionModel class similar to SegmentationModel makes sense tho. In that way, we can have an ImageDetectionModel and LidarDetectionModel inherit from it. Then we can build an even higher level PerceptionModel from which both detection and segmentation models can inherit common functionality or attributes. What do you think?

@SakhinetiPraveena
Copy link
Contributor

SakhinetiPraveena commented Apr 4, 2025

Thanks for the feedback @dpascualhe, it sounds like a better way to do it. I will refactor my current progress according to this. And for uniformity it would be better to use the same approach for ImageDetectionDataset as well right?

@RohanDobriyal
Copy link

Hi @dpascualhe and team 👋

I've been following this fantastic discussion and am very inspired by how collaborative and forward-thinking it’s been! I’d love to contribute to this project and am currently writing my GSoC proposal around DetectionMetrics.

Alongside the current efforts on evaluation and architecture refactoring, I’d like to bring in another perspective: interactive evaluation visualization and comparative analysis tools.

Here’s what I’m thinking:

  • Evaluation Dashboard: Develop interactive dashboards (maybe with Plotly or Streamlit) to visualize key metrics (Precision/Recall vs IoU, class-wise mAP, confusion matrix, object size distribution, etc.)
  • Model Comparison: Build a utility to compare two or more models on the same dataset with metric and image-based comparisons.
  • Detection Explorer: A visual sample viewer that highlights detections vs ground truth with filtering by confidence, class, size, or error type.

Would these ideas be aligned with the roadmap, and could they complement the current work on metrics and class structure?

Looking forward to your thoughts!

@rudrakatkar
Copy link
Contributor

rudrakatkar commented Apr 6, 2025

Hello, @dpascualhe and @SakhinetiPraveena 👋

I wanted to provide a brief update from my end. To verify the current metrics pipeline, I quickly tested it with a sample image that I downloaded locally. The updated demo_run.py was used to evaluate the image, and although the detection model produced logical results, the metrics (precision, recall, and F1) all came back at zero. Poor overlap between predictions and ground truth is probably the cause of this, so I think the problem should go away on its own once we have a suitable dataset evaluation in place (with realistic ground truth).

I simply tested the current configuration to see how it works; I haven't changed anything in the metrics module or other areas of the repository.

Since the issue is already assigned, I didn’t want to raise a PR directly, but thought it might be useful to share this progress here.

I can raise the PR if you allow,
Thanks!

@dpascualhe
Copy link
Collaborator Author

Hi @RohanDobriyal 👋

Thanks for your interest in DetectionMetrics! Your proposal for interactive evalution would fit better with GUI development. You can check out issue #243 in that regard.

@dpascualhe
Copy link
Collaborator Author

Thanks for the feedback @dpascualhe, it sounds like a better way to do it. I will refactor my current progress according to this. And for uniformity it would be better to use the same approach for ImageDetectionDataset as well right?

@SakhinetiPraveena , sure! Same logic would apply for defining the dataset classes.

@dpascualhe
Copy link
Collaborator Author

Hello, @dpascualhe and @SakhinetiPraveena 👋

I wanted to provide a brief update from my end. To verify the current metrics pipeline, I quickly tested it with a sample image that I downloaded locally. The updated demo_run.py was used to evaluate the image, and although the detection model produced logical results, the metrics (precision, recall, and F1) all came back at zero. Poor overlap between predictions and ground truth is probably the cause of this, so I think the problem should go away on its own once we have a suitable dataset evaluation in place (with realistic ground truth).

I simply tested the current configuration to see how it works; I haven't changed anything in the metrics module or other areas of the repository.

Since the issue is already assigned, I didn’t want to raise a PR directly, but thought it might be useful to share this progress here.

I can raise the PR if you allow, Thanks!

Hi @rudrakatkar ! I’m not sure I fully understand what you’ve been testing. How were you able to run an object detection model with the current pipeline? As it stands, inference, evaluation, and metric computation are quite tightly coupled with the segmentation task. In any case, feel free to open a PR and mark it as a draf. This will help us better understand what you’re working on.

@rudrakatkar
Copy link
Contributor

Hi @dpascualhe, thanks for your response!

You're absolutely right the current pipeline is indeed tailored for segmentation tasks. What I did was create a separate modular flow that handles object detection inference (which I forgot to mention earlier, sorry for the confusion 😅).

This includes:

  • A dataset loader for object detection (COCO-style)
  • A wrapper for torchvision based detection models
  • A separate evaluator and metrics module

Thanks for allowing me to raise a draft PR, I have created one. Looking forward to your feedback.

@dpascualhe
Copy link
Collaborator Author

Hi @rudrakatkar

I've taken a look at your draft. As discussed previously, we would be inheriting SegmentationModel and DetectionModel from a parent PerceptionModel class, and DetectionModel would then serve as a parent class for ImageDetectionModel (leaving an open door for a LidarDetectionModel, as it is done for segmentation). Check out model.py. Same for datasets classes (check out dataset.py and the particular classes for each dataset, e.g. rellis3d, goose, etc.). This is more aligned with @SakhinetiPraveena's vision.

@SakhinetiPraveena
Copy link
Contributor

Yes, I agree with @dpascualhe. I think it's best to follow the proposed structure. We can plan and divide the work among ourselves if you're interested, @rudrakatkar.

As for an update on my progress: I’m currently working on refactoring model.py . I was focused on my GSoC proposal and the overall project implementation last week, which caused a bit of a delay. I should be able to raise a draft PR over the weekend.

@jayzalani
Copy link
Contributor

jayzalani commented Apr 11, 2025

That's Great @SakhinetiPraveena I 'm also interested in implementation this object Detection module, if needed you can share work and we can build this, Currently I was researching about the GUI and have started building some basic structure needed for GUI.

I have read and follow every this conversation and familiar with the structure that's need to be implemented

@SakhinetiPraveena
Copy link
Contributor

Hi @dpascualhe , I have raised a draft pull request where I have refactored segmentation model class structure for now. Please have a look when you have a moment and share any feedback. Thanks in advance!

@jayzalani
Copy link
Contributor

That's great work out there @SakhinetiPraveena would like to collaborate with you, I am currently working for the dataset handler classes for the object detection, while reviewing your draft I can check that you have made a the PerceptionModel class which introduce standard hierarchy approach that's creates a more generic base class, Makes SegmentationModel inherit from this new base class, Moves some common functionality to the parent class, while if your not working with dataset classes I would like to continue my work on that, if your working on dataset handler then please let me know, then however I can work on another aspect of object detection functionality. @dpascualhe can you also assign this issue to me.

@dpascualhe
Copy link
Collaborator Author

I've divided this issue in multiple sub-issues to ease collaboration. As of now, @SakhinetiPraveena I've linked your draft PR to #309. Let us know if you are already working in the dataset classes. Otherwise, I'll assign it to @jayzalani . If @SakhinetiPraveena is already working on that, @jayzalani can work in #311. Does it sound good?

@rudrakatkar
Copy link
Contributor

Hello @dpascualhe and @SakhinetiPraveena , thanks for organizing everything so clearly.

I’d be happy to align my previous object detection work with the updated structure based on Perception Model. I have reviewed the new draft PR. If it's alright, I’d like to contribute to #310 (dataset classes), but I’m also open to helping with #311 if needed. Just let me know what works best.

@SakhinetiPraveena
Copy link
Contributor

Hi @dpascualhe,

I havent started with refactoring dataset classes yet, you can go ahead and assign that to either one of them. I will take up other issues once I am done with refactoring model classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants