-
Notifications
You must be signed in to change notification settings - Fork 10
Gripper segment custom models #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 15 commits
6f3fef4
1fd0029
f7bb54e
16958eb
288782f
6564dbb
3a39835
9e84dfa
9118391
4a7843e
622ae73
c6199f5
a97de4f
7d4d31b
8abf814
5e6c8b0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,3 +5,4 @@ checkpoints | |
| __pycache__ | ||
| outputs | ||
| .vscode | ||
| ~/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -108,6 +108,30 @@ automatic annotations using DINO work well in many cases but can struggle with t | |
| gripper masks. All downstream object tracking and reconstruction results are sensitive | ||
| to the segmentation quality and thus spending a bit of effort here might be worthwhile. | ||
|
|
||
| ##### Gripper masking with fine tuned models | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would make sense to also make this an option for the run_asset_generation.py pipeline. Maybe add a |
||
| We provide fine tuned networks for SAM2 and GroundingDINO for the segmentation and annotation | ||
| of the gripper used in our provided dataset which can be downloaded from [here](https://mitprod-my.sharepoint.com/personal/nepfaff_mit_edu/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fnepfaff%5Fmit%5Fedu%2FDocuments%2Fscalable%5Freal2sim%5Fmodel%5Fweights&ga=1). | ||
|
|
||
| Please put the downloaded checkpoint files in the `./checkpoints` directory. | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. :nit: That directory doesn't currently exist. It might make sense to create it and put a |
||
| We used mmdetection's implementation to fine tune Grounding DINO. Please see the | ||
| [mmdetection Official Github](https://github.com/open-mmlab/mmdetection/tree/main) | ||
| for installation instructions. | ||
|
||
|
|
||
| When `--txt_prompt` is set to `gripper`, the segmentation script will use the gripper fine tuned | ||
| models for annotation and segmentation. | ||
|
|
||
| To fine tune your own object detection model for your gripper, see [these instructions](https://github.com/open-mmlab/mmdetection/blob/main/configs/grounding_dino/README.md) | ||
| from the mmdetection Official Github. | ||
|
|
||
| To fine tune your own segmentation model for your gripper, see [these instructions](https://github.com/facebookresearch/sam2/blob/main/training/README.md) for training from the | ||
| SAM2 Official Github. | ||
|
|
||
| An example of segmentation failure on the gripper with default models: \ | ||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you elaborate a bit here for why it can fail without fine-tuning? It could be worth mentioning that our particular gripper seems to be out of distribution for SAM2, and thus it looses track of it for long videos. This can also be solved with re-prompting it after failure, segmenting the video in parts but fine-tuning removes this extra step. |
||
| <img src="assets/mask_sam2_default.png" width="200"> | ||
|
||
|
|
||
| Gripper segmentation on the same image with custom models: \ | ||
| <img src="assets/mask_sam2_custom.png" width="200"> | ||
|
|
||
| ### Submodules | ||
|
|
||
| #### robot_payload_id | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| # This configuration file is taken from https://github.com/open-mmlab/mmdetection/tree/main/configs | ||
|
|
||
| # dataset settings | ||
| dataset_type = "CocoDataset" | ||
| data_root = "data/coco/" | ||
|
|
||
| # Example to use different file client | ||
| # Method 1: simply set the data root and let the file I/O module | ||
| # automatically infer from prefix (not support LMDB and Memcache yet) | ||
|
|
||
| # data_root = 's3://openmmlab/datasets/detection/coco/' | ||
|
|
||
| # Method 2: Use `backend_args`, `file_client_args` in versions before 3.0.0rc6 | ||
| # backend_args = dict( | ||
| # backend='petrel', | ||
| # path_mapping=dict({ | ||
| # './data/': 's3://openmmlab/datasets/detection/', | ||
| # 'data/': 's3://openmmlab/datasets/detection/' | ||
| # })) | ||
| backend_args = None | ||
|
|
||
| train_pipeline = [ | ||
| dict(type="LoadImageFromFile", backend_args=backend_args), | ||
| dict(type="LoadAnnotations", with_bbox=True), | ||
| dict(type="Resize", scale=(1333, 800), keep_ratio=True), | ||
| dict(type="RandomFlip", prob=0.5), | ||
| dict(type="PackDetInputs"), | ||
| ] | ||
| test_pipeline = [ | ||
| dict(type="LoadImageFromFile", backend_args=backend_args), | ||
| dict(type="Resize", scale=(1333, 800), keep_ratio=True), | ||
| # If you don't have a gt annotation, delete the pipeline | ||
| dict(type="LoadAnnotations", with_bbox=True), | ||
| dict( | ||
| type="PackDetInputs", | ||
| meta_keys=("img_id", "img_path", "ori_shape", "img_shape", "scale_factor"), | ||
| ), | ||
| ] | ||
| train_dataloader = dict( | ||
| batch_size=2, | ||
| num_workers=2, | ||
| persistent_workers=True, | ||
| sampler=dict(type="DefaultSampler", shuffle=True), | ||
| batch_sampler=dict(type="AspectRatioBatchSampler"), | ||
| dataset=dict( | ||
| type=dataset_type, | ||
| data_root=data_root, | ||
| ann_file="annotations/instances_train2017.json", | ||
| data_prefix=dict(img="train2017/"), | ||
| filter_cfg=dict(filter_empty_gt=True, min_size=32), | ||
| pipeline=train_pipeline, | ||
| backend_args=backend_args, | ||
| ), | ||
| ) | ||
| val_dataloader = dict( | ||
| batch_size=1, | ||
| num_workers=2, | ||
| persistent_workers=True, | ||
| drop_last=False, | ||
| sampler=dict(type="DefaultSampler", shuffle=False), | ||
| dataset=dict( | ||
| type=dataset_type, | ||
| data_root=data_root, | ||
| ann_file="annotations/instances_val2017.json", | ||
| data_prefix=dict(img="val2017/"), | ||
| test_mode=True, | ||
| pipeline=test_pipeline, | ||
| backend_args=backend_args, | ||
| ), | ||
| ) | ||
| test_dataloader = val_dataloader | ||
|
|
||
| val_evaluator = dict( | ||
| type="CocoMetric", | ||
| ann_file=data_root + "annotations/instances_val2017.json", | ||
| metric="bbox", | ||
| format_only=False, | ||
| backend_args=backend_args, | ||
| ) | ||
| test_evaluator = val_evaluator | ||
|
|
||
| # inference on test dataset and | ||
| # format the output results for submission. | ||
| # test_dataloader = dict( | ||
| # batch_size=1, | ||
| # num_workers=2, | ||
| # persistent_workers=True, | ||
| # drop_last=False, | ||
| # sampler=dict(type='DefaultSampler', shuffle=False), | ||
| # dataset=dict( | ||
| # type=dataset_type, | ||
| # data_root=data_root, | ||
| # ann_file=data_root + 'annotations/image_info_test-dev2017.json', | ||
| # data_prefix=dict(img='test2017/'), | ||
| # test_mode=True, | ||
| # pipeline=test_pipeline)) | ||
| # test_evaluator = dict( | ||
| # type='CocoMetric', | ||
| # metric='bbox', | ||
| # format_only=True, | ||
| # ann_file=data_root + 'annotations/image_info_test-dev2017.json', | ||
| # outfile_prefix='./work_dirs/coco_detection/test') |
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. :nit: I find it a bit weird to have python files inside a |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # This configuration file is taken from https://github.com/open-mmlab/mmdetection/tree/main/configs | ||
|
|
||
| default_scope = "mmdet" | ||
|
|
||
| default_hooks = dict( | ||
| timer=dict(type="IterTimerHook"), | ||
| logger=dict(type="LoggerHook", interval=50), | ||
| param_scheduler=dict(type="ParamSchedulerHook"), | ||
| checkpoint=dict(type="CheckpointHook", interval=1), | ||
| sampler_seed=dict(type="DistSamplerSeedHook"), | ||
| visualization=dict(type="DetVisualizationHook"), | ||
| ) | ||
|
|
||
| env_cfg = dict( | ||
| cudnn_benchmark=False, | ||
| mp_cfg=dict(mp_start_method="fork", opencv_num_threads=0), | ||
| dist_cfg=dict(backend="nccl"), | ||
| ) | ||
|
|
||
| vis_backends = [dict(type="LocalVisBackend")] | ||
| visualizer = dict( | ||
| type="DetLocalVisualizer", vis_backends=vis_backends, name="visualizer" | ||
| ) | ||
| log_processor = dict(type="LogProcessor", window_size=50, by_epoch=True) | ||
|
|
||
| log_level = "INFO" | ||
| load_from = None | ||
| resume = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, forgot to remove that after adding it in from a local git annoyance