Skip to content

Commit c996fb8

Browse files
committed
clean up
1 parent 24469e8 commit c996fb8

26 files changed

+114
-104
lines changed

.gitignore

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# dot files
2+
.vscode
3+
4+
# cache
5+
__pycache__/
6+
.pytest_cache
7+
8+
# packaging
9+
*.egg-info/
10+
11+
# logs
12+
wandb/
13+
14+
15+

README.md

+11-11
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Rotated-Object-Detection
2-
Tiny ResNet inspired FPN network (<2M params) for Rotated Object Detection using 5-parameter Modulated Rotation Loss
2+
Novel ResNet inspired Tiny-FPN network (<2M params) for Rotated Object Detection using 5-parameter Modulated Rotation Loss
33

44
### Crux
5-
* **Architecture**: FPN with classification and regression heads ~1.9M paramters
5+
* **Architecture**: FPN with classification and regression heads ~1.9M parameters
66
* **Loss Function**: 5 Parameter Modulated Rotation Loss
77
* **Activation**: Mish
88
* **Model Summary** - *reports/FPN_torchsummary.txt* (reports/ also contain alterantive summary with named layers in table)
@@ -13,27 +13,27 @@ Tiny ResNet inspired FPN network (<2M params) for Rotated Object Detection using
1313

1414

1515
### Method
16-
* The reported results are using a ResNet inspired building block modules and a FPN.
16+
* The reported results are using a ResNet inspired building block modules and an FPN.
1717
* Separate classification and regression subnets (single FC) are used.
18-
* Feature map from the top of the pyramid which has the best semantic representation is used for classifcation.
19-
* While the finer feature map at the bottom of the pyramid which has best global representation is used for regressing the rotated bounding box. Finer details can be found in the code as comments. Code: *src/models/detector_fpn.py*
18+
* Feature map from the top of the pyramid that has the best semantic representation is used for classification.
19+
* While the finer feature map at the bottom of the pyramid that has the best global representation is used for regressing the rotated bounding box. Finer details can be found in the code as comments. Code: *src/models/detector_fpn.py*
2020

2121
* The whole implementation is from scratch, in PyTorch. Only the method for calculating AP from PR curves is borrowed and referenced (*src/metrics.py/compute_ap*).
2222

2323
### Approach
24-
1. (Confidential) Random data generator that creates images with high noise and rotated objects (shapes) in random scales and orientations
25-
2. Compare reusing generated samples for each epoch vs online generating and loading
24+
1. Random data generator that creates images with high noise and rotated objects (shapes) in random scales and orientations. (Private)
25+
2. Compare reusing generated samples for each epoch VS online generating and loading
2626
3. Implement modulated rotated loss and other metrics
2727
4. Experiment with loss functions and activations
28-
5. Tried to replace standard conv layers with ORN (Oriented Response Network) which use rotated filters to learn orientation (Could not integrate due to technical challenges)
28+
5. Tried to replace standard convolutional layers with ORN (Oriented Response Network) that use rotated filters to learn orientation (Could not integrate due to technical challenges)
2929
6. Improve basic model to use different heads for classification and regression
30-
7. Try variations by removing 512 dimensional filters as they take up the most parameters (~1M)
30+
7. Try variations by removing 512-dimensional filters as they take up the most parameters (~1M)
3131
8. Add feature pyramid and experiment with different building blocks and convolutional parameters (kernel size, stride in the first layer plays a big role)
32-
9. Streamline parameters in the building blocks, prediciton heads to be lower than 2M
32+
9. Streamline parameters in the building blocks and the prediction heads to be lower than 2M
3333

3434
* **Please find the rest of the report, with details on experiments and analysis, in** *reports/experiments.pdf*
3535

3636
### Opportunities to improve
37-
1. Use the rest of the pyramid layers for prediction (take more parameters) and have better logic to get best detection
37+
1. Use the rest of the pyramid layers for prediction (take more parameters) and have better logic to get the best detection
3838
2. Integrate ORN layers to FPN
3939
3. Using DenseNets with compact convolution layer configurations

bsridatta.egg-info/PKG-INFO

-11
This file was deleted.

bsridatta.egg-info/SOURCES.txt

-9
This file was deleted.

bsridatta.egg-info/dependency_links.txt

-1
This file was deleted.

bsridatta.egg-info/top_level.txt

-1
This file was deleted.

src/dataloader.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ def train_dataloader(opt):
1616
print("samples - ", len(dataset))
1717
return loader
1818

19+
1920
def val_dataloader(opt):
2021
print("[INFO]: Validation dataloader called")
2122
dataset = Ships(n_samples=opt.val_len)
@@ -30,6 +31,7 @@ def val_dataloader(opt):
3031
print("samples - ", len(dataset))
3132
return loader
3233

34+
3335
def test_dataloader(opt):
3436
print("[INFO]: Test dataloader called")
3537
dataset = Ships(n_samples=opt.test_len)
@@ -42,4 +44,4 @@ def test_dataloader(opt):
4244
sampler=sampler,
4345
shuffle=shuffle)
4446
print("samples - ", len(dataset))
45-
return loader
47+
return loader

src/dataset.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -4,37 +4,38 @@
44
import numpy as np
55
from tqdm import tqdm
66

7+
78
class Ships(Dataset):
89
"""ship datasets with has ship labels
910
Keyword Arguments:
1011
n_samples {int} -- items in dataset, here, items per epoch (default: {1000})
1112
pre_load {bool} -- to make all items at once and query for each step (default: {False})
12-
13+
1314
Returns:
1415
sample {Tenosr} -- p_ship, x, y, yaw, h, w
1516
"""
17+
1618
def __init__(self, n_samples=1000, pre_load=False):
1719
self.n_samples = n_samples
1820
self.pre_load = pre_load
1921
if pre_load:
2022
images, labels = make_batch(n_samples)
2123
# row, col -> n_channel,row,col
2224
inp = torch.tensor(images, dtype=torch.float32)
23-
self.inps = inp[:,None, :, :]
25+
self.inps = inp[:, None, :, :]
2426

2527
# x,y,yaw,h,w -> p(ship),x,y,yaw,h,w
2628
target = torch.tensor(labels, dtype=torch.float32)
27-
has_ship = (~torch.isnan(target[:,0])).float().reshape(-1,1)
29+
has_ship = (~torch.isnan(target[:, 0])).float().reshape(-1, 1)
2830
self.targets = torch.cat((has_ship, target), dim=1)
2931

30-
3132
def __len__(self):
3233
return self.n_samples
3334

3435
def __getitem__(self, idx):
3536
if self.pre_load:
3637
inp = self.inps[idx]
37-
target = self.targets[idx]
38+
target = self.targets[idx]
3839
else:
3940
image, label = make_data()
4041

@@ -52,6 +53,8 @@ def __getitem__(self, idx):
5253
return sample
5354

5455
# Used for simple experiment
56+
57+
5558
def make_batch(batch_size):
5659
"""Used only when pre_load = True
5760

src/loss.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
import torch
22

3+
34
def compute_loss(pred, target):
45
"""Compute loss handling no ships
56
67
Arguments:
78
pred {Tensor Batch} -- p(ship), x, y, yaw, w, h
89
target {Tensor Batch} -- p(ship), x, y, yaw, w, h
9-
10+
1011
Returns:
1112
loss -- list of all - not averaged
1213
"""
@@ -21,9 +22,9 @@ def compute_loss(pred, target):
2122

2223
l_ship = torch.nn.functional.binary_cross_entropy_with_logits(
2324
pred[:, 0], target[:, 0], reduction='none')
24-
25+
2526
loss = l_ship + l_bbox
26-
27+
2728
return loss, l_ship, l_bbox
2829

2930

src/main.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,8 @@ def _get_argparser():
7373
# device
7474
parser.add_argument('--cuda', default=True, type=lambda x: (str(x).lower() == 'true'),
7575
help='enable cuda if available')
76-
parser.add_argument('--pin_memory', default = False, type = lambda x: (str(x).lower() == 'true'),
77-
help = 'pin memory to device')
76+
parser.add_argument('--pin_memory', default=False, type=lambda x: (str(x).lower() == 'true'),
77+
help='pin memory to device')
7878
parser.add_argument('--seed', default=400, type=int,
7979
help='random seed')
8080
return parser

src/metrics.py

+8-7
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ def compute_metrics(pred, target, iou_threshold=0.7, pr_score=0.5):
1414
Keyword Arguments:
1515
iou_threshold {float} -- predicted bbox is correct if IOU > this value (default: {0.7})
1616
pr_score {float} -- object conf. threshold to sample precision and recall (default: {0.5})
17-
17+
1818
Returns:
1919
precision, recall, F1 @ pr_score, AP@ iou_threshold and mean IOU
2020
@@ -43,7 +43,7 @@ def compute_metrics(pred, target, iou_threshold=0.7, pr_score=0.5):
4343
mean_iou = torch.mean(ious)
4444

4545
# Calcualted Precision, Recall, F1, AP
46-
#### sort by conf
46+
# sort by conf
4747
sorted_idx = torch.argsort(conf, dim=0, descending=True)
4848
tp, conf = tp[sorted_idx], conf[sorted_idx]
4949

@@ -59,11 +59,11 @@ def compute_metrics(pred, target, iou_threshold=0.7, pr_score=0.5):
5959
rec = tpc / sum_tp_fn
6060

6161
# One P, R at conf threshold
62-
#### -1 as conf decreases along x
62+
# -1 as conf decreases along x
6363
p = torch.tensor(np.interp(-pr_score, -conf.cpu(), prec[:, 0].cpu()))
6464
r = torch.tensor(np.interp(-pr_score, -conf.cpu(), rec[:, 0].cpu()))
6565

66-
ap = compute_ap(list(rec),list(prec))
66+
ap = compute_ap(list(rec), list(prec))
6767

6868
f1 = 2 * p * r / (p + r + eps)
6969

@@ -72,10 +72,10 @@ def compute_metrics(pred, target, iou_threshold=0.7, pr_score=0.5):
7272

7373
def compute_ap(recall, precision):
7474
""" Compute the average precision, given the recall and precision curves.
75-
75+
7676
Code Source:
7777
unmodified - https://github.com/rbgirshick/py-faster-rcnn.
78-
78+
7979
Reference:
8080
https://github.com/ultralytics/yolov3/blob/e0a5a6b411cca45f0d64aa932abffbf3c99b92b3/test.py
8181
@@ -87,7 +87,8 @@ def compute_ap(recall, precision):
8787
"""
8888

8989
# Append sentinel values to beginning and end
90-
mrec = np.concatenate(([0.], recall, [min(recall[-1] + 1E-3, 1.)])).astype('float')
90+
mrec = np.concatenate(
91+
([0.], recall, [min(recall[-1] + 1E-3, 1.)])).astype('float')
9192
mpre = np.concatenate(([0.], precision, [0.]))
9293

9394
# Compute the precision envelope
-302 Bytes
Binary file not shown.
-1.88 KB
Binary file not shown.
-2.23 KB
Binary file not shown.
Binary file not shown.
-1.23 KB
Binary file not shown.
-2.03 KB
Binary file not shown.
-2.21 KB
Binary file not shown.

src/models/detect_orn.py

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ class Detector_ORN(nn.Module):
1111
Advatages - better IOU, fewer parameters, faster convergence, should be ideal for the task
1212
ORN paper - https://arxiv.org/pdf/1701.01833.pdf
1313
"""
14+
1415
def __init__(self):
1516
super(Detector_ORN, self).__init__()
1617
self.image_size = 200

src/models/detector.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ def _build_features(self, n_filter, activ):
3535
Arguments:
3636
n_filter {list} -- number of filter for each conv block
3737
activ {nn.Module} -- activation function to be used
38-
38+
3939
Returns:
4040
feature extraction module
4141
"""

src/models/mish.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import torch
22
import torch.nn.functional as F
33

4+
45
@torch.jit.script
56
def mish(input):
67
'''
@@ -10,19 +11,20 @@ def mish(input):
1011
'''
1112
return input * torch.tanh(F.softplus(input))
1213

14+
1315
class Mish(torch.nn.Module):
1416
'''
1517
Source: https://github.com/digantamisra98/Mish/blob/master/Mish/Torch/mish.py
16-
18+
1719
Applies the mish function element-wise:
1820
Shape:
1921
- Input: (N, *) where * means, any number of additional
2022
dimensions
2123
- Output: (N, *), same shape as the input
2224
'''
25+
2326
def __init__(self):
2427
super().__init__()
2528

2629
def forward(self, input):
2730
return mish(input)
28-

0 commit comments

Comments
 (0)