Skip to content

Commit

Permalink
Update the code comments
Browse files Browse the repository at this point in the history
  • Loading branch information
shashankboosi committed Dec 1, 2019
1 parent bdb0563 commit d565860
Show file tree
Hide file tree
Showing 5 changed files with 34 additions and 61 deletions.
29 changes: 15 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Machine-Vision
This repository consists of algorithms related to Images processing, Features, Segmentation, Pattern Recognition and Deep learning focused on Images. It consists of the
codes for both traditional way of manipulating things and the AI way.

This repository consists of algorithms related to Images processing, Features, Segmentation, Pattern Recognition and Deep learning focused on Images. It consists of problems which are
manipulated in both the traditional and the AI way.

### Installation

Expand Down Expand Up @@ -35,7 +36,7 @@ All the datasets required for other related problems are provided with the repos
### Results and Comparisons

1) **Feature extraction**: Feature extraction is done using the SIFT algorithm which is a texture based feature extractor which extracts keypoints
in an image. SIFT is applied on an Eiffel Tower image and we check how SIFT is able to match the keypoints from the original image and the rotated image.
in an image. SIFT is applied on an Eiffel Tower image and we check how SIFT is able to match the keypoints from the original image and the rotated image.
1) Rotated Image angle 0 degrees:

![0-degree-rotation](OutputImages/Sift/Eiffel-Tower-OriginalImage0.jpg)
Expand All @@ -57,32 +58,32 @@ in an image. SIFT is applied on an Eiffel Tower image and we check how SIFT is a

![segmentation](OutputImages/Segmentation/strawberry.png)

Other comparisons are available in `OutputImages/Segmentation`
Other comparisons are available in `OutputImages/Segmentation`

Code available at: `src/segmentation.py`

3) **Pattern Recognition**: The confusion matrix for the result obtained from KNN on the `scikit-learn` digits dataset is:

![knn-confusion-matrix](OutputImages/Pattern_recognition/knn_confusion_matrix.png)
![knn-confusion-matrix](OutputImages/Pattern_recognition/knn_confusion_matrix.png)

Code available at: `src/pattern_recognition.py`
Code available at: `src/pattern_recognition.py`

4) **Deep Learning**: The loss vs epochs graph for the `deep_learning` code on MNIST dataset is:

![learningCurve](OutputImages/Deep_Learning/lossvsepochs.png)
![learningCurve](OutputImages/Deep_Learning/lossvsepochs.png)

Code available at: `src/deep_learning.py`
Code available at: `src/deep_learning.py`

5) **Oil Painting**: Oil painting is implemented from scratch as part of the learning process
of manipulating images and it is implemented on a Sydney tram image. Oil
painting is done on the image using different filter sizes and the results are:
of manipulating images and it is implemented on a Sydney tram image. Oil
painting is done on the image using different filter sizes and the results are:

![oil-paint](OutputImages/Oil-Paint/plots/combinedGrayRailImageWithAll3Filters.png)
![oil-paint](OutputImages/Oil-Paint/plots/combinedGrayRailImageWithAll3Filters.png)

For detailed description about the project kindly look at the paper
`report/Oil-Paint-report.pdf` where everything about the implementation is explained.
For detailed description about the project kindly look at the paper
`report/Oil-Paint-report.pdf` where everything about the implementation is explained visually.

All the codes related are available in the directory `src/Oil-Paint`.
All the codes related are available in the directory `src/Oil-Paint`.

6) **Optic Disc Segmentation**: The Optic Disc Segmentation is evaluated on the IDRiD dataset with 54 images:
1) Image of a good prediction of the optic disc:
Expand Down
13 changes: 3 additions & 10 deletions src/IDRiD/optic_disc_localize_on_single_image.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,21 @@ def canny(img, sigma):


def jaccard_score(input, target, epsilon=1e-6):
# To avoid zero in the numerator
# smooth = 1

input = input.reshape(-1)
# print('Input dice', input.size())
target = target.reshape(-1)
# print('Target dice', target.size())

# Compute dice
# Compute jaccard
intersect = (input * target).sum()
union = input.sum() + target.sum()

return intersect / (union + epsilon - intersect)


def dice_score(input, target, epsilon=1e-6):
# To avoid zero in the numerator
# smooth = 1

input = input.reshape(-1)
# print('Input dice', input.size())
target = target.reshape(-1)
# print('Target dice', target.size())

# Compute dice
intersect = (input * target).sum()
union = input.sum() + target.sum()
Expand Down
9 changes: 4 additions & 5 deletions src/deep_learning.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
from sklearn import preprocessing
import matplotlib.pyplot as plt

# Load data(do not change)
# Load data
data = pd.read_csv("../data/mnist_train.csv")
train_data = data[:2000]
test_data = data[2000:2500]

# ----- Prepare Data ----- #
# step one: preparing your data including data normalization
# Preparing your data including data normalization
train_X = train_data.iloc[:, 1:]
train_Y = train_data["label"]
test_X = test_data.iloc[:, 1:]
Expand All @@ -19,7 +19,7 @@
train_X_data_norm = min_max_scaler.fit_transform(train_X)
test_X_data_norm = min_max_scaler.fit_transform(test_X)

# step two: transform np array to pytorch tensor
# Transform np array to pytorch tensor
train_X_tensor = torch.tensor(train_X_data_norm, dtype=torch.float32).reshape(-1, 1, 28, 28)
test_X_tensor = torch.tensor(test_X_data_norm, dtype=torch.float32).reshape(-1, 1, 28, 28)
train_Y_tensor = torch.tensor(train_Y)
Expand Down Expand Up @@ -73,14 +73,13 @@ def PlotLearningCurve(epoch, trainingloss, testingloss):
epochs = 100
for epoch in range(1, epochs + 1):
model.train()
# step one : fit your model by using training data and get predict label
output = model(train_X_tensor)
loss = criterion(output, train_Y_tensor)
optimizer.zero_grad()
loss.backward()
optimizer.step()
trainingloss += loss.item(),
# step seven: evaluation your model by using testing data and get the accuracy
# Evaluation your model by using testing data and get the accuracy
correct = 0
with torch.no_grad():
total = 0
Expand Down
41 changes: 11 additions & 30 deletions src/segmentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,55 +52,36 @@ def plot_three_images(figure_title, image1, label1,
# Convert the image to a numpy matrix
img_mat = np.array(img)[:, :, :3]

#
# +--------------------+
# | Question 1 |
# +--------------------+

# Step 1 - Extract the three RGB colour channels
# Hint: It will be useful to store the shape of one of the colour
# channels so we can reshape the flattened matrix back to this shape.
# --------------- Mean Shift algortithm ---------------------

# Extract the three RGB colour channels
b, g, r = cv2.split(img_mat)

# Step 2 - Combine the three colour channels by flatten each channel
# Combine the three colour channels by flatten each channel
# then stacking the flattened channels together.
# This gives the "colour_samples"

colour_samples = np.stack((b.flatten(), g.flatten(), r.flatten()), axis=1)

# Step 3 - Perform Meanshift clustering
# Perform Meanshift clustering
ms_clf = MeanShift(bin_seeding=True)
ms_labels = ms_clf.fit_predict(colour_samples)

# Step 4 - reshape ms_labels back to the original image shape
# for displaying the segmentation output
# Reshape ms_labels back to the original image shape for displaying the segmentation output
ms_labels = np.reshape(ms_labels, b.shape)

# %%
#
# +--------------------+
# | Question 2 |
# +--------------------+
#
# ------------- Water Shed algortithm --------------------------

# Step 1 - Convert the image to gray scale
# and convert the image to a numpy matrix
# Convert the image to gray scale and convert the image to a numpy matrix
img_array = cv2.cvtColor(img_mat, cv2.COLOR_BGR2GRAY)

# Step 2 - Calculate the distance transform
# Hint: use ndi.distance_transform_edt(img_array)
# Calculate the distance transform
distance = ndi.distance_transform_edt(img_array)

# Step 3 - Generate the watershed markers
# Hint: use the peak_local_max() function from the skimage.feature library
# to get the local maximum values and then convert them to markers
# using ndi.label() -- note the markers are the 0th output to this function
# Generate the watershed markers
local_maximum = peak_local_max(distance, indices=False, footprint=np.ones((3, 3)))
markers = ndi.label(local_maximum)[0]

# Step 4 - Perform watershed and store the labels
# Hint: use the watershed() function from the skimage.morphology library
# with three inputs: -distance, markers and your image array as a mask
# Perform watershed and store the labels
ws_labels = watershed(-distance, markers, mask=img_array)

# Display the results
Expand Down
3 changes: 1 addition & 2 deletions src/sift.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import cv2
import numpy as np


class SiftDetector:
def __init__(self, norm="L2", params=None):
self.detector = self.get_detector(params)
Expand Down Expand Up @@ -59,7 +60,6 @@ def rotate(image, angle):


# Get coordinates of center point.
#
# image: Image that will be rotated
# return: (x, y) coordinates of point at center of image
def get_img_center(image):
Expand Down Expand Up @@ -99,7 +99,6 @@ def get_img_center(image):
if m.distance < 0.75 * n.distance:
good.append([m])

# cv.drawMatchesKnn expects list of lists as matches.
img3 = cv2.drawMatchesKnn(gray_image, kp1, gray_image_1, kp2, good, None,
flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
file_extensions = '../OutputImages/Sift/Road-Sign-OriginalImage{}.jpg'.format(angle)
Expand Down

0 comments on commit d565860

Please sign in to comment.