GitHub - ashutoshmakone/Intel-Image-Scene-Multiclass-Classification--A-Computer-Vision-case-study

A self case study in deep learning based computer vision. Transfer Learning with VGG16, ResNet50, Inception_V3 and MobileNet.

The Dataset

The dataset can be obtained from Kaggle : https://www.kaggle.com/puneet6060/intel-image-classification
This Data contains around 25k images of size 150x150 distributed under 6 categories.
The categories are 'buildings', 'forest', 'glacier' , 'mountain' , 'sea', 'street'.
The Train, Test and Prediction data is separated in each zip files. There are around 14k images in Train, 3k in Test and 7k in Prediction.
The Train and Test images are grouped in folders according to their classes and hence ImageDataGenerated gets hold of their class labels.
The images from test folder are used for validation during training.
Prediction images are not grouped in folders according to their classes and hence we can not calculate test accuracy using these images.
Prediction images are used to check predictions on individual images.
All images are of size 150 x 150.

The problem statement

Its a multiclass classification problem with six classes in Intel image scene dataset.

Preprocessing

Different preprocessing is done for different models

Data Augmentation is used for VGG16, Inception and mobileNet.
For ResNet50 the in built preprocessing provided by tf.keras.applications.resnet.preprocess_input is used.
The Data Augmentation, when used, is applied only to train set and not to validation and test set.
Rescaling to 1/255 is done for all sets
Batch sizes used are : ResNet-256, mobileNet-128, Inception-128, VGG16-256.

Modeling

Pretrained models having imagenet weights are used with transfer learning.
Modeling is done with VGG16, ResNet50, Inception_V3 and MobileNet.
All the layers except last few layers of the model are frozen.
The top layer of the model is not used. Instead, a dense layer with 6 units and softmax activation is used.
VGG16 took the longest to train while mobileNet is the quickest.

Results

Accuracy and loss plots along with Confusion matrix is plotted for all models.
Validation accuracies are as follows
VGG16 - 89.13%

Inception_V3 - 84.17%

ResNet50 - 91.37%

mobileNet - 90.43%
The predictions on test images are not so good for Inception_V3 and mobileNet models.
VGG16 and ResNet have performed really well on test images.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Inception-Intel-Image-multiclass-classification-final.ipynb		Inception-Intel-Image-multiclass-classification-final.ipynb
MobileNet-Intel-Image-multiclass-classification-final.ipynb		MobileNet-Intel-Image-multiclass-classification-final.ipynb
README.md		README.md
ResNet50-Intel-Image-multiclass-classification-final.ipynb		ResNet50-Intel-Image-multiclass-classification-final.ipynb
VGG16-Intel-Image-multiclass-classification-final.ipynb		VGG16-Intel-Image-multiclass-classification-final.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Dataset

The problem statement

Preprocessing

Modeling

Results

VGG16 - 89.13%

Inception_V3 - 84.17%

ResNet50 - 91.37%

mobileNet - 90.43%

About

Releases

Packages

Languages

ashutoshmakone/Intel-Image-Scene-Multiclass-Classification--A-Computer-Vision-case-study

Folders and files

Latest commit

History

Repository files navigation

The Dataset

The problem statement

Preprocessing

Modeling

Results

VGG16 - 89.13%

Inception_V3 - 84.17%

ResNet50 - 91.37%

mobileNet - 90.43%

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages