A self case study in deep learning based computer vision. Transfer Learning with VGG16, ResNet50, Inception_V3 and MobileNet.
- The dataset can be obtained from Kaggle : https://www.kaggle.com/puneet6060/intel-image-classification
- This Data contains around 25k images of size 150x150 distributed under 6 categories.
- The categories are 'buildings', 'forest', 'glacier' , 'mountain' , 'sea', 'street'.
- The Train, Test and Prediction data is separated in each zip files. There are around 14k images in Train, 3k in Test and 7k in Prediction.
- The Train and Test images are grouped in folders according to their classes and hence ImageDataGenerated gets hold of their class labels.
- The images from test folder are used for validation during training.
- Prediction images are not grouped in folders according to their classes and hence we can not calculate test accuracy using these images.
- Prediction images are used to check predictions on individual images.
- All images are of size 150 x 150.
Its a multiclass classification problem with six classes in Intel image scene dataset.
Different preprocessing is done for different models
- Data Augmentation is used for VGG16, Inception and mobileNet.
- For ResNet50 the in built preprocessing provided by tf.keras.applications.resnet.preprocess_input is used.
- The Data Augmentation, when used, is applied only to train set and not to validation and test set.
- Rescaling to 1/255 is done for all sets
- Batch sizes used are : ResNet-256, mobileNet-128, Inception-128, VGG16-256.
- Pretrained models having imagenet weights are used with transfer learning.
- Modeling is done with VGG16, ResNet50, Inception_V3 and MobileNet.
- All the layers except last few layers of the model are frozen.
- The top layer of the model is not used. Instead, a dense layer with 6 units and softmax activation is used.
- VGG16 took the longest to train while mobileNet is the quickest.
- Accuracy and loss plots along with Confusion matrix is plotted for all models.
- Validation accuracies are as follows
- The predictions on test images are not so good for Inception_V3 and mobileNet models.
- VGG16 and ResNet have performed really well on test images.