Skip to content

VaibhavSaran/Hand-Sign-Detection-with-Tensorflow-Object-Detection

Repository files navigation

Acknowledgements

I take this opportunity to express my sincere thanks to Mr. Nicholas Renotte, his work on hand sign detection using tensorflow was a great help for me in making my own project and bring it to a level where it can be used to make sentences. This project was an amazing journey of learning and I have enclosed some of the links which have helped me during my journey to make this project.

Inspiration and Overview

Mute people often come across scenarios where they face a huge gap while communication with normal people. There are some stae of the art models availabe which have taken an approach on this problem by using 3DCNN and LSTM with FSM Context-Aware Model and many more.
The general concept is that a number of CNN layers are used followed by a number of LSTM layers, use of a pretrained mobile net followed by a number of LSTM layers. These models end up requiring large amounts of data to produce good results and also demand very high compute power due to presence of 30 to 40 million parameters.

About SSD MOBNET

SSD MobileNet V2 FPNLite 320x320 will compress the image to 320x320 in the pre processing and in post processing it is going to take the detections that it found and convert it back to the original resolution. It uses Image augmentation, i.e. it might darken shift or flip the image so that we can ideally get a better performing model.

Minimum Requirements

  1. RAM : 8 GB and above
  2. Disk Space : 2 GB is the approx size of the repository
  3. Processor : i3 and above in 10th gen (for anything less than 10th Gen minimum Intel i5 is required)
  4. GPU : Its good to have one.
  5. CUDA and CUDNN : If GPU is available

Setup Process


Step 1. Clone this repository: Hand Sign Detection.

Step 2. Create a new virtual environment
python -m venv tfod

Step 3. Activate your virtual environment
source tfod/bin/activate # Linux
.\tfod\Scripts\activate # Windows 

Step 4. Install dependencies and add virtual environment to the Python Kernel
python -m pip install --upgrade pip
pip install ipykernel
python -m ipykernel install --user --name=tfod

Step 5. Run Jupyter Notebook or Jupyter Lab which ever is available. Run the command :
Jupyter Notebook # To run Jupyter Notebook 
 # or run the below to launch Jupyter Lab
Jupyter Lab
Error: If the above command is not recognized then try installing the module with command. Anyone will do or both can be done.
pip install jupyterlab # To install Jupyter Lab
pip install jupyter # To install Jupyter Notebook

Step 6. After launching Jupyter make sure to select the kernel to the virtual environment.


Step 7. First run the notebook **Dependencies.ipynb**, it will install all the requisite libraries for your system. However do check which version of tensorflow you are installing and the corresponding CUDA and cuDNN libraries.

Executing Project

Run all the cells of the notebook Generating Labelled Data.ipynb if you want to create your own data and train the model on it. If not you can use the pretrained model available . The model has been saved and is available as models.tar.gz. While executing the project if there are any refers do try checking out Common Errors.md there is a good chance your error is already covered there.

Data Collection

For this project I have not used any data from any 3rd party source/company. I went and created my own data by taking pictures of myself performing the signs and then adding labels in each and every image. There are 10 labels used in this project namely ok, notok, thankyou, livelong, name, what, you, iloveyou, nice, love. Per class 30 images have been collected which makes a total of 300 images and then the data has been split 240 images to train and 60 images to test the model. The model itself has been trained for 10000 steps.

Working Summary of Project

  1. Installing all the dependecies.
  2. Defining labels to be collected and collecting data for each label.
  3. Use labelimg to label data and split the data into test and train.
  4. Training the model SSD MobileNet V2 FPNLite 320x320
  5. Performing real time detections and detection on image.
  6. Freezing the graph and saving the model.

Results Achieved

  1. After training the model for 10000 steps.


2) Model evalutaion.

3) Graph for Learning Rate and Losses.

Future Scope and Improvements

  1. Adding more data to perform better predictions.
  2. Adding more signs for recognition.
  3. Integrating with ASL alphabets so that basic conversation can take place using this model.
  4. It can be used to develop a Robot to recognize these hand signs. These robots can then be deployed at Airports and railway stations which can help in easing the communication between mute people and authorities.
  5. Using some other SOTA architectures which can enable to detect motion signs like 3DCNN or maybe use MediaPipe with LSTM.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published