Skip to content

With Sign2Voice we offer sign language video to audio translation to to enable everyone to interact with sign language

License

Notifications You must be signed in to change notification settings

Sign2Voice/Sign2Voice_tensorflow

Repository files navigation

sign2voice 🗣️ - tensorflow

sign2voice is aiming to improve the inclusion of people relying on sign language by providing a tool which translates video sign language input into audio.

Further details about the project and the team can be found in sign2voice.pdf.

Below the model architecture is outlined:

1️⃣ SLR (sign language recognition) - recognizing glosses in live video input

2️⃣ G2T (gloss to text) - transforming glosses into actual text incl. grammar

3️⃣ TTS (text to speech) - transforming text into audio

The full pipeline is put together in a streamlit web app which allows you to take a live video which is then translated into a gloss sequence, subsequently transformed into sentence(s) and finally read out loud.

A demo video of the MVP using tensorflow real time object detection built in streamlit can be found here:

Watch the video

SLR (sign language recognition) with tensorflow real time object detection

Navigate to the folder slr_tf_rtod and follow these steps:

  • set up & activate virtual environment python3.10 -m venv .venv
  • install requirements pip install -r slr_tf_rtod_requirements.txt
  • run code in jupyter notebooks
    • slr_tf_rtod_create_training_data.ipynb &
    • slr_tf_rtod_train_tf_model.ipynb

Sample .ckpt files for the tensorflow model trained with the sample glosses "montag", "auch", "mehr", "wolke", "als", "sonne", "ueberwiegend", "regen", "gewitter" from PHOENIX 2014t weather data can be found in slr_tf_rtod/Tensorflow/workspace/models/phoenix_new.

CREDITS - the repo is largely based on Nicholas Renotte's Real Time Sign Language Detection with Tensorflow Object Detection and Python | Deep Learning SSD.

Youtube tutorial: https://www.youtube.com/watch?v=pDXdlXlaCco&ab_channel=NicholasRenotte

Github Repo: https://github.com/nicknochnack/RealTimeObjectDetection

Gloss2Text2Speech

Get the G2T model ready by adding the adapter_model.bin file (to be requested with the authors) in the Gloss2Text2Speech/pretrained folder.

For details on how the model works check out the respective README.md file in Gloss2Text2Speech.

Text2Speech

Get the TTS model ready by creating a .env file in the repo with the following structure:

AZUREENDPOINT=
APIKEY=
AZUREDEPLOYMENT=
APIVERSION=

Note that the credentials used by the team cannot be shared externally.

For details on how the model works check out the respective README.md file in Gloss2Text2Speech.

RUN the streamlit app

  • create & activate virtual environment python3.9 -m venv .venv
  • update environment pip install -r streamlit_requirements.txt
  • run commands in jupyter notebook streamlit_setup.ipynb
  • run streamlit web app with streamlit run st_to_txt/streamlit_app.py

FUTURE IMPROVEMENTS

  • streamlit cloud - build ready to use web/ mobile app
  • real time object detection - switch real time object detection to pytorch as tensorflow object detection is deprecated
  • vocabulary - train comprehensive model to improve generalizability & accuracy of gloss detection
  • TTS - evaluate free alternatives to OpenAI TTS solution currently used

About

With Sign2Voice we offer sign language video to audio translation to to enable everyone to interact with sign language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published