GitHub - millieseow123/automatic_speech_recognition: A full-stack ASR (Automatic Speech Recognition) system using Facebook’s Wav2Vec2, Elasticsearch, and React. Transcribe, index, and search audio data with demographic filtering and faceted search.

A searchable audio transcript interface using Wav2Vec2, Elasticsearch, Flask, and React.

Project Structure

asr/: ASR microservice with wav2vec2 model
deployment-design/: Architecture design (PDF)
elastic-backend/: Elasticsearch indexing setup
search-ui/: Frontend search interface

Run ASR API with Docker

Navigate to directory

cd asr

Build the image:

docker build -t asr-api ./asr

Run the container:

docker run -p 8001:8001 asr-api

Test the API:

curl -F "file=@/path/to/sample.mp3" http://localhost:8001/asr

Run Elasticsearch Backend

Navigate to directory

cd elastic-backend

(Optional but recommended) Create a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install Python dependencies:

pip install -r requirements.txt

Start Elasticsearch cluster

docker compose up

Open http://localhost:9200/_cat/nodes?v in your browser — you should see both es01 and es02 nodes listed.

Index data

python cv-index.py

Start backend API server

python search_api.py

Run Frontend (search-ui)

Navigate to the frontend directory:

cd search-ui

Install dependencies:

npm install

Start development sever

npm start

Open http://localhost:3000 with your browser to see the result.

Limitations

The ASR model used (wav2vec2-large-960h) may produce inaccurate transcriptions, especially for non-US accents or noisy audio, therefore the search function searches both transcribed and actual audio
Some metadata fields (e.g., age, gender, accent) may be missing or inconsistent in the source CSV file.
The search UI currently does not support fuzzy matching or partial phrase queries.
Facets are limited to a fixed number of values (e.g., only top 10 accent types are shown).
Backend and search functionality assumes the local Elasticsearch and ASR services are running on ports 9200 and 8001 respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
asr		asr
deployment-design		deployment-design
elastic-backend		elastic-backend
search-ui		search-ui
.gitignore		.gitignore
README.md		README.md
essay.pdf		essay.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Structure

Run ASR API with Docker

Run Elasticsearch Backend

Run Frontend (search-ui)

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Languages

millieseow123/automatic_speech_recognition

Folders and files

Latest commit

History

Repository files navigation

Project Structure

Run ASR API with Docker

Run Elasticsearch Backend

Run Frontend (search-ui)

Limitations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages