A searchable audio transcript interface using Wav2Vec2, Elasticsearch, Flask, and React.
URL: http://3.90.182.103/
asr/: ASR microservice with wav2vec2 modeldeployment-design/: Architecture design (PDF)elastic-backend/: Elasticsearch indexing setupsearch-ui/: Frontend search interface
- Navigate to directory
cd asr- Build the image:
docker build -t asr-api ./asr- Run the container:
docker run -p 8001:8001 asr-api- Test the API:
curl -F "file=@/path/to/sample.mp3" http://localhost:8001/asr- Navigate to directory
cd elastic-backend- (Optional but recommended) Create a virtual environment:
python3 -m venv venv
source venv/bin/activate- Install Python dependencies:
pip install -r requirements.txt- Start Elasticsearch cluster
docker compose upOpen http://localhost:9200/_cat/nodes?v in your browser — you should see both es01 and es02 nodes listed.
- Index data
python cv-index.py- Start backend API server
python search_api.py- Navigate to the frontend directory:
cd search-ui- Install dependencies:
npm install- Start development sever
npm startOpen http://localhost:3000 with your browser to see the result.
- The ASR model used (
wav2vec2-large-960h) may produce inaccurate transcriptions, especially for non-US accents or noisy audio, therefore the search function searches both transcribed and actual audio - Some metadata fields (e.g., age, gender, accent) may be missing or inconsistent in the source CSV file.
- The search UI currently does not support fuzzy matching or partial phrase queries.
- Facets are limited to a fixed number of values (e.g., only top 10 accent types are shown).
- Backend and search functionality assumes the local Elasticsearch and ASR services are running on ports
9200and8001respectively.