Automatic Speech Recognition (ASR) with PyTorch

About • Results • Installation • How To Use • Demo • Credits • License

About

This repository contains Homework 1 of HSE DLA course.
Homework was solved using Conformer based model, in particular it's small version. Model was trained on train-clean-100 partition of Librespeech Dataset, CometML report of training - report.

Results

Metric	test-clean	test-other
Argmax WER	40.04	67.2
Argmax CER	13.46	30.73
BeamSearch WER	39.57	65.03
BeamSearch CER	12.67	29.09

Installation

Follow these steps to install the project:

(Optional) Create and activate new environment using conda or venv (+pyenv).

a. conda version:

# create env
conda create -n project_env python=PYTHON_VERSION

# activate env
conda activate project_env

b. venv (+pyenv) version:

# create env
~/.pyenv/versions/PYTHON_VERSION/bin/python3 -m venv project_env

# alternatively, using default python version
python3 -m venv project_env

# activate env
source project_env/bin/activate

Install all required packages
```
pip install -r requirements.txt
```
Install pre-commit:
```
pre-commit install
```

Pretrained Model

To run inference you should download pretrained model from HuggingFace using:

huggingface-cli download artem1085715/conformer-small --local-dir DIR_TO_SAVE_MODEL

There you can you use either best by WER model model_best.pth or last training checkpoint checkpoint-epoch50.pth

How To Use

To train a model, run the following command:

python3 train.py -cn=CONFIG_NAME HYDRA_CONFIG_ARGUMENTS

Where CONFIG_NAME is a config from src/configs and HYDRA_CONFIG_ARGUMENTS are optional arguments.

To run inference (evaluate the model or save predictions):

python3 inference.py HYDRA_CONFIG_ARGUMENTS

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
notebooks		notebooks
src		src
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automatic Speech Recognition (ASR) with PyTorch

About

Results

Installation

Pretrained Model

How To Use

Demo

Credits

License

About

Uh oh!

Releases

Packages

Languages

License

7embl4/asr

Folders and files

Latest commit

History

Repository files navigation

Automatic Speech Recognition (ASR) with PyTorch

About

Results

Installation

Pretrained Model

How To Use

Demo

Credits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages