Use Machine Learning (ML) on handwritten text to draw inferences about the gender of the person who wrote it.
The project uses a ViT Image Classifier Transformer model for predictions. It performs better than a standard Random Forest Image Classifier (baseline model).
This project uses DVC. It is included as a
requirement in the requirements.txt
file.
For installing the virtual environment you can install it manually with the following commands:
For Linux / Mac OS:
pyenv local 3.11.3
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
For Windows, please use the following commands instead:
pyenv local 3.11.3
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt
- Put samples in
images/bw
- Run the script
segmenter.py
to cut out words. - Create a table
index_raw.csv
with the columnsGeschlecht
andDatei Index
- Train the ViT Model with the script
vit_train.py
To download the relevant data using DVC, run the following command (Linux / Mac OS / Windows):
dvc pull