This project uses a pre-trained model from the Hugging Face transformers library to generate descriptive captions for images.
-
Create and activate the virtual environment:
python3 -m venv .venv source .venv/bin/activate -
Install the dependencies:
pip install -r requirements.txt
To generate a caption for an image, run the predict.py script with the path to your image. Sample images are provided in the data directory.
For example:
python predict.py data/demo.png