This tutorial explains how to create a dataset on Huggingface and retrain it with a large language model.
The concept of large language models (LLM)
Starting with a single text file, curate a corpus of texts that will be used to retrain one of the standard large language models.
Huggingface offers numerous models you can retrain with your own bespoke dataset. Choose one of these models for retraining.
Use a Jupyter "notebook" to retrain your model
- ...
- ...
- ...
This tutorial was created by Douglas Edric Stanley for the workshop In one body there are millions at the Master Media Design (HEAD – Genève) in March 2023.