This repository contains code and resources for training a generative screenplay model using NanoGPT. The model is trained on screenplays written by the renowned filmmaker Christopher Nolan.
The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py
reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py
is a ~300-line boilerplate training loop and model.py
a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.
Because the code is so simple, it is very easy to hack to your needs, train new models from scratch, or finetune pretrained checkpoints (e.g. biggest one currently available as a starting point would be the GPT-2 1.3B model from OpenAI).
Dependencies:
- pytorch <3
- numpy <3
pip install transformers
for huggingface transformers <3 (to load GPT-2 checkpoints)pip install datasets
for huggingface datasets <3 (if you want to download + preprocess OpenWebText)pip install tiktoken
for OpenAI's fast BPE code <3pip install wandb
for optional logging <3pip install tqdm
<3
$ python data/screenplay/prepare.py
This creates a train.bin
and val.bin
in that data directory. Now it is time to train your GPT. The size of it very much depends on the computational resources of your system:
$ python train.py config/train_screenplay.py
$ python sample.py --out_dir=out-screenplay
This generates a few samples, for example:
ANGELO:
And cowards it be strawn to my bed,
And thrust the gates of my threats,
Because he that ale away, and hang'd
An one with him.
DUKE VINCENTIO:
I thank your eyes against it.
DUKE VINCENTIO:
Then will answer him to save the malm:
And what have you tyrannous shall do this?
DUKE VINCENTIO:
If you have done evils of all disposition
To end his power, the day of thrust for a common men
That I leave, to fight with over-liking
Hasting in a roseman.
$ python train.py config/train_screenplay.py --device=cpu --compile=False --eval_iters=20 --log_interval=1 --block_size=64 --batch_size=12 --n_layer=4 --n_head=4 --n_embd=128 --max_iters=2000 --lr_decay_iters=2000 --dropout=0.0
$ python sample.py --out_dir=out-screenplay --device=cpu