ScreenPlay_nanoGPT

This repository contains code and resources for training a generative screenplay model using NanoGPT. The model is trained on screenplays written by the renowned filmmaker Christopher Nolan.

The simplest, fastest repository for training/finetuning medium-sized GPTs. It is a rewrite of minGPT that prioritizes teeth over education. Still under active development, but currently the file train.py reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training. The code itself is plain and readable: train.py is a ~300-line boilerplate training loop and model.py a ~300-line GPT model definition, which can optionally load the GPT-2 weights from OpenAI. That's it.

Because the code is so simple, it is very easy to hack to your needs, train new models from scratch, or finetune pretrained checkpoints (e.g. biggest one currently available as a starting point would be the GPT-2 1.3B model from OpenAI).

install

Dependencies:

pytorch <3
numpy <3
pip install transformers for huggingface transformers <3 (to load GPT-2 checkpoints)
pip install datasets for huggingface datasets <3 (if you want to download + preprocess OpenWebText)
pip install tiktoken for OpenAI's fast BPE code <3
pip install wandb for optional logging <3
pip install tqdm <3

quick start

$ python data/screenplay/prepare.py

This creates a train.bin and val.bin in that data directory. Now it is time to train your GPT. The size of it very much depends on the computational resources of your system:

$ python train.py config/train_screenplay.py

$ python sample.py --out_dir=out-screenplay

This generates a few samples, for example:

ANGELO:
And cowards it be strawn to my bed,
And thrust the gates of my threats,
Because he that ale away, and hang'd
An one with him.

DUKE VINCENTIO:
I thank your eyes against it.

DUKE VINCENTIO:
Then will answer him to save the malm:
And what have you tyrannous shall do this?

DUKE VINCENTIO:
If you have done evils of all disposition
To end his power, the day of thrust for a common men
That I leave, to fight with over-liking
Hasting in a roseman.

$ python train.py config/train_screenplay.py --device=cpu --compile=False --eval_iters=20 --log_interval=1 --block_size=64 --batch_size=12 --n_layer=4 --n_head=4 --n_embd=128 --max_iters=2000 --lr_decay_iters=2000 --dropout=0.0

$ python sample.py --out_dir=out-screenplay --device=cpu

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
__pycache__		__pycache__
assets		assets
config		config
data		data
out-screenplay		out-screenplay
out-shakespeare-char		out-shakespeare-char
.DS_Store		.DS_Store
NOLAN.pdf		NOLAN.pdf
README.md		README.md
bench.py		bench.py
configurator.py		configurator.py
model.py		model.py
sample.py		sample.py
train.py		train.py
transformer_sizing.ipynb		transformer_sizing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScreenPlay_nanoGPT

install

quick start

About

Releases

Packages

Languages

Adithya-jh/Generative_screenplay_Nano

Folders and files

Latest commit

History

Repository files navigation

ScreenPlay_nanoGPT

install

quick start

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages