Skip to content

j0zko/tiny-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tiny-llm

A character-level GPT transformer trained on Shakespeare, implemented from scratch in Rust (inference) and Python (training).

How it works

1. Training (Python)

The training script (train/train.py) downloads the Tiny Shakespeare dataset (~1MB of text) and trains a small GPT model on it using PyTorch.

  • Tokenizer: character-level — each unique character is a token (65 total)
  • Model: 5-layer transformer, 4 attention heads, 64-dimensional embeddings, 256-token context window
  • Size: ~1MB of weights (fits in under 1MB as a design constraint)
  • Output: exports weights/shakespeare.bin (custom binary format) and weights/vocab.json

2. Inference (Rust)

The Rust binary loads the pre-trained weights and generates text token by token.

  • Reads the .bin file directly into f32 slices (zero-copy weight loading)
  • Runs a full transformer forward pass: token + positional embeddings → N layers of (LayerNorm → Attention → LayerNorm → FFN) → LM head
  • Samples the next character using temperature scaling + top-k filtering
  • Streams output to stdout character by character

Project structure

train/
  train.py          # PyTorch training script
  input.txt         # auto-downloaded Shakespeare corpus
weights/
  shakespeare.bin   # exported model weights (binary)
  vocab.json        # character vocabulary
src/
  main.rs           # CLI entry point + sampling logic
  model.rs          # transformer forward pass (attention, FFN, layer norm)
  tokenizer.rs      # character-level tokenizer
  weights.rs        # binary weight file loader
generate.sh         # convenience wrapper around cargo run

Usage

Train the model

cd train
pip install -r requirements.txt
python train.py

Generate text

# using the convenience script
./generate.sh "HAMLET:" 200 0.8

# or directly
cargo run --release -- weights/shakespeare.bin weights/vocab.json "HAMLET:" --tokens 200 --temp 0.8 --topk 40

Options

Flag Default Description
--tokens N 200 Number of characters to generate
--temp F 0.8 Sampling temperature (higher = more random)
--topk N 40 Top-k filtering (0 = disabled)

Requirements

  • Inference: Rust (stable)
  • Training: Python 3.8+, PyTorch, NumPy

About

a very small llm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors