Skip to content

This is a stripped-down, primitive implementation of PyTorch built for educational purposes. It follows the [TinyTorch](https://mlsysbook.ai/tinytorch/) curriculum.

License

Notifications You must be signed in to change notification settings

ashwin-r11/Immi-Torch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

16 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ”ฅ Immi-Torch

A minimal deep learning framework built from scratch

Python 3.9+ License: MIT Learning Immi-Torch CI

"Immi" (Tamil: เฎ‡เฎฎเฏเฎฎเฎฟ) โ€” the smallest primitive measure (1/2,150,400)

Big Picture โ€ข Tiers โ€ข Milestones โ€ข Learning Paths


๐ŸŽฏ What is this?

I'm building a stripped-down, primitive implementation of PyTorch from scratch for educational purposes. No magic, no black boxesโ€”just pure understanding of how deep learning frameworks actually work.

Following the TinyTorch curriculum from the ML Systems Book by Prof. Vijay Janapa Reddi (Harvard University).

"What I cannot create, I do not understand." โ€” Richard Feynman


๐Ÿ—บ๏ธ The Big Picture

20 modules. Three tiers. One complete ML system.

TinyTorch Module Flow

TinyTorch Module Flow: Foundation (blue) โ†’ Architecture (purple) โ†’ Optimization (orange)

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        OPTIMIZATION (14-19) ๐ŸŸ                               โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”       โ”‚
โ”‚   โ”‚ Profilingโ”‚ โ”‚Quant โ”‚ โ”‚ Compress โ”‚ โ”‚ Memo โ”‚ โ”‚ Accel โ”‚ โ”‚ Benchmark โ”‚       โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                        ARCHITECTURE (09-13) ๐ŸŸฃ                              โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”                                                โ”‚
โ”‚   โ”‚ DataLoader โ”‚    โ”‚ CNNs โ”‚ โ† Vision Track                                 โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                                โ”‚
โ”‚         โ”‚           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ†’โ”‚ Tokentic  โ”‚ โ”‚ Embed โ”‚ โ”‚ Attention โ”‚ โ”‚ Transformer โ”‚   โ”‚
โ”‚                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ”‚
โ”‚                                   โ†‘ Language Track                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                        FOUNDATION (01-08) ๐Ÿ”ต                                โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                          โ”‚
โ”‚   โ”‚ Tensor โ”‚โ†’โ”‚ Activations โ”‚โ†’โ”‚ Layers โ”‚โ†’โ”‚ Losses โ”‚                          โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                          โ”‚
โ”‚        โ†“                                 โ†“                                  โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                                 โ”‚
โ”‚   โ”‚ Autograd โ”‚โ†’โ”‚  Optimizers โ”‚โ†’โ”‚ Training โ”‚                                 โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   

๐ŸŽจ Three Tiers

๐Ÿ”ต Tier 1: Foundation (Modules 01-08)

Build the core machinery

# Module What it does Status
01 Tensor Data structure - holds all your numbers โœ…pushed jan5
02 Activations Non-linearity - ReLU, Sigmoid, Tanh โœ…pushed jan29
03 Layers Parameterized transformations โณ
04 Losses Measure prediction error โณ
05 DataLoader Efficient data batching โณ
06 Autograd Automatic gradient computation โณ
07 Optimizers SGD, Adam, RMSprop โณ
08 Training Complete training loop โณ

๐ŸŸฃ Tier 2: Architecture (Modules 09-13)

Apply foundation to real problems

# Module What it does Track
09 DataLoader+ Advanced data pipelines Both
10 CNNs Convolutions for images ๐Ÿ‘๏ธ Vision
11 Tokenization Text โ†’ tokens ๐Ÿ“ Language
12 Embeddings Tokens โ†’ vectors ๐Ÿ“ Language
13 Attention Self-attention mechanism ๐Ÿ“ Language
14 Transformers GPT architecture ๐Ÿ“ Language

๐ŸŸ  Tier 3: Optimization (Modules 14-19)

Make it production-ready

# Module What it does
15 Profiling Find bottlenecks
16 Quantization Reduce precision
17 Compression Smaller models
18 Memoization Cache computations
19 Acceleration Hardware optimization
20 Benchmarking MLPerf-style metrics

๐Ÿ† Milestones

Historical achievements I'll unlock by recreating 70 years of ML evolution:

Milestone Year Achievement Modules Required
๐Ÿง  Perceptron 1957 First learning algorithm (Rosenblatt) 01-04
โšก XOR 1969 MLP solves non-linear problems 01-08
โœ๏ธ MLP 1986 Handwritten digit recognition 01-08
๐Ÿ‘๏ธ CNN 1998 LeNet-5 image classification 01-09
๐Ÿค– Transformer 2017 "Attention Is All You Need" 01-13
๐Ÿ MLPerf 2018 Production-speed benchmarks 01-19

What I'll Have at Each Checkpoint

Modules Outcome Historical Context
01-04 Working Perceptron classifier Rosenblatt 1957
01-08 MLP solving XOR + complete training pipeline AI Winter breakthrough 1969โ†’1986
01-09 CNN with convolutions and pooling LeNet-5 (1998)
01-13 GPT model with autoregressive generation "Attention Is All You Need" (2017)
01-19 Optimized, quantized, accelerated system Production ML today
01-20 MLPerf-style benchmarking submission Torch Olympics

๐Ÿ“ Project Structure

Immi-Torch/
โ”œโ”€โ”€ immi_torch/
โ”‚   โ”œโ”€โ”€ __init__.py                    # Main package exports
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ tier1_foundation/              # ๐Ÿ”ต Core ML machinery (01-08)
โ”‚   โ”‚   โ”œโ”€โ”€ tensor.py                  # 01: Multidimensional arrays
โ”‚   โ”‚   โ”œโ”€โ”€ activations.py             # 02: ReLU, Sigmoid, Tanh
โ”‚   โ”‚   โ”œโ”€โ”€ layers.py                  # 03: Linear, Module base
โ”‚   โ”‚   โ”œโ”€โ”€ losses.py                  # 04: MSE, CrossEntropy
โ”‚   โ”‚   โ”œโ”€โ”€ data.py                    # 05: DataLoader, Dataset
โ”‚   โ”‚   โ”œโ”€โ”€ autograd.py                # 06: Automatic differentiation
โ”‚   โ”‚   โ”œโ”€โ”€ optim.py                   # 07: SGD, Adam, RMSprop
โ”‚   โ”‚   โ””โ”€โ”€ train.py                   # 08: Training loop
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ tier2_architecture/            # ๐ŸŸฃ Vision & Language (09-14)
โ”‚   โ”‚   โ”œโ”€โ”€ cnn.py                     # 10: Conv2d, Pooling
โ”‚   โ”‚   โ”œโ”€โ”€ tokenizer.py               # 11: Text tokenization
โ”‚   โ”‚   โ”œโ”€โ”€ embeddings.py              # 12: Token embeddings
โ”‚   โ”‚   โ”œโ”€โ”€ attention.py               # 13: Self-attention
โ”‚   โ”‚   โ””โ”€โ”€ transformer.py             # 14: GPT architecture
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ tier3_optimization/            # ๐ŸŸ  Production-ready (15-20)
โ”‚       โ”œโ”€โ”€ profiling.py               # 15: Find bottlenecks
โ”‚       โ”œโ”€โ”€ quantization.py            # 16: Reduce precision
โ”‚       โ”œโ”€โ”€ compression.py             # 17: Pruning, distillation
โ”‚       โ”œโ”€โ”€ memoization.py             # 18: Cache computations
โ”‚       โ”œโ”€โ”€ acceleration.py            # 19: JIT, op fusion
โ”‚       โ””โ”€โ”€ benchmarking.py            # 20: MLPerf metrics
โ”‚
โ”œโ”€โ”€ tests/                             # Test suite
โ”œโ”€โ”€ milestones/                        # Historical achievements (70 years of ML)
โ”‚   โ”œโ”€โ”€ 01_perceptron.py               # ๐Ÿง  1957 - First neural network
โ”‚   โ”œโ”€โ”€ 02_xor.py                      # โšก 1969 - Non-linear learning
โ”‚   โ”œโ”€โ”€ 03_mnist_mlp.py                # โœ๏ธ 1986 - Handwritten digits
โ”‚   โ”œโ”€โ”€ 04_cnn_lenet.py                # ๐Ÿ‘๏ธ 1998 - LeNet-5 vision
โ”‚   โ”œโ”€โ”€ 05_transformer.py              # ๐Ÿค– 2017 - Attention mechanism
โ”‚   โ””โ”€โ”€ 06_mlperf.py                   # ๐Ÿ 2018 - Production benchmarks
โ”œโ”€โ”€ examples/                          # Usage examples
โ”œโ”€โ”€ docs/                              # Documentation
โ””โ”€โ”€ tier1_plans.md                     # Detailed Tier 1 roadmap

๐Ÿš€ Getting Started

# Clone the repository
git clone https://github.com/ashwin-r11/Immi-Torch.git
cd Immi-Torch

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest tests/

Quick Example

from immi_torch import Tensor, Linear, ReLU, MSELoss, SGD

# Create a simple model
model = Linear(10, 1)
loss_fn = MSELoss()
optimizer = SGD(model.parameters(), lr=0.01)

# Training step
x = Tensor.randn(32, 10)
y = Tensor.randn(32, 1)

pred = ReLU()(model(x))
loss = loss_fn(pred, y)
loss.backward()
optimizer.step()

๐Ÿ’ช Expect to Struggle (That's the Design)

Getting stuck is not a bugโ€”it's a feature.

TinyTorch uses productive struggle as a teaching tool. The frustration you feel is your brain rewiring to understand ML systems at a deeper level.

When stuck:

  • Run tests early and often
  • Explain the problem to a rubber duck
  • Ask for help after 30+ minutes on a single bug

๐Ÿ“š Resources


๐Ÿ“„ License

MIT License - see LICENSE for details.


The North Star ๐ŸŒŸ

By module 13, I'll have a complete GPT model generating textโ€”built from raw Python.

By module 20, I'll benchmark my entire framework with MLPerf-style submissions.

Every tensor operation. Every gradient calculation. Every optimization trick.

I wrote it.

About

This is a stripped-down, primitive implementation of PyTorch built for educational purposes. It follows the [TinyTorch](https://mlsysbook.ai/tinytorch/) curriculum.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages