Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Models

Hugging Face

This folder contains EduScale's super-resolution model code, training workflows, checkpoints, conversion tools, and deployment exports.

Role In The Study

Models are evaluated as readability enhancers for educational content. The primary research question is whether lightweight SPAN-based upscaling can make low-resolution lecture slides and classroom videos more readable on budget Android hardware.

Folder Layout

Path Purpose
models/training/ Fine-tuning, evaluation, augmentation, OCR benchmark, and data preparation helpers
models/conversion/ PyTorch, ONNX, quantization, and TFLite conversion scripts
models/span/ SPAN architecture plus original, fine-tuned, optimized, and metadata folders
models/residual_sr.py Lightweight residual baseline model
models/torch_device.py Shared device resolution helper

Current Model Inventory

Area Status
Retained fine-tuned run folders 30
Runs with metrics.csv 30
Completed training epoch rows 1,359
App-packaged TFLite files 2

The full training history is summarized in ../TRAINING_REPORT.md.

Current Deployment Candidates

Scale Checkpoint Reason it matters
x2 models/span/education-finetuned/x2-v3-tpgsr-refine-20260411/last_model.pt Strongest saved x2 readability result: 91.2295 OCR confidence and 0.0765 CER on the 291-sample updated held-out set
x3 models/span/education-finetuned/x3-real-refine-20260412/best_model.pt Refined x3 real-video checkpoint with 26.6166 PSNR and 0.9642 SSIM on the 291-sample updated held-out set

Packaged Android exports:

Public TensorFlow Lite exports are available on Hugging Face: jimzzzz/EduScale.

File Size Target
models/span/optimized/span_education_x2.tflite 971 KB 360p to 720p
models/span/optimized/span_education_x3.tflite 1.18 MB 240p to 720p

Model Selection Criteria

Use a checkpoint only after checking:

  1. Held-out PSNR and SSIM against the correct scale.
  2. OCR confidence and CER against the same held-out manifest.
  3. Runtime on the intended device or representative desktop benchmark.
  4. Exportability to TFLite without unsupported operators.
  5. App-side behavior with real educational input videos.

Training Workflows

Single-stage fine-tuning

.\.venv\Scripts\python.exe -m models.training.finetune_education `
  --manifest_name powerpoint_raw_x2 `
  --output_dir models/span/education-finetuned/x2-raw-example `
  --scale 2 `
  --use_span `
  --init_checkpoint models/span/education-finetuned/x2-v3-tpgsr/best_model.pt `
  --selection_metric val_l1 `
  --batch_size 8 `
  --epochs 50 `
  --lr 2e-5 `
  --num_workers auto

OCR-aware checkpoint selection

.\.venv\Scripts\python.exe -m models.training.finetune_education `
  --manifest_name powerpoint_raw_x2 `
  --output_dir models/span/education-finetuned/x2-ocr-aware-example `
  --scale 2 `
  --use_span `
  --init_checkpoint models/span/education-finetuned/x2-v3-tpgsr/best_model.pt `
  --selection_metric ocr_primary `
  --heldout_pairs_csv datasets/manifests/heldout/heldout_real_x2.csv `
  --benchmark_interval 2 `
  --batch_size 8 `
  --epochs 20 `
  --lr 5e-5 `
  --num_workers auto

Two-stage training

.\.venv\Scripts\python.exe scripts/train_two_stage.py `
  --stage1_train_csv datasets/manifests/all_x2_20260409/train_pairs.csv `
  --stage1_val_csv datasets/manifests/all_x2_20260409/val_pairs.csv `
  --stage2_train_csv datasets/manifests/powerpoint_raw_x2/train_pairs.csv `
  --stage2_val_csv datasets/manifests/powerpoint_raw_x2/val_pairs.csv `
  --heldout_pairs_csv datasets/manifests/heldout/heldout_real_x2.csv `
  --output_dir models/span/education-finetuned/x2-two-stage-example `
  --scale 2 `
  --use_span `
  --selection_metric ocr_primary

Evaluation Workflow

.\.venv\Scripts\python.exe -m models.training.evaluation `
  --pairs_csv datasets/manifests/heldout/heldout_real_x2.csv `
  --checkpoint models/span/education-finetuned/x2-v3-tpgsr-refine-20260411/last_model.pt `
  --output_csv benchmarks/results/x2-v3-tpgsr-refine-20260411-last-heldout-updated-eval.csv `
  --output_summary_json benchmarks/results/x2-v3-tpgsr-refine-20260411-last-heldout-updated-summary.json `
  --include_ocr_metrics

Conversion Workflow

.\.venv\Scripts\python.exe models/conversion/optimization_pipeline.py `
  --checkpoint models/span/education-finetuned/x2-v3-tpgsr-refine-20260411/last_model.pt `
  --output_dir models/span/optimized/x2-v3-refine-export `
  --input_size 1,3,360,640 `
  --quant fp16

Copy final exports into the Android app:

Copy-Item models/span/optimized/span_education_x2.tflite android/app/src/main/assets/span_education_x2.tflite -Force
Copy-Item models/span/optimized/span_education_x3.tflite android/app/src/main/assets/span_education_x3.tflite -Force

Checkpoint Files

File Meaning
best_model.pt Best checkpoint for the selected metric
best_loss_model.pt Best validation-loss checkpoint
best_ocr_model.pt Best OCR-primary checkpoint when OCR selection is enabled
last_model.pt Most recent training state
checkpoint_epoch_050.pt Periodic epoch snapshot
metrics.csv Per-epoch training log

Open Details To Confirm

  • Final publication checkpoint for x2 and x3
  • Exact SPAN source citation and architecture variant to cite
  • Whether model binaries should be published or released separately
  • Target Android runtime backend: CPU, NNAPI, GPU delegate, or mixed

Related Docs