The OCR-Based Student Assessment System automates the evaluation of handwritten exam scripts using Vision-Language Models (VLMs), Gemini-based assessment, and Retrieval-Augmented Generation (RAG).
It performs OCR, text evaluation, and figure analysis β providing a complete AI-based grading pipeline suitable for academic institutions.
- Uses Qwen2-VL-2B-Instruct, a vision-language model, to directly interpret handwritten text.
- Extracts student information (Name, ID, Course, etc.) from title pages.
- Detects question boundaries using standard markers such as:
Answer to the question no-1a End of Answer-1a
- Automatically detects and crops diagrams from scanned answer pages using OpenCV.
- Each extracted figure is evaluated using the Qwen2-VL and Gemini models.
- Outputs a structured JSON like:
{ "figure_number": "1a", "target": "heart", "caption": "Labeled diagram of human heart", "marks": 92 }
- Evaluates student answers using semantic similarity and keyword matching.
- Combines three factors:
- Semantic Similarity (SentenceTransformer embeddings)
- Keyword Overlap (TF-IDF weighting)
- Length Factor (relative completeness)
- Two scoring pipelines available:
- Gemini 2.5 Flash (default) β API-based evaluation.
- Prometheus RAG pipeline β local grading using textbook retrieval.
- Accepts single or multi-student
.ziparchives. - Automatically categorizes:
- title β student info extraction
- figure β figure segmentation + assessment
- answer β full OCR + grading
- Merges all text into structured
.txtfiles for downstream grading.
- Builds FAISS vector store from textbooks or PDFs for contextual grading.
- Enables Prometheus to access relevant content during evaluation.
- Provides RESTful endpoints for OCR, grading, and vector store management.
- Includes real-time streaming endpoints for batch script checking.
- Can run locally or deployed as a microservice.
| Component | Purpose |
|---|---|
model_loader.py |
Loads Qwen2-VL model (supports GPU and quantization). |
ocr_engine.py |
Extracts text and figures from scanned images. |
figure_processor.py |
Segments figures using OpenCV contour detection. |
gemini_assessment.py |
Grades answers with Gemini 2.5 Flash API. |
gemini_figure_processor.py |
Grades figures via Gemini multimodal reasoning. |
assessment_core.py |
Local semantic evaluation pipeline. |
assessment_core_prometheus.py |
RAG-enhanced Prometheus-based assessment. |
rag_retriever.py |
Retrieves semantically relevant content from FAISS index. |
build_vector_store.py |
Builds FAISS index from textbook PDFs. |
semantic_chunking_vector_store.py |
Alternative semantic chunking index builder. |
batch_processor.py |
Handles ZIP batch processing of multiple scripts. |
database.py |
Writes extracted student info and marks to CSV. |
app.py |
FastAPI backend server for API usage. |
config.py |
Stores configuration and prompt templates. |
project_root/
βββ app.py
βββ batch_processor.py
βββ ocr_engine.py
βββ assessment_core.py
βββ assessment_core_prometheus.py
βββ gemini_assessment.py
βββ gemini_figure_processor.py
βββ figure_processor.py
βββ model_loader.py
βββ database.py
βββ config.py
βββ rag_retriever.py
βββ semantic_chunking_vector_store.py
βββ build_vector_store.py
βββ format_answer.py
βββ utils.py
βββ main.py
βββ output/
β βββ text/
β βββ figures/
β βββ assessments/
βββ results/
βββ students.csv
βββ marks.csv
- Python 3.9 or higher
- CUDA GPU (optional but recommended)
- Minimum 8 GB VRAM (4 GB possible with quantization)
git clone <repo_url>
cd OCR-Based-Student-Assessment-System
python -m venv venv
source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
pip install -r requirements.txtpython cuda.pypython main.py /path/to/student_script.zippython main.py --multi parent_batch.zipparent_batch/
βββ student_1/
β βββ title.png
β βββ figure_1.png
β βββ page_1.png
βββ student_2/
β βββ title.png
β βββ figure_2.png
β βββ page_1.png
uvicorn app:app --host 0.0.0.0 --port 8000GET /health
Response:
{ "status": "ok" }POST /api/v1/upload-pdf
Uploads a textbook PDF and rebuilds the FAISS index.
Response:
{ "status": "ok", "message": "Vector store rebuilt." }POST /api/v1/check-scripts
Accepts a .zip of scanned pages and rubric text, returns structured grading results.
POST /api/v1/batch-check-scripts-stream
Processes multi-student ZIP archives with real-time progress events.
Create a semantic knowledge base from textbooks for context-aware evaluation:
python build_vector_store.pyor for advanced semantic chunking:
python semantic_chunking_vector_store.py| Pipeline | Model | Description |
|---|---|---|
| Gemini (Default) | gemini-2.5-flash-lite | Fast API-based grading |
| Prometheus RAG | Local + FAISS | Contextual grading with textbook retrieval |
- OCR accuracy varies with handwriting clarity.
- Prometheus pipeline requires GPU memory and FAISS index.
- Gemini API key required via
.env.
Developed by Tanjeeb Meheran Rohan and Afra Anika
Department of Computer Science and Engineering
Islamic University of Technology (IUT)
Β© 2025 β Academic and Research Use Only.