Accessibility-AI-and-Video

This project explores methods for enhancing educational accessibility by transforming lecture videos into structured, book-like formats. We use techniques in Computer Vision and Machine Learning — such as optical flow, scene segmentation, masking, and CNN-based analysis to detect visual transitions and improve readability for all users, especially those with learning differences or limited access to multimedia content.

File Structure

project/
├── annotations/
│   ├── masking.py                           # Applies Gaussian blur and thresholding to extract foreground
|   ├── masking_and_subtraction.py           # Full pipeline for masking + subtraction + scene detection
|   └── subtraction.py                       # Identifies scene changes by pixel-level subtraction
├── scrolling/
│   ├── scroll_detector.ipynb                # Optical flow-based scroll detection
│   └── webcam_OF_demo.ipynb                 # Real-time optical flow demo using webcam
├── segmenting/
│   ├── CNN.ipynb                            # Convolutional Neural Network for scene segmentation
│   └── Labeling.ipynb                       # Manual labeling tool for training data
├── archived
├── data                                     # Includes example data used
├── presentations                            # Includes posters and presentations

Pipeline of the project

Requirements

This project uses the following libraries and tools:

Python 3.8+
OpenCV
NumPy
Matplotlib
TensorFlow / PyTorch (depending on CNN implementation)
Jupyter Notebook

To install dependencies: pip install -r requirements.txt

Folder Descriptions

📁 annotations/ - Handling Annotations

Given an engineering video lecture, aims to distinguish frames with annotations and redundant frames from scene changes.

Example Pipeline

The pipeline involves extracting frames, applying masks, performing pixel-level subtraction, and counting the remaining non-null pixels to detect scene transitions.

masking.py

Thresholding is used to isolate the foreground elements in segmented frames. This works by:

Applying a Gaussian Blur to reduce noise,
Reducing the RGB color range,
Applying thresholding to produce a binary (black and white) image based on a constant value ( a_{x,y} < t ).

Original Image:

Thresholded Foreground:

Masked Output:

subtraction.py

This method compares pairs of consecutive frames and "subtracts" similar pixels:

For each adjacent frame pair (frame1, frame2), if pixel ((x, y)) is similar within a threshold, it's nulled out.
This is repeated across all frames to identify major changes.
Then, frames with more than 1.5% remaining (non-null) pixels are flagged as potential scene changes.

masking_and_subtraction.py

The masking_and_subtraction.py script combines all the above into a streamlined pipeline:

Frame extraction
Masking and subtraction
Remaining pixel analysis
Scene change prediction and output generation

Example Output

Extracting frames 19:47:37

Creating masks 19:47:42

Starting subtraction 19:48:19

Counting remaining 19:48:41

Selecting frames 19:48:41

[21, 25, 226, 227, 228, 282, 300, 301, 320, 321, 354, 355, 356, 369, 370, 371, 385, 386, 407, 408, 409, 410, 411, 412, 413, 547, 548, 549, 550, 551, 552]

Finished 19:48:41

📁 segmenting/ - Scene Segmentation and Labeling

CNN.ipynb

This notebook extracts frames from a video and compares them using a pre-trained Convolutional Neural Network (CNN), specifically VGG16 from Keras. It uses CNN feature vectors to compute visual similarity between scenes and saves the computed distances to a text file.

It performs the following steps:

Extracts one frame per second from the input video
Converts each frame into a feature vector using the VGG16 model
Computes the Euclidean distance between consecutive frame vectors
Saves the distance values to output/txt/output.txt
Helps identify slide changes or major scene transitions based on the distance values

Example:

Input:

Video file in data/input/

Output:

Feature distances saved in data/output/output_cropped_video.txt
Extracted frames are saved in data/output/segmented_frames_cropped_video

Labeling.ipynb

A small notebook to help label frames of a video as scene changes/not scene changes with a flexible interval between selected frames (sampling frequency). Traverses given video backwards. Displays two buttons ('scene change' and 'no scene change') and two frames from the video. The top frame chronologically preceeds the bottom frame by one interval.

If 'scene change' is clicked, a row with the frame number, sample number (0-indexed), and 'True' is written to the csv.

If 'scene change' is clicked, a row with the frame number, sample number, and 'false' is written.

📁 scrolling/ - Detecting Scrolling Events

Detects scrolling within lecture videos using CNN-based visual similarity and optical flow.

scroll_detector.ipynb

This notebook performs a three-stage pipeline:

Frame Extraction: Extracts one frame per second from the lecture video.
CNN-Based Grouping: Uses VGG16 features to group visually similar frames, separating slide changes from continuous scrolling.
Optical Flow Analysis: Computes vertical motion between adjacent frames using Farneback dense optical flow. This identifies scrolling based on:
- Rewarded Vertical Motion: Weighted by coverage of motion area.

Rewarded Vertical Motion (RVM) is calculated as:

$$\text{RVM} = \left( \frac{\text{Sum of Significant Motion}}{\text{Total Pixels}} \right) \times \left( \frac{\text{Moving Pixels}}{\text{Total Pixels}} \right)^{\alpha}$$

The output is a .csv file per scene group indicating how much scrolling occurred between each frame.

webcam_OF_demo.ipynb

Live webcam-based demo that visualizes optical flow in real time. Helpful for testing tuning parameters.

Author

Developed by Sunwoo Baek, Supia Park, Ashley Li, Enya Chen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Accessibility-AI-and-Video

File Structure

Pipeline of the project

Requirements

Folder Descriptions

📁 annotations/ - Handling Annotations

Example Pipeline

masking.py

subtraction.py

masking_and_subtraction.py

Example Output

📁 segmenting/ - Scene Segmentation and Labeling

CNN.ipynb

Labeling.ipynb

📁 scrolling/ - Detecting Scrolling Events

scroll_detector.ipynb

webcam_OF_demo.ipynb

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
annotations		annotations
archived		archived
data		data
presentations		presentations
scrolling		scrolling
segmenting		segmenting
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

sbaek21/Accessibility-AI-and-Video

Folders and files

Latest commit

History

Repository files navigation

Accessibility-AI-and-Video

File Structure

Pipeline of the project

Requirements

Folder Descriptions

📁 annotations/ - Handling Annotations

Example Pipeline

masking.py

subtraction.py

masking_and_subtraction.py

Example Output

📁 segmenting/ - Scene Segmentation and Labeling

CNN.ipynb

Labeling.ipynb

📁 scrolling/ - Detecting Scrolling Events

scroll_detector.ipynb

webcam_OF_demo.ipynb

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages