This script identifies the start and end of a round in boxing video footage.
-
Video Processing: Load a video and process it frame by frame.
-
Action Recognition: Utilizes the R(2+1)D-18 model, a 3D ResNet model pretrained on the Kinetics-400 dataset.
-
Frame Transformation: Applies image transformations (resize, center crop, and normalization) using the
albumentationslibrary. -
Output Video: Generates an output video with frames labeled with the detected action.
- Python 3.x
- torch
- torchvision
- cv2 (OpenCV)
- numpy
- tqdm
- albumentations
- Setup:
pip install torch torchvision opencv-python-headless numpy tqdm albumentations
- input/: This directory should contain the video files you want to process.
- outputs/: The processed videos will be saved in this directory with the action label superimposed on the frames.
- Run the Script:
- By default, the script processes the video "input/video.mp4".
python action_recognition.py ` - Output:
- The processed video will be saved in the "outputs" directory with the action label superimposed on the frames.
- To process a different video, change the
input_videovariable in the main function to the desired video path.
- Enhance the script to provide timestamps indicating intervals where no boxing activity is detected for a specified duration.
