EventLens: Multi-Label Album Event Classification

EventLens is a deep learning-based project for multi-label classification of photo albums into various event categories. The project leverages a Swin Transformer backbone for feature extraction (Actually we have replace it with Resnet50 and ConvNext due to the lack of resources, especially GPU :>) and a transformer-based aggregator for sequence modeling, and a MLP for classification. It is designed to handle multi-label classification tasks with high accuracy and scalability.

Eventlens models download link: https://huggingface.co/Vantuk/Eventlens_Photo_Album_Event_Recognition/tree/main

Features

Multi-Label Classification: Supports multiple event labels per album.
Transformer-Based Aggregation: Uses a transformer encoder for sequence modeling of image features.
Custom Dataset Handling: Includes a dataset loader for structured photo albums.
Focal Loss: Implements Focal Loss for handling class imbalance.
Mean Average Precision (mAP): Evaluates model performance using mAP.

Architecture

The model architecture consists of:

Backbone: Swin Transformer (swin_tiny_patch4_window7_224) for feature extraction.
Positional Encoding: Adds positional embeddings for image sequences.
Classification Token: Adds Learnable token to embedding vector to represent the whole album
Aggregator: Transformer encoder for sequence modeling.
Classifier: Fully connected layers with dropout and activation for multi-label classification.

Dataset

The dataset is structured as follows:

Images: Stored in folders corresponding to album IDs.
Labels: Defined in a JSON file (event_type.json) with multi-label annotations. You can find the full dataset (for both training and evaluating) at: https://www.kaggle.com/datasets/quanho02/thesis-cufed/data (Thanks quanho02 to stored this dataset as his thesis dataset)

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.gradio		.gradio
CUFED5		CUFED5
Eval_dataset		Eval_dataset
__pycache__		__pycache__
dataset		dataset
.gitignore		.gitignore
README.md		README.md
architecture.png		architecture.png
dataset.py		dataset.py
deploy.py		deploy.py
infer.py		infer.py
infer_for_album_CUFED5_CUFED20.py		infer_for_album_CUFED5_CUFED20.py
inference_api.py		inference_api.py
mAP_from_json.py		mAP_from_json.py
main.py		main.py
model_arch.py		model_arch.py
model_archi2.py		model_archi2.py
output_infer.json		output_infer.json
output_true.json		output_true.json
predictions.json		predictions.json
reconstruct.py		reconstruct.py
requirements.txt		requirements.txt
run_website.py		run_website.py
test.py		test.py
test2.py		test2.py
test3.py		test3.py
train.py		train.py
train2.py		train2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EventLens: Multi-Label Album Event Classification

Features

Architecture

Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EventLens: Multi-Label Album Event Classification

Features

Architecture

Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages