Influencer Performance System

A resource-intensive project designed to process video data, cluster influencers, and generate performance insights, using OpenAI's CLIP model for embeddings, facial recognition, and clustering techniques.

Architecture Overview

Input:
- Video URLs provided as raw data.
Output:
- Generate a comprehensive report showcasing influencers’ performance with clustered data, visualizations, and insights.
- See The full output Influencer Performance System
  Influencer Performance Report

Influencer Label	Average Performance	Video URL
30	1.5304	Video Link
25	1.12256666666667	Video Link
3	1.02478604992305	Video Link
55	0.9830456907	Video Link
45	0.917806254725	Video Link
15	0.8273821321	Video Link
7	0.80381331575	Video Link
22	0.7929559845	Video Link
34	0.5907609883	Video Link

Process Flow:
- Generate vector embeddings for each video using OpenAI CLIP.
- Cluster videos based on their embeddings to identify unique videos.
- Calculate the average performance score for each unique video.
- Extract human faces from the videos using OpenCV Haar cascades.
- Identify the best-captured face from extracted images.
- Save images to a GitHub raw content repository.
- Match influencer faces across clusters to combine clusters based on face similarity using OpenAI CLIP.
- Calculate the average performance score for each unique influencer.
Visualization:
- Accessible via a Streamlit app or as an HTML file.

System Used

Machine Specifications:

CPU: 4 vCPUs (Intel Xeon Scalable, 3.5 GHz, Sapphire Rapids)
Memory: 16 GiB (4 GiB per vCPU)
Operating System: x86_64 architecture
Environment: LightningAI equivalent system

Dependencies:

All dependencies are listed in the requirements.txt file.

Setup Guide

Clone the Repository

git clone https://github.com/Ansumanbhujabal/Influencer_Performance_System.git
cd Influencer_Performance_System

Create and Activate Virtual Environment

# Create a virtual environment
python3 -m venv venv

# Activate the environment
source venv/bin/activate    # For Linux/Mac
venv\Scripts\activate       # For Windows

Install Dependencies

Install all required packages:

pip install -r requirements.txt

Manually install additional dependencies:

pip install ftfy regex tqdm --quiet
pip install git+https://github.com/openai/CLIP.git --quiet
pip install matplotlib --quiet
pip install opencv-python-headless --quiet
pip install torch torchvision torchaudio --quiet

Run the Jupyter Notebook

Navigate to the notebooks directory:
```
cd notebooks
```
Start Jupyter Notebook:
```
jupyter notebook
```
Open and run Data_Processor_and_Insights.ipynb to process the data and generate insights.

Visualization and Report Access

Online Access:

The performance report can be visualized remotely:

Influencer Performance System

Local Access:

Open The HTML file present in any browser
```
influencer_report_up.html
```
Run the Streamlit app:
```
streamlit run app.py
```

For Numbers , you can refer to the

cd /output
Final_Influencer_Data_insights_up_dec1_t2.xlsx

Open the generated report in any browser.

Future Scope

Real-time Face Matching: Optimize the workflow by introducing real-time face extraction and matching, eliminating redundant processes to save resources and reduce execution time.
Enhanced Clustering: Improve clustering mechanisms for better influencer detection and performance accuracy.
Scalability: Adapt the system to process larger datasets more efficiently.

Challenges Faced

Resource-Intensive Processing: Managing vector generation, face matching, and clustering on non-GPU systems.
Data Cleaning: Ensuring unique identification of videos and influencers from raw, unstructured data.
Cluster Matching: Efficiently combining clusters to avoid duplicates while maintaining accuracy.

Made Over a Weekend

Created with passion and dedication in a short span to showcase influencer performance analytics.

Author: Ansuman Bhujabal
GitHub Repository: Influencer Performance System

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
influencers/detected_objects_archive		influencers/detected_objects_archive
notebooks		notebooks
output		output
scripts		scripts
.gitignore		.gitignore
README.md		README.md
app.py		app.py
influencer_report_up.html		influencer_report_up.html
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Influencer Performance System

Architecture Overview

Influencer Performance Report

System Used

Machine Specifications:

Dependencies:

Setup Guide

Clone the Repository

Create and Activate Virtual Environment

Install Dependencies

Run the Jupyter Notebook

Visualization and Report Access

Online Access:

Local Access:

Future Scope

Challenges Faced

Made Over a Weekend

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Ansumanbhujabal/Influencer_Performance_System

Folders and files

Latest commit

History

Repository files navigation

Influencer Performance System

Architecture Overview

Influencer Performance Report

System Used

Machine Specifications:

Dependencies:

Setup Guide

Clone the Repository

Create and Activate Virtual Environment

Install Dependencies

Run the Jupyter Notebook

Visualization and Report Access

Online Access:

Local Access:

Future Scope

Challenges Faced

Made Over a Weekend

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages