This project focuses on analyzing and predicting trends in YouTube videos using Data Science and Machine Learning techniques. Using real-world data from India's trending videos in 2017 and 2018, the aim is to understand what makes a video go viral and whether we can accurately predict its performance.
In this project, I performed: Exploratory Data Analysis (EDA) to uncover trends and patterns in the dataset. Feature Engineering to extract meaningful insights from video metrics. Predictive Modeling using: Linear Regression to predict the view count. XGBoost Classifier to determine if a video will trend or not.
Source: Kaggle – YouTube Trending Video Dataset
Focus: Indian region (INvideos.csv)
Features include: title, channel_title, category_id, views, likes, dislikes, comment_count, publish_time, etc.
- Python
- Pandas, NumPy
- Matplotlib, Seaborn (Visualization)
- Scikit-learn (ML models & preprocessing)
- XGBoost (Boosting algorithm for classification)
- Understand engagement metrics and their influence on video popularity.
- Predict future view counts with Linear Regression.
- Classify whether a video will trend using XGBoost.
- Gain deeper insights into YouTube's trending dynamics.
- Build a web application to allow users to input video details or links and get predictions instantly.
- Use YouTube Data API to auto-fetch video stats via URL.
- Improve visualizations and interactivity with dashboards.