|
| 1 | + Movie Recommender System (Optimized SVM + PCA): |
| 2 | + |
| 3 | +This project implements a content-based movie recommendation system using: |
| 4 | +- PCA (Principal Component Analysis) via Singular Value Decomposition (SVD) for dimensionality reduction |
| 5 | +- (Support Vector Machine) using a custom fast vectorized implementation |
| 6 | + |
| 7 | +All done from scratch using NumPy and Pandas — no scikit-learn required. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | + Dataset : |
| 12 | + |
| 13 | +The input dataset is a file named `cleaned_data.csv`, which includes: |
| 14 | +- title: Title of the movie |
| 15 | +- imdb_id: Movie ID (not used) |
| 16 | +- Other numerical columns representing metadata (genres, language, keywords, etc.) |
| 17 | + |
| 18 | + |
| 19 | +How It Works : |
| 20 | + |
| 21 | +Functions Overview : |
| 22 | + |
| 23 | +| Function Name | Role | |
| 24 | +|---------------------------|----------------------------------------------------------------------| |
| 25 | + pca_svd(X, n_components) Reduces the number of dimensions in `X` using SVD to `n_components |
| 26 | + recommend_svm_optimized(...) Trains a linear SVM from scratch to identify similar movies |
| 27 | + main block Accepts user input, runs PCA, trains SVM, and outputs top results |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +Execution Flow : |
| 32 | + |
| 33 | +1. Load Data: |
| 34 | + - Reads the `cleaned_data.csv` |
| 35 | + - Extracts features and movie titles |
| 36 | + |
| 37 | +2. **Standardize Data: |
| 38 | + - Normalizes each feature to have zero mean and unit variance |
| 39 | + |
| 40 | +3. Apply PCA: |
| 41 | + - Uses SVD to reduce features to the top `100` principal components |
| 42 | + |
| 43 | +4. Train Linear SVM (One-vs-All): |
| 44 | + - Assigns label `1` to input movie, `0` to all others |
| 45 | + - Trains linear SVM using hinge loss and gradient descent |
| 46 | + |
| 47 | +5. Score & Recommend: |
| 48 | + - Calculates similarity scores for all movies |
| 49 | + - Recommends top `10` similar titles, excluding the input itself |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | + |
| 54 | +Requirements : |
| 55 | + |
| 56 | +Only basic libraries: |
| 57 | +```bash |
| 58 | +pip install numpy pandas |
1 | 59 |
|
0 commit comments