Skip to content

We use the 1M version of the Movielens dataset. The dataset includes around 1 million ratings from 6000 users on 4000 movies, along with some user features, movie genres. In addition, the timestamp of each user-movie rating is provided, which allows creating sequences of movie ratings for each user, as expected by the BST model.

Notifications You must be signed in to change notification settings

DhanashriPetkar/Dy.Tech

 
 

Repository files navigation

Team - Dy.Tech

Team Members

Amar Parab Simeen Pathan Dhanashri Petkar Sania Alam
Class - TY A Class - TY A Class - TY A Class - TY A
Roll No. - 39 Roll No. - 35 Roll No. - 02 Roll No. - 17

College - D.Y.Patil College of Engineering & Technology

##PROJECT TITTLE ###movielens_recommendations_transformers


##USE OF REAL DATASET

###Project Structure

  • Uploading and Reading the Dataset: Upload the dialogue transcript and read it into a pandas DataFrame.
  • Preprocessing the Data: Clean and preprocess the data for training, including encoding emotions and normalizing VAD (Valence, Arousal, Dominance) scores.
  • Tokenization: Tokenize the input text using BERT tokenizer.
  • Custom Dataset Class: Create a custom dataset class to handle input encodings and labels.
  • Model Training: Train a BERT model for sequence classification using the prepared dataset.
  • Model Evaluation: Evaluate the trained model on the dataset.

About

We use the 1M version of the Movielens dataset. The dataset includes around 1 million ratings from 6000 users on 4000 movies, along with some user features, movie genres. In addition, the timestamp of each user-movie rating is provided, which allows creating sequences of movie ratings for each user, as expected by the BST model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.0%
  • Python 1.0%