Skip to content

Daphne210/Data-Mining-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Video presentation for the project: https://youtu.be/NN9GYnHz-g4

We opted to use the 2023 Amazon Reviews Dataset, but since it was greater than 150GB in size, we chose to merge 14 categories, whose zipped files were less than 1GB, or whose total size is less than 3GB. We merged samples from 14 categories, resulting in a dataset of 30,594,036 records (~15GB). The merged categories include:

  • All_Beauty: 701,528 records
  • Appliances: 2,128,605 records
  • Health_and_Personal_Care: 494,121 records
  • Industrial_and_Scientific: 5,183,005 records
  • Amazon_Fashion: 2,500,939 records
  • Baby_Products: 6,028,884 records
  • Subscription_Boxes: 16,216 records
  • Handmade_Products: 664,162 records
  • Digital_Music: 130,434 records
  • Software: 4,880,181 records
  • Musical_Instruments: 3,017,439 records
  • Magazine_Subscriptions: 71,497 records
  • Video_Games: 4,624,615 records
  • Gift_Cards: 152,410 records

About

This is a data mining project on the 2023 Amazon Reviews Dataset and the Traveling Salesman Problem.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors