Skip to content

Python based Datamining, with the use of tools such as numpy and pandas, to understand the basics of Datamining. These topics include, but not limited to: Preprocessing, Classification, Artificial Neural Networks, K Nearest Neighbor, and Regression

Notifications You must be signed in to change notification settings

mattishii26/DataMiningRepo

Repository files navigation

Data mining with Python!

Libraries

  • pandas
  • numpy
  • sklearn
  • scipy
  • matplotlib

Projects

Project 1

Objective: Preprocess data by removing noise, null values, outliers and duplicates. With the clean data, graph it and split into test and training datasets
Learned: Must account of various user input (IE: United States can be: US, USA, united states, US of A, etc...), and you cannot remove all null values, as the null valued attribute may not be important, but the rest of the attributes are

Project 2

Objective: Preprocess data, and use either Simple and Multiple Linear Regression. Normalize the data and classify the data set
Learned: It is important to know your data, to have a meaningful output

Project 3

Objective: Implement Classification Models (Decision Trees, Support Vector Machine, K-Nearest Neighbor, Naive Bayes, Logistic Regression, Artificial Neural Networks)
Learned: Feature Importance Analysis is very important, in order to apply these classification models

Project 4

Objective: Implement Clustering, Text mining and Artificial Neural Networks
Learned: Clustering is very sensitive to it's input, and dramatically changes the expected output

About

Python based Datamining, with the use of tools such as numpy and pandas, to understand the basics of Datamining. These topics include, but not limited to: Preprocessing, Classification, Artificial Neural Networks, K Nearest Neighbor, and Regression

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published