Skip to content

This project focuses on predicting house prices using machine learning techniques. The dataset consists of over 1,000,000+ rows and 12 columns containing information about various house attributes. The goal is to build predictive models to estimate house prices based on these attributes.

Notifications You must be signed in to change notification settings

dharmendradiwaker/Forecasting-House-Prices-Using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

a535a6d Β· Nov 18, 2024

History

6 Commits
Oct 30, 2023
Nov 18, 2024

Repository files navigation

Forecasting House Prices Using Machine Learning πŸ‘πŸ“Š

Overview

This project focuses on predicting house prices using machine learning techniques. The dataset consists of over 1,000,000+ rows and 12 columns containing information about various house attributes. The goal is to build predictive models to estimate house prices based on these attributes. The project explores different machine learning models, including Linear Regression, Decision Trees, and Random Forests using scikit-learn.

Key Highlights πŸ”‘

  • Data Cleaning & Analysis: Processed and analyzed a large dataset with over 1,000,000 rows to prepare it for modeling.
  • Models Trained:
    • Linear Regression: Simple but effective for baseline prediction.
    • Decision Tree: Captures non-linear relationships between features and target.
    • Random Forests: Ensemble method that improved accuracy by combining multiple decision trees.
  • Model Performance: Achieved a Root Mean Square Error (RMSE) of 866,152 after hyperparameter tuning. πŸ†

Dataset πŸ“

The dataset contains more than 1,000,000 rows and 12 columns, including features like:

  • Price of the house πŸ’° (target variable)
  • Date of Transfer πŸ“…
  • Property Type 🏠 (e.g., detached, semi-detached, terraced, flat)
  • Old/New 🏑 (indicates whether the property is newly built or existing)
  • Duration ⏳ (e.g., freehold or leasehold)
  • Town/City πŸ“
  • District 🏒
  • County 🌍
  • PPDCategory Type πŸ”– (indicates if the property was a full or partial sale)
  • Record Status πŸ—‚οΈ (applicable to monthly file updates)

Requirements πŸ“‹

  • Python (version 3.6 or higher recommended)
  • Required Python libraries:
    • scikit-learn (for machine learning algorithms) πŸ€–
    • Pandas (for data manipulation) πŸ“Š
    • NumPy (for numerical operations) πŸ”’
    • Matplotlib & Seaborn (for data visualization) 🎨

You can install these dependencies using pip:

pip install scikit-learn pandas numpy matplotlib seaborn

Workflow βš™οΈ

  1. Data Cleaning: Missing values were handled, and categorical variables were encoded.
  2. Feature Engineering: Created new features or transformed existing ones to improve model accuracy.
  3. Model Training: Trained multiple machine learning models, including:
    • Linear Regression (baseline)
    • Decision Tree Regressor
    • Random Forest Regressor
  4. Model Evaluation: Evaluated model performance using Root Mean Square Error (RMSE).
  5. Hyperparameter Tuning: Used grid search to tune the hyperparameters and improve model performance.

Setup πŸ§‘β€πŸ’»

  1. Clone this repository to your local machine:

    git clone https://github.com/dharmendradiwaker/Forecasting-House-Prices-Using-Machine-Learning.git
  2. Navigate to the project folder:

    cd forecasting-house-prices-using-machine-learning
  3. Install the required libraries:

    pip install -r requirements.txt
  4. Run the Jupyter Notebook or Python script:

    jupyter notebook House_Price_Prediction.ipynb

How to Use πŸ§‘β€πŸ’»

  1. Load the house price dataset from the provided data/ folder.
  2. Follow the steps in the notebook or script to clean, preprocess, and train the models.
  3. Explore the performance metrics of each model and see the final predictions.
  4. You can adjust hyperparameters or add new features to improve the model's accuracy.

Conclusion πŸ“Œ

This project demonstrates how to predict house prices using machine learning algorithms, offering insights into the key factors that influence the price of a property. The achieved Root Mean Square Error (RMSE) of 866,152 indicates that the model is performing fairly well after fine-tuning, although there is still room for improvement with further feature engineering or advanced techniques.

Contributors πŸ™‹β€β™‚οΈ

  • @Dharmendradiwaker12

About

This project focuses on predicting house prices using machine learning techniques. The dataset consists of over 1,000,000+ rows and 12 columns containing information about various house attributes. The goal is to build predictive models to estimate house prices based on these attributes.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published