Skip to content

This project predicts health insurance charges using machine learning, with a Streamlit app for instant estimates. The repository includes all necessary scripts and setup documentation.

License

Notifications You must be signed in to change notification settings

puni-ram48/Health-Insurance-Charges-Prediction

Repository files navigation

💻 Health Insurance Charges Prediction Using Machine Learning

Streamlit Link

App link

Project Overview

This project aims to predict health insurance charges based on various personal factors such as age, BMI, smoking habits,number of children and region using machine learning algorithms. The primary goal is to build a predictive model that accurately estimates the insurance premiums individuals might expect based on their health and demographic information. The project also includes a user-friendly interface built with Streamlit, allowing users to input their details and receive instant charge predictions.

Datasets

The dataset used for this project is sourced from Kaggle. It includes the following features:

  • Age: The age of the individual.
  • Sex: Insurance contractor gender, female, male.
  • BMI: Body Mass Index, which is a measure of body fat based on height and weight.
  • Children: The number of children covered by the insurance policy.
  • Smoker: Whether the individual is a smoker or not.
  • Region: The residential area of the individual in the US i.e. northeast, southeast, southwest, northwest.
  • Charges: The medical costs billed by the health insurance provider.

Tools and Technologies Used

  • Data Analysis: Python (Pandas,Numpy)
  • Machine Learning: Scikit-Learn(Linear Regression, Support vector Regression, Decision Tree, Random Forest,K-Nearest Neighbors, XGBoost, Gradient Boosting)
  • Visualization: Matplotlib, Seaborn
  • Model Deployment: Streamlit
  • Version Control: Git, GitHub

Installation and Usage

Prerequisites Ensure you have Python installed on your machine. You will also need to install the required libraries:

# Install dependencies
pip install -r requirements.txt

Running the Project

# Clone the repository
git clone https://github.com/puni-ram48/Health-Insurance-Charges-Prediction.git
# Run the Streamlit Application
streamlit run app.py

Interact with the Application:

  • Open the provided local URL in your web browser.
  • Enter the required details such as age,sex, BMI, number of children, smoking status, and region.
  • The application will predict the insurance charges based on the inputs provided.

Data: Conatins the dataset for the given project.

Project Analysis Report: Final report containing data analysis and visualizations and the model development .

SVM_Model: Saved machine learning model for deployment.

app.py: Streamlit application script.

requirements.txt: List of required python libraries.

Model Development and Evaluation

  1. Data Preprocessing:
  • Handling missing values (if any).
  • Encoding categorical variables (e.g., smoker, region).
  • Scaling numerical features for better model performance.
  1. Model Training:
  • Several models including Linear Regression, Decision Trees, and Random Forest were trained.
  • Hyperparameter tuning was performed to improve model accuracy.
  1. Model Evaluation:
  • The models were evaluated using metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), RMSE and R-squared.
  • The best-performing model was selected for deployment.
  1. Best Model:
    • The SVM Regression model, achieving an R-squared value of 0.8895, demonstrates strong predictive performance for health insurance charges. Its accuracy in modeling complex relationships between features and charges makes it the ideal choice for deployment, ensuring reliable and precise insurance cost predictions.

Contributing

We welcome contributions to this project! If you would like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/YourFeature).
  3. Make your changes and commit them (git commit -am 'Add some feature').
  4. Push to the branch (git push origin feature/YourFeature).
  5. Create a new Pull Request.

Please ensure your code is well-documented.

Authors and Acknowledgment

This project was initiated and completed by Puneetha Dharmapura Shrirama. Special thanks to the Jeevitha DS for the guidance and support.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

This project predicts health insurance charges using machine learning, with a Streamlit app for instant estimates. The repository includes all necessary scripts and setup documentation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published