💻 Health Insurance Charges Prediction Using Machine Learning

Streamlit Link

Project Overview

This project aims to predict health insurance charges based on various personal factors such as age, BMI, smoking habits,number of children and region using machine learning algorithms. The primary goal is to build a predictive model that accurately estimates the insurance premiums individuals might expect based on their health and demographic information. The project also includes a user-friendly interface built with Streamlit, allowing users to input their details and receive instant charge predictions.

Datasets

The dataset used for this project is sourced from Kaggle. It includes the following features:

Age: The age of the individual.
Sex: Insurance contractor gender, female, male.
BMI: Body Mass Index, which is a measure of body fat based on height and weight.
Children: The number of children covered by the insurance policy.
Smoker: Whether the individual is a smoker or not.
Region: The residential area of the individual in the US i.e. northeast, southeast, southwest, northwest.
Charges: The medical costs billed by the health insurance provider.

Tools and Technologies Used

Data Analysis: Python (Pandas,Numpy)
Machine Learning: Scikit-Learn(Linear Regression, Support vector Regression, Decision Tree, Random Forest,K-Nearest Neighbors, XGBoost, Gradient Boosting)
Visualization: Matplotlib, Seaborn
Model Deployment: Streamlit
Version Control: Git, GitHub

Installation and Usage

Prerequisites Ensure you have Python installed on your machine. You will also need to install the required libraries:

# Install dependencies
pip install -r requirements.txt

Running the Project

# Clone the repository
git clone https://github.com/puni-ram48/Health-Insurance-Charges-Prediction.git

# Run the Streamlit Application
streamlit run app.py

Interact with the Application:

Open the provided local URL in your web browser.
Enter the required details such as age,sex, BMI, number of children, smoking status, and region.
The application will predict the insurance charges based on the inputs provided.

Data: Conatins the dataset for the given project.

Project Analysis Report: Final report containing data analysis and visualizations and the model development .

SVM_Model: Saved machine learning model for deployment.

app.py: Streamlit application script.

requirements.txt: List of required python libraries.

Model Development and Evaluation

Data Preprocessing:

Handling missing values (if any).
Encoding categorical variables (e.g., smoker, region).
Scaling numerical features for better model performance.

Model Training:

Several models including Linear Regression, Decision Trees, and Random Forest were trained.
Hyperparameter tuning was performed to improve model accuracy.

Model Evaluation:

The models were evaluated using metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), RMSE and R-squared.
The best-performing model was selected for deployment.

Best Model:
- The SVM Regression model, achieving an R-squared value of 0.8895, demonstrates strong predictive performance for health insurance charges. Its accuracy in modeling complex relationships between features and charges makes it the ideal choice for deployment, ensuring reliable and precise insurance cost predictions.

Contributing

We welcome contributions to this project! If you would like to contribute, please follow these steps:

Fork the repository.
Create a new branch (git checkout -b feature/YourFeature).
Make your changes and commit them (git commit -am 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Create a new Pull Request.

Please ensure your code is well-documented.

Authors and Acknowledgment

This project was initiated and completed by Puneetha Dharmapura Shrirama. Special thanks to the Jeevitha DS for the guidance and support.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
LICENSE		LICENSE
README.md		README.md
analysis_report.ipynb		analysis_report.ipynb
app.py		app.py
image.png		image.png
image_streamlit.png		image_streamlit.png
img1.jpg		img1.jpg
requirements.txt		requirements.txt
svm_model.pkl		svm_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💻 Health Insurance Charges Prediction Using Machine Learning

Streamlit Link

Project Overview

Datasets

Tools and Technologies Used

Installation and Usage

Model Development and Evaluation

Contributing

Authors and Acknowledgment

License

About

Releases

Packages

Languages

License

puni-ram48/Health-Insurance-Charges-Prediction

Folders and files

Latest commit

History

Repository files navigation

💻 Health Insurance Charges Prediction Using Machine Learning

Streamlit Link

Project Overview

Datasets

Tools and Technologies Used

Installation and Usage

Model Development and Evaluation

Contributing

Authors and Acknowledgment

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages