This project focuses on predicting health care insurance premiums using Machine Learning techniques. The goal is to estimate premiums based on key features such as age, smoking habits, BMI, and medical history.
- Machine Learning Models Used:
- Linear Regression
- XGBoost Regressor
- Best Model: The XGBoost Regressor was selected as the optimal model, achieving an impressive 97% accuracy on the test dataset.
- Conducted detailed error analysis to evaluate the model's performance across different data segments.
- Observed that the model's predictions were consistently positive and reliable, aligning well with the actual premium values.
- Comprehensive data preprocessing and feature engineering to handle various types of input data.
- Comparison of model performance to ensure accurate predictions.
- Clear and concise code structure to make it easy to understand and extend.
- Programming Language: Python
- Libraries/Frameworks: Scikit-learn, XGBoost, NumPy, Pandas, Matplotlib
.
- Health insurance providers can use this model to estimate premiums more accurately based on client profiles.
- Enables data-driven decisions for personalized insurance plans.
Explore the code, models, and insights in this repository to understand how ML techniques are applied to real-world problems.
-
Data Files: The data files used in this project are not provided in the repository due to privacy concerns. However, you can use similar structured datasets to replicate and extend the results
THANK YOU Praveen