ML service for predicting insurance premiums based on customer input data with explainability (feature importance, SHAP) and executive summary for the user.
🔗 Live Demo: insurance-pricing-demo.symfa.com
This project demonstrates a transparent approach to insurance pricing using machine learning. Based on the US Health Insurance Dataset from Kaggle, the system predicts premiums while prioritizing explainability. Instead of a black-box output, it provides a detailed breakdown of how individual customer factors—such as age, BMI, and region—drive the calculated cost, helping business users understand the "why" behind every price.
The goal is to build a predictive model that estimates insurance premiums based on customer characteristics, providing:
- Accurate premium predictions based on customer attributes
- Explainability via feature importance and SHAP values
- Executive summary generation in human-readable language
- Interactive parameter adjustment with real-time updates
insurance-pricing/
├── backend/ # 🐍 Python Backend (UV workspace member)
│ ├── src/insurance_pricing/ # FastAPI application
│ │ ├── __init__.py
│ │ └── main.py # API endpoints
│ ├── models/ # Trained ML model artifacts
│ ├── notebooks/ # Jupyter notebooks (EDA, experiments)
│ ├── scripts/ # Training & preprocessing scripts
│ ├── data/ # Datasets
│ │ └── source.csv
│ └── pyproject.toml # Backend dependencies
│
├── frontend/ # ⚛️ Next.js Frontend
│ ├── src/app/
│ │ ├── layout.tsx
│ │ ├── page.tsx
│ │ └── globals.css
│ └── package.json
│
├── pyproject.toml # UV workspace definition
├── uv.lock # Lockfile
├── .pre-commit-config.yaml # Code quality hooks
└── README.md
The dataset contains health insurance records with the following features:
| Feature | Description |
|---|---|
age |
Age of the primary beneficiary |
sex |
Gender (male/female) |
bmi |
Body mass index |
children |
Number of dependents covered |
smoker |
Smoking status (yes/no) |
region |
Residential area in the US |
| Feature | Description |
|---|---|
charges |
Target - Individual medical costs billed by insurance |
- Python 3.13+
- FastAPI - Modern, high-performance web framework
- Pydantic - Data validation
- uvicorn - ASGI server
- Next.js 16 - React framework with SSR
- TypeScript - Type-safe JavaScript
- Tailwind CSS 4 - Utility-first CSS framework
- React 19
- pandas - Data manipulation
- scikit-learn - Machine learning
- SHAP - Model explainability
- uv - Fast Python package manager
- pnpm - Fast Node.js package manager
- pre-commit - Git hooks for code quality
- ruff - Linter and formatter
- mypy - Static type checker
- Python 3.13+
- Node.js 20+
- pnpm (fast and efficient Node.js package manager)
- uv (recommended for Python)
-
Clone the repository:
git clone https://github.com/Symfa-Inc/insurance-pricing.git cd insurance-pricing -
Install Python dependencies:
uv sync
-
Install frontend dependencies:
cd frontend pnpm install
Backend (FastAPI):
uv run uvicorn insurance_pricing.main:app --reloadAPI will be available at: http://localhost:8000 API docs at: http://localhost:8000/docs
Frontend (Next.js):
cd frontend
pnpm devFrontend will be available at: http://localhost:3000