Skip to content

Customer Churn MLOps is an end-to-end machine learning pipeline for predicting customer churn using tabular data. It integrates DVC for data/model versioning, MLflow for experiment tracking, FastAPI for model serving, and GitHub Actions for CI/CD automation, making the project fully production-ready.

Notifications You must be signed in to change notification settings

moeedfaiz/customer_churn_mlops

Repository files navigation

📊 Customer Churn Prediction – MLOps Pipeline

This project implements an end-to-end MLOps workflow for a Customer Churn Prediction Model using modern tools like DVC, MLflow, GitHub Actions, and FastAPI.

It predicts whether a customer will churn (leave the service) based on historical tabular data.

🚀 Features Implemented

✅ Data validation using Pandera ✅ Model training and evaluation with scikit-learn ✅ Experiment tracking with MLflow ✅ API serving using FastAPI ✅ Version control with Git ✅ Data & model tracking with DVC ✅ Remote storage setup with Google Drive (DVC Remote) ✅ CI/CD automation with GitHub Actions

🏗 Project Structure

customer_churn_mlops/ │ ├── data/ # Raw & processed datasets (tracked with DVC) │ ├── raw/ │ └── processed/ │ ├── models/ # Stored models (tracked with DVC) │ ├── src/ # Source code │ ├── data_validation.py # Pandera validation schemas │ ├── train_model.py # Training script with MLflow logging │ ├── serve_api.py # FastAPI app for predictions │ ├── .dvc/ # DVC configuration ├── dvc.yaml # DVC pipeline stages ├── dvc.lock # Locked pipeline stages ├── requirements.txt # Python dependencies ├── README.md # Project documentation └── .github/workflows/ # GitHub Actions CI/CD pipelines

⚙ Installation & Setup

1️⃣ Clone the Repository

git clone https://github.com//customer_churn_mlops.git cd customer_churn_mlops

2️⃣ Create a Virtual Environment

python -m venv .dvc_env ..dvc_env\Scripts\activate # Windows source .dvc_env/bin/activate # Linux/Mac

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Setup DVC Remote dvc push # Upload data & models to remote

📦 Running the Pipeline

Reproduce the ML pipeline:

dvc repro

Show metrics:

dvc metrics show

🔎 Model Training & Tracking

Train the model manually:

python src/train_model.py

View experiments in MLflow UI:

mlflow ui

🌐 API Serving

Run FastAPI server:

uvicorn src.serve_api:app --reload

Then test via browser or cURL:

http://127.0.0.1:8000/predict

⚡ CI/CD with GitHub Actions

This repo includes a GitHub Actions workflow that: 1. Installs dependencies 2. Pulls dataset & models from Google Drive (DVC Remote) 3. Reproduces the pipeline (dvc repro) 4. Shows metrics (dvc metrics show)

Check it under: 👉 GitHub → Actions tab

🛠 Tools & Tech Stack • Python 3.11 • scikit-learn – Model training • Pandera – Data validation • MLflow – Experiment tracking • FastAPI – Model serving • DVC – Data & model versioning • GitHub Actions – CI/CD automation • Google Drive – DVC remote storage

👨‍💻 Author

Developed by Moeed Abbasi

About

Customer Churn MLOps is an end-to-end machine learning pipeline for predicting customer churn using tabular data. It integrates DVC for data/model versioning, MLflow for experiment tracking, FastAPI for model serving, and GitHub Actions for CI/CD automation, making the project fully production-ready.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published