🔧 Predictive Maintenance API

TL;DR

Predictive maintenance system for industrial equipment failure detection.
Gradient Boosting model (ROC-AUC 0.985, Recall 0.824)
FastAPI + Docker + SHAP explainability
89% test coverage, CI/CD ready

Overview

This project implements an end-to-end machine learning system for predictive maintenance of industrial milling machines. The system predicts equipment failures before they occur, enabling proactive maintenance scheduling and reducing costly downtime.

Business Value:

Early failure detection with 82% recall
Reliable alerts with 72% precision
Explainable predictions using SHAP
Production-ready API with Docker deployment
Real-time inference with <100ms latency

Key Features

Machine Learning

Gradient Boosting Classifier achieving 0.985 ROC-AUC
Custom threshold optimization balancing precision and recall
SHAP-based explainability for model transparency
Handles class imbalance (failure rate ~3%)

API Service

FastAPI with automatic OpenAPI documentation
RESTful endpoints for prediction and explanation
Batch prediction support
SHAP visualization endpoint
Health checks and monitoring
Request ID tracking for debugging

Production Ready

Docker containerization for consistent deployment
89% test coverage (unit + integration tests)
Structured logging (JSON format)
Configuration management via YAML
CI/CD ready with GitHub Actions workflow
Makefile for common operations

Problem & Solution

Problem

Industrial equipment failures cause:

Unplanned downtime
Lost productivity
Emergency repair costs
Product quality issues

Solution

Predictive maintenance system that:

Monitors equipment sensors in real-time
Predicts failures before they occur
Triggers maintenance alerts when risk is high
Provides explanations for each prediction

Business Constraints

The model was selected based on strict business requirements:

Hard Constraints:

Minimum Recall ≥ 0.80 — catch at least 80% of failures
Minimum Precision ≥ 0.50 — keep false alarms manageable

Ranking Metrics (among models meeting constraints):

PR-AUC (primary)
ROC-AUC
Recall
Precision

📊 Model Performance

Final Model: Gradient Boosting Classifier

Operating Point: Threshold = 0.10

Metric	Value	Interpretation
Recall	0.824	Catches 82% of actual failures
Precision	0.718	72% of alerts are true failures
F1-Score	0.767	Strong balance of precision/recall
ROC-AUC	0.985	Excellent ranking quality
PR-AUC	0.839	Best performance on imbalanced data

Why This Model?

Only model meeting both business constraints simultaneously
Highest PR-AUC among eligible models
Best balance between early detection and alert reliability
Production-ready with calibrated threshold and explainability

📁 Project Structure

predictive-maintenance-dockerized-api/
│
├── api/                          # FastAPI application
│   ├── main.py                   # API endpoints
│   ├── schemas.py                # Pydantic models
│   ├── deps.py                   # Dependency injection
│   └── static/                   # Frontend assets
│       └── index.html
│
├── src/                          # Core ML pipeline
│   ├── config.py                 # Configuration loader
│   ├── paths.py                  # Path management
│   ├── data_loader.py            # Data loading utilities
│   ├── preprocessing.py          # Feature engineering
│   ├── models.py                 # Model builders
│   ├── predictive_model.py       # Production model wrapper
│   ├── training.py               # Training pipeline
│   ├── evaluation.py             # Metrics calculation and evaluation
│   ├── thresholding.py           # Threshold optimization
│   ├── artifacts_io.py           # Model persistence
│   ├── logging_config.py         # Logging
│   └── visualization/            # Plotting utilities
│       ├── comparison.py
│       ├── explainability.py
│       └── threshold_analysis.py
│
├── tests/                        # Test suite (89% coverage)
│   ├── unit/                     # Unit tests
│   └── integration/              # Integration tests
│
├── notebooks/                    # Jupyter notebooks
│   ├── 01_eda.ipynb             # Exploratory analysis
│   ├── 02_baseline_models.ipynb # Baseline experiments
│   ├── 03_tree_models.ipynb     # Random Forest models
│   ├── 04_gradient_boosting_models.ipynb # Gradient Boosting models
│   └── 05_model_selection_and_explainability.ipynb # Final model selection
│
├── artifacts/                    # Model artifacts
│   ├── final/                    # Production model
│   │   ├── pipeline.joblib       # Trained pipeline
│   │   ├── threshold.joblib      # Decision threshold
│   │   ├── metrics.json          # Performance metrics
│   │   └── threshold_sweep.csv   # Threshold analysis
│   └── split/                    # Train/test split
│
├── data/
│   ├── raw/                      # Original dataset
│   └── processed/                # Processed data
│
├── config/
│   └── config.yml                # Project configuration
│
├── .github/workflows/
│   └── tests.yml                 # CI/CD pipeline
│
├── Dockerfile                    # Container definition
├── Makefile                      # Development commands
├── requirements.txt              # Python dependencies
├── pytest.ini                    # Test configuration
├── .coveragerc                   # Coverage settings
└── README.md                     # This file

Quick Start

Prerequisites

Python 3.9+
Docker (optional, recommended)

Option 1: Docker (Recommended)

# Clone repository
git clone https://github.com/foxymadeit/predictive-maintenance-dockerized-api.git
cd predictive-maintenance-dockerized-api

# Build and run
make build
make run-d

# Verify
make health

API runs at http://localhost:8000

Option 2: Local Installation

# Clone repository
git clone https://github.com/foxymadeit/predictive-maintenance-dockerized-api.git
cd predictive-maintenance-dockerized-api

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run API
uvicorn api.main:app --reload --host 0.0.0.0 --port 8000

Quick Test

# Health check
curl http://localhost:8000/health

# Single prediction
curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "Air temperature [K]": 300,
    "Process temperature [K]": 310,
    "Rotational speed [rpm]": 1500,
    "Torque [Nm]": 40,
    "Tool wear [min]": 100,
    "Type": "M"
  }'

Response:

{
  "proba_failure": 0.156,
  "alert": 1,
  "threshold": 0.10
}

API Documentation

Interactive Docs

Once running, visit:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Endpoints

`GET /`

Simple web interface for testing predictions.

`GET /health`

Health check endpoint.

Response:

{
  "status": "ok",
  "threshold": 0.10,
  "features": ["Air temperature [K]", ...]
}

`POST /predict`

Single machine prediction.

Request Body:

{
  "Air temperature [K]": 300.0,
  "Process temperature [K]": 310.0,
  "Rotational speed [rpm]": 1500.0,
  "Torque [Nm]": 40.0,
  "Tool wear [min]": 100.0,
  "Type": "M"
}

Response:

{
  "proba_failure": 0.156,
  "alert": 1,
  "threshold": 0.10
}

Fields:

proba_failure: Probability of failure (0-1)
alert: Binary flag (0=safe, 1=maintenance needed)
threshold: Decision threshold used

`POST /predict/batch`

Batch prediction for multiple machines.

Request Body:

{
  "records": [
    {"Air temperature [K]": 300, ...},
    {"Air temperature [K]": 305, ...}
  ]
}

Response:

{
  "results": [
    {"proba_failure": 0.156, "alert": 1, "threshold": 0.10},
    {"proba_failure": 0.043, "alert": 0, "threshold": 0.10}
  ]
}

`POST /explain`

Get SHAP explanation for a prediction.

Query Parameters:

top_k (int, optional): Number of top features to return (default: 8)

Request Body: Same as /predict

Response:

{
  "proba_failure": 0.156,
  "alert": 1,
  "threshold": 0.10,
  "top_contributors": [
    {
      "feature": "Torque [Nm]",
      "value": 40.0,
      "shap_value": 0.087,
      "direction": "increases_risk"
    },
    {
      "feature": "Tool wear [min]",
      "value": 100.0,
      "shap_value": 0.065,
      "direction": "increases_risk"
    },
    ...
  ]
}

`POST /explain/plot`

Get SHAP waterfall plot as PNG image.

Request Body: Same as /predict

Response: PNG image (Content-Type: image/png)

Example:

curl -X POST http://localhost:8000/explain/plot \
  -H "Content-Type: application/json" \
  -d '{"Air temperature [K]": 300, ...}' \
  -o shap_plot.png

Model Details

Features

Numerical Features (5):

Air temperature [K] — ambient temperature
Process temperature [K] — operational temperature
Rotational speed [rpm] — spindle rotation speed
Torque [Nm] — torque measurement
Tool wear [min] — cumulative tool usage time

Categorical Features (1):

Type — machine quality variant (L=Low, M=Medium, H=High)

Threshold Optimization

Decision threshold was selected by:

Computing precision-recall curve on test set
Filtering thresholds meeting business constraints:
- Recall ≥ 0.80
- Precision ≥ 0.50
Selecting threshold maximizing F1-score among valid options
Final threshold: 0.10 (optimized for early detection)

Model Comparison

All trained models and their performance:

Model	Threshold	Precision	Recall	F1	ROC-AUC	PR-AUC
GB (final)	0.10	0.718	0.824	0.767	0.985	0.839
GB (tuned)	0.10	0.718	0.824	0.767	0.985	0.839
RF (tuned)	0.06	0.305	0.912	0.458	0.962	0.797
LR (balanced)	0.58	0.158	0.750	0.261	0.889	0.396
LR (default)	0.02	0.106	0.853	0.188	0.889	0.456

Why Gradient Boosting?

Only model meeting both constraints
Highest PR-AUC (critical for imbalanced data)
Best precision-recall balance
Stable performance across thresholds

Development

Setup Development Environment

# Clone and create venv
git clone https://github.com/foxymadeit/predictive-maintenance-dockerized-api.git
cd predictive-maintenance-dockerized-api
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest -v

# Run with coverage
pytest --cov=src --cov=api --cov-report=html

# View coverage report
open htmlcov/index.html

Makefile Commands

# Docker operations
make build              # Build Docker image
make run                # Run container (foreground)
make run-d              # Run container (background)
make stop               # Stop container
make rebuild            # Full rebuild cycle
make clean              # Remove images and cache

# API testing
make health             # Check API health
make predict            # Test /predict endpoint
make explain            # Test /explain endpoint
make explain-plot       # Test /explain/plot endpoint

# Testing
make test               # Run all tests
make test-cov           # Run tests with coverage
make test-unit          # Run unit tests only
make test-integration   # Run integration tests only
make test-docker        # Run tests in Docker

Testing

Test Coverage: 89%

# Run all tests
pytest -v

# With coverage report
pytest --cov=src --cov=api --cov-report=term-missing

# Unit tests only
pytest tests/unit/ -v

# Integration tests only
pytest tests/integration/ -v

# Test in Docker
make test-docker

Test Structure

tests/
├── unit/                  # Unit tests
│   ├── test_preprocessing.py
│   ├── test_models.py
│   ├── test_evaluation.py
│   └── test_predictive_model.py
└── integration/           # Integration tests
    ├── test_api.py
    └── test_pipeline.py

Explainability

SHAP (SHapley Additive exPlanations)

The model uses SHAP to explain individual predictions:

Global Explanations:

Feature importance ranking
Average impact of each feature
Feature interactions

Local Explanations:

Contribution of each feature to a specific prediction
Direction of impact (increases/decreases risk)
Magnitude of effect

Example Explanation

curl -X POST http://localhost:8000/explain?top_k=5 \
  -H "Content-Type: application/json" \
  -d '{
    "Air temperature [K]": 300,
    "Process temperature [K]": 310,
    "Rotational speed [rpm]": 1500,
    "Torque [Nm]": 40,
    "Tool wear [min]": 100,
    "Type": "M"
  }'

Response:

{
  "proba_failure": 0.156,
  "alert": 1,
  "threshold": 0.10,
  "top_contributors": [
    {
      "feature": "Torque [Nm]",
      "value": 40.0,
      "shap_value": 0.087,
      "direction": "increases_risk"
    },
    {
      "feature": "Tool wear [min]",
      "value": 100.0,
      "shap_value": 0.065,
      "direction": "increases_risk"
    },
    {
      "feature": "Rotational speed [rpm]",
      "value": 1500.0,
      "shap_value": -0.023,
      "direction": "decreases_risk"
    }
  ]
}

Interpretation:

"The model predicts 15.6% failure probability (ALERT triggered). Primary risk factors: high torque (40 Nm) and elevated tool wear (100 min). Normal rotational speed slightly reduces risk."

Visualization

Get SHAP waterfall plot:

curl -X POST http://localhost:8000/explain/plot \
  -H "Content-Type: application/json" \
  -d '{"Air temperature [K]": 300, ...}' \
  -o shap_waterfall.png

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Contact

Project Author: @foxymadeit

Project Link: https://github.com/foxymadeit/predictive-maintenance-dockerized-api

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
api		api
artifacts		artifacts
config		config
data		data
notebooks		notebooks
src		src
tests		tests
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt

License

foxymadeit/predictive-maintenance-dockerized-api

Folders and files

Latest commit

History

Repository files navigation

🔧 Predictive Maintenance API

TL;DR

Table of Contents

Overview

Key Features

Machine Learning

API Service

Production Ready

Problem & Solution

Problem

Solution

Business Constraints

📊 Model Performance

Final Model: Gradient Boosting Classifier

Why This Model?

📁 Project Structure

Quick Start

Prerequisites

Option 1: Docker (Recommended)

Option 2: Local Installation

Quick Test

API Documentation

Interactive Docs

Endpoints

GET /

GET /health

POST /predict

POST /predict/batch

POST /explain

POST /explain/plot

Model Details

Features

Threshold Optimization

Model Comparison

Development

Setup Development Environment

Makefile Commands

Testing

Test Coverage: 89%

Test Structure

Explainability

SHAP (SHapley Additive exPlanations)

Example Explanation

Visualization

📝 License

Contributing

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`GET /`

`GET /health`

`POST /predict`

`POST /predict/batch`

`POST /explain`

`POST /explain/plot`

Packages