Bias Drift Guardian is a production-ready dashboard and API for real-time bias and drift detection in machine learning models. Built with Streamlit + FastAPI, it ensures compliance with EEOC/EU AI Act regulations and explains root causes with SHAP.
Key Highlights:
- ⚡ Deploy in 30 seconds - Standalone Streamlit dashboard with pre-loaded demo
- 📚 Multi-Dataset Support - Switch instantly between German Credit, Adult Census, and COMPAS Recidivism
- 🎯 Unique Feature - Intersectional bias detection (not available in standard tools)
- 📊 Comprehensive Monitoring - Drift detection (PSI, KS, Chi-square) + Fairness metrics
- 🔍 Root Cause Analysis - SHAP-based explanations for model behavior changes
- 🚀 Production-Ready - FastAPI backend with persistence, Docker support, live deployment
- 🛡️ Robust Safety Net - Auto-fallback mock data generation (never crashes on missing files)
- 📚 Actively Maintained - 2,800+ lines of documentation, open-source (MIT), updated December 2025
🎯 Perfect for: ML Engineers • Data Scientists • Compliance Teams • AI Ethics Researchers
- 🌟 Why Bias Drift Guardian?
- ✨ Key Features
- 🚀 Quick Start
- 📊 Demo & Screenshots
- 🏗️ Architecture
- 💼 Use Cases
- 📚 Documentation
- 🛠️ Installation
- 🎓 Usage Examples
- 🌐 API Reference
- 🤝 Contributing
- 📄 License
- 📞 Contact
The Problem:
- 🚨 80% of AI models experience performance degradation in production due to data drift
- ⚖️ $1M+ lawsuits from algorithmic discrimination are becoming common
- 🔍 Hidden bias in intersectional groups (e.g., "Female employees 50+") goes undetected by standard tools
The Solution: Bias Drift Guardian is a production-ready monitoring system that combines:
- ✅ Data Drift Detection - Catch distribution shifts before they break your model
- ✅ Fairness Analysis - Ensure compliance with EEOC and EU AI Act
- ✅ Intersectional Bias Detection - Unique feature that catches compound discrimination
- ✅ Root Cause Analysis - SHAP-based explanations for why drift is happening
- ✅ Counterfactual "What-If" Analysis - Generate actionable, minimal changes to flip model predictions (e.g., "Increase income by $5k to get approved")
What makes us different: Most fairness tools only check one attribute at a time (gender OR age). We detect compound bias affecting specific subgroups.
Example:
Standard Analysis: "No gender bias" ✅ (Male: 70%, Female: 68%)
Our Analysis: "Female employees aged 50+ have only 38% approval rate!" ❌
Why it matters:
- 📋 EEOC compliance requirement
- 💼 Prevents discrimination lawsuits
- 🎓 Not available in standard Fairlearn
- PSI (Population Stability Index) - Industry standard for numerical features
- KS Test - Statistical distribution comparison
- Chi-square Test - Categorical feature drift
Thresholds:
- PSI < 0.1: ✅ No drift
- PSI 0.1-0.25:
⚠️ Monitor closely - PSI > 0.25: ❌ Action required
SHAP-based feature importance drift detection:
Root Cause Analysis:
- age: Importance increased by 0.0847 (0.1234 → 0.2081)
- credit_amount: Decreased by 0.0423
Recommendation: Investigate data distribution changes
Go beyond "Why?" to "How to fix it?"
- Actionable Insights: "If this applicant increases savings by 10%, they would be approved."
- Constraint-Aware: Respects real-world constraints (e.g., Age cannot decrease, Race is immutable).
- L0/L1 Optimization: Suggests the fewest possible changes to achieve the desired outcome.
- EEOC Compliance: Includes a sticky disclaimer and "Rejected Plans" toggle for full auditability.
Educational tool to visualize how distribution shifts affect model performance in real-time.
- Confusion Matrix visualization
- Accuracy, Precision, Recall, F1-Score
- Error breakdown and actionable insights
- Multi-Dataset Selector: Switch context instantly regardless of current state.
- 💳 German Credit: Financial compliance demo
- 👔 Adult Income: Census-based hiring fairness
- ⚖️ COMPAS: Criminal justice recidivism (with audit logging)
- Robust Loader: "Missing File" protection with auto-generated mock data for rock-solid demos.
- Promise-First Onboarding: "Bias Gap in 15s" frictionless startup flow.
Perfect for demos and portfolio showcases.
# Clone repository
git clone https://github.com/ImdataScientistSachin/Bias-Drift-Detector.git
cd Bias-Drift-Detector
# Install dependencies
pip install -r requirements.txt
# Run dashboard
streamlit run dashboard/app.pyAccess: http://localhost:8501
For production deployment with API backend.
# Install all dependencies
pip install -r requirements-full.txt
# Terminal 1: Start API
uvicorn api.main:app --reload
# Terminal 2: Start Dashboard
streamlit run dashboard/app.pyAccess:
- 📊 Dashboard: http://localhost:8501
- 🌐 API: http://localhost:8000
- 📖 API Docs: http://localhost:8000/docs
docker-compose up -dTry it now: https://bias-drift-guardian.streamlit.app/
📊 Dashboard Overview
Top Metrics Cards
- Total Predictions: 150
- Fairness Score: 60/100
- Drift Alerts: 4
- Average Drift Score: 0.18
🌊 Interactive Drift Simulation
Visualize how data distribution changes affect your model with real-time KS-test calculations.
🎯 Intersectional Bias Analysis
Worst-Performing Groups:
- Female_50+ → 38% approval (Disparity: 0.48 ❌)
- Female_40-50 → 52% approval (Disparity: 0.65
⚠️ ) - Male_50+ → 58% approval (Disparity: 0.73
⚠️ )
┌─────────────────────────────────────────────────────────────┐
│ BIAS DRIFT GUARDIAN │
└─────────────────────────────────────────────────────────────┘
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ STREAMLIT │ │ FASTAPI │ │ CORE ENGINE │
│ DASHBOARD │────▶│ API │────▶│ (Analytics) │
│ │ │ │ │ │
│ • Visualizations │ │ • REST Endpoints │ │ • Drift Detector │
│ • Metrics Cards │ │ • Persistence │ │ • Bias Analyzer │
│ • Simulations │ │ • Background │ │ • Intersectional │
│ │ │ Tasks │ │ • Root Cause │
└──────────────────┘ └──────────────────┘ └──────────────────┘
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | Streamlit | Interactive dashboard |
| Backend | FastAPI | REST API with async support |
| Analytics | Fairlearn, SHAP | Fairness metrics & explainability |
| Data | Pandas, NumPy | Data processing |
| Visualization | Plotly, Seaborn | Interactive charts |
| Statistics | SciPy, Scikit-learn | Statistical tests |
Scenario: Credit scoring model monitoring
- Monitor for age/gender bias in loan approvals
- Detect drift in applicant demographics
- EEOC compliance reporting
- Prevent discrimination lawsuits
Scenario: Hiring algorithm fairness
- Intersectional bias detection (race × gender × age)
- Resume screening fairness analysis
- Legal risk mitigation
- Diversity & inclusion metrics
Scenario: Treatment recommendation systems
- Ensure equal treatment across demographics
- Monitor for patient population changes
- Regulatory compliance (HIPAA, GDPR)
- Ethical AI deployment
Scenario: Recommendation systems
- Prevent filter bubbles
- Ensure fair product exposure
- Monitor for seasonal drift
- A/B testing fairness
- Documentation Index - Start here for navigation
- Comprehensive Analysis - Technical deep dive (30 min read)
- Quick Reference - Code examples & API docs (10 min read)
- Analysis Summary - Executive overview (5 min read)
- Dashboard Guide - UI/UX documentation
- Deployment Guide - Production deployment
- Fixes & Improvements - Change log
- Cleanup Plan - Project structure
- Grade: ⭐⭐⭐⭐½ (4.5/5) - See Analysis Summary
- Python 3.9 or higher
- pip package manager
- Git
Dashboard Only (~30MB):
pip install -r requirements.txtFull Stack (~150MB):
pip install -r requirements-full.txtKey Packages:
streamlit- Dashboard frameworkfastapi- API framework (full stack only)fairlearn- Fairness metrics (full stack only)shap- Explainability (full stack only)pandas,numpy- Data processingplotly,seaborn- Visualizationsscipy,scikit-learn- Statistical tests
from core.drift_detector import DriftDetector
import pandas as pd
# Initialize detector with baseline data
detector = DriftDetector(
baseline_data=train_df,
numerical_features=['age', 'credit_amount', 'duration'],
categorical_features=['job', 'housing', 'purpose']
)
# Detect drift in production data
drift_results = detector.detect_feature_drift(production_df)
# Check for alerts
alerts = drift_results[drift_results['alert'] == True]
print(f"Drift detected in {len(alerts)} features:")
for _, row in alerts.iterrows():
print(f" - {row['feature']}: PSI={row['psi']:.3f}")from core.bias_analyzer import BiasAnalyzer
# Initialize analyzer
analyzer = BiasAnalyzer(sensitive_attrs=['Sex', 'Age_Group'])
# Calculate fairness metrics
metrics = analyzer.calculate_bias_metrics(
y_true=true_labels,
y_pred=predictions,
sensitive_features=sensitive_df
)
# Check fairness score
print(f"Fairness Score: {metrics['fairness_score']}/100")
# Check disparate impact
for attr in ['Sex', 'Age_Group']:
di = metrics[attr]['disparate_impact']
status = "✅ PASS" if di >= 0.8 else "❌ FAIL"
print(f"{attr} Disparate Impact: {di:.3f} {status}")from core.intersectional_analyzer import IntersectionalAnalyzer
# Initialize analyzer
analyzer = IntersectionalAnalyzer(
sensitive_attrs=['Sex', 'Age_Group', 'Race']
)
# Analyze intersectional bias
results = analyzer.analyze_intersectional_bias(
y_pred=predictions,
sensitive_features=sensitive_df,
min_group_size=10
)
# Get worst-performing groups
leaderboard = analyzer.get_intersectional_leaderboard(
y_pred=predictions,
sensitive_features=sensitive_df
)
print("Worst-Performing Groups:")
for group in leaderboard[:5]:
print(f" {group['group']}: {group['selection_rate']:.1%} "
f"(Disparity: {group['disparity_ratio']:.2f})")import requests
# Register model
response = requests.post("http://localhost:8000/api/v1/models/register", json={
"model_id": "credit_model_v1",
"numerical_features": ["age", "credit_amount"],
"categorical_features": ["job", "housing"],
"sensitive_attributes": ["Sex", "Age_Group"],
"baseline_data": baseline_records
})
# Log prediction
requests.post("http://localhost:8000/api/v1/predictions/log", json={
"model_id": "credit_model_v1",
"features": {"age": 35, "credit_amount": 5000, "job": "skilled"},
"prediction": 1,
"sensitive_features": {"Sex": "Female", "Age_Group": "30-40"}
})
# Get metrics
metrics = requests.get("http://localhost:8000/api/v1/metrics/credit_model_v1").json()
print(f"Drift Alerts: {len([d for d in metrics['drift_analysis'] if d['alert']])}")
print(f"Fairness Score: {metrics['bias_analysis']['fairness_score']}")We provide 3 ready-to-run examples demonstrating real-world use cases:
Use Case: Credit risk analysis with fairness monitoring
python examples/german_credit_demo.pyWhat it does:
- Loads German Credit dataset (1,000 samples)
- Trains RandomForest classifier
- Registers model with API
- Simulates drift by shifting age distribution
- Logs 150 predictions
- Analyzes drift and bias
Expected Output:
Drift Alerts: 2 (age, savings_status)
Fairness Score: 60/100
Sex Disparate Impact: 0.75 ❌ FAIL
File: examples/german_credit_demo.py
Use Case: Income prediction fairness analysis
python examples/adult_demo.pyWhat it does:
- Analyzes Adult Census dataset
- Detects intersectional bias (race × gender × age)
- Monitors for demographic drift
File: examples/adult_demo.py
Use Case: API integration example
python examples/live_demo_client.pyWhat it does:
- Demonstrates API endpoints
- Shows model registration
- Logs predictions
- Retrieves metrics
File: examples/live_demo_client.py
💡 Tip: Start with german_credit_demo.py for a complete end-to-end example!
POST /api/v1/models/registerRequest Body:
{
"model_id": "my_model_v1",
"numerical_features": ["age", "income"],
"categorical_features": ["job", "education"],
"sensitive_attributes": ["Sex", "Race"],
"baseline_data": [...]
}POST /api/v1/predictions/logRequest Body:
{
"model_id": "my_model_v1",
"features": {"age": 35, "income": 50000},
"prediction": 1,
"true_label": 1,
"sensitive_features": {"Sex": "Female", "Race": "Asian"}
}GET /api/v1/metrics/{model_id}Response:
{
"model_id": "my_model_v1",
"total_predictions": 150,
"drift_analysis": [...],
"bias_analysis": {...},
"root_cause_report": "..."
}GET /api/v1/modelsGET /api/v1/healthFull API Documentation: http://localhost:8000/docs (when API is running)
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch
git checkout -b feature/AmazingFeature
- Commit your changes
git commit -m 'Add AmazingFeature' - Push to the branch
git push origin feature/AmazingFeature
- Open a Pull Request
# Clone your fork
git clone https://github.com/YOUR_USERNAME/Bias-Drift-Detector.git
cd Bias-Drift-Detector
# Install dev dependencies
pip install -r requirements-full.txt
pip install pytest pytest-asyncio black flake8
# Run tests (when available)
pytest tests/
# Format code
black .
# Lint code
flake8 .- Follow PEP 8
- Use Black for formatting
- Add docstrings to all functions
- Include type hints where possible
Click to see all 9 completed features
- ✅ Core drift detection (PSI, KS, Chi-square)
- ✅ Fairness analysis (Disparate Impact, Demographic Parity, Equalized Odds)
- ✅ Intersectional bias detection ⭐ Unique feature
- ✅ Root cause analysis (SHAP-based explanations)
- ✅ Standalone Streamlit demo (works without backend)
- ✅ FastAPI backend with persistence layer
- ✅ Comprehensive documentation (2,800+ lines)
- ✅ Docker support (docker-compose ready)
- ✅ Streamlit Cloud deployment (live demo available)
- ⏳ Unit tests (target: 80% coverage)
- ⏳ CI/CD pipeline (GitHub Actions)
- ⏳ Performance optimizations
- 📅 Time-series drift tracking
- 📧 Automated alerting (email/Slack)
- 📊 Model comparison features
- 🗄️ PostgreSQL integration
- ⚡ Redis caching
- ☸️ Kubernetes deployment guide
- 🌍 Multi-language support
- 🔌 Custom metric plugins
Can I deploy this commercially?
Yes! This project is MIT licensed. You can use it commercially with attribution. Perfect for:
- Enterprise ML monitoring
- SaaS products
- Consulting projects
- Internal tools
Is this GDPR compliant?
The system doesn't store sensitive data by default. For GDPR compliance:
- ✅ Data is processed in-memory
- ✅ No PII stored without configuration
⚠️ Implement data anonymization for production⚠️ Configure retention policies as needed
Can I use this with my own dataset?
Yes! The system is model-agnostic. Just register your model with baseline data and start logging predictions.
What models are supported?
Any scikit-learn compatible model. For SHAP analysis: RandomForest, XGBoost, LightGBM, and Linear models work best.
How much data do I need?
Minimum 500 samples for baseline. For production monitoring, analyze every 100-1000 predictions depending on risk level.
How does intersectional analysis differ from standard bias analysis?
Standard analysis checks one attribute at a time (e.g., gender OR age). Intersectional analysis checks combinations (e.g., "Female employees aged 50+"), catching compound discrimination that single-attribute analysis misses.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this project in your research or work, please cite:
@software{bias_drift_guardian,
author = {Sachin Paunikar},
title = {Bias Drift Guardian: Real-time AI Fairness and Data Drift Monitoring},
year = {2025},
url = {https://github.com/ImdataScientistSachin/Bias-Drift-Detector}
}- Fairlearn - Microsoft's fairness toolkit
- DiCE - Diverse Counterfactual Explanations
- SHAP - Lundberg & Lee's explainability framework
- Streamlit - Amazing dashboard framework
- FastAPI - Modern Python web framework
- UCI ML Repository - German Credit & Adult Census datasets
This project was inspired by the need for accessible, production-ready fairness monitoring tools in the ML community and the growing importance of ethical AI deployment.
Sachin Paunikar
- 📧 Email: ImdataScientistSachin@gmail.com
- 💼 LinkedIn: linkedin.com/in/sachin-paunikar-datascientists
- 🐙 GitHub: @ImdataScientistSachin
If you find this project useful, please consider giving it a star! It helps others discover the project.
Made with ❤️ for Ethical AI
