Bridge Health Level Classification using Explainable Boosting Machine
An AI-powered bridge health classification system that automatically categorizes bridge inspection reports into health levels using machine learning. The system leverages Explainable Boosting Machine (EBM) to achieve high accuracy while maintaining interpretability.
- ๐ EBM 25ๅ้ซ้ๅ: 25ๅ โ 63.8็ง๏ผ16ไธฆๅๅฆ็๏ผ
- 91.88% Test Accuracy - ๅฎ็จใฌใใซ้ๆ๏ผ
- 86.20% F1-macro score (+19.8pt improvement from v0.2)
- ใใซใใผใฟๅญฆ็ฟ: 8,615ไปถๅฆ็๏ผ31ๅใใผใฟๆดป็จ๏ผ
- Repair-requirement F1: 80% - ๅฎ็จๆง็ขบไฟ
| Version | Data Size | Test Accuracy | F1-Macro | Key Innovation |
|---|---|---|---|---|
| v0.3 | 8,615ไปถ | 91.88% | 86.20% | 16ไธฆๅ้ซ้ๅ + ใใซใใผใฟ |
| v0.2 | 276ไปถ | 84.34% | 66.40% | ้็ดใใผใฟใงใฎMVP |
# Clone the repository
git clone https://github.com/YOUR_USERNAME/health-ebm-classification.git
cd health-ebm-classification
# Set up virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1 # Windows
# source .venv/bin/activate # macOS/Linux
# Install dependencies
pip install -r requirements.txt
# Run the complete pipeline
cd src
python main_pipeline.py| Model | Training Time | Val F1-Macro | Test Accuracy | ็นๅพด |
|---|---|---|---|---|
| ๐ฅ EBM | 63.80็ง | 85.34% | 91.88% | ๆ้ซ็ฒพๅบฆ+้ซ้ๅ |
| ๐ฅ XGBoost Enhanced | 5.42็ง | 82.91% | 89.12% | ้ซ้้ซ็ฒพๅบฆ |
| ๐ฅ CatBoost | 28.12็ง | 79.44% | 87.56% | ใใฉใณในๅ |
| LightGBM | 2.23็ง | 76.38% | 85.23% | ่ถ ้ซ้ |
| Random Forest | 0.42็ง | 71.64% | 82.34% | ๆ้ซ้ |
- EBM้ซ้ๅ: 25ๅ โ 63.8็ง๏ผ25ๅ้ซ้ๅ๏ผ
- ็ฒพๅบฆๅไธ: Test Accuracy 84.34% โ 91.88% (+7.54pt)
- F1ๅไธ: 66.40% โ 86.20% (+19.80pt)
- ๅฎ็จๆง: Repair-requirement F1 25% โ 80% (+55pt)
Raw CSV Data โ Preprocessing โ Feature Engineering โ Model Training โ Evaluation
โ โ โ โ โ
9,753 records โ 8,615 samples โ 1,019 features โ 7 models โ Best: EBM (91.88%)
(31ๅใใผใฟ) (ใใซๆดป็จ) (16ไธฆๅ) (ๅฎ็จใฌใใซ)
- Level โ (Healthy): 1,404 samples (16.3%)
- Level โ ก (Preventive): 6,332 samples (73.5%)
- Repair-required (III+): 879 samples (10.2%) - 44ๅๅขๅ ๏ผ
- โก 16ไธฆๅๅฆ็: EBMใ25ๅ้ซ้ๅ๏ผ25ๅโ63.8็ง๏ผ
- ๐ ใใซใใผใฟๅญฆ็ฟ: 8,615ไปถใฎๅๅฅ่จ้ฒๅฆ็
- ๐ฏ ๅฎ่กๆ้ใใฉใใญใณใฐ: ๅ จใขใใซใฎๆง่ฝ็ฃ่ฆ
- ๐ง ๆ้ฉๅใใฉใกใผใฟ: interactions=10, max_bins=64
- 7 Advanced Models: EBM, LightGBM, CatBoost, XGBoost, Random Forest
- Class Imbalance Handling: Strategic class consolidation and weighting
- Cross-validation: 5-fold CV for robust evaluation
- Interpretable AI: EBM provides feature importance and decision explanations
- Japanese NLP: Janome morphological analysis
- TF-IDF Vectorization: 1,000-dimensional text features
- Domain Keywords: Bridge-specific terminology extraction
- Multi-modal Features: Text + numerical + categorical data
- Automated Pipeline: End-to-end ML workflow
- Error Handling: Robust processing with fallback strategies
- Modular Design: Easily extensible components
- Performance Monitoring: Detailed metrics and reporting
health-ebm-classification/
โโโ src/
โ โโโ main_pipeline.py # Main execution pipeline (v0.3ๅฏพๅฟ)
โ โโโ data_loader.py # Data loading + ใใซใใผใฟใขใผใ
โ โโโ feature_engineering.py # Feature extraction and engineering
โ โโโ model_trainer.py # Model training + 16ไธฆๅๅฆ็
โโโ 1_inspection-dataset/ # Bridge inspection data
โโโ docs/
โ โโโ README_v0-3.md # ๐ v0.3 ๆๆใพใจใ
โ โโโ README_v0-2.md # v0.2 technical documentation
โ โโโ QUICK_GUIDE.md # 5-minute start guide
โโโ requirements.txt # Python dependencies
โโโ .gitignore # Git ignore rules
โโโ README.md # This file
- Automated Inspection: Reduce manual assessment time
- Risk Prioritization: Identify bridges requiring immediate attention
- Maintenance Planning: Data-driven repair scheduling
- Quality Assurance: Consistent evaluation standards
- Explainable Predictions: Understand why a bridge needs repair
- Confidence Scoring: Reliability indicators for each prediction
- Comparative Analysis: Benchmark against historical data
- Expert Validation: AI recommendations with human oversight
- Python 3.11+
- 8GB+ RAM (for ใใซใใผใฟๅฆ็)
- ใใซใใณใขCPUๆจๅฅจ (16ไธฆๅๅฆ็ๅฏพๅฟ)
- Bridge inspection CSV data
pip install pandas numpy scikit-learn
pip install lightgbm xgboost catboost
pip install interpret janome # For EBM and Japanese textYour CSV files should contain:
BridgeID: Unique bridge identifierHealthLevel: Current health assessment (โ , โ ก, โ ข, โ ฃ, โ ค)Diagnosis: Inspection text descriptionDamageComment: Detailed damage observations
- 25ๅ้ซ้ๅ: EBMๅญฆ็ฟๆ้ 25ๅโ63.8็ง
- 16ไธฆๅๅฆ็: CPUๆ้ฉๆดป็จๅฎ่ฃ
- ใใซใใผใฟๅญฆ็ฟ: 8,615ไปถๅฆ็ๅฏพๅฟ
- ๅฎ็จใฌใใซ้ๆ: Test Accuracy 91.88%
- ๅ ๆฌ็ใใญใฅใกใณใ: README_v0-3.mdไฝๆ
- REST API implementation
- Real-time prediction endpoint
- Enhanced interpretability dashboard
- Web application interface
- Web application interface
- Automated model retraining
- Integration with inspection databases
- Multi-language support
- Image analysis integration
- Geographic factor modeling
- Predictive maintenance forecasting
- Mobile app development
We welcome contributions! Please see our Contributing Guidelines for details.
# Clone with development dependencies
git clone https://github.com/YOUR_USERNAME/health-ebm-classification.git
cd health-ebm-classification
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
python -m pytest tests/
# Run linting
flake8 src/
black src/This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Documentation: See
docs/folder for detailed guides - Questions: Create a discussion in the repository
- Microsoft InterpretML: For the Explainable Boosting Machine implementation
- Japanese NLP Community: For Janome morphological analyzer
- Bridge Engineering Domain Experts: For validation and insights
Built with โค๏ธ for safer infrastructure
Last Updated: October 3, 2025 - v0.3 Release