Skip to content

🤖 An AI-powered tool for prioritizing security vulnerabilities using Naive Bayes and BERT models

License

Notifications You must be signed in to change notification settings

oguarni/ai-vulnerability-triage

Repository files navigation

Intelligent Vulnerability Triage Tool - MVP

An AI-powered tool for prioritizing security vulnerabilities and reducing alert fatigue in DevSecOps workflows.

Quick Start

Option 1: Automated Start (Recommended)

Run the entire application with a single command:

./start.sh

This script will:

  • Install Python dependencies
  • Start Redis (if available)
  • Initialize database
  • Create test user (admin / Str0ngP@ssw0rd!_f0r_D3m0)
  • Start Flask application at http://localhost:5000

Test Credentials:

  • Username: admin
  • Password: Str0ngP@ssw0rd!_f0r_D3m0

Option 2: Manual Start

# 1. Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Start Redis (optional but recommended for caching)
redis-server  # Or: sudo systemctl start redis

# 4. Initialize database and create test user
flask --app app:create_app create-test-user

# 5. Start the application
flask --app app:create_app run --debug
# Or for production: gunicorn -w 4 -b 0.0.0.0:5000 'app:create_app()'

Application will be available at:

What's Loaded on Startup

flowchart TD
    %%{init: {'theme': 'base', 'themeVariables': {'background': '#ffffff', 'mainBkg': '#ffffff', 'primaryTextColor': '#000000', 'lineColor': '#000000', 'textColor': '#000000'}}}%%
    START(["🚀 Application Start"]) --> DEPS["📦 Load Dependencies<br/>Flask, Redis, SQLAlchemy"]
    DEPS --> DB["🗄️ Initialize Database<br/>Create tables"]
    DB --> MODELS["🤖 Load ML Models<br/>~5 seconds"]

    subgraph models [" Model Loading "]
        direction LR
        NB["⚡ Naive Bayes<br/>50ms | 80KB"]
        BERT["🧠 BERT Model<br/>2-5s | 418MB"]
    end

    MODELS --> models
    models --> READY(["✅ Application Ready<br/>Listening on :5000"])

    style START fill:#E6F3FF,stroke:#000000,stroke-width:2px,color:#000000
    style READY fill:#E6FFE6,stroke:#000000,stroke-width:2px,color:#000000
    style NB fill:#FFFDE7,stroke:#000000,stroke-width:2px,color:#000000
    style BERT fill:#E6F3FF,stroke:#000000,stroke-width:2px,color:#000000
    style DEPS fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style DB fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style MODELS fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style models fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
Loading

Overview

This project addresses the "alert fatigue" problem in software security, where scanners generate thousands of alerts (many false positives or low priority). Our solution uses machine learning to intelligently triage and prioritize vulnerabilities, allowing teams to focus on critical risks.

System Architecture

graph TB
    %%{init: {'theme': 'base', 'themeVariables': {'background': '#ffffff', 'mainBkg': '#ffffff', 'primaryTextColor': '#000000', 'lineColor': '#000000', 'textColor': '#000000', 'clusterBkg': '#FFFFFF', 'clusterBorder': '#000000'}}}%%
    subgraph clients [" 🖥️ Client Layer "]
        direction LR
        API["📡 API Client<br/>REST/JSON"]
        WEB["🌐 Web Browser<br/>HTML/Forms"]
    end

    subgraph application [" ⚙️ Application Layer "]
        FLASK["Flask Application"]
        AUTH["🔐 Authentication<br/>Flask-Login + API Keys"]
        LIMITER["⏱️ Rate Limiter"]
    end

    subgraph services [" 🔧 Service Layer "]
        INFERENCE["🔮 Inference Service"]
        REDIS["📊 Redis Service<br/>Caching + Metrics"]
    end

    subgraph models [" 🤖 Model Layer "]
        CONTAINER["Model Container"]
        NB["⚡ Naive Bayes<br/>80KB - Fast"]
        BERT["🧠 BERT Model<br/>418MB - Accurate"]
    end

    API -->|HTTPS| FLASK
    WEB -->|HTTPS| FLASK
    FLASK --> AUTH
    FLASK --> LIMITER
    FLASK --> INFERENCE
    INFERENCE --> REDIS
    INFERENCE --> CONTAINER
    CONTAINER --> NB
    CONTAINER --> BERT

    style BERT fill:#E6F3FF,stroke:#000000,stroke-width:2px,color:#000000
    style NB fill:#FFFDE7,stroke:#000000,stroke-width:2px,color:#000000
    style INFERENCE fill:#FFE6E6,stroke:#000000,stroke-width:2px,color:#000000
    style clients fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style application fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style services fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style models fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style API fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style WEB fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style FLASK fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style AUTH fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style LIMITER fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style REDIS fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style CONTAINER fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
Loading

Features

Core Features

  • Upload Security Reports: Supports OWASP Dependency-Check JSON format
  • REST API: Production-ready API with X-API-Key authentication
  • Dual AI Models:
    • Naive Bayes baseline model for fast classification
    • BERT transformer model for advanced, context-aware analysis
  • AI-Powered Prioritization: Uses ML models to classify vulnerability severity
  • Intuitive Dashboard: Clean, modern interface built with Tailwind CSS
  • Alert Reduction: Highlights critical vulnerabilities to reduce noise
  • Real-time Analysis: Instant processing and results

Enterprise Features

  • User Authentication: Flask-Login session management for web UI
  • API Key Management: Secure token-based authentication for REST API
  • Redis Integration: Distributed caching and metrics collection
  • Clean Architecture: SOLID principles with dependency injection
  • Observability: Request tracking, metrics, and structured logging

Technology Stack

Backend

  • Python 3.x
  • Flask (Web Framework)
  • Pydantic v2 (API validation & serialization)
  • scikit-learn (Naive Bayes baseline)
  • Transformers (BERT for advanced classification)
  • pandas & numpy (Data processing)

Frontend

  • Jinja2 Templates
  • Tailwind CSS (via CDN)
  • Responsive design

Installation

  1. Clone or navigate to the project directory:
cd vulnerability-triage-mvp
  1. Create a virtual environment:
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Usage

  1. Start the Flask application:
python app.py
  1. Open your browser:
http://localhost:5000
  1. Upload a security report:

    • Click "Upload a file" or drag and drop
    • Select an OWASP Dependency-Check JSON report
    • Click "Analyze Vulnerabilities"
  2. View results:

    • See prioritized vulnerabilities
    • Focus on critical alerts
    • Track alert reduction rate

API Usage

The application includes a production-ready REST API with secure API key authentication.

API Request Flow

sequenceDiagram
    %%{init: {'theme': 'base', 'themeVariables': {'background': '#ffffff', 'mainBkg': '#ffffff', 'primaryColor': '#ffffff', 'primaryTextColor': '#000000', 'signalColor': '#000000', 'actorTextColor': '#000000', 'actorLineColor': '#000000', 'labelBoxBorderColor': '#000000', 'actorBkg': '#FFFFFF', 'labelBoxBkgColor': '#FFFFFF', 'loopTextColor': '#000000', 'noteBorderColor': '#000000', 'noteBkgColor': '#FFFDE7', 'noteTextColor': '#000000'}}}%%
    participant CLIENT as 📱 API Client
    participant FLASK as ⚙️ Flask App
    participant AUTH as 🔐 Auth Service
    participant REDIS as 📊 Redis Cache
    participant MODEL as 🤖 BERT/NB Model

    CLIENT->>FLASK: POST /api/v1/triage<br/>{description, model}
    FLASK->>AUTH: Validate X-API-Key
    AUTH-->>FLASK: ✅ User verified
    FLASK->>FLASK: Validate request body
    FLASK->>REDIS: Check cache
    REDIS-->>FLASK: ❌ Cache miss
    FLASK->>MODEL: Predict vulnerability
    MODEL-->>FLASK: {priority, confidence, ai_score}
    FLASK->>REDIS: Store result (TTL: 3600s)
    FLASK-->>CLIENT: 200 OK<br/>{prediction, metadata}

    Note over CLIENT,MODEL: ⚡ Total: 200-500ms (BERT)<br/>10-50ms (Naive Bayes)<br/>5-10ms (Cached)
Loading

Complete API Tutorial

Step 1: Start the Application

# Quick start (recommended)
./start.sh

# Or manual start
flask --app app:create_app run --debug

Wait for: "BERT model loaded on cpu" in the logs (~5 seconds)

Step 2: Get Your API Key

Option A: From Web UI

  1. Open browser: http://localhost:5000
  2. Login with:
    • Username: admin
    • Password: Str0ngP@ssw0rd!_f0r_D3m0
  3. Navigate to Profile page
  4. Click "Generate API Key"
  5. Copy the key (shown once: sk_...)

Option B: From Command Line

# Get API key from test user
flask --app app:create_app create-test-user
# Output shows: API Key: sk_abc123...

Step 3: Verify API is Running

# Check health endpoint (no auth required)
curl http://localhost:5000/api/v1/health | jq

Expected response:

{
  "status": "healthy",
  "models": {
    "naive_bayes": "available",
    "bert": "available"
  },
  "fallback_rate": "0.00%",
  "total_predictions": 0,
  "timestamp": "2025-11-11T16:15:30.123Z"
}

Step 4: Make Your First Prediction

Replace YOUR_API_KEY with your actual key from Step 2:

curl -X POST http://localhost:5000/api/v1/triage \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "description": "SQL injection vulnerability in login form allows authentication bypass and database access",
    "cvss_score": 9.8,
    "model": "bert",
    "include_confidence": true
  }' | jq

Expected response:

{
  "prediction": {
    "priority": "Critical",
    "is_critical": true,
    "ai_score": 0.94,
    "confidence": 0.91,
    "method": "bert_model"
  },
  "metadata": {
    "model_used": "bert",
    "processing_time_ms": 342,
    "timestamp": "2025-11-11T16:15:30.456Z",
    "request_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
  }
}

Step 5: Try Different Models

Using Naive Bayes (Faster)

curl -X POST http://localhost:5000/api/v1/triage \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "description": "Cross-site scripting (XSS) vulnerability in comment field",
    "cvss_score": 6.1,
    "model": "naive_bayes"
  }' | jq

Auto Model Selection (Tries BERT, falls back to Naive Bayes)

curl -X POST http://localhost:5000/api/v1/triage \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "description": "Buffer overflow in authentication module allows remote code execution",
    "model": "auto"
  }' | jq

Step 6: Check Performance Metrics

curl http://localhost:5000/api/v1/metrics | jq

Example response:

{
  "counters": {
    "api.requests.total": 15,
    "api.requests.success": 15,
    "api.model.predictions": 15,
    "api.model.bert": 10,
    "api.model.naive_bayes": 5
  },
  "histograms": {
    "api.response_time_ms": {
      "count": 15,
      "mean": 234.5,
      "min": 12,
      "max": 487
    }
  },
  "fallback_rate": 0.0,
  "fallback_rate_formatted": "0.0%",
  "fallback_threshold_exceeded": false,
  "timestamp": "2025-11-11T16:20:00.123Z"
}

Step 7: Python Integration Example

Create a file test_api.py:

import requests
import json

# Your API key
API_KEY = "YOUR_API_KEY_HERE"
BASE_URL = "http://localhost:5000/api/v1"

def predict_vulnerability(description, cvss_score=None, model="auto"):
    """Predict vulnerability severity."""

    headers = {
        "Content-Type": "application/json",
        "X-API-Key": API_KEY
    }

    payload = {
        "description": description,
        "model": model,
        "include_confidence": True
    }

    if cvss_score is not None:
        payload["cvss_score"] = cvss_score

    response = requests.post(
        f"{BASE_URL}/triage",
        headers=headers,
        json=payload
    )

    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error {response.status_code}: {response.text}")
        return None

# Test cases
vulnerabilities = [
    {
        "description": "Remote code execution in Apache Struts framework",
        "cvss_score": 9.8,
        "model": "bert"
    },
    {
        "description": "Information disclosure through debug logs",
        "cvss_score": 4.3,
        "model": "naive_bayes"
    },
    {
        "description": "XML external entity (XXE) injection in XML parser",
        "cvss_score": 7.5,
        "model": "auto"
    }
]

# Make predictions
for vuln in vulnerabilities:
    print(f"\n{'='*60}")
    print(f"Testing: {vuln['description'][:50]}...")

    result = predict_vulnerability(**vuln)

    if result:
        pred = result['prediction']
        meta = result['metadata']

        print(f"Priority: {pred['priority']}")
        print(f"AI Score: {pred['ai_score']:.2f}")
        print(f"Model Used: {meta['model_used']}")
        print(f"Processing Time: {meta['processing_time_ms']}ms")

        if 'confidence' in pred:
            print(f"Confidence: {pred['confidence']:.2f}")

Run it:

python3 test_api.py

Authentication

Get your API key:

  1. Login to web UI: http://localhost:5000/login
  2. Navigate to Profile page
  3. Click "Generate API Key"
  4. Copy the key (shown once)

Alternatively, get the API key from test user creation:

flask --app app:create_app create-test-user

API Endpoints

POST /api/v1/triage

Predict vulnerability severity using ML models. Requires authentication.

Request:

curl -X POST http://localhost:5000/api/v1/triage \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY_HERE" \
  -d '{
    "description": "SQL injection vulnerability in login form allowing authentication bypass",
    "cvss_score": 9.8,
    "model": "auto",
    "include_confidence": true
  }'

Response:

{
  "prediction": {
    "priority": "Critical",
    "is_critical": true,
    "confidence": 0.87,
    "ai_score": 0.92,
    "method": "bert_model"
  },
  "metadata": {
    "model_used": "bert",
    "processing_time_ms": 245,
    "timestamp": "2025-10-31T14:23:45.123Z",
    "request_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
  }
}

Request Parameters:

  • description (string, required): Vulnerability description (10-5000 chars, min 3 words)
  • cvss_score (float, optional): CVSS score (0.0-10.0)
  • model (string, optional): Model selection - "auto" (default), "bert", or "naive_bayes"
  • include_confidence (boolean, optional): Include confidence scores (default: false)

Response Fields:

  • prediction.priority: Severity level ("Critical", "High", "Medium", "Low")
  • prediction.is_critical: Boolean flag for critical classification
  • prediction.confidence: Model confidence score (0.0-1.0, if requested)
  • prediction.ai_score: AI-generated severity score (0.0-1.0)
  • prediction.method: Prediction method used ("bert_model", "ml_model", or "heuristic")
  • metadata.model_used: Model type that generated prediction
  • metadata.processing_time_ms: Request processing time
  • metadata.timestamp: Prediction timestamp (ISO 8601 UTC)
  • metadata.request_id: Unique request identifier (UUID)

GET /api/v1/health

Check service health and model availability.

Request:

curl http://localhost:5000/api/v1/health

Response:

{
  "status": "healthy",
  "models": {
    "naive_bayes": "available",
    "bert": "available"
  },
  "timestamp": "2025-10-31T14:23:45.123Z"
}

Error Handling

The API returns structured error responses for all failure cases:

Validation Error (422):

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request validation failed",
    "details": {
      "description": ["Field is required", "Minimum length is 10 characters"]
    }
  }
}

Model Unavailable (503):

{
  "error": {
    "code": "MODEL_UNAVAILABLE",
    "message": "BERT model is not available",
    "details": {
      "fallback": "Use model=\"naive_bayes\" or model=\"auto\""
    }
  }
}

API Documentation

Full API documentation is available in OpenAPI 3.0 format:

  • OpenAPI Specification: docs/openapi.yaml
  • Interactive Documentation: Import docs/openapi.yaml into Swagger Editor

Example Usage with Python

import requests

# Your API key (get from profile page or flask create-test-user)
API_KEY = 'your-api-key-here'

# Predict vulnerability severity
response = requests.post(
    'http://localhost:5000/api/v1/triage',
    headers={'X-API-Key': API_KEY},
    json={
        'description': 'Buffer overflow vulnerability in authentication module',
        'cvss_score': 8.5,
        'model': 'auto',
        'include_confidence': True
    }
)

result = response.json()
print(f"Priority: {result['prediction']['priority']}")
print(f"Model: {result['metadata']['model_used']}")
print(f"Processing time: {result['metadata']['processing_time_ms']}ms")

Rate Limiting

Currently no rate limiting is enforced (MVP). Production deployments should implement rate limiting based on requirements.

CLI Commands

The application includes Flask CLI commands for user management:

# Create admin user (interactive password prompt)
flask create-admin username

# Create test user (non-interactive, for development)
flask create-test-user

# List all users
flask list-users

# Revoke API key for a user
flask revoke-api-key username

Training the Models

Training Pipeline Overview

flowchart LR
    %%{init: {'theme': 'base', 'themeVariables': {'background': '#ffffff', 'mainBkg': '#ffffff', 'primaryTextColor': '#000000', 'lineColor': '#000000', 'textColor': '#000000', 'clusterBkg': '#FFFFFF', 'clusterBorder': '#000000'}}}%%
    subgraph collection [" 📥 Data Collection "]
        NVD["NVD API<br/>CVE JSON"]
        COLLECT["collect_cve_data.py"]
        NVD --> COLLECT
    end

    subgraph preprocessing [" 🔧 Preprocessing "]
        RAW[("data/raw/")]
        PREPROCESS["preprocess_data.py<br/>Clean & Label"]
        SPLIT["Train/Val/Test<br/>70% / 15% / 15%"]
    end

    subgraph data [" 📊 Processed Data "]
        direction TB
        TRAIN[("train.csv<br/>397 samples")]
        VAL[("val.csv<br/>86 samples")]
        TEST[("test.csv<br/>85 samples")]
    end

    subgraph training [" 🎓 Model Training "]
        direction TB
        NB_TRAIN["⚡ Naive Bayes<br/>train_naive_bayes.py<br/>~5 min"]
        BERT_TRAIN["🧠 BERT Training<br/>train_bert.py<br/>~45 min CPU"]
    end

    subgraph output [" 💾 Saved Models "]
        direction TB
        NB_MODEL[("naive_bayes_model.pkl<br/>80KB")]
        BERT_MODEL[("bert_model/<br/>model.safetensors<br/>418MB")]
    end

    COLLECT --> RAW --> PREPROCESS --> SPLIT
    SPLIT --> TRAIN & VAL & TEST
    TRAIN --> NB_TRAIN & BERT_TRAIN
    VAL --> NB_TRAIN & BERT_TRAIN
    NB_TRAIN --> NB_MODEL
    BERT_TRAIN --> BERT_MODEL

    style NB_MODEL fill:#E6FFE6,stroke:#000000,stroke-width:2px,color:#000000
    style BERT_MODEL fill:#E6FFE6,stroke:#000000,stroke-width:2px,color:#000000
    style NVD fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style COLLECT fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style RAW fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style PREPROCESS fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style SPLIT fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style TRAIN fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style VAL fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style TEST fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style NB_TRAIN fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style BERT_TRAIN fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style collection fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style preprocessing fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style data fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style training fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style output fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
Loading

Naive Bayes Model

To train the baseline Naive Bayes model:

python scripts/train_naive_bayes.py

This will:

  • Load preprocessed data from data/processed/
  • Train a TF-IDF + Naive Bayes pipeline
  • Evaluate on validation and test sets
  • Save the model to models/naive_bayes_model.pkl

BERT Model

To train the advanced BERT transformer model:

python scripts/train_bert.py

This will:

  • Load preprocessed data from data/processed/
  • Fine-tune BERT-base-uncased on vulnerability classification
  • Train for 3 epochs with checkpoint saving
  • Save the model and tokenizer to models/bert_model/
  • Automatically resume from checkpoints if interrupted

BERT Training Flow with Checkpoints

flowchart TD
    %%{init: {'theme': 'base', 'themeVariables': {'background': '#ffffff', 'mainBkg': '#ffffff', 'primaryTextColor': '#000000', 'lineColor': '#000000', 'textColor': '#000000', 'clusterBkg': '#FFFFFF', 'clusterBorder': '#000000'}}}%%
    START([Start Training]) --> CHECK{Checkpoint<br/>Exists?}
    CHECK -->|Yes| LOAD[Load checkpoint_epoch_N.pt<br/>Resume from Epoch N+1]
    CHECK -->|No| INIT[Initialize BERT Model<br/>bert-base-uncased]

    LOAD --> EPOCH
    INIT --> EPOCH

    EPOCH[Train Epoch<br/>Forward + Backward Pass] --> VALIDATE[Validation<br/>Calculate F1 Score]

    VALIDATE --> BEST{New Best<br/>F1?}
    BEST -->|Yes| SAVE_MODEL[Save Best Model<br/>models/bert_model/]
    BEST -->|No| SKIP

    SAVE_MODEL --> CHECKPOINT
    SKIP[Skip Save] --> CHECKPOINT

    CHECKPOINT[Save Checkpoint<br/>checkpoint_epoch_N.pt<br/>~1.3GB]

    CHECKPOINT --> MORE{More<br/>Epochs?}
    MORE -->|Yes| EPOCH
    MORE -->|No| TEST[Final Test Evaluation<br/>83.27% Accuracy<br/>83.93% F1 Score]

    TEST --> END([Training Complete])

    style END fill:#E6FFE6,stroke:#000000,stroke-width:2px,color:#000000
    style CHECKPOINT fill:#FFFDE7,stroke:#000000,stroke-width:2px,color:#000000
    style START fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style CHECK fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style LOAD fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style INIT fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style EPOCH fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style VALIDATE fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style BEST fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style SAVE_MODEL fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style SKIP fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style MORE fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
    style TEST fill:#FFFFFF,stroke:#000000,stroke-width:1px,color:#000000
Loading

Note: BERT training:

  • Checkpoint System: Automatically saves progress after each epoch
  • Resume Capability: Interrupted training resumes from last checkpoint
  • CPU Training: ~45 minutes for 3 epochs
  • GPU Training: ~15-20 minutes (if available)
  • Disk Space: ~4GB total (model: 418MB, checkpoints: 1.3GB each)
  • Memory: ~450MB per worker for inference

Using BERT Predictions

Once trained, the BERT model can be used via the /bert_predict route:

curl -X POST -F "file=@report.json" http://localhost:5000/bert_predict

Or modify the frontend to add a "Use BERT Model" option alongside the existing Naive Bayes prediction.

Project Structure

vulnerability-triage-mvp/
├── app/
│   ├── __init__.py          # Flask app factory
│   ├── model.py             # Model loading and inference
│   ├── routes.py            # Web UI routes
│   └── api/                 # REST API package
│       ├── __init__.py      # API package initializer
│       └── v1/              # API version 1
│           ├── __init__.py  # Blueprint factory
│           ├── routes.py    # API endpoints
│           ├── schemas.py   # Request/response validation
│           ├── validators.py # Custom validation logic
│           └── errors.py    # Error handling
├── data/
│   ├── raw/                 # Raw CVE/NVD data
│   ├── processed/           # Processed training data
│   └── uploads/             # Uploaded scanner reports
├── docs/
│   └── openapi.yaml         # OpenAPI 3.0 specification
├── models/
│   ├── naive_bayes_model.pkl  # Trained Naive Bayes model
│   └── bert_model/            # Fine-tuned BERT model
├── scripts/
│   ├── train_naive_bayes.py   # Train Naive Bayes model
│   └── train_bert.py          # Train BERT model
├── static/                  # Static assets (CSS, JS)
├── templates/               # Jinja2 templates
│   ├── base.html
│   ├── index.html
│   ├── results.html
│   └── about.html
├── tests/
│   ├── test_model.py        # Model unit tests
│   ├── test_routes.py       # Web UI tests
│   └── test_api.py          # API endpoint tests
├── app.py                   # Application entry point
├── requirements.txt         # Python dependencies
└── README.md               # This file

Development Phases

Phase 1: Data Collection (Week 3)

  • Collect CVE/NVD data
  • Preprocess and label data
  • Create training dataset

Phase 2: Model Development (Weeks 4-5) ✓

  • ✅ Implement Naive Bayes baseline
  • ✅ Train BERT transformer model
  • ✅ Model comparison framework

Phase 3: MVP Integration (Weeks 5-6) ✓

  • ✅ Flask backend implementation
  • ✅ Web interface with Tailwind CSS
  • ✅ Results dashboard
  • ✅ Dual model support (Naive Bayes + BERT)

Phase 4: Validation (Week 7)

  • Test with real-world reports
  • Measure alert reduction rate
  • Final technical report

Current Status

Advanced MVP with REST API Complete - The application now features:

  • ✅ Upload interface
  • ✅ JSON parsing (OWASP Dependency-Check format)
  • ✅ Naive Bayes baseline model
  • ✅ BERT transformer model
  • ✅ Dual model endpoints
  • ✅ Results dashboard
  • ✅ Heuristic fallback for robustness
  • ✅ Production-ready REST API (v1)
  • ✅ Request validation with marshmallow
  • ✅ Comprehensive error handling
  • ✅ OpenAPI 3.0 specification
  • ✅ Full API test suite

Model Performance

Performance Comparison

graph LR
    subgraph "Model Characteristics"
        NB_CHAR[Naive Bayes<br/>---<br/>Inference: 10-50ms<br/>Memory: 50MB<br/>Accuracy: ~75%<br/>File Size: 80KB]

        BERT_CHAR[BERT<br/>---<br/>Inference: 200-500ms<br/>Memory: 450MB<br/>Accuracy: 83.27%<br/>File Size: 418MB]

        CACHE_CHAR[Cached<br/>---<br/>Inference: 5-10ms<br/>Memory: 10MB<br/>Accuracy: Same as model<br/>TTL: 3600s]
    end

    subgraph "Use Cases"
        FAST[High Throughput<br/>Low Latency Required]
        ACCURATE[Maximum Accuracy<br/>Context Understanding]
        REPEAT[Repeated Queries<br/>Same Vulnerabilities]
    end

    FAST --> NB_CHAR
    ACCURATE --> BERT_CHAR
    REPEAT --> CACHE_CHAR

    style NB_CHAR fill:#fff4e1
    style BERT_CHAR fill:#e1f5ff
    style CACHE_CHAR fill:#c8e6c9
Loading

Naive Bayes (Baseline)

  • Speed: Fast inference (10-50ms per request)
  • Memory: Low footprint (~50MB)
  • Accuracy: Good baseline (~75%)
  • Best For: High-volume, low-latency scenarios

BERT (Advanced)

  • Speed: Moderate inference (200-500ms per request)
  • Memory: Higher footprint (~450MB per worker)
  • Accuracy: 83.27% (F1: 83.93%)
  • Best For: Maximum accuracy, context-aware analysis

Training Results (BERT)

  • Dataset: 568 samples (train: 397, val: 86, test: 85)
  • Training Time: ~45 minutes (CPU), ~15 minutes (GPU)
  • Test Accuracy: 83.27%
  • Test F1 Score: 83.93%
  • Alert Reduction: 67.4% (568 → 185 flagged as critical)
  • Precision: 82% (116 true positives / 142 actual critical)
  • Confusion Matrix:
    • True Negatives: 357 | False Positives: 69
    • False Negatives: 26 | True Positives: 116

Next Steps

  1. ✅ Collect and preprocess CVE/NVD training data
  2. ✅ Train Naive Bayes baseline model
  3. ✅ Implement BERT-based classification
  4. Compare model performance metrics
  5. Validate with real-world scanner reports
  6. Create comparative analysis report

Deployment Architecture

Production Deployment Diagram

graph TB
    subgraph "External"
        CLIENTS[API Clients<br/>Web Browsers]
    end

    subgraph "Reverse Proxy"
        NGINX[Nginx<br/>- SSL Termination<br/>- Load Balancing<br/>- Static Files<br/>:443]
    end

    subgraph "Application Servers"
        GUNICORN[Gunicorn Master<br/>--workers 4<br/>--timeout 30]

        subgraph "Workers"
            W1[Worker 1<br/>Flask + Models<br/>~550MB RAM]
            W2[Worker 2<br/>Flask + Models<br/>~550MB RAM]
            W3[Worker 3<br/>Flask + Models<br/>~550MB RAM]
            W4[Worker 4<br/>Flask + Models<br/>~550MB RAM]
        end
    end

    subgraph "Data Tier"
        REDIS[(Redis<br/>Cache + Metrics<br/>:6379)]
        SQLITE[(SQLite<br/>Users + Sessions<br/>app.db)]
        FILES[File System<br/>Models: 418MB<br/>Checkpoints: 1.3GB×3]
    end

    CLIENTS -->|HTTPS| NGINX
    NGINX -->|HTTP| GUNICORN

    GUNICORN --> W1
    GUNICORN --> W2
    GUNICORN --> W3
    GUNICORN --> W4

    W1 <-->|Cache| REDIS
    W2 <-->|Cache| REDIS
    W3 <-->|Cache| REDIS
    W4 <-->|Cache| REDIS

    W1 <-->|Auth| SQLITE
    W2 <-->|Auth| SQLITE
    W3 <-->|Auth| SQLITE
    W4 <-->|Auth| SQLITE

    W1 -.->|Load at Startup| FILES
    W2 -.->|Load at Startup| FILES
    W3 -.->|Load at Startup| FILES
    W4 -.->|Load at Startup| FILES

    style W1 fill:#fff4e1
    style W2 fill:#fff4e1
    style W3 fill:#fff4e1
    style W4 fill:#fff4e1
    style REDIS fill:#ffe1e1
Loading

Deployment Checklist

Pre-Deployment:

  • Train models: python3 scripts/train_bert.py
  • Verify model files: ls models/bert_model/
  • Run tests: python3 -m pytest tests/
  • Configure environment variables (.env)
  • Set up Redis server
  • Create production API keys

Production Command:

gunicorn -w 4 \
  --timeout 30 \
  --bind 0.0.0.0:5000 \
  --access-logfile logs/access.log \
  --error-logfile logs/error.log \
  'app:create_app()'

Monitoring:

  • Health endpoint: GET /api/v1/health
  • Metrics endpoint: GET /api/v1/metrics
  • Monitor fallback rate < 10%
  • Monitor response time p95 < 1s
  • Check cache hit rate > 50%

Sample Data

A sample OWASP Dependency-Check JSON file is provided in data/sample/ for testing.

License

This project is licensed under the CC BY-NC-SA 4.0. This license covers all current and historical commits in this repository. See the LICENSE file for details.

Documentation

For comprehensive technical documentation with detailed architecture diagrams, see:

  • BERT Architecture and Integration Guide - Complete documentation with 20+ Mermaid diagrams covering:

    • System architecture overview
    • BERT training pipeline with checkpoints
    • Model loading and initialization sequences
    • API request flow and component interactions
    • Data flow diagrams
    • Deployment architecture
    • Performance characteristics
    • Troubleshooting guide
  • OpenAPI Specification - REST API documentation

  • [Project Summary](docs/Project Summary.md) - High-level project overview

References

  • FIRST.Org - CVSS v3.1 Specification
  • OWASP Dependency-Check
  • Papadopoulos & Tsioutsiouliklis (2024) - VulTriager
  • Yang, Sun & Fu (2023) - Transformer-based vulnerability identification
  • Hugging Face Transformers - BERT implementation

About

🤖 An AI-powered tool for prioritizing security vulnerabilities using Naive Bayes and BERT models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors 3

  •  
  •  
  •