Skip to content

McHughCyber/opencti-cve-org-connector

Repository files navigation

OpenCTI CVE.org CVE List Connector

Build, Test, and Security Pipeline

The CVE.org connector is a production-ready Python process that collects vulnerability data directly from the CVE Program, providing an alternative to NIST NVD for OpenCTI. This connector automatically imports all historical and current CVE records from the official CVE.org repository with comprehensive entity integration and advanced rate limiting.

Features

Core Functionality

  • Complete CVE Dataset: Imports all 300,000+ CVE records from 1999 to present
  • Direct CVE Data: Ingests CVE Records in CVE JSON v5.x format directly from the CVE Program
  • GitHub Releases: Uses GitHub releases from CVEProject/cvelistV5 repository for consistent snapshots
  • Nested ZIP Handling: Automatically extracts nested ZIP files containing the full CVE dataset
  • Incremental Updates: Supports both initial load and incremental updates via delta releases

Advanced Processing

  • Dual-Mode Processing: Full releases (300,000+ CVEs) and delta releases (incremental updates)
  • Comprehensive CVSS Extraction: Supports all CVE format variations from legacy (1999) to modern (2025)
  • Multi-Version CVSS: CVSS v2, v3.0, v3.1, and v4.0 support with fallback extraction strategies
  • Enhanced Data Extraction: CWE, CPE, CAPEC, solutions, credits, and affected products
  • Always-Update Strategy: Automatically updates existing vulnerabilities with new data

Entity Integration

  • AttackPattern Entities: Creates CAPEC attack patterns from modern CVEs (2023+)
  • Software Entities: Creates software entities from CPE 2.3 data (2022+)
  • CourseOfAction Entities: Creates mitigation entities from solution data
  • Rich Relationships: Links vulnerabilities to attack patterns, software, and solutions
  • Entity Deduplication: Prevents duplicate entities with intelligent caching

Production Features

  • GraphQL Mode: Direct GraphQL operations for better performance and update capabilities
  • Advanced Rate Limiting: Adaptive rate limiting with system strain detection and cooldown
  • State Management: Persistent state tracking with automatic backup and error recovery
  • Comprehensive Logging: Detailed logging for monitoring and troubleshooting
  • Health Monitoring: Built-in health checks and performance metrics
  • Software Bill of Materials (SBOM): Comprehensive SBOM generation and attestation

Requirements

  • OpenCTI Platform version 5.12.0 or higher
  • Python 3.8+
  • Internet access to GitHub API for releases
  • Docker (for containerized deployment)

Quick Start

1. Configure Environment

cp env.example .env
# Edit .env with your OpenCTI settings

2. Deploy with Docker (Recommended)

docker build -t cve-org-connector .
docker run -d --name cve-connector --env-file .env cve-org-connector

3. Deploy with Python

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 src/main.py

Configuration

Production Configuration

# Production .env configuration
OPENCTI_URL=http://opencti:8080
OPENCTI_TOKEN=your_production_token
CVE_CREATE_MODE=graphql
CVE_UPDATE_EXISTING=true
CVE_BATCH_SIZE=50
CONNECTOR_LOG_LEVEL=info
CVE_RELEASE_POLL_INTERVAL=3600
CVE_RELEASE_MAX_BACKLOG=10

# Delta processing
CVE_ENABLE_DELTA_PROCESSING=true
CVE_DELTA_POLL_INTERVAL=900
CVE_PREFER_DELTAS=true

# Rate limiting (conservative defaults)
RATE_LIMIT_REQUESTS_PER_SECOND=3.0
RATE_LIMIT_CONCURRENT_REQUESTS=2
RATE_LIMIT_ADAPTIVE_SCALING=true
RATE_LIMIT_POST_OPERATION_DELAY_MS=250

Key Environment Variables

Variable Description Default Production
OPENCTI_URL OpenCTI platform URL http://localhost:8080 http://opencti:8080
OPENCTI_TOKEN OpenCTI API token Required Get from OpenCTI admin panel
CVE_CREATE_MODE Processing mode graphql graphql (recommended)
CVE_UPDATE_EXISTING Enable updates true true
CVE_BATCH_SIZE Batch size 50 50-200
CONNECTOR_LOG_LEVEL Logging level info info
CVE_RELEASE_POLL_INTERVAL Polling interval (seconds) 3600 3600 (1 hour)

Entity Integration Configuration

# Entity creation toggles
CREATE_ATTACK_PATTERNS=true
CREATE_SOFTWARE_ENTITIES=true
CREATE_COURSE_OF_ACTIONS=true

# Relationship creation
LINK_ATTACK_PATTERNS=true
LINK_SOFTWARE=true
LINK_SOLUTIONS=true

# Processing limits
MAX_CPE_PER_CVE=50
MAX_CAPEC_PER_CVE=10
ENTITY_DEDUPLICATION_ENABLED=true

Rate Limiting Configuration

Variable Default Description
RATE_LIMIT_REQUESTS_PER_SECOND 3.0 Maximum GraphQL requests per second
RATE_LIMIT_CONCURRENT_REQUESTS 2 Maximum concurrent GraphQL requests
RATE_LIMIT_BURST_SIZE 10 Burst allowance for initial requests
RATE_LIMIT_ADAPTIVE true Enable adaptive rate limiting based on response times
RATE_LIMIT_POST_OPERATION_DELAY_MS 250 Delay in milliseconds after each operation
RATE_LIMIT_COOLDOWN_THRESHOLD_SECONDS 3.0 Response time threshold for triggering cooldown
RATE_LIMIT_COOLDOWN_DURATION_SECONDS 30 Duration of cooldown pause when strain is detected

Docker Image

Using Pre-built Images

Pull the latest image from GitHub Container Registry:

docker pull ghcr.io/mchughcyber/opencti-cve-org-connector:latest

Available tags:

  • master - Latest build from main branch
  • v1.0.0 - Specific version releases
  • v1.0 - Latest patch version of 1.0
  • v1 - Latest minor version of 1.x

Building Locally

For development or customization:

docker build -t opencti-cve-org-connector .

For multi-platform builds:

docker buildx build --platform linux/amd64,linux/arm64 -t opencti-cve-org-connector .

Initial Import Process

The connector performs a full initial load of all CVE records:

Expected Initial Load Process:

  1. Download: Downloads ~466MB CVE archive from GitHub releases
  2. Extract: Extracts outer ZIP file
  3. Nested Extraction: Automatically detects and extracts nested cves.zip file
  4. Discovery: Finds 300,000+ CVE JSON files in nested directory structure
  5. Processing: Converts CVE records to GraphQL operations and imports to OpenCTI

Sample Log Output:

Found 314625 JSON files in /tmp/cve-releases-cache/.../cves_extracted
Successfully processed 314625 CVE records
Initial load completed: 314625 vulnerabilities created

Verification:

  • ✅ Successful connection to OpenCTI
  • ✅ Download and extraction of CVE archive
  • ✅ Processing of 300,000+ CVE records
  • ✅ GraphQL operations and sending to OpenCTI

Architecture

Core Components

EnhancedReleaseConnector
├── ReleaseDetector          # Detects full vs delta releases
├── DeltaProcessor           # Handles incremental updates
├── DifferentialProcessor    # Smart change detection
├── VulnerabilityManagerV2   # Always-update strategy
├── StateManager            # Persistent state tracking
├── CVEParserEnhanced       # Multi-format CVSS extraction
├── EntityManager           # AttackPattern, Software, CourseOfAction creation
├── RelationshipManager     # Entity relationship management
└── RateLimiterV2           # Adaptive rate limiting

CVSS Extraction Pipeline

CVE Data
├── StructuredCVSSExtractor    # containers.cna.metrics[]
├── LegacyCVSSExtractor        # x_legacyV4Record.impact.cvss
├── TextCVSSExtractor          # Description text parsing
└── ADPMetricsExtractor        # containers.adp[].metrics[]

Data Flow

GitHub Releases → Outer ZIP → Nested ZIP → CVE JSON v5.x → GraphQL Operations → OpenCTI
     ↓              ↓           ↓            ↓              ↓
  Download      Extract    Extract      Parse &         Rate Limited
  & Cache       Outer      Nested       Validate        Operations

Entity Integration

Entity Types Created

  1. AttackPattern - From CAPEC data in modern CVEs (2023+)
  2. Software - From CPE 2.3 data in modern CVEs (2022+)
  3. CourseOfAction - From solution/mitigation data in CVEs

Relationship Types

  • Vulnerability --[targets]--> AttackPattern - CVE can be exploited via attack pattern
  • Vulnerability --[has]--> Software - CVE affects software (built-in field)
  • CourseOfAction --[mitigates]--> Vulnerability - Solution addresses CVE
  • CourseOfAction --[mitigates]--> AttackPattern - Solution blocks attack pattern
  • AttackPattern --[uses]--> AttackPattern - Attack pattern hierarchy
  • Software --[related-to]--> Software - Related software products

Data Sources

CAPEC Attack Patterns: containers.cna.impacts[] in modern CVEs (2023+) CPE Software Data: containers.cna.cpeApplicability[] in modern CVEs (2022+) Solution Data: containers.cna.solutions[] in CVEs

Rate Limiting

The CVE connector includes comprehensive rate limiting to prevent overwhelming the OpenCTI instance with too many GraphQL operations. This is especially important when processing large numbers of CVE records (300,000+ during initial load).

Rate Limiting Features

  • Token Bucket Algorithm: Implements a token bucket with configurable rate limits
  • Concurrent Request Limiting: Limits the number of simultaneous GraphQL operations
  • Adaptive Rate Limiting: Automatically adjusts rate based on OpenCTI response times
  • System Strain Detection: Tracks consecutive slow responses and triggers cooldown
  • Post-Operation Delays: Configurable delays after each vulnerability operation
  • Burst Handling: Allows short bursts of requests for better performance
  • Statistics Tracking: Monitors rate limiting effectiveness and performance

Default Rate Limiting Settings

The connector uses conservative default settings to protect the OpenCTI instance:

  • 3 requests per second maximum
  • 2 concurrent requests maximum
  • 10 request burst allowance
  • 250ms post-operation delay
  • Adaptive rate limiting enabled by default
  • 30-second cooldown when system strain detected

Tuning Rate Limits

For production environments, you may need to adjust these settings based on your OpenCTI instance capacity:

# Conservative settings for small instances
RATE_LIMIT_REQUESTS_PER_SECOND=2.0
RATE_LIMIT_CONCURRENT_REQUESTS=1
RATE_LIMIT_POST_OPERATION_DELAY_MS=500

# Aggressive settings for high-performance instances
RATE_LIMIT_REQUESTS_PER_SECOND=10.0
RATE_LIMIT_CONCURRENT_REQUESTS=5
RATE_LIMIT_POST_OPERATION_DELAY_MS=100

Monitoring and Troubleshooting

Log Monitoring

# Monitor logs in real-time
docker logs -f cve-connector

# Check recent logs
docker logs --tail=50 cve-connector

# Check connector status
docker ps | grep cve-connector

Health Monitoring

The connector provides comprehensive metrics:

  • Processing statistics (total CVEs, deltas, full releases)
  • Rate limiting metrics (requests/sec, throttling, response times)
  • State information (last processed releases, failed CVEs)
  • Health status (component health, error rates)
  • Entity creation statistics (attack patterns, software, course of actions)

Common Issues

  1. Connection Issues

    • Verify OpenCTI URL and token
    • Check network connectivity to GitHub API
    • Ensure OpenCTI platform is running
  2. Memory Issues

    • Reduce batch size: CVE_BATCH_SIZE=25
    • Increase container memory limits
    • Monitor system resources
  3. Performance Issues

    • Increase batch size for faster processing
    • Use GraphQL mode for better performance
    • Monitor OpenCTI API response times
    • Check rate limiting statistics
  4. Rate Limiting Issues

    • Monitor rate limiter metrics in logs
    • Adjust rate limiting configuration
    • Check for system strain detection warnings

Reset Connector

To force a fresh import:

  1. Via OpenCTI UI: Go to Data → Connectors → Reset
  2. Via Docker:
    docker stop cve-connector
    docker rm cve-connector
    # Restart with fresh state

Production Deployment

Docker Compose Integration

To integrate with existing OpenCTI deployments:

# Add to your main docker-compose.yml
services:
  cve-org-connector:
    build: ./opencti-connectors-dev/cve-org-import
    container_name: opencti-cve-org-connector
    restart: unless-stopped
    environment:
      - OPENCTI_URL=http://opencti:8080
      - OPENCTI_TOKEN=${OPENCTI_TOKEN}
      - CVE_CREATE_MODE=graphql
      - CVE_UPDATE_EXISTING=true
      - CVE_BATCH_SIZE=50
      - CONNECTOR_LOG_LEVEL=info
      # Rate limiting
      - RATE_LIMIT_REQUESTS_PER_SECOND=3.0
      - RATE_LIMIT_CONCURRENT_REQUESTS=2
      - RATE_LIMIT_ADAPTIVE_SCALING=true
      # Entity integration
      - CREATE_ATTACK_PATTERNS=true
      - CREATE_SOFTWARE_ENTITIES=true
      - CREATE_COURSE_OF_ACTIONS=true
    volumes:
      - cve-release-cache:/tmp/cve-releases-cache
    networks:
      - opencti_default
    depends_on:
      - opencti

volumes:
  cve-release-cache:

Resource Requirements

  • CPU: 2+ cores recommended
  • Memory: 4GB+ recommended
  • Storage: 10GB+ for CVE cache
  • Network: Access to GitHub API and OpenCTI platform

Software Bill of Materials (SBOM)

This connector implements comprehensive SBOM generation and attestation for supply chain transparency and security compliance.

SBOM Features

  • Multi-format Support: Generates both SPDX and CycloneDX SBOM formats
  • Container Analysis: Complete Docker image SBOM including OS packages and dependencies
  • Application Analysis: Detailed Python application and dependency SBOM
  • Signed Attestations: Cryptographically signed SBOM attestations for integrity verification
  • Vulnerability Scanning: Integrated Grype scanning with SBOM input
  • Security Reporting: Automated security scan reports and GitHub Security tab integration

Accessing SBOMs

  1. GitHub Actions: Download SBOM artifacts from the Actions tab
  2. Container Registry: View attestations in GitHub Container Registry (GHCR)
  3. Releases: Download SBOMs from GitHub release assets
  4. Security Tab: View vulnerability scan results in the repository Security tab

Testing

Test Suite

The connector includes comprehensive testing with real CVE data:

# Run all tests
python -m pytest tests/

# Run specific test suites
python -m pytest tests/unit/
python -m pytest tests/integration/

# Run tests with real CVE data
python -m pytest tests/unit/test_cve_parser_real_data.py

Test Coverage

  • ✅ All CVE format variations (legacy, transitional, modern, multi-version, minimal)
  • ✅ CVSS extraction from all sources
  • ✅ Rate limiting with adaptive scaling
  • ✅ Delta processing scenarios
  • ✅ State management and persistence
  • ✅ Entity creation and relationships
  • ✅ End-to-end integration with real CVE data

Performance

Throughput Targets

  • Full Release: Process 300,000 CVEs in < 12 hours
  • Delta Release: Process < 5,000 CVEs in < 30 minutes
  • Rate Limiting: Max 3 req/s with adaptive scaling to 1 req/s under load

Quality Requirements

  • CVSS Extraction: > 95% success rate for modern CVEs, > 70% for legacy
  • Update Success: > 99% with retry logic
  • Data Integrity: Zero data loss, all fields mapped when available
  • Entity Creation: > 90% success rate for available entity data

Operation and Maintenance

Health Monitoring

# Check container status
docker ps | grep cve-connector

# Monitor logs for errors
docker logs cve-connector | grep -i error

# Monitor resource usage
docker stats cve-connector

Regular Maintenance

  1. Monitor Logs: Check for errors or warnings
  2. Verify Data Import: Ensure new CVEs are being imported
  3. Check Cache Usage: Monitor disk space for CVE cache
  4. Update Connector: Keep connector updated
  5. Monitor Entity Creation: Verify attack patterns, software, and course of actions are being created

Backup and Recovery

  • Configuration: Keep .env files backed up
  • State: Connector state is stored in OpenCTI and auto-recovers
  • Cache: CVE cache persists across restarts

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Sources

Support

For issues and questions:

  • Create an issue in the repository
  • Check the troubleshooting section
  • Review OpenCTI documentation

Changelog

v3.0.0 - Enhanced Connector Rewrite

  • Dual-Mode Processing: Full and delta release support
  • Comprehensive CVSS Extraction: All CVE format variations (1999-2025)
  • Advanced Rate Limiting: Adaptive scaling and system strain detection
  • Always-Update Strategy: Overwrite existing vulnerabilities
  • State Management: Persistent state with error recovery
  • Entity Integration: AttackPattern, Software, and CourseOfAction creation
  • Rich Relationships: Comprehensive relationship mapping
  • Production Ready: Enhanced monitoring and troubleshooting
  • Real Data Testing: Comprehensive testing with actual CVE data
  • SBOM Support: Software Bill of Materials generation and attestation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages