A comprehensive Python application that analyzes Pi-hole DNS logs using Large Language Models (LLMs) to provide intelligent insights, threat detection, and automated anomaly analysis.
- Features
- Quick Start
- Installation
- Configuration
- Usage
- Architecture
- Examples
- Troubleshooting
- Development
- References
- License
- AI-powered categorization of DNS queries and domains
- Automated anomaly detection using machine learning patterns
- Natural language search over DNS logs
- Comprehensive reporting with actionable insights
- Real-time threat detection and alerting
- Domain reputation analysis with threat intelligence
- DNS tunneling and beaconing detection
- Suspicious client activity monitoring
- Pattern-based attack recognition
- Daily, weekly, and security-focused reports
- Client-specific activity analysis
- Domain categorization and trend analysis
- Customizable alert thresholds and notifications
- Export capabilities (JSON, text, structured reports)
- Clean separation of concerns with dedicated modules
- Type-safe data models with comprehensive validation
- Configurable components for different environments
- Extensive logging and error handling
- RESTful API design patterns
Monitor and protect your home network from threats, track family internet usage, and identify potential security risks in real-time.
Track employee browsing patterns, detect security risks, generate compliance reports, and monitor for data exfiltration attempts.
Analyze DNS patterns for research purposes, study internet usage trends, and investigate network behavior patterns.
Real-time threat detection, automated incident response, DNS-based attack prevention, and comprehensive security reporting.
- Operating System: Windows 10+, macOS 10.15+, Linux (Ubuntu 20.04+)
- Python: 3.11 or higher (required for modern type hints and performance)
- RAM: Minimum 2GB, Recommended 4GB+
- Disk Space: 500MB for application and logs
- Network: Stable connection to Pi-hole server
- Pi-hole v6+ with API access enabled
- LLM Provider (choose one):
- Ollama with any compatible model (e.g., gpt-oss:latest, llama3.2:latest, gemma3:4b)
- OpenAI API with valid API key
- Network access between the analytics server and Pi-hole
-
Clone the repository:
git clone https://github.com/LiteObject/pihole-llm-analytics.git cd pihole-llm-analytics -
Set up Python virtual environment:
python -m venv .venv # On Windows: .venv\Scripts\activate # On macOS/Linux: source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
# Copy the example configuration cp .env.example .env # Edit .env with your settings nano .env
Create a .env file in the project root with your settings:
# Pi-hole Configuration
PIHOLE_HOST=192.168.7.99 # Pi-hole server IP/hostname
PIHOLE_PORT=8080 # Pi-hole admin interface port (default: 80)
PIHOLE_PASSWORD=your_admin_password # Pi-hole admin password (NOT API token)
# LLM Configuration (Ollama - default)
OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=gpt-oss:latest
OLLAMA_TIMEOUT=120
OLLAMA_TEMPERATURE=0.2
OLLAMA_MAX_TOKENS=512
# LLM Configuration (OpenAI - alternative)
# LLM_PROVIDER=openai
# LLM_API_KEY=your_openai_api_key_here
# LLM_MODEL=gpt-3.5-turbo
# LLM_API_BASE_URL=https://api.openai.com/v1
# General LLM Configuration
LLM_TIMEOUT=120
LLM_TEMPERATURE=0.2
LLM_MAX_TOKENS=512
MAX_PROMPT_CHARS=18000
# Analytics Configuration
ANALYTICS_LOG_COUNT=1000
ANALYTICS_ENABLE_CACHING=true
ANALYTICS_ANOMALY_THRESHOLD=0.1
# Security Configuration
SECURITY_ENABLE_THREAT_DETECTION=true
SECURITY_ALERT_THRESHOLD=medium
SECURITY_ENABLE_REPUTATION_CHECK=true
# Application Configuration
APP_LOG_LEVEL=INFO
APP_LOG_FORMAT=jsonImportant Notes:
- Use
PIHOLE_PASSWORDfor your Pi-hole admin password, not the API token - Include
PIHOLE_HOSTandPIHOLE_PORTseparately for better control - Always test connections first:
python integrated_analysis.py --test-connection - Pi-hole data fetching is reliable - even if LLM analysis fails, you'll get DNS query data
- Use
pihole_diagnostics.pyfor comprehensive Pi-hole diagnostics
# Test connections first (recommended)
python integrated_analysis.py --test-connection
# Basic analysis (standalone functionality)
python integrated_analysis.py
# Analyze more queries with JSON output
python integrated_analysis.py --count 500 --output json
# Use different model
python integrated_analysis.py --model llama3.2:latest --verbose
# Test with verbose logging for troubleshooting
python integrated_analysis.py --count 50 --verboseThe application provides a comprehensive CLI for all operations:
# Run comprehensive DNS analysis
python -m pihole_analytics analyze --count 1000
# Generate security report for the last 7 days
python -m pihole_analytics report --type security --days 7
# Search DNS logs with natural language
python -m pihole_analytics search "suspicious DNS queries from last hour"
# Check system status and component health
python -m pihole_analytics status
# Analyze specific client activity
python -m pihole_analytics client-analysis 192.168.1.100 --hours 24
# Check domain reputation
python -m pihole_analytics reputation suspicious-domain.com
# Categorize domains using AI
python -m pihole_analytics categorize google.com facebook.com doubleclick.netUse the modular API for integration into other applications:
from pihole_analytics.main import PiholeAnalytics
# Initialize with default configuration
analytics = PiholeAnalytics()
# Run comprehensive analysis
results = analytics.run_analysis(count=1000)
# Generate security report
security_report = analytics.generate_report("security", days=7)
# Search logs with natural language
search_results = analytics.search_logs("blocked advertising domains")
# Analyze specific client
client_analysis = analytics.get_client_analysis("192.168.1.100", hours=24)
# Check system health
status = analytics.get_system_status()For advanced users who want direct access to the new LLM analyzer with factory pattern:
from pihole_analytics.core.pihole_client import PiholeClient
from pihole_analytics.analytics.llm_analyzer import LLMAnalyzer
from pihole_analytics.analytics.llm_providers.factory import LLMProvider
from pihole_analytics.utils.config import PiholeConfig
# Configure Pi-hole client
pihole_config = PiholeConfig(host="192.168.7.99", port=8080, password="your_password")
pihole_client = PiholeClient(pihole_config)
# Initialize LLM analyzer using factory pattern
llm_analyzer = LLMAnalyzer.create_with_provider(LLMProvider.OLLAMA)
# Test connections
if not llm_analyzer.test_connection():
print("Warning: LLM service not available")
# Fetch and analyze
with pihole_client:
queries = pihole_client.fetch_queries(100)
analysis = llm_analyzer.analyze_queries(queries)
print(f"Risk level: {analysis.threat_summary.get('risk_level')}")
# Custom analysis with specific instructions
custom_analysis = llm_analyzer.analyze_queries(queries, """
Focus on:
1. Gaming traffic patterns
2. Streaming service usage
3. Potential security threats
Return JSON with detailed breakdowns.
""")
# Alternative: Legacy configuration method (still supported)
from pihole_analytics.analytics.llm_analyzer import LLMConfig
llm_config = LLMConfig.from_env()
legacy_analyzer = LLMAnalyzer(llm_config)This project has been redesigned from a standalone script to a comprehensive, modular architecture with factory pattern for LLM providers:
- Modular Design: Separated concerns into dedicated modules
- Factory Pattern: Pluggable LLM providers (Ollama, OpenAI, extensible)
- Error Handling: Comprehensive error handling with graceful fallbacks
- Multiple Interfaces: CLI, Python API, and standalone scripts
- Configuration Management: Environment variables and programmatic config
- Robust Authentication: Multiple Pi-hole authentication methods
- LLM Integration: Structured LLM analysis with connection testing
The new factory pattern implementation provides:
- Provider Abstraction: Easy switching between LLM providers
- Extensibility: Simple to add new providers (Azure OpenAI, Anthropic, etc.)
- Fallback Support: Automatic fallback when primary provider fails
- Type Safety: Fully typed provider interfaces
- Configuration: Environment-based or programmatic provider selection
- Ollama: Local LLM hosting (default, requires Ollama service)
- OpenAI: OpenAI API integration (requires API key)
- Extensible: Easy to add Azure OpenAI, Anthropic Claude, etc.
| Feature | Implementation | Benefit |
|---|---|---|
| Modular Structure | Separated packages | Easy maintenance & testing |
| Factory Pattern | Pluggable LLM providers | Easy provider switching |
| Error Handling | Comprehensive logging & fallbacks | Reliable operation |
| Configuration | Environment + programmatic | Flexible deployment |
| Authentication | Multiple Pi-hole auth methods | Broad compatibility |
| Output Formats | JSON, text, structured | Integration flexibility |
| Connection Testing | Built-in diagnostics | Quick troubleshooting |
| Extensibility | Plugin architecture | Easy customization |
pihole-llm-analytics/
├── pihole_analytics/ # Main package
│ ├── core/ # Core functionality
│ │ ├── pihole_client.py # Pi-hole API client
│ │ └── llm_client.py # Legacy LLM integration
│ ├── analytics/ # Analytics engine
│ │ ├── analyzer.py # Traditional DNS analysis
│ │ ├── llm_analyzer.py # AI-powered analysis with factory pattern
│ │ └── llm_providers/ # LLM provider factory pattern
│ │ ├── __init__.py # Package initialization
│ │ ├── base.py # Abstract base provider
│ │ ├── factory.py # Provider factory and enum
│ │ ├── ollama_provider.py # Ollama implementation
│ │ └── openai_provider.py # OpenAI implementation
│ ├── security/ # Security monitoring
│ │ └── threat_detector.py # Threat detection and alerting
│ ├── utils/ # Shared utilities
│ │ ├── models.py # Data models and types
│ │ ├── config.py # Configuration management
│ │ └── logging.py # Logging utilities
│ ├── main.py # Main application interface
│ ├── cli.py # Command-line interface
│ └── __main__.py # Module entry point
├── integrated_analysis.py # Standalone analysis script
├── requirements.txt # Python dependencies
└── README.md # This documentation
- PiholeClient: Robust Pi-hole API integration with authentication, retry logic, and error handling
- LLMClient: Legacy Ollama integration (maintained for compatibility)
- LLMAnalyzer: AI-powered analysis with factory pattern for multiple LLM providers
- DNSAnalyzer: Traditional DNS log analysis with anomaly detection, categorization, and reporting
- LLMAnalyzer: AI-powered analysis with factory pattern, custom instructions and structured output
- ThreatDetector: Advanced security monitoring with pattern recognition and threat intelligence
- LLM Providers: Factory pattern implementation supporting Ollama, OpenAI, and extensible architecture
- Type-safe dataclasses for all data structures
- Comprehensive enums for status codes and categories
- Validation and serialization support
- Enhanced AnalysisResult model with LLM insights
- Environment variable-based configuration
- Dataclass-based config objects with validation
- Support for multiple configuration sources
- Factory pattern configuration for LLM providers
- Domain Reputation Analysis: Real-time checking against threat intelligence feeds
- Pattern Recognition: Detection of DNS tunneling, beaconing, and other suspicious patterns
- Anomaly Detection: Statistical analysis to identify unusual query patterns
- Alert Generation: Configurable alerts with severity levels and automated notifications
- Client Behavior Analysis: Track individual client query patterns and anomalies
- Temporal Analysis: Identify time-based attack patterns and suspicious activity windows
- Domain Classification: AI-powered categorization of domains and threat assessment
- Trend Analysis: Long-term pattern recognition for emerging threats
PIHOLE_HOST: Pi-hole server IP address or hostnamePIHOLE_PORT: Pi-hole admin interface port (default: 80)PIHOLE_PASSWORD: Pi-hole admin password (required for query logs access)PIHOLE_TIMEOUT: Request timeout in seconds (default: 30)
LLM_URL: Ollama server URL (default: http://localhost:11434)LLM_MODEL: Model name (gpt-oss:latest recommended)LLM_TIMEOUT: Request timeout in seconds (default: 30)
ANALYTICS_LOG_COUNT: Number of recent queries to analyze (default: 1000)ANALYTICS_ENABLE_CACHING: Enable response caching (default: true)ANALYTICS_ANOMALY_THRESHOLD: Anomaly detection sensitivity (default: 0.1)
from pihole_analytics.analytics.llm_analyzer import LLMAnalyzer
from pihole_analytics.analytics.llm_providers.factory import LLMProvider
# Factory pattern (recommended)
analyzer = LLMAnalyzer.create_with_provider(LLMProvider.OLLAMA)
# Legacy configuration (still supported)
from pihole_analytics.analytics.llm_analyzer import LLMConfig
llm_config = LLMConfig(
url="http://localhost:11434", # Ollama server URL
model="gpt-oss:latest", # Model name
timeout=120, # Request timeout
max_prompt_chars=18000, # Prompt size limit
temperature=0.2, # Response creativity (0.0-1.0)
max_tokens=512 # Response length limit
)
legacy_analyzer = LLMAnalyzer(llm_config)
# Test connection and get available models
if analyzer.test_connection():
models = analyzer.get_available_models()
print(f"Available models: {models}")# LLM Configuration
OLLAMA_URL=http://localhost:11434
OLLAMA_MODEL=gpt-oss:latest
OLLAMA_TIMEOUT=120
OLLAMA_TEMPERATURE=0.2
OLLAMA_MAX_TOKENS=512
MAX_PROMPT_CHARS=18000SECURITY_ENABLE_THREAT_DETECTION: Enable threat detection (default: true)SECURITY_ALERT_THRESHOLD: Minimum alert severity (low/medium/high/critical)SECURITY_ENABLE_REPUTATION_CHECK: Enable domain reputation checking (default: true)
# Simple analysis
python integrated_analysis.py
# Comprehensive analysis with options
python integrated_analysis.py --count 500 --output json --verbose
# Full CLI interface
python -m pihole_analytics analyze --count 500from pihole_analytics.main import PiholeAnalytics
analytics = PiholeAnalytics()
results = analytics.run_analysis(count=500)
print(f"Analyzed {results['summary']['total_queries']} queries")
print(f"Block rate: {results['summary']['block_rate']*100:.1f}%")
print(f"Anomalies detected: {results['summary']['anomalies_detected']}")# Generate security report
security_report = analytics.generate_report("security", days=7)
# Check for active threats
if security_report.get("threat_analysis"):
threats = security_report["threat_analysis"]["threats_detected"]
print(f"Active threats: {threats}")
# Review security alerts
for alert in security_report.get("active_alerts", []):
print(f"Alert: {alert['title']} - {alert['severity']}")from pihole_analytics.analytics.llm_analyzer import LLMAnalyzer
from pihole_analytics.analytics.llm_providers.factory import LLMProvider
# Initialize LLM analyzer with factory pattern
analyzer = LLMAnalyzer.create_with_provider(LLMProvider.OLLAMA)
# Test connection first
if not analyzer.test_connection():
print("LLM service unavailable")
exit(1)
# Custom analysis with specific focus
custom_instructions = """
Analyze the DNS logs focusing on:
1. Gaming traffic patterns (Steam, Epic, etc.)
2. Streaming service usage (Netflix, YouTube, etc.)
3. Social media activity (Facebook, Twitter, etc.)
4. Potential security threats or suspicious domains
5. Bandwidth-heavy applications
Provide detailed breakdown with percentages and specific recommendations.
Return analysis in the standard JSON format.
"""
analysis = analyzer.analyze_queries(queries, custom_instructions)
# Extract specific insights
risk_level = analysis.threat_summary.get('risk_level', 'unknown')
print(f"Risk Assessment: {risk_level}")
if analysis.anomalies:
print(f"Security Anomalies: {len(analysis.anomalies)}")
for anomaly in analysis.anomalies:
print(f" • {anomaly.description} (Confidence: {anomaly.confidence:.1%})")# Complete integration test
from pihole_analytics.core.pihole_client import PiholeClient
from pihole_analytics.analytics.llm_analyzer import LLMAnalyzer
from pihole_analytics.analytics.llm_providers.factory import LLMProvider
from pihole_analytics.utils.config import PiholeConfig
def test_integration():
"""Test the complete integration."""
# Load from environment or configure directly
pihole_config = PiholeConfig.from_env() # Or manual config
# Test connections
pihole_client = PiholeClient(pihole_config)
llm_analyzer = LLMAnalyzer.create_with_provider(LLMProvider.OLLAMA)
print("Testing Pi-hole connection...")
with pihole_client:
queries = pihole_client.fetch_queries(10)
print(f"Fetched {len(queries)} queries")
print("Testing LLM connection...")
if llm_analyzer.test_connection():
print("LLM service available")
models = llm_analyzer.get_available_models()
print(f"Available models: {models[:3]}...")
return True
# Run the test
if test_integration():
print("Integration test successful!")# Analyze specific client activity
client_ip = "192.168.1.100"
analysis = analytics.get_client_analysis(client_ip, hours=24)
print(f"Client {client_ip} analysis:")
print(f" Total queries: {analysis['summary']['total_queries']}")
print(f" Unique domains: {analysis['summary']['unique_domains']}")
print(f" Block rate: {analysis['summary']['block_rate']*100:.1f}%")# Test all connections without running analysis
python integrated_analysis.py --test-connection
# Verbose testing with detailed output
python integrated_analysis.py --test-connection --verbose# Test with different query counts
python integrated_analysis.py --count 50 --output text # Small test
python integrated_analysis.py --count 500 --output json # Medium test
python integrated_analysis.py --count 1000 --verbose # Large test
# Test different models
python integrated_analysis.py --model llama3.2:latest --count 100
python integrated_analysis.py --model gemma3:4b --count 100Problem: Application authenticates successfully but fails when fetching queries with 401 Unauthorized error.
Symptoms:
2025-08-27 13:19:39,471 - pihole_analytics.piholeclient - INFO - Successfully authenticated to Pi-hole
2025-08-27 13:19:39,532 - pihole_analytics.piholeclient - ERROR - Error occurred: 401 Client Error: Unauthorized for url: http://192.168.7.99:8080/api/queries?sid=...
Fixed in v1.1: The application now automatically tries multiple authentication methods for Pi-hole v6.0+ compatibility.
Integrated Architecture Benefits:
- Multiple Auth Methods: Automatically tries session-based, password-based, and legacy token methods
- Graceful Fallbacks: Continues with available data when some endpoints fail
- Enhanced Logging: Detailed error messages with troubleshooting hints
- Connection Testing: Built-in testing via
--test-connectionflag
How it works:
- Session-based authentication (Pi-hole v5.x, backward compatibility)
- Password-based per-request authentication (Pi-hole v6.0+)
- Legacy API token method (older custom installations)
Quick Test:
# Test with integrated script
python integrated_analysis.py --test-connection
# Test with full CLI
python -m pihole_analytics status
# Test specific analysis (will gracefully handle API restrictions)
python integrated_analysis.py --count 10 --verboseIf still having issues:
-
Run integrated diagnostics:
# Test both Pi-hole and LLM connections python integrated_analysis.py --test-connection --verbose # Test Pi-hole authentication and endpoints python pihole_diagnostics.py
-
Verify Pi-hole admin password:
# Test login via Pi-hole web interface first # Use the exact same password in PIHOLE_PASSWORD
-
Check Pi-hole API settings:
- Navigate to Pi-hole Settings → API
- Ensure API access is enabled
- Check Query log display permissions
-
Test integrated authentication:
from pihole_analytics.core.pihole_client import PiholeClient from pihole_analytics.utils.config import PiholeConfig config = PiholeConfig(host="your_ip", port=8080, password="your_password") with PiholeClient(config) as client: queries = client.fetch_queries(5) # Uses multiple auth methods automatically print(f"Success! Got {len(queries)} queries")
Pi-hole Version Notes:
- v5.x and earlier: Uses session-based authentication
- v6.0+: Requires password-based auth for queries endpoint
- Custom installations: May require legacy API token format
The application automatically detects your Pi-hole version and uses the appropriate authentication method.
- Verify Pi-hole URL and API key in
.env - Ensure Pi-hole API is enabled in admin settings
- Check network connectivity and firewall rules
- Connection Testing: Use
python integrated_analysis.py --test-connectionto verify LLM connectivity - Model Availability: Check available models with the integrated analyzer
- Graceful Fallback: The application continues with basic analysis if LLM fails
# Test LLM connectivity
python integrated_analysis.py --test-connection
# List available models
python -c "
from pihole_analytics.analytics.llm_analyzer import LLMAnalyzer
from pihole_analytics.analytics.llm_providers.factory import LLMProvider
analyzer = LLMAnalyzer.create_with_provider(LLMProvider.OLLAMA)
print('Available models:', analyzer.get_available_models())
"
# Test with different model
python integrated_analysis.py --model llama3.2:latest --test-connectionCommon LLM Parsing Error: If you see unhashable type: 'dict' error:
# This indicates the LLM service is responding but with unexpected format
# Pi-hole data collection still works - only LLM analysis fails
# Try with a smaller query count first
python integrated_analysis.py --count 10 --verbose
# Test only the connection without analysis
python integrated_analysis.py --test-connection
# Check if Ollama service is running properly
ollama ps # Should show running models
ollama list # Should show available modelsWhat works regardless of LLM issues:
- Pi-hole authentication and connection
- DNS query data fetching (100+ queries)
- Basic DNS log parsing and validation
- Only the AI-powered analysis may fail
The Pi-hole data fetching will still work even if LLM analysis fails.
- Start with smaller query counts for testing
- Check Pi-hole logs for any query retrieval issues
- Verify sufficient system memory for large datasets
Enable verbose logging for detailed diagnostic information:
# Set environment variable
export APP_LOG_LEVEL=DEBUG
# Or use CLI flag
python -m pihole_analytics --verbose analyzeThe project includes comprehensive diagnostic capabilities:
# Test both Pi-hole and LLM connections
python integrated_analysis.py --test-connection
# Test Pi-hole authentication and all endpoints
python pihole_diagnostics.py
# This will test:
# - Authentication to Pi-hole
# - All available API endpoints
# - LLM service connectivity
# - Data retrieval and parsing
# - Provide detailed error informationExample diagnostic output:
Getting session ID...
Authentication successful: BOf8VGaTLcX62ezS0Zw64g=
Testing: http://192.168.7.99:8080/api/summary?sid=...
Keys: ['domains_being_blocked', 'dns_queries_today', 'ads_blocked_today', ...]
Testing: http://192.168.7.99:8080/api/queries?sid=...
Failed: 401 Client Error: Unauthorized for url: ...
This helps identify exactly which endpoints are working and which are failing.
- Query Count: Start with 500-1000 queries for initial testing
- Caching: Enable analytics caching for repeated analyses
- LLM Timeout: Adjust timeout based on model size and hardware
- Memory Usage: Monitor system memory with large query datasets
-
Install development dependencies:
pip install pytest pylint black mypy
-
Run code quality checks:
# Linting pylint pihole_analytics/ # Type checking mypy pihole_analytics/ # Code formatting black pihole_analytics/
-
Run tests:
pytest tests/
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes with proper testing
- Ensure code quality:
pylint,mypy,black - Submit a pull request with detailed description
The application follows clean architecture principles:
- Separation of Concerns: Each module has a single responsibility
- Dependency Injection: Configuration and clients are injected
- Type Safety: Comprehensive type hints and validation
- Error Handling: Structured error handling with proper logging
- Extensibility: Easy to add new analyzers and detectors
This project is licensed under the MIT License - see the LICENSE file for details.
- Pi-hole Project: For providing excellent DNS filtering capabilities
- Ollama: For local LLM hosting and management
- Python Community: For robust libraries and development tools
- Pi-hole FTL API Specification: Official OpenAPI specification - Complete reference for Pi-hole v6+ API endpoints and authentication methods
- Pi-hole Documentation: docs.pi-hole.net
- DNS Security Best Practices: Industry standards for DNS monitoring and threat detection
- LLM Integration Patterns: Architectural patterns for AI-powered network analysis
For questions, issues, or contributions:
- Issues: Use GitHub Issues for bug reports and feature requests
- Documentation: Check the comprehensive examples and code comments
- Community: Join discussions in the repository discussions section
This project is licensed under the MIT License - see the LICENSE file for details.