KubeStack-AI

AI-Powered Unified Middleware Management for Kubernetes & Beyond

中文文档 • Quickstart • Architecture • Contributing • Plugin Development

🚀 Mission Statement

KubeStack-AI is a revolutionary, AI-powered command-line assistant that transforms how you diagnose, manage, and optimize your entire middleware stack running on Kubernetes and bare-metal environments. By combining the power of Large Language Models with deep middleware expertise, KubeStack-AI provides intelligent, natural language-driven operations for complex cloud-native infrastructures.

🎯 Why KubeStack-AI?

The Challenge

Modern cloud-native environments involve dozens of middleware components (Redis, Kafka, PostgreSQL, MinIO, ElasticSearch, etc.), each with unique operational complexities. Traditional approaches require:

Fragmented Tools: Different CLI tools for each middleware
Deep Expertise: Extensive knowledge of each system's internals
Manual Correlation: Connecting symptoms across multiple systems
Time-Consuming Diagnosis: Hours spent troubleshooting complex issues

Our Solution

KubeStack-AI provides a unified, AI-driven interface that:

✅ Speaks Your Language: Natural language queries instead of complex commands
✅ Thinks Holistically: Cross-middleware correlation and root cause analysis
✅ Acts Intelligently: AI-powered diagnosis with actionable recommendations
✅ Extends Seamlessly: Plugin architecture for any middleware
✅ Operates Safely: Interactive confirmation for critical operations

⭐ Key Features

🔍 Intelligent Diagnosis & Anomaly Detection

Automated Detection: Built-in detectors for threshold breaches, time-series anomalies, and log patterns.
AI-Powered RCA: Root Cause Analysis engine that infers underlying issues from symptoms using rule-based logic and knowledge graph queries.
Multi-Layer Analysis: System, Kubernetes, and middleware-specific checks
Natural Language Queries: Ask questions in plain English

🛠️ Universal Middleware Support

Database Systems: MySQL, PostgreSQL, MongoDB, Redis, ClickHouse
Message Queues: Kafka, RabbitMQ, Pulsar
Search & Analytics: ElasticSearch, OpenSearch
Storage: MinIO, Ceph
Monitoring: Prometheus, Grafana
Service Discovery: etcd, Consul

🧩 Plugin Architecture (✨ Phase 2.5 - Enhanced)

Extensible Design: Add support for any middleware through plugins with standardized interfaces
Hot-Loading: Dynamic plugin loading/unloading without service restart
Lifecycle Management: Complete Init → Start → Stop → Reload lifecycle with health checks
Sandbox Isolation: Timeout control, panic recovery, and resource limits per plugin
Core Plugins (Production-Ready):
- Redis Plugin: Comprehensive diagnostics (memory, connections, replication, persistence, performance)
  - Modes: Standalone, Sentinel, Cluster
  - Versions: 5.x, 6.x, 7.x
  - Diagnostics: Memory usage, fragmentation, eviction, client analysis, replication lag, slow logs, hit rate
- Kafka Plugin: Broker health, consumer lag tracking, topic analysis
  - Versions: 2.x, 3.x
  - Features: Consumer group monitoring, partition analysis, SASL/TLS authentication
- MySQL Plugin: Replication monitoring, slow query analysis, connection pool diagnostics
  - Versions: 5.7, 8.x
  - Features: Replication status, performance_schema integration, InnoDB metrics
Community Driven: Open plugin SDK with comprehensive documentation
Multi-Format Output: JSON, YAML, and formatted table output for all plugins

🤖 AI-Enhanced Operations

Smart Recommendations: Context-aware optimization suggestions
Automated Fixes: One-click resolution for common issues
Knowledge Integration: Built-in best practices and troubleshooting guides

advanced RAG Pipeline

Hybrid Retrieval: Combines semantic and keyword-based search to improve recall.
Reranking: Refines search results using a cross-encoder model to improve relevance.
Configurable: The entire RAG pipeline is configurable via the configs/knowledge/knowledge.yaml file.

🚀 Getting Started

We highly recommend checking out our comprehensive Quickstart Guide for detailed setup, configuration, and usage instructions.

Installation

Option 1: Go Install

go install github.com/turtacn/kubestack-ai/cmd/ksa@latest

Option 2: Homebrew (macOS/Linux)

brew tap turtacn/kubestack-ai
brew install kubestack-ai

Option 3: Download Binary

Visit our releases page to download pre-built binaries.

Quick Start

# Initialize KubeStack-AI
ksa init

# Diagnose middleware instances (Phase 2.5 Enhanced)
ksa diagnose redis localhost:6379                    # Redis diagnostics
ksa diagnose kafka broker1:9092,broker2:9092        # Kafka cluster
ksa diagnose mysql "user:pass@tcp(host:3306)/db"   # MySQL instance

# With specific categories
ksa diagnose redis localhost:6379 --categories memory,replication

# JSON output for automation
ksa diagnose redis localhost:6379 -o json | jq .

# Diagnose all middleware in current namespace
ksa diagnose --all

# Ask natural language questions
ksa ask "Why is my Redis cluster slow?"

# Get specific middleware status
ksa status redis --namespace production

# List available plugins
ksa plugin list

# Get plugin information
ksa plugin info redis-diagnostics

# Search knowledge base for solutions
ksa kb search "Redis OOM"

# Get specific KB entry
ksa kb get kb-redis-001

Web Interface

Start the API server:
```
ksa server start
```
Access the Web Console at http://localhost:8080/console. (Note: http://localhost:3000 is for the full frontend if running, but the console integration is available on the API server port).

Asynchronous Tasks

You can submit long-running diagnosis tasks asynchronously:

curl -X POST http://localhost:8080/console/diagnose?async=true \
  -H "Content-Type: application/json" \
  -d '{"targetMiddleware": "redis", "instance": "my-redis"}'

This returns a task_id. You can check the status:

curl http://localhost:8080/console/task/status/<task_id>

Basic Usage Examples

Example 1: Comprehensive System Health Check

$ ksa diagnose --middleware redis,mysql,kafka
🔍 Analyzing Redis cluster...
✅ Redis: Healthy (3/3 nodes up, memory usage: 45%)

🔍 Analyzing MySQL primary-replica...
⚠️  MySQL: Warning detected
   • Replica lag: 2.3s (threshold: 1s)
   • Slow queries: 23 in last hour

🔍 Analyzing Kafka cluster...
❌ Kafka: Critical issues found
   • Topic 'orders': 50K messages backed up
   • Consumer group 'payment-service': 5min lag

💡 AI Recommendations:
   1. MySQL: Consider tuning innodb_buffer_pool_size
   2. Kafka: Scale consumer group or check processing logic

Example 2: Natural Language Troubleshooting

$ ksa ask "My application can't connect to the database"
🤔 Analyzing connection issues...

🔍 Discovered Issues:
   • PostgreSQL max_connections (100) reached
   • Connection pool exhaustion in app pods
   • Network policy blocking traffic on port 5432

🛠️  Suggested Actions:
   1. Increase max_connections: `ksa exec postgres --set max_connections=200`
   2. Scale app replicas: `ksa scale app --replicas 5`
   3. Review network policies: `ksa network analyze postgres`

Execute fixes? [y/N]:

Example 3: Plugin Management

$ ksa plugin install clickhouse
📦 Installing ClickHouse plugin v1.2.0...
✅ Plugin installed successfully

$ ksa diagnose clickhouse --cluster analytics
🔍 ClickHouse Cluster Analysis:
   • Merge queue: 145 items (high)
   • Query latency P95: 2.3s
   • Disk usage: 78% on shard-2

💡 Recommendations:
   • Consider adding more background merge threads
   • Archive old partitions in 'events' table

📖 Documentation

Architecture Overview - Technical deep-dive into system design
Anomaly Detection - Design of the anomaly detection system
RCA Engine - Design of the Root Cause Analysis engine
Plugin Development Guide - Build your own middleware plugins
Supported Middlewares - List of supported middlewares and their capabilities
Configuration Reference - Complete config options. See configs/knowledge/knowledge.yaml for RAG pipeline configuration.
E2E Testing Guide - How to run and write E2E tests
Troubleshooting Guide - Common issues and solutions
API Reference - REST API and SDK documentation

🏗️ Codebase Structure

A brief overview of the key directories in the KubeStack-AI repository:

/cmd: Main application entry points. The ksa CLI application lives here.
/internal: All of the core application logic. As this is an internal package, it is not meant to be imported by external applications.
- /cli: Defines the command-line interface using Cobra, including command definitions, flag parsing, and UI formatters.
- /core: The heart of the application. It contains the central orchestrator and the primary interfaces for diagnosis, execution, and plugins.
  - /detection: Anomaly detection system.
  - /rca: Root Cause Analysis engine.
- /llm: Abstractions and clients for interacting with Large Language Models (LLMs) and the Retrieval-Augmented Generation (RAG) pipeline.
- /knowledge: Components for the knowledge base, including storage, crawling, and search functionalities.
- /plugin: The new unified plugin system architecture (Phase 4), including Registry, Loader, and Validator.
- /plugins: Built-in middleware plugins (e.g., Redis, Kafka, MySQL).
/pkg: Shared utility packages that could theoretically be used by external applications.
/deployments: Kubernetes manifests, Dockerfiles, and other deployment-related artifacts.
/docs: Project documentation, including architecture and contribution guides.
/scripts: Helper scripts for development tasks like building, testing, and linting.
/web: Contains frontend assets for a potential web-based UI.

🤝 Contributing

We welcome contributions from the community! KubeStack-AI is built by middleware experts for middleware experts.

How to Contribute

🐛 Report Issues: Found a bug? Open an issue
💡 Feature Requests: Have ideas? Start a discussion
🔧 Code Contributions: Fork, develop, and submit PRs
📝 Documentation: Help improve our docs
🧩 Plugin Development: Build plugins for new middleware

Development Setup

git clone https://github.com/turtacn/kubestack-ai.git
cd kubestack-ai
make dev-setup
make test
make e2e-test
make build

Testing & Quality Assurance

KubeStack-AI maintains high code quality with comprehensive testing:

# Run all unit tests
make test

# Run integration tests
make test-integration

# Run E2E tests
make e2e-test

# Run CLI smoke tests (Phase 26)
./scripts/cli_smoke_test.sh

# Generate coverage report
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

Test Coverage: 80%+ overall, 90%+ for CLI components

CLI Testing (Phase 26):

✅ 20+ E2E test scenarios covering all commands
✅ 5/5 middleware plugins fully tested (Redis, MySQL, Kafka, Elasticsearch, PostgreSQL)
✅ 100% output format coverage (text, JSON, YAML)
✅ Comprehensive configuration validation
✅ Automated smoke tests for CI/CD

For detailed CLI testing documentation, see:

See CONTRIBUTING.md for detailed guidelines.

🏆 Community & Support

💬 Discussions: GitHub Discussions
🐛 Issues: GitHub Issues
📧 Email: kubestack-ai@turtacn.com
🐦 Twitter: @KubeStackAI

📜 License

KubeStack-AI is licensed under the Apache License 2.0. See LICENSE file for details.

🌟 Star History

Built with ❤️ by the KubeStack-AI community

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
cmd/ksa		cmd/ksa
configs		configs
data		data
deployments/docker		deployments/docker
docs		docs
examples		examples
internal		internal
pkg		pkg
plugins		plugins
scripts		scripts
test		test
web		web
.gitignore		.gitignore
BUILD_FIX.md		BUILD_FIX.md
BUILD_VERIFICATION.md		BUILD_VERIFICATION.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
PHASE05_SUBMISSION.md		PHASE05_SUBMISSION.md
PHASE26_EXECUTIVE_SUMMARY.md		PHASE26_EXECUTIVE_SUMMARY.md
PHASE26_FINAL_COMPLETE.md		PHASE26_FINAL_COMPLETE.md
PHASE26_FINAL_STATUS.md		PHASE26_FINAL_STATUS.md
PHASE26_SUBMISSION.md		PHASE26_SUBMISSION.md
PHASE26_SUMMARY.txt		PHASE26_SUMMARY.txt
PHASE3_COMPLETION_REPORT.md		PHASE3_COMPLETION_REPORT.md
PHASE3_SUMMARY.txt		PHASE3_SUMMARY.txt
PLUGIN_COMMAND_ADDED.md		PLUGIN_COMMAND_ADDED.md
QUICKSTART.md		QUICKSTART.md
README-zh.md		README-zh.md
README.md		README.md
TEST_REFERENCE.md		TEST_REFERENCE.md
demo.gif		demo.gif
demo2.gif		demo2.gif
go.mod		go.mod
go.sum		go.sum
ksa_server.log		ksa_server.log
logo.png		logo.png
server.log		server.log

License

turtacn/kubestack-ai

Folders and files

Latest commit

History

Repository files navigation