|
| 1 | +# AegisClaw Roadmap |
| 2 | + |
| 3 | +> Last updated: February 2026 |
| 4 | +
|
| 5 | +--- |
| 6 | + |
| 7 | +## Completed Phases |
| 8 | + |
| 9 | +### ✅ v0.1.x — Foundations (Complete) |
| 10 | + |
| 11 | +- Go-based CLI (`aegisclaw init`, `secrets`, `sandbox`, `logs`) |
| 12 | +- Policy engine with granular scopes (`files.read`, `shell.exec`, `net.outbound`) |
| 13 | +- TUI-based human-in-the-loop approval for high-risk actions |
| 14 | +- Hardened Docker sandbox (non-root, read-only rootfs, dropped capabilities, seccomp) |
| 15 | +- `age`-based secret encryption for API keys |
| 16 | +- Tamper-evident, hash-chained audit logging |
| 17 | +- OpenClaw adapter for agent runtime integration |
| 18 | +- Egress proxy for network control |
| 19 | +- Signed skill verification (ed25519) |
| 20 | + |
| 21 | +### ✅ v0.2.x — Policy & Runtimes (Complete) |
| 22 | + |
| 23 | +- OPA (Rego) policy engine integration |
| 24 | +- gVisor (`sandbox_runtime`) support for stronger isolation |
| 25 | +- Skill manifest format (`skills/*.yaml`) with scope declarations |
| 26 | +- Adapter health monitoring and connection status |
| 27 | + |
| 28 | +### ✅ v0.3.x — Observability & UX (Complete) |
| 29 | + |
| 30 | +- Modern web dashboard (dark mode, Security Operations Center view) |
| 31 | +- Real-time terminal streaming with live log output |
| 32 | +- Prometheus metrics endpoint |
| 33 | +- OpenTelemetry tracing |
| 34 | +- Active secret redaction in logs and console output |
| 35 | +- Emergency lockdown / panic button |
| 36 | +- Security envelope visualisation (sandbox status indicator) |
| 37 | +- Explainable audit tooltips (why an action was allowed/denied) |
| 38 | +- Skill store with remote registry browsing and one-click install |
| 39 | + |
| 40 | +--- |
| 41 | + |
| 42 | +## Upcoming Phases |
| 43 | + |
| 44 | +### 🔜 v0.4.x — Usability & Developer Experience (Q2 2026) |
| 45 | + |
| 46 | +#### Installation & Onboarding |
| 47 | + |
| 48 | +- **Package manager distribution**: `brew install aegisclaw`, `go install`, pre-built binaries via GoReleaser for Linux/macOS/Windows (amd64 + arm64) |
| 49 | +- **Interactive `init` wizard**: Guided setup that detects Docker/gVisor availability, configures default policies, and walks through first secret and skill registration |
| 50 | +- **Starter skill packs**: Curated bundles of safe, pre-signed skills (file organiser, web search, code runner) so users have something useful immediately after install |
| 51 | +- **`aegisclaw doctor`**: Diagnostic command that checks Docker, gVisor, secrets, adapter connectivity, and policy health in one go — outputs a clear pass/fail checklist |
| 52 | + |
| 53 | +#### Day-to-Day Workflow |
| 54 | + |
| 55 | +- **`docker-compose` skill orchestration**: Support multi-container skills (e.g., agent + database + cache) with coordinated sandboxing and shared network policies |
| 56 | +- **Policy templates**: Pre-built Rego policy profiles — `strict` (deny-by-default, approve everything), `standard` (allow known-safe, approve high-risk), `permissive` (allow most, log everything) — selectable during init |
| 57 | +- **Scope autosuggestion**: When a skill requests scopes beyond its manifest, suggest the minimal scopes needed based on observed behaviour rather than requiring manual YAML editing |
| 58 | +- **Dashboard mobile responsiveness**: Responsive web UI for monitoring agent activity from a phone or tablet |
| 59 | +- **Notification system**: Webhook, Slack, and email notifications for pending approvals, denied actions, and emergency lockdowns |
| 60 | + |
| 61 | +#### CLI Enhancements |
| 62 | + |
| 63 | +- **`aegisclaw replay <log-id>`**: Replay an audit log entry in dry-run mode to understand what happened and what would happen if re-executed |
| 64 | +- **`aegisclaw diff <policy-a> <policy-b>`**: Compare two Rego policies side-by-side with highlighted permission differences |
| 65 | +- **Shell completions**: Bash, Zsh, Fish, and PowerShell autocompletions generated from CLI metadata |
| 66 | + |
| 67 | +--- |
| 68 | + |
| 69 | +### 🛡️ v0.5.x — Advanced Security (Q3 2026) |
| 70 | + |
| 71 | +#### Runtime Hardening |
| 72 | + |
| 73 | +- **Kata Containers / Firecracker support**: MicroVM-based isolation for workloads that need stronger-than-Docker boundaries |
| 74 | +- **Nix/bubblewrap sandbox**: Lightweight, non-Docker sandbox option for environments where Docker isn't available or desired |
| 75 | +- **Runtime behaviour profiling**: Learn normal syscall and network patterns per skill, flag anomalies in real-time (e.g., a file-organiser skill suddenly making network requests) |
| 76 | +- **Resource quotas**: CPU, memory, disk I/O, and network bandwidth limits per skill — prevent runaway agents from consuming host resources |
| 77 | + |
| 78 | +#### Secret Management |
| 79 | + |
| 80 | +- **`sops` integration**: Support Mozilla SOPS-encrypted files alongside `age` |
| 81 | +- **Pluggable vault backends**: HashiCorp Vault, Infisical, Bitwarden, and AWS Secrets Manager as secret sources — secrets are never written to disk unencrypted |
| 82 | +- **Secret rotation**: Automatic key rotation with configurable schedules and notification when skills need re-authentication |
| 83 | +- **Ephemeral secrets**: Short-lived credentials injected into sandboxes that auto-expire after execution |
| 84 | + |
| 85 | +#### LLM Safety |
| 86 | + |
| 87 | +- **NeMo Guardrails integration**: LLM prompt protection layer — detect and block prompt injection, jailbreaks, and off-topic steering before prompts reach the model |
| 88 | +- **Prompt/response audit trail**: Log every LLM interaction (prompt + response) with optional PII redaction, creating a full chain of accountability for agent decisions |
| 89 | +- **Token budget enforcement**: Per-skill and per-session token limits to prevent cost runaway from agent loops |
| 90 | +- **Output content filtering**: Configurable filters that flag or block agent outputs containing sensitive data, harmful content, or policy violations |
| 91 | + |
| 92 | +#### Auth & Access Control |
| 93 | + |
| 94 | +- **Tailscale/WireGuard integration**: Private mesh networking so the dashboard and API are only accessible over encrypted tunnels |
| 95 | +- **Authelia/Keycloak SSO**: Web UI identity provider integration for team deployments — RBAC with admin, operator, and viewer roles |
| 96 | +- **mTLS for adapter communication**: Mutual TLS between AegisClaw and OpenClaw endpoints to prevent man-in-the-middle attacks |
| 97 | +- **API key scoping**: Per-key permissions so different integrations (CI, dashboard, CLI) have minimal required access |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +### ✨ v0.6.x — Woo Factor & Ecosystem (Q4 2026) |
| 102 | + |
| 103 | +#### Visual & Interactive |
| 104 | + |
| 105 | +- **Live threat map**: Real-time animated dashboard view showing agent actions as they happen — skill executions pulse, denied actions flash red, approvals glow green — a "mission control" feel for your AI agents |
| 106 | +- **"Agent X-Ray" mode**: Click any running skill to see a live breakdown: active syscalls, open file handles, network connections, memory usage, and current scope consumption — full transparency into what the agent is actually doing inside the sandbox |
| 107 | +- **Security posture score**: Embeddable badge and dashboard widget (`AegisClaw: A+`) scoring your configuration across sandboxing, secret management, policy strictness, and audit integrity — gamification that rewards good security hygiene |
| 108 | +- **Approval UX overhaul**: Rich approval cards (web + Slack + mobile push) showing exactly what the agent wants to do, with context (which skill, what scope, risk level), diff of proposed changes, and one-tap approve/deny |
| 109 | + |
| 110 | +#### Skills Ecosystem |
| 111 | + |
| 112 | +- **Git-based skill distribution**: `aegisclaw skill install github.com/org/skill` — pull skills directly from Git repos with hash-chained provenance verification |
| 113 | +- **Skill marketplace**: Community registry with ratings, verified publishers, security audit badges, and automated vulnerability scanning of skill images |
| 114 | +- **Skill sandboxing profiles**: Per-skill seccomp and AppArmor profiles auto-generated from observed behaviour during a "learning" phase, then locked down for production |
| 115 | +- **Skill composition**: Chain multiple skills into workflows with data passing between sandbox boundaries — each step isolated, full audit trail across the pipeline |
| 116 | + |
| 117 | +#### Developer Experience |
| 118 | + |
| 119 | +- **VS Code extension**: Sidebar panel showing AegisClaw status, live audit stream, one-click approvals, and Rego policy linting with inline diagnostics |
| 120 | +- **`aegisclaw simulate`**: Dry-run mode that predicts what a skill would do (file access, network calls, resource usage) without actually executing it — like a flight simulator for agent actions |
| 121 | +- **Policy playground**: Browser-based Rego editor with live evaluation against sample skill manifests and audit scenarios — test policies before deploying them |
| 122 | +- **Terraform/Pulumi provider**: Infrastructure-as-code resources for provisioning AegisClaw instances, policies, and skill registries in team/org deployments |
| 123 | + |
| 124 | +#### Integrations |
| 125 | + |
| 126 | +- **MCP (Model Context Protocol) server**: Expose AegisClaw as an MCP tool server so any MCP-compatible AI assistant can run sandboxed skills through AegisClaw's security envelope |
| 127 | +- **GitHub Actions integration**: `aegisclaw/action@v1` that runs skills in CI with the same sandbox guarantees as local execution — consistent security in dev and CI |
| 128 | +- **Webhook-driven automation**: IFTTT-style triggers — "when a skill is denied 3 times, notify the team and auto-escalate to admin" |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +## Long-Term Vision (2027+) |
| 133 | + |
| 134 | +- **Multi-node orchestration**: Distribute agent workloads across multiple machines with centralised policy management and unified audit logs |
| 135 | +- **Federated skill trust**: Cross-organisation skill sharing with cryptographic trust chains — org A's signed skills are verifiable by org B without a central authority |
| 136 | +- **eBPF-based runtime monitoring**: Kernel-level observability without modifying the sandbox — trace syscalls, network flows, and file access at near-zero overhead |
| 137 | +- **AI-powered policy generation**: Analyse a skill's code/manifest and automatically suggest the minimal Rego policy — "this skill only needs `files.read:/tmp` and `net.outbound:api.openai.com`" |
| 138 | +- **Compliance frameworks**: Pre-built policy packs for SOC 2, HIPAA, GDPR, and NIST — one command to apply a compliance baseline |
| 139 | +- **AegisClaw Cloud**: Hosted SaaS with org management, centralised dashboards, SSO, and managed skill registries for teams that don't want to self-host |
| 140 | + |
| 141 | +--- |
| 142 | + |
| 143 | +## How to Contribute |
| 144 | + |
| 145 | +We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for details. |
| 146 | + |
| 147 | +**High-impact areas right now:** |
| 148 | + |
| 149 | +- 🐳 Adding Kata Containers / Firecracker runtime support |
| 150 | +- 🔐 Pluggable vault backend implementations (Vault, Infisical, Bitwarden) |
| 151 | +- 📝 Writing and publishing community skills with signed manifests |
| 152 | +- 🧪 Security testing and fuzzing of the sandbox boundary |
| 153 | +- 📚 Documentation improvements and tutorials |
| 154 | + |
| 155 | +Report bugs or request features via [GitHub Issues](https://github.com/mackeh/AegisClaw/issues). |
0 commit comments