feat: self-improvement system — logging, perf monitoring, retrospectives by AgentWrapper · Pull Request #108 · ComposioHQ/agent-orchestrator

AgentWrapper · 2026-02-18T19:53:16Z

Summary

Full self-improvement system for ao: structured logging, API performance monitoring, session retrospectives, dashboard management, and comprehensive tests.

Core Infrastructure

JSONL log writer (log-writer.ts): size-based rotation, crash-safe appendFileSync, configurable backups
Log reader (log-reader.ts): query/filter by time, level, source, session, pattern; reads across rotated files; tail support
Session report cards (session-report-card.ts): per-session metrics (CI attempts, review rounds, outcome, duration)
Retrospectives (retrospective.ts): post-session analysis with timeline, metrics, and heuristic lessons
Dashboard manager (dashboard-manager.ts): programmatic restartDashboard() for orchestrator agent to kill/clean/restart Next.js without CLI
Path utilities: getLogsDir(), getRetrospectivesDir()

CLI Commands

ao logs dashboard|events|session — query structured logs
ao perf routes|slow|cache|enrichment — API performance analysis
ao retro list|show|generate — session retrospective management
ao dashboard restart|status|logs — dashboard process management with PID tracking

Dashboard (Web)

/logs page with LogViewer component (filterable by source, level, time, session)
/perf page with PerfDashboard component (route stats, slow requests, cache hit rates)
POST /api/client-logs — browser-side error/perf ingestion
GET /api/logs, GET /api/perf — log and performance query APIs
ClientLogger component: captures window.onerror, PerformanceObserver, fetch timing
Cache hit/miss tracking in TTLCache.getStats()
API request timing in /api/sessions route
PR enrichment timing in serialize.ts

Event Persistence & Lifecycle

Lifecycle manager writes events to events.jsonl with full state transition data
Auto-generates retrospectives on session merge/kill
Session manager passes AO_PROJECT_ID and AO_LOG_DIR env vars to spawned sessions

Tests (101 new)

log-writer.test.ts (13): append, rotation, close, auto-create dirs, write failure
log-reader.test.ts (42): all filters, corrupted lines, readLogsFromDir, tailLogs
session-report-card.test.ts (24): transitions, CI/review counting, outcomes, duration
retrospective.test.ts (15): save/load, filters, corrupted JSON, round-trip
cache.test.ts (7 new): getStats hit/miss tracking, hit rate, size

Bugbot Fixes

Background mode log capture uses file descriptors instead of pipes (survives parent exit)
Fixed visibilitychange listener leak in client-logger
Removed dead code (with-timing.ts, unused getRequestStats)
Fixed port conflict detection logic in dashboard status
Implemented --since option in dashboard logs subcommand

Test plan

pnpm test — all 927 tests pass
pnpm typecheck — 0 errors
pnpm lint — 0 errors
ao start — dashboard logs captured to JSONL
ao logs dashboard --tail 20 — shows recent output
ao dashboard restart --clean --wait — kills, cleans .next, restarts
ao perf routes — per-route p50/p95/p99
ao retro list — shows retrospectives
Dashboard /logs and /perf pages render

🤖 Generated with Claude Code

packages/cli/src/commands/start.ts

packages/web/src/lib/client-logger.ts

packages/web/src/lib/with-timing.ts

packages/web/src/lib/request-logger.ts

packages/cli/src/commands/dashboard.ts

packages/core/src/retrospective.ts

packages/cli/src/commands/dashboard.ts

packages/core/src/dashboard-manager.ts

packages/cli/src/commands/perf.ts

packages/cli/src/commands/dashboard.ts

packages/core/src/dashboard-manager.ts

packages/cli/src/lib/perf-utils.ts

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

packages/cli/src/commands/dashboard.ts

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

packages/cli/src/commands/perf.ts

packages/cli/src/commands/start.ts

…retrospectives Add comprehensive observability infrastructure for ao to dog-food itself: Core log infrastructure: - JSONL log writer with size-based rotation (log-writer.ts) - Log query/filter utilities for reading rotated files (log-reader.ts) - getLogsDir() and getRetrospectivesDir() path utilities Event persistence: - Lifecycle manager writes state transitions to events.jsonl - Auto-generates retrospectives on session merge/kill Dashboard log capture: - `ao start -b` background mode with log piping to files - PID file for reliable `ao stop` - Browser-side error/perf capture (ClientLogger, /api/client-logs) API performance monitoring: - Request timing with breakdown (serviceInit, sessionList, prEnrichment) - Cache hit/miss tracking on TTLCache - Enrichment timing instrumentation in enrichSessionPR() - withTiming() HOF for route instrumentation Session retrospectives: - Report card generation from event logs (CI attempts, review rounds) - Retrospective with timeline, metrics, heuristic lessons - Auto-save on session completion CLI commands: - `ao logs dashboard|events|session <id>` — query structured logs - `ao retro list|show|generate` — manage retrospectives - `ao perf routes|slow|cache|enrichment` — API performance analysis Dashboard pages: - /logs — filterable log viewer with auto-refresh - /perf — route stats table, slow requests, cache effectiveness - Nav links added to main dashboard header Environment: - AO_PROJECT_ID and AO_LOG_DIR env vars in spawned sessions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Refactors `ao dashboard` from a flat start command into a command group: - `ao dashboard` (default) — starts with PID tracking + log capture - `ao dashboard restart [--clean]` — kill + restart, optionally wipe .next - `ao dashboard status` — show process state, port, .next cache age, log info - `ao dashboard logs` — tail dashboard.jsonl logs Process awareness: always writes dashboard.pid, validates PID on read, cleans up stale PID files automatically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds `packages/core/src/dashboard-manager.ts` with functions callable by the orchestrator agent or lifecycle manager without going through CLI: - restartDashboard(opts) — kill → clean .next → spawn detached → PID file - waitForHealthy(port) — poll until dashboard responds to HTTP - getDashboardStatus(logDir, port) — check if running via PID file or port - stopDashboard(logDir, port) — kill without restarting The CLI `ao dashboard restart` now delegates to restartDashboard() from core, adding --wait flag to optionally block until healthy. All exported from @composio/ao-core for programmatic use: import { restartDashboard, waitForHealthy } from "@composio/ao-core"; Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ache stats Covers log-writer (13 tests), log-reader (42 tests), session-report-card (24 tests), retrospective (15 tests), and cache getStats (7 tests). All 927 tests pass across the full suite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Orchestrator prompt: add ao logs, ao perf, ao retro, ao dashboard management commands; add debugging/monitoring workflows; fix non-existent `ao session attach` → `tmux attach -t` - Child agent prompt: document AO_PROJECT_ID/AO_LOG_DIR env vars and JSONL log file locations for self-debugging - CLAUDE.md: add Logging & Observability section with log file table, key module index, LogEntry schema, dashboard pages, debugging tips, and CLI command reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implement `getAttachInfo` on SessionManager interface, delegating to the runtime plugin's `getAttachInfo()` method. CLI command dispatches on attach type (tmux/docker/ssh/web) for proper runtime-agnostic session attachment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move data types (LogEntry, LogQueryOptions, SessionReportCard, Retrospective) to types.ts as canonical contracts. Add pluggable service interfaces: - EventLogger: swap JSONL writer for database/cloud logging - EventLogReader: swap JSONL reader for custom query backend - RetrospectiveStore: swap JSON file storage for database/API LogWriter now implements EventLogger. LifecycleManagerDeps accepts EventLogger interface instead of hardcoded logDir string. JsonlRetrospectiveStore provides default file-based implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix FD leak in dashboard-manager.ts and start.ts: close parent FDs after spawn (child gets copies via dup2) - Fix retrospective sessionId filter: use `startsWith(id + "-")` to prevent matching overlapping prefixes (e.g., myapp-1 vs myapp-10) - Fix duplicate "Process:" output in `ao dashboard status`: conflict and stale states no longer reset to "not running" - Deduplicate parseSinceArg: move to shared lib/format.ts, remove copies from dashboard.ts, logs.ts, and perf.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move shared project-directory resolution into core helpers `resolveProjectLogDir()` and `resolveProjectRetroDir()` in paths.ts, replacing 6 duplicate implementations across perf.ts, logs.ts, dashboard.ts, retrospective.ts, and the web API routes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ensive tests - Remove unused EventLogReader interface from types.ts - Extract percentile() and normalizeRoutePath() to core/utils.ts (was duplicated in CLI + web) - Create shared perf-utils.ts for CLI commands (resolveLogDir, loadRequests, ParsedRequest) - Rewrite client-logs route with runtime type-guard validation instead of unsafe `as` cast - Add 76 new tests: core utils (14), CLI perf-utils (8), CLI logs (11), CLI perf (14), CLI retrospective (14), web observability routes (15) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Export readPidFile/writePidFile/removePidFile from core dashboard-manager - Replace duplicated PID functions in CLI dashboard.ts with core imports - Fix duplicate import lint error in perf-utils.test.ts - Fix formatAge "X ago old" → "(X ago)" in dashboard status Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Deduplicate identical log formatting functions that existed independently in both dashboard.ts and logs.ts. The canonical version (from logs.ts, which includes sessionId display) now lives in lib/format.ts. Also fixes the --since option in `ao dashboard logs` which was declared but silently ignored (imports were unused). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… test lint - dashboard-manager.ts: fix NEXT_PUBLIC_TERMINAL_PORT defaults from 3001/3003 to 14800/14801 to match the canonical defaults in web-dir.ts - log-reader.ts: extract ApiLogEntry type and parseApiLogs() function to core so CLI and web packages share one implementation instead of two - perf-utils.ts: loadRequests() delegates to core parseApiLogs() (no more duplicated parsing loop) - perf-utils.test.ts: update mock to include parseApiLogs delegating to mockReadLogsFromDir; add sessionId to toEqual expectation - client-logger.test.ts: remove useless constructor from ThrowingObserver class - request-logger.test.ts: replace forbidden typeof import() type annotations with proper import type at the top of the file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Merge duplicate imports from ../dashboard-manager.js into one statement using inline `type` modifier (no-duplicate-imports rule) - Replace `Function` type with a typed callback alias (no-unsafe-function-type) - Rename unused `args` parameter to `_args` in process.kill spy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

….ts PID utilities - dashboard.ts: set running=false/pid=null after printing conflict message to prevent a second "Process: running" line appearing below it - perf.ts: output {} as JSON when no requests found (--json flag respected) - logs.ts: remove early plain-text return from `ao logs session` so printEntries() handles empty+json case correctly - start.ts: replace inline readFileSync/writeFileSync/unlinkSync PID logic with readPidFile/writePidFile/removePidFile from @composio/ao-core Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ctness - request-logger.ts: import percentile/normalizeRoutePath from @composio/ao-core instead of reimplementing them locally (single source of truth) - retrospective.ts: replace magic numbers with named constants (CI_FAILURE_HIGH, CI_FAILURE_LOW, REVIEW_ROUNDS_HIGH, SESSION_LONG_HOURS, SESSION_QUICK_HOURS) for self-documenting thresholds - format.test.ts: expand from 9 to 52 tests covering all exported functions (parseSinceArg, formatMs, padCol, colorLevel, formatLogEntry, ciStatusIcon, reviewDecisionIcon, activityIcon) - perf.test.ts: add sessionId field to makeRequest, add empty JSON routes test - logs.test.ts: fix stale assertion ("No events found" → "No log entries found.") - request-logger.test.ts: add percentile/normalizeRoutePath to all vi.doMock calls so re-imported module resolves all its @composio/ao-core dependencies Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ication Three copies of the same route-grouping + percentile stats logic existed in: - web getRequestStats (request-logger.ts) - web GET /api/perf (perf/route.ts inline) - CLI ao perf routes (perf.ts inline) Also, RequestLog in web was a duplicate of ApiLogEntry in core. Changes: - Add computeApiStats(), RouteStats, ApiPerfResult to packages/core/src/log-reader.ts - Export from @composio/ao-core - Rewrite web/request-logger.ts as thin delegation: re-export types from core, getRequestStats() = parseApiLogs() + computeApiStats() - Simplify web/perf/route.ts to use getRequestStats() (removes 80-line inline loop) - Update CLI perf.ts to use computeApiStats() (removes inline grouping) - Update with-timing.ts to use ApiLogEntry instead of local RequestLog - Add parseApiLogs + computeApiStats tests to core log-reader.test.ts - Simplify web request-logger.test.ts to a 3-test delegation contract - Update observability-routes.test.ts mock to include parseApiLogs/computeApiStats - Update perf.test.ts mock to include computeApiStats with inline implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two fixes: 1. perf/route.ts was importing request-logger.js via relative path with .js extension. Next.js webpack can't resolve this — use @/lib/request-logger alias like all other API routes do. Fixes CI Typecheck and Fresh Onboarding. 2. perf.test.ts and observability-routes.test.ts contained ~100 lines of inline mock implementations duplicating computeApiStats and parseApiLogs from core. These are tested in log-reader.test.ts. Removed the duplication: - perf.test.ts: drop @composio/ao-core mock entirely; use real pure fns - observability-routes.test.ts: mock request-logger.js at the module boundary (getRequestStats), replace 9 behavioral tests with 7 focused HTTP-contract tests (JSON shape, param forwarding, error handling) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The visibilitychange handler was registered with an anonymous arrow function, making it impossible to remove via removeEventListener. Extracted it to a named variable (onVisibilityChange) and added proper cleanup, matching the pattern already used for onError / onRejection / onUnload. Added two new tests in the cleanup suite: - removes visibilitychange event listener - no longer flushes on visibilitychange after cleanup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>