feat: self-improvement system — logging, perf monitoring, retrospectives#108
Open
AgentWrapper wants to merge 21 commits intomainfrom
Open
feat: self-improvement system — logging, perf monitoring, retrospectives#108AgentWrapper wants to merge 21 commits intomainfrom
AgentWrapper wants to merge 21 commits intomainfrom
Conversation
bfa897e to
e296020
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
5e28c21 to
086665a
Compare
…retrospectives Add comprehensive observability infrastructure for ao to dog-food itself: Core log infrastructure: - JSONL log writer with size-based rotation (log-writer.ts) - Log query/filter utilities for reading rotated files (log-reader.ts) - getLogsDir() and getRetrospectivesDir() path utilities Event persistence: - Lifecycle manager writes state transitions to events.jsonl - Auto-generates retrospectives on session merge/kill Dashboard log capture: - `ao start -b` background mode with log piping to files - PID file for reliable `ao stop` - Browser-side error/perf capture (ClientLogger, /api/client-logs) API performance monitoring: - Request timing with breakdown (serviceInit, sessionList, prEnrichment) - Cache hit/miss tracking on TTLCache - Enrichment timing instrumentation in enrichSessionPR() - withTiming() HOF for route instrumentation Session retrospectives: - Report card generation from event logs (CI attempts, review rounds) - Retrospective with timeline, metrics, heuristic lessons - Auto-save on session completion CLI commands: - `ao logs dashboard|events|session <id>` — query structured logs - `ao retro list|show|generate` — manage retrospectives - `ao perf routes|slow|cache|enrichment` — API performance analysis Dashboard pages: - /logs — filterable log viewer with auto-refresh - /perf — route stats table, slow requests, cache effectiveness - Nav links added to main dashboard header Environment: - AO_PROJECT_ID and AO_LOG_DIR env vars in spawned sessions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactors `ao dashboard` from a flat start command into a command group: - `ao dashboard` (default) — starts with PID tracking + log capture - `ao dashboard restart [--clean]` — kill + restart, optionally wipe .next - `ao dashboard status` — show process state, port, .next cache age, log info - `ao dashboard logs` — tail dashboard.jsonl logs Process awareness: always writes dashboard.pid, validates PID on read, cleans up stale PID files automatically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds `packages/core/src/dashboard-manager.ts` with functions callable
by the orchestrator agent or lifecycle manager without going through CLI:
- restartDashboard(opts) — kill → clean .next → spawn detached → PID file
- waitForHealthy(port) — poll until dashboard responds to HTTP
- getDashboardStatus(logDir, port) — check if running via PID file or port
- stopDashboard(logDir, port) — kill without restarting
The CLI `ao dashboard restart` now delegates to restartDashboard() from
core, adding --wait flag to optionally block until healthy.
All exported from @composio/ao-core for programmatic use:
import { restartDashboard, waitForHealthy } from "@composio/ao-core";
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ache stats Covers log-writer (13 tests), log-reader (42 tests), session-report-card (24 tests), retrospective (15 tests), and cache getStats (7 tests). All 927 tests pass across the full suite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Orchestrator prompt: add ao logs, ao perf, ao retro, ao dashboard management commands; add debugging/monitoring workflows; fix non-existent `ao session attach` → `tmux attach -t` - Child agent prompt: document AO_PROJECT_ID/AO_LOG_DIR env vars and JSONL log file locations for self-debugging - CLAUDE.md: add Logging & Observability section with log file table, key module index, LogEntry schema, dashboard pages, debugging tips, and CLI command reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement `getAttachInfo` on SessionManager interface, delegating to the runtime plugin's `getAttachInfo()` method. CLI command dispatches on attach type (tmux/docker/ssh/web) for proper runtime-agnostic session attachment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move data types (LogEntry, LogQueryOptions, SessionReportCard, Retrospective) to types.ts as canonical contracts. Add pluggable service interfaces: - EventLogger: swap JSONL writer for database/cloud logging - EventLogReader: swap JSONL reader for custom query backend - RetrospectiveStore: swap JSON file storage for database/API LogWriter now implements EventLogger. LifecycleManagerDeps accepts EventLogger interface instead of hardcoded logDir string. JsonlRetrospectiveStore provides default file-based implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix FD leak in dashboard-manager.ts and start.ts: close parent FDs after spawn (child gets copies via dup2) - Fix retrospective sessionId filter: use `startsWith(id + "-")` to prevent matching overlapping prefixes (e.g., myapp-1 vs myapp-10) - Fix duplicate "Process:" output in `ao dashboard status`: conflict and stale states no longer reset to "not running" - Deduplicate parseSinceArg: move to shared lib/format.ts, remove copies from dashboard.ts, logs.ts, and perf.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move shared project-directory resolution into core helpers `resolveProjectLogDir()` and `resolveProjectRetroDir()` in paths.ts, replacing 6 duplicate implementations across perf.ts, logs.ts, dashboard.ts, retrospective.ts, and the web API routes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ensive tests - Remove unused EventLogReader interface from types.ts - Extract percentile() and normalizeRoutePath() to core/utils.ts (was duplicated in CLI + web) - Create shared perf-utils.ts for CLI commands (resolveLogDir, loadRequests, ParsedRequest) - Rewrite client-logs route with runtime type-guard validation instead of unsafe `as` cast - Add 76 new tests: core utils (14), CLI perf-utils (8), CLI logs (11), CLI perf (14), CLI retrospective (14), web observability routes (15) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Export readPidFile/writePidFile/removePidFile from core dashboard-manager - Replace duplicated PID functions in CLI dashboard.ts with core imports - Fix duplicate import lint error in perf-utils.test.ts - Fix formatAge "X ago old" → "(X ago)" in dashboard status Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deduplicate identical log formatting functions that existed independently in both dashboard.ts and logs.ts. The canonical version (from logs.ts, which includes sessionId display) now lives in lib/format.ts. Also fixes the --since option in `ao dashboard logs` which was declared but silently ignored (imports were unused). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… test lint - dashboard-manager.ts: fix NEXT_PUBLIC_TERMINAL_PORT defaults from 3001/3003 to 14800/14801 to match the canonical defaults in web-dir.ts - log-reader.ts: extract ApiLogEntry type and parseApiLogs() function to core so CLI and web packages share one implementation instead of two - perf-utils.ts: loadRequests() delegates to core parseApiLogs() (no more duplicated parsing loop) - perf-utils.test.ts: update mock to include parseApiLogs delegating to mockReadLogsFromDir; add sessionId to toEqual expectation - client-logger.test.ts: remove useless constructor from ThrowingObserver class - request-logger.test.ts: replace forbidden typeof import() type annotations with proper import type at the top of the file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Merge duplicate imports from ../dashboard-manager.js into one statement using inline `type` modifier (no-duplicate-imports rule) - Replace `Function` type with a typed callback alias (no-unsafe-function-type) - Rename unused `args` parameter to `_args` in process.kill spy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
….ts PID utilities
- dashboard.ts: set running=false/pid=null after printing conflict message
to prevent a second "Process: running" line appearing below it
- perf.ts: output {} as JSON when no requests found (--json flag respected)
- logs.ts: remove early plain-text return from `ao logs session` so
printEntries() handles empty+json case correctly
- start.ts: replace inline readFileSync/writeFileSync/unlinkSync PID logic
with readPidFile/writePidFile/removePidFile from @composio/ao-core
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ctness
- request-logger.ts: import percentile/normalizeRoutePath from @composio/ao-core
instead of reimplementing them locally (single source of truth)
- retrospective.ts: replace magic numbers with named constants
(CI_FAILURE_HIGH, CI_FAILURE_LOW, REVIEW_ROUNDS_HIGH, SESSION_LONG_HOURS,
SESSION_QUICK_HOURS) for self-documenting thresholds
- format.test.ts: expand from 9 to 52 tests covering all exported functions
(parseSinceArg, formatMs, padCol, colorLevel, formatLogEntry, ciStatusIcon,
reviewDecisionIcon, activityIcon)
- perf.test.ts: add sessionId field to makeRequest, add empty JSON routes test
- logs.test.ts: fix stale assertion ("No events found" → "No log entries found.")
- request-logger.test.ts: add percentile/normalizeRoutePath to all vi.doMock
calls so re-imported module resolves all its @composio/ao-core dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ication Three copies of the same route-grouping + percentile stats logic existed in: - web getRequestStats (request-logger.ts) - web GET /api/perf (perf/route.ts inline) - CLI ao perf routes (perf.ts inline) Also, RequestLog in web was a duplicate of ApiLogEntry in core. Changes: - Add computeApiStats(), RouteStats, ApiPerfResult to packages/core/src/log-reader.ts - Export from @composio/ao-core - Rewrite web/request-logger.ts as thin delegation: re-export types from core, getRequestStats() = parseApiLogs() + computeApiStats() - Simplify web/perf/route.ts to use getRequestStats() (removes 80-line inline loop) - Update CLI perf.ts to use computeApiStats() (removes inline grouping) - Update with-timing.ts to use ApiLogEntry instead of local RequestLog - Add parseApiLogs + computeApiStats tests to core log-reader.test.ts - Simplify web request-logger.test.ts to a 3-test delegation contract - Update observability-routes.test.ts mock to include parseApiLogs/computeApiStats - Update perf.test.ts mock to include computeApiStats with inline implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes:
1. perf/route.ts was importing request-logger.js via relative path with .js
extension. Next.js webpack can't resolve this — use @/lib/request-logger
alias like all other API routes do. Fixes CI Typecheck and Fresh Onboarding.
2. perf.test.ts and observability-routes.test.ts contained ~100 lines of
inline mock implementations duplicating computeApiStats and parseApiLogs
from core. These are tested in log-reader.test.ts. Removed the duplication:
- perf.test.ts: drop @composio/ao-core mock entirely; use real pure fns
- observability-routes.test.ts: mock request-logger.js at the module
boundary (getRequestStats), replace 9 behavioral tests with 7 focused
HTTP-contract tests (JSON shape, param forwarding, error handling)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The visibilitychange handler was registered with an anonymous arrow function, making it impossible to remove via removeEventListener. Extracted it to a named variable (onVisibilityChange) and added proper cleanup, matching the pattern already used for onError / onRejection / onUnload. Added two new tests in the cleanup suite: - removes visibilitychange event listener - no longer flushes on visibilitychange after cleanup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
086665a to
9e78ec6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full self-improvement system for ao: structured logging, API performance monitoring, session retrospectives, dashboard management, and comprehensive tests.
Core Infrastructure
log-writer.ts): size-based rotation, crash-safeappendFileSync, configurable backupslog-reader.ts): query/filter by time, level, source, session, pattern; reads across rotated files; tail supportsession-report-card.ts): per-session metrics (CI attempts, review rounds, outcome, duration)retrospective.ts): post-session analysis with timeline, metrics, and heuristic lessonsdashboard-manager.ts): programmaticrestartDashboard()for orchestrator agent to kill/clean/restart Next.js without CLIgetLogsDir(),getRetrospectivesDir()CLI Commands
ao logs dashboard|events|session— query structured logsao perf routes|slow|cache|enrichment— API performance analysisao retro list|show|generate— session retrospective managementao dashboard restart|status|logs— dashboard process management with PID trackingDashboard (Web)
/logspage withLogViewercomponent (filterable by source, level, time, session)/perfpage withPerfDashboardcomponent (route stats, slow requests, cache hit rates)POST /api/client-logs— browser-side error/perf ingestionGET /api/logs,GET /api/perf— log and performance query APIsClientLoggercomponent: captureswindow.onerror,PerformanceObserver, fetch timingTTLCache.getStats()/api/sessionsrouteserialize.tsEvent Persistence & Lifecycle
events.jsonlwith full state transition dataAO_PROJECT_IDandAO_LOG_DIRenv vars to spawned sessionsTests (101 new)
log-writer.test.ts(13): append, rotation, close, auto-create dirs, write failurelog-reader.test.ts(42): all filters, corrupted lines, readLogsFromDir, tailLogssession-report-card.test.ts(24): transitions, CI/review counting, outcomes, durationretrospective.test.ts(15): save/load, filters, corrupted JSON, round-tripcache.test.ts(7 new): getStats hit/miss tracking, hit rate, sizeBugbot Fixes
visibilitychangelistener leak in client-loggerwith-timing.ts, unusedgetRequestStats)--sinceoption in dashboard logs subcommandTest plan
pnpm test— all 927 tests passpnpm typecheck— 0 errorspnpm lint— 0 errorsao start— dashboard logs captured to JSONLao logs dashboard --tail 20— shows recent outputao dashboard restart --clean --wait— kills, cleans .next, restartsao perf routes— per-route p50/p95/p99ao retro list— shows retrospectives/logsand/perfpages render🤖 Generated with Claude Code