Skip to content

Comments

feat: self-improvement system — logging, perf monitoring, retrospectives#108

Open
AgentWrapper wants to merge 21 commits intomainfrom
session/ao-52
Open

feat: self-improvement system — logging, perf monitoring, retrospectives#108
AgentWrapper wants to merge 21 commits intomainfrom
session/ao-52

Conversation

@AgentWrapper
Copy link
Collaborator

@AgentWrapper AgentWrapper commented Feb 18, 2026

Summary

Full self-improvement system for ao: structured logging, API performance monitoring, session retrospectives, dashboard management, and comprehensive tests.

Core Infrastructure

  • JSONL log writer (log-writer.ts): size-based rotation, crash-safe appendFileSync, configurable backups
  • Log reader (log-reader.ts): query/filter by time, level, source, session, pattern; reads across rotated files; tail support
  • Session report cards (session-report-card.ts): per-session metrics (CI attempts, review rounds, outcome, duration)
  • Retrospectives (retrospective.ts): post-session analysis with timeline, metrics, and heuristic lessons
  • Dashboard manager (dashboard-manager.ts): programmatic restartDashboard() for orchestrator agent to kill/clean/restart Next.js without CLI
  • Path utilities: getLogsDir(), getRetrospectivesDir()

CLI Commands

  • ao logs dashboard|events|session — query structured logs
  • ao perf routes|slow|cache|enrichment — API performance analysis
  • ao retro list|show|generate — session retrospective management
  • ao dashboard restart|status|logs — dashboard process management with PID tracking

Dashboard (Web)

  • /logs page with LogViewer component (filterable by source, level, time, session)
  • /perf page with PerfDashboard component (route stats, slow requests, cache hit rates)
  • POST /api/client-logs — browser-side error/perf ingestion
  • GET /api/logs, GET /api/perf — log and performance query APIs
  • ClientLogger component: captures window.onerror, PerformanceObserver, fetch timing
  • Cache hit/miss tracking in TTLCache.getStats()
  • API request timing in /api/sessions route
  • PR enrichment timing in serialize.ts

Event Persistence & Lifecycle

  • Lifecycle manager writes events to events.jsonl with full state transition data
  • Auto-generates retrospectives on session merge/kill
  • Session manager passes AO_PROJECT_ID and AO_LOG_DIR env vars to spawned sessions

Tests (101 new)

  • log-writer.test.ts (13): append, rotation, close, auto-create dirs, write failure
  • log-reader.test.ts (42): all filters, corrupted lines, readLogsFromDir, tailLogs
  • session-report-card.test.ts (24): transitions, CI/review counting, outcomes, duration
  • retrospective.test.ts (15): save/load, filters, corrupted JSON, round-trip
  • cache.test.ts (7 new): getStats hit/miss tracking, hit rate, size

Bugbot Fixes

  • Background mode log capture uses file descriptors instead of pipes (survives parent exit)
  • Fixed visibilitychange listener leak in client-logger
  • Removed dead code (with-timing.ts, unused getRequestStats)
  • Fixed port conflict detection logic in dashboard status
  • Implemented --since option in dashboard logs subcommand

Test plan

  • pnpm test — all 927 tests pass
  • pnpm typecheck — 0 errors
  • pnpm lint — 0 errors
  • ao start — dashboard logs captured to JSONL
  • ao logs dashboard --tail 20 — shows recent output
  • ao dashboard restart --clean --wait — kills, cleans .next, restarts
  • ao perf routes — per-route p50/p95/p99
  • ao retro list — shows retrospectives
  • Dashboard /logs and /perf pages render

🤖 Generated with Claude Code

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

AgentWrapper and others added 19 commits February 20, 2026 19:34
…retrospectives

Add comprehensive observability infrastructure for ao to dog-food itself:

Core log infrastructure:
- JSONL log writer with size-based rotation (log-writer.ts)
- Log query/filter utilities for reading rotated files (log-reader.ts)
- getLogsDir() and getRetrospectivesDir() path utilities

Event persistence:
- Lifecycle manager writes state transitions to events.jsonl
- Auto-generates retrospectives on session merge/kill

Dashboard log capture:
- `ao start -b` background mode with log piping to files
- PID file for reliable `ao stop`
- Browser-side error/perf capture (ClientLogger, /api/client-logs)

API performance monitoring:
- Request timing with breakdown (serviceInit, sessionList, prEnrichment)
- Cache hit/miss tracking on TTLCache
- Enrichment timing instrumentation in enrichSessionPR()
- withTiming() HOF for route instrumentation

Session retrospectives:
- Report card generation from event logs (CI attempts, review rounds)
- Retrospective with timeline, metrics, heuristic lessons
- Auto-save on session completion

CLI commands:
- `ao logs dashboard|events|session <id>` — query structured logs
- `ao retro list|show|generate` — manage retrospectives
- `ao perf routes|slow|cache|enrichment` — API performance analysis

Dashboard pages:
- /logs — filterable log viewer with auto-refresh
- /perf — route stats table, slow requests, cache effectiveness
- Nav links added to main dashboard header

Environment:
- AO_PROJECT_ID and AO_LOG_DIR env vars in spawned sessions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactors `ao dashboard` from a flat start command into a command group:
- `ao dashboard` (default) — starts with PID tracking + log capture
- `ao dashboard restart [--clean]` — kill + restart, optionally wipe .next
- `ao dashboard status` — show process state, port, .next cache age, log info
- `ao dashboard logs` — tail dashboard.jsonl logs

Process awareness: always writes dashboard.pid, validates PID on read,
cleans up stale PID files automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds `packages/core/src/dashboard-manager.ts` with functions callable
by the orchestrator agent or lifecycle manager without going through CLI:

- restartDashboard(opts) — kill → clean .next → spawn detached → PID file
- waitForHealthy(port) — poll until dashboard responds to HTTP
- getDashboardStatus(logDir, port) — check if running via PID file or port
- stopDashboard(logDir, port) — kill without restarting

The CLI `ao dashboard restart` now delegates to restartDashboard() from
core, adding --wait flag to optionally block until healthy.

All exported from @composio/ao-core for programmatic use:
  import { restartDashboard, waitForHealthy } from "@composio/ao-core";

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ache stats

Covers log-writer (13 tests), log-reader (42 tests), session-report-card
(24 tests), retrospective (15 tests), and cache getStats (7 tests).
All 927 tests pass across the full suite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Orchestrator prompt: add ao logs, ao perf, ao retro, ao dashboard
  management commands; add debugging/monitoring workflows; fix
  non-existent `ao session attach` → `tmux attach -t`
- Child agent prompt: document AO_PROJECT_ID/AO_LOG_DIR env vars and
  JSONL log file locations for self-debugging
- CLAUDE.md: add Logging & Observability section with log file table,
  key module index, LogEntry schema, dashboard pages, debugging tips,
  and CLI command reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement `getAttachInfo` on SessionManager interface, delegating to
the runtime plugin's `getAttachInfo()` method. CLI command dispatches
on attach type (tmux/docker/ssh/web) for proper runtime-agnostic
session attachment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move data types (LogEntry, LogQueryOptions, SessionReportCard,
Retrospective) to types.ts as canonical contracts. Add pluggable
service interfaces:

- EventLogger: swap JSONL writer for database/cloud logging
- EventLogReader: swap JSONL reader for custom query backend
- RetrospectiveStore: swap JSON file storage for database/API

LogWriter now implements EventLogger. LifecycleManagerDeps accepts
EventLogger interface instead of hardcoded logDir string.
JsonlRetrospectiveStore provides default file-based implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix FD leak in dashboard-manager.ts and start.ts: close parent FDs
  after spawn (child gets copies via dup2)
- Fix retrospective sessionId filter: use `startsWith(id + "-")` to
  prevent matching overlapping prefixes (e.g., myapp-1 vs myapp-10)
- Fix duplicate "Process:" output in `ao dashboard status`: conflict
  and stale states no longer reset to "not running"
- Deduplicate parseSinceArg: move to shared lib/format.ts, remove
  copies from dashboard.ts, logs.ts, and perf.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move shared project-directory resolution into core helpers
`resolveProjectLogDir()` and `resolveProjectRetroDir()` in paths.ts,
replacing 6 duplicate implementations across perf.ts, logs.ts,
dashboard.ts, retrospective.ts, and the web API routes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ensive tests

- Remove unused EventLogReader interface from types.ts
- Extract percentile() and normalizeRoutePath() to core/utils.ts (was duplicated in CLI + web)
- Create shared perf-utils.ts for CLI commands (resolveLogDir, loadRequests, ParsedRequest)
- Rewrite client-logs route with runtime type-guard validation instead of unsafe `as` cast
- Add 76 new tests: core utils (14), CLI perf-utils (8), CLI logs (11), CLI perf (14),
  CLI retrospective (14), web observability routes (15)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Export readPidFile/writePidFile/removePidFile from core dashboard-manager
- Replace duplicated PID functions in CLI dashboard.ts with core imports
- Fix duplicate import lint error in perf-utils.test.ts
- Fix formatAge "X ago old" → "(X ago)" in dashboard status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deduplicate identical log formatting functions that existed independently in
both dashboard.ts and logs.ts. The canonical version (from logs.ts, which
includes sessionId display) now lives in lib/format.ts.

Also fixes the --since option in `ao dashboard logs` which was declared
but silently ignored (imports were unused).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… test lint

- dashboard-manager.ts: fix NEXT_PUBLIC_TERMINAL_PORT defaults from 3001/3003
  to 14800/14801 to match the canonical defaults in web-dir.ts
- log-reader.ts: extract ApiLogEntry type and parseApiLogs() function to core
  so CLI and web packages share one implementation instead of two
- perf-utils.ts: loadRequests() delegates to core parseApiLogs() (no more
  duplicated parsing loop)
- perf-utils.test.ts: update mock to include parseApiLogs delegating to
  mockReadLogsFromDir; add sessionId to toEqual expectation
- client-logger.test.ts: remove useless constructor from ThrowingObserver class
- request-logger.test.ts: replace forbidden typeof import() type annotations
  with proper import type at the top of the file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Merge duplicate imports from ../dashboard-manager.js into one statement
  using inline `type` modifier (no-duplicate-imports rule)
- Replace `Function` type with a typed callback alias (no-unsafe-function-type)
- Rename unused `args` parameter to `_args` in process.kill spy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
….ts PID utilities

- dashboard.ts: set running=false/pid=null after printing conflict message
  to prevent a second "Process: running" line appearing below it
- perf.ts: output {} as JSON when no requests found (--json flag respected)
- logs.ts: remove early plain-text return from `ao logs session` so
  printEntries() handles empty+json case correctly
- start.ts: replace inline readFileSync/writeFileSync/unlinkSync PID logic
  with readPidFile/writePidFile/removePidFile from @composio/ao-core

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ctness

- request-logger.ts: import percentile/normalizeRoutePath from @composio/ao-core
  instead of reimplementing them locally (single source of truth)
- retrospective.ts: replace magic numbers with named constants
  (CI_FAILURE_HIGH, CI_FAILURE_LOW, REVIEW_ROUNDS_HIGH, SESSION_LONG_HOURS,
  SESSION_QUICK_HOURS) for self-documenting thresholds
- format.test.ts: expand from 9 to 52 tests covering all exported functions
  (parseSinceArg, formatMs, padCol, colorLevel, formatLogEntry, ciStatusIcon,
  reviewDecisionIcon, activityIcon)
- perf.test.ts: add sessionId field to makeRequest, add empty JSON routes test
- logs.test.ts: fix stale assertion ("No events found" → "No log entries found.")
- request-logger.test.ts: add percentile/normalizeRoutePath to all vi.doMock
  calls so re-imported module resolves all its @composio/ao-core dependencies

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ication

Three copies of the same route-grouping + percentile stats logic existed in:
  - web getRequestStats (request-logger.ts)
  - web GET /api/perf (perf/route.ts inline)
  - CLI ao perf routes (perf.ts inline)

Also, RequestLog in web was a duplicate of ApiLogEntry in core.

Changes:
- Add computeApiStats(), RouteStats, ApiPerfResult to packages/core/src/log-reader.ts
- Export from @composio/ao-core
- Rewrite web/request-logger.ts as thin delegation: re-export types from core,
  getRequestStats() = parseApiLogs() + computeApiStats()
- Simplify web/perf/route.ts to use getRequestStats() (removes 80-line inline loop)
- Update CLI perf.ts to use computeApiStats() (removes inline grouping)
- Update with-timing.ts to use ApiLogEntry instead of local RequestLog
- Add parseApiLogs + computeApiStats tests to core log-reader.test.ts
- Simplify web request-logger.test.ts to a 3-test delegation contract
- Update observability-routes.test.ts mock to include parseApiLogs/computeApiStats
- Update perf.test.ts mock to include computeApiStats with inline implementation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AgentWrapper and others added 2 commits February 20, 2026 19:34
Two fixes:
1. perf/route.ts was importing request-logger.js via relative path with .js
   extension. Next.js webpack can't resolve this — use @/lib/request-logger
   alias like all other API routes do. Fixes CI Typecheck and Fresh Onboarding.

2. perf.test.ts and observability-routes.test.ts contained ~100 lines of
   inline mock implementations duplicating computeApiStats and parseApiLogs
   from core. These are tested in log-reader.test.ts. Removed the duplication:
   - perf.test.ts: drop @composio/ao-core mock entirely; use real pure fns
   - observability-routes.test.ts: mock request-logger.js at the module
     boundary (getRequestStats), replace 9 behavioral tests with 7 focused
     HTTP-contract tests (JSON shape, param forwarding, error handling)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The visibilitychange handler was registered with an anonymous arrow
function, making it impossible to remove via removeEventListener.
Extracted it to a named variable (onVisibilityChange) and added proper
cleanup, matching the pattern already used for onError / onRejection /
onUnload.

Added two new tests in the cleanup suite:
- removes visibilitychange event listener
- no longer flushes on visibilitychange after cleanup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant