-
Notifications
You must be signed in to change notification settings - Fork 56
Description
Spec Kitty Deterministic Enforcement System
Summary
Make Spec Kitty the ULTIMATE in deterministic behavior enforcement for AI coding agents, inspired by GoopSpec's approach but working across ALL 12+ supported agents (Claude, Cursor, Gemini, Copilot, etc.).
Key Insight
GoopSpec hooks into OpenCode's tool pipeline. We can't do that for all agents. Instead, we enforce at infrastructure layers BELOW all agents: filesystem, git, and event store.
Architecture: Hybrid Defense-in-Depth
┌─────────────────────────────────────────────────────────┐
│ AI Agent (any of 12+ providers) │
└────────────────────────┬────────────────────────────────┘
│ File Operations
▼
┌─────────────────────────────────────────────────────────┐
│ LAYER 1: FILESYSTEM WATCHER DAEMON (spec-kitty-guardian)│
│ - Monitors file changes via OS-level APIs │
│ - Immediately reverts writes to protected paths │
│ - Emits PolicyViolationAttempted events │
└────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ LAYER 2: GIT HOOKS (pre-commit-phase-check) │
│ - Blocks commits with phase violations │
│ - Uses existsSync() pattern from GoopSpec │
│ - Cannot be bypassed (runs before commit) │
└────────────────────────┬────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ LAYER 3: EVENT STORE VALIDATORS │
│ - Validates state transitions before persistence │
│ - Server-authoritative for SaaS sync │
│ - Optimistic locking for multi-agent safety │
└─────────────────────────────────────────────────────────┘
Phase Enforcement Rules
| Phase | ALLOWED Writes | BLOCKED Writes |
|---|---|---|
| research | research/, research.md |
src/, contracts/, plan.md, spec.md |
| specify | spec.md, checklists/ |
src/, plan.md, tasks/ |
| plan | plan.md, data-model.md, contracts/ |
src/, tasks/ |
| tasks | tasks/, tasks.md |
src/ |
| implement | src/, tests/ |
spec.md, plan.md |
| review | Review comments, lane changes | src/ (unless changes_requested) |
New Components to Create
1. Guardian Daemon (spec_kitty/guardian/)
src/specify_cli/guardian/
├── __init__.py
├── daemon.py # PhaseEnforcementHandler, Observer
├── rules.py # is_write_allowed(), PROTECTED_PATHS, get_current_phase()
├── cli.py # guardian start/stop/status commands
└── pid_manager.py # PID file management
Key function (deterministic, pure):
def is_write_allowed(path: Path, current_phase: Phase) -> bool:
"""Returns same result for same inputs - code-enforced, not LLM-dependent."""2. Git Hook (pre-commit-phase-check)
Location: .git/hooks/pre-commit-phase-check
- Implements GoopSpec's
existsSync()pattern - Checks for required artifacts before phase transitions
- Blocks commits with
src/changes duringresearch/specify/planphases - Integrated into existing pre-commit orchestrator
3. Event Validators (spec_kitty/events/validators.py)
PhaseTransitionValidator- validates WP status changesvalidate_wp_status_change()- checks dependencies, required artifacts- New event types:
PolicyViolationAttempted,PhaseTransitionBlocked
Files to Modify
| File | Change |
|---|---|
src/specify_cli/cli/__init__.py |
Add guardian command group |
.git/hooks/pre-commit |
Add pre-commit-phase-check to orchestrator |
src/specify_cli/events/store.py |
Add validation before event persistence |
src/specify_cli/cli/commands/implement.py |
Auto-start guardian on implement |
pyproject.toml |
Add watchdog dependency |
Phase Detection (GoopSpec pattern)
def get_current_phase(feature_path: Path) -> Phase:
# 1. Check for active WPs in "doing" lane → implement
# 2. Check for WPs in "for_review" lane → review
# 3. Check artifact existence (GoopSpec existsSync pattern):
# - tasks/WP01.md exists → tasks phase
# - plan.md exists → plan phase
# - spec.md exists → specify phase
# - Otherwise → research phaseMulti-Agent Safety
- Worktree isolation - Each agent works in its own git worktree
- Guardian per worktree - Separate daemon instance per workspace
- Event-based coordination - Agents communicate via events, not filesystem
- Optimistic locking - Event store uses aggregate versions
Implementation Order
WP01: Core Rules Engine
- Create
guardian/rules.pywithis_write_allowed()andPROTECTED_PATHS - Create
guardian/__init__.pywithget_current_phase() - Write unit tests for determinism (same inputs → same outputs)
WP02: Filesystem Watcher Daemon
- Create
guardian/daemon.pywithPhaseEnforcementHandler - Create
guardian/cli.pywith start/stop/status commands - Integration with
watchdoglibrary - Auto-start on
spec-kitty implement
WP03: Git Hook Enforcement
- Create
pre-commit-phase-checkhook script - Add to existing pre-commit orchestrator
- Install via migration system
WP04: Event Store Integration
- Create
events/validators.pywithPhaseTransitionValidator - Add new event types for violations
- Integrate validation into
EventStore.append()
WP05: Testing & Documentation
- Unit tests for rules (determinism verification)
- Integration tests for guardian daemon
- E2E tests simulating agent violations
- Update docs with enforcement model
Verification Approach
- Determinism tests: Run
is_write_allowed()1000x with same inputs, verify same outputs - Guardian tests: Start daemon, write to protected path, verify immediate revert
- Git hook tests: Attempt commit with violations, verify block
- Multi-agent tests: Simulate parallel agents in different worktrees
- E2E simulation: Run actual agent (Claude Code) and verify enforcement
Dependencies to Add
# pyproject.toml
[project.dependencies]
watchdog = "^4.0.0" # Cross-platform filesystem monitoring
python-daemon = "^3.0.0" # Daemon process managementKey Differences from GoopSpec
| Aspect | GoopSpec | Spec Kitty (Proposed) |
|---|---|---|
| Enforcement point | Agent tool pipeline | OS filesystem layer |
| Agent support | OpenCode only | All 12+ agents |
| Parallel work | Wave-based sequential | True worktree isolation |
| Phase detection | State file | Frontmatter + artifact existence |
| Recovery | None | Periodic integrity checks |
Success Criteria
- Agent-agnostic: Works identically for Claude, Cursor, Gemini, etc.
- Deterministic:
is_write_allowed()is a pure function - Real-time: Violations reverted within 500ms
- Non-bypassable: Even direct file writes are caught
- Auditable: All violations logged as events
Background & Motivation
This RFC was developed after analyzing GoopSpec, an OpenCode plugin that achieves deterministic enforcement by hooking into the LLM's tool execution pipeline. Their maintainer correctly noted:
"Most of my quality gates are deterministic, not at the whim of LLMs, that is a cherry on top rather than the baseline. LLMs validate intent, instead."
The key insight: code-enforced rules cannot be bypassed by LLM reasoning. An existsSync() check will block regardless of what the AI agent thinks.
However, GoopSpec only works with OpenCode. Spec Kitty needs to support 12+ different agents. Our solution: enforce at layers below ALL agents - the filesystem itself, git operations, and the event store.
Generated with Claude Opus 4.5 - competitive analysis and planning session