Skip to content

RFC: Deterministic Enforcement System (inspired by GoopSpec) #121

@robertDouglass

Description

@robertDouglass

Spec Kitty Deterministic Enforcement System

Summary

Make Spec Kitty the ULTIMATE in deterministic behavior enforcement for AI coding agents, inspired by GoopSpec's approach but working across ALL 12+ supported agents (Claude, Cursor, Gemini, Copilot, etc.).

Key Insight

GoopSpec hooks into OpenCode's tool pipeline. We can't do that for all agents. Instead, we enforce at infrastructure layers BELOW all agents: filesystem, git, and event store.


Architecture: Hybrid Defense-in-Depth

┌─────────────────────────────────────────────────────────┐
│           AI Agent (any of 12+ providers)               │
└────────────────────────┬────────────────────────────────┘
                         │ File Operations
                         ▼
┌─────────────────────────────────────────────────────────┐
│  LAYER 1: FILESYSTEM WATCHER DAEMON (spec-kitty-guardian)│
│  - Monitors file changes via OS-level APIs               │
│  - Immediately reverts writes to protected paths         │
│  - Emits PolicyViolationAttempted events                 │
└────────────────────────┬────────────────────────────────┘
                         ▼
┌─────────────────────────────────────────────────────────┐
│  LAYER 2: GIT HOOKS (pre-commit-phase-check)            │
│  - Blocks commits with phase violations                  │
│  - Uses existsSync() pattern from GoopSpec              │
│  - Cannot be bypassed (runs before commit)              │
└────────────────────────┬────────────────────────────────┘
                         ▼
┌─────────────────────────────────────────────────────────┐
│  LAYER 3: EVENT STORE VALIDATORS                        │
│  - Validates state transitions before persistence        │
│  - Server-authoritative for SaaS sync                   │
│  - Optimistic locking for multi-agent safety            │
└─────────────────────────────────────────────────────────┘

Phase Enforcement Rules

Phase ALLOWED Writes BLOCKED Writes
research research/, research.md src/, contracts/, plan.md, spec.md
specify spec.md, checklists/ src/, plan.md, tasks/
plan plan.md, data-model.md, contracts/ src/, tasks/
tasks tasks/, tasks.md src/
implement src/, tests/ spec.md, plan.md
review Review comments, lane changes src/ (unless changes_requested)

New Components to Create

1. Guardian Daemon (spec_kitty/guardian/)

src/specify_cli/guardian/
├── __init__.py
├── daemon.py          # PhaseEnforcementHandler, Observer
├── rules.py           # is_write_allowed(), PROTECTED_PATHS, get_current_phase()
├── cli.py             # guardian start/stop/status commands
└── pid_manager.py     # PID file management

Key function (deterministic, pure):

def is_write_allowed(path: Path, current_phase: Phase) -> bool:
    """Returns same result for same inputs - code-enforced, not LLM-dependent."""

2. Git Hook (pre-commit-phase-check)

Location: .git/hooks/pre-commit-phase-check

  • Implements GoopSpec's existsSync() pattern
  • Checks for required artifacts before phase transitions
  • Blocks commits with src/ changes during research/specify/plan phases
  • Integrated into existing pre-commit orchestrator

3. Event Validators (spec_kitty/events/validators.py)

  • PhaseTransitionValidator - validates WP status changes
  • validate_wp_status_change() - checks dependencies, required artifacts
  • New event types: PolicyViolationAttempted, PhaseTransitionBlocked

Files to Modify

File Change
src/specify_cli/cli/__init__.py Add guardian command group
.git/hooks/pre-commit Add pre-commit-phase-check to orchestrator
src/specify_cli/events/store.py Add validation before event persistence
src/specify_cli/cli/commands/implement.py Auto-start guardian on implement
pyproject.toml Add watchdog dependency

Phase Detection (GoopSpec pattern)

def get_current_phase(feature_path: Path) -> Phase:
    # 1. Check for active WPs in "doing" lane → implement
    # 2. Check for WPs in "for_review" lane → review
    # 3. Check artifact existence (GoopSpec existsSync pattern):
    #    - tasks/WP01.md exists → tasks phase
    #    - plan.md exists → plan phase
    #    - spec.md exists → specify phase
    #    - Otherwise → research phase

Multi-Agent Safety

  1. Worktree isolation - Each agent works in its own git worktree
  2. Guardian per worktree - Separate daemon instance per workspace
  3. Event-based coordination - Agents communicate via events, not filesystem
  4. Optimistic locking - Event store uses aggregate versions

Implementation Order

WP01: Core Rules Engine

  • Create guardian/rules.py with is_write_allowed() and PROTECTED_PATHS
  • Create guardian/__init__.py with get_current_phase()
  • Write unit tests for determinism (same inputs → same outputs)

WP02: Filesystem Watcher Daemon

  • Create guardian/daemon.py with PhaseEnforcementHandler
  • Create guardian/cli.py with start/stop/status commands
  • Integration with watchdog library
  • Auto-start on spec-kitty implement

WP03: Git Hook Enforcement

  • Create pre-commit-phase-check hook script
  • Add to existing pre-commit orchestrator
  • Install via migration system

WP04: Event Store Integration

  • Create events/validators.py with PhaseTransitionValidator
  • Add new event types for violations
  • Integrate validation into EventStore.append()

WP05: Testing & Documentation

  • Unit tests for rules (determinism verification)
  • Integration tests for guardian daemon
  • E2E tests simulating agent violations
  • Update docs with enforcement model

Verification Approach

  1. Determinism tests: Run is_write_allowed() 1000x with same inputs, verify same outputs
  2. Guardian tests: Start daemon, write to protected path, verify immediate revert
  3. Git hook tests: Attempt commit with violations, verify block
  4. Multi-agent tests: Simulate parallel agents in different worktrees
  5. E2E simulation: Run actual agent (Claude Code) and verify enforcement

Dependencies to Add

# pyproject.toml
[project.dependencies]
watchdog = "^4.0.0"  # Cross-platform filesystem monitoring
python-daemon = "^3.0.0"  # Daemon process management

Key Differences from GoopSpec

Aspect GoopSpec Spec Kitty (Proposed)
Enforcement point Agent tool pipeline OS filesystem layer
Agent support OpenCode only All 12+ agents
Parallel work Wave-based sequential True worktree isolation
Phase detection State file Frontmatter + artifact existence
Recovery None Periodic integrity checks

Success Criteria

  1. Agent-agnostic: Works identically for Claude, Cursor, Gemini, etc.
  2. Deterministic: is_write_allowed() is a pure function
  3. Real-time: Violations reverted within 500ms
  4. Non-bypassable: Even direct file writes are caught
  5. Auditable: All violations logged as events

Background & Motivation

This RFC was developed after analyzing GoopSpec, an OpenCode plugin that achieves deterministic enforcement by hooking into the LLM's tool execution pipeline. Their maintainer correctly noted:

"Most of my quality gates are deterministic, not at the whim of LLMs, that is a cherry on top rather than the baseline. LLMs validate intent, instead."

The key insight: code-enforced rules cannot be bypassed by LLM reasoning. An existsSync() check will block regardless of what the AI agent thinks.

However, GoopSpec only works with OpenCode. Spec Kitty needs to support 12+ different agents. Our solution: enforce at layers below ALL agents - the filesystem itself, git operations, and the event store.


Generated with Claude Opus 4.5 - competitive analysis and planning session

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions