Automated fix-verify loops using Claude Agent SDK with intelligent model selection and task decomposition.
You spend hours doing this:
See failure -> Paste to Claude -> Apply fix -> Run tests -> See failure -> repeat...
uv run grind run --task "Fix failing tests" --verify "pytest tests/ -v"Walk away. Come back to passing tests.
- Intelligent Model Selection: Opus 4.5 for planning, Haiku 4.5 for execution (3-5x cost savings)
- Extended Thinking: 10K token reasoning budget for complex decomposition
- CostAwareRouter: Automatic model assignment based on task complexity
- Interleaved Thinking: Better reasoning between tool calls
- DAG Execution: Parallel task execution with dependency management
- Git Worktrees: Conflict-free parallel execution
- WebSearch Integration: Research capability during decomposition
Pricing (Dec 2025):
- Haiku 4.5: $1/$5 per million tokens (default, 73% of Opus capability)
- Sonnet 4.5: $3/$15 per million tokens (medium complexity)
- Opus 4.5: $5/$25 per million tokens (planning, 67% cheaper than Opus 4.1)
# Clone the repo
cd claude_code_agent
# Install dependencies
uv sync
# Verify Claude Code CLI is installed
claude --versionFix one thing:
uv run grind run --task "Fix failing unit tests" --verify "pytest tests/ -v"
# Short form
uv run grind -t "Fix tests" -v "pytest"When you have a list of tasks:
# Create a tasks file (or use decompose to generate one)
uv run grind batch tasks.yamltasks.yaml format:
tasks:
- task: "Fix auth tests"
verify: "pytest tests/auth/ -v"
max_iterations: 5
- task: "Fix API tests"
verify: "pytest tests/api/ -v"
max_iterations: 5When you have a big problem and need Claude to break it down:
Option A: Using slash command in conversation
Talk to Claude about your problems, then:
/generate-tasks
Reviews context and generates tasks.yaml automatically
Option B: Using CLI decompose
# Analyze and create task list
uv run grind decompose \
--problem "Fix all 47 failing tests" \
--verify "pytest tests/ -v" \
--output tasks.yaml
# Then run the generated tasks
uv run grind batch tasks.yamlWhat works:
- Interactive shell for running grind tasks
- Command history and tab completion
- Basic task execution and status tracking
What's planned:
- Real-time multi-agent monitoring
- DAG visualization
- Log streaming dashboard
Try it:
# Launch TUI
uv run grind tui
# Launch with task file
uv run grind tui -t tasks.yamlNavigate tabs with 1-6 keys. Use tab 6 (Shell) for interactive commands.
After running DAG tasks with worktrees, you'll have multiple branches with fixes. Use the intelligent merge command to combine them:
# Interactive merge with conflict resolution
uv run grind merge
# Merge specific branches
uv run grind merge fix/lint fix/tests fix/types
# Custom pattern
uv run grind merge --pattern "feature/*,bugfix/*"
# With post-merge verification
uv run grind merge --verify "pytest && ruff check"
# Dry run (see what would be merged)
uv run grind merge --dry-runWhat makes this smart:
- ✓ Merges clean branches automatically
⚠️ Prompts only when conflicts occur- 💾 Creates backup and staging branches (never touches main directly)
- 🧪 Runs verification after merging
- 📊 Shows clear summary with next steps
Conflict resolution options: When conflicts occur, you'll be prompted:
- Show diff (investigate the conflict)
- Keep ours (discard their changes)
- Keep theirs (accept their changes)
- Skip this branch (handle manually later)
- Abort entire merge
After merging:
# Review the merged result
git diff main..grind-merge-20251207-1430
# If satisfied, merge to main
git checkout main
git merge grind-merge-20251207-1430 --ff-only# Let Claude analyze and decompose
uv run grind decompose \
-p "Fix all failing pytest tests" \
-v "pytest tests/ -v --tb=short" \
-o test-tasks.yaml
# Review the generated tasks
cat test-tasks.yaml
# Run them
uv run grind batch test-tasks.yaml# Decompose by issue type/file
uv run grind decompose \
-p "Fix all SonarQube code smells and bugs" \
-v "sonar-scanner && ./check-quality-gate.sh" \
-o sonar-tasks.yaml
uv run grind batch sonar-tasks.yaml# Usually a single grind is enough for linting
uv run grind run \
-t "Fix all ruff linting errors" \
-v "ruff check src/"uv run grind run \
-t "Fix all mypy type errors" \
-v "mypy src/ --strict" \
-n 15 # May need more iterations for complex type fixes| Option | Short | Default | Description |
|---|---|---|---|
| --task | -t | required | What to fix |
| --verify | -v | required | Command to verify (exit 0 = pass) |
| --max-iter | -n | 10 | Max iterations |
| --cwd | -c | . | Working directory |
| --verbose | false | Show full Claude output | |
| --quiet | -q | false | Minimal output |
| Option | Description |
|---|---|
| file | YAML/JSON file with task list |
| --verbose | Show full output |
| --stop-on-stuck | Stop if any task gets stuck |
| Option | Short | Description |
|---|---|---|
| --problem | -p | Problem to analyze |
| --verify | -v | Verification command |
| --output | -o | Save tasks to file |
| --cwd | -c | Working directory |
| --verbose | Show analysis |
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Error |
| 2 | Agent got stuck |
| 3 | Max iterations reached |
Custom slash commands for use in Claude Code conversations:
Generate a tasks.yaml file from conversation context.
Usage: Just type /generate-tasks after discussing problems/goals with Claude.
It will:
- Analyze what you've been discussing
- Break down into actionable tasks
- Choose appropriate models
- Generate properly formatted YAML
- Write to file and show usage
See .claude/commands/README.md for details.
Choose the right model for your task based on complexity and budget (December 2025 rates):
| Model | Use Case | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|---|
| haiku (default) | Simple fixes, linting, formatting | $0.25 | $1.25 |
| sonnet | Bug fixes, refactoring, medium complexity | $3.00 | $15.00 |
| opus | Planning, architecture, complex logic | $15.00 | $75.00 |
Usage:
# Use default (haiku)
uv run grind run -t "Fix linting" -v "ruff check ."
# Specify model explicitly
uv run grind run -t "Refactor auth" -v "pytest tests/auth/" -m sonnetRecommendation: Start with haiku for most tasks. Use sonnet for medium complexity work. Reserve opus for architectural decisions and complex planning tasks.
- Use
/generate-tasksin conversations to automatically create task files - Start with decompose for large problems - let Claude figure out the chunks
- Review generated tasks before running batch - you can edit the YAML
- Use --verbose while learning to see what Claude is doing
- Lower max_iterations for quick tasks, higher for complex ones
- Good verification commands give useful error output
- Choose models wisely - haiku for simple tasks, sonnet for medium complexity, opus for planning/architecture
grind/
__init__.py # Package exports
models.py # Data structures
engine.py # Core grind loop
hooks.py # Slash command hooks
prompts.py # Prompt templates
tasks.py # Task loading
batch.py # Batch execution
cli.py # Command-line interface
utils.py # Output formatting
grind.py # Entry point
examples/
example-tasks.yaml # Example task definitions
Grind Loop is designed to work seamlessly with Claude Code.
# Install dependencies
uv sync
# Install slash commands globally (optional but recommended)
make install-commandsNow use /generate-tasks in any Claude Code conversation to automatically generate task files!
See Using with Claude Code for complete integration guide.
📚 Full Documentation (MkDocs site)
- Using with Claude Code - Integration guide and workflows
- Getting Started - Installation and setup
- Features Guide - Complete feature reference
- Architecture - System design
- SDK Reference - Claude Agent SDK docs
# View documentation locally
make docs
# Or manually:
uv run mkdocs serveThen open http://127.0.0.1:8000