Skip to content

Add prompt-refactor meta-skill and SWE-Bench modular skills#42

Draft
juanmichelini wants to merge 1 commit intomainfrom
add-prompt-refactor-skill
Draft

Add prompt-refactor meta-skill and SWE-Bench modular skills#42
juanmichelini wants to merge 1 commit intomainfrom
add-prompt-refactor-skill

Conversation

@juanmichelini
Copy link
Collaborator

Overview

This PR introduces a new approach to managing prompt complexity by decomposing monolithic prompts into modular, reusable skills.

What's Added

1. prompt-refactor (Meta-Skill)

A meta-skill that helps decompose monolithic prompts into modular, reusable components:

  • analyze_prompt.py script - Automated prompt analysis tool that identifies phases and structure
  • skill-design-patterns.md - Reference documentation for best practices in skill design
  • Workflow for analyzing, designing, and implementing modular skills

2. SWE-Bench Modular Skills (6 skills)

Six specialized skills created by refactoring the SWE-Bench prompt template:

  • code-issue-analysis - Analyze bug reports and issue descriptions
  • code-exploration - Explore codebases to understand structure and patterns
  • test-reproduction - Create minimal reproduction scripts before fixing
  • fix-analysis - Analyze problems and plan fixes with best practices
  • code-implementation - Implement fixes following established patterns
  • code-verification - Thoroughly test and verify implementations

Why This Matters

Before (Monolithic Prompt)

  • 3874 characters, 63 lines of instructions in a single template
  • Hard to maintain and update individual sections
  • No reusability across different use cases
  • Context window bloat with instructions that may not all be relevant

After (Modular Skills)

  • 7 focused, reusable skills (average ~80 lines each)
  • Each skill is independently maintainable
  • Skills can be reused across different benchmarks (SWE-Bench, Commit0, etc.)
  • Progressive disclosure - only loads relevant skills when needed
  • Better context window management

Benefits

  1. Reusability - Skills can be used across different benchmarks and workflows
  2. Maintainability - Update one skill without affecting others
  3. Testability - Each skill can be tested independently
  4. Context Efficiency - Progressive disclosure loads only what's needed
  5. Discoverability - Clear skill metadata makes it easy to understand when to use each skill

Example Usage

The prompt-refactor meta-skill can be used to refactor other monolithic prompts:

python skills/prompt-refactor/scripts/analyze_prompt.py path/to/prompt.j2

The SWE-Bench skills can be referenced in new prompt templates or used directly by OpenHands when solving coding issues.

Testing

All skills follow the established SKILL.md format with:

  • ✅ Valid YAML frontmatter (name, description)
  • ✅ Clear overview and usage guidance
  • ✅ Step-by-step workflows
  • ✅ Best practices and examples

Related Work

This is part of a larger effort to transform the SWE-Bench benchmark prompt architecture. A corresponding PR will be created for the benchmarks repository with a new skills.j2 template that references these modular skills.

Files Changed

  • skills/prompt-refactor/SKILL.md - Meta-skill documentation
  • skills/prompt-refactor/scripts/analyze_prompt.py - Prompt analysis tool
  • skills/prompt-refactor/references/skill-design-patterns.md - Design patterns
  • skills/code-issue-analysis/SKILL.md - Issue analysis skill
  • skills/code-exploration/SKILL.md - Code exploration skill
  • skills/test-reproduction/SKILL.md - Test reproduction skill
  • skills/fix-analysis/SKILL.md - Fix analysis skill
  • skills/code-implementation/SKILL.md - Implementation skill
  • skills/code-verification/SKILL.md - Verification skill

Total: 1101 lines added across 9 files

Co-authored-by: openhands openhands@all-hands.dev

This PR adds 7 new skills for decomposing monolithic prompts into modular, reusable components:

1. **prompt-refactor** - Meta-skill for analyzing and refactoring prompts
   - Includes analyze_prompt.py script for automated prompt analysis
   - Includes skill-design-patterns.md reference documentation

2. **code-issue-analysis** - Skill for analyzing bug reports and issue descriptions
3. **code-exploration** - Skill for exploring codebases to understand structure
4. **test-reproduction** - Skill for creating reproduction scripts
5. **fix-analysis** - Skill for analyzing problems and planning fixes
6. **code-implementation** - Skill for implementing fixes with best practices
7. **code-verification** - Skill for thorough testing and verification

These skills were created by refactoring the SWE-Bench prompt template, transforming a monolithic 3874-character prompt into modular, reusable skills. Each skill contains:
- Frontmatter metadata with name and description
- Clear overview and when to use guidance
- Step-by-step workflow
- Best practices and examples

Benefits:
- Reusable across different benchmarks and use cases
- More maintainable than monolithic prompts
- Better context window management through progressive disclosure
- Easier to test and iterate on individual components

Co-authored-by: openhands <openhands@all-hands.dev>
@juanmichelini juanmichelini marked this pull request as draft February 13, 2026 14:20
@openhands-ai
Copy link

openhands-ai bot commented Feb 13, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Check README.md in Skills

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #42 at branch `add-prompt-refactor-skill`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant