Skip to content

Claude Skill for benchmarks#337

Open
gabotechs wants to merge 11 commits intomainfrom
gabrielmusat/claude-skill-benchmarks
Open

Claude Skill for benchmarks#337
gabotechs wants to merge 11 commits intomainfrom
gabrielmusat/claude-skill-benchmarks

Conversation

@gabotechs
Copy link
Collaborator

@gabotechs gabotechs commented Feb 2, 2026

This PR adds several improvements to the benchmarks:

  • Claude Skills: adds some Claude Code skills for dealing with remote benchmarks. Typically these benchmarks take a while to execute, and analyzing the results can be a bit time consuming. This PR allows Claude to autonomously do all the tedious tasks of running benchmarks, waiting for the result, analyzing them, performing small changes, and redeploy for benchmark again.
  • Adds ballista back: adds ballista benchmarks back so that we have another engine to compare to. This time, ballista is added in a cargo crate detached from the workspace, so users do not need to care about importing it. All the ballista code is pretty much what was removed from Remove ballista from benchmarks #292
  • Stores the execution plans with metrics attached to the benchmarks run. This way, Claude can run benchmarks and analyze the execution plans along with the results in order to identify bottlenecks and improvement points.

@gabotechs gabotechs changed the title Gabrielmusat/claude skill benchmarks Claude skill benchmarks Feb 2, 2026
@gabotechs gabotechs changed the title Claude skill benchmarks Claude Skill for benchmarks Feb 2, 2026
@gabotechs gabotechs force-pushed the gabrielmusat/fix-metrics-display-on-leaf-nodes branch from a2311e6 to a33ebd5 Compare February 2, 2026 19:06
Base automatically changed from gabrielmusat/fix-metrics-display-on-leaf-nodes to main February 3, 2026 16:59
@gabotechs gabotechs force-pushed the gabrielmusat/claude-skill-benchmarks branch from 76df407 to 357b99a Compare February 6, 2026 07:33
Copy link

@dd-annarose dd-annarose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looks good to me, but might need a more experienced eye to take a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants