Website · Docs · Discord · Changelog
Generate tests from requirements, simulate conversation flows, detect adversarial behaviors, evaluate with 60+ metrics, and trace failures with OpenTelemetry. Engineers and domain experts, working together.
We built Rhesis because existing LLM testing tools didn't meet our needs for testing agentic applications. If you face the same challenges, contributions are welcome.
Testing shouldn't be limited to engineers. Legal teams understand compliance requirements. Marketing knows brand guidelines. Domain experts identify edge cases. Rhesis enables everyone to contribute their expertise without writing code.
Define requirements in plain language. Rhesis generates test scenarios based on your team's collective knowledge. Execute tests automatically via UI, SDK, or CI/CD. Get detailed results showing exactly how your LLM & agentic applications perform.
MIT licensed. Enterprise version lives in ee/ folders and remain separate.
Check out our main repository and documentation to get started.
Quick start options:
- Cloud - app.rhesis.ai - Managed service, just connect your app
- Self-hosted - Run locally with Docker in 5 minutes
- Python SDK - Integrate directly into your codebase
