Skip to content

feat: rewrite load test suite#4099

Closed
MasterPtato wants to merge 1 commit into02-02-chore_guard_centralize_cors_configfrom
02-02-feat_rewrite_load_test_suite
Closed

feat: rewrite load test suite#4099
MasterPtato wants to merge 1 commit into02-02-chore_guard_centralize_cors_configfrom
02-02-feat_rewrite_load_test_suite

Conversation

@MasterPtato
Copy link
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-4099 February 3, 2026 02:08 Destroyed
@railway-app
Copy link

railway-app bot commented Feb 3, 2026

🚅 Deployed to the rivet-pr-4099 environment in rivet-frontend

Service Status Web Updated (UTC)
mcp-hub 🕐 Queued (View Logs) Web Feb 4, 2026 at 6:49 pm
frontend-inspector 🕒 Building (View Logs) Web Feb 4, 2026 at 6:49 pm
website 🕒 Building (View Logs) Web Feb 4, 2026 at 6:49 pm
frontend-cloud 🕒 Building (View Logs) Web Feb 4, 2026 at 6:49 pm

Copy link
Contributor Author

MasterPtato commented Feb 3, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 3, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4099

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@4099

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4099

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4099

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4099

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4099

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4099

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4099

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4099

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4099

commit: 07288c0

@claude
Copy link

claude bot commented Feb 3, 2026

Code Review - PR #4099: Rewrite Load Test Suite

This PR migrates the load test infrastructure from a custom TypeScript implementation to k6, consolidating testing under scripts/tests/load-test/. Overall, this is a solid improvement that brings professional-grade load testing capabilities.


✅ Strengths

Architecture & Design

  • Excellent choice of k6: Industry-standard load testing tool with better performance
  • Good consolidation: Moving from engine/tests/load/ to scripts/tests/load-test/ better organizes the codebase
  • Well-structured variations: The sporadic/idle/chatty test patterns provide good coverage
  • Comprehensive metrics: Custom k6 metrics provide excellent observability

Developer Experience

  • Outstanding documentation: QUICKSTART.md is clear and well-organized
  • Helpful validation: Pre-flight checks for k6, engine health, and test-runner prevent failures
  • Good error messages: Formatted error messages with solutions are helpful
  • Smart CLI design: Duration validation prevents configuration mistakes

🔍 Critical Issues

1. Incorrect log message in SIGINT handler (test-runner/src/index.ts:127)
Line says "received SIGTERM" but should say "received SIGINT"

2. Hardcoded localhost URL (test-runner/src/index.ts:138)
Should use RIVET_ENDPOINT instead of hardcoding http://127.0.0.1:6420

3. Empty function body formatting (test-runner/src/index.ts:187)
onDisconnected: () => { } has space inside braces - should be () => {}


🐛 Code Quality Issues

4. Missing radix parameter (run.ts:102, 110)
Consider parseInt(value, 10) to be explicit

5. WebSocket error handling in chatty mode (actor-lifecycle.js:583-605)
Connection failures are caught but loop continues without WebSocket testing

6. K6 script uses JavaScript (actor-lifecycle.js)
Inconsistent with TypeScript codebase - consider k6 TypeScript support

7. Memory leak potential (test-runner/src/index.ts:30)
websocketLastMsgIndexes Map is never cleaned up - add cleanup in close handler

8. No comment on blocking sleep (actor-lifecycle.js:258)
sleep(durationSeconds) blocks VU - add comment this is intentional for idle variation

9. Health check timeouts (run.ts:314-368)
Individual fetch has 5s timeout, but no overall timeout for health checks

10. Variable naming clarity (run.ts:123)
variationDurationMs vs testDuration - both are in ms but naming is inconsistent

11. Counter race condition (actor-lifecycle.js:123, 392)
activeActors only decremented on successful destroy - fails leave it inaccurate

12. Success check needs comment (actor-lifecycle.js:232-241)
WebSocket success logic is correct but subtle - add clarifying comment


📝 Documentation Issues

13. Inconsistent script names (run.ts:209)
Help says run-load-test.ts but file is run.ts - be consistent

14. Missing variation explanation (QUICKSTART.md)
Doesn't explain what "sporadic" means - users might be confused

15. Incorrect path (QUICKSTART.md:38)
Says load-test/run-load-test.ts but should be load-test/run.ts


💡 Suggestions

16. No input validation: Consider validating RIVET_TOKEN and RIVET_ENDPOINT
17. Token in URL path: May appear in logs - verify this is intended
18. Strict thresholds: 95% success rate might fail in slow environments - make configurable
19. No cleanup on failure: Interrupted tests may leave actors running
20. No tests for test-runner changes: Functional improvements lack verification


Future Improvements

  1. Add InfluxDB/Prometheus output examples for trend tracking
  2. Allow mixing variations (70% sporadic, 20% idle, 10% chatty)
  3. Add distributed testing support
  4. Add CI integration examples
  5. Add actor cleanup utility for failed runs

📋 Required Changes

  1. ✅ Fix SIGINT log message
  2. ✅ Fix hardcoded localhost URL
  3. ✅ Add cleanup for websocketLastMsgIndexes map
  4. ✅ Fix script name inconsistencies
  5. ✅ Add variation explanations to QUICKSTART

✨ Assessment

High-quality refactor that significantly improves load testing. K6 migration is well-executed, developer experience is excellent, code is mostly clean. Issues are minor and easily addressable.

Recommendation: Approve with requested changes.

Great work on comprehensive help text, validation, and documentation! 🚀

@claude
Copy link

claude bot commented Feb 3, 2026

Pull Request Review: Load Test Suite Rewrite

This PR rewrites the load testing infrastructure with a unified k6-based solution.

Strengths

  1. Excellent documentation with clear QUICKSTART guide
  2. Comprehensive test variations (sporadic, idle, chatty)
  3. Good pre-flight health checks
  4. Rich custom k6 metrics
  5. User-friendly CLI with helpful errors

Critical Issues

1. SIGTERM Handler Bug (test-runner/src/index.ts:53)
Log says SIGTERM but handler is for SIGINT

2. Runner Variable Scope (test-runner/src/index.ts:23)
In /start endpoint, local variables shadow global runner, breaking /shutdown

3. File Naming Inconsistency
Docs reference run-load-test.ts but file is run.ts

4. Missing Dependency (scripts/tests/package.json)
Missing tsx in devDependencies

5. Wrong Token Variable (run.ts:322)
Uses global RIVET_TOKEN instead of function parameter

6. Race Condition Risk (actor-lifecycle.js:1842-1896)
Chatty test opens/closes WebSockets in tight loop without rate limiting

Recommendations

High Priority:

  • Fix SIGINT/SIGTERM log mismatch
  • Fix runner variable scope in /start endpoint
  • Rename file OR update all docs for consistency
  • Add tsx to devDependencies
  • Fix token variable in checkEngineHealth

Medium Priority:

  • Add rate limiting to chatty WebSocket connections
  • Remove unnecessary space in empty function (line 71)

Overall: Well-designed rewrite that improves testing. Address the bugs above before merging.

process.exit(0);
});
process.on("SIGINT", async () => {
getLogger().debug("received SIGTERM, force exiting in 3s");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste error in SIGINT handler log message. It says "received SIGTERM" but should say "received SIGINT".

getLogger().debug("received SIGINT, force exiting in 3s");
Suggested change
getLogger().debug("received SIGTERM, force exiting in 3s");
getLogger().debug("received SIGINT, force exiting in 3s");

Spotted by Graphite Agent

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +119 to +132
process.on("SIGTERM", async () => {
getLogger().debug("received SIGTERM, force exiting in 3s");

await new Promise(res => setTimeout(res, 3000));

process.exit(0);
});
process.on("SIGINT", async () => {
getLogger().debug("received SIGTERM, force exiting in 3s");

await new Promise(res => setTimeout(res, 3000));

process.exit(0);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These signal handlers contain unhandled promises. Since Node.js event handlers don't wait for async functions to complete, these create 'floating promises'. Fix by either removing the async keyword and using setTimeout with a callback, or add .catch() handlers to properly handle errors.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

runnerStarted.resolve(runner);
},
onDisconnected: () => {},
onDisconnected: () => { },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent formatting with extra space after the empty function body. This should be formatted as 'onDisconnected: () => {}' to match the code style used elsewhere.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

@MasterPtato MasterPtato force-pushed the 02-02-feat_rewrite_load_test_suite branch from 07288c0 to 1728003 Compare February 3, 2026 21:30
@claude
Copy link

claude bot commented Feb 3, 2026

Pull Request Review: Load Test Suite Rewrite

Summary

This PR replaces the previous TypeScript-based load testing suite with a new k6-based implementation. The change moves from custom test code to industry-standard load testing tooling, providing better metrics, configuration options, and test variations.

✅ Strengths

  1. Better tooling choice: Using k6 is a solid decision for load testing - industry-standard, well-maintained, and provides excellent metrics.

  2. Comprehensive documentation: QUICKSTART.md is well-structured and user-friendly with clear setup steps and troubleshooting.

  3. Good test coverage: Three test variations (sporadic, idle, chatty) cover different usage patterns effectively.

  4. Health checks: Pre-flight checks in run.ts for k6, engine health, and test runner availability provide excellent UX.

  5. Cleanup: Removing old engine/tests/load and engine/tests/smoke directories eliminates dead code.

  6. Structured metrics: Custom k6 metrics provide good observability (success rates, durations, counters).

⚠️ Issues & Concerns

1. Error Handling - Missing Try-Catch in Finally Blocks 🔴

Location: actor-lifecycle.js:469-471, 521-523, 625-627

Problem: If destroyActor throws in the finally block, it will mask the original error and crash the VU.

Fix: Wrap cleanup in try-catch to prevent masking original errors.

2. Inconsistent Logging - Mixed Capitalization 🟡

Location: Throughout actor-lifecycle.js

Per CLAUDE.md, log messages should be lowercase unless mentioning code symbols. Example: "Created actor" should be "created actor".

3. WebSocket Timeout Not Configurable 🟡

Location: actor-lifecycle.js:224, 604

5-second timeout is hardcoded. Under high load, legitimate operations might timeout. Make configurable via environment variable.

4. Missing Structured Logging in TypeScript 🟡

Location: test-runner/src/index.ts:199, 247, 252

Per CLAUDE.md, use structured logging instead of string interpolation. Convert messages like "Fetch called for actor X" to structured format.

5. No Graceful Shutdown for SIGTERM/SIGINT 🔴

Location: test-runner/src/index.ts:119-132

Signal handlers wait 3 seconds but do not call runner.shutdown(). This could leave actors in undefined states.

Fix: Call await runner?.shutdown(true) before process.exit().

6. Missing .gitignore Entries 🟡

Location: scripts/tests/load-test/.gitignore

Missing common k6 output files: *.json, *.csv, summary.html

7. Logger Default Level 🟢

Location: test-runner/src/log.ts:19

Default "warn" level makes debugging difficult. Consider "info" for better test visibility.

8. Race Condition in Chatty WebSocket 🟢

Location: actor-lifecycle.js:583-605

Socket closes immediately after sending without waiting for responses. Could lose messages. Consider adding delay or document as fire-and-forget.

🔒 Security Concerns

  1. Token defaults: RIVET_TOKEN defaults to "dev". Documentation should warn against default tokens in production.

  2. No rate limiting validation: Code does not warn if configuration might overwhelm the system.

🎯 Performance Considerations

  1. WebSocket connection reuse: Chatty mode creates new connection per burst. Consider reusing connections.

  2. Memory leak: websocketLastMsgIndexes Map (index.ts:30) grows unbounded. Add cleanup for closed connections.

📋 Test Coverage

Missing automated tests for the load testing infrastructure itself. Consider unit tests for parseDuration() and calculateTestDuration().

📝 Documentation

Excellent overall. Suggestions:

  • Add examples of interpreting failed results
  • Document baseline performance metrics
  • Expand k6 error troubleshooting

✅ Recommendations

Must Fix (Blocking) 🔴:

  1. Fix error handling in finally blocks
  2. Fix graceful shutdown handlers

Should Fix (High Priority) 🟡:
3. Convert to structured logging
4. Fix logging capitalization
5. Add .gitignore entries
6. Fix websocketLastMsgIndexes memory leak

Nice to Have 🟢:
7. Make WebSocket timeout configurable
8. Change default log level to info
9. Add unit tests

📊 Overall Assessment

Solid rewrite modernizing load testing infrastructure. k6 is the right choice, documentation is excellent, test variations provide good coverage. However, error handling and graceful shutdown issues should be addressed before merging.

Recommendation: Request changes for blocking issues, then approve after fixes.


Review by Claude Code following CLAUDE.md conventions

@MasterPtato MasterPtato force-pushed the 02-02-chore_guard_centralize_cors_config branch from cb28ad8 to d3bae78 Compare February 4, 2026 18:49
@MasterPtato MasterPtato force-pushed the 02-02-feat_rewrite_load_test_suite branch from 1728003 to 349575f Compare February 4, 2026 18:49
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-4099 February 4, 2026 18:49 Destroyed
@graphite-app
Copy link
Contributor

graphite-app bot commented Feb 4, 2026

Merge activity

  • Feb 4, 6:50 PM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Feb 4, 6:51 PM UTC: CI is running for this pull request on a draft pull request (#4113) due to your merge queue CI optimization settings.
  • Feb 4, 6:51 PM UTC: Merged by the Graphite merge queue via draft PR: #4113.

graphite-app bot pushed a commit that referenced this pull request Feb 4, 2026
# Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

## Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update

## How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

## Checklist:

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
@graphite-app graphite-app bot closed this Feb 4, 2026
@graphite-app graphite-app bot deleted the 02-02-feat_rewrite_load_test_suite branch February 4, 2026 18:51
@claude
Copy link

claude bot commented Feb 4, 2026

Pull Request Review: Rewrite Load Test Suite

This PR rewrites the load test suite from engine/tests/load/ to scripts/tests/load-test/ with k6-based testing.

Key Strengths

  • Better separation of concerns moving tests to scripts/
  • Professional k6 integration with metrics and thresholds
  • Three test variations (sporadic, idle, chatty) for comprehensive coverage
  • Excellent QUICKSTART.md documentation
  • Good error handling with try-catch and finally blocks

Critical Issues

  1. Missing input validation - run.ts:102 parseInt without NaN checking
  2. WebSocket race condition - actor-lifecycle.js:218 timeout set before connection opens
  3. Async/sync mismatch - log.ts:154 async configureDefaultLogger called without await

High Priority

  1. Actor cleanup failure - destroyActor in finally blocks needs retry logic
  2. Inconsistent error handling - chatty test missing try-catch around WebSocket
  3. Memory leak - websocketLastMsgIndexes Map never cleared

Medium Priority

  1. Non-idempotent metrics - use Gauge instead of Counter for activeActors
  2. Missing graceful shutdown in k6 teardown
  3. Hardcoded timeouts throughout
  4. Code duplication across test variations

Recommendations

Must fix: Issues 1-3
Should fix: Issues 4-6
Consider: Extract common setup/teardown, add README.md

Summary

Solid rewrite improving load testing infrastructure. Main concerns are error handling edge cases, input validation, and resource leaks. Recommend approve with requested changes for critical and high priority issues.

The code demonstrates good engineering practices but needs production-readiness improvements before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant