Skip to content

Conversation

@NathanFlurry
Copy link
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Member Author

NathanFlurry commented Feb 4, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Feb 4, 2026

PR Review: feat(workflow): add workflow history inspector integration

Summary

This PR adds workflow history inspector integration to RivetKit, allowing real-time observation of workflow execution state through the actor inspector interface. The implementation includes schema definitions, transport layer encoding, and integration with the workflow engine's history snapshot system.

Code Quality & Architecture

✅ Strengths

  1. Clean separation of concerns: The implementation properly separates transport layer (transport.ts), inspector adapter (inspector.ts), and schema definitions
  2. Type safety: Strong TypeScript typing throughout with proper generic constraints
  3. Event-driven design: Uses nanoevents for efficient history update notifications
  4. Proper serialization: Uses CBOR for efficient binary encoding of workflow history
  5. Optional integration: The inspector config is properly optional and doesn't break existing workflows

⚠️ Areas for Improvement

1. Missing null safety in toU64 (rivetkit/src/workflow/inspector.ts:62)

function toU64(value: number): bigint {
	return BigInt(Math.max(0, Math.floor(value)));
}

Issue: This function assumes value is always a valid number. If undefined or null is passed, it will return 0n silently.

Recommendation: Add explicit null checking or update the function signature to accept number | undefined.

2. Potential memory pressure in history snapshots (workflow-engine/src/storage.ts:69-92)

export function createHistorySnapshot(storage: Storage): WorkflowHistorySnapshot {
	// Creates full copy of all entries and metadata
}

Issue: This function is called on every history update via flush() (storage.ts:355-357). For long-running workflows with many entries, this creates a full copy of all history data on every flush.

Recommendation: Consider adding:

  • A debounce/throttle mechanism for history notifications
  • A diff-based approach that only sends changed entries
  • Documentation about the performance implications for workflows with large histories

3. Schema version handling lacks migration strategy

The PR removes v3 and v4 schemas (actor-inspector/v3.bare, v4.bare) but doesn't provide a migration path.

Recommendation:

  • Document the breaking change in the PR description
  • Consider adding a compatibility layer or version negotiation

Performance Considerations

🔴 Critical: Frequent CBOR encoding overhead

In workflow/inspector.ts, every single value (outputs, states, messages) is CBOR-encoded immediately when creating the snapshot. For workflows with large step outputs, many loop iterations, or frequent updates, this could cause significant CPU overhead and memory allocation pressure.

Recommendation:

  1. Add size limits for encoded values (similar to maxQueueMessageSize)
  2. Consider lazy encoding (only encode when sending to inspector client)
  3. Add performance metrics/benchmarks for workflows with large histories

Security Concerns

⚠️ Potential information disclosure

The inspector exposes full workflow history including:

  • Step outputs (may contain sensitive data)
  • Error messages (may contain stack traces with internal paths)
  • Message payloads (may contain user data)

Risk: If inspector connections are not properly authenticated, sensitive workflow data could be exposed.

Recommendation:

  1. Document that workflow outputs may be visible in the inspector
  2. Add guidance about redacting sensitive data in production
  3. Consider adding a config option to disable history in production environments

Test Coverage

✅ Good additions

  • history-updates.test.ts covers the basic notification flow
  • Tests run in both "yield" and "live" modes

⚠️ Missing coverage

  1. Error cases: No tests for malformed data, encoding errors, or large payloads
  2. Edge cases: Missing tests for workflows with >1000 entries, deeply nested locations, concurrent updates, and memory cleanup
  3. Integration: No end-to-end tests showing inspector client receiving updates

Best Practices

✅ Following project conventions

  • Uses structured logging with logger.debug()
  • Follows error handling patterns from CLAUDE.md
  • Proper use of readonly types
  • Follows naming conventions (e.g., *At for timestamps)

Documentation

⚠️ Missing documentation

  1. No JSDoc comments explaining the workflow inspector adapter interface
  2. No examples showing how to use the inspector with workflows
  3. No migration guide for the schema changes
  4. The CLAUDE.md in workflow-engine/ doesn't mention inspector integration

Recommendation: Add documentation for:

  • How to enable workflow inspector
  • What data is exposed and privacy implications
  • Performance characteristics
  • Example integration code

Recommendations Summary

High Priority

  1. Add null safety to toU64 function
  2. Add performance tests for large workflow histories
  3. Document security implications of exposing workflow data
  4. Add size limits for history entries to prevent memory issues

Medium Priority

  1. Add schema migration guide or version compatibility layer
  2. Consider debouncing/throttling history updates
  3. Add integration tests for inspector client receiving updates
  4. Add JSDoc documentation for public interfaces

Low Priority

  1. Add examples/guide for using workflow inspector
  2. Consider lazy CBOR encoding optimization

Conclusion

This is a solid implementation that follows good software engineering practices. The main concerns are around:

  1. Performance - The frequent snapshotting and encoding could be expensive for large workflows
  2. Security - Workflow data exposure needs to be documented
  3. Testing - Missing edge case and performance test coverage

The code is well-structured and maintainable. With the recommended improvements, this would be ready for production use.

Overall Assessment: ✅ Approve with recommendations

The PR is functional and well-designed, but addressing the performance and security concerns before merging would be beneficial for production deployments.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 4, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4111

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@4111

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4111

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4111

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4111

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4111

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4111

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4111

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4111

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4111

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4111

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4111

commit: 7d741be

@@ -0,0 +1 @@
export * from "../../../dist/schemas/persist/v1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is failing because the dist/schemas/persist/v1.js file doesn't exist yet. The build:schema script needs to be run before building to generate this file. Consider adding a try/catch block to handle the case where the schema file doesn't exist yet, or ensure the build process runs build:schema first.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

@@ -0,0 +1 @@
export * from "../../../dist/schemas/transport/v1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is failing because the dist/schemas/transport/v1.js file doesn't exist yet. The build:schema script needs to be run before building to generate this file. Consider adding a try/catch block to handle the case where the schema file doesn't exist yet, or ensure the build process runs build:schema first.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines 175 to +176
"build": "tsup src/mod.ts src/client/mod.ts src/common/log.ts src/common/websocket.ts src/actor/errors.ts src/topologies/coordinate/mod.ts src/topologies/partition/mod.ts src/utils.ts src/driver-helpers/mod.ts src/driver-test-suite/mod.ts src/serve-test-suite/mod.ts src/test/mod.ts src/inspector/mod.ts src/workflow/mod.ts",
"build:schema": "./scripts/compile-bare.ts compile schemas/client-protocol/v1.bare -o dist/schemas/client-protocol/v1.ts && ./scripts/compile-bare.ts compile schemas/client-protocol/v2.bare -o dist/schemas/client-protocol/v2.ts && ./scripts/compile-bare.ts compile schemas/client-protocol/v3.bare -o dist/schemas/client-protocol/v3.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v1.bare -o dist/schemas/file-system-driver/v1.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v2.bare -o dist/schemas/file-system-driver/v2.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v3.bare -o dist/schemas/file-system-driver/v3.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v1.bare -o dist/schemas/actor-persist/v1.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v2.bare -o dist/schemas/actor-persist/v2.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v3.bare -o dist/schemas/actor-persist/v3.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v4.bare -o dist/schemas/actor-persist/v4.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v1.bare -o dist/schemas/actor-inspector/v1.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v2.bare -o dist/schemas/actor-inspector/v2.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v3.bare -o dist/schemas/actor-inspector/v3.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v4.bare -o dist/schemas/actor-inspector/v4.ts"
"build:schema": "./scripts/compile-bare.ts compile schemas/client-protocol/v1.bare -o dist/schemas/client-protocol/v1.ts && ./scripts/compile-bare.ts compile schemas/client-protocol/v2.bare -o dist/schemas/client-protocol/v2.ts && ./scripts/compile-bare.ts compile schemas/client-protocol/v3.bare -o dist/schemas/client-protocol/v3.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v1.bare -o dist/schemas/file-system-driver/v1.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v2.bare -o dist/schemas/file-system-driver/v2.ts && ./scripts/compile-bare.ts compile schemas/file-system-driver/v3.bare -o dist/schemas/file-system-driver/v3.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v1.bare -o dist/schemas/actor-persist/v1.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v2.bare -o dist/schemas/actor-persist/v2.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v3.bare -o dist/schemas/actor-persist/v3.ts && ./scripts/compile-bare.ts compile schemas/actor-persist/v4.bare -o dist/schemas/actor-persist/v4.ts && ./scripts/compile-bare.ts compile schemas/persist/v1.bare -o dist/schemas/persist/v1.ts && ./scripts/compile-bare.ts compile schemas/transport/v1.bare -o dist/schemas/transport/v1.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v1.bare -o dist/schemas/actor-inspector/v1.ts && ./scripts/compile-bare.ts compile schemas/actor-inspector/v2.bare -o dist/schemas/actor-inspector/v2.ts",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build script should be updated to run build:schema first, ensuring the schema files are generated before the main build process. For example: "build": "npm run build:schema && tsup src/mod.ts ..."

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +1 to +17
import * as cbor from "cbor-x";
import { createNanoEvents } from "nanoevents";
import type {
BranchStatus,
BranchStatusType,
EntryKind,
EntryStatus,
Location,
SleepState,
WorkflowHistoryEntry,
WorkflowHistorySnapshot,
WorkflowEntryMetadataSnapshot,
} from "@rivetkit/workflow-engine";
import { encodeWorkflowHistoryTransport } from "@/inspector/transport";
import type * as inspectorSchema from "@/schemas/actor-inspector/mod";
import * as transport from "@/schemas/transport/mod";
import { assertUnreachable, bufferToArrayBuffer } from "@/utils";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports should be sorted alphabetically according to Biome's rules. External imports should come before local imports (those starting with '@/'). Run 'pnpm run lint:fix' to automatically fix this issue.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +80 to +85
type MessageEntry struct {
# Message name
name: str
# Message data (CBOR-encoded arbitrary data)
messageData: Cbor
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation. There's an extra space before 'type MessageEntry struct' and the indentation of the fields doesn't match the rest of the file. Run 'pnpm run lint:fix' to automatically fix formatting issues.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

This was referenced Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant