Skip to content

perf: batch multiproof dispatch for small blocks#22082

Draft
yongkangc wants to merge 4 commits intomainfrom
yk/batch-multiproof-small-blocks
Draft

perf: batch multiproof dispatch for small blocks#22082
yongkangc wants to merge 4 commits intomainfrom
yk/batch-multiproof-small-blocks

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Feb 11, 2026

Summary
The multiproof dispatch path sends a separate proof request per transaction state update: partition, Arc clone, chunking check, channel send, worker wakeup — repeated N times per block. For a 30-tx block, that's 30 dispatches when a single merged request would suffice. This PR buffers HashedPostState updates and dispatches a single merged proof request on FinishedStateUpdates for small blocks (<30 state updates), falling back to per-update dispatch for larger blocks.

Ubuntu added 2 commits February 11, 2026 18:39
For blocks with fewer than 128 state updates, buffer per-tx hashed
state updates and dispatch a single merged proof request at block
completion instead of dispatching per-tx proofs.

This eliminates per-update overhead from:
- partition_by_targets (splitting already-fetched vs not-fetched)
- dispatch_with_chunking (worker availability checks + chunking)
- Arc<MultiAddedRemovedKeys> clone per dispatch
- crossbeam channel send/recv per dispatch
- ProofSequencer sequence numbering per dispatch

The multi_added_removed_keys tracking is still done per-update during
buffering for correctness. Only the dispatch/partition/chunking is
deferred and done once on the merged state.

Falls back to per-update dispatch if the threshold is exceeded, and
disables batching for BAL (Block Access List) paths which already
provide complete state.

Amp-Thread-ID: https://ampcode.com/threads/T-019c4dc7-32c0-76dd-a099-4966bf2717f2
Two optimizations for small-block multiproof dispatch:

1. Single-dispatch path for batched small blocks: When flushing the
   small-block state update batch (< 128 updates), bypass
   dispatch_with_chunking entirely and dispatch a single proof request.
   This prevents the merged state (typically ~80-100 targets) from being
   split into multiple chunks when chunking_len > chunk_size (60),
   which would partially undo the batching win.

2. Fix V2 chunk size: Use effective_multiproof_chunk_size() instead of
   multiproof_chunk_size() in both MultiProofTask and SparseTrieCacheTask
   creation. With V2 proofs enabled (default), this changes the chunk size
   from 60 to 240, reducing unnecessary chunking across all block sizes.

Amp-Thread-ID: https://ampcode.com/threads/T-019c4de3-0cce-7083-80fd-f10b6724f5cc
@yongkangc yongkangc added C-perf A change motivated by improving speed, memory usage or disk footprint A-engine Related to the engine implementation labels Feb 11, 2026
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Feb 11, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

⚠️ Changelog not found.

A changelog entry is required before merging. We've generated a suggested changelog based on your changes:

Preview
---
reth-engine-tree: patch
---

Fixed bugs in small-block multiproof batching by using effective chunk size instead of raw chunk size and improved batching logic with proper buffer management and single-dispatch optimization for small blocks.

Add changelog to commit this to your branch.

@yongkangc yongkangc self-assigned this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-engine Related to the engine implementation C-perf A change motivated by improving speed, memory usage or disk footprint

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant