perf(engine): reduce proof worker count for small blocks#22074
Open
perf(engine): reduce proof worker count for small blocks#22074
Conversation
Contributor
|
7f38920 to
9c323c0
Compare
Member
|
overhead is reduced with rayon, can we rebench? |
Member
|
@DaniPopes thanks for note, rebenching |
yongkangc
reviewed
Feb 12, 2026
For blocks with ≤30 transactions, cap proof workers at 32 each (storage + account) instead of the full rayon pool size. Fewer transactions produce fewer state changes, making most workers idle overhead. Adds ProofWorkerHandle::with_max_workers() to support capping worker count while still using the dedicated rayon pools. Co-authored-by: Amp <amp@ampcode.com> Amp-Thread-ID: https://ampcode.com/threads/T-019c515e-d52f-77df-95a3-f5e213a81aa4
Remove with_max_workers/new_inner indirection — callers now pass worker counts directly and cap them at the call site for small blocks. Amp-Thread-ID: https://ampcode.com/threads/T-019c5312-9cb4-752a-a05b-0e3bce585ed9
In the spawn_payload_processor path, transaction_count comes from the actual block envelope — it's always the real count. Empty blocks have even fewer state changes, so capping workers is justified.
bf8eadb to
9203128
Compare
added 4 commits
February 12, 2026 21:53
Amp-Thread-ID: https://ampcode.com/threads/T-019c53d6-1454-7746-a8c9-2e9b87f18415 Co-authored-by: Amp <amp@ampcode.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduce proof worker pool size for blocks with ≤30 transactions to cut per-block spawn overhead.
Motivation
ProofWorkerHandle::new()dominatesspawn_payload_processorcost: ~1.5ms of ~2.5ms total per block. On a 31-core machine it spawns 62 workers (31 storage + 31 account), each opening an MDBX read txn + creating trie cursors.Profiling shows each worker uses only 3.9μs CPU per block — 0.0005% utilization. They are almost entirely idle for blocks under 30 txs.
We cannot skip SRT entirely (PR #22129 tried, regressed +34%) because
PreservedSparseTrieis reused block-to-block via state_root anchoring. Switching to Parallel forces full trie recomputation.Changes
transaction_countfromExecutionEnvbefore it is movedstorage_worker_countandaccount_worker_countas params toProofWorkerHandle::new()instead of reading from runtimemin(pool_size, 16)— preserves SRT cache chain while reducing spawn overheadExpected: MDBX read txns per block 62→16 for small blocks, spawn overhead
1.5ms→0.4ms.Bench
Results < 30
normal:
Profiling data: Samply baseline | Samply feature