Optimise fork choice attestation dequeueing#8378
Optimise fork choice attestation dequeueing#8378michaelsproul wants to merge 10 commits intosigp:unstablefrom
Conversation
| PersistedForkChoice { | ||
| proto_array: self.proto_array().as_ssz_container(), | ||
| queued_attestations: self.queued_attestations().to_vec(), | ||
| queued_attestations: self.queued_attestations().iter().cloned().collect(), |
There was a problem hiding this comment.
This is potentially slower than if we implemented Encode for VecDeque. The conversion from VecDeque to Vec is non-trivial in most cases where the VecDeque doesn't start at index 0:
I'll check fork choice persistence times to make sure this isn't having a substantial impact.
| fc_store, | ||
| proto_array, | ||
| queued_attestations: persisted.queued_attestations, | ||
| queued_attestations: persisted.queued_attestations.into(), |
There was a problem hiding this comment.
This conversion Vec => VecDeque is trivial and cheap, but only happens once on startup anyway.
| use types::{Epoch, Hash256, Slot}; | ||
|
|
||
| fn all_benches(c: &mut Criterion) { | ||
| let num_attestations = 1_500_000_usize / 16; |
There was a problem hiding this comment.
This is ~2 slots worth of attestations on a 1.5M validator network
|
Some required checks have failed. Could you please take a look @michaelsproul? 🙏 |
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes the attestation queue management in fork choice by replacing Vec<QueuedAttestation> with VecDeque<QueuedAttestation> and improving the dequeue_attestations algorithm to preserve allocation capacity and avoid unnecessary reallocations.
Key Changes
- Replaced
VecwithVecDequefor the queued attestations data structure across fork choice components - Optimized
dequeue_attestationsfunction usingpartition_point,rotate_left, andsplit_offto maintain capacity while removing old attestations - Made
dequeue_attestationspublic and exported it for external use, including in benchmarks
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| consensus/fork_choice/src/fork_choice.rs | Core implementation changes: converted queued_attestations to VecDeque, optimized dequeue logic with capacity preservation, made QueuedAttestation fields public, added Debug derive |
| consensus/fork_choice/src/lib.rs | Added public export of dequeue_attestations function |
| consensus/fork_choice/tests/tests.rs | Updated test helper function signature to use VecDeque reference |
| consensus/fork_choice/benches/benches.rs | Added comprehensive benchmark testing dequeue performance with realistic attestation volumes |
| consensus/fork_choice/Cargo.toml | Added criterion dev-dependency and benchmark harness configuration |
| Cargo.lock | Updated with criterion dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
|
Some required checks have failed. Could you please take a look @michaelsproul? 🙏 |
|
This pull request has merge conflicts. Could you please resolve them @michaelsproul? 🙏 |
| let to_pop = queued_attestations.partition_point(|a| a.slot < current_slot); | ||
|
|
||
| // Rotate the entries to remove into the *end* of the vec deque. | ||
| queued_attestations.rotate_left(to_pop); |
There was a problem hiding this comment.
Why not change self.queued_attestations into a map of Slot -> Vec< QueuedAttestation> and when that slot arrives we just remove the entire vec? No need to rotate or split_off
There was a problem hiding this comment.
Good idea, I'll try it and see how the benchmark looks
There was a problem hiding this comment.
It will have the same problem of re-allocating memory for the Vecs though.
There was a problem hiding this comment.
It's 4x slower 💀
dequeue_attestations/93750
time: [41.906 ms 41.928 ms 41.953 ms]
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
Impl here: michaelsproul@daf3617.
One of the problems is that I didn't try optimising out the concatenation of Vecs. dequeue_attestations could definitely return like a Vec<Vec<_>> or something, but I didn't bother refactoring any further, because I think the lack of memory-reuse just makes it worse than the VecDeque.
|
Putting this back to |
|
Hi @michaelsproul, this pull request has been closed automatically due to 30 days of inactivity. If you’d like to continue working on it, feel free to reopen at any time. |
|
Hi @michaelsproul, this pull request has been closed automatically due to 30 days of inactivity. If you’d like to continue working on it, feel free to reopen at any time. |
Proposed Changes
Optimise dequeuing of attestations in fork choice by avoiding reallocating the queue after every dequeue.
The included benchmarks shows that this takes ~35% off the runtime of
dequeue_attestationsover multiple slots:The baseline branch with just the benchmark added to
unstablecan be found here: https://github.com/michaelsproul/lighthouse/tree/dequeue-attestation-baseline-benchmarkAdditional Info
I played around with several
VecDeque-based implementations. The one withpartition_pointandrotate_left/split_offis substantially faster than usingpop_frontordrain.The risk of this PR is that the allocation for
queued_attestationsis never shrunk/contracted, so in unusual circumstances it could grow very large. We could consider adding a hard cap on the number of items in the queue, as we've also had this queue grow in an unbounded way during times of memory corruption: