Tree-sync friendly lookup sync tests#8592
Conversation
| // removed from the da_checker. Note that ALL components are removed from the da_checker | ||
| // so when we re-download and process the block we get the error | ||
| // MissingComponentsAfterAllProcessed and get stuck. | ||
| lookup.reset_requests(); |
There was a problem hiding this comment.
Bug found on testing, lookups may be stuck given this sequence of events
| // sending retry requests to the disconnecting peer. | ||
| for sync_request_id in self.network.peer_disconnected(peer_id) { | ||
| self.inject_error(*peer_id, sync_request_id, RPCError::Disconnected); | ||
| } |
There was a problem hiding this comment.
Minor bug, we need to remove the peer from the sync states (e.g. self.block_lookups) then inject the disconnect events. Otherwise we may send requests to peers that are already disconnected. I don't think there's risk of sync getting stuck if libp2p rejects sending messages to disconnected peers, but deserves a fix anyway.
|
This pull request has merge conflicts. Could you please resolve them @dapplion? 🙏 |
|
@jimmygchen Thanks for the review! I've addressed all your feedback: Already fixed:
Just fixed (8d29e73):
All 12 review threads resolved. The 2 remaining threads are my own bug notes. |
beacon_node/genesis/src/interop.rs
Outdated
| }; | ||
|
|
||
| data.signature = data.create_signature(&keypair.sk, spec); | ||
| data.signature = SignatureBytes::empty(); |
There was a problem hiding this comment.
Double check if we need these changes in this file.
Should we feature gate this?
There was a problem hiding this comment.
Potentialy delete this function in a future PR?
| ); | ||
| self.expect_empty_network(); | ||
| self.expect_no_active_lookups(); |
There was a problem hiding this comment.
Add documentation for naming conventions, e.g.
.expect_for mocking / setting up component behaviour.assert_for making assertion on behaviour
| // The signature should be checked for new validators. Return early for a bad | ||
| // signature. | ||
| if is_valid_deposit_signature(&deposit_data, spec).is_err() { | ||
| if verify_signatures.is_true() && is_valid_deposit_signature(&deposit_data, spec).is_err() { |
There was a problem hiding this comment.
do we need to add the same check to all other places where is_valid_deposit_signature is called? or would it be cleaner to use feature so we don't have to pass VerifySignatures around and remember to add this check whereever we verify signature?
… threading, cfg(test) metrics
|
Some required checks have failed. Could you please take a look @dapplion? 🙏 |
|
The The test has a latent race: on the last loop iteration, enqueuing the final gossip column triggers reconstruction nearly simultaneously. The |
are you @dapplion |
4260646 to
415e89e
Compare
| r.corrupt_last_block_signature(); | ||
| r.trigger_with_last_block(); | ||
| r.simulate(SimulateConfig::happy_path()).await; | ||
| if cfg!(feature = "fake_crypto") { |
There was a problem hiding this comment.
Having this condition in the same test feels a bit confusing, as we're testing different outcomes and in the case of fake_crypto scenario, it doesn't match what the test name suggests,
I prefer something like:
#[tokio::test]
#[cfg(feature = "fake_crypto")]
async fn block_signature_not_checked_with_fake_crypto() { ... }
#[tokio::test]
#[cfg(not(feature = "fake_crypto"))]
async fn invalid_block_signature_fails_processing() { ... }but yeah this is a minor style issue
jimmygchen
left a comment
There was a problem hiding this comment.
Hey @dapplion
Nice work! This is a great improvement to the existing tests, which is becoming hard to maintain and will break once we change how sync works.
It looks pretty solid now and we can keep iterating over the test interfaces as we add more tests.
Merge Queue StatusRule:
This pull request spent 32 minutes 9 seconds in the queue, including 29 minutes 48 seconds running CI. Required conditions to merge
|
Issue Addressed
Current lookup sync tests are written in an explicit way that assume how the internals of lookup sync work. For example the test would do:
This is unnecessarily verbose. And it will requires a complete re-write when something changes in the internals of lookup sync (has happened a few times, mostly for deneb and fulu).
What we really want to assert is:
Proposed Changes
Keep all existing tests and add new cases but written in the new style described above. The logic to serve and respond to request is in this function
fn simulatehttps://github.com/dapplion/lighthouse/blob/2288a3aeb11164bb1960dc803f41696c984c69ff/beacon_node/network/src/sync/tests/lookups.rs#L301CompleteStrategywhere you can set for example "respond to BlocksByRoot requests with empty"TestConfigAlong the way I found a couple bugs, which I documented on the diff.
Review guide
Look at
lighthouse/beacon_node/network/src/sync/tests/lookups.rsdirectly (no diff).Other changes are very minor and should not affect production paths