spectests: align with robust message replay#576
Conversation
There was a problem hiding this comment.
Greptile Summary
This PR synchronizes the ssv-spec repository with changes from the main SSV repository (PR #2445) to implement robust message replay functionality and ensure spec-tests pass correctly. The changes focus on three main areas:
Enhanced Error Reporting: The PR improves error messages throughout the codebase by replacing generic error strings with more descriptive messages that include specific slot numbers and validation contexts. For example, generic "invalid partial sig slot" errors are now replaced with detailed messages like "future partial sig msg, message slot: 50, want slot: 12" or "invalid partial sig slot: 50, want: 7413760". This provides better debugging information when slot validation fails.
Method Name Simplification: The validatePartialSigMsgForSlot method is renamed to validatePartialSigMsg across multiple files, following the principle of shorter, clearer names since the method already takes a slot parameter making the "ForSlot" suffix redundant.
Test Infrastructure Improvements: Multiple test files are updated to enhance error handling in test execution. The GetPostState() methods in various spec test classes now provide formatted error messages when expected errors don't match actual errors, showing both the expected and actual error strings for better debugging.
The changes also include updates to JSON test specification files that define expected error messages for various slot validation scenarios, ensuring the spec tests validate the new error message formats correctly. The modifications span across runner validation logic, signature validation, and comprehensive test suites for both pre-consensus and post-consensus message processing scenarios.
Confidence score: 1/5
- This PR has a critical blocking issue that makes it unsafe to merge
- Score severely lowered due to an empty/corrupted test specification file that would break the test suite
- The file
ssv/spectest/generate/tests/tests.MultiMsgProcessingSpecTest_post_consensus_invalid_msg_slot.jsonis completely empty despite being referenced in the changes
9 files reviewed, 1 comment
| ), | ||
| Duty: testingutils.TestingAggregatorDuty(spec.DataVersionElectra), | ||
| Messages: []*types.SignedSSVMessage{ | ||
| testingutils.SignPartialSigSSVMessage(ks, testingutils.SSVMsgAggregator(nil, invalidateSlot(testingutils.PostConsensusAggregatorMsg(ks.Shares[1], 1, spec.DataVersionElectra)))), |
There was a problem hiding this comment.
logic: Using invalidateSlot instead of invalidateSlotV - this may not test the version-specific slot validation logic for Electra
There was a problem hiding this comment.
In this PR I'm not really looking at what this test intends to check, I'm just trying to make it work with updated error-messages ...
As for what is the original intent of this test is (and how well invalidateSlot fits it, vs invalidateSlotV) - I'll leave this discussion open for somebody from spec-team to check out.
75a57db to
c60049c
Compare
| nil, | ||
| // TODO: before merge ask engineering how often they see such message in production | ||
| "could not process msg: invalid signed message: did not receive proposal for this round", | ||
| "no proposal for round: invalid signed message: did not receive proposal for this round", | ||
| nil, | ||
| ks, | ||
| ) |
There was a problem hiding this comment.
We do see quite a bit of these, here is an example from our production cluster:

So, with ssvlabs/ssv#2445 we should be able to buffer messages in this scenario and "replay" it later (vs just dropping the message as if it never came, as is done currently)
| ExpectedError: "no post consensus phase for voluntary exit", | ||
| }, | ||
| { | ||
| Name: fmt.Sprintf("aggregator (%s)", spec.DataVersionPhase0.String()), |
There was a problem hiding this comment.
@iurii-ssv
can you explain the reason for this change?
There was a problem hiding this comment.
Yes, so in ssvlabs/ssv#2445 I had to introduce a new error to split invalid partial sig slot in 2:
invalid partial sig slotErrFuturePartialSigMsg(this is a newly-defined error current spec-tests don't know about and this change fixes that)
the reason to split it like that is because ErrFuturePartialSigMsg is a retryable error (we can buffer the message and replay it ~100ms later successfully)
|
This is no longer needed since #585 fully supersedes it - hence closing. |
This PR intends to update/align the code in
ssv-specrepo with the changes in ssvlabs/ssv#2445 so that we can have spec-tests passing inssvrepo.Note, I'm not fully replicating the whole retry-logic implemented in ssvlabs/ssv#2445 - rather I'm just copying over the bare minimum we need to make spec-tests pass (so I'm mostly updating error-messages and re-generating json-test-files) - this is because the code in
ssvrepo already diverges from the code inssv-specrepo ... and re-implementing every feature we add from now on in both doesn't make much sense. So, what this PR is doing isn't really viable long-term, ideally we'd want to copy all the spec-tests over tossvrepo and update/adjust them there as we need.