aggregator committee: extend value checks#2682
Conversation
Greptile OverviewGreptile SummaryThis PR extends aggregator committee validation by:
Key issues to address before merge are around runtime safety (nil Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Net as P2P Network
participant R as AggregatorCommitteeRunner
participant BR as BaseRunner
participant BN as BeaconNode
participant VC as AggregatorCommitteeChecker
Net->>R: PreConsensus PartialSignatureMessages
R->>BR: basePreConsensusMsgProcessing()
BR->>BR: ValidatePreConsensusMsg(ctx, logger, runner, psigMsgs)
BR->>R: expectedPreConsensusRoots(ctx, logger)
R->>BN: fetch attestation/contribution data
R-->>BR: expected signing roots (aggregators + contributions)
BR->>BR: verifyExpectedSigningRoots(psigMsgs, expectedRoots)
R->>VC: NewAggregatorCommitteeChecker(duty, BN)
R->>BR: decide(ctx, logger, slot, consensusData, VC)
Net->>R: Consensus SignedSSVMessage
R->>BR: baseConsensusMsgProcessing(valCheck.CheckValue)
Net->>R: PostConsensus PartialSignatureMessages
R->>BR: basePostConsensusMsgProcessing()
BR->>BR: ValidatePostConsensusMsg(ctx, logger, runner, psigMsgs)
BR->>R: expectedPostConsensusRootsAndBeaconObjects(ctx, logger)
BR->>BR: verifyExpectedSigningRoots(psigMsgs, expectedRoots)
|
| func NewAggregatorCommitteeChecker( | ||
| duty *spectypes.AggregatorCommitteeDuty, | ||
| subnetProvider syncCommitteeSubnetIDProvider, | ||
| ) ValueChecker { |
There was a problem hiding this comment.
Nil duty can panic
NewAggregatorCommitteeChecker dereferences duty.ValidatorDuties unconditionally, but AggregatorCommitteeRunner now sets r.ValCheck at runtime and the test/spectest helpers also call this with constructed duties. If duty is ever nil (e.g., runner constructed before duty assignment, or a future caller passes nil), this will panic rather than returning a validation error.
Consider returning a non-nil checker with empty allowlists (or returning an error from the constructor) when duty == nil, and ensure all call sites pass a real duty.
| r.ValCheck = ssv.NewAggregatorCommitteeChecker( | ||
| duty, | ||
| r.GetBeaconNode(), | ||
| ) |
There was a problem hiding this comment.
Nil ValCheck until pre-consensus
NewAggregatorCommitteeRunner no longer initializes ValCheck, and it’s only set in ProcessPreConsensus right before decide. If any code path calls ProcessConsensus / ProcessPostConsensus before ProcessPreConsensus for the duty, r.ValCheck.CheckValue will panic.
If out-of-order message processing is possible in this runner (as it is elsewhere), initialize ValCheck in the constructor (with duty-independent checks) or guard ProcessConsensus/ProcessPostConsensus with a clear error when r.ValCheck == nil.
Additional Comments (1)
Use a saturating subtraction (e.g. |
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| if !ok { | ||
| return spectypes.NewError( | ||
| spectypes.QBFTValueInvalidErrorCode, | ||
| fmt.Sprintf("unexpected aggregator validator %d", agg.ValidatorIndex), |
There was a problem hiding this comment.
hi, bringing back the divergent views concern from the parent PR (#2503, #2503 (comment) by @MatheusFranco99).
Since operators can have different views of the aggregator lists (i.e. different or failed BN responses, or some operators haven't processed a ValidatorAdded event yet), strict rejection here would stall consensus. The same issue applies to verifyExpectedSigningRoots I think -> an unknown validator's signing root causes the entire partial sig batch to be dropped, including valid sigs for validators the operator does know about.
Matheus's suggestion of only rejecting validators that are known to the operator but shouldn't have that duty rather than simply unknown seems like an ok tradeoff
There was a problem hiding this comment.
@shane-moore Hi, thanks
@MatheusFranco99 Can we implement this in spec? I'll adjust the implementation afterwards
There was a problem hiding this comment.
Correct, we can't do validator id checks 👍
Regarding not having a duty for the validator, I think this is naturally handled by spec code.
Still, I'm adding a PR for making this clear. Will ask reviews soon.
GalRogozinski
left a comment
There was a problem hiding this comment.
I don't think the changes are correct so rejecting for now
| if runner.GetRole() == spectypes.RoleAggregatorCommittee { | ||
| aggRunner, ok := runner.(*AggregatorCommitteeRunner) | ||
| if !ok { | ||
| return fmt.Errorf("unexpected runner type %T for aggregator committee role", runner) | ||
| } | ||
|
|
||
| roots, domain, err := runner.expectedPreConsensusRootsAndDomain() | ||
| aggregatorMap, contributionMap, err := aggRunner.expectedPreConsensusRoots(ctx, logger) | ||
| if err != nil { | ||
| return fmt.Errorf("compute pre-consensus roots and domain: %w", err) | ||
| return fmt.Errorf("compute pre-consensus roots: %w", err) | ||
| } | ||
|
|
||
| return b.verifyExpectedRoot(ctx, runner, psigMsgs, roots, domain) | ||
| expectedRoots := make(map[[32]byte]struct{}) | ||
| for _, root := range aggregatorMap { | ||
| expectedRoots[root] = struct{}{} | ||
| } | ||
| for _, indexMap := range contributionMap { | ||
| for _, root := range indexMap { | ||
| expectedRoots[root] = struct{}{} | ||
| } | ||
| } | ||
|
|
||
| return b.verifyExpectedSigningRoots(psigMsgs, expectedRoots) |
There was a problem hiding this comment.
I understand that you are trying to fend off attacks where a single validator can complete quorum on a single root, thus preventing selection proofs on all other roots?
We had a discussion on this with @liorrutenberg and it was decided that we allow this attack to allow operators to have divergent views of validators (in case of EL issues).
I just discussed with @MatheusFranco99 and maybe the correct solution is to actually wait on the preconsensus stage if no aggregator passes the selection proof
There was a problem hiding this comment.
Yes, I'm adding a PR for it.
| if err := b.validatePartialSigMsg(psigMsgs, expectedSlot); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| aggRunner, ok := runner.(*AggregatorCommitteeRunner) | ||
| if !ok { | ||
| return fmt.Errorf("unexpected runner type %T for aggregator committee role", runner) | ||
| } | ||
|
|
||
| aggregatorMap, contributionMap, _, err := aggRunner.expectedPostConsensusRootsAndBeaconObjects(ctx, logger) | ||
| if err != nil { | ||
| return fmt.Errorf("compute post-consensus roots: %w", err) | ||
| } | ||
|
|
||
| expectedRoots := make(map[[32]byte]struct{}) | ||
| for _, root := range aggregatorMap { | ||
| expectedRoots[root] = struct{}{} | ||
| } | ||
| for _, roots := range contributionMap { | ||
| for _, root := range roots { | ||
| expectedRoots[root] = struct{}{} | ||
| } | ||
| } | ||
|
|
||
| return b.verifyExpectedSigningRoots(psigMsgs, expectedRoots) |
There was a problem hiding this comment.
I think whatever conclusion we have on preconsensus validation will be also valid here
|
|
||
| for _, agg := range cd.Aggregators { | ||
| allowedByValidator, ok := v.allowedAggregators[agg.ValidatorIndex] | ||
| if !ok { | ||
| return spectypes.NewError( | ||
| spectypes.QBFTValueInvalidErrorCode, | ||
| fmt.Sprintf("unexpected aggregator validator %d", agg.ValidatorIndex), | ||
| ) | ||
| } | ||
| if _, ok := allowedByValidator[phase0.CommitteeIndex(agg.CommitteeIndex)]; !ok { | ||
| return spectypes.NewError( | ||
| spectypes.QBFTValueInvalidErrorCode, | ||
| fmt.Sprintf("unexpected aggregator committee index %d for validator %d", agg.CommitteeIndex, agg.ValidatorIndex), | ||
| ) | ||
| } | ||
| } | ||
|
|
||
| for _, contrib := range cd.Contributors { | ||
| allowedByValidator, ok := v.allowedContributors[contrib.ValidatorIndex] | ||
| if !ok { | ||
| return spectypes.NewError( | ||
| spectypes.QBFTValueInvalidErrorCode, | ||
| fmt.Sprintf("unexpected contributor validator %d", contrib.ValidatorIndex), | ||
| ) | ||
| } | ||
| subnetID := contrib.CommitteeIndex | ||
| if _, ok := allowedByValidator[subnetID]; !ok { | ||
| return spectypes.NewError( | ||
| spectypes.QBFTValueInvalidErrorCode, | ||
| fmt.Sprintf("unexpected contributor subnet %d for validator %d", subnetID, contrib.ValidatorIndex), | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
do you see any good motivation for having this check @MatheusFranco99?
We need more than F bad nodes for this to fail I think
6c20a6a to
bb08a8e
Compare
On hold until it's implemented in spec due to broken spec tests