chore(consensus): read EL state with a backoff and emit useful syncing info on failure #2494

SuperFluffy · 2026-02-05T17:33:17Z

This patch implements a constant backoff from 1s to 30s when reading the validator config from execution layer state at the end of an epoch. Spamming this every second was excessive, especially while a node was syncing.

The event level was also lowered to WARN - after all, the node is actively trying to recover.

Finally, the event provides extra information about the execution layer, which potentially helps understand why it currently fails to read state.

github-actions

Summary

This PR enhances retry logic and observability for validator config reads:

Changes:

Replaces fixed 1-second retry delay with linear backoff (1s to 30s)
Adds network sync status and block height telemetry to help diagnose failures
Removes escalation from warn! to error! after 10 attempts (now always warn!)
Adds utility functions display_result and display_option for cleaner telemetry formatting

Additional Notes:

The new telemetry fields (is_syncing, best_block, target_block, blocks_behind) provide valuable context for debugging validator config read failures, especially during node sync
The display_result and display_option utilities are well-implemented with proper error chain handling
Removing the warn-to-error escalation reduces log severity for persistent failures; consider whether this aligns with alerting requirements

View this review on Amp

crates/commonware-node/src/dkg/manager/actor/mod.rs

joshieDo · 2026-02-09T18:27:19Z

crates/commonware-node/src/dkg/manager/actor/mod.rs

+                best_block = %tempo_telemetry_util::display_result(&best_block),
+                target_block,
+                blocks_behind = %tempo_telemetry_util::display_option(&blocks_behind),
+                "reading validator config from contract failed; will retry",


i feel like this might scare operators, maybe we can add to the msg that this is expected when we're behind?

even though is_syncing is in there, i feel like it might be missed

agreed, or separating the two cases into different logs

Well, right now it emits errors for hours, so this is an improvement.

WARN is appropriate because it is being handled, yet does constitute a hickup in normal operation and should be looked into.

We already have a log for this (from reth), and it is easily missed. That's why this info is added here.

Separating this into separate events feels incorrect because this is just one event.

chore(consensus): emit EL syncing logic when failing to read block state

a50fa83

SuperFluffy requested a review from hamdiallam February 5, 2026 17:33

SuperFluffy requested a review from joshieDo as a code owner February 5, 2026 17:33

SuperFluffy added the amp label Feb 5, 2026

github-actions bot reviewed Feb 5, 2026

View reviewed changes

crates/commonware-node/src/dkg/manager/actor/mod.rs Show resolved Hide resolved

Merge branch 'main' into janis/epoch-reads

07f6365

joshieDo reviewed Feb 9, 2026

View reviewed changes

joshieDo approved these changes Feb 9, 2026

View reviewed changes

SuperFluffy added this pull request to the merge queue Feb 9, 2026

Merged via the queue into main with commit d082fdd Feb 9, 2026
15 checks passed

SuperFluffy deleted the janis/epoch-reads branch February 9, 2026 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(consensus): read EL state with a backoff and emit useful syncing info on failure #2494

chore(consensus): read EL state with a backoff and emit useful syncing info on failure #2494

SuperFluffy commented Feb 5, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

joshieDo Feb 9, 2026 •

edited

Loading

Uh oh!

joshieDo Feb 9, 2026

Uh oh!

hamdiallam Feb 9, 2026

Uh oh!

SuperFluffy Feb 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore(consensus): read EL state with a backoff and emit useful syncing info on failure #2494

chore(consensus): read EL state with a backoff and emit useful syncing info on failure #2494

Conversation

SuperFluffy commented Feb 5, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Summary

Uh oh!

Uh oh!

joshieDo Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joshieDo Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

hamdiallam Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

SuperFluffy Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joshieDo Feb 9, 2026 •

edited

Loading

SuperFluffy Feb 9, 2026 •

edited

Loading