Milestone event processing introduces unnecessary latency in sync cycle

## Description

The milestone synchronization mechanism introduces structural latency at multiple points in the sync pipeline. This latency is inherent to the current design and affects both the `syncToTip` phase and the event loop at tip, regardless of whether the node eventually catches up.

## Root causes

### 1. Milestone scraper `pollDelay` = 1s (vs 200ms for spans)

In `polygon/heimdall/service.go:89`, the milestone scraper polls Heimdall every **1 second**:

```go
milestoneScraper := NewScraper(
    "milestones",
    store.Milestones(),
    milestoneFetcher,
    1*time.Second,        // ← 5x slower than spans (200ms)
    ...
)
```

During `syncToTip`, each cycle calls `SynchronizeMilestones()` which blocks on `syncEvent.Wait()` until the scraper completes a full poll cycle. This adds up to **~1s of dead time per syncToTip iteration**, contributing to the 32-58s inter-cycle gap observed in production.

The span scraper uses `200*time.Millisecond` (`service.go:98`). There's no reason for milestones to be 5x slower.

### 2. `futureMilestoneDelay` = 1s re-queue polling loop

In `polygon/sync/sync.go:289-308`, when a milestone arrives ahead of the current tip (which is the common case since Heimdall publishes milestones before the node has executed the blocks):

```go
if milestone.EndBlock().Uint64() > ccb.Tip().Number.Uint64() {
    // finality is already tracked here (line 293) ✓
    go func() {
        time.Sleep(futureMilestoneDelay)  // 1s
        s.tipEvents.events.PushEvent(...)  // re-queue
    }()
    return nil
}
```

This spawns a goroutine that sleeps 1s and re-pushes the event, repeating until the tip catches up. For a milestone 3 blocks (6s) ahead of tip, this creates ~6 polling goroutines. The finality tracking (`lastFinalizedBlockNum`) is already done at line 293 before the re-queue — the re-queue only serves CCB pruning and milestone verification, which the **next** on-time milestone will handle anyway (~32s later).

### 3. `WaitUntilHeimdallIsSynced` + `SynchronizeSpans` on every block event

In `polygon/sync/sync.go:364-376`, every single block event in the event loop triggers:

```go
err := s.heimdallSync.WaitUntilHeimdallIsSynced(ctx, 200ms)
err = s.heimdallSync.SynchronizeSpans(ctx, math.MaxUint64)
```

This is fast in steady state, but during span rotation (every 128 blocks / ~256s), `SynchronizeSpans` must fetch from Heimdall and recompute producer selection, adding **~12s of overhead** — a systematic source of lag at a predictable interval.

## Production data

From v3.4.0-beta (bor-mainnet, commit 48d7b0b):

- **syncToTip phase**: 32-58s inter-cycle gap. Execution + trie dominates, but the 1s scraper poll is wasted time on every iteration.
- **Event loop at tip**: FC cycle avg = 2.07s for 2s blocks. Steady-state head age = 2-4s. Every second of unnecessary latency directly translates to falling further behind.

From issue #59 logs (v3.1.2):
```
[sync] update fork choice done           in=8.7s
[sync] applying new milestone event      milestoneId=3648361 ...
[sync] applying new milestone event      milestoneId=3648362 ...
[span-rotation] need to wait for span rotation ...
[bor.heimdall] anticipating new span update within 8 seconds
[span-rotation] producer set was not updated within 8 seconds
```

Milestones are processed sequentially after the FC update, then span rotation adds another 8s — the node is idle for seconds doing Heimdall bookkeeping instead of processing blocks.

## Suggested improvements

1. **Reduce milestone `pollDelay` from 1s to 200ms** — align with spans, one-line change in `service.go:89`
2. **Remove the `futureMilestoneDelay` re-queue loop** — finality is already tracked; drop the event and let the next on-time milestone handle CCB pruning and verification
3. **Consider making span synchronization non-blocking** for block events that don't fall on a span boundary

## Related issues

- #116 — Milestone accumulation feedback loop (consequence of this latency)
- #112 — Intermittent lag of thousands of blocks
- #59 — Sync performance degradation with multi-second FC updates + span rotation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Milestone event processing introduces unnecessary latency in sync cycle #119

Description

Root causes

1. Milestone scraper `pollDelay` = 1s (vs 200ms for spans)

2. `futureMilestoneDelay` = 1s re-queue polling loop

3. `WaitUntilHeimdallIsSynced` + `SynchronizeSpans` on every block event

Production data

Suggested improvements

Related issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Milestone event processing introduces unnecessary latency in sync cycle #119

Description

Description

Root causes

1. Milestone scraper pollDelay = 1s (vs 200ms for spans)

2. futureMilestoneDelay = 1s re-queue polling loop

3. WaitUntilHeimdallIsSynced + SynchronizeSpans on every block event

Production data

Suggested improvements

Related issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Milestone scraper `pollDelay` = 1s (vs 200ms for spans)

2. `futureMilestoneDelay` = 1s re-queue polling loop

3. `WaitUntilHeimdallIsSynced` + `SynchronizeSpans` on every block event