Raft leader election does not gate on execution layer sync: unsynchronized node can win election and block production indefinitely

## Summary

A node whose **execution layer is significantly behind the raft-committed height** can win a raft leader election. Once elected, it cannot produce new blocks until it has replayed the missing entries, creating an extended (potentially permanent) outage. In our case, a node 166,305 blocks behind raft state won leadership and the cluster produced no new blocks for over 3 hours.

---

## Environment

- Cluster: 3 nodes (`node-1` / `node-2` / `node-3`), ev-node-evm, raft consensus enabled
- Deployment: Docker Compose, one container per node, snapshot-bootstrapped

---

## Observed behaviour

After a cascade of leader failovers, **node-2** won election at raft **term 192** while its execution layer was 166,305 blocks behind raft state:

```
# node-2 immediately after being elected leader at 07:40:41 UTC
local state behind raft state, skipping recovery to allow catchup
  component=main  diff=-166305  local_height=128706364  raft_height=128872669

became leader but not synced, attempting recovery  component=main
recovering state from raft  component=syncer  height=128872669
```

The `execution_replayer` then reported the same gap every ~60 seconds without making progress:

```
07:16:43  execution layer is behind, syncing blocks
          component=execution_replayer  blocks_to_sync=166305
          exec_layer_height=128706364  target_height=128872669

07:17:41  execution layer is behind, syncing blocks  blocks_to_sync=166305  …
07:18:40  execution layer is behind, syncing blocks  blocks_to_sync=166305  …
…  (repeated at ~60s intervals, count never decreasing, for 38+ minutes)
```

The **last block ever produced** by the cluster was at `06:12:33 UTC` at height **128,872,669** — more than 3 hours before the issue was discovered at 09:36.

---

## How the execution layer fell 166,305 blocks behind

| Time (UTC)       | Event |
|------------------|-------|
| Apr 14 10:41     | Cluster bootstrapped; node-2 wins initial election (term 2), becomes leader |
| Apr 15 01:33     | node-1 wins election (term 3); **node-2 becomes follower** |
| Apr 15 06:08:43  | node-3 wins election (term 4) |
| Apr 15 06:12:33  | Last block produced (height 128,872,669) by node-3 |
| Apr 15 06:12:39  | node-3 crashes (`leader lock lost`); cluster enters crash loop |
| **Apr 15 07:40:41** | **node-2 wins term 192 — exec layer 166,305 blocks behind raft** |

Between 01:33 (when node-2 became a follower) and 07:40 (when it won the election), **node-2's execution layer stopped processing raft log entries at height 128,706,364** — the height at which it had previously been leader. Raft replication kept its raft log current (raft_height = 128,872,669), but the execution layer never applied those entries while it was a follower.

---

## Why this is a problem

### 1. Raft election eligibility is not gated on execution-layer sync

Raft's log-matching safety property ensures only nodes with up-to-date **raft logs** can win elections. However, ev-node decouples raft log state from execution-layer state. A node with a fully replicated raft log but a stale execution layer can win an election even though it cannot immediately produce new blocks.

### 2. The new leader cannot produce blocks during catch-up

The leader transitions to a "recovering" state and must replay up to hundreds of thousands of blocks before resuming production. During this window the **cluster produces no new blocks** for potentially minutes or hours.

### 3. Catch-up is continuously interrupted, making recovery impossible

In our scenario, an ongoing crash loop on node-1 triggered new elections every ~60s. Each displacement aborted the in-progress catch-up. The `blocks_to_sync=166305` count never decremented across 38 minutes of attempts:

```
07:16:43  blocks_to_sync=166305
07:17:41  blocks_to_sync=166305
…
07:54:36  blocks_to_sync=166305  ← same count, 38 minutes later
```

### 4. Follower execution lag is not surfaced in election eligibility

A follower running far behind the execution layer looks identical to a fully-synced follower from the raft election perspective. There is no mechanism to prefer a synced node (node-3, exec_height = 128,872,669) over an unsynced one (node-2, exec_height = 128,706,364) during election.

---

## Expected behaviour

One or more of the following mitigations would address this:

1. **Gate election eligibility on execution-layer sync.** A node should withhold its pre-vote / vote until its execution layer is within a configurable threshold of raft height (e.g. `max_exec_lag_blocks`). This prevents an unsynchronized node from winning an election.

2. **Defer leader operations until catch-up completes.** If a node does become leader while unsynced, `starting leader operations` and block production should be deferred until `exec_layer_height >= raft_height`. The current code logs "became leader but not synced, attempting recovery" and then immediately begins leader operations anyway.

3. **Bound follower execution lag.** If a follower's execution layer falls more than N blocks behind its raft log, it should either stop voting (self-demote) or emit a high-severity metric/alert. Currently the drift is silent and only becomes visible after a failover.

---

## Key log evidence

**node-2 winning term 192 while 166,305 blocks behind (07:40 UTC):**
```
2026-04-15T07:40:41  raft: election won: term=192 tally=2
2026-04-15T07:40:41  raft: entering leader state
2026-04-15T07:40:18  local state behind raft state, skipping recovery to allow catchup
                     diff=-166305  local_height=128706364  raft_height=128872669
2026-04-15T07:40:41  became leader but not synced, attempting recovery
2026-04-15T07:40:41  recovering state from raft  height=128872669
```

**node-1 (same scenario, 2,151 blocks behind) briefly winning election during crash loop:**
```
2026-04-15T07:40:10  local state behind raft state, skipping recovery to allow catchup
                     diff=-2151  local_height=128870518  raft_height=128872669
2026-04-15T07:40:10  became leader but not synced, attempting recovery
```

**Execution replayer stuck at same count for 38 minutes:**
```
2026-04-15T07:16:43  blocks_to_sync=166305  exec_layer_height=128706364  target_height=128872669
2026-04-15T07:17:41  blocks_to_sync=166305
2026-04-15T07:18:40  blocks_to_sync=166305
…
2026-04-15T07:54:36  blocks_to_sync=166305
```

---

## Related

- #3229 — Slow re-election after SIGTERM (related: the ongoing crash loop that caused the divergence in the first place)
- `leader lock lost` fatal crash — a node losing leadership exits the process instead of gracefully stepping down to follower. This is a separate but contributing issue: it is what triggered the repeated unsynchronized elections described here and what prevented recovery once node-2 was elected with a stale execution layer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raft leader election does not gate on execution layer sync: unsynchronized node can win election and block production indefinitely #3255

Summary

Environment

Observed behaviour

How the execution layer fell 166,305 blocks behind

Why this is a problem

1. Raft election eligibility is not gated on execution-layer sync

2. The new leader cannot produce blocks during catch-up

3. Catch-up is continuously interrupted, making recovery impossible

4. Follower execution lag is not surfaced in election eligibility

Expected behaviour

Key log evidence

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Time (UTC)	Event
Apr 14 10:41	Cluster bootstrapped; node-2 wins initial election (term 2), becomes leader
Apr 15 01:33	node-1 wins election (term 3); node-2 becomes follower
Apr 15 06:08:43	node-3 wins election (term 4)
Apr 15 06:12:33	Last block produced (height 128,872,669) by node-3
Apr 15 06:12:39	node-3 crashes (`leader lock lost`); cluster enters crash loop
Apr 15 07:40:41	node-2 wins term 192 — exec layer 166,305 blocks behind raft

Raft leader election does not gate on execution layer sync: unsynchronized node can win election and block production indefinitely #3255

Description

Summary

Environment

Observed behaviour

How the execution layer fell 166,305 blocks behind

Why this is a problem

1. Raft election eligibility is not gated on execution-layer sync

2. The new leader cannot produce blocks during catch-up

3. Catch-up is continuously interrupted, making recovery impossible

4. Follower execution lag is not surfaced in election eligibility

Expected behaviour

Key log evidence

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions