Skip to content

fix(runner): log warning instead of closing ws on message index gap#4594

Open
NathanFlurry wants to merge 3 commits intov2.1.xfrom
fix/ws-gap-warning
Open

fix(runner): log warning instead of closing ws on message index gap#4594
NathanFlurry wants to merge 3 commits intov2.1.xfrom
fix/ws-gap-warning

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Summary

  • When the engine driver detects a gap in hibernatable WebSocket message indices, it now logs a warning instead of closing the connection with a 1008 error.

Test plan

  • Verify that WebSocket connections remain open when message index gaps occur
  • Confirm warning is logged with gap details

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Apr 9, 2026

PR Review: fix(runner): log warning instead of closing ws on message index gap

The core logic of each change is sound. Here is a breakdown by file.

websocket-tunnel-adapter.ts - Gap handling relaxation

The change to warn-and-continue instead of closing with 1008 is appropriate for hibernation semantics. During migration, messages may have been delivered with indices that appear as gaps to the new runner. Closing on any gap was too aggressive.

One thing worth noting: the private field serverMessageIndex is still updated to the gapped value, so any late-arriving messages in the skipped range will be dropped by the de-dup check (wrappingLteU16). This is correct for at-least-once delivery semantics, but worth documenting with a comment.

tunnel.ts - Hibernatable ws.downstream_closed bypass

The new branch correctly performs the same in-memory cleanup (deleteWebSocket, deletePendingRequest, removeRequestToActor) while intentionally skipping adapter._handleClose() to preserve KV state for the next runner. The logic is sound.

Minor concern: "ws.downstream_closed" is matched as a plain string literal but its canonical definition lives in Rust (engine/packages/pegboard-gateway/src/lib.rs). A comment pointing to the source would prevent silent drift if the string ever changes on the Rust side.

state-manager.ts - saveState bug fix

This is a genuine correctness fix. The early-exit guard in saveState() was inconsistent with the inner savePersistInner() check. The outer guard only checked the persistChanged flag, missing the connsWithPersistChanged.size > 0 condition. This meant connection state changes could be silently dropped when actor state itself had not changed.

Suggestion: this fix has real impact on persistence correctness and would benefit from a unit test. A case where connsWithPersistChanged is non-empty but persistChanged is false, confirming that saveState now triggers a write, would be a good regression test.

mod.ts - Diagnostic logging

The new info-level logs and performance.now() timing around onActorStart are useful for debugging migration issues. Two things to consider:

  1. Log level: several of these are fairly verbose for production (e.g. logging every hibernating request individually in a loop). Consider downgrading to debug once the migration issue is resolved.
  2. Pre-existing issue: line 355 uses console.error in the onActorStop catch block instead of the structured logger pattern (this.log?.error). Not introduced by this PR, but worth a follow-up cleanup.

Summary

Severity / File / Issue:

  • Minor: tunnel.ts - Add comment linking "ws.downstream_closed" to its Rust definition
  • Minor: state-manager.ts - No regression test added for the saveState bug fix
  • Suggestion: mod.ts - New info-level logs may be too verbose for production long-term
  • Pre-existing: mod.ts - console.error in onActorStop catch instead of structured logger

Overall the changes are correct and the saveState fix closes a real persistence gap. The approach for handling message index gaps during hibernation is sound.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 9, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4594

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4594

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4594

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4594

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4594

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4594

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4594

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4594

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4594

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4594

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4594

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4594

commit: 16648e0

@NathanFlurry NathanFlurry force-pushed the fix/ws-gap-warning branch 2 times, most recently from 651b567 to 065a506 Compare April 9, 2026 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant