Skip to content

Fix reported issue with room reconnect when wifi reconnects#900

Open
darryncampbell wants to merge 1 commit intomainfrom
darryn/room-reconnect-on-wifi-reconnect
Open

Fix reported issue with room reconnect when wifi reconnects#900
darryncampbell wants to merge 1 commit intomainfrom
darryn/room-reconnect-on-wifi-reconnect

Conversation

@darryncampbell
Copy link

To address issue: https://github.com/fabiogambaaik/livekit-demo-app

Problem

When Wi-Fi is disabled on an Android device with multiple concurrent LiveKit rooms, two issues occur:

  1. Some rooms never show RECONNECTING state. The SDK relies on ICE PeerConnectionState reaching FAILED to trigger the onEngineReconnecting callback that sets Room state to RECONNECTING. But for some rooms, ICE never reaches FAILED — it stays at DISCONNECTED or even CONNECTED. The SDK intentionally ignores
    PeerConnectionState.DISCONNECTED as transient, so these rooms remain stuck showing CONNECTED despite having no network.
  2. Rooms fail to recover when Wi-Fi is re-enabled. When the network drops, all engines detect ICE ConnectionState.DISCONNECTED (the internal connection state, distinct from PeerConnectionState) within ~200ms and self-start their own reconnect loops (RTCEngine.kt line 148-149). These loops burn through retries
    making blocking WebSocket/DNS calls to unreachable servers. By the time Android's NetworkCallback.onAvailable() fires (~10+ seconds later), all engines have active reconnect jobs stuck in blocking joinImpl() calls. The existing engine.reconnect() call from onAvailable hits the reconnectingJob?.isActive guard and
    does nothing. The stuck jobs eventually exhaust the 60-second timeout or all 30 retries, and rooms never recover.

Changes

Room.kt — onLost callback: Set Room state to RECONNECTING and post RoomEvent.Reconnecting immediately when Android reports network loss. This ensures all rooms accurately reflect their connectivity status using the OS-level network signal, rather than waiting for ICE state detection that may never fire for some
rooms.

Room.kt — onAvailable callback: Call engine.forceReconnect() instead of engine.reconnect(). The old reconnect() path could never succeed because the engine's own reconnect job was always already active (started by ICE state detection while offline).

Room.kt — reconnect() method: Removed the state == State.RECONNECTING guard. This guard was preventing onAvailable from triggering reconnection for rooms that had already transitioned to RECONNECTING via onEngineReconnecting. With the new forceReconnect() approach from onAvailable, this method is now only used by
other internal callers and the guard is unnecessary (the engine has its own idempotency guard).

RTCEngine.kt — New forceReconnect() method: Cancels any in-progress reconnect job, clears the job reference, sets fullReconnectOnNext = true, and calls reconnect() to start a fresh reconnect loop. This is necessary because:

  • The cancelled job may have already called closeResources() during a full reconnect attempt, destroying peer connections and the SignalClient. A soft reconnect (the default first attempt) would fail on these destroyed resources, so forcing a full reconnect ensures everything is rebuilt from scratch.
  • The fresh job gets a new 60-second timeout budget and full retry count, rather than inheriting the nearly-exhausted budget of the stuck job.

Why not trigger engine.reconnect() from onLost?

Starting the reconnect loop while offline would waste the retry budget (30 retries, 60s timeout) on a network that's known to be down. The engine could exhaust all retries before network returns, making recovery impossible. The engine already self-starts reconnection via ICE state detection anyway — the fix
addresses the recovery side (onAvailable), not the detection side.

@changeset-bot
Copy link

changeset-bot bot commented Mar 24, 2026

⚠️ No Changeset found

Latest commit: 76c6f25

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant