fix(sentinel): preserve seed nodes in sentinelRootNodes after topology update (#3237)#3306
Conversation
789da3a to
c46c390
Compare
|
Hi, I would like to work on this issue. Could you please assign it to me? |
|
@aartisonigra thanks, i will try to have a look today! |
|
IP-literal seeds get stuck: Could we filter at construction so only hostnames count as seeds? this.#sentinelSeedNodes = Array.from(options.sentinelRootNodes)
.filter(n => net.isIP(n.host) === 0);Plus a test for IP-only seeds with disjoint discovery, to lock the behavior in. Tests don't actually hit the new code The first test reflects into _self by symbol-name fishing and has a fallback that asserts seedNodes.length === 2 — that passes no matter what. The second one reimplements the merge logic inline and checks the duplicate, never calls #mergeSentinelNodes. Can we drive this through transform() end-to-end? The full-outage integration test from #3188 is a good starting point — just have the sentinels come back with different IPs and assert recovery via the hostname seed. Unrelated type change #createClient went from AnyRedisClientOptions to the explicit generic. Not in the description and unrelated to the seed fix — split it or mention why. SENTINE_LIST_CHANGE.size meaning changed It's now the merged length, not the discovered count. Behavior change for any consumer of the event — worth a note in the description at least. |
|
Thanks for the feedback. I've updated the regression test to validate the behavior through the real transform() flow and removed the previous merge-logic style assertions. The test now verifies the emitted SENTINE_LIST_CHANGE.size from the implementation itself. Please let me know if you'd like any additional coverage or changes. |
|
I’ve addressed the review comments and updated the PR with the latest fixes. Please let me know if any further changes are needed. Looking forward to your review. Thanks! |
Address review on redis#3306: - Remove duplicate import block that broke compilation, and drop the unused `isIP` import. - Revert the unrelated `#createClient` parameter type narrowing (`AnyRedisClientOptions` -> concrete `RedisClientOptions`). - Extract the merge logic into a pure, exported `mergeSentinelNodes` helper in utils.ts instead of a private method, and fix the broken indentation in the constructor and transform(). - Replace the regression tests that reflected on `#private` fields (always undefined at runtime, so every assertion was skipped) with real unit tests against `mergeSentinelNodes`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 7223bef. Configure here.
- mergeSentinelNodes now pushes fresh node objects instead of aliasing the frozen seed instances into the working root-nodes list. - Document seed/reconnection behaviour in docs/sentinel.md: hostname seeds re-resolve and recover from full-IP-change outages; IP-literal seeds recover only when sentinels return at the same address. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Satisfies the no-var lint rule now that utils.ts is a changed file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Problem
When sentinel reported its peer list (IP-based nodes), the client
replaced
sentinelRootNodesentirely with those IPs, dropping theoriginal hostname-based seed nodes.
After a full outage where all sentinels restarted with new IPs, the
client had no working address to reconnect to — it kept retrying
stale IPs forever.
Fixes #3237
Root Cause
In
transform():this.#sentinelRootNodes = analyzed.sentinelList;
This dropped the original hostname seed nodes completely.
Fix
Added
#mergeSentinelNodes()that:So DNS-based hostnames are always available for reconnection
even after a full outage with new IPs.
Files Changed
packages/client/lib/sentinel/index.ts— core fixpackages/client/lib/sentinel/index.spec.ts— 2 new testsTests Added
Note
Medium Risk
Changes Sentinel topology/reconnect candidate lists used during observe/connect loops; behavior is localized but affects production failover recovery paths.
Overview
Fixes Sentinel reconnection after topology updates by not replacing
sentinelRootNodeswith only the IP list returned by sentinels.transform()now builds the working list withmergeSentinelNodes: configuredsentinelRootNodesseeds stay first, discovered peers are appended, and entries are deduped byhost:port(cloned so seeds are not aliased). That keeps hostname/DNS seeds available for reconnect when every sentinel comes back on new IPs after a full outage (#3237).Docs add a “Reconnecting after an outage” section (hostname vs IP-literal seeds). Unit tests cover merge ordering, dedupe, and empty discovery.
Reviewed by Cursor Bugbot for commit be75a0f. Bugbot is set up for automated code reviews on this repo. Configure here.