Skip to content

fix(plugin-redis): stop idle refresh from emptying the key list (#1701)#1706

Merged
datlechin merged 2 commits into
mainfrom
fix/redis-engine-rewrite-1701
Jun 17, 2026
Merged

fix(plugin-redis): stop idle refresh from emptying the key list (#1701)#1706
datlechin merged 2 commits into
mainfrom
fix/redis-engine-rewrite-1701

Conversation

@datlechin

Copy link
Copy Markdown
Member

Fixes #1701.

Root cause

The connection health monitor pinged every driver with execute(query: "SELECT 1"). For SQL that is a harmless no-op; for Redis, SELECT 1 is a real command that switches the live connection to database index 1. It returns +OK, so the monitor considered the connection healthy while every later refresh scanned the wrong (usually empty) database. That is why entries vanished after the connection sat idle, and why navigating to another database and back fixed it (that re-issues SELECT).

RedisPluginDriver.ping() already sent PING correctly, but it was unreachable: DatabaseDriver had no ping() and PluginDriverAdapter never bridged it, so the monitor fell through to the destructive SELECT 1.

What changed

  • Health ping (the bug): added ping() to DatabaseDriver with a default that keeps SELECT 1 for SQL drivers, bridged it in PluginDriverAdapter, and pointed the health monitor at mainDriver.ping(). Redis now pings with PING and never mutates the selected database.
  • Key browsing completeness/pagination: default browse and namespace browse emitted single-batch SCAN 0 MATCH ... COUNT n, which never iterates the cursor (so large keyspaces showed a partial fixed set) and ignores OFFSET (so paging re-showed the first batch). Both now route through KEYBROWSE, which iterates the cursor to completion (bounded at 10k), sorts, dedupes, and honors LIMIT/OFFSET.
  • Transparent reconnect: on a dropped socket (REDIS_ERR_EOF/REDIS_ERR_IO) the connection now reconnects on the next command, replays AUTH and the selected database, and retries once, matching redis-cli. Previously commands failed until the next health check rebuilt the driver.
  • Cleanup: removed dead cachedScan* fields, deduped the hiredis error-message reading, removed stray comments, cleared pre-existing lint debt in touched files.

Decisions (deliberately not done)

These were evaluated against the actual code and declined because doing them would trade a correct pattern for risk or regression, not because they were skipped:

  • Actor conversion of the connection: hiredis is a blocking synchronous C API. The current dedicated serial DispatchQueue is the correct way to wrap blocking I/O; a plain actor would block Swift's cooperative pool. The @unchecked Sendable here is encapsulated (single serial queue + lock-guarded state), not a hack.
  • argv statement generation: verified that escapeArgument and the parser tokenizer are a lossless inverse and the C layer already uses explicit-length redisCommandArgv, so there is no injection or data-loss bug. Rewriting would overload executeParameterized app-wide for no benefit.
  • RESP3/HELLO negotiation: would change reply shapes (HGETALL etc. as maps) and regress every result-builder for marginal gain.
  • KEYS rewriting and transactions: an explicit KEYS typed in the CLI runs literally, like redis-cli; only the browse path matters and it uses SCAN. MULTI/EXEC/DISCARD is Redis's genuine transaction primitive and stays.

The namespace tree view and type-aware key opening already exist; this PR fixes the real completeness defect in that flow.

Compatibility

All changes are confined to the app target and the Redis plugin bundle. No TableProPluginKit types change, so no plugin ABI bump or registry re-release is needed.

Tests

  • New PluginDriverAdapterPingTests: a driver with a ping() override routes through it (never execute); a driver without one falls back to SELECT 1.
  • Updated RedisQueryBuilderTests to assert the corrected KEYBROWSE output (complete, paginated) instead of the old single-batch SCAN.
  • swiftlint --strict clean on changed files; full xcodebuild test succeeds (app + Redis plugin + tests compile and pass).

The C-bridge reconnect path compiles but can only be runtime-validated against a live Redis (the standard plugin C-bridge test limitation).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1407abe750

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return try executeCommandSync(args)
} catch let error as RedisPluginError where isConnectionError(error) && !isShuttingDown {
try reconnectSync()
return try executeCommandSync(args)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid replaying non-idempotent Redis commands

When an EOF/IO is reported after Redis has received a write but before the reply reaches the client, this unconditional retry sends the exact same command again. That affects non-idempotent operations routed through executeCommand (e.g. INCR, LPUSH, SADD, or commands inside a transaction), so a transient disconnect can double-apply data changes instead of surfacing an ambiguous failure. Please only reconnect and retry commands that are known safe/idempotent, or reconnect without replaying the failed mutation.

Useful? React with 👍 / 👎.

Comment on lines +493 to +494
freeContextSync()
try openContextSync(selectDatabase: targetDatabase)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep retry path available after a failed reconnect

When a reconnect attempt fails because Redis is still unavailable, freeContextSync() has already cleared context; subsequent public methods hit the guard context != nil check and throw notConnected before executeCommandSyncRetrying can call reconnectSync again. In the common Redis restart case, the connection therefore remains unable to self-heal on the next command after the server is back, which is the path this change is trying to fix.

Useful? React with 👍 / 👎.

@datlechin datlechin merged commit bbe42bc into main Jun 17, 2026
3 checks passed
@datlechin datlechin deleted the fix/redis-engine-rewrite-1701 branch June 17, 2026 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redis entry disappears after 10 seconds

1 participant