Skip to content

feat(redis): add resilient redis conn layer#511

Open
smarcet wants to merge 1 commit intomainfrom
feature/redis-resilience-conn-layer
Open

feat(redis): add resilient redis conn layer#511
smarcet wants to merge 1 commit intomainfrom
feature/redis-resilience-conn-layer

Conversation

@smarcet
Copy link
Collaborator

@smarcet smarcet commented Feb 25, 2026

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in “resilient” Predis Redis driver to retry selected commands on transient Predis\Connection\ConnectionException failures, with configurable retry limits/delay, plus accompanying tests, ADR, and CI coverage.

Changes:

  • Introduces ResilientPredisConnection (retry-with-reconnect + exponential backoff) and ResilientPredisConnector (wraps Laravel’s Predis connector).
  • Registers a new Redis client driver predis_resilient via a service provider and adds retry_limit / retry_delay to Redis connection configs.
  • Adds unit tests for retry behavior and runs/uploads Redis test results in CI; documents the decision in an ADR.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
app/Redis/ResilientPredisConnection.php Implements retry logic and idempotent-command allowlist.
app/Redis/ResilientPredisConnector.php Wraps the default connector to return the resilient connection.
app/Providers/RedisResilienceServiceProvider.php Registers the predis_resilient driver in the Redis manager.
config/database.php Adds retry_limit / retry_delay config keys for Redis connections.
config/app.php Registers the new service provider.
tests/Redis/ResilientPredisConnectionTest.php Unit tests validating retry/non-retry behavior.
adr/001-resilient-redis-connection.md ADR documenting rationale, design, and configuration.
.github/workflows/push.yml Executes Redis tests in CI and uploads results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

public function boot(): void
{
$this->app->afterResolving('redis', function (RedisManager $redis, Application $app) {
$redis->extend('predis_resilient', function () {
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback passed to $redis->extend() is declared with no parameters, but RedisManager custom creators are invoked with arguments (at least the application instance and the connection config). On PHP 8.3 this will raise an ArgumentCountError when resolving the predis_resilient driver. Update the closure signature to accept the expected args (or use a variadic) and return the connector from there.

Suggested change
$redis->extend('predis_resilient', function () {
$redis->extend('predis_resilient', function (...$parameters) {

Copilot uses AI. Check for mistakes.
'EXISTS', 'TYPE', 'TTL', 'PTTL', 'KEYS', 'SCAN', 'HSCAN', 'SSCAN', 'ZSCAN',
'INFO', 'PING', 'DBSIZE', 'TIME', 'STRLEN', 'GETRANGE',
// idempotent writes
'SET', 'SETEX', 'PSETEX', 'MSET', 'SETNX', 'GETSET',
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SETNX and GETSET are listed as idempotent, but retrying them can change the observed result (e.g., SETNX returns 1 then 0; GETSET returns the previous value, which will differ on retry). Retrying these commands after a read-side failure can change application behavior even if the final key state is the same; consider removing them from the retry allowlist or adjusting the idempotency criteria to account for return values.

Suggested change
'SET', 'SETEX', 'PSETEX', 'MSET', 'SETNX', 'GETSET',
'SET', 'SETEX', 'PSETEX', 'MSET',

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +53
'HSET', 'HMSET', 'HSETNX',
'DEL', 'HDEL', 'UNLINK',
'EXPIRE', 'EXPIREAT', 'PEXPIRE', 'PEXPIREAT', 'PERSIST',
'SADD', 'SREM',
'ZADD', 'ZREM', 'ZREMRANGEBYSCORE', 'ZREMRANGEBYRANK',
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several commands in the "idempotent writes" allowlist (e.g., HSET, DEL, EXPIRE, SADD, ZADD) can return different replies when executed twice (counts / 1-or-0 flags). If callers rely on these return values, a retry after a connection failure can change application behavior. Consider restricting retries to commands with stable replies on repetition, or explicitly redefining/normalizing the semantics you consider safe to retry.

Copilot uses AI. Check for mistakes.
'attempt' => $attempt,
'max_retries' => $this->retryLimit,
'delay_ms' => $delay,
'error' => $previous->getMessage(),
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the retry warning log, the 'error' field is always taken from the original $previous exception, so later attempts will log the wrong message if subsequent retries fail with a different ConnectionException. Use the most recent exception for logging (e.g., the caught exception for that attempt) so the log reflects the actual failure being retried.

Suggested change
'error' => $previous->getMessage(),
'error' => $lastException->getMessage(),

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +45
### 2. Job Dispatch Fallback for Audit Logging

Separately, `AuditLogOtlpStrategy` was updated to dispatch `EmitAuditLogJob` via `JobDispatcher::withSyncFallback()` instead of `EmitAuditLogJob::dispatch()`. This catches Redis failures at the job dispatch level and runs the audit log emission synchronously as a fallback — preventing audit logging from ever failing the parent business transaction.

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ADR states that AuditLogOtlpStrategy was updated to use JobDispatcher::withSyncFallback(), but that change is not part of this PR’s diffs. Either include the referenced code change in this PR, or adjust the ADR text to avoid documenting behavior that isn’t implemented here.

Copilot uses AI. Check for mistakes.
| `app/Providers/RedisResilienceServiceProvider.php` | Registers `predis_resilient` driver |
| `config/database.php` | Added `retry_limit`, `retry_delay` to Redis connections |
| `config/app.php` | Registered service provider |
| `app/Audit/AuditLogOtlpStrategy.php` | Changed to `JobDispatcher::withSyncFallback` |
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Files table lists app/Audit/AuditLogOtlpStrategy.php as changed in this ADR, but that file is not modified in this PR. Update the table to match the actual change set (or add the missing implementation change if it’s intended to be part of this ADR/PR).

Suggested change
| `app/Audit/AuditLogOtlpStrategy.php` | Changed to `JobDispatcher::withSyncFallback` |

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +60
- Retry adds latency on failure (up to ~350ms with defaults: 50 + 100 + 200ms backoff)
- `usleep()` in the retry loop blocks the PHP process during backoff — acceptable for 2-3 retries but would need async handling at higher retry counts
- The idempotent command list is manually maintained and must be updated if new Redis commands are used
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latency calculation here doesn’t match the implemented defaults: with retry_limit defaulting to 2 (code + config), exponential backoff sleeps are 50ms + 100ms (=150ms), not 50+100+200 (=350ms). Either update the ADR math/examples or change the default retry limit to align with the documented worst case.

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +61
/**
* @param \Predis\Client $client
* @param int $retryLimit Max number of retries (0 = no retries, behaves like stock PredisConnection)
* @param int $retryDelay Base delay in milliseconds between retries (doubled each attempt)
*/
public function __construct($client, int $retryLimit = 2, int $retryDelay = 50)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When retryLimit is 0, command() still routes into retryCommand(), which will emit the "all retries exhausted" error log before rethrowing. That’s not quite "behaves like stock PredisConnection" as described in the constructor doc. Consider short-circuiting in command() (or retryCommand()) when retryLimit <= 0 to rethrow without the retry/error-log path.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants