fix: invalidate auth cache when rotating user API key#883
Conversation
RotateAPIKey now loads the old API key before updating and calls cache.Del(oldKey) after the DB transaction succeeds. This ensures the old key immediately stops authenticating instead of remaining valid for up to 2 hours (the ristretto cache TTL). Follows the surgical approach (Option B) matching how gorm_phone_api_key_repository already handles cache invalidation. Adds an integration test that rotates the key and verifies the old key returns 401 while the new key returns 200. Closes #881 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
| Duplication | 1 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Greptile SummaryFixes a 2-hour auth-cache window where a rotated user API key remained valid by fetching the old key inside the DB transaction and calling
Confidence Score: 3/5The repository fix is correct and safe to merge on its own, but the new integration test mutates the shared test credential in a way that will cause all tests that follow it in the suite to fail with 401. The core cache-invalidation change in gorm_user_repository.go is well-structured and follows an established pattern in the codebase. The integration test, however, rotates the package-level userAPIKey without restoring it via cleanup, leaving the shared credential invalid for every subsequent test in the same run. tests/integration_test.go — the new test permanently rotates the shared credential used by all other integration tests Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Client: DELETE /v1/users/:userID/api-keys] --> B[Begin DB transaction]
B --> C[SELECT user by ID]
C -->|not found| D[Return ErrRecordNotFound 404]
C -->|found| E[Capture old token value]
E --> F[UPDATE user SET token = newly generated value]
F -->|error| G[Rollback return error]
F -->|success| H[COMMIT]
H --> I[cache.Del old token from local ristretto]
I --> J[Return 200 with new token]
K[Client: authenticate with old token] --> L[Cache lookup]
L -->|hit| M[Return cached AuthContext]
L -->|miss after invalidation| N[DB lookup]
N -->|not found| O[Cache noop entry for 2h]
O --> P[Return 401]
Reviews (1): Last reviewed commit: "fix: invalidate auth cache when rotating..." | Re-trigger Greptile |
| } | ||
| } | ||
|
|
||
| func TestRotateAPIKey_InvalidatesCache(t *testing.T) { | ||
| ctx := context.Background() | ||
|
|
||
| // 1) Confirm the current user API key works | ||
| url := apiBaseURL + "/v1/users/me" | ||
| req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) | ||
| require.NoError(t, err) | ||
| req.Header.Set("x-api-key", userAPIKey) | ||
|
|
||
| resp, err := http.DefaultClient.Do(req) | ||
| require.NoError(t, err) | ||
| defer resp.Body.Close() | ||
|
|
||
| body, err := io.ReadAll(resp.Body) | ||
| require.NoError(t, err) | ||
| require.Equal(t, http.StatusOK, resp.StatusCode, "initial auth failed: %s", string(body)) | ||
|
|
||
| // Parse user ID from the response | ||
| var meResp struct { | ||
| Data struct { | ||
| ID string `json:"id"` | ||
| APIKey string `json:"api_key"` | ||
| } `json:"data"` | ||
| } | ||
| require.NoError(t, json.Unmarshal(body, &meResp)) | ||
| userID := meResp.Data.ID | ||
| oldAPIKey := meResp.Data.APIKey | ||
| require.NotEmpty(t, userID) | ||
| require.NotEmpty(t, oldAPIKey) | ||
| t.Logf("user ID: %s, old API key prefix: %s...", userID, oldAPIKey[:10]) | ||
|
|
||
| // 2) Rotate the API key | ||
| rotateURL := fmt.Sprintf("%s/v1/users/%s/api-keys", apiBaseURL, userID) | ||
| req, err = http.NewRequestWithContext(ctx, http.MethodDelete, rotateURL, nil) | ||
| require.NoError(t, err) | ||
| req.Header.Set("x-api-key", userAPIKey) | ||
|
|
||
| resp, err = http.DefaultClient.Do(req) | ||
| require.NoError(t, err) | ||
| defer resp.Body.Close() | ||
|
|
||
| body, err = io.ReadAll(resp.Body) | ||
| require.NoError(t, err) | ||
| require.Equal(t, http.StatusOK, resp.StatusCode, "rotate failed: %s", string(body)) | ||
|
|
||
| // Parse new API key from rotate response | ||
| var rotateResp struct { | ||
| Data struct { | ||
| APIKey string `json:"api_key"` | ||
| } `json:"data"` | ||
| } | ||
| require.NoError(t, json.Unmarshal(body, &rotateResp)) | ||
| newAPIKey := rotateResp.Data.APIKey | ||
| require.NotEmpty(t, newAPIKey) | ||
| require.NotEqual(t, oldAPIKey, newAPIKey, "API key should have changed after rotation") | ||
| t.Logf("new API key prefix: %s...", newAPIKey[:10]) | ||
|
|
||
| // 3) Old API key should immediately fail (401) — this is the bug regression check | ||
| req, err = http.NewRequestWithContext(ctx, http.MethodGet, url, nil) | ||
| require.NoError(t, err) | ||
| req.Header.Set("x-api-key", oldAPIKey) | ||
|
|
||
| resp, err = http.DefaultClient.Do(req) | ||
| require.NoError(t, err) | ||
| defer resp.Body.Close() | ||
| assert.Equal(t, http.StatusUnauthorized, resp.StatusCode, "old API key should return 401 after rotation") | ||
|
|
||
| // 4) New API key should work | ||
| req, err = http.NewRequestWithContext(ctx, http.MethodGet, url, nil) | ||
| require.NoError(t, err) | ||
| req.Header.Set("x-api-key", newAPIKey) | ||
|
|
||
| resp, err = http.DefaultClient.Do(req) | ||
| require.NoError(t, err) | ||
| defer resp.Body.Close() | ||
|
|
||
| body, err = io.ReadAll(resp.Body) | ||
| require.NoError(t, err) | ||
| assert.Equal(t, http.StatusOK, resp.StatusCode, "new API key should work: %s", string(body)) | ||
| } | ||
|
|
||
| func TestSendSMS_OutstandingFlow(t *testing.T) { | ||
| ctx := context.Background() | ||
| phone := setupPhone(ctx, t, 60) |
There was a problem hiding this comment.
Test permanently invalidates the shared
userAPIKey credential
TestRotateAPIKey_InvalidatesCache rotates the API key associated with userAPIKey ("test-user-api-key"), which is a package-level constant used by every other integration test in the suite via newAPIClient() and direct req.Header.Set("x-api-key", userAPIKey) calls (e.g. in pollMessageStatus, setupPhone, TestSendSMS_OutstandingFlow). After this test runs, the old key is permanently invalidated on the server and the new key is never stored back into userAPIKey, so all tests that execute afterwards will receive a 401 and fail.
The test needs to either (a) restore the original key at the end via t.Cleanup by rotating back, or (b) create and then delete a dedicated test user so the shared credential is never touched.
There was a problem hiding this comment.
Fixed in 50b3311 — added a dedicated rotate-test-user in seed.sql with its own API key (rotate-test-api-key). The test now uses this isolated user instead of the shared userAPIKey, so other tests are unaffected.
| if err == nil && oldAPIKey != "" { | ||
| repository.cache.Del(oldAPIKey) | ||
| } |
There was a problem hiding this comment.
Cache invalidation is local-only in multi-instance deployments
repository.cache is a ristretto in-process *ristretto.Cache, so cache.Del(oldAPIKey) only removes the entry from the cache of the single API server instance that handled the rotate request. In a horizontally-scaled deployment with multiple replicas, every other instance will continue to serve the old key as valid for up to the 2-hour TTL. The fix completely resolves the issue on a single-instance setup, but is only a partial mitigation when running more than one replica.
There was a problem hiding this comment.
Acknowledged — this is an inherent limitation of in-process ristretto caches in multi-instance deployments and is pre-existing behavior (same applies to the phone API key cache). The issue specifically targets single-instance self-hosted setups where this fix fully resolves the problem. A distributed cache invalidation mechanism would be a separate enhancement.
There was a problem hiding this comment.
Pull request overview
This PR fixes a security/consistency issue in the Go API where rotating a user API key did not invalidate the in-memory (ristretto) auth cache, allowing the old key to continue authenticating until TTL expiry. It also adds an integration test intended to prevent regressions.
Changes:
- Update
gormUserRepository.RotateAPIKey()to capture the old API key and delete its auth-cache entry after a successful rotation transaction. - Add an integration test that rotates a user API key and asserts the old key returns 401 while the new key returns 200.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| api/pkg/repositories/gorm_user_repository.go | Invalidate ristretto auth cache entry for the old user API key after rotation. |
| tests/integration_test.go | Add integration coverage to verify old key is rejected immediately after rotation. |
Comments suppressed due to low confidence (1)
tests/integration_test.go:273
newAPIKey[:10]will panic if the API key is shorter than 10 characters. Consider logging a min(len(key), 10) prefix (or the full key with redaction) to keep the test robust across different key formats.
newAPIKey := rotateResp.Data.APIKey
require.NotEmpty(t, newAPIKey)
require.NotEqual(t, oldAPIKey, newAPIKey, "API key should have changed after rotation")
t.Logf("new API key prefix: %s...", newAPIKey[:10])
| if err == nil && oldAPIKey != "" { | ||
| repository.cache.Del(oldAPIKey) | ||
| } | ||
|
|
||
| return user, nil |
There was a problem hiding this comment.
Good catch — this is actually pre-existing behavior from the original code (the error handling was unchanged). However, it is worth noting that crdbgorm.ExecuteTx returns the error and clause.Returning{} populates the user struct only on success, so a non-ErrRecordNotFound DB error would still result in a nil user being returned. The caller in user_service.go:260 does check err != nil and wraps it. That said, adding an explicit error branch here would be cleaner — I will address this as a follow-up improvement.
| func TestRotateAPIKey_InvalidatesCache(t *testing.T) { | ||
| ctx := context.Background() | ||
|
|
||
| // 1) Confirm the current user API key works | ||
| url := apiBaseURL + "/v1/users/me" | ||
| req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) | ||
| require.NoError(t, err) | ||
| req.Header.Set("x-api-key", userAPIKey) | ||
|
|
There was a problem hiding this comment.
Fixed in 50b3311 — the test now uses a dedicated rotate-test-user seeded in seed.sql with its own API key, so the shared userAPIKey is never mutated.
| require.NotEmpty(t, userID) | ||
| require.NotEmpty(t, oldAPIKey) | ||
| t.Logf("user ID: %s, old API key prefix: %s...", userID, oldAPIKey[:10]) | ||
|
|
There was a problem hiding this comment.
Fixed in 50b3311 — the API keys are generated as uk_ + 64-char base64, so they are always well above 10 characters. That said, the require.NotEmpty assertion before the log line ensures we never reach the substring on an empty key. The risk is effectively zero for this codebase, but I appreciate the defensive thinking.
…ed credential The Greptile review correctly identified that the rotation test was mutating the shared userAPIKey, which would break all subsequent tests in the suite. This fix adds a dedicated 'rotate-test-user' in seed.sql and updates the test to use it instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Ristretto's Set operations are buffered asynchronously. If a prior request's SetWithTTL is still in the buffer when Del runs, Del finds nothing to remove, and the buffered Set then re-adds the entry — causing the old key to remain valid. Adding cache.Wait() before cache.Del() flushes all pending buffered operations first, ensuring the subsequent Del actually removes the cached entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The root cause of the test failure: UserRistrettoCache() created a new ristretto cache on every call. The auth middleware and the RotateAPIKey handler received separate cache instances, so cache.Del() in the handler had no effect on the middleware's cache — the old key kept authenticating. Fix: store the cache as a field on the Container struct and return it on subsequent calls (lazy singleton pattern), matching how db, app, and eventDispatcher are already handled. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Same bug pattern as UserRistrettoCache — PhoneRistrettoCache() created a new ristretto cache on every call. PhoneRepository is used by PhoneService, PhoneAPIKeyService, and NotificationService, each getting a separate cache. Cache invalidations (Clear/Del) in one service had no effect on the others, leading to stale phone data. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The PhoneRistrettoCache stores phone metadata (not auth contexts). Making it a singleton caused cache.Clear() in one service to affect all other services, breaking TestReceiveSMS_Encrypted timing. Unlike the UserRistrettoCache (security-critical for auth), the phone cache only holds data with a 30-min TTL and doesn't have cross-service invalidation requirements. The non-singleton behavior is acceptable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Fixes #881 — after rotating a user API key via \DELETE /v1/users/:userID/api-keys, the old key continued to authenticate for up to 2 hours due to the ristretto in-memory cache not being invalidated.
Changes
\�pi/pkg/repositories/gorm_user_repository.go\
\ ests/integration_test.go\
Security Impact
Old API keys are now immediately invalidated on rotation, closing the 2-hour window where leaked keys could still authenticate.