Skip to content

chore(api): add redis sync with orch#2122

Merged
jakubno merged 7 commits intomainfrom
chore/sync-with-redis
Apr 1, 2026
Merged

chore(api): add redis sync with orch#2122
jakubno merged 7 commits intomainfrom
chore/sync-with-redis

Conversation

@jakubno
Copy link
Copy Markdown
Member

@jakubno jakubno commented Mar 12, 2026

In case of some some random error (DB, node time out), the sandbox is removed from sandbox store, but it stays running on the node forever, this will make sure it gets properly cleaned up.

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review with inline comments below.

@jakubno jakubno force-pushed the chore/sync-with-redis branch from cf9cf5a to 5fa0270 Compare March 31, 2026 12:20
@cursor
Copy link
Copy Markdown

cursor bot commented Mar 31, 2026

PR Summary

Medium Risk
Introduces automated deletion of node sandboxes based on Redis divergence, so logic/Redis errors could terminate valid workloads. The concurrent kill loop in Store.Reconcile captures the loop variable and may attempt to kill the wrong sandbox repeatedly.

Overview
Updates node synchronization to Reconcile running sandboxes against the configured backend: for Redis, the store now detects node-reported sandboxes missing from Redis (with an age grace period and pipelined per-team MGET) and triggers orchestrator-side gRPC deletion to clean up orphans; for memory/populate_redis it preserves the prior behavior of re-adding missing sandboxes to the local cache.

Written by Cursor Bugbot for commit 2b884d8. This will update automatically on new commits. Configure here.

@jakubno jakubno marked this pull request as ready for review March 31, 2026 12:46
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Goroutine uses cancellable context for orphan cleanup
    • Applied context.WithoutCancel to detach the goroutine's context from the parent timeout context, matching the pattern used on line 118 for AsyncNewlyCreatedSandbox.

Create PR

Or push these changes by commenting:

@cursor push c5e9ba42ee
Preview (c5e9ba42ee)
diff --git a/packages/api/internal/sandbox/store.go b/packages/api/internal/sandbox/store.go
--- a/packages/api/internal/sandbox/store.go
+++ b/packages/api/internal/sandbox/store.go
@@ -168,7 +168,7 @@
 		// Redis is the source of truth — divergent sandboxes are orphans running
 		// on the node but not present in the store. Kill them.
 		for _, sbx := range sbxsToBeSynced {
-			go s.callbacks.RemoveSandboxFromNode(ctx, sbx)
+			go s.callbacks.RemoveSandboxFromNode(context.WithoutCancel(ctx), sbx)
 		}
 	} else {
 		// Memory backend — divergent sandboxes are ones discovered on the node

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@jakubno jakubno requested a review from sitole April 1, 2026 08:06
@jakubno jakubno merged commit cde821a into main Apr 1, 2026
36 checks passed
@jakubno jakubno deleted the chore/sync-with-redis branch April 1, 2026 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants