fix(orch): keep the sandbox in the map even after it starts shutting down to handle ip reverse lookups by jakubno · Pull Request #2116 · e2b-dev/infra

jakubno · 2026-03-12T08:50:23Z

Note

Medium Risk
Changes sandbox lifecycle state and eviction timing, which can affect routing/firewall lookups and cleanup correctness under concurrent deletes or slot reuse. Risk is moderated by CAS idempotency and guarded eviction, but timing-based behavior could still cause edge-case leaks or mis-association.

Overview
Keeps sandboxes discoverable by host IP for a short grace period after shutdown by introducing an explicit SandboxStatus (starting/running/dead) and replacing immediate map removal with MarkDead plus delayed eviction; live queries (Get, Items, Count) now ignore non-running entries while GetByHostPort can still resolve dead sandboxes. Also adds safeguards to prevent stale IP reuse and accidental eviction of replacement sandboxes (CAS state transition, pointer-checked eviction, and insert-time removal of older entries sharing the same IP).

^{Written by Cursor Bugbot for commit f923bee. This will update automatically on new commits. Configure here.}

claude · 2026-03-12T08:54:37Z

packages/orchestrator/internal/sandbox/map.go

+// MarkRunning transitions a sandbox from starting to running and notifies OnInsert subscribers
 func (m *Map) MarkRunning(sbx *Sandbox) {
-	sbx.started.Store(true)
+	sbx.status.Store(int32(Status`Running))


Syntax error / compilation failure: Status followed by a backtick is an unterminated raw string literal in Go, so the package will not compile. This should be StatusRunning:

sbx.status.Store(int32(StatusRunning))

claude · 2026-03-12T08:54:37Z

packages/orchestrator/internal/sandbox/map.go

+// notified immediately (so the proxy / firewall limiter can clean up), but the
+// entry stays in the map for deadEvictionGracePeriod so that IP-based lookups
+// still resolve while the Firecracker process finishes shutting down.
+func (m *Map) MarkDead(ctx context.Context, sandboxID string) {


MarkDead is not idempotent, so OnRemove can fire twice in a concurrent scenario. The old Remove used RemoveCb(return exists) which was naturally idempotent: a second concurrent call would see exists=false and skip subscriber notification.

With MarkDead, the entry intentionally stays in the map for 30 s. If two goroutines both pass the Get/IsRunning() check before either calls MarkDead (a TOCTOU race on concurrent Delete RPCs for the same sandbox), both find the entry still present, both call sbx.status.Store(StatusDead), and both fire go m.trigger(OnRemove), notifying every subscriber twice.

Using CAS would prevent this:

if !sbx.status.CompareAndSwap(int32(StatusRunning), int32(StatusDead)) { return }

…down to handle ip reverse lookups

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

packages/orchestrator/internal/sandbox/map.go

jakubno added 5 commits March 11, 2026 23:14

chore(orch): simplify sandbox map cleanup flow

af58b5d

chore: add new line

64f7659

chore: add sandbox ip address

15f8164

Apply suggestion from @jakubno

c517f92

Apply suggestion from @jakubno

24d6db8

e2b-request-same-site-reviewers bot assigned sitole Mar 12, 2026

jakubno changed the base branch from main to chore/simplify-sandobx-map-cleanup-flow March 12, 2026 08:50

claude bot reviewed Mar 12, 2026

View reviewed changes

fix(orch): keep the sandbox in the map even after it starts shutting …

7562220

…down to handle ip reverse lookups

jakubno force-pushed the chore/error-finding-sandbox-after-kill branch from c14e526 to 7562220 Compare March 12, 2026 08:55

jakubno added 2 commits March 12, 2026 09:57

fix(orch): keep the sandbox in the map even after it starts shutting …

cab5fb5

…down to handle ip reverse lookups

chore: add CAS in markingDead for idempotency

9e7dc3d

cursor bot reviewed Mar 12, 2026

View reviewed changes

packages/orchestrator/internal/sandbox/map.go Show resolved Hide resolved

Base automatically changed from chore/simplify-sandobx-map-cleanup-flow to main March 13, 2026 12:37

Merge branch 'main' into chore/error-finding-sandbox-after-kill

f923bee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(orch): keep the sandbox in the map even after it starts shutting down to handle ip reverse lookups#2116

fix(orch): keep the sandbox in the map even after it starts shutting down to handle ip reverse lookups#2116
jakubno wants to merge 9 commits intomainfrom
chore/error-finding-sandbox-after-kill

jakubno commented Mar 12, 2026 •

edited by cursor bot

Loading

Uh oh!

claude bot Mar 12, 2026

Uh oh!

claude bot Mar 12, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jakubno commented Mar 12, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jakubno commented Mar 12, 2026 •

edited by cursor bot

Loading