fix(sdk): watch handle callback awaiting, request timeouts, and exit semantics#1436
fix(sdk): watch handle callback awaiting, request timeouts, and exit semantics#1436mishushakov wants to merge 8 commits into
Conversation
- JS: await async onEvent/onExit callbacks in WatchHandle so events are handled sequentially and callback errors are routed to onExit - JS: report a clean exit (no error) to onExit when the watch is stopped via stop() instead of a misleading TimeoutError - Python sync: apply request timeout and user auth headers to WatchHandle.get_new_events/stop and accept a request_timeout param - Python async: invoke on_exit whenever the watching ends — with the exception on stream failure, or None on normal stream end and stop() - Tests: unit tests for all three handles, secured-envd watch coverage for JS, fixed async secured-envd test to actually use a secure sandbox Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 0db7048 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
PR SummaryMedium Risk Overview JS Python async Python sync
Adds unit tests for start-event helpers and watch handles, secured-envd watch coverage (JS + fixed async Python test), and changesets for patch releases. Reviewed by Cursor Bugbot for commit 0db7048. Bugbot is set up for automated code reviews on this repo. Configure here. |
Package ArtifactsBuilt from cc0f4fc. Download artifacts from this workflow run. JS SDK ( npm install ./e2b-2.30.6-mishushakov-fix-watchhandle-callbacks-timeout.0.tgzCLI ( npm install ./e2b-cli-2.12.3-mishushakov-fix-watchhandle-callbacks-timeout.0.tgzPython SDK ( pip install ./e2b-2.29.5+mishushakov.fix.watchhandle.callbacks.timeout-py3-none-any.whl |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bfb9c5bd8f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
- JS: report onEvent callback errors to onExit even when stop() was requested while the callback was in flight — only the stream abort caused by stop() is treated as a clean exit - Python async: stop() now re-raises exceptions raised by on_exit instead of silently swallowing them via asyncio.wait Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
stop() now resolves only after the watching has fully ended and the onExit callback has completed, re-throwing errors raised by onExit. An in-flight onEvent callback is abandoned on stop — the JS equivalent of the cancelled handler task in the Python SDK — which also prevents a deadlock when stop() is awaited from inside onEvent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Guard against an empty process/watch start stream where `.next().value` is `undefined`. The optional chain was on `.event` rather than the `startEvent` itself, so an exhausted stream slipped past the guard and surfaced as a bare `TypeError: Cannot read properties of undefined` instead of the intended `Expected start event` error. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t stop Promise.race could pick the stop over an onEvent callback that rejected in the same tick, swallowing the error and calling onExit(undefined). Track the callback's settlement so a settled (resolved/rejected) callback always wins over a concurrent stop(); only a still-pending callback is abandoned. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…handle-callbacks-timeout # Conflicts: # packages/python-sdk/e2b/sandbox_async/filesystem/watch_handle.py # packages/python-sdk/e2b/sandbox_sync/filesystem/filesystem.py # packages/python-sdk/e2b/sandbox_sync/filesystem/watch_handle.py
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4f48d20. Configure here.
stop() awaits handlingEvents, but handleEvents blocks on `await onExit`. Calling (or awaiting) stop() from inside onExit created a cycle: stop() → handlingEvents → onExit → stop(). Track an `inOnExit` flag and skip awaiting handlingEvents on re-entrant calls; onExit already received the exit error as its argument so nothing is lost. The onEvent re-entrant path was already protected by the stoppedPromise race. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0977f9817
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…dge cases - js: a synchronous onEvent callback that calls stop() before throwing had its error swallowed — the throw reached the outer catch where `stopped` was already true and was converted to a clean onExit(undefined). Capture the callback invocation error at the call site, matching the async path and the Python SDK. - python (async): stop() swallowed CancelledError of its own caller (e.g. asyncio.wait_for(handle.stop(), ...) timing out), reporting a successful stop and interrupting an in-flight on_exit. Shield the watcher task so the caller's cancellation propagates and on_exit is left to finish. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| async stop() { | ||
| this.stopped = true | ||
| this.notifyStopped() | ||
| this.handleStop() | ||
| // When `stop()` is called from inside `onExit`, awaiting `handlingEvents` | ||
| // would deadlock: it cannot settle until `onExit` returns, and `onExit` is | ||
| // blocked awaiting this call. `onExit` already received the exit error as | ||
| // its argument, so returning early loses nothing. | ||
| if (this.inOnExit) { | ||
| return | ||
| } | ||
| await this.handlingEvents |
There was a problem hiding this comment.
🔴 stop() returns early whenever inOnExit is true, even when the call is not re-entrant — so an external await handle.stop() that races with an in-flight async onExit skips await this.handlingEvents, and any error onExit subsequently raises is silently swallowed by the constructor's this.handlingEvents.catch(() => {}). This breaks the documented contract added in this PR (and stated in .changeset/watch-handle-stop-clean-exit.md): "stop() … re-throwing errors raised by onExit". The inOnExit boolean cannot distinguish a true re-entrant call (same async task, where awaiting handlingEvents would deadlock) from an external call from a separate async context (where it would not); a per-call-context token via AsyncLocalStorage would fix this without re-introducing the deadlock.
Extended reasoning...
What is going wrong
The PR documents this contract on stop() (watchHandle.ts:107–110, mirrored verbatim in .changeset/watch-handle-stop-clean-exit.md):
Resolves after the watching has fully ended and the
onExitcallback (if any) has completed; errors thrown byonExitare re-thrown here.
The inOnExit early-return at watchHandle.ts:124-126 was added to prevent the legitimate re-entrant deadlock when stop() is awaited from inside onExit. But inOnExit is a single instance flag — it is true any time handleEvents is parked on await this.onExit?.(error), regardless of which async task is calling stop(). An external caller racing with an in-flight async onExit therefore takes the same early-return branch as a re-entrant caller, skips await this.handlingEvents, and never observes errors raised by onExit. The constructor's unconditional this.handlingEvents.catch(() => {}) on line 102 then silently absorbs those errors so they don't crash the process either.
Step-by-step proof
Consider this realistic usage — the watch ends on its own and onExit does awaited cleanup that may rethrow the exit error:
const handle = await sandbox.files.watchDir(path, () => {}, {
timeoutMs: 100,
onExit: async (err) => {
await someSlowCleanup() // takes ~300ms
if (err) throw err // expected to be re-thrown by stop()
},
})
await new Promise(r => setTimeout(r, 200))
await handle.stop() // expected: waits, re-throws the TimeoutError
// actual: returns immediately, error is lostTrace through watchHandle.ts:
- t ≈ 100ms: the watch times out.
iterateEvents()rejects with the stream-abort error. InhandleEvents, thecatchat line 197-203 fires.this.stoppedis stillfalse, soerror = streamErroris captured. - t ≈ 100ms: line 205 sets
this.inOnExit = true. Line 207 awaitsthis.onExit?.(error). Control yields —someSlowCleanup()is in flight. - t ≈ 200ms: the user code calls
await handle.stop(). Instop()(line 116-128): setsthis.stopped = true, runsnotifyStopped(), runshandleStop(), then evaluatesif (this.inOnExit)— TRUE, so itreturns. - The user's
await handle.stop()resolves at t ≈ 200ms, with no error. - t ≈ 400ms:
someSlowCleanup()completes.onExitre-throws the TimeoutError. Theawait this.onExit?.(error)at line 207 rejects, propagates out ofhandleEvents, and rejectshandlingEvents. - The constructor's
this.handlingEvents.catch(() => {})(line 102) marks the rejection as handled. The error is gone.
The caller never sees the TimeoutError that onExit explicitly rethrew, even though it did await handle.stop(). Worst-class failure mode: no log, no propagation, contract silently violated.
Why existing safeguards don't catch this
- The constructor's
handlingEvents.catch(() => {})is correct for the "no caller is waiting" case (watch ends on its own, nobody callsstop()), but the same chain swallows the rejection when the newinOnExitshortcut detaches the external caller fromhandlingEvents. - The new test
stop can be awaited from inside onExit without deadlocking(watchHandle.test.ts:165-205) covers the legitimate re-entrant case with a synchronousonExit. - The test
stop rethrows onExit error when called after the watch already ended(watchHandle.test.ts:208-229) deliberately insertsawait new Promise(r => setTimeout(r, 0))afteronExitresolves, so by the timestop()runs, thefinallyblock at line 208-210 has already clearedinOnExitback tofalse. The intermediate window —onExitmid-flight, externalstop()called — is not exercised.
Trigger window
Any watch that ends without an explicit stop() (timeout, stream failure, normal stream end) where the user's onExit does awaited work. Trying to clean up in onExit and rethrowing the original exit error is exactly the pattern this PR encourages by documenting that errors propagate from stop(), so the failure mode hits the most realistic usage.
How to fix it
Track re-entrance with AsyncLocalStorage (or a per-invocation token set just around the await this.onExit?.(error) call) instead of a shared boolean. Only skip await this.handlingEvents when the current async context is the one running onExit. External callers from a separate async task then still await handlingEvents and receive the documented error.

Description
Fixes several
WatchHandleissues across the SDKs and aligns the watch lifecycle contract between JS and Python:onEvent/onExitcallbacks are now awaited (sequential event handling, callback errors routed toonExit).stop()reports a clean exit instead of a misleadingTimeoutError, and now resolves only after the watching has fully ended andonExithas completed — re-throwing errors raised byonExit. An in-flightonEventcallback is abandoned on stop (the JS equivalent of Python's cancelled handler task), which also prevents a deadlock whenstop()is awaited from insideonEvent.get_new_events()/stop()now apply the default request timeout and the user auth headers (previously unbounded and sent as the default user) and accept arequest_timeoutparameter.on_exitis now invoked whenever the watching ends — with the exception on stream failure, orNoneon normal stream end and onstop()— andstop()re-raises exceptions fromon_exitinstead of silently swallowing them.The resulting contract is identical in both SDKs: the exit callback fires exactly once however the watching ends (
None/undefinedon normal end or user stop, the error otherwise), andstop()returns only after exit handling has completed.Includes unit tests for all three handles, a secured-envd
watchDirtest for JS, and a fix for the async secured-envd test which wasn't actually using a secure sandbox.Usage
🤖 Generated with Claude Code