fix(python): avoid quadratic stdout/stderr accumulation in command handles#1457
fix(python): avoid quadratic stdout/stderr accumulation in command handles#1457davidzeng-pplx wants to merge 2 commits into
Conversation
|
We require contributors to sign our Contributor License Agreement, and we don't have @davidzeng-pplx on file. You can sign our CLA at https://e2b.dev/docs/cla . Once you've signed, post a comment here that says '@cla-bot check' |
🦋 Changeset detectedLatest commit: ad209bc The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
@cla-bot check |
|
The cla-bot has been summoned, and re-checked this pull request! |
|
can you provide a reproduction where we can see the issue happening? thanks! |
|
@mishushakov Sure! Since the slowdown is purely in the SDK's client-side accumulation (not the sandbox), here's a self-contained reproduction isolating the exact pattern from command_handle.py — no sandbox needed. The handles do self._stdout += chunk per chunk. Because _stdout is an instance attribute (STORE_ATTR), CPython's in-place string-concat optimization — which only applies to local STORE_FAST targets — doesn't kick in, so each append re-copies the whole buffer → O(n²) in total bytes. python class Buggy: # current SDK class Fixed: # this PR def run(cls, n, size=1024): for n in (2000, 4000, 8000, 16000): 2,000 (1 MiB): 0.98s → 0.001s 4,000 (3 MiB): 3.33s → 0.003s 8,000 (7 MiB): 15.41s → 0.011s 16,000 (15 MiB): 72.27s → 0.014s Each doubling ~quadruples the buggy time (4× = 2², classic O(n²)), while list+join stays flat in ms. A command streaming ~15 MiB of stdout blocks for over a minute today (and stalls the event loop in async), vs ~14 ms with this PR. |
Problem
AsyncCommandHandle/CommandHandleaccumulate streamed output withself._stdout += outper chunk. Becauseself._stdoutis an instance attribute (STORE_ATTR), CPython's in-place string-concatenation optimization — which only applies to localSTORE_FASTtargets — doesn't apply, so each append re-copies the entire buffer. For commands that emit large volumes of output this becomes O(n²) in total bytes, and in async contexts it stalls the event loop for hundreds of ms per chunk near the tail.Fix
Buffer decoded chunks in a
list[str]and"".join()them on read. This restores linear-time accumulation and keeps streaming responsive, with no change to the resultingstdout/stderrvalues or the public API.Notes
@e2b/python-sdk, patch).