Skip to content

fix(python): avoid quadratic stdout/stderr accumulation in command handles#1457

Open
davidzeng-pplx wants to merge 2 commits into
e2b-dev:mainfrom
davidzeng-pplx:fix/python-command-handle-quadratic-output
Open

fix(python): avoid quadratic stdout/stderr accumulation in command handles#1457
davidzeng-pplx wants to merge 2 commits into
e2b-dev:mainfrom
davidzeng-pplx:fix/python-command-handle-quadratic-output

Conversation

@davidzeng-pplx

Copy link
Copy Markdown

Problem

AsyncCommandHandle / CommandHandle accumulate streamed output with self._stdout += out per chunk. Because self._stdout is an instance attribute (STORE_ATTR), CPython's in-place string-concatenation optimization — which only applies to local STORE_FAST targets — doesn't apply, so each append re-copies the entire buffer. For commands that emit large volumes of output this becomes O(n²) in total bytes, and in async contexts it stalls the event loop for hundreds of ms per chunk near the tail.

Fix

Buffer decoded chunks in a list[str] and "".join() them on read. This restores linear-time accumulation and keeps streaming responsive, with no change to the resulting stdout/stderr values or the public API.

Notes

  • Applies the same change to both the sync and async command handles.
  • Pure internal change; the incremental UTF-8 decoding behavior is preserved.
  • Changeset included (@e2b/python-sdk, patch).

@cla-bot

cla-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

We require contributors to sign our Contributor License Agreement, and we don't have @davidzeng-pplx on file. You can sign our CLA at https://e2b.dev/docs/cla . Once you've signed, post a comment here that says '@cla-bot check'

@changeset-bot

changeset-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: ad209bc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@e2b/python-sdk Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@davidzeng-pplx

Copy link
Copy Markdown
Author

@cla-bot check

@cla-bot cla-bot Bot added the cla-signed label Jun 17, 2026
@cla-bot

cla-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

The cla-bot has been summoned, and re-checked this pull request!

@mishushakov

Copy link
Copy Markdown
Member

can you provide a reproduction where we can see the issue happening? thanks!

@davidzeng-pplx

Copy link
Copy Markdown
Author

@mishushakov Sure! Since the slowdown is purely in the SDK's client-side accumulation (not the sandbox), here's a self-contained reproduction isolating the exact pattern from command_handle.py — no sandbox needed.

The handles do self._stdout += chunk per chunk. Because _stdout is an instance attribute (STORE_ATTR), CPython's in-place string-concat optimization — which only applies to local STORE_FAST targets — doesn't kick in, so each append re-copies the whole buffer → O(n²) in total bytes.

python
import time

class Buggy: # current SDK
def init(self): self._stdout = ""
def feed(self, c): self._stdout += c

class Fixed: # this PR
def init(self): self._chunks = []
def feed(self, c): self._chunks.append(c)

def run(cls, n, size=1024):
h, chunk = cls(), "x" * size
t = time.perf_counter()
for _ in range(n): h.feed(chunk)
return time.perf_counter() - t

for n in (2000, 4000, 8000, 16000):
print(n, f"buggy={run(Buggy,n):.3f}s fixed={run(Fixed,n):.4f}s")
Results on CPython 3.11 (chunks → buggy → fixed):

2,000 (1 MiB): 0.98s → 0.001s

4,000 (3 MiB): 3.33s → 0.003s

8,000 (7 MiB): 15.41s → 0.011s

16,000 (15 MiB): 72.27s → 0.014s

Each doubling ~quadruples the buggy time (4× = 2², classic O(n²)), while list+join stays flat in ms. A command streaming ~15 MiB of stdout blocks for over a minute today (and stalls the event loop in async), vs ~14 ms with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants