Skip to content

Stream volume file uploads and downloads instead of buffering in memory#1453

Merged
mishushakov merged 27 commits into
mainfrom
mishushakov/stream-write-volumes
Jun 22, 2026
Merged

Stream volume file uploads and downloads instead of buffering in memory#1453
mishushakov merged 27 commits into
mainfrom
mishushakov/stream-write-volumes

Conversation

@mishushakov

Copy link
Copy Markdown
Member

Follow-up to #1433. Builds on the shared streaming infrastructure introduced there (FILE_TIMEOUT_MS, request-controller/stream-cleanup helpers in connectionConfig, io_utils chunk iterators, the runtime guard) and applies the same streaming model to volumes.

Note

Based on mishushakov/stream-write-file-upload (#1433). Merge that PR first; this PR's diff will then retarget to main automatically.

What changed

  • Volume.writeFile() / Volume.write_file() — stream the request body instead of buffering it in memory.
    • JS: ReadableStream data is streamed outside the browser (half-duplex); browsers still buffer since they can't stream request bodies.
    • Python: file-like objects are streamed in chunks (async wraps them in an async iterator; sync passes them to httpx directly, text-mode IO is encoded chunk-by-chunk).
  • Volume.readFile(format="stream") / read_file(format="stream") — the request timeout now bounds only the initial handshake, not the body read, matching the sandbox files.read stream path. A dropped connection during the handshake surfaces the same typed, health-checked error; JS supports signal to cancel an in-flight stream and cancels unconsumed bodies on error so the pooled connection is released.

Usage

JS — stream a file straight to a volume without buffering:

import { createReadStream } from 'node:fs'
import { Readable } from 'node:stream'

const stream = Readable.toWeb(createReadStream('large-input.bin'))
await volume.writeFile('/data/large-input.bin', stream)

// read back as a stream; the body lives until consumed/cancelled
const out = await volume.readFile('/data/large-input.bin', { format: 'stream' })
for await (const chunk of out) {
  // process chunk
}

Python — stream a file-like object:

with open("large-input.bin", "rb") as f:
    volume.write_file("/data/large-input.bin", f)  # streamed, not read() into memory

for chunk in volume.read_file("/data/large-input.bin", format="stream"):
    ...  # process chunk

Testing

  • pnpm run format, pnpm run lint, pnpm run typecheck pass.
  • Added volume streaming tests (JS tests/volume/file.test.ts; Python sync/async test_file.py text-stream cases).

🤖 Generated with Claude Code

mishushakov and others added 13 commits June 12, 2026 21:17
…memory

- Volume.writeFile/write_file: stream ReadableStream (JS, non-browser) and
  file-like objects (Python) to the API instead of buffering them in memory
- Sandbox.files.write with octet-stream upload: stream ReadableStream data
  (JS, non-browser) and file-like objects (Python), with chunked gzip
  compression
- Python Sandbox.files.read(format="stream"): stream the response body
  instead of downloading it into memory before iterating (sync and async)
- JS Sandbox.files.read({ format: 'stream' }): bound only the initial
  handshake by the request timeout instead of killing an actively-consumed
  stream; the user signal can still cancel it mid-stream

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Empty files short-circuit response parsing (Content-Length: 0), so
Sandbox.files.read() with format 'blob' returned '' and Volume.readFile()
with 'blob'/'stream' returned undefined. Return an empty Blob or
ReadableStream matching the requested format instead.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- js-sdk: handle empty files explicitly in the bytes path of read()
  instead of relying on new Uint8Array(undefined) coercion
- python-sdk: bound the request timeout to the initial handshake for
  read(format="stream") in both sync and async implementations, matching
  the JS SDK behavior; document the semantics in the stream overloads

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ite-file-upload

# Conflicts:
#	packages/js-sdk/src/sandbox/filesystem/index.ts
#	packages/python-sdk/e2b/sandbox_async/filesystem/filesystem.py
#	packages/python-sdk/e2b/sandbox_sync/filesystem/filesystem.py
read(format="stream") returned a bare (async) generator whose finally
only closed the response if iteration had begun. A reader that was
created but never consumed (or never started) held its pooled connection
open until the client was closed, leaking connections.

Wrap the streamed response in FileStreamReader / AsyncFileStreamReader,
which:
- release the connection when the stream is fully consumed or errors,
- expose deterministic cleanup via close()/aclose() and (async) context
  manager support,
- register a weakref.finalize safety net so an abandoned reader releases
  its connection on garbage collection (the async variant schedules
  aclose() on the running loop).

Both remain Iterator[bytes] / AsyncIterator[bytes], so existing usage is
unchanged. Adds credential-free unit tests covering consume/context
manager/close/GC, plus live-sandbox tests for the context manager and
partial-then-close paths.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Match the Python SDK's connection lifecycle for read(format='stream'):
- explicitly cancel the unconsumed error body before propagating, instead
  of relying solely on the abort controller (parity with r.close())
- add a FinalizationRegistry safety net so an abandoned stream releases its
  connection on GC, mirroring Python's weakref.finalize on FileStreamReader

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eader

The async stream reader's garbage-collection safety net
(_schedule_response_aclose via weakref.finalize) was best-effort at best:
loop.create_task(aclose()) from a finalizer is not thread-safe, has no
guarantee of running before loop teardown, and is useless once the loop is
gone. Remove it and rely on the cleanup that actually works—auto-close on
full consume / read error, aclose(), and the async context manager.

The sync FileStreamReader keeps its weakref.finalize(response.close) net,
which is reliable because close() is synchronous.

Document that an abandoned async stream holds its pooled connection until
the client is closed, and update the unit test accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-ups on the stream upload/download work, applying the
established stream policy consistently and addressing review findings:

- volumes: drop the client read timeout on `read_file`/`readFile` streams
  (Python `httpx.Timeout(..., read=None)`, JS handshake-bounded controller
  + `wrapStreamWithConnectionCleanup`), matching the sandbox files stream
  path and the RPC streams. The request timeout now bounds only the
  handshake, not body consumption.
- JS sandbox streaming uploads: use the file-transfer timeout (1h) instead
  of the 60s request default so large streamed uploads aren't aborted
  mid-transfer; buffered uploads keep the short default. Centralize
  `FILE_TIMEOUT_MS` in connectionConfig and reuse it from volume.
- JS: factor the stream cleanup + GC-finalizer logic into a shared
  `wrapStreamWithConnectionCleanup` used by both sandbox files and volumes.
- stream handshake error mapping (Bugbot): map dropped connections during
  the stream handshake to typed, health-checked errors — JS via
  `handleEnvdApiFetchError`, Python via the `httpx.RemoteProtocolError`
  wrapper — mirroring the non-stream read paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streaming an upload only happens on the octet-stream path; the multipart
path buffers (`toBlob` in JS, `.read()` for text file-likes in Python), so
with the old `useOctetStream`/`use_octet_stream` default of false a streamed
write was silently buffered into memory.

Default the flag to auto-detect instead: use octet-stream when any write
entry is streamable (JS `ReadableStream`; Python file-like / non-str-bytes),
and `multipart/form-data` otherwise. Browsers stay on multipart since they
can't stream request bodies. An explicit flag value still wins, gzip still
implies octet-stream, and the old-envd fallback is preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Streamed (file-like) sandbox writes used the 60s request timeout for the
write phase, so a large or slow streamed upload could trip WriteTimeout
while the body was still being sent — inconsistent with the JS SDK (1h)
and Python volume writes (1h).

Relax the write timeout to FILE_TIMEOUT (1h) when any write entry is
streamable, keeping connection setup and the response read bounded by the
request timeout. Buffered str/bytes uploads keep the request timeout.
FILE_TIMEOUT is shared via e2b/connection_config.py, mirroring the JS
SDK's FILE_TIMEOUT_MS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revert the volume read/write streaming changes so this PR is scoped to the
sandbox files streaming work. The volume changes land in a follow-up PR that
builds on the shared streaming infrastructure introduced here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stream `Volume.writeFile`/`write_file` request bodies and bound only the
`Volume.readFile`/`read_file` stream handshake with the request timeout,
reusing the shared streaming infrastructure from the sandbox files PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@changeset-bot

changeset-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 90ae65f

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
e2b Patch
@e2b/python-sdk Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cursor

cursor Bot commented Jun 17, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Streaming volume I/O changes timeout and connection behavior for large transfers; duplicate tests are a hygiene issue only.

Overview
Volume file I/O in JS and Python now streams uploads and downloads with handshake-only timeouts, idle read bounds, and renamed option types with deprecated aliases.

packages/js-sdk/tests/sandbox/files/read.test.ts accidentally duplicates three sandboxTest blocks (read file as stream, read non-existing file as stream, read empty file in all formats); the same cases already exist earlier in the file and should not be added twice.

Reviewed by Cursor Bugbot for commit 90ae65f. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Package Artifacts

Built from 859bb7b. Download artifacts from this workflow run.

JS SDK (e2b@2.30.5-mishushakov-stream-write-volumes.0):

npm install ./e2b-2.30.5-mishushakov-stream-write-volumes.0.tgz

CLI (@e2b/cli@2.12.3-mishushakov-stream-write-volumes.0):

npm install ./e2b-cli-2.12.3-mishushakov-stream-write-volumes.0.tgz

Python SDK (e2b==2.29.4+mishushakov-stream-write-volumes):

pip install ./e2b-2.29.4+mishushakov.stream.write.volumes-py3-none-any.whl

@mishushakov mishushakov marked this pull request as draft June 17, 2026 15:51
@mishushakov mishushakov marked this pull request as ready for review June 18, 2026 17:02
@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…ite-volumes

# Conflicts:
#	packages/js-sdk/src/connectionConfig.ts
…ite-file-upload-v1

# Conflicts:
#	packages/js-sdk/src/connectionConfig.ts
Comment thread packages/python-sdk/e2b/api/__init__.py Outdated
mishushakov and others added 3 commits June 18, 2026 19:09
A fully consumed stream returns its connection to the pool, where it can
linger as an idle keep-alive entry until the server-side close is observed.
Asserting on total pool size therefore flaked under load (test_sync_full_
consume_releases_connection saw 1 instead of 0 on CI). Count only
checked-out (non-idle) connections, which is what the helper name promises
and is the actual leak condition.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The FinalizationRegistry safety net for `read({ format: 'stream' })` only ran
`cleanup()` (aborting the handshake AbortController), unlike the cancel and
error paths which explicitly cancel the response body to release the pooled
envd connection. Abandoned streams could leave connections checked out until
the client was torn down. Mirror the cancel/error paths (and the Python sync
finalizer's `response.close`) by cancelling the body reader before cleanup.

Adds unit tests for wrapStreamWithConnectionCleanup, including a GC-abandonment
test (needs --expose-gc, enabled for the connectionConfig vitest project).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the FinalizationRegistry (JS) and weakref.finalize (sync Python) safety
nets on streamed reads in favor of a deterministic idle-read timeout that
reclaims a stalled stream's pooled connection. Python maps it to httpx's
per-chunk read timeout; JS arms a per-chunk timer that aborts the request
controller. Configurable via streamIdleTimeoutMs / stream_idle_timeout
(default 60s, 0/None disables), and the consume/close contract is now
documented consistently across all three readers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mishushakov and others added 5 commits June 18, 2026 21:02
Bound a stalled streamed transfer with a per-chunk idle timeout (default the
request timeout, configurable via streamIdleTimeoutMs / stream_idle_timeout,
0/None disables) on both reads and writes, so a producer or consumer that stops
making progress no longer holds the pooled connection. Reads map it to httpx's
per-chunk read timeout / a JS idle-abort wrapper; writes use httpx's per-write
timeout / a JS upload idle-abort wrapper.

The total-transfer cap is intentionally left to the server (envd): a client
cap is advisory, can't protect against non-conforming clients, and would mean
maintaining the same ceiling across three SDKs. The pre-existing client-side
1h upload total is removed for consistency with reads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
aiter_io_chunks / agzip_iter were async generators doing synchronous file
reads and zlib compression inline, stalling the asyncio event loop for the
duration of those operations on large AsyncSandbox uploads. Offload both to a
worker thread via asyncio.to_thread.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Arm the JS read idle timer only around the network read and clear it the moment
a chunk arrives, so a slow or paused consumer no longer trips it; it fires only
when the server stops sending mid-stream (a held-but-unread stream is reclaimed
server-side). Matches Python's httpx read timeout, which only counts during
socket reads.

Drop the JS upload idle wrapper: it bounded producer latency (local), not the
upload wire (not observable through fetch). Stalled uploads are bounded
server-side or via the caller's signal; Python keeps its per-write httpx
timeout, which does bound the wire.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After the filesystem streaming revert, FILE_TIMEOUT is used only by the volume
client; sandbox filesystem streaming bounds each chunk by the request timeout
and leaves the total to the server. Reword the comment so it no longer reads as
a general streaming-transfer timeout (addresses a Bugbot review note).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…imeout

The per-write httpx timeout on streamed uploads guards a stuck socket
write (server stops reading); it can't observe the opposite direction.
Record that envd >= 0.6.7's per-read idle timeout backstops that case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread packages/js-sdk/src/volume/index.ts
Stack the volume streaming work on top of file-upload's refactored shared
infrastructure instead of the pre-refactor snapshot it was cut from.

- connectionConfig: replace the GC FinalizationRegistry safety net with the
  idle-read-timeout `wrapStreamWithConnectionCleanup`.
- volume.readFile(stream): pass `controller` + `idleTimeoutMs` and expose a
  `streamIdleTimeoutMs` option (JS) / `stream_idle_timeout` (Python),
  defaulting to the request timeout.
- volume.writeFile: drop the client-side request timeout for streamed uploads
  (bounded server-side / via signal), matching the sandbox upload path.
- volume/client: define FILE_TIMEOUT_MS locally now that connectionConfig no
  longer exports it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mishushakov mishushakov force-pushed the mishushakov/stream-write-volumes branch from dc59e62 to 24f99d2 Compare June 19, 2026 14:39
Comment thread packages/js-sdk/src/volume/index.ts Outdated
Comment thread packages/js-sdk/src/volume/index.ts Outdated
Introduces a named VolumeReadOpts type for the readFile stream options
(replacing the inline `{ streamIdleTimeoutMs?: number }` literal) and
renames VolumeWriteOptions -> VolumeWriteOpts and VolumeMetadataOptions
-> VolumeMetadataOpts, matching the sandbox filesystem `*Opts`
convention (FilesystemReadOpts / FilesystemWriteOpts).

The old VolumeWriteOptions and VolumeMetadataOptions names are kept as
deprecated type aliases, so this is non-breaking.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mishushakov mishushakov force-pushed the mishushakov/stream-write-volumes branch from 44eebd6 to 11254d8 Compare June 19, 2026 15:37

@matthewlouisbrockman matthewlouisbrockman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, don't think there's timeouts on this is there?

mishushakov added a commit that referenced this pull request Jun 22, 2026
…1433)

## Description

Removes full in-memory buffering from the SDK **sandbox** file-transfer
paths, in both JS and Python (sync + async).

**Streamed uploads** — `Sandbox.files.write` / `write_files` streams
`ReadableStream` (JS, outside the browser) and file-like (Python) input
to the sandbox with chunk-by-chunk gzip compression, instead of
buffering the whole body in memory. `useOctetStream`/`use_octet_stream`
now defaults to auto-detect — octet-stream when any entry is streamable
(so streamed uploads aren't silently buffered), `multipart/form-data`
otherwise; browsers always use `multipart/form-data` since streaming
request bodies aren't supported there. A streamed upload is bounded by a
per-chunk timeout on the wire (Python's per-write `httpx` timeout,
default the request timeout); a stalled upload the wire can't observe is
bounded server-side. On Python's `AsyncSandbox`, the blocking file reads
and gzip compression of a streamed upload now run in a worker thread so
a large upload doesn't stall the event loop.

**Streamed downloads** — `Sandbox.files.read(format="stream")` now
streams the response body from the sandbox instead of downloading it
into memory before iterating (Python sync + async), and the 60s request
timeout no longer kills the stream while it's being consumed:
- The request timeout now bounds only the initial handshake.
- The body is bounded by a per-chunk **idle-read timeout** on the wire —
a per-`read()` option (`streamIdleTimeoutMs` in JS,
`stream_idle_timeout` in Python; default the request timeout — 60s —
`0`/`None` to disable). It's armed only while waiting on a network read
and cleared the moment a chunk arrives, so it aborts only when the
server stops sending mid-stream; a slow or paused consumer never trips
it (a held-but-unread stream is reclaimed server-side, not by this
timer).
- A dropped connection during the handshake surfaces the same typed,
health-checked error as non-stream reads. In JS, `signal` can still
cancel an in-flight stream.
- The stream holds its pooled connection until it is consumed to the
end, cancelled/closed, errors, or the idle timeout fires — consume it
fully, use the context manager, or close it. (This replaces the earlier
GC-finalizer net.) Python returns a
`FileStreamReader`/`AsyncFileStreamReader` supporting deterministic
cleanup via `close()`/`aclose()` and (async) context-manager use; both
still satisfy `Iterator[bytes]`/`AsyncIterator[bytes]`, so existing
iteration is unchanged.

**Empty files** — JS `Sandbox.files.read()` with `blob` or `stream`
format now returns a format-correct empty value (empty `Blob` / empty
`ReadableStream`) for empty files instead of `""`.

> [!NOTE]
> The equivalent **volume** streaming changes
(`Volume.writeFile`/`write_file`, `Volume.readFile`/`read_file` streams)
live in a follow-up PR, #1453, which is based on this branch.

## Usage

```ts
// JS: upload a large file without holding it in memory
const file = createReadStream('large.bin')
await sandbox.files.write('large.bin', Readable.toWeb(file), { gzip: true })

// JS: consume a download for longer than 60s without it being killed
const stream = await sandbox.files.read('large.bin', { format: 'stream' })
for await (const chunk of stream) { /* ... */ }

// JS: tune (or disable) the per-chunk idle-read timeout for a read
const stream = await sandbox.files.read('large.bin', {
  format: 'stream',
  streamIdleTimeoutMs: 120_000, // 0 to disable
})

// JS: empty files now return format-correct empty values
const blob = await sandbox.files.read('empty.txt', { format: 'blob' }) // Blob (size 0), not ''
```

```python
# Python: streamed upload and download
with open("large.bin", "rb") as f:
    sandbox.files.write("large.bin", f, gzip=True)

for chunk in sandbox.files.read("large.bin", format="stream"):
    ...

# Python: deterministic cleanup when not reading the stream to the end
with sandbox.files.read("large.bin", format="stream") as stream:
    first_chunk = next(iter(stream))  # connection released on block exit

# Python: tune (or disable) the per-chunk idle-read timeout for a read
for chunk in sandbox.files.read(
    "large.bin", format="stream", stream_idle_timeout=120.0  # None to disable
):
    ...
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Base automatically changed from mishushakov/stream-write-file-upload to main June 22, 2026 17:45

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Want fixes drafted automatically? Bugbot Autofix can create code changes for findings. A team admin can enable Autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b9039a2. Configure here.

Comment thread packages/js-sdk/tests/sandbox/files/read.test.ts
Comment thread packages/python-sdk/e2b/connection_config.py Outdated
@mishushakov mishushakov enabled auto-merge (squash) June 22, 2026 17:48
@mishushakov mishushakov disabled auto-merge June 22, 2026 17:48
The constant was dead code: every consumer imports FILE_TIMEOUT from
e2b.volume.connection_config, which has its own copy. Keeping a second
definition risked the two drifting apart. Move the explanatory comment
to the volume constant that's actually used.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mishushakov mishushakov enabled auto-merge (squash) June 22, 2026 19:03
@mishushakov mishushakov merged commit c1415f3 into main Jun 22, 2026
29 checks passed
@mishushakov mishushakov deleted the mishushakov/stream-write-volumes branch June 22, 2026 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants