Skip to content

Add File / FileReader polyfill#169

Open
bkaradzic-microsoft wants to merge 7 commits into
BabylonJS:mainfrom
bkaradzic-microsoft:weekend/tpc-1582-file-polyfill
Open

Add File / FileReader polyfill#169
bkaradzic-microsoft wants to merge 7 commits into
BabylonJS:mainfrom
bkaradzic-microsoft:weekend/tpc-1582-file-polyfill

Conversation

@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor

The File polyfill in BabylonNative currently lives as a JS shim
(Apps/Playground/Scripts/file_polyfill.js, ~308 lines) that is
LoadScript-ed from the Playground's AppContext (see
BabylonJS/BabylonNative#1706). Per @bghgary's review on that PR, it
should live in JsRuntimeHost alongside Blob, URL, WebSocket,
XMLHttpRequest, TextDecoder, AbortController, etc., so every
JsRuntimeHost consumer -- not only the BabylonNative Playground --
gets File and FileReader.

This PR adds a native C++ Polyfills/File/ target that implements
the WHATWG File and FileReader web APIs, and 28 new Mocha unit
tests (13 File + 15 FileReader) under Tests/UnitTests/Scripts/tests.ts.

Surface area

  • File(parts, name, options) constructor.
  • Instance accessors: size, type, name, lastModified.
  • Instance methods: arrayBuffer(), text(), bytes().
  • FileReader with state constants EMPTY/LOADING/DONE
    (both on the constructor and on instances, per WHATWG).
  • FileReader events: loadstart, load, loadend,
    progress, error, abort via both onX handler slots and
    addEventListener / removeEventListener.
  • readAsText / readAsArrayBuffer / readAsDataURL /
    readAsBinaryString / abort.

Implementation notes

  • File delegates its byte storage to the existing Blob polyfill via
    env.Global().Get("Blob") so we reuse Blob's BlobPart handling
    (ArrayBuffer, typed array, string, Blob).
  • FileReader's base64 encoder for readAsDataURL is an inlined
    RFC 4648 implementation -- JsRuntimeHost has no base-n dependency.
  • The new JSRUNTIMEHOST_POLYFILL_FILE CMake option is on by default
    and gates building the target.

Notable JSC-specific care

Two pieces of the implementation are shaped specifically to work
correctly on JavaScriptCore:

  1. FileReader.EMPTY/LOADING/DONE are registered via
    StaticValue and InstanceValue descriptors inside the
    DefineClass property list, not via
    func.Get("prototype").Set(...) after class creation.
    On JSC the
    napi shim's ConstructorInfo::Create defaults the constructor's
    .prototype to Object.prototype, so the latter pattern would
    pollute Object.prototype with EMPTY/LOADING/DONE keys and
    break for..in over plain objects throughout the runtime. The
    Tests/UnitTests/Scripts/tests.ts regression test
    "does not pollute Object.prototype with EMPTY/LOADING/DONE" pins
    this behaviour.

  2. The Blob-dependency guard in File::Initialize uses
    IsUndefined() rather than IsFunction().
    Some JSC builds
    (notably libjavascriptcoregtk-4.1 on Linux) classify constructors
    created via JSObjectMakeConstructor as typeof 'object', not
    'function', so napi_typeof returns napi_object for them
    and an IsFunction() guard would silently early-return on those
    engines. IsUndefined() matches the guard used by Blob and works
    on V8, JSI, Chakra, iOS-JSC, Android-JSC, and Linux JSC.

Testing

Locally: UnitTests.exe on Win32 V8 Release passes 167/167 tests
(139 pre-existing + 28 new), including the Object.prototype-pollution
regression canary.

End-to-end with BabylonNative: with this PR pinned in
BabylonJS/BabylonNative#1706, BabylonNative CI passes 26/26 jobs
(all platforms including Linux Clang/GCC JSC), and the 19 GLTF/OBJ
playground tests that previously depended on the JS shim are
re-enabled and passing.

After this merges

BabylonJS/BabylonNative#1706 re-pins CMakeLists.txt's
GIT_REPOSITORY/GIT_TAG to the merged SHA and then merges.

Copilot AI review requested due to automatic review settings May 26, 2026 23:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves File / FileReader support from a BabylonNative Playground-only JS shim into JsRuntimeHost as a native C++ polyfill, so all JsRuntimeHost consumers get the WHATWG-style APIs alongside existing polyfills (Blob, URL, XHR, etc.).

Changes:

  • Add a new Polyfills/File/ C++ target implementing File and FileReader (including event handler slots + EventTarget-style listeners).
  • Wire the polyfill into the build via a new JSRUNTIMEHOST_POLYFILL_FILE CMake option (ON by default) and link it into unit tests.
  • Add new Mocha unit tests covering File and FileReader behaviors (construction, reads, events, abort, and a JSC regression around Object.prototype pollution).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
CMakeLists.txt Adds JSRUNTIMEHOST_POLYFILL_FILE option (ON by default).
Polyfills/CMakeLists.txt Conditionally includes the new File polyfill subdirectory.
Polyfills/File/CMakeLists.txt Defines the File polyfill library target and its sources.
Polyfills/File/Include/Babylon/Polyfills/File.h Public API entrypoint for initializing the polyfill.
Polyfills/File/Readme.md Documents behavior and prerequisites (Blob must be initialized first).
Polyfills/File/Source/File.h Declares the internal File ObjectWrap implementation.
Polyfills/File/Source/File.cpp Implements File over the existing Blob polyfill and initializes FileReader.
Polyfills/File/Source/FileReader.h Declares the internal FileReader ObjectWrap implementation.
Polyfills/File/Source/FileReader.cpp Implements async reads, base64 DataURL generation, events, and abort logic.
Tests/UnitTests/CMakeLists.txt Links the new File target into the UnitTests executable.
Tests/UnitTests/Shared/Shared.cpp Initializes the File polyfill in the unit test JS environment.
Tests/UnitTests/Scripts/tests.ts Adds new Mocha tests for File and FileReader, including a JSC regression canary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Polyfills/File/Source/FileReader.cpp Outdated
Comment thread Polyfills/File/Source/FileReader.cpp
Comment thread Polyfills/File/Source/FileReader.cpp
Comment thread Polyfills/File/Source/File.cpp Outdated
Comment thread Tests/UnitTests/Scripts/tests.ts Outdated
@bkaradzic-microsoft bkaradzic-microsoft force-pushed the weekend/tpc-1582-file-polyfill branch from 6f8a827 to 67f9f5c Compare May 27, 2026 00:00
@bkaradzic-microsoft bkaradzic-microsoft force-pushed the weekend/tpc-1582-file-polyfill branch 7 times, most recently from 9a3e2f3 to fa84e76 Compare May 27, 2026 04:11
bkaradzic and others added 2 commits June 1, 2026 14:41
Implements the WHATWG File and FileReader web APIs as a native
polyfill, alongside the existing Blob polyfill that File delegates to
for byte storage.

* `Polyfills/File/` -- new polyfill target, gated on a new
  `JSRUNTIMEHOST_POLYFILL_FILE` CMake option (ON by default).
* `Tests/UnitTests/Scripts/tests.ts` -- 28 new Mocha tests
  (13 File + 15 FileReader), including a regression canary for
  Object.prototype pollution by `FileReader.EMPTY/LOADING/DONE`.

Notable implementation details
------------------------------
* FileReader registers its `EMPTY/LOADING/DONE` constants via
  `StaticValue` and `InstanceValue` descriptors inside the
  `DefineClass` property list. On JavaScriptCore the napi shim's
  `ConstructorInfo::Create` defaults the constructor's `.prototype`
  to `Object.prototype`, so setting these via
  `func.Get("prototype").Set(...)` would pollute every plain object
  in the runtime.
* The Blob-dependency guard in `File::Initialize` uses `IsUndefined()`
  rather than `IsFunction()`. Some JSC builds (notably
  `libjavascriptcoregtk` on Linux) classify constructors created via
  `JSObjectMakeConstructor` as `typeof 'object'`, not `'function'`,
  so `napi_typeof` returns `napi_object` for them and an
  `IsFunction()` guard would silently early-return on those engines.
  `IsUndefined()` matches the guard used by Blob and works on every
  engine we ship.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…on Chakra

ThrowAsJavaScriptException avoids the Chakra heap-corruption on
throw-from-constructor, but JSI's napi shim implements
`Error::ThrowAsJavaScriptException` as a no-op that only stores the
error in `env->last_exception` without raising it. The script then
continues normally with an incompletely-initialized File wrapper, the
`expect(...).to.throw()` matcher sees no JS error, and the unhandled
exception is later reported through AppRuntime's
UnhandledExceptionHandler, failing the test harness exit code.

Use the JRH-wide convention `throw Napi::TypeError::New(env, msg)`
(matches Blob, XHR, URL, AbortSignal, TextDecoder, FileReader). On
engines whose Node-API shim properly translates a C++ throw into a JS
exception at the ObjectWrap construction boundary (V8 / JSC / JSI),
this propagates as expected.

Chakra cannot handle throwing from a constructor — there are five
pre-existing TODOs in tests.ts noting this same limitation for URL
parse() error cases. Apply the same workaround here: keep the
WebIDL-conformant throw in C++, but comment out the test that
exercises it (it would corrupt the Chakra heap on every CI run).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bkaradzic-microsoft bkaradzic-microsoft force-pushed the weekend/tpc-1582-file-polyfill branch from fa84e76 to 52f6338 Compare June 1, 2026 21:41
@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

@bghgary friendly ping — this is the upstream half of BabylonJS/BabylonNative#1706 (File / FileReader polyfill, which re-enables 19 GLTF/OBJ serializer tests in BN). Same flow as #171 (TextEncoder) which you helped me close out earlier today: the polyfill was moved here from the BN PR per your review ("this should be a proper C++ polyfill under Polyfills/ so every consumer gets it").

State just refreshed:

  • Rebased onto current upstream/main (resolved a trivial overlapping-describe() conflict in tests.ts against the just-merged Add TextEncoder polyfill (WHATWG Encoding Standard) #171).
  • 5 inline Copilot review comments all have my replies with fixes in commit 52f6338. Notable bug-fix: readAsBinaryString now produces Latin-1 byte-per-code-unit per spec (was emitting UTF-8 of code points ≥ 0x80).
  • File library builds standalone clean (cmake --build build-notests --target File).

Polyfills/File/ sits next to Polyfills/Blob/ and follows the same shape as TextDecoder / TextEncoder / XMLHttpRequest / AbortController / URL / WebSocket. No changes to existing files outside the standard wiring (CMakeLists.txt option, Polyfills/CMakeLists.txt add_subdirectory, Tests/UnitTests/{CMakeLists, Shared/Shared.cpp, Scripts/tests.ts} link + Initialize + tests).

Once this lands I'll bump BN #1706's pin back to BabylonJS/JsRuntimeHost and that PR is one tiny commit away from merge.

Copy link
Copy Markdown
Member

@ryantrem ryantrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my comment on the other PR, we may want to do a minimal implementation of the File polyfill that is just enough to support BJS usage (not sure if that is already the case).

Comment thread Polyfills/File/Source/FileReader.cpp Outdated
// user dropped their reference. ObjectReference is move-only, so
// route it through shared_ptr to make the lambda copy-constructible
// (Napi::Function::New stores via std::function).
auto thisRef = std::make_shared<Napi::ObjectReference>(Napi::Persistent(jsThis));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we need shared_ptr for this though, can't each lambda just capture its own Napi::Persist?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4480ad8: replaced the shared_ptr trick with a member Napi::ObjectReference m_selfRef anchored across the in-flight read (matches the member-slot pattern used by WebSocket/XHR in this repo).

The reason the literal [thisRef = Napi::Persistent(jsThis)] mutable capture-per-lambda didn't fly here: Napi::ObjectReference is move-only, so capturing it makes the lambda move-only too — and the JSI shim's Napi::Function::New stores its callable through jsi::Function::createFromHostFunction, whose underlying HostFunctionType is std::function-shaped (CopyConstructible required). That's why we needed something heap-stored.

Member slot is cleaner than shared_ptr: lambdas now only capture POD + his, the FileReader's lifetime is anchored by m_selfRef while a read is in flight, and every terminal path (load / error / abort) resets it to break the self-cycle. Each lambda also guards on m_readId before dereferencing m_selfRef to handle the abort-then-late-resolve race.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Responded by Copilot on behalf of @bghgary]

Agreed on move-only ruling out the literal capture. But std::shared_ptr<Napi::ObjectReference> IS a copyable handle to the move-only resource, and closes a real UAF m_selfRef doesn't:

auto selfRef = std::make_shared<Napi::ObjectReference>(Napi::Persistent(jsThis));
auto onResolve = Napi::Function::New(env,
    [this, selfRef, myReadId, mode, contentType](const Napi::CallbackInfo& cb) {
        if (m_readId != myReadId) return;
        Napi::Value buf = cb.Length() > 0 ? cb[0] : cb.Env().Null();
        HandleReadResult(myReadId, mode, contentType, selfRef->Value(), buf);
    });
auto onReject = Napi::Function::New(env,
    [this, selfRef, myReadId](const Napi::CallbackInfo& cb) {
        if (m_readId != myReadId) return;
        Napi::Value err = cb.Length() > 0 ? cb[0]
            : static_cast<Napi::Value>(Napi::Error::New(cb.Env(), "FileReader: unknown error").Value());
        HandleReadError(myReadId, selfRef->Value(), err);
    });

The UAF: the lambda derefs this (in m_readId) BEFORE the token check. So r.abort() resets m_selfRef → user drops r → GC → FileReader destructor → underlying Blob.arrayBuffer() promise (engine-owned, separate from the wrapper) eventually settles → onResolve fires → this->m_readId is UAF. Narrow window (needs GC between abort and the queued microtask) but not closed by anything in the code. shared_ptr<Persistent> captured by the lambdas keeps the wrapper alive until both continuations are gone, so the token deref is always safe.

Bonus: drops the m_selfRef member, its 7-line comment, all four .Reset() calls, and the implicit "at most one in-flight read" invariant. m_readId stays — that's event suppression (abandoned reads shouldn't fire load/loadend), not lifetime.

Engine teardown is a separate concern, not solved by either approach (same exposure for WebSocket/XHR).


namespace Babylon::Polyfills::Internal
{
class File final : public Napi::ObjectWrap<File>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File should behave as a subtype of Blob for WebAPI compatibility. As implemented, this looks like composition (m_blob) rather than inheritance, so file instanceof Blob may be false unless the prototype chain is explicitly wired up. Can we make File extend the polyfilled Blob type, or otherwise ensure File.prototype inherits from Blob.prototype?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4480ad8: wired File.prototype → Blob.prototype via Object.setPrototypeOf in File::Initialize.
ew File(...) instanceof Blob is now rue.

This matters because Babylon.js core branches on instanceof Blob in packages/dev/core/src/Misc/fileTools.pure.ts, Offline/database.pure.ts, Engines/abstractEngine.pure.ts, and Engines/thinNativeEngine.pure.ts — without the prototype chain wired up those checks silently fail for File inputs and the wrong branch is taken (the serializer round-trip path bypasses the Blob handling).

Internal m_blob composition stays as the implementation detail; only the JS-visible prototype chain is wired. New test in ests.ts:

it("is an instance of Blob (prototype chain wired up)", function () {
    const file = new File(["x"], "x.txt");
    expect(file instanceof Blob).to.equal(true);
    expect(file instanceof File).to.equal(true);
});

…_ptr trick

Addresses @ryantrem review on BabylonJS#169 (review id 4412098338):

* Drop FileReader.readAsBinaryString and its supporting code paths.
  BJS has zero call sites (searched core, dev/, tools/). Removing it also
  eliminates the deferred Latin-1/UTF-8 follow-up that previously crashed
  Chakra during a combined-edit attempt.

* Wire File.prototype to inherit from Blob.prototype via
  Object.setPrototypeOf in File::Initialize. Babylon.js core branches on
  `instanceof Blob` in fileTools, Offline/database, abstractEngine, and
  thinNativeEngine; without inheritance those checks silently fail for
  File inputs and the wrong branch is taken. Internal m_blob composition
  stays as the implementation detail; only the JS-visible prototype chain
  is wired. New test asserts `new File(...) instanceof Blob === true`.

* Replace shared_ptr<ObjectReference>-in-lambda trick with a member
  Napi::ObjectReference m_selfRef anchored across the in-flight read.
  Matches the member-slot pattern used by WebSocket/XHR in this repo.
  Lambdas now only capture POD + this, so they remain copyable for
  jsi::Function's std::function-style callable slot. Every terminal path
  (load, error, abort) resets m_selfRef to break the self-cycle. Each
  lambda also guards on m_readId before dereferencing m_selfRef to handle
  the abort-then-late-resolve race.
@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

@ryantrem on the minimization point — I surveyed Babylon.js usage and the only thing that wasn't already minimal is FileReader.readAsBinaryString (zero call sites across packages/dev, packages/tools, etc.). Dropped it in 4480ad8.

API BJS usage
new File(parts, name, opts) inspector-v2, viewer, smartAssetSerializer, etc.
instanceof File modelLoader, smartAssetSerializer, configurator
instanceof Blob fileTools, Offline/database, abstractEngine, thinNativeEngine
FileReader.readAsArrayBuffer fileTools, database
FileReader.readAsText fileTools, smartAssetSerializer
FileReader.readAsDataURL stringTools, nodeEditor, inspector
FileReader.readAsBinaryString zero call sites — removed

Dropping readAsBinaryString also makes the deferred Latin-1/UTF-8 follow-up unnecessary (the one that previously crashed Chakra during a combined-edit attempt — see the earlier thread).

Full set of changes in 4480ad8:

  1. Drop readAsBinaryString + its enum case + its test.
  2. File inherits Blob via prototype-chain wire-up + new instanceof Blob test.
  3. Replace shared_ptr<ObjectReference> with member Napi::ObjectReference m_selfRef slot (matches the WebSocket/XHR pattern in this repo).

Local Win32 Chakra File target builds clean.

CI failed on JSC engines (Ubuntu_gcc, macOS, iOS, Android_JSC) with
'[Uncaught Error] setPrototypeOf@[native code]' during JS env init.

Root cause: the previous commit called
\Object.setPrototypeOf(func.Get('prototype'), Blob.prototype)\. On JSC,
the napi-defined class's \.prototype\ JS property points to
Object.prototype (per JSObjectMakeConstructor semantics; same quirk the
FileReader constants block documents). setPrototypeOf on Object.prototype
throws TypeError because Object.prototype's [[Prototype]] is immutable.

Fix: instantiate a throwaway File, fetch its real prototype via
Object.getPrototypeOf, and set THAT prototype's prototype to
Blob.prototype. On V8 / Chakra the real prototype is also
func.prototype, so this is correct everywhere. The temp instance is
GC-eligible immediately after.

Each call is guarded with IsExceptionPending + GetAndClearPendingException
so that if a non-Chromium JSC build still rejects setPrototypeOf on the
napi-internal prototype, Initialize stays best-effort: instances won't be
Blob subtypes on that engine but the rest of the polyfill still
installs and the JS env starts cleanly.
Commit ac19d77 swapped the direct setPrototypeOf for a temp-instance

trick to fix JSC, but that broke Chakra (Win32_x64_Chakra and

Win32_x86_Chakra now report `file instanceof Blob === false`).

Restore the direct call (which previously passed on V8 and Chakra),

verify the chain via a probe instance, and only fall back to the

temp-instance approach when the chain isn't wired up. That recovers

JSC (where `func.prototype` aliases `Object.prototype`) without

regressing the engines where the direct call was already correct.
Commit 9e7b333 (C++ dual-path direct -> temp-instance fallback)

passed Chakra but regressed JSC: on JSC, setPrototypeOf on

Object.prototype throws TypeError that escapes the napi shim as an

[Uncaught Error] instead of being capturable via IsExceptionPending.

Move both setPrototypeOf calls into a single JS shim wrapped in

try/catch. JS-level try/catch reliably traps the TypeError on every

engine, the direct path takes care of V8/Chakra, and the probe path

(instanceof Blob check + getPrototypeOf + setPrototypeOf on the real

napi-internal prototype) takes care of JSC.
@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

CI on 35bfa81 is fully green (22/22 jobs across V8, Chakra, JSI, JSC on Linux/macOS/iOS/Android/Windows/UWP, plus Sanitizers + TSan).

The prototype-chain wire-up for File extends Blob went through three iterations:

  1. 4480ad8 (initial review-fix commit): direct Object.setPrototypeOf(func.prototype, blobProto) in C++. Passed V8/Chakra; failed JSC because JSC's napi_define_class (which wraps JSObjectMakeConstructor) returns Object.prototype from func.Get("prototype") rather than the real napi-internal prototype — same quirk the FileReader constants block documents. Result: setPrototypeOf tried to mutate Object.prototype's [[Prototype]] and threw TypeError.
  2. ac19d77: temp-instance trick — Object.getPrototypeOf(new File(...)) returns JSC's real napi-internal prototype, and that one accepts setPrototypeOf. Fixed JSC; broke Chakra (reason still unclear from the log alone — Chakra-specific behavior somewhere between JsConstructObject and Object.getPrototypeOf).
  3. 9e7b333 (C++ dual-path direct + verify + fallback): regressed JSC because the JSC napi shim escapes the direct-path TypeError as [Uncaught Error] instead of capturing it via IsExceptionPending, so the C++ catch never gets the chance to fall through to the fallback.
  4. 35bfa81 (current): move both paths into a small JS shim wrapped in try/catch. JS-level catch reliably traps the TypeError on every engine; direct path covers V8/Chakra; probe path (instanceof Blob check + getPrototypeOf + setPrototypeOf on the real napi-internal prototype) covers JSC. The shim is run once at Initialize via env.RunScript (Shared N-API) or Napi::Eval (JSI N-API).

Ball back in your court for re-review.

Copy link
Copy Markdown
Contributor

@bghgary bghgary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Reviewed by Copilot on behalf of @bghgary]

Recommend landing #172 first. The JSC napi-shim bug it describes is the root cause of several workarounds in this PR — JS_PROTOTYPE_CHAIN_SHIM collapses to one direct setPrototypeOf call, the Napi::Eval / env.RunScript dispatch goes away, and the 14-line FileReader dual-exposure comment shrinks to one line. Quite a bit of code evaporates before merge if #172 lands first.

Other concerns inline.

Comment thread Polyfills/File/Source/FileReader.cpp Outdated
Comment on lines +59 to +60
// Compute loaded/total best-effort from the current result, mirroring
// the JS polyfill that this C++ implementation replaces.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "JS polyfill that this C++ implementation replaces" reference is dangling — no such JS polyfill exists in this PR.

Suggested change
// Compute loaded/total best-effort from the current result, mirroring
// the JS polyfill that this C++ implementation replaces.
// ProgressEvent contract: loaded/total reflect bytes processed. For
// one-shot reads we don't track interim progress — report the final
// byte count for both and leave lengthComputable=false.

Comment thread Polyfills/File/Source/FileReader.cpp Outdated
Comment on lines +96 to +109
// Expose EMPTY/LOADING/DONE both as static constants on the
// constructor (FileReader.EMPTY) and as instance constants on the
// prototype (new FileReader().EMPTY) per the WHATWG IDL.
//
// Important: do NOT set these via func.Get("prototype").Set(...) on
// the returned constructor. On JSC, JSObjectMakeConstructor defaults
// the constructor's .prototype property to Object.prototype, so
// writing through it pollutes Object.prototype and breaks any
// for..in over plain objects elsewhere in the runtime. The
// InstanceValue descriptors below go through napi_define_class's
// internal prototype lookup, which on JSC targets the napi-internal
// prototype (distinct from .prototype) and on V8 targets the
// function template's PrototypeTemplate — both correct, neither
// touches Object.prototype.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per JsRuntimeHost#173, dual StaticValue + InstanceValue is the correct WHATWG pattern for web polyfill constants — not a quirky workaround needing 14 lines of explanation. The JSC pitfall warning also evaporates once #172 lands. Drop the whole comment block:

Suggested change
// Expose EMPTY/LOADING/DONE both as static constants on the
// constructor (FileReader.EMPTY) and as instance constants on the
// prototype (new FileReader().EMPTY) per the WHATWG IDL.
//
// Important: do NOT set these via func.Get("prototype").Set(...) on
// the returned constructor. On JSC, JSObjectMakeConstructor defaults
// the constructor's .prototype property to Object.prototype, so
// writing through it pollutes Object.prototype and breaks any
// for..in over plain objects elsewhere in the runtime. The
// InstanceValue descriptors below go through napi_define_class's
// internal prototype lookup, which on JSC targets the napi-internal
// prototype (distinct from .prototype) and on V8 targets the
// function template's PrototypeTemplate — both correct, neither
// touches Object.prototype.

Comment on lines +142 to +147
// Initialize on* handler slots so they exist as enumerable, writable
// properties (consumers commonly assign to them after construction).
for (const auto* slot : ON_HANDLERS)
{
jsThis.Set(slot, env.Null());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same refactor as the readyState/result/error ask above — the on* handlers should be InstanceAccessor getter/setter pairs on the prototype, not enumerable writable own data props on every instance.

// in DefineClass, one per ON_HANDLER slot:
InstanceAccessor("onload", &FileReader::GetOnLoad, &FileReader::SetOnLoad),
// ...

The setter does the event-handler IDL processing: store the new function (or null) in a Napi::FunctionReference member, and the dispatch path consults it alongside the listener list (or the setter wraps add/remove of the previous via the existing listener machinery).

Comment thread Polyfills/File/Source/File.cpp Outdated
Comment on lines +9 to +12
#if defined(__has_include) && __has_include(<napi/env.h>)
#include <napi/env.h>
#define BABYLON_POLYFILL_USE_NAPI_JSI_EVAL 1
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If #172 lands first and the JS shim goes away entirely, this macro goes too. If kept:

  1. __has_include(<napi/env.h>) always succeeds — the header is transitively included for every engine via <Babylon/Polyfills/File.h>.
  2. ..._JSI_EVAL is misleading: Napi::Eval is declared for V8/Chakra/JSC too.
  3. No rationale for preferring Napi::Eval over env.RunScript() — pick one and drop the #if.

Comment on lines +39 to +51
constexpr auto JS_PROTOTYPE_CHAIN_SHIM = R"JS(
(function() {
if (typeof File !== 'function' || typeof Blob !== 'function') return;
var blobProto = Blob.prototype;
try { Object.setPrototypeOf(File.prototype, blobProto); } catch (e) {}
try {
var probe = new File([], '');
if (!(probe instanceof Blob)) {
Object.setPrototypeOf(Object.getPrototypeOf(probe), blobProto);
}
} catch (e) {}
})();
)JS";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If #172 lands first, this whole shim collapses to a direct setPrototypeOf in C++ (no JS eval, no try/catch fallback, no probe). Recommending we land #172 first and drop the shim entirely.

Comment thread Tests/UnitTests/Scripts/tests.ts Outdated
Comment on lines +1468 to +1479
// TODO: Uncomment this once the Node-API implementation for Chakra supports throwing errors from constructors.
// it("throws when fewer than 2 arguments are passed", function () {
// // File requires both fileBits and fileName per the WebIDL bindings.
// // Browsers throw TypeError on missing arguments; the native polyfill
// // must match that surface so consumers don't accidentally create a
// // File with empty name when their call site is misspelled.
// // Note: we only assert *that* it throws (not the specific error
// // type), because the JSI napi shim wraps thrown Napi::TypeError as
// // a generic JS Error when surfacing it across the host boundary.
// expect(() => new (File as any)()).to.throw();
// expect(() => new (File as any)([])).to.throw();
// });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File a tracking issue for the Chakra napi-shim throw-from-constructor limitation and reference it in the TODO. Makes these tests re-enable-able atomically when the shim is fixed.

Comment thread Polyfills/File/Source/FileReader.cpp Outdated
// user dropped their reference. ObjectReference is move-only, so
// route it through shared_ptr to make the lambda copy-constructible
// (Napi::Function::New stores via std::function).
auto thisRef = std::make_shared<Napi::ObjectReference>(Napi::Persistent(jsThis));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Responded by Copilot on behalf of @bghgary]

Agreed on move-only ruling out the literal capture. But std::shared_ptr<Napi::ObjectReference> IS a copyable handle to the move-only resource, and closes a real UAF m_selfRef doesn't:

auto selfRef = std::make_shared<Napi::ObjectReference>(Napi::Persistent(jsThis));
auto onResolve = Napi::Function::New(env,
    [this, selfRef, myReadId, mode, contentType](const Napi::CallbackInfo& cb) {
        if (m_readId != myReadId) return;
        Napi::Value buf = cb.Length() > 0 ? cb[0] : cb.Env().Null();
        HandleReadResult(myReadId, mode, contentType, selfRef->Value(), buf);
    });
auto onReject = Napi::Function::New(env,
    [this, selfRef, myReadId](const Napi::CallbackInfo& cb) {
        if (m_readId != myReadId) return;
        Napi::Value err = cb.Length() > 0 ? cb[0]
            : static_cast<Napi::Value>(Napi::Error::New(cb.Env(), "FileReader: unknown error").Value());
        HandleReadError(myReadId, selfRef->Value(), err);
    });

The UAF: the lambda derefs this (in m_readId) BEFORE the token check. So r.abort() resets m_selfRef → user drops r → GC → FileReader destructor → underlying Blob.arrayBuffer() promise (engine-owned, separate from the wrapper) eventually settles → onResolve fires → this->m_readId is UAF. Narrow window (needs GC between abort and the queued microtask) but not closed by anything in the code. shared_ptr<Persistent> captured by the lambdas keeps the wrapper alive until both continuations are gone, so the token deref is always safe.

Bonus: drops the m_selfRef member, its 7-line comment, all four .Reset() calls, and the implicit "at most one in-flight read" invariant. m_readId stays — that's event suppression (abandoned reads shouldn't fire load/loadend), not lifetime.

Engine teardown is a separate concern, not solved by either approach (same exposure for WebSocket/XHR).

Comment on lines +138 to +140
jsThis.Set("readyState", Napi::Number::New(env, EMPTY));
jsThis.Set("result", env.Null());
jsThis.Set("error", env.Null());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be InstanceAccessor getters reading from C++-side state — not direct jsThis.Set in the constructor.

Per WHATWG IDL, readyState / result / error are readonly attribute → getter accessor on the prototype, non-writable from JS, non-enumerable. The PR makes them writable enumerable own data props on every instance.

// in DefineClass:
InstanceAccessor("readyState", &FileReader::GetReadyState, nullptr),
InstanceAccessor("result",     &FileReader::GetResult,     nullptr),
InstanceAccessor("error",      &FileReader::GetError,      nullptr),

Add m_readyState / m_result / m_error C++ members; replace every jsThis.Set("readyState", ...) (L188, 317, 329, 368, 420, 468, 495) and jsThis.Get("readyState") (L178, 312, 410, 480) with member access — state-machine checks become tamper-proof (JS can't fool them by overwriting the property).

InstanceAccessor is supported by the JSI napi shim (Core/Node-API-JSI/Include/napi/napi-inl.h:2575+); no shim work needed.

Comment on lines +13 to +15
// base64 library because no other JsRuntimeHost polyfill needs one
// and this is the only call site.
void EncodeBase64(const uint8_t* data, size_t size, std::string& out)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add a third base64 impl. JRH already has one in Core/AppRuntime/V8Inspector/Source/V8InspectorUtils.cpp (Babylon::utils::base64_encode), and BabylonNative pulls in azawadzki/base-n (header-only) for its Canvas/Window polyfills. Best path: promote base-n to JRH, have FileReader consume it; BN's direct base-n dep becomes redundant and can be dropped in a follow-up.

declare const File: any;
declare const FileReader: any;

describe("File", function () {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests.ts is 1518 lines / 15 describe blocks across 10+ polyfills; adding File + FileReader here grows it further. Either file a JRH issue to split per-polyfill as a follow-up, or split in this PR — your call.

@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

Filed #174 with the JSC napi shim fix for #172. It replaces the old // BEGIN TODO ... END TODO [[Prototype]]-kludge in ConstructorInfo::Create with a proper .prototype property setup, and stops napi_define_class from going through napi_get_prototype (which per spec returns [[Prototype]], not .prototype). Includes a Blob-based regression test that asserts no instance members leak onto the global Object.prototype.

Once #174 lands, I'll rebase this branch onto main and:

Will respond on the shared_ptr thread separately — your UAF argument is right; reverting the m_selfRef simplification.

Re: splitting tests.ts per-polyfill (your #1400) — happy to do it in this PR or as a follow-up; let me know which you prefer.

@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

Update: closing #174 without merging — the JSObjectMakeConstructor .prototype slot is ReadOnly per JSObjectRef.cpp:128, so JSObjectSetProperty calls put() on it and throws on every JSC backend. The naive workaround JSObjectMakeFunctionWithCallback produces a non-constructable result. The actually-correct fix needs to switch to JSObjectMake + JSClassDefinition with both callAsConstructor and callAsFunction, plus a refactor of the NativeInfo::Link / Query sentinel pattern — too big to bundle with this PR.

So #169 keeps its JS-shim wire-up (35bfa81). I'll address your remaining inline asks against the current shape:

…abylonJS#169

File.cpp
- Drop the misleading BABYLON_POLYFILL_USE_NAPI_JSI_EVAL macro and the
  __has_include(<napi/env.h>) guard. <napi/env.h> exists on every backend
  (V8, Chakra, JSC, JSI) and is pulled in transitively via
  <Babylon/Polyfills/File.h>, so the include guard always succeeded and
  the macro name implied a JSI-only path that doesn't exist — Napi::Eval
  is declared on all four backends (the Shared N-API impl in env.cc is a
  thin wrapper around env.RunScript). Call Napi::Eval directly.
- Reorder Initialize: check "already provided" (cheap no-op, common path
  on platforms with a native File) before probing for Blob.
- Throw Napi::Error on missing Blob instead of silently bailing.
  Consumers that wire up the File polyfill expect it to be installed;
  silent failures are hard to debug.

FileReader.cpp
- MakeEvent: replace the dangling "JS polyfill that this C++
  implementation replaces" reference (no such JS polyfill exists in this
  PR) with a one-line description of the actual ProgressEvent contract.
- DefineClass: collapse the 14-line dual-StaticValue/InstanceValue
  comment to one line pointing at JsRH#173. Per BabylonJS#173 this is the correct
  WHATWG IDL `const` member exposure pattern, not a workaround that
  needs in-line justification.

tests.ts
- Re-point the Chakra throw-from-constructor TODO at JsRH#175 (filed
  today) so the disabled "throws when fewer than 2 arguments are passed"
  test can be re-enabled atomically when that shim limitation is fixed.

No behavior change; pure cleanup. Local Chakra build clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bkaradzic-microsoft
Copy link
Copy Markdown
Contributor Author

bkaradzic-microsoft commented Jun 3, 2026

Pushed 3de01db addressing the non-controversial mechanical nits from the review. CI run 2685565115122/22 green.

Changes in this commit:

  • File.cpp: drop the misleading BABYLON_POLYFILL_USE_NAPI_JSI_EVAL macro and __has_include(<napi/env.h>) guard. Napi::Eval is declared on every backend (Shared N-API has it via Core/Node-API/Source/env.cc as a thin wrapper around env.RunScript; JSI N-API declares it directly in <napi/env.h>). The include is also already pulled in transitively via <Babylon/Polyfills/File.h>, so the guard always succeeded. Now a single unconditional Napi::Eval(...) call.
  • File.cpp Initialize: reorder so the "already provided" check runs first (cheap no-op, common path on hosts with a native File), and throw Napi::Error on missing Blob instead of silently returning. Consumers wiring up the polyfill expect it to be installed — silent failures are hard to diagnose.
  • FileReader.cpp MakeEvent: drop the dangling "JS polyfill that this C++ implementation replaces" reference (there's no such JS polyfill in this PR); replaced with a one-liner describing the actual ProgressEvent loaded/total contract.
  • FileReader.cpp DefineClass: collapse the 14-line dual-StaticValue/InstanceValue block down to one line pointing at Web polyfill constants: add missing InstanceValue exposure on WebSocket and XMLHttpRequest #173, since per Web polyfill constants: add missing InstanceValue exposure on WebSocket and XMLHttpRequest #173 that's the correct WHATWG IDL const member exposure pattern, not a workaround needing in-line justification.
  • tests.ts: re-point the Chakra throw-from-constructor TODO to Chakra napi shim: throws from class constructors are swallowed (don't surface to JS) #175 (filed today) so the disabled expect(() => new (File as any)()).to.throw() test can be re-enabled atomically when the Chakra shim limitation is fixed.

Still to come (separate commits for review-clarity, in priority order):

  • shared_ptr<Napi::ObjectReference> UAF revert in FileReader.cpp (your #r3344177855 ask) — restoring the per-lambda shared_ptr capture pattern that survives abort → drop → GC → engine-owned promise settles → onResolve fires.
  • InstanceAccessor refactor for readyState / result / error / on* handlers (your JSC fix utf8 string creation #140 + Fix WorkQueue destructor deadlock #147 asks) — add C++ state members + Napi::FunctionReference slots, replace all the jsThis.Set/.Get sites.
  • base-n promotion from BN to JRH (your Build instructions for Android #15 ask) — once it lands here, FileReader can drop the inline EncodeBase64 and BN's direct dependency on the same lib goes away in a follow-up.
  • tests.ts per-polyfill split (your #1400 note that said "your call") — happy to defer unless you'd rather see it now.

#172 / #174 update: as noted in the follow-up comment on #174, the proper JSC .prototype fix needs a JSClassDefinition-based callable+constructable rewrite that's too big to bundle here. The JS shim approach in this PR stays as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants