Skip to content

Centralize DOM insertion guards#460

Open
prk-Jr wants to merge 5 commits intomainfrom
fix/centralize-dom-insertion-dispatcher
Open

Centralize DOM insertion guards#460
prk-Jr wants to merge 5 commits intomainfrom
fix/centralize-dom-insertion-dispatcher

Conversation

@prk-Jr
Copy link
Collaborator

@prk-Jr prk-Jr commented Mar 7, 2026

Summary

  • Centralize SDK DOM insertion interception behind a single shared dispatcher so integrations no longer stack independent appendChild / insertBefore wrappers.
  • Move GPT onto the shared dispatcher and harden its proxy path so dynamic GPT loads remain first-party without broadening upstream encoding negotiation.
  • Add coordination and reset coverage so SPA re-inits, test runs, and future integrations do not accumulate prototype patches.

Changes

File Change
crates/common/src/integrations/gpt.rs Switched GPT asset fetching to the shared proxy helper, preserved caller Accept-Encoding, updated bootstrap comments, and refreshed proxy tests/fixtures.
crates/js/lib/src/shared/dom_insertion_dispatcher.ts Added the runtime-global DOM insertion dispatcher that owns the single appendChild / insertBefore patch, orders handlers deterministically, and restores baselines safely.
crates/js/lib/src/shared/script_guard.ts Refactored shared script guards to register dispatcher handlers instead of patching DOM prototypes directly.
crates/js/lib/src/integrations/gpt/script_guard.ts Moved GPT Layer 5 onto the shared dispatcher and coordinated reset through handler unregistering.
crates/js/lib/src/integrations/gpt/index.ts Made GPT shim activation robust when the inline enable flag executes before the unified bundle.
crates/js/lib/src/integrations/datadome/script_guard.ts Updated DataDome to the new shared guard API and aligned guard comments with dispatcher behavior.
crates/js/lib/src/integrations/google_tag_manager/script_guard.ts Updated GTM to the new shared guard API with stable integration identity.
crates/js/lib/src/integrations/lockr/nextjs_guard.ts Updated Lockr to the new shared guard API and aligned guard comments with dispatcher behavior.
crates/js/lib/src/integrations/permutive/script_guard.ts Updated Permutive to the new shared guard API with stable integration identity.
crates/js/lib/test/shared/dom_insertion_dispatcher.test.ts Added shared-dispatcher coverage for single patch ownership, ordering, reset coordination, external patch ownership, handler isolation, and repeated install/reset cycles.
crates/js/lib/test/integrations/gpt/index.test.ts Added coverage for GPT auto-install when the enable flag is set before module evaluation.
crates/js/lib/test/integrations/gpt/script_guard.test.ts Updated GPT guard tests for dispatcher-based reset behavior and realistic versioned GPT asset paths.
crates/js/lib/test/integrations/datadome/script_guard.test.ts Updated DataDome guard tests to assert reset restores shared prototype baselines without manual prototype cleanup.
crates/js/lib/test/integrations/google_tag_manager/script_guard.test.ts Updated GTM guard tests to assert reset restores shared prototype baselines without manual prototype cleanup.
crates/js/lib/test/integrations/lockr/nextjs_guard.test.ts Updated Lockr guard tests to assert reset restores shared prototype baselines without manual prototype cleanup.

Closes

Closes #402

Test plan

  • cargo test --workspace
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • JS tests: cd crates/js/lib && npx vitest run
  • JS format: cd crates/js/lib && npm run format
  • Docs format: cd docs && npm run format
  • WASM build: cargo build --bin trusted-server-fastly --release --target wasm32-wasip1
  • Manual testing via fastly compute serve
  • Other: cd crates/js/lib && npm run build

Checklist

  • Changes follow CLAUDE.md conventions
  • No unwrap() in production code — use expect("should ...")
  • Uses tracing macros (not println!)
  • New code has tests
  • No secrets or credentials committed

@prk-Jr prk-Jr self-assigned this Mar 7, 2026
Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Well-executed refactoring that replaces N independent appendChild/insertBefore prototype patches with a single shared dispatcher. The JS design is sound and thoroughly tested. The Rust-side GPT proxy rewrite introduces two behavioral changes that need attention before merge.

Blocking

🔧 wrench

  • Redirect following widens proxy trust boundary: build_proxy_config inherits follow_redirects: true, allowing the GPT proxy to follow up to 4 redirects to any host. The old code was single-hop. GPT static assets don't redirect, so follow_redirects = false is the correct setting. (gpt.rs:128)
  • Error handling semantics changed silently: Old code returned Err on non-2xx upstream status; new code passes through the response body. This is arguably better behavior but is a semantic change that should be documented or reverted. (gpt.rs:160)

Non-blocking

🤔 thinking

  • Vary header normalization dropped: Old vary_with_accept_encoding logic removed; relying on upstream Google CDN to always send correct Vary headers. Low practical risk but removes a defensive layer. (gpt.rs:147)
  • Privacy-scrubbing relies on header set-last-wins: copy_proxy_forward_headers forwards Referer/X-Forwarded-For before with_header overrides blank them. Correct today, fragile if the proxy helper ever appends instead of sets. (gpt.rs:134)
  • Stale dispatcher state replaced without teardown: Version mismatch path overwrites the global without restoring old prototype wrappers. No production risk (single page load), but can stack wrappers in HMR/dev. (dom_insertion_dispatcher.ts:81)

♻️ refactor

  • handle JSDoc ambiguous about node insertion: "Consumed" implies the node might not be inserted, but the dispatcher always inserts. Clarify that return value only controls handler chain short-circuiting. (dom_insertion_dispatcher.ts:28)

👍 praise

  • Single shared prototype patch: The core dispatcher design eliminates wrapper stacking, makes ordering deterministic, and the test coverage (priority ordering, external patch preservation, repeated cycles, error resilience) is thorough.
  • GPT auto-install on pre-set flag: Closes a real race condition between inline bootstrap and bundle evaluation. Both orderings tested.
  • Test cleanup delegated to dispatcher teardown: Tests validate that teardown works rather than masking broken cleanup with manual prototype restoration.

CI Status

  • cargo fmt: PASS
  • cargo test: PASS
  • vitest: PASS
  • format-typescript: PASS
  • format-docs: PASS
  • CodeQL: PASS

}

fn build_proxy_config<'a>(target_url: &'a str, req: &Request) -> ProxyRequestConfig<'a> {
let mut config = ProxyRequestConfig::new(target_url).with_streaming();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — GPT proxy now follows redirects across arbitrary hosts

The old GPT proxy used Request::new().send(backend) — a single hop with no redirect following. build_proxy_config uses ProxyRequestConfig::new() which defaults to follow_redirects: true (proxy.rs:55). proxy_with_redirects (proxy.rs:463) follows up to 4 redirects to any http/https host with no allowlist. This widens the proxy trust boundary: if an upstream redirect targets an internal or attacker-controlled host, the proxy will fetch it on behalf of the client.

GPT static assets (gpt.js, pubads_impl.js) are served directly — they don't redirect. Following redirects is unnecessary here and creates unnecessary risk.

Fix: Disable redirect following for GPT asset proxying:

fn build_proxy_config<'a>(target_url: &'a str, req: &Request) -> ProxyRequestConfig<'a> {
    let mut config = ProxyRequestConfig::new(target_url).with_streaming();
    config.follow_redirects = false;
    config.forward_synthetic_id = false;
    // ...
}

response
}

async fn proxy_gpt_asset(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — GPT proxy silently changes error handling semantics

The old handle_script_serving / handle_pagead_proxy returned Err(Report<TrustedServerError>) on non-2xx upstream status, preventing callers from seeing partial or broken response bodies. The new proxy_gpt_asset passes through non-2xx responses with their original body (which may be an HTML error page served with a wrong Content-Type). The comment at line 173 acknowledges this, but it's a semantic change bundled into a refactoring PR.

If this is intentional, it should be documented in the PR description as a behavioral change. If not, add:

if !response.get_status().is_success() {
    return Err(Report::new(Self::error(format!(
        "{context}: upstream returned {}",
        response.get_status()
    ))));
}

)
}

fn finalize_gpt_asset_response(&self, mut response: Response) -> Response {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — Vary header normalization for compressed responses was dropped

The old code had explicit vary_with_accept_encoding logic ensuring Vary: Accept-Encoding was present when Content-Encoding was forwarded. The new streaming passthrough (proxy.rs:287) returns upstream headers as-is, then finalize_gpt_asset_response stamps Cache-Control: public, max-age=3600.

Google's CDN always sends correct Vary headers, and Fastly's cache is Vary-aware, so this is low-risk in practice. But if a shared CDN layer caches the response without Vary normalization, a brotli-compressed variant could be served to a client that doesn't support it.

Consider re-adding Vary normalization in finalize_gpt_asset_response when Content-Encoding is present, or document that you're relying on upstream Vary correctness.

header::USER_AGENT,
fastly::http::HeaderValue::from_static("TrustedServer/1.0"),
);
config = config.with_header(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinkingproxy_request forwards Accept/Accept-Language that old code didn't

copy_proxy_forward_headers (proxy.rs:90) forwards Accept, Accept-Language, Referer, X-Forwarded-For, and User-Agent from the client before build_proxy_config's with_header calls override some of them. The old code built a fresh request and only forwarded Accept, Accept-Encoding, and Accept-Language explicitly.

The end result is correct (overrides win), but the privacy-scrubbing relies on set-last-wins ordering which is fragile if copy_proxy_forward_headers ever changes to append rather than set.

return existingState as DomInsertionDispatcherState;
}

if (existingState) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinking — Stale global dispatcher state replaced without tearing down old wrappers

On version mismatch, the old state is replaced at line 98 without restoring any prototype wrappers it may have installed. If the old state had active appendChildWrapper/insertBeforeWrapper on Element.prototype, those become orphaned closures dispatching to a detached handler map. The new state's installDispatcher then saves the old wrapper as baselineAppendChild, creating a permanent wrapper chain.

In production this can't happen (single page load). In HMR or multi-bundle dev scenarios, wrappers can stack. Consider shape-checking the old state and attempting best-effort teardown before replacement.


export interface DomInsertionHandler {
/**
* Process a normalized DOM insertion candidate.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactorhandle return value JSDoc is ambiguous about node insertion

The JSDoc says "Return true when the handler consumed or rewrote the candidate and no subsequent handlers should run." But the dispatcher always inserts the node regardless of return value (lines 183–184). "Consumed" implies the node might not be inserted. Suggest clarifying:

/**
 * Process a normalized DOM insertion candidate.
 *
 * Return `true` when this handler has processed the candidate (e.g.
 * rewritten its URL) and no subsequent handlers should run.  The node
 * is always inserted into the DOM regardless of the return value.
 */


win.__tsjs_installGptShim = installGptShim;

if (win.__tsjs_gpt_enabled === true) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 praise — The check for __tsjs_gpt_enabled === true after registering __tsjs_installGptShim makes activation robust regardless of inline-script-vs-bundle ordering. The test at index.test.ts explicitly verifies both orderings. Clean fix for a real race condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple global prototype patches stack without coordination

2 participants