feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle by d-cs · Pull Request #4118 · triggerdotdev/trigger.dev

d-cs · 2026-07-02T17:39:05Z

What

Routes the webapp write path through the run-ops split seam: trigger/batch minting, idempotency-key resolution, and the run-lifecycle services now determine residency and dispatch writes to the correct store.

Trigger & batch (runEngine/services/triggerTask.server.ts, batchTrigger.server.ts, createBatch.server.ts, streamBatchItems.server.ts, v3/services/batchTriggerV3.server.ts): mint ids with the run-ops-aware minting and route creation/streaming through the store; batch children inherit the parent's residency.
Idempotency (runEngine/concerns/idempotencyKeys.server.ts + new idempotencyResidency.server.ts): idempotency-key lookup/dedup is residency-aware so a keyed retrigger resolves against the store that owns the original run.
Run lifecycle services (createCheckpoint, createTaskRunAttempt, enqueueDelayedRun, expireEnqueuedRun, finalizeTaskRun, resumeBatchRun, cancelDevSessionRuns, executeTasksWaitingForDeploy, triggerFailedTask): resolve their target run through the store rather than a fixed client.
Reads that fan out from writes (runsRepository + clickhouseRunsRepository, BulkActionV2 + batch read-through, realtime sessions/runReader, alerts deliverAlert/performTaskRunAlerts): route through the read-through resolver.
9535ae63d — resolves the parent run through an injectable run store in TriggerFailedTaskService.
bf8f7c881 — drops the "known-migrated" concept from write-path and read repos; residency is id-shape only.
515b897ea — self-defaults resolveWaitpointThroughReadThrough to the safe run-ops clients.

Why

PR6 of the run-ops split stack. This is the write-path counterpart to the read foundation in the previous PRs: with it in place, both reads and writes route through the seam. Additive when the split is disabled (id-shape resolution collapses to the control-plane client); behavior-changing on the minting, idempotency, and lifecycle paths when enabled.

Tests

Large new/expanded vitest suite under apps/webapp/test/ and colocated service tests: trigger-task and batch-trigger store routing, residency inheritance, idempotency dedup residency + legacy-authority, bulk-action read routing, cancel-dev-session routing, alerts store routing, runs-repository read-through, realtime session/run-reader read-through and stream-registration routing, and the waitpoint read-through default. Testcontainers-backed; no mocks.

Notes

Draft, stacked on #4117 (runops/pr05-webapp-foundation). Review that first; this diff is against it.

Server-change / changeset note to be added at stack-assembly time.

🤖 Generated with Claude Code

changeset-bot · 2026-07-02T17:39:09Z

⚠️ No Changeset found

Latest commit: ea22f52

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-07-02T17:39:15Z

Walkthrough

This PR migrates webapp write and read paths for run triggering, batch processing, idempotency-key resolution, and alert/session serialization from direct Prisma access to a RunStore/ControlPlaneResolver abstraction supporting split reads across legacy and new (run-ops) Postgres databases. It adds mint-kind-aware friendly-ID generation (KSUID vs CUID) for runs, batches, and failed-task creation to preserve residency inheritance across parent/child runs. New helpers (resolveIdempotencyDedupClient, resolveWaitpointThroughReadThrough, hydrateRunsAcrossSeam) route reads/writes based on split-mode configuration and ID classification. Numerous services (trigger, batch trigger, alerts, sessions, runs repository, cancel/resume flows) are updated accordingly, alongside extensive new integration tests using heterogeneous Postgres containers to validate store routing.

Changes

Cohort: RunStore/control-plane seam migration

Idempotency dedup client resolution: new resolveIdempotencyDedupClient helper and wiring into IdempotencyKeyConcern
Waitpoint read-through resolver: new resolveWaitpointThroughReadThrough helper
Trigger task/failed-task friendly-id minting with mint-kind resolution and residency inheritance
Batch trigger services (RunEngineBatchTriggerService, CreateBatchService, StreamBatchItemsService, BatchTriggerV3Service) migrated to runStore/mintBatchFriendlyId
Control-plane resolver adoption across run lifecycle services (task run attempts, enqueue/expire, finalize, alerts)
Run reading/session/bulk-action seam adoption (runsRepository, ClickHouse hydration, sessions, cancel/resume flows)
Documentation note added describing the run-ops write-path routing change
Minor findLatestSession signature update to accept an injectable Prisma client

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant TriggerTaskService
  participant ResolveMintKind
  participant RunStore
  participant PrismaNew
  participant PrismaLegacy

  Caller->>TriggerTaskService: call(triggerRequest)
  TriggerTaskService->>ResolveMintKind: resolveRunIdMintKind / resolveInheritedMintKind
  ResolveMintKind-->>TriggerTaskService: mintKind (ksuid | cuid)
  TriggerTaskService->>TriggerTaskService: mintRunFriendlyId(environment, parentRunFriendlyId)
  TriggerTaskService->>RunStore: findRun(parentRunId) [if parentRunId provided]
  RunStore->>PrismaNew: query (if NEW residency)
  RunStore->>PrismaLegacy: query (if LEGACY residency)
  RunStore-->>TriggerTaskService: parentRun | null
  TriggerTaskService-->>Caller: created run (friendlyId, depth, parentTaskRunId)

Compact Metadata

Estimated code review effort: 5/5 (very high — 60+ files touched, high-risk data-routing logic, extensive but distributed test coverage)
Related PRs: None specified
Suggested labels: run-engine, database, run-ops-split, needs-careful-review
Suggested reviewers: Based on the scope of Prisma/RunStore seam changes across trigger, batch, and alert services, reviewers familiar with run-engine internals and database migration strategy should review this PR.

Poem:
A rabbit hops between two dens, PG14 and PG17 too,
Minting KSUIDs and CUIDs anew, deciding which store gets the queue.
Idempotency keys now dedupe with care,
Batches seal and runs find their lair.
Through the seam we hop, split but never lost—
Every run finds its home, whatever the cost. 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description covers scope, rationale, and tests, but it does not follow the required template or include Closes #, checklist, changelog, or screenshots sections.	Add the missing template sections, especially Closes #, checklist items, Testing, Changelog, and Screenshots.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title clearly summarizes the main change: routing the webapp write path through run-ops for minting, idempotency, and lifecycle flows.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch runops/pr06-write-path

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

pkg-pr-new · 2026-07-02T18:06:10Z

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@ea22f52

trigger.dev

npm i https://pkg.pr.new/trigger.dev@ea22f52

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@ea22f52

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@ea22f52

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@ea22f52

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@ea22f52

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@ea22f52

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@ea22f52

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@ea22f52

commit: ea22f52

d-cs · 2026-07-02T19:10:40Z

Addressed the outside-diff note on idempotencyKeys.server.ts (blockRunWithWaitpoint writing via this.prisma): the block now passes the residency-resolved dedupClient as tx, so the idempotent parent run's waitpoint write lands on the store that owns that parent run rather than the fallback client.

… routing, run lifecycle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e in TriggerFailedTaskService TriggerFailedTaskService read the parent run via the ambient module-singleton store while the engine wrote the run through its own store, so a ksuid parent's row was not found and parentTaskRunId came back null. Add an optional injected runStore (defaults to the shared singleton, preserving production behaviour) and resolve the parent through it at both call sites, mirroring triggerTask.server.ts. Align the three affected webapp tests to read through the same store the engine wrote to: triggerFailedTask.test.ts passes engine.runStore; performTaskRunAlerts routing passes a passthrough store over the seeded container; triggerTask.test.ts stubs the run-ops db handles and pins split mode off so the idempotency dedup uses the container client. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…id-shape only Migration is deferred, so child/batch residency is a pure id-shape check. Remove the isKnownMigrated (and mint-only isSplitEnabled) deps from the mint sites (triggerTask, triggerFailedTask, batchTriggerV3) and call the now- synchronous resolveInheritedMintKind(parentFriendlyId) with no deps arg. Read paths: drop the isKnownMigrated re-probe-avoidance from the ClickHouse runs hydrate (probe all missing on legacy), the runsRepository readThrough options type, resolveWaitpointThroughReadThrough deps, and the BulkActionV2 batch seam adapter — keeping the genuine cross-seam fallback that reads NEW first for unclassifiable/legacy-candidate ids. Delete the injected-marker test cases; the remaining residency tests assert pure id-shape inheritance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…s and test names Review hygiene only: remove the NEW-1 label, Test X: name prefixes, and [TEST-NEWSEED] comment label. No product logic or test behavior changed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…o safe run-ops clients The read-through concern defaulted both newClient and legacyReplica to $replica (control-plane), so a bare caller that omits `deps` — the waitpoints wait route — never queried the dedicated run-ops replica. A co-located, NEW-resident waitpoint minted by streams.input().wait() lives on the run-ops-new DB, so the read missed, returned null, and the route 404'd (re-serialized to 500). Match the deps the complete/callback routes pass: default newClient to runOpsNewReplica, legacyReplica to $replica, and splitEnabled to runOpsSplitReadEnabled — mirroring readThroughRun's own self-defaulting. This immunizes any bare caller (present or future) against the control-plane pin, without touching the wait route. The wait/complete/callback call sites live on a higher branch and are unchanged; complete/callback keep their explicit deps (now redundant but harmless). Adds a heteroRunOps regression case driving the concern with no `deps` via the `defaults` DI seam: proves the old $replica default misses a NEW-resident waitpoint (null) while the safe run-ops default finds it. No mocks; the fallback is exercised against real PG14/PG17 containers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rvice prisma to the resolved store - Block the idempotent parent run's waitpoint via the residency-resolved dedup client instead of the fallback prisma, so the write lands on the store that owns the parent run. - Pass the caller-provided _prisma into WithRunEngine so a custom store isn't silently overridden by the module singleton. - Throw when a run-backed alert's environment can't be resolved instead of marking it SENT, so a transient replica miss doesn't permanently suppress the alert. - Pin splitEnabled:false in the waitpoint passthrough test so it exercises single-DB behaviour rather than relying on ksuid residency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The write-path split added static `runOpsLegacyPrisma`/`runOpsNewPrisma` imports to idempotencyKeys.server.ts, which this test loads. vitest validates every named import against the `~/db.server` mock, so the mock now errored on the missing run-ops singletons. Add the four run-ops exports (empty stubs, same boundary pattern as the batchTriggerV3 residency test) and pin isSplitEnabled() to false so the dedup routing deterministically returns the injected fake prisma regardless of the ambient RUN_OPS_SPLIT_ENABLED. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…setup Worker/engine/marqs/pubsub/socket singletons each construct an ioredis client at import time (singleton() + no lazyConnect), so any test importing the service graph opened real Redis connections on import. In CI there is no Redis, so these accumulate infinite-retry clients across a shard and take the suite down (locally they pass only because dev Redis is up). Globally mock the eager-Redis modules to no-op stubs in test/setup.ts: commonWorker, batchTriggerWorker, legacyRunEngineWorker, alertsWorker, the RunEngine and MarQS singletons, devPubSub and the socket.io server. Only these singletons are mocked — never the run store (~/v3/runStore.server, ~/db.server), which store-routing/residency tests need real against testcontainer Postgres. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…yConnect + stub runtime Redis singletons The setup-file mocks of the six eager worker/engine singletons were not enough: CI shards still flooded ECONNREFUSED/maxRetries. Two further classes of env-Redis usage survived them, reproduced locally by running the failing shards with REDIS_PORT pointed at a dead port: 1. Import-time construction: ~15 more singletons (platform cache, billing-limit reconcile queue, alerts rate limiter, DevPresence, auto-increment counter, s2 token cache, v1 streams cache, ...) build ioredis clients at module import, and ioredis dials on construction. A global ioredis mock now forces lazyConnect: true so clients only dial on first command — testcontainer-backed tests are unaffected (their first command connects as before). 2. Runtime commands inside code under test: tracePubSub.publish() (eventRepository writes), alertsRateLimiter.check() (deliverAlert) and the task metadata cache each issue commands against env-configured Redis mid-test; every command burns ~20 reconnect cycles before its error surfaces, which times the tests out. These three modules are now stubbed (metadata cache pinned to its Noop implementation, which is what CI's unset env resolves to anyway). Verified: webapp shards 2/5/6/8 (the ones failing on the pr06+ stack) run green with Redis pointed at a dead port, and shards 2/8 stay green against live Redis (store-routing suites still exercise the real run store). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…in CI CI runners have no .env, no REDIS_HOST/REDIS_PORT, and no Postgres at localhost:5432, which surfaced two failure layers that local runs mask (the dev stack answers on both): - suites transitively importing triggerTaskV1.server failed to collect because autoIncrementCounter.server.ts throws at import when REDIS_HOST/REDIS_PORT are unset (shards 2/5/6). Default the pair in test/setup.ts — the global ioredis lazyConnect mock means nothing dials. - TriggerFailedTaskService.call() resolved its event repository via getEventRepository → global prisma (feature-flag read + Prisma event repo), so in CI the swallowed connect error returned null friendlyIds (shard 8). Allow injecting the repository/store pair and bind the test to an EventRepository over the testcontainer DB. - once the cancelDevSessionRuns suite could collect, findLatestSession's hardwired global $replica was the next masked layer; give it an injectable client (defaulting to $replica) and pass the service's _replica through. Verified by replaying the exact CI env locally (.env hidden, workflow env vars, dead localhost DB, GITHUB_ACTIONS set): all four failing suites and full shards 2/5/6/8 reproduce the CI failures before and pass after. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…method access Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…am regressions Two regression tests for the write-path read seams: - runsRepository: paginating the full keyset over interleaved cuid/ksuid runs enumerates every id once, no empty page, in ClickHouse (created_at DESC, run_id DESC) order -- fails if hydration reverts to lexical id desc across the id-space seam. - runReader: a NEW-resident (ksuid) run's terminal metadata hydrates through the owning store, never a generic legacy replica. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 24f52900-ef0f-43d1-973c-1bc57696ef16

📥 Commits

Reviewing files that changed from the base of the PR and between 515b897 and ea22f52.

📒 Files selected for processing (2)

.server-changes/run-ops-split-webapp-write-path.md
apps/webapp/app/models/runtimeEnvironment.server.ts

📜 Review details

⏰ Context from checks skipped due to timeout. (11)

GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (10, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (9, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 10)
GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 10)
GitHub Check: typecheck / typecheck

⚠️ CI failures not shown inline (4)

GitHub Actions: 📝 CLAUDE.md Audit / audit: feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 25
--model claude-opus-4-8
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are reviewing a PR to check whether any CLAUDE.md files or .claude/rules/ files need updating.
## Your task
1. Run `git diff origin/main...HEAD --name-only` to see which files changed in this PR.
2. For each changed directory, check if there's a CLAUDE.md in that directory or a parent directory.
3. Determine if any CLAUDE.md or .claude/rules/ file should be updated based on the changes. Consider:
   - New files/directories that aren't covered by existing documentation
   - Changed architecture or patterns that contradict current CLAUDE.md guidance
   - New dependencies, services, or infrastructure that Claude should know about
   - Renamed or moved files that are referenced in CLAUDE.md
   - Changes to build commands, test patterns, or development workflows
## Response format
If NO updates are needed, respond with exactly:
✅ CLAUDE.md files look current for this PR.
If updates ARE needed, respond with a short list:
📝 **CLAUDE.md updates suggested:**
- `path/to/CLAUDE.md`: [what should be added/changed]
- `.claude/rules/file.md`: [what should be added/changed]
Keep suggestions specific and brief. Only flag things that would actually mislead Claude in future sessions.
Do NOT suggest updates for trivial changes (bug fixes, small refactors within existing patterns).
Do NOT suggest creating new CLAUDE.md files - only updates to existing ones.
   trigger_phrase: `@claude`
   label_trigger: claude
   branch_prefix: claude/
   use_bedrock: false
   use_vertex: false
   use_foundry: false
   classify_inline_comments: true
   use_commit_signing: false
   bot_id: 41898282
   bot_name: claude[bot]
   track_progress: false
   include_fix_links: true
   display_report: false...

GitHub Actions: 📝 CLAUDE.md Audit / 0_audit.txt: feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle

Conclusion: failure

View job details

alt@3.0.0-beta.46 -> `@trigger.dev/yalt`@3.0.0-beta.46
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.47 -> `@trigger.dev/yalt`@3.0.0-beta.47
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.48 -> `@trigger.dev/yalt`@3.0.0-beta.48
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.49 -> `@trigger.dev/yalt`@3.0.0-beta.49
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.5 -> `@trigger.dev/yalt`@3.0.0-beta.5
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.50 -> `@trigger.dev/yalt`@3.0.0-beta.50
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.51 -> `@trigger.dev/yalt`@3.0.0-beta.51
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.52 -> `@trigger.dev/yalt`@3.0.0-beta.52
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.53 -> `@trigger.dev/yalt`@3.0.0-beta.53
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.55 -> `@trigger.dev/yalt`@3.0.0-beta.55
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.6 -> `@trigger.dev/yalt`@3.0.0-beta.6
  * [new tag]             `@trigger.dev/yalt`@3.0.0-beta.7 -> `@trigger.dev/yalt`@3.0.0-beta.7
  * [new tag]             build-alert-hotfix.rc1      -> build-alert-hotfix.rc1
  * [new tag]             build-alert-hotfix.rc2      -> build-alert-hotfix.rc2
  * [new tag]             build-arm-builds-rc.1       -> build-arm-builds-rc.1
  * [new tag]             build-arm-builds-rc.2       -> build-arm-builds-rc.2
  * [new tag]             build-arm-builds-rc.3       -> build-arm-builds-rc.3
  * [new tag]             build-batchid-carryover-rc.0 -> build-batchid-carryover-rc.0
  * [new tag]             build-batching-rc.1         -> build-batching-rc.1
  * [new tag]             build-batching-rc.2         -> build-batching-rc.2
  * [new tag]             build-billing-0.0.1         -> build-billing-0.0.1
  * [new tag]             build-billing-0.0.2         -> build-billing-0.0.2
  * [new tag]             build-billing-0.0.3         -> build-billing-0.0.3
  * [new tag]             build-buildinfo-rc.0        -> b...

GitHub Actions: 🔎 REVIEW.md Drift Audit / audit: feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle

Conclusion: failure

View job details

##[group]Run anthropics/claude-code-action@428971d2ecd6e3a7cb0ee0da2a3a8b33fdb3678d
 with:
   anthropic_***REDACTED***
   use_sticky_comment: true
   allowed_bots: devin-ai-integration[bot]
   claude_args: --max-turns 30
--allowedTools "Read,Glob,Grep,Bash(git diff:*)"
   prompt: You are auditing this PR for drift against `.claude/REVIEW.md`.
## Context
`.claude/REVIEW.md` is the repo's source of truth for what AI / agent code reviewers should treat as critical findings (rolling-deploy safety, hot-table indexes, recovery-path queries, testcontainers usage, Lua versioning, etc.). It is consumed by review agents to calibrate severity. If REVIEW.md goes stale, every future agent review degrades.
## Strategy — read this first
You have a hard turn budget. Spend it on signal, not coverage. The audit is allowed to miss things; it is NOT allowed to time out.
1. Read `.claude/REVIEW.md` once, in full.
2. Run `git diff origin/main...HEAD --name-only` to get the list of changed files. Do NOT read the diff content yet.
3. Scan the file-list for relevance to REVIEW.md scope. Relevance signals: changes to Prisma schema, Redis / queue / Lua code, hot tables, recovery / restart loops, new packages, deletions of paths REVIEW.md cites. Skim everything else.
4. Open at most **5 files** total — only the ones most likely to surface a real signal. If nothing in the file-list looks relevant to any REVIEW.md rule, do NOT read any files; go straight to the verdict.
5. Form a verdict and stop. Do not exhaust the turn budget exploring.
Large PRs (>50 files changed) are a strong signal to be MORE selective, not more thorough. Pick 3-5 files at most.
## What to look for
- **Stale references** — does any REVIEW.md rule cite a file, directory, function, table, Prisma model, or package name that has been removed or renamed in this PR (or is already gone from `main`)?
- **Contradictions** — does code in this PR clearly violate a current REVIEW.md rule? (Don't re-review the PR. Only flag if REVIE...

GitHub Actions: 🔎 REVIEW.md Drift Audit / 0_audit.txt: feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle

Conclusion: failure

View job details

y-run-engine.fix3 -> build-legacy-run-engine.fix3
  * [new tag]             build-manual-checkpoints.rc1 -> build-manual-checkpoints.rc1
  * [new tag]             build-metadata-upgrade-logging.rc1 -> build-metadata-upgrade-logging.rc1
  * [new tag]             build-metadata-upgrade-logging.rc2 -> build-metadata-upgrade-logging.rc2
  * [new tag]             build-metadata-upgrade-logging.rc3 -> build-metadata-upgrade-logging.rc3
  * [new tag]             build-new-build-system.rc.1 -> build-new-build-system.rc.1
  * [new tag]             build-otel-upgrade-rc.0     -> build-otel-upgrade-rc.0
  * [new tag]             build-otel-upgrade-rc.1     -> build-otel-upgrade-rc.1
  * [new tag]             build-pre-pull-deployments-rc.1 -> build-pre-pull-deployments-rc.1
  * [new tag]             build-prod-rescue-rc.1      -> build-prod-rescue-rc.1
  * [new tag]             build-rate-limiter-fix-rc.1 -> build-rate-limiter-fix-rc.1
  * [new tag]             build-re2.rc0               -> build-re2.rc0
  * [new tag]             build-realtime-v2-stream-fix -> build-realtime-v2-stream-fix
  * [new tag]             build-realtime-v2-stream-fix-2 -> build-realtime-v2-stream-fix-2
  * [new tag]             build-realtime-v2-stream-fix-3 -> build-realtime-v2-stream-fix-3
  * [new tag]             build-realtime-v2-stream-fix-4 -> build-realtime-v2-stream-fix-4
  * [new tag]             build-realtime-v2-stream-fix-5 -> build-realtime-v2-stream-fix-5
  * [new tag]             build-realtimestreams-dedupe -> build-realtimestreams-dedupe
  * [new tag]             build-registry-maintenance-rc.1 -> build-registry-maintenance-rc.1
  * [new tag]             build-registry-maintenance-rc.2 -> build-registry-maintenance-rc.2
  * [new tag]             build-remote-ecr-rc.0       -> build-remote-ecr-rc.0
  * [new tag]             build-reschedule-hotfix.rc1 -> build-reschedule-hotfix.rc1
  * [new tag]             build-resume-fixes.rc1      -> build-resume-fixes.rc1
  * [new tag]      ...

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Use named constants for sentinel/placeholder values (e.g. const UNSET_VALUE = '__unset__') instead of raw string literals scattered across comparisons

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

apps/webapp/**/*.server.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.ts: Never use request.signal for detecting client disconnects. Use getRequestAbortSignal() from app/services/httpAsyncStorage.server.ts instead, which is wired directly to Express res.on('close') and fires reliably
Access environment variables via env export from app/env.server.ts. Never use process.env directly
Always use findFirst instead of findUnique in Prisma queries. findUnique has an implicit DataLoader that batches concurrent calls and has active bugs even in Prisma 6.x (uppercase UUIDs returning null, composite key SQL correctness issues, 5-10x worse performance). findFirst is never batched and avoids this entire class of issues

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs}: Use pnpm run typecheck for changes in apps (apps/*) and internal packages (internal-packages/*), and never use build to verify those changes.
Use Vitest for tests, and never mock anything; use testcontainers instead.
Prefer static imports over dynamic import(), and only use dynamic imports for unresolved circular dependencies, genuine code-splitting needs, or conditional runtime loading.

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

**/*.{ts,tsx,js,jsx,mts,cts,mjs,cjs,md,mdx}

📄 CodeRabbit inference engine (CLAUDE.md)

Always import from @trigger.dev/sdk when writing Trigger.dev tasks; never use @trigger.dev/sdk/v3 or deprecated client.defineJob.

Files:

apps/webapp/app/models/runtimeEnvironment.server.ts

🧠 Learnings (14)

📚 Learning: 2026-03-22T13:26:12.060Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-03-22T19:24:14.403Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-05-18T08:21:27.694Z

Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma error P1001 ("Can't reach database server") in TypeScript, don’t assume a single error shape. Prisma can surface P1001 via two different error classes/fields: `PrismaClientKnownRequestError` exposes it as `err.code === "P1001"` (common during mid-query connection drops), while `PrismaClientInitializationError` exposes it as `err.errorCode === "P1001"` (common on client startup failure). Therefore, predicates should use `err.code === "P1001" || err.errorCode === "P1001"`. Do not flag `err.code === "P1001"` as “unreachable/never matches,” as it is expected in production.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-05-18T08:21:27.694Z

Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3632
File: apps/webapp/sentry.server.ts:4-21
Timestamp: 2026-05-18T08:21:27.694Z
Learning: When handling Prisma errors for P1001 ("Can't reach database server"), do not assume it only appears under a single property name. Prisma may surface P1001 via either `PrismaClientKnownRequestError` (`err.code === "P1001"`, e.g., mid-query connection drops) or `PrismaClientInitializationError` (`err.errorCode === "P1001"`, e.g., client startup connection failure). To reliably detect the condition, check `err.code === "P1001" || err.errorCode === "P1001"`, and avoid review rules that would incorrectly flag `err.code === "P1001"` as unreachable/never-matching.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-13T19:53:13.759Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3937
File: packages/trigger-sdk/skills/realtime-and-frontend/SKILL.md:258-260
Timestamp: 2026-06-13T19:53:13.759Z
Learning: When reviewing code that uses `trigger.dev/react-hooks`’s `useRealtimeRun`, preserve the call signature where the first argument is the full realtime handle object (not `handle.id`). This is intentional to maintain type-safety and is consistent with the official docs; do not suggest changing the first argument from the handle object to `handle.id`.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-17T17:13:49.929Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3948
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.bulk-actions.$bulkActionParam/route.tsx:48-62
Timestamp: 2026-06-17T17:13:49.929Z
Learning: In triggerdotdev/trigger.dev, within `dashboardLoader`/`dashboardAction` (or similar context resolver code) whenever you resolve an organization ID from an organization slug for RBAC/enterprise authorization scope, always read from the primary Prisma client (`prisma`), not `$replica`. Using `$replica` can hit replica-lag and cause the RBAC lookup/authorization to run without the correct org scope (bypassing intended role enforcement). Implement the slug→org lookup with `prisma.organization.findFirst(...)` (or equivalent primary-client query) and add an inline comment documenting why the primary client is required (replica lag could lead to unscoped RBAC checks).

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-23T13:04:21.413Z

Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4023
File: apps/webapp/app/services/upsertBranch.server.ts:14-18
Timestamp: 2026-06-23T13:04:21.413Z
Learning: In TypeScript, it’s valid to `import { type X }` and then use `typeof X` in a type-only position, e.g. `type Alias = z.infer<typeof X>`. The `type` modifier suppresses the runtime import, but the type checker still has the full exported type so `z.infer<typeof X>` can resolve correctly. In code reviews, don’t flag this as a TypeScript compile error as long as `typeof X` is used in a type context (e.g., with `z.infer`, `type` aliases, generics), not as a runtime value.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-05-05T09:38:02.512Z

Learnt from: d-cs
Repo: triggerdotdev/trigger.dev PR: 3523
File: apps/webapp/app/routes/api.v3.batches.ts:178-181
Timestamp: 2026-05-05T09:38:02.512Z
Learning: When reviewing code that catches `ServiceValidationError` in `*.server.ts` files, do not blindly forward `error.status` to HTTP responses, because SVEs may be thrown with non-default statuses (e.g., 400/500) and forwarding them can cause client-visible behavioral regressions (e.g., surfacing 500s to clients). Prefer a safe default response status of `error.status ?? 422`, but only after confirming via the reachable call graph that the caught `ServiceValidationError` instances are expected to carry those non-default statuses; otherwise, normalize to `422` to avoid unexpected client-visible 5xx behavior.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-05-12T21:04:05.815Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3542
File: apps/webapp/app/components/sessions/v1/SessionStatus.tsx:1-3
Timestamp: 2026-05-12T21:04:05.815Z
Learning: In this Remix + TypeScript codebase, do not flag a server/client boundary violation when a file imports only types from a module matching `*.server`.

Specifically, it’s safe to import types using `import type { Foo } from "*.server"` or `import { type Foo } from "*.server"` because TypeScript erases type-only imports at compile time and they emit no JavaScript, so they won’t cross the Remix server/client bundle boundary.

Only raise the boundary concern for value imports (e.g., `import { Foo }` without `type`, or `import Foo`), since those produce JavaScript output.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-25T18:21:51.905Z

Learnt from: carderne
Repo: triggerdotdev/trigger.dev PR: 4039
File: apps/webapp/app/routes/invite-revoke.tsx:0-0
Timestamp: 2026-06-25T18:21:51.905Z
Learning: During the Zod v4 migration in the triggerdotdev/trigger.dev webapp, ensure any imports from `conform-to/zod` use the Zod-4 subpath: `conform-to/zod/v4` (e.g., `import { parseWithZod } from "conform-to/zod/v4"`). Do not import from the package root `conform-to/zod`, because it is the Zod 3 implementation and may load Zod-3-only symbols (e.g., `ZodBranded`, `ZodEffects`), which can throw at module load (notably with `zod4.4.3`). This should be enforced across `apps/webapp/**/*` where helpers like `parseWithZod` and `conformZodMessage` are used.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-07-03T17:10:21.498Z

Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 4148
File: apps/webapp/app/models/orgMember.server.ts:149-168
Timestamp: 2026-07-03T17:10:21.498Z
Learning: In triggerdotdev/trigger.dev, `User.email` (Prisma schema: `internal-packages/database/prisma/schema.prisma`) currently does NOT use `citext` and does NOT have a `lower(email)` functional unique index. Therefore, do not introduce Prisma queries like `where: { email: { equals: <value>, mode: "insensitive" } }` (or any case-insensitive lookup) against `User.email`, because it can force sequential scans of the `users` table under load. During review, ensure email is normalized (e.g., lowercased/trimmed) before both writes and subsequent lookups, and if true case-insensitive behavior/uniqueness is required, implement it via a separate app-wide migration (e.g., switch to `citext` and/or add a functional unique index with backfill) rather than bolting it onto individual feature PRs.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-04T18:16:35.386Z

Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3836
File: apps/supervisor/src/backpressure/backpressureMonitor.ts:3-5
Timestamp: 2026-06-04T18:16:35.386Z
Learning: When reviewing TypeScript in this repo, apply the rule “prefer type aliases over interfaces” only to data/object shapes and union/intersection type modeling. If an interface is being used as a behavioral contract for collaborators to implement (e.g., method-shape interfaces that define required behavior, such as `BackpressureLogger` / `BackpressureSignalSource` in `apps/supervisor/src/backpressure/backpressureMonitor.ts`), keep it as an `interface` and do not flag it as a type-alias-vs-interface violation.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-06-09T17:58:04.699Z

Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3879
File: apps/webapp/app/models/vercelIntegration.server.ts:619-630
Timestamp: 2026-06-09T17:58:04.699Z
Learning: In this codebase, outbound raw `fetch` calls should typically rely on Node/undici’s default request timeout (about ~300s) rather than adding a per-call `AbortController` + `setTimeout` wrapper inside individual functions (e.g. in files like `apps/webapp/app/models/vercelIntegration.server.ts`). During code review, do not flag the absence of a per-call timeout on a single `fetch` as an issue; if per-call timeouts are needed, they should be implemented via a codebase-wide convention (e.g., a shared fetch wrapper or documented pattern) rather than ad-hoc per-function changes.

Applied to files:

apps/webapp/app/models/runtimeEnvironment.server.ts

📚 Learning: 2026-05-14T14:54:39.095Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3545
File: .server-changes/agent-view-sessions.md:10-10
Timestamp: 2026-05-14T14:54:39.095Z
Learning: In the `trigger.dev` repository, do not flag inconsistent dot vs slash notation in route/path strings inside `.server-changes/*.md` files. These markdown files are consumed verbatim into the changelog, so the mixed notation (e.g., `resources.orgs.../runs.$runParam/...`) is intentional and should be preserved as-is.

Applied to files:

.server-changes/run-ops-split-webapp-write-path.md

🪛 LanguageTool

.server-changes/run-ops-split-webapp-write-path.md

[grammar] ~6-~6: Ensure spelling is correct
Context: ...ea: webapp type: feature --- Route the webapp write path — trigger/batch run minting,...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🔇 Additional comments (2)

.server-changes/run-ops-split-webapp-write-path.md (1)

6-6: LGTM!

apps/webapp/app/models/runtimeEnvironment.server.ts (1)

361-375: LGTM!

devin-ai-integration

Devin Review found 1 new potential issue.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 24f52900-ef0f-43d1-973c-1bc57696ef16

📥 Commits

Reviewing files that changed from the base of the PR and between 515b897 and ea22f52.

📒 Files selected for processing (2)

.server-changes/run-ops-split-webapp-write-path.md
apps/webapp/app/models/runtimeEnvironment.server.ts

📜 Review details

🔇 Additional comments (2)

.server-changes/run-ops-split-webapp-write-path.md (1)

6-6: LGTM!

apps/webapp/app/models/runtimeEnvironment.server.ts (1)

361-375: LGTM!

🛑 Comments failed to post (1)

apps/webapp/app/models/runtimeEnvironment.server.ts (1)

5-5: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect the resolver implementation to confirm tx-forwarding and return shape
fd -t f 'controlPlaneResolver.server.ts' -x cat -n {}

# Check downstream consumers of findEnvironmentFromRun for reliance on nested relations
rg -nP -A5 -B5 '\bfindEnvironmentFromRun\s*\(' --type=ts

Repository: triggerdotdev/trigger.dev

Length of output: 18773

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- apps/webapp/app/models/runtimeEnvironment.server.ts (relevant section) ---'
sed -n '240,340p' apps/webapp/app/models/runtimeEnvironment.server.ts

echo
echo '--- type definitions / exports in apps/webapp/app/models/runtimeEnvironment.server.ts ---'
rg -n "type EnvironmentFromRun|interface EnvironmentFromRun|export type EnvironmentFromRun|export async function findEnvironmentFromRun|resolveAuthenticatedEnv" apps/webapp/app/models/runtimeEnvironment.server.ts

echo
echo '--- downstream consumers of EnvironmentFromRun.environment ---'
rg -n -A3 -B3 "EnvironmentFromRun|environment\." apps/webapp/app/models -g '*.ts' -g '*.tsx'

echo
echo '--- callsites of findEnvironmentFromRun ---'
rg -n -A4 -B4 "\bfindEnvironmentFromRun\s*\(" apps/webapp -g '*.ts' -g '*.tsx'

Repository: triggerdotdev/trigger.dev

Length of output: 18779

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- runEngineHandlers usage around findEnvironmentFromRun result ---'
sed -n '597,660p' apps/webapp/app/v3/runEngineHandlers.server.ts

echo
echo '--- authenticated environment shape helpers ---'
rg -n -A4 -B4 "type AuthenticatedEnvironment|authIncludeBase|authIncludeWithParent|toAuthenticated" apps/webapp/app/models/runtimeEnvironment.server.ts

Repository: triggerdotdev/trigger.dev

Length of output: 6348

Forward tx into the environment lookup. findRun() respects the caller’s transaction, but controlPlaneResolver.resolveAuthenticatedEnv() always uses its own client, so a passed-in tx can still read the environment from a different connection/replica and break read-after-write consistency.

d-cs · 2026-07-03T17:39:06Z

Clarifying the test/setup.ts test-infra changes (for anyone reading the review note about "global mocks")

We are not globally mocking Redis. The ioredis handler is a thin subclass that only forces lazyConnect: true — it is the real ioredis client, it just doesn't open a socket at construction. Any test that issues a command (i.e. uses a real Redis testcontainer) connects and behaves exactly as in production.

What it actually fixes is a connection storm, not test correctness. ~20+ modules construct ioredis clients at import time (eager singletons: workers, cache stores, pub/sub, rate limiters, …). As the webapp suite has grown, importing the service graph across many test files × forked workers means all those eager clients dial simultaneously. CI has no Redis service, so that's a flood of ECONNREFUSED at shard scale that fails the suite before a single assertion runs. lazyConnect defers the dial to first use, so unused singletons stay quiet while real-Redis tests still connect and exercise real Redis.

Separately, a handful of pure-infra singletons (the workers, engine, marqs, socket.io, trace pub/sub, rate limiter, task-metadata cache) are stubbed to no-ops — but only ones no test exercises. Every test that needs real infra builds its own client from the testcontainer (redisOptions) and hits real Postgres/Redis; no product logic is mocked. (Verified: the real-Redis rate-limit tests still genuinely reject over-limit requests through this setup.)

d-cs · 2026-07-03T17:39:06Z

Follow-up (non-blocking): the cleaner long-term fix is to make those eager clients lazyConnect: true (or defer their construction) at the source, so importing a module never opens a Redis connection. Once that lands, this test-setup scaffold can be deleted entirely — the lazyConnect wrapper and the infra-singleton stubs only exist because construction currently has a side effect. Flagging it here so the scaffold isn't treated as load-bearing indefinitely.

…#4119) ## What Extends the ClickHouse runs-replication service to fan in from multiple Postgres sources (the control-plane DB and the run-ops DB) instead of a single source, plus the admin operations to run and observe it. - **Multi-source fan-in** (`services/runsReplicationService.server.ts`, new `runsReplicationInstance.server.ts`, `runsReplicationGlobal.server.ts`): factors the replication service into per-source instances and a coordinator so a single ClickHouse target is fed from more than one Postgres source. - **Admin ops** (`routes/admin.api.v1.runs-replication.status.ts`, `admin.api.v1.runs-replication.backfill.ts`, `v3/services/adminWorker.server.ts`): adds a status endpoint reporting per-source replication state and updates the backfill entrypoint for the multi-source shape. ## Why PR7 of the run-ops split stack, and the final piece: once run state can live in a separate run-ops DB (earlier PRs), the analytics replication into ClickHouse has to consume both sources so runs remain queryable regardless of residency. Behavior-changing for the replication service internals; the ClickHouse-facing output is unchanged (still one runs stream), and single-source operation is preserved when the split is not enabled. ## Tests New vitest coverage: `runsReplicationInstance.test.ts` (per-source instance behavior) and `runsReplicationService.part8`/`part9` suites exercising the multi-source coordinator. Testcontainers-backed (ClickHouse + Postgres); no mocks. ## Notes Draft, **stacked on #4118** (`runops/pr06-write-path`). Review that first; this diff is against it. Server-change / changeset note to be added at stack-assembly time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

d-cs mentioned this pull request Jul 2, 2026

feat(run-ops): ClickHouse multi-source replication fan-in + admin ops #4119

Merged

d-cs self-assigned this Jul 2, 2026

This comment was marked as resolved.

Sign in to view

d-cs force-pushed the runops/pr05-webapp-foundation branch from 413a945 to 99643f8 Compare July 2, 2026 18:02

d-cs force-pushed the runops/pr06-write-path branch from 515b897 to cb97148 Compare July 2, 2026 18:02

This comment was marked as resolved.

Sign in to view

d-cs force-pushed the runops/pr05-webapp-foundation branch from 26871d5 to cdc4eb9 Compare July 2, 2026 19:25

d-cs force-pushed the runops/pr06-write-path branch from c59d9c5 to d5d7fa1 Compare July 2, 2026 19:25

d-cs force-pushed the runops/pr05-webapp-foundation branch from cdc4eb9 to e0b35d5 Compare July 2, 2026 20:21

d-cs force-pushed the runops/pr06-write-path branch 3 times, most recently from 0db90f0 to d5415e8 Compare July 2, 2026 21:44

d-cs force-pushed the runops/pr05-webapp-foundation branch from 8024e36 to f9b9b0b Compare July 3, 2026 08:51

d-cs force-pushed the runops/pr06-write-path branch from aa55b6b to 3153bc4 Compare July 3, 2026 08:51

d-cs force-pushed the runops/pr05-webapp-foundation branch from f9b9b0b to 0937b15 Compare July 3, 2026 10:02

d-cs force-pushed the runops/pr06-write-path branch from 3153bc4 to d561590 Compare July 3, 2026 10:02

d-cs force-pushed the runops/pr05-webapp-foundation branch from 0937b15 to 729daf1 Compare July 3, 2026 10:36

d-cs force-pushed the runops/pr06-write-path branch from d561590 to 9e7c367 Compare July 3, 2026 10:36

d-cs force-pushed the runops/pr05-webapp-foundation branch from 729daf1 to bd6fc79 Compare July 3, 2026 10:44

d-cs force-pushed the runops/pr06-write-path branch from 9e7c367 to e23432d Compare July 3, 2026 10:44

d-cs force-pushed the runops/pr05-webapp-foundation branch from bd6fc79 to a7e0846 Compare July 3, 2026 11:08

d-cs force-pushed the runops/pr06-write-path branch from e23432d to 8dff8b2 Compare July 3, 2026 11:08

d-cs force-pushed the runops/pr05-webapp-foundation branch from a7e0846 to 4119616 Compare July 3, 2026 12:08

d-cs force-pushed the runops/pr06-write-path branch 2 times, most recently from 891d81a to 5140cbc Compare July 3, 2026 15:42

d-cs force-pushed the runops/pr05-webapp-foundation branch from d087c25 to b554794 Compare July 3, 2026 16:33

d-cs force-pushed the runops/pr05-webapp-foundation branch from b554794 to 071cdc1 Compare July 3, 2026 16:44

d-cs force-pushed the runops/pr06-write-path branch from f8f3096 to 4bda37a Compare July 3, 2026 16:44

Base automatically changed from runops/pr05-webapp-foundation to main July 3, 2026 17:02

d-cs and others added 15 commits July 3, 2026 18:03

feat(run-ops): webapp write path — trigger/batch minting, idempotency…

f7bea8f

… routing, run lifecycle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

style(run-ops): apply oxfmt

8ff98c2

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chore: add server-changes for pr06

fba183a

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chore(run-ops): fix lint/format for main lint rules

30bfdda

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(run-ops test): make engine/marqs no-op mock recursive for nested …

c1371f2

…method access Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

d-cs force-pushed the runops/pr06-write-path branch from 4bda37a to ea22f52 Compare July 3, 2026 17:08

d-cs marked this pull request as ready for review July 3, 2026 17:08

coderabbitai Bot reviewed Jul 3, 2026

View reviewed changes

devin-ai-integration Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread apps/webapp/test/setup.ts

coderabbitai Bot reviewed Jul 3, 2026

View reviewed changes

ericallam approved these changes Jul 3, 2026

View reviewed changes

d-cs merged commit 84f3e1b into main Jul 3, 2026
40 checks passed

d-cs deleted the runops/pr06-write-path branch July 3, 2026 17:52

github-actions Bot mentioned this pull request Jul 3, 2026

chore: release v4.5.1 #4126

Open

Uh oh!

Uh oh!

Conversation

d-cs commented Jul 2, 2026

What

Why

Tests

Notes

Uh oh!

changeset-bot Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Compact Metadata

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

d-cs commented Jul 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

d-cs commented Jul 3, 2026

Uh oh!

d-cs commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading