Skip to content

[codex] Preserve managed endpoint failure diagnostics#3366

Closed
juliusmarminge wants to merge 4 commits into
mainfrom
codex/preserve-managed-endpoint-diagnostics
Closed

[codex] Preserve managed endpoint failure diagnostics#3366
juliusmarminge wants to merge 4 commits into
mainfrom
codex/preserve-managed-endpoint-diagnostics

Conversation

@juliusmarminge

@juliusmarminge juliusmarminge commented Jun 20, 2026

Copy link
Copy Markdown
Member

Summary

  • preserve exact secret-store and schema-decode causes on typed configuration errors while logging only bounded structural tags
  • replace raw process/platform causes with stable error tags, bounded module/method fields, and connector identifiers
  • re-propagate connector-scope interruptions instead of logging them as runtime failures
  • use Effect.catchTags for tagged platform/config alternatives and keep user-facing spawn failures independent of internal causes
  • retain output redaction by logging only relay output length

Verification

  • vp test apps/server/src/cloud/ManagedEndpointRuntime.test.ts (13 passed)
  • vp check (passes with 20 pre-existing warnings)
  • vp run typecheck

Overlap audit

No active PR touches the three changed files (gh pr list --state open --limit 1000).


Note

Medium Risk
Changes cloud relay connector supervision, restart triggers on probe failure, and persisted config bootstrap—important for tunnel availability but scoped to managed endpoint runtime logging and reconciliation.

Overview
Cloud managed endpoint runtime logging and error handling are tightened so operators get useful structure without echoing secrets or nested failure text.

Logging now uses bounded diagnostic summaries (managedEndpointCauseDiagnostics, platformErrorDiagnostics) instead of raw cause objects or relay line content—transport warnings log outputLength only, not redacted or full output. Pure scope interruptions are re-propagated via logManagedEndpointCause and are not logged as supervisor/observer failures.

Config load wraps secret read/decode in tagged CloudManagedEndpointRuntimeConfigReadError / DecodeError (exact .cause retained on the error class); startup logs only errorTag, resource, and causeTag. decodeRuntimeConfig switches from decodeUnknownOption to decodeUnknownEffect.

Runtime behavior: failed isRunning probes log a structured warning and treat the connector as dead (restart on next reconcile); spawn PlatformError yields a fixed user-facing reason (Failed to start the relay client.) while logs stay tag/module/method-only.

Reviewed by Cursor Bugbot for commit 96f192e. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Preserve structured failure diagnostics in managed endpoint logging

  • Replaces raw cause/description logging in CloudManagedEndpointRuntime with bounded diagnostic summaries (tag counts, module/method fields) via a new managedEndpointCauseDiagnostics helper.
  • Adds logManagedEndpointCause to handle mixed failure/interruption causes: pure interruptions are never logged, mixed causes are logged with diagnostics while the interrupt cause is still propagated.
  • Adds platformErrorDiagnostics for consistent flat diagnostic objects from PlatformError without nested cause strings.
  • Introduces CloudManagedEndpointRuntimeConfigReadError and CloudManagedEndpointRuntimeConfigDecodeError tagged error classes; readRuntimeConfig and decodeRuntimeConfig now surface distinct typed failures instead of returning Option.
  • Relay client output logging now records only output length rather than copying potentially sensitive line content.
  • Behavioral Change: spawn failures now return the fixed reason 'Failed to start the relay client.' with no nested error detail; isRunning probe failures with PlatformError now trigger a restart instead of being treated as not-running silently.

Macroscope summarized 96f192e.

@juliusmarminge juliusmarminge added the vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. label Jun 20, 2026
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e2f6ae73-1dfc-40f9-b68c-0bf23071eb73

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/preserve-managed-endpoint-diagnostics

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the size:M 30-99 changed lines (additions + deletions). label Jun 20, 2026
macroscopeapp[bot]
macroscopeapp Bot previously approved these changes Jun 20, 2026
@macroscopeapp

macroscopeapp Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Needs human review

This PR modifies error handling behavior across multiple code paths in the managed endpoint runtime, changing what diagnostic information is logged and introducing new error types. While the changes appear security-positive (avoiding sensitive data in logs), the substantive modifications to error handling infrastructure warrant human review.

No code changes detected at 96f192e. Prior analysis still applies.

You can customize Macroscope's approvability policy. Learn more.

@macroscopeapp macroscopeapp Bot dismissed their stale review June 20, 2026 17:06

Dismissing prior approval to re-evaluate c9180e8

macroscopeapp[bot]
macroscopeapp Bot previously approved these changes Jun 20, 2026
@juliusmarminge juliusmarminge force-pushed the codex/preserve-managed-endpoint-diagnostics branch from c9180e8 to 2f9cc13 Compare June 20, 2026 18:01
@macroscopeapp macroscopeapp Bot dismissed their stale review June 20, 2026 18:01

Dismissing prior approval to re-evaluate 2f9cc13

@github-actions github-actions Bot added size:L 100-499 changed lines (additions + deletions). and removed size:M 30-99 changed lines (additions + deletions). labels Jun 20, 2026
@juliusmarminge juliusmarminge force-pushed the codex/preserve-managed-endpoint-diagnostics branch from 2f9cc13 to b7a1982 Compare June 20, 2026 22:41

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Mixed causes drop failure diagnostics
    • Changed the interrupt-only check from interruptionReasons.length > 0 to interruptionReasons.length === cause.reasons.length so logging is only suppressed for purely-interrupt causes, and added a mixed-cause branch that logs diagnostics before re-propagating interrupts.

Create PR

Or push these changes by commenting:

@cursor push 42a4fe013d
Preview (42a4fe013d)
diff --git a/apps/server/src/cloud/ManagedEndpointRuntime.ts b/apps/server/src/cloud/ManagedEndpointRuntime.ts
--- a/apps/server/src/cloud/ManagedEndpointRuntime.ts
+++ b/apps/server/src/cloud/ManagedEndpointRuntime.ts
@@ -88,13 +88,19 @@
   attributes: Readonly<Record<string, unknown>>,
 ) {
   const interruptionReasons = cause.reasons.filter(Cause.isInterruptReason);
-  if (interruptionReasons.length > 0) {
+  if (interruptionReasons.length > 0 && interruptionReasons.length === cause.reasons.length) {
     return Effect.failCause(Cause.fromReasons<never>(interruptionReasons));
   }
-  return Effect.logWarning(message, {
+  const log = Effect.logWarning(message, {
     ...attributes,
     ...managedEndpointCauseDiagnostics(cause),
   });
+  if (interruptionReasons.length > 0) {
+    return log.pipe(
+      Effect.andThen(Effect.failCause(Cause.fromReasons<never>(interruptionReasons))),
+    );
+  }
+  return log;
 }
 
 function platformErrorDiagnostics(error: PlatformError.PlatformError) {

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit b7a1982. Configure here.

Comment thread apps/server/src/cloud/ManagedEndpointRuntime.ts
@juliusmarminge juliusmarminge force-pushed the codex/preserve-managed-endpoint-diagnostics branch 14 times, most recently from a42406d to e1eb3fb Compare June 21, 2026 02:20
juliusmarminge and others added 3 commits June 20, 2026 19:21
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge force-pushed the codex/preserve-managed-endpoint-diagnostics branch from e1eb3fb to 96f192e Compare June 21, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L 100-499 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant