Skip to content

[CRE] Add test framework support for confidential relay DON#21396

Closed
nadahalli wants to merge 1 commit intodevelopfrom
tejaswi/cre-relay-don-feature
Closed

[CRE] Add test framework support for confidential relay DON#21396
nadahalli wants to merge 1 commit intodevelopfrom
tejaswi/cre-relay-don-feature

Conversation

@nadahalli
Copy link
Contributor

Confidential CRE Workflows (implementation plan | relay DON design)

Summary

The CRE test framework can't spin up a relay DON because the gateway job spec generator doesn't know about the confidential-compute-relay handler type, and there's no Feature to wire it. This adds both, unblocking remote-mode E2E tests in confidential-compute.

Changes

  • deployment/cre/jobs/pkg/gateway_job.go: Add GatewayHandlerTypeConfidentialRelay constant, confidentialRelayHandlerConfig struct, newDefaultConfidentialRelayHandler() factory, and switch case in Resolve(). Config is NodeRateLimiter-only; RequestTimeoutSec defaults to 30 in the handler constructor.
  • system-tests/lib/cre/types.go: Add ConfidentialRelayCapability flag.
  • system-tests/lib/cre/features/confidential_relay/confidential_relay.go (new): Feature implementation. PreEnvStartup adds the gateway handler, configures gateway access, and propagates CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS / CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM from CapabilityConfig.Values to NodeSet.EnvVars. PostEnvStartup is a no-op; the relay handler auto-starts via env var gate. No on-chain capability registration (the relay handler is a CRE subservice, not a registered capability).

Related PRs

Add gateway handler type, capability flag, and Feature implementation
so the CRE test framework can spin up a relay DON for remote-mode E2E
tests.
Copilot AI review requested due to automatic review settings March 4, 2026 13:01
@nadahalli nadahalli requested review from a team as code owners March 4, 2026 13:01
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

👋 nadahalli, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

✅ No conflicts with other open PRs targeting develop

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the CRE system-test framework to support spinning up and configuring a confidential relay DON by teaching the gateway job-spec generator about a new gateway handler type and adding a corresponding CRE Feature/Capability flag.

Changes:

  • Add confidential-compute-relay gateway handler type support to the gateway job-spec generator (default config + resolve switch case).
  • Add a new CRE capability flag confidential-relay.
  • Add a new confidential_relay Feature that wires the gateway handler into topology and propagates required env vars to DON nodes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
deployment/cre/jobs/pkg/gateway_job.go Adds a new gateway handler type constant and default handler config, and resolves it into generated gateway TOML.
system-tests/lib/cre/types.go Adds the ConfidentialRelayCapability flag constant.
system-tests/lib/cre/features/confidential_relay/confidential_relay.go New Feature that adds the gateway handler + gateway access config and forwards confidential relay env vars from capability config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


func newDefaultConfidentialRelayHandler() handler {
return handler{
Name: GatewayHandlerTypeConfidentialRelay,
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newDefaultConfidentialRelayHandler() doesn’t set ServiceName. In legacy gateway config, non-legacy JSON-RPC requests (without params.body.don_id) are routed by jsonRequest.ServiceName() via the serviceNameToDonID map, which is populated from handler ServiceName fields; without it, requests for this handler’s methods will fail with "Service name not found".

Set ServiceName to the method prefix used by the confidential relay handler (and keep it consistent with the handler’s exported Methods()), similar to how GatewayHandlerTypeHTTPCapabilities uses ServiceName: "workflows".

Suggested change
Name: GatewayHandlerTypeConfidentialRelay,
Name: GatewayHandlerTypeConfidentialRelay,
ServiceName: GatewayHandlerTypeConfidentialRelay,

Copilot uses AI. Check for mistakes.
Comment on lines +105 to +106
case GatewayHandlerTypeConfidentialRelay:
hs = append(hs, newDefaultConfidentialRelayHandler())
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new handler type was added to Resolve() but there’s no corresponding test coverage in gateway_job_test.go to assert the generated TOML includes the confidential relay handler block and its default config. Add a TestGateway_Resolve_WithConfidentialRelayHandler (similar to the vault/http-capabilities tests) to prevent accidental regressions in handler name/config formatting.

Copilot uses AI. Check for mistakes.
Comment on lines +46 to +61
// Set env vars from capability config to activate the confidential relay handler on DON nodes.
// The handler is gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS; without it, the subservice won't start.
capConfig, ok := don.CapabilityConfigs[flag]
if ok && capConfig.Values != nil {
ns := don.MustNodeSet()
if ns.EnvVars == nil {
ns.EnvVars = make(map[string]string)
}

if v, exists := capConfig.Values["trustedPCRs"]; exists {
ns.EnvVars["CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"] = fmt.Sprintf("%v", v)
}
if v, exists := capConfig.Values["caRootsPEM"]; exists {
ns.EnvVars["CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM"] = fmt.Sprintf("%v", v)
}
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature currently silently proceeds when the confidential relay capability config is missing or when trustedPCRs/caRootsPEM are present but not strings. Since the relay subservice is gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS, this can lead to confusing “feature enabled but nothing starts” failures.

Consider validating that trustedPCRs is provided and is either a JSON string or a structured value that you explicitly json.Marshal to a string; otherwise, return an error. Avoid fmt.Sprintf("%v", v) here because it will produce Go’s formatting for maps/slices rather than JSON.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +22
const flag = cre.ConfidentialRelayCapability

type ConfidentialRelay struct{}

func (o *ConfidentialRelay) Flag() cre.CapabilityFlag {
return flag
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ConfidentialRelay feature/flag won’t be usable via the default CRE environment wiring unless it’s registered in the default feature set and flag providers. system-tests/lib/cre/features/sets/sets.go currently constructs the feature set without this feature, and system-tests/lib/cre/flags/provider.go doesn’t include cre.ConfidentialRelayCapability in SupportedCapabilityFlags().

Add the new feature to sets.New() and include the new capability in the relevant capability-flags providers so the framework can select and execute it when the flag is specified.

Copilot uses AI. Check for mistakes.
@cl-sonarqube-production
Copy link

@trunk-io
Copy link

trunk-io bot commented Mar 4, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@nadahalli
Copy link
Contributor Author

Consolidated into #21375

@nadahalli nadahalli closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants