[CRE] Add test framework support for confidential relay DON#21396
[CRE] Add test framework support for confidential relay DON#21396
Conversation
Add gateway handler type, capability flag, and Feature implementation so the CRE test framework can spin up a relay DON for remote-mode E2E tests.
|
👋 nadahalli, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
✅ No conflicts with other open PRs targeting |
There was a problem hiding this comment.
Pull request overview
This PR extends the CRE system-test framework to support spinning up and configuring a confidential relay DON by teaching the gateway job-spec generator about a new gateway handler type and adding a corresponding CRE Feature/Capability flag.
Changes:
- Add
confidential-compute-relaygateway handler type support to the gateway job-spec generator (default config + resolve switch case). - Add a new CRE capability flag
confidential-relay. - Add a new
confidential_relayFeature that wires the gateway handler into topology and propagates required env vars to DON nodes.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
deployment/cre/jobs/pkg/gateway_job.go |
Adds a new gateway handler type constant and default handler config, and resolves it into generated gateway TOML. |
system-tests/lib/cre/types.go |
Adds the ConfidentialRelayCapability flag constant. |
system-tests/lib/cre/features/confidential_relay/confidential_relay.go |
New Feature that adds the gateway handler + gateway access config and forwards confidential relay env vars from capability config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| func newDefaultConfidentialRelayHandler() handler { | ||
| return handler{ | ||
| Name: GatewayHandlerTypeConfidentialRelay, |
There was a problem hiding this comment.
newDefaultConfidentialRelayHandler() doesn’t set ServiceName. In legacy gateway config, non-legacy JSON-RPC requests (without params.body.don_id) are routed by jsonRequest.ServiceName() via the serviceNameToDonID map, which is populated from handler ServiceName fields; without it, requests for this handler’s methods will fail with "Service name not found".
Set ServiceName to the method prefix used by the confidential relay handler (and keep it consistent with the handler’s exported Methods()), similar to how GatewayHandlerTypeHTTPCapabilities uses ServiceName: "workflows".
| Name: GatewayHandlerTypeConfidentialRelay, | |
| Name: GatewayHandlerTypeConfidentialRelay, | |
| ServiceName: GatewayHandlerTypeConfidentialRelay, |
| case GatewayHandlerTypeConfidentialRelay: | ||
| hs = append(hs, newDefaultConfidentialRelayHandler()) |
There was a problem hiding this comment.
A new handler type was added to Resolve() but there’s no corresponding test coverage in gateway_job_test.go to assert the generated TOML includes the confidential relay handler block and its default config. Add a TestGateway_Resolve_WithConfidentialRelayHandler (similar to the vault/http-capabilities tests) to prevent accidental regressions in handler name/config formatting.
| // Set env vars from capability config to activate the confidential relay handler on DON nodes. | ||
| // The handler is gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS; without it, the subservice won't start. | ||
| capConfig, ok := don.CapabilityConfigs[flag] | ||
| if ok && capConfig.Values != nil { | ||
| ns := don.MustNodeSet() | ||
| if ns.EnvVars == nil { | ||
| ns.EnvVars = make(map[string]string) | ||
| } | ||
|
|
||
| if v, exists := capConfig.Values["trustedPCRs"]; exists { | ||
| ns.EnvVars["CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS"] = fmt.Sprintf("%v", v) | ||
| } | ||
| if v, exists := capConfig.Values["caRootsPEM"]; exists { | ||
| ns.EnvVars["CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEM"] = fmt.Sprintf("%v", v) | ||
| } | ||
| } |
There was a problem hiding this comment.
This feature currently silently proceeds when the confidential relay capability config is missing or when trustedPCRs/caRootsPEM are present but not strings. Since the relay subservice is gated by CL_CONFIDENTIAL_RELAY_TRUSTED_PCRS, this can lead to confusing “feature enabled but nothing starts” failures.
Consider validating that trustedPCRs is provided and is either a JSON string or a structured value that you explicitly json.Marshal to a string; otherwise, return an error. Avoid fmt.Sprintf("%v", v) here because it will produce Go’s formatting for maps/slices rather than JSON.
| const flag = cre.ConfidentialRelayCapability | ||
|
|
||
| type ConfidentialRelay struct{} | ||
|
|
||
| func (o *ConfidentialRelay) Flag() cre.CapabilityFlag { | ||
| return flag | ||
| } |
There was a problem hiding this comment.
The new ConfidentialRelay feature/flag won’t be usable via the default CRE environment wiring unless it’s registered in the default feature set and flag providers. system-tests/lib/cre/features/sets/sets.go currently constructs the feature set without this feature, and system-tests/lib/cre/flags/provider.go doesn’t include cre.ConfidentialRelayCapability in SupportedCapabilityFlags().
Add the new feature to sets.New() and include the new capability in the relevant capability-flags providers so the framework can select and execute it when the flag is specified.
|
|
Consolidated into #21375 |




Summary
The CRE test framework can't spin up a relay DON because the gateway job spec generator doesn't know about the
confidential-compute-relayhandler type, and there's no Feature to wire it. This adds both, unblocking remote-mode E2E tests inconfidential-compute.Changes
deployment/cre/jobs/pkg/gateway_job.go: AddGatewayHandlerTypeConfidentialRelayconstant,confidentialRelayHandlerConfigstruct,newDefaultConfidentialRelayHandler()factory, and switch case inResolve(). Config is NodeRateLimiter-only;RequestTimeoutSecdefaults to 30 in the handler constructor.system-tests/lib/cre/types.go: AddConfidentialRelayCapabilityflag.system-tests/lib/cre/features/confidential_relay/confidential_relay.go(new): Feature implementation.PreEnvStartupadds the gateway handler, configures gateway access, and propagatesCL_CONFIDENTIAL_RELAY_TRUSTED_PCRS/CL_CONFIDENTIAL_RELAY_CA_ROOTS_PEMfromCapabilityConfig.ValuestoNodeSet.EnvVars.PostEnvStartupis a no-op; the relay handler auto-starts via env var gate. No on-chain capability registration (the relay handler is a CRE subservice, not a registered capability).Related PRs