Skip to content

feat(providers): support safe custom provider profile updates #1881

@johntmyers

Description

@johntmyers

Problem Statement

Providers v2 supports importing, exporting, linting, and deleting custom provider profiles, but it does not support updating an existing custom provider profile.

This makes custom profile lifecycle management awkward and blocks safe rollout of profile changes. Provider instances reference profiles by provider.type; they do not copy endpoint or binary policy into the provider instance. Effective sandbox policy is composed just-in-time from the sandbox policy plus the attached providers' profiles. That means a provider profile update is the natural control-plane operation for changing provider-derived policy across all provider instances of that type.

Today, users must create a new profile ID, recreate or update provider instances to use the new type, and reattach or recreate sandbox state. That is unnecessary because the runtime model already resolves profiles dynamically.

Related umbrella issue: #896.

Proposed Design

Add first-class update support for custom provider profiles.

User-facing CLI:

openshell provider profile update -f custom-profile.yaml

Optional batch form:

openshell provider profile update --from ./provider-profiles

Server/API behavior:

  • Add an UpdateProviderProfile RPC.
  • Accept one profile per update request, or mirror the existing import batch shape if batch updates are preferred.
  • Require the profile ID to already exist as a custom profile.
  • Reject updates to built-in provider profile IDs.
  • Preserve the existing StoredProviderProfile.metadata.id, name, created_at_ms, and labels.
  • Update StoredProviderProfile.profile and increment resource_version.
  • Support optimistic concurrency with expected_resource_version if practical.
  • Reuse existing profile validation before persisting.
  • Reuse and extend existing attached-sandbox diagnostics so an update cannot introduce ambiguous dynamic token grants for sandboxes that already use affected provider types.
  • Do not mutate provider instances. Provider instances continue to store only type, credentials, config, and credential expiry metadata.
  • Do not persist provider-derived network rules into sandbox policies.

Effective policy behavior:

  • Sandboxes using providers of the updated profile type should receive updated provider-derived network rules on the next sandbox config/policy sync.
  • This works because GetSandboxConfig composes effective policy from the current sandbox policy plus profile_provider_policy_layers(...).
  • If providers_v2_enabled=false, profile network policy changes should not affect sandbox effective policy.
  • If a gateway-global policy is active, provider-derived policy layers should remain suppressed as they are today.

Credential/dynamic credential behavior:

  • If the profile update changes dynamic token grants, provider environment revision should change so sandbox-side provider credential state refreshes.
  • Existing compute_provider_env_revision(...) already hashes custom profile payloads; tests should lock this in through the public update path.

Validation and safety:

  • Reject invalid profile IDs using the same normalization rules as import/get/delete.
  • Reject profile ID changes during update.
  • Reject updates that conflict with existing profile IDs or built-in profile IDs.
  • Reject updates that would make active attached-provider sets ambiguous for dynamic token grant resolution.
  • Keep delete behavior unchanged: custom profile delete remains blocked while in use by sandboxes.

Implementation outline:

  • proto/openshell.proto: add UpdateProviderProfileRequest and UpdateProviderProfileResponse or reuse ProviderProfileResponse.
  • crates/openshell-cli/src/main.rs: add openshell provider profile update.
  • crates/openshell-cli/src/run.rs: parse YAML/JSON profile input using existing profile import helpers.
  • crates/openshell-server/src/grpc/provider.rs: add handler that validates, fetches existing custom profile, runs attached-sandbox diagnostics with the candidate profile, then writes with CAS/update semantics instead of WriteCondition::MustCreate.
  • crates/openshell-server/src/grpc/policy.rs: add or extend tests proving updated profile endpoints appear in effective policy without modifying provider instances or persisted sandbox source policy.
  • docs/sandboxes/providers-v2.mdx: document update semantics and rollout behavior.
  • docs/sandboxes/manage-providers.mdx: add CLI examples for updating custom profiles.

Definition of done:

  • openshell provider profile update -f profile.yaml updates an existing custom profile.
  • Updating a built-in profile returns a clear error.
  • Updating a missing custom profile returns a clear not-found error.
  • Updating a profile changes effective policy for sandboxes with attached providers of that type on next config sync.
  • Provider objects are not rewritten when a profile is updated.
  • Persisted sandbox source policy does not gain _provider_* rules.
  • Dynamic token grant ambiguity is detected before persisting an update.
  • Docs explain that profile updates affect all provider instances of that type.

Alternatives Considered

Use openshell provider profile import --replace.

This is compact, but it makes import more dangerous because the existing command is create-only today. A dedicated update command is clearer, easier to gate with stronger validation, and avoids accidental replacement when users expect import to be non-destructive.

Create a new profile ID for every change.

This works today but forces provider instance churn and sandbox attachment churn. It does not match the current runtime model, where providers reference profiles dynamically by type.

Copy profile network policy into provider instances.

This would make profile updates harder because every provider instance would need migration. The current design already avoids this by resolving profiles from provider.type during policy composition.

Agent Investigation

  • Confirmed the CLI has provider profile export, import, lint, and delete, but no update.
  • Confirmed ImportProviderProfiles persists custom profiles with WriteCondition::MustCreate.
  • Confirmed import rejects existing custom profile IDs and rejects overwriting built-in profile IDs.
  • Confirmed provider instances store only metadata, type, credentials, config, and credential_expires_at_ms.
  • Confirmed sandbox provider attachment is SandboxSpec.providers, a repeated list of provider names.
  • Confirmed effective policy composition walks from sandbox provider names to provider records, then from provider.type to the provider profile.
  • Confirmed profile-derived policy layers are composed just-in-time and are not persisted into the sandbox source policy.
  • Confirmed compute_provider_env_revision(...) hashes custom profile payloads, so profile changes can trigger sandbox-side provider refresh behavior.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:cliCLI-related workarea:gatewayGateway server and control-plane workarea:policyPolicy engine and policy lifecycle workstate:review-readyReady for human review

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions