Skip to content

RBAC evaluation crashes on missing claims instead of falling back #15110

@tom-ludwig

Description

@tom-ludwig

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

The SSO RBAC gatekeeper aborts the entire authorization process if a single rule fails to evaluate (e.g., due to a missing variable in the OIDC token). This prevents fallback to lower-priority ServiceAccounts (like a default read-only account).

Specifically, if a token is missing a claim referenced in a high-priority rule (e.g., groups is missing because the user has no GitHub Teams), the expr library returns an error. The gatekeeper treats this error as fatal and returns immediately, rather than logging it and continuing to the next rule.

Expected Behavior: The Admin rule errors/evaluates to false. The loop continues. The Default SA (Precedence 0) is matched.

Actual Behavior: The Admin rule returns an error (unknown name groups). The function exits immediately with PermissionDenied.

Relevant Code
Argo Workflows (gatekeeper.go): The loop returns nil, err immediately upon evaluation failure instead of continue.

return nil, fmt.Errorf("failed to evaluate rule: %w", err)

Expr Library (checker.go): Returns error on missing keys. https://github.com/expr-lang/expr/blob/593f93febed21dc14168c830df606dfd96aff827/checker/checker.go#L283

Before I contribute, I need to know if this is by design?

Version(s)

v3.7.4

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.

na

Logs from the workflow controller

level=error msg="failed to perform RBAC authorization" error="failed to evaluate rule: unknown name groups (1:45)\n | '...' in groups\n | ............................................^"
time="2025-12-05T14:40:49.672Z" level=warning msg="finished unary call with code PermissionDenied" error="rpc error: code = PermissionDenied desc = not allowed" grpc.code=PermissionDenied grpc.method=GetInfo grpc.service=info.InfoService grpc.start_time="2025-12-05T14:40:49Z" grpc.time_ms=2.412 s
pan.kind=server system=grpc

Logs from in your workflow's wait container

na.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions