Skip to content

fix: network isolated cluster oras login should use acr auth scope#8719

Open
fseldow wants to merge 13 commits into
mainfrom
xinhl/nilogin
Open

fix: network isolated cluster oras login should use acr auth scope#8719
fseldow wants to merge 13 commits into
mainfrom
xinhl/nilogin

Conversation

@fseldow

@fseldow fseldow commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:
fix: network isolated cluster oras login should use acr auth scope
otherwise when auth-as-arm is disabled, the login will fail at fetching acr exchange token

The pr will change to use acr scope first for the identity token, if fails then fallback to arm endpoint

https://learn.microsoft.com/en-us/azure/container-registry/container-registry-disable-authentication-as-arm#microsoft-entra-authentication-scope-types

Which issue(s) this PR fixes:

Fixes #

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates ORAS ACR login flows used in network-isolated scenarios to request an IMDS token for the ACR audience (https://containerregistry.azure.net) first, with fallback to the ARM audience (https://management.azure.com/) for compatibility when ACR-scoped tokens aren’t available.

Changes:

  • Windows: update Invoke-OrasLogin to try ACR auth scope first and fall back to ARM scope.
  • Linux: update oras_login_with_kubelet_identity to try ACR auth scope first and fall back to ARM scope.
  • Add/adjust Pester and ShellSpec tests to cover the new endpoint selection and fallback behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
staging/cse/windows/networkisolatedclusterfunc.ps1 Adds ACR-first IMDS token acquisition with ARM fallback for ORAS login.
staging/cse/windows/networkisolatedclusterfunc.tests.ps1 Extends Pester coverage for ACR-first and fallback scenarios.
parts/linux/cloud-init/artifacts/cse_helpers.sh Adds ACR-first token acquisition with ARM fallback for ORAS login on Linux CSE.
spec/parts/linux/cloud-init/artifacts/cse_helpers_spec.sh Updates ShellSpec expectations and adds fallback/success test cases.

Comment thread staging/cse/windows/networkisolatedclusterfunc.ps1
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread staging/cse/windows/networkisolatedclusterfunc.tests.ps1
- Linux: track last_ret_code across endpoint loop, return it instead of
  always returning ERR_ORAS_PULL_UNAUTHORIZED when all endpoints fail
- Windows: distinguish IMDS token failures (WINDOWS_CSE_ERROR_ORAS_IMDS_TIMEOUT)
  from auth failures (WINDOWS_CSE_ERROR_ORAS_PULL_UNAUTHORIZED)
- Update Windows test to expect IMDS timeout when IMDS calls fail
Copilot AI review requested due to automatic review settings June 17, 2026 10:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread staging/cse/windows/networkisolatedclusterfunc.ps1
…nflict

- Windows: use anyEndpointGotAccessToken flag to correctly classify
  errors. Only report IMDS timeout when NO endpoint successfully
  obtained an access token. This handles mixed-failure scenarios
  (e.g., ACR gets token but fails exchange, ARM fails IMDS).
- Fix pre-existing test failures: rename mock scriptblock parameter
  from $Args to $RequestArgs to avoid PowerShell automatic variable
  conflict when passing hashtable args through scriptblock invocation.
The dot-sourced windowscsehelper.ps1 defines Retry-Command at script
scope, which takes precedence over the global: function defined in
BeforeEach. Switch to Pester's Mock mechanism which correctly
intercepts regardless of scope.
Copilot AI review requested due to automatic review settings June 17, 2026 10:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread staging/cse/windows/networkisolatedclusterfunc.ps1 Outdated
xinhl and others added 2 commits June 17, 2026 11:01
The dot-sourced windowscsehelper.ps1 defines Retry-Command at script
scope which shadows the global: override. Copy the mock function to
script scope to ensure it takes precedence.
… Retry-Command mock, use Invoke-RestMethod mocks directly
Copilot AI review requested due to automatic review settings June 17, 2026 11:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread staging/cse/windows/networkisolatedclusterfunc.ps1
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh Outdated
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
xinhl added 2 commits June 17, 2026 12:44
- Reset last_ret_code=0 at the start of each endpoint iteration so
  only the current endpoint's error is preserved (per djsly review)
- Print the actual error response content when refresh token response
  contains error, for better debuggability
Keep the exit code stable (always 212) for consistent error handling
in the caller (logs_to_events || exit $?), while logging the actual
last_ret_code for debuggability.
Copilot AI review requested due to automatic review settings June 17, 2026 12:49

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
xinhl added 2 commits June 17, 2026 13:08
Remove raw response from log to avoid potential token leakage.
Extract only .error and .error_description via jq instead of printing
the full response body, avoiding potential token leakage while still
providing useful debugging info.
Copilot AI review requested due to automatic review settings June 17, 2026 13:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants