fix: network isolated cluster oras login should use acr auth scope#8719
Open
fseldow wants to merge 13 commits into
Open
fix: network isolated cluster oras login should use acr auth scope#8719fseldow wants to merge 13 commits into
fseldow wants to merge 13 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates ORAS ACR login flows used in network-isolated scenarios to request an IMDS token for the ACR audience (https://containerregistry.azure.net) first, with fallback to the ARM audience (https://management.azure.com/) for compatibility when ACR-scoped tokens aren’t available.
Changes:
- Windows: update
Invoke-OrasLoginto try ACR auth scope first and fall back to ARM scope. - Linux: update
oras_login_with_kubelet_identityto try ACR auth scope first and fall back to ARM scope. - Add/adjust Pester and ShellSpec tests to cover the new endpoint selection and fallback behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| staging/cse/windows/networkisolatedclusterfunc.ps1 | Adds ACR-first IMDS token acquisition with ARM fallback for ORAS login. |
| staging/cse/windows/networkisolatedclusterfunc.tests.ps1 | Extends Pester coverage for ACR-first and fallback scenarios. |
| parts/linux/cloud-init/artifacts/cse_helpers.sh | Adds ACR-first token acquisition with ARM fallback for ORAS login on Linux CSE. |
| spec/parts/linux/cloud-init/artifacts/cse_helpers_spec.sh | Updates ShellSpec expectations and adds fallback/success test cases. |
- Linux: track last_ret_code across endpoint loop, return it instead of always returning ERR_ORAS_PULL_UNAUTHORIZED when all endpoints fail - Windows: distinguish IMDS token failures (WINDOWS_CSE_ERROR_ORAS_IMDS_TIMEOUT) from auth failures (WINDOWS_CSE_ERROR_ORAS_PULL_UNAUTHORIZED) - Update Windows test to expect IMDS timeout when IMDS calls fail
…nflict - Windows: use anyEndpointGotAccessToken flag to correctly classify errors. Only report IMDS timeout when NO endpoint successfully obtained an access token. This handles mixed-failure scenarios (e.g., ACR gets token but fails exchange, ARM fails IMDS). - Fix pre-existing test failures: rename mock scriptblock parameter from $Args to $RequestArgs to avoid PowerShell automatic variable conflict when passing hashtable args through scriptblock invocation.
The dot-sourced windowscsehelper.ps1 defines Retry-Command at script scope, which takes precedence over the global: function defined in BeforeEach. Switch to Pester's Mock mechanism which correctly intercepts regardless of scope.
The dot-sourced windowscsehelper.ps1 defines Retry-Command at script scope which shadows the global: override. Copy the mock function to script scope to ensure it takes precedence.
… Retry-Command mock, use Invoke-RestMethod mocks directly
djsly
reviewed
Jun 17, 2026
djsly
reviewed
Jun 17, 2026
djsly
reviewed
Jun 17, 2026
djsly
requested changes
Jun 17, 2026
- Reset last_ret_code=0 at the start of each endpoint iteration so only the current endpoint's error is preserved (per djsly review) - Print the actual error response content when refresh token response contains error, for better debuggability
Keep the exit code stable (always 212) for consistent error handling in the caller (logs_to_events || exit $?), while logging the actual last_ret_code for debuggability.
…HORIZED" This reverts commit a4124a6.
Remove raw response from log to avoid potential token leakage.
Extract only .error and .error_description via jq instead of printing the full response body, avoiding potential token leakage while still providing useful debugging info.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
fix: network isolated cluster oras login should use acr auth scope
otherwise when auth-as-arm is disabled, the login will fail at fetching acr exchange token
The pr will change to use acr scope first for the identity token, if fails then fallback to arm endpoint
https://learn.microsoft.com/en-us/azure/container-registry/container-registry-disable-authentication-as-arm#microsoft-entra-authentication-scope-types
Which issue(s) this PR fixes:
Fixes #