feat: Add PR documentation monitor workflow by jongio · Pull Request #6927 · Azure/azure-dev

jongio · 2026-02-27T17:35:16Z

Summary

Adds a new GitHub Actions workflow and custom TypeScript action that automatically monitors PRs for documentation impact. When a code PR is opened or updated against main, the workflow uses AI (GitHub Models API / GPT-4o) to analyze the diff and determine which docs in Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr need to be created, updated, or deleted.

Closes #6924

Security Model

This workflow is designed to be safe for fork PRs from external contributors.

Concern	Mitigation
Fork modifies workflow YAML to steal secrets	`pull_request_target` runs workflow code from `main`, not from the fork
Secrets exposed on runner	No secrets stored in GitHub. OIDC federated credentials + Key Vault signing. Private key is non-exportable and never leaves Key Vault.
Token over-scoping	GitHub App token scoped to `MicrosoftDocs/azure-dev-docs-pr` only
Executing untrusted code	Action never checks out or executes PR code. All data read via GitHub REST API
Token tied to a person	GitHub App is an org-level identity, not tied to any individual
Markdown injection via PR title/body	Multi-layer sanitization: `sanitizePlainText()` on AI output, `escapeTableCell()` (strips HTML/markdown links, escapes pipes/newlines), `sanitizeForMarkdown()` on PR bodies, output length caps (`MAX_REASON_LENGTH=200`, `MAX_SUMMARY_LENGTH=500`)
Tracking comment spoofing	`findTrackingComment` verifies comment author is `github-actions[bot]`
AI prompt injection via doc content	`sanitizeText()` strips HTML tags and control characters from all doc manifest data (titles, topics, headings) before injection into AI prompt
AI output manipulation (prompt injection)	`MAX_IMPACTS=15` cap on impact count, unknown repos rejected (not just warned), repo format validated via regex, path traversal (`..`, leading `/`) blocked
Resource exhaustion via batch modes	`MAX_PRS_PER_RUN=20` cap on `all_open` and `list` modes
Large file denial-of-service	`MAX_CONTENT_SIZE_BYTES=50KB` per doc file — oversized files skipped during inventory
Supply chain via unpinned actions	All actions pinned to commit SHAs (`actions/checkout@34e11487...`, `azure/login@a457da9e...`)
Log injection via PR title	PR title truncated to 100 chars and control characters stripped before logging

Auth Flow

flowchart LR
    OIDC["1. GitHub OIDC token | (id-token: write)"]
    AZ["2. azure/login@v2 | (federated credentials)"]
    KV["3. az keyvault key sign | (non-exportable RSA key | in azuresdkengkeyvault)"]
    JWT["4. Signed JWT | (GitHub App 1086291)"]
    INST["5. POST /app/installations/ | {id}/access_tokens"]
    TOKEN["6. Installation token | (scoped, 1h TTL)"]

    OIDC --> AZ --> KV --> JWT --> INST --> TOKEN

GITHUB_TOKEN handles in-repo operations (read PR diff, create doc PRs in azure-dev, post comments)
Workflow requests OIDC token from GitHub (id-token: write permission)
azure/login@v2 exchanges OIDC token for Azure access using federated credentials (no client secret)
eng/common/actions/login-to-github composite action signs a JWT using az keyvault key sign (RSA key azure-sdk-automation in azuresdkengkeyvault -- key is non-exportable)
JWT is exchanged for a short-lived GitHub App installation token scoped to MicrosoftDocs/azure-dev-docs-pr
Token is exported as GH_TOKEN env var, expires after 1 hour, never stored

No secrets are stored in GitHub. The entire auth chain is keyless -- OIDC federation replaces client secrets, and Key Vault signing replaces private key storage.

Flow

flowchart TD
    A["PR Event: opened/synchronize/closed"] --> B{Event Type?}
    B -->|opened / synchronize| C["Fetch PR Diff via API"]
    B -->|closed + merged| SKIP["Skip: PRs already exist"]
    B -->|closed + not merged| Z["Close doc PRs, clean branches"]

    D["Manual Trigger"] --> E{Mode?}
    E -->|single| C
    E -->|all_open| F["Enumerate open PRs"]
    E -->|list| G["Parse PR numbers"]
    F --> C
    G --> C

    C --> H["Classify changes"]
    H --> I["Build docs inventory"]
    I --> J["AI Analysis via GPT-4o"]
    J --> K{Docs impacted?}

    K -->|No| L["Post: no doc changes needed"]
    K -->|Yes| M["Generate doc proposals"]

    M --> N{"In-repo docs?"}
    N -->|Yes| O["Branch: docs/pr-N in azure-dev"]
    O --> P["Create/update PR"]
    N -->|No| Q{"External docs?"}

    P --> Q
    Q -->|Yes| R["Mint token via OIDC"]
    R --> R2["Branch: docs/pr-N in docs repo"]
    R2 --> S["Create/update docs PR"]
    Q -->|No| T["Update tracking comment"]

    S --> T
    L --> U["Done"]
    T --> U
    Z --> U
    SKIP --> U

Architecture

graph TD
    subgraph "GitHub Action (.github/actions/doc-monitor)"
        IDX[index.ts - Entry Point] --> INP[inputs.ts - Validation]
        IDX --> PROC[processor.ts - Orchestrator]
        PROC --> DIFF[diff.ts - PR Diff Extraction]
        PROC --> INV[docs-inventory.ts - Doc Manifest]
        PROC --> ANA[analyze.ts - AI Analysis]
        PROC --> PRM[pr-manager.ts - Companion PRs]
        PROC --> CMT[comment-tracker.ts - Tracking Comments]
        PRM --> GHU[github-utils.ts - API Helpers]
        PRM --> PRB[pr-body.ts - Markdown Builders]
        ANA --> CON[constants.ts - Config Values]
    end

    subgraph "Auth (eng/common)"
        LOGIN[login-to-github - Composite Action] --> SCRIPT[login-to-github.ps1 - Key Vault JWT Signing]
    end

    ANA -->|"OpenAI API"| GMAI["GitHub Models GPT-4o"]
    DIFF -->|"REST API"| GH["GitHub API"]
    INV -->|"REST API"| GH
    PRM -->|"REST API"| GH
    CMT -->|"REST API"| GH
    SCRIPT -->|"az keyvault key sign"| KV["Azure Key Vault"]

What it does

Triggers on pull_request_target events (opened, synchronized, reopened, closed) targeting main, plus manual workflow_dispatch
Extracts the PR diff via GitHub REST API and classifies changes
Inventories documentation in both Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr (using git.getTree + git.getBlob for efficiency, with sanitizeText() on all extracted content)
Analyzes changes using GitHub Models AI (GPT-4o) to determine doc impact, with comprehensive output validation (repo format regex, path traversal blocking, unknown repo rejection, MAX_IMPACTS=15 cap, output length caps)
Creates companion PRs in the appropriate repos with deterministic branch naming (docs/pr-{N})
Posts a tracking comment on the source PR linking to all companion doc PRs (with author verification and multi-layer markdown injection prevention)
Assigns doc PRs to alexwolfmsft and diberry
Respects human edits -- never force-pushes over manual changes on doc branches
Skips merged PRs -- avoids wasteful re-analysis of already-merged PRs
Graceful degradation -- without cross-repo token, still scans and reports impacts

Modes

Mode	Trigger	Description
`auto`	PR events	Processes the triggering PR
`single`	`workflow_dispatch`	Process a specific PR by number
`all_open`	`workflow_dispatch`	Process all open PRs targeting main (capped at 20)
`list`	`workflow_dispatch`	Process a comma-separated list of PR numbers (capped at 20)

Security Hardening

A comprehensive red team assessment was performed against the action simulating 10 attacker personas (Script Kiddie through Nation-State). 11 findings were produced:

#	Severity	Finding	Resolution
1	CRIT	`workflow_dispatch` any-collaborator trigger	Admin: add `required_reviewers` to environment (comment)
2	HIGH	AI output drives write operations (prompt injection)	Code: MAX_IMPACTS=15, reject unknown repos, path traversal block, repo format validation
3	HIGH	Cross-org token scope	Admin: verify App installation scoped to single repo (comment)
4	HIGH	Hardcoded OIDC GUIDs	Admin: move to `vars.*` repo variables (comment)
5	MED	No rate limit on batch modes	Code: MAX_PRS_PER_RUN=20
6	MED	AI output length uncapped	Code: MAX_REASON_LENGTH=200, MAX_SUMMARY_LENGTH=500
7	MED	Doc inventory from attacker paths	Low residual: content from main branch only
8	MED	Actions not pinned to SHA	Code: pinned to commit SHAs
9	MED	Large file ReDoS/perf risk	Code: MAX_CONTENT_SIZE_BYTES=50KB
10	LOW	PR title logged unsanitized	Code: control char stripping + truncation
11	MED	Doc manifest prompt injection	Code: sanitizeText() on all extracted data

Files

Path	Purpose
`.github/workflows/doc-monitor.yml`	Workflow definition (`pull_request_target` + OIDC + Key Vault)
`.github/actions/doc-monitor/`	Custom TypeScript action (12 source modules + compiled dist)
`eng/common/actions/login-to-github/action.yml`	Composite action wrapping Key Vault signing script
`eng/common/scripts/login-to-github.ps1`	PowerShell script that mints GitHub App tokens via Key Vault

Prerequisites (managed by EngSys)

Component	Value	Purpose
GitHub Environment	`AzureSDKEngKeyVault`	OIDC federated credential binding
Azure Key Vault	`azuresdkengkeyvault`	Hosts the non-exportable RSA signing key
Key Vault Key	`azure-sdk-automation`	RSA key used to sign GitHub App JWTs
GitHub App	ID `1086291` (Azure SDK Automation)	Installed on MicrosoftDocs org with `contents:write` + `pull_requests:write`
Workflow permissions	`id-token: write`, `contents: write`, `pull-requests: write`, `models: read`	Already configured in workflow YAML

Auth Evolution

This PR went through three auth approaches before landing on the current design:

PAT (rejected) -- secrets exposed on runner, tied to individual, doesn't expire automatically
actions/create-github-app-token (rejected) -- requires GitHub App private key as a GitHub secret, which is present on the runner during workflow execution
OIDC + Key Vault signing (current) -- no secrets in GitHub, private key never leaves Key Vault, fully keyless OIDC chain. Uses the same pattern as azure-sdk-tools (see PR #14219)

Adds a GitHub Actions workflow + custom TypeScript action that uses GitHub Models AI (GPT-4o) to analyze PR diffs and identify impacted documentation across Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr. Creates companion doc PRs, posts tracking comments, and supports manual batch processing via workflow_dispatch. Closes Azure#6924 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds an automated “doc-monitor” GitHub Actions workflow plus a custom TypeScript action that analyzes PR diffs with GitHub Models (GPT-4o) to determine documentation impact and creates/updates companion documentation PRs.

Changes:

Introduces .github/workflows/doc-monitor.yml to run on PR events and manual dispatch modes.
Adds a custom Node 20 TypeScript action under .github/actions/doc-monitor/ to fetch diffs, inventory docs across repos, run AI analysis, and manage companion PRs + tracking comments.
Adds supporting action metadata, build config, and documentation.

Reviewed changes

Copilot reviewed 18 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
.github/workflows/doc-monitor.yml	New workflow wiring (triggers, permissions, concurrency, action invocation).
.github/actions/doc-monitor/action.yml	Declares action inputs/outputs + Node runtime entrypoint.
.github/actions/doc-monitor/package.json	Action dependencies and build/test scripts.
.github/actions/doc-monitor/tsconfig.json	TypeScript compilation settings for the action.
.github/actions/doc-monitor/src/index.ts	Mode routing and PR enumeration (auto/single/all_open/list).
.github/actions/doc-monitor/src/inputs.ts	Input parsing + validation.
.github/actions/doc-monitor/src/processor.ts	Orchestrates diff fetch, inventory, AI analysis, PR/comment updates.
.github/actions/doc-monitor/src/diff.ts	PR metadata/files fetch + change classification + diff summarization.
.github/actions/doc-monitor/src/docs-inventory.ts	Builds markdown doc inventory from repo contents for AI context.
.github/actions/doc-monitor/src/analyze.ts	GitHub Models/OpenAI client integration + response validation.
.github/actions/doc-monitor/src/pr-manager.ts	Companion branch/PR creation, updates, and closure behavior.
.github/actions/doc-monitor/src/comment-tracker.ts	Tracking comment create/update and formatting.
.github/actions/doc-monitor/src/pr-body.ts	Markdown body/summary builders for companion PRs.
.github/actions/doc-monitor/src/constants.ts	Centralized constants for limits, defaults, and markers.
.github/actions/doc-monitor/src/types.ts	Shared type definitions for the action.
.github/actions/doc-monitor/README.md	Local documentation for configuring and developing the action.
.github/actions/doc-monitor/.gitignore	Ignores action-local node_modules.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.github/actions/doc-monitor/src/diff.ts

.github/actions/doc-monitor/src/processor.ts

.github/actions/doc-monitor/src/docs-inventory.ts

.github/workflows/doc-monitor.yml

.github/actions/doc-monitor/README.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Switch external docs repo from MicrosoftDocs/azure-dev-docs-pr (private) to MicrosoftDocs/azure-dev-docs (public). When DOCS_REPO_PAT is not set, fall back to GITHUB_TOKEN for reading the public docs repo inventory. Companion PR creation still requires DOCS_REPO_PAT. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace PAT-based cross-repo auth with GitHub App token minting via actions/create-github-app-token. Switch trigger from pull_request to pull_request_target to prevent fork PRs from exfiltrating secrets. Security model: - pull_request_target runs workflow code from main (not fork branch) - GitHub App tokens are short-lived (1 hour), scoped to specific repo - Action reads PR data via GitHub API only, never executes PR code Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jongio · 2026-02-27T18:43:48Z

Security Stance

This workflow is designed to be safe for use with fork PRs from external contributors. Here's why:

1. `pull_request_target` trigger (not `pull_request`)

The workflow uses pull_request_target which runs the workflow code from the base branch (main), not from the PR branch. This is the critical security boundary:

pull_request (unsafe for secrets): Runs workflow YAML from the PR head. A fork contributor could modify .github/workflows/doc-monitor.yml to exfiltrate any secrets the workflow has access to.
pull_request_target (safe for secrets): Runs workflow YAML from the repository's default branch. Fork contributors cannot modify the workflow code that executes.

2. No PR code checkout or execution

The action never checks out the PR's source code. The only actions/checkout step loads the action's own compiled code from main:

- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
  with:
    ref: ${{ github.event.repository.default_branch }}
    sparse-checkout: .github/actions/doc-monitor

All PR data (diff, files, metadata) is read exclusively through the GitHub REST API (octokit.pulls.listFiles(), octokit.pulls.get()). There is no path for untrusted code execution.

3. OIDC + Key Vault signing (no secrets in GitHub)

Cross-repo write access uses OIDC federated credentials + Azure Key Vault signing instead of stored secrets:

Property	PAT / App Private Key	OIDC + Key Vault
Secrets in GitHub	Yes (PAT or private key stored as repo/org secret)	None (OIDC is fully keyless)
Private key exposure	On runner during workflow execution	Never leaves Key Vault (signing is server-side)
Token lifetime	Until manually revoked (PAT) or 1h (App token)	1h (App installation token)
Scope	Depends on configuration	Only repos where App is installed
Identity	Tied to a person (PAT) or App	Org-level App identity

The auth chain works as follows:

GitHub OIDC provider issues a token to the workflow (id-token: write)
azure/login@v2 exchanges the OIDC token for Azure access using federated credentials (no client secret needed)
eng/common/actions/login-to-github creates a JWT, computes its SHA-256 hash, and signs it via az keyvault key sign --algorithm RS256 (the RSA key azure-sdk-automation in azuresdkengkeyvault is non-exportable)
The signed JWT is exchanged for a GitHub App installation token via POST /app/installations/{id}/access_tokens
The token is scoped to MicrosoftDocs/azure-dev-docs-pr only and expires in 1 hour

This is the same pattern used by the Azure SDK EngSys team across Azure SDK repos. See azure-sdk-tools PR #14219 for the composite action source.

4. Multi-layer injection prevention

PR titles, bodies, and doc content are attacker-controlled data that flow through the system. Five sanitization layers prevent injection at every stage:

Layer	Function	Where Applied	What It Does
`sanitizeText()`	Doc inventory input	titles, topics, H2 headings from doc files	Strips HTML tags and control characters before they enter the AI prompt
Anti-injection system prompt	AI analysis	GPT-4o system message	Instructs the model to ignore embedded instructions in untrusted data
`sanitizePlainText()`	AI output	reason, suggestedChanges, summary, repo, path	Strips HTML, control chars, and excessive whitespace from AI responses
`escapeTableCell()`	Tracking comment	all table cell values	Strips HTML tags, converts markdown links to plain text, removes images, escapes pipes, collapses newlines
`sanitizeForMarkdown()`	PR bodies	companion PR body content	Prevents markdown injection in generated PR descriptions

Additionally, AI output is structurally constrained:

MAX_REASON_LENGTH=200 and MAX_SUMMARY_LENGTH=500 cap output field lengths
MAX_IMPACTS=15 caps the number of doc impacts the AI can propose
Unknown repos are rejected (not just warned) — AI cannot target arbitrary repositories
Repo format validated via regex (owner/repo pattern required)
Path traversal blocked (.. and leading / rejected)
Error messages redacted from tracking comments to prevent data leakage

5. Tracking comment author verification

findTrackingComment verifies that the comment author is github-actions[bot] in addition to checking for the marker. This prevents an attacker from pre-planting a comment with the marker to hijack the tracking display.

6. Bot loop prevention

The workflow skips execution when:

The PR head ref starts with docs/pr- (it's a companion doc PR, not a code PR)
The PR actor is github-actions[bot] (prevents infinite recursion)

7. Actions pinned to commit SHAs

All third-party actions are pinned to immutable commit SHAs to prevent supply chain attacks via tag mutation:

actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5   # v2

8. Resource exhaustion prevention

Batch processing modes are rate-limited:

MAX_PRS_PER_RUN=20 — caps all_open and list modes
MAX_CONTENT_SIZE_BYTES=50KB — skips oversized doc files during inventory
MAX_CONTENT_FETCHES=50 — limits the number of doc files fetched per repo

Summary

This design follows the most secure pattern available for GitHub Actions workflows that need cross-repo write access on fork PRs:

pull_request_target runs trusted code from main
PR data read via API only (never checkout)
OIDC federated credentials (no secrets stored in GitHub)
Key Vault signing (private key never on runner)
Short-lived scoped tokens for cross-repo operations
5-layer input/output sanitization chain
Structural AI output constraints (impact count, field length, repo validation, path traversal blocking)
Actions pinned to commit SHAs
Resource exhaustion prevention (PR count, file size, fetch count caps)
Author verification on tracking comments
Error message redaction from public-facing comments

- Use merged_at instead of merged for reliable merge detection (thread Azure#1) - Expand isDocOnlyPr to handle doc-adjacent assets (thread Azure#2) - Replace N+1 API calls with git.getTree for doc inventory (thread Azure#3) - Fix README trigger types to match actual workflow config (thread Azure#5) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jongio · 2026-02-27T19:59:58Z

Research: GitHub Agentic Workflows as Alternative to Custom Action

Investigated whether GitHub Agentic Workflows (technical preview, Feb 2026) could replace our custom TypeScript action. Here is the analysis:

What It Is

GitHub Agentic Workflows let you define repository automation in Markdown instead of code. An AI agent (Copilot, Claude, or Codex) interprets natural-language instructions, runs read-only in a sandbox, then uses "safe-outputs" (structured, permission-separated jobs) to write back to GitHub. Developed by GitHub Next and Microsoft Research.

Capability Mapping

Our Requirement	gh-aw Support	Notes
Trigger on PR events	Supported	`on: pull_request`
Analyze PR diff with AI	Supported	Core feature -- agent reads repo context
Scan docs in azure-dev	Supported	Full read access to own repo
Scan docs in MicrosoftDocs/azure-dev-docs-pr	Supported	`checkout:` with cross-repo token
Create PR in azure-dev	Supported	`safe-outputs: create-pull-request`
Create PR in azure-dev-docs-pr (cross-repo)	Supported	`create-pull-request: {target-repo: "..."}` with token
Push updates to doc PR branch	Same-repo only	`push-to-pull-request-branch` does not support cross-repo
Maintain tracking comment on source PR	Supported	`safe-outputs: add-comment` (max 1 default)
Preserve human changes on doc PRs	Limited	AI reasoning, not deterministic logic
Manual trigger for specific PRs	Supported	`on: workflow_dispatch`
Batch run against multiple PRs	Not built-in	Would need custom logic
Fork security	Supported	Built-in sandboxing, agents run read-only
Cross-repo auth without stored secrets	Not solved	Still requires PAT or App token -- does not support OIDC + Key Vault

Advantages

~1200 lines of TypeScript would shrink to ~50-line Markdown file with YAML frontmatter
Built-in security via safe-outputs (least-privilege, agent runs read-only)
No build step (no tsc, no ncc, no dist/index.js to maintain)
Multi-model support (swap Copilot/Claude/Codex without code changes)
Maintained by GitHub (security model and tooling updates automatically)

Blockers for Adoption Today

Technical preview -- may change significantly; not production-ready for Azure/azure-dev
Cross-repo push limitation -- push-to-pull-request-branch is same-repo only, so we cannot iteratively update the MicrosoftDocs PR branch (a core requirement)
Same auth challenge -- still needs PAT or GitHub App token for cross-repo writes; does not support OIDC + Key Vault signing (our current approach)

Recommendation

Ship the current custom action (built, reviewed, deterministic), and track gh-aw as a migration target once it exits preview and adds cross-repo push support. The Markdown-based approach is a natural fit for this use case long-term.

References

- Switch auth from GitHub App secrets to OIDC + Key Vault signing - Add eng/common login-to-github action (from azure-sdk-tools #14219) - Fix 12 MQ review findings (CR-002 through CR-013) - Update all deps to latest CJS-compatible versions (0 CVEs) - Change docs repo to MicrosoftDocs/azure-dev-docs-pr - Rebuild dist bundle Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 20 out of 22 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T01:22:16Z

eng/common/scripts/login-to-github.ps1

+  $SignResultJson = az keyvault key sign `
+      --vault-name $VaultName `
+      --name $KeyName `
+      --algorithm RS256 `
+      --digest $Base64Value | ConvertFrom-Json
+
+  if ($LASTEXITCODE -ne 0) {
+    throw "Failed to sign JWT with Azure Key Vault. Error: $SignResult"
+  }
+


$SignResult is not defined, so the error path will itself throw/produce an unhelpful message. Also, piping az ... | ConvertFrom-Json means a non-zero az exit (or non-JSON output) can terminate before $LASTEXITCODE is checked. Capture the raw az output first, check $LASTEXITCODE, then ConvertFrom-Json only on success (and include the captured output in the thrown message).

Suggested change

$SignResultJson = az keyvault key sign `

--vault-name $VaultName `

--name $KeyName `

--algorithm RS256 `

--digest $Base64Value | ConvertFrom-Json

if ($LASTEXITCODE -ne 0) {

throw "Failed to sign JWT with Azure Key Vault. Error: $SignResult"

}

$SignResultRaw = az keyvault key sign `

--vault-name $VaultName `

--name $KeyName `

--algorithm RS256 `

--digest $Base64Value 2>&1

if ($LASTEXITCODE -ne 0) {

throw "Failed to sign JWT with Azure Key Vault. ExitCode: $LASTEXITCODE. Output: $SignResultRaw"

}

try {

$SignResultJson = $SignResultRaw | ConvertFrom-Json

}

catch {

throw "Failed to parse Azure Key Vault sign response as JSON. Raw output: $SignResultRaw"

}

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

Copilot · 2026-02-28T01:22:17Z

eng/common/scripts/login-to-github.ps1

+    $resp = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
+    if (!$resp.id) { throw "No installations found for this App." }
+    return $resp.id


Where-Object can return multiple matching installations, which makes $resp.id an array. That can later break the token exchange call that expects a single installation id. Select a single match deterministically (e.g., the first match) and improve the error to include the requested owner to aid troubleshooting.

Suggested change

$resp = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }

if (!$resp.id) { throw "No installations found for this App." }

return $resp.id

$matches = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }

if (-not $matches) {

throw "No installations found for this App and owner '$InstallationTokenOwner'."

}

$selected = $matches | Select-Object -First 1

if ($matches.Count -gt 1) {

Write-Warning "Multiple installations found for owner '$InstallationTokenOwner'. Using installation id $($selected.id)."

}

return $selected.id

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

Copilot · 2026-02-28T01:22:17Z

eng/common/actions/login-to-github/action.yml

+        $owners = $env:INPUT_TOKEN_OWNERS -split ',' | ForEach-Object { $_.Trim() }
+        & $scriptPath `
+          -KeyVaultName $env:INPUT_KEY_VAULT_NAME `
+          -KeyName $env:INPUT_KEY_NAME `
+          -GitHubAppId $env:INPUT_APP_ID `
+          -InstallationTokenOwners $owners `
+          -VariableNamePrefix $env:INPUT_VARIABLE_NAME_PREFIX


Splitting token-owners without filtering empty entries means values like an empty string (or a trailing comma) will produce \"\" as an owner and cause Get-GitHubInstallationId to fail. Filter out empty/whitespace-only owners after trimming (and consider failing fast if the resulting list is empty).

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

.github/actions/doc-monitor/src/docs-inventory.ts

.github/actions/doc-monitor/src/comment-tracker.ts

.github/actions/doc-monitor/src/diff.ts

- Pin actions to commit SHAs (actions/checkout, azure/login) - Cap all_open/list mode to MAX_PRS_PER_RUN=20 - Cap AI output: MAX_REASON_LENGTH=200, MAX_SUMMARY_LENGTH=500 - Add MAX_IMPACTS=15 to limit AI-generated impact count - Add MAX_CONTENT_SIZE_BYTES=50KB per doc file - Sanitize doc manifest content (titles, topics, headings) - Reject unknown repos from AI output (not just warn) - Validate repo format with regex (owner/repo) - Block path traversal in AI-returned paths - Sanitize PR title in log output (strip control chars) - Strip HTML from existing PR body in closeCompanionPrs - Remove error messages from tracking comment (prevent data leak) - Upper-bound PR number input to 999999 - Rename TRUSTED_DOC_INVENTORY to DOC_INVENTORY tag Red team findings addressed: Azure#2, Azure#5, Azure#6, Azure#8, Azure#9, Azure#10, Azure#11 Admin items remaining: Azure#1 (env gating), Azure#3 (token scope), Azure#4 (OIDC vars) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jongio · 2026-02-28T05:33:00Z

🔧 Eng-Sys Action Items — Red Team Findings Requiring Admin/Infra Changes

The following items were identified during a red team security assessment of the doc-monitor workflow. They cannot be resolved through code changes alone and require admin or infrastructure configuration.

Finding #1 (CRITICAL) — `workflow_dispatch` Environment Gating

Risk: Any repo collaborator can manually trigger workflow_dispatch, which runs with contents: write + pull-requests: write permissions and mints a cross-org GitHub App token via OIDC.

Required Action: Add required_reviewers protection to the AzureSDKEngKeyVault environment so that workflow_dispatch runs require approval before executing.

Steps:

Go to Settings → Environments → AzureSDKEngKeyVault
Enable Required reviewers
Add the eng-sys team or designated approvers
pull_request_target runs (the normal path) already pass through the environment gate via OIDC, so this only adds a gate for manual dispatch

Finding #3 (HIGH) — Cross-Org GitHub App Token Scope

Risk: The GitHub App (ID 1086291) minted via OIDC + Key Vault signing is used for cross-org operations on MicrosoftDocs/azure-dev-docs-pr. If the App installation grants access to more repos than needed, a compromised token could affect other repos.

Required Action: Verify the GitHub App installation on the MicrosoftDocs org is scoped to only the azure-dev-docs-pr repository.

Steps:

Go to the GitHub App settings for App ID 1086291
Under Install & Authorize, check the MicrosoftDocs installation
Ensure it uses "Only select repositories" with only azure-dev-docs-pr selected
If it currently has "All repositories" access, restrict it

Finding #4 (HIGH) — Hardcoded OIDC Configuration

Risk: client-id, tenant-id, and subscription-id are hardcoded in the workflow YAML. If these values ever need rotation or the workflow is forked/copied, the GUIDs are visible in the source code.

Required Action: Create GitHub repository variables and update the workflow to reference them.

Steps:

Go to Settings → Variables → Actions
Create these repository variables:
- AZURE_CLIENT_ID = (current client-id value)
- AZURE_TENANT_ID = (current tenant-id value)
- AZURE_SUBSCRIPTION_ID = (current subscription-id value)
Once created, we will update doc-monitor.yml to use ${{ vars.AZURE_CLIENT_ID }} etc.

Summary of Code-Level Fixes Already Applied

All code-fixable findings have been addressed in commit 05628b66:

#	Severity	Finding	Status
2	HIGH	AI output drives write ops (prompt injection)	✅ MAX_IMPACTS=15, reject unknown repos, path traversal block
5	MED	No rate limit on all_open	✅ MAX_PRS_PER_RUN=20
6	MED	AI markdown output unsanitized	✅ MAX_REASON/SUMMARY_LENGTH caps
8	MED	Actions not pinned to SHA	✅ Pinned to commit SHAs
9	MED	Large file ReDoS/perf	✅ MAX_CONTENT_SIZE_BYTES=50KB
10	LOW	PR title in logs	✅ Control char stripping, truncation
11	MED	Doc manifest prompt injection	✅ sanitizeText() on all extracted data

cc @jongio

…escaping, magic number - docs-inventory.ts: resolve default branch tree SHA instead of passing 'HEAD' to git.getTree (which can 404) - comment-tracker.ts: strip backticks and carriage returns in escapeTableCell() to prevent markdown injection - diff.ts: replace magic number 30 with actual string length for accurate size budgeting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 27, 2026 17:35

jongio requested review from danieljurek, rajeshkamal5050, tg-msft and vhvb1989 as code owners February 27, 2026 17:35

microsoft-github-policy-service bot assigned jongio Feb 27, 2026

Copilot started reviewing on behalf of jongio February 27, 2026 17:35 View session

jongio mentioned this pull request Feb 27, 2026

Add automated PR documentation impact analysis workflow #6924

Open

13 tasks

Copilot AI reviewed Feb 27, 2026

View reviewed changes

jongio and others added 3 commits February 27, 2026 09:44

fix: Make docs-repo-token optional for graceful degradation

7e83b90

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jongio requested a review from Copilot February 28, 2026 01:15

Copilot started reviewing on behalf of jongio February 28, 2026 01:16 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

-    $resp = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
-    if (!$resp.id) { throw "No installations found for this App." }
-    return $resp.id
+    $matches = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
+    if (-not $matches) {
+        throw "No installations found for this App and owner '$InstallationTokenOwner'."
+    }
+    $selected = $matches | Select-Object -First 1
+    if ($matches.Count -gt 1) {
+        Write-Warning "Multiple installations found for owner '$InstallationTokenOwner'. Using installation id $($selected.id)."
+    }
+    return $selected.id

Conversation

jongio commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Security Model

Auth Flow

Flow

Architecture

What it does

Modes

Security Hardening

Files

Prerequisites (managed by EngSys)

Auth Evolution

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jongio commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Stance

1. pull_request_target trigger (not pull_request)

2. No PR code checkout or execution

3. OIDC + Key Vault signing (no secrets in GitHub)

4. Multi-layer injection prevention

5. Tracking comment author verification

6. Bot loop prevention

7. Actions pinned to commit SHAs

8. Resource exhaustion prevention

Summary

Uh oh!

jongio commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Research: GitHub Agentic Workflows as Alternative to Custom Action

What It Is

Capability Mapping

Advantages

Blockers for Adoption Today

Recommendation

References

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

jongio Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

jongio Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

jongio Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jongio commented Feb 28, 2026

🔧 Eng-Sys Action Items — Red Team Findings Requiring Admin/Infra Changes

Finding #1 (CRITICAL) — workflow_dispatch Environment Gating

Finding #3 (HIGH) — Cross-Org GitHub App Token Scope

Finding #4 (HIGH) — Hardcoded OIDC Configuration

Summary of Code-Level Fixes Already Applied

Uh oh!

jongio commented Feb 27, 2026 •

edited

Loading

jongio commented Feb 27, 2026 •

edited

Loading

1. `pull_request_target` trigger (not `pull_request`)

jongio commented Feb 27, 2026 •

edited

Loading

Finding #1 (CRITICAL) — `workflow_dispatch` Environment Gating