Skip to content

feat: Add PR documentation monitor workflow#6927

Open
jongio wants to merge 8 commits intoAzure:mainfrom
jongio:doc-monitor-workflow
Open

feat: Add PR documentation monitor workflow#6927
jongio wants to merge 8 commits intoAzure:mainfrom
jongio:doc-monitor-workflow

Conversation

@jongio
Copy link
Member

@jongio jongio commented Feb 27, 2026

Summary

Adds a new GitHub Actions workflow and custom TypeScript action that automatically monitors PRs for documentation impact. When a code PR is opened or updated against main, the workflow uses AI (GitHub Models API / GPT-4o) to analyze the diff and determine which docs in Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr need to be created, updated, or deleted.

Closes #6924

Security Model

This workflow is designed to be safe for fork PRs from external contributors.

Concern Mitigation
Fork modifies workflow YAML to steal secrets pull_request_target runs workflow code from main, not from the fork
Secrets exposed on runner No secrets stored in GitHub. OIDC federated credentials + Key Vault signing. Private key is non-exportable and never leaves Key Vault.
Token over-scoping GitHub App token scoped to MicrosoftDocs/azure-dev-docs-pr only
Executing untrusted code Action never checks out or executes PR code. All data read via GitHub REST API
Token tied to a person GitHub App is an org-level identity, not tied to any individual
Markdown injection via PR title/body Multi-layer sanitization: sanitizePlainText() on AI output, escapeTableCell() (strips HTML/markdown links, escapes pipes/newlines), sanitizeForMarkdown() on PR bodies, output length caps (MAX_REASON_LENGTH=200, MAX_SUMMARY_LENGTH=500)
Tracking comment spoofing findTrackingComment verifies comment author is github-actions[bot]
AI prompt injection via doc content sanitizeText() strips HTML tags and control characters from all doc manifest data (titles, topics, headings) before injection into AI prompt
AI output manipulation (prompt injection) MAX_IMPACTS=15 cap on impact count, unknown repos rejected (not just warned), repo format validated via regex, path traversal (.., leading /) blocked
Resource exhaustion via batch modes MAX_PRS_PER_RUN=20 cap on all_open and list modes
Large file denial-of-service MAX_CONTENT_SIZE_BYTES=50KB per doc file — oversized files skipped during inventory
Supply chain via unpinned actions All actions pinned to commit SHAs (actions/checkout@34e11487..., azure/login@a457da9e...)
Log injection via PR title PR title truncated to 100 chars and control characters stripped before logging

Auth Flow

flowchart LR
    OIDC["1. GitHub OIDC token | (id-token: write)"]
    AZ["2. azure/login@v2 | (federated credentials)"]
    KV["3. az keyvault key sign | (non-exportable RSA key | in azuresdkengkeyvault)"]
    JWT["4. Signed JWT | (GitHub App 1086291)"]
    INST["5. POST /app/installations/ | {id}/access_tokens"]
    TOKEN["6. Installation token | (scoped, 1h TTL)"]

    OIDC --> AZ --> KV --> JWT --> INST --> TOKEN
Loading
  1. GITHUB_TOKEN handles in-repo operations (read PR diff, create doc PRs in azure-dev, post comments)
  2. Workflow requests OIDC token from GitHub (id-token: write permission)
  3. azure/login@v2 exchanges OIDC token for Azure access using federated credentials (no client secret)
  4. eng/common/actions/login-to-github composite action signs a JWT using az keyvault key sign (RSA key azure-sdk-automation in azuresdkengkeyvault -- key is non-exportable)
  5. JWT is exchanged for a short-lived GitHub App installation token scoped to MicrosoftDocs/azure-dev-docs-pr
  6. Token is exported as GH_TOKEN env var, expires after 1 hour, never stored

No secrets are stored in GitHub. The entire auth chain is keyless -- OIDC federation replaces client secrets, and Key Vault signing replaces private key storage.

Flow

flowchart TD
    A["PR Event: opened/synchronize/closed"] --> B{Event Type?}
    B -->|opened / synchronize| C["Fetch PR Diff via API"]
    B -->|closed + merged| SKIP["Skip: PRs already exist"]
    B -->|closed + not merged| Z["Close doc PRs, clean branches"]

    D["Manual Trigger"] --> E{Mode?}
    E -->|single| C
    E -->|all_open| F["Enumerate open PRs"]
    E -->|list| G["Parse PR numbers"]
    F --> C
    G --> C

    C --> H["Classify changes"]
    H --> I["Build docs inventory"]
    I --> J["AI Analysis via GPT-4o"]
    J --> K{Docs impacted?}

    K -->|No| L["Post: no doc changes needed"]
    K -->|Yes| M["Generate doc proposals"]

    M --> N{"In-repo docs?"}
    N -->|Yes| O["Branch: docs/pr-N in azure-dev"]
    O --> P["Create/update PR"]
    N -->|No| Q{"External docs?"}

    P --> Q
    Q -->|Yes| R["Mint token via OIDC"]
    R --> R2["Branch: docs/pr-N in docs repo"]
    R2 --> S["Create/update docs PR"]
    Q -->|No| T["Update tracking comment"]

    S --> T
    L --> U["Done"]
    T --> U
    Z --> U
    SKIP --> U
Loading

Architecture

graph TD
    subgraph "GitHub Action (.github/actions/doc-monitor)"
        IDX[index.ts - Entry Point] --> INP[inputs.ts - Validation]
        IDX --> PROC[processor.ts - Orchestrator]
        PROC --> DIFF[diff.ts - PR Diff Extraction]
        PROC --> INV[docs-inventory.ts - Doc Manifest]
        PROC --> ANA[analyze.ts - AI Analysis]
        PROC --> PRM[pr-manager.ts - Companion PRs]
        PROC --> CMT[comment-tracker.ts - Tracking Comments]
        PRM --> GHU[github-utils.ts - API Helpers]
        PRM --> PRB[pr-body.ts - Markdown Builders]
        ANA --> CON[constants.ts - Config Values]
    end

    subgraph "Auth (eng/common)"
        LOGIN[login-to-github - Composite Action] --> SCRIPT[login-to-github.ps1 - Key Vault JWT Signing]
    end

    ANA -->|"OpenAI API"| GMAI["GitHub Models GPT-4o"]
    DIFF -->|"REST API"| GH["GitHub API"]
    INV -->|"REST API"| GH
    PRM -->|"REST API"| GH
    CMT -->|"REST API"| GH
    SCRIPT -->|"az keyvault key sign"| KV["Azure Key Vault"]
Loading

What it does

  1. Triggers on pull_request_target events (opened, synchronized, reopened, closed) targeting main, plus manual workflow_dispatch
  2. Extracts the PR diff via GitHub REST API and classifies changes
  3. Inventories documentation in both Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr (using git.getTree + git.getBlob for efficiency, with sanitizeText() on all extracted content)
  4. Analyzes changes using GitHub Models AI (GPT-4o) to determine doc impact, with comprehensive output validation (repo format regex, path traversal blocking, unknown repo rejection, MAX_IMPACTS=15 cap, output length caps)
  5. Creates companion PRs in the appropriate repos with deterministic branch naming (docs/pr-{N})
  6. Posts a tracking comment on the source PR linking to all companion doc PRs (with author verification and multi-layer markdown injection prevention)
  7. Assigns doc PRs to alexwolfmsft and diberry
  8. Respects human edits -- never force-pushes over manual changes on doc branches
  9. Skips merged PRs -- avoids wasteful re-analysis of already-merged PRs
  10. Graceful degradation -- without cross-repo token, still scans and reports impacts

Modes

Mode Trigger Description
auto PR events Processes the triggering PR
single workflow_dispatch Process a specific PR by number
all_open workflow_dispatch Process all open PRs targeting main (capped at 20)
list workflow_dispatch Process a comma-separated list of PR numbers (capped at 20)

Security Hardening

A comprehensive red team assessment was performed against the action simulating 10 attacker personas (Script Kiddie through Nation-State). 11 findings were produced:

# Severity Finding Resolution
1 CRIT workflow_dispatch any-collaborator trigger Admin: add required_reviewers to environment (comment)
2 HIGH AI output drives write operations (prompt injection) Code: MAX_IMPACTS=15, reject unknown repos, path traversal block, repo format validation
3 HIGH Cross-org token scope Admin: verify App installation scoped to single repo (comment)
4 HIGH Hardcoded OIDC GUIDs Admin: move to vars.* repo variables (comment)
5 MED No rate limit on batch modes Code: MAX_PRS_PER_RUN=20
6 MED AI output length uncapped Code: MAX_REASON_LENGTH=200, MAX_SUMMARY_LENGTH=500
7 MED Doc inventory from attacker paths Low residual: content from main branch only
8 MED Actions not pinned to SHA Code: pinned to commit SHAs
9 MED Large file ReDoS/perf risk Code: MAX_CONTENT_SIZE_BYTES=50KB
10 LOW PR title logged unsanitized Code: control char stripping + truncation
11 MED Doc manifest prompt injection Code: sanitizeText() on all extracted data

Files

Path Purpose
.github/workflows/doc-monitor.yml Workflow definition (pull_request_target + OIDC + Key Vault)
.github/actions/doc-monitor/ Custom TypeScript action (12 source modules + compiled dist)
eng/common/actions/login-to-github/action.yml Composite action wrapping Key Vault signing script
eng/common/scripts/login-to-github.ps1 PowerShell script that mints GitHub App tokens via Key Vault

Prerequisites (managed by EngSys)

Component Value Purpose
GitHub Environment AzureSDKEngKeyVault OIDC federated credential binding
Azure Key Vault azuresdkengkeyvault Hosts the non-exportable RSA signing key
Key Vault Key azure-sdk-automation RSA key used to sign GitHub App JWTs
GitHub App ID 1086291 (Azure SDK Automation) Installed on MicrosoftDocs org with contents:write + pull_requests:write
Workflow permissions id-token: write, contents: write, pull-requests: write, models: read Already configured in workflow YAML

Auth Evolution

This PR went through three auth approaches before landing on the current design:

  1. PAT (rejected) -- secrets exposed on runner, tied to individual, doesn't expire automatically
  2. actions/create-github-app-token (rejected) -- requires GitHub App private key as a GitHub secret, which is present on the runner during workflow execution
  3. OIDC + Key Vault signing (current) -- no secrets in GitHub, private key never leaves Key Vault, fully keyless OIDC chain. Uses the same pattern as azure-sdk-tools (see PR #14219)

Adds a GitHub Actions workflow + custom TypeScript action that uses
GitHub Models AI (GPT-4o) to analyze PR diffs and identify impacted
documentation across Azure/azure-dev and MicrosoftDocs/azure-dev-docs-pr.

Creates companion doc PRs, posts tracking comments, and supports
manual batch processing via workflow_dispatch.

Closes Azure#6924

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an automated “doc-monitor” GitHub Actions workflow plus a custom TypeScript action that analyzes PR diffs with GitHub Models (GPT-4o) to determine documentation impact and creates/updates companion documentation PRs.

Changes:

  • Introduces .github/workflows/doc-monitor.yml to run on PR events and manual dispatch modes.
  • Adds a custom Node 20 TypeScript action under .github/actions/doc-monitor/ to fetch diffs, inventory docs across repos, run AI analysis, and manage companion PRs + tracking comments.
  • Adds supporting action metadata, build config, and documentation.

Reviewed changes

Copilot reviewed 18 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
.github/workflows/doc-monitor.yml New workflow wiring (triggers, permissions, concurrency, action invocation).
.github/actions/doc-monitor/action.yml Declares action inputs/outputs + Node runtime entrypoint.
.github/actions/doc-monitor/package.json Action dependencies and build/test scripts.
.github/actions/doc-monitor/tsconfig.json TypeScript compilation settings for the action.
.github/actions/doc-monitor/src/index.ts Mode routing and PR enumeration (auto/single/all_open/list).
.github/actions/doc-monitor/src/inputs.ts Input parsing + validation.
.github/actions/doc-monitor/src/processor.ts Orchestrates diff fetch, inventory, AI analysis, PR/comment updates.
.github/actions/doc-monitor/src/diff.ts PR metadata/files fetch + change classification + diff summarization.
.github/actions/doc-monitor/src/docs-inventory.ts Builds markdown doc inventory from repo contents for AI context.
.github/actions/doc-monitor/src/analyze.ts GitHub Models/OpenAI client integration + response validation.
.github/actions/doc-monitor/src/pr-manager.ts Companion branch/PR creation, updates, and closure behavior.
.github/actions/doc-monitor/src/comment-tracker.ts Tracking comment create/update and formatting.
.github/actions/doc-monitor/src/pr-body.ts Markdown body/summary builders for companion PRs.
.github/actions/doc-monitor/src/constants.ts Centralized constants for limits, defaults, and markers.
.github/actions/doc-monitor/src/types.ts Shared type definitions for the action.
.github/actions/doc-monitor/README.md Local documentation for configuring and developing the action.
.github/actions/doc-monitor/.gitignore Ignores action-local node_modules.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jongio and others added 3 commits February 27, 2026 09:44
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Switch external docs repo from MicrosoftDocs/azure-dev-docs-pr (private)
to MicrosoftDocs/azure-dev-docs (public). When DOCS_REPO_PAT is not set,
fall back to GITHUB_TOKEN for reading the public docs repo inventory.
Companion PR creation still requires DOCS_REPO_PAT.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace PAT-based cross-repo auth with GitHub App token minting via
actions/create-github-app-token. Switch trigger from pull_request to
pull_request_target to prevent fork PRs from exfiltrating secrets.

Security model:
- pull_request_target runs workflow code from main (not fork branch)
- GitHub App tokens are short-lived (1 hour), scoped to specific repo
- Action reads PR data via GitHub API only, never executes PR code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jongio
Copy link
Member Author

jongio commented Feb 27, 2026

Security Stance

This workflow is designed to be safe for use with fork PRs from external contributors. Here's why:

1. pull_request_target trigger (not pull_request)

The workflow uses pull_request_target which runs the workflow code from the base branch (main), not from the PR branch. This is the critical security boundary:

  • pull_request (unsafe for secrets): Runs workflow YAML from the PR head. A fork contributor could modify .github/workflows/doc-monitor.yml to exfiltrate any secrets the workflow has access to.
  • pull_request_target (safe for secrets): Runs workflow YAML from the repository's default branch. Fork contributors cannot modify the workflow code that executes.

2. No PR code checkout or execution

The action never checks out the PR's source code. The only actions/checkout step loads the action's own compiled code from main:

- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
  with:
    ref: ${{ github.event.repository.default_branch }}
    sparse-checkout: .github/actions/doc-monitor

All PR data (diff, files, metadata) is read exclusively through the GitHub REST API (octokit.pulls.listFiles(), octokit.pulls.get()). There is no path for untrusted code execution.

3. OIDC + Key Vault signing (no secrets in GitHub)

Cross-repo write access uses OIDC federated credentials + Azure Key Vault signing instead of stored secrets:

Property PAT / App Private Key OIDC + Key Vault
Secrets in GitHub Yes (PAT or private key stored as repo/org secret) None (OIDC is fully keyless)
Private key exposure On runner during workflow execution Never leaves Key Vault (signing is server-side)
Token lifetime Until manually revoked (PAT) or 1h (App token) 1h (App installation token)
Scope Depends on configuration Only repos where App is installed
Identity Tied to a person (PAT) or App Org-level App identity

The auth chain works as follows:

  1. GitHub OIDC provider issues a token to the workflow (id-token: write)
  2. azure/login@v2 exchanges the OIDC token for Azure access using federated credentials (no client secret needed)
  3. eng/common/actions/login-to-github creates a JWT, computes its SHA-256 hash, and signs it via az keyvault key sign --algorithm RS256 (the RSA key azure-sdk-automation in azuresdkengkeyvault is non-exportable)
  4. The signed JWT is exchanged for a GitHub App installation token via POST /app/installations/{id}/access_tokens
  5. The token is scoped to MicrosoftDocs/azure-dev-docs-pr only and expires in 1 hour

This is the same pattern used by the Azure SDK EngSys team across Azure SDK repos. See azure-sdk-tools PR #14219 for the composite action source.

4. Multi-layer injection prevention

PR titles, bodies, and doc content are attacker-controlled data that flow through the system. Five sanitization layers prevent injection at every stage:

Layer Function Where Applied What It Does
sanitizeText() Doc inventory input titles, topics, H2 headings from doc files Strips HTML tags and control characters before they enter the AI prompt
Anti-injection system prompt AI analysis GPT-4o system message Instructs the model to ignore embedded instructions in untrusted data
sanitizePlainText() AI output reason, suggestedChanges, summary, repo, path Strips HTML, control chars, and excessive whitespace from AI responses
escapeTableCell() Tracking comment all table cell values Strips HTML tags, converts markdown links to plain text, removes images, escapes pipes, collapses newlines
sanitizeForMarkdown() PR bodies companion PR body content Prevents markdown injection in generated PR descriptions

Additionally, AI output is structurally constrained:

  • MAX_REASON_LENGTH=200 and MAX_SUMMARY_LENGTH=500 cap output field lengths
  • MAX_IMPACTS=15 caps the number of doc impacts the AI can propose
  • Unknown repos are rejected (not just warned) — AI cannot target arbitrary repositories
  • Repo format validated via regex (owner/repo pattern required)
  • Path traversal blocked (.. and leading / rejected)
  • Error messages redacted from tracking comments to prevent data leakage

5. Tracking comment author verification

findTrackingComment verifies that the comment author is github-actions[bot] in addition to checking for the marker. This prevents an attacker from pre-planting a comment with the marker to hijack the tracking display.

6. Bot loop prevention

The workflow skips execution when:

  • The PR head ref starts with docs/pr- (it's a companion doc PR, not a code PR)
  • The PR actor is github-actions[bot] (prevents infinite recursion)

7. Actions pinned to commit SHAs

All third-party actions are pinned to immutable commit SHAs to prevent supply chain attacks via tag mutation:

actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5   # v2

8. Resource exhaustion prevention

Batch processing modes are rate-limited:

  • MAX_PRS_PER_RUN=20 — caps all_open and list modes
  • MAX_CONTENT_SIZE_BYTES=50KB — skips oversized doc files during inventory
  • MAX_CONTENT_FETCHES=50 — limits the number of doc files fetched per repo

Summary

This design follows the most secure pattern available for GitHub Actions workflows that need cross-repo write access on fork PRs:

  • pull_request_target runs trusted code from main
  • PR data read via API only (never checkout)
  • OIDC federated credentials (no secrets stored in GitHub)
  • Key Vault signing (private key never on runner)
  • Short-lived scoped tokens for cross-repo operations
  • 5-layer input/output sanitization chain
  • Structural AI output constraints (impact count, field length, repo validation, path traversal blocking)
  • Actions pinned to commit SHAs
  • Resource exhaustion prevention (PR count, file size, fetch count caps)
  • Author verification on tracking comments
  • Error message redaction from public-facing comments

- Use merged_at instead of merged for reliable merge detection (thread Azure#1)
- Expand isDocOnlyPr to handle doc-adjacent assets (thread Azure#2)
- Replace N+1 API calls with git.getTree for doc inventory (thread Azure#3)
- Fix README trigger types to match actual workflow config (thread Azure#5)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jongio
Copy link
Member Author

jongio commented Feb 27, 2026

Research: GitHub Agentic Workflows as Alternative to Custom Action

Investigated whether GitHub Agentic Workflows (technical preview, Feb 2026) could replace our custom TypeScript action. Here is the analysis:

What It Is

GitHub Agentic Workflows let you define repository automation in Markdown instead of code. An AI agent (Copilot, Claude, or Codex) interprets natural-language instructions, runs read-only in a sandbox, then uses "safe-outputs" (structured, permission-separated jobs) to write back to GitHub. Developed by GitHub Next and Microsoft Research.

Capability Mapping

Our Requirement gh-aw Support Notes
Trigger on PR events Supported on: pull_request
Analyze PR diff with AI Supported Core feature -- agent reads repo context
Scan docs in azure-dev Supported Full read access to own repo
Scan docs in MicrosoftDocs/azure-dev-docs-pr Supported checkout: with cross-repo token
Create PR in azure-dev Supported safe-outputs: create-pull-request
Create PR in azure-dev-docs-pr (cross-repo) Supported create-pull-request: {target-repo: "..."} with token
Push updates to doc PR branch Same-repo only push-to-pull-request-branch does not support cross-repo
Maintain tracking comment on source PR Supported safe-outputs: add-comment (max 1 default)
Preserve human changes on doc PRs Limited AI reasoning, not deterministic logic
Manual trigger for specific PRs Supported on: workflow_dispatch
Batch run against multiple PRs Not built-in Would need custom logic
Fork security Supported Built-in sandboxing, agents run read-only
Cross-repo auth without stored secrets Not solved Still requires PAT or App token -- does not support OIDC + Key Vault

Advantages

  • ~1200 lines of TypeScript would shrink to ~50-line Markdown file with YAML frontmatter
  • Built-in security via safe-outputs (least-privilege, agent runs read-only)
  • No build step (no tsc, no ncc, no dist/index.js to maintain)
  • Multi-model support (swap Copilot/Claude/Codex without code changes)
  • Maintained by GitHub (security model and tooling updates automatically)

Blockers for Adoption Today

  1. Technical preview -- may change significantly; not production-ready for Azure/azure-dev
  2. Cross-repo push limitation -- push-to-pull-request-branch is same-repo only, so we cannot iteratively update the MicrosoftDocs PR branch (a core requirement)
  3. Same auth challenge -- still needs PAT or GitHub App token for cross-repo writes; does not support OIDC + Key Vault signing (our current approach)

Recommendation

Ship the current custom action (built, reviewed, deterministic), and track gh-aw as a migration target once it exits preview and adds cross-repo push support. The Markdown-based approach is a natural fit for this use case long-term.

References

- Switch auth from GitHub App secrets to OIDC + Key Vault signing
- Add eng/common login-to-github action (from azure-sdk-tools #14219)
- Fix 12 MQ review findings (CR-002 through CR-013)
- Update all deps to latest CJS-compatible versions (0 CVEs)
- Change docs repo to MicrosoftDocs/azure-dev-docs-pr
- Rebuild dist bundle

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 22 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +100 to +109
$SignResultJson = az keyvault key sign `
--vault-name $VaultName `
--name $KeyName `
--algorithm RS256 `
--digest $Base64Value | ConvertFrom-Json

if ($LASTEXITCODE -ne 0) {
throw "Failed to sign JWT with Azure Key Vault. Error: $SignResult"
}

Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$SignResult is not defined, so the error path will itself throw/produce an unhelpful message. Also, piping az ... | ConvertFrom-Json means a non-zero az exit (or non-JSON output) can terminate before $LASTEXITCODE is checked. Capture the raw az output first, check $LASTEXITCODE, then ConvertFrom-Json only on success (and include the captured output in the thrown message).

Suggested change
$SignResultJson = az keyvault key sign `
--vault-name $VaultName `
--name $KeyName `
--algorithm RS256 `
--digest $Base64Value | ConvertFrom-Json
if ($LASTEXITCODE -ne 0) {
throw "Failed to sign JWT with Azure Key Vault. Error: $SignResult"
}
$SignResultRaw = az keyvault key sign `
--vault-name $VaultName `
--name $KeyName `
--algorithm RS256 `
--digest $Base64Value 2>&1
if ($LASTEXITCODE -ne 0) {
throw "Failed to sign JWT with Azure Key Vault. ExitCode: $LASTEXITCODE. Output: $SignResultRaw"
}
try {
$SignResultJson = $SignResultRaw | ConvertFrom-Json
}
catch {
throw "Failed to parse Azure Key Vault sign response as JSON. Raw output: $SignResultRaw"
}

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

Comment on lines +133 to +135
$resp = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
if (!$resp.id) { throw "No installations found for this App." }
return $resp.id
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where-Object can return multiple matching installations, which makes $resp.id an array. That can later break the token exchange call that expects a single installation id. Select a single match deterministically (e.g., the first match) and improve the error to include the requested owner to aid troubleshooting.

Suggested change
$resp = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
if (!$resp.id) { throw "No installations found for this App." }
return $resp.id
$matches = $resp | Where-Object { $_.account.login -ieq $InstallationTokenOwner }
if (-not $matches) {
throw "No installations found for this App and owner '$InstallationTokenOwner'."
}
$selected = $matches | Select-Object -First 1
if ($matches.Count -gt 1) {
Write-Warning "Multiple installations found for owner '$InstallationTokenOwner'. Using installation id $($selected.id)."
}
return $selected.id

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

Comment on lines +96 to +102
$owners = $env:INPUT_TOKEN_OWNERS -split ',' | ForEach-Object { $_.Trim() }
& $scriptPath `
-KeyVaultName $env:INPUT_KEY_VAULT_NAME `
-KeyName $env:INPUT_KEY_NAME `
-GitHubAppId $env:INPUT_APP_ID `
-InstallationTokenOwners $owners `
-VariableNamePrefix $env:INPUT_VARIABLE_NAME_PREFIX
Copy link

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting token-owners without filtering empty entries means values like an empty string (or a trailing comma) will produce \"\" as an owner and cause Get-GitHubInstallationId to fail. Filter out empty/whitespace-only owners after trimming (and consider failing fast if the resulting list is empty).

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is from eng/common/ and is owned by the eng-sys team. It's being contributed via azure-sdk-tools PR #14219. Deferring this feedback to that PR's reviewers.

- Pin actions to commit SHAs (actions/checkout, azure/login)
- Cap all_open/list mode to MAX_PRS_PER_RUN=20
- Cap AI output: MAX_REASON_LENGTH=200, MAX_SUMMARY_LENGTH=500
- Add MAX_IMPACTS=15 to limit AI-generated impact count
- Add MAX_CONTENT_SIZE_BYTES=50KB per doc file
- Sanitize doc manifest content (titles, topics, headings)
- Reject unknown repos from AI output (not just warn)
- Validate repo format with regex (owner/repo)
- Block path traversal in AI-returned paths
- Sanitize PR title in log output (strip control chars)
- Strip HTML from existing PR body in closeCompanionPrs
- Remove error messages from tracking comment (prevent data leak)
- Upper-bound PR number input to 999999
- Rename TRUSTED_DOC_INVENTORY to DOC_INVENTORY tag

Red team findings addressed: Azure#2, Azure#5, Azure#6, Azure#8, Azure#9, Azure#10, Azure#11
Admin items remaining: Azure#1 (env gating), Azure#3 (token scope), Azure#4 (OIDC vars)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jongio
Copy link
Member Author

jongio commented Feb 28, 2026

🔧 Eng-Sys Action Items — Red Team Findings Requiring Admin/Infra Changes

The following items were identified during a red team security assessment of the doc-monitor workflow. They cannot be resolved through code changes alone and require admin or infrastructure configuration.


Finding #1 (CRITICAL) — workflow_dispatch Environment Gating

Risk: Any repo collaborator can manually trigger workflow_dispatch, which runs with contents: write + pull-requests: write permissions and mints a cross-org GitHub App token via OIDC.

Required Action: Add required_reviewers protection to the AzureSDKEngKeyVault environment so that workflow_dispatch runs require approval before executing.

Steps:

  1. Go to Settings → Environments → AzureSDKEngKeyVault
  2. Enable Required reviewers
  3. Add the eng-sys team or designated approvers
  4. pull_request_target runs (the normal path) already pass through the environment gate via OIDC, so this only adds a gate for manual dispatch

Finding #3 (HIGH) — Cross-Org GitHub App Token Scope

Risk: The GitHub App (ID 1086291) minted via OIDC + Key Vault signing is used for cross-org operations on MicrosoftDocs/azure-dev-docs-pr. If the App installation grants access to more repos than needed, a compromised token could affect other repos.

Required Action: Verify the GitHub App installation on the MicrosoftDocs org is scoped to only the azure-dev-docs-pr repository.

Steps:

  1. Go to the GitHub App settings for App ID 1086291
  2. Under Install & Authorize, check the MicrosoftDocs installation
  3. Ensure it uses "Only select repositories" with only azure-dev-docs-pr selected
  4. If it currently has "All repositories" access, restrict it

Finding #4 (HIGH) — Hardcoded OIDC Configuration

Risk: client-id, tenant-id, and subscription-id are hardcoded in the workflow YAML. If these values ever need rotation or the workflow is forked/copied, the GUIDs are visible in the source code.

Required Action: Create GitHub repository variables and update the workflow to reference them.

Steps:

  1. Go to Settings → Variables → Actions
  2. Create these repository variables:
    • AZURE_CLIENT_ID = (current client-id value)
    • AZURE_TENANT_ID = (current tenant-id value)
    • AZURE_SUBSCRIPTION_ID = (current subscription-id value)
  3. Once created, we will update doc-monitor.yml to use ${{ vars.AZURE_CLIENT_ID }} etc.

Summary of Code-Level Fixes Already Applied

All code-fixable findings have been addressed in commit 05628b66:

# Severity Finding Status
2 HIGH AI output drives write ops (prompt injection) ✅ MAX_IMPACTS=15, reject unknown repos, path traversal block
5 MED No rate limit on all_open ✅ MAX_PRS_PER_RUN=20
6 MED AI markdown output unsanitized ✅ MAX_REASON/SUMMARY_LENGTH caps
8 MED Actions not pinned to SHA ✅ Pinned to commit SHAs
9 MED Large file ReDoS/perf ✅ MAX_CONTENT_SIZE_BYTES=50KB
10 LOW PR title in logs ✅ Control char stripping, truncation
11 MED Doc manifest prompt injection ✅ sanitizeText() on all extracted data

cc @jongio

…escaping, magic number

- docs-inventory.ts: resolve default branch tree SHA instead of passing
  'HEAD' to git.getTree (which can 404)
- comment-tracker.ts: strip backticks and carriage returns in
  escapeTableCell() to prevent markdown injection
- diff.ts: replace magic number 30 with actual string length for
  accurate size budgeting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add automated PR documentation impact analysis workflow

2 participants