guard: GraphQL response labeling, backend enrichment, and sub-method skip by lpcox · Pull Request #2335 · github/gh-aw-mcpg

lpcox · 2026-03-22T20:15:53Z

Summary

Fixes integrity filtering bugs in the GitHub guard that caused false positives when processing responses from the GitHub MCP Server.

Problem

The GitHub MCP Server returns responses in different formats depending on the API used (GraphQL vs REST), and several fields critical for integrity labeling are missing:

GraphQL format not handled — list_issues and list_pull_requests return GraphQL-shaped responses ({issues: [...]}, {pullRequests: [...]}) that the guard's response labeling didn't recognize
Missing author_association — GraphQL IssueFragment doesn't request authorAssociation; MinimalPullRequest lacks the field entirely (github/github-mcp-server#2250)
Sub-method response mismatch — pull_request_read(method: get_check_runs) returns check run objects, but the guard tried to parse them as PRs → pr:#unknown with none integrity → incorrectly filtered
Error responses mislabeled — 404 Not Found responses were parsed as domain objects → pr:#unknown → filtered

Fixes

GraphQL response format support (response_items.rs, response_paths.rs) — Handle {issues: [...]}, {pullRequests: [...]}, and GraphQL single-node formats alongside existing REST array format
Backend enrichment (helpers.rs) — When author_association is missing on public repos, the guard now makes backend calls to fetch it:
- Issues: calls issue_read (REST endpoint includes the field)
- PRs: calls get_pull_request_facts for association + fork/merge status
Sub-method skip (response_items.rs, response_paths.rs) — For pull_request_read and issue_read sub-methods (get_check_runs, get_files, get_comments, etc.), skip per-item response labeling. These return non-PR/issue objects. Resource-level labels from tool_rules provide correct integrity.
Error response skip (response_items.rs, response_paths.rs) — When isError=true, skip response labeling entirely. Resource-level labels handle error cases.
Search scope extraction (tool_rules.rs) — search_issues and search_pull_requests now extract repo:owner/repo from query strings to apply scoped secrecy/integrity labels.

Validation

Tested via 5 sequential repo-assist CI runs:

Run	DIFC-FILTERED	Result
23407151206 (v0.1.24 baseline)	1	—
23408145677 (GraphQL fix)	6	❌ lpcox admin issues filtered
23410257857 (+enrichment)	4	✅ lpcox fixed; check_runs + search remaining
23410932567 (+sub-method skip)	1	404 error mislabeled
23411168049 (+isError skip)	0	✅ All clean

Note

repo-assist.lock.yml is temporarily set to build a local container for testing. Must be reverted before merge.

The guard's response labeling code only recognized REST-like formats (arrays, {items: [...]}) but not GraphQL nested responses like data.repository.pullRequests.nodes. This caused the proxy to apply coarse-grained filtering (blocking entire responses) instead of per-item integrity filtering for gh CLI calls. Changes: - extract_items_array() now traverses GraphQL nested paths: data.repository.{issues,pullRequests,discussions,...}.nodes and data.search.nodes - extract_graphql_nodes() extracts collection arrays from GraphQL - extract_graphql_single_object() extracts singular objects (data.repository.issue, data.repository.pullRequest) - is_graphql_wrapper() prevents treating {data:...} as a single item - response_items.rs PR and issue handlers check GraphQL formats before the single-object fallback - 9 new tests covering GraphQL collection, single object, search, path labeling, and wrapper detection Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When the GitHub MCP Server omits author_association from list responses (e.g. list_issues via GraphQL, list_pull_requests via REST), the guard now fetches the individual item via a backend call to obtain the correct author_association value. This prevents incorrectly assigning 'none' integrity to repo members/collaborators. Changes: - Add has_author_association() helper to detect absent field - issue_integrity(): enrich via get_issue_author_association when field is missing on public repos (private repos already get writer integrity) - pr_integrity(): enrich via get_pull_request_facts when field is missing, also backfills is_forked and is_merged status - Add 6 tests for has_author_association and enrichment scenarios Upstream fix filed: github/github-mcp-server#2250 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- response_items.rs/response_paths.rs: Skip per-item response labeling for pull_request_read and issue_read sub-methods (get_check_runs, get_files, get_comments, etc.) that return non-PR/issue objects. Resource-level labels from tool_rules provide correct integrity. - tool_rules.rs: Extract repo scope from search_issues and search_pull_requests query strings using extract_repo_info_from_search_query. When repo:owner/repo is present, apply scoped secrecy and integrity instead of deferring entirely to response labeling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When the MCP backend returns isError=true (e.g. 404 Not Found), skip per-item response labeling in both response_items.rs and response_paths.rs. Error responses don't contain valid domain objects and would be mislabeled as pr:#unknown or issue:#unknown with none integrity, causing incorrect DIFC filtering. Resource-level labels from tool_rules still apply to error responses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds GraphQL-aware response labeling to the GitHub guard so PR/issue results can be labeled correctly even when returned in GraphQL wrapper shapes, and adjusts workflow wiring to run against a locally built MCPG container.

Changes:

Improve repo scoping for search_issues / search_pull_requests by parsing repo:owner/repo from the search query.
Extend response labeling to handle GraphQL collection/single-object shapes and skip per-item labeling for *_read sub-methods and error responses.
Add backend “enrichment” in pr_integrity / issue_integrity when author_association is missing, and update the repo-assist workflow to build/run a local MCPG image.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`guards/github-guard/rust-guard/src/labels/tool_rules.rs`	Adds repo-qualified search scoping for issues/PR search tools.
`guards/github-guard/rust-guard/src/labels/response_paths.rs`	Adds error-response skip + GraphQL items-path support + sub-method skip logic.
`guards/github-guard/rust-guard/src/labels/response_items.rs`	Adds GraphQL node/single-object extraction + sub-method skip logic.
`guards/github-guard/rust-guard/src/labels/mod.rs`	Re-exports new helpers and adds tests for GraphQL extraction helpers.
`guards/github-guard/rust-guard/src/labels/helpers.rs`	Implements GraphQL extraction helpers and backend enrichment for missing `author_association`.
`.github/workflows/repo-assist.lock.yml`	Switches repo-assist to build and run `gh-aw-mcpg:local` instead of a pinned image.

Comments suppressed due to low confidence (4)

.github/workflows/repo-assist.lock.yml:421

Switching the proxy container from a pinned release image to ghcr.io/github/gh-aw-mcpg:local makes this workflow dependent on the local docker build step and removes the reproducibility benefits of the lock file. If the intent is to run published code, keep the pinned tag; if the intent is to test PR builds, consider limiting this to PR-triggered runs (not scheduled) or make it configurable via workflow inputs.

          docker run -d --name awmg-proxy --network host \
            -e GH_TOKEN \
            -e DEBUG='*' \
            -v "$PROXY_LOG_DIR:$PROXY_LOG_DIR" \
            ghcr.io/github/gh-aw-mcpg:local proxy \
              --policy "$POLICY" \

.github/workflows/repo-assist.lock.yml:517

The image pre-pull step no longer includes the MCPG image; combined with the shift to a locally built MCPG image, this reduces cache usefulness and can increase runtime variability. If the workflow should rely on released images, keep MCPG in the pre-pull list; if it should rely on local builds, consider adjusting the step name/documentation and ensuring scheduled runs aren’t impacted.

      - name: Download container images
        run: bash ${RUNNER_TEMP}/gh-aw/actions/download_docker_images.sh ghcr.io/github/gh-aw-firewall/agent:0.24.3 ghcr.io/github/gh-aw-firewall/api-proxy:0.24.3 ghcr.io/github/gh-aw-firewall/squid:0.24.3 ghcr.io/github/github-mcp-server:v0.32.0 node:lts-alpine

guards/github-guard/rust-guard/src/labels/helpers.rs:985

pr_integrity now does a backend get_pull_request call per item when author_association is missing. With MAX_ITEMS_PER_RESPONSE = 100, this can turn a single list/search response into up to 100 extra GitHub API calls, increasing latency and risk of rate limiting. Consider adding caching for PR facts (keyed by repo+number), lowering the max for enrichment, or only enriching when is_forked/merge state is unknown and the integrity outcome would otherwise remain empty.

    // Backend enrichment: when author_association is absent from the response
    // (e.g. GitHub MCP Server omits it from MinimalPullRequest), fetch the
    // individual PR via REST to obtain the correct association, fork status,
    // and merge status.
    if integrity.is_empty() && !has_author_association(item) && !repo_private {
        if let Some(number) = item.get("number").and_then(|v| v.as_u64()) {
            let (owner, repo) = repo_full_name.split_once('/').unwrap_or(("", ""));
            if !owner.is_empty() && !repo.is_empty() {
                let number_str = number.to_string();
                if let Some(facts) =
                    super::backend::get_pull_request_facts(owner, repo, &number_str)
                {
                    crate::log_debug(&format!(
                        "[integrity] pr:{}#{} enriched: author_association={:?}, is_forked={:?}, is_merged={}",
                        repo_full_name, number, facts.author_association, facts.is_forked, facts.is_merged
                    ));

.github/workflows/repo-assist.lock.yml:865

MCP_GATEWAY_DOCKER_COMMAND is updated to run ghcr.io/github/gh-aw-mcpg:local, coupling gateway startup to the earlier local docker build. This can be fragile on reruns/cached steps and makes the lock workflow non-reproducible. Prefer the pinned release image for scheduled/default runs, or gate the local image path behind an explicit debug/PR-only switch.

          export GH_AW_ENGINE="copilot"
          export MCP_GATEWAY_DOCKER_COMMAND='docker run -i --rm --network host -v /var/run/docker.sock:/var/run/docker.sock -e MCP_GATEWAY_PORT -e MCP_GATEWAY_DOMAIN -e MCP_GATEWAY_API_KEY -e MCP_GATEWAY_PAYLOAD_DIR -e MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD -e DEBUG -e MCP_GATEWAY_LOG_DIR -e GH_AW_MCP_LOG_DIR -e GH_AW_SAFE_OUTPUTS -e GH_AW_SAFE_OUTPUTS_CONFIG_PATH -e GH_AW_SAFE_OUTPUTS_TOOLS_PATH -e GH_AW_ASSETS_BRANCH -e GH_AW_ASSETS_MAX_SIZE_KB -e GH_AW_ASSETS_ALLOWED_EXTS -e DEFAULT_BRANCH -e GITHUB_MCP_SERVER_TOKEN -e GITHUB_MCP_GUARD_MIN_INTEGRITY -e GITHUB_MCP_GUARD_REPOS -e GITHUB_REPOSITORY -e GITHUB_SERVER_URL -e GITHUB_SHA -e GITHUB_WORKSPACE -e GITHUB_TOKEN -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e GITHUB_JOB -e GITHUB_ACTION -e GITHUB_EVENT_NAME -e GITHUB_EVENT_PATH -e GITHUB_ACTOR -e GITHUB_ACTOR_ID -e GITHUB_TRIGGERING_ACTOR -e GITHUB_WORKFLOW -e GITHUB_WORKFLOW_REF -e GITHUB_WORKFLOW_SHA -e GITHUB_REF -e GITHUB_REF_NAME -e GITHUB_REF_TYPE -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GH_AW_SAFE_OUTPUTS_PORT -e GH_AW_SAFE_OUTPUTS_API_KEY -v /tmp/gh-aw/mcp-payloads:/tmp/gh-aw/mcp-payloads:rw -v /opt:/opt:ro -v /tmp:/tmp:rw -v '"${GITHUB_WORKSPACE}"':'"${GITHUB_WORKSPACE}"':rw ghcr.io/github/gh-aw-mcpg:local'

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-22T20:21:58Z

guards/github-guard/rust-guard/src/labels/helpers.rs

+    // Backend enrichment: when author_association is absent from the response
+    // (e.g. GitHub MCP Server's list_issues GraphQL path omits it), fetch the
+    // individual issue via REST to obtain the correct value. This avoids
+    // incorrectly assigning "none" integrity to members/collaborators.
+    if integrity.is_empty() && !has_author_association(item) && !repo_private {
+        if let Some(number) = item.get("number").and_then(|v| v.as_u64()) {
+            let (owner, repo) = repo_full_name.split_once('/').unwrap_or(("", ""));
+            if !owner.is_empty() && !repo.is_empty() {
+                let number_str = number.to_string();
+                if let Some(association) =
+                    super::backend::get_issue_author_association(owner, repo, &number_str)
+                {
+                    crate::log_debug(&format!(
+                        "[integrity] issue:{}#{} enriched author_association='{}'",
+                        repo_full_name, number, association
+                    ));
+                    // Re-check trusted bot status with enriched login
+                    let enriched_floor =
+                        author_association_floor_from_str(repo_full_name, Some(&association), ctx);
+                    integrity =
+                        max_integrity(repo_full_name, integrity, enriched_floor, ctx);
+                } else {


issue_integrity adds per-item backend enrichment via get_issue_author_association when author_association is missing. This can create an N+1 pattern (up to MAX_ITEMS_PER_RESPONSE extra backend calls) on list/search issue responses. Consider caching associations (repo+issue number) or otherwise bounding/avoiding enrichment to reduce latency and rate-limit exposure.

This issue also appears on line 970 of the same file.

Copilot · 2026-03-22T20:21:58Z

.github/workflows/repo-assist.lock.yml

      # NOTE: Local container build kept for debugging. Uncomment to test unpublished changes.
-      # - name: Build local MCPG container (debugging only)
-      #   run: |
-      #     rustup target add wasm32-wasip1
-      #     cd guards/github-guard/rust-guard && ./build.sh && cd ../../..
-      #     docker build . -t ghcr.io/github/gh-aw-mcpg:local
+      - name: Build local MCPG container (debugging only)
+        run: |
+          rustup target add wasm32-wasip1
+          cd guards/github-guard/rust-guard && ./build.sh && cd ../../..
+          docker build . -t ghcr.io/github/gh-aw-mcpg:local


This step is labeled “debugging only” and the preceding note says it should be uncommented for debugging, but it is now always enabled. Building the MCPG container in this scheduled workflow will add significant runtime and introduces additional failure modes (toolchain install, docker build). If this is meant for occasional debugging, gate it behind an input/flag and keep the default path using the pinned published image.

This issue also appears in the following locations of the same file:

line 416

line 516

line 864

Copilot · 2026-03-22T20:21:59Z

guards/github-guard/rust-guard/src/labels/helpers.rs

+                        "[integrity] issue:{}#{} enriched author_association='{}'",
+                        repo_full_name, number, association
+                    ));
+                    // Re-check trusted bot status with enriched login


The comment // Re-check trusted bot status with enriched login is misleading here: the enrichment path only fetches author_association (no author login), so there is no bot re-check happening. Please update/remove the comment to reflect what the code actually does.

Suggested change

// Re-check trusted bot status with enriched login

// Recompute integrity floor using enriched author_association

Copilot · 2026-03-22T20:21:59Z

guards/github-guard/rust-guard/src/labels/tool_rules.rs

+                desc = format!("search_issues:{}", q_repo_id);
+                secrecy =
+                    apply_repo_visibility_secrecy(&q_owner, &q_repo, &q_repo_id, secrecy, ctx);
+                integrity = private_writer_integrity(&q_repo_id, repo_private, ctx);


In the repo-qualified search path, repo_private is computed from extract_repo_info(tool_args) (which is typically empty for search tools), so private_writer_integrity(&q_repo_id, repo_private, ctx) will never grant writer integrity even when the repo:owner/repo target is private. Compute repo visibility for q_owner/q_repo and pass that result into private_writer_integrity (or return the privacy flag from apply_repo_visibility_secrecy). Also update baseline_scope to q_repo_id when you emit repo-scoped integrity so ensure_integrity_baseline doesn’t downgrade labels.

Suggested change

integrity = private_writer_integrity(&q_repo_id, repo_private, ctx);

// Derive repo privacy from the applied secrecy labels for this repo scope

let repo_private_for_search =

secrecy.contains(&policy_private_scope_label(&q_repo_id, ctx));

integrity = private_writer_integrity(&q_repo_id, repo_private_for_search, ctx);

// Ensure the integrity baseline is applied at the repo scope for this query

baseline_scope = q_repo_id.clone();

Copilot · 2026-03-22T20:21:59Z

guards/github-guard/rust-guard/src/labels/tool_rules.rs

+                desc = format!("search_pull_requests:{}", q_repo_id);
+                secrecy =
+                    apply_repo_visibility_secrecy(&q_owner, &q_repo, &q_repo_id, secrecy, ctx);
+                integrity = private_writer_integrity(&q_repo_id, repo_private, ctx);


Same issue as search_issues: repo_private here is derived from tool_args owner/repo (usually absent for search), so private_writer_integrity(&q_repo_id, repo_private, ctx) won’t grant writer integrity for private repos even when the query includes repo:owner/repo. Determine privacy for q_owner/q_repo and set baseline_scope = q_repo_id when applying repo-scoped integrity.

Suggested change

desc = format!("search_pull_requests:{}", q_repo_id);

secrecy =

apply_repo_visibility_secrecy(&q_owner, &q_repo, &q_repo_id, secrecy, ctx);

integrity = private_writer_integrity(&q_repo_id, repo_private, ctx);

// Determine privacy for the repository referenced in the query and

// scope integrity baselines to that repository.

let (_, _, _, q_repo_private) = extract_repo_info(&q_owner, &q_repo, ctx);

desc = format!("search_pull_requests:{}", q_repo_id);

secrecy =

apply_repo_visibility_secrecy(&q_owner, &q_repo, &q_repo_id, secrecy, ctx);

baseline_scope = q_repo_id.clone();

integrity = private_writer_integrity(&q_repo_id, q_repo_private, ctx);

Switch repo-assist.lock.yml from local container build to published `ghcr.io/github/gh-aw-mcpg:v0.1.25` image (includes GraphQL response labeling, backend enrichment, sub-method skip, and error response handling fixes from #2335). ### Changes - Comment out local container build step (kept for debugging) - Update proxy image: `:local` → `:v0.1.25` - Update gateway image: `:local` → `:v0.1.25`

## Summary Switch repo-assist from local container build to published image `v0.1.26`, which includes all DIFC integrity filtering fixes: - **Backend enrichment tool name fix** (PR #2340): corrected `get_pull_request` → `pull_request_read` and added missing `method: "get"` parameters - **Sub-method response labeling skip** (PR #2335): prevents mislabeling `get_check_runs`/`get_comments` sub-method responses - **isError response skip** (PR #2335): avoids labeling 404/error responses as domain objects - **Search scope extraction** (PR #2335): enables filtering for `search_issues`/`search_pull_requests` ## Changes - Comment out local container build step (kept as comment for future debugging) - Update proxy image tag: `:local` → `:v0.1.26` - Update MCP gateway image tag: `:local` → `:v0.1.26` ## Validation v0.1.26 was validated with **0 DIFC-FILTERED events** in run [23412918230](https://github.com/github/gh-aw-mcpg/actions/runs/23412918230).

lpcox and others added 5 commits March 22, 2026 10:07

test: temporarily use local container build for repo-assist

93eec6a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 22, 2026 20:15

Copilot started reviewing on behalf of lpcox March 22, 2026 20:16 View session

lpcox changed the title ~~Feat/guard graphql response labeling~~ guard: GraphQL response labeling, backend enrichment, and sub-method skip Mar 22, 2026

lpcox merged commit b2b0527 into main Mar 22, 2026
23 checks passed

lpcox deleted the feat/guard-graphql-response-labeling branch March 22, 2026 20:20

Copilot AI reviewed Mar 22, 2026

View reviewed changes

github-actions bot mentioned this pull request Mar 22, 2026

Smoke Test: Copilot - 23411699256 #2336

Closed

lpcox mentioned this pull request Mar 22, 2026

chore: update repo-assist to gateway v0.1.25 #2338

Merged

lpcox mentioned this pull request Mar 22, 2026

chore: update repo-assist MCPG container image to v0.1.26 #2346

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

guard: GraphQL response labeling, backend enrichment, and sub-method skip#2335

guard: GraphQL response labeling, backend enrichment, and sub-method skip#2335
lpcox merged 5 commits intomainfrom
feat/guard-graphql-response-labeling

lpcox commented Mar 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	// Re-check trusted bot status with enriched login
	// Recompute integrity floor using enriched author_association

-                integrity = private_writer_integrity(&q_repo_id, repo_private, ctx);
+                // Derive repo privacy from the applied secrecy labels for this repo scope
+                let repo_private_for_search =
+                    secrecy.contains(&policy_private_scope_label(&q_repo_id, ctx));
+                integrity = private_writer_integrity(&q_repo_id, repo_private_for_search, ctx);
+                // Ensure the integrity baseline is applied at the repo scope for this query
+                baseline_scope = q_repo_id.clone();

Conversation

lpcox commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Fixes

Validation

Related

Note

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lpcox commented Mar 22, 2026 •

edited

Loading