safe-outputs: no permissive / reputation mode — research workflows produce all-redacted URLs

## Summary

The `safe-outputs` content sanitizer (`actions/setup/js/sanitize_content_core.cjs`) redacts any URL whose domain isn't on the merged `network.allowed` + `safe-outputs.allowed-domains` allowlist, rewriting it to `(domain)/redacted` in the final discussion / issue body. There is no permissive escape hatch for workflows that legitimately need to surface URLs from an open set of domains — research, news, and competitive-analysis use cases in particular.

For an example of the resulting output, see [RealPage/ai-internal-enablement#521](https://github.com/RealPage/ai-internal-enablement/discussions/521): a weekly-research run where every external citation is rendered as `[text]((domain.com/redacted)`, killing the readability of the report. Cross-references upstream [githubnext/agentics#309](https://github.com/githubnext/agentics/issues/309).

## What I tried

Verified against current source on `main`:

- `safe-outputs.allowed-domains: ["*"]` — explicitly rejected by `pkg/workflow/network_firewall_validation.go:211-218` (`wildcard-only domain '*' is not allowed`).
- `network.allowed: ["*"]` — special-cased in `pkg/workflow/firewall.go:185` to disable the egress firewall, but the sanitizer matcher (`actions/setup/js/sanitize_content_core.cjs:276-292`) does not honor `"*"`, so output URLs still get redacted.
- `network: {}` — denies egress and falls back to the hardcoded GitHub-only sanitizer set (`sanitize_content_core.cjs:112`); makes things worse.
- Manually allowlisting domains — the workaround we just shipped ([RealPage/ai-internal-enablement#563](https://github.com/RealPage/ai-internal-enablement/pull/563)) — works but isn't tractable for open-web research workflows.

There is no URL-reputation hook in the source (no Safe Browsing / URLhaus / VirusTotal / blocklist integration).

## Why this matters

`weekly-research.md` ships as a sample workflow in `githubnext/agentics`. It is structurally incompatible with a static domain allowlist — research surfaces URLs from arbitrary publishers. Any user who installs it sees the same redacted output. The current docs ("If you see `(redacted)` in workflow outputs, add the domain to your `network.allowed` list") imply this is tractable; it isn't, for this class of workflow.

The current design optimizes hard for one threat (URL-based exfiltration through agent output) and produces unusable output for any open-web reporting use case. There should be a way to opt in to a different threat model.

## Proposed solutions (in priority order)

### 1. `safe-outputs.url-policy` mode (preferred)

Add a `safe-outputs.url-policy:` field with values:

- `allowlist` — current behavior (default; backwards-compatible).
- `audit` — pass all URLs through unchanged, but emit a workflow log line for any URL whose domain isn't on the allowlist. Lets users see what would have been redacted without breaking the output.
- `reputation` — call a pluggable URL reputation service (Google Safe Browsing API is the obvious default; URLhaus, PhishTank, VirusTotal could be alternatives via config). Redact only entries flagged as malicious. Configurable via `safe-outputs.reputation: { provider: google-safe-browsing, api-key-secret: SB_API_KEY }` or similar.

The `audit` mode alone would cover most cases and is cheap to implement.

### 2. Accept `"*"` as a valid value in `safe-outputs.allowed-domains`

If the policy-mode approach is too large a surface, the minimal change is: have the validator and sanitizer matcher accept `"*"` to mean "pass any URL through unchanged". This mirrors the existing `network.allowed: ["*"]` semantic on the egress side. Users pair it with `network: defaults` to retain the egress firewall as the remaining defensive gate.

### 3. Document the limitation clearly

Independent of the above, the `(redacted)` paragraph in `docs/.../reference/network.md` should call out explicitly that open-web research workflows are not viable under the current model, and link to whichever fix lands. Right now the docs read as if "add the domain to the allowlist" is a complete answer.

## Threat model note

The user's real-world concern when adopting an `audit` or `*` mode is malicious URLs in agent output (phishing, drive-by). That's a real risk, but:

- It's already mitigated for the egress side (firewall) — the sanitizer is an additional belt on top of suspenders.
- Allowlisting doesn't actually defend against it once the attacker compromises an allowlisted domain.
- A reputation-based mode addresses the actual threat far better than an allowlist does.

Happy to send a PR for option 2 (smallest change) if there's directional agreement. Option 1 is the more correct fix but a larger spec change.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safe-outputs: no permissive / reputation mode — research workflows produce all-redacted URLs #33970

Summary

What I tried

Why this matters

Proposed solutions (in priority order)

1. `safe-outputs.url-policy` mode (preferred)

2. Accept `"*"` as a valid value in `safe-outputs.allowed-domains`

3. Document the limitation clearly

Threat model note

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

safe-outputs: no permissive / reputation mode — research workflows produce all-redacted URLs #33970

Description

Summary

What I tried

Why this matters

Proposed solutions (in priority order)

1. safe-outputs.url-policy mode (preferred)

2. Accept "*" as a valid value in safe-outputs.allowed-domains

3. Document the limitation clearly

Threat model note

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `safe-outputs.url-policy` mode (preferred)

2. Accept `"*"` as a valid value in `safe-outputs.allowed-domains`