[demo] missing_docs drift-watch: CLI backlog burn-down sample#278
Conversation
…secret) Sample output of one missing_docs drift-watch pass over the CLI backlog, to evaluate the skill's efficacy. Resolves 14 of 31 undocumented CLI commands: - Draft: document the `oz api-key list/create/expire` subcommands in the CLI reference (reference/cli/api-keys.mdx), with flags/args extracted from crates/warp_cli/src/api_key.rs. - Map (no duplication): `oz schedule *` and `oz secret *` subcommands are already documented in their feature pages, so add surface-map entries pointing there instead of re-drafting. Reviewer routing (suggest_reviewers.py): crates/warp_cli ownership -> @bnavetta, @ianhodge. Audit after: CLI findings 31 -> 17, exit 0, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR updates the missing-docs feature surface map for existing schedule and secret CLI coverage, and adds an API-key CLI management section to reference/cli/api-keys.mdx. The direction is reasonable, but the API-key reference is still incomplete for one supported expire flag and the examples should avoid normalizing non-expiring credentials. No approved spec context was provided, so no implementation/spec drift was assessed.
Concerns
oz api-key expiresupports the shared--jq <FILTER>output filter, but the new expire section only documents--force.- The create example uses
--no-expirationfor an agent key even though the same page recommends setting appropriate expirations; prefer an expiring example and leave no-expiration as a documented option. - Clarify that
oz api-key listprints metadata for active API keys, not recoverable secret values.
Security
- The primary create example currently demonstrates a non-expiring agent key; use an expiring example to avoid encouraging long-lived credentials.
Verdict
Found: 0 critical, 1 important, 2 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
|
|
||
| ### List keys | ||
|
|
||
| `oz api-key list` prints your active API keys. |
There was a problem hiding this comment.
💡 [SUGGESTION] This wording can imply the command prints recoverable secret values. Say it prints metadata for active API keys to keep the CLI behavior aligned with the page's one-time-secret guidance.
| oz api-key create "ci-pipeline" --expires-in 30d | ||
|
|
||
| # An agent key scoped to a cloud agent, with no expiration | ||
| oz api-key create "nightly-triage" --agent AGENT_UID --no-expiration |
There was a problem hiding this comment.
💡 [SUGGESTION] [SECURITY] The primary agent-key example normalizes a non-expiring credential. Use --expires-in in the example and keep --no-expiration only in the option list.
| `oz api-key expire <NAME_OR_UID>` immediately invalidates a key (also available as `oz api-key delete`). Expired keys can't be recovered. | ||
|
|
||
| * `<NAME_OR_UID>` — The name or UID of the key to expire (required). | ||
| * `--force` — Skip the confirmation prompt. |
There was a problem hiding this comment.
oz api-key expire also flattens the shared JSON output options, so this section is missing the supported --jq <FILTER> flag. Add it here so the reference covers every flag for the command.
…e detection (#201) * Overhaul missing_docs skill: fix audit blind spots, add change detection - audit_docs.py: fail loud (exit 2 + audits_skipped) when repos are missing; fix repo auto-detection to siblings of the docs repo root - GA detection via the app/src/features.rs cargo-feature bridge plus RELEASE_FLAGS/PREVIEW_FLAGS/DOGFOOD_FLAGS instead of snake_case guessing - New audits: public API routes (router/handlers/public_api gin groups vs OpenAPI spec), CLI subcommands (clap enum tree, hidden skipped), slash commands (static registry), surface-map hygiene (dead entries) - Staleness: strip code spans, word-boundary matching, skip historical changelog pages (73 -> 29 findings; 'oz agent' CLI noise eliminated) - Change detection: references/surface_snapshot.json + --diff mode reporting added/removed/promoted surfaces and changelog items since last run; --update-snapshot regenerates the baseline - Seed feature_surface_map.md: map handoff/orchestration/queueing/BYOK/ billing/etc. flags, prune 45+ dead entries, add slash-command section and API internal sentinels - SKILL.md: document the exit-code contract, diff workflow, drift-watch mode for the recurring agent (with copyable scheduled-agent prompt), and make surface-map + snapshot updates explicit drafting steps Co-Authored-By: Oz <oz-agent@warp.dev> * Expand missing_docs audit: settings, web app, tools, skills, structure, stale refs - Settings audit: parse the define_setting! toml_path registry (~200 settings, flag-status aware, object-typed settings handled) and check coverage in the all-settings reference; reverse check catches documented settings that were renamed/removed in code (e.g. agents.oz.* -> agents.warp_agent.*) - Stale doc references: validate documented keybinding actions (scope:action) still exist anywhere in warp-internal source - Docs structure audit: flag pages missing from src/sidebar.ts (with an allowlist section in the surface map) - CLI: recursive subcommand parsing (oz run message send, oz environment image list, ...) plus per-module --flag tracking in the snapshot - API: positional RouterGroup argument resolution at Register* call sites (fixes oauth route prefixes) and param-name-insensitive OpenAPI matching - Snapshot v2: settings, web app routes (AgentsApp.tsx), server-side agent tools (ToolName consts + Create*NativeTool registrations), bundled + channel-gated skills, CLI flags; graceful one-time note when diffing against a v1 snapshot - Changelog cross-check now also tracks 'Oz updates' bullets - Extraction sanity guards: implausibly low parse counts (broken parser after a code-layout change) skip dependent audits and exit 2 instead of silently under-reporting; map hygiene and reverse checks gated on healthy extraction - Feature flag enum parsing is brace-safe (survives future struct variants) - SKILL.md documents the 9 coverage audits, snapshot-only surfaces, and adjacent-skill ownership (validate_ui_refs, sync-error-docs, style_lint, weekly-404-monitor) Co-Authored-By: Oz <oz-agent@warp.dev> * Add completeness accounting and map integrity checks; reclassify GA flags Triple-check that every feature is encapsulated in the mapping: - Built-in completeness accounting on every full run: partitions every extracted surface item (277 flags, 74 CLI commands, 71 API routes, 47 slash commands, 201 settings) into exactly one accountability bucket (mapped / ignored / doc-covered / visible finding / snapshot-tracked) and exits 2 with integrity:accounting if anything escapes — an unaccounted item can only mean the audit logic regressed - Map integrity checks in hygiene: entries in both the mapping and the ignore list (ignore silently wins) and duplicate keys within a section are now medium findings - Ignore-list review against computed statuses found ~20 flags filed under 'Non-GA' that have since GA'd: reclassified 15 user-facing ones to real mappings (session sharing trio, AgentHarness, SshRemoteServer, ArtifactCommand, OzIdentityFederation, image-context pair, OzPlatformSkills, WorkflowAliases, ShellSelector, KittyImages, UndoClosedPanes, RevertDiffHunk), surfaced FullScreenZenMode as a visible undocumented-feature finding, and retitled the section so placement no longer asserts rollout status - Re-baselined the snapshot after live drift the diff caught on today's checkouts (SuperGrok dogfood->ga, new /rename-conversation slash command — both now standing coverage findings) - SKILL.md: documented the accounting contract and the end-to-end 'how every change path is caught' chain (new/promoted/removed surfaces, no-code-change launches via changelog net, parser rot via extraction guards, map rot via hygiene) Co-Authored-By: Oz <oz-agent@warp.dev> * Target the public warp repo instead of warp-internal Warp's client code is open source at warpdotdev/warp — the audit now treats the public repo as the primary source: - Repo auto-detection prefers a sibling checkout named 'warp' and falls back to 'warp-internal' for transitional environments - New --warp flag is the primary CLI option; --warp-internal remains as a deprecated alias (same destination) so existing invocations keep working - All docstrings, stderr messages, skip reasons, section headers, and SKILL.md guidance (requirements, audit descriptions, drift-watch command, scheduled-agent prompt) now reference the public warp client repo Validated: auto-detect fallback, preferred 'warp' sibling resolution, explicit --warp, and the deprecated alias all run the full audit with clean completeness accounting. Co-Authored-By: Oz <oz-agent@warp.dev> * First drift-watch burn-down: settings, keybindings, slash commands, SuperGrok docs (#203) * First drift-watch burn-down: settings, keybindings, slash commands, SuperGrok Dogfoods the missing_docs drift-watch workflow on the standing findings backlog (41 findings resolved): - all-settings.mdx: documented 17 missing settings (prompt submission mode, orchestration message display, auto handoff on sleep, agent attribution, handoff kill-switches, OSC 52 clipboard access, async find, directory tab colors, vertical-tabs panel options, hidden files in project explorer, line number mode, force X11, Ctrl+Enter submit for CLI agents, input focus on block selection) with types/defaults/options extracted from the settings registry; moved git_operations_autogen_enabled into [agents.warp_agent.active_ai] reflecting the agents.oz -> agents.warp_agent rename and removed the stale remnant section; added an Experimental section - keyboard-shortcuts.mdx: fixed all 14 dead action names (10 renames like workspace:open_new_tab -> workspace:new_tab, editor_view:cmd_i -> editor_view:inspect_command, terminal:trigger_subshell_bootstrap -> terminal:warpify_subshell; 4 removed actions blanked) - slash-commands.mdx: added /environment, /harness, /host (cloud agent session selectors) and /rename-conversation - bring-your-own-api-key.mdx: documented connecting a SuperGrok subscription instead of an xAI API key (newly GA SuperGrok flag) + map entry - Surface map: allowlisted guides/agent-workflows/warp-vs-claude-code as intentionally unlisted (per its frontmatter note) - Re-applied the GA-flag reclassification from the previous session (15 mappings + FullScreenZenMode surfaced) which had been silently reverted by a 'git checkout' during integrity testing before it was committed \u2014 caught by comparing accounting bucket counts run-over-run - Snapshot re-baselined; audit now reports 0 stale doc references, 0 undocumented settings/slash commands/unlisted pages; remaining backlog: 29 CLI subcommand docs, 24 OpenAPI spec gaps (update-open-api-spec), 29 terminology pages (style_lint), FullScreenZenMode + GroupedTabs Co-Authored-By: Oz <oz-agent@warp.dev> * docs: second drift-watch pass — settings, feature flags, map hygiene Re-ran the missing_docs drift-watch audit against current code surfaces and burned down the newly-found in-scope drift (features 5→0, settings 6→0, map hygiene 2→0). Settings (all-settings.mdx): - Document appearance.icon.show_dock_icon (macOS Dock / Cmd-Tab visibility) - Document agents.warp_agent.other.long_running_command_submission_mode - Document code.editor.format_on_save - Document cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled - Document warpify.ssh.reuse_existing_control_master - Map warpify.ssh.ssh_tmux_deprecation_notice_pending -> internal (one-time migration banner state, not user-configurable) Feature flags (feature_surface_map.md): - Map CodexPlugin -> cli-agents/codex.md - Map FullScreenZenMode, AsyncFind -> all-settings.mdx (surfaces are documented settings) - Map CustomModelRouters -> inference/model-choice.mdx (new "Custom routers" section) - Ignore GroupedTabs (macOS-only Preview; docs pending GA promotion) Map hygiene: - Prune stale ignore-list flags FreeUserNoAi and WelcomeTab (no longer in code) Co-Authored-By: Oz <oz-agent@warp.dev> * docs(missing_docs): codify finding-resolution patterns in SKILL.md Add a "Resolution patterns" subsection capturing the per-type decision rules applied during the second drift-watch pass, so recurring runs resolve findings consistently: - user-facing setting -> document in all-settings - internal/state-only setting -> map `section.key -> internal` - feature flag with a dedicated page -> map to it - feature flag whose only surface is a documented setting -> map to that page - preview/pre-launch feature with no docs -> ignore-list with a comment - stale map entry/doc reference -> prune after confirming removal in code Co-Authored-By: Oz <oz-agent@warp.dev> * Update .agents/skills/missing_docs/references/feature_surface_map.md Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com> --------- Co-authored-by: Oz <oz-agent@warp.dev> Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com> * docs(missing_docs): add reviewer routing from code ownership Add scripts/suggest_reviewers.py, which maps the source surface behind each drift-watch finding to its owning engineer using the CODEOWNERS-format ownership files already maintained in the code repos: - warp client: .github/STAKEHOLDERS - warp-server: .github/STAKEHOLDERS (advisory) + .github/CODEOWNERS (enforced) It resolves with standard last-match-wins precedence, dedupes owners into users/teams, and prints a ready-to-run `gh pr edit --add-reviewer` command. Unresolved paths are non-fatal so a scheduled run is never blocked. Wire it into SKILL.md: a new "Reviewer routing" section (per-category source-file hints), a drift-watch "Route reviewers" step before opening the PR, an updated scheduled-agent prompt, and a References entry. Ownership stays sourced from STAKEHOLDERS/CODEOWNERS (kept fresh by sync-stakeholders) — never hardcoded. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(missing_docs): address review comments - SKILL.md: Phase 3 step 7 now points to src/sidebar.ts (the real sidebar source the structure audit checks); astro.config.mjs only for a new top-level topic. - all-settings.mdx: correct code.editor.show_hidden_files default to `true` (matches the client setting definition). - surface_snapshot.json: regenerate so it includes the newly documented settings (code.editor.format_on_save, appearance.icon.show_dock_icon, agents.warp_agent.other.long_running_command_submission_mode, warpify.ssh.reuse_existing_control_master, cloud_platform.third_party_api_keys.gemini_enterprise_credentials_enabled) and current flag/CLI/API state. Audit: exit 0, audits_skipped none, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev> * demo(missing_docs): CLI drift burn-down sample (api-key / schedule / secret) (#278) Sample output of one missing_docs drift-watch pass over the CLI backlog, to evaluate the skill's efficacy. Resolves 14 of 31 undocumented CLI commands: - Draft: document the `oz api-key list/create/expire` subcommands in the CLI reference (reference/cli/api-keys.mdx), with flags/args extracted from crates/warp_cli/src/api_key.rs. - Map (no duplication): `oz schedule *` and `oz secret *` subcommands are already documented in their feature pages, so add surface-map entries pointing there instead of re-drafting. Reviewer routing (suggest_reviewers.py): crates/warp_cli ownership -> @bnavetta, @ianhodge. Audit after: CLI findings 31 -> 17, exit 0, unaccounted none. Co-authored-by: Oz <oz-agent@warp.dev> * test(missing_docs): add stdlib test suite for the skill scripts - test_suggest_reviewers.py: 15 unit tests for reviewer resolution (CODEOWNERS matching incl. anchored dir-prefix / exact-file / glob / default rule, last-match-wins precedence, user vs team split, dedup, unresolved paths, warp-internal alias, stdin) — all via temp ownership files. - test_audit_docs.py: 6 behavioral integration tests that run audit_docs.py against the sibling code repos — clean exit + completeness accounting (unaccounted empty), --category scoping, --severity filtering, fail-loud (exit 2) on a missing repo, committed-snapshot currency, and that --update-snapshot honors --snapshot without mutating the committed snapshot. Skips gracefully when the code repos aren't checked out. Both suites use only the Python stdlib (unittest) — no third-party deps. 21/21 pass. Documented under a new "## Tests" section in SKILL.md. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(missing_docs): encode public vs. private surface boundary Make explicit that the skill must only document publicly released surfaces: - New "Public vs. private surfaces" section in SKILL.md: the OSS warp client repo is public; warp-server is a PRIVATE repo whose only public surface is the released Oz Agent API already in the OpenAPI spec. Two gates (source/exposure + GA rollout); never document private or unreleased surfaces. - Woven into the API audit description, Phase 3 API-gap research, and Resolution patterns: warp-server endpoints not in the released spec are not auto- documentable — route released ones via sync-openapi-spec, else `-> internal`/defer. Apply the rule to detected drift: Agent Memory is research preview, so its `oz memory*` / `oz memory-store*` CLI and `/memory_stores/*` REST API are mapped `-> internal` with comments. Added a public/private POLICY note to the surface map's API section. Audit after: CLI 17->15, API 29->18, audits_skipped none, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev> * test(missing_docs): cover public/private boundary + run skill tests in CI - test_audit_docs.py: add test_research_preview_surfaces_are_deferred, a regression guard asserting Agent Memory's CLI (`oz memory*`) and REST API (`/memory_stores/*`) are never flagged for documentation (public/private boundary). 22 tests total, all green. - ci.yml: run the skill's stdlib test suites on every PR. The reviewer-resolver unit tests run fully; the audit integration tests skip gracefully since the warp/warp-server code repos aren't checked out in docs CI. - SKILL.md: note the new boundary test in the Tests section. Co-Authored-By: Oz <oz-agent@warp.dev> * feat(missing_docs): rollout-gate CLI/API surfaces via gated:<Flag> Fixes the limitation that the CLI and API audits flagged commands/routes regardless of whether their feature had shipped (so non-GA surfaces like Agent Memory needed a permanent `-> internal`). audit_docs.py: - New `gated:<Flag>` surface-map target for CLI commands and API routes. The audit resolves the gating flag's rollout status (the same machinery used for feature flags + settings): * non-GA (preview/dogfood/other) -> deferred (new `gated_non_ga` accounting bucket), not a finding; * GA -> falls through to normal coverage so it auto-surfaces as a finding; * unknown flag -> conservative (still a finding) + map-hygiene error so the annotation can't silently rot. - audit_cli / audit_api now take flag_statuses; main computes it for API runs. - Completeness accounting keeps totality (gated_non_ga counted; unaccounted none). feature_surface_map.md: migrate Agent Memory's `oz memory*` / `oz memory-store*` CLI and `/memory_stores/*` API from `-> internal` to `-> gated:AIMemories`, so they auto-surface for docs when AIMemories goes GA. Documented the `gated:` sentinel in the header. SKILL.md: document `gated:<Flag>` in Public vs. private surfaces + Resolution patterns. tests: add TestGatedLogic (helper, non-GA deferral, GA auto-surface, unknown-flag conservatism, map-hygiene validation). 27 tests pass; audit exit 0, unaccounted none. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(missing_docs): sync SKILL.md accounting buckets with gated_non_ga The gated:<Flag> work added a `gated_non_ga` bucket to the CLI and API completeness accounting, but SKILL.md's bucket list still omitted it. Document `gated_non_ga` for CLI commands and API routes, and note the `gated:<Flag>` sentinel alongside `internal` in the References section so the documented accounting matches what the audit actually emits. Co-Authored-By: Oz <oz-agent@warp.dev> * docs(cli): resolve missing_docs CLI drift (oz agent, oz run messaging, oz provider) (#279) Demonstrates the missing_docs drift-watch loop end to end on real drift. The audit flagged 15 undocumented `oz` subcommands; this resolves all of them according to each command's GA rollout status: - GA (NamedAgents): document the `oz agent` named-agent management group (list/get/create/update/delete + `oz agent skills`) and fix the existing `oz agent list` entry, which incorrectly described skill listing. - GA (ConversationApi): document `oz run conversation get` and the `oz run message` inbox commands (list/read/watch/send/mark-delivered). - Non-GA (ProviderCommand = dogfood): defer the whole `oz provider` group via `gated:ProviderCommand` so it auto-surfaces for docs when it goes GA. All command flags drafted from crates/warp_cli source (agent.rs, task.rs, provider.rs). CLI audit now reports 0 gaps; cli_commands gated_non_ga = 14. Co-authored-by: Oz <oz-agent@warp.dev> * docs(cli): cross-link run-cloud --agent to named-agent management The new "Managing named agents" section says agents are run with `oz agent run-cloud --agent <UID>`, but that flag was absent from the run-cloud key-flags list. Document `--agent <UID>` (from RunCloudArgs in crates/warp_cli/src/agent.rs) and cross-link the two sections so readers can get from creating a named agent to running one. Co-Authored-By: Oz <oz-agent@warp.dev> --------- Co-authored-by: Oz <oz-agent@warp.dev> Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
What this is
A sample PR generated by running the
missing_docsdrift-watch skill over the standing CLI backlog, so we can evaluate the skill's efficacy. Not intended to merge as-is — it's stacked on #201 (the skill itself) and meant for inspection.What one pass produced
Triaging the 31 undocumented
ozCLI commands, the skill split them into two resolution types:Draft (genuinely undocumented):
oz api-key list,oz api-key create, andoz api-key expireinreference/cli/api-keys.mdx— command syntax, every flag, and examples, all extracted fromcrates/warp_cli/src/api_key.rs(--sort-by,--sort-order,--jq,--agent, the required expiration group,--force).Map, don't duplicate (already documented elsewhere):
oz schedule *(7 subcommands) →platform/triggers/scheduled-agents.mdxoz secret *(4 subcommands) →platform/secrets.mdxThese are fully documented in their feature pages, so the skill adds surface-map entries pointing there instead of re-drafting (avoids junk-drawer duplication).
Reviewer routing
suggest_reviewers.pyresolved the owning engineers fromcrates/warp_cliownership: @bnavetta, @ianhodge. (Not actually requested here since this is a throwaway demo PR.)Result
unaccounted: none, drafted doc passesstyle_lint.What it does NOT cover (good signal on limits)
oz run message/conversation,oz agentCRUD,oz memory*[research preview],oz provider*[dogfood]) — these need a human rollout/readiness call before documenting, which is exactly the kind of judgment the skill defers rather than guesses.Conversation: https://staging.warp.dev/conversation/c5ce6a1a-81a4-4a04-92d9-8d587693be12
Run: https://oz.staging.warp.dev/runs/019f19e2-f981-725b-b383-bc7ea01adb6a
This PR was generated with Oz.