Skip to content

refactor: self-validating clients, fast validate + separate coverage/smoke (#1504)#1505

Merged
cliffhall merged 12 commits into
v2/mainfrom
v2/launcher-self-validate
Jun 21, 2026
Merged

refactor: self-validating clients, fast validate + separate coverage/smoke (#1504)#1505
cliffhall merged 12 commits into
v2/mainfrom
v2/launcher-self-validate

Conversation

@cliffhall

@cliffhall cliffhall commented Jun 20, 2026

Copy link
Copy Markdown
Member

Closes #1504.

Summary

A build/test tooling refactor that makes every client self-validating, separates the fast validate check from the coverage gate, splits the cross-client smoke suite into its own target, and cleans up the noisy web unit-test output. Each section below is independently reviewable.


1. Every client self-validates

The launcher, CLI, and TUI now each have their own web-style validate (the launcher and the cli/tui previously had no lint/format:check of their own):

  • Added per-client lint / format / format:check / validate + a Node-only eslint flat config (web's React/Storybook plugins stripped) and prettier.
    • CLI lints/formats clean as-is.
    • TUI registers eslint-plugin-react-hooks for the classic rules only (rules-of-hooks + exhaustive-deps, which the source already references in disable directives); the stricter react-hooks@7 rules (e.g. set-state-in-effect) are not enforced on the interim component surface tracked in TUI: extend coverage gate to Ink components (lift interim exclusion) #1501.
    • Launcher gets a per-file coverage gate over parse-launcher-argv.ts (100%); src/index.ts is excluded as binary bootstrap.
  • The repo-root validate is just the four per-client validations (validate:webvalidate:clivalidate:tuivalidate:launcher); each delegates to that client's own npm run validate. No cross-client build dependencies.

2. validate is fast; coverage is its own target

  • Each client's validate runs test, not test:coverage — no v8 instrumentation, and web runs the unit project only. validate still builds each client bundle (cli via its test pretest, the others via an explicit build).
  • The per-file coverage gate moved to a dedicated top-level coverage (and per-client coverage:web/coverage:cli/coverage:tui/coverage:launcher, each → that client's test:coverage). Run npm run coverage when you want the gate.
  • One deliberate asymmetry: the CLI's out-of-process e2e.test.ts spawns the built binary, so its test/test:coverage build first and its validate folds that in (format:check && lint && test, no separate build).

3. smoke is a separate top-level target

  • smoke (smoke:launcher + prod smoke:cli/smoke:tui/smoke:web) is not part of validate — run it after a build/validate (npm run validate && npm run smoke). It has no build commands; the bundles come from validate.
  • smoke:launcher is now a real script (scripts/smoke-launcher.mjs) like the others, asserting --help / --cli --help / --tui --help each print the right mode's usage banner (proving the launcher resolved the right client build), not just that they exit 0.

4. CI

Collapsed to: validate (fast: format/lint/build/test + builds all bundles) → web test:integration (real stdio/HTTP/OAuth/storage paths, no coverage) → smoke → storybook. The coverage gate is intentionally not run in CI — it's a local/manual npm run coverage. (Integration is kept in CI as its own step since it otherwise only ran via test:coverage.)

5. Quieter unit-test output

Green web unit runs printed dozens of alarming-but-expected lines from happy-dom doing real I/O with no server:

  • <iframe src="/sandbox.html"> → ~40 DOMException/AsyncTaskManager destroyed lines. Fixed by disabling child-frame navigation in the unit project (tests only assert on the iframe element).
  • App-level renders reading GET /api/config → ~76 404s. Fixed with a benign default fetch (empty 200) in the unit setup; tests that exercise fetch still override it.

Negative tests already suppressed their own console output; CLI/TUI/integration were already clean.

6. Root script cleanup + reordering

Dropped redundant root aggregators (test, test:web/cli/tui, test:coverage, and the rarely-used inspector / root test:storybook) now that each client self-validates, and reordered the script lists (frequently-run first). The build:* / validate:* / coverage:* / smoke:* families follow one consistent aggregate + per-client pattern.


Verification

npm run validate && npm run coverage && npm run smoke passes end-to-end (exit 0): all four client validations, the full coverage gate (incl. web integration, web unit at 2424 tests / gate green), and all four smokes (smoke:launcher/smoke:cli/smoke:tui/smoke:web). Unit output is noise-free.

🤖 Generated with Claude Code

…vel step (#1504)

Previously `validate:launcher` built cli/tui/web and ran the cross-client
smokes, so the launcher's "validation" depended on every other client
being built — and the launcher had no lint/format/coverage of its own.

- clients/launcher: add a web-style `validate` (format:check && lint &&
  build && test:coverage) with a Node-only eslint flat config, prettier,
  and a per-file coverage gate (parse-launcher-argv.ts covered;
  src/index.ts excluded as binary bootstrap). 100% on the gated file.
- Root `validate:launcher` now just delegates: `cd clients/launcher &&
  npm run validate` — no dependency on the other clients.
- New top-level `smoke` step (smoke:launcher + smoke:cli + smoke:tui +
  smoke:web) with NO build commands; `validate` runs it last. Each
  client's own validation builds its bundle first, so nothing builds
  twice (cli/tui via test:coverage, launcher via its validate;
  smoke:web builds dist on demand).
- clients/tui test:coverage now builds the TUI bundle (mirroring cli) so
  the build-free smoke has it available.
- CI: replace the standalone cli/tui/launcher build step with a
  "Validate launcher" step. Docs updated (AGENTS.md, launcher README).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall cliffhall added the v2 Issues and PRs for v2 label Jun 20, 2026
…rom validate

Extends the launcher self-validate work to the CLI and TUI, and makes the
top-level `validate` nothing but the per-client validations — smokes are
now a separate top-level target.

- clients/cli, clients/tui: add a web-style `validate` with a Node flat
  eslint config + prettier. CLI lint/format pass as-is. TUI registers
  eslint-plugin-react-hooks for the classic rules only (rules-of-hooks +
  exhaustive-deps, which the source already references); the stricter
  react-hooks@7 rules are not enforced on the interim component surface
  (#1501).
- Avoid double builds:
  - CLI's e2e spawns the built binary, so its `test:coverage` builds first;
    `validate` folds that in (`format:check && lint && test:coverage`, no
    separate `build`).
  - TUI tests don't spawn a bundle, so `validate` keeps an explicit
    `build` (for the smoke) and `test:coverage` is pure.
- Root: `validate` = validate:web && validate:cli && validate:tui &&
  validate:launcher (each delegates to the client's own validate). `smoke`
  is no longer part of `validate` — it's a separate top-level target run
  after a build/validate, with no build commands of its own.
- CI: per-client "Validate X" steps + a single "Run cross-client smokes"
  (npm run smoke). Docs updated (AGENTS.md, cli/tui READMEs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall cliffhall changed the title refactor(launcher): self-contained validate; split smokes into top-level step (#1504) refactor: every client self-validates; smoke separated from validate (#1504) Jun 20, 2026
cliffhall and others added 7 commits June 20, 2026 16:40
Now that every client self-validates and the top-level `validate` chains
the per-client validations, the root `test` / `test:web` / `test:cli` /
`test:tui` / `test:coverage` aggregators are redundant — run
`npm run validate` (all) or `cd clients/<name> && npm run validate` (one).
`test:storybook` stays (it's the one check not part of validate). CI was
already using each client's own scripts, so it's unaffected. Docs updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Passing unit runs printed dozens of alarming-but-expected lines that made
green runs look broken — all from happy-dom performing real I/O with no
server behind it:
- `<iframe src="/sandbox.html">` (MCP Apps sandbox) made happy-dom fetch
  and later unload that URL, emitting ~40 `DOMException [NetworkError/
  AbortError]` + `AsyncTaskManager destroyed` lines. Disable child-frame
  navigation in the unit project (the tests only assert on the iframe
  element/attributes, not its document), removing the noise with no
  coverage loss.
- App-level renders read `GET /api/config` via useSandboxUrl /
  useServerListWritable, 404-ing ~76x. Add a benign default `fetch` (empty
  200) in the unit setup; tests that exercise fetch still override it.

Negative tests that assert on errors already suppress their own console
output (e.g. jsonUtils/schemaUtils); CLI, TUI, and the web integration
suite were already clean. Unit suite is now noise-free; 2424 web tests
pass and the coverage gate is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… own target

Each client's `validate` now runs `test` instead of `test:coverage`, so
the everyday pre-push check skips v8 instrumentation (and, for web, the
slower integration project). The per-file coverage gate moves to a
dedicated top-level `coverage` script:

  coverage = coverage:web && coverage:cli && coverage:tui && coverage:launcher
  coverage:<name> = cd clients/<name> && npm run test:coverage

`validate` still builds each client's bundle (cli via its test `pretest`,
the others via an explicit `build`), so `npm run validate && npm run
smoke` works unchanged. CI now runs `validate` (fast checks + builds),
then `coverage` (the gate), then `smoke` — collapsing the old per-client
steps. Docs updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI no longer runs `npm run coverage` — the per-file coverage gate is now a
local/manual check (`npm run coverage` before pushing). To avoid silently
dropping the web integration suite (it only ran via test:coverage, and
`validate` runs the unit project only), CI gains a standalone
`clients/web` `test:integration` step that exercises the real
stdio/HTTP/OAuth/storage paths without coverage instrumentation.

CI is now: validate (fast) → web integration → smoke → storybook. Docs
updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
smoke:launcher was the odd one out — an inline chain of three
`node clients/launcher/build/index.js ... --help` commands while smoke:cli
/ smoke:tui / smoke:web each run a `scripts/smoke-*.mjs`. Extract it to
`scripts/smoke-launcher.mjs`, which runs the same `--help` / `--cli --help`
/ `--tui --help` dispatch checks but now also asserts each prints that
mode's usage banner (proving the launcher resolved the right client build)
and verifies the required builds exist up front with a clear error. Prints
a `smoke:launcher OK — ...` summary in the same style as the rest. Docs
updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move frequently-run scripts toward the top and rarely-run-by-hand ones
(pretest hooks, lint/format internals, etc.) toward the end, consistently
across the root and client manifests. Pure reordering — no script names or
values changed, and no other manifest content is touched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reordering dropped the root `inspector` and `test:storybook`
aggregators (intentional — `inspector` is redundant with `web`, and
storybook is run from clients/web in CI). Update AGENTS.md so it no longer
points at `npm run inspector`, and run `test:storybook` from `clients/web/`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall cliffhall changed the title refactor: every client self-validates; smoke separated from validate (#1504) refactor: self-validating clients, fast validate + separate coverage/smoke (#1504) Jun 20, 2026
@cliffhall

Copy link
Copy Markdown
Member Author

@claude review

@claude

claude Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Claude finished @cliffhall's task in 1m 54s —— View job


Code Review

  • Read the diff against origin/v2/main
  • Review root + per-client package.json scripts
  • Review CI workflow changes
  • Review new eslint/vitest configs + smoke-launcher script
  • Review web test-output quieting changes
  • Post review feedback

Overall this is a clean, well-motivated tooling refactor with unusually good inline documentation — every non-obvious choice (the coverage-gate split, the happy-dom navigation flag, the launcher index.ts coverage exclusion, the smoke ordering) carries a comment that explains why. The script families now follow one consistent aggregate + per-client pattern, and the self-validating model removes the awkward cross-client validate:launcher that used to build cli/tui by hand. I verified the smoke-launcher markers against the actual source and they all line up. Nothing here is blocking. A few observations below, mostly minor.

Findings

1. format / format:check globs miss top-level TS files (low)
The cli/tui/launcher scripts use prettier --write src __tests__, but lint uses eslint .. For the TUI this leaves index.ts, tui.tsx, and dev.ts (repo-root of the client) linted but not format-checked; for the launcher, scripts/ is in the same gap. So formatting drift in those files won't be caught by validate, even though they're linted. Consider widening to prettier --write "**/*.{ts,tsx}" --ignore-path ... or explicitly adding the stray files, so format:check and lint cover the same surface.

2. New eslint.config.js files aren't covered by format:check and use a different quote style (cosmetic)
The three added flat configs use single quotes, while the TS source (and prettier's default) is double-quoted. Since they live outside src/__tests__, prettier never checks them, so the inconsistency is invisible to CI but real. Harmless, just worth a note for consistency.

3. fetch baseline in setup.ts — restoration is not config-enforced (low)
The comment says "any test that actually exercises fetch overrides this … which Vitest restores back to this baseline afterward." That auto-restore only happens for vi.stubGlobal + unstubAllGlobals, or with restoreMocks: true — and the unit project sets neither (no restoreMocks in vite.config.ts). In practice the suite is green, so tests are cleaning up themselves, but the stated guarantee is slightly stronger than what's configured. Two side effects worth being aware of:

  • A test that forgets to stub fetch and asserts on a failure/error path will now silently get a benign 200 {} instead of throwing — a missing stub can read as a pass.
  • If you want the restoration the comment describes to be automatic, add restoreMocks: true (or unstubEnvs/unstubGlobals) to the unit project.

Reasonable tradeoff for quieting ~76 noisy 404s; just flagging that the comment slightly oversells the mechanism.

4. Coverage gate is now CI-invisible (process note, by design)
Moving the per-file gate to a local-only npm run coverage is a deliberate, documented choice and it does make CI much faster. The flip side: a PR that drops a file below the 90/85/80/50 thresholds will pass CI green, and the regression only surfaces when someone remembers to run coverage locally. That's a real behavior change from the prior CI (which ran test:coverage for web/cli/tui). Since AGENTS.md already documents "run coverage before pushing," this is acceptable — but it does rely on contributor discipline rather than enforcement. Worth confirming the team is comfortable with the gate being advisory in CI.

Nits

  • smoke-launcher.mjs asserts on exact banner substrings (Usage: inspector-cli, Usage: mcp-inspector-tui, Mode flags (--web, --cli, --tui)). All three exist in source today, but they're now load-bearing test fixtures — a future reword of any usage banner will break the smoke. That's arguably the point (it proves the right build was reached), just noting the coupling.
  • The Object.defineProperty(globalThis, "fetch", …) baseline returns the same {} 200 regardless of method/URL; fine for incidental calls, but if more components start doing meaningful GETs on mount, this could quietly satisfy assertions that should fail. Not actionable now.

None of these block merge — they're polish/awareness items. Nice cleanup.
· v2/launcher-self-validate

- TUI format/format:check now also cover the client-root source files
  (index.ts, tui.tsx, dev.ts) that live outside src/ — `eslint .` already
  lints them, so they were linted but not format-checked. (cli/launcher
  have no source outside src/; their root .ts are single-quoted configs
  intentionally left out of prettier, matching web.)
- Reword the setup.ts default-fetch comment: it overstated auto-restore
  (the unit project sets no restoreMocks/unstubGlobals), and now also
  flags that the permissive 200 default means a test asserting on a fetch
  FAILURE must install its own rejecting stub.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall

Copy link
Copy Markdown
Member Author

Thanks for the thorough review! Addressed the two actionable findings in 5e872b7:

  • Progress notifications #1 (format/lint surface mismatch): Widened the TUI format / format:check to also cover its client-root source files — index.ts, tui.tsx, dev.ts — which eslint . already lints. (cli/launcher have no source outside src/; their only root .ts are the single-quoted vitest/tsup configs, which — like web's vite.config.ts — are intentionally kept out of prettier so they don't get flipped to double quotes. So I left those, and the launcher's scripts/make-executable.js is .js, outside the eslint ts/mts glob too.)
  • Completion requests #3 (fetch-baseline comment oversells restore): Reworded the setup.ts comment — it no longer claims Vitest auto-reverts (the unit project sets no restoreMocks/unstubGlobals), and it now flags the real tradeoff you noted: the permissive 200 {} default means a test asserting on a fetch failure must install its own rejecting stub. I opted to fix the comment rather than add restoreMocks: true globally, since the suite is green and that flag (restoring after every test) risks breaking files that set spies in beforeAll.

On the by-design / awareness items:

  • Request logging #2 (eslint.config.js single quotes): Left as-is — they match the repo's config-file convention (web's vite.config.ts, every client vitest.config.ts) and aren't prettier-gated anywhere.
  • SSE transport support #4 (coverage gate now CI-invisible): Confirmed intentional — the gate is local-only (npm run coverage, documented in AGENTS.md as a pre-push step). Happy to revisit if the team wants it enforced in CI.
  • Nits (smoke-launcher banner coupling, single-response fetch default): Acknowledged — the banner coupling is deliberate (it proves the right build was reached), and the fetch default is fine for incidental calls per the reworded comment.

@cliffhall

Copy link
Copy Markdown
Member Author

@claude review

@claude

claude Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Claude finished @cliffhall's task in 40s —— View job


Re-review (post-fixups in 5e872b77)

  • Review changes in 5e872b77 (format glob gap + fetch-baseline comment)
  • Verify the two addressed findings
  • Post review

Re-reviewed the fixup commit against my prior feedback. Both actionable findings are resolved cleanly, and the reasoning in your reply checks out against the actual tree.

#1 — format/lint surface mismatch (TUI)
format / format:check now glob src __tests__ index.ts tui.tsx dev.ts. I confirmed those three are exactly the non-config source files at the TUI client root:

clients/tui/{index.ts, tui.tsx, dev.ts}   ← now format-checked + linted (matched surface)
clients/tui/{tsup.config.ts, vitest.config.ts}  ← intentionally excluded (single-quoted configs)

Cross-checked the other two clients — the asymmetry is justified:

  • CLI root has only tsup.config.ts / vitest.config.ts (configs) → correctly left out.
  • Launcher root has eslint.config.js, vitest.config.ts (configs) and scripts/make-executable.js (a .js, outside both the eslint ts/mts glob and prettier).

So eslint . and prettier now cover the same source surface in every client, with only the deliberately-excluded config files differing. 👍

#3 — fetch-baseline comment
The reworded comment in setup.ts no longer claims automatic restoration. It now correctly states that direct globalThis.fetch mutation isn't auto-reverted without restoreMocks, and explicitly flags the permissive-200 tradeoff (a test asserting on a fetch failure must install its own rejecting stub). Matches the actual config — accurate now.

By-design items (#2 eslint.config.js quotes, #4 CI-invisible coverage gate, smoke-launcher coupling): acknowledged as intentional in your reply; no objection.

Nothing else surfaced in the delta. LGTM — nice, tightly-scoped fixups. ✅

@cliffhall

Copy link
Copy Markdown
Member Author

Thanks — LGTM confirmed, all findings resolved. Ready for merge into v2/main once CI is green.

cliffhall and others added 2 commits June 20, 2026 20:28
`install:clients` ran the same `node scripts/install-clients.mjs` as
`postinstall`, which fires automatically on `npm install` — so the manual
alias was redundant. Drop it and the doc references (AGENTS.md, spec).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This PR made CI run the full `npm run smoke`, which surfaced that
smoke:tui fails in headless CI: the Ink TUI throws "Raw mode is not
supported" on mount (no TTY) and exits before rendering its first frame.
The old CI never ran smoke:tui, so this was latent.

smoke-tui.mjs now self-skips when `process.env.CI` is set — the boot/
render check is inherently local-only (a TUI needs a terminal). The TUI is
still built and unit-tested in CI; `npm run smoke` stays the single
command everywhere (full locally, tui-skipped in CI). Verified
`CI=true npm run smoke` is green (launcher/cli OK, tui SKIPPED, web OK).
Docs + CI comment updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall cliffhall merged commit 5efe9dd into v2/main Jun 21, 2026
1 check passed
@cliffhall cliffhall deleted the v2/launcher-self-validate branch June 21, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2 Issues and PRs for v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Self-validating clients: per-client validate scripts; smoke separated from validate

1 participant