Skip to content

Fire the GitHub ↔ LLM-session graph join (cross-source convergence + File re-key)#125

Merged
philcunliffe merged 4 commits into
masterfrom
github-llm-graph-bridge
Jun 21, 2026
Merged

Fire the GitHub ↔ LLM-session graph join (cross-source convergence + File re-key)#125
philcunliffe merged 4 commits into
masterfrom
github-llm-graph-bridge

Conversation

@philcunliffe

Copy link
Copy Markdown
Contributor

What & why

Make a repo, commit, or file seen by both @hypaware/github and a recorded Claude/Codex session land on one graph node. Convergence is automatic given a shared natural key (ids are content-addressed, LLP 0023) — the GitHub plugin already mints bridge-ready Repo/Commit/File nodes; this is the host-side half that makes the LLM-session contract adopt matching keys so the join actually fires. The same re-key also collapses one file across git worktrees, which absolute-path keying never could.

Design recorded in llp/0032-github-llm-graph-bridge.decision.md.

Changes

  • Shared key vocabulary — new context-graph/src/graph-keys.js, exposed on the kit as kit.keys. Repo/Commit/File recipes are byte-identical to github-hyp-plugin/src/keys.js; host-only remote-URL→owner/repo and absolute-path→relpath reconciliation feed them.
  • Capture (schema_version 7) — new nullable git_remote / head_sha / repo_root. The Claude hook already shells git for the branch, so it now also reads the remote, full HEAD sha, and repo root. Codex's already-captured git_origin_url/git_commit/workspace path are promoted to first-class fields (live + backfill). The gateway data source pads its declared schema columns so reading a new column doesn't throw ColumnNotFoundError over pre-v7 partitions.
  • Additive nodes/edgesRepo, Commit, with Session -in-> Repo, Session -at-> Commit, and Commit -in-> Repo (the last converges with the GitHub side's identical edge).
  • File migration (costly-to-reverse) — re-key File from absolute path → owner/repo:relpath, with absolute-path fallback for out-of-repo / non-github / no-repo files. Orphans committed File/touched rows (content-addressed ids, no retract).
  • Convergence guardai-gateway-graph-bridge.test.js pins the exact node/edge digests @hypaware/github publishes and asserts the host bridge mints them (e1505143…, c40ec7e7…, ca7c3b20…, f036a284…).
  • Actor stays deferred to enrichment (LLP 0028) — no deterministic T0 key.

Migration note (operational)

Re-projecting mints the new owner/repo:relpath File ids alongside the orphaned absolute ones; compaction won't merge them. To retire the stale rows, drop the ai-gateway.t0 File/touched rows and re-run hyp graph project. See LLP 0032 §file-migration.

Cross-repo coupling

graph-keys.js and github-hyp-plugin/src/keys.js must stay byte-identical; the digest pins on both sides are the enforcement.

Verification

  • npm test: 1245 pass / 1 skipped / 0 fail
  • npm run typecheck + npm run lint: clean
  • Smokes: gateway_codex_capture (exercises Codex git promotion) + backfill_codex_fixture pass. context_graph_projects_rows / gateway_claude_capture fail identically on baseline (pre-existing temp-install harness gaps), not introduced here.

🤖 Generated with Claude Code

…File re-key)

Make a repo, commit, or file seen by BOTH @hypaware/github and a recorded
Claude/Codex session land on ONE graph node. Convergence is automatic given a
shared natural key (ids are content-addressed, LLP 0023), so the work is
agreeing the key on the LLM-session side and capturing what the keys need.

- Shared key vocabulary: context-graph/src/graph-keys.js, exposed as kit.keys.
  Repo/Commit/File recipes are byte-identical to github-hyp-plugin/src/keys.js;
  host-only remote-URL → owner/repo and absolute-path → relpath reconciliation
  feed them. Digest pins on both sides enforce the cross-repo contract.

- Capture (schema_version 7): new nullable git_remote / head_sha / repo_root
  columns. The Claude hook already shells git for the branch, so it now also
  reads the remote, full HEAD sha, and repo root; Codex's already-captured
  git_origin_url / git_commit / workspace path are promoted to first-class
  fields (live + backfill). The gateway data source pads its declared schema
  columns so reading a new column doesn't throw over pre-v7 partitions.

- Additive nodes/edges: Repo (owner/repo), Commit (full HEAD sha), with
  Session -in-> Repo, Session -at-> Commit, and Commit -in-> Repo (the last
  converges with @hypaware/github's identical edge).

- File migration (costly-to-reverse): re-key File from absolute path to
  owner/repo:relpath, with absolute-path fallback for out-of-repo / non-github
  / no-repo files. This also collapses one file across git worktrees, which
  absolute-path keying never could. Orphans committed File/touched rows
  (content-addressed ids, no retract) — re-project deliberately; see LLP 0032.

- Convergence guard (ai-gateway-graph-bridge.test.js) pins the exact node/edge
  digests @hypaware/github publishes and asserts the host bridge mints them.

- LLP 0032 records the decision, the migration, the github.com-only V1 limit,
  the abbreviated-sha guard, and Actor staying deferred to enrichment (0028).

npm test: 1245 pass / 0 fail. typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@philcunliffe

Copy link
Copy Markdown
Contributor Author

Dual-agent review — request_changes

  • Verdict: request_changes
  • Risk class: medium
  • Auto-merge advisory: 👎 thumbs down — verdict is request_changes; needs human-gated follow-up

Advisory only: no merge was attempted.

Risk capstone

Cross-reference: reviewer findings vs high-risk surfaces

Source Finding (severity, evidence) Intersects
Codex major — capability accepts ^1.0.0 but unconditionally uses kit.keys (ai-gateway-graph/src/index.js:26, graph_contract.js:43/122) Capability version negotiation gap (Risks #1)
Codex major — raw git_remote persisted to rows + graph provenance (hook_command.js:70, backfill.js:508, graph_contract.js:137/222/332) Raw git_remote persisted verbatim (Risks #2)
Claude minor — codex backfill repo-identity mapping exercised but unverified (backfill.js:508-510; codex-backfill.test.js) Capture chain (Direct callers); coverage gap, not a high-risk surface
Codex review

Fix Validations

GitHub ↔ LLM graph convergence

  • Status: correct
  • Evidence: hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:148, hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:186, hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:231, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:128
  • Assessment: Repo, Commit, and File keys now route through kit.keys, and the contract uses those keys consistently for nodes and edges.

File re-key with fallback

  • Status: correct
  • Evidence: hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:207, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:326
  • Assessment: In-repo GitHub files re-key to owner/repo:relpath; non-bridgeable paths keep the prior absolute-path key.

Additive v7 column compatibility

  • Status: correct
  • Evidence: hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:144, hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:166
  • Assessment: The gateway data source now advertises declared schema columns even when older partitions lack them, so the new graph SQL can reference nullable v7 fields.

Findings

2) Contract & Interface Fidelity

  • Severity: major
  • Confidence: high
  • Evidence: hypaware-core/plugins-workspace/context-graph/src/index.js:23, hypaware-core/plugins-workspace/context-graph/src/index.js:66, hypaware-core/plugins-workspace/ai-gateway-graph/src/index.js:26, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:43, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:122
  • Why it matters: ai-gateway-graph still accepts hypaware.context-graph@^1.0.0, but now unconditionally calls kit.keys; an older 1.x provider can satisfy activation and then fail later during projection.
  • Suggested fix: Bump the context-graph capability version that guarantees kit.keys, update the consumer requirement accordingly, and fail at activation with a clear error if kit.keys is absent.

6) Security Surface

  • Severity: major
  • Confidence: high
  • Evidence: hypaware-core/plugins-workspace/claude/src/hook_command.js:70, hypaware-core/plugins-workspace/claude/src/hook_command.js:178, hypaware-core/plugins-workspace/codex/src/exchange-projector.js:535, hypaware-core/plugins-workspace/codex/src/backfill.js:508, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:137, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:222, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:332
  • Why it matters: Git remotes can be credential-bearing HTTPS URLs, and this PR persists the raw remote into gateway rows and graph provenance even though only the normalized owner/repo is needed for convergence.
  • Suggested fix: Redact URL userinfo before storing git_remote or graph source_keys, and add a regression case such as https://user:token@github.com/acme/repo.git.

No Finding

  1. Behavioral Correctness
  2. Change Impact / Blast Radius
  3. Concurrency, Ordering & State Safety
  4. Error Handling & Resilience
  5. Resource Lifecycle & Cleanup
  6. Release Safety
  7. Test Evidence Quality
  8. Architectural Consistency
  9. Debuggability & Operability

Evidence Bundle

  • Changed hot paths: context-graph capability kit, ai-gateway graph contract projection, ai-gateway message schema/data source, Claude session hook/projector, Codex live/backfill projectors.
  • Impacted callers: hypaware-core/plugins-workspace/ai-gateway-graph/src/index.js:26, hypaware-core/plugins-workspace/context-graph/src/index.js:66, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:120, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:204.
  • Impacted tests: test/plugins/ai-gateway-graph-bridge.test.js:52, test/plugins/ai-gateway-graph-contract.test.js:214, test/plugins/claude-session-context-hook.test.js:154, test/plugins/codex-exchange-projector.test.js:479, test/plugins/ai-gateway-message-projector.test.js:34.
  • Unresolved uncertainty: I did not inspect more than the requested five files; the GitHub plugin counterpart is outside this worktree, and the claim that Codex workspace.path is always the repo root remains assumed rather than proven by the diff.
Claude review

Claude review

Codex backfill's new first-class repo-identity mapping is exercised but unverified

  • Severity: minor
  • Confidence: 85
  • Evidence: hypaware-core/plugins-workspace/codex/src/backfill.js:508-510 (mapping); test/plugins/codex-backfill.test.js:245-249, 264-276 (asserts only attributes.codex.* and materialized rows, never the new first-class fields)
  • Why it matters: The PR adds new deterministic field bindings in projectedExchangeFromSession (git_remote = session.gitOriginUrl, head_sha = session.gitCommit, repo_root = session.cwd); the modern-rollout test fixture has a git block so this code runs, but the test asserts none of exchange.git_remote/head_sha/repo_root (nor on the materialized rows), so a wrong/swapped binding in the backfill path would pass green — unlike the Claude hook and Codex live-projector paths, which both got direct assertions for the equivalent wiring.
  • Suggested fix: In the "modern rollout projects into canonical ai_gateway_messages rows" test, add assert.equal(exchange.git_remote, 'https://github.com/acme/repo.git'), assert.equal(exchange.head_sha, 'abc123def'), assert.equal(exchange.repo_root, '/work/repo') (and ideally assert the same three on a materialized row), mirroring codex-exchange-projector.test.js:479-481.

Reports: .git/dual-review/pr-125

…daction

Fixes the two majors and the coverage gap from the dual-review of #125.

1. Contract fidelity (capability version). `kit.keys` became a required
   field of the `hypaware.context-graph` capability, but the provider still
   advertised 1.0.0 and the connector required ^1.0.0 — a stale 1.0.x
   provider would satisfy activation and then throw deep in projection.
   Bump the provided capability to 1.1.0 (constant + manifest) and tighten
   @hypaware/ai-gateway-graph to ^1.1.0, so requireCapability rejects a
   pre-keys provider at activation with a clear cap_missing error. The
   unrelated context-graph-enrich consumer keeps ^1.0.0 (1.1.0 satisfies it;
   it doesn't use keys).

2. Security (raw git remote). The Repo key normalizes to owner/repo
   (credential-safe), but the raw git_remote was persisted verbatim into the
   ai_gateway_messages row, the attributes.codex.git_origin_url mirror, the
   Claude session-context sidecar, and (read back from the row) the graph
   node/edge source_keys. A token-bearing HTTPS remote
   (https://x-access-token:<token>@github.com/o/r.git) would leak the token
   to disk. Redact URL userinfo at ingress in every capture path, before the
   value reaches any sink; the scp-like SSH form (git@host:o/r) is left
   intact. owner/repo is all convergence needs, so nothing downstream breaks.

3. Test coverage. Pin the redaction on each path (claude hook end-to-end via
   a real git repo, codex backfill, codex live projector, plus a unit test of
   the redactor) and assert the first-class git_remote/head_sha/repo_root
   fields on the codex backfill path, which were exercised but unverified.

LLP 0032 updated in the same commit (capability-bump rationale + new
"Remote redaction" section). Full suite green (1256 pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011CRtJuCKFbvQ8W3kG2543P
@philcunliffe

Copy link
Copy Markdown
Contributor Author

Addressed the dual-review findings — 0df89a2

Codex major (cat 2 — contract fidelity): kit.keys is now a required field of the capability, so the provided hypaware.context-graph capability is bumped to 1.1.0 (constant + manifest) and @hypaware/ai-gateway-graph tightens its requirement to ^1.1.0. requireCapability now rejects a pre-keys 1.0.x provider at activation with a clear cap_missing error instead of throwing deep in projection. (context-graph-enrich keeps ^1.0.01.1.0 satisfies it and it doesn't use keys.)

Codex major (cat 6 — security): the raw git_remote is now redacted at ingress in every capture path (Claude hook, Codex live projector, Codex backfill) — URL userinfo (https://user:token@…) is stripped before the value reaches any sink: the row column, the attributes.codex.git_origin_url mirror, the Claude session-context sidecar, and the graph source_keys (read back from the row). The scp-like SSH form (git@host:o/r) is left intact; owner/repo is all convergence needs, so nothing downstream changes.

Claude minor (coverage): the first-class git_remote/head_sha/repo_root bindings on the Codex backfill path are now asserted, and each redaction path is pinned by a test (Claude end-to-end via a real git repo, Codex backfill, Codex live projector, plus a unit test of the redactor).

LLP 0032 updated in the same commit (capability-bump rationale + new Remote redaction section). Full suite green: 1256 pass, 0 fail.

…face

Architectural correction following review of the capability bump in 0df89a2.
The bridge-key recipes (Repo/Commit/File: repoKeyFromRemote, commitKey,
fileKeyFromParts, …) belonged in the engine kit as `kit.keys`. That gave three
specific node types a privileged home in `@hypaware/context-graph`, a substrate
that otherwise hardcodes zero node types — the wrong precedent for a graph meant
to carry many sources (some unofficial), each of which should own its own node
types symmetrically. It was also what forced the 1.0.0 → 1.1.0 capability bump.

Move `graph-keys.js` from `@hypaware/context-graph` (engine) into
`@hypaware/ai-gateway-graph` (connector), beside the contract that mints those
nodes — the host-side twin of `github-hyp-plugin/src/keys.js`. The contract now
imports `keys` directly instead of reading it off the kit.

Consequences:
- Engine reverts to a node-type-agnostic substrate: kit is back to
  { nodeId, edgeId, makeRowBuilders }; capability back to 1.0.0; connector
  requires ^1.0.0 again. `git diff master -- context-graph/` is now empty — the
  PR's engine surface is a no-op. The bump + activation guard from 0df89a2 are
  gone (the bump was a symptom of the misplacement, not a real need).
- Convergence with @hypaware/github is unchanged — still enforced by the digest
  pins in ai-gateway-graph-bridge.test.js (which pass unchanged, proving the
  move is byte-identical), not by shared engine code. context-graph-enrich's
  ^1.0.0 requirement stays satisfied.
- GraphKeys type consolidates into the connector (full 8-fn, documenting the
  module it now owns); the engine kit type drops `keys`.

Credential-redaction fix and capture-column coverage tests are untouched.
LLP 0032 "Shared key vocabulary" rewritten to record connector ownership and
the "engine stays node-type-agnostic" principle; the capability-bump paragraph
is removed. Full suite green (1256 pass), typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011CRtJuCKFbvQ8W3kG2543P
@philcunliffe

Copy link
Copy Markdown
Contributor Author

Reverted the capability bump — bridge keys now owned by the connector (e199fd5)

Follow-up to a design concern about the dual-review's Fix 1: the 1.0.0 → 1.1.0 capability bump was a symptom of putting Repo/Commit/File-specific key recipes (kit.keys) into the generic @hypaware/context-graph engine. The engine hardcodes zero node types everywhere else; hosting three blessed node types' identity recipes there is the wrong precedent for a substrate meant to carry many sources (some unofficial), each of which should own its own node types.

What changed:

  • Moved graph-keys.js from @hypaware/context-graph (engine) → @hypaware/ai-gateway-graph (connector), beside the contract that mints those nodes — the host-side twin of github-hyp-plugin/src/keys.js. The contract imports keys directly instead of reading it off the kit.
  • Engine reverts to a node-type-agnostic substrate: kit back to { nodeId, edgeId, makeRowBuilders }, capability back to 1.0.0, connector requires ^1.0.0 again. git diff master -- context-graph/ is now empty — the PR's engine surface is a no-op. The version bump + activation guard from 0df89a2 are gone.
  • Convergence with @hypaware/github is unchanged — still enforced by the digest pins in ai-gateway-graph-bridge.test.js, which pass unchanged, proving the move is byte-identical. context-graph-enrich's ^1.0.0 stays satisfied.

The credential-redaction fix (Fix 2) and capture-column coverage tests (Fix 3) are untouched. LLP 0032's "Shared key vocabulary" section now records connector ownership and the "engine stays node-type-agnostic" principle. Full suite green (1256 pass), typecheck + lint clean, all @ref LLP 0032 anchors resolve.

@philcunliffe

Copy link
Copy Markdown
Contributor Author

Dual-agent review — request_changes

  • Verdict: request_changes
  • Risk class: high
  • Auto-merge advisory: 👎 thumbs down — verdict is request_changes; needs human-gated follow-up

Advisory only: no merge was attempted.

Risk capstone

Cross-reference: reviewer findings vs high-risk surfaces

Source Finding (severity, evidence) Intersects
Codex F1 .. escape mints wrong File key (major, graph-keys.js:211) Targets (relativizePath), Risks bullet 1
Codex F2 Codex repo_root←cwd mis-keys in subdirs (major, exchange-projector.js:537 / backfill.js:512) Targets (new columns), Risks bullet 2
Claude Claude backfill drops repo-identity fields (major, claude/backfill.js:303) Direct callers (column producers), Risks bullet 3
Claude withSchemaColumns padding untested (minor, dataset.js:166) Direct callers (withSchemaColumns), Risks bullet 4
Codex review

Fix Validations

GitHub bridge IDs for Repo/Commit

  • Status: correct
  • Evidence: hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:131, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:146, test/plugins/ai-gateway-graph-bridge.test.js:51, test/plugins/ai-gateway-graph-bridge.test.js:58
  • Assessment: Repo and full-sha Commit nodes now derive through the shared key helpers and are pinned against the GitHub-side digests.

File bridge re-key

  • Status: incomplete
  • Evidence: hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:324, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:211, hypaware-core/plugins-workspace/codex/src/exchange-projector.js:537
  • Assessment: The happy path converges, but two edge cases can still mint wrong File keys or skip convergence. Details below.

Remote credential redaction

  • Status: correct
  • Evidence: hypaware-core/plugins-workspace/codex/src/exchange-projector.js:502, hypaware-core/plugins-workspace/codex/src/backfill.js:438, hypaware-core/plugins-workspace/claude/src/hook_command.js:185, test/plugins/codex-exchange-projector.test.js:518
  • Assessment: New live, backfill, and Claude capture paths redact URL userinfo before storing git_remote.

Findings

1) Behavioral Correctness

  • Severity: major
  • Confidence: high
  • Evidence: hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:211, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:218, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:63, hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:349
  • Why it matters: relativizePath() uses a raw prefix check and normalizeRelpath() does not collapse or reject .., so a raw tool path like /repo/../outside.txt is treated as bridgeable and becomes owner/repo:../outside.txt instead of falling back to the absolute out-of-repo key.
  • Suggested fix: Normalize the absolute path and root before the containment check, then reject relative results that are .., start with ../, or are absolute; add tests for /repo/../outside, /repo/sub/../../outside, and normal in-repo paths.

2) Contract & Interface Fidelity

  • Severity: major
  • Confidence: medium
  • Evidence: collectivus-plugin-kernel-types.d.ts:1370, hypaware-core/plugins-workspace/claude/src/hook_command.js:180, hypaware-core/plugins-workspace/codex/src/exchange-projector.js:537, hypaware-core/plugins-workspace/codex/src/exchange-projector.js:542, hypaware-core/plugins-workspace/codex/src/backfill.js:437, hypaware-core/plugins-workspace/codex/src/backfill.js:512
  • Why it matters: repo_root is documented and consumed as git rev-parse --show-toplevel, but Codex live/backfill populate it from workspace/cwd; if Codex runs in /repo/pkg, files under /repo/pkg key as owner/repo:a.js instead of owner/repo:pkg/a.js, and sibling repo files fail to converge.
  • Suggested fix: Only set Codex repo_root when the metadata explicitly provides the repository root or when backfill can derive it with git rev-parse --show-toplevel; otherwise leave it unset so File keys fall back rather than minting wrong bridge IDs.

No Finding

  1. Change Impact / Blast Radius
  2. Concurrency, Ordering & State Safety
  3. Error Handling & Resilience
  4. Security Surface
  5. Resource Lifecycle & Cleanup
  6. Release Safety
  7. Test Evidence Quality
  8. Architectural Consistency
  9. Debuggability & Operability

Evidence Bundle

  • Changed hot paths: message_projector.js:5, message_projector.js:51, message_projector.js:662, graph_contract.js:118, graph_contract.js:202, graph-keys.js:235, hook_command.js:176, exchange-projector.js:465, backfill.js:429
  • Impacted callers: ai-gateway-graph/src/index.js:32, context-graph/src/project.js:50, src/core/query/sql.js:108
  • Impacted tests: ai-gateway-graph-bridge.test.js:51, ai-gateway-graph-contract.test.js:235, claude-session-context-hook.test.js:154, codex-exchange-projector.test.js:479, codex-backfill.test.js:248
  • Unresolved uncertainty: I did not run the test suite; review is based on the supplied diff plus targeted caller/contract tracing. Actual Codex metadata may currently use repo roots as workspace keys, but the code and tests do not establish that contract.
Claude review

Claude review

Claude backfill drops the new repo-identity fields, so backfilled Claude sessions never converge

  • Severity: major
  • Confidence: 85
  • Evidence: hypaware-core/plugins-workspace/claude/src/backfill.js:303
  • Why it matters: The live Claude projector stamps git_remote/head_sha/repo_root from the session-context record (projector.js:273-275), and pickLatestMatching hands the same record (which session_context.js:126-128 normalizes those three fields onto) to the backfill path — but projectedExchangeFromEntries stamps only cwd/git_branch (backfill.js:303-304), so backfilled Claude sessions silently mint no Repo/Commit nodes, keep absolute-path File keys, and never join with @hypaware/github or across worktrees — the exact convergence this PR exists to deliver, which live Claude and both Codex paths (live + backfill) do perform. LLP 0032 §capture documents Codex backfill explicitly but is silent on Claude backfill, so the asymmetry reads as an oversight, not a deliberate scope cut.
  • Suggested fix: Mirror the live projector after backfill.js:304 — if (record?.git_remote) exchange.git_remote = record.git_remote (and head_sha, repo_root); add the three fields to the appended record + row assertions in test/plugins/claude-backfill.test.js. If Claude backfill is intentionally left un-bridged, say so in LLP 0032 §capture.

withSchemaColumns v7 null-padding (the no-ColumnNotFoundError shim) has no direct test

  • Severity: minor
  • Confidence: 80
  • Evidence: hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:166
  • Why it matters: LLP 0032 §capture names this padding as the load-bearing mechanism that lets the additive v7 columns ship with no partition-label bump or cache wipe — it prevents ColumnNotFoundError when a contract/query reads git_remote/head_sha/repo_root over a pre-v7 partition. It is exercised only transitively (every other test stages partitions that already carry all columns), so a regression dropping the padding would pass the whole suite while breaking real queries over old data.
  • Suggested fix: Add one test in test/core/ai-gateway-dataset.test.js: build a createDataSource over a partition that physically lacks git_remote, assert source.columns includes the three new columns, and that scanning a row reads them as null rather than throwing.

Reports: .git/dual-review/pr-125

…pture paths

Four findings from the dual-agent review of the GitHub ↔ LLM-session graph
join, all converging on the File re-key path producing wrong-but-plausible
keys or skipping convergence on some ingestion paths:

- relativizePath (`..` escape, major): POSIX-normalize root and path before
  the containment check, so a path that escapes the repo via `..`
  (`/repo/../outside`) falls back to its absolute key instead of slicing to a
  bogus `owner/repo:../outside` relpath (which could collide with a real file).
  In-repo `..` that stays inside still relativizes.

- Codex repo_root (subdir mis-key, major): Codex exposes no verified git
  toplevel — live `workspace.path` and backfill rollout `cwd` can be a repo
  subdir, which silently mis-relativizes (and can false-merge) File keys onto
  one content-addressed node. Stop populating Codex repo_root; Codex File nodes
  keep absolute keys in V1. Repo/Commit convergence (git_remote/head_sha)
  is unaffected, so the headline session↔repo / session↔commit joins still fire.

- Claude backfill (dropped fields, major): backfill stamped only cwd/git_branch
  from the session-context record, so re-imported Claude sessions never
  converged. Stamp git_remote/head_sha/repo_root too, mirroring the live
  projector (Claude's hook captures a real --show-toplevel, so it bridges).

- withSchemaColumns (untested shim, minor): add a direct test that a v7 column
  read over a pre-v7 partition surfaces as null instead of ColumnNotFoundError.

LLP 0032 updated in lockstep: corrected §capture (the workspace path is NOT the
repo root), added §codex-repo-root recording the fail-safe, noted Claude-backfill
parity, and documented the `..`-normalization fallback in §file-migration.

npm test green (1259 pass), lint + typecheck clean, @refs resolve.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@philcunliffe philcunliffe merged commit dcea821 into master Jun 21, 2026
6 checks passed
@philcunliffe philcunliffe deleted the github-llm-graph-bridge branch June 21, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant