Fire the GitHub ↔ LLM-session graph join (cross-source convergence + File re-key)#125
Conversation
…File re-key) Make a repo, commit, or file seen by BOTH @hypaware/github and a recorded Claude/Codex session land on ONE graph node. Convergence is automatic given a shared natural key (ids are content-addressed, LLP 0023), so the work is agreeing the key on the LLM-session side and capturing what the keys need. - Shared key vocabulary: context-graph/src/graph-keys.js, exposed as kit.keys. Repo/Commit/File recipes are byte-identical to github-hyp-plugin/src/keys.js; host-only remote-URL → owner/repo and absolute-path → relpath reconciliation feed them. Digest pins on both sides enforce the cross-repo contract. - Capture (schema_version 7): new nullable git_remote / head_sha / repo_root columns. The Claude hook already shells git for the branch, so it now also reads the remote, full HEAD sha, and repo root; Codex's already-captured git_origin_url / git_commit / workspace path are promoted to first-class fields (live + backfill). The gateway data source pads its declared schema columns so reading a new column doesn't throw over pre-v7 partitions. - Additive nodes/edges: Repo (owner/repo), Commit (full HEAD sha), with Session -in-> Repo, Session -at-> Commit, and Commit -in-> Repo (the last converges with @hypaware/github's identical edge). - File migration (costly-to-reverse): re-key File from absolute path to owner/repo:relpath, with absolute-path fallback for out-of-repo / non-github / no-repo files. This also collapses one file across git worktrees, which absolute-path keying never could. Orphans committed File/touched rows (content-addressed ids, no retract) — re-project deliberately; see LLP 0032. - Convergence guard (ai-gateway-graph-bridge.test.js) pins the exact node/edge digests @hypaware/github publishes and asserts the host bridge mints them. - LLP 0032 records the decision, the migration, the github.com-only V1 limit, the abbreviated-sha guard, and Actor staying deferred to enrichment (0028). npm test: 1245 pass / 0 fail. typecheck + lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Dual-agent review —
|
| Source | Finding (severity, evidence) | Intersects |
|---|---|---|
| Codex | major — capability accepts ^1.0.0 but unconditionally uses kit.keys (ai-gateway-graph/src/index.js:26, graph_contract.js:43/122) |
Capability version negotiation gap (Risks #1) |
| Codex | major — raw git_remote persisted to rows + graph provenance (hook_command.js:70, backfill.js:508, graph_contract.js:137/222/332) |
Raw git_remote persisted verbatim (Risks #2) |
| Claude | minor — codex backfill repo-identity mapping exercised but unverified (backfill.js:508-510; codex-backfill.test.js) |
Capture chain (Direct callers); coverage gap, not a high-risk surface |
Codex review
Fix Validations
GitHub ↔ LLM graph convergence
- Status: correct
- Evidence:
hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:148,hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:186,hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:231,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:128 - Assessment: Repo, Commit, and File keys now route through
kit.keys, and the contract uses those keys consistently for nodes and edges.
File re-key with fallback
- Status: correct
- Evidence:
hypaware-core/plugins-workspace/context-graph/src/graph-keys.js:207,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:326 - Assessment: In-repo GitHub files re-key to
owner/repo:relpath; non-bridgeable paths keep the prior absolute-path key.
Additive v7 column compatibility
- Status: correct
- Evidence:
hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:144,hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:166 - Assessment: The gateway data source now advertises declared schema columns even when older partitions lack them, so the new graph SQL can reference nullable v7 fields.
Findings
2) Contract & Interface Fidelity
- Severity: major
- Confidence: high
- Evidence:
hypaware-core/plugins-workspace/context-graph/src/index.js:23,hypaware-core/plugins-workspace/context-graph/src/index.js:66,hypaware-core/plugins-workspace/ai-gateway-graph/src/index.js:26,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:43,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:122 - Why it matters:
ai-gateway-graphstill acceptshypaware.context-graph@^1.0.0, but now unconditionally callskit.keys; an older 1.x provider can satisfy activation and then fail later during projection. - Suggested fix: Bump the context-graph capability version that guarantees
kit.keys, update the consumer requirement accordingly, and fail at activation with a clear error ifkit.keysis absent.
6) Security Surface
- Severity: major
- Confidence: high
- Evidence:
hypaware-core/plugins-workspace/claude/src/hook_command.js:70,hypaware-core/plugins-workspace/claude/src/hook_command.js:178,hypaware-core/plugins-workspace/codex/src/exchange-projector.js:535,hypaware-core/plugins-workspace/codex/src/backfill.js:508,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:137,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:222,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:332 - Why it matters: Git remotes can be credential-bearing HTTPS URLs, and this PR persists the raw remote into gateway rows and graph provenance even though only the normalized owner/repo is needed for convergence.
- Suggested fix: Redact URL userinfo before storing
git_remoteor graphsource_keys, and add a regression case such ashttps://user:token@github.com/acme/repo.git.
No Finding
- Behavioral Correctness
- Change Impact / Blast Radius
- Concurrency, Ordering & State Safety
- Error Handling & Resilience
- Resource Lifecycle & Cleanup
- Release Safety
- Test Evidence Quality
- Architectural Consistency
- Debuggability & Operability
Evidence Bundle
- Changed hot paths: context-graph capability kit, ai-gateway graph contract projection, ai-gateway message schema/data source, Claude session hook/projector, Codex live/backfill projectors.
- Impacted callers:
hypaware-core/plugins-workspace/ai-gateway-graph/src/index.js:26,hypaware-core/plugins-workspace/context-graph/src/index.js:66,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:120,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:204. - Impacted tests:
test/plugins/ai-gateway-graph-bridge.test.js:52,test/plugins/ai-gateway-graph-contract.test.js:214,test/plugins/claude-session-context-hook.test.js:154,test/plugins/codex-exchange-projector.test.js:479,test/plugins/ai-gateway-message-projector.test.js:34. - Unresolved uncertainty: I did not inspect more than the requested five files; the GitHub plugin counterpart is outside this worktree, and the claim that Codex
workspace.pathis always the repo root remains assumed rather than proven by the diff.
Claude review
Claude review
Codex backfill's new first-class repo-identity mapping is exercised but unverified
- Severity: minor
- Confidence: 85
- Evidence: hypaware-core/plugins-workspace/codex/src/backfill.js:508-510 (mapping); test/plugins/codex-backfill.test.js:245-249, 264-276 (asserts only
attributes.codex.*and materialized rows, never the new first-class fields) - Why it matters: The PR adds new deterministic field bindings in
projectedExchangeFromSession(git_remote = session.gitOriginUrl,head_sha = session.gitCommit,repo_root = session.cwd); the modern-rollout test fixture has a git block so this code runs, but the test asserts none ofexchange.git_remote/head_sha/repo_root(nor on the materialized rows), so a wrong/swapped binding in the backfill path would pass green — unlike the Claude hook and Codex live-projector paths, which both got direct assertions for the equivalent wiring. - Suggested fix: In the "modern rollout projects into canonical ai_gateway_messages rows" test, add
assert.equal(exchange.git_remote, 'https://github.com/acme/repo.git'),assert.equal(exchange.head_sha, 'abc123def'),assert.equal(exchange.repo_root, '/work/repo')(and ideally assert the same three on a materializedrow), mirroringcodex-exchange-projector.test.js:479-481.
Reports: .git/dual-review/pr-125
…daction Fixes the two majors and the coverage gap from the dual-review of #125. 1. Contract fidelity (capability version). `kit.keys` became a required field of the `hypaware.context-graph` capability, but the provider still advertised 1.0.0 and the connector required ^1.0.0 — a stale 1.0.x provider would satisfy activation and then throw deep in projection. Bump the provided capability to 1.1.0 (constant + manifest) and tighten @hypaware/ai-gateway-graph to ^1.1.0, so requireCapability rejects a pre-keys provider at activation with a clear cap_missing error. The unrelated context-graph-enrich consumer keeps ^1.0.0 (1.1.0 satisfies it; it doesn't use keys). 2. Security (raw git remote). The Repo key normalizes to owner/repo (credential-safe), but the raw git_remote was persisted verbatim into the ai_gateway_messages row, the attributes.codex.git_origin_url mirror, the Claude session-context sidecar, and (read back from the row) the graph node/edge source_keys. A token-bearing HTTPS remote (https://x-access-token:<token>@github.com/o/r.git) would leak the token to disk. Redact URL userinfo at ingress in every capture path, before the value reaches any sink; the scp-like SSH form (git@host:o/r) is left intact. owner/repo is all convergence needs, so nothing downstream breaks. 3. Test coverage. Pin the redaction on each path (claude hook end-to-end via a real git repo, codex backfill, codex live projector, plus a unit test of the redactor) and assert the first-class git_remote/head_sha/repo_root fields on the codex backfill path, which were exercised but unverified. LLP 0032 updated in the same commit (capability-bump rationale + new "Remote redaction" section). Full suite green (1256 pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011CRtJuCKFbvQ8W3kG2543P
Addressed the dual-review findings —
|
…face Architectural correction following review of the capability bump in 0df89a2. The bridge-key recipes (Repo/Commit/File: repoKeyFromRemote, commitKey, fileKeyFromParts, …) belonged in the engine kit as `kit.keys`. That gave three specific node types a privileged home in `@hypaware/context-graph`, a substrate that otherwise hardcodes zero node types — the wrong precedent for a graph meant to carry many sources (some unofficial), each of which should own its own node types symmetrically. It was also what forced the 1.0.0 → 1.1.0 capability bump. Move `graph-keys.js` from `@hypaware/context-graph` (engine) into `@hypaware/ai-gateway-graph` (connector), beside the contract that mints those nodes — the host-side twin of `github-hyp-plugin/src/keys.js`. The contract now imports `keys` directly instead of reading it off the kit. Consequences: - Engine reverts to a node-type-agnostic substrate: kit is back to { nodeId, edgeId, makeRowBuilders }; capability back to 1.0.0; connector requires ^1.0.0 again. `git diff master -- context-graph/` is now empty — the PR's engine surface is a no-op. The bump + activation guard from 0df89a2 are gone (the bump was a symptom of the misplacement, not a real need). - Convergence with @hypaware/github is unchanged — still enforced by the digest pins in ai-gateway-graph-bridge.test.js (which pass unchanged, proving the move is byte-identical), not by shared engine code. context-graph-enrich's ^1.0.0 requirement stays satisfied. - GraphKeys type consolidates into the connector (full 8-fn, documenting the module it now owns); the engine kit type drops `keys`. Credential-redaction fix and capture-column coverage tests are untouched. LLP 0032 "Shared key vocabulary" rewritten to record connector ownership and the "engine stays node-type-agnostic" principle; the capability-bump paragraph is removed. Full suite green (1256 pass), typecheck + lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011CRtJuCKFbvQ8W3kG2543P
Reverted the capability bump — bridge keys now owned by the connector (
|
Dual-agent review —
|
| Source | Finding (severity, evidence) | Intersects |
|---|---|---|
| Codex | F1 .. escape mints wrong File key (major, graph-keys.js:211) |
Targets (relativizePath), Risks bullet 1 |
| Codex | F2 Codex repo_root←cwd mis-keys in subdirs (major, exchange-projector.js:537 / backfill.js:512) |
Targets (new columns), Risks bullet 2 |
| Claude | Claude backfill drops repo-identity fields (major, claude/backfill.js:303) | Direct callers (column producers), Risks bullet 3 |
| Claude | withSchemaColumns padding untested (minor, dataset.js:166) |
Direct callers (withSchemaColumns), Risks bullet 4 |
Codex review
Fix Validations
GitHub bridge IDs for Repo/Commit
- Status: correct
- Evidence:
hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:131,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:146,test/plugins/ai-gateway-graph-bridge.test.js:51,test/plugins/ai-gateway-graph-bridge.test.js:58 - Assessment: Repo and full-sha Commit nodes now derive through the shared key helpers and are pinned against the GitHub-side digests.
File bridge re-key
- Status: incomplete
- Evidence:
hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:324,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:211,hypaware-core/plugins-workspace/codex/src/exchange-projector.js:537 - Assessment: The happy path converges, but two edge cases can still mint wrong File keys or skip convergence. Details below.
Remote credential redaction
- Status: correct
- Evidence:
hypaware-core/plugins-workspace/codex/src/exchange-projector.js:502,hypaware-core/plugins-workspace/codex/src/backfill.js:438,hypaware-core/plugins-workspace/claude/src/hook_command.js:185,test/plugins/codex-exchange-projector.test.js:518 - Assessment: New live, backfill, and Claude capture paths redact URL userinfo before storing
git_remote.
Findings
1) Behavioral Correctness
- Severity: major
- Confidence: high
- Evidence:
hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:211,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:218,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph-keys.js:63,hypaware-core/plugins-workspace/ai-gateway-graph/src/graph_contract.js:349 - Why it matters:
relativizePath()uses a raw prefix check andnormalizeRelpath()does not collapse or reject.., so a raw tool path like/repo/../outside.txtis treated as bridgeable and becomesowner/repo:../outside.txtinstead of falling back to the absolute out-of-repo key. - Suggested fix: Normalize the absolute path and root before the containment check, then reject relative results that are
.., start with../, or are absolute; add tests for/repo/../outside,/repo/sub/../../outside, and normal in-repo paths.
2) Contract & Interface Fidelity
- Severity: major
- Confidence: medium
- Evidence:
collectivus-plugin-kernel-types.d.ts:1370,hypaware-core/plugins-workspace/claude/src/hook_command.js:180,hypaware-core/plugins-workspace/codex/src/exchange-projector.js:537,hypaware-core/plugins-workspace/codex/src/exchange-projector.js:542,hypaware-core/plugins-workspace/codex/src/backfill.js:437,hypaware-core/plugins-workspace/codex/src/backfill.js:512 - Why it matters:
repo_rootis documented and consumed asgit rev-parse --show-toplevel, but Codex live/backfill populate it from workspace/cwd; if Codex runs in/repo/pkg, files under/repo/pkgkey asowner/repo:a.jsinstead ofowner/repo:pkg/a.js, and sibling repo files fail to converge. - Suggested fix: Only set Codex
repo_rootwhen the metadata explicitly provides the repository root or when backfill can derive it withgit rev-parse --show-toplevel; otherwise leave it unset so File keys fall back rather than minting wrong bridge IDs.
No Finding
- Change Impact / Blast Radius
- Concurrency, Ordering & State Safety
- Error Handling & Resilience
- Security Surface
- Resource Lifecycle & Cleanup
- Release Safety
- Test Evidence Quality
- Architectural Consistency
- Debuggability & Operability
Evidence Bundle
- Changed hot paths:
message_projector.js:5,message_projector.js:51,message_projector.js:662,graph_contract.js:118,graph_contract.js:202,graph-keys.js:235,hook_command.js:176,exchange-projector.js:465,backfill.js:429 - Impacted callers:
ai-gateway-graph/src/index.js:32,context-graph/src/project.js:50,src/core/query/sql.js:108 - Impacted tests:
ai-gateway-graph-bridge.test.js:51,ai-gateway-graph-contract.test.js:235,claude-session-context-hook.test.js:154,codex-exchange-projector.test.js:479,codex-backfill.test.js:248 - Unresolved uncertainty: I did not run the test suite; review is based on the supplied diff plus targeted caller/contract tracing. Actual Codex metadata may currently use repo roots as workspace keys, but the code and tests do not establish that contract.
Claude review
Claude review
Claude backfill drops the new repo-identity fields, so backfilled Claude sessions never converge
- Severity: major
- Confidence: 85
- Evidence: hypaware-core/plugins-workspace/claude/src/backfill.js:303
- Why it matters: The live Claude projector stamps
git_remote/head_sha/repo_rootfrom the session-context record (projector.js:273-275), andpickLatestMatchinghands the same record (whichsession_context.js:126-128normalizes those three fields onto) to the backfill path — butprojectedExchangeFromEntriesstamps onlycwd/git_branch(backfill.js:303-304), so backfilled Claude sessions silently mint noRepo/Commitnodes, keep absolute-pathFilekeys, and never join with@hypaware/githubor across worktrees — the exact convergence this PR exists to deliver, which live Claude and both Codex paths (live + backfill) do perform. LLP 0032 §capture documents Codex backfill explicitly but is silent on Claude backfill, so the asymmetry reads as an oversight, not a deliberate scope cut. - Suggested fix: Mirror the live projector after backfill.js:304 —
if (record?.git_remote) exchange.git_remote = record.git_remote(andhead_sha,repo_root); add the three fields to the appended record + row assertions intest/plugins/claude-backfill.test.js. If Claude backfill is intentionally left un-bridged, say so in LLP 0032 §capture.
withSchemaColumns v7 null-padding (the no-ColumnNotFoundError shim) has no direct test
- Severity: minor
- Confidence: 80
- Evidence: hypaware-core/plugins-workspace/ai-gateway/src/dataset.js:166
- Why it matters: LLP 0032 §capture names this padding as the load-bearing mechanism that lets the additive v7 columns ship with no partition-label bump or cache wipe — it prevents
ColumnNotFoundErrorwhen a contract/query readsgit_remote/head_sha/repo_rootover a pre-v7 partition. It is exercised only transitively (every other test stages partitions that already carry all columns), so a regression dropping the padding would pass the whole suite while breaking real queries over old data. - Suggested fix: Add one test in
test/core/ai-gateway-dataset.test.js: build acreateDataSourceover a partition that physically lacksgit_remote, assertsource.columnsincludes the three new columns, and that scanning a row reads them as null rather than throwing.
Reports: .git/dual-review/pr-125
…pture paths Four findings from the dual-agent review of the GitHub ↔ LLM-session graph join, all converging on the File re-key path producing wrong-but-plausible keys or skipping convergence on some ingestion paths: - relativizePath (`..` escape, major): POSIX-normalize root and path before the containment check, so a path that escapes the repo via `..` (`/repo/../outside`) falls back to its absolute key instead of slicing to a bogus `owner/repo:../outside` relpath (which could collide with a real file). In-repo `..` that stays inside still relativizes. - Codex repo_root (subdir mis-key, major): Codex exposes no verified git toplevel — live `workspace.path` and backfill rollout `cwd` can be a repo subdir, which silently mis-relativizes (and can false-merge) File keys onto one content-addressed node. Stop populating Codex repo_root; Codex File nodes keep absolute keys in V1. Repo/Commit convergence (git_remote/head_sha) is unaffected, so the headline session↔repo / session↔commit joins still fire. - Claude backfill (dropped fields, major): backfill stamped only cwd/git_branch from the session-context record, so re-imported Claude sessions never converged. Stamp git_remote/head_sha/repo_root too, mirroring the live projector (Claude's hook captures a real --show-toplevel, so it bridges). - withSchemaColumns (untested shim, minor): add a direct test that a v7 column read over a pre-v7 partition surfaces as null instead of ColumnNotFoundError. LLP 0032 updated in lockstep: corrected §capture (the workspace path is NOT the repo root), added §codex-repo-root recording the fail-safe, noted Claude-backfill parity, and documented the `..`-normalization fallback in §file-migration. npm test green (1259 pass), lint + typecheck clean, @refs resolve. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
What & why
Make a repo, commit, or file seen by both
@hypaware/githuband a recorded Claude/Codex session land on one graph node. Convergence is automatic given a shared natural key (ids are content-addressed, LLP 0023) — the GitHub plugin already mints bridge-readyRepo/Commit/Filenodes; this is the host-side half that makes the LLM-session contract adopt matching keys so the join actually fires. The same re-key also collapses one file across git worktrees, which absolute-path keying never could.Design recorded in
llp/0032-github-llm-graph-bridge.decision.md.Changes
context-graph/src/graph-keys.js, exposed on the kit askit.keys.Repo/Commit/Filerecipes are byte-identical togithub-hyp-plugin/src/keys.js; host-only remote-URL→owner/repoand absolute-path→relpath reconciliation feed them.schema_version7) — new nullablegit_remote/head_sha/repo_root. The Claude hook already shellsgitfor the branch, so it now also reads the remote, full HEAD sha, and repo root. Codex's already-capturedgit_origin_url/git_commit/workspace path are promoted to first-class fields (live + backfill). The gateway data source pads its declared schema columns so reading a new column doesn't throwColumnNotFoundErrorover pre-v7 partitions.Repo,Commit, withSession -in-> Repo,Session -at-> Commit, andCommit -in-> Repo(the last converges with the GitHub side's identical edge).Filefrom absolute path →owner/repo:relpath, with absolute-path fallback for out-of-repo / non-github / no-repo files. Orphans committedFile/touchedrows (content-addressed ids, no retract).ai-gateway-graph-bridge.test.jspins the exact node/edge digests@hypaware/githubpublishes and asserts the host bridge mints them (e1505143…,c40ec7e7…,ca7c3b20…,f036a284…).Migration note (operational)
Re-projecting mints the new
owner/repo:relpathFile ids alongside the orphaned absolute ones; compaction won't merge them. To retire the stale rows, drop theai-gateway.t0File/touchedrows and re-runhyp graph project. See LLP 0032 §file-migration.Cross-repo coupling
graph-keys.jsandgithub-hyp-plugin/src/keys.jsmust stay byte-identical; the digest pins on both sides are the enforcement.Verification
npm test: 1245 pass / 1 skipped / 0 failnpm run typecheck+npm run lint: cleangateway_codex_capture(exercises Codex git promotion) +backfill_codex_fixturepass.context_graph_projects_rows/gateway_claude_capturefail identically on baseline (pre-existing temp-install harness gaps), not introduced here.🤖 Generated with Claude Code