Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions bin/admin.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
* hypaware-server-admin activate-config <name> <etag>
* hypaware-server-admin query "SELECT ..."
* hypaware-server-admin run-mover | run-archive [--force] | run-eviction
* hypaware-server-admin github-backfill [owner/repo ...]
* hypaware-server-admin graph-project [--source <dataset>] [--dry-run]
* hypaware-server-admin graph-neighbors <node> [--depth N] [--type T] [--edge-type T] [--direction out|in|both] [--limit N]
*
* Env: HYPSERVER_URL (default http://127.0.0.1:8740), HYPSERVER_ADMIN_TOKEN
*/
Expand Down Expand Up @@ -122,6 +125,30 @@ switch (command) {
case 'run-eviction':
await call('POST', '/v1/admin/eviction/run')
break
case 'github-backfill':
// Positional owner/repo args narrow the configured selection; none = all.
await call('POST', '/v1/admin/github/backfill', { repos: rest.filter((a) => !a.startsWith('--')) })
break
case 'graph-project':
await call('POST', '/v1/admin/graph/project', {
...(flagValue('--source') ? { source: flagValue('--source') } : {}),
dry_run: rest.includes('--dry-run'),
})
break
case 'graph-neighbors': {
const depth = flagValue('--depth')
const limit = flagValue('--limit')
const node = rest[0] && !rest[0].startsWith('--') ? rest[0] : ''
await call('POST', '/v1/admin/graph/neighbors', {
node,
...(flagValue('--type') ? { type: flagValue('--type') } : {}),
...(flagValue('--edge-type') ? { edge_types: [flagValue('--edge-type')] } : {}),
...(flagValue('--direction') ? { direction: flagValue('--direction') } : {}),
...(depth !== undefined ? { depth: Number(depth) } : {}),
...(limit !== undefined ? { limit: Number(limit) } : {}),
})
break
}
default:
console.error('unknown command; see header comment for usage')
process.exit(2)
Expand Down
149 changes: 149 additions & 0 deletions llp/0010-server-side-graph.decision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# LLP 0010: Server-Side Graph — GitHub Capture, T0 Projection, and Neighbors as Admin Operations

**Type:** Decision
**Status:** Active
**Systems:** Query, Core
**Author:** Phil / Claude
**Date:** 2026-06-22
**Related:** LLP 0002, LLP 0004, LLP 0006

> The server now holds a context graph, not just forwarded logs. It loads the
> bundled `@hypaware/context-graph` plugin and a vendored `@hypaware/github`
> source plugin into its own kernel, captures GitHub activity directly, and
> exposes `graph project` / `graph neighbors` as admin operations — so the
> forwarded LLM sessions ([hypaware LLP 0032 git-bridge](../../hypaware/llp))
> and GitHub activity converge into one node/edge graph on the server.

## Why on the server at all

The admin attach is a read-only SQL endpoint ([LLP 0006](./0006-admin-query-attach.decision.md)):
it can read the server's cache + archive but cannot invoke a plugin command. And
the graph's whole point is convergence — `Repo`/`Actor`/`Commit`/`File` nodes
minted from GitHub activity sharing content-addressed ids with nodes minted from
LLM sessions (the `git_remote`/`head_sha`/`repo_root` columns 1.6 added to
`ai_gateway_messages`). That convergence only happens where both datasets live in
one kernel cache. The forwarded LLM logs live in the **server's** cache. So the
projection must run there, not in a side client reading the server over SQL.

## Decision

<a id="in-kernel-graph"></a>**The server activates the graph plugins in its own
kernel.** `pluginSelection()` ([boot.js](../src/boot.js)) gains two entries
before the control plane: `@hypaware/context-graph` (from the bundled workspace
that ships inside the `hypaware` dependency) and `@hypaware/github` (vendored —
see below). The kernel resolver orders context-graph first from github's
`requires.plugins`, so github's eager `requireCapability('hypaware.context-graph')`
resolves. The plugins activate for their **registrations** — github_events /
node / edge datasets, the github T0 contract, and the `github backfill` /
`graph project` / `graph neighbors` commands — exactly as on a client.

Alongside them the server also loads `@hypaware/ai-gateway-graph` — the
cross-source connector that maps the forwarded `ai_gateway_messages` (the LLM
sessions) into the same node/edge graph. It is pure contract registration (no
source, no config; requires context-graph), and its bridge-ready
`Repo`/`Commit`/`File` keys are content-addressed identically to github's, so a
single `graph project` run spans both sources and the two converge by id — the
GitHub↔LLM join 1.6's git-bridge (`git_remote`/`head_sha`/`repo_root`) was built
for (hypaware LLP 0032).

This reuses the kernel-host substrate of [LLP 0002](./0002-kernel-reuse.decision.md):
the server expresses behavior as activated plugins, not bespoke code. Projection
and traversal are the graph plugin's own **pure functions** — `projectGraph()` and
`queryNeighbors()` — invoked over the server kernel's `runtime.query` /
`runtime.storage` handles (the same handles `executeSql` uses). The registered
contracts are read through the plugin's process-global registry singleton
(`requireGraphRuntime().registry.list()`) — the identical seam the in-process
`hyp graph project` command uses — so nothing in the `hypaware` repo had to change.
The deep reach into the bundled plugin tree is funnelled through the one sanctioned
shim ([kernel/shim.js](../src/kernel/shim.js), [LLP 0002#host-shim](./0002-kernel-reuse.decision.md#host-shim)),
anchored on the same `bundledWorkspaceDir` the loader used so the singleton is the
very instance activation populated.

**The github plugin is vendored** under [`plugins/github/`](../plugins/github/)
rather than added as a dependency: it has no git remote yet, and `plugins/` is
already on the server's plugin path and already copied by the Docker build, so
vendoring adds the plugin with no Dockerfile or build-context change. Keep exactly
one copy of its source tree — the runtime singletons depend on module identity.

<a id="server-pulls-github"></a>**The server pulls GitHub directly; capture is an
admin one-shot, not a daemon source.** The github plugin registers a poll source,
but the server never starts it — and the wrapped source registry would suppress it
anyway ([LLP 0002](./0002-kernel-reuse.decision.md), only `@hypaware-server/*` may
start sources). Instead, `github backfill` runs on demand via the admin surface,
calling the plugin's `runCaptureTick(runtime, { mode: 'backfill' })` and then
flushing the `github_events` cache table so the freshly captured rows are queryable
on the same kernel. Repo selection (`orgs` / `repos` / `ignore`) comes from server
env (`HYPSERVER_GITHUB_*`), injected as the plugin's `[github]` config section by
`githubSection()` in boot. The GitHub token stays in the box environment under the
name in `token_env` (default `GITHUB_TOKEN`) and is read by the github client at
request time — never in config, consistent with the secrets-never-in-config
invariant ([LLP 0000](./0000-hypaware-server.explainer.md)). The box therefore now
makes outbound calls to the GitHub API; this is the one network egress the server
performs beyond its archive destination.

<a id="admin-operations"></a>**Graph operations are admin operations on the server
kernel, the precedent [LLP 0006](./0006-admin-query-attach.decision.md) set for SQL.**
`POST /v1/admin/github/backfill`, `/v1/admin/graph/project`, and
`/v1/admin/graph/neighbors` ([routes-admin.js](../src/http/routes-admin.js)) sit
behind the same admin bearer token, alongside the mover / archive / eviction
escape hatches. They are *operations*, not fleet config ([LLP 0009](./0009-remote-config.spec.md)):
nothing here is served to gateways. Each has a one-line `hypaware-server-admin`
wrapper for docker-exec workflows ([LLP 0008#admin-visibility](./0008-fleet-enrollment.spec.md#admin-visibility)).
Projection is idempotent — content-addressed ids plus pre-write dedup mean a
re-run with no new source data writes zero rows — so an operator firing
`graph-project` after each backfill (or after a fresh forward of LLM logs) just
folds new activity into the existing graph.

<a id="self-managed-datasets"></a>**Graph datasets keep their own read closures;
the server's date-partition synthesis is only for forwarded ingest.** The
catalog-backed registry ([LLP 0004#catalog-backed-registry](./0004-dataset-catalog.spec.md#catalog-backed-registry))
synthesizes `discoverPartitions` / `createDataSource` that read `date=` partitions —
correct for the wire-ingested datasets the mover fills (logs, traces, metrics,
ai_gateway_messages), and it deliberately discards the client-side closures those
plugins ship because they assume a client layout. But `github_events` is
`source=`-partitioned and `node` / `edge` are `graph_v1`-partitioned, and all three
are written **directly** into the cache by the in-kernel plugins (never wire-ingested),
each carrying a `createDataSource` authored for its own layout. So
[registry.js](../src/catalog/registry.js) routes datasets from the self-managed
plugins (`@hypaware/server`, `@hypaware/context-graph`, `@hypaware/github`) to the
`custom` map — keeping their closures — and everything else to the catalog seed +
date-partition synthesis. The split is by plugin, not by closure presence, because
`@hypaware/ai-gateway` also ships a `createDataSource` yet must take the synthesized
path. Without this, projection and neighbors silently read zero rows even though the
data is on disk.

## Consequences and bounds

- `github_events` / `node` / `edge` are cache-resident and never archived (no wire
signal, no archive ack). Ack-coupled eviction ([LLP 0003](./0003-spool-durability.spec.md))
refuses to evict unarchived partitions, so they are not lost; they also are not
`date=`-partitioned, so the date-based eviction sweep does not touch them.
- A concurrent backfill and projection stay safe: pre-write dedup plus
content-addressed ids keep projection idempotent, and `graph compact` (a graph
command) cleans any residue from interleaved runs.
- The hermetic smoke test ([test/smoke.js](../test/smoke.js)) drives the full chain
— inject an in-memory GitHub client into the runtime singleton, `github backfill`,
`graph project` (asserting idempotence on re-run), and `graph neighbors` — with no
network, exercising the real cache append + read path. It also pins the surfaces a
backfill-only path leaves dormant: the per-repo cursor sidecar round-trip and a
`poll`-mode tick (no-change captures nothing and skips PR descent; an advanced
high-water captures only new rows and moves the cursor on); `resolveRepos` /
`captureRepos` repo-set resolution and per-repo error isolation; the `graph project`
`--source` filter and `dry_run` safe-preview; the `graph neighbors`
direction/`edge_types`/`limit`/`depth` parameters; and the `hypaware-server-admin`
CLI wrappers, spawned end-to-end against the running daemon.
- **`plugins/github/` is vendored third-party code and is out of scope for this
repo's `/ref-check`.** Its source carries `@ref LLP NNNN` annotations that resolve
against the **`@hypaware/github`** corpus (where, e.g., LLP 0002 is "capture-model"),
not this server's (where LLP 0002 is "kernel-reuse"). Those refs are honest in their
own repo; treat the vendored tree like a dependency and exclude it when validating
this repo's references. The server's own code carries the `@ref LLP 0010` annotations
for everything decided here.

## Not in scope

Ongoing server-side polling (the github poll source stays dormant by design),
enrichment beyond T0 (a non-goal for these datasets), and serving the github plugin
to fleet clients (clients do not capture GitHub in this design). A future
metadata-level partition spec for the graph datasets is deferred, as for the
archive ([LLP 0005#day-aligned-exports](./0005-archive-sink.decision.md#day-aligned-exports)).
11 changes: 6 additions & 5 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions plugins/github/hypaware.plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"schema_version": 1,
"name": "@hypaware/github",
"version": "0.1.0",
"hypaware_api": "^1.0.0",
"runtime": "node",
"node_engine": ">=20",
"entrypoint": "./src/index.js",
"description": "GitHub source plugin built for the context-graph. Captures GitHub activity (issues, pull requests, commits, files, reviews, comments) into a thin append-only github_events skeleton and bundles a T0 projection contract that maps it into the node/edge graph with bridge-ready natural keys. Capture is poll/backfill/sync; projection is the graph plugin's `hyp graph project`.",
"permissions": ["network", "read_state", "write_state"],
"requires": {
"plugins": {
"@hypaware/context-graph": "^0.1.0"
},
"capabilities": {
"hypaware.context-graph": "^1.0.0"
}
},
"contributes": {
"datasets": [
{ "name": "github_events", "summary": "Append-only GitHub activity skeleton (structural columns only; content stays in GitHub)" }
],
"config_sections": [
{ "section": "github", "summary": "Repo/org selection, ignore list, optional poll interval, and token env-var name" }
],
"sources": [
{ "name": "github", "summary": "Ongoing poll (opt-in via poll_interval): append events past each repo's cursor" }
],
"commands": [
{ "name": "github backfill", "summary": "Pull a repo's full history into github_events (cold-start)" },
{ "name": "github sync", "summary": "Run one poll tick now, off the daemon" }
]
}
}
Loading