Skip to content

feat(provider): add Sandra semantic graph provider#42

Open
ShabanShaame wants to merge 1 commit into
supermemoryai:mainfrom
ShabanShaame:feat/sandra-provider
Open

feat(provider): add Sandra semantic graph provider#42
ShabanShaame wants to merge 1 commit into
supermemoryai:mainfrom
ShabanShaame:feat/sandra-provider

Conversation

@ShabanShaame
Copy link
Copy Markdown

Summary

Adds a sandra provider that stores memories in Sandra, a semantic graph database exposing typed refs and entity factories via an MCP HTTP server.

Architecture

Ingestion — two-stage pipeline matching the existing filesystem/RAG providers:

  1. LLM extractor reads each session and emits structured entities + facts as JSON.
  2. One sandra_batch MCP call writes everything to the graph in a single roundtrip, scoped by instance_id = containerTag.

Retrieval — single-shot sandra_semantic_search on the lme_fact factory, client-filtered by instance_id. No multi-turn tool-use agent — search results are static, matching how mem0 and zep providers return retrieval results today.

Cleanupclear() is a best-effort no-op like supermemory's; tests use a dedicated DB (SANDRA_DB=benchmark_mb) so physical isolation comes from the database, not per-containerTag deletion.

Extractor LLM

Provider preference is checked at init time:

Env var present Provider Default model
ANTHROPIC_API_KEY Anthropic claude-haiku-4-5-20251001
OPENAI_API_KEY OpenAI gpt-4o-mini

Override either default via SANDRA_EXTRACTOR_MODEL. Anthropic wins if both keys are set.

Env vars added

  • SANDRA_URL — MCP HTTP endpoint. Default http://localhost:8090/mcp.
  • SANDRA_TOKEN — optional bearer for the MCP server.
  • (Reused) ANTHROPIC_API_KEY or OPENAI_API_KEY for the extractor.
  • (Optional) SANDRA_EXTRACTOR_MODEL to pin a specific extractor model.

Self-hosting

Sandra ships as PHP + MySQL; the Sandra repo includes a docker-compose setup. No managed-cloud requirement — reviewers can spin up a local instance and run memorybench -p sandra end-to-end.

Wiring

File Change
src/providers/sandra/index.ts SandraProvider class
src/providers/sandra/extractor.ts LLM extractor (Anthropic / OpenAI via Vercel AI SDK)
src/providers/sandra/mcp-client.ts MCP HTTP client for sandra_batch / sandra_semantic_search
src/providers/sandra/prompts.ts Extractor + answer prompts
src/providers/sandra/schema.ts Factory + verb constants
src/providers/index.ts Register sandra in factory map
src/types/provider.ts Extend ProviderName union
src/cli/index.ts Document the provider + flag
src/utils/config.ts Add SANDRA_URL, SANDRA_TOKEN; route apiKey slot to ANTHROPIC_API_KEY

Test plan

  • Reviewer spins up a local Sandra instance (docker compose up in the Sandra repo).
  • Reviewer sets SANDRA_URL, ANTHROPIC_API_KEY (or OPENAI_API_KEY).
  • bun run src/cli/index.ts -p sandra --dataset longmemeval --questions 5 ingests a small slice and returns answers.
  • Reviewer verifies that switching extractor (Anthropic ↔ OpenAI) via env vars works without code changes.

Adds a `sandra` provider that stores memories in Sandra
(github.com/everdreamsoft/sandra), a semantic graph database
exposing typed refs and entity factories via an MCP HTTP server.

Ingestion is a two-stage pipeline: an LLM extractor emits
entities + facts per session, then one `sandra_batch` call writes
them to the graph scoped by `instance_id = containerTag`.

Extractor supports both Anthropic (default `claude-haiku-4-5-20251001`)
and OpenAI (default `gpt-4o-mini`); preference order is Anthropic if
`ANTHROPIC_API_KEY` is set, otherwise OpenAI. Override either default
via `SANDRA_EXTRACTOR_MODEL`.

Retrieval is single-shot: `sandra_semantic_search` on the
`lme_fact` factory, client-filtered by `instance_id`. No
multi-turn tool-use agent — search results are static, matching
how Mem0 and Zep providers return retrieval results today.

Wiring:
- src/providers/sandra/{index,extractor,mcp-client,prompts,schema}.ts
- providers/index.ts: register `sandra` in factory map
- types/provider.ts: extend `ProviderName` union
- cli/index.ts: document the provider + flag
- utils/config.ts: add `SANDRA_URL`, `SANDRA_TOKEN`; route Sandra's
  apiKey slot to `ANTHROPIC_API_KEY`

Required env (one of):
- ANTHROPIC_API_KEY — preferred path, Claude Haiku
- OPENAI_API_KEY    — fallback path, gpt-4o-mini

Plus SANDRA_URL (MCP HTTP endpoint, default
http://localhost:8090/mcp) and optional SANDRA_TOKEN bearer.

Sandra is self-hostable (PHP + MySQL, docker-compose in the
Sandra repo); no managed-cloud requirement.
@ShabanShaame ShabanShaame marked this pull request as ready for review May 12, 2026 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant