Skip to content

Add native LLM core foundation#24712

Open
kitlangton wants to merge 169 commits intodevfrom
llm-core-patch-api
Open

Add native LLM core foundation#24712
kitlangton wants to merge 169 commits intodevfrom
llm-core-patch-api

Conversation

@kitlangton
Copy link
Copy Markdown
Contributor

Summary

  • Add packages/llm, a native Effect-based LLM core with typed request/event schemas, provider adapters, patches, tool runtime, and recorded provider tests.
  • Add an OpenCode native request/event/tool stream bridge behind OPENCODE_EXPERIMENTAL_LLM_NATIVE, while keeping the existing stream path as the default fallback.
  • Extract generic HTTP cassette recording/replay into the private workspace package @opencode-ai/http-recorder.

Safety

  • PR diff was scanned for real-looking secrets, AI attribution trailers, debug leftovers, and suspicious files.
  • Secret-like hits are intentional fake/test sentinel values or documented AWS example credentials.
  • Branch was checked to merge cleanly with current origin/dev.

Testing

  • cd packages/opencode && bun run test test/provider/llm-bridge.test.ts test/session/llm-native.test.ts test/session/llm-native-events.test.ts test/session/llm-native-stream.test.ts
  • cd packages/opencode && bun typecheck
  • cd packages/http-recorder && bun run test
  • cd packages/http-recorder && bun typecheck
  • cd packages/llm && bun run test test/provider/openai-compatible-chat.recorded.test.ts test/provider/anthropic-messages.recorded.test.ts test/provider/gemini.recorded.test.ts
  • cd packages/llm && bun typecheck
  • pre-push hook: bun turbo typecheck

kitlangton added 25 commits May 1, 2026 08:11
- Structurally match recorded requests by canonical JSON so non-deterministic
  field ordering doesn't break replay.
- Pluggable header allow-list and body redaction hook on the record/replay
  layer, so adapters with non-default auth (Anthropic, Bedrock) can plug in
  without touching this file.
- Move the cassette-name dedupe set inside recordedTests() so two describe
  files using different prefixes can run in parallel.
- Replace inline SSE template literals and per-file HTTP layers with shared
  test/lib helpers (sseEvents, fixedResponse, dynamicResponse, truncatedStream).
- Tighten recorded-test assertions to exact text and usage so adapter parser
  regressions surface immediately instead of passing fuzzy length>0 checks.
- Add cancellation and mid-stream transport-error tests for the OpenAI Chat
  adapter.
- Add cross-phase patch tests that verify each phase sees an updated
  PatchContext and that same-order patches sort deterministically by id.
- shared sse helper now expects Effectful decodeChunk and process callbacks,
  so adapter parsers can be Effect.gen and yield typed ProviderChunkError
  instead of throwing across the sync mapAccum boundary.
- parseJson returns Effect<unknown, ProviderChunkError> via Effect.try,
  matching the package style guide on yieldable errors.
- OpenAI Chat finalizes accumulated tool inputs eagerly when finish_reason
  arrives, surfacing JSON parse failures at the boundary instead of at halt.
  onHalt stays sync and just emits from state.
- generate's runFold reducer now mutates the accumulator instead of
  reallocating the events array on every chunk, dropping O(n^2) growth on
  long streams.
Gemini rejects integer enums, dangling required fields, untyped arrays, and
object keywords on scalar schemas. The sanitizer was previously a divergent
copy in OpenCode; this lands it in the package as a tool-schema patch with
deterministic tests and selects it for Gemini-protocol or Gemini-named models.

Also tightens the Gemini test suite: covers tool-choice none, drops the
tool-input-delta assertion that Gemini does not actually emit, and confirms
total usage stays undefined when only thoughtsTokenCount arrives.
…integration

Updates the AGENTS.md TODO list:
- mark Responses, Anthropic, and Gemini adapter coverage as done
- mark the Gemini schema sanitizer port as done
- add concrete next-step items for OpenCode integration: ModelRef bridge,
  request bridge, provider-quirk patches, request/stream parity tests, and
  a flagged rollout against existing session/llm.test.ts cases
- add OpenAI-compatible Chat, Bedrock Converse, and Vertex routing as
  outstanding adapter/dispatch decisions
Every adapter's parse already produces LLMEvents (via the process callback in
the shared sse helper), and every raise was Stream.make(event). The Chunk type
parameter, the raise field, the RaiseState interface, and the Stream.flatMap
raise step in client.stream were all pure overhead.

- Adapter contract shrinks from <Draft, Target, Chunk> to <Draft, Target>.
- All four adapters drop their raise: (event) => Stream.make(event) line.
- client.stream skips the no-op flatMap.
- AGENTS.md adapter section reflects the simpler contract.
Per the package style guide, sync if/return functions that need to fail
should yield the error directly via Effect.gen rather than ladder
Effect.fail / Effect.succeed across every branch.

Touches all four adapters' tool-choice lowering. The naming-required
validation now reads as 'guard, then return' rather than embedded in a
chain of monadic returns. Behavior unchanged.
Locks down the error contract before OpenCode integration:
- mid-stream provider errors (Anthropic 'event: error', OpenAI Responses
  'type: error') surface as 'provider-error' LLMEvents
- HTTP 4xx responses fail with ProviderRequestError before stream parsing
  begins (the executor contract)

Anthropic already had both. Adds:
- OpenAI Responses: provider-error fixture, code-fallback fixture, HTTP 400
- OpenAI Chat: HTTP 400 sad path
- AGENTS.md TODO refreshed; live recordings of provider errors still pending
Schema-first, Effect-first tool loop:

- 'tool({ description, parameters, success, execute })' constructs a fully
  typed Tool. parameters and success are Effect Schemas; execute is typed
  against them and returns Effect<Success, ToolFailure>. Handler dependencies
  are closed over at construction time so the runtime never sees per-tool
  services.
- 'ToolRuntime.run(client, { request, tools, maxSteps?, stopWhen? })' streams
  the model, decodes tool-call inputs against parameters, dispatches to the
  matching handler, encodes results against success, emits tool-result events,
  appends assistant + tool messages, and re-streams. Stops on non-tool-calls
  finish, maxSteps, or stopWhen.
- Three recoverable error paths emit tool-error events so the model can
  self-correct: unknown tool name, input fails parameters Schema, handler
  returns ToolFailure. Defects fail the stream.
- 'ToolFailure' added to the schema and exported as the single forced error
  channel for handlers.
- Tool definitions on the LLMRequest are derived via toJsonSchemaDocument so
  consumers don't write JSON Schema by hand.

8 deterministic fixture tests cover the loop, errors, maxSteps, stopWhen, and
parallel tool calls in one step.
kitlangton added 29 commits May 5, 2026 16:58
…e-patch-api

# ------------------------ >8 ------------------------
# Do not modify or remove the line above.
# Everything below it will be ignored.
#
# Conflicts:
#	packages/llm/AGENTS.md
#	packages/llm/package.json
#	packages/llm/script/recording-cost-report.ts
#	packages/llm/script/setup-recording-env.ts
#	packages/llm/src/adapter.ts
#	packages/llm/src/index.ts
#	packages/llm/src/llm.ts
#	packages/llm/src/patch.ts
#	packages/llm/src/protocols/anthropic-messages.ts
#	packages/llm/src/protocols/openai-responses.ts
#	packages/llm/src/provider/bedrock-converse.ts
#	packages/llm/src/provider/openai-compatible-chat.ts
#	packages/llm/src/provider/openai-compatible-family.ts
#	packages/llm/src/provider/xai.ts
#	packages/llm/src/schema.ts
#	packages/llm/src/tool-runtime.ts
#	packages/llm/src/tool.ts
#	packages/llm/test/adapter.test.ts
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/groq-llama-3-3-70b-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/groq-streams-text.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/groq-streams-tool-call.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/openrouter-claude-opus-4-7-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/openrouter-gpt-4o-mini-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/openrouter-gpt-5-5-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/openrouter-streams-text.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/openrouter-streams-tool-call.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/xai-grok-4-3-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/xai-streams-text.json
#	packages/llm/test/fixtures/recordings/openai-compatible-chat/xai-streams-tool-call.json
#	packages/llm/test/fixtures/recordings/openai-responses/gpt-5-5-drives-a-tool-loop.json
#	packages/llm/test/fixtures/recordings/openai-responses/gpt-5-5-streams-text.json
#	packages/llm/test/fixtures/recordings/openai-responses/gpt-5-5-streams-tool-call.json
#	packages/llm/test/provider-resolver.test.ts
#	packages/llm/test/provider/anthropic-messages.recorded.test.ts
#	packages/llm/test/provider/bedrock-converse.test.ts
#	packages/llm/test/provider/gemini.recorded.test.ts
#	packages/llm/test/provider/gemini.test.ts
#	packages/llm/test/provider/openai-chat-tool-loop.recorded.test.ts
#	packages/llm/test/provider/openai-compatible-chat.recorded.test.ts
#	packages/llm/test/provider/openai-responses.recorded.test.ts
#	packages/llm/test/provider/openai-responses.test.ts
#	packages/llm/test/recorded-scenarios.ts
#	packages/llm/test/recorded-test.ts
#	packages/opencode/src/provider/llm-bridge.ts
#	packages/opencode/src/session/llm-native-tools.ts
#	packages/opencode/src/session/llm.ts
#	packages/opencode/test/provider/llm-bridge.test.ts
#	packages/opencode/test/session/llm-native-stream.test.ts
Users now see consistent, descriptive error messages when attempting
to use unsupported content types (like media or reasoning) with LLM
providers that don't support them. Instead of generic or inconsistent
error messages, each error clearly states which provider, message role,
and content types are actually supported.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working contributor Vouched

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant