NVIDIA NeMo Relay helps you see and control what happens inside agent runs without rewriting the agent stack you already have. It gives coding agents, applications, framework integrations, middleware, and observability backends a shared runtime for scopes, policy, plugins, and lifecycle events.
The best first step is to get one real run on disk. Once Relay is writing raw events and a trajectory file, you have something concrete to inspect, debug, and build from.
This walkthrough gives you an end-to-end success signal. You install the
nemo-relay CLI, turn on local exporters, run either Codex or Claude Code
through Relay, and check that Relay wrote both raw events and normalized
trajectories.
Tip
Start by trusting the raw Agent Trajectory Observability Format (ATOF) JSONL. It shows the lifecycle events Relay actually captured before anything is translated into Agent Trajectory Interchange Format (ATIF), OpenTelemetry, or OpenInference output.
cargo install nemo-relay-cliIf you use cargo-binstall, the CLI can also be installed with:
cargo binstall nemo-relay-cliFrom the project directory you want to observe, open the project-scoped plugin editor:
nemo-relay plugins edit --projectThe editor creates or updates the nearest project plugin file at
.nemo-relay/plugins.toml. In the menu:
- Enable the
Observabilitycomponent. - Open
ATOF, toggle the section on, and set:output_directoryto.nemo-relay/atoffilenametoevents.jsonlmodetooverwrite
- Open
ATIF, toggle the section on, and set:output_directoryto.nemo-relay/atiffilename_templatetotrajectory-{session_id}.json
- Press
pto preview the generated TOML. - Press
sto save.
Note
Use nemo-relay plugins edit without --project only when you want these
exporter settings in your user-level Relay config instead of this one project.
Use the host CLI that is installed on your machine.
nemo-relay codex -- exec "Summarize this repository."nemo-relay claude -- "Summarize this repository."The transparent wrapper starts a local Relay gateway, injects host-specific hook and provider settings for that launched process, then shuts the gateway down when the agent exits.
Warning
Codex users may need to review and activate generated hooks before events appear. Refer to the Codex CLI guide for the current hook activation caveat and troubleshooting steps.
After the run exits, check that raw events and trajectory files were written:
test -s .nemo-relay/atof/events.jsonl
ls .nemo-relay/atif/*.json
for file in .nemo-relay/atif/*.json; do
python3 -m json.tool "$file" >/dev/null
doneThen verify that at least one raw ATOF 0.1 event exists:
python3 - <<'PY'
from pathlib import Path
import json
events_path = Path(".nemo-relay/atof/events.jsonl")
events = [
json.loads(line)
for line in events_path.read_text().splitlines()
if line.strip()
]
assert events, "no ATOF events were written"
assert any(event.get("atof_version") == "0.1" for event in events), "no ATOF 0.1 events found"
print(f"validated {len(events)} ATOF event(s)")
PYA successful run gives you two things to inspect:
.nemo-relay/atof/events.jsonl, the raw canonical event stream.- One or more
.nemo-relay/atif/*.jsontrajectory files for analysis and evaluation workflows.
Tip
If raw ATOF events exist but LLM spans are missing, provider traffic probably isn't flowing through the Relay gateway. If ATIF is missing, make sure the agent session or turn ended and the output directory is writable. Use NeMo Relay CLI when you are ready for persistent host plugin installation, gateway configuration, exporter options, and agent-specific diagnostics.
Pick the row closest to what you are trying to do next. Refer to the corresponding documentation for more information.
| Goal | Start With |
|---|---|
| Observe Codex, Claude Code, Cursor, or Hermes locally | NeMo Relay CLI |
| Instrument app-owned LLM or tool calls | Quick Start |
| Use LangChain, LangGraph, Deep Agents, or OpenClaw | Supported Integrations |
| Build a framework or provider integration | Integrate into Frameworks |
| Export ATOF, ATIF, OpenTelemetry, or OpenInference | Observability Plugin |
| Package reusable middleware or exporters | Build Plugins |
| Develop or test this repository from source | CONTRIBUTING.md |
If you own the code that calls the model or tool, install the binding for your language and route that boundary through Relay directly.
# Python
uv add nemo-relay
# Node.js
npm install nemo-relay-node
# Rust
cargo add nemo-relayThen run the smallest workflow for that binding:
The Node.js package requires Node.js 24 or newer.
Relay is the liaison between agent systems. A production application may combine NeMo Agent Toolkit, LangChain, LangGraph, provider SDKs, custom harness code, NeMo Guardrails, tracing systems, and evaluation pipelines. Relay gives those pieces one runtime contract instead of asking every layer to invent its own wrappers and trace vocabulary.
Relay gives those systems:
- Scopes so runs, turns, tools, LLM calls, and subagents have clear ownership, parent-child lineage, cleanup boundaries, and request isolation.
- Managed LLM and tool calls so the same lifecycle and middleware rules apply around each callback.
- Middleware for the places where Relay must block, sanitize, transform, route, retry, or replace execution.
- Plugins so reusable observability, guardrail, adaptive, and exporter behavior can be turned on from configuration.
- Events and subscribers so raw ATOF, normalized ATIF, OpenTelemetry, and OpenInference output all come from the same runtime stream.
Relay does not replace your framework, model provider, application logic, observability backend, or guardrail authoring system. It gives those systems a common boundary to meet at.
flowchart LR
App[Application, Framework, or CLI Harness]
subgraph Runtime[NeMo Relay Runtime]
direction TB
Scopes[Scopes]
Middleware[Middleware]
Plugins[Plugins]
Events[Lifecycle Events]
end
Output[Subscribers and Exporters]
App --> Scopes
App --> Middleware
Plugins --> Middleware
Scopes --> Events
Middleware --> Events
Events --> Output
Note
The main supported paths today are Rust, Python, and Node.js. Go, WebAssembly, and raw C FFI are available for source-first users, but they are still experimental.
The following table shows which language bindings and CLI features are currently supported:
| Binding | Status | Notes |
|---|---|---|
| Python | Fully supported | Documented with Quick Start and Guides. |
| Node.js | Fully supported | Documented with Quick Start and Guides. |
| Rust | Fully supported | Documented with Quick Start and Guides. |
| NeMo Relay CLI | Supported | Local observability and hook-backed security are supported; optimization is partial and host-dependent. |
| Go | Experimental | Source-first under go/nemo_relay. |
| WebAssembly | Experimental | Source-first under crates/wasm. |
| FFI | Experimental | Source-first under crates/ffi. |
The CLI support matrix separates the supported CLI surface from host-specific coverage.
- Observability works for the listed harnesses.
- Security is supported when the host exposes blocking hooks.
- Optimization remains partial and host-dependent.
| Agent | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| Claude Code | Yes | Yes | Partial | Hook forwarding, pre-tool blocking, and gateway-routed LLM observability are supported. |
| Codex | Yes | Yes | Partial | Hook activation is required; missing session-end behavior limits trajectory finalization and full optimization coverage. |
| Hermes Agent | Yes | Yes | Partial | Hook forwarding, pre-tool blocking, and gateway-routed or hook-backed LLM observability are supported. |
| Cursor | Partial | Limited | No | Missing hooks under cursor-agent and manual gateway routing limit full feature coverage. |
Use these integrations when the framework exposes stable callbacks, middleware, or plugin hooks that preserve enough lifecycle fidelity.
| Agent / Library | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| LangChain | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| LangGraph | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| Deep Agents | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| OpenClaw | Yes | Partial | No | Hook-backed telemetry with pre-tool guardrails. Managed execution rewrites require the patch-based integration. |
The Python nemo-relay package ships extras for LangChain, LangGraph, and Deep
Agents:
uv add "nemo-relay[langchain,langgraph,deepagents]"Refer to Supported Integrations for setup guides and current caveats.
Patch-based integrations are experimental samples maintained against pinned upstream checkouts. Use third_party/README.md for the clone, checkout, and patch-application workflow.
| Integration | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| LangChain, LangGraph, LangChain NVIDIA | Yes | Yes | Yes | Directly patches behavior into code. |
| opencode | Yes | Yes | Yes | Directly patches behavior into code. |
| OpenClaw | Yes | Yes | Yes | Adds middleware support to OpenClaw and a built-in plugin. |
| Hermes Agent | Yes | Yes | Yes | Directly patches behavior into code. |
End-user documentation lives at NVIDIA NeMo Relay documentation.
Important local entry points:
For source builds, tests, and contribution workflow, refer to CONTRIBUTING.md.
- NemoClaw support and integration for managed tool and LLM execution flows.
- Deeper NVIDIA NeMo ecosystem integration across agent, guardrail, evaluation, and observability workflows.
- Expanded adaptive optimization capabilities for performance-aware scheduling, hints, and cache behavior.
- First-party plugins and packages for common agent runtimes and frameworks where upstream extension points allow it.
NVIDIA NeMo Relay is licensed under the Apache License 2.0.