Skip to content

NVIDIA/NeMo-Relay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

222 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

License GitHub Release Codecov PyPI npm node npm wasm Crates.io Crates.io Crates.io Ask DeepWiki

NVIDIA NeMo Relay

NVIDIA NeMo Relay helps you see and control what happens inside agent runs without rewriting the agent stack you already have. It gives coding agents, applications, framework integrations, middleware, and observability backends a shared runtime for scopes, policy, plugins, and lifecycle events.

The best first step is to get one real run on disk. Once Relay is writing raw events and a trajectory file, you have something concrete to inspect, debug, and build from.

Start Here: Capture One Local Agent Run

This walkthrough gives you an end-to-end success signal. You install the nemo-relay CLI, turn on local exporters, run either Codex or Claude Code through Relay, and check that Relay wrote both raw events and normalized trajectories.

Tip

Start by trusting the raw Agent Trajectory Observability Format (ATOF) JSONL. It shows the lifecycle events Relay actually captured before anything is translated into Agent Trajectory Interchange Format (ATIF), OpenTelemetry, or OpenInference output.

1. Install the CLI

cargo install nemo-relay-cli

If you use cargo-binstall, the CLI can also be installed with:

cargo binstall nemo-relay-cli

2. Enable Local Observability Output

From the project directory you want to observe, open the project-scoped plugin editor:

nemo-relay plugins edit --project

The editor creates or updates the nearest project plugin file at .nemo-relay/plugins.toml. In the menu:

  1. Enable the Observability component.
  2. Open ATOF, toggle the section on, and set:
    • output_directory to .nemo-relay/atof
    • filename to events.jsonl
    • mode to overwrite
  3. Open ATIF, toggle the section on, and set:
    • output_directory to .nemo-relay/atif
    • filename_template to trajectory-{session_id}.json
  4. Press p to preview the generated TOML.
  5. Press s to save.

Note

Use nemo-relay plugins edit without --project only when you want these exporter settings in your user-level Relay config instead of this one project.

3. Run Codex or Claude Code Through Relay

Use the host CLI that is installed on your machine.

nemo-relay codex -- exec "Summarize this repository."
nemo-relay claude -- "Summarize this repository."

The transparent wrapper starts a local Relay gateway, injects host-specific hook and provider settings for that launched process, then shuts the gateway down when the agent exits.

Warning

Codex users may need to review and activate generated hooks before events appear. Refer to the Codex CLI guide for the current hook activation caveat and troubleshooting steps.

4. Verify the Run

After the run exits, check that raw events and trajectory files were written:

test -s .nemo-relay/atof/events.jsonl
ls .nemo-relay/atif/*.json
for file in .nemo-relay/atif/*.json; do
  python3 -m json.tool "$file" >/dev/null
done

Then verify that at least one raw ATOF 0.1 event exists:

python3 - <<'PY'
from pathlib import Path
import json

events_path = Path(".nemo-relay/atof/events.jsonl")
events = [
    json.loads(line)
    for line in events_path.read_text().splitlines()
    if line.strip()
]

assert events, "no ATOF events were written"
assert any(event.get("atof_version") == "0.1" for event in events), "no ATOF 0.1 events found"
print(f"validated {len(events)} ATOF event(s)")
PY

A successful run gives you two things to inspect:

  • .nemo-relay/atof/events.jsonl, the raw canonical event stream.
  • One or more .nemo-relay/atif/*.json trajectory files for analysis and evaluation workflows.

Tip

If raw ATOF events exist but LLM spans are missing, provider traffic probably isn't flowing through the Relay gateway. If ATIF is missing, make sure the agent session or turn ended and the output directory is writable. Use NeMo Relay CLI when you are ready for persistent host plugin installation, gateway configuration, exporter options, and agent-specific diagnostics.

Choose Your Next Path

Pick the row closest to what you are trying to do next. Refer to the corresponding documentation for more information.

Goal Start With
Observe Codex, Claude Code, Cursor, or Hermes locally NeMo Relay CLI
Instrument app-owned LLM or tool calls Quick Start
Use LangChain, LangGraph, Deep Agents, or OpenClaw Supported Integrations
Build a framework or provider integration Integrate into Frameworks
Export ATOF, ATIF, OpenTelemetry, or OpenInference Observability Plugin
Package reusable middleware or exporters Build Plugins
Develop or test this repository from source CONTRIBUTING.md

Application Quick Starts

If you own the code that calls the model or tool, install the binding for your language and route that boundary through Relay directly.

# Python
uv add nemo-relay

# Node.js
npm install nemo-relay-node

# Rust
cargo add nemo-relay

Then run the smallest workflow for that binding:

The Node.js package requires Node.js 24 or newer.

What Relay Adds

Relay is the liaison between agent systems. A production application may combine NeMo Agent Toolkit, LangChain, LangGraph, provider SDKs, custom harness code, NeMo Guardrails, tracing systems, and evaluation pipelines. Relay gives those pieces one runtime contract instead of asking every layer to invent its own wrappers and trace vocabulary.

Relay gives those systems:

  • Scopes so runs, turns, tools, LLM calls, and subagents have clear ownership, parent-child lineage, cleanup boundaries, and request isolation.
  • Managed LLM and tool calls so the same lifecycle and middleware rules apply around each callback.
  • Middleware for the places where Relay must block, sanitize, transform, route, retry, or replace execution.
  • Plugins so reusable observability, guardrail, adaptive, and exporter behavior can be turned on from configuration.
  • Events and subscribers so raw ATOF, normalized ATIF, OpenTelemetry, and OpenInference output all come from the same runtime stream.

Relay does not replace your framework, model provider, application logic, observability backend, or guardrail authoring system. It gives those systems a common boundary to meet at.

flowchart LR
    App[Application, Framework, or CLI Harness]

    subgraph Runtime[NeMo Relay Runtime]
        direction TB
        Scopes[Scopes]
        Middleware[Middleware]
        Plugins[Plugins]
        Events[Lifecycle Events]
    end

    Output[Subscribers and Exporters]

    App --> Scopes
    App --> Middleware
    Plugins --> Middleware
    Scopes --> Events
    Middleware --> Events
    Events --> Output
Loading

Support Status

Note

The main supported paths today are Rust, Python, and Node.js. Go, WebAssembly, and raw C FFI are available for source-first users, but they are still experimental.

The following table shows which language bindings and CLI features are currently supported:

Binding Status Notes
Python Fully supported Documented with Quick Start and Guides.
Node.js Fully supported Documented with Quick Start and Guides.
Rust Fully supported Documented with Quick Start and Guides.
NeMo Relay CLI Supported Local observability and hook-backed security are supported; optimization is partial and host-dependent.
Go Experimental Source-first under go/nemo_relay.
WebAssembly Experimental Source-first under crates/wasm.
FFI Experimental Source-first under crates/ffi.

Agent Harness Support

The CLI support matrix separates the supported CLI surface from host-specific coverage.

  • Observability works for the listed harnesses.
  • Security is supported when the host exposes blocking hooks.
  • Optimization remains partial and host-dependent.
Agent Observability Security Optimization Notes
Claude Code Yes Yes Partial Hook forwarding, pre-tool blocking, and gateway-routed LLM observability are supported.
Codex Yes Yes Partial Hook activation is required; missing session-end behavior limits trajectory finalization and full optimization coverage.
Hermes Agent Yes Yes Partial Hook forwarding, pre-tool blocking, and gateway-routed or hook-backed LLM observability are supported.
Cursor Partial Limited No Missing hooks under cursor-agent and manual gateway routing limit full feature coverage.

Public API Integrations

Use these integrations when the framework exposes stable callbacks, middleware, or plugin hooks that preserve enough lifecycle fidelity.

Agent / Library Observability Security Optimization Notes
LangChain Yes Yes Yes Wrapped tool and LLM calling.
LangGraph Yes Yes Yes Wrapped tool and LLM calling.
Deep Agents Yes Yes Yes Wrapped tool and LLM calling.
OpenClaw Yes Partial No Hook-backed telemetry with pre-tool guardrails. Managed execution rewrites require the patch-based integration.

The Python nemo-relay package ships extras for LangChain, LangGraph, and Deep Agents:

uv add "nemo-relay[langchain,langgraph,deepagents]"

Refer to Supported Integrations for setup guides and current caveats.

Patch-Based Integrations

Patch-based integrations are experimental samples maintained against pinned upstream checkouts. Use third_party/README.md for the clone, checkout, and patch-application workflow.

Integration Observability Security Optimization Notes
LangChain, LangGraph, LangChain NVIDIA Yes Yes Yes Directly patches behavior into code.
opencode Yes Yes Yes Directly patches behavior into code.
OpenClaw Yes Yes Yes Adds middleware support to OpenClaw and a built-in plugin.
Hermes Agent Yes Yes Yes Directly patches behavior into code.

Documentation

End-user documentation lives at NVIDIA NeMo Relay documentation.

Important local entry points:

For source builds, tests, and contribution workflow, refer to CONTRIBUTING.md.

Roadmap

  • NemoClaw support and integration for managed tool and LLM execution flows.
  • Deeper NVIDIA NeMo ecosystem integration across agent, guardrail, evaluation, and observability workflows.
  • Expanded adaptive optimization capabilities for performance-aware scheduling, hints, and cache behavior.
  • First-party plugins and packages for common agent runtimes and frameworks where upstream extension points allow it.

License

NVIDIA NeMo Relay is licensed under the Apache License 2.0.

About

Multi-language agent runtime for execution scope management, lifecycle events, and middleware on tool and LLM calls.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors