Advanced Features

ClawRouter v0.5+ includes intelligent routing features that work automatically.

Response Cache
Agentic Auto-Detection
Tool Detection
Context-Length-Aware Routing
Model Aliases
Free Tier Fallback
Session Persistence
Cost Tracking with /stats

Response Cache

ClawRouter includes LLM response caching inspired by LiteLLM's caching system. Identical requests return cached responses, saving both cost and latency.

How it works:

Request: "What is 2+2?"
  First call:  → API ($0.001) → Cache response
  Second call: → Cache HIT → Return instantly ($0)

Features:

Feature	Default	Description
TTL	10 minutes	Responses expire after TTL
Max size	200 entries	LRU eviction when full
Item limit	1MB	Large responses skipped
Auto-enabled	Yes	No config needed

Cache key generation:

The cache key is a SHA-256 hash of the request body (model + messages + params), with normalization:

Message timestamps stripped (OpenClaw injects [Mon 2024-01-15 10:30 UTC])
Keys sorted for consistent hashing
Stream mode, user, and request_id fields excluded

Bypass cache:

// Via header
fetch("/v1/chat/completions", {
  headers: { "Cache-Control": "no-cache" }
})

// Via body
{
  "model": "blockrun/auto",
  "cache": false,  // or "no_cache": true
  "messages": [...]
}

Check cache stats:

curl http://localhost:8402/cache

Response:

{
  "size": 42,
  "maxSize": 200,
  "hits": 156,
  "misses": 89,
  "evictions": 3,
  "hitRate": "63.7%"
}

Configuration:

Response caching is enabled by default with sensible defaults. For advanced tuning, the cache can be configured programmatically:

import { ResponseCache } from "@blockrun/clawrouter";

const cache = new ResponseCache({
  maxSize: 500, // Max cached responses
  defaultTTL: 300, // 5 minutes
  maxItemSize: 2_097_152, // 2MB max per item
  enabled: true,
});

Agentic Auto-Detection

ClawRouter automatically detects multi-step agentic tasks and routes to models optimized for autonomous execution:

"what is 2+2"                    → gemini-flash (standard)
"build the project then run tests" → kimi-k2.5 (auto-agentic)
"fix the bug and make sure it works" → kimi-k2.5 (auto-agentic)

How it works:

Detects agentic keywords: file ops ("read", "edit"), execution ("run", "test", "deploy"), iteration ("fix", "debug", "verify")
Threshold: 2+ signals triggers auto-switch to agentic tiers
No config needed — works automatically

Agentic tier models (optimized for multi-step autonomy):

Tier	Agentic Model	Why
SIMPLE	claude-haiku-4.5	Fast + reliable tool use
MEDIUM	kimi-k2.5	200+ tool chains, 76% cheaper
COMPLEX	claude-sonnet-4.6	Best balance for complex tasks
REASONING	kimi-k2.5	Extended reasoning + execution

Force Agentic Mode

You can also force agentic mode via config:

# openclaw.yaml
plugins:
  - id: "@blockrun/clawrouter"
    config:
      routing:
        overrides:
          agenticMode: true # Always use agentic tiers

Tool Detection

When your request includes a tools array (function calling), ClawRouter automatically switches to agentic tiers:

// Request with tools → auto-agentic mode
{
  model: "blockrun/auto",
  messages: [{ role: "user", content: "Check the weather" }],
  tools: [{ type: "function", function: { name: "get_weather", ... } }]
}
// → Routes to claude-haiku-4.5 (excellent tool use)
// → Instead of gemini-flash (may produce malformed tool calls)

Why this matters: Some models (like deepseek-reasoner) are optimized for chain-of-thought reasoning but can generate malformed tool calls. Tool detection ensures requests with functions go to models proven to handle tool use correctly.

Context-Length-Aware Routing

ClawRouter automatically filters out models that can't handle your context size:

150K token request:
  Full chain: [grok-4-fast (131K), deepseek (128K), kimi (262K), gemini (1M)]
  Filtered:   [kimi (262K), gemini (1M)]
  → Skips models that would fail with "context too long" errors

This prevents wasted API calls and faster fallback to capable models.

Model Aliases

Use short aliases instead of full model paths:

/model free      # gpt-oss-120b (FREE!)
/model br-sonnet # anthropic/claude-sonnet-4.6
/model br-opus   # anthropic/claude-opus-4
/model br-haiku  # anthropic/claude-haiku-4.5
/model gpt       # openai/gpt-4o
/model gpt5      # openai/gpt-5.2
/model deepseek  # deepseek/deepseek-chat
/model reasoner  # deepseek/deepseek-reasoner
/model kimi      # moonshot/kimi-k2.5
/model gemini    # google/gemini-2.5-pro
/model flash     # google/gemini-2.5-flash
/model grok      # xai/grok-3
/model grok-fast # xai/grok-4-fast-reasoning

All aliases work with /model blockrun/xxx or just /model xxx.

Free Tier Fallback

When your wallet balance hits $0, ClawRouter automatically falls back to the free model (gpt-oss-120b):

Wallet: $0.00
Request: "Help me write a function"
→ Routes to gpt-oss-120b (FREE)
→ No "insufficient funds" error
→ Keep building while you top up

You'll never get blocked by an empty wallet — the free tier keeps you running.

Session Persistence

For multi-turn conversations, ClawRouter pins the model to prevent mid-task switching:

Turn 1: "Build a React component" → claude-sonnet-4.6
Turn 2: "Add dark mode support"   → claude-sonnet-4.6 (pinned)
Turn 3: "Now add tests"           → claude-sonnet-4.6 (pinned)

Sessions are identified by conversation ID and persist for 1 hour of inactivity.

Cost Tracking with /stats

Track your savings in real-time:

# In any OpenClaw conversation
/stats

Output:

+============================================================+
|              ClawRouter Usage Statistics                   |
+============================================================+
|  Period: last 7 days                                      |
|  Total Requests: 442                                      |
|  Total Cost: $1.73                                       |
|  Baseline Cost (Opus): $20.13                            |
|  Total Saved: $18.40 (91.4%)                             |
+------------------------------------------------------------+
|  Routing by Tier:                                          |
|    SIMPLE     ===========           55.0% (243)            |
|    MEDIUM     ======                30.8% (136)            |
|    COMPLEX    =                      7.2% (32)             |
|    REASONING  =                      7.0% (31)             |
+============================================================+

Stats are stored locally at ~/.openclaw/blockrun/logs/ and aggregated on demand.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced Features

Table of Contents

Response Cache

Agentic Auto-Detection

Force Agentic Mode

Tool Detection

Context-Length-Aware Routing

Model Aliases

Free Tier Fallback

Session Persistence

Cost Tracking with /stats

FilesExpand file tree

features.md

Latest commit

History

features.md

File metadata and controls

Advanced Features

Table of Contents

Response Cache

Agentic Auto-Detection

Force Agentic Mode

Tool Detection

Context-Length-Aware Routing

Model Aliases

Free Tier Fallback

Session Persistence

Cost Tracking with /stats