Skip to content

Commit fe3ee1a

Browse files
committed
add note on performance and latency
1 parent 75435ec commit fe3ee1a

File tree

7 files changed

+14
-7
lines changed

7 files changed

+14
-7
lines changed

docs/ref/checks/custom_prompt_check.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ Implements custom content checks using configurable LLM prompts. Uses your custo
2323
- **`include_reasoning`** (optional): Whether to include reasoning/explanation fields in the guardrail output (default: `false`)
2424
- When `false`: The LLM only generates the essential fields (`flagged` and `confidence`), reducing token generation costs
2525
- When `true`: Additionally, returns detailed reasoning for its decisions
26-
- **Use Case**: Keep disabled for production to minimize costs; enable for development and debugging
26+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
27+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
2728

2829
## Implementation Notes
2930

docs/ref/checks/hallucination_detection.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ Flags model text containing factual claims that are clearly contradicted or not
2828
- **`include_reasoning`** (optional): Whether to include detailed reasoning fields in the output (default: `false`)
2929
- When `false`: Returns only `flagged` and `confidence` to save tokens
3030
- When `true`: Additionally, returns `reasoning`, `hallucination_type`, `hallucinated_statements`, and `verified_statements`
31-
- Recommended: Keep disabled for production (default); enable for development/debugging
31+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
32+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
3233

3334
### Tuning guidance
3435

docs/ref/checks/jailbreak.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,8 @@ Detects attempts to bypass safety or policy constraints via manipulation (prompt
4646
- **`include_reasoning`** (optional): Whether to include reasoning/explanation fields in the guardrail output (default: `false`)
4747
- When `false`: The LLM only generates the essential fields (`flagged` and `confidence`), reducing token generation costs
4848
- When `true`: Additionally, returns detailed reasoning for its decisions
49-
- **Use Case**: Keep disabled for production to minimize costs; enable for development and debugging
49+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
50+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
5051

5152
### Tuning guidance
5253

docs/ref/checks/llm_base.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ Base configuration for LLM-based guardrails. Provides common configuration optio
2222
- **`include_reasoning`** (optional): Whether to include reasoning/explanation fields in the guardrail output (default: `false`)
2323
- When `true`: The LLM generates and returns detailed reasoning for its decisions (e.g., `reason`, `reasoning`, `observation`, `evidence` fields)
2424
- When `false`: The LLM only returns the essential fields (`flagged` and `confidence`), reducing token generation costs
25-
- **Use Case**: Keep disabled for production to minimize costs; enable for development and debugging
25+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
26+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
2627

2728
## What It Does
2829

docs/ref/checks/nsfw.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,8 @@ Flags workplace‑inappropriate model outputs: explicit sexual content, profanit
3232
- **`include_reasoning`** (optional): Whether to include reasoning/explanation fields in the guardrail output (default: `false`)
3333
- When `false`: The LLM only generates the essential fields (`flagged` and `confidence`), reducing token generation costs
3434
- When `true`: Additionally, returns detailed reasoning for its decisions
35-
- **Use Case**: Keep disabled for production to minimize costs; enable for development and debugging
35+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
36+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
3637

3738
### Tuning guidance
3839

docs/ref/checks/off_topic_prompts.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ Ensures content stays within defined business scope using LLM analysis. Flags co
2323
- **`include_reasoning`** (optional): Whether to include reasoning/explanation fields in the guardrail output (default: `false`)
2424
- When `false`: The LLM only generates the essential fields (`flagged` and `confidence`), reducing token generation costs
2525
- When `true`: Additionally, returns detailed reasoning for its decisions
26-
- **Use Case**: Keep disabled for production to minimize costs; enable for development and debugging
26+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
27+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
2728

2829
## Implementation Notes
2930

docs/ref/checks/prompt_injection_detection.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ After tool execution, the prompt injection detection check validates that the re
4444
- **`include_reasoning`** (optional): Whether to include the `observation` and `evidence` fields in the output (default: `false`)
4545
- When `true`: Returns detailed `observation` explaining what the action is doing and `evidence` with specific quotes/details
4646
- When `false`: Omits reasoning fields to save tokens (typically 100-300 tokens per check)
47-
- Recommended: Keep disabled for production (default); enable for development/debugging
47+
- **Performance**: In our evaluations, disabling reasoning reduces median latency by 40% on average (ranging from 18% to 67% depending on model) while maintaining detection performance
48+
- **Use Case**: Keep disabled for production to minimize costs and latency; enable for development and debugging
4849

4950
**Flags as MISALIGNED:**
5051

0 commit comments

Comments
 (0)