-
Notifications
You must be signed in to change notification settings - Fork 10
Support multi-turn for all LLM based guardrails #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends multi-turn conversation support from the Jailbreak guardrail to all LLM-based guardrails. It centralizes conversation history handling in llm-base.ts, introduces configurable max_turns and include_reasoning parameters, and refactors existing guardrails to use this unified infrastructure. The changes enable more robust detection across all LLM guardrails while giving users control over token costs through configurable reasoning fields and conversation history limits.
Key Changes:
- Unified multi-turn support infrastructure in
llm-base.tswith automatic conversation history extraction and configurable turn limits - Added
include_reasoningconfig to control whether detailed explanation fields are included in outputs (defaults tofalsefor production cost savings) - Added
max_turnsconfig to limit conversation history size (defaults to 10 turns)
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/checks/llm-base.ts |
Core infrastructure: added extractConversationHistory, buildAnalysisPayload, LLMReasoningOutput schema, include_reasoning and max_turns config fields; updated runLLM and createLLMCheckFn to support multi-turn analysis |
src/checks/jailbreak.ts |
Refactored to use createLLMCheckFn instead of custom implementation, removing duplicate conversation history logic |
src/checks/prompt_injection_detection.ts |
Added conditional reasoning field inclusion based on include_reasoning; integrated max_turns parameter for conversation slicing |
src/checks/hallucination-detection.ts |
Added conditional reasoning field inclusion with explicit field listing pattern |
src/checks/user-defined-llm.ts |
Updated to use automatic reasoning handling via createLLMCheckFn |
src/checks/topical-alignment.ts |
Updated to use automatic reasoning handling via createLLMCheckFn |
src/checks/nsfw.ts |
Updated to use automatic reasoning handling via createLLMCheckFn |
src/checks/moderation.ts |
Minor cleanup: removed checked_text field from one error path |
src/__tests__/unit/llm-base.test.ts |
Comprehensive tests for new helper functions, reasoning control, and multi-turn behavior |
src/__tests__/unit/prompt_injection_detection.test.ts |
Tests for include_reasoning and max_turns configuration options |
src/__tests__/unit/checks/jailbreak.test.ts |
Updated tests to reflect refactored implementation using createLLMCheckFn |
src/__tests__/unit/checks/hallucination-detection.test.ts |
New comprehensive test file for reasoning control and error handling |
src/__tests__/unit/checks/user-defined-llm.test.ts |
Updated test to verify include_reasoning functionality |
examples/basic/hello_world.ts |
Demonstrates include_reasoning: true in example configuration |
docs/ref/checks/*.md |
Comprehensive documentation updates across all LLM guardrail docs explaining new include_reasoning and max_turns parameters with consistent performance claims |
.gitignore |
Added PR_READINESS_CHECKLIST.md to ignored files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
…v/steven/multi_turn
Extending
llm_base.pyto always use theconversation_historyfromctxto provide the conversation history to all LLM based guardrails. Previously we had the Jailbreak guardrail as a custom multi-turn guardrail.max_turnsin the config to control how much of the conversation is passed to the guardrail, balancing token cost with context