-
Notifications
You must be signed in to change notification settings - Fork 22
Parameterize LLM returning reasoning #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds the ability to parameterize whether LLM-based guardrails return reasoning/explanation fields, allowing users to reduce token generation costs in production while enabling detailed output for development and debugging.
Key Changes
- Added
include_reasoningfield toLLMConfig(default:false) to control whether reasoning fields are included - Created
LLMReasoningOutputclass extendingLLMOutputwith areasonfield - Updated guardrail implementations to conditionally use extended or base output models based on configuration
- Updated tests to validate behavior with reasoning enabled/disabled
- Updated documentation to reflect the new optional parameter
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/guardrails/checks/text/llm_base.py |
Added include_reasoning config field, LLMReasoningOutput class, and logic to select output model |
src/guardrails/checks/text/jailbreak.py |
Removed custom JailbreakLLMOutput class, now uses base/reasoning models conditionally |
src/guardrails/checks/text/hallucination_detection.py |
Updated to conditionally use HallucinationDetectionOutput based on reasoning config |
src/guardrails/checks/text/prompt_injection_detection.py |
Updated to conditionally use PromptInjectionDetectionOutput based on reasoning config |
src/guardrails/checks/text/nsfw.py |
Removed explicit output_model parameter, now uses default reasoning support |
src/guardrails/checks/text/off_topic_prompts.py |
Removed explicit output_model parameter, now uses default reasoning support |
src/guardrails/checks/text/user_defined_llm.py |
Removed explicit output_model parameter, now uses default reasoning support |
src/guardrails/evals/core/benchmark_reporter.py |
Fixed file path sanitization for model names containing "/" |
tests/unit/checks/test_llm_base.py |
Added tests for reasoning toggle behavior |
tests/unit/checks/test_jailbreak.py |
Updated tests to not expect reason field by default, added reasoning toggle tests |
docs/ref/checks/*.md |
Updated documentation to describe the include_reasoning parameter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
openai-guardrails-python/src/guardrails/checks/text/llm_base.py
Lines 389 to 393 in 2d42617
| return ( | |
| output_model( | |
| flagged=False, | |
| confidence=0.0, | |
| ), |
When include_reasoning=True, guardrails using create_llm_check_fn pass LLMReasoningOutput (which requires a reason) into run_llm. In the empty-response fallback here, the code instantiates output_model(flagged=False, confidence=0.0) without supplying the required reasoning field, so a blank/filtered completion will raise a validation error and fall into the generic error path instead of returning a benign unflagged result (the previous behaviour). Provide a default reason or skip validation in this branch so reasoning-enabled guardrails gracefully handle empty outputs.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
Codex Review: Didn't find any major issues. Hooray! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
gabor-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thank you!
Allow users to toggle
reasonon and off for the LLM based guardrails via the config fileinclude_reasoning(optional): Whether to include reasoning/explanation fields in the guardrail output (default:false)false: The LLM only generates the essential fields (flaggedandconfidence), reducing token generation coststrue: Additionally, returns detailed reasoning for its decisionsUpdated docs and tests