Skip to content

fix: harden SetModelResponseTool fallback to prevent infinite loops#5091

Open
vietnamesekid wants to merge 1 commit intogoogle:mainfrom
vietnamesekid:fix/harden-set-model-response-fallback
Open

fix: harden SetModelResponseTool fallback to prevent infinite loops#5091
vietnamesekid wants to merge 1 commit intogoogle:mainfrom
vietnamesekid:fix/harden-set-model-response-fallback

Conversation

@vietnamesekid
Copy link
Copy Markdown

Summary

This is a follow-up to #5057. It improves the fallback behavior of SetModelResponseTool to avoid infinite loops, especially when flash models (like gemini-2.5-flash and gemini-3-flash) ignore set_model_response and keep calling other tools.

The changes come from the investigation and discussion in #5054.

Changes

  • Stronger instructions for simple types (_output_schema_processor.py): primitive schemas like str and int are easy for models to ignore, so we now give clearer guidance in these cases.
  • Deterministic tool restriction: on the second-to-last round (N-1), we force the model to only call set_model_response using tool_config, so we can guarantee structured output.
  • Hard cutoff: at round N (_MAX_TOOL_ROUNDS=25), we stop execution completely to avoid runaway loops and unnecessary API usage.
  • Early return after success (base_llm_flow.py): once set_model_response succeeds, we skip extra steps like transfer_to_agent.

Test plan

  • All unit tests pass (19/19), including 6 new tests
  • All integration tests pass (4/4), covering BaseModel and str schemas across GOOGLE_AI and Vertex AI
  • No regressions in existing tests
  • Formatting and lint checks pass

Related

Flash models (gemini-2.5-flash, gemini-3-flash) can ignore
set_model_response and loop indefinitely when output_schema is used
with tools. This adds a layered defense:

1. Type-aware instruction: primitive schemas (str, int) get a stronger
   prompt since their trivial tool signature is easily ignored by flash
   models.

2. Deterministic tool_choice guard: on round N-1 (_MAX_TOOL_ROUNDS-1),
   restrict the model to only call set_model_response via tool_config.

3. Hard cutoff: on round N, terminate the invocation entirely to
   prevent runaway API costs.

4. Early return after set_model_response: skip unnecessary
   transfer_to_agent processing in base_llm_flow.py after structured
   output is successfully produced.

Based on analysis by @surfai, @nino-robotfutures-co, and
@surajksharma07 on google#5054.
@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants