-
Notifications
You must be signed in to change notification settings - Fork 98
Description
Title
Gemini fails when a tool returns a top-level JSON array, and the same result can keep breaking later turns via tape replay
Body
## What happened
When using a Gemini model, if a tool returns valid JSON whose top-level value is an array, for example `[]`, Bub fails with:
```text
invalid_input: gemini:... 1 validation error for FunctionResponse
response
Input should be a valid dictionary [type=dict_type, input_value=[], input_type=list]How we hit it
In our case:
- Bub received a message from a channel
- Bub called a script through
bash - The script successfully returned
[] - The same turn then failed with the Gemini
FunctionResponse.responsevalidation error
So the script/tool execution itself succeeded. The failure happened after that result was fed back into the model flow.
Tape evidence
We checked the local tape and found the corresponding entries:
- the user message asking for the task list
- the
bashtool call - the tool result recorded as:
"payload": {"results": ["[]"]}- then the Gemini validation error on the same turn
This confirms that the failure is triggered by a successful tool result whose content is a top-level JSON array.
Why it kept happening later
After that result was written into tape, later turns in the same session could fail again, even for unrelated messages such as “hello”.
From our investigation, this happened because the historical tool_result was replayed from tape during later turns, and the same kind of invalid payload re-entered the model flow.
What we changed locally
We made two local changes:
-
We changed the relevant skill scripts so they no longer emit raw top-level array/scalar JSON to Bub.
Instead, they emit labeled text such as:JSON list response (0 items): [] -
We also made tape replay render old non-object
tool_resultvalues as safe text, so existing tape data would not keep reproducing the same failure.
After restarting Bub, the same session started working normally again.
Result after local mitigation
After the local changes:
- Bub was able to answer the “empty task list” case correctly
- later outbound delivery succeeded as well
- the same tape no longer reproduced the error in our environment