Skip to content

fix(autocontext): preserve tool_calls/tool_result pairing during large message offloading#1042

Open
shiyiyue1102 wants to merge 2 commits intoagentscope-ai:mainfrom
shiyiyue1102:main
Open

fix(autocontext): preserve tool_calls/tool_result pairing during large message offloading#1042
shiyiyue1102 wants to merge 2 commits intoagentscope-ai:mainfrom
shiyiyue1102:main

Conversation

@shiyiyue1102
Copy link
Contributor

@shiyiyue1102 shiyiyue1102 commented Mar 25, 2026

In offloadingLargePayload (Strategy 2/3), messages were offloaded as plain TextBlock stubs regardless of role, which caused two bugs:

  1. ASSISTANT messages with ToolUseBlock (tool_calls) were stripped of their ToolUseBlock when offloaded. The subsequent TOOL result messages then had no preceding tool_calls message, triggering DashScope 400: 'messages with role tool must be a response to a preceding message with tool_calls'.

  2. TOOL result messages were never actually compressed because Msg.getTextContent() only extracts top-level TextBlocks and returns empty string for TOOL messages (whose content is a ToolResultBlock).

Fix:

  • Skip ASSISTANT+ToolUseBlock messages entirely; these pairs are handled exclusively by Strategy 1 (summaryToolsMessages).
  • Handle TOOL result messages separately: extract output text from ToolResultBlock.output for size comparison, offload the original message, then rebuild a replacement that preserves the ToolResultBlock shell (id + name) while replacing only the output text with the offload hint. This keeps tool_call_id/name intact for API formatters.

Add three regression tests covering:

  • Strategy 2/3 must not offload ASSISTANT tool-call messages as plain stubs
  • Compressed TOOL result must preserve ToolResultBlock id and name
  • Full conversation integrity: every TOOL result must follow a tool-call assistant

Change-Id: If62a76459d0bdb681cf04fbfd10c7df093b2ab8a
Co-developed-by: Qoder noreply@qoder.com

AgentScope-Java Version

[The version of AgentScope-Java you are working on, e.g. 1.0.9, check your pom.xml dependency version or run mvn dependency:tree | grep agentscope-parent:pom(only mac/linux)]

Description

[Please describe the background, purpose, changes made, and how to test this PR]

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has been formatted with mvn spotless:apply
  • All tests are passing (mvn test)
  • Javadoc comments are complete and follow project conventions
  • Related documentation has been updated (e.g. links, examples, etc.)
  • Code is ready for review

…e payload offloading

In offloadingLargePayload (Strategy 2/3), messages were offloaded as
plain TextBlock stubs regardless of role, which caused two bugs:

1. ASSISTANT messages with ToolUseBlock (tool_calls) were stripped of
   their ToolUseBlock when offloaded. The subsequent TOOL result messages
   then had no preceding tool_calls message, triggering DashScope 400:
   'messages with role tool must be a response to a preceding message
   with tool_calls'.

2. TOOL result messages were never actually compressed because
   Msg.getTextContent() only extracts top-level TextBlocks and returns
   empty string for TOOL messages (whose content is a ToolResultBlock).

Fix:
- Skip ASSISTANT+ToolUseBlock messages entirely; these pairs are handled
  exclusively by Strategy 1 (summaryToolsMessages).
- Handle TOOL result messages separately: extract output text from
  ToolResultBlock.output for size comparison, offload the original
  message, then rebuild a replacement that preserves the ToolResultBlock
  shell (id + name) while replacing only the output text with the
  offload hint. This keeps tool_call_id/name intact for API formatters.

Add three regression tests covering:
- Strategy 2/3 must not offload ASSISTANT tool-call messages as plain stubs
- Compressed TOOL result must preserve ToolResultBlock id and name
- Full conversation integrity: every TOOL result must follow a tool-call assistant

Change-Id: If62a76459d0bdb681cf04fbfd10c7df093b2ab8a
Co-developed-by: Qoder <noreply@qoder.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes AutoContext large-payload offloading (Strategies 2/3) to preserve valid tool-call / tool-result sequencing and to properly offload oversized TOOL result outputs without breaking API formatting constraints.

Changes:

  • Skip offloading ASSISTANT tool-call messages (ToolUseBlock) during large payload offloading to avoid orphaning subsequent TOOL results.
  • Add specialized offloading for TOOL result messages: offload based on ToolResultBlock output size, then rebuild a replacement message that keeps tool_call_id/name intact.
  • Add regression tests for tool-call pairing and ToolResultBlock preservation during offloading.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
agentscope-extensions/agentscope-extensions-autocontext-memory/src/main/java/io/agentscope/core/memory/autocontext/AutoContextMemory.java Updates Strategy 2/3 offloading to avoid stripping tool_calls and to offload TOOL results while preserving ToolResultBlock shell.
agentscope-extensions/agentscope-extensions-autocontext-memory/src/test/java/io/agentscope/core/memory/autocontext/AutoContextMemoryTest.java Adds regression tests around tool_calls/tool_result integrity and ToolResultBlock id/name preservation during offloading.


mem.compressIfNeeded();
List<Msg> messages = mem.getMessages();

Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testToolCallPairingIntegrityAfterMixedOffloading validates the pairing invariant, but it doesn’t assert that any offloading actually occurred in this run. Adding an assertion that the large USER message was replaced with an offload hint / has _compress_meta.offloaduuid would ensure the test exercises Strategy 2/3 (so it can’t pass without covering the offloading path).

Suggested change
// Assert that the large USER message was actually offloaded and replaced
// with an offload hint containing the expected metadata marker.
boolean offloaded =
messages.stream()
.filter(msg -> msg.getRole() == MsgRole.USER)
.map(Object::toString)
.anyMatch(
s ->
s.contains("\"_compress_meta\"")
&& s.contains("offloaduuid"));
assertTrue(
offloaded,
"Expected the large USER message to be offloaded and replaced with an offload "
+ "hint containing _compress_meta.offloaduuid, but no such hint was "
+ "found in the compressed message list.");

Copilot uses AI. Check for mistakes.
ToolResultBlock.of(
originalResult.getId(),
originalResult.getName(),
TextBlock.builder().text(offloadHint).build());
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rebuilding the compressed ToolResultBlock, the original ToolResultBlock metadata is dropped (e.g., the agentscope_suspended flag and any other tool execution metadata). Preserve originalResult.getMetadata() when creating compressedResult (e.g., use the ToolResultBlock.of(id, name, output, metadata) overload or merge metadata) so downstream consumers don’t lose semantic flags after offloading.

Suggested change
TextBlock.builder().text(offloadHint).build());
TextBlock.builder().text(offloadHint).build(),
originalResult.getMetadata());

Copilot uses AI. Check for mistakes.
Comment on lines +1659 to +1671
// Build a large ASSISTANT tool-use message (> largePayloadThreshold)
String largeInput = "x".repeat(200);
Msg largeToolUseMsg =
Msg.builder()
.role(MsgRole.ASSISTANT)
.name("assistant")
.content(
ToolUseBlock.builder()
.id("call_large")
.name("search")
.input(Map.of("query", largeInput))
.build())
.build();
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testLargePayloadOffloadingSkipsAssistantToolUseMessage doesn’t currently exercise the regression scenario: the tool-call ASSISTANT message contains only a ToolUseBlock, so Msg.getTextContent() is empty and Strategy 2/3 would not offload it even in the buggy implementation. To make this a real regression test, construct a tool-call ASSISTANT message that includes a large top-level TextBlock alongside the ToolUseBlock (as can happen in real messages), and/or assert that Strategy 2/3 actually performed an offload in this run; otherwise this test can pass without validating the fix.

Copilot uses AI. Check for mistakes.
Comment on lines +1769 to +1797

// If the TOOL message was offloaded (compressed), it must still carry ToolResultBlock
// with the original id and name intact.
if (toolResultMsg != null) {
ToolResultBlock block = toolResultMsg.getFirstContentBlock(ToolResultBlock.class);
assertNotNull(
block,
"Compressed TOOL result message must still contain a ToolResultBlock"
+ " (not be degraded to plain TextBlock)");
assertEquals(
"call_tool_id_001",
block.getId(),
"ToolResultBlock id must be preserved after offloading");
assertEquals(
"search",
block.getName(),
"ToolResultBlock name must be preserved after offloading");
// The output should now contain the offload hint
String outputText =
block.getOutput().stream()
.filter(b -> b instanceof TextBlock)
.map(b -> ((TextBlock) b).getText())
.findFirst()
.orElse("");
assertTrue(
outputText.contains("CONTEXT_OFFLOAD"),
"Compressed tool result output should contain offload hint. Got: "
+ outputText);
}
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can silently pass if no TOOL result message is found because the assertions are guarded by if (toolResultMsg != null). Since the scenario always adds a TOOL result, assert toolResultMsg is non-null and (optionally) that compressIfNeeded() actually returned true / the message was replaced, to avoid false positives.

Suggested change
// If the TOOL message was offloaded (compressed), it must still carry ToolResultBlock
// with the original id and name intact.
if (toolResultMsg != null) {
ToolResultBlock block = toolResultMsg.getFirstContentBlock(ToolResultBlock.class);
assertNotNull(
block,
"Compressed TOOL result message must still contain a ToolResultBlock"
+ " (not be degraded to plain TextBlock)");
assertEquals(
"call_tool_id_001",
block.getId(),
"ToolResultBlock id must be preserved after offloading");
assertEquals(
"search",
block.getName(),
"ToolResultBlock name must be preserved after offloading");
// The output should now contain the offload hint
String outputText =
block.getOutput().stream()
.filter(b -> b instanceof TextBlock)
.map(b -> ((TextBlock) b).getText())
.findFirst()
.orElse("");
assertTrue(
outputText.contains("CONTEXT_OFFLOAD"),
"Compressed tool result output should contain offload hint. Got: "
+ outputText);
}
assertNotNull(
toolResultMsg,
"TOOL result message must be present in the compressed history");
// If the TOOL message was offloaded (compressed), it must still carry ToolResultBlock
// with the original id and name intact.
ToolResultBlock block = toolResultMsg.getFirstContentBlock(ToolResultBlock.class);
assertNotNull(
block,
"Compressed TOOL result message must still contain a ToolResultBlock"
+ " (not be degraded to plain TextBlock)");
assertEquals(
"call_tool_id_001",
block.getId(),
"ToolResultBlock id must be preserved after offloading");
assertEquals(
"search",
block.getName(),
"ToolResultBlock name must be preserved after offloading");
// The output should now contain the offload hint
String outputText =
block.getOutput().stream()
.filter(b -> b instanceof TextBlock)
.map(b -> ((TextBlock) b).getText())
.findFirst()
.orElse("");
assertTrue(
outputText.contains("CONTEXT_OFFLOAD"),
"Compressed tool result output should contain offload hint. Got: "
+ outputText);

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 90.90909% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ope/core/memory/autocontext/AutoContextMemory.java 90.90% 2 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

…arge tool results

When compressing a large TOOL result message, the replacement
ToolResultBlock was built without the original metadata, silently
dropping semantic flags such as agentscope_suspended and any custom
tool execution metadata written by tool implementations.

Pass originalResult.getMetadata() to ToolResultBlock.of() so all
metadata is carried through to the compressed stub.

Change-Id: I54e37a5b7e1dee6ec2de8462450016f9a1bf1124
Co-developed-by: Qoder <noreply@qoder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants