Skip to content

feat(agent): add stream_final_turn_only parameter to stream_async#2104

Open
zhifanl wants to merge 1 commit intostrands-agents:mainfrom
zhifanl:feat/stream-final-turn-only
Open

feat(agent): add stream_final_turn_only parameter to stream_async#2104
zhifanl wants to merge 1 commit intostrands-agents:mainfrom
zhifanl:feat/stream-final-turn-only

Conversation

@zhifanl
Copy link
Copy Markdown

@zhifanl zhifanl commented Apr 9, 2026

Motivation

When using stream_async with tool-using agents, text events from every model turn are yielded to the caller — including intermediate reasoning before tool calls. For production chat UIs and SSE endpoints, this is noise. The only workaround today requires consumers to implement fragile buffering logic that depends on SDK internals like start_event_loop, raw messageStop events, and the end_turntool_use override.

This adds a first-class SDK option to stream only the final answer, eliminating the need for consumer-side buffering.

Resolves: #2055

Public API Changes

Agent.stream_async accepts a new stream_final_turn_only keyword argument:

# Before: consumers receive text from ALL model turns
async for event in agent.stream_async("Analyze this data"):
    if "data" in event:
        yield event["data"]  # Includes intermediate "Let me look that up..." text

# After: consumers receive text only from the final turn
async for event in agent.stream_async("Analyze this data", stream_final_turn_only=True):
    if "data" in event:
        yield event["data"]  # Only final answer tokens

When stream_final_turn_only=True, intermediate turn text events are buffered internally and discarded when the turn ends with tool use. Text from the final turn (where stop_reason == "end_turn") is flushed to both the caller and callback handler. Non-text events (lifecycle, tool use, reasoning, citations, model stream chunks) pass through unchanged regardless of this setting.

Default is False — fully backward compatible, no behavior change unless opted in.

Use Cases

  • Chat applications streaming via SSE where users should only see the final answer
  • API endpoints wrapping agents where downstream consumers expect a single coherent streamed response
  • Any production deployment where intermediate model reasoning is noise for the end user

Related Issues

#2055

Type of Change

New feature

Testing

  • 8 unit tests covering backward compatibility, single/multi-turn scenarios, callback handler behavior, empty final turns, and non-text event passthrough
  • All 408 agent tests pass
  • I ran hatch run prepare

All test passed

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly - Will update once gather positive feedback
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed - Will update once gather positive feedback
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add a stream_final_turn_only parameter to Agent.stream_async that buffers
intermediate turn text events and only yields text from the final model
turn. Non-text events (lifecycle, tool use, reasoning, citations) pass
through unchanged.

Closes strands-agents#2055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Make agent only yield final reponse

1 participant