fix(ai-amazon-bedrock): emit finish part after metadata event to preserve token usage in streaming#6160
Open
jamesone wants to merge 1 commit intoEffect-TS:mainfrom
Open
Conversation
…erve token usage in streaming In the Bedrock ConverseStream API, events arrive in the order: messageStop → metadata. The metadata event carries all token usage data (inputTokens, outputTokens, totalTokens, cacheReadInputTokens, cacheWriteInputTokens) as well as trace information. Previously, the streaming handler emitted the finish part during the messageStop event, before the metadata event had populated the usage object. Because the downstream LanguageModel layer decodes stream parts through a schema (creating new instances), the finish part was captured with uninitialized usage values — causing all token counts (including cachedInputTokens) to be lost in streaming responses. The fix introduces a `tryEmitFinish` guard that defers the finish part until both messageStop (which provides the stop reason) and metadata (which provides usage/trace) have been received. This: - Ensures cachedInputTokens, inputTokens, outputTokens, totalTokens are all correctly populated in streaming responses - Ensures trace and cacheWriteInputTokens metadata are present - Handles event ordering defensively (emits on whichever arrives second) - Preserves correct behavior when errors interrupt the stream between messageStop and metadata (no misleading finish part with empty usage) The non-streaming (Converse) path was already correct since it reads directly from the decoded ConverseResponse. Adds comprehensive test suite for AmazonBedrockLanguageModel covering both streaming and non-streaming paths: token usage, cached tokens, explicit zero cache counts, trace metadata, tool calls, error scenarios, and missing event edge cases.
🦋 Changeset detectedLatest commit: 924be6f The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type
Description
In the Bedrock ConverseStream API, events arrive in the order:
messageStop→metadata. Themetadataevent carries all token usage data (inputTokens,outputTokens,totalTokens,cacheReadInputTokens,cacheWriteInputTokens) as well as trace information.Previously, the streaming handler emitted the finish part during the
messageStopevent, before themetadataevent had populated the usage object. Because the downstreamLanguageModellayer decodes stream parts through a schema (creating new instances), the finish part was captured with uninitialized usage values — causing all token counts (includingcachedInputTokens) to be lost in streaming responses.Fix
The fix introduces a
tryEmitFinishguard that defers the finish part until bothmessageStop(which provides the stop reason) andmetadata(which provides usage/trace) have been received. This:cachedInputTokens,inputTokens,outputTokens,totalTokensare all correctly populated in streaming responsestraceandcacheWriteInputTokensmetadata are presentmessageStopandmetadata(no misleading finish part with empty usage)The non-streaming (
Converse) path was already correct since it reads directly from the decodedConverseResponse.Tests
Adds comprehensive test suite for
AmazonBedrockLanguageModelcovering both streaming and non-streaming paths:cacheReadInputTokens/cacheWriteInputTokens)0(notundefined)messageStop(no misleading finish part)messageStopedge case (no finish part without a stop reason)Related