Skip to content

Conversation

@apascal07
Copy link
Collaborator

@apascal07 apascal07 commented Dec 7, 2025

Originally, formatters were implemented as something that modifies the model response message in place which was different from how it was implemented in JS. This was done in error as this solution was not capable of handling non-JSON formats (e.g. ModelResponse.Output() assumed that the content was JSON and since there was no formatter attached to the type, it was impossible to hook into the format-specific logic).

In this PR, we introduce a new interface called StreamingFormatHandler which, despite the name, is the general format handler interface. It implements ParseOutput(*Message) and ParseChunk(*ModelResponseChunk) which are used when the user calls ModelResponse.Output(), ModelResponseChunk.Output(), or the convenience function OutputFrom[T any]() without modifying content in place.

Checklist (if applicable):

@apascal07 apascal07 changed the title feat(go): Refactored formatters + added support for formatting streams. feat(go): refactored formatters + added support for formatting streams Dec 7, 2025
@github-actions github-actions bot added the go label Dec 7, 2025
@apascal07 apascal07 marked this pull request as ready for review December 12, 2025 22:43
@apascal07 apascal07 requested a review from pavelgj December 12, 2025 22:43
@pavelgj
Copy link
Collaborator

pavelgj commented Dec 16, 2025

@gemini-code-assist

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant refactoring of the formatters to better support streaming and non-JSON formats. It introduces a new StreamingFormatHandler interface, which is a great improvement for handling streaming data parsing without in-place modifications. The changes are well-implemented across the various formatters (text, json, jsonl, array, enum). The addition of comprehensive unit tests for each formatter is also a major plus.

I have a couple of suggestions for improvement:

  • For performance, strings.Builder should be used for accumulating text in streaming chunks across all formatters.
  • For consistency with the new design and other formatters, the legacy ParseMessage method in jsonHandler and enumHandler should be simplified to a no-op, as the parsing logic is now handled by the new ParseOutput and ParseChunk methods.

newParts = append(newParts, NewJSONPart(line))
for _, part := range chunk.Content {
if part.IsText() {
a.accumulatedText += part.Text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For performance, it's better to use strings.Builder for accumulating text from chunks instead of string concatenation with +=. This avoids repeated memory allocations for the string.

To implement this, you could change the accumulatedText field in the arrayHandler struct to strings.Builder.

This would require a few adjustments:

  1. In the struct definition: accumulatedText strings.Builder
  2. When resetting: a.accumulatedText.Reset() instead of a.accumulatedText = ""
  3. Here: a.accumulatedText.WriteString(part.Text)
  4. When reading: base.ExtractItems(a.accumulatedText.String(), a.cursor)

This advice applies to the other format handlers (enumHandler, jsonHandler, jsonlHandler, textHandler) as well.

Comment on lines +109 to 152
func (j *jsonHandler) ParseMessage(m *Message) (*Message, error) {
if m == nil {
return nil, errors.New("message is empty")
}
if len(m.Content) == 0 {
return nil, errors.New("message has no content")
}

var nonTextParts []*Part
accumulatedText := strings.Builder{}

for _, part := range m.Content {
if !part.IsText() {
nonTextParts = append(nonTextParts, part)
} else {
accumulatedText.WriteString(part.Text)
}
}

newParts := []*Part{}
text := base.ExtractJSONFromMarkdown(accumulatedText.String())
if text != "" {
if j.config.Schema != nil {
schemaBytes, err := json.Marshal(j.config.Schema)
if err != nil {
return nil, fmt.Errorf("expected schema is not valid: %w", err)
}
if err = base.ValidateRaw([]byte(text), schemaBytes); err != nil {
return nil, err
}
} else {
if !base.ValidJSON(text) {
return nil, errors.New("message is not a valid JSON")
}
newParts := []*Part{}
text := base.ExtractJSONFromMarkdown(accumulatedText.String())
if text != "" {
if j.config.Schema != nil {
schemaBytes, err := json.Marshal(j.config.Schema)
if err != nil {
return nil, fmt.Errorf("expected schema is not valid: %w", err)
}
if err = base.ValidateRaw([]byte(text), schemaBytes); err != nil {
return nil, err
}
} else {
if !base.ValidJSON(text) {
return nil, errors.New("message is not a valid JSON")
}
newParts = append(newParts, NewJSONPart(text))
}
newParts = append(newParts, NewJSONPart(text))
}

newParts = append(newParts, nonTextParts...)
newParts = append(newParts, nonTextParts...)

m.Content = newParts
}
m.Content = newParts

return m, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This ParseMessage implementation still modifies the message in-place, which seems to contradict the main goal of this refactoring. The PR description states that in-place modification was an error, and the new StreamingFormatHandler interface is meant to provide a better way to parse output.

For consistency with other updated formatters like arrayHandler and jsonlHandler, and to align with the // Legacy: ... comment in formatter.go, this method should probably be a no-op that just returns the message without modification. The actual parsing is now handled by ParseOutput and ParseChunk.

This also applies to enumHandler's ParseMessage implementation.

func (j *jsonHandler) ParseMessage(m *Message) (*Message, error) {
	return m, nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants