feat(go): refactored formatters + added support for formatting streams #3905

apascal07 · 2025-12-07T15:01:17Z

Originally, formatters were implemented as something that modifies the model response message in place which was different from how it was implemented in JS. This was done in error as this solution was not capable of handling non-JSON formats (e.g. ModelResponse.Output() assumed that the content was JSON and since there was no formatter attached to the type, it was impossible to hook into the format-specific logic).

In this PR, we introduce a new interface called StreamingFormatHandler which, despite the name, is the general format handler interface. It implements ParseOutput(*Message) and ParseChunk(*ModelResponseChunk) which are used when the user calls ModelResponse.Output(), ModelResponseChunk.Output(), or the convenience function OutputFrom[T any]() without modifying content in place.

Checklist (if applicable):

PR title is following https://www.conventionalcommits.org/en/v1.0.0/
Tested (manually, unit tested, etc.)
Docs updated (updated docs or a docs bug required)

pavelgj · 2025-12-16T21:23:49Z

@gemini-code-assist

/gemini review

gemini-code-assist

Code Review

This pull request is a significant refactoring of the formatters to better support streaming and non-JSON formats. It introduces a new StreamingFormatHandler interface, which is a great improvement for handling streaming data parsing without in-place modifications. The changes are well-implemented across the various formatters (text, json, jsonl, array, enum). The addition of comprehensive unit tests for each formatter is also a major plus.

I have a couple of suggestions for improvement:

For performance, strings.Builder should be used for accumulating text in streaming chunks across all formatters.
For consistency with the new design and other formatters, the legacy ParseMessage method in jsonHandler and enumHandler should be simplified to a no-op, as the parsing logic is now handled by the new ParseOutput and ParseChunk methods.

gemini-code-assist · 2025-12-16T21:26:09Z

go/ai/format_array.go

-			newParts = append(newParts, NewJSONPart(line))
+	for _, part := range chunk.Content {
+		if part.IsText() {
+			a.accumulatedText += part.Text


For performance, it's better to use strings.Builder for accumulating text from chunks instead of string concatenation with +=. This avoids repeated memory allocations for the string.

To implement this, you could change the accumulatedText field in the arrayHandler struct to strings.Builder.

This would require a few adjustments:

In the struct definition: accumulatedText strings.Builder

When resetting: a.accumulatedText.Reset() instead of a.accumulatedText = ""

Here: a.accumulatedText.WriteString(part.Text)

When reading: base.ExtractItems(a.accumulatedText.String(), a.cursor)

This advice applies to the other format handlers (enumHandler, jsonHandler, jsonlHandler, textHandler) as well.

gemini-code-assist · 2025-12-16T21:26:09Z

go/ai/format_json.go

+func (j *jsonHandler) ParseMessage(m *Message) (*Message, error) {
+	if m == nil {
+		return nil, errors.New("message is empty")
+	}
+	if len(m.Content) == 0 {
+		return nil, errors.New("message has no content")
+	}
+
+	var nonTextParts []*Part
+	accumulatedText := strings.Builder{}
+
+	for _, part := range m.Content {
+		if !part.IsText() {
+			nonTextParts = append(nonTextParts, part)
+		} else {
+			accumulatedText.WriteString(part.Text)
 		}
+	}

-		newParts := []*Part{}
-		text := base.ExtractJSONFromMarkdown(accumulatedText.String())
-		if text != "" {
-			if j.config.Schema != nil {
-				schemaBytes, err := json.Marshal(j.config.Schema)
-				if err != nil {
-					return nil, fmt.Errorf("expected schema is not valid: %w", err)
-				}
-				if err = base.ValidateRaw([]byte(text), schemaBytes); err != nil {
-					return nil, err
-				}
-			} else {
-				if !base.ValidJSON(text) {
-					return nil, errors.New("message is not a valid JSON")
-				}
+	newParts := []*Part{}
+	text := base.ExtractJSONFromMarkdown(accumulatedText.String())
+	if text != "" {
+		if j.config.Schema != nil {
+			schemaBytes, err := json.Marshal(j.config.Schema)
+			if err != nil {
+				return nil, fmt.Errorf("expected schema is not valid: %w", err)
+			}
+			if err = base.ValidateRaw([]byte(text), schemaBytes); err != nil {
+				return nil, err
+			}
+		} else {
+			if !base.ValidJSON(text) {
+				return nil, errors.New("message is not a valid JSON")
 			}
-			newParts = append(newParts, NewJSONPart(text))
 		}
+		newParts = append(newParts, NewJSONPart(text))
+	}

-		newParts = append(newParts, nonTextParts...)
+	newParts = append(newParts, nonTextParts...)

-		m.Content = newParts
-	}
+	m.Content = newParts

 	return m, nil
 }


This ParseMessage implementation still modifies the message in-place, which seems to contradict the main goal of this refactoring. The PR description states that in-place modification was an error, and the new StreamingFormatHandler interface is meant to provide a better way to parse output.

For consistency with other updated formatters like arrayHandler and jsonlHandler, and to align with the // Legacy: ... comment in formatter.go, this method should probably be a no-op that just returns the message without modification. The actual parsing is now handled by ParseOutput and ParseChunk.

This also applies to enumHandler's ParseMessage implementation.

func (j *jsonHandler) ParseMessage(m *Message) (*Message, error) { return m, nil }

Refactored formatters + added support for formatting streams.

9d1d995

github-project-automation bot added this to Genkit Backlog Dec 7, 2025

apascal07 changed the title ~~feat(go): Refactored formatters + added support for formatting streams.~~ feat(go): refactored formatters + added support for formatting streams Dec 7, 2025

github-actions bot added the go label Dec 7, 2025

apascal07 added 3 commits December 7, 2025 12:58

Added V2 of formatters that have legacy behavior.

f445059

Added back backward compatibility for JSONL.

6dc77f1

Removed v2 formats.

f5d9f5c

apascal07 marked this pull request as ready for review December 12, 2025 22:43

apascal07 requested a review from pavelgj December 12, 2025 22:43

apascal07 added 7 commits December 12, 2025 14:51

Merge branch 'main' into ap/go-structured-streaming

3f1e7aa

Update gen.go

af4ecf6

Update extract_test.go

39f9f74

Update formatter_test.go

c48d79c

Update formatter_test.go

72425be

Update gemini_test.go

0e5b472

Rewrote formatter tests.

31f46e7

pavelgj approved these changes Dec 16, 2025

View reviewed changes

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(go): refactored formatters + added support for formatting streams #3905

feat(go): refactored formatters + added support for formatting streams #3905

Uh oh!

apascal07 commented Dec 7, 2025 •

edited

Loading

Uh oh!

pavelgj commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(go): refactored formatters + added support for formatting streams #3905

Are you sure you want to change the base?

feat(go): refactored formatters + added support for formatting streams #3905

Uh oh!

Conversation

apascal07 commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavelgj commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

apascal07 commented Dec 7, 2025 •

edited

Loading