Skip to content

Conversation

@ilopezluna
Copy link
Contributor

Adds support for multimodal messages in Ollama API

@ilopezluna ilopezluna requested a review from a team December 10, 2025 13:48
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Ollama API's message handling by introducing robust support for multimodal messages. It enables the system to process and correctly format messages that include both textual content and image data, converting them into an OpenAI-compatible structure. A crucial part of this change is the intelligent handling of image data URIs, ensuring that images are always correctly prefixed for downstream processing, even when the input format varies from sources like OpenWebUI.

Highlights

  • Multimodal Message Support: Implemented the ability to send messages containing both text and images, aligning with OpenAI's multimodal message format by structuring the 'content' field as an array of text and image objects.
  • Image Data URI Handling: Introduced a new utility function, 'ensureDataURIPrefix', to automatically prepend 'data:image/jpeg;base64,' to raw base64 image data, ensuring compatibility with 'llama.cpp' and correctly handling various URI schemes (data:, http://, https://).
  • Conversion Logic Update: Modified the 'convertMessages' function to transform Ollama's message structure (which previously used a separate 'Images' field) into the OpenAI-compatible 'content' array format for multimodal inputs, deprecating the old 'images' field handling.
  • Comprehensive Testing: Added a new test file, 'http_handler_test.go', with extensive unit tests covering various multimodal scenarios, raw base64 handling, and preservation of message order to ensure the robustness of the new features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for multimodal messages by handling image data URIs, which is a great feature. The implementation correctly converts messages to the OpenAI multimodal format and includes comprehensive tests for various scenarios.

My review includes a few suggestions to improve the implementation:

  • In ensureDataURIPrefix, I've suggested a way to make image type detection more robust instead of hardcoding image/jpeg.
  • In convertMessages, I've recommended pre-allocating a slice to improve performance.
  • For the new tests, I've proposed a more robust way to compare JSON objects to avoid failures from formatting or key order differences.

Overall, this is a solid contribution. Addressing these points will make the code more resilient and performant.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The ensureDataURIPrefix helper assumes all raw base64 should be treated as image/jpeg; if other formats are possible from clients, consider either detecting the MIME type or making the default configurable to avoid mislabeling non-JPEG data.
  • The tests in TestConvertMessages_Multimodal compare JSON via exact string equality on map-encoded objects, which is fragile due to nondeterministic map key order; consider unmarshaling both expected and actual JSON into structs or map[string]interface{} and comparing structurally instead.
  • In ensureDataURIPrefix, you may want to strings.TrimSpace the input before checking prefixes so that leading/trailing whitespace from UIs does not prevent correct detection of an existing scheme.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `ensureDataURIPrefix` helper assumes all raw base64 should be treated as `image/jpeg`; if other formats are possible from clients, consider either detecting the MIME type or making the default configurable to avoid mislabeling non-JPEG data.
- The tests in `TestConvertMessages_Multimodal` compare JSON via exact string equality on map-encoded objects, which is fragile due to nondeterministic map key order; consider unmarshaling both expected and actual JSON into structs or `map[string]interface{}` and comparing structurally instead.
- In `ensureDataURIPrefix`, you may want to `strings.TrimSpace` the input before checking prefixes so that leading/trailing whitespace from UIs does not prevent correct detection of an existing scheme.

## Individual Comments

### Comment 1
<location> `pkg/ollama/http_handler.go:736-737` </location>
<code_context>
+		return imageData
+	}
+
+	// Assume raw base64 data - add data URI prefix
+	return "data:image/jpeg;base64," + imageData
+}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Hardcoding the MIME type to image/jpeg may be too restrictive for non-JPEG images.

This treats all raw base64 input as `image/jpeg`. If upstream may send PNG/WebP or other formats, the data URI will be incorrect. Consider making the MIME type configurable (e.g., from a parameter or `msg` metadata) or otherwise centralizing/documenting this assumption so it’s easy to change if additional formats are supported.

Suggested implementation:

```golang
 // ensureDataURIPrefix ensures that image data has a proper data URI prefix.
 // OpenWebUI may send raw base64 data without prefix, but llama.cpp requires it.
 // This function:
 // - Returns data as-is if it already starts with "data:", "http://", or "https://"
-// - Prepends "data:image/jpeg;base64," to raw base64 strings
-func ensureDataURIPrefix(imageData string) string {
+// - Prepends "data:<mimeType>;base64," to raw base64 strings
+//   The mimeType parameter is used for the data URI; if empty, "image/jpeg" is assumed.
+func ensureDataURIPrefix(imageData string, mimeType string) string {
 	// Check if already has a URI scheme
 	if strings.HasPrefix(imageData, "data:") ||
 		strings.HasPrefix(imageData, "http://") ||
 		strings.HasPrefix(imageData, "https://") {
 		return imageData
 	}
+
+	// Default to image/jpeg if no MIME type was provided
+	if mimeType == "" {
+		mimeType = "image/jpeg"
+	}

-	// Assume raw base64 data - add data URI prefix
-	return "data:image/jpeg;base64," + imageData
+	// Assume raw base64 data - add data URI prefix using the provided MIME type
+	return "data:" + mimeType + ";base64," + imageData
 }

```

1. All call sites of `ensureDataURIPrefix` must be updated from `ensureDataURIPrefix(imageData)` to `ensureDataURIPrefix(imageData, mimeType)` (or `ensureDataURIPrefix(imageData, "")` to keep the existing default JPEG behavior).
2. Where possible, derive `mimeType` from upstream metadata (e.g., message fields like `image_mime_type`, HTTP headers such as `Content-Type`, or other existing structures) so that PNG/WebP/etc. are passed through correctly.
3. If there is a central place where image messages are handled, consider defining a small helper that encapsulates how the MIME type is resolved (e.g., `resolveImageMimeType(msg)`), and use that helper when calling `ensureDataURIPrefix`.
</issue_to_address>

### Comment 2
<location> `pkg/ollama/http_handler_test.go:106-107` </location>
<code_context>
+				t.Fatalf("Failed to marshal result: %v", err)
+			}
+
+			// Compare JSON strings
+			if string(resultJSON) != tt.expected {
+				t.Errorf("convertMessages() mismatch\nGot:      %s\nExpected: %s", string(resultJSON), tt.expected)
+			}
</code_context>

<issue_to_address>
**suggestion (testing):** Avoid brittle JSON string comparison in table test

Comparing raw JSON strings here is brittle for `map[string]interface{}` because key order isn’t guaranteed and may vary by Go version/implementation. Instead, unmarshal both `resultJSON` and `tt.expected` into a structured type (e.g. `[]map[string]interface{}` or `[]any`) and compare with `reflect.DeepEqual` or a cmp helper so the test only fails on real semantic differences, not ordering or formatting.

Suggested implementation:

```golang
			// Marshal to JSON for comparison
			resultJSON, err := json.Marshal(result)
			if err != nil {
				t.Fatalf("Failed to marshal result: %v", err)
			}

			// Unmarshal both actual and expected into structured values to avoid brittle
			// string comparison that depends on JSON key ordering or formatting.
			var resultData []map[string]interface{}
			if err := json.Unmarshal(resultJSON, &resultData); err != nil {
				t.Fatalf("Failed to unmarshal result JSON: %v", err)
			}

			var expectedData []map[string]interface{}
			if err := json.Unmarshal([]byte(tt.expected), &expectedData); err != nil {
				t.Fatalf("Failed to unmarshal expected JSON: %v", err)
			}

			// Compare structured values so the test only fails on semantic differences.
			if !reflect.DeepEqual(resultData, expectedData) {
				t.Errorf("convertMessages() mismatch\nGot:      %s\nExpected: %s", string(resultJSON), tt.expected)
			}

```

1. Ensure the file imports the `reflect` package at the top:
   - Add `reflect` to the existing import block, e.g.:
     `import ( ... "encoding/json" "reflect" ... )`
2. If the `expected` field in the table tests is not JSON representing a `[]map[string]interface{}`, adjust the `resultData`/`expectedData` types (e.g. to `[]any` or a concrete struct) to match the actual shape of `convertMessages` output and the expected JSON.
</issue_to_address>

### Comment 3
<location> `pkg/ollama/http_handler_test.go:91` </location>
<code_context>
+		{
+			name: "assistant message with tool calls",
</code_context>

<issue_to_address>
**suggestion (testing):** Add coverage for messages with `tool_call_id` (tool result messages)

Since `convertMessages` handles `ToolCalls` and `ToolCallID` separately, please add a test case for a tool result message (e.g., `Role` = "tool" with `ToolCallID` set) and assert that the output map includes the correct `"tool_call_id"`. This will exercise the tool-result path and protect against regressions.
</issue_to_address>

### Comment 4
<location> `pkg/ollama/http_handler_test.go:114` </location>
<code_context>
+	}
+}
+
+func TestEnsureDataURIPrefix(t *testing.T) {
+	tests := []struct {
+		name     string
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding an empty-string case for `ensureDataURIPrefix`

You already cover prefixed data URIs and http/https URLs. Please also add a case for an empty-string input, which currently becomes just the `"data:image/jpeg;base64,"` prefix, so this behavior is explicitly tested and guarded against future changes.

Suggested implementation:

```golang
func TestEnsureDataURIPrefix(t *testing.T) {
	tests := []struct {
		name     string
		input    string
		expected string
	}{
		{
			name:     "jpeg data uri",
			input:    "data:image/jpeg;base64,abc123",
			expected: "data:image/jpeg;base64,abc123",
		},
		{
			name:     "http url",
			input:    "http://example.com/image.jpg",
			expected: "http://example.com/image.jpg",
		},
		{
			name:     "https url",
			input:    "https://example.com/image.jpg",
			expected: "https://example.com/image.jpg",
		},
		{
			name:     "raw base64 without prefix",
			input:    "abc123",
			expected: "data:image/jpeg;base64,abc123",
		},
		{
			name:     "empty string",
			input:    "",
			// Current behavior: empty input becomes just the data URI prefix.
			expected: "data:image/jpeg;base64,",
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			result := ensureDataURIPrefix(tt.input)
			if result != tt.expected {
				t.Errorf("ensureDataURIPrefix(%q) = %q, want %q", tt.input, result, tt.expected)
			}
		})
	}
}

```

If the concrete test cases or field names in `TestEnsureDataURIPrefix` differ from what I inferred, adjust the SEARCH block to match the existing tests slice and insert the `empty string` case in the same style:

- Keep `name`, `input`, and `expected` field names consistent.
- Ensure the expected value for the empty-string input is `"data:image/jpeg;base64,"`, reflecting the current behavior you want to guard.
</issue_to_address>

### Comment 5
<location> `pkg/ollama/http_handler_test.go:157-166` </location>
<code_context>
+func TestConvertMessages_PreservesOrder(t *testing.T) {
</code_context>

<issue_to_address>
**suggestion (testing):** Strengthen multimodal assertions in `TestConvertMessages_PreservesOrder`

To improve robustness, also assert the structure of the two content parts (e.g., first has `type == "text"` with the expected text, second has `type == "image_url"` with the expected URL) so the test validates the multimodal `content` array contents, not just its length.

Suggested implementation:

```golang
func TestConvertMessages_PreservesOrder(t *testing.T) {
	messages := []Message{
		{Role: "system", Content: "You are a helpful assistant"},
		{Role: "user", Content: "Hello"},
		{Role: "assistant", Content: "Hi there!"},
		{Role: "user", Content: "What's in this image?", Images: []string{"data:image/jpeg;base64,abc123"}},
	}

	result := convertMessages(messages)

	if len(result) != 4 {
		t.Fatalf("expected 4 converted messages, got %d", len(result))
	}

	// Verify roles are preserved in order
	expectedRoles := []string{"system", "user", "assistant", "user"}
	for i, role := range expectedRoles {
		if result[i].Role != role {
			t.Fatalf("message %d role mismatch: got %q, want %q", i, result[i].Role, role)
		}
	}

	// Verify multimodal content structure for the last user message
	last := result[3]

	// We expect a multimodal content array with two parts: text then image_url
	if len(last.Content) != 2 {
		t.Fatalf("expected last message to have 2 content parts, got %d", len(last.Content))
	}

	// First part: text
	textPart, ok := last.Content[0].(openai.ChatCompletionMessageContentPartText)
	if !ok {
		t.Fatalf("expected first content part to be text, got %T", last.Content[0])
	}
	if textPart.Type != "text" {
		t.Fatalf("expected first content part type %q, got %q", "text", textPart.Type)
	}
	if textPart.Text != "What's in this image?" {
		t.Fatalf("unexpected text content: got %q, want %q", textPart.Text, "What's in this image?")
	}

	// Second part: image_url
	imagePart, ok := last.Content[1].(openai.ChatCompletionMessageContentPartImage)
	if !ok {
		t.Fatalf("expected second content part to be image_url, got %T", last.Content[1])
	}
	if imagePart.Type != "image_url" {
		t.Fatalf("expected second content part type %q, got %q", "image_url", imagePart.Type)
	}
	if imagePart.ImageURL.URL != "data:image/jpeg;base64,abc123" {
		t.Fatalf("unexpected image_url content URL: got %q, want %q", imagePart.ImageURL.URL, "data:image/jpeg;base64,abc123")
	}

```

1. This change assumes:
   - `result` is a slice of a struct with a `Role string` field and a `Content []interface{}` (or similar) field.
   - `Content` elements are concrete types `openai.ChatCompletionMessageContentPartText` and `openai.ChatCompletionMessageContentPartImage` from the `github.com/sashabaranov/go-openai` package, with fields:
     - `Type string`
     - `Text string` on the text part
     - `ImageURL.URL string` on the image part.
2. If your actual types differ (e.g., `[]openai.ChatCompletionMessagePart` with a `Type` discriminator instead of Go interfaces), adjust the type assertions and field accesses accordingly:
   - Replace the type assertions with direct indexing (e.g. `last.Content[0].Type == openai.ChatMessagePartTypeText`).
   - Replace `imagePart.ImageURL.URL` with the correct field path used in your code.
3. Ensure the test file imports the `openai` package if it is not already imported:
   - Add `openai "github.com/sashabaranov/go-openai"` (or the correct alias/path) to the imports of `pkg/ollama/http_handler_test.go`.
</issue_to_address>

### Comment 6
<location> `pkg/ollama/http_handler_test.go:69` </location>
<code_context>
+		{
+			name: "multimodal message with raw base64 from OpenWebUI (no prefix)",
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a case for multiple raw base64 images without prefix

You already cover a single raw base64 image (no prefix) being normalized via `ensureDataURIPrefix` in `convertMessages`. Since this also runs for multiple images, please add a test where `Images` contains at least two raw base64 strings without prefixes, and assert that each resulting `image_url.url` is correctly prefixed. This helps catch any off‑by‑one or loop issues in the multimodal path.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +106 to +107
// Compare JSON strings
if string(resultJSON) != tt.expected {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Avoid brittle JSON string comparison in table test

Comparing raw JSON strings here is brittle for map[string]interface{} because key order isn’t guaranteed and may vary by Go version/implementation. Instead, unmarshal both resultJSON and tt.expected into a structured type (e.g. []map[string]interface{} or []any) and compare with reflect.DeepEqual or a cmp helper so the test only fails on real semantic differences, not ordering or formatting.

Suggested implementation:

			// Marshal to JSON for comparison
			resultJSON, err := json.Marshal(result)
			if err != nil {
				t.Fatalf("Failed to marshal result: %v", err)
			}

			// Unmarshal both actual and expected into structured values to avoid brittle
			// string comparison that depends on JSON key ordering or formatting.
			var resultData []map[string]interface{}
			if err := json.Unmarshal(resultJSON, &resultData); err != nil {
				t.Fatalf("Failed to unmarshal result JSON: %v", err)
			}

			var expectedData []map[string]interface{}
			if err := json.Unmarshal([]byte(tt.expected), &expectedData); err != nil {
				t.Fatalf("Failed to unmarshal expected JSON: %v", err)
			}

			// Compare structured values so the test only fails on semantic differences.
			if !reflect.DeepEqual(resultData, expectedData) {
				t.Errorf("convertMessages() mismatch\nGot:      %s\nExpected: %s", string(resultJSON), tt.expected)
			}
  1. Ensure the file imports the reflect package at the top:
    • Add reflect to the existing import block, e.g.:
      import ( ... "encoding/json" "reflect" ... )
  2. If the expected field in the table tests is not JSON representing a []map[string]interface{}, adjust the resultData/expectedData types (e.g. to []any or a concrete struct) to match the actual shape of convertMessages output and the expected JSON.

},
},
// The tool_calls will have arguments converted to JSON string
// Note: JSON field order follows struct definition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add coverage for messages with tool_call_id (tool result messages)

Since convertMessages handles ToolCalls and ToolCallID separately, please add a test case for a tool result message (e.g., Role = "tool" with ToolCallID set) and assert that the output map includes the correct "tool_call_id". This will exercise the tool-result path and protect against regressions.

}
}

func TestEnsureDataURIPrefix(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding an empty-string case for ensureDataURIPrefix

You already cover prefixed data URIs and http/https URLs. Please also add a case for an empty-string input, which currently becomes just the "data:image/jpeg;base64," prefix, so this behavior is explicitly tested and guarded against future changes.

Suggested implementation:

func TestEnsureDataURIPrefix(t *testing.T) {
	tests := []struct {
		name     string
		input    string
		expected string
	}{
		{
			name:     "jpeg data uri",
			input:    "data:image/jpeg;base64,abc123",
			expected: "data:image/jpeg;base64,abc123",
		},
		{
			name:     "http url",
			input:    "http://example.com/image.jpg",
			expected: "http://example.com/image.jpg",
		},
		{
			name:     "https url",
			input:    "https://example.com/image.jpg",
			expected: "https://example.com/image.jpg",
		},
		{
			name:     "raw base64 without prefix",
			input:    "abc123",
			expected: "data:image/jpeg;base64,abc123",
		},
		{
			name:     "empty string",
			input:    "",
			// Current behavior: empty input becomes just the data URI prefix.
			expected: "data:image/jpeg;base64,",
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			result := ensureDataURIPrefix(tt.input)
			if result != tt.expected {
				t.Errorf("ensureDataURIPrefix(%q) = %q, want %q", tt.input, result, tt.expected)
			}
		})
	}
}

If the concrete test cases or field names in TestEnsureDataURIPrefix differ from what I inferred, adjust the SEARCH block to match the existing tests slice and insert the empty string case in the same style:

  • Keep name, input, and expected field names consistent.
  • Ensure the expected value for the empty-string input is "data:image/jpeg;base64,", reflecting the current behavior you want to guard.

Comment on lines +157 to +166
func TestConvertMessages_PreservesOrder(t *testing.T) {
messages := []Message{
{Role: "system", Content: "You are a helpful assistant"},
{Role: "user", Content: "Hello"},
{Role: "assistant", Content: "Hi there!"},
{Role: "user", Content: "What's in this image?", Images: []string{"data:image/jpeg;base64,abc123"}},
}

result := convertMessages(messages)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Strengthen multimodal assertions in TestConvertMessages_PreservesOrder

To improve robustness, also assert the structure of the two content parts (e.g., first has type == "text" with the expected text, second has type == "image_url" with the expected URL) so the test validates the multimodal content array contents, not just its length.

Suggested implementation:

func TestConvertMessages_PreservesOrder(t *testing.T) {
	messages := []Message{
		{Role: "system", Content: "You are a helpful assistant"},
		{Role: "user", Content: "Hello"},
		{Role: "assistant", Content: "Hi there!"},
		{Role: "user", Content: "What's in this image?", Images: []string{"data:image/jpeg;base64,abc123"}},
	}

	result := convertMessages(messages)

	if len(result) != 4 {
		t.Fatalf("expected 4 converted messages, got %d", len(result))
	}

	// Verify roles are preserved in order
	expectedRoles := []string{"system", "user", "assistant", "user"}
	for i, role := range expectedRoles {
		if result[i].Role != role {
			t.Fatalf("message %d role mismatch: got %q, want %q", i, result[i].Role, role)
		}
	}

	// Verify multimodal content structure for the last user message
	last := result[3]

	// We expect a multimodal content array with two parts: text then image_url
	if len(last.Content) != 2 {
		t.Fatalf("expected last message to have 2 content parts, got %d", len(last.Content))
	}

	// First part: text
	textPart, ok := last.Content[0].(openai.ChatCompletionMessageContentPartText)
	if !ok {
		t.Fatalf("expected first content part to be text, got %T", last.Content[0])
	}
	if textPart.Type != "text" {
		t.Fatalf("expected first content part type %q, got %q", "text", textPart.Type)
	}
	if textPart.Text != "What's in this image?" {
		t.Fatalf("unexpected text content: got %q, want %q", textPart.Text, "What's in this image?")
	}

	// Second part: image_url
	imagePart, ok := last.Content[1].(openai.ChatCompletionMessageContentPartImage)
	if !ok {
		t.Fatalf("expected second content part to be image_url, got %T", last.Content[1])
	}
	if imagePart.Type != "image_url" {
		t.Fatalf("expected second content part type %q, got %q", "image_url", imagePart.Type)
	}
	if imagePart.ImageURL.URL != "data:image/jpeg;base64,abc123" {
		t.Fatalf("unexpected image_url content URL: got %q, want %q", imagePart.ImageURL.URL, "data:image/jpeg;base64,abc123")
	}
  1. This change assumes:
    • result is a slice of a struct with a Role string field and a Content []interface{} (or similar) field.
    • Content elements are concrete types openai.ChatCompletionMessageContentPartText and openai.ChatCompletionMessageContentPartImage from the github.com/sashabaranov/go-openai package, with fields:
      • Type string
      • Text string on the text part
      • ImageURL.URL string on the image part.
  2. If your actual types differ (e.g., []openai.ChatCompletionMessagePart with a Type discriminator instead of Go interfaces), adjust the type assertions and field accesses accordingly:
    • Replace the type assertions with direct indexing (e.g. last.Content[0].Type == openai.ChatMessagePartTypeText).
    • Replace imagePart.ImageURL.URL with the correct field path used in your code.
  3. Ensure the test file imports the openai package if it is not already imported:
    • Add openai "github.com/sashabaranov/go-openai" (or the correct alias/path) to the imports of pkg/ollama/http_handler_test.go.

Images: []string{"/9j/4AAQSkZJRgABAQEBLA...."},
},
},
// Should auto-add the data URI prefix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a case for multiple raw base64 images without prefix

You already cover a single raw base64 image (no prefix) being normalized via ensureDataURIPrefix in convertMessages. Since this also runs for multiple images, please add a test where Images contains at least two raw base64 strings without prefixes, and assert that each resulting image_url.url is correctly prefixed. This helps catch any off‑by‑one or loop issues in the multimodal path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants