fix(openrouter-images): handle per-provider images[i] payload shapes#39
fix(openrouter-images): handle per-provider images[i] payload shapes#39FunJim wants to merge 1 commit intoOpenRouterTeam:mainfrom
Conversation
Google Gemini returns each entry in message.images[] as a plain string
(raw base64 or a full data: URL), but OpenAI image models return objects
of the form {type:"image_url", image_url:{url:"data:image/png;base64,..."}}.
The previous generate.ts and edit.ts called .startsWith directly on the
element, which threw 'images[i].startsWith is not a function' for any
non-Gemini image-capable model (e.g. openai/gpt-5.4-image-2,
openai/gpt-5-image, openai/gpt-5-image-mini).
Normalize the per-element shape before building the data: URL, with
fallbacks for image_url.url, url, b64_json, and data. Also document the
provider differences in SKILL.md so future tooling can handle the same
case correctly.
Verified end-to-end:
- google/gemini-3.1-flash-image-preview (string shape) — generate + edit
- openai/gpt-5.4-image-2 (object shape) — generate
There was a problem hiding this comment.
Perry's Review
Fixes a real crash when using OpenAI image models with the openrouter-images skill by normalizing message.images[i] from a string-or-object union before calling .startsWith. The fallback chain (image_url.url → url → b64_json → data) covers all known provider shapes and exits cleanly on unrecognized ones.
Verdict: 💬 Comments
Details
CI: no required checks configured ✅
Findings:
🟡 Suggestion — SKILL.md documentation snippet (Provider Response Shapes section, const dataUrl line): str can be undefined here. If raw is an object with none of image_url?.url, url, b64_json, or data set, the ?? chain resolves to undefined, and str.startsWith("data:") immediately throws TypeError: Cannot read properties of undefined (reading 'startsWith'). An implementor copying the snippet verbatim would get a cryptic crash instead of a clear error. The actual generate.ts and edit.ts already guard correctly with ?? "" and an explicit process.exit(1) — the snippet should match:
const str =
typeof raw === "string"
? raw
: raw?.image_url?.url ?? raw?.url ?? raw?.b64_json ?? raw?.data ?? "";
if (!str) {
console.error("Error: unrecognized image payload shape:", JSON.stringify(raw).slice(0, 300));
process.exit(1);
}
const dataUrl = str.startsWith("data:") ? str : `data:image/png;base64,${str}`;- nit:
generate.tsline 55 is missing theas anycast thatedit.tsline 68 already has (const raw = images[i] as any). TypeScript strict-mode error, no runtime impact withtsx.
⚠️ Note: GitHub's PR review API does not support inline comments on fork PRs when the reviewer app is not installed on the fork account (FunJim/skills). Findings above are in the review body instead.
Codex (openai/gpt-5.5): skipped — small tier (54 LoC)
Research: OpenRouter docs confirm message.images[] should return {image_url:{url:"data:..."}} objects. Gemini's plain-string deviation is an undocumented provider quirk. The fallback chain is correct and complete for all documented shapes.
Test coverage: no unit tests for the normalization logic — end-to-end verified by the PR author. A normalizeImagePayload() unit test would catch future provider regressions.
Unresolved threads: none
Tier: small (54 LoC)
Problem
openrouter-images/scripts/generate.tsandedit.tsassume every entry inmessage.images[]is a string and call.startsWithon it directly. That holds for Google Gemini image models (the current default), but OpenAI image models (e.g.openai/gpt-5.4-image-2) return each entry as an object:{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }So invoking the skill against any OpenAI image model crashes with
images[i].startsWith is not a functionand never saves an output. This is the same provider listed byGET /api/v1/modelsas image-capable, so it's a real path users hit when they pass--model openai/....Fix
Normalize the per-element shape before building the
data:URL, with fallbacks for the various wrappers OpenRouter / providers may emit:Applied to both
generate.tsandedit.ts. Also added a Provider Response Shapes section toSKILL.mdso future tooling that consumes the chat-completions image modality response handles the same case.No behavior change for the existing default model — Gemini still returns a string and follows the original code path.
Verification
End-to-end runs against the modified scripts (this branch):
google/gemini-3.1-flash-image-preview(default, string shape) —generate.tsandedit.tsboth produce a real image fileopenai/gpt-5.4-image-2(object shape, the original bug) —generate.tsnow produces a real PNG (1024x1024) instead of crashingFiles changed
skills/openrouter-images/scripts/generate.tsskills/openrouter-images/scripts/edit.tsskills/openrouter-images/SKILL.md(newProvider Response Shapessection)