Skip to content

fix(openrouter-images): handle per-provider images[i] payload shapes#39

Open
FunJim wants to merge 1 commit intoOpenRouterTeam:mainfrom
FunJim:fix/openrouter-images-multi-provider-shapes
Open

fix(openrouter-images): handle per-provider images[i] payload shapes#39
FunJim wants to merge 1 commit intoOpenRouterTeam:mainfrom
FunJim:fix/openrouter-images-multi-provider-shapes

Conversation

@FunJim
Copy link
Copy Markdown

@FunJim FunJim commented May 6, 2026

Problem

openrouter-images/scripts/generate.ts and edit.ts assume every entry in message.images[] is a string and call .startsWith on it directly. That holds for Google Gemini image models (the current default), but OpenAI image models (e.g. openai/gpt-5.4-image-2) return each entry as an object:

{ "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } }

So invoking the skill against any OpenAI image model crashes with images[i].startsWith is not a function and never saves an output. This is the same provider listed by GET /api/v1/models as image-capable, so it's a real path users hit when they pass --model openai/....

Fix

Normalize the per-element shape before building the data: URL, with fallbacks for the various wrappers OpenRouter / providers may emit:

const raw = images[i];
const str =
  typeof raw === 'string'
    ? raw
    : raw?.image_url?.url ?? raw?.url ?? raw?.b64_json ?? raw?.data;
const dataUrl = str.startsWith('data:') ? str : `data:image/png;base64,${str}`;

Applied to both generate.ts and edit.ts. Also added a Provider Response Shapes section to SKILL.md so future tooling that consumes the chat-completions image modality response handles the same case.

No behavior change for the existing default model — Gemini still returns a string and follows the original code path.

Verification

End-to-end runs against the modified scripts (this branch):

  • google/gemini-3.1-flash-image-preview (default, string shape) — generate.ts and edit.ts both produce a real image file
  • openai/gpt-5.4-image-2 (object shape, the original bug) — generate.ts now produces a real PNG (1024x1024) instead of crashing

Files changed

  • skills/openrouter-images/scripts/generate.ts
  • skills/openrouter-images/scripts/edit.ts
  • skills/openrouter-images/SKILL.md (new Provider Response Shapes section)

Google Gemini returns each entry in message.images[] as a plain string
(raw base64 or a full data: URL), but OpenAI image models return objects
of the form {type:"image_url", image_url:{url:"data:image/png;base64,..."}}.

The previous generate.ts and edit.ts called .startsWith directly on the
element, which threw 'images[i].startsWith is not a function' for any
non-Gemini image-capable model (e.g. openai/gpt-5.4-image-2,
openai/gpt-5-image, openai/gpt-5-image-mini).

Normalize the per-element shape before building the data: URL, with
fallbacks for image_url.url, url, b64_json, and data. Also document the
provider differences in SKILL.md so future tooling can handle the same
case correctly.

Verified end-to-end:
- google/gemini-3.1-flash-image-preview (string shape) — generate + edit
- openai/gpt-5.4-image-2 (object shape) — generate
Copy link
Copy Markdown

@perry-the-pr-reviewer perry-the-pr-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test review body — Perry diagnostic check

Copy link
Copy Markdown

@perry-the-pr-reviewer perry-the-pr-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perry's Review

Fixes a real crash when using OpenAI image models with the openrouter-images skill by normalizing message.images[i] from a string-or-object union before calling .startsWith. The fallback chain (image_url.url → url → b64_json → data) covers all known provider shapes and exits cleanly on unrecognized ones.

Verdict: 💬 Comments

Details

CI: no required checks configured ✅

Findings:

🟡 Suggestion — SKILL.md documentation snippet (Provider Response Shapes section, const dataUrl line): str can be undefined here. If raw is an object with none of image_url?.url, url, b64_json, or data set, the ?? chain resolves to undefined, and str.startsWith("data:") immediately throws TypeError: Cannot read properties of undefined (reading 'startsWith'). An implementor copying the snippet verbatim would get a cryptic crash instead of a clear error. The actual generate.ts and edit.ts already guard correctly with ?? "" and an explicit process.exit(1) — the snippet should match:

const str =
  typeof raw === "string"
    ? raw
    : raw?.image_url?.url ?? raw?.url ?? raw?.b64_json ?? raw?.data ?? "";
if (!str) {
  console.error("Error: unrecognized image payload shape:", JSON.stringify(raw).slice(0, 300));
  process.exit(1);
}
const dataUrl = str.startsWith("data:") ? str : `data:image/png;base64,${str}`;
  • nit: generate.ts line 55 is missing the as any cast that edit.ts line 68 already has (const raw = images[i] as any). TypeScript strict-mode error, no runtime impact with tsx.

⚠️ Note: GitHub's PR review API does not support inline comments on fork PRs when the reviewer app is not installed on the fork account (FunJim/skills). Findings above are in the review body instead.

Codex (openai/gpt-5.5): skipped — small tier (54 LoC)

Research: OpenRouter docs confirm message.images[] should return {image_url:{url:"data:..."}} objects. Gemini's plain-string deviation is an undocumented provider quirk. The fallback chain is correct and complete for all documented shapes.

Test coverage: no unit tests for the normalization logic — end-to-end verified by the PR author. A normalizeImagePayload() unit test would catch future provider regressions.

Unresolved threads: none

Tier: small (54 LoC)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant