diff --git a/.changeset/fal-billable-units-usage.md b/.changeset/fal-billable-units-usage.md new file mode 100644 index 000000000..cce58450e --- /dev/null +++ b/.changeset/fal-billable-units-usage.md @@ -0,0 +1,11 @@ +--- +'@tanstack/ai-event-client': minor +'@tanstack/ai-fal': minor +'@tanstack/ai': minor +--- + +Surface fal's billed units as `result.usage`. The fal adapters now read fal's `x-fal-billable-units` response header off the result fetch and expose the billed quantity (`usage.unitsBilled`) on the generation result, so consumers can compute exact media-generation cost without wrapping `fetch` themselves. + +- `TokenUsage` gains an optional `unitsBilled` field for usage-based (non-token) billing, denominated in the provider's priced unit. +- `falImage`, `falAudio`, `falVideo`, `falSpeech`, and `falTranscription` populate `result.usage.unitsBilled` when fal reports it. +- `VideoUrlResult` gains an optional `usage` slot; `getVideoJobStatus` now emits the `video:usage` event and returns `usage` when the completed result reports billed units. diff --git a/docs/config.json b/docs/config.json index f5552703f..e3fc3b712 100644 --- a/docs/config.json +++ b/docs/config.json @@ -242,17 +242,20 @@ { "label": "Audio Generation", "to": "media/audio-generation", - "addedAt": "2026-04-23" + "addedAt": "2026-04-23", + "updatedAt": "2026-06-08" }, { "label": "Image Generation", "to": "media/image-generation", - "addedAt": "2026-04-15" + "addedAt": "2026-04-15", + "updatedAt": "2026-06-08" }, { "label": "Video Generation", "to": "media/video-generation", - "addedAt": "2026-04-15" + "addedAt": "2026-04-15", + "updatedAt": "2026-06-08" }, { "label": "Generation Hooks", diff --git a/docs/media/audio-generation.md b/docs/media/audio-generation.md index 65937ca21..0dc64a405 100644 --- a/docs/media/audio-generation.md +++ b/docs/media/audio-generation.md @@ -118,7 +118,10 @@ interface AudioGenerationResult { duration?: number } // Canonical TokenUsage (same shape as chat), present when the provider - // reports it (e.g. Gemini Lyria via generateContent). + // reports it (e.g. Gemini Lyria via generateContent). Usage-billed providers + // (fal) instead surface `usage.unitsBilled` — the real billed quantity read + // from fal's `x-fal-billable-units` result header. Multiply by the endpoint's + // unit price (fal pricing API) for the exact cost. usage?: TokenUsage } ``` diff --git a/docs/media/image-generation.md b/docs/media/image-generation.md index d8af2e816..18552ae8c 100644 --- a/docs/media/image-generation.md +++ b/docs/media/image-generation.md @@ -203,7 +203,8 @@ interface ImageGenerationResult { images: GeneratedImage[] // Array of generated images // Canonical TokenUsage (same shape as chat). Token-billed models also surface // a per-modality breakdown on `promptTokensDetails` (e.g. text vs image input - // tokens for gpt-image-1). + // tokens for gpt-image-1). Usage-billed providers (fal) instead surface + // `usage.unitsBilled` — see the note below. usage?: TokenUsage } @@ -214,6 +215,25 @@ interface GeneratedImage { } ``` +> **Cost tracking (fal):** fal bills by usage-based units rather than tokens. The +> fal image adapter surfaces the real billed quantity as `usage.unitsBilled` +> (read from fal's `x-fal-billable-units` result header). Multiply it by the +> endpoint's unit price from +> `GET https://api.fal.ai/v1/models/pricing?endpoint_id=…` for the exact cost — +> no `fetch` interceptor needed. + +```typescript +const result = await generateImage({ + adapter: falImage('fal-ai/flux/dev'), + prompt: 'a serene mountain lake', +}) + +if (result.usage?.unitsBilled != null) { + const cost = result.usage.unitsBilled * unitPrice // unitPrice from fal pricing API + console.log(`Billed ${result.usage.unitsBilled} units (~$${cost})`) +} +``` + ## Model Availability ### OpenAI Models diff --git a/docs/media/video-generation.md b/docs/media/video-generation.md index 4af93f020..ae325f95b 100644 --- a/docs/media/video-generation.md +++ b/docs/media/video-generation.md @@ -408,7 +408,7 @@ const { jobId } = await generateVideo({ ## Response Types -> **Note:** The interfaces below are the underlying adapter-level types. The `getVideoJobStatus()` helper returns a single merged object, `{ status, progress?, url?, error? }` — it does not return `jobId` or `expiresAt`. +> **Note:** The interfaces below are the underlying adapter-level types. The `getVideoJobStatus()` helper returns a single merged object, `{ status, progress?, url?, error?, usage? }` — it does not return `jobId` or `expiresAt`. ### VideoJobResult (from create) @@ -437,9 +437,20 @@ interface VideoUrlResult { jobId: string url: string // URL to download/stream the video expiresAt?: Date // When the URL expires + // Usage for the completed generation, when the adapter reports it. fal + // populates `usage.unitsBilled` from its `x-fal-billable-units` header. + usage?: TokenUsage } ``` +> **Cost tracking (fal):** fal bills media generation by usage-based units +> rather than tokens. The fal adapters surface the real billed quantity as +> `usage.unitsBilled` (denominated in the endpoint's priced unit). Combine it +> with the endpoint's unit price from +> `GET https://api.fal.ai/v1/models/pricing?endpoint_id=…` to compute the exact +> cost (`unitsBilled * unitPrice`). The same `usage.unitsBilled` is surfaced +> on image, audio, speech, and transcription results. + ## Model Variants | Model | Description | Use Case | diff --git a/examples/ts-react-media/src/components/ImageGenerator.tsx b/examples/ts-react-media/src/components/ImageGenerator.tsx index 484df42c9..3a7c5b0e2 100644 --- a/examples/ts-react-media/src/components/ImageGenerator.tsx +++ b/examples/ts-react-media/src/components/ImageGenerator.tsx @@ -209,13 +209,24 @@ export default function ImageGenerator({ {modelResult.status === 'success' && modelResult.result && modelResult.result.images.length > 0 && ( -
- {`Generated -
+ <> +
+ {`Generated +
+ {modelResult.result.usage?.unitsBilled != null && ( +

+ Billed {modelResult.result.usage.unitsBilled} fal unit + {modelResult.result.usage.unitsBilled === 1 + ? '' + : 's'}{' '} + — multiply by the endpoint unit price for USD cost +

+ )} + )} ) diff --git a/examples/ts-react-media/src/components/VideoGenerator.tsx b/examples/ts-react-media/src/components/VideoGenerator.tsx index 13fd69a8b..712981dfb 100644 --- a/examples/ts-react-media/src/components/VideoGenerator.tsx +++ b/examples/ts-react-media/src/components/VideoGenerator.tsx @@ -20,7 +20,7 @@ type JobState = model: string progress?: number | undefined } - | { status: 'completed'; url: string } + | { status: 'completed'; url: string; unitsBilled?: number } | { status: 'error'; message: string } interface VideoGeneratorProps { @@ -95,7 +95,11 @@ export default function VideoGenerator({ setJobStates((prev) => ({ ...prev, - [model]: { status: 'completed', url: url }, + [model]: { + status: 'completed', + url: url, + unitsBilled: urlResult.usage?.unitsBilled, + }, })) } else if (status.status === 'processing') { setJobStates((prev) => ({ @@ -387,15 +391,24 @@ export default function VideoGenerator({ )} {state.status === 'completed' && ( -
-
+ <> +
+
+ {state.unitsBilled != null && ( +

+ Billed {state.unitsBilled} fal unit + {state.unitsBilled === 1 ? '' : 's'} — multiply by the + endpoint unit price for USD cost +

+ )} + )} ) diff --git a/examples/ts-react-media/src/lib/server-functions.ts b/examples/ts-react-media/src/lib/server-functions.ts index 21029e74a..455226e5d 100644 --- a/examples/ts-react-media/src/lib/server-functions.ts +++ b/examples/ts-react-media/src/lib/server-functions.ts @@ -29,11 +29,17 @@ export const generateImageFn = createServerFn({ method: 'POST' }) }) } case 'xai/grok-imagine-image': { + // NOTE: fal's generated `size` type for this model only offers + // `16:9_1K` / `16:9_4K`, but the live API rejects those resolutions + // ("Input should be '1k' or '2k'") — fal's published enum is out of + // sync with its API, so `'16:9_4K'` type-checks yet 422s at runtime. + // Pass aspect_ratio via modelOptions and let the endpoint pick its + // default resolution, which both type-checks and works at runtime. return generateImage({ adapter: falImage('xai/grok-imagine-image'), prompt: data.prompt, numberOfImages: 1, - size: '16:9_4K', + modelOptions: { aspect_ratio: '16:9' }, }) } case 'fal-ai/flux-2/klein/9b': { diff --git a/packages/ai-event-client/src/index.ts b/packages/ai-event-client/src/index.ts index 59d10d62e..1af14ca87 100644 --- a/packages/ai-event-client/src/index.ts +++ b/packages/ai-event-client/src/index.ts @@ -223,6 +223,18 @@ export interface TokenUsage { completionTokensDetails?: CompletionTokensDetails /** Duration in seconds for duration-based billing (e.g., Whisper transcription) */ durationSeconds?: number + /** + * Number of priced units actually billed, for usage-based (non-token) billing. + * This is a bare count, not a cost and not a unit name — the unit itself + * (megapixels, seconds, images, …) is provider-defined and not carried here; + * providers typically expose it via a separate pricing API. Surfaced for media + * generation, where there are no tokens: fal returns this count in its + * `x-fal-billable-units` response header. Multiply by the unit price to get the + * exact cost (`unitsBilled * unitPrice`). The unit-priced analogue of + * `durationSeconds` (the time-priced case); both are quantities, distinct from + * the monetary `cost` / `costDetails`. + */ + unitsBilled?: number /** Provider-specific usage details not covered by standard fields */ providerUsageDetails?: TProviderDetails /** Provider-reported cost for the request, when available. */ diff --git a/packages/ai-fal/src/adapters/audio.ts b/packages/ai-fal/src/adapters/audio.ts index 5eed67365..ddd007be6 100644 --- a/packages/ai-fal/src/adapters/audio.ts +++ b/packages/ai-fal/src/adapters/audio.ts @@ -1,8 +1,10 @@ import { fal } from '@fal-ai/client' import { BaseAudioAdapter } from '@tanstack/ai/adapters' import { + buildFalUsage, configureFalClient, deriveAudioContentType, + takeBillableUnits, generateId as utilGenerateId, } from '../utils' import type { OutputType, Result } from '@fal-ai/client' @@ -133,6 +135,8 @@ export class FalAudioAdapter extends BaseAudioAdapter< throw new Error('Audio URL not found in fal audio generation response') } + const usage = buildFalUsage(takeBillableUnits(response.requestId)) + return { id: response.requestId || this.generateId(), model: this.model, @@ -140,6 +144,7 @@ export class FalAudioAdapter extends BaseAudioAdapter< url: audioUrl, contentType: deriveAudioContentType(contentType, audioUrl), }, + ...(usage ? { usage } : {}), } } } diff --git a/packages/ai-fal/src/adapters/image.ts b/packages/ai-fal/src/adapters/image.ts index 11dcbaeb2..503affefb 100644 --- a/packages/ai-fal/src/adapters/image.ts +++ b/packages/ai-fal/src/adapters/image.ts @@ -1,6 +1,11 @@ import { fal } from '@fal-ai/client' import { BaseImageAdapter } from '@tanstack/ai/adapters' -import { configureFalClient, generateId as utilGenerateId } from '../utils' +import { + buildFalUsage, + configureFalClient, + takeBillableUnits, + generateId as utilGenerateId, +} from '../utils' import { mapSizeToFalFormat } from '../image/image-provider-options' import type { OutputType, Result } from '@fal-ai/client' import type { FalClientConfig } from '../utils' @@ -120,10 +125,13 @@ export class FalImageAdapter extends BaseImageAdapter< ) } + const usage = buildFalUsage(takeBillableUnits(response.requestId)) + return { id: response.requestId || this.generateId(), model: this.model, images, + ...(usage ? { usage } : {}), } } diff --git a/packages/ai-fal/src/adapters/speech.ts b/packages/ai-fal/src/adapters/speech.ts index 4a4646b41..ea7f67d47 100644 --- a/packages/ai-fal/src/adapters/speech.ts +++ b/packages/ai-fal/src/adapters/speech.ts @@ -2,8 +2,10 @@ import { fal } from '@fal-ai/client' import { BaseTTSAdapter } from '@tanstack/ai/adapters' import { arrayBufferToBase64, + buildFalUsage, configureFalClient, extractUrlExtension, + takeBillableUnits, generateId as utilGenerateId, } from '../utils' import type { OutputType, Result } from '@fal-ai/client' @@ -133,12 +135,15 @@ export class FalSpeechAdapter extends BaseTTSAdapter< safeUrlExtension || contentTypeMime?.split('/')[1] || 'wav' const format = rawFormat === 'mpeg' ? 'mp3' : rawFormat + const usage = buildFalUsage(takeBillableUnits(response.requestId)) + return { id: response.requestId || this.generateId(), model: this.model, audio: base64, format, contentType: contentTypeMime || `audio/${format}`, + ...(usage ? { usage } : {}), } } } diff --git a/packages/ai-fal/src/adapters/transcription.ts b/packages/ai-fal/src/adapters/transcription.ts index 586f4aacf..9fa769292 100644 --- a/packages/ai-fal/src/adapters/transcription.ts +++ b/packages/ai-fal/src/adapters/transcription.ts @@ -1,8 +1,10 @@ import { fal } from '@fal-ai/client' import { BaseTranscriptionAdapter } from '@tanstack/ai/adapters' import { + buildFalUsage, configureFalClient, dataUrlToBlob, + takeBillableUnits, generateId as utilGenerateId, } from '../utils' import type { OutputType, Result } from '@fal-ai/client' @@ -151,12 +153,15 @@ export class FalTranscriptionAdapter< (data.inferred_languages as Array | undefined)?.[0] || (data.languages as Array | undefined)?.[0] + const usage = buildFalUsage(takeBillableUnits(response.requestId)) + return { id: response.requestId || this.generateId(), model: this.model, text, ...(language !== undefined ? { language } : {}), ...(segments !== undefined ? { segments } : {}), + ...(usage ? { usage } : {}), } } } diff --git a/packages/ai-fal/src/adapters/video.ts b/packages/ai-fal/src/adapters/video.ts index 662b4f60f..05b006069 100644 --- a/packages/ai-fal/src/adapters/video.ts +++ b/packages/ai-fal/src/adapters/video.ts @@ -1,6 +1,11 @@ import { fal } from '@fal-ai/client' import { BaseVideoAdapter } from '@tanstack/ai/adapters' -import { configureFalClient, generateId as utilGenerateId } from '../utils' +import { + buildFalUsage, + configureFalClient, + takeBillableUnits, + generateId as utilGenerateId, +} from '../utils' import { mapVideoSizeToFalFormat } from '../video/video-provider-options' import type { VideoGenerationOptions, @@ -163,9 +168,12 @@ export class FalVideoAdapter extends BaseVideoAdapter< throw new Error('Video URL not found in response') } + const usage = buildFalUsage(takeBillableUnits(result.requestId)) + return { jobId, url, + ...(usage ? { usage } : {}), } } diff --git a/packages/ai-fal/src/utils/billing.ts b/packages/ai-fal/src/utils/billing.ts new file mode 100644 index 000000000..8df762baf --- /dev/null +++ b/packages/ai-fal/src/utils/billing.ts @@ -0,0 +1,117 @@ +import type { TokenUsage } from '@tanstack/ai' + +/** + * Response header fal sets on a queue *result* fetch carrying the real billed + * quantity for the generation, denominated in the endpoint's priced unit. + */ +const FAL_BILLABLE_UNITS_HEADER = 'x-fal-billable-units' + +/** + * Response header fal sets carrying the request id. The fal client surfaces this + * same value as `Result.requestId`, so keying captured billable units by it + * guarantees the adapter's lookup matches the fetch the units came from — no URL + * parsing or global correlation registry of our own design needed. + */ +const FAL_REQUEST_ID_HEADER = 'x-fal-request-id' + +/** + * Upper bound on retained, not-yet-consumed billable-unit entries. Each + * successful generation reads-and-deletes its entry (see {@link takeBillableUnits}), + * so this only guards against an unbounded leak when a result fetch records units + * but the adapter never resolves (e.g. it throws before reading). When the cap is + * exceeded the oldest entry is evicted (Map preserves insertion order). + */ +const MAX_PENDING_ENTRIES = 256 + +const billableUnitsByRequestId = new Map() + +/** + * Parse the `x-fal-billable-units` header value into a finite number. Returns + * `undefined` for a missing or non-numeric value so callers can skip attaching + * usage rather than surfacing `NaN`. + */ +export function parseBillableUnits(value: string | null): number | undefined { + if (value == null || value === '') return undefined + const parsed = Number(value) + return Number.isFinite(parsed) ? parsed : undefined +} + +/** + * Record the billable units carried by a fal result response, keyed by the + * request id from the same response. Reading headers does not consume the body, + * so the response can be returned to the fal client untouched. + */ +export function recordBillableUnitsFromResponse(response: Response): void { + const units = parseBillableUnits( + response.headers.get(FAL_BILLABLE_UNITS_HEADER), + ) + if (units == null) return + const requestId = response.headers.get(FAL_REQUEST_ID_HEADER) + if (!requestId) return + if ( + billableUnitsByRequestId.size >= MAX_PENDING_ENTRIES && + !billableUnitsByRequestId.has(requestId) + ) { + const oldest = billableUnitsByRequestId.keys().next().value + if (oldest !== undefined) billableUnitsByRequestId.delete(oldest) + } + billableUnitsByRequestId.set(requestId, units) +} + +/** + * Read and remove the billable units recorded for a request id. Removing on read + * keeps the registry from growing across the lifetime of the process. + */ +export function takeBillableUnits( + requestId: string | undefined, +): number | undefined { + if (!requestId) return undefined + const units = billableUnitsByRequestId.get(requestId) + if (units !== undefined) billableUnitsByRequestId.delete(requestId) + return units +} + +/** + * Build a {@link TokenUsage} carrying fal's billed quantity. Media generation has + * no tokens, so the token fields are zero and the real billing signal rides on + * `unitsBilled` — mirroring how the duration-billed transcription adapters + * surface `durationSeconds`. Returns `undefined` when no units were captured so + * callers can omit `usage` entirely. + */ +export function buildFalUsage( + unitsBilled: number | undefined, +): TokenUsage | undefined { + if (unitsBilled == null) return undefined + return { + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled, + } +} + +/** + * Wrap a fetch so every fal request's response is inspected for the + * `x-fal-billable-units` header before being returned untouched. Installed as + * fal's `config.fetch`, which (unlike a global `responseHandler`) is honoured for + * every request — the fal client forces `resultResponseHandler` per queue + * operation, clobbering any configured response handler. + * + * `baseFetch` is the underlying implementation to delegate to (defaults to the + * global `fetch`). Injecting it keeps usage capture working when a caller + * supplies a custom fetch — a proxy, instrumentation, or a test mock — without + * mutating any global. + */ +export function createBillingFetch( + baseFetch: typeof fetch = globalThis.fetch, +): typeof fetch { + return async (input, init) => { + const response = await baseFetch(input, init) + try { + recordBillableUnitsFromResponse(response) + } catch { + // Capturing usage must never break the underlying request. + } + return response + } +} diff --git a/packages/ai-fal/src/utils/client.ts b/packages/ai-fal/src/utils/client.ts index ad54f68f5..c98461d62 100644 --- a/packages/ai-fal/src/utils/client.ts +++ b/packages/ai-fal/src/utils/client.ts @@ -1,9 +1,17 @@ import { fal } from '@fal-ai/client' import { generateId as _generateId, getApiKeyFromEnv } from '@tanstack/ai-utils' +import { createBillingFetch } from './billing' export interface FalClientConfig { apiKey: string proxyUrl?: string + /** + * Override the underlying fetch used for fal requests. The adapter wraps it to + * read the `x-fal-billable-units` header (see ./billing.ts), so usage capture + * still works. Defaults to the global `fetch`. Useful for proxying, + * instrumentation, or pointing requests at a mock in tests. + */ + fetch?: typeof fetch } export function getFalApiKeyFromEnv(): string { @@ -14,6 +22,10 @@ export function configureFalClient(config?: FalClientConfig): void { const apiKey = config?.apiKey ?? getFalApiKeyFromEnv() fal.config({ credentials: apiKey, + // Wrap the (optionally overridden) fetch to read fal's + // `x-fal-billable-units` header off every response so adapters can surface + // the billed quantity as `result.usage`. See ./billing.ts. + fetch: createBillingFetch(config?.fetch), ...(config?.proxyUrl ? { proxyUrl: config.proxyUrl } : {}), }) } diff --git a/packages/ai-fal/src/utils/index.ts b/packages/ai-fal/src/utils/index.ts index a103c5061..379e31757 100644 --- a/packages/ai-fal/src/utils/index.ts +++ b/packages/ai-fal/src/utils/index.ts @@ -8,3 +8,5 @@ export { deriveAudioContentType, type FalClientConfig, } from './client' + +export { takeBillableUnits, buildFalUsage } from './billing' diff --git a/packages/ai-fal/tests/audio-adapter.test.ts b/packages/ai-fal/tests/audio-adapter.test.ts index d95216bcd..02e6d8ca7 100644 --- a/packages/ai-fal/tests/audio-adapter.test.ts +++ b/packages/ai-fal/tests/audio-adapter.test.ts @@ -2,6 +2,18 @@ import { beforeEach, describe, expect, it, vi } from 'vitest' import { generateAudio } from '@tanstack/ai' import { falAudio } from '../src/adapters/audio' +import { recordBillableUnitsFromResponse } from '../src/utils/billing' + +function seedBillableUnits(requestId: string, units: string) { + recordBillableUnitsFromResponse( + new Response(null, { + headers: { + 'x-fal-request-id': requestId, + 'x-fal-billable-units': units, + }, + }), + ) +} // Declare mocks at module level let mockSubscribe: any @@ -255,6 +267,7 @@ describe('Fal Audio Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), }) }) @@ -266,6 +279,7 @@ describe('Fal Audio Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) @@ -329,4 +343,50 @@ describe('Fal Audio Adapter', () => { expect(result.audio.contentType).toBe('audio/wav') }) + + it('surfaces fal billable units as usage', async () => { + seedBillableUnits('req-billed-audio', '2.5') + mockSubscribe.mockResolvedValueOnce({ + data: { + audio: { + url: 'https://fal.media/files/billed.mp3', + content_type: 'audio/mpeg', + }, + }, + requestId: 'req-billed-audio', + }) + + const result = await generateAudio({ + adapter: createAdapter(), + prompt: 'billed track', + modelOptions: { lyrics_prompt: DEFAULT_LYRICS }, + }) + + expect(result.usage).toEqual({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 2.5, + }) + }) + + it('omits usage when fal does not report billable units', async () => { + mockSubscribe.mockResolvedValueOnce({ + data: { + audio: { + url: 'https://fal.media/files/unbilled.mp3', + content_type: 'audio/mpeg', + }, + }, + requestId: 'req-unbilled-audio', + }) + + const result = await generateAudio({ + adapter: createAdapter(), + prompt: 'unbilled track', + modelOptions: { lyrics_prompt: DEFAULT_LYRICS }, + }) + + expect(result.usage).toBeUndefined() + }) }) diff --git a/packages/ai-fal/tests/billing.test.ts b/packages/ai-fal/tests/billing.test.ts new file mode 100644 index 000000000..d35c834dd --- /dev/null +++ b/packages/ai-fal/tests/billing.test.ts @@ -0,0 +1,136 @@ +import { describe, expect, it, vi } from 'vitest' +import { + buildFalUsage, + createBillingFetch, + parseBillableUnits, + recordBillableUnitsFromResponse, + takeBillableUnits, +} from '../src/utils/billing' + +function resultResponse( + headers: Record, + body: BodyInit | null = null, +): Response { + return new Response(body, { headers }) +} + +describe('parseBillableUnits', () => { + it('parses integer and fractional values', () => { + expect(parseBillableUnits('4')).toBe(4) + expect(parseBillableUnits('1.5')).toBe(1.5) + expect(parseBillableUnits('0')).toBe(0) + }) + + it('returns undefined for missing or non-numeric values', () => { + expect(parseBillableUnits(null)).toBeUndefined() + expect(parseBillableUnits('')).toBeUndefined() + expect(parseBillableUnits('not-a-number')).toBeUndefined() + }) +}) + +describe('recordBillableUnitsFromResponse / takeBillableUnits', () => { + it('records billable units keyed by the request id header', () => { + recordBillableUnitsFromResponse( + resultResponse({ + 'x-fal-request-id': 'req-record-1', + 'x-fal-billable-units': '6', + }), + ) + + expect(takeBillableUnits('req-record-1')).toBe(6) + }) + + it('removes the entry on read so it is consumed once', () => { + recordBillableUnitsFromResponse( + resultResponse({ + 'x-fal-request-id': 'req-record-2', + 'x-fal-billable-units': '3', + }), + ) + + expect(takeBillableUnits('req-record-2')).toBe(3) + expect(takeBillableUnits('req-record-2')).toBeUndefined() + }) + + it('ignores responses without a billable-units header', () => { + recordBillableUnitsFromResponse( + resultResponse({ 'x-fal-request-id': 'req-record-3' }), + ) + + expect(takeBillableUnits('req-record-3')).toBeUndefined() + }) + + it('ignores responses with billable units but no request id', () => { + // No request id means there is nothing to correlate the units against. + expect(() => + recordBillableUnitsFromResponse( + resultResponse({ 'x-fal-billable-units': '9' }), + ), + ).not.toThrow() + }) + + it('returns undefined for an unknown or missing request id', () => { + expect(takeBillableUnits('never-recorded')).toBeUndefined() + expect(takeBillableUnits(undefined)).toBeUndefined() + }) +}) + +describe('buildFalUsage', () => { + it('returns undefined when no units were captured', () => { + expect(buildFalUsage(undefined)).toBeUndefined() + }) + + it('surfaces billable units on a zero-token usage object', () => { + expect(buildFalUsage(4)).toEqual({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 4, + }) + }) + + it('preserves a zero billed quantity', () => { + expect(buildFalUsage(0)).toEqual({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 0, + }) + }) +}) + +describe('createBillingFetch', () => { + it('records the header and returns the response with its body intact', async () => { + const mockResponse = resultResponse( + { + 'x-fal-request-id': 'req-fetch-1', + 'x-fal-billable-units': '7', + 'content-type': 'application/json', + }, + JSON.stringify({ ok: true }), + ) + // Inject the underlying fetch directly — no global to stub or restore. + const baseFetch = vi.fn().mockResolvedValue(mockResponse) + + const billingFetch = createBillingFetch(baseFetch) + const returned = await billingFetch( + 'https://queue.fal.run/fal-ai/flux/requests/req-fetch-1', + ) + + expect(returned).toBe(mockResponse) + expect(baseFetch).toHaveBeenCalledTimes(1) + // Reading headers must not consume the body — the fal client still parses it. + await expect(returned.json()).resolves.toEqual({ ok: true }) + expect(takeBillableUnits('req-fetch-1')).toBe(7) + }) + + it('passes through responses without the billable-units header', async () => { + const mockResponse = resultResponse({ 'x-fal-request-id': 'req-fetch-2' }) + const baseFetch = vi.fn().mockResolvedValue(mockResponse) + + const billingFetch = createBillingFetch(baseFetch) + await billingFetch('https://queue.fal.run/fal-ai/flux/requests/req-fetch-2') + + expect(takeBillableUnits('req-fetch-2')).toBeUndefined() + }) +}) diff --git a/packages/ai-fal/tests/image-adapter.test.ts b/packages/ai-fal/tests/image-adapter.test.ts index 9c7c5b49c..b429f1000 100644 --- a/packages/ai-fal/tests/image-adapter.test.ts +++ b/packages/ai-fal/tests/image-adapter.test.ts @@ -2,6 +2,23 @@ import { beforeEach, describe, expect, it, vi } from 'vitest' import { generateImage } from '@tanstack/ai' import { falImage } from '../src/adapters/image' +import { recordBillableUnitsFromResponse } from '../src/utils/billing' + +/** + * Seed the billing registry the way the wrapped `config.fetch` would after a + * real fal result fetch: the result response carries both the request id and the + * billable-units header. The adapter then looks the units up by `requestId`. + */ +function seedBillableUnits(requestId: string, units: string) { + recordBillableUnitsFromResponse( + new Response(null, { + headers: { + 'x-fal-request-id': requestId, + 'x-fal-billable-units': units, + }, + }), + ) +} // Declare mocks at module level let mockSubscribe: any @@ -219,6 +236,7 @@ describe('Fal Image Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), }) }) @@ -230,6 +248,7 @@ describe('Fal Image Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) @@ -299,4 +318,39 @@ describe('Fal Image Adapter', () => { 'https://fal.media/files/direct-string.png', ) }) + + it('surfaces fal billable units as usage', async () => { + seedBillableUnits('req-billed-img', '4') + mockSubscribe.mockResolvedValueOnce({ + data: { images: [{ url: 'https://fal.media/files/billed.png' }] }, + requestId: 'req-billed-img', + }) + + const result = await generateImage({ + adapter: createAdapter(), + prompt: 'billed image', + }) + + expect(result.usage).toEqual({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 4, + }) + }) + + it('omits usage when fal does not report billable units', async () => { + mockSubscribe.mockResolvedValueOnce( + createMockImageResponse([ + { url: 'https://fal.media/files/unbilled.png' }, + ]), + ) + + const result = await generateImage({ + adapter: createAdapter(), + prompt: 'unbilled image', + }) + + expect(result.usage).toBeUndefined() + }) }) diff --git a/packages/ai-fal/tests/speech-adapter.test.ts b/packages/ai-fal/tests/speech-adapter.test.ts index e42b4e985..f172400a5 100644 --- a/packages/ai-fal/tests/speech-adapter.test.ts +++ b/packages/ai-fal/tests/speech-adapter.test.ts @@ -214,6 +214,7 @@ describe('Fal Speech Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), }) }) @@ -225,6 +226,7 @@ describe('Fal Speech Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) diff --git a/packages/ai-fal/tests/transcription-adapter.test.ts b/packages/ai-fal/tests/transcription-adapter.test.ts index 4da78961b..c88fddfec 100644 --- a/packages/ai-fal/tests/transcription-adapter.test.ts +++ b/packages/ai-fal/tests/transcription-adapter.test.ts @@ -219,6 +219,7 @@ describe('Fal Transcription Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), }) }) @@ -230,6 +231,7 @@ describe('Fal Transcription Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) diff --git a/packages/ai-fal/tests/utils.test.ts b/packages/ai-fal/tests/utils.test.ts index ec163afef..954a8b67f 100644 --- a/packages/ai-fal/tests/utils.test.ts +++ b/packages/ai-fal/tests/utils.test.ts @@ -43,6 +43,7 @@ describe('configureFalClient', () => { expect(configSpy).toHaveBeenCalledWith({ credentials: 'test-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) @@ -53,7 +54,10 @@ describe('configureFalClient', () => { configureFalClient({ apiKey: 'test-key' }) - expect(configSpy).toHaveBeenCalledWith({ credentials: 'test-key' }) + expect(configSpy).toHaveBeenCalledWith({ + credentials: 'test-key', + fetch: expect.any(Function), + }) }) }) diff --git a/packages/ai-fal/tests/video-adapter.test.ts b/packages/ai-fal/tests/video-adapter.test.ts index 08e1cec06..7bf5ce466 100644 --- a/packages/ai-fal/tests/video-adapter.test.ts +++ b/packages/ai-fal/tests/video-adapter.test.ts @@ -2,6 +2,18 @@ import { beforeEach, describe, expect, it, vi } from 'vitest' import { generateVideo } from '@tanstack/ai' import { falVideo } from '../src/adapters/video' +import { recordBillableUnitsFromResponse } from '../src/utils/billing' + +function seedBillableUnits(requestId: string, units: string) { + recordBillableUnitsFromResponse( + new Response(null, { + headers: { + 'x-fal-request-id': requestId, + 'x-fal-billable-units': units, + }, + }), + ) +} // Declare mocks at module level let mockQueueSubmit: any @@ -282,6 +294,39 @@ describe('Fal Video Adapter', () => { 'Video URL not found in response', ) }) + + it('surfaces fal billable units as usage', async () => { + seedBillableUnits('job-billed', '12') + mockQueueResult.mockResolvedValueOnce({ + data: { + video: { url: 'https://fal.media/files/billed.mp4' }, + }, + requestId: 'job-billed', + }) + + const result = await createAdapter().getVideoUrl('job-billed') + + expect(result.url).toBe('https://fal.media/files/billed.mp4') + expect(result.usage).toEqual({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 12, + }) + }) + + it('omits usage when fal does not report billable units', async () => { + mockQueueResult.mockResolvedValueOnce({ + data: { + video: { url: 'https://fal.media/files/unbilled.mp4' }, + }, + requestId: 'job-unbilled', + }) + + const result = await createAdapter().getVideoUrl('job-unbilled') + + expect(result.usage).toBeUndefined() + }) }) describe('client configuration', () => { @@ -290,6 +335,7 @@ describe('Fal Video Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), }) }) @@ -301,6 +347,7 @@ describe('Fal Video Adapter', () => { expect(mockConfig).toHaveBeenCalledWith({ credentials: 'my-api-key', + fetch: expect.any(Function), proxyUrl: '/api/fal/proxy', }) }) diff --git a/packages/ai/skills/ai-core/media-generation/SKILL.md b/packages/ai/skills/ai-core/media-generation/SKILL.md index 09a552b73..b9c4c1a2c 100644 --- a/packages/ai/skills/ai-core/media-generation/SKILL.md +++ b/packages/ai/skills/ai-core/media-generation/SKILL.md @@ -343,10 +343,38 @@ const { generate, result, jobId, videoStatus, isLoading } = useGenerateVideo({ console.log(`${status.status} (${status.progress}%)`), }) -// videoStatus: { jobId, status, progress?, url?, error? } +// videoStatus: { jobId, status, progress?, url?, error?, usage? } // result (on completion): { url } ``` +### 6. Cost tracking (fal billable units) + +fal bills media generation by usage-based units, not tokens. Every fal media +adapter (`falImage`, `falAudio`, `falSpeech`, `falTranscription`, `falVideo`) +surfaces the real billed quantity on the result as `usage.unitsBilled`, read +from fal's `x-fal-billable-units` response header — no `fetch` interceptor +needed. It rides on the canonical `TokenUsage` shape (token fields are `0` for +media), mirroring how duration-billed transcription surfaces `durationSeconds`. + +```typescript +import { generateImage } from '@tanstack/ai' +import { falImage } from '@tanstack/ai-fal' + +const result = await generateImage({ + adapter: falImage('fal-ai/flux/dev'), + prompt: 'a serene mountain lake', +}) + +// usage.unitsBilled is the priced quantity. Multiply by the endpoint unit +// price (GET https://api.fal.ai/v1/models/pricing?endpoint_id=…) for exact cost. +if (result.usage?.unitsBilled != null) { + const cost = result.usage.unitsBilled * unitPrice +} +``` + +For video, the units arrive with the completed result: `getVideoJobStatus()` +returns `usage` and emits a `video:usage` devtools event when fal reports it. + --- ## Common Hook API diff --git a/packages/ai/src/activities/generateVideo/index.ts b/packages/ai/src/activities/generateVideo/index.ts index cee2339f7..4e0e48896 100644 --- a/packages/ai/src/activities/generateVideo/index.ts +++ b/packages/ai/src/activities/generateVideo/index.ts @@ -15,6 +15,7 @@ import type { DebugOption } from '../../logger/types' import type { VideoAdapter } from './adapter' import type { StreamChunk, + TokenUsage, VideoJobResult, VideoStatusResult, VideoUrlResult, @@ -380,6 +381,7 @@ async function* runStreamingVideoGeneration< status: 'completed', url: urlResult.url, expiresAt: urlResult.expiresAt, + ...(urlResult.usage ? { usage: urlResult.usage } : {}), }, timestamp: Date.now(), } as StreamChunk @@ -454,6 +456,7 @@ export async function getVideoJobStatus< progress?: number url?: string error?: string + usage?: TokenUsage }> { const { adapter, jobId } = options const requestId = createId('video-status') @@ -487,10 +490,19 @@ export async function getVideoJobStatus< duration: Date.now() - startTime, timestamp: Date.now(), }) + if (urlResult.usage) { + aiEventClient.emit('video:usage', { + requestId, + model: adapter.model, + usage: urlResult.usage, + timestamp: Date.now(), + }) + } return { status: statusResult.status, progress: statusResult.progress, url: urlResult.url, + ...(urlResult.usage ? { usage: urlResult.usage } : {}), } } catch (error) { const errorMessage = diff --git a/packages/ai/src/types.ts b/packages/ai/src/types.ts index e57f860ba..6034e83ca 100644 --- a/packages/ai/src/types.ts +++ b/packages/ai/src/types.ts @@ -1656,6 +1656,12 @@ export interface VideoUrlResult { url: string /** When the URL expires, if applicable */ expiresAt?: Date + /** + * Usage information for the completed generation, when the adapter can report + * it. For usage-based providers (e.g. fal) this carries `unitsBilled` — the + * real billed quantity — so consumers can compute exact cost. + */ + usage?: TokenUsage } // ============================================================================ diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index ab84c8703..7f403d64a 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -1862,6 +1862,9 @@ importers: '@tanstack/ai-elevenlabs': specifier: workspace:* version: link:../../packages/ai-elevenlabs + '@tanstack/ai-fal': + specifier: workspace:* + version: link:../../packages/ai-fal '@tanstack/ai-gemini': specifier: workspace:* version: link:../../packages/ai-gemini diff --git a/testing/e2e/global-setup.ts b/testing/e2e/global-setup.ts index f869df01a..2af2b5bd0 100644 --- a/testing/e2e/global-setup.ts +++ b/testing/e2e/global-setup.ts @@ -66,6 +66,14 @@ export default async function globalSetup() { // `promptTokensDetails.cachedTokens` / `completionTokensDetails.reasoningTokens`. mock.mount('/openai-usage-details', openaiUsageDetailsMount()) + // fal billable-units capture. aimock doesn't model fal's queue protocol + // (submit → poll status → fetch result) or its `x-fal-billable-units` / + // `x-fal-request-id` result headers, so this mount hand-rolls the three queue + // round-trips and stamps the billing headers on the result fetch. The + // companion api.fal-billable-units route redirects fal's hardcoded + // queue.fal.run URLs here and asserts the units reach `result.usage`. + mock.mount('/fal-queue', falQueueMount()) + await mock.start() console.log(`[aimock] started on port 4010`) ;(globalThis as any).__aimock = mock @@ -466,6 +474,83 @@ function openaiUsageDetailsMount(): Mountable { } } +/** + * Request id and billed quantity the fal queue mount reports. Exported-by-value + * to the companion route/spec via the literal below — kept in one place so the + * assertion and the mock can't drift. + */ +const FAL_E2E_REQUEST_ID = 'fal-req-e2e' +const FAL_E2E_BILLABLE_UNITS = '4' + +/** + * Mimics fal's queue protocol for a single image generation: + * POST /{appId} → submit, returns request_id + * GET /{appId}/requests/{id}/status → poll, returns COMPLETED + * GET /{appId}/requests/{id} → result, returns the image payload + * with `x-fal-request-id` and + * `x-fal-billable-units` headers + * The billing fetch installed by @tanstack/ai-fal reads those headers off the + * result fetch and the adapter surfaces them as `result.usage.unitsBilled`. + */ +function falQueueMount(): Mountable { + return { + async handleRequest( + req: http.IncomingMessage, + res: http.ServerResponse, + // Mount prefix (/fal-queue) is stripped; pathname is `/{appId}/...`. + pathname: string, + ): Promise { + const isResultPath = + req.method === 'GET' && /\/requests\/[^/]+$/.test(pathname) + const isStatusPath = req.method === 'GET' && pathname.endsWith('/status') + const isSubmitPath = + req.method === 'POST' && !pathname.includes('/requests/') + + if (isSubmitPath) { + await drainBody(req) + res.statusCode = 200 + res.setHeader('Content-Type', 'application/json') + res.end( + JSON.stringify({ + request_id: FAL_E2E_REQUEST_ID, + status: 'IN_QUEUE', + }), + ) + return true + } + + if (isStatusPath) { + res.statusCode = 200 + res.setHeader('Content-Type', 'application/json') + res.end( + JSON.stringify({ + status: 'COMPLETED', + request_id: FAL_E2E_REQUEST_ID, + }), + ) + return true + } + + if (isResultPath) { + res.statusCode = 200 + res.setHeader('Content-Type', 'application/json') + // The two headers the feature hangs on: the billed quantity, and the + // request id the adapter correlates it against. + res.setHeader('x-fal-request-id', FAL_E2E_REQUEST_ID) + res.setHeader('x-fal-billable-units', FAL_E2E_BILLABLE_UNITS) + res.end( + JSON.stringify({ + images: [{ url: 'https://fal.media/files/e2e-billed.png' }], + }), + ) + return true + } + + return false + }, + } +} + function buildToolPlusServerToolEvents(): Array> { const messageId = 'msg_bug_604' const model = 'claude-sonnet-4-5' diff --git a/testing/e2e/package.json b/testing/e2e/package.json index af0e4ad99..bbf9c9203 100644 --- a/testing/e2e/package.json +++ b/testing/e2e/package.json @@ -20,6 +20,7 @@ "@tanstack/ai-anthropic": "workspace:*", "@tanstack/ai-client": "workspace:*", "@tanstack/ai-elevenlabs": "workspace:*", + "@tanstack/ai-fal": "workspace:*", "@tanstack/ai-gemini": "workspace:*", "@tanstack/ai-grok": "workspace:*", "@tanstack/ai-groq": "workspace:*", diff --git a/testing/e2e/src/routeTree.gen.ts b/testing/e2e/src/routeTree.gen.ts index 165f14d4d..b58d9e55e 100644 --- a/testing/e2e/src/routeTree.gen.ts +++ b/testing/e2e/src/routeTree.gen.ts @@ -37,6 +37,7 @@ import { Route as ApiMcpServerRouteImport } from './routes/api.mcp-server' import { Route as ApiMcpManagedTestRouteImport } from './routes/api.mcp-managed-test' import { Route as ApiMcpLifecycleTestRouteImport } from './routes/api.mcp-lifecycle-test' import { Route as ApiImageRouteImport } from './routes/api.image' +import { Route as ApiFalBillableUnitsRouteImport } from './routes/api.fal-billable-units' import { Route as ApiChatRouteImport } from './routes/api.chat' import { Route as ApiAudioRouteImport } from './routes/api.audio' import { Route as ApiArktypeToolWireRouteImport } from './routes/api.arktype-tool-wire' @@ -192,6 +193,11 @@ const ApiImageRoute = ApiImageRouteImport.update({ path: '/api/image', getParentRoute: () => rootRouteImport, } as any) +const ApiFalBillableUnitsRoute = ApiFalBillableUnitsRouteImport.update({ + id: '/api/fal-billable-units', + path: '/api/fal-billable-units', + getParentRoute: () => rootRouteImport, +} as any) const ApiChatRoute = ApiChatRouteImport.update({ id: '/api/chat', path: '/api/chat', @@ -265,6 +271,7 @@ export interface FileRoutesByFullPath { '/api/arktype-tool-wire': typeof ApiArktypeToolWireRoute '/api/audio': typeof ApiAudioRouteWithChildren '/api/chat': typeof ApiChatRoute + '/api/fal-billable-units': typeof ApiFalBillableUnitsRoute '/api/image': typeof ApiImageRouteWithChildren '/api/mcp-lifecycle-test': typeof ApiMcpLifecycleTestRoute '/api/mcp-managed-test': typeof ApiMcpManagedTestRoute @@ -306,6 +313,7 @@ export interface FileRoutesByTo { '/api/arktype-tool-wire': typeof ApiArktypeToolWireRoute '/api/audio': typeof ApiAudioRouteWithChildren '/api/chat': typeof ApiChatRoute + '/api/fal-billable-units': typeof ApiFalBillableUnitsRoute '/api/image': typeof ApiImageRouteWithChildren '/api/mcp-lifecycle-test': typeof ApiMcpLifecycleTestRoute '/api/mcp-managed-test': typeof ApiMcpManagedTestRoute @@ -348,6 +356,7 @@ export interface FileRoutesById { '/api/arktype-tool-wire': typeof ApiArktypeToolWireRoute '/api/audio': typeof ApiAudioRouteWithChildren '/api/chat': typeof ApiChatRoute + '/api/fal-billable-units': typeof ApiFalBillableUnitsRoute '/api/image': typeof ApiImageRouteWithChildren '/api/mcp-lifecycle-test': typeof ApiMcpLifecycleTestRoute '/api/mcp-managed-test': typeof ApiMcpManagedTestRoute @@ -391,6 +400,7 @@ export interface FileRouteTypes { | '/api/arktype-tool-wire' | '/api/audio' | '/api/chat' + | '/api/fal-billable-units' | '/api/image' | '/api/mcp-lifecycle-test' | '/api/mcp-managed-test' @@ -432,6 +442,7 @@ export interface FileRouteTypes { | '/api/arktype-tool-wire' | '/api/audio' | '/api/chat' + | '/api/fal-billable-units' | '/api/image' | '/api/mcp-lifecycle-test' | '/api/mcp-managed-test' @@ -473,6 +484,7 @@ export interface FileRouteTypes { | '/api/arktype-tool-wire' | '/api/audio' | '/api/chat' + | '/api/fal-billable-units' | '/api/image' | '/api/mcp-lifecycle-test' | '/api/mcp-managed-test' @@ -515,6 +527,7 @@ export interface RootRouteChildren { ApiArktypeToolWireRoute: typeof ApiArktypeToolWireRoute ApiAudioRoute: typeof ApiAudioRouteWithChildren ApiChatRoute: typeof ApiChatRoute + ApiFalBillableUnitsRoute: typeof ApiFalBillableUnitsRoute ApiImageRoute: typeof ApiImageRouteWithChildren ApiMcpLifecycleTestRoute: typeof ApiMcpLifecycleTestRoute ApiMcpManagedTestRoute: typeof ApiMcpManagedTestRoute @@ -733,6 +746,13 @@ declare module '@tanstack/react-router' { preLoaderRoute: typeof ApiImageRouteImport parentRoute: typeof rootRouteImport } + '/api/fal-billable-units': { + id: '/api/fal-billable-units' + path: '/api/fal-billable-units' + fullPath: '/api/fal-billable-units' + preLoaderRoute: typeof ApiFalBillableUnitsRouteImport + parentRoute: typeof rootRouteImport + } '/api/chat': { id: '/api/chat' path: '/api/chat' @@ -888,6 +908,7 @@ const rootRouteChildren: RootRouteChildren = { ApiArktypeToolWireRoute: ApiArktypeToolWireRoute, ApiAudioRoute: ApiAudioRouteWithChildren, ApiChatRoute: ApiChatRoute, + ApiFalBillableUnitsRoute: ApiFalBillableUnitsRoute, ApiImageRoute: ApiImageRouteWithChildren, ApiMcpLifecycleTestRoute: ApiMcpLifecycleTestRoute, ApiMcpManagedTestRoute: ApiMcpManagedTestRoute, diff --git a/testing/e2e/src/routes/api.fal-billable-units.ts b/testing/e2e/src/routes/api.fal-billable-units.ts new file mode 100644 index 000000000..d271b5c10 --- /dev/null +++ b/testing/e2e/src/routes/api.fal-billable-units.ts @@ -0,0 +1,70 @@ +import { createFileRoute } from '@tanstack/react-router' +import { generateImage } from '@tanstack/ai' +import { falImage } from '@tanstack/ai-fal' + +const LLMOCK_DEFAULT_BASE = process.env.LLMOCK_URL || 'http://127.0.0.1:4010' + +/** fal hardcodes its queue endpoint; we redirect these to the aimock mount. */ +const FAL_QUEUE_PREFIX = 'https://queue.fal.run/' + +/** + * Drives the fal image adapter against the `/fal-queue` aimock mount, which + * stamps `x-fal-billable-units` on the result fetch. The companion spec asserts + * those units reach `result.usage.unitsBilled` — proving the adapter's billing + * capture forwards fal's real billed quantity. + * + * fal's queue URLs are not configurable, so we pass a per-request `fetch` to the + * adapter that rewrites `queue.fal.run` requests to the mock. This is scoped to + * the adapter instance (no global mutation), so concurrent requests can't + * interfere with each other. Non-fal requests pass through untouched. + */ +export const Route = createFileRoute('/api/fal-billable-units')({ + server: { + handlers: { + POST: async () => { + const mockBase = `${LLMOCK_DEFAULT_BASE}/fal-queue/` + const redirectFetch = (( + input: RequestInfo | URL, + init?: RequestInit, + ) => { + const url = + typeof input === 'string' + ? input + : input instanceof URL + ? input.href + : input.url + if (url.startsWith(FAL_QUEUE_PREFIX)) { + return fetch(mockBase + url.slice(FAL_QUEUE_PREFIX.length), init) + } + return fetch(input, init) + }) as typeof fetch + + try { + const adapter = falImage('fal-ai/flux/dev', { + apiKey: 'fal-e2e-dummy', + fetch: redirectFetch, + }) + const result = await generateImage({ + adapter, + prompt: 'a billed image', + }) + return new Response( + JSON.stringify({ ok: true, usage: result.usage }), + { + status: 200, + headers: { 'Content-Type': 'application/json' }, + }, + ) + } catch (error) { + return new Response( + JSON.stringify({ + ok: false, + error: error instanceof Error ? error.message : String(error), + }), + { status: 200, headers: { 'Content-Type': 'application/json' } }, + ) + } + }, + }, + }, +}) diff --git a/testing/e2e/tests/fal-billable-units.spec.ts b/testing/e2e/tests/fal-billable-units.spec.ts new file mode 100644 index 000000000..026072e43 --- /dev/null +++ b/testing/e2e/tests/fal-billable-units.spec.ts @@ -0,0 +1,40 @@ +import { test, expect } from './fixtures' + +/** + * Verifies that fal's `x-fal-billable-units` result header reaches + * `result.usage.unitsBilled`. The `/api/fal-billable-units` route drives the + * fal image adapter against the `/fal-queue` aimock mount, which mimics fal's + * queue protocol (submit → poll → result) and stamps the billing headers on the + * result fetch. The adapter's `config.fetch` reads those headers and surfaces + * the real billed quantity on the generation result — the end-to-end proof that + * media-generation spend is recoverable through the SDK without an app-side + * `fetch` interceptor. + */ +test.describe('fal — billable units', () => { + test('x-fal-billable-units reaches result.usage.unitsBilled', async ({ + request, + }) => { + const res = await request.post('/api/fal-billable-units') + expect(res.ok()).toBe(true) + + const { ok, usage, error } = (await res.json()) as { + ok: boolean + error?: string + usage?: { + promptTokens?: number + completionTokens?: number + totalTokens?: number + unitsBilled?: number + } + } + + expect(error ?? null).toBeNull() + expect(ok).toBe(true) + expect(usage).toMatchObject({ + promptTokens: 0, + completionTokens: 0, + totalTokens: 0, + unitsBilled: 4, + }) + }) +})