Extend OpenAITelemetryPlugin: Add tracking & pricing for cached tokens

**Feature Request**

Extend the `OpenAITelemetryPlugin` to support token tracking and pricing information for cached tokens.

---
**Background**
Currently, the `OpenAITelemetryPlugin` tracks and calculates token usage and associated pricing. However, many OpenAI APIs (especially when caching is present) price cached tokens differently than non-cached tokens.

**Requested Enhancement**
- Update the plugin to distinguish between regular and cached tokens.
- Track cached tokens separately and calculate/report cost accurately based on their distinct rates.
- Factor in OpenAI pricing models that treat cached tokens differently, and expose the split and totals in pricing information/output.
- Update relevant reporting and metrics methods to show cached token counts and costs distinctly.

**Impacted Code References:**
- [`PricesData.cs` / `ModelPrices`](https://github.com/dotnet/dev-proxy/blob/87050e5924a0f0e4d2a5c546daf41f0dac575b60/DevProxy.Abstractions/LanguageModel/PricesData.cs#L10-L14) – only has `Input` and `Output` pricing, no `CachedInput` field
- [`PricesData.cs` / `CalculateCost()`](https://github.com/dotnet/dev-proxy/blob/87050e5924a0f0e4d2a5c546daf41f0dac575b60/DevProxy.Abstractions/LanguageModel/PricesData.cs#L49-L62) – does not account for cached tokens
- [`OpenAITelemetryPlugin.cs` / `RecordUsageMetrics()`](https://github.com/dotnet/dev-proxy/blob/cadb1a0891e597120ca04fb402b870d47d0cdbe3/DevProxy.Plugins/Inspection/OpenAITelemetryPlugin.cs#L934-L942) – passes entire `PromptTokens` value for cost
- [`OpenAITelemetryPlugin.cs` / `GetReportModelUsageInfo()`](https://github.com/dotnet/dev-proxy/blob/87050e5924a0f0e4d2a5c546daf41f0dac575b60/DevProxy.Plugins/Inspection/OpenAITelemetryPlugin.cs#L1045-L1057) – reads and reports cached tokens, but doesn't factor into cost logic
- [`LanguageModelPricingLoader.cs`](https://github.com/dotnet/dev-proxy/blob/87050e5924a0f0e4d2a5c546daf41f0dac575b60/DevProxy.Plugins/Inspection/LanguageModelPricingLoader.cs#L36-L52) – supports prices only for `input` and `output` fields. Add support for `cached_input` if available

**Benefits**
- Improved reporting and transparency for users with cache-aware pricing.
- More accurate cost/control for workloads with cache/usage mix.
- Better insights in telemetry and exported reports for complex scenarios.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend OpenAITelemetryPlugin: Add tracking & pricing for cached tokens #1582

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend OpenAITelemetryPlugin: Add tracking & pricing for cached tokens #1582

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions