-
Notifications
You must be signed in to change notification settings - Fork 80
Open
Description
Feature Request
Extend the OpenAITelemetryPlugin to support token tracking and pricing information for cached tokens.
Background
Currently, the OpenAITelemetryPlugin tracks and calculates token usage and associated pricing. However, many OpenAI APIs (especially when caching is present) price cached tokens differently than non-cached tokens.
Requested Enhancement
- Update the plugin to distinguish between regular and cached tokens.
- Track cached tokens separately and calculate/report cost accurately based on their distinct rates.
- Factor in OpenAI pricing models that treat cached tokens differently, and expose the split and totals in pricing information/output.
- Update relevant reporting and metrics methods to show cached token counts and costs distinctly.
Impacted Code References:
PricesData.cs/ModelPrices– only hasInputandOutputpricing, noCachedInputfieldPricesData.cs/CalculateCost()– does not account for cached tokensOpenAITelemetryPlugin.cs/RecordUsageMetrics()– passes entirePromptTokensvalue for costOpenAITelemetryPlugin.cs/GetReportModelUsageInfo()– reads and reports cached tokens, but doesn't factor into cost logicLanguageModelPricingLoader.cs– supports prices only forinputandoutputfields. Add support forcached_inputif available
Benefits
- Improved reporting and transparency for users with cache-aware pricing.
- More accurate cost/control for workloads with cache/usage mix.
- Better insights in telemetry and exported reports for complex scenarios.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels