Track xAI TTS costs from API response usage data#362
Conversation
Parse the usage.cost_in_usd_ticks field from xAI JSON TTS responses and store the actual dollar cost (1 tick = $1e-9) in saveAudioFile usage and insertUsageEvent. The v1/speech route now uses the exact xAI-reported cost instead of the estimated pricing-table amount for Grok voices. Backward-compatible: binary audio responses (no JSON body) continue to work without cost info; cost is only extracted when the response Content-Type is application/json. https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
|
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @gianpaj's task in 2m 47s —— View job Code Review
SummaryThe PR cleanly threads Higher priority
Lower priority No security issues spotted: cost data flows from a trusted upstream, is parsed defensively, and never round-trips back to the user. Credit deduction still uses the local estimator, so a manipulated xAI response can't drain credits. |
There was a problem hiding this comment.
Code Review
This pull request implements precise cost tracking for xAI (Grok) text-to-speech by extracting cost_in_usd_ticks from API responses and calculating the dollar amount across the generate-voice and V1 speech routes. The generateXaiTts utility was updated to handle JSON responses, and tests were added to verify the new tracking logic. Feedback suggests removing a toFixed(6) call in the V1 route to prevent precision loss and maintain consistency with the internal route. Additionally, the V1 route's usage metadata should be updated to include the codec for parity between endpoints.
- Remove toFixed(6) to preserve full nanotick precision for dollar amounts - Add codec to Grok voice usage event metadata for parity with generate-voice route - Hoist grokCodec variable so it's accessible at the insertUsageEvent call site - Update test assertion to match both changes https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
There was a problem hiding this comment.
Pull request overview
This PR updates the xAI (Grok) TTS integration to ingest the usage.cost_in_usd_ticks value returned by xAI and use it to record more accurate cost/usage data (with backward compatibility when the cost field is absent).
Changes:
- Updated
generateXaiTts()to handle JSON responses containing base64 audio plususage.cost_in_usd_ticks. - Updated
/api/v1/speechand/api/generate-voiceto propagatecostInUsdTicksinto usage metadata and computedollarAmountfrom ticks. - Updated route tests to mock xAI JSON responses and assert that cost metadata is persisted and usage events include dollar amounts.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/web/lib/tts/xai.ts | Adds JSON-response parsing for audio + usage cost ticks and returns costInUsdTicks to callers. |
| apps/web/app/api/v1/speech/route.ts | Captures Grok tick costs and uses them to compute dollarAmount (fallback to estimated pricing otherwise). |
| apps/web/app/api/generate-voice/route.ts | Captures Grok tick costs and stores them into audio file usage + usage event metadata. |
| apps/web/tests/v1-speech.test.ts | Updates xAI mock to return JSON with cost ticks and asserts persistence/usage event fields. |
| apps/web/tests/generate-voice.test.ts | Updates xAI mock to return JSON with cost ticks and asserts persistence/usage event fields. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…sing
- Add usdTicksToDollarAmount() to lib/tts/xai.ts so both routes use the
same tick-to-dollar conversion and no inline constants can drift
- Make JSON content-type check case-insensitive (.toLowerCase().includes('json'))
to handle variants like 'application/json; charset=utf-8'
- Validate decoded base64 audio buffer is non-empty before returning
- Fix generate-voice route to only spread dollarAmount when defined
(avoid explicit null; let insertUsageEvent handle the default)
- Update both tests to import and use usdTicksToDollarAmount
https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
…icks - usdTicksToDollarAmount now rounds to 6 decimal places to match the numeric(12,6) DB column, preventing silent Postgres rounding that would produce different values than what the app logged / computed in-memory - Add validateTicks() to coerce cost_in_usd_ticks to undefined when the value is not a finite non-negative number, preventing NaN propagation if xAI ever returns a string, null, or unexpected type for that field https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ? usdTicksToDollarAmount(grokCostInUsdTicks) | ||
| : calculateExternalApiDollarAmount({ | ||
| sourceType: 'api_tts', | ||
| provider, |
| const grokDollarAmount = | ||
| grokCostInUsdTicks === undefined | ||
| ? undefined | ||
| : usdTicksToDollarAmount(grokCostInUsdTicks); |
| const GROK_CHAR_BUCKET = 50; | ||
| const GROK_CREDITS_PER_BUCKET = 50; | ||
|
|
||
| // our cost |
| // $15.00 / 1M characters | ||
| // ~4 credits per character (~4,000 credits per minute of audio) | ||
| // Gross margin: | ||
| // 99.5% |
|
|
||
| - **Gpro voices**: ~1 credit per token (based on text and audio length) | ||
| - **Grok voices**: 4 credits per 100 characters | ||
| - **Grok voices**: 100 credits per 100 characters |
| expect(response.status).toBe(200); | ||
| expect(json.data).toHaveLength(3); | ||
| expect(json.data[0].id).toBe('gpro'); | ||
| expect(json.data[2].id).toBe('xai'); |
| const xaiResponseBuffer = new Uint8Array([10, 20, 30, 40]).buffer; | ||
| const xaiAudioBytes = new Uint8Array([10, 20, 30, 40]); | ||
| const xaiAudioBase64 = Buffer.from(xaiAudioBytes).toString('base64'); | ||
| const xaiCostInUsdTicks = 1950; |
| expect(json.url).toContain('files.sexyvoice.ai'); | ||
| expect(json.url).toContain('.mp3'); | ||
|
|
||
| const expectedDollarAmount = usdTicksToDollarAmount(xaiCostInUsdTicks); |
Replace 1950/1650 tick values (which round away to 0.000002, causing 2.6% precision loss) with 15_000_000 (= $0.000015 exactly at 6dp, matching the fallback pricing rate). Also update the v1/speech primary Grok test to use a JSON mock response and add assertions on saveAudioFileAdmin and insertUsageEvent for costInUsdTicks and dollarAmount, so the cost-tracking paths are actually exercised. https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
…ET changed to 50
GROK_CHAR_BUCKET/GROK_CREDITS_PER_BUCKET were halved (100→50) upstream,
making estimateGrokCredits("Hello world") return 50 instead of 100.
https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
| ? usdTicksToDollarAmount(grokCostInUsdTicks) | ||
| : calculateExternalApiDollarAmount({ | ||
| sourceType: 'api_tts', | ||
| provider, |
| sourceId: 'test-audio-file-id', | ||
| model: 'xai', | ||
| dollarAmount: expectedDollarAmount, | ||
| creditsUsed: 50, | ||
| metadata: expect.objectContaining({ |
| // $4.20 / 1M characters | ||
| // ~1 credit per character (~1,000 credits per minute of audio) | ||
| // $15.00 / 1M characters | ||
| // ~4 credits per character (~4,000 credits per minute of audio) |
| const grokDollarAmount = | ||
| grokCostInUsdTicks === undefined | ||
| ? undefined | ||
| : usdTicksToDollarAmount(grokCostInUsdTicks); | ||
|
|
| response_format: z | ||
| .enum(['wav', 'mp3']) | ||
| .optional() | ||
| .describe('Audio format. Default depends on model'), | ||
| .describe('Audio format. `xai` supports both `mp3` and `wav`.'), | ||
| style: z | ||
| .string() | ||
| .optional() | ||
| .describe('Emotion/style variant (e.g., "happy", "sad", "whisper")'), | ||
| .describe( | ||
| 'Ignored for `xai` requests. Use Grok speech tags in `input` instead.', |
…k, fix stale comments - v1/speech: restore missing `model` arg in calculateExternalApiDollarAmount — without it the key lookup falls back to ZERO_PRICE for all providers - generate-voice: fall back to calculateGrokTtsDollarAmount(text) when xAI doesn't return cost_in_usd_ticks (binary audio path), so cost tracking has no gaps; also include dollarAmount in saveAudioFile for all Grok voices - utils.ts: replace stale "~4 credits per char / Gross margin 99.5%" comment with accurate description of the 50-char bucket billing logic - docs: update billing description to reflect 50-char buckets (was "100 per 100") - api-v1-meta test: use set membership (toContain) instead of hard-coded index assertions so ordering changes don't break the test https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u
Summary
Updated the xAI TTS integration to capture and track actual cost data from xAI's API response instead of relying on estimated pricing. The xAI API now returns
cost_in_usd_ticksin the response, which is converted to USD and stored for accurate billing and usage tracking.Changes
generateXaiTts()to parse JSON responses from xAI API and extractcost_in_usd_ticksfrom the usage object/api/v1/speechand/api/generate-voiceroutes to use actual xAI costs when available, falling back to estimated pricing for other providerscostInUsdTicksin usage metadata and calculate precisedollarAmount(1 tick = $0.000_000_001) for Grok voicesHow to test
pnpm test -- v1-speech.test.ts generate-voice.test.tscost_in_usd_ticksfrom xAI responsessaveAudioFileAdminandinsertUsageEventare called with the correct cost data in both test filesScope
Checklist
Notes for reviewers
The changes maintain backward compatibility by only using xAI's reported costs for Grok voices when available. Other TTS providers continue to use the estimated pricing table. The conversion from ticks to USD uses
toFixed(6)for precision in the v1 API route to match typical currency formatting.https://claude.ai/code/session_014T6dJTtWEyjF7Nok7K3t2u