Skip to content

fix(client): preserve cache token counts when message_delta omits them#215

Open
sumleo wants to merge 2 commits into
anthropics:mainfrom
sumleo:fix/cache-token-null-merge
Open

fix(client): preserve cache token counts when message_delta omits them#215
sumleo wants to merge 2 commits into
anthropics:mainfrom
sumleo:fix/cache-token-null-merge

Conversation

@sumleo

@sumleo sumleo commented Jun 17, 2026

Copy link
Copy Markdown

Summary

The Microsoft.Extensions.AI streaming bridge loses the prompt-caching token counts reported by message_start whenever the terminal message_delta omits the cache fields (which is the common case).

In both AsIChatClient bridges the message_delta handler replaced the accumulated usage wholesale:

usageDetails = ToUsageDetails(deltaUsage);

MessageDeltaUsage.CacheReadInputTokens and CacheCreationInputTokens are nullable and are usually absent on message_delta. ToUsageDetails maps those nulls straight through, so the cache counts populated from message_start are wiped out: CachedInputTokenCount becomes null and InputTokenCount/TotalTokenCount are understated.

Root cause

  • src/Anthropic/AnthropicClientExtensions.cs (RawMessageDeltaEvent handler, ~line 638)
  • src/Anthropic/Services/Beta/Messages/AnthropicBetaClientExtensions.cs (BetaRawMessageDeltaEvent handler, ~line 361)

Fix

Replace the wholesale assignment with a per-field, null-safe merge: only the delta's non-null cache values override what message_start already populated; when the delta omits a cache field, the existing value is kept and the derived input/total counts are recomputed accordingly.

This mirrors how the SDK already merges streaming usage elsewhere:

  • src/Anthropic/Helpers/MessageContentAggregator.cs (null-checked overwrite of CacheCreationInputTokens/CacheReadInputTokens, lines ~70-80)
  • FallbackStreamSplicer.Backfill (src/Anthropic/Helpers/FallbackStreamSplicer.cs, ~line 889)

Test

Added GetStreamingResponseAsync_DeltaWithoutCacheTokens_PreservesStartCacheTokens to AnthropicClientExtensionsTestsBase (so it runs for both the standard and Beta bridges). It streams a message_start carrying cache tokens followed by a message_delta that omits the cache fields, and asserts the final usage still reports CachedInputTokenCount, CacheCreationInputTokens, and the correct InputTokenCount/TotalTokenCount.

These are hand-written bridge files (not generated), so the change is local to them.

@sumleo sumleo requested a review from a team as a code owner June 17, 2026 00:39
@sumleo

sumleo commented Jun 18, 2026

Copy link
Copy Markdown
Author

Hi @anthropics, gentle nudge on this when you have a moment. It's a small, self-contained prompt-caching fix, and I'm happy to rebase or tweak anything if that would make review easier. Thanks for the project and your time!

// The cache token fields are nullable and are usually omitted on the
// terminal message_delta, so merge them with the values already
// populated from message_start instead of replacing wholesale.
usageDetails = ToUsageDetails(deltaUsage, usageDetails);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than passing usageDetails, you can use local variables to track CacheCreationInputTokens and CacheReadInputTokens (and also InputTokens and ServerToolUse since those are nullable too).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in b346a3d. Both the stable and beta streaming paths now capture InputTokens, CacheCreationInputTokens, CacheReadInputTokens, and ServerToolUse from message_start into locals and fall back to them when the terminal message_delta omits them, instead of round-tripping through the previously built UsageDetails. That also removes the helper that re-read CacheCreationInputTokens back out of AdditionalCounts, and covers the InputTokens/ServerToolUse omission you noted (both nullable). csharpier and the build are clean; I couldn't launch the test runner locally (the testhost gets killed in my environment), so I'm relying on CI to run the suite.

… merge

Track the nullable cumulative usage fields (InputTokens, CacheCreationInputTokens,
CacheReadInputTokens, ServerToolUse) reported on message_start in local variables
and fall back to them when the terminal message_delta omits them, instead of
round-tripping through the previously built UsageDetails.

This also covers InputTokens and ServerToolUse (both nullable) being omitted on
the delta, and removes the helper that re-parsed CacheCreationInputTokens back out
of AdditionalCounts. Same change applied to the beta path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants