Fix avg_decode calculation when decode_time is 0 for single tokens#13
Fix avg_decode calculation when decode_time is 0 for single tokens#13KhudaBuxMagsi wants to merge 1 commit into
Conversation
rajpratham1
left a comment
There was a problem hiding this comment.
This is a focused bug fix that addresses an edge case in the reporting logic without changing the inference pipeline itself. Previously, requests that produced only a single token had decode_time = 0 but were still included in the average calculation, which could skew the reported metric. Excluding those entries from the average while handling the "no valid decode times" case is a sensible improvement.
|
Good catch — the bug is real and the fix is correct. What the bug was
sum(r["decode_time"] for r in results if r["tokens"] > 0) / successful
The fixUsing a separate Minor notes
The core change is solid. ✅ |
Description
In
collect_stream_silent(Line 184),decode_timeis explicitly set to0iftoken_count <= 1. However, during the metrics aggregation inside therunfunction,avg_decodewas dividing the total sum bysuccessfulrequests (which counts any request wheretokens > 0).This skewed the final
Avg decode_time/requestmetric whenever single-token responses occurred, as their0.0sdecode times were included in the summation but still counted as full requests in the denominator.Changes
successful_decode_reqsto explicitly count only requests wheredecode_time > 0.avg_decodecalculation to safely divide bysuccessful_decode_reqs.avg_tokensdependent on the overallsuccessfulcounter.