Fix avg_decode calculation when decode_time is 0 for single tokens by KhudaBuxMagsi · Pull Request #13 · baidu/Unlimited-OCR

KhudaBuxMagsi · 2026-06-24T06:39:07Z

Description

In collect_stream_silent (Line 184), decode_time is explicitly set to 0 if token_count <= 1. However, during the metrics aggregation inside the run function, avg_decode was dividing the total sum by successful requests (which counts any request where tokens > 0).

This skewed the final Avg decode_time/request metric whenever single-token responses occurred, as their 0.0s decode times were included in the summation but still counted as full requests in the denominator.

Changes

Added successful_decode_reqs to explicitly count only requests where decode_time > 0.
Updated the avg_decode calculation to safely divide by successful_decode_reqs.
Kept avg_tokens dependent on the overall successful counter.

rajpratham1

This is a focused bug fix that addresses an edge case in the reporting logic without changing the inference pipeline itself. Previously, requests that produced only a single token had decode_time = 0 but were still included in the average calculation, which could skew the reported metric. Excluding those entries from the average while handling the "no valid decode times" case is a sensible improvement.

kushdab · 2026-06-26T00:04:18Z

Good catch — the bug is real and the fix is correct.

What the bug was

avg_decode was computed as:

sum(r["decode_time"] for r in results if r["tokens"] > 0) / successful

successful counts every request with tokens > 0, which includes single-token responses where decode_time is explicitly set to 0.0 (by the token_count <= 1 branch in collect_stream_silent). Those zero-time entries dragged the average down without representing a real decode step.

The fix

Using a separate successful_decode_reqs counter — requests where decode_time > 0 — is exactly the right semantics: "average decode time, among requests that actually had a decode phase." The if successful_decode_reqs > 0 guard covers the edge case where every request in the batch returned a single token (prints 0.0s, which is accurate).

Minor notes

Line 334 adds a trailing space before the newline on the main() line, and the final line is missing a newline (\ No newline at end of file). Worth cleaning both up before merge.
When avg_decode = 0.0 is printed because all requests were single-token, the metric is slightly misleading at a glance. Optionally printing "N/A (all single-token)" for that case would improve clarity, but that is a polish item, not a correctness issue.

The core change is solid. ✅

Fix avg_decode calculation when decode_time is 0 for single tokens

295282b

shreyasmene06 mentioned this pull request Jun 24, 2026

[bug+feature] infer.py: hardcoded ngram params, no --resume, no structured results, image-mode validation, tmpdir leak #17

Open

rajpratham1 approved these changes Jun 25, 2026

View reviewed changes

kushdab mentioned this pull request Jun 26, 2026

Make batch inference resumable and leak-free #36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix avg_decode calculation when decode_time is 0 for single tokens#13

Fix avg_decode calculation when decode_time is 0 for single tokens#13
KhudaBuxMagsi wants to merge 1 commit into
baidu:mainfrom
KhudaBuxMagsi:fix-avg-decode-calculation

KhudaBuxMagsi commented Jun 24, 2026

Uh oh!

rajpratham1 left a comment

Uh oh!

kushdab commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

KhudaBuxMagsi commented Jun 24, 2026

Description

Changes

Uh oh!

rajpratham1 left a comment

Choose a reason for hiding this comment

Uh oh!

kushdab commented Jun 26, 2026

What the bug was

The fix

Minor notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants