feat: expose cumulative token usage per ExecuteTools run by localai-bot · Pull Request #55 · mudler/cogito

localai-bot · 2026-06-04T08:22:28Z

Summary

Adds a way to read the cumulative token usage of a whole ExecuteTools run (and therefore of each spawned sub-agent), where previously only the last LLM call's usage was retained on Status.LastUsage.

New Status.CumulativeUsage LLMUsage field, summed across every LLM call in a run.
A small counting LLM decorator (newCountingLLM) that accumulates usage from both CreateChatCompletion and Ask. It preserves StreamingLLM so wrapping never disables the streaming code path.
ExecuteTools wraps its llm once (after the sub-agent fallback agentLLM is captured, so a sub-agent's tokens are not folded into the parent) and stamps the total onto the returned fragment via a deferred named return — covering all exit paths.

Because each sub-agent runs its own ExecuteTools and runAgent assigns that fragment to AgentState.Fragment before firing the completion callback, AgentState.Fragment.Status.CumulativeUsage is populated per sub-agent for embedders to report.

Known limitation

The bundled stream clients don't yet populate StreamEvent.Usage on the done event, so streaming-path tokens read as zero until they request usage from the API (e.g. StreamOptions{IncludeUsage: true}). The non-streaming path is fully counted. This is documented in usage_counter.go.

Test Plan

usage_counter_internal_test.go: decorator sums both Ask + CreateChatCompletion; newCountingLLM yields a StreamingLLM iff the inner LLM is one.
tools_cumulative_test.go: a multi-call ExecuteTools run reports CumulativeUsage equal to the summed dispensed usage and greater than LastUsage.
Streaming regression (TestAskWithStreaming*) passes — the wrap preserves streaming.
Full non-e2e suite green. (One pre-existing background-spawn mock race is unrelated and present at the base commit.)

🤖 Generated with Claude Code

Add an exported Background flag on AgentState, set true for background spawns (spawn_agent background=true) and false for foreground ones. This lets embedders tell unattended background work apart from a foreground sub-agent whose result is consumed inline — e.g. to auto-notify on completion only for background agents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Adds a counting LLM decorator and a CumulativeUsage field on Status so a full ExecuteTools run's token usage can be summed and exposed. Preserves StreamingLLM so wrapping does not disable streaming.

Buffer the forwarded stream channel and make the send context-aware to match the client convention and prevent goroutine leaks. Document that streaming-path usage is unpopulated by the bundled clients today.

Wraps the run LLM in a counting decorator and stamps the summed usage onto the returned fragment's Status.CumulativeUsage via a deferred named return, covering all exit paths. Each sub-agent run reports its own total.

…al replace cogito's CumulativeUsage (mudler/cogito#55) is now released on main, so nib depends on the published pseudo-version and the dev-time local replace is removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…16) * docs: design spec for sub-agent completion stats line Spec for rendering a Claude-Code-style completion summary (tools · cumulative tokens · elapsed) when a spawned sub-agent finishes, in both the TUI and the plain CLI. Cumulative token tracking requires a small cogito-side accumulator exposed on the returned fragment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: implementation plan for sub-agent completion stats line Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * build: replace cogito with local worktree during dev Temporary: lets nib build against the un-released CumulativeUsage change. Removed once cogito is tagged (see plan Task 9). * feat(chat): AgentEvent run-stats fields and formatters * feat(chat): time sub-agents and populate completion run-stats * feat(tui): show sub-agent run-stats on the completion marker * feat(cli): show sub-agent run-stats on the completion line * build: bump cogito to v0.10.1-0.20260604082319-fe7fd5de11d1; drop local replace cogito's CumulativeUsage (mudler/cogito#55) is now released on main, so nib depends on the published pseudo-version and the dev-time local replace is removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

mudler and others added 4 commits June 2, 2026 21:14

feat: add per-run token usage accumulator (CumulativeUsage)

3af3bd7

Adds a counting LLM decorator and a CumulativeUsage field on Status so a full ExecuteTools run's token usage can be summed and exposed. Preserves StreamingLLM so wrapping does not disable streaming.

refactor: harden counting stream forwarder and document usage limits

cd0a989

Buffer the forwarded stream channel and make the send context-aware to match the client convention and prevent goroutine leaks. Document that streaming-path usage is unpopulated by the bundled clients today.

feat: expose cumulative token usage on the ExecuteTools result

54f027b

Wraps the run LLM in a counting decorator and stamps the summed usage onto the returned fragment's Status.CumulativeUsage via a deferred named return, covering all exit paths. Each sub-agent run reports its own total.

localai-bot mentioned this pull request Jun 4, 2026

feat: show sub-agent run-stats (tools · tokens · time) on completion mudler/nib#16

Merged

5 tasks

mudler merged commit fe7fd5d into main Jun 4, 2026
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: expose cumulative token usage per ExecuteTools run#55

feat: expose cumulative token usage per ExecuteTools run#55
mudler merged 4 commits into
mainfrom
feat/cumulative-usage

localai-bot commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

localai-bot commented Jun 4, 2026

Summary

Known limitation

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants