Source: PR #271:
Separately, deepseek's overall verbosity (the 888k tokens across Search+Verify) is a model characteristic, not addressed here.
Impact: /deep-research has no per-stage or per-run token ceiling. A verbose model can spend hundreds of thousands of tokens in the Search and Verify fan-out before the user sees anything, with no warning and no abort threshold. #271 fixed the Synthesize loop but the spend in earlier stages is uncapped.
Fix sketch:
- Track cumulative tokens per stage in the deep-research harness (the workflow engine already counts per-agent usage).
- Add a configurable budget (e.g.
CLAWCODEX_DEEP_RESEARCH_TOKEN_BUDGET, default a few hundred k) — when exceeded, stop launching new search/verify agents, log what was skipped, and proceed to Synthesize with what was collected.
- Surface the spend in the progress UI so the user sees runaway cost early.
Source: PR #271:
Impact:
/deep-researchhas no per-stage or per-run token ceiling. A verbose model can spend hundreds of thousands of tokens in the Search and Verify fan-out before the user sees anything, with no warning and no abort threshold. #271 fixed the Synthesize loop but the spend in earlier stages is uncapped.Fix sketch:
CLAWCODEX_DEEP_RESEARCH_TOKEN_BUDGET, default a few hundred k) — when exceeded, stop launching new search/verify agents, log what was skipped, and proceed to Synthesize with what was collected.