feat(qoderwork): intercept token usage via QODER_WORKER_RUNTIME_PATH wrapper#80
feat(qoderwork): intercept token usage via QODER_WORKER_RUNTIME_PATH wrapper#80fangxiu-wf wants to merge 3 commits into
Conversation
|
CLA Not Signed The Contributor License Agreement (CLA) check is currently pending on this PR ( @fangxiu-wf please sign the CLA via the CLA assistant badge in the comment above, or visit https://cla-assistant.io/alibaba/loongsuite-pilot. Once signed, the Automated check by github-manager-bot |
Co-authored-by: multica-agent <github@multica.ai>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>
e1875c2 to
78b9ff8
Compare
ralf0131
left a comment
There was a problem hiding this comment.
Summary
Well-structured token interception approach via QODER_WORKER_RUNTIME_PATH wrapper, with good test coverage (38 tests). However, one critical safety issue: the wrapper throws (instead of degrading gracefully) when the real runtime path isn't found, which could crash QoderWork's worker thread entirely. Several privacy concerns around full system prompt capture also need attention before merge.
Findings
- [CRITICAL]
assets/hooks/qoderwork-runtime-wrapper.mjs:115— Throws if the real QoderWork runtime is not found at any hard-coded candidate path. Because this wrapper is loaded as QODER_WORKER_RUNTIME_PATH, throwing here crashes the worker thread and breaks QoderWork entirely. The comment claims the SDK will fall back to ProcessTransport, but the throw prevents that. Degrade gracefully (log and return) so the wrapper cannot disrupt normal agent operation. - [WARNING]
assets/hooks/qoderwork-runtime-wrapper.mjs:53— Globally overriding JSON.parse and performing synchronous file I/O (fs.appendFileSync) on the parse hot path blocks the worker event loop. Every SSE chunk and internal JSON parse pays this cost. Consider buffering writes asynchronously or using a writable stream with backpressure to avoid stalling the agent. - [WARNING]
assets/hooks/qoderwork-runtime-wrapper.mjs:81— JSON.stringify is globally overridden to capture the full system prompt content in plaintext and append it to a log file. This is privacy-sensitive: system prompts may contain user instructions, project context, or secrets. Store only a hash/identifier or redact before writing, and document the data retention policy. - [WARNING]
assets/hooks/qoderwork-runtime-wrapper.mjs:87— MIN_SYSTEM_PROMPT_LENGTH only filters by length; it does not redact or exclude the system prompt. A 101-character system prompt is still captured verbatim. Review whether full system prompt capture is necessary for token accounting. - [WARNING]
src/inputs/qoder-work-trace/qoder-work-trace-input.ts:592— The wrapper captures reasoning_tokens, but applyInterceptUsage and applyInterceptCacheReasoning never write it to the LLM span. This leaves AGENT token sum potentially misaligned when reasoning tokens are present and makes the wrapper field unused. - [INFO]
src/inputs/qoder-work-trace/qoder-work-trace-input.ts:598— Linear scan (interceptData.tokens.find) per response is O(n*m). For long sessions this becomes expensive. Consider building a Map<id, token> once when intercept data is loaded. - [INFO]
assets/hooks/qoderwork-runtime-wrapper.mjs:73— appendFileSync opens and closes the file descriptor on every token record. Combined with the JSON.parse hot path, this creates unnecessary syscall overhead. Holding a persistent fd or batching writes would reduce overhead. - [INFO]
assets/hooks/qoderwork-runtime-wrapper.mjs:56— The detection heuristic (result.usage && result.choices !== undefined && result.id) may match non-LLM JSON objects that happen to have these fields. A more specific check (e.g., result.object === 'chat.completion' or model presence) would reduce false positives. - [INFO]
src/inputs/qoder-work-trace/qoder-work-trace-input.ts:603— totalTokens || fallback treats a legitimate 0 total as missing. Use ?? or an explicit undefined check to preserve zero values when reported by the provider. - [INFO]
assets/hooks/qoderwork-hook-processor.mjs:354— Using || for responseId fallback treats empty string or '0' as missing. Prefer ?? for id values so a valid but falsy message.id is not silently discarded. - [INFO]
deploy/installer-opensource.sh:916— QoderWork wrapper injection is Darwin-only; Linux and Windows paths are skipped. Documented in code, but cross-platform users will not get token intercept coverage.
Automated review by github-manager-bot
背景
QoderWork 采集场景下,pilot 无法直接拿到 LLM 调用的 token 用量(尤其
cache_read.input_tokens),导致 AGENT span 的 token sum 与各 LLM span 之和无法对齐,也无法回填缓存命中数。Issue AGE-378 要求通过拦截 QoderWork 子进程的请求/响应来补齐 token 字段。核心改动
assets/hooks/qoderwork-runtime-wrapper.mjs:通过QODER_WORKER_RUNTIME_PATH包装 QoderWork 子进程,拦截 LLM 请求/响应,将每次调用的prompt_tokens/completion_tokens/cached_tokens写入 intercept JSONL。deploy/installer-opensource.sh中inject_qodercli_token_intercept安装钩子。src/inputs/qoder-trace/intercept-token-reader.ts:readInterceptData增加filename形参,按 chatcmpl id 索引。src/inputs/qoder-work-trace/qoder-work-trace-input.ts:在 LLM span 上回填gen_ai.usage.cache_read.input_tokens,保持 AGENT token sum 严格等于 sum(LLM)。reasoning_tokens:该字段不在 OTel GenAI 规范中,已通过 revert 移除,避免污染下游 OLAP 列。修改文件
inject_qodercli_token_intercept安装步骤readInterceptData加filename形参验证结果
单元测试(提交前本地反向核验)
E2E(tester 报告 comment b6fdb253,coordinator 复核 PASS)
真实 QoderWork.app GUI (macOS, PID 33177) 触发多轮 ReAct + 工具对话(非替代、非伪造),install commit
1.0.0_e1875c2,trace 10 spans、3 STEP / 2 TOOL。复现命令:
validate-trace verdict:
PASS— 133 checks | 34 pass | 99 warn | 0 error | 0 skipped(99 WARN 全为*.should.*可选属性缺失,非阻断)。6 项 checklist:
gen_ai.input.messages/gen_ai.output.messages非空gen_ai.usage.cache_read.input_tokens回填reasoning_tokens不出现在 LLM span(反向)AGENT token sum 严格校验:
total=105338 / input=104979 / output=359 / cache_read=69796与sum(LLM.*)逐字段精确等。