Skip to content

feat(qoderwork): intercept token usage via QODER_WORKER_RUNTIME_PATH wrapper#80

Open
fangxiu-wf wants to merge 3 commits into
mainfrom
fangxiu-wf/qoderwork-token-intercept
Open

feat(qoderwork): intercept token usage via QODER_WORKER_RUNTIME_PATH wrapper#80
fangxiu-wf wants to merge 3 commits into
mainfrom
fangxiu-wf/qoderwork-token-intercept

Conversation

@fangxiu-wf

Copy link
Copy Markdown
Collaborator

背景

QoderWork 采集场景下,pilot 无法直接拿到 LLM 调用的 token 用量(尤其 cache_read.input_tokens),导致 AGENT span 的 token sum 与各 LLM span 之和无法对齐,也无法回填缓存命中数。Issue AGE-378 要求通过拦截 QoderWork 子进程的请求/响应来补齐 token 字段。

核心改动

  • 新增 assets/hooks/qoderwork-runtime-wrapper.mjs:通过 QODER_WORKER_RUNTIME_PATH 包装 QoderWork 子进程,拦截 LLM 请求/响应,将每次调用的 prompt_tokens / completion_tokens / cached_tokens 写入 intercept JSONL。
  • 新增 deploy/installer-opensource.shinject_qodercli_token_intercept 安装钩子。
  • 修改 src/inputs/qoder-trace/intercept-token-reader.tsreadInterceptData 增加 filename 形参,按 chatcmpl id 索引。
  • 修改 src/inputs/qoder-work-trace/qoder-work-trace-input.ts:在 LLM span 上回填 gen_ai.usage.cache_read.input_tokens,保持 AGENT token sum 严格等于 sum(LLM)。
  • 在 LLM span 上写 reasoning_tokens:该字段不在 OTel GenAI 规范中,已通过 revert 移除,避免污染下游 OLAP 列。

修改文件

文件 说明
assets/hooks/qoderwork-runtime-wrapper.mjs 新增:子进程 wrapper,输出 intercept JSONL
assets/hooks/qoderwork-hook-processor.mjs 小幅调整以配合新 wrapper
deploy/installer-opensource.sh 新增 inject_qodercli_token_intercept 安装步骤
src/inputs/qoder-trace/intercept-token-reader.ts readInterceptDatafilename 形参
src/inputs/qoder-work-trace/qoder-work-trace-input.ts LLM span 回填 cache_read,AGENT token sum 对齐
tests/unit/hooks/qoderwork/hook-processor.test.mjs 新增单测
tests/unit/inputs/intercept-token-reader.test.ts 新增单测
tests/unit/inputs/qoder-work-trace-input.test.ts 扩展回填与 token sum 校验单测

验证结果

单元测试(提交前本地反向核验)

npx vitest run tests/unit/hooks/qoderwork/ \
  tests/unit/inputs/qoder-work-trace-input.test.ts \
  tests/unit/inputs/intercept-token-reader.test.ts
# → Test Files 3 passed (3) | Tests 38 passed (38)

npx tsc --noEmit
# → 0 error

E2E(tester 报告 comment b6fdb253,coordinator 复核 PASS)

真实 QoderWork.app GUI (macOS, PID 33177) 触发多轮 ReAct + 工具对话(非替代、非伪造),install commit 1.0.0_e1875c2,trace 10 spans、3 STEP / 2 TOOL。

复现命令:

multica attachment download 019efe91-bf58-78c4-b578-71aa6f97b67a -o /tmp/pilot-e2e/
multica attachment download 019efe91-bf70-72c0-b3c5-ef071f3a66b2 -o /tmp/pilot-e2e/
node /tmp/loongsuite-pilot/scripts/validate-trace.mjs \
  --input /tmp/pilot-e2e/fx-pilot-qoder-work-2026-06-25.jsonl \
  --format json --output /tmp/pilot-e2e/validate-report.json

validate-trace verdict: PASS — 133 checks | 34 pass | 99 warn | 0 error | 0 skipped(99 WARN 全为 *.should.* 可选属性缺失,非阻断)。

6 项 checklist:

# 结果
1 STEP ≥ 3 3 ✅
2 TOOL ≥ 2 2 ✅
3 validate-trace 0 ERROR 0 ERROR / 0 SKIPPED ✅
4 LLM gen_ai.input.messages / gen_ai.output.messages 非空 3/3 LLM span ✅
5 gen_ai.usage.cache_read.input_tokens 回填 STEP2=34813、STEP3=34983 精确命中;STEP1 cached=0 不触发 ✅
6 reasoning_tokens 不出现在 LLM span(反向) 3/3 LLM span 0 个 reasoning.* 属性,revert 干净 ✅

AGENT token sum 严格校验: total=105338 / input=104979 / output=359 / cache_read=69796sum(LLM.*) 逐字段精确等。

@CLAassistant

CLAassistant commented Jun 25, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@ralf0131

Copy link
Copy Markdown
Collaborator

CLA Not Signed

The Contributor License Agreement (CLA) check is currently pending on this PR (license/cla: Contributor License Agreement is not signed yet.). This PR cannot be merged until the CLA is signed.

@fangxiu-wf please sign the CLA via the CLA assistant badge in the comment above, or visit https://cla-assistant.io/alibaba/loongsuite-pilot. Once signed, the license/cla status will turn green.


Automated check by github-manager-bot

fangxiu-wf and others added 2 commits June 25, 2026 20:36
Co-authored-by: multica-agent <github@multica.ai>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
@fangxiu-wf fangxiu-wf force-pushed the fangxiu-wf/qoderwork-token-intercept branch from e1875c2 to 78b9ff8 Compare June 25, 2026 12:36

@ralf0131 ralf0131 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Well-structured token interception approach via QODER_WORKER_RUNTIME_PATH wrapper, with good test coverage (38 tests). However, one critical safety issue: the wrapper throws (instead of degrading gracefully) when the real runtime path isn't found, which could crash QoderWork's worker thread entirely. Several privacy concerns around full system prompt capture also need attention before merge.

⚠️ CLA not signed — this PR cannot be approved/merged until the contributor signs the CLA (see the CLA reminder comment above).

Findings

  • [CRITICAL] assets/hooks/qoderwork-runtime-wrapper.mjs:115 — Throws if the real QoderWork runtime is not found at any hard-coded candidate path. Because this wrapper is loaded as QODER_WORKER_RUNTIME_PATH, throwing here crashes the worker thread and breaks QoderWork entirely. The comment claims the SDK will fall back to ProcessTransport, but the throw prevents that. Degrade gracefully (log and return) so the wrapper cannot disrupt normal agent operation.
  • [WARNING] assets/hooks/qoderwork-runtime-wrapper.mjs:53 — Globally overriding JSON.parse and performing synchronous file I/O (fs.appendFileSync) on the parse hot path blocks the worker event loop. Every SSE chunk and internal JSON parse pays this cost. Consider buffering writes asynchronously or using a writable stream with backpressure to avoid stalling the agent.
  • [WARNING] assets/hooks/qoderwork-runtime-wrapper.mjs:81 — JSON.stringify is globally overridden to capture the full system prompt content in plaintext and append it to a log file. This is privacy-sensitive: system prompts may contain user instructions, project context, or secrets. Store only a hash/identifier or redact before writing, and document the data retention policy.
  • [WARNING] assets/hooks/qoderwork-runtime-wrapper.mjs:87 — MIN_SYSTEM_PROMPT_LENGTH only filters by length; it does not redact or exclude the system prompt. A 101-character system prompt is still captured verbatim. Review whether full system prompt capture is necessary for token accounting.
  • [WARNING] src/inputs/qoder-work-trace/qoder-work-trace-input.ts:592 — The wrapper captures reasoning_tokens, but applyInterceptUsage and applyInterceptCacheReasoning never write it to the LLM span. This leaves AGENT token sum potentially misaligned when reasoning tokens are present and makes the wrapper field unused.
  • [INFO] src/inputs/qoder-work-trace/qoder-work-trace-input.ts:598 — Linear scan (interceptData.tokens.find) per response is O(n*m). For long sessions this becomes expensive. Consider building a Map<id, token> once when intercept data is loaded.
  • [INFO] assets/hooks/qoderwork-runtime-wrapper.mjs:73 — appendFileSync opens and closes the file descriptor on every token record. Combined with the JSON.parse hot path, this creates unnecessary syscall overhead. Holding a persistent fd or batching writes would reduce overhead.
  • [INFO] assets/hooks/qoderwork-runtime-wrapper.mjs:56 — The detection heuristic (result.usage && result.choices !== undefined && result.id) may match non-LLM JSON objects that happen to have these fields. A more specific check (e.g., result.object === 'chat.completion' or model presence) would reduce false positives.
  • [INFO] src/inputs/qoder-work-trace/qoder-work-trace-input.ts:603 — totalTokens || fallback treats a legitimate 0 total as missing. Use ?? or an explicit undefined check to preserve zero values when reported by the provider.
  • [INFO] assets/hooks/qoderwork-hook-processor.mjs:354 — Using || for responseId fallback treats empty string or '0' as missing. Prefer ?? for id values so a valid but falsy message.id is not silently discarded.
  • [INFO] deploy/installer-opensource.sh:916 — QoderWork wrapper injection is Darwin-only; Linux and Windows paths are skipped. Documented in code, but cross-platform users will not get token intercept coverage.

Automated review by github-manager-bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants