[otel] OpenTelemetry phase usage events と usage 分析を追加#785
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughこのPRはオプトインのフェーズ粒度usageイベント出力を追加します。OpenTelemetryスパンからフェーズ使用量レコードを生成してJSONLへ追記するSpanProcessor、収集ログを集計するCLI、実行時の配線、ドキュメント、およびE2E/ユニットテストを含みます。 ChangesPhase usage events observability feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
e2e/specs/observability.e2e.tsParsing error: /e2e/specs/observability.e2e.ts was not found by the project service. Consider either including it in the tsconfig.json or including it in allowDefaultProject. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Note Unit test generation is a public access feature. Expect some limitations and changes as we gather feedback and continue to improve it. Generating unit tests... This may take up to 20 minutes. |
|
✅ Created PR with unit tests: #786 |
|
動作確認メモです。 PR #785 の phase usage events について、ローカルで 確認内容:
Codex provider では mock provider と違い、usage が取得できていました。 確認できた phase usage records: 今回の実行では structured output 判定で完了したため、 ローカル確認の範囲では、phase usage events の出力と |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/core/workflow/engine/ArpeggioRunner.ts (1)
200-205: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win例外時の usage 情報の一貫性を改善する。
runWithPhaseSpanの catch ブロック(Line 187-194)ではproviderUsageにusageMissing: trueを設定していますが、この catch ブロックでは設定されていません。一貫性のため、ここでもproviderUsageを設定することを推奨します。♻️ 提案する修正
return { batchIndex: batch.batchIndex, content: '', success: false, error: lastError, + providerUsage: { + usageMissing: true, + reason: USAGE_MISSING_REASONS.NOT_AVAILABLE, + }, };ファイル冒頭に import を追加:
+import { USAGE_MISSING_REASONS } from '../../logging/contracts.js';🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/core/workflow/engine/ArpeggioRunner.ts` around lines 200 - 205, The catch block in runWithPhaseSpan in ArpeggioRunner.ts returns a result without setting providerUsage (unlike the earlier catch that sets usageMissing: true), causing inconsistent usage metadata; update the returned object in this catch to include providerUsage with usageMissing: true (and any minimal structure expected by the caller) alongside batchIndex, content, success, and error so both failure paths produce consistent providerUsage information.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@e2e/specs/observability.e2e.ts`:
- Around line 125-127: The test reads monitor.json into the variable monitor and
currently checks for '"takt.run.id"' via JSON.stringify, which is brittle;
instead assert the parsed object contains that key directly. Update the
assertions around monitor (from observability.e2e.ts) to treat monitor as an
object and use a direct property existence check (e.g.,
Object.prototype.hasOwnProperty.call(monitor, 'takt.run.id') or a testing helper
like expect(monitor).toHaveProperty('takt.run.id')) and optionally validate the
property's value/type rather than relying on JSON.stringify.
In `@src/__tests__/otelFoundation.test.ts`:
- Line 218: The test currently asserts spanProcessors length equals 2, but
createSpanProcessorState() always returns [sessionLogSpanProcessor,
usageEventsSpanProcessor], so instead change the test to assert the behavior of
registration: when usageEventsPhase is false, verify that the
usageEventsSpanProcessor (or its exporter/register method) is not called; when
true, verify it is called — locate assertions around
foundation.constructedOptions and replace the length check with spies/mocks on
usageEventsSpanProcessor.register (or the exporter.register function used by
createSpanProcessorState) and assert call/non-call accordingly.
In `@src/infra/observability/usageEventsSpanProcessor.ts`:
- Around line 18-85: The instance-level hasReportedWriteFailure hides write
errors for different runId files; change hasReportedWriteFailure into a per-run
map (e.g. hasReportedWriteFailureByRun: Map<string, boolean>) and update
safeAppend to use options.runId (or options.phaseUsageLogPath) as the key when
checking/setting the flag; also update register's returned unregister function
to remove the flag for that runId and clear the map in shutdown. Ensure you
modify the UsageEventsSpanProcessor constructor/fields, safeAppend, register
(returned deleter), and shutdown to operate on the per-run map instead of a
single boolean.
---
Outside diff comments:
In `@src/core/workflow/engine/ArpeggioRunner.ts`:
- Around line 200-205: The catch block in runWithPhaseSpan in ArpeggioRunner.ts
returns a result without setting providerUsage (unlike the earlier catch that
sets usageMissing: true), causing inconsistent usage metadata; update the
returned object in this catch to include providerUsage with usageMissing: true
(and any minimal structure expected by the caller) alongside batchIndex,
content, success, and error so both failure paths produce consistent
providerUsage information.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 87d7c8ed-a551-42ab-b14d-f7526f47732f
📒 Files selected for processing (37)
README.mddocs/README.ja.mddocs/configuration.ja.mddocs/configuration.mddocs/observability.ja.mddocs/observability.mddocs/testing/e2e.mde2e/specs/observability.e2e.tspackage.jsonsrc/__tests__/analyze-usage-command.test.tssrc/__tests__/logging-contracts.test.tssrc/__tests__/otelFoundation.test.tssrc/__tests__/phaseUsageEvent.test.tssrc/__tests__/usageEventsSpanProcessor.test.tssrc/__tests__/workflowSpans.test.tssrc/agents/judge-status-usecase.tssrc/agents/structured-caller/prompt-based-structured-caller.tssrc/commands/analyze-usage.tssrc/core/logging/contracts.tssrc/core/logging/phaseUsageEvent.tssrc/core/logging/providerEvent.tssrc/core/workflow/arpeggio/types.tssrc/core/workflow/engine/ArpeggioRunner.tssrc/core/workflow/engine/ParallelRunner.tssrc/core/workflow/engine/StepExecutor.tssrc/core/workflow/engine/team-leader-part-runner.tssrc/core/workflow/observability/workflowSpans.tssrc/core/workflow/report-phase-runner.tssrc/core/workflow/status-judgment-phase.tssrc/core/workflow/types.tssrc/features/tasks/execute/workflowExecutionBootstrap.tssrc/infra/fs/index.tssrc/infra/fs/jsonl.tssrc/infra/observability/otelFoundation.tssrc/infra/observability/usageEventsSpanProcessor.tssrc/shared/types/provider.tsvitest.config.e2e.mock.ts
…vents fix(observability): capture Claude usage for phase events
|
動作確認メモです。 PR #785 の phase usage events について、ローカルで 確認内容:
どちらの provider でも 今回の実行では structured output 判定で完了したため、 |
|
補足です。 上の動作確認では TAKT 側の計装経路は共通です。
一方で、両 provider は provider 側の実行経路が異なります。
そのため、Claude 側で返される raw usage の粒度・内部プロンプト・ツール呼び出しの包み方・structured output の扱い・session/context の載せ方が完全に同一とは限りません。また、今回の確認は別々の実行なので、生成内容や judge の中間出力も完全一致ではありません。 したがって、今回の token 数差は「TAKT の計装が provider ごとに不公平に記録している」ことを示すものではなく、 この動作確認の主目的は、両 provider で phase usage events が欠損せず |
概要
テスト
Closes #704
Closes #705
Summary by CodeRabbit
新機能
ドキュメント
テスト