Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,7 @@ See [External Integrations](./docs/external-integrations.md) for other community
| [Tutorial](./docs/tutorial.md) | Improve one example over three phases while queuing, running, and inspecting tasks |
| [CLI Reference](./docs/cli-reference.md) | All commands and options |
| [Configuration](./docs/configuration.md) | Global and project settings |
| [Observability](./docs/observability.md) | Phase-level usage events and analysis workflow |
| [Design Philosophy](./docs/design-philosophy.md) | Why TAKT is built around workflows, facets, feedback loops, and traceability |
| [Workflow Guide](./docs/workflows.md) | Creating and customizing workflows |
| [Builtin Catalog](./docs/builtin-catalog.md) | All builtin workflows and personas |
Expand Down
1 change: 1 addition & 0 deletions docs/README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ npx create-takt-sdd
| [チュートリアル](./tutorial.ja.md) | 3 フェーズで題材を改良しながら、タスクを積み、実行し、結果を確認する流れ |
| [CLI Reference](./cli-reference.ja.md) | 全コマンド・オプション |
| [Configuration](./configuration.ja.md) | グローバル設定・プロジェクト設定 |
| [Observability](./observability.ja.md) | phase 粒度の usage events と集計 workflow |
| [設計思想](./design-philosophy.ja.md) | TAKT が workflow、facet、フィードバックループ、追跡性を重視する理由 |
| [Workflow Guide](./workflows.ja.md) | workflow の作成・カスタマイズ |
| [Builtin Catalog](./builtin-catalog.ja.md) | ビルトイン workflow・persona の一覧 |
Expand Down
5 changes: 3 additions & 2 deletions docs/configuration.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
[English](./configuration.md)

このドキュメントは TAKT の全設定オプションのリファレンスです。クイックスタートについては [README](../README.md) を参照してください。
phase 粒度の usage events と集計方法は [Observability Guide](./observability.ja.md) を参照してください。

## グローバル設定

Expand Down Expand Up @@ -172,7 +173,7 @@ interactive_preview_steps: 3 # インタラクティブモードでの step プ
| `workflow_runtime_prepare` | object | `{ custom_scripts: false }` | ランタイム prepare ポリシー(ビルトインプリセットは常に許可) |
| `workflow_command_gates` | object | `{ custom_scripts: false }` | workflow YAML command quality gate ポリシー |
| `sync_conflict_resolver` | object | `{ auto_approve_tools: false }` | sync conflict resolver ポリシー |
| `observability` | object | 無効 | OpenTelemetry foundation の opt-in 設定。`enabled` で SDK を初期化し、`monitor` は workflow metric を `.takt/runs/<run>/monitor.json` に出力し、`session_log_exporter` は span 由来の shadow session log を出力します。`usage_events_phase` は後続変更向けの予約フラグです。 |
| `observability` | object | 無効 | OpenTelemetry foundation の opt-in 設定。`enabled` で SDK を初期化し、`monitor` は workflow metric を `.takt/runs/<run>/monitor.json` に出力し、`session_log_exporter` は span 由来の shadow session log を出力します。`usage_events_phase` は phase 粒度の usage events を `.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl` に出力します。 |

## プロジェクト設定

Expand Down Expand Up @@ -235,7 +236,7 @@ concurrency: 2 # このプロジェクトでの takt run 並列
| `workflow_runtime_prepare` | object | - | ランタイム prepare ポリシー(グローバルを上書き) |
| `workflow_command_gates` | object | - | workflow YAML command quality gate ポリシー(グローバルを上書き) |
| `sync_conflict_resolver` | object | - | sync conflict resolver ポリシー(グローバルを上書き) |
| `observability` | object | - | プロジェクトレベルの OpenTelemetry opt-in 上書き。`enabled` で SDK を初期化し、`monitor` は workflow metric を `.takt/runs/<run>/monitor.json` に出力し、`session_log_exporter` は span 由来の shadow session log を出力します。`usage_events_phase` は後続変更向けの予約フラグです。 |
| `observability` | object | - | プロジェクトレベルの OpenTelemetry opt-in 上書き。`enabled` で SDK を初期化し、`monitor` は workflow metric を `.takt/runs/<run>/monitor.json` に出力し、`session_log_exporter` は span 由来の shadow session log を出力します。`usage_events_phase` は phase 粒度の usage events を `.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl` に出力します。 |

プロジェクト設定の値は、両方が設定されている場合にグローバル設定を上書きします。

Expand Down
5 changes: 3 additions & 2 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
[日本語](./configuration.ja.md)

This document is a reference for all TAKT configuration options. For a quick start, see the main [README](../README.md).
For phase-level usage events and analysis, see the [Observability Guide](./observability.md).

## Global Configuration

Expand Down Expand Up @@ -172,7 +173,7 @@ interactive_preview_steps: 3 # Step previews in interactive mode (0-10, default
| `workflow_runtime_prepare` | object | `{ custom_scripts: false }` | Runtime prepare policy (builtin presets always allowed) |
| `workflow_command_gates` | object | `{ custom_scripts: false }` | Workflow YAML command quality gate policy |
| `sync_conflict_resolver` | object | `{ auto_approve_tools: false }` | Sync conflict resolver policy |
| `observability` | object | disabled | Opt-in OpenTelemetry foundation. `enabled` initializes the SDK, `monitor` writes workflow metrics to `.takt/runs/<run>/monitor.json`, `session_log_exporter` writes a shadow session log from spans, and `usage_events_phase` is reserved for a later change. |
| `observability` | object | disabled | Opt-in OpenTelemetry foundation. `enabled` initializes the SDK, `monitor` writes workflow metrics to `.takt/runs/<run>/monitor.json`, `session_log_exporter` writes a shadow session log from spans, and `usage_events_phase` writes phase-level usage events to `.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl`. |

## Project Configuration

Expand Down Expand Up @@ -235,7 +236,7 @@ concurrency: 2 # Parallel task count for takt run in this project
| `workflow_runtime_prepare` | object | - | Runtime prepare policy (overrides global) |
| `workflow_command_gates` | object | - | Workflow YAML command quality gate policy (overrides global) |
| `sync_conflict_resolver` | object | - | Sync conflict resolver policy (overrides global) |
| `observability` | object | - | Project-level OpenTelemetry opt-in override. `enabled` initializes the SDK, `monitor` writes workflow metrics to `.takt/runs/<run>/monitor.json`, `session_log_exporter` writes a shadow session log from spans, and `usage_events_phase` is reserved for a later change. |
| `observability` | object | - | Project-level OpenTelemetry opt-in override. `enabled` initializes the SDK, `monitor` writes workflow metrics to `.takt/runs/<run>/monitor.json`, `session_log_exporter` writes a shadow session log from spans, and `usage_events_phase` writes phase-level usage events to `.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl`. |

Project config values override global config when both are set.

Expand Down
74 changes: 74 additions & 0 deletions docs/observability.ja.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Observability

[English](./observability.md)

TAKT の observability は opt-in です。無効時は workflow 実行、session log、provider events、既存の `logging.usage_events` 出力の挙動を変えません。

## Phase Usage Events を有効化する

`~/.takt/config.yaml` または `.takt/config.yaml` に次を追加します。

```yaml
observability:
enabled: true
usage_events_phase: true
```

phase 粒度の usage events は次に出力されます。

```text
.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl
```

この出力は既存の `logging.usage_events` とは別ファイルです。`logs/<session>-usage-events.jsonl` は置き換えません。

## イベント粒度

record は workflow phase ごとに分かれます。

| Phase | 意味 |
|-------|------|
| `phase1_execute` | step 本体の実行 |
| `phase2_report` | output contract / report 生成 |
| `phase3_structured` | structured output による status judgment |
| `phase3_tag` | tag fallback による status judgment |
| `phase3_fallback` | AI judge fallback による status judgment |

usage を取得できない場合は `usage_missing: true` と reason を記録します。分析コマンドでは missing usage を 0 token として扱わず、token 統計から除外します。

## Usage を集計する

先に build します。

```bash
npm run build
```

その後、ファイルまたは run directory を渡して集計します。

```bash
npm run analyze:usage -- .takt/runs/<run>/logs/*-usage-events.phase.jsonl
npm run analyze:usage -- .takt/runs/<run>
```

デフォルト出力は `step x phase x provider x model` で集計した Markdown table です。

CSV が必要な場合は `--format csv` を使います。

```bash
npm run analyze:usage -- --format csv .takt/runs/<run> > usage.csv
```

出力列は次の通りです。

| Column | 意味 |
|--------|------|
| `step` / `phase` / `provider` / `model` | 集計キー |
| `runs` | unique な `run_id` 数 |
| `calls` | phase usage record 数 |
| `missing` | usage を取得できなかった record 数 |
| `input_tokens` / `output_tokens` / `total_tokens` | usage を取得できた record の token 合計 |
| `cached_input_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` | cache 関連 token 合計 |
| `avg_total_tokens` / `median_total_tokens` / `stddev_total_tokens` | missing usage を除外した call 単位の total token 統計 |

before/after 比較では、それぞれの run directory 群に対して別々にコマンドを実行し、出力された table または CSV を比較します。
74 changes: 74 additions & 0 deletions docs/observability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Observability

[日本語](./observability.ja.md)

TAKT observability is opt-in. When disabled, workflow execution, session logs, provider events, and the existing `logging.usage_events` output keep their current behavior.

## Enable Phase Usage Events

Add this to `~/.takt/config.yaml` or `.takt/config.yaml`:

```yaml
observability:
enabled: true
usage_events_phase: true
```

This writes phase-level usage events to:

```text
.takt/runs/<run>/logs/<session>-usage-events.phase.jsonl
```

The phase usage stream is separate from the existing `logging.usage_events` file. It does not replace `logs/<session>-usage-events.jsonl`.

## Event Granularity

Records are grouped by workflow phase:

| Phase | Meaning |
|-------|---------|
| `phase1_execute` | Main step execution |
| `phase2_report` | Output contract/report generation |
| `phase3_structured` | Structured status judgment |
| `phase3_tag` | Tag fallback status judgment |
| `phase3_fallback` | AI judge fallback status judgment |

Missing usage is recorded with `usage_missing: true` and a reason. Missing usage is not treated as zero tokens by the analysis command.

## Analyze Usage

Build the project first:

```bash
npm run build
```

Then aggregate one or more files or run directories:

```bash
npm run analyze:usage -- .takt/runs/<run>/logs/*-usage-events.phase.jsonl
npm run analyze:usage -- .takt/runs/<run>
```

The default output is a Markdown table grouped by `step x phase x provider x model`.

Use CSV output for spreadsheets or downstream scripts:

```bash
npm run analyze:usage -- --format csv .takt/runs/<run> > usage.csv
```

The output columns are:

| Column | Meaning |
|--------|---------|
| `step` / `phase` / `provider` / `model` | Aggregation key |
| `runs` | Unique `run_id` count |
| `calls` | Number of phase usage records |
| `missing` | Records with unavailable usage |
| `input_tokens` / `output_tokens` / `total_tokens` | Token totals for records with usage |
| `cached_input_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` | Cache-related token totals |
| `avg_total_tokens` / `median_total_tokens` / `stddev_total_tokens` | Per-call total token statistics, excluding missing usage |

For before/after comparisons, run the command separately for each set of run directories and compare the resulting tables or CSV files.
7 changes: 7 additions & 0 deletions docs/testing/e2e.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,13 @@ E2Eテストを追加・変更した場合は、このドキュメントも更
- `takt --task 'Create a short report and finish' --workflow e2e/fixtures/workflows/report-judge.yaml --provider mock` を実行する。
- `TAKT_MOCK_SCENARIO=e2e/fixtures/scenarios/report-judge.json` を設定する。
- 出力に `Workflow completed` が含まれることを確認する。
- Observability file outputs(`e2e/specs/observability.e2e.ts`)
- 目的: observability の全ローカルファイル出力を config だけで有効化できることを確認。
- LLM: 呼び出さない(`--provider mock` 固定)
- 手順(ユーザー行動/コマンド):
- E2E用 `config.yaml` に `observability.enabled: true`, `usage_events_phase: true`, `monitor: true`, `session_log_exporter: true` を設定する。
- `takt --task 'Create a short report and finish' --workflow e2e/fixtures/workflows/report-judge.yaml --provider mock` を実行する。
- run配下に `*-usage-events.phase.jsonl`, `*-otel-session-shadow.jsonl`, `monitor.json` が出力されることを確認する。
- Add task(`e2e/specs/add.e2e.ts`)
- 目的: `takt add` がIssue参照からタスクファイルを生成できることを確認。
- LLM: 呼び出さない(`provider: mock` + `TAKT_MOCK_SCENARIO` 固定)
Expand Down
159 changes: 159 additions & 0 deletions e2e/specs/observability.e2e.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { readdirSync, readFileSync } from 'node:fs';
import { dirname, join, resolve } from 'node:path';
import { fileURLToPath } from 'node:url';
import {
createIsolatedEnv,
updateIsolatedConfig,
type IsolatedEnv,
} from '../helpers/isolated-env';
import { createLocalRepo, type LocalRepo } from '../helpers/test-repo';
import { runTakt } from '../helpers/takt-runner';
import { copyWorkflowFixtureToRepo } from '../helpers/local-workflow-fixture';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

type JsonRecord = Record<string, unknown>;

function readJsonl(path: string): JsonRecord[] {
return readFileSync(path, 'utf-8')
.trim()
.split('\n')
.filter((line) => line.length > 0)
.map((line) => JSON.parse(line) as JsonRecord);
}

function isJsonRecord(value: unknown): value is JsonRecord {
return typeof value === 'object' && value !== null && !Array.isArray(value);
}

function monitorHasRunIdAttribute(monitor: JsonRecord): boolean {
const scopeMetrics = monitor.scopeMetrics;
if (!Array.isArray(scopeMetrics)) {
return false;
}
return scopeMetrics.some((scopeMetric) => {
if (!isJsonRecord(scopeMetric) || !Array.isArray(scopeMetric.metrics)) {
return false;
}
return scopeMetric.metrics.some((metric) => {
if (!isJsonRecord(metric) || !Array.isArray(metric.points)) {
return false;
}
return metric.points.some((point) => {
if (!isJsonRecord(point) || !isJsonRecord(point.attributes)) {
return false;
}
return (
Object.prototype.hasOwnProperty.call(point.attributes, 'takt.run.id') &&
typeof point.attributes['takt.run.id'] === 'string'
);
});
});
});
}

function firstRunRoot(repoPath: string): string {
const runsDir = join(repoPath, '.takt', 'runs');
const runDirs = readdirSync(runsDir).sort();
const runDir = runDirs[0];
if (!runDir) {
throw new Error('Run directory not found');
}
return join(runsDir, runDir);
}

function findLogFile(runRoot: string, suffix: string): string {
const logsDir = join(runRoot, 'logs');
const entries = readdirSync(logsDir);
const file = entries.find((entry) => entry.endsWith(suffix));
if (!file) {
throw new Error(`Log file not found: *${suffix}; logs: ${entries.join(', ')}`);
}
return join(logsDir, file);
}

// E2E更新時は docs/testing/e2e.md も更新すること
describe('E2E: Observability file outputs (mock)', () => {
let isolatedEnv: IsolatedEnv;
let testRepo: LocalRepo;

beforeEach(() => {
isolatedEnv = createIsolatedEnv();
updateIsolatedConfig(isolatedEnv.taktDir, {
observability: {
enabled: true,
usage_events_phase: true,
monitor: true,
session_log_exporter: true,
},
});
testRepo = createLocalRepo();
});

afterEach(() => {
try {
testRepo.cleanup();
} catch {
// best-effort
}
try {
isolatedEnv.cleanup();
} catch {
// best-effort
}
});

it('should write phase usage events, shadow session log, and monitor JSON from config only', () => {
const workflowPath = copyWorkflowFixtureToRepo(
testRepo.path,
resolve(__dirname, '../fixtures/workflows/report-judge.yaml'),
);
const scenarioPath = resolve(__dirname, '../fixtures/scenarios/report-judge.json');

const result = runTakt({
args: [
'--task', 'Create a short report and finish',
'--workflow', workflowPath,
'--provider', 'mock',
],
cwd: testRepo.path,
env: {
...isolatedEnv.env,
TAKT_MOCK_SCENARIO: scenarioPath,
},
timeout: 240_000,
});

expect(result.exitCode).toBe(0);

const runRoot = firstRunRoot(testRepo.path);
const phaseUsagePath = findLogFile(runRoot, '-usage-events.phase.jsonl');
const shadowLogPath = findLogFile(runRoot, '-otel-session-shadow.jsonl');
const monitorPath = join(runRoot, 'monitor.json');

const phaseUsageRecords = readJsonl(phaseUsagePath);
expect(phaseUsageRecords.length).toBeGreaterThan(0);
const phases = new Set(phaseUsageRecords.map((record) => record.phase));
expect(phases.has('phase1_execute')).toBe(true);
expect(phases.has('phase2_report')).toBe(true);
expect([...phases].some((phase) => typeof phase === 'string' && phase.startsWith('phase3_'))).toBe(true);
expect(phaseUsageRecords[0]).toEqual(expect.objectContaining({
step: expect.any(String),
provider: 'mock',
provider_model: expect.any(String),
step_type: 'agent',
usage_missing: expect.any(Boolean),
usage: expect.any(Object),
}));

const shadowRecords = readJsonl(shadowLogPath);
expect(shadowRecords.some((record) => record.type === 'workflow_start')).toBe(true);
expect(shadowRecords.some((record) => record.type === 'workflow_complete')).toBe(true);

const monitor = JSON.parse(readFileSync(monitorPath, 'utf-8')) as JsonRecord;
expect(monitor).toBeTruthy();
expect(monitorHasRunIdAttribute(monitor)).toBe(true);
}, 240_000);
});
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
"scripts": {
"build": "tsc && mkdir -p dist/shared/prompts/en dist/shared/prompts/ja dist/shared/i18n dist/core/runtime/presets && cp src/shared/prompts/en/*.md dist/shared/prompts/en/ && cp src/shared/prompts/ja/*.md dist/shared/prompts/ja/ && cp src/shared/i18n/labels_en.yaml src/shared/i18n/labels_ja.yaml dist/shared/i18n/ && cp src/core/runtime/presets/*.sh dist/core/runtime/presets/",
"watch": "tsc --watch",
"analyze:usage": "node dist/commands/analyze-usage.js",
"test": "vitest run",
"test:watch": "vitest",
"test:e2e": "tmp=\"$(mktemp -t takt-e2e.XXXXXX)\"; npm run test:e2e:mock >\"$tmp\" 2>&1; code=$?; cat \"$tmp\"; if grep -q \"error connecting to api.github.com\" \"$tmp\"; then echo \"[takt] GitHub connectivity error detected in E2E output\"; code=1; fi; rm -f \"$tmp\"; if [ \"$code\" -eq 0 ]; then msg='test:e2e passed'; else msg=\"test:e2e failed (exit=$code)\"; fi; if command -v osascript >/dev/null 2>&1; then osascript -e \"display notification \\\"$msg\\\" with title \\\"takt\\\" subtitle \\\"E2E\\\"\" >/dev/null 2>&1 || true; fi; echo \"[takt] $msg\"; exit $code",
Expand Down
Loading
Loading