nrslib · nrslib · Feb 12, 2026 · Feb 11, 2026 · Feb 11, 2026 · Feb 11, 2026
diff --git a/.github/workflows/auto-tag.yml b/.github/workflows/auto-tag.yml
@@ -86,13 +86,21 @@ jobs:
       - name: Verify dist-tags
         run: |
           PACKAGE_NAME=$(node -p "require('./package.json').name")
-          LATEST=$(npm view "${PACKAGE_NAME}" dist-tags.latest)
-          NEXT=$(npm view "${PACKAGE_NAME}" dist-tags.next || true)
 
-          echo "latest=${LATEST}"
-          echo "next=${NEXT}"
+          for attempt in 1 2 3 4 5; do
+            LATEST=$(npm view "${PACKAGE_NAME}" dist-tags.latest)
+            NEXT=$(npm view "${PACKAGE_NAME}" dist-tags.next || true)
 
-          if [ "${{ steps.npm-tag.outputs.tag }}" = "latest" ] && [ "${LATEST}" != "${NEXT}" ]; then
-            echo "Expected next to match latest on stable release, but they differ."
-            exit 1
-          fi
+            echo "Attempt ${attempt}: latest=${LATEST}, next=${NEXT}"
+
+            if [ "${{ steps.npm-tag.outputs.tag }}" != "latest" ] || [ "${LATEST}" = "${NEXT}" ]; then
+              echo "Dist-tags verified."
+              exit 0
+            fi
+
+            if [ "$attempt" -eq 5 ]; then
+              echo "::warning::dist-tags not synced after 5 attempts (latest=${LATEST}, next=${NEXT}). Registry propagation may be delayed."
+              exit 0
+            fi
+            sleep $((attempt * 10))
+          done
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,46 @@ All notable changes to this project will be documented in this file.
 
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 
+## [0.13.0-alpha.1] - 2026-02-13
+
+### Added
+
+- **Team Leader ムーブメント**: ムーブメント内でチームリーダーエージェントがタスクを動的にサブタスク（Part）へ分解し、複数のパートエージェントを並列実行する新しいムーブメントタイプ — `team_leader` 設定（persona, maxParts, timeoutMs, partPersona, partEdit, partPermissionMode）をサポート (#244)
+- **構造化出力（Structured Output）**: エージェント呼び出しに JSON Schema ベースの構造化出力を導入 — タスク分解（decomposition）、ルール評価（evaluation）、ステータス判定（judgment）の3つのスキーマを `builtins/schemas/` に追加。Claude / Codex 両プロバイダーで対応 (#257)
+- **`backend` ビルトインピース**: バックエンド開発特化のピースを新規追加 — バックエンド、セキュリティ、QA の並列専門家レビュー対応
+- **`backend-cqrs` ビルトインピース**: CQRS+ES 特化のバックエンド開発ピースを新規追加 — CQRS+ES、セキュリティ、QA の並列専門家レビュー対応
+- **AbortSignal によるパートタイムアウト**: Team Leader のパート実行にタイムアウト制御と親シグナル連動の AbortSignal を追加
+- **エージェントユースケース層**: `agent-usecases.ts` にエージェント呼び出しのユースケース（`decomposeTask`, `executeAgent`, `evaluateRules`）を集約し、構造化出力の注入を一元管理
+
+### Changed
+
+- **BREAKING: パブリック API の整理**: `src/index.ts` の公開 API を大幅に絞り込み — 内部実装の詳細（セッション管理、Claude/Codex クライアント詳細、ユーティリティ関数等）を非公開化し、安定した最小限の API サーフェスに (#257)
+- **Phase 3 判定ロジックの刷新**: `JudgmentDetector` / `FallbackStrategy` を廃止し、構造化出力ベースの `status-judgment-phase.ts` に統合。判定の安定性と保守性を向上 (#257)
+- **Report フェーズのリトライ改善**: Report Phase（Phase 2）が失敗した場合、新規セッションで自動リトライするよう改善 (#245)
+- **Ctrl+C シャットダウンの統一**: `sigintHandler.ts` を廃止し、`ShutdownManager` に統合 — グレースフルシャットダウン → タイムアウト → 強制終了の3段階制御を全プロバイダーで共通化 (#237)
+- フロントエンドナレッジにデザイントークンとテーマスコープのガイダンスを追加
+- アーキテクチャナレッジの改善（en/ja 両対応）
+
+### Fixed
+
+- clone 時に既存ブランチの checkout が失敗する問題を修正 — `git clone --shared` で `--branch` を渡してからリモートを削除するよう変更
+- Issue 参照付きブランチ名から `#` を除去（`takt/#N/slug` → `takt/N/slug`）
+- OpenCode の report フェーズで deprecated ツール依存を解消し、permission 中心の制御へ移行 (#246)
+- 不要な export を排除し、パブリック API の整合性を確保
+
+### Internal
+
+- Team Leader 関連のテスト追加（engine-team-leader, team-leader-schema-loader, task-decomposer）
+- 構造化出力関連のテスト追加（parseStructuredOutput, claude-executor-structured-output, codex-structured-output, provider-structured-output, structured-output E2E）
+- ShutdownManager のユニットテスト追加
+- AbortSignal のユニットテスト追加（abort-signal, claude-executor-abort-signal, claude-provider-abort-signal）
+- Report Phase リトライのユニットテスト追加（report-phase-retry）
+- パブリック API エクスポートのユニットテスト追加（public-api-exports）
+- E2E テストの大幅拡充: cycle-detection, model-override, multi-step-sequential, pipeline-local-repo, report-file-output, run-sigint-graceful, session-log, structured-output, task-status-persistence
+- E2E テストヘルパーのリファクタリング（共通 setup 関数の抽出）
+- `judgment/` ディレクトリ（JudgmentDetector, FallbackStrategy）を削除
+- `ruleIndex.ts` ユーティリティを追加（1-based → 0-based インデックス変換）
+
 ## [0.12.1] - 2026-02-11
 
 ### Fixed

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -371,19 +371,30 @@ Files: `.takt/logs/{sessionId}.jsonl`, with `latest.json` pointer. Legacy `.json
 
 **Instruction auto-injection over explicit placeholders.** The instruction builder auto-injects `{task}`, `{previous_response}`, `{user_inputs}`, and status rules. Templates should contain only step-specific instructions, not boilerplate.
 
-**Persona prompts contain only domain knowledge.** Persona prompt files (`builtins/{lang}/personas/*.md`) must contain only domain expertise and behavioral principles — never piece-specific procedures. Piece-specific details (which reports to read, step routing, specific templates with hardcoded step names) belong in the piece YAML's `instruction_template`. This keeps personas reusable across different pieces.
-
-What belongs in persona prompts:
-- Role definition ("You are a ... specialist")
-- Domain expertise, review criteria, judgment standards
-- Do / Don't behavioral rules
-- Tool usage knowledge (general, not piece-specific)
-
-What belongs in piece `instruction_template`:
-- Step-specific procedures ("Read these specific reports")
-- References to other steps or their outputs
-- Specific report file names or formats
-- Comment/output templates with hardcoded review type names
+**Faceted prompting: each facet has a dedicated file type.** TAKT assembles agent prompts from 4 facets. Each facet has a distinct role. When adding new rules or knowledge, place content in the correct facet.
+
+```
+builtins/{lang}/
+  personas/     — WHO: identity, expertise, behavioral habits
+  policies/     — HOW: judgment criteria, REJECT/APPROVE rules, prohibited patterns
+  knowledge/    — WHAT TO KNOW: domain patterns, anti-patterns, detailed reasoning with examples
+  instructions/ — WHAT TO DO NOW: step-specific procedures and checklists
+```
+
+| Deciding where to place content | Facet | Example |
+|--------------------------------|-------|---------|
+| Role definition, AI habit prevention | Persona | "置き換えたコードを残す → 禁止" |
+| Actionable REJECT/APPROVE criterion | Policy | "内部実装のパブリックAPIエクスポート → REJECT" |
+| Detailed reasoning, REJECT/OK table with examples | Knowledge | "パブリックAPIの公開範囲" section |
+| This-step-only procedure or checklist | Instruction | "レビュー観点: 構造・設計の妥当性..." |
+| Workflow structure, facet assignment | Piece YAML | `persona: coder`, `policy: coding`, `knowledge: architecture` |
+
+Key rules:
+- Persona files are reusable across pieces. Never include piece-specific procedures (report names, step references)
+- Policy REJECT lists are what reviewers enforce. If a criterion is not in the policy REJECT list, reviewers will not catch it — even if knowledge explains the reasoning
+- Knowledge provides the WHY behind policy criteria. Knowledge alone does not trigger enforcement
+- Instructions are bound to a single piece step. They reference procedures, not principles
+- Piece YAML `instruction_template` is for step-specific details (which reports to read, step routing, output templates)
 
 **Separation of concerns in piece engine:**
 - `PieceEngine` - Orchestration, state management, event emission

diff --git a/README.md b/README.md
@@ -474,6 +474,8 @@ TAKT includes multiple builtin pieces:
 | `unit-test` | Unit test focused piece: test analysis → test implementation → review → fix. |
 | `e2e-test` | E2E test focused piece: E2E analysis → E2E implementation → review → fix (Vitest-based E2E flow). |
 | `frontend` | Frontend-specialized development piece with React/Next.js focused reviews and knowledge injection. |
+| `backend` | Backend-specialized development piece with backend, security, and QA expert reviews. |
+| `backend-cqrs` | CQRS+ES-specialized backend development piece with CQRS+ES, security, and QA expert reviews. |
 
 **Per-persona provider overrides:** Use `persona_providers` in config to route specific personas to different providers (e.g., coder on Codex, reviewers on Claude) without duplicating pieces.
 

diff --git a/builtins/en/knowledge/architecture.md b/builtins/en/knowledge/architecture.md
@@ -18,6 +18,26 @@
 - No circular dependencies
 - Appropriate directory hierarchy
 
+**Operation Discoverability:**
+
+When calls to the same generic function are scattered across the codebase with different purposes, it becomes impossible to understand what the system does without grepping every call site. Group related operations into purpose-named functions within a single module. Reading that module should reveal the complete list of operations the system performs.
+
+| Judgment | Criteria |
+|----------|----------|
+| REJECT | Same generic function called directly from 3+ places with different purposes |
+| REJECT | Understanding all system operations requires grepping every call site |
+| OK | Purpose-named functions defined and collected in a single module |
+
+**Public API Surface:**
+
+Public APIs should expose only domain-level functions and types. Do not export infrastructure internals (provider-specific functions, internal parsers, etc.).
+
+| Judgment | Criteria |
+|----------|----------|
+| REJECT | Infrastructure-layer functions exported from public API |
+| REJECT | Internal implementation functions callable from outside |
+| OK | External consumers interact only through domain-level abstractions |
+
 **Function Design:**
 
 - One responsibility per function
@@ -299,19 +319,18 @@ Correct handling:
 
 ## DRY Violation Detection
 
-Detect duplicate code.
+Eliminate duplication by default. When logic is essentially the same and should be unified, apply DRY. Do not judge mechanically by count.
 
 | Pattern | Judgment |
 |---------|----------|
-| Same logic in 3+ places | Immediate REJECT - Extract to function/method |
-| Same validation in 2+ places | Immediate REJECT - Extract to validator function |
-| Similar components 3+ | Immediate REJECT - Create shared component |
-| Copy-paste derived code | Immediate REJECT - Parameterize or abstract |
-
-AHA principle (Avoid Hasty Abstractions) balance:
-- 2 duplications → Wait and see
-- 3 duplications → Extract immediately
-- Different domain duplications → Don't abstract (e.g., customer validation vs admin validation are different)
+| Essentially identical logic duplicated | REJECT - Extract to function/method |
+| Same validation duplicated | REJECT - Extract to validator function |
+| Essentially identical component structure | REJECT - Create shared component |
+| Copy-paste derived code | REJECT - Parameterize or abstract |
+
+When NOT to apply DRY:
+- Different domains: Don't abstract (e.g., customer validation vs admin validation are different things)
+- Superficially similar but different reasons to change: Treat as separate code
 
 ## Spec Compliance Verification
 

diff --git a/builtins/en/knowledge/frontend.md b/builtins/en/knowledge/frontend.md
@@ -224,6 +224,54 @@ Signs to make separate components:
 - Added variant is clearly different from original component's purpose
 - Props specification becomes complex on the usage side
 
+### Theme Differences and Design Tokens
+
+When you need different visuals with the same functional components, manage it with design tokens + theme scope.
+
+Principles:
+- Define color, spacing, radius, shadow, and typography as tokens (CSS variables)
+- Apply role/page-specific differences by overriding tokens in a theme scope (e.g. `.consumer-theme`, `.admin-theme`)
+- Do not hardcode hex colors (`#xxxxxx`) in feature components
+- Keep logic differences (API/state) separate from visual differences (tokens)
+
+```css
+/* tokens.css */
+:root {
+  --color-bg-page: #f3f4f6;
+  --color-surface: #ffffff;
+  --color-text-primary: #1f2937;
+  --color-border: #d1d5db;
+  --color-accent: #2563eb;
+}
+
+.consumer-theme {
+  --color-bg-page: #f7f8fa;
+  --color-accent: #4daca1;
+}
+```
+
+```tsx
+// same component, different look by scope
+<div className="consumer-theme">
+  <Button variant="primary">Submit</Button>
+</div>
+```
+
+Operational rules:
+- Implement shared UI primitives (Button/Card/Input/Tabs) using tokens only
+- In feature views, use theme-common utility classes (e.g. `surface`, `title`, `chip`) to avoid duplicated styling logic
+- For a new theme, follow: "add tokens -> override by scope -> reuse existing components"
+
+Review checklist:
+- No copy-pasted hardcoded colors/spacings
+- No duplicated components per theme for the same UI behavior
+- No API/state-management changes made solely for visual adjustments
+
+Anti-patterns:
+- Creating `ButtonConsumer`, `ButtonAdmin` for styling only
+- Hardcoding colors in each feature component
+- Changing response shaping logic when only the theme changed
+
 ## Abstraction Level Evaluation
 
 **Conditional branch bloat detection:**

diff --git a/builtins/en/personas/coder.md b/builtins/en/personas/coder.md
@@ -33,4 +33,5 @@ You are the implementer. Focus on implementation, not design decisions.
 - Making design decisions arbitrarily → Report and ask for guidance
 - Dismissing reviewer feedback → Prohibited
 - Adding backward compatibility or legacy support without being asked → Absolutely prohibited
+- Leaving replaced code/exports after refactoring → Prohibited (remove unless explicitly told to keep)
 - Layering workarounds that bypass safety mechanisms on top of a root cause fix → Prohibited
diff --git a/builtins/en/piece-categories.yaml b/builtins/en/piece-categories.yaml
@@ -9,7 +9,10 @@ piece_categories:
   🎨 Frontend:
     pieces:
       - frontend
-  ⚙️ Backend: {}
+  ⚙️ Backend:
+    pieces:
+      - backend
+      - backend-cqrs
   🔧 Expert:
     Full Stack:
       pieces: