[codex] Merge editor foundation runtime governance into dev#1554
Merged
oscharko merged 29 commits intoJun 26, 2026
Conversation
…1493) * feat(editor): safe apply-edits and patch workflow for agents Harden the agent-native editor action path (Issue #1296 scaffolding) so agent applyTextEdits/applyPatch actions are safe: structured hash/version conflicts, overlap/invalid-edit rejection, user-visible non-destructive dirty-buffer conflicts, undo/redo preservation, and workspace containment. Deterministic-first: every safety check runs in the server preflight (fail-closed, before queuing); the browser applies through Monaco's controlled value path (undo-preserving) and surfaces conflicts and an explicit patch review (Accept/Reject) to the user. - contracts: INVALID_EDITS/OUT_OF_SCOPE conflict codes; pure validators validateAgentTextEdits (inverted/negative/overlap), isContainedAgentPath, isEditorAgentConflictCode (additive, leaf-safe, root-surface-neutral). - server: preflight containment + edit-structure + patch validation (reuses keiko-tools validatePatch + keiko-workspace containment); applyPatch is server-validated and normalized to contract textEdits for in-buffer review; preflight conflicts emitted over SSE for editor visibility (AC3). - ui: AgentConflictBanner (role=alert, .ai-danger), patch review surface (Accept/Reject), structured INVALID_EDITS on overlap, read-only guard, session-scoped conflict surfacing. - tests: contract/server unit + UI component + a11y + e2e (real app/BFF). - docs: ADR-0058. ci/codeql/dependency-review enabled for the integration branch. Refs #1394 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(editor): satisfy lint gate in #1394 e2e spec Extract module-scope helpers (registerAgentSnapshot, postApplyPatch) and patch constants, and remove unused declarations, so each e2e test function is within the max-lines-per-function limit and eslint passes with --max-warnings=0. Behavior is unchanged: all 5 scenarios, assertions, and test names are identical (e2e 5/5 green). Refs #1394 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…actions, events (#1391) (#1495) * feat(editor): agent editor public contracts for sessions, snapshots, actions, events (#1391) Ratify and harden the public, schema-first agent-editor contract (packages/keiko-contracts/src/editor-agent.ts) to satisfy Issue #1391's acceptance criteria. The module was scaffolded by #1296 and extended by #1394/ADR-0058; this change closes the remaining AC gaps without building a parallel subsystem. - AC1: snapshot text defaults to `none` (DEFAULT_EDITOR_AGENT_SNAPSHOT_TEXT_MODE plus a parse-boundary default; a present-but-invalid value is still rejected). - AC2: write actions require a version/hash precondition in addition to the mandatory idempotency key — new EDITOR_AGENT_WRITE_ACTION_TYPES / isEditorAgentWriteActionType / editorAgentActionHasWritePrecondition / editorAgentWritePreconditionError; the BFF rejects a blind write with the structured PRECONDITION_REQUIRED conflict (ordered last in the write chain). - AC3: structured error-code taxonomy exported as EditorAgentConflictCode and EDITOR_AGENT_CONFLICT_CODES (including PRECONDITION_REQUIRED); the conflict banner handles the new code (Dismiss-only). - AC4: isEditorAgentEvent validates every event kind at the SSE trust boundary; barrel exports plus surface pins; the browser bridge adopts the guard. Docs: ADR-0059 and docs/editor-agent-contracts.md (public API semantics). Tests: contract validators for every action/result/event shape, default-none, write-precondition, taxonomy, and schema-version compatibility; server PRECONDITION_REQUIRED preflight tests. Refs #1391 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(editor): address adversarial review findings for #1391 agent editor contracts Resolve the must-fix and disposition findings from the multi-lens verification: - MF1: make the isEditorAgentConflictCode exhaustive test cover all seven codes (it claimed "six" and omitted PRECONDITION_REQUIRED). - MF2: validate the editor-agent:action SSE frame with isEditorAgentEvent in the browser bridge, symmetric with the result listener — the action stream drives the write actions, so it is the more consequential trust boundary. Test harnesses now emit the full event envelope the server sends. - Validate the conflict sub-object in isEditorAgentActionResult (a result cannot smuggle an out-of-taxonomy conflict code past the guard). - Add a contract test that a precondition with an invalid/empty hash is rejected (no pseudo-hash blind write). - ADR-0059: status Proposed -> Accepted; D5 covers both SSE streams and the conflict-detail validation; document the version/hash comparison's dependency on the bridge-supplied snapshot revision; register ADR-0059 in docs/adr/README. Refs #1391 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…ents (#1392) (#1498) * feat(editor): agent editor session registry, action queue, and SSE events (#1392) Implement the live BFF control plane that lets agents discover editor sessions and queue actions for a connected browser bridge, building on #1391/#1394. - contract: additive NO_ACTIVE_BRIDGE conflict code + TIMED_OUT/QUEUE_FULL lifecycle-failure taxonomy (EditorAgentActionFailure, guards, barrel exports) - server: new agentSessionRegistry.ts owns the session registry, per-session live bridge tracking, a bounded self-healing action queue with deadlines, result correlation, and scoped event fan-out; agentRoutes.ts delegates and adds the NO_ACTIVE_BRIDGE liveness gate (last in preflight) + QUEUE_FULL (429) - ui: the bridge SSE connection carries its sessionId for server-side liveness; AgentConflictBanner renders the new code - tests: registry unit, route integration (8 scenarios), security at the real HTTP boundary (CSRF/host/containment), and a no-raw-source-in-logs guard - docs: ADR-0060 + docs/editor-agent-session-registry.md AC1 NO_ACTIVE_BRIDGE structured unavailable; AC2 queued actions time out with queue cleanup; AC3 CSRF/same-origin reused; AC4 OUT_OF_SCOPE containment reused; AC5 the server never mutates editor state. Refs #1392 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(editor): harden agent queue against cross-session and unbounded growth (#1392) Adversarial review fixes on top of the registry: - correctness/security (high): the action queue is now keyed PER SESSION (sessionId -> actionId -> entry), so a result, timeout, or re-queue for one session can never clear another session's slot or timer. reportResult correlates strictly by (sessionId, actionId); a mismatched/cross-session result is fanned out for visibility without touching the queue. This also removes the separate depth counter (depth = inner map size) and its drift. - correctness (medium): a second admit for an actionId already in flight for a session is rejected (409) instead of silently superseding the first action's deadline, so AC2 self-healing holds. - security (medium): the idempotency cache is bounded (FIFO, max 1024) and stores only a SHA-256 hash of the request body — no unbounded growth and no retained raw action content (text edits, patch bodies). - security (low): the session registry is bounded; on overflow the oldest idle session (no live bridge, no pending actions) is evicted, never active work. Tests: cross-session result isolation, duplicate-actionId rejection (registry + route), and session-registry eviction. Refs #1392 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…#1393) (#1502) Completes the browser-side EditorAgentBridge for #1393: extracts the inline SSE/register/dispatch/result wiring into a thin editorAgentBridge module (pure dispatchEditorAgentAction + useEditorAgentBridge) and finishes moveTab/splitPane/setSelection by delegating to existing controllers/policies. No change to the frozen #1391 contract. ci + ui + CodeQL + cross-platform smokes green.
…it (Refs #1395) (#1508) * feat(editor): agent editor action governance, policy, and bounded audit (Refs #1395) Add an explicit allow/deny/review-required policy taxonomy and a bounded, content-free audit trail for agent editor actions, with a read-only "recent agent actions" UI surface. Reuses the existing evidence/redaction and governance primitives rather than a parallel stack (ADR-0062). - keiko-contracts: new editor-agent-governance leaf — effect-class taxonomy, deterministic fail-closed policy classifier, content-free audit record + pure builder (no raw source/secret can enter by construction). - keiko-server: bounded in-memory per-session audit ledger; policy decision + audit emission wired into the action route; sensitive-path (deny-list) write denial across all write actions (surfaced as OUT_OF_SCOPE); GET /api/editor/agent/audit read feed. - keiko-ui: EditorAgentActionsPanel recent-actions surface (a11y, no globals.css, CSP-safe), bridge onAgentActivity refresh hook. - docs: ADR-0062. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(editor): governance doc + Playwright browser evidence for #1395 (Refs #1395) - docs/editor-agent-governance.md: policy taxonomy, audit schema, storage decision, UI surface, and limitations. - tests/e2e/editor-agent-1395.spec.ts + config + npm script: real-app browser evidence that the policy denies sensitive-path writes (OUT_OF_SCOPE), admits contained writes for review, and serves the content-free audit feed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(editor): close adversarial-review test gaps + distinguish audit load failure (Refs #1395) Adversarial multi-lens review (4/5 SHIP, tests SHIP_WITH_FIXES) confirmed test gaps and one UX gap; no correctness blocker. Fixes: - AC2: also assert format/save (not just applyTextEdits) deny on a sensitive path. - AC1: assert the audit record is not duplicated on idempotent replay (bounded). - AC3: add an AWS-key redaction case so the test proves the full redactor, not one pattern. - AC4: the panel now distinguishes a load failure from an empty feed (distinct message + role=status), instead of silently showing "no actions" on a failed fetch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…(Refs #1395) (#1511) The recent-actions panel mounts once per editor pane, so a role="status" on its load-error message collided with getByRole("status") queries in the existing EditorWidget test-generation tests (multiple status roles). Announce the failure via aria-live="polite" instead — no queryable status role, same SR behavior. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
) Single-sources the absolute->root-relative file-identifier conversion behind a tested keiko-contracts contract (editor-workspace-path, ADR-0063) and routes EditorWidget, workspaceActions, and AppShell through it, so the editor never sends an absolute path to the BFF (the absolute-path editor load failure). Server validation unchanged. Refs #1374.
…ss (#1514) playwright.issue-1394-editor-agent.config.ts and tests/e2e/editor-agent-1394.spec.ts existed without an npm script, unlike the sibling per-issue editor e2e configs (test:e2e:editor-fidelity-1295/-1296, test:e2e:editor-governance-1395). Add test:e2e:editor-agent-1394 following the exact same pattern. These per-issue editor e2e configs are ad-hoc evidence harnesses and are not referenced by any CI workflow (only test:e2e:smoke runs in ci.yml); no CI wiring changes. Verified: `npm run test:e2e:editor-agent-1394 -- --list` discovers all 5 tests in the spec. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…1515) Hardens the existing editor layout system so multi-tab strips, split panes, and recursive resize behave like a stable IDE layout. - Re-home the per-pane dirty index onto the committed layout via a pure reconcileEditorDirtyByPane helper, so a dirty tab keeps its marker, active selection, and unsaved-changes prompt as it moves between panes and no orphaned flag survives a collapsed pane (AC3). Fixes a latent data-loss path where a moved dirty tab could close without a save prompt and a save could target the wrong pane. - Expose the WAI-ARIA window-splitter pattern on split resizers (role=separator, aria-orientation, aria-valuemin/max/now) with no visible DOM change (AC5). - Document the reducer invariants in editor-layout.ts and ADR-0063. - Add regression coverage: contract serialize-reload round trip (AC1); jsdom tests for reload persistence, drag-changes-only-layout (AC2), dirty-tab move (AC3), empty-pane collapse (AC4), and separator semantics (AC5); an EditorWidget accessibility smoke; and a packaged-app Playwright spec (editor-layout-1375) for split creation, nested resize, keyboard reorder/move, and reload persistence. No contract wire shape changes; the dirty reconciliation lives in the UI layer. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…1516) * test(editor): strengthen AC3 close-prompt coverage and e2e persistence sync (Refs #1375) Adversarial review follow-ups: - Add a jsdom regression that closing a file from the pane a dirty tab LEFT raises no false unsaved-changes dialog, directly exercising the orphaned-flag close-prompt path (dirtyFilesForPane). Mutation-verified: fails if the dirty reconciliation is removed. - e2e: wait for the layout-persistence effect to flush the reordered order to localStorage before reload, and assert persistence at the storage layer. - Reference ADR-0064 from the editorDirtyState module doc. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(editor): characterize shared-file dirty re-spread on reconcile (Refs #1375) Closes the last adversarial-review coverage nit: a split-view file open in two panes whose flag was cleared in only one pane is re-homed onto both by reconcileEditorDirtyByPane, because the Monaco model is shared by (root,file) so the buffer is unsaved in every pane that shows it (ADR-0064 D2). Documents the intended union semantics; no production change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…#1376) (#1518) Audit and complete the editor's unsaved-work protection across tab close, pane close, root change, window close, reload, and disk-conflict scenarios. - Wire the previously dead `reload-file` dirty-close reason: a conflict reload over a dirty buffer now routes through an explicit modal acknowledgement that reuses the close-flow dialog surface, instead of overwriting the buffer outright (D1, AC1). - Delete the hot-exit snapshot on an explicit Discard close from the EditorWidget side, so discarded edits no longer resurface as a recovery offer on reopen (AC5). - Reserve the incoming snapshot's bytes in prune() and exclude its overwritten predecessor, so a write cannot push hot-exit storage past the 8 MiB cap (D2). - Offer recovery on content difference alone, dropping a redundant content-hash clause that could only ever suppress a legitimate recovery offer (AC3). - Turn the disk-conflict "Compare" into a true side-by-side diff via the existing EditorDiffSurface rather than a prose notice (AC4). - Add contract, store, and component regression coverage, including a regression asserting no native window.confirm in editor close flows (D4); document the decisions, including the window-close scope boundary, in ADR-0065. Window close stays protected by the hot-exit snapshot plus the beforeunload guard rather than a shell-level dialog, which would require editing the desktop window manager outside this issue's write-ownership. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…(Refs #1377) (#1519) Establish the editor browser quality backbone (Epic #1491): a reusable e2e helper library, a consolidated baseline regression matrix, a named runnable gate, the test-design doc, and ADR-0066. Test-infrastructure only; no product code is changed and no quality gate is weakened. - tests/e2e/support/editorWorkspace.ts: shared fixtures, window seeding, layout drivers, persisted-state + IndexedDB readers, page-error collector, consolidating the helpers the per-issue editor specs each re-implemented. - tests/e2e/editor-baseline-1377.spec.ts: 10-scenario deterministic matrix driving the real packaged app (tree-open, tabs, reorder persistence, split/resize, dirty buffer, dirty-close guard, hot-exit recovery, empty state, load-failure, and keyboard/focus accessibility). No timing hacks. - playwright.issue-1377-editor-baseline.config.ts + test:e2e:editor-baseline-1377 npm script: the named, locally-runnable browser quality gate. - docs/keiko-editor/1377-editor-browser-regression-gate.md: test-design doc, regression matrix, helper API, a11y coverage, performance/memory budget cross-references. - docs/adr/ADR-0066-...md: the gate decision record. Refs #1377 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…age (Refs #1376) (#1520) Adversarial review disposition for the dirty-buffer/hot-exit/recovery hardening: - Serialize hot-exit store writes/deletes through one promise chain so an explicit Discard's delete cannot race an in-flight dirty-write and resurrect the snapshot. - Delete the hot-exit snapshot when a dirty conflict reload is confirmed, so the reload does not immediately re-offer the just-discarded edits. - Capture the on-disk content when recovery is offered and diff against that baseline, so Compare stays accurate even if the buffer is edited before Compare opens. - Move focus to the compare view's primary action and suppress the recovery banner while comparing so its actions are not duplicated. - Replace the vacuous prune predecessor test with one that fails without the exclusion; add the AC3 content-hash false-negative guard, edit-before-compare, reload-confirm focus/Escape, reload-confirm snapshot deletion, and axe checks for the new modal and compare panel. - The gating release-smoke @smoke test now exercises the conflict reload-confirm and the dirty-tab-close dialog (Cancel preserves the buffer; no native confirm). - ADR-0065: document store-mutation serialization, the disk-baseline compare, the single-key two-pane limitation, the oversized-snapshot behavior, and the agent-conflict reload scope boundary. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…s run (#1521) Lands the post-merge fixes #1519 raced past: the baseline 'tab' selector now targets the dirty-bearing .ed-tab span via :has(.ed-tab-hit[data-tip]), so the dirty-buffer, dirty-close, and recovery scenarios run; adds the dirty-close dialog aria-labelledby a11y assertion. Test-infrastructure only. Refs #1377.
… map (#1379) Reuse-and-ratify per ADR-0067: canonical editor language mode map in keiko-contracts; exhaustive describeLanguageCapabilities() (AC2); additive content-free languageCapability on the agent snapshot (AC4); registry-driven UI ratified (AC1); plaintext degrade pinned (AC3). Additive only; schema versions stay "1". Refs #1379
… baseline (#1380) (#1526) Make the explicit Format command + provider status reflect what is actually reachable in the browser, and register the syntax grammars every supported language needs, per ADR-0068 (Epic #1491). - D1: new strict contracts leaf `editor-builtin-capabilities.ts` — the single editor-tier source of truth for per-language browser built-in capabilities (syntaxHighlighting, bracketMatching, documentFormatting source), exhaustive over EDITOR_LANGUAGE_MODE_IDS and coherence-pinned by test. - D3/D5 (AC1/AC5): editorMonacoRuntime registers the css/scss/less/html basic-languages grammars so those buffers tokenise; EditorRuntimeWidget derives Format availability from the registry (monaco-builtin for json/css/scss/less/html; keiko-language-service for ts/js; none otherwise) so yaml/markdown no longer advertise a formatter the browser cannot reach. - D4 (AC5): a content-free `formatting` status field, fed the same degraded-aware value that gates the Format button + its dynamic aria-label, so command and status can never disagree. - D7: the Keiko formatting bridge selector stays ts/js only (no double-registration with Monaco's bundled json/css/html workers); pinned by a coherence test. - AC3/AC4: reuse the #1201 failure-safe bridge (EMPTY_EDITS on failure) — added tests, no behaviour change. - D4 deliverable: packaged-app Playwright browser evidence (editor-formatting-1380.spec.ts) for language mode, syntax highlighting, dark-theme/tokenisation after reload, per-source Format availability, and the no-formatter failure-safe path. Additive only; no server change, no contract-shape change, schema versions stay "1". Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* feat(editor): governed LSP process manager foundation (#1381) Long-lived supervised stdio LSP process layer for keiko-server: process lifecycle, in-house LSP base-protocol JSON-RPC codec, request deadlines, response byte caps, cancellation, restart throttling, workspace-root executable containment, content-free lifecycle status, and graceful shutdown — proven with fake/test providers only (no real language server, no container or remote supervision, no model intelligence). This is the ADR-0043 amendment that ADR-0045 D2 required before any real language server ships: an injected LspSpawnFn (never the buffering runCommand) with isCommandAllowed deny-by-default preflight (I5), env allowlist, ephemeral HOME, workspace-root executable containment (I2), a maxFrameBytes frame-reader hard-reject (I3), content-free lifecycle attestation with deepRedactStrings (I4), and a networkPolicy field that accepts future enforced-egress wrapping (I1). Provider status maps onto the existing LanguageProviderDescriptor via lspStatusToProviderDescriptor with zero UI change; the read-only GET /api/editor/lsp/status feed is env-gated default-off. No vscode-jsonrpc dependency; the existing unavailable external-LSP descriptors and registry are untouched. - contracts: new strict leaf lsp-process.ts (status/config/error-code and content-free lifecycle-event vocabulary + pure mappings) - server: editor/lsp/ subsystem (frame codec, JSON-RPC client, transport, node adapter, lifecycle manager, restart throttle, language provider, lifecycle ledger, status route) + fake LSP harness and real-subprocess fixture; route registered in routes.ts - docs: ADR-0069 Refs #1381 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(editor): address LSP process manager review findings (#1381) Adversarial security/correctness/performance review disposition: - frame codec (I3 + perf): cap the pre-header region at MAX_HEADER_BYTES so a server flooding bytes without a header terminator cannot grow the read buffer unbounded; accumulate chunks in a list and concat at most once per frame (removes per-chunk O(n^2) Buffer.concat). - lifecycle: set `exited` before the dispose early-return so a prompt child exit during graceful shutdown resolves escalateKill without burning the full grace; dispose the superseded transport on restart so in-flight requests reject immediately (DISPOSED) instead of waiting out the request deadline. - restart robustness: track a monotonic childGeneration so a late exit/error from a child already superseded by a restart cannot trigger a spurious CRASHED transition or consume a stale throttle credit. - node adapter: correct the security-boundary comment (preflight/containment/env allowlist are the manager's responsibility) and add a defense-in-depth absolute-path assertion in defaultLspSpawnFn. - repair a control character (NUL) that had replaced the space in the bare-name executable check `name.includes(" ")`; restore the literal space so the file is UTF-8 text and the space-rejection is correct. - tests: dedicated restart-throttle unit tests (boundary mutation coverage), header-overflow and many-small-chunks codec tests, in-flight-crash rejection, stale-generation no-op, tightened oversized-frame assertion; remove the dead "crash" behavior from the fake harness union. - docs: ADR-0069 status -> Accepted (pending human review). Refs #1381 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Implements the local runtime capability detector, API route, tests, and documentation for Issue #1385.
…#1546) Implements the governed command runner for Issue #1387: a closed catalog of test/build/run tasks discovered from package scripts, executed through the single governed spawn boundary (keiko-tools runCommand) with workspace containment, output byte caps, timeout, cancellation, content-free evidence, and dual-layer secret redaction. Reuses the ADR-0043/ADR-0006 spawn boundary and the ADR-0018 terminal manager pattern rather than adding a second execution path; a run names a discovered task id, never free-form argv. - contracts: command-runner wire contract (catalog, run request/result, events, COMMAND_TASK_RULES allowlist) + hand-rolled validators; adds the additive "command-run" EvidenceTaskType - server: CommandRunnerManager (package-script discovery + governed execution composing runCommand) + routes (/api/commands/catalog|runs|events) + content-free run evidence, wired into deps/routes - ui: CommandsWidget task surface + commands-api client + window registration - tests: contracts/server/ui unit + integration coverage and an e2e browser smoke for catalog discovery, run, and structured result - docs: command-runner security notes Refs #1387 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Addresses review findings on PR #1546 (security audit + verifier + PR review): - contracts: cap taskId (<=256) and requestId (<=128, token charset) at the parse boundary so an oversized/non-token id cannot reach the manager, audit ledger, or SSE fan-out (the 16 KB body cap was the only prior backstop) + tests - server: document the per-run catalog re-discovery as an intentional untrusted-taskId re-validation; add a test proving a throwing SSE subscriber does not block fan-out to other subscribers - routes: assert Layer-2 redaction is applied to every SSE event frame - ui: render the project path as the Tasks window subtitle (connectionUtils) - e2e: deterministically mock the SSE channel and assert a run lifecycle event reaches the bounded events log - docs: correct the audit-evidence note (the standard manifest workspaceRoot path is retained; argv values, output bytes, and secrets are excluded) Refs #1387 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
#1388) (#1550) * feat(runtime): add container engine detection and governed execution pilot (#1388) Adds container-backed execution as an optional progressive enhancement with graceful degradation when no engine is available (ADR-0070, Epic #1491). - contracts: container-runtime wire contract (engine status/capability, execution policy, closed ContainerTask catalog, run request/result/events, 11 failure reasons, CONTAINER_TASK_RULES allowlist) + hand-rolled validators; reuses the RuntimeCapabilityState vocabulary; adds the additive "container-run" EvidenceTaskType - server: opt-in ACTIVE containerEngineDetector (docker/podman version+info through the single runCommand boundary into structured non-throwing states) + ContainerRunnerManager with a server-frozen `docker run` argv (--rm --network none --read-only --cap-drop ALL --security-opt no-new-privileges --pids-limit/--memory/ --cpus --pull never, read-only /workspace mount, no privileged, no socket mount) + /api/containers/{capability,catalog,runs,events} routes + content-free run evidence, wired into deps/routes/index. The #1385 metadata-only detector is left unchanged. - docs: ADR-0070 governing the new Docker-socket trust boundary (NIST SP 800-190 + Docker rootless), indexed in docs/adr/README.md Refs #1388 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(ui): add container status widget with graceful degradation (#1388) Adds the container capability surface for Issue #1388: a status widget that always renders, showing the structured engine state and a remediation hint when no engine is available (role="status", graceful degradation — never blocks the surface), and the allowlisted catalog tasks with a run control only when an engine is available (AC1/AC2). Mirrors the #1387 CommandsWidget for placement, registration, api-client style, and accessibility. - lib/container-api.ts: typed /api/containers/* fetch client (CSRF, no body logging) - ContainerStatusWidget + jest-axe a11y tests (zero serious/critical) - WindowsRegistry/descriptor-meta/index registration for the containerStatus window - e2e: unavailable-state browser smoke proving Keiko stays fully usable with no engine Refs #1388 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(runtime): add container execution security notes (#1388) Documents the container execution trust model for Issue #1388 (ADR-0070): deny-by-default allowlist, server-frozen argv, read-only workspace mount, network-isolated container, dropped capabilities, no-new-privileges, no auto pull, resource limits, content-free audit evidence, the opt-in active detection probe, and the NIST SP 800-190 control mapping. Cross-references the new probe from the #1385 local-runtime-capabilities doc. Refs #1388 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* feat(git-delivery): add governed Git action contracts, policy packs, and risk semantics (#471) (#1503) Introduce the typed contract foundation for Epic #470 governed Git delivery: three keiko-contracts leaf modules (git-delivery, git-delivery-policy, git-delivery-provider) defining the 10-kind action model, the 4-class risk taxonomy with data-driven severity, the approval-intent model, the lifecycle envelope (resolved inputs / policy decision / approval requirement / preview / result / evidence ref), repo and org policy packs with a deterministic fail-closed evaluator, and provider-neutral branch-protection / checks / pull-request / merge-readiness interfaces. Adds ADR-0058, an operator-facing governance doc, and a keiko-tools boundary test proving the read-only terminal baseline still denies every Git mutation so governed write authority lives only behind these typed contracts (AC5). Enables CI and CodeQL on the feat integration branch by mirroring the existing feat/prompt-enhancer-1307 trigger and protected-branch-gate entries. Refs #471 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * fix(git-delivery): harden #471 contract validators per adversarial review (#1506) Stricter parse-time enum guards (actionKind, provider-capability) and typed GitDeliveryRemoteTargetPolicy push-target patterns, with 10 regression tests. Follow-up to #1503. Refs #471. * feat(git-delivery): governed Git mutation execution kernel (Refs #472) (#1509) Implement the deterministic preflight and mutation orchestration kernel for governed local Git writes (Issue #472, Epic #470), consuming the #471 contracts. - Lifecycle orchestrator (runGitMutation): resolve, preflight, preview, policy, execute, result — the single execution authority over local mutation kinds. - Deterministic preflight evaluators over a content-free worktree snapshot, with typed findings (blocking/advisory severity; user-actionable/internal remediation); idempotent reruns by construction. - Narrow local Git adapter: a typed port with NO generic exec method, a closed governed command table, a dedicated allowlist, and pure argv builders with flag-injection guards. The Node adapter runs plans through the existing keiko-tools no-shell spawn boundary. - Structured failure taxonomy (policy-block / preflight-block / execution-failure / provider-failure / recovery-required) consumable without string parsing. - Idempotency journal (records successes only) and safe-retry semantics. - ADR-0059 + docs; barrel exports + surface pins; ./internal/git-mutation subpath. The read-only terminal git baseline is preserved and machine-checked complementary to the governed write surface. Remote/provider execution (push/PR/merge) is deferred to #476-#478 behind a separate gateway. Refs #472 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): add approval orchestration, preview manifests, and action-sheet UX contracts (#473) (#1513) Implement the approval and preview presentation layer for governed Git delivery (Issue #473, Epic #470) on top of the #471 contracts and the #472 execution kernel. - keiko-contracts: new strict-leaf git-delivery-action-sheet.ts projecting the kernel's content-free facts into a UI-safe GitDeliveryActionSheet — a three-state union (ready-to-execute / waiting-for-approval / blocked), an approval summary with mandatory/optional/impossible necessity, a content-free preview manifest (branch targets, mutation scope, remote impact, PR/merge/branch-protection/checks state, expected blockers), a policy/preflight/provider-not-ready blocked-cause classification, and recovery hints with a suggested governed recovery strategy. Includes the wire request type and pure assemblers/guards/parsers. - keiko-server: POST /api/git-delivery/action-sheet — a read-only/computational BFF endpoint that runs the pure kernel phases (evaluateGitPreflight + evaluateGitPolicy over TRUSTED server policy packs) and projects them into a sheet. Default-false deployment capability gate; CSRF + body-cap + strict-key + secret-shape + unsafe-format-char (Trojan-Source) rejection; deep-redacted response; expiry-aware approval demotion (clock parity with the #472 kernel). - keiko-ui: desktop GitDeliveryActionSheetCard rendering ready/blocked/recovery states with alertdialog semantics, focus management, a status live region, not-colour-alone labelling, and jest-axe coverage; fetchGitDeliveryActionSheet client. - ADR-0060 + docs/git-delivery/governed-git-approval-surface.md; regenerated ADR-0051 visual-regression proofs (evidence/1300) after the additive .gdas-* globals.css block. Authority stays server-side; the action sheet is a pure projection over backend facts, never a second policy system. No mutation executes in this slice (local execution is #472; remote push/PR/merge are #476-#478). Content-free throughout. Refs #473 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed Git mutation evidence ledger, audit export, and recovery metadata (Refs #474) (#1517) * feat(git-delivery): add governed Git mutation evidence ledger, audit export, and recovery metadata (Refs #474) Deliver the audit and compliance backbone for Epic #470 governed Git delivery so mutating repository actions become inspectable, exportable, and supportable after execution (Issue #474). - keiko-contracts: new strict-leaf `git-delivery-evidence.ts` defining the content-free `GitDeliveryEvidenceRecord`, exportable `GitDeliveryAuditPacket`, AC1 outcome-class vocabulary, the net-new retrospective three-way `GitDeliveryRecoveryDisposition` (retryable/user-fixable/policy-forbidden/none), and total exhaustive recovery-disposition derivations. Reuses the #473 recovery action-hint and #471 strategy vocabularies (no parallel subsystem). - keiko-tools: pure `buildGitDeliveryEvidenceRecord` projecting the #472 `GitMutationLifecycleResult` into a record for EVERY terminal outcome; hashes remote identifiers, the provider external id, and the repo identity (content-free by construction). - keiko-server: bounded, date-bucketed append-only evidence ledger (`mutationEvidenceLedger.ts`, redact-then-persist, fail-closed, never throws) and a capability-gated `GET /api/git-delivery/evidence` audit-export route with re-redaction on read. - ADR-0061 and docs/git-delivery/governed-git-evidence-ledger.md. Tests: 47 new (contracts 13, tools 21, server 13) proving AC1 completeness, AC2/AC5 no-secret-leak, AC3 recovery classification, AC4 export. Producer wiring of the kernel into a live execution route remains deferred to #476-#478. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(git-delivery): harden #474 evidence ledger per adversarial review (Refs #474) Adversarial 5-lens review (security/architecture/correctness/tests/redaction-skeptic; 3 SHIP, 1 SHIP_WITH_FIXES, no blockers). Confirmed fixes: - correctness (medium): effectiveBlockReason no longer attaches an eagerly-evaluated policy block reason to a preflight-blocked outcome (the policy gate never fired); preflight blocks now carry no policy blockReason. Regression test added. - audit integrity: strip bidirectional/zero-width/BOM format characters from echoed branch names so a crafted ref cannot visually spoof an audit row. Test added. - guard: isGitDeliveryAuditPacket now verifies recordCount === records.length. - export honesty: the audit packet now carries a bounded-window limitation note; multi-day cross-bucket + window-exclusion route tests added. - docs: soften the SHA-256/redactor wording (redactor is a secret-shape backstop, not a catch-all; the primary control is by-construction hashing); document that approver ids and branch names are deliberately retained governance provenance; correct ADR-0061 builder deps (deps.hash), correlation mechanism, server filenames, and the stale boundary-review note. Disposition: raw approver ids are intentional governance provenance (AC2 requires preserving approval provenance); not a leak. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed local branch, staging, and commit flows with commit-intent composition (Refs #475) (#1523) * feat(git-delivery): commit-message policy + commit-intent contracts and branch-switch kind (Refs #475) Adds two strict keiko-contracts leaves (git-commit-policy.ts, git-commit-intent.ts) and the governed branch-switch action kind. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(adr): ADR-0062 governed local git flows and commit-intent (Refs #475) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): branch-switch kernel, read-only worktree snapshot reader, commit-intent summary (Refs #475) Adds the governed branch-switch command across the #472 kernel (adapter argv, orchestrator dispatch, preflight switch-target-missing), a read-only worktree snapshot reader on the internal git-mutation subpath, and the pure commit-intent change summarizer. Threads the new finding code through the action-sheet projection recovery-hint table. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed local branch/staging/commit execution + preview routes (Refs #475) Adds the BFF execution surface for the #475 local flows: branch create/switch + staging routes and a read-only commit preview + governed commit execute route. Each runs the #472 kernel (preflight + policy + approval) over a server-built live snapshot, enforces the commit-message policy before the kernel, surfaces commit-intent quality warnings, and records evidence through the #474 ledger. Capability- and CSRF-gated, content-free, fail-closed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed local git flow desktop surface (Refs #475) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(git-delivery): governed local git flows reference (Refs #475) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(git-delivery): browser e2e evidence for governed commit no-bypass + fix window persistence (Refs #475) Adds a deterministic Playwright spec (real packaged app, governed routes intercepted) proving the browser commit path surfaces the governed message-policy block and cannot bypass /commit/execute (AC5 browser half). Fixes the governedGit window persistence from evidence-reference (which strips the slash-bearing project path → broken empty window on reload) to fs-reference, matching the path-carrying files/editor windows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * style(git-delivery): satisfy strict lint in #475 test files (Refs #475) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * test(git-delivery): restore #475 coverage gate + validator-totality hardening (Refs #475) (#1525) * test(git-delivery): raise branch coverage + harden commit-message validator totality (Refs #475) Adversarial-review follow-ups: guard new RegExp in the issue-key check so the pure validateGitCommitMessage stays total (and fails closed) on a malformed operator-configured pattern. Add an execution-core integration test (real git through the default seams) + route branch-coverage for worktree-unavailable, approval-required, allowEmpty, the real branch/staging specs, and validation paths; drop the unused isStringArray guard. Restores keiko-server branches above its ratchet floor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit 1a99e4e) * fix(test): use the real in-memory store in the resolveProjectWorkspace test (Refs #475) The fake Project literal did not satisfy the UiStore type under the full tsc --noEmit (which checks test files); use createInMemoryUiStore so the test is type-safe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): add governed remote publish gateway for push, upstream, and protected-target awareness (#1527) Add the governed publish layer (Issue #476, Epic #470) that turns local commit completion into safe remote delivery. `git push` becomes a controlled publish workflow with explicit preview, policy enforcement, recovery semantics, and evidence capture for allowed and blocked attempts. - keiko-tools: new `git-publish-gateway.ts` (pure) — `GitPushCommand`, the narrow `GitRemotePublishAdapter` port, a dedicated push-only allowlist, `buildPushArgv` (refuses force), the publish-rejection taxonomy, and the `runGitPublish` orchestrator producing a kernel-shaped lifecycle result. New `git-publish-node.ts` Node executor classifies rejections from git output. Push preflight gains a `non-fast-forward` finding. - keiko-server: new `pushExecution.ts` (`executeGovernedPublish`, default-safe publish policy pack, preview/response projections) + `pushRoutes.ts` (`/api/git-delivery/push/preview` read-only, `/push/execute` governed). - keiko-ui: a Publish section in `GovernedGitFlowCard`. - Protected/shared targets are blocked by policy (stricter than user branches); force push is blocked by default (publish risk ceiling + argv refusal). - ADR-0063; integration, route, unit, UI, and packaged-app browser evidence. Refs #476 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed GitHub pull request command center and metadata orchestration (Refs #477) (#1531) Deliver the governed pull request layer for Epic #470 (ADR-0064): turn a published branch into a review-ready GitHub PR through an explicit, governed workflow — a parallel execution authority to the #476 publish gateway, never an extension of it. - contracts: new strict leaf git-pull-request.ts — provider-neutral, content-free readiness model (objectExists vs reviewReady + severity-ranked blockers), deterministic metadata synthesis, draft-vs-ready recommendation, reviewer/label/ linkage suggestions, and a neutral rejection taxonomy with exhaustive disposition/error-code tables. No cross-package imports, no provider field names. - tools: git-pr-gateway.ts (pure PR gateway — GitPullRequestCommand carrying title/body, narrow two-method adapter port, dedicated `gh api` allowlist with NO merge/delete, pure argv builders, GitHub-error classifier, effective policy, runGitPullRequest returning a kernel-shaped lifecycle the #474 evidence builder consumes unchanged) + git-pr-node.ts (Node `gh api` executor; gh reads its own token, Keiko never does). - server: prExecution.ts + prRoutes.ts (read-only preview + governed execute) under a default-safe KEIKO_DEFAULT_PR_POLICY_PACK; content-free evidence (only byte lengths), capability-gated behind KEIKO_GIT_DELIVERY_ENABLED. - ui: GovernedPullRequestCard.tsx command center (editable metadata draft, readiness panel, recommendation, normalized failures). Inline CSS vars (globals.css untouched); WCAG 2.2 AA (aria-live, text-not-colour). - tests: contracts/tools/server/ui unit + a11y suites; browser e2e proving the PR path reaches the governed BFF routes with no bypass (evidence under docs/git-delivery/evidence/477). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(git-delivery): governed merge gateway, protected-branch enforcement, and guided recovery (Refs #478) (#1534) The merge layer of Epic #470 (ADR-0065): a THIRD parallel `gh api` execution authority that turns a review-ready pull request into a merged base branch as a governed release decision. Merge cannot execute until preflight, policy + final approval, and a provider readiness gate all pass; the provider's own enforcement is the backstop. - keiko-contracts: NEW leaf git-merge.ts — provider-neutral merge-readiness model (severity-ranked blockers reusing GitDeliveryMergeBlockReason + lifecycle states), strategy eligibility (policy ∩ provider capability), merge recommendation, and the rejection taxonomy with exhaustive error-code / recovery-disposition tables. Pure; existing contracts unchanged. - keiko-tools: git-merge-gateway.ts (GitMergeCommand, narrow two-method GitMergeAdapter, dedicated `gh api` merge allowlist, argv builders, ordered GitHub merge-error classifier, mergeable_state mapper, runGitMerge 3-gate orchestrator producing a kernel-shaped lifecycle) + git-merge-node.ts (Node `gh api` executor: readiness reads + merge PUT + guarded non-fatal branch DELETE; token read by gh, never by Keiko). - keiko-server: gitDelivery/mergeExecution.ts (approval-gated KEIKO_DEFAULT_MERGE_POLICY_PACK; preview/execute projections carrying per-blocker recovery info) + mergeRoutes.ts (read-only preview + governed execute, capability gate, validation, #474 evidence). Route group registered. - keiko-ui: GovernedMergeCard.tsx (window "governedMerge": eligible-strategy selector, readiness/recovery panel, final high-risk confirmation, outcome banner; inline CSS vars, globals.css untouched) + api client + window registry. - Docs: ADR-0065, docs/git-delivery/governed-merge.md. - Tests: contracts (38), tools gateway/node (55), server routes (13), UI card + a11y (17); browser e2e (playwright.issue-478) proving no merge-anyway bypass, with evidence under docs/git-delivery/evidence/478. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * docs(git-delivery): add Issue #479 closure evidence (#1536) Refs #479 * docs(git-delivery): finalize Epic #470 closeout after child-issue reconciliation (Refs #470) (#1538) The #479 closeout summary recorded that Epic #470 must remain open until the GitHub issue and project-board records for #472, #477, and #478 were reconciled with their already-merged code. #477 and #478 have since been closed; this finalizes the remaining reconciliation of #472 (governed Git mutation execution kernel, PR #1509 / 401b08a) and retires the closure-gating note now that all nine child issues (#471-#479) are closed and every Definition-of-Done and Expected-Verification item is evidenced. Refs #470 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * fix(git-delivery): harden governed merge audit gaps (Refs #470) (#1542) * test(ui): refresh design-system evidence after git delivery integration * feat(editor): route Git status actions to governed delivery * feat(editor): complete runtime git command hub --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
68 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refs #1491.
This PR brings the #1491 agent-native editor foundation and runtime governance branch into
devafter the production-readiness audit PR #1553 was merged back into the feature branch.Included scope:
devbranch, including voice exports and design-system proof regenerationAudit and Evidence
devconflicts.Local Validation
npm run typechecknpm run lintnpm run arch:checknpm run arch:check:negativenpm run test:coverage:qualitynpm run build --workspace @oscharko-dev/keiko-uinpm run check:git-delivery-evidencenpm run check:editor-doc-linksnpm audit --audit-level=moderate --workspace @oscharko-dev/keiko-ui#1491 should be closed only after this PR is green and merged into
dev.