feat(runtime): controlled test/build/run command executor (#1387) by oscharko · Pull Request #1546 · oscharko-dev/Keiko

oscharko · 2026-06-26T09:19:32Z

Summary

Implements the governed test/build/run command executor for Issue #1387. Keiko can run a closed catalog of workspace tasks (discovered from package.json scripts) through governed local processes with cancellation, output caps, and audit metadata. A run names a discovered task id — never free-form argv — and executes through the single governed spawn boundary (keiko-tools runCommand, ADR-0043 D2), inheriting no-shell execution, name-allowlisted env, ephemeral HOME, workspace-contained cwd, byte-capped + redacted output, timeout (SIGTERM→SIGKILL), and AbortController cancellation.

Refs #1387

Scope

In scope: command policy contract; BFF command runner + catalog/result/event routes; UI command surface; tests and security notes.
Out of scope: arbitrary shell terminal; background daemon management; remote execution; container execution (separate child issue).

Reuse And No-Duplication

Existing Keiko functionality was inspected before implementation.
This PR reuses, extends, generalizes, or consolidates existing functionality where practical.
Any new implementation is limited to a documented capability gap in the linked issue.
This PR does not introduce a parallel execution/policy/evidence/UI subsystem where an existing subsystem can be extended.
Refactoring or consolidation was considered when existing functionality was close but not shaped for this change.

Reuse / gap rationale: the runner reuses runCommand (the only spawn boundary, ADR-0043 D2), DEFAULT_SANDBOX_POLICY, readWorkspaceFile + path containment (keiko-workspace), EvidenceStore (keiko-evidence), deepRedactStrings, and mirrors the ADR-0018 TerminalExecutionManager/routes/evidence pattern. The genuine gap — absent today — is a task-oriented layer: package-script discovery, a separate narrow executor allowlist (COMMAND_TASK_RULES, distinct from DEFAULT_COMMAND_RULES/TERMINAL_COMMAND_RULES), discovered-task-id validation (closes the npm run <untrusted-name> RCE), and a structured result with failureReason. The read-only terminal surface is an arbitrary-operand inspection tool and does not satisfy this scope.

Acceptance criteria mapping

Issue AC	Where
Only allowed commands can run	`COMMAND_TASK_RULES` + discovered-task-id validation; unknown id → `TASK_NOT_FOUND` before spawn (`command-runner.ts`, tests)
Commands can be cancelled	`AbortController` per run; `DELETE /api/commands/runs/:runId`; UI Cancel armed via SSE `run-started`/`requestId` (manager + routes + widget tests)
Output bounded, does not freeze UI	`runCommand` byte cap + `truncated`; UI renders already-capped output (truncation test, e2e)
Results include exit code, duration, truncation, failure reason	`CommandTaskRunResult` (contract + tests)
Agents can reuse the same command API	policy + result types in `keiko-contracts`; executor is the shared `CommandRunnerManager` composing `runCommand` (no second path)

Deliverables

Command policy contract — keiko-contracts/src/command-runner.ts
BFF command runner — keiko-server/src/command-runner.ts
Command result and event routes — keiko-server/src/command-runner-routes.ts
UI command surface — keiko-ui/.../CommandsWidget.tsx + lib/commands-api.ts
Tests and security notes — unit/integration/e2e + docs/command-runner/security-notes.md

Verification

Required:

Required GitHub checks expected to pass before merge.
Local verification commands listed below.
Reuse/extension/generalization evidence listed above.

Local verification:

npm run typecheck                  PASS (build:packages + check:package-graph + tsc)
npm run lint                       PASS
npm run arch:check                 PASS (0 dependency violations; import-policy PASS)
npm run arch:check:negative        PASS (40 negative fixtures caught)
npm run test:coverage:quality      PASS (12571 tests; keiko-server 87.55% / keiko-contracts within floor)
npm run test:coverage:ui + check:coverage:ui   PASS (keiko-ui 88.99% lines)
npm run check:version-consistency  PASS
npm run build:ui + check:editor-bundle-size --require-static-export   PASS
playwright --grep @command-runner-1387   1 passed (real Studio shell, mocked BFF)
new tests: contracts 19, server manager 16, routes 13, evidence 3, ui 13, descriptor-meta 7
command-runner file coverage: server manager 97.59% L / 98.80% B; errors/evidence 100%; routes 96.42% L; contract 93.38% L

Select only what applies:

UI behavior verified by component (jsdom + axe) tests and a Playwright browser smoke.
Core logic covered by unit, integration, and fixture tests.
Security-sensitive change reviewed for trust boundaries, secrets, external calls, and generated artifacts (see docs/command-runner/security-notes.md).
Not applicable items are explained below.

Not applicable rationale:

Supply-chain/package-surface: no new package dependencies or exports entries (additive index.ts re-exports only).
Release release:check: no release-path semantics changed.

Review And Closure

The PR implements only the linked issue scope.
Issue acceptance criteria and closure evidence are updated only where evidence exists.
Delivery board status updated (In Progress; PR Open to follow).
Use Resolves #1387 only when this PR should close the issue — kept as Refs #1387; the issue is closed manually after merge once every Definition-of-Done gate is satisfied.

Risk Notes

Execution is gated to discovered package-script task ids mapped to a frozen npm run <script> argv; script bodies are the project's own trusted workspace content, run under the minimized env + containment of runCommand.
Dual-layer secret redaction (env-value in runCommand; structural in the BFF for response and SSE frames); audit evidence is content-free (counts/enums only, taskType: "command-run").
commands window is restorable (fs-reference) and holds only a project-path reference; no secrets persisted.

🤖 Generated with Claude Code

Implements the governed command runner for Issue #1387: a closed catalog of test/build/run tasks discovered from package scripts, executed through the single governed spawn boundary (keiko-tools runCommand) with workspace containment, output byte caps, timeout, cancellation, content-free evidence, and dual-layer secret redaction. Reuses the ADR-0043/ADR-0006 spawn boundary and the ADR-0018 terminal manager pattern rather than adding a second execution path; a run names a discovered task id, never free-form argv. - contracts: command-runner wire contract (catalog, run request/result, events, COMMAND_TASK_RULES allowlist) + hand-rolled validators; adds the additive "command-run" EvidenceTaskType - server: CommandRunnerManager (package-script discovery + governed execution composing runCommand) + routes (/api/commands/catalog|runs|events) + content-free run evidence, wired into deps/routes - ui: CommandsWidget task surface + commands-api client + window registration - tests: contracts/server/ui unit + integration coverage and an e2e browser smoke for catalog discovery, run, and structured result - docs: command-runner security notes Refs #1387 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Addresses review findings on PR #1546 (security audit + verifier + PR review): - contracts: cap taskId (<=256) and requestId (<=128, token charset) at the parse boundary so an oversized/non-token id cannot reach the manager, audit ledger, or SSE fan-out (the 16 KB body cap was the only prior backstop) + tests - server: document the per-run catalog re-discovery as an intentional untrusted-taskId re-validation; add a test proving a throwing SSE subscriber does not block fan-out to other subscribers - routes: assert Layer-2 redaction is applied to every SSE event frame - ui: render the project path as the Tasks window subtitle (connectionUtils) - e2e: deterministically mock the SSE channel and assert a run lifecycle event reaches the bounded events log - docs: correct the audit-evidence note (the standard manifest workspaceRoot path is retained; argv values, output bytes, and secrets are excluded) Refs #1387 Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

oscharko merged commit 8be49af into feat/keiko-agent-native-editor-foundation-and-runtime Jun 26, 2026
12 checks passed

oscharko deleted the claude/issue-1387-command-executor branch June 26, 2026 09:33

oscharko mentioned this pull request Jun 26, 2026

fix(runtime): harden command runner per review (#1387) #1548

Merged

2 tasks

oscharko mentioned this pull request Jun 26, 2026

Controlled test, build, and run command executor #1387

Closed

73 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(runtime): controlled test/build/run command executor (#1387)#1546

feat(runtime): controlled test/build/run command executor (#1387)#1546
oscharko merged 1 commit into
feat/keiko-agent-native-editor-foundation-and-runtimefrom
claude/issue-1387-command-executor

oscharko commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oscharko commented Jun 26, 2026

Summary

Scope

Reuse And No-Duplication

Acceptance criteria mapping

Deliverables

Verification

Review And Closure

Risk Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant