v0.8.61: Default independent shell and verifier work to background jobs

## Problem

The runtime still treats too much shell, verifier, and worker work as foreground work. Agents start checks, searches, builds, verifier gates, or child workers and then wait even when the result is not an immediate dependency. This makes the main turn feel stuck and prevents useful parallel progress.

This is the main UX pain point: "allowed shell" is not enough if the agent starts allowed work and then blocks its entire turn waiting for it.

Current CodeWhale already has important substrate:

- `exec_shell` supports `background = true` and returns a task id.
- `task_shell_start` wraps shell work as a background task.
- `exec_shell_wait` and `task_shell_wait` can inspect a background task.
- Turn cancellation during `exec_shell_wait` can leave the background process running.
- Tool-batch tests already prove background shell/verifier starts can join parallel read-only work.

The missing slice is making this the default path the model actually chooses, and making waits nonblocking unless the model explicitly asks for a barrier.

## Target Design

Independent shell and verifier work should default to background execution with visible job tracking and automatic completion notification. Foreground execution should be reserved for short commands whose result is required before the next decision.

Rules of thumb for the model/runtime:

- Start long or independent commands in the background immediately.
- Keep inspecting, editing, or coordinating while they run.
- Let completion arrive automatically into the transcript/status stream.
- Poll only when current progress is genuinely needed before continuing.
- Use a deliberate blocking wait only at final verification gates or true dependencies.

This should apply to:

- `exec_shell` / `task_shell_start`
- `run_verifiers` and cargo-style gates
- Fleet workers and the user-facing "sub-agent" surface once it is backed by Fleet
- structured user questions / steering where the parent can keep working safely

## Fit With Current CodeWhale

- Change model-visible tool descriptions/results so background tasks explicitly say: "returns immediately; you will be notified when done; do not poll/wait unless you need early output."
- Change wait tool defaults toward snapshot/nonblocking behavior. Today `exec_shell_wait` defaults `wait = true`; make the safer default `wait = false` or add a canonical nonblocking read path and teach the model to use it.
- Make `task_shell_wait` and `exec_shell_wait` visibly different from `join`: they inspect progress by default; blocking requires an explicit flag such as `block = true` / `wait = true`.
- When a foreground command times out or looks long-running, the recovery hint should prefer "restart in background and continue other work," not merely "rerun with a longer timeout."
- Do not let `agent_eval block:true`, foreground Agent/Fleet workers, or foreground verifier gates become the default for independent evidence collection.
- Preserve existing user controls: `/jobs`, cancel, wait, output tail, and foreground-to-background detach.

## Acceptance Criteria

- Shell execution has a clear scheduling policy: foreground only when dependent, background for long-running or independent work.
- The model/tool descriptions nudge toward background execution for builds, tests, verifiers, servers, broad searches, polling, sleeps, and long diagnostics.
- Starting a background shell/verifier returns a task id immediately and includes metadata that a completion notification will arrive automatically.
- Parallel read-only shell commands can run concurrently when safe.
- `run_verifiers` and cargo-style gates can start while the agent continues inspecting files or implementing unrelated follow-up work.
- A cancelled/interrupted wait does not kill the underlying background job unless explicitly requested.
- Background task completion produces a transcript/status event the agent can consume without manual polling.
- Tests cover background-start hints, wait default behavior, wait/cancel behavior, automatic-completion metadata, and at least one parallel read-only shell scenario.

## Related

- #3211 permission profiles and execution defaults
- #3213 model-facing runtime capability prompt
- #3146 activity metadata rows
- #2982 busy/free display
- #3096 headless worker runtime
- #3154 Agent Fleet control plane


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.61: Default independent shell and verifier work to background jobs #3212

Problem

Target Design

Fit With Current CodeWhale

Acceptance Criteria

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

v0.8.61: Default independent shell and verifier work to background jobs #3212

Description

Problem

Target Design

Fit With Current CodeWhale

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions