Skip to content

v0.9.0: Always-on stateful agent identity across workrooms, repos, and fleet runs #3210

@Hmbown

Description

@Hmbown

Problem

CodeWhale's current agent state is still too tied to a turn, a TUI session, or a single worker run. The desired product shape needs an always-on, stateful agent that can live across workrooms, repos, GitHub issues, mobile/chat surfaces, and Fleet/WhaleFlow runs.

This is related to Fleet, but not identical. Fleet gives durable worker execution. The always-on agent identity is the durable operator/persona that watches work, wakes up on events, remembers commitments, routes tasks to workers, and can be resumed or steered from any authorized surface.

Product direction

An always-on CodeWhale agent should be able to:

  • Own an inbox across workrooms, mentions, approvals, failures, and GitHub events.
  • Remember active commitments, watched repos/branches, blocked decisions, and follow-up timers.
  • Wake on durable events rather than relying on a user keeping a TUI turn alive.
  • Delegate work to Fleet/WhaleFlow workers with explicit role/model/tool policies.
  • Be steered from TUI, Runtime API, mobile, or a chat bridge without losing state.
  • Explain what it is watching, what it is doing, what is blocked, and what needs human approval.
  • Survive process restart, laptop sleep, host reboot, or chat bridge reconnect.

Proposed abstraction

Add a typed AgentProfile / AgentIdentity layer:

  • Stable id, display name, persona, owner, allowed workspaces/repos, trust level, and auth boundaries.
  • Provider/model policy: explicit route, inherit, or route-effective auto policy from v0.8.61: Add route-effective model inventory and auto fleet model selector #3205.
  • Tool/fleet role policy: scout, implementer, reviewer, verifier, operator, summarizer.
  • Subscriptions: workroom mentions, GitHub issues/PRs/checks, fleet run barriers, failed receipts, timers, approvals, and manually pinned threads.
  • Durable state: active commitments, next checks, blockers, watched refs, recent receipts, memory refs, and compact summary.
  • Lifecycle: active, sleeping, paused, needs-human, degraded, failed, disabled.
  • Controls: pause, resume, sleep, wake now, handoff, clear subscription, export state.

Runtime requirements

  • State is persisted outside a model context window and has schema versions/migrations.
  • The agent reads compact state summaries and artifact refs, not raw full transcripts by default.
  • Event-driven wakeups are debounced, idempotent, and replay-safe after restart.
  • Wakeups respect approvals, sandbox/trust level, provider budgets, and route health.
  • Agent identity and worker execution remain separate: the always-on agent may schedule fleet workers, but each worker still has its own receipt and artifact boundary.
  • No secrets are stored in agent state; secrets are referenced through configured secret refs or existing auth stores.

Acceptance criteria

  • A design doc/RFC defines AgentProfile, AgentState, subscriptions, lifecycle states, and how they connect to workrooms and Fleet.
  • Runtime API can list agent identities, read bounded state, pause/resume an agent, and trigger a manual wakeup.
  • CLI/TUI exposes at least a minimal status surface: active agents, what each is watching, current state, and blocked/needs-human count.
  • A CI-safe test proves an agent subscription wakeup is persisted, consumed once, and not duplicated after manager/runtime restart.
  • A smoke test proves a paused agent does not schedule new workers, then resumes and processes a queued wakeup exactly once.
  • A fleet-backed task can report a receipt to the owning always-on agent, and the agent state records the receipt ref rather than copying the full transcript.
  • Docs explain how this differs from normal sub-agents, Fleet workers, and one-off codewhale exec runs.

Non-goals for the first version

  • No autonomous destructive actions without approval.
  • No hidden hosted memory service.
  • No always-on behavior enabled by default for every install.
  • No model-weight training loop.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestexternal-memoryExternal memory, context substrate, and long-running agent statepod-workflowsPod-style background workflow monitoring and grouped agent orchestrationreliabilityReliability, flaky behavior, retries, fallbacks, and robustnesssubagentsSub-agent orchestration, lifecycle, and completion handlinguxUser experience, interaction, or presentation polishv0.9.0Targeting v0.9.0workflow-runtimeWorkflow IR, executor, control flow, and replay runtime

    Projects

    Status
    Backlog

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions