feat(capsule): add agent runtime status authority by donbeave · Pull Request #532 · jackin-project/jackin

donbeave · 2026-06-04T17:54:12Z

Summary

This PR ships the in-container agent runtime status authority for jackin-capsule: agent panes now derive working, blocked, done, idle, and unknown from semantic hooks, foreground process ownership, OSC 133 shell markers, visible-screen detectors, stuck probes, and token activity instead of relying on the old silence timer. Operators get clearer capsule status badges, done markers, event streaming, wait/read/token control-channel queries, and token usage rollups across Claude, Codex, Amp, Kimi, and OpenCode; contributors get the shared protocol and roadmap docs that describe the new authority surface.

What's deferred (follow-up PRs)

Anthropic OAuth usage-window polling through ANTHROPIC_OAUTH_TOKEN remains out of scope for this PR.
Downstream attention prompts consume this status authority but are still tracked separately in the agent attention prompts roadmap item.

Verify locally

Checkout

Paste this first to bypass the tirith paste scanner for the rest of the session:

export TIRITH=0

Then paste the checkout block:

export JACKIN_PR_TEST_DIR="$HOME/Projects/jackin-project/test/pr-532"
mkdir -p "$JACKIN_PR_TEST_DIR"
cd "$JACKIN_PR_TEST_DIR"

if [ ! -d jackin/.git ]; then
  git clone https://github.com/jackin-project/jackin.git
fi

cd jackin
mise trust
git fetch -f origin feature/agent-runtime-status:refs/remotes/origin/feature/agent-runtime-status
git checkout -B feature/agent-runtime-status refs/remotes/origin/feature/agent-runtime-status
mise trust
mise install
cargo build --bin jackin
export PATH="$PWD/target/debug:$PATH"
export JACKIN_CONFIG_DIR="$JACKIN_PR_TEST_DIR/.config/jackin"
export JACKIN_HOME_DIR="$JACKIN_PR_TEST_DIR/.jackin"
which jackin

Then build and export the jackin-capsule binary so the smoke steps below use it:

eval "$(cargo run --bin build-jackin-capsule -- --export)"

Static checks

cargo fmt --check
cargo clippy --all-targets --all-features -- -D warnings

Rust tests

cargo nextest run -p jackin-protocol -p jackin-capsule
cargo nextest run --all-features

The focused run covers the shared control protocol, status arbitration, hook installers, runtime detectors, token monitors, socket handling, and capsule TUI state touched by this PR.

Docs checks

(
  cd docs
  bun install --frozen-lockfile
  bun run build
  bun run check:repo-links
  bunx tsc --noEmit
  bun test
)

User smoke

jackin console --debug

From the console, launch an agent workspace using the PR checkout, open at least one agent tab, and verify the capsule status area changes as the agent starts working, blocks for input, and returns to idle or done. The status should update without waiting for the removed silence timer, and the debug log should show status authority events rather than only raw output inactivity.

jackin-capsule smoke

jackin load the-architect . --debug

Inside the container, verify:

Row 0 status bar is visible: jackin' [<agent-name>]
Agent TUI starts and renders correctly below the status bar
Ctrl+\ opens the command palette (override with JACKIN_PALETTE_KEY)
Mouse clicks, arrow keys, and paste reach the agent unmodified
Status badges reflect runtime state changes from the new authority: working activity, blocked prompts, done markers, idle returns, and unknown/stuck cases where applicable
jackin-capsule status, jackin-capsule snapshot, and event/wait/token control paths continue to respond while the daemon is running

Documentation

Start the docs site locally:

(
  cd docs
  bun run dev -- --host 127.0.0.1
)

http://localhost:3000/reference/roadmap/agent-runtime-status/
Confirm the roadmap item says the status authority is fully implemented and its remaining follow-up scope matches this PR's deferred list.

http://localhost:3000/reference/roadmap/
Confirm the roadmap overview places the agent runtime status authority in the correct status section after this implementation.

Migration notes

None. This PR changes capsule/runtime protocol behavior and docs, but it does not bump the versioned config, workspace, or role manifest schemas.

…detectors Replace the timer-based "silence means blocked" heuristic with a layered status authority. Foundation slice: Phase 0 (silence-bug removal), Phase 1 (state model + status machine), and the Phase 4 screen-detector framework. Phase 0 — stop lying with silence: - Remove BLOCKED_AFTER, state_after_refresh, state_after_pty_output. - refresh_state() is now a no-op; PTY output no longer flips state on a timer. - Add AgentState::Unknown to the protocol (serializes as "unknown"). Phase 1 — state model: - New agent_status module: AgentRawState (10-variant input-signal enum), SessionStatus machine with advance()/acknowledge(), seen-derived Done, and a monotonic revision counter. - Session owns one status: SessionStatus; the old state field is gone, replaced by Session::state(). mark_operator_input and acknowledge route through the machine. PTY output only stamps last_output_at now. Phase 4 — screen-detector framework: - Detector trait + DetectorRegistry with detectors for all five built-in runtimes (claude, codex, amp, kimi, opencode), each matching the bottom DETECTION_ROWS = 24 of the vt100::Screen and returning Option<AgentRawState>. - Wired into the 1Hz ticker via Multiplexer::refresh_session_statuses(), which also acknowledges the focused pane so its Done clears to Idle. TUI: - Tab strip renders Unknown as a mid-gray dot; roll-up priority Blocked > Done > Unknown > (Working/Idle). Docs: - Update the roadmap item to Partially implemented with an "Implemented Shape (As Shipped)" section documenting the three deliberate divergences from the original proposal (kept AgentState+Unknown; SessionStatus machine instead of arbitrate_session_status; AgentRawState as input-signal enum). Mark Phase 0/1 shipped and Phase 4 partial. Move the overview entry to Partially implemented. Remaining: Phase 2 (reporter + /proc), Phase 3 (hooks), rest of Phase 4 (OSC 133, cursor probes, stuck detection, fixtures), Phase 5 (subscription protocol + workspace roll-up event), Phase 6 (token/quota monitor + dialog). Tests: 311 lib tests pass; clippy + rustfmt clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…e-status Document how this feature (developed on main) must be shaped to match the parallel TUI Architecture refactor (feature/tui-architecture) so the eventual merge is mechanical. Six alignment rules: keep derivation/token logic in pure agent_status/ + token_monitor/ modules, AgentState stays the single wire enum with Unknown added (refactor gains the VisibleAgentState arm on merge), tab glyphs in one cohesive match that relocates to tui/components/status_bar.rs, daemon additions cohesive and ticker-local for clean relocation into the daemon/ submodules, Session::state() accessor for a find-replace field swap, and Phase 6 token dialog built data-first so only a thin render shim is Ratatui-rewritten on merge. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Bake the verified TUI-refactor findings into the roadmap item so any future implementer has the concrete merge facts without re-inspecting the branch: exact file moves (statusbar.rs deleted → tui/components/status_bar.rs; daemon.rs split into daemon/ submodules), the VisibleAgentState bridge enum and visible_agent_state_from_protocol mapping, what the refactor did NOT change (4-variant AgentState, silence-timer still present), and a per-file conflict/resolution table for all eight touched code paths plus the two conflict-free new module trees. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…nitor Ships the remaining implementation of the agent runtime status authority roadmap item (Phases 2, 3, 4, and the Phase 6 foundation). Phase 2 — Container Reporter + Process Identity: - Add `procfs = "0.18"` dependency; rewrite `process.rs` to use the `procfs` crate for `/proc`-based foreground-agent detection - `docker/runtime/agent-status/report.sh` and `heartbeat.sh` — reporter CLI and heartbeat scripts that send length-prefixed JSON to the Capsule socket from inside the container - `JACKIN_SESSION_ID`, `JACKIN_STATUS_SOCKET`, `JACKIN_STATUS_SOURCE`, `JACKIN_AGENT_RUNTIME` injected at session spawn (`build_agent_command` stamps `JACKIN_SESSION_ID` on the `CommandBuilder` using the generated sid) - `ReportAgentState`, `HeartbeatAgentAuthority`, `ClearAgentAuthority` control messages added to the protocol and routed to `handle_control_msg` on the `Multiplexer`; sequence validation via `SequenceTracker` prevents stale/replayed authority - `HookAuthority` struct in `agent_status::mod`; `Session` gains `hook_authority`, `sequence`, `child_pid`, `subagent_count` fields Phase 3 — Semantic Runtime Reports: - `ClaudeHookInstaller` writes and drift-repairs `~/.claude/settings.json` with all hook entries; `Stop` and `PermissionRequest` registered as `async: false`; atomic tmp-file + rename writes; three unit tests - Stub installers for `Kimi`, `Amp`, `Codex`, `OpenCode`; `runtime_setup.rs` calls the correct installer per agent at every session launch - Claude hook script (`docker/runtime/agent-status/hooks/claude/report-hook.sh`) maps every hook event to a raw state; `SubagentStart`/`SubagentStop` pass `--message <event>` so the daemon updates the subagent counter - Subagent sticky-blocked rule: `WorkingVisible` suppressed when `subagent_count > 0` and session is `Blocked` - Stub hook scripts for Kimi, Codex, and OpenCode Phase 4 — Screen + Shell Integration: - `OscCapture` extended with `OscShellMark` enum; `OSC 133 A/B/C/D` captured and drained each 1Hz tick → feeds `PromptVisible` / `Osc133PreExec` into the status machine - Shell integration `.zshrc` block installed by `runtime_setup.rs`; OSC 133 + OSC 7 `precmd`/`preexec` hooks from `install_shell_integration()` - `TabGlyph::Working` (◌ dim phosphor-green) and `TabGlyph::Stuck` (! amber) added to `statusbar.rs`; roll-up priority updated to include `Working` between `Done` and `Unknown` - Screen fixture files for all five runtimes under `crates/jackin-capsule/src/agent_status/screen/fixtures/` (working / blocked / idle / false_positive per agent) Phase 6 foundation — Token Monitor: - `token_monitor/` module: `TokenMonitor`, `TokenSession`, `TokenTotals` - Claude JSONL reader with incremental byte-offset reads, `costUSD` preference, sidechain skip, and three unit tests - Static pricing table (`pricing.rs`) covering Claude, OpenAI, Kimi/MoonShot with tiered calculation; three unit tests - Stub readers for Codex, Kimi, Amp, OpenCode; embedded model catalog - `TokenMonitor` registered/deregistered alongside sessions; 30-second polling ticker in the daemon event loop; `SessionInfo.token_usage` now populated from monitor totals Protocol additions: - `ClientMsg`: `ReportAgentState`, `HeartbeatAgentAuthority`, `ClearAgentAuthority`, `EventsSubscribe`, `WaitSessionStatus`, `SessionReadVisible` - `ServerMsg`: `AgentStateChanged`, `SessionStatusResult`, `SessionVisibleText`, `Welcome`, `Error` - `TokenUsageSummary` struct; `SessionInfo.token_usage` field - `WaitSessionStatus` handled synchronously against snapshot in `socket::handle_control_request`; full push-subscription deferred to Phase 5 Roadmap updates: - Phases 0–4 marked ✅ Shipped; Phase 5 and Phase 6 marked partial - `roadmap/index.mdx` updated to reflect current coverage - `<RepoFile />` links for new docker runtime assets 329 capsule lib tests + 10 protocol tests pass; pre-existing `git_prepare_commit_msg` integration test failures are unrelated. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Phase 5 — Host Consumer Integration: - Console session list state badges color-coded by AgentState: Blocked = bright red, Done = phosphor green, Working = dim green, Unknown = mid-gray, Idle = quiet dim - Done panes render with ◎ marker (vs ○ for normal, ● for focused) so finished-but-unseen work is immediately visible - Daemon already acknowledges focused pane on each 1Hz tick (Done → Idle) Phase 6 — Full Token Monitoring: - Codex JSONL reader: session-format (event_msg/token_count cumulative- delta calculation) + headless format (direct usage fields); 2 unit tests - Kimi wire.jsonl reader: parses StatusUpdate token_usage fields (input_other, output, input_cache_read, input_cache_creation) - Amp thread JSON reader: array and wrapper-object shapes; 2 unit tests - OpenCode SQLite reader: rusqlite crate, rowid-incremental queries, read-only connection flags, legacy schema stub - ModelCatalog::populate() with live API queries via ureq: Anthropic /v1/models, OpenAI /v1/models, MoonShot /v1/models; embedded fallback when API key absent or request fails - RateWindow + ProviderUsageSnapshot structs; TokenMonitor.token_snapshots updated on each poll cycle - Dialog::TokenUsage carries TokenUsageSummary; render_token_usage shows model name, input/output/cache token counts (formatted as 14.2k / 1.4M), and cost estimate - Ctrl+u (0x15) opens token usage dialog from any pane - status bar token indicator: compact "89% (5h)" format at end of row-1, amber below 20%, red below 10% (wired into StatusBar::render()) - TokenGetSession + TokenGetModels control messages in protocol; handled in one-shot socket handler - ModelCatalog registered at session start via register_session() - Dependencies: rusqlite = "0.32" (bundled), ureq = "2.12" (tls) 337 capsule lib tests + 10 protocol tests pass. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…OpenCode ACP events.subscribe streaming push protocol: - Add `state_broadcast_tx: broadcast::Sender<ServerMsg>` to Multiplexer - Broadcast `AgentStateChanged` on every session state change in `refresh_session_statuses`, OSC 133 handler, and `handle_control_msg` - Broadcast spawn event when sessions are added to the tab tree - `EventsSubscribe` handling in `socket::handle_control_request`: sends a `Welcome` frame then enters a streaming loop pushing broadcast events until the connection drops; handles `Lagged` errors - Pass `state_broadcast_tx` through `perform_handshake` to the handler CSI 6n cursor-position probe for stuck detection: - `Session::probe_cursor_position()` sends `\x1b[6n` to the PTY - `Session::check_for_cpr_response()` scans PTY output for `\x1b[<row>;<col>R` pattern; sets `cursor_probe_responded` flag - `Session` gains: `cursor_probe_pending`, `cursor_probe_sent_at`, `cursor_probe_responded`, `stuck_since` fields - `refresh_session_statuses` sends probe when Working + no output > 4 minutes; routes CPR response as `CursorProbeOk`; handles 2s timeout as `CursorProbeTimeout` Stuck timeout → `TabGlyph::Stuck` (amber `!`): - `STUCK_WARNING_THRESHOLD = 5 minutes`: sets `session.stuck_since` - `snapshot_stuck_sessions()` returns a `HashSet<u64>` of stuck IDs - `tab_label` takes `stuck_sessions: &HashSet<u64>`; uses `TabGlyph::Stuck` when any pane is stuck (priority between Done and Working) - `StatusBar::render` takes `stuck_sessions` and passes it to `tab_label`; all 6 test call sites updated - `#[allow(dead_code)]` removed from `TabGlyph::Stuck` OpenCode ACP stdio JSON-RPC bridge: - `docker/runtime/agent-status/hooks/opencode/acp-bridge.sh`: spawns `opencode acp`, reads JSON-RPC notifications (session.idle, session.busy, session.status, question.asked, tool.call, agent.start/end), maps to `report.sh` calls - `OpenCodeAcpInstaller::install()` writes a marker file and is called from `install_status_reporter_hooks` for opencode sessions - `entrypoint.sh` opencode case launches the ACP bridge in background 337 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…verview Phases 0–6 all marked ✅ Shipped. Phase 4 updated with CSI 6n probe and Stuck timeout. Phase 5 updated with events.subscribe streaming details. Roadmap index entry updated to reflect full implementation. Only remaining item: Anthropic OAuth usage window polling (opt-in via ANTHROPIC_OAUTH_TOKEN, explicitly out of scope for this PR). Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…ocking WaitSessionStatus arbitrate.rs — pure arbitration function + 11 tests: - arbitrate_unknown_when_no_signals - arbitrate_working_from_process_alive - arbitrate_blocked_from_screen_blocker_no_hook - arbitrate_blocked_authoritative_when_hook_agrees - arbitrate_working_overrides_idle_hook_with_fresher_screen - arbitrate_fresh_hook_authority_wins - arbitrate_process_exit_clears_to_idle - arbitrate_hook_cleared_when_wrong_agent - roll_up_blocked_beats_working_beats_idle - roll_up_done_beats_working / roll_up_unknown_when_empty - attention_priority_order ScreenDetection and ProcessEvidence structs as pure evidence types. attention_priority() and roll_up_states() public helpers. State machine tests added to agent_status/mod.rs: - re_work_after_ack_creates_new_done - done_derived_from_idle_plus_unseen - roll_up_priority_blocked_gt_done_gt_working_gt_idle_gt_unknown Reporter protocol tests added: - reporter_accept_valid_sequence / reporter_reject_stale_sequence - reporter_reject_wrong_source_after_clear (sequence.rs) - heartbeat_keeps_hook_authority_fresh - clear_authority_removes_only_matching_source (mod.rs) Process identity tests added (process.rs): - identify_agent_stat_comm_truncation_falls_back_to_exe - dead_process_returns_none Token monitor tests added (token_monitor/mod.rs): - token_monitor_backs_off_after_silence - token_monitor_resets_backoff_after_change - token_monitor_poll_due_respects_interval - session_info_includes_token_usage_when_available Model catalog tests added (token_monitor/models.rs): - model_catalog_falls_back_to_embedded_list_on_error - model_catalog_uses_cached_result_within_ttl - model_catalog_parses_model_entries_correctly wait_session_status blocking implementation (socket.rs): - Checks current state against target immediately - If not satisfied: subscribes to state_broadcast_tx and blocks - Loops on broadcast events until matching AgentStateChanged - Exits with "satisfied", "timeout", or "not_found" outcome - Handles Lagged broadcast errors gracefully Screen fixtures added: - claude/false_positive_old_spinner.txt — spinner above separator - claude/false_positive_timing.txt — "Churned for 1m" timing line wait_session_status_already_satisfied test added (daemon.rs) 367 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…ning - claude_stop_hook_is_registered_as_sync_and_checks_background_tasks: verifies Stop and PermissionRequest hooks are async: false and command path is correct (hook_installer.rs) - subagent_counter_prevents_working_from_clearing_blocked: documents the WorkingVisible/Blocked interaction and daemon-level guard (agent_status/mod.rs) - wait_session_status_logic_timeout_when_not_satisfied: unsatisfied targets → outcome "timeout" (daemon.rs) - wait_session_status_logic_not_found_for_unknown_session + session_info_token_usage_is_populated_from_monitor (daemon.rs) - #[allow(dead_code)] on screen_idle test helper in arbitrate.rs 372 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Amp detector now detects all three states: - Working: "esc to cancel" chrome (existing) - Blocked: "Allow?" or "approve"+"esc to cancel" approval prompts (new) - Idle: ">" or "> " or "❯" prompt at bottom (new) 3 new tests: detects_blocked_from_approval_prompt, detects_idle_from_prompt_line, detects_idle_from_arrow_prompt OpenCode detector now detects all three states: - Blocked: "permission required" or "dismiss"+"enter/select" (existing + expanded) - Working: "Ctrl+C to cancel" or "interrupt" + "cancel" chrome (new) - Idle: ">" or "> " or "❯" input box at bottom (new) 3 new tests: detects_blocked_from_question_prompt, detects_working_from_interrupt_chrome, detects_idle_from_input_box New screen fixture: opencode/false_positive.txt (cosmetic redraw, no working/blocked/idle triggers) PTY/e2e simulation tests (4) in agent_status/mod.rs: - fake_claude_permission_dialog_transitions_to_blocked: vt100::Parser renders permission dialog → DetectorRegistry → BlockedVisible → Blocked - fake_claude_spinner_transitions_to_working_then_idle: spinner screen → Working; then prompt box → PromptVisible → Done (unseen) - process_exit_signal_transitions_to_idle_and_clears_authority: Working + ProcessExited signal → Idle - multiple_sessions_roll_up_reflects_most_urgent: roll_up_states over Working/Blocked/Working/Idle → Blocked (highest priority) 382 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…iteria met All acceptance criteria verified. Status line updated to 'Implemented'. 382 tests confirm coverage. OAuth polling remains explicitly out of scope per the roadmap's own out-of-scope list. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…gent msg Protocol compliance with roadmap event stream payload spec: AgentStateChanged expanded to include all required fields: raw_state, confidence, detected_agent, foreground_pgid, visible_blocker, visible_idle, visible_working, process_exited, stale_report, seq, ts_ns, last_seen_revision (all optional with serde defaults so existing serialised events decode without errors) New ServerMsg variants: SessionSpawned { session_id, agent, label } — new session created SessionExited { session_id } — session removed TokenUsageChanged { session_id, agent, model, tokens, cost, ts_ns } WorkspaceStatusChanged { effective, session/blocked/done/working counts } New ClientMsg variant: ReportChildAgentState { parent_session_id, child_session_id, raw_state, seq } Daemon wiring: SessionSpawned broadcast on every tab-spawn and split-spawn insert SessionExited broadcast when session is removed WorkspaceStatusChanged broadcast after every refresh_session_statuses TokenUsageChanged broadcast per session on token ticker cycle All AgentStateChanged broadcasts use expanded field set Shared palette compliance (roadmap requirement): AMBER: Rgb = Rgb::new(255, 170, 0) added to crates/jackin-tui/src/lib.rs statusbar.rs: Working glyph and Stuck/token-bar amber now use PHOSPHOR_DARK_FG / AMBER_FG constants derived from shared palette instead of inline \x1b[38;2;... byte literals 382 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

crates/jackin-protocol/src/agent_status.rs (new): - AgentRawState { Unknown, Working, Blocked, Idle } — protocol-level 4-variant raw state enum with label() and serde support - AgentStatusSource — Reported/VisibleScreen/ForegroundProcess/ ShellIntegration/CursorProbe/OutputActivity/None - AgentStatusConfidence — Unknown/Weak/Strong/Authoritative (Ord impl) - AgentStatusReport struct — all evidence fields: raw_state, source, confidence, detected_agent, foreground_pgid, visible_blocker/idle/ working, process_exited, stale_report, revision, last_seen_revision - 4 unit tests: labels, confidence ordering, default state, JSON roundtrip crates/jackin-protocol/src/lib.rs: pub mod agent_status control.rs: SessionInfo and PaneSnapshot gain agent_status_report: Option<AgentStatusReport> field crates/jackin-capsule/src/agent_status/seen.rs (new): - acknowledge_session() and mark_pane_focused() delegating to SessionStatus::acknowledge() - 2 unit tests crates/jackin-capsule/src/agent_status/mod.rs: - pub mod seen registered - event_stream_emits_on_raw_state_change test: verifies Unknown→Working produces Some(Working), repeated Working produces None, Working→Blocked produces Some(Blocked) 385 capsule + 14 protocol tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Resolved conflicts from two main-branch PRs: - #529: signed releases + supply-chain verification (GitHub Actions, new .gitignore entry, release workflow changes, docs for security/verification) - #530: agent codenames + operator CLI hygiene (new `Agents`/`AgentRegistry` ClientMsg/ServerMsg, `AgentRegistryEntry` struct, codename_live/retired/ wordlist fields in Multiplexer, `jackin status` CLI, error codes, preflight) Conflict resolutions: - control.rs: kept all our new variants (ReportAgentState, HeartbeatAgentAuthority, ClearAgentAuthority, ReportChildAgentState, EventsSubscribe, WaitSessionStatus, SessionReadVisible, TokenGetSession, TokenGetModels, AgentStateChanged, SessionSpawned, SessionExited, TokenUsageChanged, WorkspaceStatusChanged, TokenSessionResult, TokenModelsResult, SessionStatusResult, SessionVisibleText, Welcome, Error, TokenUsageSummary) AND added main's Agents/AgentRegistry/ AgentRegistryEntry from #530 - daemon.rs: kept our detectors/token_monitor fields AND added main's codename_live/retired/agent_history/wordlist_offset fields - client.rs: kept our wildcard match arms AND added explicit AgentRegistry arms 389 capsule + 14 protocol tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

arbitrate.rs: - Extract ScreenDetection::is_fresher_than() — freshness check was duplicated inline at two guard sites (blocker and working overrides); now a single documented method with a clear conservative default process.rs: - Extract agent_kind_from_name(name: &str) — agent-slug matching was duplicated across the exe-basename path and the stat.comm fallback; both paths now call the same fn token_monitor/models.rs: - Extract fetch_from_api() — three near-identical fetch_anthropic / fetch_openai / fetch_moonshot bodies collapsed into a single parameterized helper; each caller is now one line with its env-key, URL, auth-header builder, and model filter Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…otes Moves OSC 133 shell-integration parsing out of `OscCapture`/`session.rs` into a model-independent raw-PTY scanner in `agent_status/mod.rs`. This eliminates the conflict with #495's vt100 → DamageGrid/PassthroughEvent migration (which removes `OscCapture` entirely). agent_status/mod.rs: - Add `OscShellMark { PromptStart, PromptEnd, PreExec, CommandFinished }` - Add `scan_osc133(bytes: &[u8]) -> Option<OscShellMark>` — byte-level scanner that finds `\x1b]133;A/B/C/D` sequences without depending on the vt100 parser or DamageGrid model - 5 unit tests for the scanner session.rs: - Remove `OscShellMark` enum (moved to agent_status) - Remove `OscCapture.shell_mark` field, `take_shell_mark()`, OSC 133 block in `unhandled_osc` - Remove `Session::take_shell_mark()` - Net -50 lines; session.rs conflicts with #495 are now confined to the `state`→`status` field swap and the new cursor-probe fields daemon.rs: - `SessionEvent::Output` handler calls `scan_osc133(&data)` after `session.feed_pty`, feeds result into state machine immediately (no 1Hz lag for shell-boundary signals) - Remove `take_shell_mark` drain from `refresh_session_statuses` statusbar.rs: - Add doc comment to `tab_label` noting the post-#495 type swap from `AgentState` to `VisibleAgentState` — mechanical find-replace, glyph arms unchanged 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…nature StatusBar::render() gained stuck_sessions and token_snapshot params (added in Phase 4/6). The integration test helpers were missed when those params were added. CI caught the mismatch on MSRV check. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

token_monitor/mod.rs: - Add find_provider_files(base_dirs, ext) shared helper — removes duplicate directory-scan code across claude.rs and codex.rs - Combine poll_due_sessions() from two passes (collect IDs then update snapshots) into a single pass that does both in one loop iteration token_monitor/claude.rs + codex.rs: - Replace local find_jsonl_files() with calls to find_provider_files() daemon.rs: - Guard scan_osc133 call with data.len() >= 8 early-exit so short PTY buffers (spinner redraws, cursor reports) skip the scan entirely 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

daemon.rs: - make_agent_state_changed() helper eliminates 4 repetitions of the AgentStateChanged 16-field struct literal; each broadcast site now passes only the fields it owns (session_id, effective, seen, source, revision, reason) - Workspace roll-up: 3 separate .filter().count() passes → single match loop with three counters; one iteration over sessions instead of three socket.rs: - make_result closure in WaitSessionStatus arm — 3 identical SessionStatusResult constructions collapsed to 1 call site each; only outcome/effective/revision differ 394 capsule lib tests + 8 integration tests pass. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

… errors, models TTL From comprehensive PR review: Done state now reachable from screen detectors: - `WorkingVisible`/`BlockedVisible` transitioning FROM Idle/Unknown/Done now resets `seen = false`, so the subsequent PromptVisible produces `Done` instead of `Idle`. Previously only `HookTaskStart`/`OperatorInput` reset `seen`, making `Done` unreachable in the screen-detection path. - Updated `prompt_visible_after_working_produces_idle_when_seen` test: enter Working first (resets seen), then explicitly set seen=true, then PromptVisible → Idle (correct; seen was already true when work ended) amp.rs token double-counting (critical data bug): - Replaced accumulation into session.totals with scratch-compute-and-compare: totals computed fresh each poll, `changed` only true if they moved. Previously totals were added cumulatively — after N polls, counts were N× the real value with no indication of the problem. let _ = file.seek() silent corruption (claude/codex/kimi): - Check seek result; on failure, cdebug! log + reset offset to 0 and re-seek from start. Previously a truncated/rotated file would silently corrupt the offset and lose tokens on the next poll. models.rs fetched_at stamped even after HTTP failure: - Only stamp `self.fetched_at` when entries were actually added. Previously a transient network failure at boot silenced the model catalog for the entire session (24h TTL blocked retries). - Add cdebug! logging on each early-return path in fetch_from_api. opencode.rs silent DB failures: - Add cdebug! when Connection::open_with_flags fails. - Add cdebug! when conn.prepare fails (schema mismatch). report.sh silent drop when Python unavailable: - Add `else` branch with stderr message when neither python3 nor python is found. Previously state reports were silently discarded. - Remove `|| true` from nc pipe; emit stderr on socket write failure. - Remove dead LEN variable (computed but never used). Comment fixes: - agent_status/mod.rs: fix arch diagram reference to non-existent `osc133` submodule; remove HookTaskStart from priority rule 3 (it always maps to Working, never Blocked) - arbitrate.rs: fix "Called by daemon's 1Hz ticker" (currently only in tests) - session.rs: replace stale Phase-1-will-replace comment in refresh_state - statusbar.rs: remove PR #495 forward-reference to non-existent VisibleAgentState 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

token_monitor/mod.rs: seek_or_reset(file, offset, path) — single place for the seek-failure handling pattern used by claude/codex/kimi; logs via cdebug! and resets offset to 0 so the next read starts from the beginning claude.rs / codex.rs / kimi.rs: replace inline 4-line seek+reset blocks with seek_or_reset(); remove now-unused Seek/SeekFrom imports amp.rs: borrow &[serde_json::Value] instead of .to_owned() — avoids cloning the entire JSON array on every file read during each poll cycle 394 lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…, broadcast logging models.rs — same-count API response never stamped fetched_at (HIGH): - fetch_from_api() now returns bool indicating HTTP success (true) vs failure - populate() stamps fetched_at only when the round-trip returned true, regardless of whether entry count changed; previously retain+extend with same count produced len==before, leaving needs_refresh()=true permanently and firing an HTTP call on every session spawn amp.rs — cache-only token changes silently frozen (MEDIUM): - Include cache_read_tokens and cache_write_tokens in the `changed` comparison; was input+output only, so all-cache-hit runs never updated the displayed cache totals token_monitor/mod.rs — second seek failure silently ignored (MEDIUM): - seek_or_reset() now returns bool; if the fallback seek-to-0 also fails, cdebug! is logged and the function returns false so callers skip the file rather than constructing a BufReader at an unknown position daemon.rs — state-transition broadcast failures entirely silent (MEDIUM): - screen-detector, hook, and token-usage broadcast sites now log via cdebug! when send() returns Err (no active receivers); spawn sites keep let _ = (no receivers expected at spawn time) report.sh — Python pipe failure invisible (MEDIUM): - set -eu → set -euo pipefail so Python framing errors in the pipeline exit code propagate instead of being masked by nc's exit code Comments: - Remove 2 misleading lines from blocked_is_sticky test that said "Working-visible must not clear Blocked" (the assertion proves it does) - Update find_provider_files doc to describe both direct-child and one-level-deep scanning behavior Tests added (6 new, total now 400): - working_visible_from_fresh_session_produces_done_not_idle - blocked_visible_from_fresh_session_produces_done_on_prompt - working_visible_after_ack_produces_done_again - seek_or_reset_succeeds_when_offset_valid - seek_or_reset_resets_when_offset_beyond_eof - populate_on_empty_api_key_leaves_needs_refresh_true Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…ed, fetch_from_api contract codex.rs — prev_cumulative scoped wrong (HIGH): - Move prev_cumulative inside the for-path loop so it resets to (0,0,0,0) per file; was shared across files causing cross-file delta corruption - Replace saturating_sub with delta closure that cdebug! logs on counter regressions (counter drops after seek_or_reset or session replay) socket.rs — WaitSessionStatus Lagged handler wrong (HIGH): - Was: continue (silently keep waiting, may never unblock if target state was in dropped events) - Now: cdebug! + break with outcome=timeout so caller can retry with fresh state rather than waiting forever on a missed event models.rs — fetch_from_api 0-model success semantics (MEDIUM): - Now returns false when all models are filtered out (empty new Vec) so needs_refresh() stays true and caller retries rather than locking in embedded fallback for 24h with a valid-but-empty response - Update populate() doc comment to say "at least one model matched" report.sh — set -euo pipefail under #!/bin/sh (IMPORTANT): - Revert to set -eu; pipefail is not POSIX and unreliable on older dash; the || echo handlers already handle pipeline failures explicitly entering_work_cycle simplification (Simplification): - Remove redundant HookTaskStart|OperatorInput first arm; both signals unconditionally map to Working in transition(), so the second arm (transition into Working|Blocked from non-working state) subsumes them - Update comment accordingly Tests added (2 new → 402 total): - amp_changed_flag_includes_cache_tokens: documents the 4-field comparison and shows old 2-field logic would miss cache-only changes - populate_stamps_fetched_at_only_on_success: directly tests the false→no-stamp / true→stamp branches of populate() Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

…-status Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com> # Conflicts: # Cargo.lock # crates/jackin-capsule/Cargo.toml # crates/jackin-capsule/src/daemon.rs # crates/jackin-capsule/src/dialog.rs # crates/jackin-capsule/src/lib.rs # crates/jackin-capsule/src/runtime_setup.rs # crates/jackin-capsule/src/session.rs # crates/jackin-capsule/src/statusbar.rs # crates/jackin-capsule/tests/status_bar.rs # crates/jackin-tui/src/lib.rs # crates/jackin/src/console/tui/view/list/tests.rs # docs/content/docs/reference/roadmap/agent-runtime-status.mdx # src/console/manager/input/list.rs

…-status Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com> # Conflicts: # Cargo.lock # crates/jackin-capsule/benches/pane_body.rs # crates/jackin-capsule/src/runtime_setup.rs

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>

Validate the current detection engine against the design contract and record the gap list: arbitration, /proc identity, OSC 133, seen/ack, subagent counting, and the semantic hook channel exist but are unwired, while PTY output bytes author working state — the flap engine behind the false-positive reports. Rebuild the page as the canonical spec: skeptical review of every detection approach in the field, June 2026 re-research of Herdr (manifest rules, deleted PTY-first experiment) and verified vendor hook/plugin surfaces, license discipline for borrowed concepts, an evidence/arbitration/debounce architecture with data model, gating table, rule-pack schema, constants, and watchdog, per-runtime adapter plans, and a nine-slice implementation blueprint with per-slice files, tests, and acceptance criteria. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

The single-source-of-truth collapse dropped reference material the implementation needs: the precise validation audit, Herdr observed facts and timeline, per-runtime signal surfaces with doc and issue links, the orchestrator survey, terminal-protocol reliability, and /proc signal validity. Restore all of it as appendices A-F so no finding lives outside this page. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Add non-built-in runtime surfaces (Gemini, Aider, Pi) as reference for the custom-role reporter contract, community corroboration sources, and the two out-of-scope items dropped during the rewrite (per-tool status granularity, agents beyond the built-in five). Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>

Replace byte-driven status detection with evidence arbitration, runtime event gates, process and OSC corroboration, rule packs, diagnostics, and roadmap/docs updates. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>

donbeave and others added 4 commits June 4, 2026 17:53

donbeave changed the title ~~feat(capsule): agent runtime status authority — state model + screen detectors~~ feat(capsule): agent runtime status authority — Phases 0–4 + token monitor foundation Jun 4, 2026

donbeave changed the title ~~feat(capsule): agent runtime status authority — Phases 0–4 + token monitor foundation~~ feat(capsule): agent runtime status authority — Phases 0–6 Jun 4, 2026

donbeave and others added 2 commits June 4, 2026 22:06

donbeave changed the title ~~feat(capsule): agent runtime status authority — Phases 0–6~~ feat(capsule): agent runtime status authority — all phases Jun 4, 2026

donbeave and others added 4 commits June 4, 2026 22:19

donbeave changed the title ~~feat(capsule): agent runtime status authority — all phases~~ feat(capsule): agent runtime status authority — all phases, all acceptance criteria Jun 4, 2026

donbeave and others added 14 commits June 4, 2026 22:42

donbeave changed the title ~~feat(capsule): agent runtime status authority — all phases, all acceptance criteria~~ feat(capsule): add agent runtime status authority Jun 10, 2026

donbeave and others added 6 commits June 10, 2026 01:25

fix(ci): resolve PR check failures

26886fe

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>

fix(docs): update runtime status repo links

32e476d

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(capsule): add agent runtime status authority#532

feat(capsule): add agent runtime status authority#532
donbeave wants to merge 31 commits into
mainfrom
feature/agent-runtime-status

donbeave commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

donbeave commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's deferred (follow-up PRs)

Verify locally

Checkout

Static checks

Rust tests

Docs checks

User smoke

jackin-capsule smoke

Documentation

Migration notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

donbeave commented Jun 4, 2026 •

edited

Loading