feat(capsule): add agent runtime status authority#532
Open
donbeave wants to merge 31 commits into
Open
Conversation
…detectors Replace the timer-based "silence means blocked" heuristic with a layered status authority. Foundation slice: Phase 0 (silence-bug removal), Phase 1 (state model + status machine), and the Phase 4 screen-detector framework. Phase 0 — stop lying with silence: - Remove BLOCKED_AFTER, state_after_refresh, state_after_pty_output. - refresh_state() is now a no-op; PTY output no longer flips state on a timer. - Add AgentState::Unknown to the protocol (serializes as "unknown"). Phase 1 — state model: - New agent_status module: AgentRawState (10-variant input-signal enum), SessionStatus machine with advance()/acknowledge(), seen-derived Done, and a monotonic revision counter. - Session owns one status: SessionStatus; the old state field is gone, replaced by Session::state(). mark_operator_input and acknowledge route through the machine. PTY output only stamps last_output_at now. Phase 4 — screen-detector framework: - Detector trait + DetectorRegistry with detectors for all five built-in runtimes (claude, codex, amp, kimi, opencode), each matching the bottom DETECTION_ROWS = 24 of the vt100::Screen and returning Option<AgentRawState>. - Wired into the 1Hz ticker via Multiplexer::refresh_session_statuses(), which also acknowledges the focused pane so its Done clears to Idle. TUI: - Tab strip renders Unknown as a mid-gray dot; roll-up priority Blocked > Done > Unknown > (Working/Idle). Docs: - Update the roadmap item to Partially implemented with an "Implemented Shape (As Shipped)" section documenting the three deliberate divergences from the original proposal (kept AgentState+Unknown; SessionStatus machine instead of arbitrate_session_status; AgentRawState as input-signal enum). Mark Phase 0/1 shipped and Phase 4 partial. Move the overview entry to Partially implemented. Remaining: Phase 2 (reporter + /proc), Phase 3 (hooks), rest of Phase 4 (OSC 133, cursor probes, stuck detection, fixtures), Phase 5 (subscription protocol + workspace roll-up event), Phase 6 (token/quota monitor + dialog). Tests: 311 lib tests pass; clippy + rustfmt clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…e-status Document how this feature (developed on main) must be shaped to match the parallel TUI Architecture refactor (feature/tui-architecture) so the eventual merge is mechanical. Six alignment rules: keep derivation/token logic in pure agent_status/ + token_monitor/ modules, AgentState stays the single wire enum with Unknown added (refactor gains the VisibleAgentState arm on merge), tab glyphs in one cohesive match that relocates to tui/components/status_bar.rs, daemon additions cohesive and ticker-local for clean relocation into the daemon/ submodules, Session::state() accessor for a find-replace field swap, and Phase 6 token dialog built data-first so only a thin render shim is Ratatui-rewritten on merge. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
Bake the verified TUI-refactor findings into the roadmap item so any future implementer has the concrete merge facts without re-inspecting the branch: exact file moves (statusbar.rs deleted → tui/components/status_bar.rs; daemon.rs split into daemon/ submodules), the VisibleAgentState bridge enum and visible_agent_state_from_protocol mapping, what the refactor did NOT change (4-variant AgentState, silence-timer still present), and a per-file conflict/resolution table for all eight touched code paths plus the two conflict-free new module trees. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…nitor Ships the remaining implementation of the agent runtime status authority roadmap item (Phases 2, 3, 4, and the Phase 6 foundation). Phase 2 — Container Reporter + Process Identity: - Add `procfs = "0.18"` dependency; rewrite `process.rs` to use the `procfs` crate for `/proc`-based foreground-agent detection - `docker/runtime/agent-status/report.sh` and `heartbeat.sh` — reporter CLI and heartbeat scripts that send length-prefixed JSON to the Capsule socket from inside the container - `JACKIN_SESSION_ID`, `JACKIN_STATUS_SOCKET`, `JACKIN_STATUS_SOURCE`, `JACKIN_AGENT_RUNTIME` injected at session spawn (`build_agent_command` stamps `JACKIN_SESSION_ID` on the `CommandBuilder` using the generated sid) - `ReportAgentState`, `HeartbeatAgentAuthority`, `ClearAgentAuthority` control messages added to the protocol and routed to `handle_control_msg` on the `Multiplexer`; sequence validation via `SequenceTracker` prevents stale/replayed authority - `HookAuthority` struct in `agent_status::mod`; `Session` gains `hook_authority`, `sequence`, `child_pid`, `subagent_count` fields Phase 3 — Semantic Runtime Reports: - `ClaudeHookInstaller` writes and drift-repairs `~/.claude/settings.json` with all hook entries; `Stop` and `PermissionRequest` registered as `async: false`; atomic tmp-file + rename writes; three unit tests - Stub installers for `Kimi`, `Amp`, `Codex`, `OpenCode`; `runtime_setup.rs` calls the correct installer per agent at every session launch - Claude hook script (`docker/runtime/agent-status/hooks/claude/report-hook.sh`) maps every hook event to a raw state; `SubagentStart`/`SubagentStop` pass `--message <event>` so the daemon updates the subagent counter - Subagent sticky-blocked rule: `WorkingVisible` suppressed when `subagent_count > 0` and session is `Blocked` - Stub hook scripts for Kimi, Codex, and OpenCode Phase 4 — Screen + Shell Integration: - `OscCapture` extended with `OscShellMark` enum; `OSC 133 A/B/C/D` captured and drained each 1Hz tick → feeds `PromptVisible` / `Osc133PreExec` into the status machine - Shell integration `.zshrc` block installed by `runtime_setup.rs`; OSC 133 + OSC 7 `precmd`/`preexec` hooks from `install_shell_integration()` - `TabGlyph::Working` (◌ dim phosphor-green) and `TabGlyph::Stuck` (! amber) added to `statusbar.rs`; roll-up priority updated to include `Working` between `Done` and `Unknown` - Screen fixture files for all five runtimes under `crates/jackin-capsule/src/agent_status/screen/fixtures/` (working / blocked / idle / false_positive per agent) Phase 6 foundation — Token Monitor: - `token_monitor/` module: `TokenMonitor`, `TokenSession`, `TokenTotals` - Claude JSONL reader with incremental byte-offset reads, `costUSD` preference, sidechain skip, and three unit tests - Static pricing table (`pricing.rs`) covering Claude, OpenAI, Kimi/MoonShot with tiered calculation; three unit tests - Stub readers for Codex, Kimi, Amp, OpenCode; embedded model catalog - `TokenMonitor` registered/deregistered alongside sessions; 30-second polling ticker in the daemon event loop; `SessionInfo.token_usage` now populated from monitor totals Protocol additions: - `ClientMsg`: `ReportAgentState`, `HeartbeatAgentAuthority`, `ClearAgentAuthority`, `EventsSubscribe`, `WaitSessionStatus`, `SessionReadVisible` - `ServerMsg`: `AgentStateChanged`, `SessionStatusResult`, `SessionVisibleText`, `Welcome`, `Error` - `TokenUsageSummary` struct; `SessionInfo.token_usage` field - `WaitSessionStatus` handled synchronously against snapshot in `socket::handle_control_request`; full push-subscription deferred to Phase 5 Roadmap updates: - Phases 0–4 marked ✅ Shipped; Phase 5 and Phase 6 marked partial - `roadmap/index.mdx` updated to reflect current coverage - `<RepoFile />` links for new docker runtime assets 329 capsule lib tests + 10 protocol tests pass; pre-existing `git_prepare_commit_msg` integration test failures are unrelated. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
Phase 5 — Host Consumer Integration: - Console session list state badges color-coded by AgentState: Blocked = bright red, Done = phosphor green, Working = dim green, Unknown = mid-gray, Idle = quiet dim - Done panes render with ◎ marker (vs ○ for normal, ● for focused) so finished-but-unseen work is immediately visible - Daemon already acknowledges focused pane on each 1Hz tick (Done → Idle) Phase 6 — Full Token Monitoring: - Codex JSONL reader: session-format (event_msg/token_count cumulative- delta calculation) + headless format (direct usage fields); 2 unit tests - Kimi wire.jsonl reader: parses StatusUpdate token_usage fields (input_other, output, input_cache_read, input_cache_creation) - Amp thread JSON reader: array and wrapper-object shapes; 2 unit tests - OpenCode SQLite reader: rusqlite crate, rowid-incremental queries, read-only connection flags, legacy schema stub - ModelCatalog::populate() with live API queries via ureq: Anthropic /v1/models, OpenAI /v1/models, MoonShot /v1/models; embedded fallback when API key absent or request fails - RateWindow + ProviderUsageSnapshot structs; TokenMonitor.token_snapshots updated on each poll cycle - Dialog::TokenUsage carries TokenUsageSummary; render_token_usage shows model name, input/output/cache token counts (formatted as 14.2k / 1.4M), and cost estimate - Ctrl+u (0x15) opens token usage dialog from any pane - status bar token indicator: compact "89% (5h)" format at end of row-1, amber below 20%, red below 10% (wired into StatusBar::render()) - TokenGetSession + TokenGetModels control messages in protocol; handled in one-shot socket handler - ModelCatalog registered at session start via register_session() - Dependencies: rusqlite = "0.32" (bundled), ureq = "2.12" (tls) 337 capsule lib tests + 10 protocol tests pass. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…OpenCode ACP events.subscribe streaming push protocol: - Add `state_broadcast_tx: broadcast::Sender<ServerMsg>` to Multiplexer - Broadcast `AgentStateChanged` on every session state change in `refresh_session_statuses`, OSC 133 handler, and `handle_control_msg` - Broadcast spawn event when sessions are added to the tab tree - `EventsSubscribe` handling in `socket::handle_control_request`: sends a `Welcome` frame then enters a streaming loop pushing broadcast events until the connection drops; handles `Lagged` errors - Pass `state_broadcast_tx` through `perform_handshake` to the handler CSI 6n cursor-position probe for stuck detection: - `Session::probe_cursor_position()` sends `\x1b[6n` to the PTY - `Session::check_for_cpr_response()` scans PTY output for `\x1b[<row>;<col>R` pattern; sets `cursor_probe_responded` flag - `Session` gains: `cursor_probe_pending`, `cursor_probe_sent_at`, `cursor_probe_responded`, `stuck_since` fields - `refresh_session_statuses` sends probe when Working + no output > 4 minutes; routes CPR response as `CursorProbeOk`; handles 2s timeout as `CursorProbeTimeout` Stuck timeout → `TabGlyph::Stuck` (amber `!`): - `STUCK_WARNING_THRESHOLD = 5 minutes`: sets `session.stuck_since` - `snapshot_stuck_sessions()` returns a `HashSet<u64>` of stuck IDs - `tab_label` takes `stuck_sessions: &HashSet<u64>`; uses `TabGlyph::Stuck` when any pane is stuck (priority between Done and Working) - `StatusBar::render` takes `stuck_sessions` and passes it to `tab_label`; all 6 test call sites updated - `#[allow(dead_code)]` removed from `TabGlyph::Stuck` OpenCode ACP stdio JSON-RPC bridge: - `docker/runtime/agent-status/hooks/opencode/acp-bridge.sh`: spawns `opencode acp`, reads JSON-RPC notifications (session.idle, session.busy, session.status, question.asked, tool.call, agent.start/end), maps to `report.sh` calls - `OpenCodeAcpInstaller::install()` writes a marker file and is called from `install_status_reporter_hooks` for opencode sessions - `entrypoint.sh` opencode case launches the ACP bridge in background 337 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…verview Phases 0–6 all marked ✅ Shipped. Phase 4 updated with CSI 6n probe and Stuck timeout. Phase 5 updated with events.subscribe streaming details. Roadmap index entry updated to reflect full implementation. Only remaining item: Anthropic OAuth usage window polling (opt-in via ANTHROPIC_OAUTH_TOKEN, explicitly out of scope for this PR). Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…ocking WaitSessionStatus arbitrate.rs — pure arbitration function + 11 tests: - arbitrate_unknown_when_no_signals - arbitrate_working_from_process_alive - arbitrate_blocked_from_screen_blocker_no_hook - arbitrate_blocked_authoritative_when_hook_agrees - arbitrate_working_overrides_idle_hook_with_fresher_screen - arbitrate_fresh_hook_authority_wins - arbitrate_process_exit_clears_to_idle - arbitrate_hook_cleared_when_wrong_agent - roll_up_blocked_beats_working_beats_idle - roll_up_done_beats_working / roll_up_unknown_when_empty - attention_priority_order ScreenDetection and ProcessEvidence structs as pure evidence types. attention_priority() and roll_up_states() public helpers. State machine tests added to agent_status/mod.rs: - re_work_after_ack_creates_new_done - done_derived_from_idle_plus_unseen - roll_up_priority_blocked_gt_done_gt_working_gt_idle_gt_unknown Reporter protocol tests added: - reporter_accept_valid_sequence / reporter_reject_stale_sequence - reporter_reject_wrong_source_after_clear (sequence.rs) - heartbeat_keeps_hook_authority_fresh - clear_authority_removes_only_matching_source (mod.rs) Process identity tests added (process.rs): - identify_agent_stat_comm_truncation_falls_back_to_exe - dead_process_returns_none Token monitor tests added (token_monitor/mod.rs): - token_monitor_backs_off_after_silence - token_monitor_resets_backoff_after_change - token_monitor_poll_due_respects_interval - session_info_includes_token_usage_when_available Model catalog tests added (token_monitor/models.rs): - model_catalog_falls_back_to_embedded_list_on_error - model_catalog_uses_cached_result_within_ttl - model_catalog_parses_model_entries_correctly wait_session_status blocking implementation (socket.rs): - Checks current state against target immediately - If not satisfied: subscribes to state_broadcast_tx and blocks - Loops on broadcast events until matching AgentStateChanged - Exits with "satisfied", "timeout", or "not_found" outcome - Handles Lagged broadcast errors gracefully Screen fixtures added: - claude/false_positive_old_spinner.txt — spinner above separator - claude/false_positive_timing.txt — "Churned for 1m" timing line wait_session_status_already_satisfied test added (daemon.rs) 367 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…ning - claude_stop_hook_is_registered_as_sync_and_checks_background_tasks: verifies Stop and PermissionRequest hooks are async: false and command path is correct (hook_installer.rs) - subagent_counter_prevents_working_from_clearing_blocked: documents the WorkingVisible/Blocked interaction and daemon-level guard (agent_status/mod.rs) - wait_session_status_logic_timeout_when_not_satisfied: unsatisfied targets → outcome "timeout" (daemon.rs) - wait_session_status_logic_not_found_for_unknown_session + session_info_token_usage_is_populated_from_monitor (daemon.rs) - #[allow(dead_code)] on screen_idle test helper in arbitrate.rs 372 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
Amp detector now detects all three states:
- Working: "esc to cancel" chrome (existing)
- Blocked: "Allow?" or "approve"+"esc to cancel" approval prompts (new)
- Idle: ">" or "> " or "❯" prompt at bottom (new)
3 new tests: detects_blocked_from_approval_prompt,
detects_idle_from_prompt_line,
detects_idle_from_arrow_prompt
OpenCode detector now detects all three states:
- Blocked: "permission required" or "dismiss"+"enter/select" (existing + expanded)
- Working: "Ctrl+C to cancel" or "interrupt" + "cancel" chrome (new)
- Idle: ">" or "> " or "❯" input box at bottom (new)
3 new tests: detects_blocked_from_question_prompt,
detects_working_from_interrupt_chrome,
detects_idle_from_input_box
New screen fixture: opencode/false_positive.txt (cosmetic redraw,
no working/blocked/idle triggers)
PTY/e2e simulation tests (4) in agent_status/mod.rs:
- fake_claude_permission_dialog_transitions_to_blocked: vt100::Parser
renders permission dialog → DetectorRegistry → BlockedVisible → Blocked
- fake_claude_spinner_transitions_to_working_then_idle: spinner screen
→ Working; then prompt box → PromptVisible → Done (unseen)
- process_exit_signal_transitions_to_idle_and_clears_authority: Working
+ ProcessExited signal → Idle
- multiple_sessions_roll_up_reflects_most_urgent: roll_up_states over
Working/Blocked/Working/Idle → Blocked (highest priority)
382 capsule lib tests pass; workspace builds clean.
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…iteria met All acceptance criteria verified. Status line updated to 'Implemented'. 382 tests confirm coverage. OAuth polling remains explicitly out of scope per the roadmap's own out-of-scope list. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…gent msg
Protocol compliance with roadmap event stream payload spec:
AgentStateChanged expanded to include all required fields:
raw_state, confidence, detected_agent, foreground_pgid,
visible_blocker, visible_idle, visible_working, process_exited,
stale_report, seq, ts_ns, last_seen_revision (all optional with
serde defaults so existing serialised events decode without errors)
New ServerMsg variants:
SessionSpawned { session_id, agent, label } — new session created
SessionExited { session_id } — session removed
TokenUsageChanged { session_id, agent, model, tokens, cost, ts_ns }
WorkspaceStatusChanged { effective, session/blocked/done/working counts }
New ClientMsg variant:
ReportChildAgentState { parent_session_id, child_session_id, raw_state, seq }
Daemon wiring:
SessionSpawned broadcast on every tab-spawn and split-spawn insert
SessionExited broadcast when session is removed
WorkspaceStatusChanged broadcast after every refresh_session_statuses
TokenUsageChanged broadcast per session on token ticker cycle
All AgentStateChanged broadcasts use expanded field set
Shared palette compliance (roadmap requirement):
AMBER: Rgb = Rgb::new(255, 170, 0) added to crates/jackin-tui/src/lib.rs
statusbar.rs: Working glyph and Stuck/token-bar amber now use
PHOSPHOR_DARK_FG / AMBER_FG constants derived from shared palette
instead of inline \x1b[38;2;... byte literals
382 capsule lib tests pass; workspace builds clean.
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
crates/jackin-protocol/src/agent_status.rs (new):
- AgentRawState { Unknown, Working, Blocked, Idle } — protocol-level
4-variant raw state enum with label() and serde support
- AgentStatusSource — Reported/VisibleScreen/ForegroundProcess/
ShellIntegration/CursorProbe/OutputActivity/None
- AgentStatusConfidence — Unknown/Weak/Strong/Authoritative (Ord impl)
- AgentStatusReport struct — all evidence fields: raw_state, source,
confidence, detected_agent, foreground_pgid, visible_blocker/idle/
working, process_exited, stale_report, revision, last_seen_revision
- 4 unit tests: labels, confidence ordering, default state, JSON roundtrip
crates/jackin-protocol/src/lib.rs: pub mod agent_status
control.rs: SessionInfo and PaneSnapshot gain
agent_status_report: Option<AgentStatusReport> field
crates/jackin-capsule/src/agent_status/seen.rs (new):
- acknowledge_session() and mark_pane_focused() delegating to
SessionStatus::acknowledge()
- 2 unit tests
crates/jackin-capsule/src/agent_status/mod.rs:
- pub mod seen registered
- event_stream_emits_on_raw_state_change test: verifies Unknown→Working
produces Some(Working), repeated Working produces None, Working→Blocked
produces Some(Blocked)
385 capsule + 14 protocol tests pass; workspace builds clean.
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Resolved conflicts from two main-branch PRs: - #529: signed releases + supply-chain verification (GitHub Actions, new .gitignore entry, release workflow changes, docs for security/verification) - #530: agent codenames + operator CLI hygiene (new `Agents`/`AgentRegistry` ClientMsg/ServerMsg, `AgentRegistryEntry` struct, codename_live/retired/ wordlist fields in Multiplexer, `jackin status` CLI, error codes, preflight) Conflict resolutions: - control.rs: kept all our new variants (ReportAgentState, HeartbeatAgentAuthority, ClearAgentAuthority, ReportChildAgentState, EventsSubscribe, WaitSessionStatus, SessionReadVisible, TokenGetSession, TokenGetModels, AgentStateChanged, SessionSpawned, SessionExited, TokenUsageChanged, WorkspaceStatusChanged, TokenSessionResult, TokenModelsResult, SessionStatusResult, SessionVisibleText, Welcome, Error, TokenUsageSummary) AND added main's Agents/AgentRegistry/ AgentRegistryEntry from #530 - daemon.rs: kept our detectors/token_monitor fields AND added main's codename_live/retired/agent_history/wordlist_offset fields - client.rs: kept our wildcard match arms AND added explicit AgentRegistry arms 389 capsule + 14 protocol tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
arbitrate.rs: - Extract ScreenDetection::is_fresher_than() — freshness check was duplicated inline at two guard sites (blocker and working overrides); now a single documented method with a clear conservative default process.rs: - Extract agent_kind_from_name(name: &str) — agent-slug matching was duplicated across the exe-basename path and the stat.comm fallback; both paths now call the same fn token_monitor/models.rs: - Extract fetch_from_api() — three near-identical fetch_anthropic / fetch_openai / fetch_moonshot bodies collapsed into a single parameterized helper; each caller is now one line with its env-key, URL, auth-header builder, and model filter Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…otes Moves OSC 133 shell-integration parsing out of `OscCapture`/`session.rs` into a model-independent raw-PTY scanner in `agent_status/mod.rs`. This eliminates the conflict with #495's vt100 → DamageGrid/PassthroughEvent migration (which removes `OscCapture` entirely). agent_status/mod.rs: - Add `OscShellMark { PromptStart, PromptEnd, PreExec, CommandFinished }` - Add `scan_osc133(bytes: &[u8]) -> Option<OscShellMark>` — byte-level scanner that finds `\x1b]133;A/B/C/D` sequences without depending on the vt100 parser or DamageGrid model - 5 unit tests for the scanner session.rs: - Remove `OscShellMark` enum (moved to agent_status) - Remove `OscCapture.shell_mark` field, `take_shell_mark()`, OSC 133 block in `unhandled_osc` - Remove `Session::take_shell_mark()` - Net -50 lines; session.rs conflicts with #495 are now confined to the `state`→`status` field swap and the new cursor-probe fields daemon.rs: - `SessionEvent::Output` handler calls `scan_osc133(&data)` after `session.feed_pty`, feeds result into state machine immediately (no 1Hz lag for shell-boundary signals) - Remove `take_shell_mark` drain from `refresh_session_statuses` statusbar.rs: - Add doc comment to `tab_label` noting the post-#495 type swap from `AgentState` to `VisibleAgentState` — mechanical find-replace, glyph arms unchanged 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…nature StatusBar::render() gained stuck_sessions and token_snapshot params (added in Phase 4/6). The integration test helpers were missed when those params were added. CI caught the mismatch on MSRV check. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
token_monitor/mod.rs: - Add find_provider_files(base_dirs, ext) shared helper — removes duplicate directory-scan code across claude.rs and codex.rs - Combine poll_due_sessions() from two passes (collect IDs then update snapshots) into a single pass that does both in one loop iteration token_monitor/claude.rs + codex.rs: - Replace local find_jsonl_files() with calls to find_provider_files() daemon.rs: - Guard scan_osc133 call with data.len() >= 8 early-exit so short PTY buffers (spinner redraws, cursor reports) skip the scan entirely 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
daemon.rs: - make_agent_state_changed() helper eliminates 4 repetitions of the AgentStateChanged 16-field struct literal; each broadcast site now passes only the fields it owns (session_id, effective, seen, source, revision, reason) - Workspace roll-up: 3 separate .filter().count() passes → single match loop with three counters; one iteration over sessions instead of three socket.rs: - make_result closure in WaitSessionStatus arm — 3 identical SessionStatusResult constructions collapsed to 1 call site each; only outcome/effective/revision differ 394 capsule lib tests + 8 integration tests pass. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
… errors, models TTL From comprehensive PR review: Done state now reachable from screen detectors: - `WorkingVisible`/`BlockedVisible` transitioning FROM Idle/Unknown/Done now resets `seen = false`, so the subsequent PromptVisible produces `Done` instead of `Idle`. Previously only `HookTaskStart`/`OperatorInput` reset `seen`, making `Done` unreachable in the screen-detection path. - Updated `prompt_visible_after_working_produces_idle_when_seen` test: enter Working first (resets seen), then explicitly set seen=true, then PromptVisible → Idle (correct; seen was already true when work ended) amp.rs token double-counting (critical data bug): - Replaced accumulation into session.totals with scratch-compute-and-compare: totals computed fresh each poll, `changed` only true if they moved. Previously totals were added cumulatively — after N polls, counts were N× the real value with no indication of the problem. let _ = file.seek() silent corruption (claude/codex/kimi): - Check seek result; on failure, cdebug! log + reset offset to 0 and re-seek from start. Previously a truncated/rotated file would silently corrupt the offset and lose tokens on the next poll. models.rs fetched_at stamped even after HTTP failure: - Only stamp `self.fetched_at` when entries were actually added. Previously a transient network failure at boot silenced the model catalog for the entire session (24h TTL blocked retries). - Add cdebug! logging on each early-return path in fetch_from_api. opencode.rs silent DB failures: - Add cdebug! when Connection::open_with_flags fails. - Add cdebug! when conn.prepare fails (schema mismatch). report.sh silent drop when Python unavailable: - Add `else` branch with stderr message when neither python3 nor python is found. Previously state reports were silently discarded. - Remove `|| true` from nc pipe; emit stderr on socket write failure. - Remove dead LEN variable (computed but never used). Comment fixes: - agent_status/mod.rs: fix arch diagram reference to non-existent `osc133` submodule; remove HookTaskStart from priority rule 3 (it always maps to Working, never Blocked) - arbitrate.rs: fix "Called by daemon's 1Hz ticker" (currently only in tests) - session.rs: replace stale Phase-1-will-replace comment in refresh_state - statusbar.rs: remove PR #495 forward-reference to non-existent VisibleAgentState 394 capsule lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
token_monitor/mod.rs: seek_or_reset(file, offset, path) — single place for the seek-failure handling pattern used by claude/codex/kimi; logs via cdebug! and resets offset to 0 so the next read starts from the beginning claude.rs / codex.rs / kimi.rs: replace inline 4-line seek+reset blocks with seek_or_reset(); remove now-unused Seek/SeekFrom imports amp.rs: borrow &[serde_json::Value] instead of .to_owned() — avoids cloning the entire JSON array on every file read during each poll cycle 394 lib tests pass; workspace builds clean. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…, broadcast logging models.rs — same-count API response never stamped fetched_at (HIGH): - fetch_from_api() now returns bool indicating HTTP success (true) vs failure - populate() stamps fetched_at only when the round-trip returned true, regardless of whether entry count changed; previously retain+extend with same count produced len==before, leaving needs_refresh()=true permanently and firing an HTTP call on every session spawn amp.rs — cache-only token changes silently frozen (MEDIUM): - Include cache_read_tokens and cache_write_tokens in the `changed` comparison; was input+output only, so all-cache-hit runs never updated the displayed cache totals token_monitor/mod.rs — second seek failure silently ignored (MEDIUM): - seek_or_reset() now returns bool; if the fallback seek-to-0 also fails, cdebug! is logged and the function returns false so callers skip the file rather than constructing a BufReader at an unknown position daemon.rs — state-transition broadcast failures entirely silent (MEDIUM): - screen-detector, hook, and token-usage broadcast sites now log via cdebug! when send() returns Err (no active receivers); spawn sites keep let _ = (no receivers expected at spawn time) report.sh — Python pipe failure invisible (MEDIUM): - set -eu → set -euo pipefail so Python framing errors in the pipeline exit code propagate instead of being masked by nc's exit code Comments: - Remove 2 misleading lines from blocked_is_sticky test that said "Working-visible must not clear Blocked" (the assertion proves it does) - Update find_provider_files doc to describe both direct-child and one-level-deep scanning behavior Tests added (6 new, total now 400): - working_visible_from_fresh_session_produces_done_not_idle - blocked_visible_from_fresh_session_produces_done_on_prompt - working_visible_after_ack_produces_done_again - seek_or_reset_succeeds_when_offset_valid - seek_or_reset_resets_when_offset_beyond_eof - populate_on_empty_api_key_leaves_needs_refresh_true Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…ed, fetch_from_api contract codex.rs — prev_cumulative scoped wrong (HIGH): - Move prev_cumulative inside the for-path loop so it resets to (0,0,0,0) per file; was shared across files causing cross-file delta corruption - Replace saturating_sub with delta closure that cdebug! logs on counter regressions (counter drops after seek_or_reset or session replay) socket.rs — WaitSessionStatus Lagged handler wrong (HIGH): - Was: continue (silently keep waiting, may never unblock if target state was in dropped events) - Now: cdebug! + break with outcome=timeout so caller can retry with fresh state rather than waiting forever on a missed event models.rs — fetch_from_api 0-model success semantics (MEDIUM): - Now returns false when all models are filtered out (empty new Vec) so needs_refresh() stays true and caller retries rather than locking in embedded fallback for 24h with a valid-but-empty response - Update populate() doc comment to say "at least one model matched" report.sh — set -euo pipefail under #!/bin/sh (IMPORTANT): - Revert to set -eu; pipefail is not POSIX and unreliable on older dash; the || echo handlers already handle pipeline failures explicitly entering_work_cycle simplification (Simplification): - Remove redundant HookTaskStart|OperatorInput first arm; both signals unconditionally map to Working in transition(), so the second arm (transition into Working|Blocked from non-working state) subsumes them - Update comment accordingly Tests added (2 new → 402 total): - amp_changed_flag_includes_cache_tokens: documents the 4-field comparison and shows old 2-field logic would miss cache-only changes - populate_stamps_fetched_at_only_on_success: directly tests the false→no-stamp / true→stamp branches of populate() Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
…-status Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com> # Conflicts: # Cargo.lock # crates/jackin-capsule/Cargo.toml # crates/jackin-capsule/src/daemon.rs # crates/jackin-capsule/src/dialog.rs # crates/jackin-capsule/src/lib.rs # crates/jackin-capsule/src/runtime_setup.rs # crates/jackin-capsule/src/session.rs # crates/jackin-capsule/src/statusbar.rs # crates/jackin-capsule/tests/status_bar.rs # crates/jackin-tui/src/lib.rs # crates/jackin/src/console/tui/view/list/tests.rs # docs/content/docs/reference/roadmap/agent-runtime-status.mdx # src/console/manager/input/list.rs
…-status Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com> # Conflicts: # Cargo.lock # crates/jackin-capsule/benches/pane_body.rs # crates/jackin-capsule/src/runtime_setup.rs
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>
Validate the current detection engine against the design contract and record the gap list: arbitration, /proc identity, OSC 133, seen/ack, subagent counting, and the semantic hook channel exist but are unwired, while PTY output bytes author working state — the flap engine behind the false-positive reports. Rebuild the page as the canonical spec: skeptical review of every detection approach in the field, June 2026 re-research of Herdr (manifest rules, deleted PTY-first experiment) and verified vendor hook/plugin surfaces, license discipline for borrowed concepts, an evidence/arbitration/debounce architecture with data model, gating table, rule-pack schema, constants, and watchdog, per-runtime adapter plans, and a nine-slice implementation blueprint with per-slice files, tests, and acceptance criteria. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
The single-source-of-truth collapse dropped reference material the implementation needs: the precise validation audit, Herdr observed facts and timeline, per-runtime signal surfaces with doc and issue links, the orchestrator survey, terminal-protocol reliability, and /proc signal validity. Restore all of it as appendices A-F so no finding lives outside this page. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
Add non-built-in runtime surfaces (Gemini, Aider, Pi) as reference for the custom-role reporter contract, community corroboration sources, and the two out-of-scope items dropped during the rewrite (per-tool status granularity, agents beyond the built-in five). Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
Replace byte-driven status detection with evidence arbitration, runtime event gates, process and OSC corroboration, rule packs, diagnostics, and roadmap/docs updates. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Codex <codex@openai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR ships the in-container agent runtime status authority for jackin-capsule: agent panes now derive
working,blocked,done,idle, andunknownfrom semantic hooks, foreground process ownership, OSC 133 shell markers, visible-screen detectors, stuck probes, and token activity instead of relying on the old silence timer. Operators get clearer capsule status badges, done markers, event streaming, wait/read/token control-channel queries, and token usage rollups across Claude, Codex, Amp, Kimi, and OpenCode; contributors get the shared protocol and roadmap docs that describe the new authority surface.What's deferred (follow-up PRs)
ANTHROPIC_OAUTH_TOKENremains out of scope for this PR.Verify locally
Checkout
Paste this first to bypass the
tirithpaste scanner for the rest of the session:export TIRITH=0Then paste the checkout block:
Then build and export the jackin-capsule binary so the smoke steps below use it:
Static checks
Rust tests
The focused run covers the shared control protocol, status arbitration, hook installers, runtime detectors, token monitors, socket handling, and capsule TUI state touched by this PR.
Docs checks
User smoke
From the console, launch an agent workspace using the PR checkout, open at least one agent tab, and verify the capsule status area changes as the agent starts working, blocks for input, and returns to idle or done. The status should update without waiting for the removed silence timer, and the debug log should show status authority events rather than only raw output inactivity.
jackin-capsule smoke
jackin load the-architect . --debugInside the container, verify:
jackin' [<agent-name>]Ctrl+\opens the command palette (override withJACKIN_PALETTE_KEY)jackin-capsule status,jackin-capsule snapshot, and event/wait/token control paths continue to respond while the daemon is runningDocumentation
Start the docs site locally:
( cd docs bun run dev -- --host 127.0.0.1 )http://localhost:3000/reference/roadmap/agent-runtime-status/
Confirm the roadmap item says the status authority is fully implemented and its remaining follow-up scope matches this PR's deferred list.
http://localhost:3000/reference/roadmap/
Confirm the roadmap overview places the agent runtime status authority in the correct status section after this implementation.
Migration notes
None. This PR changes capsule/runtime protocol behavior and docs, but it does not bump the versioned config, workspace, or role manifest schemas.