Skip to content

feat(capsule): add agent runtime status authority#532

Open
donbeave wants to merge 31 commits into
mainfrom
feature/agent-runtime-status
Open

feat(capsule): add agent runtime status authority#532
donbeave wants to merge 31 commits into
mainfrom
feature/agent-runtime-status

Conversation

@donbeave

@donbeave donbeave commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

This PR ships the in-container agent runtime status authority for jackin-capsule: agent panes now derive working, blocked, done, idle, and unknown from semantic hooks, foreground process ownership, OSC 133 shell markers, visible-screen detectors, stuck probes, and token activity instead of relying on the old silence timer. Operators get clearer capsule status badges, done markers, event streaming, wait/read/token control-channel queries, and token usage rollups across Claude, Codex, Amp, Kimi, and OpenCode; contributors get the shared protocol and roadmap docs that describe the new authority surface.

What's deferred (follow-up PRs)

  • Anthropic OAuth usage-window polling through ANTHROPIC_OAUTH_TOKEN remains out of scope for this PR.
  • Downstream attention prompts consume this status authority but are still tracked separately in the agent attention prompts roadmap item.

Verify locally

Checkout

Paste this first to bypass the tirith paste scanner for the rest of the session:

export TIRITH=0

Then paste the checkout block:

export JACKIN_PR_TEST_DIR="$HOME/Projects/jackin-project/test/pr-532"
mkdir -p "$JACKIN_PR_TEST_DIR"
cd "$JACKIN_PR_TEST_DIR"

if [ ! -d jackin/.git ]; then
  git clone https://github.com/jackin-project/jackin.git
fi

cd jackin
mise trust
git fetch -f origin feature/agent-runtime-status:refs/remotes/origin/feature/agent-runtime-status
git checkout -B feature/agent-runtime-status refs/remotes/origin/feature/agent-runtime-status
mise trust
mise install
cargo build --bin jackin
export PATH="$PWD/target/debug:$PATH"
export JACKIN_CONFIG_DIR="$JACKIN_PR_TEST_DIR/.config/jackin"
export JACKIN_HOME_DIR="$JACKIN_PR_TEST_DIR/.jackin"
which jackin

Then build and export the jackin-capsule binary so the smoke steps below use it:

eval "$(cargo run --bin build-jackin-capsule -- --export)"

Static checks

cargo fmt --check
cargo clippy --all-targets --all-features -- -D warnings

Rust tests

cargo nextest run -p jackin-protocol -p jackin-capsule
cargo nextest run --all-features

The focused run covers the shared control protocol, status arbitration, hook installers, runtime detectors, token monitors, socket handling, and capsule TUI state touched by this PR.

Docs checks

(
  cd docs
  bun install --frozen-lockfile
  bun run build
  bun run check:repo-links
  bunx tsc --noEmit
  bun test
)

User smoke

jackin console --debug

From the console, launch an agent workspace using the PR checkout, open at least one agent tab, and verify the capsule status area changes as the agent starts working, blocks for input, and returns to idle or done. The status should update without waiting for the removed silence timer, and the debug log should show status authority events rather than only raw output inactivity.

jackin-capsule smoke

jackin load the-architect . --debug

Inside the container, verify:

  • Row 0 status bar is visible: jackin' [<agent-name>]
  • Agent TUI starts and renders correctly below the status bar
  • Ctrl+\ opens the command palette (override with JACKIN_PALETTE_KEY)
  • Mouse clicks, arrow keys, and paste reach the agent unmodified
  • Status badges reflect runtime state changes from the new authority: working activity, blocked prompts, done markers, idle returns, and unknown/stuck cases where applicable
  • jackin-capsule status, jackin-capsule snapshot, and event/wait/token control paths continue to respond while the daemon is running

Documentation

Start the docs site locally:

(
  cd docs
  bun run dev -- --host 127.0.0.1
)

http://localhost:3000/reference/roadmap/agent-runtime-status/
Confirm the roadmap item says the status authority is fully implemented and its remaining follow-up scope matches this PR's deferred list.

http://localhost:3000/reference/roadmap/
Confirm the roadmap overview places the agent runtime status authority in the correct status section after this implementation.

Migration notes

None. This PR changes capsule/runtime protocol behavior and docs, but it does not bump the versioned config, workspace, or role manifest schemas.

donbeave and others added 4 commits June 4, 2026 17:53
…detectors

Replace the timer-based "silence means blocked" heuristic with a layered
status authority. Foundation slice: Phase 0 (silence-bug removal), Phase 1
(state model + status machine), and the Phase 4 screen-detector framework.

Phase 0 — stop lying with silence:
- Remove BLOCKED_AFTER, state_after_refresh, state_after_pty_output.
- refresh_state() is now a no-op; PTY output no longer flips state on a timer.
- Add AgentState::Unknown to the protocol (serializes as "unknown").

Phase 1 — state model:
- New agent_status module: AgentRawState (10-variant input-signal enum),
  SessionStatus machine with advance()/acknowledge(), seen-derived Done,
  and a monotonic revision counter.
- Session owns one status: SessionStatus; the old state field is gone,
  replaced by Session::state(). mark_operator_input and acknowledge route
  through the machine. PTY output only stamps last_output_at now.

Phase 4 — screen-detector framework:
- Detector trait + DetectorRegistry with detectors for all five built-in
  runtimes (claude, codex, amp, kimi, opencode), each matching the bottom
  DETECTION_ROWS = 24 of the vt100::Screen and returning Option<AgentRawState>.
- Wired into the 1Hz ticker via Multiplexer::refresh_session_statuses(),
  which also acknowledges the focused pane so its Done clears to Idle.

TUI:
- Tab strip renders Unknown as a mid-gray dot; roll-up priority
  Blocked > Done > Unknown > (Working/Idle).

Docs:
- Update the roadmap item to Partially implemented with an "Implemented Shape
  (As Shipped)" section documenting the three deliberate divergences from the
  original proposal (kept AgentState+Unknown; SessionStatus machine instead of
  arbitrate_session_status; AgentRawState as input-signal enum). Mark Phase 0/1
  shipped and Phase 4 partial. Move the overview entry to Partially implemented.

Remaining: Phase 2 (reporter + /proc), Phase 3 (hooks), rest of Phase 4
(OSC 133, cursor probes, stuck detection, fixtures), Phase 5 (subscription
protocol + workspace roll-up event), Phase 6 (token/quota monitor + dialog).

Tests: 311 lib tests pass; clippy + rustfmt clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…e-status

Document how this feature (developed on main) must be shaped to match the
parallel TUI Architecture refactor (feature/tui-architecture) so the eventual
merge is mechanical. Six alignment rules: keep derivation/token logic in pure
agent_status/ + token_monitor/ modules, AgentState stays the single wire enum
with Unknown added (refactor gains the VisibleAgentState arm on merge), tab
glyphs in one cohesive match that relocates to tui/components/status_bar.rs,
daemon additions cohesive and ticker-local for clean relocation into the
daemon/ submodules, Session::state() accessor for a find-replace field swap,
and Phase 6 token dialog built data-first so only a thin render shim is
Ratatui-rewritten on merge.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Bake the verified TUI-refactor findings into the roadmap item so any future
implementer has the concrete merge facts without re-inspecting the branch:
exact file moves (statusbar.rs deleted → tui/components/status_bar.rs;
daemon.rs split into daemon/ submodules), the VisibleAgentState bridge enum
and visible_agent_state_from_protocol mapping, what the refactor did NOT
change (4-variant AgentState, silence-timer still present), and a per-file
conflict/resolution table for all eight touched code paths plus the two
conflict-free new module trees.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…nitor

Ships the remaining implementation of the agent runtime status authority
roadmap item (Phases 2, 3, 4, and the Phase 6 foundation).

Phase 2 — Container Reporter + Process Identity:
- Add `procfs = "0.18"` dependency; rewrite `process.rs` to use the
  `procfs` crate for `/proc`-based foreground-agent detection
- `docker/runtime/agent-status/report.sh` and `heartbeat.sh` — reporter
  CLI and heartbeat scripts that send length-prefixed JSON to the Capsule
  socket from inside the container
- `JACKIN_SESSION_ID`, `JACKIN_STATUS_SOCKET`, `JACKIN_STATUS_SOURCE`,
  `JACKIN_AGENT_RUNTIME` injected at session spawn (`build_agent_command`
  stamps `JACKIN_SESSION_ID` on the `CommandBuilder` using the generated sid)
- `ReportAgentState`, `HeartbeatAgentAuthority`, `ClearAgentAuthority`
  control messages added to the protocol and routed to `handle_control_msg`
  on the `Multiplexer`; sequence validation via `SequenceTracker` prevents
  stale/replayed authority
- `HookAuthority` struct in `agent_status::mod`; `Session` gains
  `hook_authority`, `sequence`, `child_pid`, `subagent_count` fields

Phase 3 — Semantic Runtime Reports:
- `ClaudeHookInstaller` writes and drift-repairs `~/.claude/settings.json`
  with all hook entries; `Stop` and `PermissionRequest` registered as
  `async: false`; atomic tmp-file + rename writes; three unit tests
- Stub installers for `Kimi`, `Amp`, `Codex`, `OpenCode`; `runtime_setup.rs`
  calls the correct installer per agent at every session launch
- Claude hook script (`docker/runtime/agent-status/hooks/claude/report-hook.sh`)
  maps every hook event to a raw state; `SubagentStart`/`SubagentStop` pass
  `--message <event>` so the daemon updates the subagent counter
- Subagent sticky-blocked rule: `WorkingVisible` suppressed when
  `subagent_count > 0` and session is `Blocked`
- Stub hook scripts for Kimi, Codex, and OpenCode

Phase 4 — Screen + Shell Integration:
- `OscCapture` extended with `OscShellMark` enum; `OSC 133 A/B/C/D`
  captured and drained each 1Hz tick → feeds `PromptVisible` /
  `Osc133PreExec` into the status machine
- Shell integration `.zshrc` block installed by `runtime_setup.rs`;
  OSC 133 + OSC 7 `precmd`/`preexec` hooks from `install_shell_integration()`
- `TabGlyph::Working` (◌ dim phosphor-green) and `TabGlyph::Stuck` (!
  amber) added to `statusbar.rs`; roll-up priority updated to include
  `Working` between `Done` and `Unknown`
- Screen fixture files for all five runtimes under
  `crates/jackin-capsule/src/agent_status/screen/fixtures/`
  (working / blocked / idle / false_positive per agent)

Phase 6 foundation — Token Monitor:
- `token_monitor/` module: `TokenMonitor`, `TokenSession`, `TokenTotals`
- Claude JSONL reader with incremental byte-offset reads, `costUSD`
  preference, sidechain skip, and three unit tests
- Static pricing table (`pricing.rs`) covering Claude, OpenAI, Kimi/MoonShot
  with tiered calculation; three unit tests
- Stub readers for Codex, Kimi, Amp, OpenCode; embedded model catalog
- `TokenMonitor` registered/deregistered alongside sessions; 30-second
  polling ticker in the daemon event loop; `SessionInfo.token_usage` now
  populated from monitor totals

Protocol additions:
- `ClientMsg`: `ReportAgentState`, `HeartbeatAgentAuthority`,
  `ClearAgentAuthority`, `EventsSubscribe`, `WaitSessionStatus`,
  `SessionReadVisible`
- `ServerMsg`: `AgentStateChanged`, `SessionStatusResult`,
  `SessionVisibleText`, `Welcome`, `Error`
- `TokenUsageSummary` struct; `SessionInfo.token_usage` field
- `WaitSessionStatus` handled synchronously against snapshot in
  `socket::handle_control_request`; full push-subscription deferred to Phase 5

Roadmap updates:
- Phases 0–4 marked ✅ Shipped; Phase 5 and Phase 6 marked partial
- `roadmap/index.mdx` updated to reflect current coverage
- `<RepoFile />` links for new docker runtime assets

329 capsule lib tests + 10 protocol tests pass; pre-existing
`git_prepare_commit_msg` integration test failures are unrelated.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
@donbeave donbeave changed the title feat(capsule): agent runtime status authority — state model + screen detectors feat(capsule): agent runtime status authority — Phases 0–4 + token monitor foundation Jun 4, 2026
Phase 5 — Host Consumer Integration:
- Console session list state badges color-coded by AgentState: Blocked
  = bright red, Done = phosphor green, Working = dim green, Unknown =
  mid-gray, Idle = quiet dim
- Done panes render with ◎ marker (vs ○ for normal, ● for focused) so
  finished-but-unseen work is immediately visible
- Daemon already acknowledges focused pane on each 1Hz tick (Done → Idle)

Phase 6 — Full Token Monitoring:
- Codex JSONL reader: session-format (event_msg/token_count cumulative-
  delta calculation) + headless format (direct usage fields); 2 unit tests
- Kimi wire.jsonl reader: parses StatusUpdate token_usage fields
  (input_other, output, input_cache_read, input_cache_creation)
- Amp thread JSON reader: array and wrapper-object shapes; 2 unit tests
- OpenCode SQLite reader: rusqlite crate, rowid-incremental queries,
  read-only connection flags, legacy schema stub
- ModelCatalog::populate() with live API queries via ureq:
  Anthropic /v1/models, OpenAI /v1/models, MoonShot /v1/models;
  embedded fallback when API key absent or request fails
- RateWindow + ProviderUsageSnapshot structs; TokenMonitor.token_snapshots
  updated on each poll cycle
- Dialog::TokenUsage carries TokenUsageSummary; render_token_usage shows
  model name, input/output/cache token counts (formatted as 14.2k / 1.4M),
  and cost estimate
- Ctrl+u (0x15) opens token usage dialog from any pane
- status bar token indicator: compact "89% (5h)" format at end of row-1,
  amber below 20%, red below 10% (wired into StatusBar::render())
- TokenGetSession + TokenGetModels control messages in protocol; handled
  in one-shot socket handler
- ModelCatalog registered at session start via register_session()
- Dependencies: rusqlite = "0.32" (bundled), ureq = "2.12" (tls)

337 capsule lib tests + 10 protocol tests pass.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
@donbeave donbeave changed the title feat(capsule): agent runtime status authority — Phases 0–4 + token monitor foundation feat(capsule): agent runtime status authority — Phases 0–6 Jun 4, 2026
donbeave and others added 2 commits June 4, 2026 22:06
…OpenCode ACP

events.subscribe streaming push protocol:
- Add `state_broadcast_tx: broadcast::Sender<ServerMsg>` to Multiplexer
- Broadcast `AgentStateChanged` on every session state change in
  `refresh_session_statuses`, OSC 133 handler, and `handle_control_msg`
- Broadcast spawn event when sessions are added to the tab tree
- `EventsSubscribe` handling in `socket::handle_control_request`:
  sends a `Welcome` frame then enters a streaming loop pushing
  broadcast events until the connection drops; handles `Lagged` errors
- Pass `state_broadcast_tx` through `perform_handshake` to the handler

CSI 6n cursor-position probe for stuck detection:
- `Session::probe_cursor_position()` sends `\x1b[6n` to the PTY
- `Session::check_for_cpr_response()` scans PTY output for
  `\x1b[<row>;<col>R` pattern; sets `cursor_probe_responded` flag
- `Session` gains: `cursor_probe_pending`, `cursor_probe_sent_at`,
  `cursor_probe_responded`, `stuck_since` fields
- `refresh_session_statuses` sends probe when Working + no output >
  4 minutes; routes CPR response as `CursorProbeOk`; handles 2s
  timeout as `CursorProbeTimeout`

Stuck timeout → `TabGlyph::Stuck` (amber `!`):
- `STUCK_WARNING_THRESHOLD = 5 minutes`: sets `session.stuck_since`
- `snapshot_stuck_sessions()` returns a `HashSet<u64>` of stuck IDs
- `tab_label` takes `stuck_sessions: &HashSet<u64>`; uses
  `TabGlyph::Stuck` when any pane is stuck (priority between Done
  and Working)
- `StatusBar::render` takes `stuck_sessions` and passes it to
  `tab_label`; all 6 test call sites updated
- `#[allow(dead_code)]` removed from `TabGlyph::Stuck`

OpenCode ACP stdio JSON-RPC bridge:
- `docker/runtime/agent-status/hooks/opencode/acp-bridge.sh`: spawns
  `opencode acp`, reads JSON-RPC notifications (session.idle,
  session.busy, session.status, question.asked, tool.call,
  agent.start/end), maps to `report.sh` calls
- `OpenCodeAcpInstaller::install()` writes a marker file and is
  called from `install_status_reporter_hooks` for opencode sessions
- `entrypoint.sh` opencode case launches the ACP bridge in background

337 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…verview

Phases 0–6 all marked ✅ Shipped. Phase 4 updated with CSI 6n probe and
Stuck timeout. Phase 5 updated with events.subscribe streaming details.
Roadmap index entry updated to reflect full implementation.

Only remaining item: Anthropic OAuth usage window polling (opt-in via
ANTHROPIC_OAUTH_TOKEN, explicitly out of scope for this PR).

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
@donbeave donbeave changed the title feat(capsule): agent runtime status authority — Phases 0–6 feat(capsule): agent runtime status authority — all phases Jun 4, 2026
donbeave and others added 4 commits June 4, 2026 22:19
…ocking WaitSessionStatus

arbitrate.rs — pure arbitration function + 11 tests:
- arbitrate_unknown_when_no_signals
- arbitrate_working_from_process_alive
- arbitrate_blocked_from_screen_blocker_no_hook
- arbitrate_blocked_authoritative_when_hook_agrees
- arbitrate_working_overrides_idle_hook_with_fresher_screen
- arbitrate_fresh_hook_authority_wins
- arbitrate_process_exit_clears_to_idle
- arbitrate_hook_cleared_when_wrong_agent
- roll_up_blocked_beats_working_beats_idle
- roll_up_done_beats_working / roll_up_unknown_when_empty
- attention_priority_order
ScreenDetection and ProcessEvidence structs as pure evidence types.
attention_priority() and roll_up_states() public helpers.

State machine tests added to agent_status/mod.rs:
- re_work_after_ack_creates_new_done
- done_derived_from_idle_plus_unseen
- roll_up_priority_blocked_gt_done_gt_working_gt_idle_gt_unknown

Reporter protocol tests added:
- reporter_accept_valid_sequence / reporter_reject_stale_sequence
- reporter_reject_wrong_source_after_clear (sequence.rs)
- heartbeat_keeps_hook_authority_fresh
- clear_authority_removes_only_matching_source (mod.rs)

Process identity tests added (process.rs):
- identify_agent_stat_comm_truncation_falls_back_to_exe
- dead_process_returns_none

Token monitor tests added (token_monitor/mod.rs):
- token_monitor_backs_off_after_silence
- token_monitor_resets_backoff_after_change
- token_monitor_poll_due_respects_interval
- session_info_includes_token_usage_when_available

Model catalog tests added (token_monitor/models.rs):
- model_catalog_falls_back_to_embedded_list_on_error
- model_catalog_uses_cached_result_within_ttl
- model_catalog_parses_model_entries_correctly

wait_session_status blocking implementation (socket.rs):
- Checks current state against target immediately
- If not satisfied: subscribes to state_broadcast_tx and blocks
- Loops on broadcast events until matching AgentStateChanged
- Exits with "satisfied", "timeout", or "not_found" outcome
- Handles Lagged broadcast errors gracefully

Screen fixtures added:
- claude/false_positive_old_spinner.txt — spinner above separator
- claude/false_positive_timing.txt — "Churned for 1m" timing line

wait_session_status_already_satisfied test added (daemon.rs)

367 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…ning

- claude_stop_hook_is_registered_as_sync_and_checks_background_tasks:
  verifies Stop and PermissionRequest hooks are async: false and command
  path is correct (hook_installer.rs)
- subagent_counter_prevents_working_from_clearing_blocked: documents
  the WorkingVisible/Blocked interaction and daemon-level guard
  (agent_status/mod.rs)
- wait_session_status_logic_timeout_when_not_satisfied: unsatisfied
  targets → outcome "timeout" (daemon.rs)
- wait_session_status_logic_not_found_for_unknown_session +
  session_info_token_usage_is_populated_from_monitor (daemon.rs)
- #[allow(dead_code)] on screen_idle test helper in arbitrate.rs

372 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Amp detector now detects all three states:
- Working: "esc to cancel" chrome (existing)
- Blocked: "Allow?" or "approve"+"esc to cancel" approval prompts (new)
- Idle: ">" or "> " or "❯" prompt at bottom (new)
3 new tests: detects_blocked_from_approval_prompt,
             detects_idle_from_prompt_line,
             detects_idle_from_arrow_prompt

OpenCode detector now detects all three states:
- Blocked: "permission required" or "dismiss"+"enter/select" (existing + expanded)
- Working: "Ctrl+C to cancel" or "interrupt" + "cancel" chrome (new)
- Idle: ">" or "> " or "❯" input box at bottom (new)
3 new tests: detects_blocked_from_question_prompt,
             detects_working_from_interrupt_chrome,
             detects_idle_from_input_box

New screen fixture: opencode/false_positive.txt (cosmetic redraw,
no working/blocked/idle triggers)

PTY/e2e simulation tests (4) in agent_status/mod.rs:
- fake_claude_permission_dialog_transitions_to_blocked: vt100::Parser
  renders permission dialog → DetectorRegistry → BlockedVisible → Blocked
- fake_claude_spinner_transitions_to_working_then_idle: spinner screen
  → Working; then prompt box → PromptVisible → Done (unseen)
- process_exit_signal_transitions_to_idle_and_clears_authority: Working
  + ProcessExited signal → Idle
- multiple_sessions_roll_up_reflects_most_urgent: roll_up_states over
  Working/Blocked/Working/Idle → Blocked (highest priority)

382 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…iteria met

All acceptance criteria verified. Status line updated to 'Implemented'.
382 tests confirm coverage. OAuth polling remains explicitly out of scope
per the roadmap's own out-of-scope list.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
@donbeave donbeave changed the title feat(capsule): agent runtime status authority — all phases feat(capsule): agent runtime status authority — all phases, all acceptance criteria Jun 4, 2026
donbeave and others added 14 commits June 4, 2026 22:42
…gent msg

Protocol compliance with roadmap event stream payload spec:

AgentStateChanged expanded to include all required fields:
  raw_state, confidence, detected_agent, foreground_pgid,
  visible_blocker, visible_idle, visible_working, process_exited,
  stale_report, seq, ts_ns, last_seen_revision (all optional with
  serde defaults so existing serialised events decode without errors)

New ServerMsg variants:
  SessionSpawned { session_id, agent, label } — new session created
  SessionExited { session_id } — session removed
  TokenUsageChanged { session_id, agent, model, tokens, cost, ts_ns }
  WorkspaceStatusChanged { effective, session/blocked/done/working counts }

New ClientMsg variant:
  ReportChildAgentState { parent_session_id, child_session_id, raw_state, seq }

Daemon wiring:
  SessionSpawned broadcast on every tab-spawn and split-spawn insert
  SessionExited broadcast when session is removed
  WorkspaceStatusChanged broadcast after every refresh_session_statuses
  TokenUsageChanged broadcast per session on token ticker cycle
  All AgentStateChanged broadcasts use expanded field set

Shared palette compliance (roadmap requirement):
  AMBER: Rgb = Rgb::new(255, 170, 0) added to crates/jackin-tui/src/lib.rs
  statusbar.rs: Working glyph and Stuck/token-bar amber now use
    PHOSPHOR_DARK_FG / AMBER_FG constants derived from shared palette
    instead of inline \x1b[38;2;... byte literals

382 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
crates/jackin-protocol/src/agent_status.rs (new):
- AgentRawState { Unknown, Working, Blocked, Idle } — protocol-level
  4-variant raw state enum with label() and serde support
- AgentStatusSource — Reported/VisibleScreen/ForegroundProcess/
  ShellIntegration/CursorProbe/OutputActivity/None
- AgentStatusConfidence — Unknown/Weak/Strong/Authoritative (Ord impl)
- AgentStatusReport struct — all evidence fields: raw_state, source,
  confidence, detected_agent, foreground_pgid, visible_blocker/idle/
  working, process_exited, stale_report, revision, last_seen_revision
- 4 unit tests: labels, confidence ordering, default state, JSON roundtrip

crates/jackin-protocol/src/lib.rs: pub mod agent_status

control.rs: SessionInfo and PaneSnapshot gain
  agent_status_report: Option<AgentStatusReport> field

crates/jackin-capsule/src/agent_status/seen.rs (new):
- acknowledge_session() and mark_pane_focused() delegating to
  SessionStatus::acknowledge()
- 2 unit tests

crates/jackin-capsule/src/agent_status/mod.rs:
- pub mod seen registered
- event_stream_emits_on_raw_state_change test: verifies Unknown→Working
  produces Some(Working), repeated Working produces None, Working→Blocked
  produces Some(Blocked)

385 capsule + 14 protocol tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Resolved conflicts from two main-branch PRs:
- #529: signed releases + supply-chain verification (GitHub Actions, new
  .gitignore entry, release workflow changes, docs for security/verification)
- #530: agent codenames + operator CLI hygiene (new `Agents`/`AgentRegistry`
  ClientMsg/ServerMsg, `AgentRegistryEntry` struct, codename_live/retired/
  wordlist fields in Multiplexer, `jackin status` CLI, error codes, preflight)

Conflict resolutions:
- control.rs: kept all our new variants (ReportAgentState, HeartbeatAgentAuthority,
  ClearAgentAuthority, ReportChildAgentState, EventsSubscribe, WaitSessionStatus,
  SessionReadVisible, TokenGetSession, TokenGetModels, AgentStateChanged,
  SessionSpawned, SessionExited, TokenUsageChanged, WorkspaceStatusChanged,
  TokenSessionResult, TokenModelsResult, SessionStatusResult, SessionVisibleText,
  Welcome, Error, TokenUsageSummary) AND added main's Agents/AgentRegistry/
  AgentRegistryEntry from #530
- daemon.rs: kept our detectors/token_monitor fields AND added main's
  codename_live/retired/agent_history/wordlist_offset fields
- client.rs: kept our wildcard match arms AND added explicit AgentRegistry arms

389 capsule + 14 protocol tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
arbitrate.rs:
- Extract ScreenDetection::is_fresher_than() — freshness check was
  duplicated inline at two guard sites (blocker and working overrides);
  now a single documented method with a clear conservative default

process.rs:
- Extract agent_kind_from_name(name: &str) — agent-slug matching was
  duplicated across the exe-basename path and the stat.comm fallback;
  both paths now call the same fn

token_monitor/models.rs:
- Extract fetch_from_api() — three near-identical fetch_anthropic /
  fetch_openai / fetch_moonshot bodies collapsed into a single
  parameterized helper; each caller is now one line with its
  env-key, URL, auth-header builder, and model filter

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…otes

Moves OSC 133 shell-integration parsing out of `OscCapture`/`session.rs`
into a model-independent raw-PTY scanner in `agent_status/mod.rs`. This
eliminates the conflict with #495's vt100 → DamageGrid/PassthroughEvent
migration (which removes `OscCapture` entirely).

agent_status/mod.rs:
- Add `OscShellMark { PromptStart, PromptEnd, PreExec, CommandFinished }`
- Add `scan_osc133(bytes: &[u8]) -> Option<OscShellMark>` — byte-level
  scanner that finds `\x1b]133;A/B/C/D` sequences without depending on
  the vt100 parser or DamageGrid model
- 5 unit tests for the scanner

session.rs:
- Remove `OscShellMark` enum (moved to agent_status)
- Remove `OscCapture.shell_mark` field, `take_shell_mark()`, OSC 133 block
  in `unhandled_osc`
- Remove `Session::take_shell_mark()`
- Net -50 lines; session.rs conflicts with #495 are now confined to
  the `state`→`status` field swap and the new cursor-probe fields

daemon.rs:
- `SessionEvent::Output` handler calls `scan_osc133(&data)` after
  `session.feed_pty`, feeds result into state machine immediately
  (no 1Hz lag for shell-boundary signals)
- Remove `take_shell_mark` drain from `refresh_session_statuses`

statusbar.rs:
- Add doc comment to `tab_label` noting the post-#495 type swap from
  `AgentState` to `VisibleAgentState` — mechanical find-replace, glyph
  arms unchanged

394 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…nature

StatusBar::render() gained stuck_sessions and token_snapshot params
(added in Phase 4/6). The integration test helpers were missed when
those params were added. CI caught the mismatch on MSRV check.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
token_monitor/mod.rs:
- Add find_provider_files(base_dirs, ext) shared helper — removes
  duplicate directory-scan code across claude.rs and codex.rs
- Combine poll_due_sessions() from two passes (collect IDs then update
  snapshots) into a single pass that does both in one loop iteration

token_monitor/claude.rs + codex.rs:
- Replace local find_jsonl_files() with calls to find_provider_files()

daemon.rs:
- Guard scan_osc133 call with data.len() >= 8 early-exit so short
  PTY buffers (spinner redraws, cursor reports) skip the scan entirely

394 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
daemon.rs:
- make_agent_state_changed() helper eliminates 4 repetitions of the
  AgentStateChanged 16-field struct literal; each broadcast site now
  passes only the fields it owns (session_id, effective, seen, source,
  revision, reason)
- Workspace roll-up: 3 separate .filter().count() passes → single
  match loop with three counters; one iteration over sessions instead
  of three

socket.rs:
- make_result closure in WaitSessionStatus arm — 3 identical
  SessionStatusResult constructions collapsed to 1 call site each;
  only outcome/effective/revision differ

394 capsule lib tests + 8 integration tests pass.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
… errors, models TTL

From comprehensive PR review:

Done state now reachable from screen detectors:
- `WorkingVisible`/`BlockedVisible` transitioning FROM Idle/Unknown/Done
  now resets `seen = false`, so the subsequent PromptVisible produces
  `Done` instead of `Idle`. Previously only `HookTaskStart`/`OperatorInput`
  reset `seen`, making `Done` unreachable in the screen-detection path.
- Updated `prompt_visible_after_working_produces_idle_when_seen` test:
  enter Working first (resets seen), then explicitly set seen=true, then
  PromptVisible → Idle (correct; seen was already true when work ended)

amp.rs token double-counting (critical data bug):
- Replaced accumulation into session.totals with scratch-compute-and-compare:
  totals computed fresh each poll, `changed` only true if they moved.
  Previously totals were added cumulatively — after N polls, counts were
  N× the real value with no indication of the problem.

let _ = file.seek() silent corruption (claude/codex/kimi):
- Check seek result; on failure, cdebug! log + reset offset to 0 and
  re-seek from start. Previously a truncated/rotated file would silently
  corrupt the offset and lose tokens on the next poll.

models.rs fetched_at stamped even after HTTP failure:
- Only stamp `self.fetched_at` when entries were actually added.
  Previously a transient network failure at boot silenced the model
  catalog for the entire session (24h TTL blocked retries).
- Add cdebug! logging on each early-return path in fetch_from_api.

opencode.rs silent DB failures:
- Add cdebug! when Connection::open_with_flags fails.
- Add cdebug! when conn.prepare fails (schema mismatch).

report.sh silent drop when Python unavailable:
- Add `else` branch with stderr message when neither python3 nor python
  is found. Previously state reports were silently discarded.
- Remove `|| true` from nc pipe; emit stderr on socket write failure.
- Remove dead LEN variable (computed but never used).

Comment fixes:
- agent_status/mod.rs: fix arch diagram reference to non-existent `osc133`
  submodule; remove HookTaskStart from priority rule 3 (it always maps to
  Working, never Blocked)
- arbitrate.rs: fix "Called by daemon's 1Hz ticker" (currently only in tests)
- session.rs: replace stale Phase-1-will-replace comment in refresh_state
- statusbar.rs: remove PR #495 forward-reference to non-existent VisibleAgentState

394 capsule lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
token_monitor/mod.rs: seek_or_reset(file, offset, path) — single place for the
seek-failure handling pattern used by claude/codex/kimi; logs via cdebug! and
resets offset to 0 so the next read starts from the beginning

claude.rs / codex.rs / kimi.rs: replace inline 4-line seek+reset blocks with
seek_or_reset(); remove now-unused Seek/SeekFrom imports

amp.rs: borrow &[serde_json::Value] instead of .to_owned() — avoids cloning the
entire JSON array on every file read during each poll cycle

394 lib tests pass; workspace builds clean.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…, broadcast logging

models.rs — same-count API response never stamped fetched_at (HIGH):
- fetch_from_api() now returns bool indicating HTTP success (true) vs failure
- populate() stamps fetched_at only when the round-trip returned true,
  regardless of whether entry count changed; previously retain+extend with
  same count produced len==before, leaving needs_refresh()=true permanently
  and firing an HTTP call on every session spawn

amp.rs — cache-only token changes silently frozen (MEDIUM):
- Include cache_read_tokens and cache_write_tokens in the `changed`
  comparison; was input+output only, so all-cache-hit runs never updated
  the displayed cache totals

token_monitor/mod.rs — second seek failure silently ignored (MEDIUM):
- seek_or_reset() now returns bool; if the fallback seek-to-0 also fails,
  cdebug! is logged and the function returns false so callers skip the file
  rather than constructing a BufReader at an unknown position

daemon.rs — state-transition broadcast failures entirely silent (MEDIUM):
- screen-detector, hook, and token-usage broadcast sites now log via
  cdebug! when send() returns Err (no active receivers); spawn sites
  keep let _ = (no receivers expected at spawn time)

report.sh — Python pipe failure invisible (MEDIUM):
- set -eu → set -euo pipefail so Python framing errors in the pipeline
  exit code propagate instead of being masked by nc's exit code

Comments:
- Remove 2 misleading lines from blocked_is_sticky test that said
  "Working-visible must not clear Blocked" (the assertion proves it does)
- Update find_provider_files doc to describe both direct-child and
  one-level-deep scanning behavior

Tests added (6 new, total now 400):
- working_visible_from_fresh_session_produces_done_not_idle
- blocked_visible_from_fresh_session_produces_done_on_prompt
- working_visible_after_ack_produces_done_again
- seek_or_reset_succeeds_when_offset_valid
- seek_or_reset_resets_when_offset_beyond_eof
- populate_on_empty_api_key_leaves_needs_refresh_true

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…ed, fetch_from_api contract

codex.rs — prev_cumulative scoped wrong (HIGH):
- Move prev_cumulative inside the for-path loop so it resets to (0,0,0,0)
  per file; was shared across files causing cross-file delta corruption
- Replace saturating_sub with delta closure that cdebug! logs on
  counter regressions (counter drops after seek_or_reset or session replay)

socket.rs — WaitSessionStatus Lagged handler wrong (HIGH):
- Was: continue (silently keep waiting, may never unblock if target state
  was in dropped events)
- Now: cdebug! + break with outcome=timeout so caller can retry with
  fresh state rather than waiting forever on a missed event

models.rs — fetch_from_api 0-model success semantics (MEDIUM):
- Now returns false when all models are filtered out (empty new Vec)
  so needs_refresh() stays true and caller retries rather than locking
  in embedded fallback for 24h with a valid-but-empty response
- Update populate() doc comment to say "at least one model matched"

report.sh — set -euo pipefail under #!/bin/sh (IMPORTANT):
- Revert to set -eu; pipefail is not POSIX and unreliable on older dash;
  the || echo handlers already handle pipeline failures explicitly

entering_work_cycle simplification (Simplification):
- Remove redundant HookTaskStart|OperatorInput first arm; both signals
  unconditionally map to Working in transition(), so the second arm
  (transition into Working|Blocked from non-working state) subsumes them
- Update comment accordingly

Tests added (2 new → 402 total):
- amp_changed_flag_includes_cache_tokens: documents the 4-field
  comparison and shows old 2-field logic would miss cache-only changes
- populate_stamps_fetched_at_only_on_success: directly tests the
  false→no-stamp / true→stamp branches of populate()

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
…-status

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Codex <codex@openai.com>

# Conflicts:
#	Cargo.lock
#	crates/jackin-capsule/Cargo.toml
#	crates/jackin-capsule/src/daemon.rs
#	crates/jackin-capsule/src/dialog.rs
#	crates/jackin-capsule/src/lib.rs
#	crates/jackin-capsule/src/runtime_setup.rs
#	crates/jackin-capsule/src/session.rs
#	crates/jackin-capsule/src/statusbar.rs
#	crates/jackin-capsule/tests/status_bar.rs
#	crates/jackin-tui/src/lib.rs
#	crates/jackin/src/console/tui/view/list/tests.rs
#	docs/content/docs/reference/roadmap/agent-runtime-status.mdx
#	src/console/manager/input/list.rs
…-status

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Codex <codex@openai.com>

# Conflicts:
#	Cargo.lock
#	crates/jackin-capsule/benches/pane_body.rs
#	crates/jackin-capsule/src/runtime_setup.rs
@donbeave donbeave changed the title feat(capsule): agent runtime status authority — all phases, all acceptance criteria feat(capsule): add agent runtime status authority Jun 10, 2026
donbeave and others added 6 commits June 10, 2026 01:25
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Codex <codex@openai.com>
Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Codex <codex@openai.com>
Validate the current detection engine against the design contract and
record the gap list: arbitration, /proc identity, OSC 133, seen/ack,
subagent counting, and the semantic hook channel exist but are unwired,
while PTY output bytes author working state — the flap engine behind
the false-positive reports.

Rebuild the page as the canonical spec: skeptical review of every
detection approach in the field, June 2026 re-research of Herdr
(manifest rules, deleted PTY-first experiment) and verified vendor
hook/plugin surfaces, license discipline for borrowed concepts, an
evidence/arbitration/debounce architecture with data model, gating
table, rule-pack schema, constants, and watchdog, per-runtime adapter
plans, and a nine-slice implementation blueprint with per-slice files,
tests, and acceptance criteria.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
The single-source-of-truth collapse dropped reference material the
implementation needs: the precise validation audit, Herdr observed
facts and timeline, per-runtime signal surfaces with doc and issue
links, the orchestrator survey, terminal-protocol reliability, and
/proc signal validity. Restore all of it as appendices A-F so no
finding lives outside this page.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Add non-built-in runtime surfaces (Gemini, Aider, Pi) as reference for
the custom-role reporter contract, community corroboration sources, and
the two out-of-scope items dropped during the rewrite (per-tool status
granularity, agents beyond the built-in five).

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Claude <noreply@anthropic.com>
Replace byte-driven status detection with evidence arbitration, runtime event gates, process and OSC corroboration, rule packs, diagnostics, and roadmap/docs updates.

Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com>
Co-authored-by: Codex <codex@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant