feat(agent): detect availability via session/new probe and assistant-first identity#500
Merged
Merged
Conversation
added 30 commits
June 15, 2026 17:57
The availability scheduler runs `try_connect_custom_agent` every 5 minutes for every agent, spawning a CLI subprocess and tearing it down once the ACP handshake completes (or fails). For wrapper CLIs that fork a long-lived grandchild — `npm exec openclaw --acp` is the production case — cleanup was leaking the grandchild because: * `kill_on_drop(true)` on the tokio Command only signals the direct child (the npm exec wrapper), not its grandchild. * The probe relied on `drop(protocol)` for the success path and on no explicit cleanup for the handshake-fail path, so `proc.kill` was never called. * `CliAgentProcess::kill` itself short-circuited and returned Ok the moment the leader exited within the grace period — so even when callers did invoke it, no group-wide SIGKILL was sent. Result: dozens of zombie `openclaw-acp` processes accumulated per day under the 5-minute scheduler. Fix: 1. `CliAgentProcess::kill` now always issues a group-wide SIGKILL after the grace period, even when the leader has already exited. `force_kill` already maps ESRCH to success, so the sweep is idempotent for already-reaped trees. 2. `try_connect_custom_agent` calls `proc.kill` on every outcome (success, ACP failure, handshake timeout) by hoisting the spawn out of the inner future and running cleanup unconditionally after the timeout race resolves. 3. New regression test `probe_kills_grandchild_left_behind_by_wrapper` exercises the exact wrapper-grandchild shape from production and asserts the grandchild is reaped before the probe returns.
added 25 commits
June 22, 2026 16:32
…ficeAI/AionCore into feat/agent-connection-testing-phase2 * 'feat/agent-connection-testing-phase2' of github.com:iOfficeAI/AionCore:
…ackend
An assistant's agent_status was matched to its agent row by `backend`
only. aionrs (the built-in Rust agent) has a NULL `backend` and is keyed
by `agent_type` ("aionrs"), so every aionrs-backed assistant failed to
resolve a row and was mislabelled Missing/unavailable.
Match the agent row on `backend == effective_backend` OR
`agent_type.serde_name() == effective_backend`, so aionrs assistants
resolve to the real aionrs row and reflect its actual status.
Add a regression test covering an aionrs assistant (row with NULL backend,
agent_type Aionrs, Online) resolving to Online instead of Missing.
…-testing-phase2 # Conflicts: # crates/aionui-conversation/src/service.rs
…-testing-phase2 # Conflicts: # crates/aionui-conversation/src/service.rs # crates/aionui-conversation/src/service_test.rs # crates/aionui-conversation/src/turn_orchestrator.rs # crates/aionui-cron/tests/service_integration.rs # crates/aionui-team/src/test_utils.rs # crates/aionui-team/tests/session_service_integration.rs
This reverts commit e2ee532.
kaizhou-lab
pushed a commit
that referenced
this pull request
Jun 25, 2026
🤖 I have created a release *beep* *boop* --- ## [0.1.37](v0.1.36...v0.1.37) (2026-06-25) ### Features * **agent:** detect availability via session/new probe and assistant-first identity ([#500](#500)) ([6c9a721](6c9a721)) * **conversation:** add cursor pagination for messages ([#515](#515)) ([ba76273](ba76273)) ### Bug Fixes * **agent:** classify ACP and provider errors ([#518](#518)) ([ef573d0](ef573d0)) * **aionrs:** adapt runtime guard config ([#510](#510)) ([464f453](464f453)) * **conversation:** recover dead ACP turns after agent process loss ([#514](#514)) ([e0ce4f4](e0ce4f4)) * **db:** repair legacy handoff schema drift ([#516](#516)) ([292e5f2](292e5f2)) * validate skill frontmatter as yaml ([#512](#512)) ([6b46055](6b46055)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR started as agent connection testing phase 2, but it now also completes the backend side of the assistant-first identity migration. The combined scope is: accurate agent availability detection, assistant/agent identity unification, cron assistant-first persistence, and a clearer
/api/assistantscontract.Agent availability detection
session/newstartup instead of inferring availability from static metadata.needs_authstate after successful startup.aionrsavailability on resolved provider configuration instead of backend labels alone.Assistant-first identity and schema cleanup
agent_metadata.idas the canonical concrete agent binding behind assistants.agent_metadata.idinstead of ambiguous backend/preset fields.backendto explicitacp_backend.agent.idfrom/api/assistants; the top-levelagent_idis the only concrete agent binding in the response./api/assistants.agentas runtime metadata only:type,source, and optionalacp_backend.Cron jobs
agent_config.assistant_id.cron_jobs.agent_type/agent_config.backenddependence from current write/read paths.assistant_id.Team, conversation, channel, and ACP session flows
Tests
/api/assistantsexposesagent_idplusagent.acp_backend, and no longer exposes nestedagent.idoragent.backend.Testing
just pushpassed after merging latestorigin/main.Closes #499