Skip to content

fix(runtime): reap detached box's full cgroup tree on stop/remove#876

Draft
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:fix/detached-box-cgroup-reap
Draft

fix(runtime): reap detached box's full cgroup tree on stop/remove#876
G4614 wants to merge 2 commits into
boxlite-ai:mainfrom
G4614:fix/detached-box-cgroup-reap

Conversation

@G4614

@G4614 G4614 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Problem

A detached box's boxlite stop / rm -f returns success and removes the
box record, but leaves its VM running — an untracked orphan. This is the
runtime root of detached-box accumulation ("积压") on long-lived hosts
(#137/#141).

Cause: kill_process(recorded_pid) only SIGKILLs the outer bwrap launcher
(shim.pid records it). A box's tree is outer bwrap → inner pid-ns bwrap → shim → VM. Since #851 stopped applying --die-with-parent to detached boxes,
killing the outer one leaves the inner pid-ns tree alive.

Fix — reap the box's whole cgroup by id

  • cgroup::kill_cgroup(box_id) — write 1 to <cgroup>/cgroup.kill
    (cgroup v2). Tears down every process in the box's cgroup atomically,
    regardless of pid-namespace / process-group structure. Best-effort,
    idempotent — a no-op when the cgroup is gone/empty or absent.
  • remove_box force pathkill_cgroup(id) before the single-pid
    fallback; keyed on id, so it reaps even after recovery cleared
    state.pid.
  • stopkill_cgroup as a final step after graceful guest shutdown; a
    no-op once the box has exited cleanly.

kill_process stays as the fallback for hosts without the cgroup jailer
(disabled / macOS seatbelt), where the box tree is a single shim process.

Tests

  • Unit (CI, no root)kill_cgroup on an absent cgroup is a no-op
    (returns false); locks the best-effort contract for the no-jailer / macOS
    path.
  • Integration (Linux, VM)detached_box_force_remove_reaps_whole_tree:
    boot a detached box, assert a live tree, remove(force), assert 0 box
    processes remain (/proc scan by box id). Reproduced two-sided as root:
    with the rm -f reap disabled it FAILS (left: 2, right: 0 — the leaked
    inner bwrap + shim); restored, it passes.

Verified (root, repeated)

Boxes boot on the test host once euid 0 bypasses the jailer's RLIMIT_NPROC
ceiling.

path before after
rm -f detached ×5 3 procs 0 each
stop detached ×2 2 procs 0 each
non-detached run --rm prints output, exit 0, 0 leak

Compiles; cargo fmt clean.

⚠️ Not run: the full cargo/CI integration suites (need a KVM runner with
process headroom). macOS seatbelt path falls back to kill_process and is
unchanged.

Related

The test-side counterpart (recovery tests reaping their own stranded detached
trees) is #870. This PR fixes the underlying production leak that those tests
were tripping over.

🤖 Generated with Claude Code

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 40810cbd-a580-4400-af9c-6848791d3b46

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

G4614 and others added 2 commits June 26, 2026 14:14
A detached box's `stop` / `rm -f` left its VM running. `kill_process` only
SIGKILLs the recorded pid — the outer bwrap launcher — and since boxlite-ai#851 stopped
applying `--die-with-parent` to detached boxes, the inner pid-ns tree (inner
bwrap + shim + VM) survives. The command returns success and the box record is
removed, so the orphan VM is untracked — the root of detached-box accumulation
on long-lived hosts (boxlite-ai#137/boxlite-ai#141).

The whole tree lives in the box's cgroup, so reap it by id:
- add `cgroup::kill_cgroup(box_id)`: write "1" to `<cgroup>/cgroup.kill`
  (cgroup v2); reaps every process in the box's cgroup atomically, regardless
  of pid-namespace / process-group structure. Best-effort, idempotent, no-op
  when the cgroup is gone/empty or absent.
- `remove_box` force path: `kill_cgroup` before the single-pid fallback; keyed
  on id so it reaps even after recovery cleared `state.pid`.
- `stop`: `kill_cgroup` as a final step after graceful shutdown — a no-op once
  the box has exited cleanly.
`kill_process` stays as the fallback for hosts without the cgroup jailer
(disabled / macOS seatbelt), where the box tree is a single shim process.

Tests:
- cgroup unit test (no root, CI): `kill_cgroup` on an absent cgroup is a no-op
  (returns false) — locks the best-effort contract for the no-jailer / macOS
  path.
- detach integration test (Linux, VM): `detached_box_force_remove_reaps_whole_tree`
  asserts `rm -f` leaves zero box processes (scans /proc by box id). Reproduced
  two-sided as root: with the rm-f reap disabled the test FAILS (left: 2,
  right: 0 — the leaked inner bwrap + shim); restored, it passes.

Verified as root (boxes boot once euid 0 bypasses the jailer RLIMIT_NPROC
ceiling), repeated: rm -f detached x5 -> 3 procs -> 0; stop detached x2 -> 2 ->
0; non-detached run --rm prints output, exit 0, 0 leak (no regression).
Compiles; cargo fmt clean. Full CI integration suites need a KVM runner and
were not run here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`jailer::cgroup` is `#[cfg(target_os = "linux")]`, so the unconditional
`kill_cgroup` calls in remove_box and stop() failed to compile on macOS
(`error[E0433]: cannot find 'cgroup' in 'jailer'`). Gate both call sites with
`#[cfg(target_os = "linux")]`, matching the existing cross-platform jailer call
convention; macOS keeps the `kill_process` single-pid path (no pid-namespace
nesting there, so it already reaps the box). Linux behavior unchanged.
@G4614 G4614 force-pushed the fix/detached-box-cgroup-reap branch from 01cadc6 to 4b5c0b7 Compare June 26, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant