Container parity with VM agentic-dev landed in 2026.5.0 under the
#181 epic (issues #182–#186). The dashboard, REST surface, and AIWG
bridge treat containers as first-class workloads alongside QEMU/KVM
VMs — same lifecycle vocabulary, same loadout selector, same mission
dispatch flow.
This document is the reference for operators picking a runtime and for
integrators wiring container instances into the AIWG bridge. The Rust
source of truth is
management/src/docker_runtime.rs;
the HTTP surface that wraps it is
management/src/http/containers.rs.
docker_runtime is the single chokepoint for Docker shell-outs. Every
container lifecycle operation funnels through these functions:
| Symbol | Purpose |
|---|---|
DockerMonitorConfig |
Poll cadence + orphan-age threshold; loaded from env (DOCKER_MONITOR_ENABLED, DOCKER_POLL_INTERVAL_SECS, DOCKER_ORPHANED_AGE_SECS). |
ContainerInfo / ContainerStatus |
Normalized docker ps row — Running, Stopped, or Other(raw). finished_at populated for stopped containers. |
SpawnOpts |
env: Vec<(String,String)>, labels: Vec<(key, value)>, mounts: Vec<(host, container)>, network: Option<String>, cmd: Vec<String>. |
list_containers() |
docker ps -a --filter label=agentic-sandbox=true. Managed containers only — we never surface containers we did not spawn. |
spawn_container(name, image, opts) |
Runs docker run -d --label agentic-sandbox=true --name {name} --add-host host.docker.internal:host-gateway …. Returns the container ID. |
start_container(name) / stop_container(name, timeout) |
Idempotent lifecycle verbs over the same label-filtered set. |
remove_container(id) |
docker rm -f on a single ID. |
get_container_by_name(name) |
Convenience lookup over list_containers(). |
spawn_docker_monitor(config, metrics) |
Background task: polls every poll_interval_secs, emits container.* lifecycle events, sweeps orphans older than orphaned_age_secs. |
The --add-host host.docker.internal:host-gateway is unconditional on
Linux. Without it the in-container agent's default
MANAGEMENT_SERVER=host.docker.internal:8120 does not resolve and the
container starts but immediately fails its first gRPC dial. Docker
no-ops the flag on Mac/Windows where the host gateway is native.
Both runtimes register against the same OutputAggregator, speak the
same gRPC contract from agent-rs, and surface in the same dashboard
sidebar. They differ where the substrate differs.
| Dimension | VM (QEMU/KVM) | Container (Docker) |
|---|---|---|
| Isolation | Full hardware virtualization. Kernel boundary between host and workload. | Process namespace. Shared kernel. |
| Startup time | 30–90 s cold (cloud-init runs once); 5–15 s warm. | 1–3 s typical for agentic/agent:dev-derived images. |
| Resource overhead | ~512 MB RAM floor per VM (kernel + systemd + journald). Dedicated virtual disk. | ~50 MB RAM floor. Layered filesystem; no per-instance kernel. |
| Network | Libvirt-managed bridge (192.168.122.0/24 default). Per-VM IP. agentshare profile gets --network none for isolation. |
Docker bridge or --network host. Reaches the host via host.docker.internal:host-gateway. |
| Persistence | Disk image survives virsh destroy; only provision-vm.sh --destroy wipes it. |
Container filesystem is ephemeral unless mounts are bound. Use mounts: [(host_path, /workdir)] for persistence. |
| AIWG framework install | Baked into the cloud-init seed by provision-vm.sh via loadout. |
Baked into the image at build time; claude / codex / opencode images rebase onto agentic/agent:dev. |
| Operator escape hatch | virsh console, ssh agent@<ip>. |
docker exec -it <name> bash. |
| Crash recovery | crash_loop.rs detector triggers provision-vm.sh rebuild. See crash-loop.md. |
Monitor sweeps stopped containers older than orphaned_age_secs (default 1 h). No auto-rebuild — operator decides. |
- The workload runs untrusted code, downloads arbitrary binaries, or needs to exercise kernel features the container runtime forbids (raw sockets, ptrace of arbitrary PIDs, loading kernel modules).
- The workload needs to survive container daemon restarts independent of host reboot.
- The mission persists for hours and the storage cost of a virtual disk is acceptable.
- The mission needs the
agentshare --network noneisolation tier (forensics / red-team profiles).
- The workload is a short-lived agent task (minutes to ~1 h).
- Fast iteration: rebuild image once, spawn dozens of fresh instances.
- The toolchain in
agentic/agent:devis sufficient (Python via uv, Node via fnm, Go, Rust via rustup, ripgrep/fd/bat/jq/delta/xh, cmake/ninja/meson, aider pinned to Python 3.12,gh+gh copilot). - The provider image (claude / codex / opencode) is one of the rebased variants that already speak the agent protocol.
Container images are layered: a shared dev toolchain at the bottom, provider-specific images on top.
| Image | Purpose | Built from |
|---|---|---|
agentic/agent:dev |
Shared dev toolchain layer. Mirrors the agentic-dev VM profile's apt/uv/fnm/rustup package set. /etc/profile.d snippet stabilizes PATH across login shells. |
Debian base + AIWG bootstrap. See CHANGELOG.md 2026.5.0 entry for #182. |
agentic/claude:latest |
Claude Code CLI on top of agentic/agent:dev. |
Rebased onto shared base for parity (#183). |
agentic/codex:latest |
OpenAI Codex CLI on top of agentic/agent:dev. |
Rebased onto shared base (#184). |
agentic/opencode:latest |
OpenCode CLI on top of agentic/agent:dev. |
Rebased onto shared base (#185). |
agentic/automation-control:latest |
Blueprint for orchestrator-driven TUI control sessions. Includes Codex, Aider, shared dev tools, and agentic-provider-inventory without bundling credentials. |
Extends agentic/codex:latest (#346). |
The CI smoke matrix (#186) builds each image and asserts:
python --version,node --version,go version,cargo --versionall resolve.rg --version,fd --version,bat --version,jq --version,xh --version,grpcurl --versionall resolve.- The agent binary inside the image dials the management server and registers within the smoke window.
Use agentic/automation-control:latest when an external orchestrator needs a general-purpose sandbox session it can observe, search, and drive through the PTY control plane. The image intentionally does not embed secrets or auto-launch provider login flows. Start with the credential-free probe, then use the low-churn Codex wrapper when a browser observer or external orchestrator needs to read the TUI:
agentic-provider-inventory
agentic-codex-automationagentic-codex-automation runs Codex with TERM=xterm, NO_COLOR=1, and --no-alt-screen. Set AGENTIC_CODEX_WORKDIR when the session should start outside the current directory.
Then launch provider TUIs only after the orchestrator has satisfied its credential and Controller-input policy gates.
Container lifecycle lives at /api/v1/containers/*, mirroring the
shape of /api/v1/vms/*. The full list is in
management/src/http/containers.rs;
the relevant endpoints are:
GET /api/v1/containers— list managed containers.POST /api/v1/containers— create + spawn (auto-injects agent bootstrap env, generates 256-bit secret).GET /api/v1/containers/{name}— single-container detail.POST /api/v1/containers/{name}/start— start a stopped container.POST /api/v1/containers/{name}/stop— graceful stop with timeout.DELETE /api/v1/containers/{name}—docker rm -f.
The image catalog endpoint
(GET /api/v1/container-images →
management/src/http/container_images.rs:56)
returns the curated provider image set the dashboard offers in the
Create dialog.
PTY exec inside a container is not part of this surface — that
lives behind the pty-ws/v1 binding (see pty-rendering.md)
which attaches to whatever the container entrypoint produces via the
existing in-container agent path.
When AIWG_SERVE_ENDPOINT is set, the management server registers
itself as an A2A executor (see aiwg-executor.md).
Mission dispatch lands at
POST /api/v1/sessions/:id/dispatch and routes to either a VM or a
container depending on the session's recorded runtime.
- The bridge does not know or care which runtime backs an
instance. It addresses by
(instance_id, session_id). - The dispatch handler resolves the instance to its runtime, then invokes the matching session-attach path.
- Container instances participate in the same
mission.*event vocabulary the executor contract emits —mission.dispatched,mission.completed,mission.failed. The events stream over the same/ws/executors/{id}channel.
The 2026.5.0 server_hello capability banner (#190) advertises both
runtimes; AIWG's replayCapable gate flips on for either.
Today's published loadout profiles (images/qemu/profiles/) are
VM-only — agentic-dev.yaml, agentic-dev-cloud-init.yaml. The
loadout schema described in LOADOUTS.md is reused
for container instances by setting an explicit runtime on the
manifest. Container "loadouts" today are effectively the choice of
image (for example agentic/claude:latest, agentic/codex:latest, agentic/opencode:latest, or agentic/automation-control:latest) plus mounts and env;
the formal runtime: field unifies that with the VM profile syntax.
When provisioning a container from the dashboard:
- Pick Container in the Runtime dropdown of the Unified Instances Create dialog (#178).
- Pick an image from the curated
agentic/*:latestprovider list. - Optionally add bind mounts (host path →
/workdir-style container path) for persistence. v2 admin Docker provision acceptsmountsashost_path:container_pathstrings. - The dashboard issues
POST /api/v1/containerswith the auto-injected agent bootstrap env (AGENT_ID,AGENT_SECRET,MANAGEMENT_SERVER).
For v2 admin Docker provision, agentshare: true creates a per-instance host
workspace under AGENTIC_SANDBOX_DOCKER_WORKSPACE_ROOT or
/var/lib/agentic-sandbox/workspaces and bind-mounts it at /workspace.
If a caller supplies an explicit /workspace mount, that mount wins. Docker
AgentCards advertise adapter-command/v1 only when a /workspace mount is
available, so orchestrators can treat the extension as a live capability
contract rather than an unconditional server feature.
- Orphan cleanup is opt-in.
DOCKER_MONITOR_ENABLED=falsedisables the background sweep entirely. Default prefix filter istask-so operator-spawnedagent-*containers are never auto-deleted (regression-proofed in 2026.5.0 after a5c897f / 005e471 / 24e1cf9 / 2e76a0d / 9dd7711). - Stop ≠ delete. The dashboard's Stop button calls
POST /api/v1/containers/{name}/stopand leaves the container in Stopped state so the operator can restart or inspect it. Force-off goes through the same path with timeout 0. - Container metrics flow into the same
Metricsaggregator as VM metrics (management/src/telemetry/metrics.rs). Seetelemetry.mdfor the label scheme. - Lifecycle events emit through the same SSE stream as VM events
(
/api/v1/events?follow=true). Seetransport-audit.md.
LOADOUTS.md— loadout manifest schema (VM today, containerruntime:field per #178).aiwg-executor.md— full AIWG bridge contract.pty-rendering.md— PTY attach over thepty-ws/v1binding (works against containers via in-container agent).crash-loop.md— VM-specific auto-remediation (container parity is operator-driven for now).telemetry.md,transport-audit.md— observability for either runtime.CHANGELOG.md— 2026.5.0 entry for the #181 epic.