feat(sdk): parallel pipeline fan-out + typed JSON report with metrics#156
Merged
Conversation
added 5 commits
June 28, 2026 15:27
Base::run_parallel runs steps as concurrent copy-on-write MicroVM forks (bounded, collect-all, input-ordered) and returns a dependency-free JSON Report. StepResult gains separated stdout/stderr, duration_ms, and metrics parsed from `::metric <key>=<value>` guest-stdout lines (a scoring channel for matrix/selection workloads). Steps now take &self (atomic fork counter), so fan-out no longer needs hand-rolled threads. RAII removes each fork on every path and the base snapshot on Drop (--force). Box/snapshot names carry per-process+instance entropy to prevent cross-pipeline name collisions; output is capped to bound report memory under large fan-out; infra failures use a distinct sentinel. 15 unit tests + doctest, clippy clean; validated end-to-end on a real /dev/kvm host (real boot, CoW fork-per-step, metrics, JSON report, 0 leaks).
… + infra retry Backs the parallel pipeline with real-microVM coverage and hardens it for sustained, highly-concurrent churn: - sweep_orphans(): reclaim ci-base-* boxes/snapshots left by a SIGKILL/OOM'd pipeline process (its RAII never ran), matched by the dead owner pid embedded in the resource name; never touches a live peer's resources. - WarmBase::infra_retries (default 2): retry a fork that hits a TRANSIENT infra failure (restore/start/boot); the step's command never ran, so re-forking is idempotent. Keeps sustained high-concurrency churn green. - tests/integration_kvm.rs (5 #[ignore] tests): warm+fork+exec, cache hit, parallel order/metrics, fork isolation, leak-freeness, sweep crash-recovery. - tests/soak_kvm.rs: sustained fork-eval churn stays leak-free and RSS-stable; leak gates are process-scoped (robust to a concurrent pipeline on the host). - ci.yml: run both under the integration-kvm (real /dev/kvm) gate. Validated on a real KVM host (a3s-box 2.5.1): integration 5/5; soak 1500 fork-evals across 75 generations leak-free, RSS +512 KiB; 0 orphans after.
- a3s-box-ci: a dep-free bin (in a3s-box-sdk) bridging any agent/tool to the pipeline. `run [SPEC|-]` parses a line-based spec -> run_parallel -> JSON Report (exit 0 iff passed); `sweep` reclaims crashed-pipeline orphans. This is what lets a3s-code / Claude Code / Codex drive the pipeline from a script. - warm_base now retries a transient infrastructure failure too (DRY'd with the per-step fork via a shared retry_infra), so concurrent same-image warms stay robust under load. Validated end-to-end on real KVM (runner: real pipeline + sweep; a3s-code drives it via session.program through the QuickJS runtime). 19 unit/bin tests + clippy.
…e corruption
RootfsCache::prune (called after a cache-miss put) evicted LRU entries with no
in-use guard, so it could remove_dir_all a cache entry that a CONCURRENT box was
using as its overlayfs lowerdir — that box's mount(2) then failed with ENOENT
('No such file or directory (os error 2)'), persisting through retries since the
backing was gone. Two pipelines from the same image collapse onto one cache key,
so this hit any concurrent same-image workload.
Fix: the same in-use guard SnapshotStore::prune already applies to live CoW
lowers. Each overlay box records the cache key it holds in <box_dir>/.rootfs-
cache-key (removed with the box dir); prune skips any still-referenced key
(prune_protecting, with prune as the empty-set wrapper).
Found via a concurrent-pipeline chaos test driven through a3s-code; verified on
a real /dev/kvm host (concurrency scenario: ~50% failure -> reliably green).
41 rootfs-cache unit tests + clippy clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Makes the
a3s-box-sdkprogrammable-CI pipeline usable for matrix / parallel CI and produces a machine-readable result.Base::run_parallel(steps, max_concurrency) -> Report— runs steps concurrently as isolated copy-on-write MicroVM forks, bounded bymax_concurrency, collect-all (every step runs; results returned in input order). Built onstd::thread::scope+ a work queue — still dependency-free.Report/StepResultwithto_json()(dep-free encoder): separatedstdout/stderr,duration_ms,cached,allow_failure, andmetricsparsed from::metric <key>=<value>lines a step prints to stdout (a scoring channel for matrix/selection workloads).&self(atomic fork counter). The previous&mut selfmade the documented "fan out with threads" impossible through the borrow checker.Dropwith--force.image+setup(or two processes on one host) can no longer collide and tear down each other's live boxes.WarmBase::max_output, default 1 MiB) so one chatty/looping step can't balloon an in-memory report during large fan-out.INFRA_FAILUREexit sentinel so a scorer can tell "never ran (infra)" from "ran and failed".Why
The pipeline could previously only run steps sequentially (
steptook&mut self), so a3s-box's cheap (~110 ms) CoW fork couldn't be used for matrix/parallel CI, and a result was only per-callexit_code+ concatenatedlogs— not a comparable, machine-readable signal. This is the building block for fan-out CI and selection/scoring workloads.Breaking
StepResult.logs(field) is removed, replaced bystdout/stderr. UseStepResult::combined()for the old concatenated view. (Documented under### Changed.)Validation
cargo test -p a3s-box-sdk— 15 unit tests + doctest;cargo clippyclean./dev/kvmhost (a3s-box 2.5.1):Real microVM boot, CoW fork-per-step, parallel fan-out,
::metricextraction from a live guest, typed JSON report, and zero leaked boxes/snapshots.Dependency-free throughout; no new crates.