forkd's planning horizon. Items are grouped into milestones; each milestone ends with a deliberate marketing / community pulse so the next milestone's priorities are driven by real user feedback rather than guesses.
Tracked work for individual items lives in GitHub issues.
The bottleneck right now is not raw performance — it's that the project speaks only to Python / numpy workloads, and the first-time setup is a ~10 minute rootfs build. Both are fixable inside one milestone.
Browser fan-out is the second-largest AI-agent workload shape after Python (Anthropic computer-use, OpenAI browsing, every coding-agent that uses Playwright/Puppeteer). Cold-start of a headless Chromium container is 2–3 s; fork-from-warm should put it at ~10 ms — a 100–300× win on a load that the project doesn't currently serve at all.
Done when:
- ✓ Recipe builds a parent rootfs from the official Playwright image with a headless Chromium already running in the parent. (scaffold)
- ✓
forkd-agentships a JS bridge:sb.eval(<js>)against any child routes to the warmed Chromium via the warmup subprocess. (#30) - ✓
forkd fork --tag pwb -n 3 --per-child-netnsclean spawns on Firecracker, 56 ms wall-clock on the dev box; per-childsb.eval("return await page.title()")returns in 10–82 ms. Two gotchas surfaced: parent VM needs ≥ 2 GiB (Chromium OOMs on the default 512), per-child cgroup ceiling ≥ 2560 MiB. Both documented in the recipe README and supported by a newforkd snapshot --mem-size-mibflag. - ☐ Larger N (50, 100) and benchmark numbers in
bench/+ a row in the README comparison table.
Risk resolved: Chromium-via-snapshot-restore works without GPU / timer regressions, but is memory-hungry. Memory tuning is the operational gotcha to call out in the marketing pulse.
Right now forkd run requires a ~10 minute first-time rootfs build
and warmup per recipe. A public registry of pre-built parent
snapshots (memory.bin + rootfs.ext4 + vmstate.json + manifest,
hosted as OCI artifacts on a bucket) collapses that to a single
forkd pull (~30 s).
Done when:
- ✓ CLI ships
forkd pack,forkd unpack,forkd pull,forkd push,forkd images. Pack format v1 documented inline inhub.rs(tar.zst + manifest.toml + per-file sha256). (#29) - ✓ Hosting decided: GitHub Releases instead of an R2/S3 bucket —
free, free egress, stable URLs, no infra to operate. Registry
index served from
raw.githubusercontent.com/deeplethe/forkd/main/registry.json(default--hubURL). The originalforkd-hub.deeplethe.comcustom-domain plan is dropped as unnecessary. - ✓ 9 snapshot-producing recipes published:
python-numpy,playwright-browser,postgres-fixture,coding-agent,nodejs,e2b-codeinterpreter,jupyter-kernel,langgraph-react,coding-agent-fork.forkd pull deeplethe/python-numpy && forkd fork --tag python-numpy -n 100works end-to-end from a clean Ubuntu host without running any recipe build script. (agent-workbenchis the one remaining recipe with abuild.sh; its base image pull fromghcr.io/agent-infra/sandbox:latestis rate-limit-sensitive and will land in a follow-up.) - ✓ README quickstart leads with
forkd pull— bothREADME.mdandREADME-zh.md.from-imageandfrom-sourceare reframed as "Alternative:" sections. (#165)
Risk: base-image redistribution licensing per recipe. Triaged 1-time at start; expected clear since all our base images are Apache / BSD / MIT.
The week M1.1 and M1.2 are done, before starting M2: ship one substantial English blog post ("Forking Playwright at 10 ms"), Show HN, and equivalents on r/MachineLearning, r/LocalLLaMA, XHS / 知乎. The point is to collect external signal that determines whether M2 priorities still make sense.
The two items below are pre-selected as M2 candidates, but their order and scope must be re-confirmed after the M1.3 marketing pulse — if real users surface a more urgent gap, that takes priority.
Today, modifying a parent (e.g. pip install a new package) means
re-running the whole snapshot pipeline. Firecracker already supports
snapshot_type: "Diff"; we just need to surface it as a chain
(base → +pandas → +sklearn) at the controller / CLI / registry
layer. Dev iteration drops from ~10 s + several-GB I/O to seconds.
Done when:
forkd snapshot diff --from <tag> --tag <new-tag>produces a diff < 100 MB for anapt installorpip installdelta.- Restore time on a 3-snapshot chain is within 10% of the base snapshot's restore time.
- Snapshot Hub MVP (M1.2) understands chains: pulling a diff also pulls its parents.
Risk: Firecracker diff-snapshot has some vmstate fields that don't chain cleanly. Highest-risk item in the roadmap; budget +1 week if the first 2 weeks hit an edge case.
Allow a workload to pause at an arbitrary execution point and fork into K children, each with a variant of some input (env var, random seed, prompt). This is the primitive that AI-agent researchers actually want — tree search over agent decision space — and no competing project offers it.
Done when:
state.fork(k=5, vary={"temperature": [...]})exposed in the Python SDK.- One end-to-end demo: an LLM agent forks at a tool-call decision point into 5 branches, returns the highest-scoring continuation.
- Demo gif + blog post + HN-ready.
Risk: this is a product project, not a pure engineering one. If the demo doesn't sing, the work is wasted. Should only be greenlit if M1.3 produces an audience to demo to.
Tracked from README.md#Status. Not committed to a date; expected
to be unblocked by paying or prospective enterprise users surfacing
specific gaps.
- Default-deny egress per child netns.
cpu.max,io.max,pids.maxquotas beyond the existingmemory.max.- TLS termination in the daemon (currently rely on reverse proxy).
- Multi-node scheduling (one daemon → multiple hosts).
- Third-party security audit.