Open Dynamic Workflow (odw) is a script-driven workflow runner for
agent-authored JavaScript workflows.
It is zero install — nothing is scaffolded into your project. Run odw guide
for the full self-contained usage guide, then write a workflow script and run it:
odw guide # how to author + run (self-contained)
odw exec --script wf.js --input-file input.json --backend pandacode
odw exec --resume latest
odw runs show latestinput.json (optional) contains the goal and any prompts for slots the
workflow declares; it is exposed to the script as args.
The JavaScript workflow owns phases, branching, fan-out, intermediate results, checkpoints, and synthesis. ODW provides the direct runner, node logs, resume state, schema validation, and a PandaCode bridge for the Claude / Codex / Bamboo executor runtimes.
Agent or CLI caller
-> odw exec --script <workflow.js>
-> phase(...)
-> agent(prompt, { label, phase, runtime, provider?, model?, schema?, schemaDescription?, agentType? })
-> .odw/runs/<run_id>/events.jsonl + state.json
-> pandacode claude|codex|bamboo exec
Claude Code remains an optional caller and compatibility surface through /odw
and /workflows. It is not required for the core path. Worker failures are also
part of the contract: failed workers return structured error feedback instead of
unclassified prose.
ODW is a pure orchestration runtime: the only executor backend is pandacode
(plus mock for token-free smoke tests). All Codex/Claude/Bamboo execution —
including persistent Claude terminals and domestic-model Bamboo runs — is owned
by PandaCode; ODW just dispatches agent(prompt, { runtime, provider, model })
nodes to pandacode <runtime> exec (single-shot).
Pick whichever fits your situation:
One-shot (odw + pandacode together): ./install.sh builds and installs
both — odw (this repo) and the pandacode executor it dispatches to (found
next to this repo, on PATH, or via PANDACODE_DIR / PANDACODE_REPO).
Or install each piece yourself:
cargo install --path . # put `odw` on PATH (recommended)
cargo build --release # or just build: ./target/release/odwZero install — nothing to scaffold. odw never writes files into your project. The binary is self-documenting, so any agent can use it straight from the CLI:
odw guide # the full self-contained authoring + run guide (read this first)
odw doctor # check node + the pandacode executor are wired up
odw spec | odw contract # machine-readable API types + the authoring contractodw finds the pandacode binary automatically when it sits next to odw (the
workspace builds both into the same dir, whether cargo install or cargo build). Only set ODW_PANDACODE_BIN=/path/to/pandacode (or --pandacode-bin)
if yours lives elsewhere.
For agents: run odw guide. It is the single self-contained entry point —
what odw is, the full authoring API, how to run, and the gotchas. No skill to
install, no files to read; the same content any AI (Claude, codex, or otherwise)
gets straight from the CLI. odw spec adds the TypeScript types; odw contract
the full authoring contract.
export const meta = { name: "ship-feature" };
phase("Implement", "Codex implements the change");
const implPrompt = promptSlot("implement", {
input: args,
required_schema: ".odw/schemas/codex-result.schema.json"
});
const result = await agent(implPrompt, {
label: "codex-implement",
phase: "Implement",
runtime: "codex",
agentType: "odw-codex-coder",
schema: ".odw/schemas/codex-result.schema.json",
schemaDescription: "Final response is the implementation result: changed files and verification evidence."
});
checkpoint("after-implement", { ok: result.ok });
return { ok: result.ok !== false };odw exec --script ./ship-feature.js --input-file workflow-input.json --backend pandacode
odw runs show latest
odw exec --resume latestAfter writing a workflow, generate a self-contained HTML execution graph:
odw report --script ./ship-feature.js --open
odw report --run latest --openodw report --script performs a token-free mock dry run, reads the emitted event
stream, and derives the graph automatically. odw report --run <id> renders an
existing mock or real run. Use --input for the mock run payload and --out to
choose the HTML path.
The report shows a Mermaid execution graph on the left, coloured by runtime
(codex, claude, bamboo), and node details on the right: model, prompt,
status, tokens, and duration.
Node-level fan-out uses the runner API, not ad hoc orchestration:
const reviews = await parallel(batches.map((batch, index) => () =>
agent(promptSlot("review_batch", {
batch,
index,
required_schema: ".odw/schemas/security-finding.schema.json"
}, `
Role:
Read-only evidence-backed reviewer.
Input batch:
{{context}}
Task:
Review this batch and return only evidence-backed findings.
Constraints:
Do not edit files. Cite exact evidence.
Output schema:
.odw/schemas/security-finding.schema.json
Done criteria:
The batch is either clean or every finding has file evidence.
Failure contract:
If blocked or failed, return .odw/schemas/error-feedback.schema.json.
`), {
id: `review-${index}`,
label: `review ${batch.name}`,
phase: "Review",
agentType: "odw-security-reviewer",
schema: ".odw/schemas/security-finding.schema.json",
schemaDescription: "Final response is the evidence-backed review result for this one batch.",
retry: { maxAttempts: 2 }
})
));Inside Claude Code:
/odw-audit src/routes for missing auth checks
/odw-ship implement the agreed billing permission fix
/odw-flow decompose this feature into parallel Codex tasks
Claude should load .odw/framework/workflow-api.d.ts and write or adapt a
workflow with this shape:
phase("Research", "read files");
const result = await agent(prompt, {
label: "research",
phase: "Research",
agentType: "odw-researcher"
});For workflow-node parallelism, use parallel([() => agent(...), ...]).
ODW emits parallel group start/done events, caps concurrency at 16, preserves
result order, and joins before returning. For item streams, use
pipeline(items, ...stages). For dynamic task decomposition, let an upstream
node return a task array, then map it with fanout(tasks, (task) => agent(...)).
workflow: a JavaScript module that an agent writes or loads, withexport const metafollowed by top-level async workflow code. Input is available asargs. Workflow scripts are orchestration-only (no direct filesystem/shell access); file, shell, and code-edit work must go throughagent(...)executor nodes. The vm context gives that separation and a determinism guard, not a security boundary — treat workflows as trusted code you author, not a way to run untrusted scripts.phase: a named stage emitted to live logs and.odw/runs/*/events.jsonl.agent: a node invocation. The aligned call shape isagent(prompt, { label, phase, runtime, model, schema, schemaDescription, retry, agentType }). ODW does not create default nodes or require fixed agent types; each call in workflow code is the node, andagentTypeis only an optional author-defined routing/tag value.promptSlot: workflow scripts declare prompt slots; real runs injectinput.prompts.<slot>. Suggested template literals are for mock smoke tests or explicit caller opt-in.pandacode: convenience namespace —pandacode.codex(prompt),pandacode.claude(prompt),pandacode.bamboo(prompt, { provider }),pandacode.exec(runtime, prompt); each is justagent(prompt, { runtime }).runtime: with--backend pandacode, ordinaryagent(...)nodes dispatch toruntime: "claude",runtime: "codex", orruntime: "bamboo"while the workflow remains the caller and orchestrator. Bamboo nodes (domestic models) requireprovider(orbambooProvider) such asdeepseek,xiaomi,kimi,zhipu,minimax,qwen, orstepfun; provider is invalid on non-Bamboo runtimes. Enable a provider by setting its API key in the environment — the provider-specific var (DEEPSEEK_API_KEY,KIMI_API_KEY,QWEN_API_KEY,ZHIPU_API_KEY,MINIMAX_API_KEY,XIAOMI_API_KEY,STEPFUN_API_KEY) or the genericPANDACODE_BAMBOO_API_KEY. Without a key the node returns a structuredmissing API keyfailure;odw doctorshows whether bamboo is configured.checkpoint: persists resume state and emits a checkpoint event.pipeline:pipeline(items, ...stages)runs each item through sequential stages while items fan out.parallel: a workflow node-level fan-out/join viaparallel([() => agent(...), ...]); keep concurrency at or below 16.fanout: dynamic node fan-out from structured upstream output viafanout(items, (item, index) => agent(...)).schema: schema use is opt-in. If a workflow author passesagent(..., { schema, schemaDescription, retry }), ODW appends the full JSON Schema as a final-response-only contract, validates that final response, and on a mismatch injects the mismatch context into the node prompt and retries. Retry is NOT automatic: a node attempts once by default (maxAttempts:1), unlike the built-in tool which keeps retrying on mismatch — passretry: { maxAttempts: N }(ormaxAttempts: N) for built-in-style retry, else a single non-conforming reply returns structuredschema_mismatchfeedback immediately. Route schema nodes to claude (coding agents like codex/ bamboo are unreliable at structured output). Ifschemais omitted, ODW applies no default schema.error feedback:.odw/schemas/error-feedback.schema.jsonis the standard result when a worker, command, schema, or CodexCTL step fails.
odw-orchestrator plans and routes. The workflow script owns executable
branching, fan-out, loops, intermediate state, and final aggregation.
- run:
odw exec --script <workflow.js> --input <json> --backend <mock|pandacode> - optional Claude run:
/odw,/odw-audit,/odw-ship,/odw-flow - watch:
odw runs show latest; optional Claude watch:/workflows - pause/resume:
odw exec --resume latest - stop: stop the invoking process, or use
/workflowsfor Claude-launched runs - restart node: direct exec resumes by the stable
prompt + optionscache key; completed nodes are skipped from state (editing a node's prompt re-runs it) - live logs:
odw execstreams node progress - local journals:
odw runs listandodw runs show latest
odw guide # self-contained authoring + run guide
odw doctor # check node + pandacode are wired up
odw spec # framework spec + TypeScript API types
odw contract # full authoring contract
odw capabilities # machine-readable capability map
odw exec --script wf.js --input '{"goal":"x"}' --backend mock # token-free dry run
odw exec --script wf.js --backend pandacode # real run
odw exec --resume latest
odw report --script wf.js --open # HTML execution-graph preview
odw runs list
odw runs show latestodw exec is the direct runner. It streams workflow, phase, node, checkpoint,
error, and exit events, writes .odw/runs/<run_id>/events.jsonl, and persists
node resume state in .odw/runs/<run_id>/state.json. The only executor backend
is pandacode (plus mock for token-free smoke tests).
Codex runs through PandaCode, single-shot. A node with
agent(prompt, { runtime: "codex" }) dispatches to pandacode codex exec, which
runs to completion. PandaCode owns Codex account handling, logs, and model
discovery; the workflow script stays the orchestrator.
const result = await agent(implementationPrompt, {
runtime: "codex",
agentType: "odw-codex-coder",
schema: ".odw/schemas/codex-result.schema.json"
});If the executor fails, workers return .odw/schemas/error-feedback.schema.json
with a classified error (category, retryability, next action) instead of burying
the failure in prose.
Bamboo domestic-model runs also go through PandaCode. ODW only dispatches the node; PandaCode owns provider credentials, model selection, logs, token usage, and execution.
const result = await agent(prompt, {
runtime: "bamboo",
provider: "deepseek"
});
const same = await pandacode.bamboo(prompt, { provider: "deepseek" });The dispatch is pandacode bamboo exec --provider <provider> .... Supported
provider names are deepseek, xiaomi, kimi, zhipu, minimax, qwen, and
stepfun. Passing provider with runtime: "claude" or runtime: "codex" is
an authoring error.
Bamboo is a tool-using coding agent — it shines at reading/writing files and running commands. Its node result is a prose summary of what it did, not raw content, so it is a poor fit for nodes whose value IS the answer:
- Prose deliverables (a review, an analysis) can fail with
missing JSON object in model responsewhen the agent answers in text instead of a tool call. schema:structured-output nodes also tend to fail — a schema does not fix it (verified: a schema'd bamboo classification returned its answer as prose — "...negative with score 0.95..." — and missed the schema on every retry).
Route answer-shaped nodes to runtime: "claude" / "codex"; keep bamboo for the
file/command work it is built for. See issue #5.
ODW's script runtime matches the Claude Code built-in Workflow tool on these runtime behaviors:
- Cores-aware concurrency.
parallel/pipeline/fanoutcap atmin(16, cpuCores - 2); a 1000-agent-per-run backstop guards runaway loops. - Determinism guard. Inside a workflow script,
Date.now(),Math.random(), and arglessnew Date()throw (they break resume). Deterministic forms (new Date(ts),Date.parse, all otherMath.*) still work. Runner internals keep using the real clock. isolation: "worktree". Set it on anagent(...)node to run its executor in a throwaway git worktree branched fromcwd, so file-mutating agents in aparallel(...)group do not conflict. The worktree is removed on success, error, or timeout. Requirescwdto be a git repo.- Real
budget. Seedargs.budget.total(tokens).budget.spent()sums each node's total token usage (input + output + cache + reasoning) from PandaCode reports. This differs from the built-in tool, whosespent()counts output tokens only — and the gap is large for coding-agent nodes, whose cost is dominated by the input harness (a trivial Bamboo node can report ~19k total but <300 output). So a budget loop ported from the built-in exhausts far sooner here; size budgets in total tokens, not output. Nodes without token usage report 0 and mark the budgetapprox;budget.remaining()tracks it; once spent reaches total, the nextagent(...)throws.spentpersists across--resumeand is not double-counted for cached nodes. It is best-effort, not a hard cap: under concurrency, in-flight nodes still finish, so a run can overshoot by up to ~concurrency × per-node tokens(as the built-in tool's budget also does). workflow(nameOrRef, args). Run a saved/sibling workflow inline as one step. It shares this run's agent counter, concurrency caps, budget, and state. 1 level only: a sub-workflow that callsworkflow()throws. Names resolve to.claude/workflows/<name>.js,odw-<name>.js, or a relative/absolute path.meta.whenToUseand per-phasemodel.meta.phases[].modelsets a default model that a phase's agents inherit when they omitoptions.model.
export const meta = {
name: "ship-with-budget",
whenToUse: "implement a change under a token budget with isolated workers",
phases: [{ title: "Build", detail: "codex implements", model: "opus" }]
};
phase("Build");
const reviews = await parallel(batches.map((b, i) => () =>
agent(promptSlot("review", { b }), { id: `rev-${i}`, isolation: "worktree" })
));
if (budget.remaining() !== null && budget.remaining() < 50_000) {
return { ok: true, note: "budget nearly exhausted" };
}
const sub = await workflow("synthesize", { reviews });These behaviors are self-verified: node scripts/selftest.mjs runs odw against
crafted mock workflows and asserts every parity feature (token-free,
deterministic). It is wired into cargo test as the parity_selftest
integration test, so the gate fails if any parity behavior regresses.
- Heterogeneous executors. Each node picks its own runtime/provider/model, so
one
parallel(...)can fan a task across codex + claude + several domestic models at once — the built-in tool runs claude subagents only. - Persistent, offline observability. Every run writes
events.jsonland a standalone HTML execution graph: per-node runtime, the resolved model (even when the script left it implicit), real token count, prompt, status, and duration. The built-in tool's progress tree is ephemeral. - Structured failures, not exceptions. A node that exhausts its retries
returns
{ ok: false, error: { category, message } }instead of throwing, so a script can inspect why a node failed (result.ok === false,result.error.category). A thunk that throws still resolves tonullinsideparallel/pipeline, matching the built-in.filter(Boolean)idiom — so use.filter(r => r && r.ok !== false)when you want to drop failed nodes too. - claude token accounting. claude nodes report token usage (parsed from the
Claude Code session transcript), so
budgetcounts them like codex/bamboo.
Honest tradeoffs — reach for the built-in Workflow when these matter:
- Zero setup. It runs inside Claude Code with no install; ODW needs the
odwbinary, Node, and (for real runs) PandaCode + runtime CLIs on PATH. - Live progress. Its progress tree updates in-terminal as agents run; ODW's
HTML graph is rendered after the run (the live signal is
events.jsonl). - Real subagent types.
agentTypethere selects a custom Claude subagent with its own system prompt and toolset; ODW has no subagent registry, soagentTypeis only a routing/tag value —runtime(codex/claude/bamboo) is the real selector. Prose/analysis/structured-output nodes should targetclaude. - Automatic schema retry. It keeps re-prompting until structured output
validates; ODW attempts once by default (
maxAttempts:1) — passretry. - Output-token budgets. Its
budget.spent()counts output tokens (matching Claude Code's+Nkmetering); ODW counts total tokens, so the same ceiling trips much sooner here. Size ODW budgets in total tokens.
Implemented now:
- Rust CLI named
odw - Open Dynamic Workflow project pack installer
- direct workflow script contract
- direct JavaScript runner through
odw exec - prompt-slot injection for node prompts
- complex flow starter with dynamic fan-out, join, parallel review, quality gate, and bounded rework loop
- framework
.d.tsand runtime contract docs - project-level Claude Code agent types
- slash commands and starter workflow scripts
- worker output schemas
- saved workflow artifact evidence reader
- live run journals under
.odw/runs - checkpointed direct resume with
odw exec --resume - single-shot Codex execution through PandaCode (
runtime: "codex") - Bamboo provider dispatch through PandaCode (
runtime: "bamboo", provider) - structured error feedback schema and failure analyst agent
- validation, doctor, unit tests, clippy checks
Remaining product work:
- richer reusable workflow template library
- live
runs watchview over the journal
The gate every change must pass (also enforced by CI in .github/workflows/):
cargo build --all-targets
cargo test
cargo clippy --all-targets -- -D warnings
node scripts/selftest.mjsscripts/selftest.mjs is the parity self-test — token-free, mock-backed, and the
fastest way to confirm the runtime still matches the built-in Workflow contract.
See CONTRIBUTING.md for details.
MIT — see LICENSE.