Build an MVP of a dynamic-workflow skill that teaches a local coding agent to create and run code-based orchestration workflows for large, repetitive, long-running, parallelizable, or verification-heavy tasks.
The MVP is not a standalone product and not a full CLI framework. It is a skill folder containing:
SKILL.md: instructions that tell a coding agent when and how to use the dynamic workflow pattern.- A minimal JavaScript runtime template:
workflow-runtime.mjs. - A small set of reference docs.
- A few sample workflows that a coding agent can copy and adapt.
The coding agent should be able to read this skill, copy the runtime into the target repo, write a task-specific workflow script, run it, and summarize the results from the generated artifacts.
The dynamic workflow pattern separates deterministic orchestration from fuzzy agent work.
- File enumeration
- Batching
- Queueing
- Parallelism
- State persistence
- Resume behavior
- Shell command execution
- Serialized state writes
- Diff generation
- Writing results
- Test or benchmark execution
- Acceptance checks based on objective metrics
- Report generation
- Local reasoning
- Text editing
- Code editing
- Review
- Hypothesis generation
- Summarization
- Strategy suggestions
The workflow owns the loop. Agents own local reasoning or local edits. The current chat session should not hold all intermediate state.
The MVP should allow a coding agent to perform workflows like:
- Proofread hundreds of Markdown/text files.
- Review many files in a codebase and produce structured findings.
- Run repeated code/test/build/benchmark optimization loops.
- Run simple auto-research style loops where an agent proposes a change, deterministic commands evaluate it, and the workflow accepts or rejects the result.
The MVP must support:
- A skill folder layout.
- A single-file Node.js runtime with no required npm dependencies.
- Task-specific workflow scripts written as
.mjsfiles. - Configurable CLI agents.
- Agent prompt input via stdin or prompt file.
- Agent output as text or JSON.
- Basic JSON extraction/parsing.
- Lightweight schema validation for machine-consumed agent JSON.
- Parallel execution with bounded concurrency.
- Pipeline execution over many items.
- Optional per-item state persistence for workflows that need item-level resume.
- Resume by skipping completed items when the workflow uses item state.
- Shell command execution with timeout.
- Run artifact directory with logs, prompts, outputs, state, and report.
- Example workflows.
Do not implement these in the MVP:
- A polished standalone CLI.
- A web UI or TUI.
- Distributed workers.
- Durable database beyond JSON/JSONL files.
- Full JSON Schema validation library dependency; the runtime should keep a small dependency-free schema subset.
- Complex sandboxing.
- Full git worktree management.
- Automated dependency installation.
- Full self-modifying workflow runtime.
- Cross-machine execution.
- Token/cost accounting.
- Model provider SDK integration.
The MVP should remain simple enough that a coding agent can inspect and modify it easily.
Create the following folder:
skills/dynamic-workflow/
SKILL.md
references/
concepts.md
when-to-use.md
runtime-contract.md
agent-contract.md
patterns.md
safety-and-isolation.md
runtime/
workflow-runtime.mjs
examples/
proofread-directory.workflow.mjs
review-codebase.workflow.mjs
benchmark-optimize.workflow.mjs
auto-research-simple.workflow.mjs
The exact parent folder may depend on the local coding agent's skill system. The implementation should keep the internal structure above.
This skills/dynamic-workflow/ folder is the reusable skill installation. It is separate from .dynamic-workflows/, which is created inside each target repo when the skill is used.
When a coding agent uses this skill in a target repo, it should create:
.dynamic-workflows/
agents.json
runtime/
workflow-runtime.mjs
workflows/
<task-name>.workflow.mjs
runs/
<workflow-name>/
run.json
events.jsonl
items.json # optional item checkpoint
report.md
prompts/
outputs/
errors/
shell/
diffs/
artifacts/
For the earliest MVP, the default run directory should be stable per workflow, not timestamped per invocation. Reusing one directory makes resume behavior obvious and avoids requiring a separate run registry.
<workflow-name> means safeName(name), not the raw workflow name.
items.json is a standard optional checkpoint file. It is needed for workflows that want item-level resume, such as "process these 400 files and skip the 217 already completed files." Workflows that only run a single command, produce one report, or manage custom state do not need to use item state. The runtime may create an empty items.json for consistency, but workflow correctness should not depend on every workflow using it.
The target repo should contain a configurable agent adapter file:
{
"agents": {
"editor": { "preset": "claude" },
"reviewer": { "preset": "codex" },
"coder": { "preset": "pi", "timeoutMs": 1800000 }
}
}The preset values are examples. The skill must explain that the user or coding agent should adapt them to the locally available agent CLI.
Supported built-in presets:
claude:claude -p; structured calls use--output-format json, schema calls use--json-schema.codex:codex exec --ephemeral --skip-git-repo-check -s read-only -; structured calls add--json.pi:pi -p; schema calls rely on prompt contract plus runtime validation.
Preset defaults can be overridden per agent by setting normal adapter fields. A string value is shorthand: "editor": "codex" means "editor": { "preset": "codex" }.
Supported input values:
stdinfile
Supported output values:
textjsoncodex-jsonclaude-json
Optional adapter fields:
preset: built-in adapter preset:claude,codex, orpi.jsonCommand: optional shell command to use for structured output calls. Use this for local agent CLI JSON flags such as--jsonor--output-format json.schemaCommand: optional shell command to use whenagent(..., { schema })is used. Use{schema}to inject shell-quoted schema JSON.timeoutMs: default timeout for this agent.inheritEnv: whether the subprocess inherits the current environment. Defaulttruefor local CLI compatibility.env: extra environment variables to add or override.
Use output: "codex-json" for codex exec --json, which emits JSONL events. The runtime should extract the last completed agent_message text before applying normal JSON parsing and schema validation.
Use output: "claude-json" for claude -p --output-format json. The runtime should return structured_output when present, otherwise result.
The skill must warn that inherited environment variables can expose secrets to agent subprocesses.
When the user asks for a large task, the coding agent should:
- Detect that the task is suitable for dynamic workflow.
- Create
.dynamic-workflows/in the target repo if missing. - Copy
runtime/workflow-runtime.mjsfrom the skill into.dynamic-workflows/runtime/. - Create
.dynamic-workflows/agents.jsonif missing. - Create a task-specific workflow under
.dynamic-workflows/workflows/. - Run the workflow using Node.js:
node .dynamic-workflows/workflows/<task-name>.workflow.mjs- Inspect
.dynamic-workflows/runs/<task-name>/report.mdand other artifacts. - Summarize what happened to the user.
The coding agent should not try to process hundreds of files directly inside the chat session.
SKILL.md must be short, directive, and optimized for coding-agent behavior.
It should include:
- A description header.
- Trigger conditions.
- Required behavior.
- The key workflow principle.
- The default implementation sequence.
- References to detailed docs.
- Warnings about safety and state.
---
name: dynamic-workflow
description: Use when a task is too large, repetitive, parallelizable, long-running, or verification-heavy to complete directly in the current chat. Create code-based workflows that call CLI agents, keep state on disk, and resume safely.
---
# Dynamic Workflow Skill
Use this skill when the user's task has one or more of these properties:
- Many independent items: files, tests, documents, issues, URLs, candidates, modules, examples.
- Long-running loops: optimize until a metric improves, run repeated experiments, repeatedly test and fix.
- Parallelizable subtasks: one agent per file, module, candidate, or review dimension.
- Need for verification: writer agent followed by reviewer agent, code change followed by tests, benchmark result followed by acceptance check.
- Context too large for one conversation.
- The user explicitly asks for workflow, orchestration, batching, fan-out, auto research, or dynamic workflow.
Do not try to complete such tasks directly in the current conversation. Instead, create a workflow script.
## Required behavior
1. Create `.dynamic-workflows/` in the current repo if it does not exist.
2. Copy or create the minimal runtime at `.dynamic-workflows/runtime/workflow-runtime.mjs`.
3. Create `.dynamic-workflows/agents.json` if missing.
4. Write a task-specific workflow under `.dynamic-workflows/workflows/`.
5. Use deterministic code for enumeration, batching, state, retries, writing files, shell commands, and report generation.
6. Use agents only for fuzzy judgment, editing, summarization, code changes, hypothesis generation, or review.
7. All agent calls must have explicit input and output contracts.
8. Prefer JSON output from agents when possible.
9. For item-based workflows, persist item status to disk so the workflow can resume.
10. At the end, summarize from workflow artifacts, not from memory.
## Key principle
The workflow owns the loop. Agents own local reasoning or local edits. The current chat session only creates, starts, monitors, and summarizes the workflow.
## Before writing a workflow
Read:
- `references/runtime-contract.md`
- `references/agent-contract.md`
- `references/patterns.md`
- `references/safety-and-isolation.md`
Use examples as templates.The MVP runtime should export these functions:
createWorkflow(options)
agent(agentName, options)
shell(command, options)
parallel(tasks, options)
pipeline(items, stages, options)
globFiles(options)
readText(filePath)
writeText(filePath, content)
appendText(filePath, content)
diffText(before, after)
safeName(name)Optional but useful:
readJson(filePath, fallback)
writeJson(filePath, value)
appendJsonl(filePath, value)
fileExists(filePath)
ensureDir(dirPath)const wf = createWorkflow({
name,
runDir,
concurrency,
resume
});{
name: string,
runDir?: string,
concurrency?: number,
resume?: boolean
}Defaults:
concurrency = 4
resume = true
runDir = `.dynamic-workflows/runs/${safeName(name)}`MVP run directories are stable per workflow by default. Re-running the same workflow reuses the same artifact directory and can resume from prior item state. A workflow can still pass a custom runDir if it wants isolated artifacts for a one-off run.
resume controls item-state loading:
resume: true: load existingitems.jsonif present.resume: false: start with empty item state for this invocation and overwriteitems.jsonon the first item-state write.
The runtime should not delete the run directory automatically. If a completely clean artifact directory is needed, the user or top-level coding agent can remove it explicitly or provide a different runDir.
{
name,
runDir,
concurrency,
state,
run(fn),
event(type, data),
getItemState(key),
setItemState(key, value),
isItemDone(key),
markItemDone(key, value),
markItemFailed(key, errorOrValue),
writeReport(markdown)
}state is the in-memory item checkpoint object with the same shape as items.json, usually { items: {} }. It is not intended to be a general workflow database. Workflows that need custom state should write explicit artifacts under artifacts/ or their own files.
wf.run(fn) should:
- Create the run directory.
- Create subdirectories:
prompts/outputs/errors/shell/diffs/artifacts/
- Load existing item state if
resumeis true anditems.jsonexists. - Otherwise initialize empty in-memory item state.
- Write or overwrite
run.jsonfor the latest invocation. - Append
run_startedtoevents.jsonl. - Execute
fn. - On success, append
run_completed. - On failure, append
run_failedand rethrow.
For item-based workflows, use a simple JSON checkpoint file:
.dynamic-workflows/runs/<workflow-name>/items.json
items.json is not required for every workflow. It is the runtime's standard per-item progress file for workflows that have stable item keys and need resumable processing. Examples:
- Proofreading many files should use item state.
- Reviewing many source files should use item state.
- A single benchmark command may not need item state.
- An experiment loop may use custom artifacts instead of item state, unless each experiment has a stable key.
Shape:
{
"items": {
"docs/file1.md": {
"status": "done",
"updatedAt": "2026-05-30T12:00:00.000Z",
"result": {
"status": "updated",
"summary": "Fixed spelling and punctuation."
}
}
}
}Recommended item statuses:
pendingrunningdonefailedskippedrejected
The MVP can keep this simple: only done items are skipped on resume.
For terminal outcomes that should not be retried, such as "unchanged", "updated", or "reviewer rejected this edit", prefer top-level status: "done" with a detailed result.status. Use top-level failed for items that should be retried. Use top-level rejected or skipped only when the workflow explicitly wants those statuses to remain non-done for future policy decisions.
The workflow decides whether to skip an item by calling wf.isItemDone(key). The runtime should not hide items automatically from pipeline() or parallel() because each workflow may have different retry, rejected, or stale-result rules.
Failed items should be recorded with status failed, usually through wf.markItemFailed(key, errorOrValue) or wf.setItemState(key, ...). wf.isItemDone(key) must return true only for status: "done", so failed items are naturally retried on the next run when resume: true unless the workflow explicitly chooses a different policy.
When item-state helpers are used under parallel() or pipeline(), the runtime should serialize in-process writes to items.json so concurrent item completions do not overwrite each other. A simple promise-chained write mutex is enough for MVP: enqueue each state mutation, update the in-memory object, write JSON to a temporary file, then rename it over items.json. The MVP does not need cross-process locking; running the same workflow directory from two separate Node.js processes at the same time is unsupported.
Every important operation should append a JSON line to:
events.jsonl
Example events:
{"type":"run_started","at":"2026-05-30T12:00:00.000Z","name":"proofread-directory"}
{"type":"agent_started","at":"...","agent":"editor","label":"edit:docs/a.md"}
{"type":"agent_completed","at":"...","agent":"editor","label":"edit:docs/a.md","durationMs":12345}
{"type":"item_done","at":"...","key":"docs/a.md","result":{"status":"updated"}}
{"type":"run_completed","at":"..."}The event log is append-only and should be useful for debugging.
Because the default run directory is reused, events.jsonl may contain events from multiple invocations of the same workflow. Each invocation should append its own run_started and terminal run_completed or run_failed event.
const result = await agent("editor", {
label,
prompt,
cwd,
schema,
timeoutMs,
retries
});{
label: string,
prompt: string,
cwd?: string,
schema?: object,
timeoutMs?: number,
retries?: number
}- Load
.dynamic-workflows/agents.json. - Find the named agent.
- If
schemais provided, append the schema contract to the prompt and use structured output mode. - Write the prompt to
prompts/<safe-label>.md. - Resolve any built-in preset, then run
schemaCommandwhenschemais provided and the adapter provides one, otherwise runjsonCommandwhen structured output mode is active and available, otherwise runcommand. - If
inputisstdin, pass prompt to stdin. - If
inputisfile, create a prompt file and replace{promptFile}in the command string. If the placeholder is missing, throw a configuration error. - Capture stdout and stderr.
- Write stdout to
outputs/<safe-label>.stdout.txt. - Write stderr to
errors/<safe-label>.stderr.txt. - If output mode is
codex-json, extract the last completed Codexagent_messagetext from the JSONL event stream. - If output mode is
claude-json, parse Claude Code's JSON envelope and returnstructured_outputwhen present, otherwiseresult. - If output mode is
jsonorschemais provided, parse JSON from stdout or from the extracted agent message text. - If JSON parsing fails, attempt to extract the first JSON object or array from stdout, choosing whichever valid JSON region appears first by position.
- If
schemais provided, validate the parsed JSON against the runtime's lightweight schema subset. - Return parsed JSON or raw text.
Supported schema keywords are type, required, properties, items, enum, additionalProperties, nullable, minItems, maxItems, minLength, and maxLength. This validates shape only. Workflow code must still compute deterministic facts such as actual changed files, diffs, command exit codes, and parsed metrics.
For MVP, support command as a string and run it through the shell:
spawn(command, { shell: true })This is less safe than argv-array execution but simpler. Document the risk.
Future npm package can support argv arrays.
Agent subprocesses inherit the current environment by default unless their adapter sets inheritEnv: false. Adapter env values should be merged into the spawned process environment.
For input: "file", the runtime should replace every {promptFile} occurrence with a shell-quoted absolute prompt-file path. The command string should contain the unquoted placeholder, for example agent-cli --prompt {promptFile}.
retries defaults to 0 and means additional attempts after the first attempt. For example, retries: 2 allows up to 3 attempts total.
Retry only transient execution failures:
- Non-zero exit code.
- Timeout.
- JSON parse or extraction failure when adapter
outputisjson,codex-json,claude-json, orschemais provided. - Schema validation failure when
agent(..., { schema })is used.
Do not retry configuration errors such as a missing agent name, unsupported adapter option, or missing {promptFile} placeholder for input: "file".
Between retries, wait with simple linear backoff such as 500ms * attemptNumber. If more than one attempt occurs, prompt/stdout/stderr artifacts should include the attempt number so failed attempts are auditable.
If timeout expires:
- Kill the subprocess.
- Write an error event.
- Throw an error.
Every agent prompt should include:
- Local objective.
- Input data.
- Allowed actions.
- Forbidden actions.
- Required output format.
- Semantic payload fields expected by the workflow.
- Acceptance criteria.
Example:
You are proofreading one Markdown file.
Rules:
- Fix spelling and grammar errors.
- Preserve meaning.
- Preserve Markdown structure.
- Do not modify code blocks.
- Do not rewrite style unnecessarily.
- Return the corrected full file text and a short summary.
The workflow should pass the machine-readable shape through agent(..., { schema }), for example:
const edit = await agent("editor", {
label: "edit:docs/a.md",
prompt,
schema: {
type: "object",
required: ["correctedText", "summary"],
properties: {
correctedText: { type: "string" },
summary: { type: "string" }
},
additionalProperties: false
},
retries: 1
});Schema validation confirms output shape, not truth. The workflow still computes deterministic facts such as whether text changed, actual changed files, command exit codes, diffs, and metrics.
const result = await shell("npm test -- --json", {
cwd,
timeoutMs,
label,
json
});{
cwd?: string,
timeoutMs?: number,
label?: string,
json?: boolean,
env?: object
}{
ok: boolean,
exitCode: number,
stdout: string,
stderr: string,
json?: any,
durationMs: number
}If json: true, parse stdout as JSON or extract JSON from stdout using the same loose JSON parser as agent().
- Write stdout/stderr to
shell/<safe-label>.stdout.txtandshell/<safe-label>.stderr.txtwhen a workflow context exists. - Append shell events to
events.jsonl.
const results = await parallel(
tasks,
{ concurrency: 4, stopOnError: false }
);tasks is an array of async functions.
{
concurrency?: number,
stopOnError?: boolean
}- Execute up to
concurrencytasks at a time. - Preserve result order by input task index.
- Always return result envelopes when
stopOnErroris false:
{
ok: true,
index: 0,
value: "..."
}{
ok: false,
index: 0,
error: "..."
}- If
stopOnErroris true, throw on the first observed task failure. Already-running tasks are not forcibly cancelled, but no ordered result array is returned.
const results = await pipeline(items, [stage1, stage2, stage3], {
concurrency: 4,
stopOnError: false
});Options match parallel():
{
concurrency?: number,
stopOnError?: boolean
}For each item:
- Run stage 1.
- Pass result to stage 2.
- Pass result to stage 3.
- Return final result.
Across items, run pipelines concurrently with bounded concurrency.
pipeline() returns result envelopes in input item order when stopOnError is false:
{
ok: true,
index: 0,
item,
value
}{
ok: false,
index: 0,
item,
stage: 1,
error: "..."
}If any stage throws for an item, later stages for that item do not run. stage is the zero-based stage index that failed. If stopOnError is true, throw on the first observed item failure. Already-running item pipelines are not forcibly cancelled.
This is better than barrier-style parallel stages for large item sets.
No npm dependency in MVP.
Implement a simple recursive file finder that supports:
- Root directory.
- File extensions.
- Ignore directories.
const files = await globFiles({
root: "docs",
extensions: [".md", ".txt"],
ignoreDirs: ["node_modules", ".git", ".dynamic-workflows"]
});Do not implement full glob syntax in MVP unless easy.
Required helpers:
readText(filePath)
writeText(filePath, content)
appendText(filePath, content)
readJson(filePath, fallback)
writeJson(filePath, value)
appendJsonl(filePath, value)
ensureDir(dirPath)
fileExists(filePath)
safeName(name)safeName should convert labels into filesystem-safe names.
MVP implementation can be simple:
- Write both versions to artifacts if needed.
- Return a rough line-level diff.
No dependency required.
Simple output is acceptable:
- old line
+ new lineThis is mainly for reviewer prompts and reports.
diffText() output is display-oriented, not a machine-parseable patch contract. Workflows that need to apply changes should use structured JSON, explicit file writes, or a real patch artifact generated by the workflow.
File:
examples/proofread-directory.workflow.mjs
Purpose:
- Enumerate Markdown/text files.
- For each file, call an editor agent.
- Call reviewer agent on proposed correction.
- Write accepted changes.
- Persist item state.
- Generate report.
Expected workflow behavior:
- Find files under
docs/with.mdand.txtextensions. - Skip files already marked done.
- Read original file.
- Send original content to editor agent.
- Editor returns:
{
"changed": true,
"correctedText": "...",
"summary": "..."
}- If unchanged, mark item done.
- If changed, generate diff.
- Send original, corrected text, and diff to reviewer agent.
- Reviewer returns:
{
"accept": true,
"reason": "...",
"finalText": "..."
}- If accepted, write
finalTextto file. - If rejected, do not write.
- Mark item done with result.
- Write report with counts:
- total files
- updated
- unchanged
- rejected
- failed
File:
examples/review-codebase.workflow.mjs
Purpose:
- Enumerate source files.
- Ask reviewer agents to produce structured findings.
- Aggregate findings.
- Generate report.
The workflow should not modify files.
Agent output:
{
"findings": [
{
"severity": "low|medium|high",
"file": "src/example.ts",
"line": 123,
"title": "...",
"description": "...",
"suggestion": "..."
}
]
}Report should group by severity.
File:
examples/benchmark-optimize.workflow.mjs
Purpose:
- Run baseline benchmark command.
- Create an isolated sandbox copy of the target code under the run artifact directory.
- Ask coder agent to implement one optimization inside the sandbox only.
- Run tests in the sandbox.
- Run benchmark in the sandbox.
- Record the candidate diff and accept/reject result based on tests and benchmark improvement.
MVP behavior:
- Do not implement git worktrees.
- Do not modify the original repo.
- Copy only configured target paths into
artifacts/benchmark-sandbox/. - Run the coder agent with
cwdset to the sandbox. - Restrict the prompt to explicitly allowed files or directories.
- Generate a candidate diff artifact comparing the original target paths to the sandbox.
- Write a report with baseline metric, candidate metric, test result, improvement percentage, and accept/reject decision.
- Never automatically apply, commit, or revert changes in the original repo.
If the candidate is accepted, the example should still leave application to the human or top-level coding agent. Its job is to produce an auditable candidate patch plus benchmark evidence.
File:
examples/auto-research-simple.workflow.mjs
Purpose:
- Demonstrate an experiment loop.
- Agent proposes a hypothesis.
- Agent proposes a target-file change or patch.
- Shell runs evaluation command.
- Workflow compares metric.
- Workflow records accept/reject.
MVP should avoid complex self-modifying strategy code. It should document the future pattern but not fully implement it. The example should make clear that Auto Research is only one pattern built on Dynamic Workflow, not the central purpose of the skill.
The skill and examples must teach these rules:
- Prefer workflows that read input, ask agents for structured output, and let deterministic code write results.
- Do not let agents directly modify many files unless necessary.
- For text proofreading, have agents return corrected text; workflow writes it.
- For code changes, require tests before accepting.
- For optimization, require objective metrics before accepting.
- Do not expose secrets to agent subprocesses unless explicitly required.
- Do not put API keys in prompts.
- Be explicit about whether agent subprocesses inherit environment variables.
- Do not allow arbitrary destructive shell commands in generated workflows.
- For code-edit workflows, prefer patch/edit-plan generation over uncontrolled direct edits.
- Keep run artifacts for auditability.
- Make workflows resumable when they process many items or long loops.
The MVP should document but not fully implement agentCode().
The intended future pattern:
root workflow: stable, not edited during current run
strategy modules: allowed to be edited by agents after validation
target files: edited as part of experiments
Example future structure:
.dynamic-workflows/
workflows/
auto-research.workflow.mjs
strategies/
propose.mjs
review.mjs
prompts/
propose.md
review.md
Rule:
Agents may propose edits to strategy modules for the next generation of the workflow, but the current root workflow should not mutate itself during execution.
The MVP can include this in references/patterns.md as a future pattern.
The runtime should be a single ESM file.
Use only Node.js built-in modules:
node:fs/promises
node:path
node:child_process
node:os
node:cryptoNo external dependencies.
Minimum supported Node.js version: Node 18+.
The runtime should be readable and easy for a coding agent to modify.
// workflow-runtime.mjs
import fs from "node:fs/promises";
import path from "node:path";
import { spawn } from "node:child_process";
import os from "node:os";
import crypto from "node:crypto";
let currentWorkflow = null;
export function createWorkflow(options) { ... }
export async function agent(agentName, options) { ... }
export async function shell(command, options = {}) { ... }
export async function parallel(tasks, options = {}) { ... }
export async function pipeline(items, stages, options = {}) { ... }
export async function globFiles(options) { ... }
export async function readText(filePath) { ... }
export async function writeText(filePath, content) { ... }
export async function appendText(filePath, content) { ... }
export async function readJson(filePath, fallback = null) { ... }
export async function writeJson(filePath, value) { ... }
export async function appendJsonl(filePath, value) { ... }
export async function ensureDir(dirPath) { ... }
export async function fileExists(filePath) { ... }
export function safeName(name) { ... }
export function diffText(before, after) { ... }Use currentWorkflow so helper functions can write logs into the active run directory.
Implement helper:
function parseJsonLoose(text) {
try {
return JSON.parse(text);
} catch {}
// Try fenced code block first.
// Then try the first valid {...} or [...] region by position.
// Throw if still invalid.
}This does not need to be perfect. It should be good enough for common agent outputs.
Each workflow should produce report.md.
The runtime should provide:
await wf.writeReport(markdown)Example report sections:
# Proofread Directory Report
- Total files: 312
- Updated: 147
- Unchanged: 151
- Rejected: 9
- Failed: 5
## Updated Files
| File | Summary |
|---|---|
| docs/a.md | Fixed spelling and punctuation. |
## Rejected Files
| File | Reason |
|---|---|
| docs/b.md | Reviewer found meaning changed. |
## Failed Files
| File | Error |
|---|---|Must explain:
- Current chat is not the worker.
- Workflow script is the manager.
- Agents are workers.
- State lives on disk.
- Results are structured.
- Verification is a stage, not a vibe.
Must list trigger conditions:
- More than about 20 independent items.
- Repeated experiment loops.
- Benchmark/test/build optimization.
- Broad audit.
- Migration.
- Many files or URLs.
- Need for reviewers.
- Need for resumability.
Must document runtime APIs, stable default run directories, resume behavior, and optional item-state helpers.
Must document prompt and output contract rules.
Must document these patterns:
- map-only
- map-review-write
- planner-reviewer-coder
- test-fix-loop
- benchmark-accept-reject
- auto-research-loop
- strategy-self-improvement as future pattern
Must document safety rules and destructive-operation warnings.
The MVP is complete when all of the following are true.
- The
dynamic-workflowskill folder exists. SKILL.mdexists and contains trigger conditions and required behavior.- All reference docs exist.
- Runtime file exists.
- At least four examples exist.
workflow-runtime.mjscan be imported from an example workflow.createWorkflow().run()creates or reuses the stable workflow run directory.- Events are appended to
events.jsonl. - Item state is saved to
items.jsonwhen item-state helpers are used. - Concurrent item-state writes are serialized in process.
agent()can call a configured CLI command.agent()writes prompt, stdout, and stderr artifacts.agent()documents and respects adapter environment settings.agent()supports built-inclaude,codex, andpipresets that can be overridden by normal adapter fields.agent()usesjsonCommandfor structured output calls when configured.agent()validates schema-backed JSON outputs and retries transient validation failures.agent()retry behavior is defined and auditable through attempt artifacts.shell()can run a command and capture stdout/stderr/exit code.parallel()respects concurrency and returns ordered result envelopes.pipeline()processes many items with bounded concurrency and returns ordered result envelopes.globFiles()can recursively find files by extension.writeReport()writesreport.md.
proofread-directory.workflow.mjsis runnable after configuring agents.review-codebase.workflow.mjsis runnable after configuring agents.benchmark-optimize.workflow.mjsdemonstrates baseline/test/benchmark flow in a sandbox and does not modify the original repo.auto-research-simple.workflow.mjsdemonstrates propose/change/evaluate/record flow.
- The default run directory is stable per workflow name.
- If
resumeis true and an item is marked done initems.json, rerunning the same workflow can skip it. - If
resumeis false, the workflow starts with empty item state for that invocation. - Failed items should be marked
failed, notdone, andisItemDone()should return false for them.
- A coding agent should be able to read
SKILL.mdand examples, then create a new task-specific workflow without needing additional explanation.
Create a fake agent script:
// fake-agent.mjs
let input = "";
process.stdin.on("data", chunk => input += chunk);
process.stdin.on("end", () => {
console.log(JSON.stringify({
changed: false,
correctedText: input,
summary: "fake agent did nothing"
}));
});Create a fake reviewer script:
// fake-reviewer.mjs
let input = "";
process.stdin.on("data", chunk => input += chunk);
process.stdin.on("end", () => {
console.log(JSON.stringify({
accept: true,
reason: "fake reviewer accepted",
finalText: input
}));
});Configure:
{
"agents": {
"editor": {
"command": "node .dynamic-workflows/fake-agent.mjs",
"input": "stdin",
"output": "json",
"timeoutMs": 60000
},
"reviewer": {
"command": "node .dynamic-workflows/fake-reviewer.mjs",
"input": "stdin",
"output": "json",
"timeoutMs": 60000
}
}
}Run proofread workflow on a small test directory.
Expected:
- No files changed.
- Reviewer should not be called because the fake editor returns
changed: false. - Events written.
- Prompts and outputs written.
- Report generated.
Create 10 fake tasks where each sleeps for 1 second. Run with concurrency 5. Expected wall time should be around 2 seconds, not 10 seconds.
Run proofread workflow over 3 files. Stop after 1 file manually or simulate failure. Rerun the same workflow with the default stable run directory and resume: true. Previously done items should be skipped.
Run a workflow that calls:
node -e "console.log(JSON.stringify({ok:true, metric:123}))"Expected:
shell(..., { json: true })returns parsed JSON.
Recommended implementation sequence for the coding agent:
- Create folder structure.
- Write
SKILL.md. - Write reference docs as concise markdown.
- Implement
workflow-runtime.mjswith file helpers and event logging. - Implement stable run directories,
resume, and serialized item-state writes. - Implement
shell(). - Implement
agent()with prompt artifacts, retries, timeout handling, and JSON extraction. - Implement
parallel()with ordered result envelopes. - Implement
pipeline()with ordered result envelopes and per-item stage failure handling. - Implement
globFiles(). - Implement
diffText(). - Write
proofread-directory.workflow.mjs. - Write fake-agent manual tests.
- Write the remaining examples.
- Update docs based on examples.
Keep the MVP boring.
Prefer:
- One readable runtime file.
- Plain JavaScript ESM.
- No dependencies.
- Explicit prompts.
- Explicit contracts.
- Simple JSON files.
- Simple run artifacts.
- Examples that coding agents can copy.
Avoid:
- Premature npm package design.
- Complex plugin architecture.
- Hidden magic.
- Runtime self-modification.
- Hard-coding one agent provider.
- Provider SDKs.
- Complex JSON schema validation.
The purpose of the MVP is to validate the pattern: can a coding agent reliably create useful workflow scripts for large tasks?
After the MVP works on several real tasks, consider extracting the runtime into an npm package.
Potential future package structure:
@dynamic-workflow/core
@dynamic-workflow/cli
@dynamic-workflow/agents
Future features:
- Real CLI:
init,run,resume,report. - Explicit run IDs and run history management.
- Better schema validation with Zod or Ajv.
- Git worktree support.
- File locks.
- Resource locks.
- Cost/token tracking.
- HTML report.
- Human approval gates.
- Strategy module hot-swapping.
- Stronger sandboxing.
- TUI/monitoring UI.
Do not implement these in MVP unless absolutely necessary.