Dynamic Workflow Skill MVP Tech Spec

1. Purpose

Build an MVP of a dynamic-workflow skill that teaches a local coding agent to create and run code-based orchestration workflows for large, repetitive, long-running, parallelizable, or verification-heavy tasks.

The MVP is not a standalone product and not a full CLI framework. It is a skill folder containing:

SKILL.md: instructions that tell a coding agent when and how to use the dynamic workflow pattern.
A minimal JavaScript runtime template: workflow-runtime.mjs.
A small set of reference docs.
A few sample workflows that a coding agent can copy and adapt.

The coding agent should be able to read this skill, copy the runtime into the target repo, write a task-specific workflow script, run it, and summarize the results from the generated artifacts.

2. Core Idea

The dynamic workflow pattern separates deterministic orchestration from fuzzy agent work.

Deterministic workflow code handles

File enumeration
Batching
Queueing
Parallelism
State persistence
Resume behavior
Shell command execution
Serialized state writes
Diff generation
Writing results
Test or benchmark execution
Acceptance checks based on objective metrics
Report generation

Agents handle

Local reasoning
Text editing
Code editing
Review
Hypothesis generation
Summarization
Strategy suggestions

Key principle

The workflow owns the loop. Agents own local reasoning or local edits. The current chat session should not hold all intermediate state.

3. MVP Goals

The MVP should allow a coding agent to perform workflows like:

Proofread hundreds of Markdown/text files.
Review many files in a codebase and produce structured findings.
Run repeated code/test/build/benchmark optimization loops.
Run simple auto-research style loops where an agent proposes a change, deterministic commands evaluate it, and the workflow accepts or rejects the result.

The MVP must support:

A skill folder layout.
A single-file Node.js runtime with no required npm dependencies.
Task-specific workflow scripts written as .mjs files.
Configurable CLI agents.
Agent prompt input via stdin or prompt file.
Agent output as text or JSON.
Basic JSON extraction/parsing.
Lightweight schema validation for machine-consumed agent JSON.
Parallel execution with bounded concurrency.
Pipeline execution over many items.
Optional per-item state persistence for workflows that need item-level resume.
Resume by skipping completed items when the workflow uses item state.
Shell command execution with timeout.
Run artifact directory with logs, prompts, outputs, state, and report.
Example workflows.

4. Non-goals for MVP

Do not implement these in the MVP:

A polished standalone CLI.
A web UI or TUI.
Distributed workers.
Durable database beyond JSON/JSONL files.
Full JSON Schema validation library dependency; the runtime should keep a small dependency-free schema subset.
Complex sandboxing.
Full git worktree management.
Automated dependency installation.
Full self-modifying workflow runtime.
Cross-machine execution.
Token/cost accounting.
Model provider SDK integration.

The MVP should remain simple enough that a coding agent can inspect and modify it easily.

5. Expected Skill Folder Structure

Create the following folder:

skills/dynamic-workflow/
  SKILL.md
  references/
    concepts.md
    when-to-use.md
    runtime-contract.md
    agent-contract.md
    patterns.md
    safety-and-isolation.md
  runtime/
    workflow-runtime.mjs
  examples/
    proofread-directory.workflow.mjs
    review-codebase.workflow.mjs
    benchmark-optimize.workflow.mjs
    auto-research-simple.workflow.mjs

The exact parent folder may depend on the local coding agent's skill system. The implementation should keep the internal structure above.

This skills/dynamic-workflow/ folder is the reusable skill installation. It is separate from .dynamic-workflows/, which is created inside each target repo when the skill is used.

6. Target Project Runtime Layout

When a coding agent uses this skill in a target repo, it should create:

.dynamic-workflows/
  agents.json
  runtime/
    workflow-runtime.mjs
  workflows/
    <task-name>.workflow.mjs
  runs/
    <workflow-name>/
      run.json
      events.jsonl
      items.json         # optional item checkpoint
      report.md
      prompts/
      outputs/
      errors/
      shell/
      diffs/
      artifacts/

For the earliest MVP, the default run directory should be stable per workflow, not timestamped per invocation. Reusing one directory makes resume behavior obvious and avoids requiring a separate run registry.

<workflow-name> means safeName(name), not the raw workflow name.

items.json is a standard optional checkpoint file. It is needed for workflows that want item-level resume, such as "process these 400 files and skip the 217 already completed files." Workflows that only run a single command, produce one report, or manage custom state do not need to use item state. The runtime may create an empty items.json for consistency, but workflow correctness should not depend on every workflow using it.

`agents.json`

The target repo should contain a configurable agent adapter file:

{
  "agents": {
    "editor": { "preset": "claude" },
    "reviewer": { "preset": "codex" },
    "coder": { "preset": "pi", "timeoutMs": 1800000 }
  }
}

The preset values are examples. The skill must explain that the user or coding agent should adapt them to the locally available agent CLI.

Supported built-in presets:

claude: claude -p; structured calls use --output-format json, schema calls use --json-schema.
codex: codex exec --ephemeral --skip-git-repo-check -s read-only -; structured calls add --json.
pi: pi -p; schema calls rely on prompt contract plus runtime validation.

Preset defaults can be overridden per agent by setting normal adapter fields. A string value is shorthand: "editor": "codex" means "editor": { "preset": "codex" }.

Supported input values:

stdin
file

Supported output values:

text
json
codex-json
claude-json

Optional adapter fields:

preset: built-in adapter preset: claude, codex, or pi.
jsonCommand: optional shell command to use for structured output calls. Use this for local agent CLI JSON flags such as --json or --output-format json.
schemaCommand: optional shell command to use when agent(..., { schema }) is used. Use {schema} to inject shell-quoted schema JSON.
timeoutMs: default timeout for this agent.
inheritEnv: whether the subprocess inherits the current environment. Default true for local CLI compatibility.
env: extra environment variables to add or override.

Use output: "codex-json" for codex exec --json, which emits JSONL events. The runtime should extract the last completed agent_message text before applying normal JSON parsing and schema validation.

Use output: "claude-json" for claude -p --output-format json. The runtime should return structured_output when present, otherwise result.

The skill must warn that inherited environment variables can expose secrets to agent subprocesses.

7. Main User Flow

When the user asks for a large task, the coding agent should:

Detect that the task is suitable for dynamic workflow.
Create .dynamic-workflows/ in the target repo if missing.
Copy runtime/workflow-runtime.mjs from the skill into .dynamic-workflows/runtime/.
Create .dynamic-workflows/agents.json if missing.
Create a task-specific workflow under .dynamic-workflows/workflows/.
Run the workflow using Node.js:

node .dynamic-workflows/workflows/<task-name>.workflow.mjs

Inspect .dynamic-workflows/runs/<task-name>/report.md and other artifacts.
Summarize what happened to the user.

The coding agent should not try to process hundreds of files directly inside the chat session.

8. `SKILL.md` Requirements

SKILL.md must be short, directive, and optimized for coding-agent behavior.

It should include:

A description header.
Trigger conditions.
Required behavior.
The key workflow principle.
The default implementation sequence.
References to detailed docs.
Warnings about safety and state.

Draft `SKILL.md`

---
name: dynamic-workflow
description: Use when a task is too large, repetitive, parallelizable, long-running, or verification-heavy to complete directly in the current chat. Create code-based workflows that call CLI agents, keep state on disk, and resume safely.
---

# Dynamic Workflow Skill

Use this skill when the user's task has one or more of these properties:

- Many independent items: files, tests, documents, issues, URLs, candidates, modules, examples.
- Long-running loops: optimize until a metric improves, run repeated experiments, repeatedly test and fix.
- Parallelizable subtasks: one agent per file, module, candidate, or review dimension.
- Need for verification: writer agent followed by reviewer agent, code change followed by tests, benchmark result followed by acceptance check.
- Context too large for one conversation.
- The user explicitly asks for workflow, orchestration, batching, fan-out, auto research, or dynamic workflow.

Do not try to complete such tasks directly in the current conversation. Instead, create a workflow script.

## Required behavior

1. Create `.dynamic-workflows/` in the current repo if it does not exist.
2. Copy or create the minimal runtime at `.dynamic-workflows/runtime/workflow-runtime.mjs`.
3. Create `.dynamic-workflows/agents.json` if missing.
4. Write a task-specific workflow under `.dynamic-workflows/workflows/`.
5. Use deterministic code for enumeration, batching, state, retries, writing files, shell commands, and report generation.
6. Use agents only for fuzzy judgment, editing, summarization, code changes, hypothesis generation, or review.
7. All agent calls must have explicit input and output contracts.
8. Prefer JSON output from agents when possible.
9. For item-based workflows, persist item status to disk so the workflow can resume.
10. At the end, summarize from workflow artifacts, not from memory.

## Key principle

The workflow owns the loop. Agents own local reasoning or local edits. The current chat session only creates, starts, monitors, and summarizes the workflow.

## Before writing a workflow

Read:

- `references/runtime-contract.md`
- `references/agent-contract.md`
- `references/patterns.md`
- `references/safety-and-isolation.md`

Use examples as templates.

9. Runtime Public API

The MVP runtime should export these functions:

createWorkflow(options)
agent(agentName, options)
shell(command, options)
parallel(tasks, options)
pipeline(items, stages, options)
globFiles(options)
readText(filePath)
writeText(filePath, content)
appendText(filePath, content)
diffText(before, after)
safeName(name)

Optional but useful:

readJson(filePath, fallback)
writeJson(filePath, value)
appendJsonl(filePath, value)
fileExists(filePath)
ensureDir(dirPath)

10. `createWorkflow(options)`

Signature

const wf = createWorkflow({
  name,
  runDir,
  concurrency,
  resume
});

Options

{
  name: string,
  runDir?: string,
  concurrency?: number,
  resume?: boolean
}

Defaults:

concurrency = 4
resume = true
runDir = `.dynamic-workflows/runs/${safeName(name)}`

MVP run directories are stable per workflow by default. Re-running the same workflow reuses the same artifact directory and can resume from prior item state. A workflow can still pass a custom runDir if it wants isolated artifacts for a one-off run.

resume controls item-state loading:

resume: true: load existing items.json if present.
resume: false: start with empty item state for this invocation and overwrite items.json on the first item-state write.

The runtime should not delete the run directory automatically. If a completely clean artifact directory is needed, the user or top-level coding agent can remove it explicitly or provide a different runDir.

Returned object

{
  name,
  runDir,
  concurrency,
  state,
  run(fn),
  event(type, data),
  getItemState(key),
  setItemState(key, value),
  isItemDone(key),
  markItemDone(key, value),
  markItemFailed(key, errorOrValue),
  writeReport(markdown)
}

state is the in-memory item checkpoint object with the same shape as items.json, usually { items: {} }. It is not intended to be a general workflow database. Workflows that need custom state should write explicit artifacts under artifacts/ or their own files.

Behavior

wf.run(fn) should:

Create the run directory.
Create subdirectories:
- prompts/
- outputs/
- errors/
- shell/
- diffs/
- artifacts/
Load existing item state if resume is true and items.json exists.
Otherwise initialize empty in-memory item state.
Write or overwrite run.json for the latest invocation.
Append run_started to events.jsonl.
Execute fn.
On success, append run_completed.
On failure, append run_failed and rethrow.

11. State Model

For item-based workflows, use a simple JSON checkpoint file:

.dynamic-workflows/runs/<workflow-name>/items.json

items.json is not required for every workflow. It is the runtime's standard per-item progress file for workflows that have stable item keys and need resumable processing. Examples:

Proofreading many files should use item state.
Reviewing many source files should use item state.
A single benchmark command may not need item state.
An experiment loop may use custom artifacts instead of item state, unless each experiment has a stable key.

Shape:

{
  "items": {
    "docs/file1.md": {
      "status": "done",
      "updatedAt": "2026-05-30T12:00:00.000Z",
      "result": {
        "status": "updated",
        "summary": "Fixed spelling and punctuation."
      }
    }
  }
}

12. Event Log

Every important operation should append a JSON line to:

events.jsonl

Example events:

{"type":"run_started","at":"2026-05-30T12:00:00.000Z","name":"proofread-directory"}
{"type":"agent_started","at":"...","agent":"editor","label":"edit:docs/a.md"}
{"type":"agent_completed","at":"...","agent":"editor","label":"edit:docs/a.md","durationMs":12345}
{"type":"item_done","at":"...","key":"docs/a.md","result":{"status":"updated"}}
{"type":"run_completed","at":"..."}

The event log is append-only and should be useful for debugging.

Because the default run directory is reused, events.jsonl may contain events from multiple invocations of the same workflow. Each invocation should append its own run_started and terminal run_completed or run_failed event.

13. `agent(agentName, options)`

Signature

const result = await agent("editor", {
  label,
  prompt,
  cwd,
  schema,
  timeoutMs,
  retries
});

Options

{
  label: string,
  prompt: string,
  cwd?: string,
  schema?: object,
  timeoutMs?: number,
  retries?: number
}

Behavior

Load .dynamic-workflows/agents.json.
Find the named agent.
If schema is provided, append the schema contract to the prompt and use structured output mode.
Write the prompt to prompts/<safe-label>.md.
Resolve any built-in preset, then run schemaCommand when schema is provided and the adapter provides one, otherwise run jsonCommand when structured output mode is active and available, otherwise run command.
If input is stdin, pass prompt to stdin.
If input is file, create a prompt file and replace {promptFile} in the command string. If the placeholder is missing, throw a configuration error.
Capture stdout and stderr.
Write stdout to outputs/<safe-label>.stdout.txt.
Write stderr to errors/<safe-label>.stderr.txt.
If output mode is codex-json, extract the last completed Codex agent_message text from the JSONL event stream.
If output mode is claude-json, parse Claude Code's JSON envelope and return structured_output when present, otherwise result.
If output mode is json or schema is provided, parse JSON from stdout or from the extracted agent message text.
If JSON parsing fails, attempt to extract the first JSON object or array from stdout, choosing whichever valid JSON region appears first by position.
If schema is provided, validate the parsed JSON against the runtime's lightweight schema subset.
Return parsed JSON or raw text.

Supported schema keywords are type, required, properties, items, enum, additionalProperties, nullable, minItems, maxItems, minLength, and maxLength. This validates shape only. Workflow code must still compute deterministic facts such as actual changed files, diffs, command exit codes, and parsed metrics.

Command parsing

For MVP, support command as a string and run it through the shell:

spawn(command, { shell: true })

This is less safe than argv-array execution but simpler. Document the risk.

Future npm package can support argv arrays.

Agent subprocesses inherit the current environment by default unless their adapter sets inheritEnv: false. Adapter env values should be merged into the spawned process environment.

For input: "file", the runtime should replace every {promptFile} occurrence with a shell-quoted absolute prompt-file path. The command string should contain the unquoted placeholder, for example agent-cli --prompt {promptFile}.

Retries

retries defaults to 0 and means additional attempts after the first attempt. For example, retries: 2 allows up to 3 attempts total.

Retry only transient execution failures:

Non-zero exit code.
Timeout.
JSON parse or extraction failure when adapter output is json, codex-json, claude-json, or schema is provided.
Schema validation failure when agent(..., { schema }) is used.

Do not retry configuration errors such as a missing agent name, unsupported adapter option, or missing {promptFile} placeholder for input: "file".

Between retries, wait with simple linear backoff such as 500ms * attemptNumber. If more than one attempt occurs, prompt/stdout/stderr artifacts should include the attempt number so failed attempts are auditable.

Timeout

If timeout expires:

Kill the subprocess.
Write an error event.
Throw an error.

14. Agent Output Contract

Every agent prompt should include:

Local objective.
Input data.
Allowed actions.
Forbidden actions.
Required output format.
Semantic payload fields expected by the workflow.
Acceptance criteria.

Example:

You are proofreading one Markdown file.

Rules:
- Fix spelling and grammar errors.
- Preserve meaning.
- Preserve Markdown structure.
- Do not modify code blocks.
- Do not rewrite style unnecessarily.
- Return the corrected full file text and a short summary.

The workflow should pass the machine-readable shape through agent(..., { schema }), for example:

const edit = await agent("editor", {
  label: "edit:docs/a.md",
  prompt,
  schema: {
    type: "object",
    required: ["correctedText", "summary"],
    properties: {
      correctedText: { type: "string" },
      summary: { type: "string" }
    },
    additionalProperties: false
  },
  retries: 1
});

Schema validation confirms output shape, not truth. The workflow still computes deterministic facts such as whether text changed, actual changed files, command exit codes, diffs, and metrics.

15. `shell(command, options)`

Signature

const result = await shell("npm test -- --json", {
  cwd,
  timeoutMs,
  label,
  json
});

Options

{
  cwd?: string,
  timeoutMs?: number,
  label?: string,
  json?: boolean,
  env?: object
}

Return value

{
  ok: boolean,
  exitCode: number,
  stdout: string,
  stderr: string,
  json?: any,
  durationMs: number
}

If json: true, parse stdout as JSON or extract JSON from stdout using the same loose JSON parser as agent().

Behavior

Write stdout/stderr to shell/<safe-label>.stdout.txt and shell/<safe-label>.stderr.txt when a workflow context exists.
Append shell events to events.jsonl.

16. `parallel(tasks, options)`

Signature

const results = await parallel(
  tasks,
  { concurrency: 4, stopOnError: false }
);

Input

tasks is an array of async functions.

Options

{
  concurrency?: number,
  stopOnError?: boolean
}

Behavior

Execute up to concurrency tasks at a time.
Preserve result order by input task index.
Always return result envelopes when stopOnError is false:

{
  ok: true,
  index: 0,
  value: "..."
}

{
  ok: false,
  index: 0,
  error: "..."
}

If stopOnError is true, throw on the first observed task failure. Already-running tasks are not forcibly cancelled, but no ordered result array is returned.

17. `pipeline(items, stages, options)`

Signature

const results = await pipeline(items, [stage1, stage2, stage3], {
  concurrency: 4,
  stopOnError: false
});

Options match parallel():

{
  concurrency?: number,
  stopOnError?: boolean
}

Behavior

For each item:

Run stage 1.
Pass result to stage 2.
Pass result to stage 3.
Return final result.

Across items, run pipelines concurrently with bounded concurrency.

pipeline() returns result envelopes in input item order when stopOnError is false:

{
  ok: true,
  index: 0,
  item,
  value
}

{
  ok: false,
  index: 0,
  item,
  stage: 1,
  error: "..."
}

If any stage throws for an item, later stages for that item do not run. stage is the zero-based stage index that failed. If stopOnError is true, throw on the first observed item failure. Already-running item pipelines are not forcibly cancelled.

This is better than barrier-style parallel stages for large item sets.

18. `globFiles(options)`

No npm dependency in MVP.

Implement a simple recursive file finder that supports:

Root directory.
File extensions.
Ignore directories.

Signature

const files = await globFiles({
  root: "docs",
  extensions: [".md", ".txt"],
  ignoreDirs: ["node_modules", ".git", ".dynamic-workflows"]
});

Do not implement full glob syntax in MVP unless easy.

19. File Helpers

Required helpers:

readText(filePath)
writeText(filePath, content)
appendText(filePath, content)
readJson(filePath, fallback)
writeJson(filePath, value)
appendJsonl(filePath, value)
ensureDir(dirPath)
fileExists(filePath)
safeName(name)

safeName should convert labels into filesystem-safe names.

20. `diffText(before, after)`

MVP implementation can be simple:

Write both versions to artifacts if needed.
Return a rough line-level diff.

No dependency required.

Simple output is acceptable:

- old line
+ new line

This is mainly for reviewer prompts and reports.

diffText() output is display-oriented, not a machine-parseable patch contract. Workflows that need to apply changes should use structured JSON, explicit file writes, or a real patch artifact generated by the workflow.

21. Example Workflow 1: Proofread Directory

File:

examples/proofread-directory.workflow.mjs

Purpose:

Enumerate Markdown/text files.
For each file, call an editor agent.
Call reviewer agent on proposed correction.
Write accepted changes.
Persist item state.
Generate report.

Expected workflow behavior:

Find files under docs/ with .md and .txt extensions.
Skip files already marked done.
Read original file.
Send original content to editor agent.
Editor returns:

{
  "changed": true,
  "correctedText": "...",
  "summary": "..."
}

If unchanged, mark item done.
If changed, generate diff.
Send original, corrected text, and diff to reviewer agent.
Reviewer returns:

{
  "accept": true,
  "reason": "...",
  "finalText": "..."
}

If accepted, write finalText to file.
If rejected, do not write.
Mark item done with result.
Write report with counts:

total files
updated
unchanged
rejected
failed

22. Example Workflow 2: Review Codebase

File:

examples/review-codebase.workflow.mjs

Purpose:

Enumerate source files.
Ask reviewer agents to produce structured findings.
Aggregate findings.
Generate report.

The workflow should not modify files.

Agent output:

{
  "findings": [
    {
      "severity": "low|medium|high",
      "file": "src/example.ts",
      "line": 123,
      "title": "...",
      "description": "...",
      "suggestion": "..."
    }
  ]
}

Report should group by severity.

23. Example Workflow 3: Benchmark Optimize

File:

examples/benchmark-optimize.workflow.mjs

Purpose:

Run baseline benchmark command.
Create an isolated sandbox copy of the target code under the run artifact directory.
Ask coder agent to implement one optimization inside the sandbox only.
Run tests in the sandbox.
Run benchmark in the sandbox.
Record the candidate diff and accept/reject result based on tests and benchmark improvement.

MVP behavior:

Do not implement git worktrees.
Do not modify the original repo.
Copy only configured target paths into artifacts/benchmark-sandbox/.
Run the coder agent with cwd set to the sandbox.
Restrict the prompt to explicitly allowed files or directories.
Generate a candidate diff artifact comparing the original target paths to the sandbox.
Write a report with baseline metric, candidate metric, test result, improvement percentage, and accept/reject decision.
Never automatically apply, commit, or revert changes in the original repo.

If the candidate is accepted, the example should still leave application to the human or top-level coding agent. Its job is to produce an auditable candidate patch plus benchmark evidence.

24. Example Workflow 4: Simple Auto Research

File:

examples/auto-research-simple.workflow.mjs

Purpose:

Demonstrate an experiment loop.
Agent proposes a hypothesis.
Agent proposes a target-file change or patch.
Shell runs evaluation command.
Workflow compares metric.
Workflow records accept/reject.

MVP should avoid complex self-modifying strategy code. It should document the future pattern but not fully implement it. The example should make clear that Auto Research is only one pattern built on Dynamic Workflow, not the central purpose of the skill.

25. Safety Rules

The skill and examples must teach these rules:

Prefer workflows that read input, ask agents for structured output, and let deterministic code write results.
Do not let agents directly modify many files unless necessary.
For text proofreading, have agents return corrected text; workflow writes it.
For code changes, require tests before accepting.
For optimization, require objective metrics before accepting.
Do not expose secrets to agent subprocesses unless explicitly required.
Do not put API keys in prompts.
Be explicit about whether agent subprocesses inherit environment variables.
Do not allow arbitrary destructive shell commands in generated workflows.
For code-edit workflows, prefer patch/edit-plan generation over uncontrolled direct edits.
Keep run artifacts for auditability.
Make workflows resumable when they process many items or long loops.

26. Agent-code Pattern for Future Extension

The MVP should document but not fully implement agentCode().

The intended future pattern:

root workflow: stable, not edited during current run
strategy modules: allowed to be edited by agents after validation
target files: edited as part of experiments

Example future structure:

.dynamic-workflows/
  workflows/
    auto-research.workflow.mjs
  strategies/
    propose.mjs
    review.mjs
  prompts/
    propose.md
    review.md

Rule:

Agents may propose edits to strategy modules for the next generation of the workflow, but the current root workflow should not mutate itself during execution.

The MVP can include this in references/patterns.md as a future pattern.

27. Implementation Requirements for `workflow-runtime.mjs`

The runtime should be a single ESM file.

Use only Node.js built-in modules:

node:fs/promises
node:path
node:child_process
node:os
node:crypto

No external dependencies.

Minimum supported Node.js version: Node 18+.

The runtime should be readable and easy for a coding agent to modify.

28. Suggested Runtime Implementation Outline

// workflow-runtime.mjs

import fs from "node:fs/promises";
import path from "node:path";
import { spawn } from "node:child_process";
import os from "node:os";
import crypto from "node:crypto";

let currentWorkflow = null;

export function createWorkflow(options) { ... }
export async function agent(agentName, options) { ... }
export async function shell(command, options = {}) { ... }
export async function parallel(tasks, options = {}) { ... }
export async function pipeline(items, stages, options = {}) { ... }
export async function globFiles(options) { ... }
export async function readText(filePath) { ... }
export async function writeText(filePath, content) { ... }
export async function appendText(filePath, content) { ... }
export async function readJson(filePath, fallback = null) { ... }
export async function writeJson(filePath, value) { ... }
export async function appendJsonl(filePath, value) { ... }
export async function ensureDir(dirPath) { ... }
export async function fileExists(filePath) { ... }
export function safeName(name) { ... }
export function diffText(before, after) { ... }

Use currentWorkflow so helper functions can write logs into the active run directory.

29. Minimal JSON Extraction

Implement helper:

function parseJsonLoose(text) {
  try {
    return JSON.parse(text);
  } catch {}

  // Try fenced code block first.
  // Then try the first valid {...} or [...] region by position.
  // Throw if still invalid.
}

This does not need to be perfect. It should be good enough for common agent outputs.

30. Reporting

Each workflow should produce report.md.

The runtime should provide:

await wf.writeReport(markdown)

Example report sections:

# Proofread Directory Report

- Total files: 312
- Updated: 147
- Unchanged: 151
- Rejected: 9
- Failed: 5

## Updated Files

| File | Summary |
|---|---|
| docs/a.md | Fixed spelling and punctuation. |

## Rejected Files

| File | Reason |
|---|---|
| docs/b.md | Reviewer found meaning changed. |

## Failed Files

| File | Error |
|---|---|

31. Reference Docs Content

`references/concepts.md`

Must explain:

Current chat is not the worker.
Workflow script is the manager.
Agents are workers.
State lives on disk.
Results are structured.
Verification is a stage, not a vibe.

`references/when-to-use.md`

Must list trigger conditions:

More than about 20 independent items.
Repeated experiment loops.
Benchmark/test/build optimization.
Broad audit.
Migration.
Many files or URLs.
Need for reviewers.
Need for resumability.

`references/runtime-contract.md`

Must document runtime APIs, stable default run directories, resume behavior, and optional item-state helpers.

`references/agent-contract.md`

Must document prompt and output contract rules.

`references/patterns.md`

Must document these patterns:

map-only
map-review-write
planner-reviewer-coder
test-fix-loop
benchmark-accept-reject
auto-research-loop
strategy-self-improvement as future pattern

`references/safety-and-isolation.md`

Must document safety rules and destructive-operation warnings.

32. Acceptance Criteria

The MVP is complete when all of the following are true.

Skill structure

The dynamic-workflow skill folder exists.
SKILL.md exists and contains trigger conditions and required behavior.
All reference docs exist.
Runtime file exists.
At least four examples exist.

Runtime

workflow-runtime.mjs can be imported from an example workflow.
createWorkflow().run() creates or reuses the stable workflow run directory.
Events are appended to events.jsonl.
Item state is saved to items.json when item-state helpers are used.
Concurrent item-state writes are serialized in process.
agent() can call a configured CLI command.
agent() writes prompt, stdout, and stderr artifacts.
agent() documents and respects adapter environment settings.
agent() supports built-in claude, codex, and pi presets that can be overridden by normal adapter fields.
agent() uses jsonCommand for structured output calls when configured.
agent() validates schema-backed JSON outputs and retries transient validation failures.
agent() retry behavior is defined and auditable through attempt artifacts.
shell() can run a command and capture stdout/stderr/exit code.
parallel() respects concurrency and returns ordered result envelopes.
pipeline() processes many items with bounded concurrency and returns ordered result envelopes.
globFiles() can recursively find files by extension.
writeReport() writes report.md.

Example workflows

proofread-directory.workflow.mjs is runnable after configuring agents.
review-codebase.workflow.mjs is runnable after configuring agents.
benchmark-optimize.workflow.mjs demonstrates baseline/test/benchmark flow in a sandbox and does not modify the original repo.
auto-research-simple.workflow.mjs demonstrates propose/change/evaluate/record flow.

Resume

The default run directory is stable per workflow name.
If resume is true and an item is marked done in items.json, rerunning the same workflow can skip it.
If resume is false, the workflow starts with empty item state for that invocation.
Failed items should be marked failed, not done, and isItemDone() should return false for them.

Documentation

A coding agent should be able to read SKILL.md and examples, then create a new task-specific workflow without needing additional explanation.

33. Suggested Manual Test Plan

Test 1: Fake agent

Create a fake agent script:

// fake-agent.mjs
let input = "";
process.stdin.on("data", chunk => input += chunk);
process.stdin.on("end", () => {
  console.log(JSON.stringify({
    changed: false,
    correctedText: input,
    summary: "fake agent did nothing"
  }));
});

Create a fake reviewer script:

// fake-reviewer.mjs
let input = "";
process.stdin.on("data", chunk => input += chunk);
process.stdin.on("end", () => {
  console.log(JSON.stringify({
    accept: true,
    reason: "fake reviewer accepted",
    finalText: input
  }));
});

Configure:

{
  "agents": {
    "editor": {
      "command": "node .dynamic-workflows/fake-agent.mjs",
      "input": "stdin",
      "output": "json",
      "timeoutMs": 60000
    },
    "reviewer": {
      "command": "node .dynamic-workflows/fake-reviewer.mjs",
      "input": "stdin",
      "output": "json",
      "timeoutMs": 60000
    }
  }
}

Run proofread workflow on a small test directory.

Expected:

No files changed.
Reviewer should not be called because the fake editor returns changed: false.
Events written.
Prompts and outputs written.
Report generated.

Test 2: Parallelism

Create 10 fake tasks where each sleeps for 1 second. Run with concurrency 5. Expected wall time should be around 2 seconds, not 10 seconds.

Test 3: Resume

Run proofread workflow over 3 files. Stop after 1 file manually or simulate failure. Rerun the same workflow with the default stable run directory and resume: true. Previously done items should be skipped.

Test 4: Shell

Run a workflow that calls:

node -e "console.log(JSON.stringify({ok:true, metric:123}))"

Expected:

shell(..., { json: true }) returns parsed JSON.

34. Implementation Order

Recommended implementation sequence for the coding agent:

Create folder structure.
Write SKILL.md.
Write reference docs as concise markdown.
Implement workflow-runtime.mjs with file helpers and event logging.
Implement stable run directories, resume, and serialized item-state writes.
Implement shell().
Implement agent() with prompt artifacts, retries, timeout handling, and JSON extraction.
Implement parallel() with ordered result envelopes.
Implement pipeline() with ordered result envelopes and per-item stage failure handling.
Implement globFiles().
Implement diffText().
Write proofread-directory.workflow.mjs.
Write fake-agent manual tests.
Write the remaining examples.
Update docs based on examples.

35. Important Design Constraints

Keep the MVP boring.

Prefer:

One readable runtime file.
Plain JavaScript ESM.
No dependencies.
Explicit prompts.
Explicit contracts.
Simple JSON files.
Simple run artifacts.
Examples that coding agents can copy.

Avoid:

Premature npm package design.
Complex plugin architecture.
Hidden magic.
Runtime self-modification.
Hard-coding one agent provider.
Provider SDKs.
Complex JSON schema validation.

The purpose of the MVP is to validate the pattern: can a coding agent reliably create useful workflow scripts for large tasks?

36. Future Extensions After MVP

After the MVP works on several real tasks, consider extracting the runtime into an npm package.

Potential future package structure:

@dynamic-workflow/core
@dynamic-workflow/cli
@dynamic-workflow/agents

Future features:

Real CLI: init, run, resume, report.
Explicit run IDs and run history management.
Better schema validation with Zod or Ajv.
Git worktree support.
File locks.
Resource locks.
Cost/token tracking.
HTML report.
Human approval gates.
Strategy module hot-swapping.
Stronger sandboxing.
TUI/monitoring UI.

Do not implement these in MVP unless absolutely necessary.

FilesExpand file tree

dynamic_workflow_skill_mvp_tech_spec.md

Latest commit

History

dynamic_workflow_skill_mvp_tech_spec.md

File metadata and controls

Dynamic Workflow Skill MVP Tech Spec

1. Purpose

2. Core Idea

Deterministic workflow code handles

Agents handle

Key principle

3. MVP Goals

4. Non-goals for MVP

5. Expected Skill Folder Structure

6. Target Project Runtime Layout

agents.json

7. Main User Flow

8. SKILL.md Requirements

Draft SKILL.md

9. Runtime Public API

10. createWorkflow(options)

Signature

Options

Returned object

Behavior

11. State Model

12. Event Log

13. agent(agentName, options)

Signature

Options

Behavior

Command parsing

Retries

Timeout

14. Agent Output Contract

15. shell(command, options)

Signature

Options

Return value

Behavior

16. parallel(tasks, options)

Signature

Input

Options

Behavior

17. pipeline(items, stages, options)

Signature

Behavior

18. globFiles(options)

Signature

19. File Helpers

20. diffText(before, after)

21. Example Workflow 1: Proofread Directory

22. Example Workflow 2: Review Codebase

23. Example Workflow 3: Benchmark Optimize

24. Example Workflow 4: Simple Auto Research

25. Safety Rules

26. Agent-code Pattern for Future Extension

27. Implementation Requirements for workflow-runtime.mjs

28. Suggested Runtime Implementation Outline

29. Minimal JSON Extraction

30. Reporting

31. Reference Docs Content

references/concepts.md

references/when-to-use.md

references/runtime-contract.md

references/agent-contract.md

references/patterns.md

references/safety-and-isolation.md

32. Acceptance Criteria

Skill structure

Runtime

Example workflows

Resume

Documentation

33. Suggested Manual Test Plan

Test 1: Fake agent

Test 2: Parallelism

Test 3: Resume

`agents.json`

8. `SKILL.md` Requirements

Draft `SKILL.md`

10. `createWorkflow(options)`

13. `agent(agentName, options)`

15. `shell(command, options)`

16. `parallel(tasks, options)`

17. `pipeline(items, stages, options)`

18. `globFiles(options)`

20. `diffText(before, after)`

27. Implementation Requirements for `workflow-runtime.mjs`

`references/concepts.md`

`references/when-to-use.md`

`references/runtime-contract.md`

`references/agent-contract.md`

`references/patterns.md`

`references/safety-and-isolation.md`