Releases: heggria/pi-taskflow
pi-taskflow 0.0.28
Granular-reuse release. v0.0.27 proved the incremental-recompute cost win — v0.0.28 makes it far larger and trivial to opt into. Invalidation drops from whole-flow to per-phase and per-item, and a single flag flips an entire flow into cross-run reuse.
⚡ Smarter invalidation
- Per-phase fingerprint (
v3:phasefp) — edit one phase, only it and its dependents re-run. An independent sibling keeps its cache hit. - Per-item
mapcaching — change 1 of N items, only that item re-executes. The other N−1 are served for $0.
🎛️ One-flag opt-in
incrementalflag (flow-level andrunarg) — defaults every phase to cross-run reuse, no per-phase annotation needed. Per-phasecacheand blocked types (gate/approval/loop/tournament) still win; default stays the saferun-only.
🔍 See what got reused
- Reuse reporting — the end-of-run report and
/tf recomputenow show reused-vs-executed counts plus a per-phase Why trace:▲ rerun · ✂ cutoff · ✓ reused · ✗ failed, with← causedBy.
Implementation details & soundness
v3:phasefphashes the phase plus its transitivedependsOn ∪ fromclosure, replacing the whole-flowv2:flowdefhash.cacheKeysemits a 4-tier read ladder (v3:phasefpwrite →v2:flowdef→ bare flowdef → legacy, all read-only) so the upgrade is additive — no miss-storm for unchanged flows. Fail-open: any per-phase error degrades that phase to the whole-flow hash. Falls back to whole-flow when per-phase invalidation can't be statically guaranteed (flow-widecontextSharing, anyshareContextphase in the closure,join: "any", or sub-flow inner phases). —extensions/flowir/phasefp.ts- Per-item keys omit the structural fingerprint (which hashes the whole
oversource) so changing one item no longer moves every key; they fold[phase.id, it.agent, model, it.task]+ the world-state tail. Disabled (whole-map only) underrun-only/offscope,shareContext/ flow-widecontextSharing, or inside a runtime-generated sub-flow. phaseFingerprintnow stripscache,retry,concurrency,final— none changes a phase's output, so a no-op config tweak no longer falsely invalidates.- Reuse reporting reports a dollar figure only for within-run reuse (where prior usage is preserved); cross-run hits are counted without an invented saving. —
summarizeReuse/RecomputeDecisioninextensions/runtime.ts
Tests: 804 → 846 (+42) across 46 files — new cache-phasefp, cache-peritem, incremental-flag, reuse-summary suites. Typecheck clean.
Upgrade: drop-in. The 4-tier cache ladder reads old entries; nothing to migrate.
pi-taskflow 0.0.27
[0.0.27] — 2026-06-25
Evidence release: the incremental-recompute cost win is now proven, not
asserted. v0.0.25 made/tf recomputetrustworthy and v0.0.26 made the
dependency contract under it real — but the only cascade test re-ran every
phase, so "rerun only what changed" had no regression proof. This release pins
the two ways recompute actually saves money, closing the flagship's open
acceptance criterion (the prerequisite for ever flipping recompute on by
default).
Added
- Flagship cost-win tests (
test/recompute.test.ts):- Partial cascade —
rerun < full. A diamond where one branch shares no
edge with the changed seed proves the unrelated phase is reused (0 tokens),
never re-run, and the rerun set is strictly smaller than the full flow. - Early-cutoff propagation. Re-seeding a phase whose output is unchanged
cuts off its entire transitive downstream — only the seed spends a token,
every descendant hits its cache. This is the "changed a file that didn't
actually affect the result ⇒ near-zero rerun" guarantee.
- Partial cascade —
- Tests: 802 → 804 (+2).
Changed
- README test count and feature line refreshed (was stale at 702/34 files):
now 804 tests across 42 files, withincremental recomputeand
FlowIR compile seamlisted among the headline capabilities.
Notes
- Scope held deliberately. Two further H2 ideas — flipping
runto
auto-recompute by default, and preciseir-changed/ map item-level reuse —
are not in this release. The first changes every user'srunbehavior
(a kernel-level, post-M5 decision); the latter two are scoped-out in the
roadmap (§6) as later RFCs. Shipping the proof first keeps each step
independently releasable.
pi-taskflow 0.0.26
[0.0.26] — 2026-06-25
Foundation release: the convergence roadmap's H1 lands — a real FlowIR
compile seam (M1), a declared dependency plane (M2), and a
backward-compatible cache-key migration. v0.0.25 made incremental recompute
trustworthy; this release makes the contract underneath it real: the
recompute frontier now reasons over observed ∪ declared dependencies, the
flow definition compiles through a typed IR surface instead of an inlined
hash, and folding the definition into the cache key no longer evicts every
pre-existing cross-run entry.
Added
- FlowIR compile seam (M1). New
extensions/flowir/{index,translate,meta}.ts
exposescompileTaskflowToIR(def) → { ir, meta, hash, usedFallbackHash, warnings, errors }— a typed, never-throwing projection of a desugared flow
into a content-addressed IR. The runtime now routesflowDefHashthrough this
seam instead of inlining it.translateis currently a 1:1 stub projection
(sousedFallbackHashistrueand the hash equals the vendored
flowDefHash); it becomes the genuine overstory compiler once that kernel is
vendored, at which point the cache-key version advancesv2: → v3:. /tf ir <flow>command +irtool action. Renders the compiled IR plus
its hash and any structuredCompileError[]— zero tokens, no LLM.- Declared dependency plane (M2).
compileTaskflowToIRsynthesizes per-phase
DeclaredDeps { reads, writes }from interpolation refs
(task/over/when/until/eval/branches/with/context) and
dependsOn, attaches them toir.meta.declaredDeps, and persists them to
RunState./tf recomputenow computes its stale frontier over
union(observed ∪ declared) rather than observed-only — a dependency that
was declared but never interpolated at runtime is no longer missed. - Tests: 753 → 802 (+49) across new suites:
flowir.test.ts,
flowir-declared.test.ts,stale-union.test.ts(incl. a 500-iteration
property test proving the union frontier is never narrower than observed-only),
recompute-union.test.ts,cache-migration.test.ts, pluse2e-flowir.mts
ande2e-cache-migration.mts.
Fixed
- Cache-key migration no longer evicts existing cross-run entries. Folding
flowdef:into the key previously invalidated every pre-existing cross-run
cache entry on upgrade (a one-time miss-storm).cacheKeyis now versioned
(v2:flowdef:) with a 3-tier lookup: new key → bareflowdef:key →
legacy (no-flowdef) key. Old entries still hit for one release cycle; there is
no write-through on a fallback hit (legacy entries age out naturally), and
every tier still includesflow:${name}so two different flows can never
collide. - Declared plane and recompute guard now see
loop.untilandgate.eval.
collectRefsskippeduntil(loop convergence) andeval[](gate zero-token
checks), so a dependency expressed only in those fields was absent from the
declared plane and from thedryRun:falseunobserved-dependency guard. Both
are now scanned. (Closes the two MEDIUM findings from the H1 risk review.)
Compatibility
- Backward compatible.
RunState.flowDefHashandRunState.declaredDeps
are optional — pre-0.0.26 run states load unchanged. A compile/hash failure
fails open:usedFallbackHashstays set, cross-run cache is disabled for that
run, and the key degrades to a flow-scoped (collision-free) form. The one
observable change on upgrade is a single re-execution of in-flight phases
whose storedinputHashpredates thev2:prefix.
pi-taskflow 0.0.25
Release v0.0.25: incremental recompute is now trustworthy Closes three silent correctness holes in /tf recompute: when/eval readSet capture, loop self-read scheduler deadlock, and the unobserved-dependency guard for real recomputation. The difference between looks-incremental and provably-incremental.
pi-taskflow 0.0.24
[0.0.24] — 2026-06-23
Feature release:
/tf compile— turn the declared DAG into a Mermaid
diagram plus a verification overlay for 0 tokens. A picture of the plan, a
structural audit of the plan, and a GitHub-pastable artifact — all from the
same JSON.
Added
compileaction for thetaskflowtool and the/tf compile <name>
command. Renders the flow as a Mermaidflowchart, overlays verification
issues onto the nodes (red = error, amber = warning, green border = final),
and emits a markdown document suitable for READMEs / issues / PRs.- Distinct shapes for every phase kind: agent ▭, parallel/map/flow ⊐, reduce ▽,
gate ◇, approval ⏸, loop ↻, tournament ⬡. Guards become edge labels;
join: "any"becomes dotted edges. - Reuses the existing
verifyTaskflowgraph analysis, so every dead-end,
unreachable node, gate-exhaustion, budget overflow, concurrency warning, and
guard contradiction is painted directly on the diagram. - Zero runtime dependencies; the compiler is a pure function with no LLM calls.
- Tests: 670 → 702 (+32) in
test/compile.test.ts— structural assertions on
the emitted Mermaid tokens (no third-party parser dependency; render-
correctness is validated by shape/edge/class assertions).
Fixed
- Id collisions no longer merge nodes. Two distinct phase ids that
sanitize to the same Mermaid token (e.g.audit-eachandaudit_each) are
now disambiguated with a_2suffix instead of collapsing into one node with
an accidental self-loop. - Markdown-injection hardening. Free-form strings (flow name, description,
verification messages) are neutralized before interpolation, so a
multi-line / bracket-laden name can no longer break out of the H1 heading or
spawn a second blockquote. /tf compile <name>now schema-validates first, matching the tool action
— a malformed saved flow yields a clean error instead of a half-rendered
diagram. An optionallr/tdsuffix selects diagram direction.- Backslashes are now escaped inside Mermaid labels.
pi-taskflow 0.0.23
[0.0.23] — 2026-06-11
Feature release: the Shared Context Tree — an opt-in mechanism that gives
subagents a horizontal blackboard and a vertical supervision tree, so fan-out
items can reuse expensive context instead of re-reading it, and a node can
delegate work at runtime and have its children report back. Validated with six
real end-to-end runs (realpi, real models) including a recursive org tree
and a large 5-way audit that converges through a loop + gate.
Added
- Shared Context Tree (opt-in). Set
shareContext: trueon a phase (or
contextSharing: trueat the flow level) to give its subagent four extra
tools backed by a per-run, file-based blackboard:ctx_write(key, value)/ctx_read(key?)— a horizontal blackboard: a
node publishes a finding; siblings/descendants reuse it (own > ancestors >
completed-others on key conflict; a running sibling's half-written findings
stay hidden). Stops fan-out items from re-reading the same files.ctx_report(summary, structured?)/ctx_spawn(assignments[])— a
vertical supervision tree: a node reports up, and delegates child work at
runtime; the runtime runs each child (isolated) after the node finishes and
folds their reports into the phase output.- New module
extensions/context-store.tsreuses the run store's atomic-write- file-lock primitives (per-node findings files — no global lock contention).
- All bookkeeping is fail-open (it can never sink a phase); the blackboard
is size-bounded (256 KB/value, 256 keys/node), depth-capped (5), and cleaned
up with the run. Fully backward-compatible: flows that don't opt in are
byte-for-byte unaffected.
ctx_spawnaccepts a sub-graph, not just flat tasks. An assignment is now
either{task, agent?}or{subflow, defaultAgent?}wheresubflowis an
inline Taskflow (a dependency-bearing DAG withmap/gate/reduce). The
spawned subflow reuses the samevalidateTaskflow+verifyTaskflow+
nested-executeTaskflowmachinery asflow{def}; spawn-subflows andflow{def}
share oneMAX_DYNAMIC_NESTINGcounter (adef:spawn-*_stackframe), and
spawned child token/cost usage is folded into the parent phase for honest budget
accounting. A bad subflow fails open with a diagnostic.- Tests: 608 → 670 (+62) across 33 files, incl.
context-store,
context-tree,spawn-xor,spawn-subflow,spawn-subflow-nesting,
workspace,workspace-isolation. - Workspace isolation (
cwdkeywords). A phase'scwdnow accepts three
reserved keywords that make the runtime allocate an isolated working directory
for the phase's subagent and tear it down afterwards:"temp"— an ephemeral dir under the OS tmpdir, removed when the phase ends."dedicated"— a persistent dir under the run state
(runs/ws/<runId>/<phaseId>), kept for inspection and deterministic per
phase so a resume reuses the same dir."worktree"— a realgit worktreeon a throwaway branch offHEAD,
removed (git worktree remove --force+ branch delete) when the phase ends;
for changes you want to diff / commit / discard in isolation.- New module
extensions/workspace.ts(zero deps:fs.mkdtemp+gitvia
child_process). Fail-open: a failed allocation degrades to the base
cwd (worktree→tempwhen not a git repo) and records awarnings
diagnostic — a phase never fails to run because of isolation. Security:
the keywords are rejected at validation in LLM-authored sub-flows
(flow{def}/ctx_spawnsubflow) so generated plans cannot allocate
worktrees or temp dirs that mutate the repo. A literal path is passed
through unchanged (fully backward-compatible).
Fixed
map/parallelfan-out items that callctx_spawnwere silently
orphaned. The post-run spawn-drain only covered single-agent/gate/reduce
phases (keyed on the base phase id), but fan-out items run with suffixed node
ids (audit-0…audit-4) and were never drained — their queued children never
ran (5 orphaned intents, 0 children, in a real e2e). Each fan-out item now
drains its own node and runs + folds its spawned children (reports + usage),
fail-open. Regression test added.- Workspace override no longer leaks across isolation boundaries (found by
the pre-release adversarial review).runInlineSubflowand the gate
onBlock:retryupstream re-execution both spread...depswithout clearing
the parent's_cwdOverride, so a spawned subflow / re-run upstream dep could
be force-pinned to the parent phase's isolated dir. Both now strip the
override (a spawned subflow still inherits the parent's dir as its base cwd,
consistent withflow{def}, but no longer ignores an inner phase's own cwd).
The triplicatedeffCwdformula was extracted into oneresolveEffCwd()
helper (the divergence was the root cause).runs/ws/dedicated-workspace
dirs are now reclaimed by the terminal-run cleanup, andrmrf()gained a
path-containment guard (defense-in-depth).
pi-taskflow 0.0.22
[0.0.22] — 2026-06-10
Dogfooding release. The
dogfood-fullself-audit taskflow (which itself
exercises all 9 phase types + when/join/retry/budget/cache/eval/flow-def/
loop/tournament/approval) ran against the codebase and surfaced these fixes.
Added
- Live auto-refresh for the
/tf runspanel. The run-history panel was a static snapshot taken when opened, so a background (detached) run's progress never updated while watching. It now polls run state on a 1s interval and re-renders only when a run's status/updatedAtactually changes — phase progress (includingmap/parallelsubProgresslike24/24) updates live. The user's selection follows the samerunIdacross refreshes, a green● livetag shows while any run is running, and the refresh timer is cleared on close (dispose()) andunref'd so it never keeps the event loop alive. Fully backward-compatible: without live hooks the panel renders statically as before.- 5 new tests (
test/runs-view.test.ts): refresh-on-change, no-render-when-unchanged, dispose-stops-timer, selection-follows-runId, back-compat-no-hooks.
- 5 new tests (
Fixed
safeParsenow prefers ajson-tagged fence in multi-fence output. When an LLM phase emitted an evidence block (e.g.```typescript) before the```jsonpayload, the old single-match regex grabbed the first fence, failed to parse, and the balanced-bracket fallback was misled by braces in the prose —safeParsereturnedundefinedand any downstreammapphase failed with'over' did not resolve to an array. It now scans every fenced block and triesjson-tagged ones first, then untagged. (3 new multi-fence tests.)- Unresolved interpolation refs are surfaced as phase warnings.
interpolate()returnsmissing[](placeholders with no source), but the runtime discarded it on the main task path — so{args.typo}or a{steps.x.output}withoutdependsOnwas silently left intact in the dispatched task. Theinterpolate.tsdoc comment promised "a recorded warning" that no code produced. The runtime now logs[taskflow] phase X: unresolved refs ...and attaches the message toPhaseState.warnings(persisted in the run record, visible in/tf runs). Doc comment corrected to match.
pi-taskflow 0.0.21
[0.0.21] — 2026-06-10
Added
- Per-step context pre-read in shorthand modes. Single, chain, and tasks shorthand steps now accept
context(file paths) andcontextLimit, desugared directly onto the generated phases. This eliminatesO(N²)file exploration without writing the full DSL. In paralleltasksmode all branches share the deduped union of step contexts; chain steps each carry their own context. A top-levelcontextin chain mode produces a warning (no unsupported flow-level default). Context-file changes automatically invalidate phase caches.
Fixed
- Headless approval safety. Approval phases now auto-reject (not auto-approve) when running in detached/background/CI mode, preventing silent bypass of human gates.
- Step-reference validator accepts transitive ancestors. The step-reference checker previously raised false positives on valid DAGs where dependencies span multiple levels of ancestry. Ancestor transitive closure is now fully resolved.
pi-taskflow 0.0.20
[0.0.20] — 2026-06-10
Added
- Background (detached) execution —
detach: true. Run a taskflow in a detached child process without blocking the current session. Passdetach: trueand get arunIdback immediately; the flow executes in the background, persisting state to the store. Status polled via/tf runsandresumeworks as normal.extensions/detached-runner.ts(new): lightweight child-process entry script — reads serialized context, callsexecuteTaskflow, persists terminal state.extensions/index.ts:detach: Booleanparameter on the taskflow tool + child-process spawn logic (records PID inRunState).extensions/store.ts:RunStategainspid?: number+detached?: booleanfields;isProcessAlive(pid)stale-PID helper.- Design: entry-point spawn wrapper — zero changes to the 1340-line
runtime.tscore, no new phase type, no DSL version bump, fully backward-compatible. - Approval phases auto-reject in background mode. Idle watchdog kills stalled children. Stale PID detection via signal-0 probe.
- 8 new tests (
test/detached.test.ts): process-alive, PID persistence, end-to-end detached, crash→failed, resume after failure, stale PID, backward compat.
Fixed
approvalViewinitialization robustness: throws a clear error when the approval view module is unavailable, preventing silent failures in detached/background mode.
pi-taskflow 0.0.19
[0.0.19] — 2026-06-10
Documentation
- Closed the SKILL coverage gap — the LLM can now author every shipped feature. A schema-vs-SKILL.md audit (
docs/internal/skill-coverage-audit.md, machine-checked + cross-adversarial reviewed) found several implemented + tested features that were undocumented in the LLM-facing skill, so the model never generated them. All ~46 user-facing schema fields are now documented across SKILL.md + configuration.md.- SKILL.md: phase-type table now lists all 9 types (added
loop,tournament) with a “details” column pointing each to its section; new Loop phases (until/maxIterations/convergence) and Tournament phases (variants/judge/mode/judgeAgent) sections;eval(zero-token machine gate) andonBlock: "retry"(self-healing rework loop) folded into the Gate section; cross-runcachepointer +optional+ staticbranchesnotes. - SKILL.md: new Operating a run section — run lifecycle (
running → completed/blocked/failed/paused), cache-aware resume, when to resume vs. re-run, budget-mid-run behavior, and run inspection. Clarified action semantics (definevsname, save scope/collision,verify/agentsactions). - configuration.md: new §2.1 Context pre-reading (
context/contextLimit— resolution order, per-file 8000-char cap, 200k total cap) and §8 Cross-run caching (cache.scope,ttl, fullfingerprintprefix table for git/glob/glob!/file/env). Fixed a stale “5 phase types” → 9 cross-file drift.
- SKILL.md: phase-type table now lists all 9 types (added
- Every documented JSON example validates against the live schema; all run-status/resume claims verified against the runtime (
blockedis terminal;paused/failedare resumable). 560 tests pass, zero regression.
CI
- GitHub Packages publish is now best-effort (
continue-on-error) so an unscoped-package 404 there can never block the npm publish or the GitHub Release.