Skip to content

Commit bb5d0ac

Browse files
committed
Dogfood Workflow for Intent sync
1 parent d99d765 commit bb5d0ac

20 files changed

Lines changed: 2083 additions & 348 deletions

.pnpmfile.cjs

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
const workflowPackageVersions = {
2+
'@tanstack/workflow-core': '0.0.3',
3+
'@tanstack/workflow-runtime': '0.0.1',
4+
}
5+
6+
function readPackage(pkg) {
7+
if (
8+
pkg.name === '@tanstack/workflow-runtime' ||
9+
pkg.name === '@tanstack/workflow-store-drizzle-postgres' ||
10+
pkg.name === '@tanstack/workflow-netlify'
11+
) {
12+
pkg.dependencies = {
13+
...pkg.dependencies,
14+
...Object.fromEntries(
15+
Object.entries(workflowPackageVersions).filter(([name]) =>
16+
pkg.dependencies?.[name]?.startsWith('workspace:'),
17+
),
18+
),
19+
}
20+
}
21+
22+
return pkg
23+
}
24+
25+
module.exports = {
26+
hooks: {
27+
readPackage,
28+
},
29+
}
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Workflow Adapter Research: Intent Sync POC
2+
3+
This note captures what the TanStack.com Intent sync POC needs from the
4+
Workflow library after the runtime/store/host packages landed.
5+
6+
## What Got Better
7+
8+
The main abstraction we wanted now exists upstream:
9+
10+
- `@tanstack/workflow-core` owns deterministic workflow replay primitives.
11+
- `@tanstack/workflow-runtime` owns workflow registration, schedule
12+
materialization, run lifecycle, leases, timers, signals, approvals, and
13+
bounded sweeps.
14+
- `@tanstack/workflow-store-drizzle-postgres` owns durable Postgres
15+
persistence.
16+
- `@tanstack/workflow-netlify` owns the Netlify scheduled sweep handler.
17+
- `@tanstack/workflow-vercel` owns the Vercel cron sweep handler.
18+
19+
That moves host/runtime machinery out of TanStack.com. The app now only needs:
20+
21+
```ts
22+
export const intentProcessWorkflow = createWorkflow({
23+
id: 'intent-process-workflow',
24+
input,
25+
}).handler(async (ctx) => {
26+
const versions = await ctx.step('select-pending-versions', () =>
27+
selectPendingIntentVersions({ limit: ctx.input.batchSize }),
28+
)
29+
30+
for (const version of versions) {
31+
await ctx.step(`process-version:${version.id}`, () =>
32+
processIntentVersion(version.id),
33+
)
34+
}
35+
})
36+
```
37+
38+
and one registration provider:
39+
40+
```ts
41+
export function createIntentWorkflowRegistrations() {
42+
return {
43+
'intent-process-workflow': {
44+
load: async () => intentProcessWorkflow,
45+
schedules: [{ schedule: every.minutes(15) }],
46+
},
47+
}
48+
}
49+
```
50+
51+
## Domain Boundary
52+
53+
The Intent registry already has durable business state:
54+
55+
- `pending`
56+
- `synced`
57+
- `failed`
58+
59+
Workflow should not replace those domain statuses. Workflow should make
60+
orchestration durable and observable:
61+
62+
- which scheduled bucket ran
63+
- which workflow steps replayed
64+
- which step failed
65+
- which run is paused, finished, errored, or claimable
66+
67+
This is why the process workflow uses one step to select work and one stable
68+
step per package version. Partial failures are local to the version step and do
69+
not abort the entire batch.
70+
71+
## Serverless Runtime Model
72+
73+
Netlify and Vercel adapters should treat cron as a wake-up signal:
74+
75+
1. materialize due schedule buckets
76+
2. claim due scheduled runs
77+
3. claim due timers
78+
4. run bounded workflow slices
79+
5. return before the host timeout
80+
81+
The app should not own:
82+
83+
- cron parsing
84+
- deterministic bucket IDs
85+
- run ID construction
86+
- timer sweeping
87+
- lease ownership
88+
- duplicate scheduled delivery handling
89+
- event collection defaults
90+
91+
## Remaining Library Gaps
92+
93+
Package publication currently leaks monorepo internals. The published
94+
`@tanstack/workflow-runtime`, `@tanstack/workflow-netlify`, and
95+
`@tanstack/workflow-store-drizzle-postgres` manifests still contain
96+
`workspace:*` dependencies. External apps need normal dependency ranges in
97+
published packages.
98+
99+
The Drizzle/Postgres store ships `ensureSchema()`, but production apps often
100+
want generated or exported migration SQL. Copying SQL from package internals is
101+
easy to get wrong. The store package should expose a stable migration artifact
102+
or a Drizzle schema helper.
103+
104+
The runtime store supports `claimStaleRuns`, but the sweep path should clearly
105+
document whether stale running runs are reclaimed by `runtime.sweep()` today or
106+
whether hosts need a separate stale-run pass.
107+
108+
Admin visibility APIs are enough for a POC (`listRuns`, `getRunTimeline`), but
109+
a future adapter story should include a small generic status panel pattern so
110+
every app does not redraw the same table.
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# Intent Sync Workflow POC
2+
3+
TanStack.com dogfoods TanStack Workflow for the Intent registry background
4+
sync. The app owns workflow definitions and domain idempotency. The Workflow
5+
runtime, Netlify adapter, and Postgres store own scheduling, replay, timers,
6+
leases, and status visibility.
7+
8+
## Runtime Shape
9+
10+
The user-land workflow code lives in
11+
`src/utils/intent-workflows.server.ts`.
12+
13+
- `intent-discover-workflow`
14+
- Step: `discover-intent-packages`
15+
- Operation: npm/GitHub package discovery and version enqueueing
16+
- Schedule: registered in the runtime as `intent-discover-every-6h`
17+
18+
- `intent-process-workflow`
19+
- Step: `select-pending-versions`
20+
- Per-version steps: `process-version:${version.id}`
21+
- Operation: tarball skill extraction and synced/failed marking
22+
- Schedule: registered in the runtime as `intent-process-every-15m`
23+
24+
Runtime registration is composed in `src/utils/workflow-registrations.server.ts`.
25+
The site-wide runtime lives in `src/utils/workflow-runtime.server.ts`:
26+
27+
```ts
28+
export const workflowRuntime = createAppWorkflowRuntime()
29+
```
30+
31+
That runtime uses:
32+
33+
- `@tanstack/workflow-runtime` for workflow registration, schedules, leases,
34+
timers, and sweeps
35+
- `@tanstack/workflow-store-drizzle-postgres` for durable Postgres persistence
36+
- `@tanstack/workflow-netlify` for the Netlify scheduled sweep handler
37+
38+
## Netlify Wake-Up
39+
40+
Netlify does not run a long-lived worker. It only wakes the runtime.
41+
42+
`netlify/functions/workflow-sweep-background.ts` runs every 5 minutes:
43+
44+
```ts
45+
export default createNetlifyWorkflowSweepHandler({ runtime: workflowRuntime })
46+
export const config = createNetlifyWorkflowSweepConfig({
47+
schedule: '*/5 * * * *',
48+
})
49+
```
50+
51+
Each sweep materializes due schedule buckets and starts deterministic workflow
52+
runs from the durable store. The default scheduled run ID shape is:
53+
54+
```txt
55+
${workflowId}:${scheduleId}:${bucketTimestamp}
56+
```
57+
58+
Examples:
59+
60+
- `intent-discover-workflow:intent-discover-every-6h:1779796800000`
61+
- `intent-process-workflow:intent-process-every-15m:1779796800000`
62+
63+
The scheduled function is deliberately stateless. If it is delivered late or
64+
twice, the runtime/store should claim each bucket once.
65+
66+
## Durable Store
67+
68+
Workflow persistence is generic and shared by every workflow in the app. The
69+
Intent sync does not get custom workflow runtime tables.
70+
71+
The runtime store tables are:
72+
73+
- `workflow_runs`
74+
- `workflow_run_states`
75+
- `workflow_event_locks`
76+
- `workflow_events`
77+
- `workflow_timers`
78+
- `workflow_signal_deliveries`
79+
- `workflow_schedules`
80+
- `workflow_schedule_buckets`
81+
82+
The app schema mirrors the upstream store schema in `src/db/schema.ts`. The SQL
83+
migration is `drizzle/migrations/0000_workflow_run_store.sql`.
84+
85+
Note: `drizzle/migrations` is ignored in this repo, so that migration must be
86+
force-added if this branch is staged.
87+
88+
## Intent Queue
89+
90+
The Intent domain queue is unchanged:
91+
92+
- `intent_package_versions.sync_status = 'pending'` means discovered but not
93+
indexed.
94+
- `sync_status = 'synced'` means skills were extracted and stored.
95+
- `sync_status = 'failed'` means the row is still retryable by later process
96+
runs.
97+
98+
Workflow records orchestration progress and replay events. Intent tables remain
99+
the business-state source of truth.
100+
101+
## Admin Visibility
102+
103+
The existing Intent admin page lists recent workflow runs by calling the
104+
workflow store's `listRuns` API for the two Intent workflow IDs. This is minimal
105+
visibility for scheduled sync status without exposing a generic workflow admin
106+
surface yet.
107+
108+
## Manual Testing
109+
110+
Use the repo-local workflow CLI to test workflow behavior without Netlify cron:
111+
112+
```bash
113+
pnpm workflow list-workflows
114+
pnpm workflow ensure-schema
115+
pnpm workflow list-runs --workflow-id intent-process-workflow
116+
```
117+
118+
Start one workflow run directly:
119+
120+
```bash
121+
pnpm workflow start intent-process-workflow \
122+
--input '{"batchSize":1,"source":"admin"}' \
123+
--events
124+
```
125+
126+
Exercise the same model as the Netlify scheduled function:
127+
128+
```bash
129+
pnpm workflow sweep --events false
130+
```
131+
132+
Inspect a run:
133+
134+
```bash
135+
pnpm workflow timeline <run-id> --events false
136+
```
137+
138+
For automated unit tests, instantiate `createAppWorkflowRuntime` with
139+
`inMemoryWorkflowExecutionStore()` and pass only the workflow registrations
140+
under test. That keeps Workflow user-land tests independent of Postgres,
141+
Netlify, npm, and GitHub.
142+
143+
## Current Gap
144+
145+
The published workflow adapter packages currently reference internal workflow
146+
dependencies with `workspace:*`. TanStack.com uses `.pnpmfile.cjs` to rewrite
147+
those package manifests to published versions during install. That file should
148+
be removed once the Workflow packages are republished with normal dependency
149+
ranges.
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
CREATE TABLE IF NOT EXISTS "workflow_runs" (
2+
"run_id" text PRIMARY KEY,
3+
"workflow_id" text NOT NULL,
4+
"workflow_version" text,
5+
"status" text NOT NULL,
6+
"input" jsonb NOT NULL,
7+
"output" jsonb,
8+
"error" jsonb,
9+
"waiting_for" jsonb,
10+
"pending_approval" jsonb,
11+
"wake_at" bigint,
12+
"lease_owner" text,
13+
"lease_expires_at" bigint,
14+
"created_at" bigint NOT NULL,
15+
"updated_at" bigint NOT NULL
16+
);
17+
--> statement-breakpoint
18+
CREATE INDEX IF NOT EXISTS "workflow_runs_status_idx" ON "workflow_runs" ("status", "updated_at");
19+
--> statement-breakpoint
20+
CREATE INDEX IF NOT EXISTS "workflow_runs_lease_idx" ON "workflow_runs" ("status", "lease_expires_at");
21+
--> statement-breakpoint
22+
CREATE TABLE IF NOT EXISTS "workflow_run_states" (
23+
"run_id" text PRIMARY KEY,
24+
"workflow_id" text NOT NULL,
25+
"workflow_version" text,
26+
"status" text NOT NULL,
27+
"input" jsonb NOT NULL,
28+
"output" jsonb,
29+
"error" jsonb,
30+
"waiting_for" jsonb,
31+
"pending_approval" jsonb,
32+
"created_at" bigint NOT NULL,
33+
"updated_at" bigint NOT NULL
34+
);
35+
--> statement-breakpoint
36+
CREATE TABLE IF NOT EXISTS "workflow_event_locks" (
37+
"run_id" text PRIMARY KEY,
38+
"created_at" bigint NOT NULL
39+
);
40+
--> statement-breakpoint
41+
CREATE TABLE IF NOT EXISTS "workflow_events" (
42+
"run_id" text NOT NULL,
43+
"event_index" integer NOT NULL,
44+
"event_type" text NOT NULL,
45+
"step_id" text,
46+
"event" jsonb NOT NULL,
47+
"created_at" bigint NOT NULL,
48+
CONSTRAINT "workflow_events_run_id_event_index_pk" PRIMARY KEY ("run_id", "event_index")
49+
);
50+
--> statement-breakpoint
51+
CREATE INDEX IF NOT EXISTS "workflow_events_type_idx" ON "workflow_events" ("run_id", "event_type");
52+
--> statement-breakpoint
53+
CREATE TABLE IF NOT EXISTS "workflow_timers" (
54+
"run_id" text NOT NULL,
55+
"signal_id" text NOT NULL,
56+
"workflow_id" text NOT NULL,
57+
"workflow_version" text,
58+
"wake_at" bigint NOT NULL,
59+
"lease_owner" text,
60+
"lease_expires_at" bigint,
61+
CONSTRAINT "workflow_timers_run_id_signal_id_pk" PRIMARY KEY ("run_id", "signal_id")
62+
);
63+
--> statement-breakpoint
64+
CREATE INDEX IF NOT EXISTS "workflow_timers_due_idx" ON "workflow_timers" ("wake_at", "lease_expires_at");
65+
--> statement-breakpoint
66+
CREATE TABLE IF NOT EXISTS "workflow_signal_deliveries" (
67+
"run_id" text NOT NULL,
68+
"signal_id" text NOT NULL,
69+
"created_at" bigint NOT NULL,
70+
CONSTRAINT "workflow_signal_deliveries_run_id_signal_id_pk" PRIMARY KEY ("run_id", "signal_id")
71+
);
72+
--> statement-breakpoint
73+
CREATE TABLE IF NOT EXISTS "workflow_schedules" (
74+
"schedule_id" text PRIMARY KEY,
75+
"workflow_id" text NOT NULL,
76+
"workflow_version" text,
77+
"schedule" jsonb NOT NULL,
78+
"overlap_policy" text NOT NULL,
79+
"input" jsonb,
80+
"next_fire_at" bigint,
81+
"enabled" boolean NOT NULL,
82+
"updated_at" bigint NOT NULL
83+
);
84+
--> statement-breakpoint
85+
CREATE INDEX IF NOT EXISTS "workflow_schedules_due_idx" ON "workflow_schedules" ("enabled", "next_fire_at");
86+
--> statement-breakpoint
87+
CREATE TABLE IF NOT EXISTS "workflow_schedule_buckets" (
88+
"schedule_id" text NOT NULL,
89+
"bucket_id" text NOT NULL,
90+
"workflow_id" text NOT NULL,
91+
"workflow_version" text,
92+
"run_id" text NOT NULL,
93+
"fire_at" bigint NOT NULL,
94+
"input" jsonb,
95+
"overlap_policy" text NOT NULL,
96+
"status" text NOT NULL,
97+
"lease_owner" text,
98+
"lease_expires_at" bigint,
99+
"started_at" bigint,
100+
CONSTRAINT "workflow_schedule_buckets_schedule_id_bucket_id_pk" PRIMARY KEY ("schedule_id", "bucket_id")
101+
);
102+
--> statement-breakpoint
103+
CREATE INDEX IF NOT EXISTS "workflow_schedule_buckets_lease_idx" ON "workflow_schedule_buckets" ("status", "fire_at", "lease_expires_at");

0 commit comments

Comments
 (0)