Skip to content

Epic: Restate-backed Effect-native workflow engine (alternative to the faithful binding) #764

@schickling-assistant

Description

@schickling-assistant

Problem / motivation

The shipped @overeng/restate-effect (#757) is, by decision 0001, a thin, faithful binding: Restate's own model — Services, Virtual Objects, Workflows, and the durable ctx primitives (ctx.run, ctx.sleep, awakeables, durable promises, keyed state, service-to-service calls) — surfaced as Effect-returning combinators. Restate's programming model is the API surface; handler code is coupled to Restate's model on purpose, for mechanical sympathy with the engine.

This epic proposes the inverse, complementary package: take an Effect-native workflow API (workflows / activities / durable steps / signals / child-workflows expressed in pure Effect terms — the shape of @effect/workflow) and run it on Restate as the durable-execution engine underneath. Here the surface is Effect's own workflow abstractions; Restate provides the durable substrate (journaling, deterministic replay, retries, suspension, durable state, awakeables) as a pluggable backend — an implementation detail rather than the surface.

Why this is worth exploring:

  • Effect-idiomatic authoring above Restate's model. Author durable workflows in Effect's own vocabulary (Workflow.make / Activity.make / DurableClock / DurableDeferred) rather than learning Restate's construct taxonomy.
  • Portability across durable backends. @effect/workflow already abstracts execution behind a WorkflowEngine service; the same workflow can run on the in-memory engine, the cluster engine, or (this proposal) a Restate-backed engine — chosen at the Layer level, not in the workflow code.

Contrast with @overeng/restate-effect

These are complementary, not competitors — different altitudes, and this proposal deliberately does not change the binding.

@overeng/restate-effect (#757) This proposal
Surface Restate's model (Services / VOs / Workflows / ctx.*) Effect's workflow model (Workflow/Activity/DurableClock/DurableDeferred)
Restate's role The programming model and the engine The engine underneath an Effect-native API
Portability None — coupled to Restate by design Backend-portable across WorkflowEngine impls
Decision 0001 stance This is the faithful binding This is the "@effect/workflow-as-the-engine" path 0001 explicitly rejected for the binding

Decision 0001 and the vision ("What This Is Not") rule out a pluggable-engine facade for the binding. That rejection is what makes this a separate package: the binding stays faithful; this explores the engine-underneath path as its own thing. Crucially, the binding has already solved the hard sub-problems this layer would reuse rather than reimplement — so a Restate-backed engine can sit as a higher layer on top of the binding's lower layers:

  • Determinism layer (decision 0004) — journaled Clock/Random + explicit durable waits.
  • Serde — Schema-based wire/journal encoding incl. field redaction.
  • Endpoint / boundary — scoped Layer, error-channel → terminal-error transport, cancellation ↔ interruption.
  • OTel bridge — one coherent trace, exactly-once-on-replay emission.
  • Docker-free testing harness — native restate-server as a scoped Layer (plus an in-memory test context).

Design space / open questions

  1. Build on @effect/workflow's engine interface vs. define our own. @effect/workflow already exposes a WorkflowEngine service with register / execute / poll / interrupt / resume / activityExecute / deferredResult / deferredDone / scheduleClock, and a lower-level WorkflowEngine.makeUnsafe(...) constructor explicitly intended for custom backends (ClusterWorkflowEngine and WorkflowEngine.layerMemory are the two in-tree implementations). The leading option is a third implementation of that same service, backed by Restate — reusing Effect's Workflow/Activity/DurableClock/DurableDeferred authoring surface unchanged. Open question: is that interface a good fit for Restate's grain, or do we want a narrower bespoke surface?
  2. Mapping Effect workflow primitives onto Restate. Candidate mapping to validate:
    • register / execute → a Restate Workflow run + ingress submit/attach.
    • activityExecute (run-once unless retried) → ctx.run (journaled, exactly-once side effect).
    • scheduleClock / DurableClock.sleepctx.sleep durable timer.
    • DurableDeferred (token / await / succeed) → awakeables + durable promises.
    • child-workflow / fan-out → Restate service-to-service calls and Virtual Objects for per-key state.
    • interrupt / resume / poll → Restate cancellation, suspend/resume, ingress attach/output.
  3. Reconciling two replay models. @effect/workflow has its own execution/replay semantics; Restate journals at the ctx boundary. The open question is where the single source of replay truth lives — the goal is to reuse the binding's determinism layer (decision 0004) so journaled Clock/Random and explicit durable waits stay correct-by-construction rather than introducing a second replay mechanism that fights Restate's.
  4. Goal framing: portability vs. ergonomics. Is the primary win backend-portability (same workflow on memory / cluster / Restate) or Effect-ergonomics (a nicer authoring surface that happens to run on Restate)? This shapes whether we conform exactly to @effect/workflow's contract or diverge.
  5. Layering on the binding. Confirm the engine can consume the binding's serde / endpoint / OTel / testing as lower layers without reaching around them.
  6. Deployment / runtime story. How a workflow app is served and registered against a restate-server, how versioning/upgrades interact with Effect-side workflow definitions, and the single-binary vs. cluster operational model.

Acceptance criteria

Start with a design spike, not a full build:

  • A design spike documenting the primitive mapping (open question 2) and how Effect-native replay reconciles with Restate journaling (open question 3), with a clear recommendation on build-on-@effect/workflow vs. bespoke (open question 1).
  • A VRS proposal (vision + initial decisions) capturing the engine-underneath stance and its relationship to decision 0001 — explicitly complementary, not a replacement.
  • A small POC: exactly one Effect-native workflow (one workflow + one activity + one durable sleep + one durable-deferred signal) running durably on a native restate-server via a Restate-backed WorkflowEngine, reusing the binding's determinism layer and Docker-free testing harness.
  • A clear feedback loop: the POC runs under the existing native-server integration lane (no Docker), so "the Effect-native workflow ran durably on Restate" is an automated, repeatable check.

Out of scope for v1: full coverage of every @effect/workflow feature, child-workflow trees, serverless targets, production-grade upgrade/versioning. Those follow only if the spike + POC validate the approach.

References

Posted on behalf of @schickling
field value
agent_name 🥇 cl2-pyrite
agent_session_id a71a7fca-0ccf-427e-9e57-51aea841dc74
agent_tool Claude Code
agent_tool_version 2.1.165
agent_runtime Claude Code 2.1.165
agent_model claude-opus-4-8
runtime_profile /nix/store/sz4ll7nq7qbwcsw65pw13w5hw61lnvk5-coding-agent-runtime-profile/share/coding-agents/profile.json
skills_manifest /nix/store/nhbhipdhwmcqh669bpr15g39hr17cqbb-agent-skills-corpus/share/agent-skills/manifest.json
worktree effect-utils/schickling/2026-06-08-restate-effect
machine dev3
tooling_profile dotfiles@7360c0d

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:effectEffect framework usage · Set: manualorigin:agentFiled or primarily produced by an AI agent · Set: AI agent or manualtype:epicLarge tracking issue with child tasks · Set: manual

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions