Add workflow DAG reporting to nf-tower plugin by ewels · Pull Request #7218 · nextflow-io/nextflow

ewels · 2026-06-10T21:30:37Z

Summary

Serialize the run execution DAG faithfully and send it to Seqera Platform once at run start (on the begin request), so Platform (or any consumer) can render a live execution graph.

The plugin's responsibility is deliberately minimal and stable: it serializes session.getDag() and sends it as-is — no reduction, filtering or reinterpretation. All interpretation (collapsing operator/channel vertices to a process-only graph, building a subworkflow hierarchy, choosing a layout) happens consumer-side, where it can evolve without a Nextflow release. The DAG is small (KB-scale even for large pipelines), so it's sent whole.

What's in here

New core serializer — modules/nextflow/src/main/groovy/nextflow/dag/DagSerializer.groovy

static Map toMap(DAG dag) mirrors the DAG model 1:1:

{
  "dag": {
    "vertices": [ { "id": "<stable id>", "type": "PROCESS", "label": "<vertex label>", "scope": "<enclosing workflow>" } ],
    "edges":    [ { "source": "<id>", "target": "<id>", "label": "<optional channel label>" } ]
  }
}

Every vertex type (PROCESS/OPERATOR/ORIGIN/NODE) and every edge is emitted — operator/channel vertices between processes are preserved, never dropped.
It lives in the core nextflow.dag package alongside the existing DOT/GEXF/Mermaid renderers because DAG.Type, addVertex and createVertex are @PackageScope — placing it there keeps it @CompileStatic-clean and lets the faithful unit test construct PROCESS+OPERATOR vertices. nf-tower just invokes it.

TowerObserver — adds renderDag(session) (null-safe; swallows errors so DAG reporting can never abort a run) and wires the result into makeBeginReq as an optional, additive top-level dag field (omitted when empty/unavailable). No existing fields changed.

Key decisions

Lifecycle point: onFlowBegin. Verified via ScriptRunner.run(): scriptLoader.runScript() builds the entire dataflow network (and thus the DAG) before session.fireDataflowNetwork() → notifyFlowBegin(). CH.broadcast(), which runs after onFlowBegin, uses GPars' native queue.into()/MixOp and adds no NodeMarker DAG vertices — so the DAG is structurally complete at begin. Sent exactly once.
FQ process name source: Vertex.label directly. When a process runs inside a subworkflow, BindableDef clones it with prefix + SCOPE_SEP + name, so process.name (stored as Vertex.label) is already the fully-qualified name (e.g. RNASEQ:ALIGN:STAR). We also emit scope (Vertex.workflow, the enclosing subworkflow) so the consumer never has to guess. The stable edge join uses Vertex.id (the AtomicLong, stable across normalize() unlike the positional v0/v1 name).
normalize() before serializing — resolves channel-derived edge labels and fills ORIGIN/NODE boundary vertices, matching the file renderers. Idempotent, so it's safe alongside GraphObserver's own normalize() at completion.

Backwards compatibility

The dag field is optional and additive. Older Platform/consumers ignore the unknown field, and the plugin never fails if the server doesn't accept it. No existing fields changed.

Tests

./gradlew :plugins:nf-tower:test and :nextflow:test — all green.

DagSerializerTest (core): faithful vertex/edge serialization with an operator between two processes (asserts it's preserved), FQ name + scope exposure, boundary-vertex fill-in, null/empty handling.
TowerObserverTest: dag present in begin request, omitted when null, renderDag empty/exception null-safety, populated-DAG operator-chain check.
TowerJsonGeneratorTest: the dag map round-trips through the real wire generator with operators and FQ labels/scope intact.

E2E: ran a pipeline with a named subworkflow (SUB) via ./launch.sh … -with-dag — the DAG showed FQ names SUB:FOO/SUB:BAR, the intervening map operator, boundary vertices, and labeled edges (ch, collected) — exactly the data DagSerializer emits.

🤖 Generated with Claude Code

Serialize the run execution DAG faithfully and send it to Seqera Platform once at run start, so consumers can render a live execution graph. A new `nextflow.dag.DagSerializer` mirrors the DAG model one-to-one (every vertex and edge, no filtering or reduction) into a plain map with the shape `{vertices:[{id,type,label,scope}], edges:[{source,target,label}]}`. It lives in the core `nextflow.dag` package alongside the existing DOT/GEXF/Mermaid renderers, where `DAG.Type`/`addVertex`/`createVertex` (package-scoped) are accessible. For PROCESS vertices `label` is the fully-qualified process name (scope prefix included) and `scope` is the enclosing subworkflow, so the FQ name stays reconstructable. Edges join via the stable per-vertex id. `TowerObserver` serializes the DAG in `makeBeginReq` as an optional, additive `dag` field on the begin payload. The DAG is complete at `onFlowBegin`: the script body builds the full dataflow network before `fireDataflowNetwork` fires the begin event, and `CH.broadcast()` (which runs afterwards) adds no DAG vertices. Serialization is null-safe and swallows errors so DAG reporting can never disrupt a run; older consumers ignore the unknown field. Note: the consumer side (accepting `dag` on begin, persisting it, reducing and serving it for the graph view) is a separate seqeralabs/platform PR; the two should land together. Signed-off-by: Phil Ewels <phil.ewels@seqera.io>

netlify · 2026-06-10T21:30:42Z

✅ Deploy Preview for nextflow-docs-staging ready!

Name	Link
🔨 Latest commit	`178aabe`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a2aa810c7bc330008086390
😎 Deploy Preview	https://deploy-preview-7218--nextflow-docs-staging.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

pditommaso

This is a good idea — sending the DAG faithfully at run start and leaving all interpretation to the consumer is the right call, and the design claims hold up against the code. I verified the key points locally:

DAG complete at onFlowBegin — confirmed: scriptLoader.runScript() builds the full dataflow network before session.fireDataflowNetwork(), and notifyFlowBegin() fires first there. The subsequent CH.broadcast() (queue bridging and topic MixOp) never calls NodeMarker.addOperatorNode, so no vertices are added after the snapshot. Serialization also happens synchronously on the main thread before igniters run, so no concurrent-mutation risk.
normalize() idempotency — confirmed, safe alongside GraphObserver's own normalize() at completion.
Wire format — TowerJsonGenerator's scheme map only truncates listed paths, it's not a whitelist, so the dag field passes through as-is.
Tests — ran DagSerializerTest, TowerObserverTest and TowerJsonGeneratorTest on this branch, all green. Coverage is genuinely good: the operator-preservation property is asserted at three layers (core serializer, observer, real wire generator round-trip).

To address

renderDag catches Exception, but the DAG invariant throws AssertionError. DAG.normalizeMissingVertices() enforces its invariant with a Groovy assert (DAG.groovy:314), which throws AssertionError — an Error, not an Exception — so it would escape the catch and abort the run, contradicting the stated "can never abort a run" contract. Unlikely in practice, but worth catching Throwable (or Exception | AssertionError) to make the contract bulletproof.

Nice to have (non-blocking)

"Stable id" wording is slightly misleading. Vertex.nextID is a static JVM-global AtomicLong — ids aren't deterministic across runs and won't match between a run and its -resume. They're only stable within one payload, which is sufficient for edge joins, but a one-word doc tweak ("payload-scoped id") would prevent a consumer treating them as cross-run identity.
DagSerializer.toMap() mutates its argument via normalize() — a side effect in a method named toMap. Documented and idempotent, so acceptable, but a serialize(DAG) name or an explicit note would make the mutation harder to miss.
Raw LinkedHashMap/Map types in DagSerializer — Map<String,Object> would be cleaner under @CompileStatic.
entry.scope = v.workflow ?: null deliberately converts the empty-string default from createVertex to null — correct, but a short comment would explain the ?:.

…docs - renderDag now catches Throwable, not just Exception: DAG.normalize() enforces its invariant with a Groovy assert (an AssertionError), so the 'DAG reporting can never abort a run' contract now holds for that case too. - Clarify that vertex ids are payload-scoped (static JVM-global counter, not deterministic across runs or a -resume), not cross-run stable. - Document the in-place normalize() side effect of DagSerializer.toMap(). - Parameterize raw Map/LinkedHashMap types as Map<String,Object>. - Comment the empty-string-to-null scope coercion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Phil Ewels <phil.ewels@seqera.io>

ewels · 2026-06-11T12:17:01Z

Thanks for the thorough review — all points addressed in f65bf8e.

To address

renderDag now catches Throwable. You're right that DAG.normalize() → normalizeMissingVertices() enforces its invariant with a Groovy assert, which throws AssertionError (an Error, not an Exception) and would have escaped the old catch(Exception). Switched to catch(Throwable) with a comment explaining why, so the "DAG reporting can never abort a run" contract holds even in that case.

Nice to have

"Stable id" wording. Reworded the javadoc to "payload-scoped per-vertex id" and added a note that it comes from a static JVM-global AtomicLong, so it's not deterministic across runs and won't match between a run and its -resume — identity only within a single payload.
toMap() side effect. Added a Side effect note in the javadoc spelling out the in-place normalize() mutation and that idempotency keeps it safe alongside GraphObserver. Kept the toMap name (renaming touches the call site plus three test layers) since the documented-note option was acceptable.
Raw types. toMap now returns Map<String,Object> and the internal collections are parameterized too.
Empty-string scope coercion. Added a comment explaining that createVertex defaults workflow to "" and the ?: coerces that root scope back to null.

DagSerializerTest, TowerObserverTest and TowerJsonGeneratorTest all still green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Phil Ewels <phil.ewels@seqera.io>

bentsherman

I think Nextflow should just send a map of process name -> upstream processes. Everything else (operators, channel names, etc) are internal details that are not appropriate for exposure via API. That is, it limits our ability to change the DAG structure because external systems like platform are now dependent on it

ewels · 2026-06-11T18:59:34Z

Hah, I started off with an implementation like that and then rolled it back to be generic instead. Basically the same logic, but in reverse - I was thinking if we just sent the whole DAG then Nextflow could be agnostic and we would be better future proofed for anything we want to do without needing further Nextflow changes.

pditommaso reviewed Jun 11, 2026

View reviewed changes

docs: note DagSerializer in nextflow.dag developer docs

178aabe

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Phil Ewels <phil.ewels@seqera.io>

ewels requested a review from a team as a code owner June 11, 2026 12:20

bentsherman requested changes Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add workflow DAG reporting to nf-tower plugin#7218

Add workflow DAG reporting to nf-tower plugin#7218
ewels wants to merge 3 commits into
masterfrom
dag-reporting-nf-tower

ewels commented Jun 10, 2026 •

edited

Loading

Uh oh!

netlify Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

pditommaso left a comment

Uh oh!

ewels commented Jun 11, 2026

Uh oh!

bentsherman left a comment

Uh oh!

ewels commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ewels commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in here

Key decisions

Backwards compatibility

Tests

Uh oh!

netlify Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging ready!

Uh oh!

pditommaso left a comment

Choose a reason for hiding this comment

To address

Nice to have (non-blocking)

Uh oh!

ewels commented Jun 11, 2026

Uh oh!

bentsherman left a comment

Choose a reason for hiding this comment

Uh oh!

ewels commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ewels commented Jun 10, 2026 •

edited

Loading

netlify Bot commented Jun 10, 2026 •

edited

Loading