ADR: Remote pipeline inclusion by bentsherman · Pull Request #7213 · nextflow-io/nextflow

bentsherman · 2026-06-10T00:19:29Z

This PR adds an ADR for remote pipeline inclusion, aka "meta-pipelines".

It describes an approach for including remote pipelines into a meta-pipeline in a way that preserves dataflow concurrency between pipeline inputs/outputs.

It discusses alternative approaches such as pipeline chaining / nf-cascade and why they don't satisfy certain use cases (preserving dataflow concurrency).

It also walks through a basic example of fetchngs -> rnaseq.

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

netlify · 2026-06-10T00:19:39Z

✅ Deploy Preview for nextflow-docs-staging ready!

Name	Link
🔨 Latest commit	`09c9e96`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a2ae8fb06f1bf00081bb8b5
😎 Deploy Preview	https://deploy-preview-7213--nextflow-docs-staging.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

ewels · 2026-06-10T06:03:33Z

Great write up, thanks for this Ben!

As you might expect, I'm most concerned about the params. You characterise it as a one-off cost which is mitigated by LLMs, however that doesn't take into account updates to included pipelines (a core functionality with included modules). The params drift with updates would be dangerous and a constant source of dev work.

I'd still love to look into how we could bulk import nested config and apply it at root level. Even if it is a separate import + apply mechanism (eg. like config profiles in a sense?). I think without it, the use of the meta pipeline functionality is substantially limited.

jorgee · 2026-06-10T06:45:48Z

+3. No use of project-level assets (`projectDir`, `bin`, `lib`) within the core workflow. Module-level assets can be used through the module `resources/` bundle and `moduleDir`.
+4. Declare software dependencies (`container`, `conda`) in the process definition, not in config.
+5. No default `ext` settings in config -- specify these defaults in the process definition or use explicit process inputs. Otherwise, any default `ext` settings must be replicated manually in the meta-pipeline.
+6. No plugin functions within the core workflow.


No clear about some of these best practices and what's the issue of not following them; maybe could be good to add an example.

Following these guidelines makes it so that when you include the core workflow and its dependent modules/subworkflows, it is self-contained

For example:

if the core workflow uses project-level assets like bin or lib, I have to remember to copy them into the meta-pipeline

if the core workflow uses a param directly and I import that into the meta-pipeline, I have to remember to define the same param (with the same meaning) in the meta-pipeline

and so on

pinin4fjords · 2026-06-10T08:19:21Z

+    > results/output-rnaseq.json
+```
+
+While pipeline chaining has always been possible in theory, new language features such as [workflow outputs](20251020-workflow-outputs.md) and [record types](20260306-record-types.md) make it much more practical. Each pipeline can define a structured output which can be passed to the next pipeline via JSON. Mismatches between an upstream output and downstream input (e.g. missing columns, different column names) can be resolved by a small adapter pipeline.


I remain to be convinced of the point of pipeline chaining if we can trivially make meta pipelines.

Agreed, I think pipeline chaining is used because metapipelines don't work right now. If they did, the number of pipeline chains drops.

That's not to say they're never useful, but it's much less common.

Two main use cases:

Run major pipeline (sarek, rnaseq) and add a few auxiliary processes

Daisy chain two pipelines (fetchngs -> rnaseq)

Both are solved better by metapipelines than pipeline chaining.

The main use case for daisy chaining is actually wiring nextflow up to non Nextflow tools, e.g. Nextflow into an ETL system. In this case structured inputs and outputs are still very useful.

at this point the value prop of pipeline chaining appears to be low development overhead (just plug A into B)

Well, chaining has development overhead, it's quite a faff, all we have to do is bring meta-pipeline dev under that faff level

pinin4fjords · 2026-06-10T08:42:51Z

As you might expect, I'm most concerned about the params.

Agreed. Feel like we need some sort of auto-import of the params of child workflows, so e.g. they appear automatically in Platform, and I could say e.g. meta.rnaseq.pseudoaligner = 'kallisto' in the meta pipeline's nextflow.config to override.

Then some auto-assembly of docs as well.

Basically we need to standardise at the nextflow level where a bunch of the non-nextflow pieces need to live.

adamrtalbot · 2026-06-10T09:29:55Z

+3. No use of project-level assets (`projectDir`, `bin`, `lib`) within the core workflow. Module-level assets can be used through the module `resources/` bundle and `moduleDir`.
+4. Declare software dependencies (`container`, `conda`) in the process definition, not in config.
+5. No default `ext` settings in config -- specify these defaults in the process definition or use explicit process inputs. Otherwise, any default `ext` settings must be replicated manually in the meta-pipeline.
+6. No plugin functions within the core workflow.


Plugin support feels like a requirement, functions like a webhook or logging statement could be critical for the workflow. The main challenge might be supporting multiple versions (e.g. WORKFLOW1 uses plugin@1.2.3 and WORKFLOW2 uses plugin@2.4.1), but maybe we can just say "ONE PLUGIN ONLY"

plugins for webhooks / logging typically live outside the core workflow. so the meta-pipeline would just import the core workflow logic and decide whether to include those plugins in its own shell

I have yet to see a plugin that is actually used in a workflow's core logic, although it's certainly possible. Most plugins provide third-party integrations at the pipeline boundary

I agree but this might become more popular with the plugin registry + vibe coding.

Sounds like premature optimization by me, easier to just tell people to be careful and deal with it if it's a problem.

sure, that's why I call them out as best practices instead of hard rules. you can use a plugin function as long as you remember to declare it in the meta-pipeline config

adamrtalbot · 2026-06-10T09:33:08Z

+2. No `publishDir` -- use the `output` block.
+3. No use of project-level assets (`projectDir`, `bin`, `lib`) within the core workflow. Module-level assets can be used through the module `resources/` bundle and `moduleDir`.
+4. Declare software dependencies (`container`, `conda`) in the process definition, not in config.
+5. No default `ext` settings in config -- specify these defaults in the process definition or use explicit process inputs. Otherwise, any default `ext` settings must be replicated manually in the meta-pipeline.


ext.args is soooooo powerful, yet clearly breaks the interface contract for processes.

I still think we should promote args to a directive and it will solve a number of these issues (process.args) 😉 .

process { args "--concise" // etc... } // main.nf my_process(ch_inputs, args: "--verbose") // nextflow.config process.withName 'my_process' { args = "--verbose" }

both ext.args and process.args can work, as long as the default value for the arg is defined in the process definition rather than in config

the core problem is that when I import a workflow, Nextflow doesn't know which config is "tied" to that workflow

adamrtalbot · 2026-06-10T09:33:47Z

+5. No default `ext` settings in config -- specify these defaults in the process definition or use explicit process inputs. Otherwise, any default `ext` settings must be replicated manually in the meta-pipeline.
+6. No plugin functions within the core workflow.
+
+For process directives, it is helpful to distinguish *what* is executed vs *how* it is executed. Directives that affect the *what* (`container`, `ext` settings) should be owned by the process definition. Directives that affect the *how* (`cpus`, `memory`, `executor`, `queue`, `errorStrategy`) should be owned by the meta-pipeline.


I don't understand the distinction here.

in other words, some directives affect the task result while others don't

adamrtalbot · 2026-06-10T09:36:20Z

+
+Alternatively, these core plugin dependencies could be specified in the pipeline spec under `requires.plugins`. When installing a pipeline, Nextflow could copy these plugin declarations into the meta-pipeline config and/or spec.
+
+Since this use case is rare -- plugin functions are typically used in the entry workflow outside the core workflow -- it can be deferred in the first iteration.


With more private plugin registries, I expect more utility methods in plugins (e.g. updateLims(sampleId, status)), but maybe this is premature optimization.

A LIMS integration sounds like something that could live outside the core workflow

adamrtalbot · 2026-06-10T09:38:35Z

+    > results/output-rnaseq.json
+```
+
+While pipeline chaining has always been possible in theory, new language features such as [workflow outputs](20251020-workflow-outputs.md) and [record types](20260306-record-types.md) make it much more practical. Each pipeline can define a structured output which can be passed to the next pipeline via JSON. Mismatches between an upstream output and downstream input (e.g. missing columns, different column names) can be resolved by a small adapter pipeline.


Agreed, I think pipeline chaining is used because metapipelines don't work right now. If they did, the number of pipeline chains drops.

That's not to say they're never useful, but it's much less common.

Two main use cases:

Run major pipeline (sarek, rnaseq) and add a few auxiliary processes

Daisy chain two pipelines (fetchngs -> rnaseq)

Both are solved better by metapipelines than pipeline chaining.

The main use case for daisy chaining is actually wiring nextflow up to non Nextflow tools, e.g. Nextflow into an ETL system. In this case structured inputs and outputs are still very useful.

adamrtalbot · 2026-06-10T09:41:31Z

+
+The Nextflow-in-Nextflow approach treats the included pipeline as a *black box* -- it preserves the exact pipeline behavior (core workflow + entry workflow + config) while forfeiting dataflow composition (separate dataflow graphs).
+
+An ideal solution might combine the best of both: compose pipelines into a single dataflow graph (white box) while inheriting each pipeline's params, outputs, and config so they need not be replicated (black box). We considered such a model, where an included pipeline contributes its shell as namespaced, overridable defaults, but rejected it. Dataflow composition fundamentally requires exposing the core workflow as a set of channel ports, so the white-box mechanism is unavoidable; inheritance would only layer implicit behavior on top of it. That behavior comes at a steep cost: it relocates a one-time *write* cost (boilerplate) into a recurring *read* cost (hidden defaults, auto-bound arguments, auto-published outputs), burdens every tool that must now understand it (linter, type checker, config resolution, resume), and conflicts with the frozen-island philosophy that otherwise governs vendored code.


I agree with this. The added complexity is enormous.

@ewels @pinin4fjords @adamrtalbot

Pulling everyone into this thread to talk about auto-inheritance

As you might expect, I'm most concerned about the params. You characterise it as a one-off cost which is mitigated by LLMs, however that doesn't take into account updates to included pipelines (a core functionality with included modules). The params drift with updates would be dangerous and a constant source of dev work.

That's fair, but not my main point. The core problem is this -- if you want to preserve dataflow concurrency between pipelines, then you can't really just auto-import params into the meta-pipeline. You have to define which params are replaced with inter-pipeline wiring vs exposed to the top-level. That amounts to just writing the meta-workflow.

The development overhead is what it is. I suggest the AI skill just as an idea. I'm sure it could also handle updates. All of that is better than having loads of hidden behavior that makes the meta-pipeline impossible to reason about

I'd still love to look into how we could bulk import nested config and apply it at root level. Even if it is a separate import + apply mechanism (eg. like config profiles in a sense?). I think without it, the use of the meta pipeline functionality is substantially limited.

Not sure I understand this point. Most of the config is just standard boilerplate, so it doesn't make sense to auto-import it because you will just get lots of duplicate config

Unless you are talking about ext config. That will depend on whether we can move the default ext settings into the process definition

Building on what Adam said:

In a scenario where I update my workflow from v1.1 to v1.2, an update to params should be explicit in the input block, not implicit and I hope it doesn't change too much.

The nice thing about an explicit meta-pipeline definition is that when I update the included pipeline, the linter / language server will immediately pick up on any inconsistencies, because it's just regular code. I'm not sure the tooling would be able to do that if there was a lot of implicit behavior

adamrtalbot · 2026-06-10T09:43:26Z

+    }
+
+    // perform RNAseq analysis
+    multiqc_report = NFCORE_RNASEQ( ch_samples )


Side note - I would remove MultiQC from all nf-core pipelines and put them in the metapipelines, i.e. no MultiQC repeats, but that's a matter of opinion.

FETCHNGS(ch_inputs) RNASEQ(fetchngs.out) MULTIQC(RNASEQ.out.qc_files)

I was wondering about that. Wasn't sure if you would want a meta-pipeline to produce one multiqc report per pipeline or just one for the whole thing

adamrtalbot · 2026-06-10T09:49:30Z

As you might expect, I'm most concerned about the params.

Agreed. Feel like we need some sort of auto-import of the params of child workflows, so e.g. they appear automatically in Platform, and I could say e.g. meta.rnaseq.pseudoaligner = 'kallisto' in the meta pipeline's nextflow.config to override.

Then some auto-assembly of docs as well.

Basically we need to standardise at the nextflow level where a bunch of the non-nextflow pieces need to live.

I disagree. Having unpredictable global scope params blocks is just weird and if we were designed Nextflow today we would never include this behaviour. In other languages, globals need to be used with caution and are generally not advised. Having random params.foo.bar.baz with no way of validating or checking is just "something you have to know", instead of being clear to the author.

In a scenario where I update my workflow from v1.1 to v1.2, an update to params should be explicit in the input block, not implicit and I hope it doesn't change too much.

If we really want to make them importable, we could add a dedicated params block to the workflow definition:

workflow THING {
   params:
        foo: Int
        bar: Bool
        baz: String

   take:
   // etc
}

but this doesn't feel very different to:

record ThingParams {
    foo: Int
    bar: Bool
    baz: String
}

workflow THING {
   take:
       params: ThingParams

   // etc
}

adamrtalbot · 2026-06-10T09:55:21Z

My main concern here is versioning of imported workflows. Do we include a lock file or something to ensure consistency or just trust in the files that are copied into the workflow code?

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman · 2026-06-10T13:47:35Z

+
+When a pipeline is included, it is vendored into the meta-pipeline project under `workflows/<scope>/<name>/`. Included pipelines are isolated -- each included pipeline has its own `modules/` and `workflows/` directories. This way, two pipelines can use different versions of the same module without compromising reproducibility.
+
+Included pipelines should be committed to the meta-pipeline repository. The pipeline should have a *pipeline spec* (`nextflow_spec.json`) which specifies the pipeline version, so that Nextflow can track local changes.


@adamrtalbot

My main concern here is versioning of imported workflows. Do we include a lock file or something to ensure consistency or just trust in the files that are copied into the workflow code?

See here. Like modules, we will likely want to have some sort of checksum verification (e.g. .pipeline-info)

I guess the simplest way would be to commit the entire pipeline, even though only the core workflow will be used. Then you can have a single checksum for the entire pipeline directory

It's probably still useful to keep the pipeline shell in the meta-pipeline repo, since e.g. your agent will want to refer to it when updating the meta-pipeline

nf-core copy+pastes modules for subworkflows and it works well!

bentsherman · 2026-06-10T14:00:17Z

+```groovy
+include { NFCORE_FETCHNGS } from 'nf-core/fetchngs'
+include { NFCORE_RNASEQ } from 'nf-core/rnaseq'


For anyone feeling adventurous, here is what Claude and I came up with while exploring auto-inheritance:

include { NFCORE_FETCHNGS } from 'nf-core/fetchngs' include { NFCORE_RNASEQ } from 'nf-core/rnaseq' params { input: Path // meta entry point strandedness: String = 'auto' // one new knob // aligner / fasta / ... inherited from rnaseq's params, override on CLI as --rnaseq.fasta=... } workflow { main: ch_ids = channel.fromPath(params.input).splitCsv() ch_samples = NFCORE_FETCHNGS( ch_ids ) ch_samples = samples.map { r -> r + record(strandedness: params.strandedness) } // rnaseq.* params automagically passed to rnaseq workflow via named arguments NFCORE_RNASEQ( samples: ch_samples ) // no publish/output blocks: each pipeline's outputs publish under <output-dir>/<pipeline>/ // question: what if I don't want to publish something (e.g. fetchngs output)? }

Feel free to take it and run with it...

Exactly the way I was thinking. We just namespace the children's params

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

pditommaso

Thanks for putting this together, Ben — the dataflow-composition motivation and the rejection of the runtime-inheritance hybrid are both nicely argued. A few thoughts to share before this moves past draft:

1. The key technical challenge could be expanded. At its core this proposes a mechanism to include a fully-fledged Nextflow workflow into another, mimicking how we already include modules and sub-workflows. The part I'd love to see fleshed out is how channels and values get bound into the included workflow's inputs. The ADR sets out the policy (params live at the top level, the core workflow consumes everything via take:) but doesn't yet describe the binding mechanics: how a scalar value vs. a streaming channel is bound at the call site, the value-channel/queue-channel broadcast semantics, and whether a typed take: can accept a bare value type like String/Path. The example here (take: aligner: String) also reads a bit differently from the typed-workflows ADR, where every take: input is a channel type. Since this binding question largely determines feasibility, it'd be great to work it out explicitly.

2. The nomenclature can be better shaped. The document moves between "meta-pipeline" and "remote pipeline", and I think the framing could be sharpened. Terms like workflow modularisation / workflow inclusion / workflow composition might describe what's happening (composing one workflow into another) more directly than introducing a new "meta-pipeline" category.

3. There's some overlap with existing sub-workflow inclusion. Once you discard the entry workflow, params, and output block and import only the core workflow, what's left looks a lot like a sub-workflow. It'd be helpful to clarify how this differs from including a remote sub-workflow, and what the main benefit is that justifies a separate mechanism (separate storage layout, a new nextflow_spec.json, a separate CLI, etc.).

4. A possible framing. I'd lean toward framing the next step as enabling remote sub-workflows — the natural progression after remote modules (processes). Module (process) → sub-workflow → composition feels like a clean, incremental story that reuses the conventions we already have, rather than introducing a "pipeline" as a new top-level artifact with its own resolution rules, storage path, and spec file. If we get remote sub-workflow inclusion right, "meta-pipelines" might largely fall out of it as a usage pattern rather than a new concept.

bentsherman · 2026-06-10T19:01:50Z

@pditommaso thanks for the review

The part I'd love to see fleshed out is how channels and values get bound into the included workflow's inputs.

There isn't much to say here because it just works like normal. In the appendix example, NFCORE_RNASEQ is just a named workflow. The meta-pipeline calls it the same way that rnaseq would call it. The only difference is that some inputs might come from upstream outputs instead of params.

... how a scalar value vs. a streaming channel is bound at the call site, the value-channel/queue-channel broadcast semantics, and whether a typed take: can accept a bare value type like String/Path.

A workflow take can be a channel, a dataflow value, or a regular value. This is how it has always worked

The document moves between "meta-pipeline" and "remote pipeline", and I think the framing could be sharpened. Terms like workflow modularisation / workflow inclusion / workflow composition might describe what's happening (composing one workflow into another) more directly than introducing a new "meta-pipeline" category.

"Meta-pipeline" is the top-line feature that everyone is after, but the only actual new feature proposed by the ADR is "remote pipeline inclusion" -- how to install a pipeline as a component and keep it in sync with the source. This is why the ADR is titled "Remote pipeline inclusion". Once you have that, everything else is just normal workflow composition and convention.

They are distinct concepts -- the ADR does not treat them as interchangeable.

Once you discard the entry workflow, params, and output block and import only the core workflow, what's left looks a lot like a sub-workflow. It'd be helpful to clarify how this differs from including a remote sub-workflow, and what the main benefit is that justifies a separate mechanism (separate storage layout, a new nextflow_spec.json, a separate CLI, etc.).

The core workflow looks like a subworkflow because it is a subworkflow 😄

The only new thing that we introduce here is installing a pipeline into a project as a component and keeping it in sync with the remote source (either from Git or the registry). For that you likely need a pipeline spec (version, checksum) and a CLI (installing, updating). I just haven't spelled all that out yet because the bigger question right now is how to minimize developer overhead

I'd lean toward framing the next step as enabling remote sub-workflows — the natural progression after remote modules (processes). Module (process) → sub-workflow → composition feels like a clean, incremental story that reuses the conventions we already have, rather than introducing a "pipeline" as a new top-level artifact with its own resolution rules, storage path, and spec file. If we get remote sub-workflow inclusion right, "meta-pipelines" might largely fall out of it as a usage pattern rather than a new concept.

Looks like you arrived at the same place as me. Remote workflows are the real feature, meta-pipelines emerge naturally as a convention on top.

I'm not sure whether it's worth trying to distinguish between pipelines / workflows / subworkflows. They're all basically the same thing. Especially if we add the ability to execute named workflows directly (#7208). The difference boils down to boilerplate, which we want to minimize anyway

This is why I just talk about "remote pipeline inclusion", because when I import a workflow, I don't really care whether that workflow is a "pipeline" like rnaseq or a "subworkflow" like BAM_STATS_SAMTOOLS. Workflow composition works the same way either way.

Happy to rename the ADR to "remote workflow inclusion" to align with the workflow keyword.

ewels · 2026-06-10T21:36:25Z

My main concern here is versioning of imported workflows. Do we include a lock file or something to ensure consistency or just trust in the files that are copied into the workflow code?

Modules have a .moduleinfo file with a hash to allow checking that stuff wasn't modified. I think I saw something similar mentioned here for pipelines / workflows?

ewels · 2026-06-10T21:38:16Z

Having unpredictable global scope params blocks is just weird and if we were designed Nextflow today we would never include this behaviour. In other languages, globals need to be used with caution and are generally not advised.

@adamrtalbot agreed, I never said global. I would love it if the pipeline config is imported within a dedicated scope and treated as a baseline default. Then the import-ing pipeline can override anything, but doesn't need to duplicate config that isn't being changed.

Doing this would not be trivial. The only way I can think of is to do something fairly radical like rendering the config at import time and saving that to a locked config file somewhere. Or some other crazy mechanism.

adamrtalbot · 2026-06-11T09:32:40Z

Having unpredictable global scope params blocks is just weird and if we were designed Nextflow today we would never include this behaviour. In other languages, globals need to be used with caution and are generally not advised.

@adamrtalbot agreed, I never said global. I would love it if the pipeline config is imported within a dedicated scope and treated as a baseline default. Then the import-ing pipeline can override anything, but doesn't need to duplicate config that isn't being changed.

Doing this would not be trivial. The only way I can think of is to do something fairly radical like rendering the config at import time and saving that to a locked config file somewhere. Or some other crazy mechanism.

Config or params? In my mind they are very different concepts, I was referring to parameters here.

adamrtalbot · 2026-06-11T09:49:40Z

Happy to rename the ADR to "remote workflow inclusion" to align with the workflow keyword.

I agree with this. They're all workflows*, the only thing that separates a "pipeline" from a subworkflow is perception.

*except the anonymous entry workflow, which is where the sticky point about params and config comes in 😉

ewels · 2026-06-11T14:55:56Z

Config or params? In my mind they are very different concepts, I was referring to parameters here.

Ideally params, but might need to be config for all the ext stuff..?

Happy to rename the ADR to "remote workflow inclusion" to align with the workflow keyword.

Yeah as it stands I think this basically boils down to the functionality we already have with nf-core subworkflows, right? Which is quite far from what I think of as meta-pipelines. Still good to have and useful..

bentsherman · 2026-06-11T16:00:25Z

Yeah as it stands I think this basically boils down to the functionality we already have with nf-core subworkflows, right?

Can the nf-core tooling install a workflow from a pipeline repo? e.g. NFCORE_RNASEQ from nf-core/rnaseq? I think that is the main thing that this ADR adds

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman · 2026-06-11T17:02:45Z

+// module
+include { BWA_MEM } from 'nf-core/bwa/mem'
+
+// pipeline
+include { NFCORE_RNASEQ } from 'nf-core/rnaseq'


One point that makes me hesitant to reframe the ADR as "remote workflow inclusion" -- here we are referencing the pipeline by name (nf-core/rnaseq)

It could be the GitHub repo or an entity in the Nextflow registry, but either way, the pipeline itself plays a role in facilitating the inclusion. Even if we only include the core workflow (NFCORE_RNASEQ), we likely need to store the entire pipeline code in the meta-pipeline repo, because that is the thing that is versioned

As a user, I will want to know that my meta-pipeline is using a specific pipeline version (e.g. nf-core/rnaseq 3.3.0), so in effect we have to say that we are including the entire pipeline

ADR: Meta-pipelines

f97e3d7

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman requested review from ewels and pditommaso June 10, 2026 00:19

bentsherman added this to the 26.10 milestone Jun 10, 2026

jorgee reviewed Jun 10, 2026

View reviewed changes

pinin4fjords reviewed Jun 10, 2026

View reviewed changes

jorgee reviewed Jun 10, 2026

View reviewed changes

Comment thread adr/20260608-remote-pipeline-inclusion.md

adamrtalbot reviewed Jun 10, 2026

View reviewed changes

bentsherman added 2 commits June 10, 2026 08:10

Define fetchngs/rnaseq pipelines in appendix example [ci skip]

9841151

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

Improve description of configuration in appendix example

c0911de

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman commented Jun 10, 2026

View reviewed changes

Relax language in "Best practices for included pipelines" [ci skip]

286ac8d

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman mentioned this pull request Jun 10, 2026

docs: Clarify distinction between modules and script inclusion #7217

Merged

1 task

pditommaso requested changes Jun 10, 2026

View reviewed changes

bentsherman changed the title ~~ADR: Meta-pipelines~~ ADR: Remote pipeline inclusion Jun 11, 2026

Consolidate pipeline chaining, nf-cascade into one section

09c9e96

Signed-off-by: Ben Sherman <bentshermann@gmail.com>

bentsherman commented Jun 11, 2026

View reviewed changes


		Alternatively, these core plugin dependencies could be specified in the pipeline spec under `requires.plugins`. When installing a pipeline, Nextflow could copy these plugin declarations into the meta-pipeline config and/or spec.

		Since this use case is rare -- plugin functions are typically used in the entry workflow outside the core workflow -- it can be deferred in the first iteration.


		The Nextflow-in-Nextflow approach treats the included pipeline as a black box -- it preserves the exact pipeline behavior (core workflow + entry workflow + config) while forfeiting dataflow composition (separate dataflow graphs).

		An ideal solution might combine the best of both: compose pipelines into a single dataflow graph (white box) while inheriting each pipeline's params, outputs, and config so they need not be replicated (black box). We considered such a model, where an included pipeline contributes its shell as namespaced, overridable defaults, but rejected it. Dataflow composition fundamentally requires exposing the core workflow as a set of channel ports, so the white-box mechanism is unavoidable; inheritance would only layer implicit behavior on top of it. That behavior comes at a steep cost: it relocates a one-time write cost (boilerplate) into a recurring read cost (hidden defaults, auto-bound arguments, auto-published outputs), burdens every tool that must now understand it (linter, type checker, config resolution, resume), and conflicts with the frozen-island philosophy that otherwise governs vendored code.


		When a pipeline is included, it is vendored into the meta-pipeline project under `workflows/<scope>/<name>/`. Included pipelines are isolated -- each included pipeline has its own `modules/` and `workflows/` directories. This way, two pipelines can use different versions of the same module without compromising reproducibility.

		Included pipelines should be committed to the meta-pipeline repository. The pipeline should have a pipeline spec (`nextflow_spec.json`) which specifies the pipeline version, so that Nextflow can track local changes.

Conversation

bentsherman commented Jun 10, 2026

Uh oh!

netlify Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging ready!

Uh oh!

ewels commented Jun 10, 2026

Uh oh!

jorgee Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pinin4fjords commented Jun 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bentsherman Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adamrtalbot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamrtalbot commented Jun 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pditommaso left a comment

Choose a reason for hiding this comment

Uh oh!

bentsherman commented Jun 10, 2026

Uh oh!

ewels commented Jun 10, 2026

netlify Bot commented Jun 10, 2026 •

edited

Loading

jorgee Jun 10, 2026 •

edited

Loading

bentsherman Jun 10, 2026 •

edited

Loading

adamrtalbot commented Jun 10, 2026 •

edited

Loading