Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ null/
# Builds
build/
*.egg-info/
!modules/**/build/

# Pesky vscode
.vscode/
Expand Down
28 changes: 28 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,33 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [v1.2.0] - 2026-05-20

Update to enable users to map reads with Bowtie 2 instead of BWAMem as an optional parameter and use Artic primers as default. Add in parameter to better control ONT variant masking

### `Added`

- Illumina reads mapping with Bowtie 2 as an alterative (instead of BWAMem 2) in response to [issue #37](https://github.com/phac-nml/measeq/issues/37). [PR #39](https://github.com/phac-nml/measeq/pull/39)

- `align_bowtie2` added as an optional parameter to map illumina data with Bowtie 2 instead of BWAMem2.

- [Artic primers](https://doi.org/10.1101/2024.12.20.629611) for MeV were added as a profile. [PR #39](https://github.com/phac-nml/measeq/pull/39)

- This allows running the pipeline with the Artic primers mapped to the pipeline's preset references (D8, B3, and A genotypes) [issue #38](https://github.com/phac-nml/measeq/issues/38).
- To run the pipeline with this profile, use `nextflow run phac-nml/measeq -profile artic_primers,<docker/singularity>` with the other normal parameters you would use.

- New `min_mask_freq_c3` parameter for nanopore data to help better control when sites are masked as Ns in more noisy regions/datasets [PR #42](https://github.com/phac-nml/measeq/pull/42)
- Default is `0.30`
- To help allow adjustments to the acceptable amount of site variation before an N is used to mask it
- So now by default a call with `0.15` alt frequeny and a quality score of 3 would NOT be N masked when before it was

> [!NOTE]
> If you use `-profile artic_primers`, then there is no need to use `--amplicon` as it is automatically passed.

### `Fixes`

- Ambiguous regions that don't include the Ref base are properly tracked now in the report and include both alts [PR #41](https://github.com/phac-nml/measeq/pull/41)

## [v1.1.0] - 2026-03-27

Update focusing on fixing small inconsistencies between final reports and output consensus sequences along with a few small bugfixes and exposing some more parameter options available to the user
Expand Down Expand Up @@ -285,6 +312,7 @@ Small addition of Picard MarkDuplicates workflow along with some new tests

- MeaSeq pipeline created and initial code added

[v1.2.0]: https://github.com/phac-nml/measeq/releases/tag/1.2.0
[v1.1.0]: https://github.com/phac-nml/measeq/releases/tag/1.1.0
[v1.0.1]: https://github.com/phac-nml/measeq/releases/tag/1.0.1
[v1.0.0]: https://github.com/phac-nml/measeq/releases/tag/1.0.0
Expand Down
22 changes: 15 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,14 +22,24 @@
- [Troubleshooting](#troubleshooting)
- [Credits](#credits)
- [Citations](#citations)
- [Contributing](#legal)
- [Contributing](#contributing)
- [Legal](#legal)

## Current Updates

### _2026-03-27_ Summary
### _2026-05-20_ Summary

Full release version 1.1.0! Pipeline supports equivalent Illumina and Nanopore workflows allowing whole genome or amplicon sequencing analysis. The MeaSeq workflow generates whole genome consensus sequences, N450 sequences and reporting information, DSId hashing and assigning, and a final QC report. It can be run with a single reference or with the genotyping predictions and a config setup containing a users preferred references.
Full release version 1.2.0! Pipeline supports equivalent Illumina and Nanopore workflows allowing whole genome or amplicon sequencing analysis. The MeaSeq workflow generates whole genome consensus sequences, N450 sequences and reporting information, DSId hashing and assigning, and a final QC report. It can be run with a single reference or with the genotyping predictions and a config setup containing a users preferred references.

Changes in `v1.2.0` include the addition of [Bowtie 2](https://github.com/BenLangmead/bowtie2) as an alterative read mapping tool (instead of BWAMem 2) and support for the [Artic primers](https://www.biorxiv.org/content/10.1101/2024.12.20.629611v1) mapped to the pipeline's preset references (D8, B3, and A genotypes).

#### Preprint

If you find this pipeline useful, please cite our preprint as:

> Evaluation of MeaSeq: comprehensive analysis and reporting of measles virus whole genome sequences.
> Darian T Hole, Ahmed Abdalla, Vanessa Zubach, Molly Pratt, Stephanie Van Driel, Samar Ashfaq, Joanne Hiebert, Ana T Duggan
> bioRxiv 2026.05.12.724559; doi: https://doi.org/10.64898/2026.05.12.724559

#### Genotype Predictions

Expand All @@ -45,8 +55,6 @@ Full release version 1.1.0! Pipeline supports equivalent Illumina and Nanopore w

- Updating the final report and maintaining best practices/tool updates as they are released

- Writing a quick summary paper of the process and uses for reporting

- For IRIDA-Next, we're hoping to evaluate generic viral pipeline options (or create one) and merge in virus specific post-processing stages
- So measeq post-processing would end up included there

Expand Down Expand Up @@ -156,9 +164,9 @@ Additional or local models can also be used, you just have to provide a path to

#### Variant Quality Filtering and Masking Mixed Sites

In addition to calling variants with Clair3, the Nanopore pipeline will mask sites that are of lower quality (Default: 2 < QUAL < 7) or have an allele frequency below 60% with an N in the final consensus. These masked sites can be found in the final HTML report or under the `results/vcf/artic/<sample>.fail.vcf` file.
In addition to calling variants with Clair3, the Nanopore pipeline will mask sites that are of lower quality (Default: 2 < QUAL < 7) or have a non-consensus level allele frequency (Default: 30% < AF < 60%) with an N in the final consensus. These masked sites can be found in the final HTML report or under the `results/vcf/artic/<sample>.fail.vcf` file.

To adjust this behaviour, you can set the `--min_variant_qual_c3` and `--min_allele_freq_c3` parameters. Setting them both to 0 will essentially turn of variant filtering other than for indels and low depth sites
To adjust this behaviour, you can set the `--min_variant_qual_c3`, `--min_allele_freq_c3`, and `--min_mask_freq_c3` parameters. Setting them all to 0 will essentially turn off variant filtering other than for indels and low depth sites and will then instead rely solely on clair3's calls.

### Reference Assignment

Expand Down
Loading
Loading