Skip to content

Strange truncation of fastq files when demultiplexing short Illumina inserts libraries #90

Description

@j0n0curry

Hi there,

Have been using your library to demultiplex libraries - 384 well illumina libraries per ONT run - which have a similar construction to Tru-seq paired indexes.

The first sets tested worked very well - library insert size range 200 - 600 bp - targeted PCR library with intentional bias to see if the input ratio would be picked up and was.

However, we are also interested in the performance of the indexing plates - both for cross contamination and total library read counts per index - then compare back to the original Illumina library run. Using a 90 bp insert (x 4 - one per quadrant of the plate) and final amplicon 165 bp long. The combined demultiplexing report of 800 plus files provides a relatively realistic number of index pairs found - however the demux fastq.gz files - one per 384 well, provides only short reads of mean 26 bases +/- 40 bp. Strangely. The filtered fastq files were filtered for 180 to 400 bp ranges and pre-mapped to verify that number quantity of mappable reads. Post Anglerfish demux results in low or no alignment (bowtie2 or Minknow medeka) - looking through with fastqc showed they were too short - BLAST did no find many that aligned with insert or the i5 or i7 region.

I have attached a zipped file with: output of Anglerfish / text with settings used (have tried with and without -lenient) / fastq filtered file / the resulting demux reads from this file / a copy of the samples sheet - 384 indexes. I am using v0.6.1 - Ubuntu on WSL - Python 3.10 - minimap2 is installed. Unsure if very short amplicons causes problems with the pipeline. The fragment range includes singles and duplicated reads - expected.

anglerfish_test.zip

Thanks very much for this pipeline! We are hoping to pipe to this for real-time demultiplexing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions