Skip to content

Latest commit

 

History

History
144 lines (73 loc) · 8.43 KB

File metadata and controls

144 lines (73 loc) · 8.43 KB

biodiversitycellatlas/bca_preprocessing: Citations

Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.

Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.

Pipeline tools

  • Alevin

    Srivastava, A., Malik, L., Smith, T. et al. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biol 20, 65 (2019).

  • Alevin-fry

    He, D., Zakeri, M., Sarkar, H. et al. Alevin-fry unlocks rapid, accurate and memory-frugal quantification of single-cell RNA-seq data. Nat Methods 19, 316–322 (2022).

  • CellBender

    Stephen J Fleming, Mark D Chaffin, Alessandro Arduini et al. Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender. Nature Methods, 2023. https://doi.org/10.1038/s41592-023-01943-7

  • CellSweep

    Maya Caskey, Joseph Rich, Ryan Weber et al. Single-Cell Genomics Decontamination with CellSweep. bioRxiv 2026.03.04.709349, doi: https://doi.org/10.64898/2026.03.04.709349.

  • fastp

    Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018 Sep 1;34(17):i884-i890. doi: 10.1093/bioinformatics/bty560. PubMed PMID: 30423086; PubMed Central PMCID: PMC6129281.

  • FastQC

    Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data.

  • featureCounts

    Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014 Apr 1;30(7):923-30. doi: 10.1093/bioinformatics/btt656. Epub 2013 Nov 13. PubMed PMID: 24227677.

  • Kraken2

    Wood, D. E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome Biology, 20(1), 257. https://doi.org/10.1186/s13059-019-1891-0

  • matplotlib

    J. D. Hunter. (2007). Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95. https://ieeexplore.ieee.org/document/4160265

  • MultiQC

    Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

  • numpy

    Harris, C.R., Millman, K.J., van der Walt, S.J. et al. (2020).Array programming with NumPy. Nature 585, 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2

  • pandas

    pandas development team. (2021). pandas-dev/pandas: Pandas (v2.1.0rc0). Zenodo. https://doi.org/10.5281/zenodo.605272

  • pigz

    The pigz developement team. (2023). pigz: A parallel implementation of gzip for modern multi-processor, multi-core machines.

  • pysam

    pysam: a Python module for reading and manipulating SAM/BAM files.

  • Salmon

    Patro, R., Duggal, G., Love, M. et al. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14, 417–419 (2017).

  • SAMtools

    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002.

  • seqtk

    Li, H. et al. (2023). Seqtk: a fast and lightweight tool for processing sequences in the FASTA or FASTQ format.

  • Simpleaf

    He, D., Patro, R. simpleaf: a simple, flexible, and scalable framework for single-cell data processing using alevin-fry, Bioinformatics 39, 10 (2023).

  • STARsolo

    Benjamin Kaminow, Dinar Yunusov, Alexander Dobin. STARsolo: accurate, fast and versatile mapping/quantification of single-cell and single-nucleus RNA-seq data. BioRxiv 2021.05.05.442755 (2021).

Git modules

  • 10x_saturate

    Zolotarov, G., Elek, A. (2024). 10x_saturate: Compute sample saturation curve by downsampling.

  • GeneExt

    Zolotarov, G., Grau-Bové, X., & Sebé-Pedrós, A. GeneExt: a gene model extension tool for enhanced single-cell RNA-seq analysis. bioRxiv 2023.12.05.570120. https://doi.org/10.1101/2023.12.05.570120

  • pavianCore

    Florian P Breitwieser, Steven L Salzberg, Pavian: interactive analysis of metagenomics data for microbiome studies and pathogen identification, Bioinformatics, Volume 36, Issue 4, February 2020, Pages 1303–1304, https://doi.org/10.1093/bioinformatics/btz715

External software/pipelines

Software packaging/containerisation tools

  • Anaconda

    Anaconda Software Distribution. Computer software. Vers. 2-2.4.0. Anaconda, Nov. 2016. Web.

  • Bioconda

    Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.

  • BioContainers

    da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.

  • Docker

    Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.

  • Singularity

    Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.