This repository contains all code and pipelines used in the study of Enterococcus faecium strain E745 characterisation. This project uses a multi-omics approach to uncover genomic structure, the resistome and metabolic dependencies between the host and pathogen in human serum.
Data Source: The clinical samples utilized in this study were obtained from:
Zhang, X., de Maat, V., Guzmán Prieto, A. M., Prajsnar, T. K., Bayjanov, J. R., de Been, M., Rogers, M. R. C., Bonten, M. J. M., Mesnage, S., Willems, R. J. L., & van Schaik, W. (2017). RNA-seq and Tn-seq reveal fitness determinants of vancomycin-resistant Enterococcus faecium during growth in human serum. BMC genomics, 18(1), 893. https://doi.org/10.1186/s12864-017-4299-9
The pipeline within this repository encompasses all aspects of the entire process from start to finish for the genome analysis:
-
De novo Assemblies: Combination hybrid assembly (Illumina + Nanopore/PacBio).
-
Structural Annotation: Full functional annotation (Prokka/EggNOG).
-
Comparative Resistome: Identifying Mobile genetic element (MGEs) and AMR genes.
-
Transcriptomics/Fitness: Combining RNA-Seq differential expression analysis with Tn-Seq essentiality.
The analysis architecture follows an automated pipeline starting from raw sequence data to functional interpretation.
The following toolstack is utilized to ensure reproducibility and high-fidelity analysis:
| Category | Tools & Packages |
|---|---|
| Quality Control | FastQC v0.12.1-Java-17, Trimmomatic v0.39-Java-17 |
| Genome Assembly | SPAdes v4.2.0-GCC-13.3.0, Flye v2.9.6-GCC-13.3.0, Canu v2.3-GCCcore-13.3.0-Java-17, Bandage v0.9.0 |
| Assembly Validation | BUSCO v5.8.2-gfbf-2024a, QUAST v5.3.0, MUMmer v4.0.1-GCCcore-13.3.0 |
| Annotation | Prokka v1.14.5-gompi-2024a, EggNOG-mapper v2.1.13-gfbf-2024a |
| Variant Calling | BWA-MEM, BCFtools, IGV |
| Transcriptomics | HTSeq, DESeq2, R |
| Fitness Profiling | featureCounts, samtools |
| Comparative Genomics | BLAST, ACT, PlasmidFinder, ResFinder |
-
Hybrid Assembly: Produced reference genome assembly of ~2.93 Mb with high-coverage plasmid mapping.
-
AMR Characterization: Confirmed chromosomal resistance SNPs (such as gyrA gene) and plasmid-mediated antibiotic resistance (vanHAX).
-
Host-Adaptation: Found essential pathways for metabolism (nucleotide biosynthesis) by combining RNA-Seq and Tn-Seq.
For more information of the whole project, visit the Wiki page.
Victor Guillermo Cornejo Villanueva | Junior Bioinformatician Intern Master’s Project at Uppsala University
This project is licensed under the MIT License.