Genome-scale metabolic modeling of lactic acid bacteria (LAB) consortia in legume-based fermentation. Master's project, CEB, Universidade do Minho.
This repository contains the jupyter notebooks, notes, results and supporting materials developed for a bioinformatics project focused on modeling lactic acid bacteria (LAB) consortium in legume-based fermentation systems.
The project aims to use genome-scale metabolic models (GSMMs) to study microbial interactions in plant-based matrices, with a focus on chickpea and fava bean fermentations. The main goal is to investigate how selected LAB strains may contribute to the mitigation of off-flavors and to the identification of favorable fermentation conditions through computational modeling.
The workflow combines:
- curation and standardization of GEMs;
- definition of legume-based growth media;
- interaction analysis using SMETANA;
- community simulation using MICOM.
- GEM curation — curate and standardize iLP728 and Koduru2022, evaluating quality (mass balance, blocked reactions, biomass consistency) with memote before and after correction.
- Legume medium construction — translate chickpea and fava bean composition (sugars, amino acids, organic acids, lipids) into exchange bounds in COBRApy, using literature values and an approximation of the Sauer equation.
- Interaction analysis — screen pairwise cross-feeding potential between the two LAB with SMETANA.
- Community modeling — simulate the LAB consortium in both legume matrices with MICOM, comparing community behavior against solo-FBA baselines and assessing sensitivity to relative abundance.
- Off-flavor analysis — interpret predicted flux distributions for compounds tied to known off-flavor pathways and evaluate which interactions plausibly contribute to mitigation.
docs/
Final deliverables: the final article (Artigo_Final_pg59766.pdf, LNCS format), the intermediate report submitted earlier in the project (Artigo_Intercalar_pg59766.pdf), and presentation slides (ppt_SaraSousa_pg59766.pdf).
models/raw/
GSMMs as originally published/downloaded, before any curation. Kept for traceability.
models/curated/
Curated and harmonized versions of iLP728 and Koduru2022 used in all downstream analyses.
notebooks/
All analysis code, one notebook per objective:
obj1_*notebooks curate the two GEMs and document why the third candidate strain was excluded;obj2builds the chickpea/fava bean media;obj3runs SMETANA;obj4_obj5runs MICOM community simulations and the off-flavor flux analysis.
reports/memote/
Automated model quality reports generated by memote. _v2 files are post-curation reports; files without _v2 are pre-curation.
results/
Numerical outputs and figures from the notebooks, named by objective. obj4_* covers community growth rates, abundance sweeps, cross-feeding predictions and solo-vs-consortium comparisons. obj5_* covers the off-flavor perturbation analysis. smetana_* covers the three SMETANA interaction scores (MU, MIP, MRO).
Root files
legume_medium1_v2.py and utils.py are shared Python modules imported by the notebooks — medium construction logic and general helper functions, respectively.
-
Download/clone the repository, keeping the full folder structure inside one parent folder.
-
Set up the environment, installing all the necessary packages.
-
Install Gurobi.
-
Run notebooks in order from
notebooks/:obj1_*→obj2_*→obj3_*→obj4_obj5_*. Each depends on outputs from the previous one.
- Python, COBRApy, Genome-scale metabolic models (GSMMs), SMETANA, MICOM, ReFramed
Sara Sousa (PG59766) Supervisor: Óscar Dias