MetaWorks 2.0

A flexible, web-based control node for managing bioinformatics pipeline runs (ESV/OTU analysis) across local machines, servers, and HPC clusters.

Features

**Modern Web UI (work in progress) **: Vue 3 + Vite based interface with real-time updates
Multi-Environment Support: Run locally, on servers, or on HPC clusters
Flexible Runtimes: Conda, Docker, and Apptainer (Singularity) support
Workflow Presets: Pre-configured templates for COI, 16S, and custom analyses
Schema-Driven Configuration: Dynamic config forms generated from the pipeline schema
Real-Time Monitoring: Live progress tracking, log streaming, and status updates
Scheduler Integration: Pluggable scheduler architecture (SLURM planned)

Quick Start

Prerequisites

Python 3.9+
Node.js 18+ (only for building UI - users don't need this)
Conda/Mamba (optional, for local runs)
Docker/Apptainer (optional, for containerized runs)

Installation

Clone and setup:

git clone https://github.com/Hajibabaei-Lab/MetaWorks-2.0.git
cd MetaWorks-2.0

Create conda environment:

conda env create -f environment.yml
conda activate MetaWorks

Build the UI (one-time setup):

cd frontend
npm install
npm run build
cd ..

Start the server:

uvicorn api.main:app --host 0.0.0.0 --port 8000

Open in browser: Navigate to http://localhost:8000

That's it! The web interface is now ready to use.

Using Docker Compose

cd deploy
cp .env.example .env
docker compose up --build

Access the UI at http://localhost:8080.

Architecture

Control Node Pattern

MetaWorks uses a control node architecture:

Web UI: Vue 3 SPA served from frontend/
API Server: FastAPI backend managing run lifecycle
Job Manager: Handles scheduler integration and job execution
Runtime Layer: Supports Conda, Docker, and Apptainer for running pipelines

This separation allows the control node to run anywhere (laptop, server, HPC) while the actual pipeline runs execute on appropriate compute resources.

Split Frontend Deployment

The backend now exposes a stable /api surface and optional legacy static UI serving. The recommended deployment runs the standalone frontend in its own container and proxies /api/* to FastAPI, so the runner remains independently usable without the web app.

Documentation

Deployment Guide - Deploy to local, server, or HPC environments
Configuration Guide - System and user configuration
Module Standards - Creating custom modules
Remote API Usage - Using the API programmatically

Usage Overview

Submitting a Run

Choose a workflow preset (COI Standard, 16S Microbiome, or Custom)
Configure parameters:
- Runtime type (Conda, Docker, Apptainer)
- Input directory and sample source
- Resource requirements (cores, memory)
Edit config sections (optional):
- Click "Load [ESV/OTU] sections" to see available parameters
- Modify fields as needed - help tooltips explain each option
- Only changed values are sent with the run
Upload assets (optional):
- Upload classifier and adapter files via the UI
- Reference them in your config
Submit to scheduler and monitor progress

Monitoring Runs

Auto-refresh: Runs automatically refresh every 5 seconds
Progress tracking: Percentage complete, current step, time estimates
Log streaming: Real-time log output in the browser
Actions: Cancel, download logs, download artifacts, delete runs

Development Mode

For UI development with hot-reload:

# Terminal 1: Vite dev server
cd frontend
npm run dev

# Terminal 2: API server
uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Project Structure

MetaWorks-2.0/
├── api/                    # FastAPI backend (routes, services, schemas)
├── config/                 # Pipeline defaults and marker presets
├── deploy/                 # Docker Compose split deployment
├── docs/                   # Documentation
├── frontend/               # Vue 3 + TypeScript SPA
├── lib/                    # Config management, runtime builders, exceptions
├── tests/                  # pytest test suite (162 tests)
├── workflow/               # Snakemake pipeline (rules/, scripts/, profiles/)
├── Makefile                # Dev, test, lint, build commands
└── environment.yml         # Conda environment

Deployment Options

Local Development

Run on your laptop for testing and development:

 conda activate MetaWorks
 uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload

Production Server

Deploy the recommended split stack:

cd deploy
docker compose up -d --build

For the quickest smoke test after startup, submit a run against /MetaWorks/tests/testing_data, which is already bundled into the backend image.

HPC Cluster

Deploy on HPC with multiple options:

Dedicated control node with shared storage
SSH tunneling from local machine
Interactive job on compute node
Reverse proxy with authentication

See Deployment Guide for detailed instructions.

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

See Module Standards for guidance on creating new pipeline modules.

License

GNU General Public License v3.0

Citation

If you use MetaWorks in your research, please cite the MetaWorks paper: Porter, T. M., & Hajibabaei, M. (2022). MetaWorks: A flexible, scalable bioinformatic pipeline for high-throughput multi-marker biodiversity assessments. PLOS ONE, 17(9), e0274260. doi: 10.1371/journal.pone.0274260

You can also cite this repository: Teresita M. Porter. (2020, June 25). MetaWorks: A Multi-Marker Metabarcode Pipeline (Version v1.10.0). Zenodo. http://doi.org/10.5281/zenodo.4741407

If you use this dataflow for making COI taxonomic assignments, please cite the COI classifier publication: Porter, T. M., & Hajibabaei, M. (2018). Automated high throughput animal CO1 metabarcode classification. Scientific Reports, 8, 4226.

If you use the pseudogene filtering methods, please cite the pseudogene publication: Porter, T.M., & Hajibabaei, M. (2021). Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets. BMC Bioinformatics, 22: 256.

If you use the RDP classifier, please cite the publication: Wang, Q., Garrity, G. M., Tiedje, J. M., & Cole, J. R. (2007). Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy. Applied and Environmental Microbiology, 73(16), 5261–5267. doi:10.1128/AEM.00062-07

Last updated: May 2026

Acknowledgments

Hajibabaei Lab
Terri Porter
Alex Song
Contributors and community members

Support

Open an issue on GitHub
Check documentation
Contact the development team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetaWorks 2.0

Features

Quick Start

Prerequisites

Installation

Using Docker Compose

Architecture

Control Node Pattern

Split Frontend Deployment

Documentation

Usage Overview

Submitting a Run

Monitoring Runs

Development Mode

Project Structure

Deployment Options

Local Development

Production Server

HPC Cluster

Contributing

License

Citation

Acknowledgments

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github/workflows		.github/workflows
api		api
config		config
deploy		deploy
docs		docs
frontend		frontend
lib		lib
tests		tests
workflow		workflow
.gitignore		.gitignore
.nvmrc		.nvmrc
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml

Folders and files

Latest commit

History

Repository files navigation

MetaWorks 2.0

Features

Quick Start

Prerequisites

Installation

Using Docker Compose

Architecture

Control Node Pattern

Split Frontend Deployment

Documentation

Usage Overview

Submitting a Run

Monitoring Runs

Development Mode

Project Structure

Deployment Options

Local Development

Production Server

HPC Cluster

Contributing

License

Citation

Acknowledgments

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages