Skip to content

graeter-group/hydrolysis-qmmm-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hydrolysis of Collagen Triplehelix vs. Single Peptide

Workflow

source setup.sh

Open index.qmd, run steps sequentially.

Project Overview

This is a scientific computing project for QM/MM simulations of collagen hydrolysis. It combines Python analysis pipelines (Hamilton) with the Quarto documentation framework for reproducible computational research. The project sets up, runs and analyzes molecular dynamics simulations focusing on collagen triple helix hydrolysis vs single peptide hydrolysis.

Key Commands

Environment Setup

source setup.sh  # Sets up environment and activates appropriate venv based on hostname

Testing

make tests  # Runs pytest on src/ directory
python -m pytest src  # Alternative test command

Code Quality

black src/  # Format Python code (black is included in dependencies)

Documentation and Analysis

make preview  # Start Quarto preview server
quarto preview  # Alternative preview command

make docs  # Build full documentation with quartodoc and quarto

make analysis  # Run Python Hamilton pipeline (replaces R targets)

Development Utilities

make fix  # Fix permissions on docs directory
make sync  # Sync targets data using scripts/sync-targets-data.sh

Architecture Overview

Python dependencies are managed via uv with pyproject.toml.

Core Python Package (src/)

  • analysis.py: Hamilton-based data pipeline
  • workflows.py: High-level workflow functions for checking simulations
  • analysis_utils.py: Utility functions for reading simulation data
  • steps.py: Simulation and analysis steps
  • operations.py: Individual simulation and analysis operations
  • parsing.py: Data parsing utilities
  • mdatools.py: MDAnalysis-based molecular dynamics analysis tools
  • coords.py: Coordinate manipulation and geometric calculations
  • utils.py: General utility functions
  • settings.py: Configuration and constants for how the simulations are set up
  • units.py, constants.py: Scientific constants and unit conversions

Data Pipeline Architecture

The project uses Python (Hamilton) for data analysis workflows:

  • Python Pipeline: src/analysis.py

Quarto Documentation System

  • _quarto.yml: Main Quarto configuration
  • index.qmd: Project overview and TODO tracking
  • thesis.qmd: Primary analysis notebook with Python code execution the generates the figures
  • reference/: API documentation generated by quartodoc
  • Outputs to docs/ directory as website

Simulation Data Structure

  • envs/: Environment files (JSON) containing simulation metadata
  • assets/: Templates for molecular dynamics simulations (GROMACS, CP2K)
  • tmp/: Temporary simulation outputs and intermediate files
  • data/: Analysis results and processed data

Dependencies

  • Scientific Python stack: numpy, pandas, matplotlib, seaborn, plotnine, uv
  • Molecular dynamics: MDAnalysis, gromacs
  • Documentation: Quarto, quartodoc
  • Data pipelines: Hamilton (Python)
  • Computational chemistry: cp2k-input-tools, kimmdy, gromacs

Important Notes

  • Environment setup varies by hostname (cascade cluster, local workstation, laptop)
  • The project combines QM/MM simulations with statistical analysis
  • Simulation templates in assets/ are critical for reproducible runs
  • Python environment needs to be properly configured (see setup.sh)
  • Tests are located in src/tests/

About

Workflow for QM/MM simulations of collagen triplehelix and single peptide hydrolysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors