This repository includes material for the Physalia workshop on Generalized Linear Latent Variable Models, 7-10 July 2026. Feel free to share, alter, or re-use this material with appropriate referencing of this repository.
Workshop webpage: https://www.physalia-courses.org/courses-workshops/gllvm/
Since the 1950s, ecologists have used ordination methods for analysis of data on ecological communities. In recent years, research has shown that classical ordination methods (PCA, PCoA, RDA, CA, CCA, NMDS etc.) which rely on distance measures have various unfavourable properties. Warton et al. (2012) showed that distance-based methods confound location and dispersion effects, O'Hara and Kotze (2010) demonstrated that log-transforming count data is generally inappropriate, and classical methods lack random effects, uncertainty quantification, predictive capacity, and a coherent unified framework.
Hui et al. (2015) suggested the Generalized Linear Latent Variable Modeling (GLLVM) framework as a modern alternative for ecological multivariate analysis. GLLVMs can be seen as a multivariate extension of GL(M)Ms, inheriting many useful properties of both statistical models and ordination methods. Resources include Skrondal and Rabe-Hesketh (2004) and Bartholomew et al. (2011).
This workshop teaches GLLVMs through a mix of lectures and practicals, building from multispecies GLMs and GLMMs through JSDMs to model-based ordination and beyond. Basic familiarity with GLMs and the R programming language is assumed. The material of my Physalia workshop on Generalised Linear Models can be found here. Gavin Simpson's Physalia workshop on classical multivariate analysis (github here) can serve as an introduction to some of the material in this course.
Please make sure to update your R installation prior to the workshop. Most of the code used in the workshop should function on older versions of R as well, but not all R packages used might be available or function fully.
You can find an R installation based on your operating system here
Sessions from 14:00 to 20:00 (Tuesday to Friday). Sessions will consist of a mix of lectures, in-class discussion, and practical exercises over Zoom.
- Introduction to model-based community analysis
- Multispecies Generalised Linear Models
- Multispecies Generalised Linear Mixed Models
- Model checking and comparison
- Hierarchical environmental responses
- Joint Species Distribution Models
- Predicting species richness and diversity
- Unimodal response models
- Extensions: spatial/temporal autocorrelation and mixed response types
- Own data analysis and wrap-up
| Day | Time | Subject |
|---|---|---|
| Tuesday | 14:00 - 14:45 | Introduction to model-based community analysis |
| 14:45 - 15:45 | Vector Generalised Linear Models | |
| 15:45 - 16:00 | Break | |
| 16:00 - 17:00 | Practical 1: Fitting multispecies GLMs | |
| 17:00 - 17:30 | Vector Generalised Linear Mixed Models | |
| 17:30 - 18:15 | Break | |
| 18:15 - 18:45 | Model checking and comparison | |
| 18:45 - 20:00 | Practical 2: Multispecies GLMMs and diagnostics | |
| --------- | ------------- | ---------------------------------------------------------------- |
| Wednesday | 14:00 - 14:45 | Hierarchically modelling environmental responses |
| 14:45 - 15:45 | Practical 3: Traits and phylogeny | |
| 15:45 - 16:00 | Break | |
| 16:00 - 16:45 | Joint Species Distribution Models | |
| 16:45 - 17:45 | Practical 4: Joint Species Distribution Models | |
| 17:45 - 18:30 | Break | |
| 18:30 - 19:15 | Predicting species richness and diversity | |
| 19:15 - 20:00 | Practical 5: Predicting diversity | |
| --------- | ------------- | ---------------------------------------------------------------- |
| Thursday | 14:00 - 14:45 | Model-based ordination |
| 14:45 - 15:45 | Practical 6: Model-based ordination | |
| 15:45 - 16:00 | Break | |
| 16:00 - 16:45 | Ordination with covariates | |
| 16:45 - 17:45 | Article reanalysis | |
| 17:45 - 18:30 | Break | |
| 18:30 - 19:15 | Conditioning and nested designs | |
| 19:15 - 20:00 | Practical 7: Conditioning and partial ordination | |
| --------- | ------------- | ---------------------------------------------------------------- |
| Friday | 14:00 - 14:45 | Unimodal response models |
| 14:45 - 15:30 | Practical 8: Unimodal responses | |
| 15:30 - 15:45 | Break | |
| 15:45 - 16:30 | Extensions: spatial/temporal and mixed response types | |
| 16:30 - 17:30 | Practical 9: Extensions | |
| 17:30 - 18:15 | Break | |
| 18:15 - 20:00 | Own data analysis and wrap-up | |
| --------- | ------------- | ---------------------------------------------------------------- |
| gllvm argument | Function | Accepted structures | Data |
|---|---|---|---|
formula |
Fixed and random species-specific effects | lme4-type formula (e.g. ~ x1 + (0+x2|1)) |
X: environmental variables |
lv.formula |
Specifies fixed or random effect in the ordination | lme4-type formula (e.g., ~x1 + x2 or ~(0+x1 + x2|1) |
X: covariates for the latent variables |
row.eff |
Includes fixed and random species-common effects | glmmTMB-type formula, alternatively "fixed" or "random" | studyDesign: any categorical or continuous covariates |
lvCor |
For group-level unconstrained ordination or to introduce correlation structure among unconstrained latent variables | lme4-type formula | studyDesign |
The gllvm R package is the primary focus of this workshop, but several other packages implement related methods for model-based multivariate analysis of community data. A detailed overview with examples is available in this presentation and accompanying practical.
| Package | Description |
|---|---|
| mvabund | Multivariate GLMs for community data; hypothesis testing via resampling |
| Hmsc | Hierarchical Model of Species Communities; Bayesian JSDM framework |
| sjSDM | Joint Species Distribution Models via deep learning |
| boral | Bayesian ordination and regression analysis using latent variables |
| ecopCopula | Copula-based models for multivariate abundance data |
| glmmTMB | Generalised linear mixed models via TMB; flexible random effects |
The animation below shows the variational approximation converging to the final solution when fitting a GLLVM to the spider dataset. The ordination plots settle as the algorithm iterates toward the maximum of the approximate likelihood.
