Wading Bird Population Forecasting

A comparative forecasting framework for Everglades wading bird populations using dynamic GAMs and traditional time series models.

Overview

This project implements sliding window cross-validation to compare multiple forecasting approaches for six wading bird species in the Florida Everglades. Models incorporate water management covariates and evaluate predictions using both numeric (CRPS) and ordinal (RPS) metrics.

Species analyzed:

Great Blue Heron (gbhe)
Great Egret (greg)
Roseate Spoonbill (rosp)
Snowy Egret (sneg)
White Ibis (whib)
Wood Stork (wost) spatial hierarchy:

all
 └── subregion (8 regions: 1, 2a, 2b, 3an, 3as, 3ase, 3b, inlandenp)
      └── colony (33 viable colonies ≥15 years)

Installation

# Required packages
install.packages(c(
  "config",        # Configuration management
  "conflicted",    # Namespace conflict handling
  "distributional",# Vectorised probability distributions
  "dplyr",         # Data manipulation
  "fable",         # Traditional time series models
  "feasts",        # Feature extraction
  "ggplot2",       # Figures
  "glue",          # String formatting
  "remotes",       
  "scales", 
  "tidyr",         # Data structure
  "tsibble",       # Time series tibbles
  "verification",  # RPS scoring
  "zoo"            # Interpolation
))

# Install from GitHub
remotes::install_github("nicholasjclark/mvgam")  # Dynamic GAMs
remotes::install_github("weecology/wader")        # Bird count data
remotes::install_github("weecology/edenr")        # Everglades water data

Strucutre

.
├── main.R                          # Main execution script
├── config.yml                      # Configuration settings
├── data_functions.R                # Data loading and preparation
├── evaluation.R                    # Cross-validation and metrics
├── plotting.R                      # Visualization functions
├── final_year_plots.R              # Forecast vs actual for latest year
├── models/
│   ├── mvgam_baseline.R           # Baseline mvgam model
│   ├── mvgam_ar.R                 # AR model with water covariates
│   ├── mvgam_ar_exog.R            # AR with polynomial covariates
│   ├── mvgam_species_specific.R   # Species-specific smooth responses
│   ├── mvgam_trait.R              # Trait-based VAR model
│   └── fable_models.R             # ARIMA, TSLM, ARIMA-exog
└── results/
    ├── RDS_results/               # Saved model outputs (.rds)
    └── *.png                      # Generated plots

Workflow

#config.R choose

dgam/fable models you want to use.
run fable and dgam
ordinal evaluation or not
ordinal breaks
training and testing windown size


main.R:
  ↓
if sliding_window_breaks = FALSE:
    Compute breaks once from full dataset
    ↓
    filter_ordinal_years(ordinal_years) → subset if needed
    ↓
    precomputed_breaks = [species-specific quantiles]
else:
    precomputed_breaks = NULL
  ↓
Pass to fit_sliding_window()
  ↓
For each window:
    ↓
    if precomputed_breaks exists:
        Use fixed breaks
    else:
        Compute from training window
        ↓
        filter_ordinal_years(ordinal_years) → subset if needed

Evaluations

CRPS (numeric)

Continuous Ranked Probability Score
↓
Measures: How close is predicted distribution to actual count?
Lower = Better

RPS (Ordinal)

Ranked Probability Score
↓
Step 1: Define categories from training data quantiles
        Low: < 33rd percentile
        Medium: 33rd - 50th percentile  
        High: 50th - 67th percentile
        Very High: > 67th percentile
↓
Step 2: Convert predictions to category probabilities
        P(Low) = pnorm(threshold_low, mean=prediction, sd=pred_sd)
        P(Medium) = pnorm(threshold_med, ...) - P(Low)
        ...
↓
Step 3: Score predicted probabilities vs actual category
        Lower = Better
        
#skill score
skill_score = 1 - (model_score / baseline_score)

> 0: Better than baseline
= 0: Same as baseline
< 0: Worse than baseline

Visualization

Four plots generated per model framework:

- CRPS Skill Over Time - Time series of skill scores by species
- RPS Skill Over Time - Ordinal accuracy evolution
- Combined Metrics - All metrics in faceted grid
- Best Model Counts - Winner frequency by species

All settings controlled via config.yml. Key options:

default:
  spatial:
    level: all              # "all", "subregion", or "colony"
    run_by_region: false    # true = separate model per spatial unit
    min_years_required: 10  # Minimum years to include a spatial unit

  models:
    mvgam: [baseline, ar, ar_exog, species_specific, trait]
    fable: [baseline, arima, tslm, arima_exog]

  run_mvgam: true           # Enable/disable mvgam framework
  run_fable: true           # Enable/disable fable framework
  use_ordinal: true         # true = CRPS + RPS, false = CRPS only

  ordinal_breaks: [0.33, 0.50, 0.67]   # Category boundaries (quantiles)
  sliding_window_breaks: true           # true = recompute per window
  ordinal_years: All                    # Years used to define breaks

  train_years: 20           # Training window size
  test_years: 2             # Test window size
  chains: 4                 # MCMC chains
  burnin: 1500              # MCMC warmup iterations
  samples: 1500             # MCMC sampling iterations

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
models		models
results		results
.gitignore		.gitignore
3as_depth_data.csv		3as_depth_data.csv
EvergladesForecasting.qmd		EvergladesForecasting.qmd
README.md		README.md
_quarto.yml		_quarto.yml
compare_fable_mvgam.R		compare_fable_mvgam.R
config.yml		config.yml
data_functions.R		data_functions.R
eden_covariates.csv		eden_covariates.csv
evaluation.R		evaluation.R
everglades-forecasts.R		everglades-forecasts.R
final_year_plots.R		final_year_plots.R
forecast_plot.R		forecast_plot.R
indicator-forecasts.R		indicator-forecasts.R
indicator-forecasts.qmd		indicator-forecasts.qmd
inlandenp_depth_data.csv		inlandenp_depth_data.csv
main.R		main.R
mvgam-model.qmd		mvgam-model.qmd
plot_all_forecasts.R		plot_all_forecasts.R
plotting.R		plotting.R
scenario_forecast.R		scenario_forecast.R
time-series.qmd		time-series.qmd
water-data.R		water-data.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wading Bird Population Forecasting

Overview

Installation

Strucutre

Workflow

Evaluations

CRPS (numeric)

RPS (Ordinal)

Visualization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Wading Bird Population Forecasting

Overview

Installation

Strucutre

Workflow

Evaluations

CRPS (numeric)

RPS (Ordinal)

Visualization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages