Skip to content

RidhimaGupta4/FreshWaterMacroInvertebrates

Repository files navigation

Macroinvertebrate Abundance Trends in England’s Rivers (1990–2024)

📌 Project Overview

This project delivers a production-ready statistical framework for assessing national macroinvertebrate trends using 30+ years of Environment Agency (EA) open data. The primary challenge addressed is measurement heterogeneity: reconciling legacy categorical placeholders (e.g., AB indices) with modern numeric counts and sub-sampled data.

By implementing Seasonal Censored-Normal Generalized Additive Models (CNORM GAMMs), this pipeline produces season-aware national indicators that respect interval uncertainty and phenology.

🛠️ Technical Stack

  • Language: R
  • Modeling: mgcv (Censored GAMMs with REML smoothing)
  • Data Engineering: Apache Arrow (for high-performance Parquet processing)
  • Visualization: ggplot2, gratia, and sf for spatial mapping

🔬 Key Statistical Features

  • Interval-Censored Likelihood: Every observation is treated as an interval $[L_i, U_i]$ on a variance-stabilizing $\sqrt{\cdot}$ scale to account for one-significant-figure rounding and categorical bins.
  • Seasonal Phenology: Separate thin-plate regression splines for Spring and Autumn to capture divergent life-cycle dynamics.
  • Precision Weighting: Implemented $w_i \propto 1 / (width_i + 1)$ to ensure exact counts carry more statistical weight than broad categorical ranges.
  • Spatial Heterogeneity: Integrated site-level random intercepts to produce site-marginal national curves, ensuring trends reflect population-level change rather than site-specific noise.
  • Model Triangulation: Validated findings across four aligned models: Presence/Absence (Binomial), Ordered Categorical (OCAT), Censored-Poisson, and the headline CNORM.

📊 Findings Summary

Across five focal families (Aphelocheiridae, Brachycentridae, Cordulegastridae, Odontoceridae, Potamanthidae), the analysis reveals a characteristic "dip and recovery":

  • General Trend: Significant national increase through the 1990s to a mid-2000s peak, followed by stabilization and a recent recovery into the 2020s.
  • Spatial Insight: For the Aphelocheiridae family, approximately 97% of studied sites showed positive abundance shifts between the early ($\le2005$) and recent ($\ge2006$) windows.

About

National abundance trend analysis of England’s river macroinvertebrates (1990–2024). Implements a custom R pipeline using Arrow for big data and seasonal interval-censored GAMMs (mgcv) to reconcile legacy categorical records with modern numeric counts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors