Statistical sample matching analysis to assess the effectiveness of Spanish protected areas in conserving aboveground biomass carbon (2017-2024).
1. Setup environment:
conda env create -f env.yml
conda activate biomass_pas2. Run analysis:
# Quick test (1 replica)
python src/runner.py 1
# Full analysis (100 replicas)
python src/runner.py 1003. Export results:
python src/export_results.pyResults will be in results/ directory as CSV and Parquet files.
Compares carbon trajectories inside vs outside protected areas using statistical matching across five scenarios:
- No Control - Random pairs (baseline)
- Biomass Only - Match on initial biomass
- Biomass + Climate - Add climate variables
- Biomass + Climate + Topography - Add elevation and slope
- Biomass + Climate + Topography + Accessibility - Add accessibility/remoteness
For each scenario, the code:
- Creates matched PA/non-PA pairs with similar characteristics
- Calculates biomass trends (2017-2024) for each pair
- Compares PA vs non-PA carbon trajectories
- Repeats across 100 replicas for statistical robustness
├── src/
│ ├── runner.py # Main analysis script
│ ├── core_analysis.py # Clustering and matching functions
│ ├── matching_scenarios.py # Five matching scenarios
│ ├── dataloaders.py # Data loading functions
│ ├── utils.py # Statistics and output
│ ├── export_results.py # Convert results to CSV/Parquet
│ ├── config.py # Configuration parameters
│ └── visualization/ # Plotting scripts
├── data/ # Input data (biomass, climate, PAs)
├── results/ # Output files
├── env.yml # Conda environment
└── README.md
Required data (place in data/ directory):
- Biomass time series (2017-2024, 100m resolution)
- PA boundaries shapefile
- Climate baseline and anomalies
- Topography (elevation, slope)
- Accessibility layer
See the Zenodo repository https://doi.org/10.5281/zenodo.17610642 for detailed data descriptions and access to full dataset.
Edit src/config.py to change:
- Edge buffer size (default: 1 km two-sided)
- Cluster size (default: 2000 pixels)
- Pool size (default: 50 pixels)
- PA stratification type (designation or size)
# Run with different stratifications
python src/runner.py 100 --stratification designation # By PA type (default)
python src/runner.py 100 --stratification size # By PA size
python src/runner.py 100 --stratification both # Both (2x computation)
# Quick test
python src/runner.py 1Main results saved to results/pa_effectiveness_matching_results.npz
After running export_results.py:
pool_results.parquet- Pool-level results (main dataset)replica_summaries.csv- Summary per replicaoverall_statistics.csv- Aggregated statisticscluster_quality.parquet- Matching quality metrics
See results/README.md for details on output files.
- Python 3.10+
- Main packages: numpy, pandas, geopandas, rasterio, scikit-learn, scipy
- See
env.ymlfor complete list