Performance benchmarking suite for the CROC ocean modeling ecosystem (CrocoDash, mom6_forge, xESMF/ESMF regridding).
| Suite | File(s) | Data needed | What it measures |
|---|---|---|---|
| xESMF weight generation | xesmf/bench_weights_generate.py |
None (synthetic) | xe.Regridder() construction time + RSS, grid→grid and grid→locstream, up to ~1 M source points |
| xESMF regrid application | xesmf/bench_regrid_apply.py |
None (synthetic) | regridder(ds) time across grid sizes, time depths, and methods |
| ESMF weight generation | esmf/bench_weights_generate.py |
None (synthetic) | esmpy.Regrid() construction time + RSS — raw ESMF, same sizes as xESMF suite |
| ESMF regrid application | esmf/bench_regrid_apply.py |
None (synthetic) | esmpy.Regrid()(src, dst) time — raw ESMF, loop over ntime steps |
| OBC forcing pipeline | crocodash/bench_obc.py |
Cached GLORYS (GLADE) | REGRID + MERGE phases of process_conditions(), no GET; varies step_days chunk size |
| Runoff mapping | mom6_forge/bench_runoff_mapping.py |
ESMF mesh files (GLADE) | gen_rof_maps() — nearest-neighbour and smoothed NN mapping between ROF and OCN meshes |
| Raw data access health | crocodash/bench_raw_data_access.py |
Credentials / GLADE | Connectivity check for GLORYS (RDA + Copernicus), GEBCO, GLOFAS, MOM6 output; returns 1 (up) / 0 (down) |
| Bathymetry pipeline | mom6_forge/bench_topo.py |
GEBCO (GLADE) | Topo.set_from_dataset() — GEBCO regrid + fill across grid sizes |
| Import times | crocodash/bench_imports.py |
None | import CrocoDash.case, mom6_forge.grid/topo/vgrid — catches import-time regressions |
Verify your environment is ready:
conda activate CrocoDash
bash scripts/configure.shNo path configuration needed — asv.conf.json points to the public CrocoDash GitHub repo so it works identically on GLADE and in CI.
Use scripts/run_bench.sh — it automatically detects the CrocoDash commit from your active editable install and passes the correct --set-commit-hash:
conda activate CrocoDash
# Run all benchmarks (full timing)
bash scripts/run_bench.sh
# Quick sanity check (one rep per benchmark, less accurate but fast)
bash scripts/run_bench.sh --quick
# Run a specific suite or class
bash scripts/run_bench.sh --bench "CrocoDashImports" --quick
bash scripts/run_bench.sh --bench "XESMFWeightsGenerate"Any extra arguments are passed straight through to asv run. Each run produces a result file named <croco-hash>-existing-....json in results/derecho/, tagged to the exact CrocoDash commit checked out locally.
Why the wrapper instead of asv run directly? With environment_type: "existing", ASV silently discards results unless --set-commit-hash is passed. The wrapper handles this automatically and uses your local CrocoDash HEAD (not GitHub's main), so benchmarking older commits works correctly.
On Derecho/GLADE, submit as a PBS job for data-dependent benchmarks:
qsub scripts/pbs_submit.shTo populate the regression timeline with real per-version data, benchmark across several CrocoDash commits:
cd /path/to/CrocoDash
COMMITS=($(git log --oneline -5 | awk '{print $1}'))
cd /path/to/SeaSloth
for HASH in "${COMMITS[@]}"; do
git -C /path/to/CrocoDash checkout --quiet "$HASH"
python -m asv run --quick --bench "CrocoDashImports" --set-commit-hash "$HASH"
done
git -C /path/to/CrocoDash checkout mainCommit and push the accumulated results so CI can publish the updated timeline:
git add results/
git commit -m "bench: import times across CrocoDash commits"
git pushFill in paths in benchmarks/data_config.json to enable benchmarks that need real data:
| Key | Benchmark | What to put |
|---|---|---|
gebco_path |
Bathymetry pipeline | Path to GEBCO_2024.nc |
mesh_pairs[*].rof_mesh / ocn_mesh |
Runoff mapping | Paths to ESMF mesh NetCDF files |
obc_config_path |
OBC pipeline | Path to a CrocoDash case config YAML |
obc_step_days_dirs |
OBC pipeline | Three pre-staged raw GLORYS folders (one per step_days) |
Benchmarks that need data raise NotImplementedError in setup() when files are absent — ASV marks them n/a automatically.
bash scripts/publish.shBuilds two pages in .asv/html/:
index.html— snapshot bar charts per benchmark class (generated byscripts/generate_report.py)asv_timeline.html— ASV's commit-timeline view for spotting regressions (needs 2+ commits to show data)
The live dashboard deploys to GitHub Pages on every push to main.
First-time GitHub Pages setup: Settings → Pages → Source → GitHub Actions.