Skip to content

CROCODILE-CESM/SeaSloth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SeaSloth

Performance benchmarking suite for the CROC ocean modeling ecosystem (CrocoDash, mom6_forge, xESMF/ESMF regridding).

What's Benchmarked

Suite File(s) Data needed What it measures
xESMF weight generation xesmf/bench_weights_generate.py None (synthetic) xe.Regridder() construction time + RSS, grid→grid and grid→locstream, up to ~1 M source points
xESMF regrid application xesmf/bench_regrid_apply.py None (synthetic) regridder(ds) time across grid sizes, time depths, and methods
ESMF weight generation esmf/bench_weights_generate.py None (synthetic) esmpy.Regrid() construction time + RSS — raw ESMF, same sizes as xESMF suite
ESMF regrid application esmf/bench_regrid_apply.py None (synthetic) esmpy.Regrid()(src, dst) time — raw ESMF, loop over ntime steps
OBC forcing pipeline crocodash/bench_obc.py Cached GLORYS (GLADE) REGRID + MERGE phases of process_conditions(), no GET; varies step_days chunk size
Runoff mapping mom6_forge/bench_runoff_mapping.py ESMF mesh files (GLADE) gen_rof_maps() — nearest-neighbour and smoothed NN mapping between ROF and OCN meshes
Raw data access health crocodash/bench_raw_data_access.py Credentials / GLADE Connectivity check for GLORYS (RDA + Copernicus), GEBCO, GLOFAS, MOM6 output; returns 1 (up) / 0 (down)
Bathymetry pipeline mom6_forge/bench_topo.py GEBCO (GLADE) Topo.set_from_dataset() — GEBCO regrid + fill across grid sizes
Import times crocodash/bench_imports.py None import CrocoDash.case, mom6_forge.grid/topo/vgrid — catches import-time regressions

First-time Setup

Verify your environment is ready:

conda activate CrocoDash
bash scripts/configure.sh

No path configuration needed — asv.conf.json points to the public CrocoDash GitHub repo so it works identically on GLADE and in CI.

Running Benchmarks

Use scripts/run_bench.sh — it automatically detects the CrocoDash commit from your active editable install and passes the correct --set-commit-hash:

conda activate CrocoDash

# Run all benchmarks (full timing)
bash scripts/run_bench.sh

# Quick sanity check (one rep per benchmark, less accurate but fast)
bash scripts/run_bench.sh --quick

# Run a specific suite or class
bash scripts/run_bench.sh --bench "CrocoDashImports" --quick
bash scripts/run_bench.sh --bench "XESMFWeightsGenerate"

Any extra arguments are passed straight through to asv run. Each run produces a result file named <croco-hash>-existing-....json in results/derecho/, tagged to the exact CrocoDash commit checked out locally.

Why the wrapper instead of asv run directly? With environment_type: "existing", ASV silently discards results unless --set-commit-hash is passed. The wrapper handles this automatically and uses your local CrocoDash HEAD (not GitHub's main), so benchmarking older commits works correctly.

On Derecho/GLADE, submit as a PBS job for data-dependent benchmarks:

qsub scripts/pbs_submit.sh

Tracking Multiple CrocoDash Commits

To populate the regression timeline with real per-version data, benchmark across several CrocoDash commits:

cd /path/to/CrocoDash
COMMITS=($(git log --oneline -5 | awk '{print $1}'))

cd /path/to/SeaSloth
for HASH in "${COMMITS[@]}"; do
    git -C /path/to/CrocoDash checkout --quiet "$HASH"
    python -m asv run --quick --bench "CrocoDashImports" --set-commit-hash "$HASH"
done

git -C /path/to/CrocoDash checkout main

Commit and push the accumulated results so CI can publish the updated timeline:

git add results/
git commit -m "bench: import times across CrocoDash commits"
git push

HPC Data-Dependent Benchmarks

Fill in paths in benchmarks/data_config.json to enable benchmarks that need real data:

Key Benchmark What to put
gebco_path Bathymetry pipeline Path to GEBCO_2024.nc
mesh_pairs[*].rof_mesh / ocn_mesh Runoff mapping Paths to ESMF mesh NetCDF files
obc_config_path OBC pipeline Path to a CrocoDash case config YAML
obc_step_days_dirs OBC pipeline Three pre-staged raw GLORYS folders (one per step_days)

Benchmarks that need data raise NotImplementedError in setup() when files are absent — ASV marks them n/a automatically.

Dashboard

bash scripts/publish.sh

Builds two pages in .asv/html/:

  • index.html — snapshot bar charts per benchmark class (generated by scripts/generate_report.py)
  • asv_timeline.html — ASV's commit-timeline view for spotting regressions (needs 2+ commits to show data)

The live dashboard deploys to GitHub Pages on every push to main.

First-time GitHub Pages setup: Settings → Pages → Source → GitHub Actions.

Documentation

About

Benchmarking CROCODILE Packages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors