SeaSloth

Performance benchmarking suite for the CROC ocean modeling ecosystem (CrocoDash, mom6_forge, xESMF/ESMF regridding).

What's Benchmarked

Suite	File(s)	Data needed	What it measures
xESMF weight generation	`xesmf/bench_weights_generate.py`	None (synthetic)	`xe.Regridder()` construction time + RSS, grid→grid and grid→locstream, up to ~1 M source points
xESMF regrid application	`xesmf/bench_regrid_apply.py`	None (synthetic)	`regridder(ds)` time across grid sizes, time depths, and methods
ESMF weight generation	`esmf/bench_weights_generate.py`	None (synthetic)	`esmpy.Regrid()` construction time + RSS — raw ESMF, same sizes as xESMF suite
ESMF regrid application	`esmf/bench_regrid_apply.py`	None (synthetic)	`esmpy.Regrid()(src, dst)` time — raw ESMF, loop over ntime steps
OBC forcing pipeline	`crocodash/bench_obc.py`	Cached GLORYS (GLADE)	REGRID + MERGE phases of `process_conditions()`, no GET; varies step_days chunk size
Runoff mapping	`mom6_forge/bench_runoff_mapping.py`	ESMF mesh files (GLADE)	`gen_rof_maps()` — nearest-neighbour and smoothed NN mapping between ROF and OCN meshes
Raw data access health	`crocodash/bench_raw_data_access.py`	Credentials / GLADE	Connectivity check for GLORYS (RDA + Copernicus), GEBCO, GLOFAS, MOM6 output; returns 1 (up) / 0 (down)
Bathymetry pipeline	`mom6_forge/bench_topo.py`	GEBCO (GLADE)	`Topo.set_from_dataset()` — GEBCO regrid + fill across grid sizes
Import times	`crocodash/bench_imports.py`	None	`import CrocoDash.case`, `mom6_forge.grid/topo/vgrid` — catches import-time regressions

First-time Setup

Verify your environment is ready:

conda activate CrocoDash
bash scripts/configure.sh

No path configuration needed — asv.conf.json points to the public CrocoDash GitHub repo so it works identically on GLADE and in CI.

Running Benchmarks

Use scripts/run_bench.sh — it automatically detects the CrocoDash commit from your active editable install and passes the correct --set-commit-hash:

conda activate CrocoDash

# Run all benchmarks (full timing)
bash scripts/run_bench.sh

# Quick sanity check (one rep per benchmark, less accurate but fast)
bash scripts/run_bench.sh --quick

# Run a specific suite or class
bash scripts/run_bench.sh --bench "CrocoDashImports" --quick
bash scripts/run_bench.sh --bench "XESMFWeightsGenerate"

Any extra arguments are passed straight through to asv run. Each run produces a result file named <croco-hash>-existing-....json in results/derecho/, tagged to the exact CrocoDash commit checked out locally.

Why the wrapper instead of asv run directly? With environment_type: "existing", ASV silently discards results unless --set-commit-hash is passed. The wrapper handles this automatically and uses your local CrocoDash HEAD (not GitHub's main), so benchmarking older commits works correctly.

On Derecho/GLADE, submit as a PBS job for data-dependent benchmarks:

qsub scripts/pbs_submit.sh

Tracking Multiple CrocoDash Commits

To populate the regression timeline with real per-version data, benchmark across several CrocoDash commits:

cd /path/to/CrocoDash
COMMITS=($(git log --oneline -5 | awk '{print $1}'))

cd /path/to/SeaSloth
for HASH in "${COMMITS[@]}"; do
    git -C /path/to/CrocoDash checkout --quiet "$HASH"
    python -m asv run --quick --bench "CrocoDashImports" --set-commit-hash "$HASH"
done

git -C /path/to/CrocoDash checkout main

Commit and push the accumulated results so CI can publish the updated timeline:

git add results/
git commit -m "bench: import times across CrocoDash commits"
git push

HPC Data-Dependent Benchmarks

Fill in paths in benchmarks/data_config.json to enable benchmarks that need real data:

Key	Benchmark	What to put
`gebco_path`	Bathymetry pipeline	Path to `GEBCO_2024.nc`
`mesh_pairs[*].rof_mesh` / `ocn_mesh`	Runoff mapping	Paths to ESMF mesh NetCDF files
`obc_config_path`	OBC pipeline	Path to a CrocoDash case config YAML
`obc_step_days_dirs`	OBC pipeline	Three pre-staged raw GLORYS folders (one per step_days)

Benchmarks that need data raise NotImplementedError in setup() when files are absent — ASV marks them n/a automatically.

Dashboard

bash scripts/publish.sh

Builds two pages in .asv/html/:

index.html — snapshot bar charts per benchmark class (generated by scripts/generate_report.py)
asv_timeline.html — ASV's commit-timeline view for spotting regressions (needs 2+ commits to show data)

The live dashboard deploys to GitHub Pages on every push to main.

First-time GitHub Pages setup: Settings → Pages → Source → GitHub Actions.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
results		results
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
asv.conf.json		asv.conf.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SeaSloth

What's Benchmarked

First-time Setup

Running Benchmarks

Tracking Multiple CrocoDash Commits

HPC Data-Dependent Benchmarks

Dashboard

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SeaSloth

What's Benchmarked

First-time Setup

Running Benchmarks

Tracking Multiple CrocoDash Commits

HPC Data-Dependent Benchmarks

Dashboard

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages