Expand ZarrWriter test coverage; add Zarr to benchmarking + read mode by glwagner · Pull Request #5610 · CliMA/Oceananigans.jl

glwagner · 2026-05-20T19:44:20Z

Summary

Follow-up to #5605. Two independent additions in one PR:

Tests — surface grid-type round-trip gaps in ZarrWriter (no fixes yet, by design).
Benchmarks — add ZarrWriter to benchmarking/, add a new --mode=read axis, and report on-disk size + chunk shape for Zarr.

Test coverage (`test/test_zarr_writer.jl`)

New Phase 8 testset sweeps round-trip behaviour across grid types beyond RectilinearGrid:

Grid	Marker
`LatitudeLongitudeGrid` (regular)	`@test`
`LatitudeLongitudeGrid` (stretched)	`@test`
`TripolarGrid`	`@test_broken`
`RotatedLatitudeLongitudeGrid`	`@test_broken`
`ImmersedBoundaryGrid` over `RectilinearGrid`	`@test_broken`
`ImmersedBoundaryGrid` over `LatitudeLongitudeGrid`	`@test_broken`
`ImmersedBoundaryGrid` over `TripolarGrid`	`@test_broken`

Local run: 110 pass, 5 broken, 0 errored. The broken markers will flip to @test automatically once the follow-ups land.

Known gaps (intentional, not fixed in this PR)

OrthogonalSphericalShellGrid / TripolarGrid / RotatedLatitudeLongitudeGrid have no constructor_arguments method — grid serialization throws at write time.
ImmersedBoundaryGrid serialization throws at read time (Immersed-boundary reconstruction not yet implemented for Zarr. at ext/OceananigansZarrExt/zarr_writer.jl:388); inside the writer, an UndefVarError: GridFittedBottom is also reachable when the underlying-grid serialization succeeds.

Filed for a separate PR.

Benchmarks (`benchmarking/`)

Project.toml: adds Zarr (0.10).
result.jl: IOBenchmarkResult gains chunk_shape::Union{Nothing, Vector{Int}}. New ReadBenchmarkResult reports bulk-read, iteration, and per-snapshot timings.
utils.jl:
- path_size(p): recursive walkdir for directory stores (needed for Zarr's DirectoryStore).
- zarr_chunk_shape(path, name): reads back the spatial chunk shape from the written .zarray.
- run_io_benchmark accepts output_format = "zarr" (alongside "jld2" and "netcdf") and a zarr_chunks kwarg.
- run_read_benchmark writes a source dataset using run_io_benchmark then times FieldTimeSeries(path, name) construction + fts[n] iteration.
OceananigansBenchmarks.jl: using Zarr; re-exports path_size, run_read_benchmark, ReadBenchmarkResult.
run_benchmarks.jl:
- --output_format now accepts zarr.
- --format, --zarr_chunks, --read_variable added.
- --mode=read dispatches to run_read_benchmark.
- Summary table + markdown report extended with Size and Chunks columns.

--mode=data_forced is deferred to a future PR.

Test plan

julia --project=test -e 'using Pkg; Pkg.test("Oceananigans", test_args=["zarr_writer"])' — Phase 8 reports 110 pass / 5 broken locally on CPU.
--mode=io --output_format=zarr produces chunk_shape and total_output_size_bytes in JSON; markdown renders Size + Chunks columns.
--mode=io --output_format=jld2|netcdf regression-clean; chunk_shape is null.
--mode=read --format=zarr|jld2|netcdf all produce a ReadBenchmarkResult.
--mode=benchmark regression-clean (no IO).
GPU CI sweep.
Distributed CI (test_distributed_zarr_writer.jl) — untouched here, still RectilinearGrid-only.

🤖 Generated with Claude Code

Test side (`test/test_zarr_writer.jl`) Adds a new Phase 8 testset that sweeps round-trip behaviour across grid types beyond `RectilinearGrid`: - `LatitudeLongitudeGrid` (regular + stretched): asserted `@test`. - `TripolarGrid`, `RotatedLatitudeLongitudeGrid`: `@test_broken` — these have no `constructor_arguments` method, so grid serialization throws at write time. - `ImmersedBoundaryGrid` over `RectilinearGrid`, `LatitudeLongitudeGrid`, and `TripolarGrid`: `@test_broken` — the Zarr extension's `reconstruct_zarr_grid` explicitly throws for IBG. Tests are scoped to surface the gaps without fixing them. Follow-ups will land `constructor_arguments` for OSSG / Tripolar / Rotated, and full IBG serialization (mask + bottom_height) in a separate PR; when they land, the `@test_broken` calls flip to `@test` automatically. Benchmark side (`benchmarking/`) - `Project.toml`: adds `Zarr` (0.10). - `result.jl`: `IOBenchmarkResult` gains `chunk_shape`. Adds `ReadBenchmarkResult`, modeled on `IOBenchmarkResult` but reporting bulk read, iteration, and per-snapshot timings. - `utils.jl`: adds `path_size` (recursive walkdir for directory stores), `zarr_chunk_shape` (reads spatial chunks back from the written `.zarray`), extends `run_io_benchmark` to dispatch `ZarrWriter` for `output_format = "zarr"`, and adds `run_read_benchmark` which writes a source dataset then times `FieldTimeSeries(path, name)` construction plus a `fts[n]` iteration loop. - `OceananigansBenchmarks.jl`: re-exports `path_size`, `run_read_benchmark`, and `ReadBenchmarkResult`; adds `using Zarr`. - `run_benchmarks.jl`: - `--output_format` now accepts `zarr`. - `--format` (read mode), `--zarr_chunks`, `--read_variable` added. - New `--mode=read` branch dispatching to `run_read_benchmark`. - Summary table and markdown report extended with `Size` and `Chunks` columns; read entries render with per-snapshot units. Verified locally on CPU: - `test/test_zarr_writer.jl`: 110 pass, 5 broken (the new `@test_broken` cases), 0 errored. - `--mode=io` with `--output_format=zarr|jld2|netcdf` all produce expected JSON + markdown. - `--mode=read` with `--format=zarr|jld2|netcdf` all produce expected JSON + markdown. - `--mode=benchmark` regression-clean. Out of scope (deferred per design): - `constructor_arguments` for OSSG family. - `ImmersedBoundaryGrid` serialization (mask + bottom_height). - `--mode=data_forced` (continuous disk reads during stepping). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@glwagner

Per @glwagner: let CI fail loudly on the known-broken grids so the gap is unambiguous in build status. Every grid in the sweep is now asserted with plain `@test`. The 5 currently-broken rows (Tripolar, RotatedLatLon, and the three ImmersedBoundaryGrid variants) will stay red until the OSSG + NetCDF support PR lands `constructor_arguments` for the OSSG family and serializes the immersed boundary. Also widens the `try`/`catch` inside `zarr_round_trip` to cover model construction, since `HydrostaticFreeSurfaceModel` over `TripolarGrid` throws an `ArgumentError` about variable substepping in the auto-picked `SplitExplicitFreeSurface` — previously that escaped the catch and reported the row as ERROR rather than FAIL. Local CPU result: 110 pass, 5 fail, 0 errored, 0 broken. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

glwagner and others added 4 commits May 20, 2026 13:43

Merge branch 'main' into zarr-writer-grid-tests-and-benchmarks

ca825f7

Merge branch 'main' into zarr-writer-grid-tests-and-benchmarks

39b4f13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand ZarrWriter test coverage; add Zarr to benchmarking + read mode#5610

Expand ZarrWriter test coverage; add Zarr to benchmarking + read mode#5610
glwagner wants to merge 4 commits into
mainfrom
zarr-writer-grid-tests-and-benchmarks

glwagner commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

glwagner commented May 20, 2026

Summary

Test coverage (test/test_zarr_writer.jl)

Known gaps (intentional, not fixed in this PR)

Benchmarks (benchmarking/)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Test coverage (`test/test_zarr_writer.jl`)

Benchmarks (`benchmarking/`)