Skip to content

Add dev-01deg_jra55_ryf+wombatlite configuration#286

Open
dougiesquire wants to merge 13 commits into
dev-01deg_jra55_ryf+wombatlitefrom
270-01deg_jra55_ryf+wombatlite
Open

Add dev-01deg_jra55_ryf+wombatlite configuration#286
dougiesquire wants to merge 13 commits into
dev-01deg_jra55_ryf+wombatlitefrom
270-01deg_jra55_ryf+wombatlite

Conversation

@dougiesquire

@dougiesquire dougiesquire commented Mar 7, 2026

Copy link
Copy Markdown
Collaborator

This PR:

This is preparation for the first official release of this configuration.

Still to do:

  • Move inputs out of prerelease area
  • Add comment to redundant section of field_table
  • Update performance info
  • Update checksums

@dougiesquire

Copy link
Copy Markdown
Collaborator Author

!test repro commit

@github-actions

github-actions Bot commented Mar 7, 2026

Copy link
Copy Markdown

❌ The Bitwise Reproducibility Check Failed ❌

When comparing:

  • 270-01deg_jra55_ryf+wombatlite (checksums created using commit 21aad8a), against
  • dev-01deg_jra55_ryf+wombatlite (checksums in commit 6c759fd)

🔧 The new checksums will be committed to this PR, if they differ from what is on this branch.

Further information

The experiment can be found on Gadi at /scratch/tm70/repro-ci/experiments/access-om2-configs/pr286/270-01deg_jra55_ryf+wombatlite/21aad8a0a9cce9674588aec94550be8f3f8f3e11, and the test results at https://github.com/ACCESS-NRI/access-om2-configs/runs/66134313760.

The checksums generated by this !test command are found in the testing/checksum directory of https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/22797073144/artifacts/5810794199.

The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum

Test summary:
test_repro_historical
test_repro_determinism
test_repro_restart

@dougiesquire

Copy link
Copy Markdown
Collaborator Author

This does not complete 1 month within the 5 hour walltime limit (the max for this number of CPUs).

Approximately:

  • WOMBAT legacy (10 tr, ~10 equations) is about 1.8-1.9x slower than running without BGC. Just adding 10 passive tracers slows the model by about 1.6x.
  • WOMBATlite (15 tr, ~100 equations) is about 3-3.4x slower than running without BGC. Just adding 15 passive tracers slows the model by about 2x.

For reference:

  • BLING (7 tr, ~50 equations) is about 2x slower than running without BGC. Just adding 7 passive tracers slows the model by about 1.4x.

There are some low-hanging opportunities for code refactoring to improve WOMBATlite performance, but so far I've only been able to squeeze out ~7%.

@access-hive-bot

Copy link
Copy Markdown

This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/cosima-twg-meeting-minutes-2026/5906/8

Relative to non-BGC config:
Before this change:
- 3.5x slower
- 3.5x costlier
After this change:
- 2.0x slower
- 2.6x costlier
@dougiesquire dougiesquire force-pushed the 270-01deg_jra55_ryf+wombatlite branch from e2b853d to 6056897 Compare March 12, 2026 03:34
@dougiesquire

Copy link
Copy Markdown
Collaborator Author

!test repro commit

@dougiesquire

dougiesquire commented Mar 12, 2026

Copy link
Copy Markdown
Collaborator Author

This does not complete 1 month within the 5 hour walltime limit (the max for this number of CPUs).

I've moved this configuration to Sapphire Rapids nodes and increased number of CPUs used by the ocean (see 6056897). With these changes this configuration is 2.0x slower and 2.6x costlier that the equivalent non-BGC config (previously 3.5x the speed and cost). 1 month now easily completes within the walltime limit.

@github-actions

Copy link
Copy Markdown

!test Command Failed ❌ See https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/22985609834

@dougiesquire

Copy link
Copy Markdown
Collaborator Author

!test repro commit

@dougiesquire dougiesquire self-assigned this Mar 12, 2026
@github-actions

Copy link
Copy Markdown

❌ The Bitwise Reproducibility Check Failed ❌

When comparing:

  • 270-01deg_jra55_ryf+wombatlite (checksums created using commit 8109a11), against
  • dev-01deg_jra55_ryf+wombatlite (checksums in commit 6c759fd)

🔧 The new checksums will be committed to this PR, if they differ from what is on this branch.

Further information

The experiment can be found on Gadi at /scratch/tm70/repro-ci/experiments/access-om2-configs/pr286/270-01deg_jra55_ryf+wombatlite/8109a11dc41b40dfb45c1f1eaf7bbd6ba71348e5, and the test results at https://github.com/ACCESS-NRI/access-om2-configs/runs/66742886479.

The checksums generated by this !test command are found in the testing/checksum directory of https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/22987403441/artifacts/5884709069.

The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum

Test summary:
test_repro_historical
test_repro_determinism
test_repro_restart

@dougiesquire dougiesquire marked this pull request as ready for review March 13, 2026 02:17
@dougiesquire

Copy link
Copy Markdown
Collaborator Author

@anton-seaice this is ready for review. It probably makes sense to review #280 first as I've added some comments there that may help answer some questions that might come up.

This configuration uses 62 SR nodes, with no unused CPUs (down from 7 unused CL CPUs)

The failing QA checks will be fixed by ACCESS-NRI/model-config-tests#204

@dougiesquire

Copy link
Copy Markdown
Collaborator Author

!test repro commit

@github-actions

Copy link
Copy Markdown

❌ The Bitwise Reproducibility Check Failed ❌

When comparing:

  • 270-01deg_jra55_ryf+wombatlite (checksums created using commit 6479ddf), against
  • dev-01deg_jra55_ryf+wombatlite (checksums in commit 6c759fd)

🔧 The new checksums will be committed to this PR, if they differ from what is on this branch.

Further information

The experiment can be found on Gadi at /scratch/tm70/repro-ci/experiments/access-om2-configs/pr286/270-01deg_jra55_ryf+wombatlite/6479ddfcf473a6c389f660a244682ab0a60998cc, and the test results at https://github.com/ACCESS-NRI/access-om2-configs/runs/67178662554.

The checksums generated by this !test command are found in the testing/checksum directory of https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/23128443861/artifacts/5937557785.

The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum

Test summary:
test_repro_historical
test_repro_determinism
test_repro_restart

anton-seaice
anton-seaice previously approved these changes Mar 16, 2026
@dougiesquire dougiesquire force-pushed the 270-01deg_jra55_ryf+wombatlite branch from 1c270f7 to 9fb1862 Compare June 3, 2026 04:31
@dougiesquire

Copy link
Copy Markdown
Collaborator Author

!test repro commit

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

❌ The Bitwise Reproducibility Check Failed ❌

When comparing:

  • 270-01deg_jra55_ryf+wombatlite (checksums created using commit b4246c1), against
  • dev-01deg_jra55_ryf+wombatlite (checksums in commit 6c759fd)

🔧 The new checksums will be committed to this PR, if they differ from what is on this branch.

Further information

The experiment can be found on Gadi at /scratch/tm70/repro-ci/experiments/access-om2-configs/pr286/270-01deg_jra55_ryf+wombatlite/b4246c129e7d14449788b2d2b0c8b17fe3d3ac27, and the test results at https://github.com/ACCESS-NRI/access-om2-configs/runs/79421011290.

The checksums generated by this !test command are found in the testing/checksum directory of https://github.com/ACCESS-NRI/access-om2-configs/actions/runs/26919516182/artifacts/7399648563.

The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum

Test summary:
test_repro_historical
test_repro_determinism
test_repro_restart
test_repro_payu_setup

@dougiesquire

Copy link
Copy Markdown
Collaborator Author

I've run this for 6 months and Pearse and I are happy with the WOMBAT output

@dougiesquire dougiesquire requested a review from anton-seaice June 4, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants