Add dev-01deg_jra55_ryf+wombatlite configuration#286
Conversation
|
!test repro commit |
|
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
🔧 The new checksums will be committed to this PR, if they differ from what is on this branch. Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum Test summary: |
|
This does not complete 1 month within the 5 hour walltime limit (the max for this number of CPUs). Approximately:
For reference:
There are some low-hanging opportunities for code refactoring to improve WOMBATlite performance, but so far I've only been able to squeeze out ~7%. |
|
This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there: https://forum.access-hive.org.au/t/cosima-twg-meeting-minutes-2026/5906/8 |
Relative to non-BGC config: Before this change: - 3.5x slower - 3.5x costlier After this change: - 2.0x slower - 2.6x costlier
e2b853d to
6056897
Compare
|
!test repro commit |
I've moved this configuration to Sapphire Rapids nodes and increased number of CPUs used by the ocean (see 6056897). With these changes this configuration is 2.0x slower and 2.6x costlier that the equivalent non-BGC config (previously 3.5x the speed and cost). 1 month now easily completes within the walltime limit. |
|
❌ |
|
!test repro commit |
|
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
🔧 The new checksums will be committed to this PR, if they differ from what is on this branch. Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum Test summary: |
|
@anton-seaice this is ready for review. It probably makes sense to review #280 first as I've added some comments there that may help answer some questions that might come up. This configuration uses 62 SR nodes, with no unused CPUs (down from 7 unused CL CPUs) The failing QA checks will be fixed by ACCESS-NRI/model-config-tests#204 |
|
!test repro commit |
|
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
🔧 The new checksums will be committed to this PR, if they differ from what is on this branch. Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum Test summary: |
8b9dd21 to
1c270f7
Compare
1c270f7 to
9fb1862
Compare
|
!test repro commit |
|
❌ The Bitwise Reproducibility Check Failed ❌ When comparing:
🔧 The new checksums will be committed to this PR, if they differ from what is on this branch. Further informationThe experiment can be found on Gadi at The checksums generated by this The checksums compared against are found here https://github.com/ACCESS-NRI/access-om2-configs/tree/6c759fdcf40b08d256bcef88ec387ac5abad4ef4/testing/checksum Test summary: |
|
I've run this for 6 months and Pearse and I are happy with the WOMBAT output |
This PR:
.travis.ymlfrom 0.1 deg configs #325This is preparation for the first official release of this configuration.
Still to do:
field_table