Releases · ai2cm/ace

11 May 20:29

brianhenn

v2026.5.1

12b9103

2026.5.1 Latest

Latest

Note that for PyPI versioning consistency reasons this release includes and supersedes the 2026.5.0 release.

What's Changed

Fine-Tuning & Checkpoint Resume

New config options make it easier to resume or fine-tune from existing checkpoints:

OptimizationConfig.resume_optimizer_ckpt_path: restore optimizer state when fine-tuning (#1043)
EMAConfig.resume_ema_ckpt_path: resume from an EMA checkpoint (#1118)
CheckpointStepperConfig: load stepper config directly from a checkpoint (#1103)
Optimizer/EMA state is now included in epoch checkpoints (ckpt_{epoch:04d}.tar) (#1104)

Ensemble Inference

Initial ensemble (IC ensemble) support added to the evaluator and inference aggregators (#709)

New Models & Architecture

filter_preserves_global_mean option added to SFNO (#1100)
SecondaryModuleStepConfig / SecondaryModuleStep: compose a secondary module during training steps (#1073)

Coupled Model

Stochastic CoupledStepper training (#750)
Randomly sampled LossContributions.n_steps (#869)
optimize_last_step_only added to coupled LossContributionsConfig (#868)

Diagnostics

Power spectrum diagnostics logged in the inference entrypoint (#1078, #1079)
Weather eval entrypoint replaced with a more general additional_inference list (#1096)

Data Processing

Time subsetting can now be configured prior to time coarsening (#1055)
PRMSL added to X-SHiELD data processing configurations (#1036)

Bug Fixes

Clamped SSR calculation that was producing NaNs silently dropped from W&B (#1088)
Worked around xarray StringDType serialization error (#1086)
Signal handler now exits with a nonzero code (#1068)
IceCorrectorConfig correctly registered in CorrectorSelector registry (#1044)

Breaking Changes

TrainStepperConfig.train_n_forward_steps renamed to TrainStepperConfig.n_forward_steps — all train YAML configs must update this field (#1052)
TrainConfig.n_forward_steps removed (was deprecated; use stepper_training.n_forward_steps) (#1052)
TrainConfig.weather_evaluation: WeatherEvaluationConfig | None replaced by TrainConfig.additional_inference: list[AdditionalInferenceConfig] (#1096)
Sub-aggregator record_batch(time, data) interface replaced by record_batch(data: InferenceBatchData) (#1097)
StepLoss.forward() now returns LossOutput instead of torch.Tensor; call .total() for the scalar (#1020)
fme.diffusion package removed (#1084)

Full Changelog: v2026.4.0...v2026.5.1

Assets 2

05 May 17:29

jpdunc23

v2026.5.0

709c4c3

v2026.5.0

What's Changed

Allow stochastic CoupledStepper training by @jpdunc23 in #750
Ensure IceCorrectorConfig is added to CorrectorSelector registry by @William-gregory in #1044
Add link to colab notebook example by @oliverwm1 in #1047
Add optimize_last_step_only to coupled LossContributionsConfig by @jpdunc23 in #868
Update dataset path for era5 data processing config by @mcgibbon in #1051
Enable configuring time subsetting prior to time coarsening by @spencerkclark in #1055
Add script to copy gcs data to weka by @mcgibbon in #1058
Remove deprecated n_forward_steps from TrainConfig by @mcgibbon in #1052
Downscaling loss weighting by @AnnaKwa in #1056
Refactor DiffusionModel.generate by @AnnaKwa in #1060
Set explicit 30-minute timeout on all init_process_group calls by @mcgibbon in #1069
Allow Distributed.get_instance() without context for single-rank by @mcgibbon in #1070
Add Claude Code transcript logging as pr comment by @mcgibbon in #1072
Fork DISCO convolution with FFT-based contraction into fme/core/disco by @mcgibbon in #1066
Add new test_ice_train script to catch bugs which affect GraphCast and IceCorrector by @William-gregory in #1054
Extract TensorDictAccumulator primitive by @mcgibbon in #1074
Remove barrier from ModelTorchDistributed.shutdown() by @mcgibbon in #1067
Randomly sampled coupled LossContributions.n_steps by @jpdunc23 in #869
Exit with nonzero code in signal handler by @mcgibbon in #1068
Set seed in fme.coupled training integration test by @jpdunc23 in #1075
Updates repo name in perlmutter example make-venv by @yikwill in #1063
Update baseline Beaker budget to "atec-climate" by @brianhenn in #1080
Log power spectrum diagnostics in inference entrypoint by @spencerkclark in #1078
from_state methods give on-device, memory-decoupled objects by @mcgibbon in #1061
Add SecondaryModuleStepConfig and SecondaryModuleStep by @mcgibbon in #1073
Remove unused fme/diffusion folder by @oliverwm1 in #1084
Implement get_dataset for power spectrum aggregators by @spencerkclark in #1079
Have losses return non-reduced tensors by @Arcomano1234 in #1020
Work around xarray StringDType serialization error by @spencerkclark in #1086
Fix/n ensemble attributes batch data by @Arcomano1234 in #1085
DenoisingMoEPredictor by @AnnaKwa in #1071
Decouple noise floor argo workflow from fme by @spencerkclark in #1083
Add bottleneck_attention option to diffusion model config by @AnnaKwa in #1094
Add PRMSL to X-SHiELD data processing configurations by @spencerkclark in #1036
Add IC ensemble ability to evaluator and update inference aggregators for ensembles by @Arcomano1234 in #709
Replace weather eval with additional inference by @mcgibbon in #1096
Add isotropic Morlet filter basis for DISCO by @mcgibbon in #1093
Enable DenoisingMoEPredictor in inference by @AnnaKwa in #1059
Add script to extract Stepper checkpoints from an CoupledStepper checkpoint by @jpdunc23 in #1105
Fix latents, inputs order in mixture of experts generate by @AnnaKwa in #1106
Remove scripts/test_distributed_context.py given #1070 by @jpdunc23 in #1107
Uniform sub-aggregator interface via InferenceBatchData by @mcgibbon in #1097
Add filter_preserves_global_mean option to SFNO by @mcgibbon in #1100
Clamp SSR calculation that was leading to NaNs and silently dropped from Wandb by @Arcomano1234 in #1088
Add CheckpointStepperConfig to load stepper config from checkpoint by @mcgibbon in #1103
Thin coordinator: move builder logic to config.build() by @mcgibbon in #1098

New Contributors

@William-gregory made their first contribution in #1044
@yikwill made their first contribution in #1063

Full Changelog: v2026.4.0...v2026.5.0

Contributors

jpdunc23, spencerkclark, and 7 other contributors

Assets 2

09 Apr 16:53

elynnwu

v2026.4.0

c147fc5

v2026.4.0

Release date: April 9, 2026

What's Changed

A subset of changes are listed here, see full changelog for more detail: v2026.1.1...v2026.4.0

⚠️ Breaking Changes

fme.ace and fme.coupled training configs: Training-only fields (loss, optimize_last_step_only, n_ensemble, parameter_init, train_n_forward_steps) have been removed from StepperConfig and must now be set under a new top-level stepper_training: TrainStepperConfig field. Existing training configs will need to be updated. (#862)

New Config Options

metrics_log_dir on LoggingConfig: Log W&B scalar metrics to a local JSONL file on disk in addition to W&B. (#992)
Configurable inference step logging: Control which inference steps are logged to W&B. (#883)
ValidationConfig on InferenceEvaluatorConfig (fme.ace): Optionally run a validation pass before inference and log metrics to step 0 of the W&B run. (#878)
LRTuningConfig on TrainConfig (fme.ace, fme.coupled, fme.diffusion): Automatically tune the learning rate at configurable epochs by running short isolated comparison trials between the current and a candidate LR — no restarts required. (#930)
prescribed_prognostic_names on SingleModuleStepperConfig: Override named prognostic variables with ground-truth values at each inference/eval timestep. Intended to be set via stepper_override in eval configs. (#810)
Optional left/two-tailed PDF metrics for downscaling training. (#994)
LossVsNoiseAggregator for downscaling: Tracks loss as a function of noise level during diffusion training. (#1025)
Configurable training noise distribution. (#874)

Deprecations

sea_ice_thickness_name on SeaIceFractionConfig (ocean corrector): Deprecated in favor of the more general zero_where_ice_free_names list, which supports correcting multiple outputs. (#843)
CascadePredictor (downscaling): Deprecated and removed. (#970)
Topography pathway on downscaling DataLoaderConfig / PairedDataLoaderConfig: Deprecated; use StaticInputs instead. (#926)

Notable Behavioral Fixes

CheckpointModelConfig.build now returns the module in eval mode by default. (#1019)
Final-epoch checkpoints are now always saved (previously could be skipped). (#1015)

Assets 2

09 Apr 17:28

frodre

v2026.1.1

ea74b16

HiRO-ACE Release

This release is the official milestone for our team's change to fully open development, and includes the latest updates for our HiRO-ACE as described in our paper. HiRO-ACE is a two-stage emulation framework for generating 3 km resolution precipitation outputs using a stochastic climate emulator (ACE2S) to generate 100km climate simulations and a downscaling model (HiRO) to generate 3 km precipitation outputs.

See the docs for a quickstart on installation and use, and our huggingface repo for the models and some sample data to run on.

Open Development

Previously, the ACE repo solely held updates related to papers and releases, while most of the development happened behind the scenes in a separate repository. This made it harder for external collaborators to contribute and for users to track development progress. We hope the changeover to all of our development happening here brings us closer to users, facilitating easier paths for soliciting feedback, issues, and development from outside of our group.

Updates

Don't upload big maps by @AnnaKwa in #707
Add StaticInputs class by @AnnaKwa in #713
Add hiro ckpt train config by @AnnaKwa in #721
Provide backwards compatibility for list-type BatchLabels by @mcgibbon in #722
Serialize static inputs with downscaling model by @AnnaKwa in #727
Beaker CI test via gantry by @brianhenn in #723
Remove filter repo tools by @brianhenn in #729
Add training configs for ACE2S used in HiRO - ACE manuscript by @Arcomano1234 in #710
Pass model static inputs to dataset build calls at generation by @AnnaKwa in #728
Call optimizer autocast in stepper predict generator by @mcgibbon in #733
Ensure topography is on device in downscaling inference by @AnnaKwa in #731
Coupled stepper config removes deprecated crps_training key by @elynnwu in #734
Samudra bugfix: Use circular padding for longitude axis by @elynnwu in #735
Add additional diagnostics of the OHC budget by @jpdunc23 in #737
Prevent backpropagation anomalies in energy corrector by @spencerkclark in #724
Fix bug causing step sampler to be ignored by @mcgibbon in #742
Add a contributing guideline by @oliverwm1 in #730
Increase timeout of NCCL collective operations to 20 minutes by @jpdunc23 in #746
Add docs page for downscaling inference by @AnnaKwa in #743
Vendorize Apache 2.0 Nvidia Downscaling Code by @frodre in #748
Enforce lat bounds (-88 deg, 88 deg) by @AnnaKwa in #740
Bump version v2026.1.1 for HiRO-ACE release by @frodre in #751

Full Changelog: v2026.1.0...v2026.1.1

Contributors

frodre, jpdunc23, and 7 other contributors

Assets 2

21 Jan 19:20

frodre

v2026.1.0

0fc1eab

v2026.1.0

Release marking the switch to open development for the ai2cm team.

What's Changed

Docs CI update by @brianhenn in #711
Bump version to 2026.1.0 by @brianhenn in #715

Full Changelog: https://github.com/ai2cm/ace/commits/v2026.1.0

Contributors

brianhenn

Assets 2

07 Nov 21:20

brianhenn

2025.11.0

633ec47

2025.11.0

Release date: November 7, 2025
Full Changelog: 2025.10.0...2025.11.0

What's Changed

We updated the versions of fme dependencies torch-harmonics (0.7.4 --> 0.8.0) and imageio(<2.27.0 --> >2.28.1) based on user feedback.

Assets 2

16 Oct 18:36

brianhenn

2025.10.0

1beed63

2025.10.0

Release date: October 16, 2025
Full Changelog: 2025.7.0...2025.10.0

What's Changed

This release includes the capability to run coupled models (such as those emulating the atmosphere, ocean, and sea ice!) via entrypoints in fme.coupled. We have provided documentation for running inference using coupled model weights.

The deprecated legacy training configuration format (SingleModuleStepperConfig) has been removed in this release. However, breaking changes have been avoided and backwards compatibility has been maintained with existing saved models for most cases.

Assets 2

15 Jul 05:28

brianhenn

2025.7.0

1382d5f

2025.7.0

What's Changed

This release includes major internal refactors and improved documentation. The previous training configuration format has been deprecated and will be removed in a future release. However, breaking changes have been avoided and backwards compatibility has been maintained with existing saved models for most cases.

Version updates:

Python 3.11 and torch 2.7.1

Internal refactors:

The fme package has been moved one level up (i.e., away from the legacy fme/fme/... layout and to fme/ace/ and fme/core/ instead).

Increased modularity for ML emulation:

Training configuration is now based around a more flexible StepperConfig; the legacy SingleModuleStepperConfig is deprecated and will be removed in a future release.
The stepper config now supports the modular step framework allowing composible steps for ML emulation.

Experimental features:

Samudra, a global ocean emulator developed by M2LInES, is now fully integrated into Ai2's full model framework. An example production workflow for training and running Samudra is currently under development and will be included in the upcoming release.

Documentation

Added an improved quickstart.rst focused around the models saved in our Hugging Face collection.

Full Changelog: 2024.12.0...2025.7.0

Assets 2

17 Dec 01:12

oliverwm1

2024.12.0

2dceb95

2024.12.0

What's Changed

This release contains many internal changes for ACE code. However, all configuration options accessible by the entrypoints of the fme package (i.e. fme.ace.train, fme.ace.inference and fme.ace.evaluator) have had no breaking changes.

The following lists are not complete but just a highlight of changes which may be relevant to users.

Bug fixes:

resolved transient bug that sometimes occurred in XarrayDataset when trying to read the image shape from a scalar field
when using n_repeats greater than 1, XarrayDataset now correctly increments the values in the returned time arrays

New features:

ACE works on Apple Silicon! Set the environmental variable FME_USE_MPS=1 to use the pytorch MPS backend. Make sure to have the latest version of pytorch installed. This gives about a 5x speed up over running on CPU (tested on a Macbook Pro M3 Max).
add perturbations to sea surface temperature during inference (see ForcingDataLoaderConfig.perturbations)

Refactors:

deduplicated some inference code by using generics. Now the fme.ace.inference and fme.ace.evaluator entrypoints now share more code.

Full Changelog: 2024.9.0...2024.12.0

Assets 2

10 Oct 21:13

oliverwm1

2024.9.0

06afe52

2024.9.0

What's Changed

Update README to link to zenodo repo with checkpoint by @oliverwm1 in #3
New public release of FME code by @oliverwm1 in #5
Fix instruction for installing from GitHub by @oliverwm1 in #7
Add readthedocs config by @mcgibbon in #6
Add docs badge and link by @oliverwm1 in #8
Add link to zenodo archive with checkpoint by @oliverwm1 in #9
Add link to E3SMv2-trained paper and checkpoint by @oliverwm1 in #12
Add link to published EAMv2 paper in JGR-ML by @jpdunc23 in #16
Add missing init files by @oliverwm1 in #17
Update for PyPI release by @frodre in #20

New Contributors

@oliverwm1 made their first contribution in #3
@mcgibbon made their first contribution in #6
@jpdunc23 made their first contribution in #16
@frodre made their first contribution in #20

Full Changelog: 2023.12.0...2024.9.0

Contributors

frodre, jpdunc23, and 2 other contributors

Assets 2

Uh oh!

Releases: ai2cm/ace

2026.5.1

What's Changed

Fine-Tuning & Checkpoint Resume

Ensemble Inference

New Models & Architecture

Coupled Model

Diagnostics

Data Processing

Bug Fixes

Breaking Changes

Uh oh!

v2026.5.0

What's Changed

New Contributors

Contributors

Uh oh!

v2026.4.0

What's Changed

⚠️ Breaking Changes

New Config Options

Deprecations

Notable Behavioral Fixes

Uh oh!

HiRO-ACE Release

Open Development

Updates

Contributors

Uh oh!

v2026.1.0

What's Changed

Contributors

Uh oh!

2025.11.0

What's Changed

Uh oh!

2025.10.0

What's Changed

Uh oh!

2025.7.0

What's Changed

Uh oh!

2024.12.0

What's Changed

Uh oh!

2024.9.0

What's Changed

New Contributors

Contributors

Uh oh!