Skip to content

fix: mask CMIP fill values (1e20) before monthly max/min aggregation#457

Closed
rbeucher wants to merge 2 commits into
mainfrom
fix/tasmax-fill-value-masking
Closed

fix: mask CMIP fill values (1e20) before monthly max/min aggregation#457
rbeucher wants to merge 2 commits into
mainfrom
fix/tasmax-fill-value-masking

Conversation

@rbeucher

Copy link
Copy Markdown
Member

Problem

When data is loaded with decode_cf=False (as done throughout the pipeline to preserve lazy/Dask computation), fill values such as 1e20 are not automatically converted to NaN. This caused calculate_monthly_maximum and calculate_monthly_minimum to return 1e20 as the aggregated result when any fill value was present in the input data — most notably for tasmax and tasmin.

Fix

In both calculate_monthly_minimum and calculate_monthly_maximum in calc_utils.py:

  1. Detect _FillValue / missing_value from both attrs and encoding
  2. Mask those values via .where(da != fill_val) before calling .resample().max()/.min()
  3. Preserves lazy computation — xarray.where is a lazy operation

Tests

Added two regression tests to test_derivations_calc_utils.py:

  • test_masks_fill_values_in_attrs — verifies 1e20 in attrs is masked and does not appear in results
  • test_masks_fill_values_in_encoding — verifies fill values in encoding are also handled

All 58 existing tests continue to pass.

When data is loaded with decode_cf=False (as done throughout the pipeline
to preserve lazy computation), fill values such as 1e20 are NOT automatically
converted to NaN. This caused calculate_monthly_maximum and
calculate_monthly_minimum to return 1e20 as the aggregated result when any
fill value was present in the input data (e.g. tasmax/tasmin).

Fix: detect _FillValue / missing_value from attrs and encoding, and mask
those values via .where() before calling .resample().max()/.min().

Add regression tests for both attrs-based and encoding-based fill values.
@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.77778% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.7%. Comparing base (9316f47) to head (f0d1543).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/access_moppy/derivations/calc_utils.py 77.8% 4 Missing ⚠️

❌ Your patch check has failed because the patch coverage (77.8%) is below the target coverage (90.0%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@          Coverage Diff          @@
##            main    #457   +/-   ##
=====================================
  Coverage   76.7%   76.7%           
=====================================
  Files         31      31           
  Lines       5927    5946   +19     
  Branches    1094    1097    +3     
=====================================
+ Hits        4546    4562   +16     
- Misses      1131    1135    +4     
+ Partials     250     249    -1     
Flag Coverage Δ
unit 76.7% <77.8%> (+<0.1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rbeucher rbeucher closed this Jun 24, 2026
@rbeucher rbeucher deleted the fix/tasmax-fill-value-masking branch June 24, 2026 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant