Skip to content

Fix Oyr ocean resampling pipeline (enable resampling, yearly time/bounds/filename, calendar)#463

Open
rhaegar325 wants to merge 4 commits into
mainfrom
yearlydata-aggregate-calculation-issue
Open

Fix Oyr ocean resampling pipeline (enable resampling, yearly time/bounds/filename, calendar)#463
rhaegar325 wants to merge 4 commits into
mainfrom
yearlydata-aggregate-calculation-issue

Conversation

@rhaegar325

Copy link
Copy Markdown
Collaborator

PR Summary: Fix Oyr ocean resampling pipeline (enable resampling, yearly time/bounds/filename, calendar)

Overview

This change set fixes a chain of defects that prevented ACCESS-ESM1-5 ocean yearly
(Oyr)
variables from being CMORised correctly, and tightens CF/WCRP compliance of the
resampled output. All fixes are in the producer (access_moppy); each is covered by unit
tests and verified end-to-end against real model output with the WCRP CMIP6
compliance-checker.

Problems fixed

# Symptom Root cause Files
1 Ocean Oyr/Omon files have a dangling time:bounds but no time_bnds variable (compliance ATTR004/VAR004) Ocean bounds calculation was commented out; method lived only on the atmosphere class base.py, atmosphere.py, ocean.py (landed earlier, #455)
2 Oyr output still had 12 monthly steps instead of yearly Driver never passed enable_resampling to the ocean CMORisers, and Ocean_CMORiser_OM2/OM3.__init__ did not accept it driver.py, ocean.py
3 Single-year input crashed: "Need at least 2 time points to infer time bounds" calculate_time_bounds could not infer frequency from one point utilities.py, base.py
4 Yearly time coordinate landed on 31 Dec instead of mid-year resample labels bins with the YE (year-end) boundary utilities.py
5 Resampled output lost time:units/calendar (CF-invalid) decode_cf moved units into encoding and they were never restored utilities.py
6 Yearly filename included a month (010107-...) Filename builder lumped yearly into the monthly YYYYMM branch vocabulary_processors.py
7 Non-chronological (glob) file inputs crashed: "Index must be monotonic for resampling" Out-of-order multi-file concat produced a non-monotonic time axis; the sort ran too late utilities.py
8 [TIME001] yearly midpoint off by 1 day; time_bnds vs time calendar mismatch The non-CF "GREGORIAN" label was read by cftime as the Julian/Gregorian standard calendar, while the file declares proleptic_gregorian utilities.py

Changes

src/access_moppy/driver.py

  • ACCESS_ESM_CMORiser.enable_resampling default FalseTrue (no-op unless there is
    a genuine frequency mismatch).
  • Forward validate_frequency / enable_resampling / resampling_method to both
    Ocean_CMORiser_OM2 and Ocean_CMORiser_OM3.

src/access_moppy/ocean.py

  • Ocean_CMORiser_OM2/OM3.__init__ accept and forward the three resampling parameters to
    the base class (the link was previously broken at the subclass level).

src/access_moppy/utilities.py

  • _shift_resampled_time_to_period_midpoint: centre the resampled time coordinate on the
    period midpoint (CMOR convention), reusing the existing per-period bounds helpers.
  • resample_dataset_temporal: sort by time before resampling (monotonicity); normalise the
    "GREGORIAN" calendar label to "proleptic_gregorian" before decoding; restore the CF
    units/calendar encoding (numeric) after resampling.
  • _normalise_calendar_name: map only the non-CF "GREGORIAN" to "proleptic_gregorian".
  • calculate_time_bounds: add a freq_hint fallback for single-point axes; normalise the
    calendar before date arithmetic.

src/access_moppy/base.py

  • _target_frequency_hint: derive a daily/monthly/yearly hint from the CMOR table.
  • Pass that hint into calculate_time_bounds for the time axis.

src/access_moppy/vocabulary_processors.py

  • Add a yearly branch to the filename builder: Oyr time range is year-only (YYYY-YYYY).

Behaviour change

  • enable_resampling now defaults to True. Resampling is a no-op when the input
    frequency already matches the target (e.g. monthly→monthly), and only triggers on a real
    mismatch (e.g. monthly input for an Oyr table). Pass enable_resampling=False to opt
    out. Sea-ice is unaffected (the driver does not wire the flag into it); no existing test
    depended on the old default.

Tests

  • test_utilities.py:
    • TestResampleTimeMidpoint: midpoint on ~2 July (incl. leap year), CF units preserved,
      unsorted input handled.
    • test_single_point_with_freq_hint, test_gregorian_label_treated_as_proleptic,
      test_normalise_calendar_name.
  • test_vocabulary_processors.py: test_generate_filename_yearly_year_only.
  • Full local suites green: test_utilities / test_ocean / test_atmosphere /
    test_driver / test_vocabulary_processors (the 7 failing test_cmip7_* cases are a
    pre-existing CMIP7 CV submodule issue, unrelated to this change).
module use /g/data/xp65/public/modules && module load conda/analysis3
PYTHONPATH=src python -m pytest tests/unit/test_utilities.py tests/unit/test_ocean.py \
  tests/unit/test_driver.py tests/unit/test_vocabulary_processors.py -q

Compliance verification

Run against a real CMORised file
(no3_Oyr_ACCESS-ESM1-5_historical_r1i1p1f1_gn_0101-0110.nc):

  • time_bnds present (ATTR004/VAR004 cleared).
  • [TIME001] time-squareness passes — time[0] = 36706.5, the proleptic_gregorian
    yearly midpoint the checker expects.
  • ✅ CF §7.1 time_bnds/time calendar mismatch cleared (both proleptic_gregorian).
  • ✅ Yearly filename is YYYY-YYYY.

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.14815% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 77.2%. Comparing base (7142b03) to head (8eab7f2).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/access_moppy/utilities.py 97.2% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main    #463     +/-   ##
=======================================
+ Coverage   76.9%   77.2%   +0.3%     
=======================================
  Files         31      31             
  Lines       5974    6024     +50     
  Branches    1107    1119     +12     
=======================================
+ Hits        4594    4648     +54     
+ Misses      1131    1126      -5     
- Partials     249     250      +1     
Flag Coverage Δ
unit 77.2% <98.1%> (+0.3%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rbeucher

Copy link
Copy Markdown
Member

Can you link the issues that this will close?

@rhaegar325

Copy link
Copy Markdown
Collaborator Author

solved #454 #461

@rbeucher

Copy link
Copy Markdown
Member

How much of that also applied to ESM1-6 and CMIP7. I think we should check to make sure that we don't just use code for CMIP6.

@rbeucher

Copy link
Copy Markdown
Member

Most of this PR should apply to ACCESS-ESM1-6 and CMIP7, since the resampling/time-bounds/calendar fixes are in shared code paths used after CMIP7→CMIP6 mapping. Two things to double-check: CMIP7 yearly filename formatting (this change looks CMIP6-only), and OM2 source_id acceptance for ACCESS-ESM1-6 so ocean CMORisation isn’t blocked independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants