feat: native xarray support via `ds.canyonb` accessor (v0.3.0) by RaphaelBajon · Pull Request #7 · RaphaelBajon/canyonbpy

RaphaelBajon · 2026-03-04T05:26:06Z

Summary

This PR adds native xarray support to canyonbpy by implementing a Dataset accessor registered as ds.canyonb. Users can now run CANYON-B predictions directly on an xr.Dataset without any manual variable extraction, and get results back as an xr.Dataset sharing the same dimensions and coordinates — ready to merge.

import canyonbpy  # ds.canyonb is now available on any xr.Dataset

results = ds.canyonb.predict(param=["pH", "NO3"])
ds_enriched = xr.merge([ds, results])

Motivation

Working with ocean model output or Argo float data almost always means xr.Dataset objects. Previously, users had to manually extract each variable into a numpy array, build a dictionary, call canyonb(), then figure out how to reassign coordinates on the outputs. This PR removes all of that friction.

The accessor pattern (used by argopy, cf_xarray, xoak, etc.) is the idiomatic xarray-native approach: it lives on the object, is auto-discoverable, and groups related functionality under a single namespace without polluting the top-level package API.

Changes

New files

File	Description
`canyonbpy/accessor.py`	`CanyonBAccessor` — registered as `ds.canyonb` via `@xr.register_dataset_accessor`
`canyonbpy/tests/test_accessor.py`	12 tests covering registration, `predict()`, `converter()`

Modified files

File	Description
`canyonbpy/preprocessing.py`	Implements `DatasetToNumpy` (replaces the empty stub)
`canyonbpy/__init__.py`	Imports `accessor` module to trigger registration on import; bumps version to `0.3.0`
`docs/user-guide/advanced-features.md`	Replaces the "not yet implemented" warning with full xarray section
`docs/version-history.md`	Adds v0.3.0 entry

API

`ds.canyonb.predict()` — main entry point

Returns an xr.Dataset with the same dimensions and coordinates as the source.

import canyonbpy

# Default variable names: time, latitude, longitude, pressure, temperature, salinity, doxy
results = ds.canyonb.predict()

# Select parameters
results = ds.canyonb.predict(param=["pH", "AT", "NO3"])

# Custom measurement errors
results = ds.canyonb.predict(epres=1.0, etemp=0.01, epsal=0.01)

# Merge back into source dataset
ds_enriched = xr.merge([ds, results])

Custom variable names via `var_map`

Only keys that differ from the defaults need to be supplied. Argo BGC delayed-mode example:

var_map = {
    "temp": "TEMP_ADJUSTED",
    "psal": "PSAL_ADJUSTED",
    "doxy": "DOXY_ADJUSTED",
    "pres": "PRES_ADJUSTED",
    "lat":  "LATITUDE",
    "lon":  "LONGITUDE",
}
results = ds.canyonb.predict(var_map=var_map, param=["pH", "NO3"])

Default mapping:

`canyonb` argument	Default dataset variable
`gtime`	`time`
`lat`	`latitude`
`lon`	`longitude`
`pres`	`pressure`
`temp`	`temperature`
`psal`	`salinity`
`doxy`	`doxy`

`ds.canyonb.converter()` — low-level access

Returns the underlying DatasetToNumpy instance for cases where you need to inspect or modify numpy arrays before running the neural network.

conv   = ds.canyonb.converter()
inputs = conv.to_dict()         # dict[str, np.ndarray]
shape  = conv.original_shape()  # e.g. (n_prof, n_depth)

results = canyonb(**inputs, param=["pH"])
ph_grid = results["pH"].reshape(shape)

Implementation notes

Scalar _cim outputs: for carbonate parameters (AT, CT, pH, pCO2), canyonb returns _cim (measurement uncertainty) as a scalar because cvalcimeas = inputsigma[i]**2 is a fixed constant rather than a per-point value. _pack_results handles this by broadcasting scalar/size-1 arrays to original_shape via np.full, rather than attempting a reshape.
No circular imports: accessor.py imports canyonb from .core inside the predict() method body, avoiding any circular dependency at module load time.
No breaking changes: the existing canyonb() numpy API is untouched.

Tests

pytest canyonbpy/tests/test_accessor.py -v

tests/test_accessor.py::TestAccessorRegistration::test_accessor_is_available   PASSED
tests/test_accessor.py::TestAccessorRegistration::test_accessor_type           PASSED
tests/test_accessor.py::TestPredict::test_returns_dataset                      PASSED
tests/test_accessor.py::TestPredict::test_expected_variables_present           PASSED
tests/test_accessor.py::TestPredict::test_unrequested_params_absent            PASSED
tests/test_accessor.py::TestPredict::test_output_shape_preserved               PASSED
tests/test_accessor.py::TestPredict::test_output_dims_match_input              PASSED
tests/test_accessor.py::TestPredict::test_result_is_mergeable                  PASSED
tests/test_accessor.py::TestPredict::test_custom_var_map_argo                  PASSED
tests/test_accessor.py::TestPredict::test_results_consistent_with_canyonb     PASSED
tests/test_accessor.py::TestPredict::test_custom_errors                        PASSED
tests/test_accessor.py::TestConverter::test_returns_dataset_to_numpy           PASSED
tests/test_accessor.py::TestConverter::test_converter_with_var_map             PASSED

Checklist

CanyonBAccessor implemented in canyonbpy/accessor.py
DatasetToNumpy implemented in canyonbpy/preprocessing.py (stub → full implementation)
Accessor auto-registered on import canyonbpy (no extra import needed for the user)
Scalar _cim outputs handled correctly in _pack_results
All 13 new tests passing
Existing test suite unaffected
Documentation updated (advanced-features.md, version-history.md)
__version__ bumped to 0.3.0
No breaking changes to the existing canyonb() API

feat: add xarray accessor ds.canyonb (v0.3.0)

8606909

RaphaelBajon mentioned this pull request Mar 4, 2026

xarray integration and netcdf files #6

Closed

RaphaelBajon merged commit e483dfc into main Mar 4, 2026
4 checks passed

RaphaelBajon mentioned this pull request May 28, 2026

Feat: add CO2CONTENT.m algo, #9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: native xarray support via `ds.canyonb` accessor (v0.3.0)#7

feat: native xarray support via `ds.canyonb` accessor (v0.3.0)#7
RaphaelBajon merged 1 commit into
mainfrom
feature/xarray-accessor

RaphaelBajon commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RaphaelBajon commented Mar 4, 2026

Summary

Motivation

Changes

New files

Modified files

API

ds.canyonb.predict() — main entry point

Custom variable names via var_map

ds.canyonb.converter() — low-level access

Implementation notes

Tests

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`ds.canyonb.predict()` — main entry point

Custom variable names via `var_map`

`ds.canyonb.converter()` — low-level access