feat: native xarray support via ds.canyonb accessor (v0.3.0)#7
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds native
xarraysupport tocanyonbpyby implementing a Dataset accessor registered asds.canyonb. Users can now run CANYON-B predictions directly on anxr.Datasetwithout any manual variable extraction, and get results back as anxr.Datasetsharing the same dimensions and coordinates — ready to merge.Motivation
Working with ocean model output or Argo float data almost always means
xr.Datasetobjects. Previously, users had to manually extract each variable into a numpy array, build a dictionary, callcanyonb(), then figure out how to reassign coordinates on the outputs. This PR removes all of that friction.The accessor pattern (used by
argopy,cf_xarray,xoak, etc.) is the idiomatic xarray-native approach: it lives on the object, is auto-discoverable, and groups related functionality under a single namespace without polluting the top-level package API.Changes
New files
canyonbpy/accessor.pyCanyonBAccessor— registered asds.canyonbvia@xr.register_dataset_accessorcanyonbpy/tests/test_accessor.pypredict(),converter()Modified files
canyonbpy/preprocessing.pyDatasetToNumpy(replaces the empty stub)canyonbpy/__init__.pyaccessormodule to trigger registration on import; bumps version to0.3.0docs/user-guide/advanced-features.mddocs/version-history.mdAPI
ds.canyonb.predict()— main entry pointReturns an
xr.Datasetwith the same dimensions and coordinates as the source.Custom variable names via
var_mapOnly keys that differ from the defaults need to be supplied. Argo BGC delayed-mode example:
Default mapping:
canyonbargumentgtimetimelatlatitudelonlongitudeprespressuretemptemperaturepsalsalinitydoxydoxyds.canyonb.converter()— low-level accessReturns the underlying
DatasetToNumpyinstance for cases where you need to inspect or modify numpy arrays before running the neural network.Implementation notes
_cimoutputs: for carbonate parameters (AT, CT, pH, pCO2),canyonbreturns_cim(measurement uncertainty) as a scalar becausecvalcimeas = inputsigma[i]**2is a fixed constant rather than a per-point value._pack_resultshandles this by broadcasting scalar/size-1 arrays tooriginal_shapevianp.full, rather than attempting a reshape.accessor.pyimportscanyonbfrom.coreinside thepredict()method body, avoiding any circular dependency at module load time.canyonb()numpy API is untouched.Tests
Checklist
CanyonBAccessorimplemented incanyonbpy/accessor.pyDatasetToNumpyimplemented incanyonbpy/preprocessing.py(stub → full implementation)import canyonbpy(no extra import needed for the user)_cimoutputs handled correctly in_pack_resultsadvanced-features.md,version-history.md)__version__bumped to0.3.0canyonb()API