Skip to content

Switch to writing files in CDF5 (NETCDF3_64BIT_DATA) format #333

Merged
xylar merged 8 commits into
E3SM-Project:mainfrom
xylar:write-cdf5
Jun 10, 2025
Merged

Switch to writing files in CDF5 (NETCDF3_64BIT_DATA) format #333
xylar merged 8 commits into
E3SM-Project:mainfrom
xylar:write-cdf5

Conversation

@xylar

@xylar xylar commented Jun 3, 2025

Copy link
Copy Markdown
Collaborator

This is E3SM's preferred fromat for large files. Since E3SM can't support NETCDF4 format, it is the only option that we
can be sure will work for both small and large meshes, and saves us the hassle of having different fromats for different mesh sizes.

Good performance does require first writing a NETCDF4 file from xarray and then converting to NETCDF3_64BIT_DATA. This is handled by a new version of mpas_tools, see MPAS-Dev/MPAS-Tools#633.

Copilot summary

This pull request introduces changes to improve NetCDF file handling and streamline the codebase in polaris. Key updates include modifying the default NetCDF format, refactoring imports for clarity, and adding functionality to set default I/O types for streams. Below is a breakdown of the most important changes:

NetCDF File Handling Improvements:

  • Updated the default NetCDF format to NETCDF3_64BIT_DATA in polaris/default.cfg to align with improved compatibility and performance. The output engine was also switched from scipy to netcdf4.
  • Added logic in polaris/model_step.py to set the default I/O type for streams when the format is NETCDF3_64BIT_DATA.
  • Introduced the set_default_io_type function in polaris/streams.py to set the io_type attribute for streams and immutable streams, except for immutable streams named 'mesh'.

Codebase Refactoring:

  • Refactored imports in polaris/mesh/spherical.py, replacing xarray with an alias xr and adding write_netcdf for improved readability and functionality.
  • Updated instances of xarray.DataArray and xarray.open_dataset to use the alias xr for consistency in polaris/mesh/spherical.py. [1] [2]
  • Enhanced the run method in polaris/mesh/spherical.py to rewrite temporary mesh files in the desired NetCDF format using write_netcdf.

Checklist

  • User's Guide has been updated
  • Developer's Guide has been updated
  • API documentation in the Developer's Guide (api.md) has any new or modified class, method and/or functions listed
  • Documentation has been built locally and changes look as expected
  • Testing comment in the PR documents testing used to verify the changes

@xylar xylar self-assigned this Jun 3, 2025
@xylar xylar added framework Changes relating to the polaris framework as opposed to individual tests or analysis ocean Related to the ocean component enhancement New feature or request and removed framework Changes relating to the polaris framework as opposed to individual tests or analysis labels Jun 3, 2025
@xylar xylar force-pushed the write-cdf5 branch 3 times, most recently from 61ac0cf to 8873379 Compare June 6, 2025 12:21
@xylar xylar marked this pull request as ready for review June 6, 2025 12:52
xylar added 6 commits June 7, 2025 01:31
This is E3SM's preferred fromat for large files.  Since E3SM
can't support `NETCDF4` format, it is the only option that we
can be sure will work for both small and large meshes, and saves
us the hassle of having different fromats for different mesh sizes.

Good performance does require first writing a `NETCDF4` file from
`xarray` and then converting to `NETCDF3_64BIT_DATA`.  This is
handled by a new version of `mpas_tools`.
This ensures that they have the desired format.
@xylar

xylar commented Jun 9, 2025

Copy link
Copy Markdown
Collaborator Author

Testing

I ran the pr suite and could show that all tests were BFB with a recent pr suite before this functionality was added. (The overflow test was added after I generated the baseline, I could not check that test.)

@xylar

xylar commented Jun 10, 2025

Copy link
Copy Markdown
Collaborator Author

I'm going to merged this. If it turns out to be a problem, we can move back to NETCDF3_64BIT for files that aren't too large. But I would like to at least try to have a consistent format for now.

@xylar xylar merged commit a8bdd58 into E3SM-Project:main Jun 10, 2025
5 checks passed
@xylar xylar deleted the write-cdf5 branch June 12, 2025 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ocean Related to the ocean component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant