Skip to content

scotthosking/ESMcat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

308 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESMcat

All Contributors

ESMcat is a Python package for working with large climate model datasets. It provides a catalogue system for indexing and filtering CMIP6 datasets.


Requirements


Installation

Install directly from GitHub:

pip install git+https://github.com/scotthosking/ESMcat.git

Or clone and install in editable mode for development:

git clone https://github.com/scotthosking/ESMcat.git
cd ESMcat
pip install -e .

Machine configuration

ESMcat uses a config file at ~/.esmcat/config.json to know which machine you are on, and therefore which dataset paths and catalogue files to use.

Set your machine on first use:

import esmcat as ecat
ecat.set_config('jasmin')

This writes {"machine": "jasmin"} to ~/.esmcat/config.json. ESMcat will then load datasets_jasmin.json from the package for dataset root paths and directory structures.

To check your current config:

ecat.get_config()

To add support for a new machine, create a datasets_{machine}.json file in the esmcat/ package directory following the same structure as datasets_jasmin.json.


Catalogue files

ESMcat uses pre-built catalogue files (Parquet format) stored in ~/.esmcat/. These index the files available on your system for each dataset.

Bundled catalogue files are included in the package under esmcat/catalogues/ and are copied to ~/.esmcat/ automatically on first use. Currently bundled:

Dataset Coverage
cmip6 CMIP and ScenarioMIP activities

To rebuild a catalogue from scratch (e.g. after new data has been added to the archive):

ecat.catalogue(dataset='cmip6', refresh=True)

Existing CSV catalogues are automatically migrated to Parquet on first use.


Usage

Filter the CMIP6 catalogue

import esmcat as ecat

df = ecat.catalogue(dataset='cmip6',
                  Experiment='historical',
                  Var=['tas', 'pr'],
                  CMOR='Amon')

print(df.head())

Use CMOR to select frequency and realm (e.g. Amon for monthly atmosphere, day for daily).

Available columns

Column Description Example values
MIP CMIP6 activity CMIP, ScenarioMIP
Centre Modelling centre MOHC, CNRM-CERFACS
Model Model name HadGEM3-GC31-LL, CNRM-ESM2-1
Experiment Experiment ID historical, ssp245, ssp585
RunID Ensemble member r1i1p1f1, r2i1p1f2
CMOR CMOR table (encodes frequency and realm) Amon, Omon, day, fx
Var Variable name tas, pr, tos
Grid Grid label gn (native), gr (regridded)
Version Data version v20190621
StartDate Start date of files (YYYYMMDD) 19500101
EndDate End date of files (YYYYMMDD) 21001231
Path Relative path to data directory
DataFiles Semicolon-separated list of filenames

Open a dataset

Pass a single-row catalogue entry to ecat.open_dataset() to load it as an Xarray Dataset. Multiple files (e.g. a variable split across decades) are combined automatically via xarray.open_mfdataset.

import esmcat as ecat

catlg = ecat.catalogue(dataset='cmip6',
                     Experiment='historical',
                     Var='tas',
                     CMOR='Amon',
                     Model='HadGEM3-GC31-LL',
                     RunID='r1i1p1f3')

ds = ecat.open_dataset(catlg.iloc[0])
print(ds)

To loop over multiple variables:

for _, row in catlg.iterrows():
    ds = ecat.open_dataset(row)

ecat.open_dataset requires Xarray and access to the underlying data files.

Read everything (bypass default filters)

df = ecat.catalogue(dataset='cmip6', read_everything=True)

Adding or editing datasets

Dataset configurations (root paths, directory structures, filename structures) are defined in datasets_{machine}.json. To add support for a new dataset on an existing machine, add an entry to the relevant JSON file following the same structure as the existing ones.


Contributors ✨

Thanks goes to these wonderful people (emoji key):

Scott Hosking
Scott Hosking

💻
TomBracegirdle
TomBracegirdle

💻
Tony Phillips
Tony Phillips

💻
Charles H. Simpson
Charles H. Simpson

💻

This project follows the all-contributors specification. Contributions of any kind welcome!

About

Earth System Modelling Catalogue + Loader for CMIP6

Topics

Resources

License

Stars

Watchers

Forks

Contributors