CopyCat is a BMI module that retrieves catchment discharge values from provided NWM channel_rt output files, rather than computing them itself. It is designed to facilitate "hybrid" model runs of ngen that may run only a subset of catchments but needs sensible upstream flows from outside the domain to make the simulation output reasonable.
For example, this could be used to run a NextGen model only within a rectangle domain run by WOFS (using its QPF forcing) and pull in contributing flows from the latest short-range full continental NWM run to fill in the boundary conditions for the simulation.
In the future, as long as the NextGen-based NWM produces channel_rt output in the same format, this module could be used to implement localized more computationally-intensive or higher-resolution "overlay" model runs that pull boundary flows from the continental-scale model. If the output changes, this module should be updated to support the same use-case using the new NWM outputs.
- Language: Python
- Dependencies: Python>=3.9, NumPy, xarray, pyyaml, netCDF4
- Status: ALPHA - undergoing frequent change, no guarantee of compatibility between versions, documentation may easily become outdated!
Install with pip directly from GitHub:
pip install git+https://github.com/mattw-nws/CopyCat
To use with NGIAB, you may need to make a new image that includes CopyCat. A simple Dockerfile like this will do:
FROM docker.io/awiciroh/ciroh-ngen-image
RUN dnf install git -y && dnf clean all && rm -rf /var/cache/dnf
RUN uv pip install git+https://github.com/mattw-nws/CopyCat
Build the image like so:
docker build -f Dockerfile -t ciroh-ngen-image-copycat:latest
If you are using guide.sh, you can then specify this custom image using the -i option:
./guide.sh -p -i localhost/ciroh-ngen-image-copycat
Provide a config file for the module. One config file is designed to be usable by multiple instances of the module within a simulation.
Example:
start_time: "2020-08-11 01:00:00"
source_base: https://nomads.ncep.noaa.gov/pub/data/nccf/com/nwm/v3.0/nwm.20251001/medium_range_blend/nwm.t06z.medium_range_blend.channel_rt.f001.conus.nc
crosswalk: ./config/copycat_crosswalk.dbm
cache_dir: /tmp/copycat
start_time: The start time of the simulation, in ISO8601 formatsource_base: A URL or file path that points to the chanel_rt NetCDF file containing data corresponding to thestart_time. CopyCat will walk forward thefXXXnumber in the file name to find subsequent It does not have to be af001file, but there needs to be enough remaining channel_rt files found in the same location to complete the duration of the simulation!- TODO: In the future this should support a directory path/URL or even just "nomads" or "nodd" and be able to figure out the correct file(s) to use.
crosswalk(optional): A dbm file with keys corresponding to catchment IDs (integers, as bytes) and values corresponding to NWM reach IDs (integers, as bytes). Needed if you specifycatchment_numfor any instances instead offeature_id- NOTE: Supported crosswalk formats may be added/changed in the future!
cache_dir(optional): A path on the local filesystem to store downloaded/copied channel_rt NetCDF files. This is mainly used whensource_baseis a URL, but may be useful with a path if the files are stored on slow local storage (e.g. slow NFS mount or s3fs).- You can re-use
cache_dirs for multiple simulations or multiple runs of a simulation. - A cache_dir is always shared by multiple instances of CopyCat within a single simulation.
- It should be possible to run multiple parallel simulations (e.g. for ensemble runs) using the same
cache_dirif they all share the same time range. - However, do not use a single
cache_dirfor multiple simultaneous simulations of different time ranges--it is highly likely that CopyCat instances in one simulation will end up waiting for an instance in another simulation to download a desired channel_rt file which will never happen because it is not in the time range of the other simulation!
- You can re-use
Generally, CopyCat will be used for specific catchments within a NextGen realization, as in the example fragment below. CopyCat currently only provides one output variable: Q, in m^3/s.
It accepts two inputs, catchment_num and feature_id, both integers. Only one of these inputs is expected to be provided, if both are provided the behavior is undefined. catchment_num is provided as input, you must configure the crosswalk location in the config file. Generally, and for simplicity, these inputs will be provided within the realization config via the model_params mechanism.
Example realization config fragment:
...
"catchments": {
"cat-2164301": {
"formulations": [{
"name": "bmi_python",
"params": {
"name": "bmi_python",
"python_type": "copycatbmi.CopyCat",
"model_type_name": "CopyCat",
"model_params": {
"feature_id": 538737
},
"uses_forcing_file": false,
"init_config": "./config/copycat_config.yaml",
"allow_exceed_end_time": true,
"main_output_variable": "Q"
}
}],
"forcing": {
"path": "./forcings/forcings.nc",
"provider": "NetCDF",
"enable_cache": false
}
}
},
...
COMING SOON...
This is Alpha level software. Consult the Issues page for outstanding bugs.
If you have questions, concerns, bug reports, etc, please file an issue in this repository's Issue Tracker.
General instructions on how to contribute can be found at CONTRIBUTING.