Python types and utilities for Zarr Conventions Metadata.
zarr-cm provides typed Python support for the published Zarr conventions:
| Convention | Module | Description |
|---|---|---|
| proj | zarr_cm.proj (also zarr_cm.geo_proj) |
Coordinate reference system information |
| spatial | zarr_cm.spatial |
Spatial coordinate metadata |
| multiscales | zarr_cm.multiscales |
Multiscale pyramid layout |
| license | zarr_cm.license |
License specifiers |
| uom | zarr_cm.uom |
Units of measurement |
Each module provides:
- TypedDict types for convention-specific metadata
create— create convention metadatainsert— add convention metadata to a Zarr attributes dictextract— remove and return convention metadata from an attributes dictvalidate— check runtime invariants the type system cannot express
See the docs for more information.
pip install zarr-cmfrom zarr_cm import geo_proj
# Create convention metadata
data = geo_proj.create(code="EPSG:4326")
print(data)
#> {'proj:code': 'EPSG:4326'}
# Validate
print(geo_proj.validate({"proj:code": "EPSG:4326"}))
#> {'proj:code': 'EPSG:4326'}
# Insert into an attributes dict
attrs = {"foo": "bar"}
result = geo_proj.insert(attrs, data)
print(result)
"""
{
'foo': 'bar',
'proj:code': 'EPSG:4326',
'zarr_conventions': [
{
'uuid': 'f17cb550-5864-4468-aeb7-f3180cfb622f',
'schema_url': 'https://raw.githubusercontent.com/zarr-conventions/proj/5ca5b2f92e5c7245f957d9128b289ee535f0720d/schema.json',
'spec_url': 'https://github.com/zarr-conventions/proj/blob/5ca5b2f92e5c7245f957d9128b289ee535f0720d/README.md',
'name': 'proj:',
'description': 'Coordinate reference system information for geospatial data',
}
],
}
"""
# Extract it back out
remaining, extracted = geo_proj.extract(result)
print(remaining)
#> {'foo': 'bar'}
print(extracted)
#> {'proj:code': 'EPSG:4326'}Upstream Zarr conventions sometimes change their field shapes in place —
keeping the same uuid but altering required keys and cardinalities. To let you
both author data at the current spec and still read data written against an
earlier draft, the revisioned conventions (spatial, proj, multiscales)
expose package-local revision labels ordered oldest → newest. Today spatial
and proj ship r2 and r3, while multiscales ships only r2; more are
added as upstream conventions evolve.
Each revision pins its emitted schema_url/spec_url to the upstream commit
SHA it was snapshotted from, so a written document is self-describing: the
uuid says which convention, and the pinned schema_url says which
revision. Writes default to the latest revision; reads auto-detect the revision
from the document's schema_url (overridable with a revision= argument).
The revision labels start at r2, not r1. Earlier drafts of spatial,
proj, and multiscales did exist, but the only schema_url they could carry
was upstream's refs/tags/v1/schema.json — and that tag was never published
on any of these repositories (their first and only release tag is v0.1). That
URL has therefore always returned 404, which is non-conformant with the Zarr
Conventions spec's requirement that schema_url resolve to the convention's
schema. For multiscales the draft was worse than dangling: its schema
const-requires the refs/tags/v1 URL, so no schema_url value could both
resolve and validate.
Rather than ship a revision whose self-describing URL is permanently broken,
r1 was dropped from all three conventions. The surviving revisions keep their
r2/r3 labels (labels are package-local and never appear in emitted
documents, so renumbering would only churn the public type names without
changing behavior).