Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 9 additions & 12 deletions doc/internals/zarr-encoding-spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ NetCDF data model in Zarr.
Dimension Encoding in Zarr Formats
-----------------------------------

Xarray encodes array dimensions differently depending on the Zarr format version:
Xarray encodes/decodes array dimensions differently depending on the Zarr format version:

**Zarr V2 Format:**
Xarray uses a special Zarr array attribute: ``_ARRAY_DIMENSIONS``. The value of this
Expand All @@ -43,9 +43,9 @@ When accessing arrays with zarr-python, this information is available in the arr
metadata but not in the attributes dictionary.

When reading a Zarr group, Xarray looks for dimension information in the appropriate
location based on the format version, raising an error if it can't be found. The
location based on the inferred format version, raising an error if it can't be found. The
dimension information is used to define the variable dimension names and then
(for Zarr V2) removed from the attributes dictionary returned to the user.
(for Zarr V2) is removed from the attributes dictionary returned to the user.

CF Conventions
--------------
Expand All @@ -59,17 +59,14 @@ used to describe metadata in NetCDF and Zarr.
Compatibility and Reading
-------------------------

Because of these encoding choices, Xarray cannot read arbitrary Zarr arrays, but only
Zarr data with valid dimension metadata. Xarray supports:
Because of these encoding choices, Xarray cannot read arbitrary Zarr groups, but only
Zarr groups with valid dimension metadata. Xarray supports:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "Zarr groups containing arrays with valid dimension metadata" here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (0b067d9) - thanks @shoyer!


- Zarr V2 arrays with ``_ARRAY_DIMENSIONS`` attributes
- Zarr V3 arrays with ``dimension_names`` metadata
- `NCZarr <https://docs.unidata.ucar.edu/nug/current/nczarr_head.html>`_ format
(dimension names are defined in the ``.zarray`` file)
1. Zarr V3 groups with ``dimension_names`` metadata
2. Zarr V2 groups with ``_ARRAY_DIMENSIONS`` attributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should still be "arrays" not "groups" here, because the metadata/attributes are on the array, not the group

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (0b067d9) - thank you!

3. `NCZarr <https://docs.unidata.ucar.edu/nug/current/nczarr_head.html>`_ format (dimension names are defined in the ``dimrefs`` field in the custom ``.zarray`` file)

After decoding the dimension information and assigning the variable dimensions,
Xarray proceeds to [optionally] decode each variable using its standard CF decoding
machinery used for NetCDF data.
Xarray checks each of these three conventions, in the order given above, when looking for dimension name metadata. Note that while Xarray can read NCZarr groups, it currently does not write NCZarr groups. After decoding the dimension information and assigning the variable dimensions, Xarray proceeds to [optionally] decode each variable using its standard CF decoding machinery used for NetCDF data.

Finally, it's worth noting that Xarray writes (and attempts to read)
"consolidated metadata" by default (the ``.zmetadata`` file), which is another
Expand Down
3 changes: 3 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ Bug Fixes
- Ensure that ``keep_attrs='drop'`` and ``keep_attrs=False`` remove attrs from result, even when there is
only one xarray object given to ``apply_ufunc`` (:issue:`10982` :pull:`10997`).
By `Julia Signell <https://github.com/jsignell>`_.
- Slightly amend `Xarray's Zarr Encoding Specification doc <https://docs.xarray.dev/en/latest/internals/zarr-encoding-spec.html>`_ for clarity, and provide a code comment in ``xarray.backends.zarr._get_zarr_dims_and_attrs`` referencing the doc (:issue:`8749` :pull:`11013`).
By `Ewan Short <https://github.com/eshort0401>`_.


Documentation
~~~~~~~~~~~~~
Expand Down
3 changes: 3 additions & 0 deletions xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,9 @@ def _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name):


def _get_zarr_dims_and_attrs(zarr_obj, dimension_key, try_nczarr):
# Check for attributes and dimension name metadata as discussed in the Zarr encoding
# specification https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html

# Zarr V3 explicitly stores the dimension names in the metadata
try:
# if this exists, we are looking at a Zarr V3 array
Expand Down
Loading