Add data cataloging tutorial

VeckoTheGecko · VeckoTheGecko · commit ed7984cffb28 · 2026-06-17T16:32:42.000+02:00
diff --git a/intermediate/cataloging.ipynb b/intermediate/cataloging.ipynb
@@ -0,0 +1,61 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1a1c33fb",
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "source": [
+    "# Data cataloguing for Xarray\n",
+    "\n",
+    ":::{admonition} Under construction\n",
+    "This notebook is very much still under construction\n",
+    ":::\n",
+    "\n",
+    "**Goals:** At the end of this tutorial, you'll have an overview about what data cataloging, why it is done, what tools are available. TODO: Refine goal\n",
+    "\n",
+    "## What is cataloging? Why is it useful?\n",
+    "\n",
+    "- Many different ways to open Xarray datasets\n",
+    "    - From file\n",
+    "        - Netcdf\n",
+    "        - Zarr\n",
+    "        - From Icechunk store\n",
+    "    - From remote URLs\n",
+    "    - From custom engines (see tutorial x)\n",
+    "- Copying all of this from script to script is TIRING\n",
+    "- Its a data engineering problem (and one that's on the org level)\n",
+    "- What if we could map out the datasets that we use on an org level, and expose that as a collection of datasets? A ✨catalogue✨ if you will 🧐\n",
+    "\n",
+    "\n",
+    "## \n",
+    "\n",
+    "\n",
+    "## Packages\n",
+    "\n",
+    "- Odc-stac\n",
+    "- Stackstac\n",
+    "- Xpystac\n",
+    "- lazycogs(?)\n",
+    "- intake v2\n",
+    "\n",
+    "\n",
+    "## More resources\n",
+    "\n",
+    "https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-consumers/ \n",
+    "\n",
+    "\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}