Skip to content

Commit ed7984c

Browse files
committed
Add data cataloging tutorial
1 parent 5e2a15c commit ed7984c

1 file changed

Lines changed: 61 additions & 0 deletions

File tree

intermediate/cataloging.ipynb

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "1a1c33fb",
6+
"metadata": {
7+
"vscode": {
8+
"languageId": "plaintext"
9+
}
10+
},
11+
"source": [
12+
"# Data cataloguing for Xarray\n",
13+
"\n",
14+
":::{admonition} Under construction\n",
15+
"This notebook is very much still under construction\n",
16+
":::\n",
17+
"\n",
18+
"**Goals:** At the end of this tutorial, you'll have an overview about what data cataloging, why it is done, what tools are available. TODO: Refine goal\n",
19+
"\n",
20+
"## What is cataloging? Why is it useful?\n",
21+
"\n",
22+
"- Many different ways to open Xarray datasets\n",
23+
" - From file\n",
24+
" - Netcdf\n",
25+
" - Zarr\n",
26+
" - From Icechunk store\n",
27+
" - From remote URLs\n",
28+
" - From custom engines (see tutorial x)\n",
29+
"- Copying all of this from script to script is TIRING\n",
30+
"- Its a data engineering problem (and one that's on the org level)\n",
31+
"- What if we could map out the datasets that we use on an org level, and expose that as a collection of datasets? A ✨catalogue✨ if you will 🧐\n",
32+
"\n",
33+
"\n",
34+
"## \n",
35+
"\n",
36+
"\n",
37+
"## Packages\n",
38+
"\n",
39+
"- Odc-stac\n",
40+
"- Stackstac\n",
41+
"- Xpystac\n",
42+
"- lazycogs(?)\n",
43+
"- intake v2\n",
44+
"\n",
45+
"\n",
46+
"## More resources\n",
47+
"\n",
48+
"https://guide.cloudnativegeo.org/cookbooks/zarr-stac-report/data-consumers/ \n",
49+
"\n",
50+
"\n"
51+
]
52+
}
53+
],
54+
"metadata": {
55+
"language_info": {
56+
"name": "python"
57+
}
58+
},
59+
"nbformat": 4,
60+
"nbformat_minor": 5
61+
}

0 commit comments

Comments
 (0)