Skip to content

Commit 19a2eca

Browse files
author
Test User
committed
Add S3 Tables (Iceberg) API support
Implements the MinIO S3 Tables API for Iceberg table operations: - TablesClient for managing warehouses, namespaces, and tables - Full Iceberg REST catalog API support - DataFusion integration for query pushdown (optional feature) - Table scan planning and execution - Advanced operations: commit, rename, multi-table transactions
1 parent 4befb98 commit 19a2eca

File tree

193 files changed

+45248
-17
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

193 files changed

+45248
-17
lines changed

Cargo.toml

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ readme = "README.md"
1010
keywords = ["object-storage", "minio", "s3"]
1111
categories = ["api-bindings", "web-programming::http-client"]
1212

13+
[package.metadata.docs.rs]
14+
features = ["datafusion", "puffin-compression"]
15+
1316
[features]
1417
default = ["default-tls", "default-crypto", "http2"]
1518
default-tls = ["reqwest/default-tls"]
@@ -22,6 +25,10 @@ ring = ["dep:ring"]
2225
# Gracefully falls back to HTTP/1.1 when the server doesn't support it.
2326
http2 = ["reqwest/http2"]
2427
localhost = []
28+
# Puffin compression support for Iceberg table compression
29+
puffin-compression = ["dep:zstd", "dep:lz4_flex"]
30+
# DataFusion integration for query pushdown support
31+
datafusion = ["dep:datafusion", "dep:arrow", "dep:parquet", "dep:object_store", "dep:tokio"]
2532

2633
[workspace.dependencies]
2734
uuid = "1.19"
@@ -58,6 +65,7 @@ lazy_static = "1.5"
5865
log = { workspace = true }
5966
md5 = "0.8"
6067
multimap = "0.10"
68+
once_cell = "1.21"
6169
percent-encoding = "2.3"
6270
url = "2.5"
6371
regex = "1.12"
@@ -71,6 +79,16 @@ xmltree = "0.12"
7179
http = { workspace = true }
7280
thiserror = "2.0"
7381
typed-builder = "0.23"
82+
# DataFusion integration (optional, for query pushdown)
83+
datafusion = { version = "51.0", optional = true }
84+
arrow = { version = "57.1", optional = true }
85+
parquet = { version = "57.1", features = ["snap"], optional = true }
86+
object_store = { version = "0.12", optional = true }
87+
tokio = { workspace = true, optional = true, features = ["rt-multi-thread"] }
88+
# Puffin compression (optional, for Iceberg table compression)
89+
zstd = { version = "0.13", optional = true }
90+
lz4_flex = { version = "0.11", optional = true }
91+
plotters = "0.3.7"
7492

7593
[dev-dependencies]
7694
minio-common = { path = "./common" }
@@ -81,6 +99,17 @@ clap = { version = "4.5", features = ["derive"] }
8199
rand = { workspace = true, features = ["small_rng"] }
82100
quickcheck = "1.0"
83101
criterion = "0.8"
102+
# DataFusion benchmark dependencies (also available as optional feature)
103+
object_store = { version = "0.12", features = ["aws"] }
104+
futures = "0.3"
105+
# Iceberg-rust for proper manifest file creation in benchmarks
106+
iceberg = { version = "0.7", features = ["storage-s3"] }
107+
iceberg-catalog-rest = "0.7"
108+
# Arrow/Parquet versions matching iceberg-rust 0.7 (v55.1)
109+
# Use package aliasing to avoid conflicts with datafusion's arrow/parquet
110+
arrow-array-55 = { version = "55.1", package = "arrow-array" }
111+
arrow-schema-55 = { version = "55.1", package = "arrow-schema" }
112+
parquet-55 = { version = "55.1", package = "parquet", features = ["async"] }
84113

85114
[lib]
86115
name = "minio"
@@ -101,6 +130,45 @@ name = "append_object"
101130
[[example]]
102131
name = "load_balancing_with_hooks"
103132

133+
[[example]]
134+
name = "tables_quickstart"
135+
path = "examples/s3tables/tables_quickstart.rs"
136+
137+
[[example]]
138+
name = "tables_stress_throughput_saturation"
139+
path = "examples/s3tables/tables_stress_throughput_saturation.rs"
140+
141+
[[example]]
142+
name = "tables_stress_sustained_load"
143+
path = "examples/s3tables/tables_stress_sustained_load.rs"
144+
145+
[[example]]
146+
name = "tables_stress_state_chaos"
147+
path = "examples/s3tables/tables_stress_state_chaos.rs"
148+
149+
[[example]]
150+
name = "sdk_comparison_benchmark"
151+
path = "examples/datafusion/sdk_comparison_benchmark.rs"
152+
required-features = ["datafusion"]
153+
154+
[[example]]
155+
name = "s3tables_pushdown_benchmark"
156+
path = "examples/datafusion/s3tables_pushdown_benchmark.rs"
157+
required-features = ["datafusion"]
158+
159+
[[example]]
160+
name = "simd_benchmark"
161+
path = "examples/s3tables/simd_benchmark.rs"
162+
163+
[[example]]
164+
name = "simd_benchmark_full"
165+
path = "examples/s3tables/simd_benchmark_full.rs"
166+
required-features = ["datafusion"]
167+
168+
[[example]]
169+
name = "storage_pushdown_benchmark"
170+
path = "examples/s3tables/storage_pushdown_benchmark.rs"
171+
104172
[[bench]]
105173
name = "s3-api"
106174
path = "benches/s3/api_benchmarks.rs"
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# DataFusion Integration Status
2+
3+
Last updated: 2025-12-16
4+
5+
## Overview
6+
7+
The DataFusion integration is **production-ready for most use cases** with documented limitations below.
8+
9+
## Implementation Status
10+
11+
| Component | Status | Location |
12+
|-----------|--------|----------|
13+
| TableProvider | Complete | `src/s3tables/datafusion/table_provider.rs` |
14+
| Filter Translation | Complete (40+ operators) | `src/s3tables/datafusion/filter_translator.rs` |
15+
| ObjectStore | Complete | `src/s3tables/datafusion/object_store.rs` |
16+
| Partition Pruning | Complete | `src/s3tables/datafusion/partition_pruning.rs` |
17+
| Residual Filters | Complete | `src/s3tables/datafusion/residual_filter_exec.rs` |
18+
| Column Statistics | Complete | `src/s3tables/datafusion/column_statistics.rs` |
19+
20+
## Known Bugs
21+
22+
### 1. Regex Character Classes Incorrectly Handled
23+
24+
- **Location**: `src/s3tables/datafusion/filter_translator.rs:1498`
25+
- **Severity**: Medium
26+
- **Status**: Known issue, test marked `#[ignore]`
27+
- **Description**: Character classes like `[0-9]` are incorrectly treated as exact literal matches instead of regex patterns
28+
- **Impact**: Complex regex patterns in WHERE clauses may produce incorrect filter pushdown
29+
- **Workaround**: Avoid character classes in regex patterns; use simpler patterns
30+
31+
## Known Limitations
32+
33+
### 1. Async Query Planning NOT Supported
34+
35+
- **Location**: `src/s3tables/datafusion/table_provider.rs:627-630`
36+
- **Severity**: High
37+
- **Status**: Not implemented
38+
- **Description**: When server returns `PlanningStatus::Submitted` (indicating async planning), the client returns an error instead of polling for completion
39+
- **Impact**: Large queries that trigger async planning on the server will fail
40+
- **Fix Required**: Implement polling infrastructure to wait for async planning completion
41+
42+
### 2. LIMIT Clause: Client-Side Only (By Design)
43+
44+
- **Location**: `src/s3tables/datafusion/table_provider.rs:554-568`
45+
- **Severity**: Low (optimization only)
46+
- **Status**: Implemented as client-side optimization
47+
- **Description**: The `limit` parameter is now passed through to DataFusion's execution plan for early termination. However, the **Iceberg REST API does NOT support server-side LIMIT pushdown**, so all matching files are still returned by `plan_table_scan()`.
48+
- **Impact**: DataFusion stops reading once enough rows are collected (client-side early termination), but the server still identifies all matching files.
49+
- **Workaround**: None needed - client-side optimization is automatic when using LIMIT clause
50+
- **Note**: Server-side LIMIT support would require changes to the Apache Iceberg REST API specification
51+
52+
### 3. Unsupported Filter Expressions
53+
54+
The following expression types cannot be pushed to the server:
55+
56+
- Scalar functions (UPPER, LOWER, TRIM, etc.)
57+
- Aggregate functions (COUNT, SUM, AVG, etc.)
58+
- Subqueries
59+
- Window functions
60+
- Complex nested function calls
61+
- Cast expressions (except in binary comparison context)
62+
63+
These expressions become residual filters evaluated client-side.
64+
65+
## Code Quality TODOs
66+
67+
### 1. Clone Operation Investigation
68+
69+
- **Location**: `src/s3tables/datafusion/object_store.rs:174`
70+
- **Priority**: Low
71+
- **Description**: Developer left TODO asking "why clone here?"
72+
- **Action**: Investigate if clone can be eliminated for performance
73+
74+
## Documentation Gaps
75+
76+
- [ ] Create architecture guide for DataFusion integration
77+
- [ ] Create troubleshooting guide for common issues
78+
- [ ] Document performance tuning recommendations
79+
- [ ] Add examples for complex query patterns
80+
81+
## Test Coverage
82+
83+
- **Unit tests**: 200+ tests covering filter translation, pushdown, residual handling
84+
- **Integration tests**: Feature-gated with `#[cfg(feature = "datafusion")]`
85+
- **Known gap**: Regex character class test marked `#[ignore]`
86+
87+
## Version Compatibility
88+
89+
- DataFusion: 51.0
90+
- Arrow: 57.1
91+
- Parquet: 57.1
92+
- object_store: 0.12
93+
94+
## Feature Flag
95+
96+
Enable DataFusion support with:
97+
98+
```toml
99+
[dependencies]
100+
minio = { version = "...", features = ["datafusion"] }
101+
```
102+
103+
Or build with:
104+
105+
```bash
106+
cargo build --features datafusion
107+
```
108+
109+
## Performance Expectations
110+
111+
| Filter Selectivity | Expected Speedup | Data Reduction |
112+
|-------------------|------------------|----------------|
113+
| 10% pass rate | ~5x | 90% |
114+
| 50% pass rate | ~2x | 50% |
115+
| 90% pass rate | Minimal | 10% |
116+
117+
## Priority Action Items
118+
119+
1. **High**: Implement async query planning support (polling for `PlanningStatus::Submitted`)
120+
2. **Medium**: Fix regex character class pattern detection
121+
3. **Low**: Investigate clone optimization in object_store.rs
122+
123+
## Recently Completed
124+
125+
- **LIMIT clause optimization**: Client-side early termination now supported (2025-12-16)
126+
- The `limit` parameter is passed through to DataFusion's ParquetExec
127+
- DataFusion stops reading once enough rows are collected
128+
- Note: Server-side LIMIT is not supported by Iceberg REST API specification

0 commit comments

Comments
 (0)