Skip to content

Conversation

@AndreaBozzo
Copy link
Contributor

@AndreaBozzo AndreaBozzo commented Dec 18, 2025

Which issue does this PR close?

Closes #1780

What changes are included in this PR?

This PR adds support for configuring the catalog type dynamically per engine in sqllogictest schedule files.

Changes:

  • Add catalog_type and catalog_properties fields to engine configuration
  • Support memory catalog (default) and all catalog types from iceberg-catalog-loader (rest, glue, hms, sql, s3tables)
  • Add iceberg-catalog-loader dependency to the workspace

Example configuration:

[engines]
df = { type = "datafusion", catalog_type = "rest", catalog_properties = { uri = "http://localhost:8181" } }

[[steps]]
engine = "df"
slt = "test.slt"

If no catalog_type is specified, defaults to memory catalog for backward compatibility.

Are these changes tested?

Yes, unit tests added for:

  • Engine loading with default catalog
  • Engine loading with explicit memory catalog configuration

Note: pr body updated after maintainer's review and feedbacks

@AndreaBozzo
Copy link
Contributor Author

The macos-latest CI job failed due to a transient timeout in arduino/setup-protoc@v3:

Error: Request timeout: /repos/protocolbuffers/protobuf/releases?page=1

This is an intermittent infrastructure issue with GitHub Actions, not related to the code changes in this PR. Re-running the failed job should resolve it.

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AndreaBozzo for this pr, catalog config should be engine specific, not globally.


/// Configuration for a catalog in the schedule file.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CatalogConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catalog config should be engine specific, e.g., it should be part of an engine, rather in a common section.

@AndreaBozzo
Copy link
Contributor Author

Sorry, i should have figured that out. I'll work on It and amend the commit when done!

This PR implements issue apache#1780 by allowing each engine in the sqllogictest
framework to configure its own catalog.

Changes:
- Remove global [catalog] section from schedule parsing
- Each engine now creates its own catalog based on engine-specific config
- DataFusionEngine reads 'catalog_type' and 'catalog_properties' from config
- Default catalog type is 'memory' with a temp warehouse for testing
- Support for all catalog types via iceberg-catalog-loader (rest, glue, hms, sql, s3tables)

Example configuration:
```toml
[engines]
df = { type = "datafusion", catalog_type = "rest", catalog_properties = { uri = "http://localhost:8181" } }
```

Closes apache#1780
@AndreaBozzo AndreaBozzo force-pushed the feat/sqllogictest-dynamic-catalog branch from 7e8d26f to 4239a20 Compare December 20, 2025 10:19
@AndreaBozzo
Copy link
Contributor Author

I've updated the PR to make the catalog configuration engine-specific as suggested. I considered adding explicit error handling for NamespaceAlreadyExists/TableAlreadyExistsinstead of silently ignoring errors in test setup, but kept it simple to stay within scope.

Let me know if you'd prefer that change and thank you for your patience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support configuring datafusion catalog in sqllogictest framework

2 participants