Skip to content

[SPARK-33902][SQL] Support CREATE TABLE LIKE for V2#54809

Open
viirya wants to merge 7 commits intoapache:masterfrom
viirya:create-table-like-v2
Open

[SPARK-33902][SQL] Support CREATE TABLE LIKE for V2#54809
viirya wants to merge 7 commits intoapache:masterfrom
viirya:create-table-like-v2

Conversation

@viirya
Copy link
Member

@viirya viirya commented Mar 14, 2026

What changes were proposed in this pull request?

Previously, CREATE TABLE LIKE was implemented only via CreateTableLikeCommand, which bypassed the V2 catalog pipeline entirely. This meant:

  • 3-part names (catalog.namespace.table) caused a parse error
  • 2-part names targeting a V2 catalog caused NoSuchDatabaseException

This PR adds a V2 execution path for CREATE TABLE LIKE:

  • Grammar: change tableIdentifier (2-part max) to identifierReference (N-part) for both target and source, consistent with all other DDL commands
  • Parser: emit CreateTableLike (new V2 logical plan) instead of CreateTableLikeCommand directly
  • ResolveCatalogs: resolve the target UnresolvedIdentifier to ResolvedIdentifier
  • ResolveSessionCatalog: route back to CreateTableLikeCommand when both target and source are V1 tables/views in the session catalog (V1->V1 path)
  • DataSourceV2Strategy: convert CreateTableLike to new CreateTableLikeExec
  • CreateTableLikeExec: physical exec that copies schema and partitioning from the resolved source Table and calls TableCatalog.createTable()

Why are the changes needed?

CREATE TABLE LIKE was implemented solely via CreateTableLikeCommand, a V1-only command that bypasses the DataSource V2 analysis pipeline entirely. As a result, it was impossible to use CREATE TABLE LIKE to create a table in a non-session V2 catalog (e.g., testcat.dst): a 2-part name like testcat.dst was misinterpreted as database testcat in the session catalog and threw NoSuchDatabaseException, while a 3-part name like testcat.ns.dst was a parse error because the grammar only accepted 2-part tableIdentifier.

This change routes CREATE TABLE LIKE through the standard V2 DDL pipeline so that V2 catalog targets are fully supported, while preserving the existing V1 behavior when both target and source resolve to the session catalog.

Does this PR introduce any user-facing change?

Yes. CREATE TABLE LIKE DDL command supports V2.

How was this patch tested?

  • CreateTableLikeSuite: new integration tests covering V2 target with V1/V2 source, cross-catalog, views as source, IF NOT EXISTS, property behavior, and V1 fallback regression, etc.
  • DDLParserSuite: updated existing create table like test to match the new CreateTableLike plan shape; added 3-part name parsing test

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6

@viirya viirya changed the title [][SQL] Support CREATE TABLE LIKE for V2 [SPARK-XXXXX][SQL] Support CREATE TABLE LIKE for V2 Mar 14, 2026
@viirya viirya changed the title [SPARK-XXXXX][SQL] Support CREATE TABLE LIKE for V2 [SPARK-55994][SQL] Support CREATE TABLE LIKE for V2 Mar 14, 2026
@viirya viirya changed the title [SPARK-55994][SQL] Support CREATE TABLE LIKE for V2 [SPARK-33902][SQL] Support CREATE TABLE LIKE for V2 Mar 15, 2026
@viirya viirya force-pushed the create-table-like-v2 branch from 638846e to 6e695fe Compare March 15, 2026 20:45
viirya and others added 7 commits March 15, 2026 18:51
## What changes were proposed in this pull request?

Previously, `CREATE TABLE LIKE` was implemented only via `CreateTableLikeCommand`,
which bypassed the V2 catalog pipeline entirely. This meant:
- 3-part names (catalog.namespace.table) caused a parse error
- 2-part names targeting a V2 catalog caused `NoSuchDatabaseException`

This PR adds a V2 execution path for `CREATE TABLE LIKE`:

- Grammar: change `tableIdentifier` (2-part max) to `identifierReference`
  (N-part) for both target and source, consistent with all other DDL commands
- Parser: emit `CreateTableLike` (new V2 logical plan) instead of
  `CreateTableLikeCommand` directly
- `ResolveCatalogs`: resolve the target `UnresolvedIdentifier` to
  `ResolvedIdentifier`
- `ResolveSessionCatalog`: route back to `CreateTableLikeCommand` when both
  target and source are V1 tables/views in the session catalog (V1->V1 path)
- `DataSourceV2Strategy`: convert `CreateTableLike` to new `CreateTableLikeExec`
- `CreateTableLikeExec`: physical exec that copies schema and partitioning from
  the resolved source `Table` and calls `TableCatalog.createTable()`

## How was this patch tested?

- `CreateTableLikeSuite`: new integration tests covering V2 target with V1/V2
  source, cross-catalog, views as source, IF NOT EXISTS, property behavior,
  and V1 fallback regression
- `DDLParserSuite`: updated existing `create table like` test to match the new
  `CreateTableLike` plan shape; added 3-part name parsing test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two tests covering the case where the source is a V2 table in a
non-session catalog and the target resolves to the session catalog.
These exercise the CreateTableLikeExec → V2SessionCatalog path and
confirm that schema and partitioning are correctly propagated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two tests to CreateTableLikeSuite documenting that pure V2 catalogs
(e.g. InMemoryCatalog) accept any provider string without validation,
while V2SessionCatalog rejects non-existent providers by delegating to
DataSource.lookupDataSource. This is consistent with how CreateTableExec
handles the USING clause for other V2 DDL commands.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…CREATE TABLE LIKE

Two new tests covering previously untested code paths in CreateTableLikeExec:
- Source provider is copied to V2 target as PROP_PROVIDER when no USING override
  is given, consistent with how CreateTableExec handles other V2 DDL.
- CHAR(n)/VARCHAR(n) types declared on a V1 source are preserved in the V2
  target via CharVarcharUtils.getRawSchema, not collapsed to StringType.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add inline comment explaining the six reasons withConstraints is
intentionally omitted: V1 behavior parity, ForeignKey cross-catalog
dangling references, constraint name collision risk, validation status
semantics on empty tables, NOT NULL already captured in nullability,
and PostgreSQL precedent (INCLUDING CONSTRAINTS opt-in). Also notes
the path forward if constraint copying is added in the future.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clarify that V1 tables (CatalogTable) have no constraint objects at all
since CHECK/PRIMARY KEY/UNIQUE/FOREIGN KEY are V2-only concepts added in
Spark 4.1.0, rather than saying CreateTableLikeCommand "never copied"
them which implies an intentional decision rather than absence of the
feature.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed identifiers

After the CREATE TABLE LIKE V2 change, the target and source identifiers
in CreateTableLikeCommand are now fully qualified (spark_catalog.default.*)
because ResolvedV1Identifier explicitly adds the catalog component via
ident.asTableIdentifier.copy(catalog = Some(catalog.name)), and
ResolvedV1TableIdentifier returns t.catalogTable.identifier which also
includes the catalog. Update the analyzer golden file accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@viirya viirya force-pushed the create-table-like-v2 branch from 6e695fe to 6e3053c Compare March 16, 2026 01:51
@aokolnychyi
Copy link
Contributor

I'll take a look later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants