Skip to content

[SPARK-56811][UI] Restore sub-execution grouping on the SQL tab listing#55787

Open
yaooqinn wants to merge 2 commits into
apache:masterfrom
yaooqinn:SPARK-56811
Open

[SPARK-56811][UI] Restore sub-execution grouping on the SQL tab listing#55787
yaooqinn wants to merge 2 commits into
apache:masterfrom
yaooqinn:SPARK-56811

Conversation

@yaooqinn
Copy link
Copy Markdown
Member

@yaooqinn yaooqinn commented May 10, 2026

What changes were proposed in this pull request?

This PR restores sub-execution grouping on the SQL tab listing that was silently dropped when the listing was switched to server-side pagination in SPARK-56140.

  • Backend (SqlResource.sqlTable): accept a new groupSubExecution query parameter (default = cluster config spark.ui.groupSQLSubExecutionEnabled). When enabled, paginate over root executions only and embed each root's children as subExecutions: [...] on the row. Sub-executions whose root is missing from the filtered set surface as roots so they are never hidden. Fix recordsTotal / recordsFiltered to count root rows in grouped mode so the DataTables "Showing X to Y of Z entries" matches the visible rows.
  • Frontend (allexecutionspage.js): read the existing group-sub-exec-config data attribute, forward groupSubExecution=true|false to the server, and append a trailing Sub Executions column showing a +N sub toggle on roots that have children. Toggling expands the row using DataTables row().child() with a nested table — matches the SPARK-41752 / Spark 4.1 layout.
  • CSS (webui-dataTables.css): minimal styling for the toggle link and the nested child table (indent + tertiary background).

Why are the changes needed?

SPARK-41752 introduced sub-execution grouping in Spark 4.1, and SPARK-55875 carried it over when the listing moved to client-side DataTables. SPARK-56140 then switched the listing to server-side pagination but didn't carry over the grouping logic — every sub-execution now shows up as its own flat row, regressing the UX for queries such as CACHE TABLE and nested CTAS.

Does this PR introduce any user-facing change?

Yes — the SQL tab listing again folds sub-executions under their root, with a +N sub toggle to expand. Default behaviour is controlled by the existing spark.ui.groupSQLSubExecutionEnabled config (default true).

The screenshots below were taken from the same workload (SELECT, CACHE TABLE, broadcast join, DROP TABLE, CREATE TABLE … AS) — CACHE TABLE and the inner CREATE TABLE each spawn a sub-execution.

Before — flat listing (today, post-SPARK-56140): sub-executions appear as flat sibling rows; e.g. id 2/3/4 are all CACHE TABLE people_eng_cached and id 7/8/9 are all nested CTAS.

before

After — grouped, collapsed: only roots are listed, with a trailing +1 sub toggle on rows that own a sub-execution.

after-collapsed

After — grouped, expanded: clicking +1 sub expands a nested table inline.

after-expanded

How was this patch tested?

  • New unit test in SqlResourceWithActualMetricsSuite covering both grouped and flat modes against a session that runs CACHE TABLE (which produces a root + sub-execution pair).
  • Existing SqlResourceWithActualMetricsSuite and AllExecutionsPageWithInMemoryStoreSuite continue to pass.
  • Manual verification in a local Spark UI (screenshots above): ran the same workload twice — once with spark.ui.groupSQLSubExecutionEnabled=false (flat) and once with the default true (grouped) — and confirmed both modes render correctly.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: GitHub Copilot CLI 1.0.44-2 with Claude Opus 4.7 (Extra high reasoning)

### What changes were proposed in this pull request?

This PR restores sub-execution grouping on the SQL tab listing that was
silently dropped when the listing was switched to server-side pagination
in SPARK-56140.

- Backend (`SqlResource.sqlTable`): accept a new `groupSubExecution`
  query parameter (default = cluster config
  `spark.ui.groupSQLSubExecutionEnabled`). When enabled, paginate over
  root executions only and embed each root's children as
  `subExecutions: [...]` on the row. Sub-executions whose root is
  missing from the filtered set surface as roots so they are never
  hidden. Fix `recordsTotal` / `recordsFiltered` to count root rows
  in grouped mode so the DataTables "Showing X to Y of Z entries"
  matches the visible rows.
- Frontend (`allexecutionspage.js`): read the existing
  `group-sub-exec-config` data attribute, forward
  `groupSubExecution=true|false` to the server, and append a trailing
  **Sub Executions** column showing a `+N sub` toggle on roots that
  have children. Toggling expands the row using DataTables
  `row().child()` with a nested table — matches the SPARK-41752 /
  Spark 4.1 layout.
- CSS (`webui-dataTables.css`): minimal styling for the toggle link
  and the nested child table (indent + tertiary background).

### Why are the changes needed?

SPARK-41752 introduced sub-execution grouping in Spark 4.1, and
SPARK-55875 carried it over when the listing moved to client-side
DataTables. SPARK-56140 then switched the listing to server-side
pagination but didn't carry over the grouping logic — every
sub-execution now shows up as its own flat row, regressing the UX
for queries such as `CACHE TABLE` and nested CTAS.

### Does this PR introduce _any_ user-facing change?

Yes — the SQL tab listing again folds sub-executions under their
root, with a `+N sub` toggle to expand. Default behaviour is
controlled by the existing `spark.ui.groupSQLSubExecutionEnabled`
config (default true).

### How was this patch tested?

- New unit test in `SqlResourceWithActualMetricsSuite` covering both
  grouped and flat modes against a session that runs CACHE TABLE
  (which produces a root + sub-execution pair).
- Existing `SqlResourceWithActualMetricsSuite` and
  `AllExecutionsPageWithInMemoryStoreSuite` continue to pass.
- Manual verification in a local Spark UI: ran a workload with two
  sub-execution-producing statements (CACHE TABLE, CTAS) and confirmed
  both modes render correctly via screenshot.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: GitHub Copilot CLI 1.0.44-2 with Claude Opus 4.7 (Extra high reasoning)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Accepts DataTables server-side parameters (start, length, order, search)
* and returns paginated results with recordsTotal/recordsFiltered counts.
*
* When `groupSubExecution=true` (default = `spark.ui.sql.groupSubExecutionEnabled`),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual config (from core/src/main/scala/org/apache/spark/internal/config/UI.scala) is spark.ui.groupSQLSubExecutionEnabled. Fix the docstring so users don't waste time searching for a non-existent key.

"with no filter, recordsTotal should match returned root count")
assert(groupedRecordsFiltered === groupedRows.size,
"with no filter, recordsFiltered should match returned root count")
groupedRows.foreach { row =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This asserts that every grouped row satisfies id == rootExecutionId, but the backend (line 133-134 of SqlResource.scala) deliberately surfaces orphan sub-executions (whose parent is missing from the filtered set) as roots. Those orphans have id != rootExecutionId. The test passes today only because CACHE TABLE happens not to produce orphans and the test runs with no filter (so every parent is present); the assertion encodes a property that holds only in the unfiltered case but is documented as a universal "grouped row is a root" invariant.

}
}

test("SPARK-56811: sqlTable groups sub-executions under their root execution") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The orphan branch on line 134 of SqlResource.scala is one of the most subtle pieces of new logic — sub appears under a root unless the root is filtered out, in which case the sub surfaces. The new test doesn't exercise this path. Worth a second test case using a search[value] that matches the sub but not the root (or a status filter that produces the same shape).

.filter(_.nonEmpty)
val needsFilter = searchValue.isDefined || statusFilter.isDefined

// Always load all execs once. The KVStore-level pagination optimization
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Always load all execs once. We need the full set to (a) identify orphan
// sub-executions whose root is filtered out and (b) count root rows for
// `recordsTotal`. `sqlStore.executionsList()` is already a full
// materialization, so there is no separate "KVStore-pagination" path being
// disabled here.

// CACHE TABLE produces a root execution plus an inner sub-execution that
// shares its rootExecutionId. This is the canonical case where the SQL
// listing should fold the sub row under the root rather than flattening it.
spark.sql("CREATE OR REPLACE TEMP VIEW spark_56811 AS SELECT id FROM RANGE(10)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CACHE TABLE spark_56811_cached leaves a cached table behind that persists into subsequent tests in this suite. Wrap with withTempView and an explicit UNCACHE TABLE, or add a try { ... } finally { spark.sql("UNCACHE TABLE IF EXISTS spark_56811_cached") } block.

val aaData = page.map(execToRow)
// Counts: when grouping, totals reflect root-only counts so DataTables shows
// "Showing X to Y of Z entries" matching the rows the user actually sees.
val recordsTotal = if (groupSubExec) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When groupSubExec && needsFilter, this does a fresh O(N) Set construction + O(N) count on allExecs. The work is the same flavor as the filtered-set partition above (line 132-135) but on a different domain (all execs vs. filtered execs), so it can't be folded directly. Consider precomputing both partitions (filtered + unfiltered) up front behind a single helper, or memoizing allRootIds so the unfiltered count and the filtered set both consult the same lazily-built data.

{
data: "queryId", name: "queryId", title: "Query ID",
orderable: false,
render: function (data, type) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data (queryId) is interpolated into both a title attribute and visible text without escapeHtml. The other rendered columns (description, errorMessage) use escapeHtml. Today queryId is a UUID, so this is safe in practice, but the inconsistency is the kind of thing that bites later when a schema changes. Consider wrapping in escapeHtml(data) to match the rest.

if (type !== "display") return "";
var subs = row.subExecutions || [];
if (subs.length === 0) return "";
return '<a href="#" class="toggle-sub-exec">' +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The disclosure is a <a href="#"> with a click handler — functionally a button. Screen readers see it as a generic link with no expanded/collapsed state. Worth adding:

  • role="button" (so AT announces it as a control, not navigation),
  • aria-expanded="false" on render, toggled to "true" / "false" in the click handler alongside the shown class,
  • preferably aria-controls= referencing the generated child row id.

Existing components in the Spark UI are already inconsistent with such accessibility standards, and we can unify and standardize this in subsequent iterations.

val page = if (length > 0) sortedRoots.slice(start, start + length) else sortedRoots

// Convert to Java-compatible row data; embed sub-executions when grouping
val aaData = page.map { exec =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single CACHE TABLE on a large nested DAG, or a deeply-nested CTAS, can produce dozens of sub-executions under one root. The current design embeds every sub of every visible root into the same JSON response, so a page of 20 roots could carry hundreds of sub rows even if the user never expands any disclosure. For most workloads this is fine, but worth noting in the PR description — and worth considering a follow-up that lazy-loads sub rows on first expand (a separate ?rootId=N endpoint) if a user reports payload size issues.

html += '<tr><td><a href="' + basePath + '/SQL/execution/?id=' +
child.id + '">' + child.id + '</a></td>';
html += '<td>' + statusBadge(child.status) + '</td>';
html += '<td>' + escapeHtml(child.description || "") + '</td>';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parent rows use descriptionHtml() which wraps in a link to the execution detail page, but child rows use plain escapeHtml(). Is this intentional, or should child rows also use descriptionHtml({ id: child.id, description: child.description })?

Address review feedback on PR apache#55787:

Backend (SqlResource.scala):
- Fix docstring config key: spark.ui.groupSQLSubExecutionEnabled
- Replace inline KVStore-pagination comment with a clearer explanation
  of why allExecs is materialized once (orphan promotion + recordsTotal)
- Extract the root/orphan partition predicate into
  SqlResource.partitionRoots so the filtered and unfiltered branches
  share the same implementation

Frontend (allexecutionspage.js):
- Escape queryId before rendering the title and short-form text
- Add a11y attributes on the sub-execution toggle: role=button,
  aria-expanded, aria-controls pointing at the child table id
- Render sub-execution descriptions via descriptionHtml so they match
  the parent row affordance instead of plain escapeHtml

Tests (SqlResourceWithActualMetricsSuite.scala):
- Wrap the SPARK-56811 grouping test in try/finally to UNCACHE the
  table even if assertions fail
- Generalize per-row assertion to allow orphan rows
- Add a synchronous unit test for SqlResource.partitionRoots that
  exercises true roots, grouped subs, and an orphan sub whose parent
  is absent from the input set

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants