Skip to content

Conversation

@shuoweil
Copy link
Contributor

@shuoweil shuoweil commented Nov 10, 2025

This PR introduces several improvements related to the anywidget display mode. For displaying multi-index DataFrames in anywidget mode, it will be rendered with a nested "semi-exploding" view, similar to BigQuery web UI.

Added Index and Multi-Index DataFrame Example to Notebook:

  • The notebooks/dataframes/anywidget_mode.ipynb notebook now includes a new section demonstrating how to display a DataFrame with a MultiIndex using real data from the PyPI public dataset. This enhances the notebook's examples for complex data structures.

Added Index and Multi-Index System Tests:

  • The tests were made more robust by replacing brittle, generic string assertions with precise checks for unique data values within rendered HTML table cells.

See an example here: screen/6nyCvGjzpRwM2nW

Fixes #<459515995> 🦕

@shuoweil shuoweil requested a review from tswast November 10, 2025 22:11
@shuoweil shuoweil self-assigned this Nov 10, 2025
@shuoweil shuoweil requested review from a team as code owners November 10, 2025 22:11
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Nov 10, 2025
@shuoweil shuoweil changed the title feat: Enhance anywidget notebook with Index and MultiIndex example and improve test robustness feat: Display index and mutiindex in anywidget mode Nov 10, 2025
@shuoweil shuoweil marked this pull request as draft November 11, 2025 19:13
@shuoweil shuoweil removed the request for review from tswast November 11, 2025 19:13
@shuoweil shuoweil force-pushed the shuowei-anywidget-index-testcase branch from 441bb94 to 33fd6e3 Compare November 11, 2025 20:30
@shuoweil shuoweil requested a review from tswast November 11, 2025 21:41
@shuoweil shuoweil marked this pull request as ready for review November 11, 2025 21:42
@shuoweil shuoweil changed the title feat: Display index and mutiindex in anywidget mode feat: Display index and multi-index in anywidget mode Nov 12, 2025
Comment on lines 32 to 80
def _calculate_rowspans(dataframe: pd.DataFrame) -> list[list[int]]:
"""Calculates the rowspan for each cell in a MultiIndex DataFrame.
Args:
dataframe (pd.DataFrame):
The DataFrame for which to calculate index rowspans.
Returns:
list[list[int]]:
A list of lists, where each inner list corresponds to an index level
and contains the rowspan for each row at that level. A value of 0
indicates that the cell should not be rendered (it's covered by a
previous rowspan).
"""
if not isinstance(dataframe.index, pd.MultiIndex):
# If not a MultiIndex, no rowspans are needed for the index itself.
# Return a structure that indicates each index cell should be rendered once.
return [[1] * len(dataframe.index)] if dataframe.index.nlevels > 0 else []

rowspans: list[list[int]] = []
for level_idx in range(dataframe.index.nlevels):
current_level_spans: list[int] = []
current_value = None
current_span = 0

for i in range(len(dataframe.index)):
value = dataframe.index.get_level_values(level_idx)[i]

if value == current_value:
current_span += 1
current_level_spans.append(0) # Mark as covered by previous rowspan
else:
# If new value, finalize previous span and start a new one
if current_span > 0:
# Update the rowspan for the start of the previous span
current_level_spans[i - current_span] = current_span
current_value = value
current_span = 1
current_level_spans.append(0) # Placeholder, will be updated later

# Finalize the last span
if current_span > 0:
current_level_spans[len(dataframe.index) - current_span] = current_span

rowspans.append(current_level_spans)

return rowspans


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this makes for a somewhat cleaner view, I think we'd be better off not implementing rowpans. For example sorting by the various index columns is less intuitive if the index isn't rendered the same as other columns.

table_html.append(' <tr style="text-align: left;">')

# Add index headers
for name in dataframe.index.names:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to actually refer to the original BigFrames DataFrame when deciding which index columns to render or not. In particular, we should not render the index if the BigFrame DataFrame has a NULL index.

@shuoweil shuoweil marked this pull request as draft December 5, 2025 18:53
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: l Pull request size is large. labels Dec 17, 2025
@shuoweil shuoweil changed the title feat: Display index and multi-index in anywidget mode feat: Display custom multi-index in anywidget mode Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: xl Pull request size is extra large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants