Skip to content

GH-49255: Fix pandas deprecation warnings in Categorical tests#49271

Open
shashbha14 wants to merge 1 commit intoapache:mainfrom
shashbha14:gh-49255-fix-pandas-deprecation
Open

GH-49255: Fix pandas deprecation warnings in Categorical tests#49271
shashbha14 wants to merge 1 commit intoapache:mainfrom
shashbha14:gh-49255-fix-pandas-deprecation

Conversation

@shashbha14
Copy link
Contributor

@shashbha14 shashbha14 commented Feb 13, 2026

Fixes the pandas deprecation warnings we're seeing in the test suite.

What was happening

Pandas started warning when you create a Categorical with values that aren't in the categories list. We had a few places in the tests doing this:

  • test_category: Creating cat_strings_with_na with categories ['foo', 'bar'] but the data includes 'qux'
  • test_category_implicit_from_pandas: Two places creating Categoricals with ['a', 'b', 'c'] but only allowing ['a', 'b'] in categories

What I changed

Instead of passing categories directly to pd.Categorical(), I:

  1. Create the Categorical first with all the values
  2. Then use .set_categories() to restrict it to what we want

This is the recommended way to do it and avoids the warnings.

Testing

  • Tests still pass (functionality unchanged)
  • No more deprecation warnings
  • No linter errors

Fixes #49255

print("LEN:", len(loaded_array))
print("RSS: {}MB".format(pa.total_allocated_bytes() >> 20))

Security considerations for untrusted IPC data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shashbha14 this is not related to Categorical tests. Would you like to update this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! I've removed the unrelated docs commit. The PR now only contains the pandas Categorical deprecation warnings fix.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Feb 13, 2026
…H-49255)

Replace pd.Categorical() calls that specify categories containing
values not in the categories list with the recommended pattern:
create the Categorical first, then use .set_categories() to restrict.

Fixes deprecation warnings:
- test_category: cat_strings_with_na
- test_category_implicit_from_pandas: two Categorical instances

Fixes apache#49255
@shashbha14 shashbha14 force-pushed the gh-49255-fix-pandas-deprecation branch from 80babf4 to ab9ee88 Compare February 13, 2026 09:51
@github-actions github-actions bot added awaiting change review Awaiting change review and removed Component: Documentation awaiting changes Awaiting changes labels Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] Fix DeprecationWarnings in PyArrow tests

2 participants