Skip to content

[SPARK-55986][PYTHON] Upgrade black to 26.3.1#54782

Closed
LuciferYang wants to merge 9 commits intoapache:masterfrom
LuciferYang:black-26.3.1
Closed

[SPARK-55986][PYTHON] Upgrade black to 26.3.1#54782
LuciferYang wants to merge 9 commits intoapache:masterfrom
LuciferYang:black-26.3.1

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Mar 13, 2026

What changes were proposed in this pull request?

This pr aims to upgrade black from 23.12.1 to 26.3.1

Why are the changes needed?

To fix https://github.com/apache/spark/security/dependabot/172

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass Github Actions

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang changed the title Upgrade black to 26.3.1 [SPARK-XXXXX][PYTHON] Upgrade black to 26.3.1 Mar 13, 2026
@LuciferYang
Copy link
Contributor Author

test first

@LuciferYang LuciferYang marked this pull request as draft March 13, 2026 05:50
@LuciferYang LuciferYang changed the title [SPARK-XXXXX][PYTHON] Upgrade black to 26.3.1 [SPARK-55986][PYTHON] Upgrade black to 26.3.1 Mar 13, 2026
@LuciferYang LuciferYang marked this pull request as ready for review March 13, 2026 15:35
@LuciferYang
Copy link
Contributor Author

cc @HyukjinKwon Although the intention is to fix https://github.com/apache/spark/security/dependabot/172, this would result in a large number of files being re-formatted. Please help make a decision on whether to proceed with the upgrade. Thanks ~

@LuciferYang
Copy link
Contributor Author

All test paased

@dongjoon-hyun
Copy link
Member

Merged to master. Thank you, @LuciferYang .

@LuciferYang
Copy link
Contributor Author

Thank you @dongjoon-hyun

@gaogaotiantian
Copy link
Contributor

I think we should (soon) move to ruff for formatter as we already using it for linter. One less dependency. Also ruff is written in rust and is much faster than black. I have that on my list but did not get the chance to finish it (I did have concerns about large code refactoring). Do you think it's a good time to just jump to ruff so people don't need to deal with the refactoring twice?

@LuciferYang LuciferYang deleted the black-26.3.1 branch March 17, 2026 08:04
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Mar 17, 2026

I think we should (soon) move to ruff for formatter as we already using it for linter. One less dependency. Also ruff is written in rust and is much faster than black. I have that on my list but did not get the chance to finish it (I did have concerns about large code refactoring). Do you think it's a good time to just jump to ruff so people don't need to deal with the refactoring twice?

ruff is indeed extremely fast, roughly 100 times faster than black (on a PySpark project, Ruff took 0.12 seconds while Black took 12.9 seconds). Moreover, it requires minimal modifications to the toolchain. The downside is that this action will result in a massive number of code changes: 761 files, with +31K/-10K lines. We need to carry out this task during a low-traffic period and add this commit record to the .git-blame-ignore-revs file.

cc @dongjoon-hyun @HyukjinKwon @zhengruifeng WDYT?

@gaogaotiantian
Copy link
Contributor

@LuciferYang you probably ran ruff directly without setting the line length. I already made the PR #54840 to replace black with ruff. Limiting to the files that black reformats now and setting the line length to 100. The PR changes 134 files (some of them are infra) with 1000+ modified lines, which has less change than this PR upgrading black.

I think we should do it sooner than later because for people who did not work on spark super recently, they will only experience one big code refactoring (black + ruff). If we postpone it, more people will experience twice.

@dongjoon-hyun
Copy link
Member

@gaogaotiantian and @LuciferYang . We had better discuss on #54840 instead of here. Otherwise, most of people are not aware of it. 😄

The Apache Spark community is already one of the biggest community. Given that, please send an email with the title [FYI] to dev@spark mailing list about #54840 to get more attention and feedback on your PR, @gaogaotiantian . Although I support the idea, that's the Apache Spark way.

@gaogaotiantian
Copy link
Contributor

please send an email with the title [FYI] to dev@spark mailing list about #54840

Sure! Will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants