Skip to content

[SPARK-52785][PYTHON] Simplifying super() syntax in PySpark#54790

Open
simolanayak wants to merge 3 commits intoapache:masterfrom
simolanayak:master
Open

[SPARK-52785][PYTHON] Simplifying super() syntax in PySpark#54790
simolanayak wants to merge 3 commits intoapache:masterfrom
simolanayak:master

Conversation

@simolanayak
Copy link

@simolanayak simolanayak commented Mar 13, 2026

What changes were proposed in this pull request?

9 instances of super(Class, cls) syntax removed and simplified to super()
These are present in the files:

  • _globals.py
  • sql/connect/plan.py
  • sql/pandas/serializers.py
  • sql/tests/test_udf.py
  • streaming/tests/test_listener.py

Why are the changes needed?

Issue SPARK-52785
Remove Python 2 support

Does this PR introduce any user-facing change?

No behavior changes

How was this patch tested?

github actions

Was this patch authored or co-authored using generative AI tooling?

No

…les _globals.py, sql/connect/plan.py sql/pandas/serializers.py sql/tests/test_udf.py streaming/tests/test_listener.py. 7 instances.
@huaxingao
Copy link
Contributor

@simolanayak Can we keep the PR title short (e.g., SPARK-52785: Simplify super() usage in PySpark) and move the file list / ‘N instances’ details into the What changes were proposed section? The current title is a bit long.

@simolanayak simolanayak changed the title SPARK-52785: removing super(Class, cls) syntax from python/pyspark SPARK-52785: simplifying super() syntax in pyspark Mar 13, 2026
@simolanayak simolanayak changed the title SPARK-52785: simplifying super() syntax in pyspark SPARK-52785: simplifying super() syntax in PySpark Mar 13, 2026
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @simolanayak for your contribution!

@gaogaotiantian
Copy link
Contributor

gaogaotiantian commented Mar 13, 2026

Most of the changes in this PR are incorrect (semantics equivalent wise). I did this once and fixed most of the super() usages. The ones I left out were intentional.

Basically

class A(B):
    def f(self):
        # equivalent to
        # super(A, self).f()
        # but NOT
        # super(B, self).f()
        super().f()

So we can only change the ones that explicitly use the classes the methods are defined in (not their base class). I believe at least a few of them are super intentional to skip the methods of one of the parent classes.

That being said, there are a few of the changes are correct. So we should keep those.

…tests fail: serializers.py, test_udf.py, test_listener.py
@gaogaotiantian
Copy link
Contributor

I believe the ones in serializers should be reverted too.

@simolanayak simolanayak marked this pull request as draft March 14, 2026 01:07
@simolanayak simolanayak marked this pull request as ready for review March 14, 2026 06:10
@HyukjinKwon HyukjinKwon changed the title SPARK-52785: simplifying super() syntax in PySpark [SPARK-52785][PYTHON] Simplifying super() syntax in PySpark Mar 15, 2026
@HyukjinKwon
Copy link
Member

Let's also keep the PR description updated:

_globals.py
sql/connect/plan.py
sql/pandas/serializers.py
sql/tests/test_udf.py
streaming/tests/test_listener.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants