Skip to content

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup for reverted SPARK-54450. We still want this feature, it's useful, we just don't want to interrupt people's workflow.

We are more careful about detecting test modules. If we fail to import a module that is not the test module itself (e.g. py4j), we just pass the whole string and hope it works.

Why are the changes needed?

If the user is using a weird python to start the script, they might not have the full environment set up. However, bin/pyspark still works because it has its own environment manipulation mechanism. So unable to import the test module in run-tests.py does not necessarily mean we can't run the test.

We want to keep a better backward compatibility while introducing the new convenient features.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Manually confirmed the warning is generated and the test will still work if py4j is not available.

Was this patch authored or co-authored using generative AI tooling?

No

Comment on lines +39 to +45
You can run individual tests by using ``--testnames`` option. For example,

.. code-block:: bash
python/run-tests --testnames pyspark.sql.tests.test_dataframe
python/run-tests --testnames pyspark.sql.tests.test_dataframe.DataFrameTests.test_range
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add version supported? If users are on older Spark versions this won't work right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is a dev-only thing. It won't impact any end user.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants