-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Improve python_install test reliability on Windows
#17177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
155f3fd to
efcbc01
Compare
python_install test reliability on Windows
|
|
||
| [[profile.default.overrides]] | ||
| platform = 'cfg(target_os = "windows")' | ||
| filter = 'test(python_install::)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also python upgrade and python find tests that use managed versions. Can we tie this to the managed-python feature instead somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see a way to do it per-feature https://nexte.st/docs/filtersets/reference/.
We could just select more tests manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd need to move the tests to a specific package or something. I just worry this is brittle long-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the native_auth filter is also bothersome and janky)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah putting them in a specific package would let us filter on them. Maybe something for an issue and a separate PR though?
I will add python_upgrade. But for python_find, there are only 8 tests in there which seem to install python. Seems kind of heavy handed to add the whole module. There are also a bunch of other places in the tests where python_install or python_upgrade are called.
One option I guess we could consider is tagging the tests via their name. E.g. by putting any such tests within a py_install sub module and then matching on "::py_install::"?
|
Looks like it's only half working now...? (after the change to using |
|
Original results with an extra columns for max-threads and for serialising all the python_install tests:
Some of this could be noise, but some tests seem to be slow running when ran with a bunch of other tests, and some seem to be very slow only with other python_install tests. But most seem to fall into the latter group. As a bonus, however, in a full run with no filters, there seems to be almost no overhead to running these all as serialised. (Updated with change %) |

Summary
Results:
Comparing https://github.com/astral-sh/uv/actions/runs/20342580132/job/58446553156?pr=17177 against https://github.com/astral-sh/uv/actions/runs/20338690535/job/58431797575.
Overall test time went down from 279.163s to 258.427s - presumably not related though as this should be marginally slower. Here are some local tests (on linux):
Summary [ 34.452s] 45 tests run: 44 passed (12 slow), 1 failed, 3314 skippedthreads-required = "num-test-threads":Summary [ 117.991s] 45 tests run: 44 passed (1 slow), 1 failed, 3314 skippedthreads-required = 4:Summary [ 48.793s] 45 tests run: 44 passed (4 slow), 1 failed, 3314 skippedSummary [ 371.911s] 3357 tests run: 3356 passed (49 slow), 1 failed, 2 skippedthreads-required = "num-test-threads":Summary [ 451.323s] 3357 tests run: 3356 passed (26 slow), 1 failed, 2 skippedthreads-required = 4:Summary [ 386.213s] 3357 tests run: 3356 passed (31 slow), 1 failed, 2 skipped(Failed test is not related)
Conclusion
I think this is a good way to gain more reliability for these tests on all platforms, but I am not certain that we should be setting this override in the config. I think realistically you want something more like
num-test-threads / 3rather than just4.