Pin python in ci integration tests to prevent intermitent codspeed segfaults in walltime#105
Merged
GuillaumeLagrange merged 1 commit intomasterfrom Feb 6, 2026
Conversation
Merging this PR will degrade performance by 18.32%
Performance Changes
Comparing |
6dca575 to
a618ede
Compare
a618ede to
c3a194a
Compare
Contributor
|
FWIW Python 3.13.12 is similarly affected: Works well with 3.13.11, so I pinned to the patch version and opened that PR to confirm. Should I create an issue? |
Contributor
Author
|
@edgarrmondragon Thank you very much for the report, I have created the issue where we'll clarify. Please let us know if you find other versions that are affected |
adriencaccia
approved these changes
Feb 6, 2026
cheeeee
pushed a commit
to cheeeee/bytewax
that referenced
this pull request
Mar 6, 2026
All bytewax Py<T> references are properly guarded with SafePy (15 struct fields, 5 Drop impls, PICKLE_MODULE static). The SIGSEGV comes from pytest-codspeed's walltime profiler shutdown conflicting with Python 3.14 finalization (CodSpeedHQ/pytest-codspeed#105). Benchmark data is captured before the shutdown crash, so catch exit 139 and emit a warning instead of failing the job.
cheeeee
added a commit
to cheeeee/bytewax
that referenced
this pull request
Mar 6, 2026
* Fix benchmark workspace permission errors on self-hosted runners * Upgrade actions to latest versions, improve caching and sccache - Upgrade checkout v3→v6, cache v4→v5, upload-artifact v4→v7, download-artifact v4→v8, setup-just v2→v3, dawidd6 v6→v16 - Add restore-keys to all Cargo caches for warm restarts - Add cache-dependency-glob to all setup-uv calls - Set CARGO_INCREMENTAL=0 for sccache compatibility - Add retention-days: 7 to CI wheel artifacts * Bump Rust 1.74.1 → 1.85.0, update dependencies and Dockerfile - Rust toolchain: 1.74.1 → 1.85.0 - Widen serde/tokio/fastrand Cargo.toml pins, cargo update resolves 174 packages (tokio 1.50, serde 1.0.228, chrono 0.4.44, etc.) - pre-commit-hooks: v4.4.0 → v5.0.0 - Dockerfile: rust:1.68-bullseye → 1.85-bookworm, distroless debian11 → debian12, install maturin via pip instead of obsolete konstin2/maturin:v0.12.6 image * Fix repo-checks OOM, pin benchmarks to Intel runner - repo-checks: replace `uv pip sync -e .` (OOM during in-process cargo build) with maturin-action wheel build + filtered dev deps install. Adds sccache, sudo wrapper, workspace permission fix. - benchmarks: add Intel label to runs-on for perf consistency * Fix maturin-action directory collision and uv hardlink issues - Clean stale /__w/_temp/run-maturin-action.sh before builds (new runners had a directory at this path, causing Docker bind mount failures: "is a directory: permission denied") - Set UV_LINK_MODE=copy for container jobs to avoid hardlink failures across overlay filesystem boundaries (prevents cache corruption like missing wheel METADATA) * Fix concurrent maturin collisions, sccache timeout, uv cache corruption - Isolate maturin temp dir per matrix job (RUNNER_TEMP unique path) to prevent concurrent jobs on the same runner from colliding on /__w/_temp/run-maturin-action.sh - Disable sccache for repo-checks (server can't start in containers) - Retry uv pip sync with cache clean on failure to recover from corrupted wheel metadata in shared uv cache * Fix maturin-action stale directory cleanup for cross-compile jobs Replace broken RUNNER_TEMP env var override (resolved at parse time, not runtime) with targeted cleanup that only removes the maturin temp script path when it's a stale directory from a previous crashed job. Remove unnecessary isolation step from benches.yml (x86_64-only builds don't use Docker). * Ensure /__w symlink on every runner for cross-compile Docker builds maturin-action creates the build script at $RUNNER_TEMP but Docker bind-mounts it via /__w/_temp/ path. The /__w -> /opt/actions-runner/_work symlink was only created by pre-pull (on one runner). Cross-compile jobs on other runners fail because Docker can't resolve the path and auto-creates a directory instead. Fix by ensuring the symlink exists on each runner before the maturin-action step. * Fix SIGSEGV in Python 3.13/3.14: wrap PICKLE_MODULE static in SafePy The static GILOnceCell<Py<PyModule>> at pyo3_extensions.rs:17 was the only bare Py<T> in a global/static context not wrapped in SafePy. During Python 3.13+ interpreter finalization, its drop calls Py_DECREF on an already-freed type object, causing SIGSEGV (exit 139). Wrapping in SafePy<PyModule> checks Py_IsFinalizing() before dropping. * Handle CodSpeed walltime SIGSEGV on Python 3.13+ benchmarks All bytewax Py<T> references are properly guarded with SafePy (15 struct fields, 5 Drop impls, PICKLE_MODULE static). The SIGSEGV comes from pytest-codspeed's walltime profiler shutdown conflicting with Python 3.14 finalization (CodSpeedHQ/pytest-codspeed#105). Benchmark data is captured before the shutdown crash, so catch exit 139 and emit a warning instead of failing the job. --------- Co-authored-by: Nick Bozhenko <nick.bozhenko@lotusflare.com>
cheeeee
added a commit
to cheeeee/bytewax
that referenced
this pull request
Mar 6, 2026
* Fix benchmark workspace permission errors on self-hosted runners * Upgrade actions to latest versions, improve caching and sccache - Upgrade checkout v3→v6, cache v4→v5, upload-artifact v4→v7, download-artifact v4→v8, setup-just v2→v3, dawidd6 v6→v16 - Add restore-keys to all Cargo caches for warm restarts - Add cache-dependency-glob to all setup-uv calls - Set CARGO_INCREMENTAL=0 for sccache compatibility - Add retention-days: 7 to CI wheel artifacts * Bump Rust 1.74.1 → 1.85.0, update dependencies and Dockerfile - Rust toolchain: 1.74.1 → 1.85.0 - Widen serde/tokio/fastrand Cargo.toml pins, cargo update resolves 174 packages (tokio 1.50, serde 1.0.228, chrono 0.4.44, etc.) - pre-commit-hooks: v4.4.0 → v5.0.0 - Dockerfile: rust:1.68-bullseye → 1.85-bookworm, distroless debian11 → debian12, install maturin via pip instead of obsolete konstin2/maturin:v0.12.6 image * Fix repo-checks OOM, pin benchmarks to Intel runner - repo-checks: replace `uv pip sync -e .` (OOM during in-process cargo build) with maturin-action wheel build + filtered dev deps install. Adds sccache, sudo wrapper, workspace permission fix. - benchmarks: add Intel label to runs-on for perf consistency * Fix maturin-action directory collision and uv hardlink issues - Clean stale /__w/_temp/run-maturin-action.sh before builds (new runners had a directory at this path, causing Docker bind mount failures: "is a directory: permission denied") - Set UV_LINK_MODE=copy for container jobs to avoid hardlink failures across overlay filesystem boundaries (prevents cache corruption like missing wheel METADATA) * Fix concurrent maturin collisions, sccache timeout, uv cache corruption - Isolate maturin temp dir per matrix job (RUNNER_TEMP unique path) to prevent concurrent jobs on the same runner from colliding on /__w/_temp/run-maturin-action.sh - Disable sccache for repo-checks (server can't start in containers) - Retry uv pip sync with cache clean on failure to recover from corrupted wheel metadata in shared uv cache * Fix maturin-action stale directory cleanup for cross-compile jobs Replace broken RUNNER_TEMP env var override (resolved at parse time, not runtime) with targeted cleanup that only removes the maturin temp script path when it's a stale directory from a previous crashed job. Remove unnecessary isolation step from benches.yml (x86_64-only builds don't use Docker). * Ensure /__w symlink on every runner for cross-compile Docker builds maturin-action creates the build script at $RUNNER_TEMP but Docker bind-mounts it via /__w/_temp/ path. The /__w -> /opt/actions-runner/_work symlink was only created by pre-pull (on one runner). Cross-compile jobs on other runners fail because Docker can't resolve the path and auto-creates a directory instead. Fix by ensuring the symlink exists on each runner before the maturin-action step. * Fix SIGSEGV in Python 3.13/3.14: wrap PICKLE_MODULE static in SafePy The static GILOnceCell<Py<PyModule>> at pyo3_extensions.rs:17 was the only bare Py<T> in a global/static context not wrapped in SafePy. During Python 3.13+ interpreter finalization, its drop calls Py_DECREF on an already-freed type object, causing SIGSEGV (exit 139). Wrapping in SafePy<PyModule> checks Py_IsFinalizing() before dropping. * Handle CodSpeed walltime SIGSEGV on Python 3.13+ benchmarks All bytewax Py<T> references are properly guarded with SafePy (15 struct fields, 5 Drop impls, PICKLE_MODULE static). The SIGSEGV comes from pytest-codspeed's walltime profiler shutdown conflicting with Python 3.14 finalization (CodSpeedHQ/pytest-codspeed#105). Benchmark data is captured before the shutdown crash, so catch exit 139 and emit a warning instead of failing the job. --------- Co-authored-by: Nick Bozhenko <nick.bozhenko@lotusflare.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
All observed crashes occured with
3.14.3, although it did not crash all the time.Pin for now to unblock other PRs, and investigate later