Skip to content

Comments

fix: deflake //rs/sns/integration_tests:upgrade_canister_test#8993

Merged
jasonz-dfinity merged 1 commit intomasterfrom
ai/deflake-upgrade_canister_test
Feb 23, 2026
Merged

fix: deflake //rs/sns/integration_tests:upgrade_canister_test#8993
jasonz-dfinity merged 1 commit intomasterfrom
ai/deflake-upgrade_canister_test

Conversation

@basvandijk
Copy link
Collaborator

@basvandijk basvandijk commented Feb 22, 2026

//rs/sns/integration_tests:upgrade_canister_test was flaky this week:

$ bazel run //ci/githubstats:query -- top 1 flaky% --week --include  //rs/sns/integration_tests:upgrade_canister_test
...
┍━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┑
│    │ label                                            │   total │   non_success │   flaky │   timeout │   fail │   non_success% │   flaky% │   timeout% │   fail% │   impact │   total duration │   duration_p90 │ owners          │
┝━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━┿━━━━━━━━┿━━━━━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━┥
│  0 │ //rs/sns/integration_tests:upgrade_canister_test │     174 │             4 │       4 │         0 │      0 │            2.3 │      2.3 │          0 │       0 │    20:44 │         15:01:54 │           5:11 │ governance-team │
┕━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┙

The test failed 3 times in a similar way:

$ bazel run //ci/githubstats:query -- last --flaky --week //rs/sns/integration_tests:upgrade_canister_test --columns=last_started_at,errors
...
╒════╤═════════════════════════╤════════════════════════════════════════════════════════════════════════════════════════╕
│    │   last started at (UTC) │ errors per attempt                                                                     │
╞════╪═════════════════════════╪════════════════════════════════════════════════════════════════════════════════════════╡
│  0 │ Fri 2026-02-20 15:31:47 │ 1: -- Test timed out at 2026-02-20 15:31:46 UTC --                                     │
├────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┤
│  1 │ Wed 2026-02-18 14:42:46 │ 1: test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; ... │
├────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┤
│  2 │ Mon 2026-02-16 17:47:55 │ 1: test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; ... │
├────┼─────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────┤
│  3 │ Mon 2026-02-16 12:05:56 │ 1: test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; ... │
╘════╧═════════════════════════╧════════════════════════════════════════════════════════════════════════════════════════╛

Specifically the test_upgrade_canister_proposal_execution_fail failed 3 times on the following assert:

        assert!(
            decision_age_s < EXPECTED_SNS_DAPP_CANISTER_UPGRADE_TIME_SECONDS,
            "decision_age_s: {decision_age_s}, proposal: {proposal:?}"
        );

where EXPECTED_SNS_DAPP_CANISTER_UPGRADE_TIME_SECONDS = 60 with an error like:

thread 'test_upgrade_canister_proposal_execution_fail' panicked at rs/sns/integration_tests/src/upgrade_canister.rs:451:9:
decision_age_s: 71, proposal: ...

Claude Opus 4.6 made the following Root Cause Analysis and accompanying fix:

Root Cause

The test_upgrade_canister_proposal_execution_fail test used SystemTime::now() (wall clock) to compute how long ago the proposal was decided/failed. However, the state machine's internal time (used for proposal timestamps) is initialized at test start via with_current_time() and does not advance during the SNS setup phase, which compiles WASM binaries and takes 60-100+ seconds of wall-clock time.

This caused decision_age_s to equal the variable setup time rather than the actual proposal processing time, exceeding the 60-second threshold (EXPECTED_SNS_DAPP_CANISTER_UPGRADE_TIME_SECONDS).

From flaky run logs, the observed values were:

  • decision_age_s: 103 (setup took 101.3s)
  • decision_age_s: 69 (setup took 67.1s)
  • decision_age_s: 71 (setup took 69.1s)

Fix

Replace the wall-clock-based age_s() function with timestamp consistency checks that compare decided_timestamp_seconds and failed_timestamp_seconds against proposal_creation_timestamp_seconds. All three timestamps live in the same state machine time domain, eliminating the wall-clock dependency entirely.


Skill: .claude/skills/fix-flaky-tests/SKILL.md.

Replace wall-clock-based age assertions with timestamp consistency
checks using the proposal's own internal timestamps.

Root cause: The test_upgrade_canister_proposal_execution_fail test
used SystemTime::now() (wall clock) to compute how long ago the
proposal was decided/failed. However, the state machine's internal
time (used for proposal timestamps) is initialized at test start and
does not advance during the SNS setup phase, which compiles WASM
binaries and takes 60-100+ seconds of wall-clock time. This caused
decision_age_s to equal the variable setup time rather than the actual
proposal processing time, exceeding the 60-second threshold.

Fix: Compare decided_timestamp_seconds and failed_timestamp_seconds
against proposal_creation_timestamp_seconds (all in the same state
machine time domain), eliminating the wall-clock dependency entirely.

Skill: .claude/skills/fix-flaky-tests/SKILL.md
@github-actions github-actions bot added the fix label Feb 22, 2026
@basvandijk basvandijk marked this pull request as ready for review February 22, 2026 19:38
@basvandijk basvandijk requested a review from a team as a code owner February 22, 2026 19:38
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request changes code owned by the Governance team. Therefore, make sure that
you have considered the following (for Governance-owned code):

  1. Update unreleased_changelog.md (if there are behavior changes, even if they are
    non-breaking).

  2. Are there BREAKING changes?

  3. Is a data migration needed?

  4. Security review?

How to Satisfy This Automatic Review

  1. Go to the bottom of the pull request page.

  2. Look for where it says this bot is requesting changes.

  3. Click the three dots to the right.

  4. Select "Dismiss review".

  5. In the text entry box, respond to each of the numbered items in the previous
    section, declare one of the following:

  • Done.

  • $REASON_WHY_NO_NEED. E.g. for unreleased_changelog.md, "No
    canister behavior changes.", or for item 2, "Existing APIs
    behave as before.".

Brief Guide to "Externally Visible" Changes

"Externally visible behavior change" is very often due to some NEW canister API.

Changes to EXISTING APIs are more likely to be "breaking".

If these changes are breaking, make sure that clients know how to migrate, how to
maintain their continuity of operations.

If your changes are behind a feature flag, then, do NOT add entrie(s) to
unreleased_changelog.md in this PR! But rather, add entrie(s) later, in the PR
that enables these changes in production.

Reference(s)

For a more comprehensive checklist, see here.

GOVERNANCE_CHECKLIST_REMINDER_DEDUP

@jasonz-dfinity jasonz-dfinity added this pull request to the merge queue Feb 23, 2026
Merged via the queue into master with commit c205f23 Feb 23, 2026
44 checks passed
@jasonz-dfinity jasonz-dfinity deleted the ai/deflake-upgrade_canister_test branch February 23, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants