fix: deflake //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration by basvandijk · Pull Request #8981 · dfinity/ic

basvandijk · 2026-02-21T15:36:09Z

The test //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration is slightly flaky:

$ bazel run //ci/githubstats:query -- top 1 flaky --include //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration
...
┍━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━━┯━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┑
│    │ label                                                                       │   total │   non_success │   flaky │   timeout │   fail │   non_success% │   flaky% │   timeout% │   fail% │   impact │   total duration │   duration_p90 │ owners          │
┝━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━━┿━━━━━━━━┿━━━━━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━┥
│  0 │ //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration │     402 │             5 │       5 │         0 │      0 │            1.2 │      1.2 │          0 │       0 │    21:45 │  1 days 05:08:42 │           4:21 │ governance-team │
┕━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━┷━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┙

Claude Opus 4.6 made the follow Root Cause Analysis and accompanying fix:

Root Cause

The test was flaky because governance's automated reward minting flow requires many inter-canister calls (NRC, CMC, Ledger, self-call) spanning 8+ ticks, but only 2 ticks were provided after each monthly metrics loop. Additionally, the retry loops used advance_time() without tick(), so governance's run_periodic_tasks never actually ran during retries.

Failure Patterns

Detailed Flow

After feeding 31 days of daily node metrics, governance detects it is time to mint rewards during the final wait_for_nrc_metrics_sync ticks
The NRC may not have synced all needed days yet, causing the mint to fail with No metrics found for day X
After wait_for_nrc_metrics_sync exits, only 2 explicit ticks were provided - insufficient for the full inter-canister call chain
The retry loops called advance_time() but never tick(), so governance never ran periodic tasks during retries

Fix

Added a wait_for_rewards_minting() helper that keeps calling tick() until governance completes the full reward minting flow
Replaced all insufficient 2-tick + non-ticking-retry patterns with calls to this helper
Removed seconds_advanced_test_2 tracking that compensated for the now-removed retry loops

Skill: .claude/skills/fix-flaky-tests/SKILL.md.

…ovider_remuneration The test was flaky because governance's automated reward minting flow requires many inter-canister calls (NRC, CMC, Ledger, self-call) spanning 8+ ticks, but only 2 ticks were provided after each monthly metrics loop. Additionally, the retry loops used advance_time() without tick(), so governance's run_periodic_tasks never actually ran during retries. Root cause: - After feeding 31 days of daily node metrics in the inner loop, governance detects it's time to mint rewards during the final wait_for_nrc_metrics_sync ticks - The NRC may not have synced all needed days yet at that point, causing the mint to fail with "No metrics found for day X" - After wait_for_nrc_metrics_sync exits, only 2 explicit ticks were provided - insufficient for the full inter-canister call chain - The subsequent retry loops called advance_time() and query() but never tick(), so governance never got another chance to run periodic tasks and complete the minting Fix: - Add a wait_for_rewards_minting() helper that keeps calling tick() until governance successfully completes the full reward minting flow (detected by a change in the most_recent_monthly_node_provider_rewards timestamp) - Replace all insufficient 2-tick + non-ticking-retry patterns with calls to this helper - Remove the seconds_advanced_test_2 tracking that compensated for time advanced in the now-removed retry loops

github-actions

This pull request changes code owned by the Governance team. Therefore, make sure that
you have considered the following (for Governance-owned code):

Update unreleased_changelog.md (if there are behavior changes, even if they are
non-breaking).
Are there BREAKING changes?
Is a data migration needed?
Security review?

How to Satisfy This Automatic Review

Go to the bottom of the pull request page.
Look for where it says this bot is requesting changes.
Click the three dots to the right.
Select "Dismiss review".
In the text entry box, respond to each of the numbered items in the previous
section, declare one of the following:

Done.
$REASON_WHY_NO_NEED. E.g. for unreleased_changelog.md, "No
canister behavior changes.", or for item 2, "Existing APIs
behave as before.".

Brief Guide to "Externally Visible" Changes

"Externally visible behavior change" is very often due to some NEW canister API.

Changes to EXISTING APIs are more likely to be "breaking".

If these changes are breaking, make sure that clients know how to migrate, how to
maintain their continuity of operations.

If your changes are behind a feature flag, then, do NOT add entrie(s) to
unreleased_changelog.md in this PR! But rather, add entrie(s) later, in the PR
that enables these changes in production.

Reference(s)

For a more comprehensive checklist, see here.

GOVERNANCE_CHECKLIST_REMINDER_DEDUP

No canister changes.

rs/nns/integration_tests/src/node_provider_remuneration.rs

…vider_remuneration

Address review comment from jasonz-dfinity: add advance_time(1s) before each tick() in wait_for_rewards_minting so that timer-based periodic tasks are also triggered, making the test more robust and realistic.

basvandijk · 2026-03-04T20:17:05Z

@jasonz-dfinity since the PR is already 2 weeks old I asked Claude Opus 4.6 if it still addresses the flaky runs that happened in the last week:

PR #8981 Flaky Test Analysis

Flaky Runs Summary (last week: 6 runs, 8 failing attempts)

Date	Test that failed	Error
Feb 25	`test_list_node_provider_rewards`	"No metrics found for day 2026-04-18"
Feb 26 (x2)	`test_automated_node_provider_remuneration`	Assertion at line 928: rewards content mismatch
Feb 27 (x2)	`test_automated_node_provider_remuneration`	Same assertion at line 928
Mar 2	`test_list_node_provider_rewards`	"No metrics found for day 2026-10-17"
Mar 4 (x2)	`test_automated_node_provider_remuneration`	Same assertion at line 928

Two Failure Patterns — Same Root Cause

Both patterns trace back to the Node Rewards Canister (NRC) not being synced when governance tries to mint rewards:

Pattern A (test_list_node_provider_rewards, 2/6 runs): Governance's minting fails outright with "No metrics found for day X" because NRC hasn't synced that day's data.
Pattern B (test_automated_node_provider_remuneration, 4/6 runs): The logs show "Metrics and registry are not synced up to to_date" errors. On master, only 2 explicit tick() calls + 10 execute_ingress calls (from the retry loop) give governance ~12 heartbeat opportunities, which isn't always enough for NRC to sync + governance to complete the full inter-canister minting flow (NRC → CMC → Ledger → self-call).

PR #8981's Fix

The PR introduces wait_for_rewards_minting(), which loops up to 500 iterations of advance_time(1s) + tick(), returning as soon as a new rewards timestamp appears. This is applied to both tests.

Yes, PR #8981 still addresses both failure patterns:

Pattern A: 500 ticks (vs. 2) gives NRC ample time to sync metrics before governance's minting flow succeeds.
Pattern B: 500 explicit tick() calls run ALL canisters' heartbeats each iteration, letting NRC catch up and governance retry and complete the full async minting chain. The old retry loop relied on execute_ingress heartbeats which weren't sufficient.

One Minor Concern

The PR removes the seconds_advanced_test_2 compensation for the third reward period. The old code tracked 600s of extra time from the retry loop; the PR's wait_for_rewards_minting adds a small untracked amount of extra time (a few seconds). This is much smaller than the old 600s and unlikely to cause a day-boundary shift in expected_reward_days_covered, but in principle the untracked extra seconds from wait_for_rewards_minting could affect the third reward period's day count calculation if the test runs right at a UTC day boundary. This is a low-probability edge case.

Conclusion

The PR is still relevant and should substantially reduce or eliminate the observed flakiness. The reviewer @jasonz-dfinity's suggestion to include advance_time in each tick iteration (already incorporated) also makes the fix more robust for a potential future heartbeat-to-timer migration.

github-actions bot added the fix label Feb 21, 2026

clippy fix

cb82d60

basvandijk marked this pull request as ready for review February 21, 2026 15:58

basvandijk requested a review from a team as a code owner February 21, 2026 15:58

github-actions bot previously requested changes Feb 21, 2026

View reviewed changes

github-actions bot added the @governance-team label Feb 21, 2026

jasonz-dfinity reviewed Feb 23, 2026

View reviewed changes

rs/nns/integration_tests/src/node_provider_remuneration.rs Show resolved Hide resolved

basvandijk added 2 commits March 4, 2026 19:45

Merge remote-tracking branch 'origin/master' into ai/deflake-node_pro…

f68f96e

…vider_remuneration

fix: advance time in wait_for_rewards_minting loop

20b2baa

Address review comment from jasonz-dfinity: add advance_time(1s) before each tick() in wait_for_rewards_minting so that timer-based periodic tasks are also triggered, making the test more robust and realistic.

basvandijk requested a review from jasonz-dfinity March 4, 2026 20:21

jasonz-dfinity approved these changes Mar 4, 2026

View reviewed changes

jasonz-dfinity added this pull request to the merge queue Mar 4, 2026

Merged via the queue into master with commit ff394c4 Mar 4, 2026
41 checks passed

jasonz-dfinity deleted the ai/deflake-node_provider_remuneration branch March 4, 2026 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: deflake //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration#8981

fix: deflake //rs/nns/integration_tests:integration_tests_src/node_provider_remuneration#8981
jasonz-dfinity merged 4 commits intomasterfrom
ai/deflake-node_provider_remuneration

basvandijk commented Feb 21, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

basvandijk commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

basvandijk commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Failure Patterns

Detailed Flow

Fix

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

How to Satisfy This Automatic Review

Brief Guide to "Externally Visible" Changes

Reference(s)

Uh oh!

Uh oh!

basvandijk commented Mar 4, 2026

PR #8981 Flaky Test Analysis

Flaky Runs Summary (last week: 6 runs, 8 failing attempts)

Two Failure Patterns — Same Root Cause

PR #8981's Fix

One Minor Concern

Conclusion

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

basvandijk commented Feb 21, 2026 •

edited

Loading