Skip to content

[update] fix: stop control plane check during split variant post-services tests#3978

Open
Valkyrie00 wants to merge 1 commit into
openstack-k8s-operators:mainfrom
Valkyrie00:fix/OSPCIX-1381-stop-bg-tests-before-split-intermediate
Open

[update] fix: stop control plane check during split variant post-services tests#3978
Valkyrie00 wants to merge 1 commit into
openstack-k8s-operators:mainfrom
Valkyrie00:fix/OSPCIX-1381-stop-bg-tests-before-split-intermediate

Conversation

@Valkyrie00

@Valkyrie00 Valkyrie00 commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Context

This is a targeted fix to unblock the job which is currently failing due to a race condition between the continuous control plane check and tobiko tests.

A more comprehensive solution, such as adding a pause/resume mechanism to the control plane check scripts or reworking how background tests interact with the split update variant, could be evaluated as a follow-up.

Problem

Some jobs fails because tobiko tests encounter VMs in BUILD state during the post-services test phase.

The root cause is a race condition: the continuous control plane check (cifmw_update_control_plane_check) runs workload_launch.sh in an infinite loop, creating and tearing down VMs every few seconds.

In the split variant, tests run between the services update and system update phases while this loop is still active.
Tobiko lists all servers, finds a VM mid-creation, and fails:
novaclient.exceptions.Conflict: Cannot 'reboot' instance ... while it is in vm_state building (HTTP 409)

Proposed solution

Stop the control plane check before running the post-services tests in update_variant_split.yml, then restart it so it monitors the system update phase as well.

The ping test is not stopped because it only runs ping against an existing VM and does not create any new OpenStack resources.

Each control plane check run produces PID-scoped log files (control-plane-test-<PID>.log), so the stop/restart cycle creates two separate runs with independent validation, no log collision.

Closes: OSPCIX-1381
Assisted-By: Cursor

@openshift-ci

openshift-ci Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@Valkyrie00 Valkyrie00 changed the title [DNM] [update] Stop control plane check during split variant post-services tests [DNM] [update] fix: stop control plane check during split variant post-services tests Jun 4, 2026
jistr
jistr previously approved these changes Jun 15, 2026

@jistr jistr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

ciecierski
ciecierski previously approved these changes Jun 15, 2026
@Valkyrie00 Valkyrie00 dismissed stale reviews from ciecierski and jistr via c8c6dfb June 15, 2026 23:00
@Valkyrie00 Valkyrie00 force-pushed the fix/OSPCIX-1381-stop-bg-tests-before-split-intermediate branch from c722046 to c8c6dfb Compare June 15, 2026 23:00
@openshift-ci openshift-ci Bot removed the lgtm label Jun 15, 2026
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jistr. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…tests

In the split update variant, the continuous control plane check
runs in the background creating and deleting VMs in a loop. When
post-services tests (tobiko) execute between the services and system
update phases, they discover transient VMs in BUILD state and fail
with HTTP 409 conflicts.

Stop the control plane check before running the post-services tests
and restart it afterward so it continues monitoring during the system
update phase. The ping test is left running since it only pings an
existing VM and does not create new resources.

Closes: OSPCIX-1381
Signed-off-by: Vito Castellano <vcastell@redhat.com>
@Valkyrie00 Valkyrie00 force-pushed the fix/OSPCIX-1381-stop-bg-tests-before-split-intermediate branch from c8c6dfb to fa56392 Compare June 15, 2026 23:01
@Valkyrie00 Valkyrie00 changed the title [DNM] [update] fix: stop control plane check during split variant post-services tests [update] fix: stop control plane check during split variant post-services tests Jun 16, 2026
@Valkyrie00 Valkyrie00 marked this pull request as ready for review June 16, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants