OCPBUGS-74430: Fix screen progression issues with agent OVE UI #1842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

bfournie wants to merge 1 commit into openshift-metal3:master from bfournie:agent-ove-ui-driven-fixes

Contributor

bfournie commented Jan 26, 2026

Fixes multiple issues with the behaviour of the UI-driven installation for Agent OVE, i.e. ISO_NO_REGISTRY test.

Added a delay before starting the installation. Without this delay the ignition_config_overrides do not set since the IGNITION_CONFIG_OVERRIDE feature had not yet been read from InfraEnv
Handle progression from the Cluster screen to allow cluster creation to complete and Operators page to load
Fix Host Discovery page loading to proceed when Next button is available

In addition debug has been added to save the cluster and host status retrieved from the API along with additional screenshots.


          OCPBUGS-74430: Fix screen progression issues with agent OVE UI

8ca2b7a

Fixes multiple issues with the behaviour of the UI-driven installation
for Agent OVE, i.e. ISO_NO_REGISTRY test.

- Added a delay before starting the installation. Without this delay
  the ignition_config_overrides do not set since the IGNITION_CONFIG_OVERRIDE
  feature had not yet been read from InfraEnv
- Handle progression from the Cluster screen to allow cluster creation
  to complete and Operators page to load
- Fix Host Discovery page loading to proceed when Next button is
  available

In addition debug has been added to save the cluster and host status
retrieved from the API along with additional screenshots.

openshift-ci bot requested review from andfasano and pawanpinjarkar

January 26, 2026 18:06

openshift-ci bot commented Jan 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rwsu for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

agent/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Contributor Author

bfournie commented Jan 26, 2026

/test e2e-agent-sno-ipv6

openshift-ci bot commented Jan 26, 2026

@bfournie: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-agent-sno-ipv6	`8ca2b7a`	link	false	`/test e2e-agent-sno-ipv6`
ci/prow/e2e-agent-ha-dualstack	`8ca2b7a`	link	false	`/test e2e-agent-ha-dualstack`
ci/prow/e2e-agent-compact-ipv4-iso-no-registry	`8ca2b7a`	link	false	`/test e2e-agent-compact-ipv4-iso-no-registry`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

andfasano reviewed

View reviewed changes

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              	// Wait 1 minute before entering cluster details to allow backend initialization

              	// This mimics manual usage where a user takes time to read and fill in the form

              	logrus.Info("Waiting 1 minute to allow backend initialization...")

Member

andfasano Jan 27, 2026

Froma pure logging point of view, until a new LOGS_DIR env var will be set (inherited from the dev-scripts calling environment), and configured for the current logger, I don't think the logs will be visible anywhere
Note: the ds LOGS_DIR already points to a folder gathered at the end of the job and published as CI artifacts, so in that way it will be automatically taken (aka don't log any sensible info)

Member

andfasano Jan 27, 2026

logOutputDir     = os.Getenv("LOGDIR")
...
func initLogger() {
	logFileName := "ui_driven_cluster_installation.log"
	logBaseDir := "."
	if logOutputDir != "" {
		logBaseDir = logOutputDir
	}

	file, err := os.OpenFile(filepath.Join(logBaseDir, logFileName), os.O_CREATE|os.O_WRONLY, 0644)
	if err != nil {
		panic(err)
	}
	logrus.SetOutput(io.MultiWriter(os.Stdout, file))
}

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              			screenshotPath = filepath.Join(cwd, ocpDir)

              	}

              	// Wait 1 minute before entering cluster details to allow backend initialization

Member

andfasano Jan 27, 2026

What is the exact problem here? What is the exact backed initialization we're waiting for?
Is it go-rod too fast? Btw note that the SlowMotion could be also used to slow it down, ie:

browser := rod.New().
		ControlURL(url).
		SlowMotion(500 * time.Millisecond).
		MustConnect()

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              func main() {

              	logrus.Info("Launching headless browser...")

              	time.Sleep(1 * time.Minute)

Member

andfasano Jan 27, 2026

Why 1 minute? As soon as the UI is available the user should be able to jump on it and start filling the details

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              			// No error page, cluster loaded successfully

              			logrus.Infof("Operators page loaded successfully (attempt %d)", attempt)

              			break

              		}

Member

andfasano Jan 27, 2026

@bfournie I think at this point this project may deserve some relevant refactoring to make it maintenable and simpler (also to change in case in future enh of the UI). I like the original @pawanpinjarkar ideas of having 1 method per page, and the main should really looks like something extremely simple (meta-code):

initLogger()
page := initBrowser()

fillAndSubmitClusterDetails(...)
fillAndSubmitVirtualizationBundle(...)
fillAndSubmitHostDiscovery(...)
fillAndSubmitVerifyStorage(...)
(and so on)

Every page submitting method should have its own logic to understand if an error happened or not, and then fill the required fields.
And the logic should be based only on the UI elements (as the user would do)

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              	logrus.Info("Select virtualization bundle")

              	// Save API state for debugging

              	saveClusterAPIState(screenshotPath, "02-operators")

Member

andfasano Jan 27, 2026

I don't think this is required, and potentially overlapping with the agent-gather. We must have a single way of debugging issues, I think we already agreed on adding the SSH key to the ISO for letting the agent-gather to do its work

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              	// Wait for operators page to finish loading before clicking Next

              	logrus.Info("Waiting for operators page to finish loading...")

              	time.Sleep(5 * time.Second)

Member

andfasano Jan 27, 2026

Also this wait here seems not strictly required

agent/isobuilder/ui_driven_cluster_installation/main.go

    
              	ticker := time.NewTicker(30 * time.Second)

              	defer ticker.Stop()

              	go func() {

Member

andfasano Jan 27, 2026

I don't think we should do at all this kind of actions here (polling the AS to infer the status). As the user would do, the expected approach is to wait until:

An error is shown in a popup
The next button becomes enabled

Contributor Author

bfournie commented Jan 27, 2026

/hold
Marking as hold for now

openshift-ci bot added the do-not-merge/hold label

Contributor Author

bfournie commented Jan 27, 2026

Some of the screen progression issues seem to be resolved in 4.22 so closing this for now
/close

openshift-ci bot closed this

openshift-ci bot commented Jan 27, 2026

@bfournie: Closed this PR.

Details

In response to this:

Some of the screen progression issues seem to be resolved in 4.22 so closing this for now
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold