-
Notifications
You must be signed in to change notification settings - Fork 202
OCPBUGS-74430: Fix screen progression issues with agent OVE UI #1842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-74430: Fix screen progression issues with agent OVE UI #1842
Conversation
Fixes multiple issues with the behaviour of the UI-driven installation for Agent OVE, i.e. ISO_NO_REGISTRY test. - Added a delay before starting the installation. Without this delay the ignition_config_overrides do not set since the IGNITION_CONFIG_OVERRIDE feature had not yet been read from InfraEnv - Handle progression from the Cluster screen to allow cluster creation to complete and Operators page to load - Fix Host Discovery page loading to proceed when Next button is available In addition debug has been added to save the cluster and host status retrieved from the API along with additional screenshots.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test e2e-agent-sno-ipv6 |
|
@bfournie: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
|
||
| // Wait 1 minute before entering cluster details to allow backend initialization | ||
| // This mimics manual usage where a user takes time to read and fill in the form | ||
| logrus.Info("Waiting 1 minute to allow backend initialization...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Froma pure logging point of view, until a new LOGS_DIR env var will be set (inherited from the dev-scripts calling environment), and configured for the current logger, I don't think the logs will be visible anywhere
Note: the ds LOGS_DIR already points to a folder gathered at the end of the job and published as CI artifacts, so in that way it will be automatically taken (aka don't log any sensible info)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logOutputDir = os.Getenv("LOGDIR")
...
func initLogger() {
logFileName := "ui_driven_cluster_installation.log"
logBaseDir := "."
if logOutputDir != "" {
logBaseDir = logOutputDir
}
file, err := os.OpenFile(filepath.Join(logBaseDir, logFileName), os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
panic(err)
}
logrus.SetOutput(io.MultiWriter(os.Stdout, file))
}
| screenshotPath = filepath.Join(cwd, ocpDir) | ||
| } | ||
|
|
||
| // Wait 1 minute before entering cluster details to allow backend initialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the exact problem here? What is the exact backed initialization we're waiting for?
Is it go-rod too fast? Btw note that the SlowMotion could be also used to slow it down, ie:
browser := rod.New().
ControlURL(url).
SlowMotion(500 * time.Millisecond).
MustConnect()
|
|
||
| func main() { | ||
| logrus.Info("Launching headless browser...") | ||
| time.Sleep(1 * time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 1 minute? As soon as the UI is available the user should be able to jump on it and start filling the details
| // No error page, cluster loaded successfully | ||
| logrus.Infof("Operators page loaded successfully (attempt %d)", attempt) | ||
| break | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bfournie I think at this point this project may deserve some relevant refactoring to make it maintenable and simpler (also to change in case in future enh of the UI). I like the original @pawanpinjarkar ideas of having 1 method per page, and the main should really looks like something extremely simple (meta-code):
initLogger()
page := initBrowser()
fillAndSubmitClusterDetails(...)
fillAndSubmitVirtualizationBundle(...)
fillAndSubmitHostDiscovery(...)
fillAndSubmitVerifyStorage(...)
(and so on)
Every page submitting method should have its own logic to understand if an error happened or not, and then fill the required fields.
And the logic should be based only on the UI elements (as the user would do)
| logrus.Info("Select virtualization bundle") | ||
|
|
||
| // Save API state for debugging | ||
| saveClusterAPIState(screenshotPath, "02-operators") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is required, and potentially overlapping with the agent-gather. We must have a single way of debugging issues, I think we already agreed on adding the SSH key to the ISO for letting the agent-gather to do its work
|
|
||
| // Wait for operators page to finish loading before clicking Next | ||
| logrus.Info("Waiting for operators page to finish loading...") | ||
| time.Sleep(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this wait here seems not strictly required
| ticker := time.NewTicker(30 * time.Second) | ||
| defer ticker.Stop() | ||
|
|
||
| go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do at all this kind of actions here (polling the AS to infer the status). As the user would do, the expected approach is to wait until:
- An error is shown in a popup
- The next button becomes enabled
|
/hold |
|
Some of the screen progression issues seem to be resolved in 4.22 so closing this for now |
|
@bfournie: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fixes multiple issues with the behaviour of the UI-driven installation for Agent OVE, i.e. ISO_NO_REGISTRY test.
In addition debug has been added to save the cluster and host status retrieved from the API along with additional screenshots.