Skip to content

MON-4473: Migrate Prometheus targets discovering from Endpoints to EndpointsSlices#823

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
machine424:endpo
Mar 14, 2026
Merged

MON-4473: Migrate Prometheus targets discovering from Endpoints to EndpointsSlices#823
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
machine424:endpo

Conversation

@machine424
Copy link
Copy Markdown
Contributor

@machine424 machine424 commented Jan 22, 2026

This PR migrates Prometheus service discovery from the deprecated Endpoints API to the EndpointSlices API, by:

  • Setting serviceDiscoveryRole: EndpointSlice on ServiceMonitors.
  • Granting Prometheus endpointslices permissions.

We're taking a conservative approach by keeping the existing endpoints permissions alongside the new endpointslices ones. This provides a safety net in case any ServiceMonitors, whether deployed from this repo or from another source, still rely on the same Role and were missed during the migration.

That said, since both resources provide essentially the same data, keeping both isn't meaningfully more permissive from a security standpoint.

These changes target OpenShift 4.22+ and should not be backported to earlier releases.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 22, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Jan 22, 2026

@machine424: This pull request references MON-4473 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 22, 2026

Walkthrough

Added EndpointSlice support: Prometheus RBAC Roles were extended to allow endpointslices access and ServiceMonitor resources were configured to use EndpointSlice-based service discovery.

Changes

Cohort / File(s) Summary
RBAC EndpointSlice Support
manifests/0000_90_cluster-authentication-operator_01_prometheusrbac.yaml
Added apiGroups: ["discovery.k8s.io"] with resources: ["endpointslices"] and verbs: ["get","list","watch"] to the prometheus-k8s Role entries across relevant namespaces.
ServiceMonitor EndpointSlice Configuration
manifests/0000_90_cluster-authentication-operator_02_servicemonitor.yaml
Added serviceDiscoveryRole: EndpointSlice to three ServiceMonitor resources (authentication-operator, oauth-openshift, openshift-oauth-apiserver).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from ibihim and liouk January 22, 2026 16:07
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
manifests/0000_90_cluster-authentication-operator_01_prometheusrbac.yaml (1)

12-18: EndpointSlice RBAC uses the wrong API group.
endpointslices live in discovery.k8s.io, not the core "" API group. The current rules will deny Prometheus authorization to list/watch EndpointSlices, causing endpoint discovery to fail.

Proposed fix (apply to all three Roles)
@@
   resources:
   - services
   - endpoints
-  - endpointslices
   - pods
   verbs:
   - get
   - list
   - watch
+- apiGroups:
+  - discovery.k8s.io
+  resources:
+  - endpointslices
+  verbs:
+  - get
+  - list
+  - watch
@@
   resources:
   - services
   - endpoints
-  - endpointslices
   - pods
   verbs:
   - get
   - list
   - watch
+- apiGroups:
+  - discovery.k8s.io
+  resources:
+  - endpointslices
+  verbs:
+  - get
+  - list
+  - watch
@@
   resources:
   - services
   - endpoints
-  - endpointslices
   - pods
   verbs:
   - get
   - list
   - watch
+- apiGroups:
+  - discovery.k8s.io
+  resources:
+  - endpointslices
+  verbs:
+  - get
+  - list
+  - watch

Also applies to: 35-41, 58-64

@machine424
Copy link
Copy Markdown
Contributor Author

good catch

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@manifests/0000_90_cluster-authentication-operator_01_prometheusrbac.yaml`:
- Around line 22-29: Fix the typo in the Role rules for the discovery.k8s.io
apiGroup: replace the invalid verb "watchg" with "watch" for the rules granting
access to resources "endpointslices" (the three occurrences currently showing
"watchg"); update each Role/Rule block that lists the verb to use "watch" so
Prometheus can properly watch EndpointSlice changes.

@liouk
Copy link
Copy Markdown
Member

liouk commented Jan 23, 2026

/retest-required

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Feb 9, 2026

@machine424: This pull request references MON-4473 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This PR migrates Prometheus service discovery from the deprecated Endpoints API to the EndpointSlices API, by:

  • Setting serviceDiscoveryRole: EndpointSlice on ServiceMonitors.
  • Granting Prometheus endpointslices permissions.

We're taking a conservative approach by keeping the existing endpoints permissions alongside the new endpointslices ones. This provides a safety net in case any ServiceMonitors, whether deployed from this repo or from another source, still rely on the same Role and were missed during the migration.

That said, since both resources provide essentially the same data, keeping both isn't meaningfully more permissive from a security standpoint.

These changes target OpenShift 4.22+ and should not be backported to earlier releases.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@machine424
Copy link
Copy Markdown
Contributor Author

/retest-required

4 similar comments
@machine424
Copy link
Copy Markdown
Contributor Author

/retest-required

@machine424
Copy link
Copy Markdown
Contributor Author

/retest-required

@machine424
Copy link
Copy Markdown
Contributor Author

/retest-required

@machine424
Copy link
Copy Markdown
Contributor Author

/retest-required

@machine424
Copy link
Copy Markdown
Contributor Author

/verified by existing tests
/jira refresh

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 9, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@machine424: This PR has been marked as verified by existing tests.

Details

In response to this:

/verified by existing tests
/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 9, 2026

@machine424: This pull request references MON-4473 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

/verified by existing tests
/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@liouk
Copy link
Copy Markdown
Member

liouk commented Mar 9, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liouk, machine424

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 12, 2026

@machine424: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-operator-disruptive a622252 link true /test e2e-gcp-operator-disruptive

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD eca6050 and 2 for PR HEAD a622252 in total

@openshift-merge-bot openshift-merge-bot bot merged commit 0ad22f8 into openshift:master Mar 14, 2026
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants