Skip to content

[multiple] nova05epsilon: adjust Ceph for BM SNO DCN cases#3773

Open
bogdando wants to merge 3 commits into
openstack-k8s-operators:mainfrom
bogdando:nova05epsilon
Open

[multiple] nova05epsilon: adjust Ceph for BM SNO DCN cases#3773
bogdando wants to merge 3 commits into
openstack-k8s-operators:mainfrom
bogdando:nova05epsilon

Conversation

@bogdando

@bogdando bogdando commented Mar 17, 2026

Copy link
Copy Markdown
Contributor
  1. Adjust ceph.yml post_stage_run hook for DCN conventions

    The ceph.yml post_stage_run hook (via cifmw_ceph_client role) writes Ceph config files to
    cifmw_ceph_client_fetch_dir (default /tmp/). This template reads those files and provides them as base64-
    encoded values under data.ceph_conf (DCN convention).

  2. Allow overriding ssh and storage_mgmt

    To allow BM SNO with ceph using custom ceph CIDR values,
    make ssh_network_range and storage_mgmt_network_range overridable via
    cifmw_ceph_ssh_network_range and cifmw_ceph_storage_mgmt_network_range.
    Both are set in set_fact which clobbers extra vars, so we use the
    cifmw_ indirection with default() to preserve original defaults.

    NOTE: storage_network_range also needs this treatment.
    It use to be commented out in set_fact, and this change needs
    extra testing with Ceph ci jobs perhaps.

    Also gather network facts for IP-to-host mapping.

  3. Fix Swift by Ceph RGW on SNO setup

    On SNO with a single EDPM compute (single-host CephHCI), the Ceph
    ingress service (haproxy/keepalived) is not deployed because the
    ceph_rgw.yml.j2 spec template only creates it for multi-host clusters.

    Parameterize RGW port to correct the Keystone Swift endpoint for SNO.
    Change the VIP detection logic so that if cifmw_cephadm_rgw_vip
    is pre-set (e.g. to the host's storage IP for SNO cases) - it's
    preserved. Otherwise it falls back to cifmw_cephadm_vip
    (the ingress VIP) as before.

    Users will be able to chose from VIP:8080 vs host_ip:8082 accordingly.

Jira: OSPRH-27641
Generated-by: claude-4.6-opus-high

@openshift-ci

openshift-ci Bot commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@github-actions

github-actions Bot commented Apr 2, 2026

Copy link
Copy Markdown

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale label Apr 2, 2026
@github-actions github-actions Bot closed this Apr 9, 2026
@bogdando bogdando reopened this Apr 23, 2026
@openshift-ci

openshift-ci Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign brjackma for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bogdando bogdando changed the title nova05: populate ceph_conf from files by ceph.yml [ci_gen_kustomize_values] nova05epsilon: ceph conf Apr 23, 2026
@bogdando bogdando removed the Stale label Apr 23, 2026
@github-actions

github-actions Bot commented May 9, 2026

Copy link
Copy Markdown

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale label May 9, 2026
@bogdando bogdando removed the Stale label May 11, 2026
@bogdando bogdando changed the title [ci_gen_kustomize_values] nova05epsilon: ceph conf [multiple] nova05epsilon: adjust Ceph for BM SNO cases May 14, 2026
@bogdando bogdando changed the title [multiple] nova05epsilon: adjust Ceph for BM SNO cases [multiple] nova05epsilon: adjust Ceph for BM SNO DCN cases May 14, 2026
@bogdando bogdando requested review from danpawlik, fmount and fultonj May 14, 2026 08:27
@bogdando bogdando force-pushed the nova05epsilon branch 3 times, most recently from 84314e0 to f9d684c Compare May 14, 2026 09:32
@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/188c70ab54dc4f6bb3ab99792c2ac3c4

openstack-k8s-operators-content-provider NODE_FAILURE Node(set) request 100-0000095747 failed in 0s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal-minor-update SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ podified-multinode-hci-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ cifmw-pod-zuul-files SUCCESS in 4m 49s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 03s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 5m 00s
✔️ cifmw-pod-pre-commit SUCCESS in 9m 08s
✔️ cifmw-architecture-validate-hci SUCCESS in 4m 47s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 6m 22s
✔️ cifmw-molecule-reproducer SUCCESS in 14m 04s

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/6347082e77b543c9b6f59af6ea456921

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 50m 26s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 29m 07s
cifmw-crc-podified-edpm-baremetal FAILURE in 39m 01s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 52m 02s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 51m 40s
✔️ cifmw-pod-zuul-files SUCCESS in 6m 13s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 11m 34s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 7m 25s
✔️ cifmw-pod-pre-commit SUCCESS in 10m 37s
✔️ cifmw-architecture-validate-hci SUCCESS in 5m 43s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 11m 49s

@bogdando

Copy link
Copy Markdown
Contributor Author

recheck cifmw-crc-podified-edpm-baremetal

@fultonj

fultonj commented May 18, 2026

Copy link
Copy Markdown
Contributor

This looks like it should be fine to me. Let's confirm with a test project.

Nit: The commit message says something about "Add computes to zuul inventory" but I don't see that in the changed files.

@fmount

fmount commented May 18, 2026

Copy link
Copy Markdown
Contributor

This looks like it should be fine to me. Let's confirm with a test project.

+1 I agree, looks ok but testprojects w/ the Ceph related scenarios will confirm.

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/ffd020108dc0482b9c6b06007b8f1ac5

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 51m 29s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 26m 06s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 48m 08s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 57m 56s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 49m 57s
cifmw-pod-zuul-files FAILURE in 5m 00s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 10m 06s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 6m 20s
✔️ cifmw-pod-pre-commit SUCCESS in 9m 50s
✔️ cifmw-architecture-validate-hci SUCCESS in 4m 37s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 6m 36s

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/7828c9f5bdcc411aa28b4c039f43deb6

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 23m 57s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 27m 39s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 37m 33s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 10m 29s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 44m 07s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 45s
✔️ noop SUCCESS in 0s
cifmw-pod-ansible-test FAILURE in 7m 47s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 5m 17s
✔️ cifmw-pod-pre-commit SUCCESS in 9m 18s
✔️ cifmw-architecture-validate-hci SUCCESS in 4m 58s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 6m 35s

@bogdando

bogdando commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

recheck cifmw-pod-ansible-test

@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/2c48d42cfac64f35bcc0975ba84c2350

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 25m 30s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 32m 45s
cifmw-crc-podified-edpm-baremetal NODE_FAILURE Node(set) request 099-0000115138 failed in 0s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 07m 12s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 53m 26s
✔️ cifmw-pod-zuul-files SUCCESS in 6m 56s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 36s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 7m 40s
✔️ cifmw-pod-pre-commit SUCCESS in 10m 26s
✔️ cifmw-architecture-validate-hci SUCCESS in 6m 01s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 7m 49s
cifmw-molecule-ci_local_storage FAILURE in 13m 17s

Comment thread hooks/playbooks/fix_swift_endpoint.yml Outdated
Comment thread roles/ci_local_storage/molecule/default/prepare.yml Outdated
@bogdando bogdando requested a review from fmount June 11, 2026 16:00
bogdando added 2 commits June 11, 2026 18:03
The ceph.yml post_stage_run hook (via cifmw_ceph_client role) writes
Ceph config files to cifmw_ceph_client_fetch_dir (default /tmp/).
This template reads those files and provides them as base64-encoded
values under data.ceph_conf (DCN convention).

Generated-by: claude-4.6-opus-high
Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
To allow BM SNO with ceph using custom ceph CIDR values,
make ssh_network_range and storage_mgmt_network_range overridable via
cifmw_ceph_ssh_network_range and cifmw_ceph_storage_mgmt_network_range.
Both are set in set_fact which clobbers extra vars, so we use the
cifmw_ indirection with default() to preserve original defaults.

NOTE: storage_network_range also needs this treatment.
It use to be commented out in set_fact, and this change needs
extra testing with Ceph ci jobs perhaps.

Also gather network facts for IP-to-host mapping.

Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
@centosinfra-prod-github-app

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://gateway-cloud-softwarefactory.apps.ocp.cloud.ci.centos.org/zuul/t/rdoproject.org/buildset/740cd2414f1c42698d90efcb81525826

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 16m 28s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 25m 38s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 37m 02s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 03m 35s
podified-multinode-hci-deployment-crc FAILURE in 1h 04m 43s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 05s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 15s
✔️ cifmw-pod-k8s-snippets-source SUCCESS in 5m 03s
cifmw-pod-pre-commit FAILURE in 8m 18s
✔️ cifmw-architecture-validate-hci SUCCESS in 5m 31s
✔️ cifmw-molecule-ci_gen_kustomize_values SUCCESS in 6m 52s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 5m 15s

On SNO with a single EDPM compute (single-host CephHCI), the Ceph
ingress service (haproxy/keepalived) is not deployed because the
ceph_rgw.yml.j2 spec template only creates it for multi-host clusters.

Parameterize RGW port to correct the Keystone Swift endpoint for SNO.
Change the VIP detection logic so that if cifmw_cephadm_rgw_vip
is pre-set (e.g. to the host's storage IP for SNO cases) - it's
preserved. Otherwise it falls back to cifmw_cephadm_vip
(the ingress VIP) as before.

Users will be able to chose from
VIP:8080 vs host_ip:8082 accordingly.

Signed-off-by: Bohdan Dobrelia <bdobreli@redhat.com>
@bogdando

Copy link
Copy Markdown
Contributor Author

downstream testproject passed tempest tests, the horizon test failures are unrelated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants