Skip to content

[OSDOCS-15316]: Creating a hosted cluster on bare metal #96275

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lahinson
Copy link
Contributor

@lahinson lahinson commented Jul 17, 2025

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 17, 2025
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from a51c7f8 to 0c0d363 Compare July 17, 2025 17:00
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch 2 times, most recently from 09a5173 to 4b08e25 Compare July 17, 2025 17:23
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch 3 times, most recently from 73ffe57 to daa7c37 Compare July 17, 2025 19:39
@openshift-ci openshift-ci bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 17, 2025
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from daa7c37 to e5561df Compare July 17, 2025 19:57
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch 6 times, most recently from aba8843 to 30573c3 Compare July 18, 2025 15:49
@@ -20,6 +20,8 @@ As you create a hosted cluster, keep the following guidelines in mind:

- The most common service publishing strategy is to expose services through a load balancer. That strategy is the preferred method for exposing the Kubernetes API server. If you create a hosted cluster by using the web console or by using {rh-rhacm-title}, to set a publishing strategy for a service besides the Kubernetes API server, you must manually specify the `servicePublishingStrategy` information in the `HostedCluster` custom resource.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We suggest to use a load balancer but by default with the hcp create agent command the bare metal hosted clusters are created with node port: we should indicate this here. We also should explain how to change the servicePublishingStrategy in the section to create bare metal hosted clusters as otherwise we leave this for the user to research with not even references to where. @jparrill ptal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the wording here, but I still need help to describe what users should do to change the servicePublishingStrategy.

Copy link

@racedo racedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add information about changing publishing strategy to recommended one for bare metal (load balancer) and explain it's not the default one

<9> Specify the availability policy for the hosted control plane components. Supported options are `SingleReplica` and `HighlyAvailable`. The default value is `HighlyAvailable`.
<10> Specify the supported {product-title} version that you want to use, for example, `4.19.0-multi`. If you are using a disconnected environment, replace `<ocp_release_image>` with the digest image. To extract the {product-title} release image digest, see _Extracting the {product-title} release image digest_.
<10> Specify the supported {product-title} version that you want to use, for example, `4.19.0-multi`. If you are using a disconnected environment, replace `<ocp_release_image>` with the digest image. To extract the {product-title} release image digest, see "Extracting the {product-title} release image digest".
<11> Specify the node pool replica count, for example, `3`. You must specify the replica count as `0` or greater to create the same number of replicas. Otherwise, no node pools are created.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't explain that in order to be able to add nodes to this cluster we must have previously added nodes to an InfraEnv (via web ui for example) and we don't explain how to do this. This leaves the user with a workflow that will always fail to complete as at this point the user hasn't been told how to add nodes to the inventory first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new procedure before this procedure about adding bare-metal nodes to a hardware inventory.

Copy link

@racedo racedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to explain the requirement to have nodes in the inventory before running this command as otherwise this will fail

Copy link

@racedo racedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add example to add zone labels in workers before creating hosted clusters and explain it's mandatory (it suggests it works but will add to single node).

@@ -20,6 +20,8 @@ As you create a hosted cluster, keep the following guidelines in mind:

- The most common service publishing strategy is to expose services through a load balancer. That strategy is the preferred method for exposing the Kubernetes API server. If you create a hosted cluster by using the web console or by using {rh-rhacm-title}, to set a publishing strategy for a service besides the Kubernetes API server, you must manually specify the `servicePublishingStrategy` information in the `HostedCluster` custom resource.

- Ensure that you meet the requirements described in "Preparing to deploy {hcp} on bare metal", which includes requirements related to infrastructure, firewalls, ports, and services.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the "Preparing to deploy hosted control planes on bare metal" in "Prerequisites to configure a management cluster" it's mentioned "You must add the topology.kubernetes.io/zone label to your bare-metal hosts on your management cluster. Ensure that each host has a unique value for topology.kubernetes.io/zone. Otherwise, all of the hosted control plane pods are scheduled on a single node, causing a single point of failure."

In my experience if you don't the hosted cluster creation command never completes. We should give an example of how to do that, e.g. on your management cluster run:

oc label node [worker-node-1] topology.kubernetes.io/zone=zone1
oc label node [worker-node-2] topology.kubernetes.io/zone=zone2  
oc label node [worker-node-3] topology.kubernetes.io/zone=zone3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the prerequisites ✅

@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from 30573c3 to cc2d20e Compare July 22, 2025 15:20
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from cc2d20e to 549af30 Compare July 22, 2025 16:01
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from 549af30 to e15611b Compare July 22, 2025 18:17
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch 4 times, most recently from 531204b to 1104fb6 Compare July 22, 2025 21:01
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch 3 times, most recently from 5dbf4ce to 007db9a Compare July 23, 2025 15:15
@lahinson lahinson force-pushed the osdocs-15316-hcp-bm-improvements branch from 007db9a to d7dd5ba Compare July 23, 2025 17:05
Copy link

openshift-ci bot commented Jul 23, 2025

@lahinson: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants