Skip to content

[release-4.19] add systemd services for configuration after start #1062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ podman-remote/
.sw[a-p]
crc-cluster-kube-apiserver-operator
crc-cluster-kube-controller-manager-operator
systemd/crc-dnsmasq.sh
35 changes: 35 additions & 0 deletions createdisk-library.sh
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,7 @@ function prepare_hyperV() {
echo 'CONST{virt}=="microsoft", RUN{builtin}+="kmod load hv_sock"' > /etc/udev/rules.d/90-crc-vsock.rules
EOF
}

function prepare_qemu_guest_agent() {
local vm_ip=$1

Expand Down Expand Up @@ -392,3 +393,37 @@ function remove_pull_secret_from_disk() {
esac
}

function copy_systemd_units() {
case "${BUNDLE_TYPE}" in
"snc"|"okd")
export APPS_DOMAIN="apps-crc.testing"
envsubst '${APPS_DOMAIN}' < systemd/dnsmasq.sh.template > systemd/crc-dnsmasq.sh
unset APPS_DOMAIN
;;
"microshift")
export APPS_DOMAIN="apps.crc.testing"
envsubst '${APPS_DOMAIN}' < systemd/dnsmasq.sh.template > systemd/crc-dnsmasq.sh
unset APPS_DOMAIN
;;
esac

${SSH} core@${VM_IP} -- 'mkdir -p /home/core/systemd-units && mkdir -p /home/core/systemd-scripts'
${SCP} systemd/crc-*.service core@${VM_IP}:/home/core/systemd-units/
${SCP} systemd/crc-*.sh core@${VM_IP}:/home/core/systemd-scripts/

case "${BUNDLE_TYPE}" in
"snc"|"okd")
${SCP} systemd/ocp-*.service core@${VM_IP}:/home/core/systemd-units/
${SCP} systemd/ocp-*.sh core@${VM_IP}:/home/core/systemd-scripts/
;;
esac

${SSH} core@${VM_IP} -- 'sudo cp /home/core/systemd-units/* /etc/systemd/system/ && sudo cp /home/core/systemd-scripts/* /usr/local/bin/'
${SSH} core@${VM_IP} -- 'ls /home/core/systemd-scripts/ | xargs -t -I % sudo chmod +x /usr/local/bin/%'
${SSH} core@${VM_IP} -- 'sudo restorecon -rv /usr/local/bin'

# enable only the .path units
${SSH} core@${VM_IP} -- 'ls /home/core/systemd-units/*.service | xargs basename -a | xargs sudo systemctl enable'

${SSH} core@${VM_IP} -- 'rm -rf /home/core/systemd-units /home/core/systemd-scripts'
}
13 changes: 13 additions & 0 deletions createdisk.sh
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,8 @@ fi

# Beyond this point, packages added to the ADDITIONAL_PACKAGES variable won’t be installed in the guest
install_additional_packages ${VM_IP}
copy_systemd_units

cleanup_vm_image ${VM_NAME} ${VM_IP}

# Enable cloud-init service
Expand All @@ -173,6 +175,17 @@ fi

podman_version=$(${SSH} core@${VM_IP} -- 'rpm -q --qf %{version} podman')

# Disable cloud-init network config
${SSH} core@${VM_IP} 'sudo bash -x -s' << EOF
cat << EFF > /etc/cloud/cloud.cfg.d/05_disable-network.cfg
network:
config: disabled
EFF
EOF

# Disable cloud-init hostname update
${SSH} core@${VM_IP} -- 'sudo sed -i "s/^preserve_hostname: false$/preserve_hostname: true/" /etc/cloud/cloud.cfg'

# Cleanup cloud-init config
${SSH} core@${VM_IP} -- "sudo cloud-init clean --logs"

Expand Down
34 changes: 34 additions & 0 deletions docs/self-sufficient-bundle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Self sufficient bundles

Since release 4.19.0 of OpenShift Local, the bundles generated by `snc` contain additional systemd services to provision the cluster and remove the need for
an outside entity to provision the cluster, although an outside process needs to create some files on pre-defined locations inside the VM for the systemd
services to do their work.

## The following table lists the systemd services and the location of files they need to provision the cluster, users of SNC need to create those files

| Systemd unit | Runs for (ocp, MicroShift, both) | Input files location | Marker env variables |
| :----------------------------: | :------------------------------: | :----------------------------------: | :------------------: |
| `crc-cluster-status.service` | both | none | none |
| `crc-pullsecret.service` | both | /opt/crc/pull-secret | none |
| `crc-dnsmasq.service` | both | none | none |
| `crc-routes-controller.service`| both | none | none |
| `ocp-cluster-ca.service` | ocp | /opt/crc/custom-ca.crt | CRC_CLOUD=1 |
| `ocp-clusterid.service` | ocp | none | none |
| `ocp-custom-domain.service` | ocp | none | CRC_CLOUD=1 |
| `ocp-growfs.service` | ocp | none | none |
| `ocp-userpasswords.service` | ocp | /opt/crc/pass_{kubeadmin, developer} | none |

In addition to the above services we have `ocp-cluster-ca.path`, `crc-pullsecret.path` and `ocp-userpasswords.path` that monitors the filesystem paths
related to their `*.service` counterparts and starts the service when the paths become available.

> [!NOTE]
> "Marker env variable" is set using an env file, if the required env variable is not set then unit is skipped
> some units are run only when CRC_CLOUD=1 is set, these are only needed when using the bundles with crc-cloud

The systemd services are heavily based on the [`clustersetup.sh`](https://github.com/crc-org/crc-cloud/blob/main/pkg/bundle/setup/clustersetup.sh) script found in the `crc-cloud` project.

## Naming convention for the systemd unit files

Systemd units that are needed for both 'OpenShift' and 'MicroShift' are named as `crc-*.service`, units that are needed only for 'OpenShift' are named
as `ocp-*.service` and when we add units that are only needed for 'MicroShift' they should be named as `ucp-*.service`

13 changes: 13 additions & 0 deletions systemd/crc-cluster-status.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[Unit]
Description=CRC Unit checking if cluster is ready
After=kubelet.service ocp-clusterid.service ocp-cluster-ca.service ocp-custom-domain.service
After=crc-pullsecret.service

[Service]
Type=oneshot
Restart=on-failure
ExecStart=/usr/local/bin/crc-cluster-status.sh
RemainAfterExit=true

[Install]
WantedBy=multi-user.target
43 changes: 43 additions & 0 deletions systemd/crc-cluster-status.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/bin/bash

set -x

export KUBECONFIG=/opt/kubeconfig

function check_cluster_healthy() {
WAIT="authentication|console|etcd|ingress|openshift-apiserver"

until `oc get co > /dev/null 2>&1`
do
sleep 2
done

for i in $(oc get co | grep -P "$WAIT" | awk '{ print $3 }')
do
if [[ $i == "False" ]]
then
return 1
fi
done
return 0
}

rm -rf /tmp/.crc-cluster-ready

COUNTER=0
CLUSTER_HEALTH_SLEEP=8
CLUSTER_HEALTH_RETRIES=500

while ! check_cluster_healthy
do
sleep $CLUSTER_HEALTH_SLEEP
if [[ $COUNTER == $CLUSTER_HEALTH_RETRIES ]]
then
return 1
fi
((COUNTER++))
done

# need to set a marker to let `crc` know the cluster is ready
touch /tmp/.crc-cluster-ready

18 changes: 18 additions & 0 deletions systemd/crc-dnsmasq.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[Unit]
Description=CRC Unit for configuring dnsmasq
Wants=ovs-configuration.service
After=ovs-configuration.service
Before=kubelet-dependencies.target
StartLimitIntervalSec=30

[Service]
Type=oneshot
Restart=on-failure
EnvironmentFile=/etc/systemd/system/crc-env
ExecStartPre=/bin/systemctl start ovs-configuration.service
ExecStart=/usr/local/bin/crc-dnsmasq.sh
ExecStartPost=/usr/bin/systemctl restart NetworkManager.service
ExecStartPost=/usr/bin/systemctl restart dnsmasq.service

[Install]
WantedBy=kubelet-dependencies.target
12 changes: 12 additions & 0 deletions systemd/crc-pullsecret.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[Unit]
Description=CRC Unit for adding pull secret to cluster
After=kubelet.service
StartLimitIntervalSec=90sec

[Service]
Type=oneshot
Restart=on-failure
ExecStart=/usr/local/bin/crc-pullsecret.sh

[Install]
WantedBy=multi-user.target
21 changes: 21 additions & 0 deletions systemd/crc-pullsecret.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

set -x

source /usr/local/bin/crc-systemd-common.sh
export KUBECONFIG="/opt/kubeconfig"

wait_for_resource secret

# check if existing pull-secret is valid if not add the one from /opt/crc/pull-secret
existingPsB64=$(oc get secret pull-secret -n openshift-config -o jsonpath="{['data']['\.dockerconfigjson']}")
existingPs=$(echo "${existingPsB64}" | base64 -d)

echo "${existingPs}" | jq -e '.auths'

if [[ $? != 0 ]]; then
pullSecretB64=$(cat /opt/crc/pull-secret | base64 -w0)
oc patch secret pull-secret -n openshift-config --type merge -p "{\"data\":{\".dockerconfigjson\":\"${pullSecretB64}\"}}"
rm -f /opt/crc/pull-secret
fi

12 changes: 12 additions & 0 deletions systemd/crc-routes-controller.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[Unit]
Description=CRC Unit starting routes controller
Wants=network-online.target gvisor-tap-vsock.service sys-class-net-tap0.device
After=sys-class-net-tap0.device network-online.target kubelet.service gvisor-tap-vsock.service

[Service]
Type=oneshot
EnvironmentFile=/etc/systemd/system/crc-env
ExecStart=/usr/local/bin/crc-routes-controller.sh

[Install]
WantedBy=multi-user.target
16 changes: 16 additions & 0 deletions systemd/crc-routes-controller.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

set -x

if [[ ${CRC_NETWORK_MODE_USER} -eq 0 ]]; then
echo -n "network-mode 'system' detected: skipping routes-controller pod deployment"
exit 0
fi

source /usr/local/bin/crc-systemd-common.sh
export KUBECONFIG=/opt/kubeconfig

wait_for_resource pods

oc apply -f /opt/crc/routes-controller.yaml

12 changes: 12 additions & 0 deletions systemd/crc-systemd-common.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# $1 is the resource to check
# $2 is an optional maximum retry count; default 20
function wait_for_resource() {
local retry=0
local max_retry=${2:-20}
until `oc get "$1" > /dev/null 2>&1`
do
[ $retry == $max_retry ] && exit 1
sleep 5
((retry++))
done
}
26 changes: 26 additions & 0 deletions systemd/dnsmasq.sh.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/bin/bash

set -x

if [[ ${CRC_NETWORK_MODE_USER} -eq 1 ]]; then
echo -n "network-mode 'user' detected: skipping dnsmasq configuration"
exit 0
fi

hostName=$(hostname)
hostIp=$(hostname --all-ip-addresses | awk '{print $1}')

cat << EOF > /etc/dnsmasq.d/crc-dnsmasq.conf
listen-address=$hostIp
expand-hosts
log-queries
local=/crc.testing/
domain=crc.testing
address=/${APPS_DOMAIN}/$hostIp
address=/api.crc.testing/$hostIp
address=/api-int.crc.testing/$hostIp
address=/$hostName.crc.testing/$hostIp
EOF

/bin/systemctl enable --now dnsmasq.service
/bin/nmcli conn modify --temporary ovs-if-br-ex ipv4.dns $hostIp,1.1.1.1
11 changes: 11 additions & 0 deletions systemd/ocp-cluster-ca.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[Unit]
Description=CRC Unit setting custom cluster ca
After=kubelet.service ocp-clusterid.service

[Service]
Type=oneshot
Restart=on-failure
ExecStart=/usr/local/bin/ocp-cluster-ca.sh

[Install]
WantedBy=multi-user.target
91 changes: 91 additions & 0 deletions systemd/ocp-cluster-ca.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/bin/bash

# The steps followed to generate CA and replace system:admin cert are from:
# https://access.redhat.com/solutions/5286371
# https://access.redhat.com/solutions/6054981

set -x

source /usr/local/bin/crc-systemd-common.sh
export KUBECONFIG="/opt/kubeconfig"

wait_for_resource configmap

custom_ca_path=/opt/crc/custom-ca.crt
external_ip_path=/opt/crc/eip

if [ ! -f ${custom_ca_path} ]; then
echo "Cert bundle /opt/crc/custom-ca.crt not found, generating one..."
# generate a ca bundle and use it, overwrite custom_ca_path
CA_SUBJ="/OU=openshift/CN=admin-kubeconfig-signer-custom"
openssl genrsa -out /tmp/custom-ca.key 4096
openssl req -x509 -new -nodes -key /tmp/custom-ca.key -sha256 -days 365 -out "${custom_ca_path}" -subj "${CA_SUBJ}"
fi

if [ ! -f /opt/crc/pass_kubeadmin ]; then
echo "kubeadmin password file not found"
exit 1
fi

PASS_KUBEADMIN="$(cat /opt/crc/pass_kubeadmin)"
oc create configmap client-ca-custom -n openshift-config --from-file=ca-bundle.crt=${custom_ca_path}
oc patch apiserver cluster --type=merge -p '{"spec": {"clientCA": {"name": "client-ca-custom"}}}'
oc create configmap admin-kubeconfig-client-ca -n openshift-config --from-file=ca-bundle.crt=${custom_ca_path} \
--dry-run=client -o yaml | oc replace -f -

rm -f /opt/crc/custom-ca.crt

# create CSR
openssl req -new -newkey rsa:4096 -nodes -keyout /tmp/newauth-access.key -out /tmp/newauth-access.csr -subj "/CN=system:admin"

cat << EOF >> /tmp/newauth-access-csr.yaml
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: newauth-access
spec:
signerName: kubernetes.io/kube-apiserver-client
groups:
- system:authenticated
request: $(cat /tmp/newauth-access.csr | base64 -w0)
usages:
- client auth
EOF

oc create -f /tmp/newauth-access-csr.yaml

until `oc adm certificate approve newauth-access > /dev/null 2>&1`
do
echo "Unable to approve the csr newauth-access"
sleep 5
done

cluster_name=$(oc config view -o jsonpath='{.clusters[0].name}')
apiserver_url=$(oc config view -o jsonpath='{.clusters[0].cluster.server}')

if [ -f "${external_ip_path}" ]; then
apiserver_url=api.$(cat "${external_ip_path}").nip.io
fi

updated_kubeconfig_path=/opt/crc/kubeconfig

oc get csr newauth-access -o jsonpath='{.status.certificate}' | base64 -d > /tmp/newauth-access.crt
oc config set-credentials system:admin --client-certificate=/tmp/newauth-access.crt --client-key=/tmp/newauth-access.key --embed-certs --kubeconfig="${updated_kubeconfig_path}"
oc config set-context system:admin --cluster="${cluster_name}" --namespace=default --user=system:admin --kubeconfig="${updated_kubeconfig_path}"
oc get secret localhost-recovery-client-token -n openshift-kube-controller-manager -ojsonpath='{.data.ca\.crt}'| base64 -d > /tmp/bundle-ca.crt
oc config set-cluster "${cluster_name}" --server="${apiserver_url}" --certificate-authority=/tmp/bundle-ca.crt \
--kubeconfig="${updated_kubeconfig_path}" --embed-certs

echo "Logging in again to update $KUBECONFIG with kubeadmin token"
COUNTER=0
MAXIMUM_LOGIN_RETRY=500
until `oc login --insecure-skip-tls-verify=true -u kubeadmin -p "$PASS_KUBEADMIN" https://api.crc.testing:6443 --kubeconfig /opt/crc/newkubeconfig > /dev/null 2>&1`
do
if [ $COUNTER == $MAXIMUM_LOGIN_RETRY ]; then
echo "Unable to login to the cluster..., installation failed."
exit 1
fi
echo "Logging into OpenShift with updated credentials try $COUNTER, hang on...."
sleep 5
((COUNTER++))
done
Loading