diff --git a/_topic_maps/_topic_map_ms.yml b/_topic_maps/_topic_map_ms.yml index bd7f6a5759ec..fdd91440b1cf 100644 --- a/_topic_maps/_topic_map_ms.yml +++ b/_topic_maps/_topic_map_ms.yml @@ -215,8 +215,8 @@ Topics: File: microshift-embed-apps-offline-use - Name: Embedding applications tutorial File: microshift-embedding-apps-tutorial -- Name: Creating application or workload health check scripts - File: microshift-greenboot-workload-scripts +- Name: Using greenboot for workload health checks + File: microshift-greenboot-workload-health-checks - Name: Automating application management with GitOps File: microshift-gitops - Name: Pod security authentication and authorization diff --git a/microshift_install_get_ready/microshift-greenboot.adoc b/microshift_install_get_ready/microshift-greenboot.adoc index b5ae906d8a6b..1901f0812568 100644 --- a/microshift_install_get_ready/microshift-greenboot.adoc +++ b/microshift_install_get_ready/microshift-greenboot.adoc @@ -6,19 +6,14 @@ include::_attributes/attributes-microshift.adoc[] toc::[] -Greenboot is the generic health check framework for the `systemd` service on `rpm-ostree` systems such as {op-system-ostree-first}. This framework is included in {microshift-short} installations with the `microshift-greenboot` and `greenboot-default-health-checks` RPM packages. +Learn about how greenboot health checks are used with {microshift-short}. -Greenboot health checks run at various times to assess system health and automate a rollback on `rpm-ostree` systems to the last healthy state in cases of software trouble, for example: - -* Default health check scripts run each time the system starts. -* In addition the to the default health checks, you can write, install, and configure application health check scripts to also run every time the system starts. -* Greenboot can reduce your risk of being locked out of edge devices during updates and prevent a significant interruption of service if an update fails. -* When a failure is detected, the system boots into the last known working configuration using the `rpm-ostree` rollback capability. This feature is especially useful automation for edge devices where direct serviceability is either limited or non-existent. - -A {microshift-short} application health check script is included in the `microshift-greenboot` RPM. The `greenboot-default-health-checks` RPM includes health check scripts verifying that DNS and `ostree` services are accessible. You can create your own health check scripts for the workloads you are running. You can write one that verifies that an application has started, for example. +include::modules/microshift-greenboot-dir-use-for-scripts.adoc[leveloffset=+1] include::modules/microshift-greenboot-dir-structure.adoc[leveloffset=+1] +include::modules/microshift-greenboot-included-health-checks.adoc[leveloffset=+1] + include::modules/microshift-greenboot-microshift-health-script.adoc[leveloffset=+1] include::modules/microshift-greenboot-systemd-journal-data.adoc[leveloffset=+1] @@ -40,4 +35,4 @@ include::modules/microshift-greenboot-check-update.adoc[leveloffset=+1] [id="additional-resources_microshift-greenboot_{context}"] [role="_additional-resources_microshift-greenboot"] == Additional resources -* xref:../microshift_running_apps/microshift-greenboot-workload-scripts.adoc#microshift-greenboot-workload-scripts[Greenboot workload health check scripts] \ No newline at end of file +* xref:../microshift_running_apps/microshift-greenboot-workload-health-checks.adoc#microshift-greenboot-workload-health-checks[Greenboot workload health checks] \ No newline at end of file diff --git a/microshift_running_apps/microshift-greenboot-workload-health-checks.adoc b/microshift_running_apps/microshift-greenboot-workload-health-checks.adoc new file mode 100644 index 000000000000..c62151deecef --- /dev/null +++ b/microshift_running_apps/microshift-greenboot-workload-health-checks.adoc @@ -0,0 +1,26 @@ +:_mod-docs-content-type: ASSEMBLY +[id="microshift-greenboot-workload-health-checks"] += Using greenboot for application and workload health checks +include::_attributes/attributes-microshift.adoc[] +:context: microshift-greenboot-workload-health-checks + +toc::[] + +You can use greenboot health checks to assess the health of your workloads and applications. + +include::modules/microshift-greenboot-how-workload-health-checks-work.adoc[leveloffset=+1] + +include::modules/microshift-greenboot-health-check-command.adoc[leveloffset=+1] + +include::modules/microshift-greenboot-app-health-check-script.adoc[leveloffset=+1] + +include::modules/microshift-greenboot-workload-health-script-ex.adoc[leveloffset=+2] + +include::modules/microshift-greenboot-testing-workload-script.adoc[leveloffset=+2] + +[role="_additional-resources"] +== Additional resources + +* xref:../microshift_install_get_ready/microshift-greenboot.adoc#microshift-greenboot[The greenboot health check] + +* xref:../microshift_running_apps/microshift-applications.adoc#microshift-applying-manifests-example_applications-microshift[Auto applying manifests] diff --git a/microshift_running_apps/microshift-greenboot-workload-scripts.adoc b/microshift_running_apps/microshift-greenboot-workload-scripts.adoc deleted file mode 100644 index 145906bd2c59..000000000000 --- a/microshift_running_apps/microshift-greenboot-workload-scripts.adoc +++ /dev/null @@ -1,25 +0,0 @@ -:_mod-docs-content-type: ASSEMBLY -[id="microshift-greenboot-workload-scripts"] -= Greenboot workload health check scripts -include::_attributes/attributes-microshift.adoc[] -:context: microshift-greenboot-workload-scripts - -toc::[] - -Greenboot health check scripts are helpful on edge devices where direct serviceability is either limited or non-existent. You can create health check scripts to assess the health of your workloads and applications. These additional health check scripts are useful components of software problem checks and automatic system rollbacks. - -A {microshift-short} health check script is included in the `microshift-greenboot` RPM. You can also create your own health check scripts based on the workloads you are running. For example, you can write one that verifies that a service has started. - -include::modules/microshift-greenboot-how-workload-health-check-scripts-work.adoc[leveloffset=+1] - -include::modules/microshift-greenboot-included-health-checks.adoc[leveloffset=+1] - -include::modules/microshift-greenboot-create-health-check-script.adoc[leveloffset=+1] - -include::modules/microshift-greenboot-testing-workload-script.adoc[leveloffset=+1] - -[id="additional-resources_microshift-greenboot-workload-scripts_{context}"] -[role="_additional-resources"] -== Additional resources -* xref:../microshift_install_get_ready/microshift-greenboot.adoc#microshift-greenboot[The greenboot health check] -* xref:../microshift_running_apps/microshift-applications.adoc#microshift-applying-manifests-example_applications-microshift[Auto applying manifests] diff --git a/microshift_updating/microshift-update-options.adoc b/microshift_updating/microshift-update-options.adoc index 64d460e6ed2b..14f0066370ea 100644 --- a/microshift_updating/microshift-update-options.adoc +++ b/microshift_updating/microshift-update-options.adoc @@ -51,7 +51,7 @@ To begin a {microshift-short} update by embedding in a {op-system-ostree} image, To understand more about greenboot, see the following documentation: * xref:../microshift_install_get_ready/microshift-greenboot.adoc#microshift-greenboot[The greenboot health check] -* xref:../microshift_running_apps/microshift-greenboot-workload-scripts.adoc#microshift-greenboot-workload-scripts[Greenboot workload health check scripts] +* xref:../microshift_running_apps/microshift-greenboot-workload-health-checks.adoc#microshift-greenboot-workload-health-checks[Greenboot workload health checks] [id="microshift-update-options-manual-rpm-updates_{context}"] === Manual RPM updates @@ -91,4 +91,4 @@ You can update {op-system-ostree} or {op-system-base} and update {microshift-sho * xref:../microshift_updating/microshift-update-rpms-ostree.adoc#microshift-update-rpms-ostree[Applying updates on an OSTree system] * xref:../microshift_updating/microshift-update-rpms-manually.adoc#microshift-update-rpms-manually[Applying updates manually with RPMs] * xref:../microshift_install_get_ready/microshift-greenboot.adoc#microshift-greenboot[The greenboot system health check] -* xref:../microshift_running_apps/microshift-greenboot-workload-scripts.adoc#microshift-greenboot-workload-scripts[Greenboot workload scripts] +* xref:../microshift_running_apps/microshift-greenboot-workload-health-checks.adoc#microshift-greenboot-workload-health-checks[Greenboot workload health checks] diff --git a/modules/microshift-greenboot-app-health-check-script.adoc b/modules/microshift-greenboot-app-health-check-script.adoc new file mode 100644 index 000000000000..b5f055c58411 --- /dev/null +++ b/modules/microshift-greenboot-app-health-check-script.adoc @@ -0,0 +1,19 @@ +//Module included in the following assemblies: +// +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc + +:_mod-docs-content-type: CONCEPT +[id="microshift-greenboot-app-health-check-script_{context}"] += How to create a health check script for your application + +You can create workload or application health check scripts in the text editor of your choice. Save the scripts in the `/etc/greenboot/check/required.d` directory. When a script in the `/etc/greenboot/check/required.d` directory exits with an error, greenboot triggers a reboot in an attempt to heal the system. + +[NOTE] +==== +Any script in the `/etc/greenboot/check/required.d` directory triggers a reboot if it exits with an error. +==== + +If your health check logic requires any post-check steps, you can also create additional scripts and save them in the relevant greenboot directories. For example: + +* You can also place shell scripts you want to run after a boot has been declared successful in `/etc/greenboot/green.d`. +* You can place shell scripts you want to run after a boot has been declared failed in `/etc/greenboot/red.d`. For example, if you have steps to heal the system before restarting, you can create scripts for your use case and place them in the `/etc/greenboot/red.d` directory. diff --git a/modules/microshift-greenboot-check-update.adoc b/modules/microshift-greenboot-check-update.adoc index 5215f582f511..3ad258d15b5b 100644 --- a/modules/microshift-greenboot-check-update.adoc +++ b/modules/microshift-greenboot-check-update.adoc @@ -16,11 +16,10 @@ Access the output of greenboot health check scripts in the system log after an u ---- $ sudo grub2-editenv - list | grep ^boot_success ---- - ++ .Example output for a successful update [source,terminal] ---- -boot_success=1 +boot_success=1 <1> ---- - -If your command returns `boot_success=0`, either the greenboot health check is still running, or the update is a failure. \ No newline at end of file +<1> If your command returns `boot_success=0`, either the greenboot health check is still running, or the update is a failure. diff --git a/modules/microshift-greenboot-create-health-check-script.adoc b/modules/microshift-greenboot-create-health-check-script.adoc deleted file mode 100644 index 3a558fc20d0f..000000000000 --- a/modules/microshift-greenboot-create-health-check-script.adoc +++ /dev/null @@ -1,113 +0,0 @@ -//Updated title and ID: -//Module included in the following assemblies: -// -//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc - -:_mod-docs-content-type: CONCEPT -[id="microshift-greenboot-app-health-check-script_{context}"] -= How to create a health check script for your application - -You can create workload or application health check scripts in the text editor of your choice using the example in this documentation. Save the scripts in the `/etc/greenboot/check/required.d` directory. When a script in the `/etc/greenboot/check/required.d` directory exits with an error, greenboot triggers a reboot in an attempt to heal the system. - -[NOTE] -==== -Any script in the `/etc/greenboot/check/required.d` directory triggers a reboot if it exits with an error. -==== - -If your health check logic requires any post-check steps, you can also create additional scripts and save them in the relevant greenboot directories. For example: - -* You can also place shell scripts you want to run after a boot has been declared successful in `/etc/greenboot/green.d`. -* You can place shell scripts you want to run after a boot has been declared failed in `/etc/greenboot/red.d`. For example, if you have steps to heal the system before restarting, you can create scripts for your use case and place them in the `/etc/greenboot/red.d` directory. - -[id="microshift-greenboot-about-workload-health-check-script-example_{context}"] -== About the workload health check script example - -The following example uses the {microshift-short} health check script as a template. You can use this example with the provided libraries as a guide for creating basic health check scripts for your applications. - -[id="microshift-greenboot-app-health-check-basic-prereqs_{context}"] -=== Basic prerequisites for creating a health check script - -* The workload must be installed. -* You must have root access. - -[id="microshift-greenboot-app-health-check-ex-reqs_{context}"] -=== Example and functional requirements - -You can start with the following example health check script. Modify it for your use case. In your workload health check script, you must complete the following minimum steps: - -* Set the environment variables. -* Define the user workload namespaces. -* List the expected pod count. - -[IMPORTANT] -==== -Choose a name prefix for your application that ensures it runs after the `40_microshift_running_check.sh` script, which implements the {product-title} health check procedure for its core services. -==== - -.Example workload health check script -[source, bash] ----- -# #!/bin/bash -set -e - -SCRIPT_NAME=$(basename $0) -PODS_NS_LIST=( ) -PODS_CT_LIST=( ) -# Update these two lines with at least one namespace and the pod counts that are specific to your workloads. Use the kubernetes where your workload is deployed. - -# Set greenboot to read and execute the workload health check functions library. -source /usr/share/microshift/functions/greenboot.sh - -# Set the exit handler to log the exit status. -trap 'script_exit' EXIT - -# Set the script exit handler to log a `FAILURE` or `FINISHED` message depending on the exit status of the last command. -# args: None -# return: None -function script_exit() { - [ "$?" -ne 0 ] && status=FAILURE || status=FINISHED - echo $status -} - -# Set the system to automatically stop the script if the user running it is not 'root'. -if [ $(id -u) -ne 0 ] ; then - echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges" - exit 1 -fi - -echo "STARTED" - -# Set the script to stop without reporting an error if the MicroShift service is not running. -if [ $(systemctl is-enabled microshift.service 2>/dev/null) != "enabled" ] ; then - echo "MicroShift service is not enabled. Exiting..." - exit 0 -fi - -# Set the wait timeout for the current check based on the boot counter. -WAIT_TIMEOUT_SECS=$(get_wait_timeout) - -# Set the script to wait for the pod images to be downloaded. -for i in ${!PODS_NS_LIST[@]}; do - CHECK_PODS_NS=${PODS_NS_LIST[$i]} - - echo "Waiting ${WAIT_TIMEOUT_SECS}s for pod image(s) from the ${CHECK_PODS_NS} namespace to be downloaded" - wait_for ${WAIT_TIMEOUT_SECS} namespace_images_downloaded -done - -# Set the script to wait for pods to enter ready state. -for i in ${!PODS_NS_LIST[@]}; do - CHECK_PODS_NS=${PODS_NS_LIST[$i]} - CHECK_PODS_CT=${PODS_CT_LIST[$i]} - - echo "Waiting ${WAIT_TIMEOUT_SECS}s for ${CHECK_PODS_CT} pod(s) from the ${CHECK_PODS_NS} namespace to be in 'Ready' state" - wait_for ${WAIT_TIMEOUT_SECS} namespace_pods_ready -done - -# Verify that pods are not restarting by running, which could indicate a crash loop. -for i in ${!PODS_NS_LIST[@]}; do - CHECK_PODS_NS=${PODS_NS_LIST[$i]} - - echo "Checking pod restart count in the ${CHECK_PODS_NS} namespace" - namespace_pods_not_restarting ${CHECK_PODS_NS} -done ----- diff --git a/modules/microshift-greenboot-dir-structure.adoc b/modules/microshift-greenboot-dir-structure.adoc index 1a1cf251e048..b96dd3b82d98 100644 --- a/modules/microshift-greenboot-dir-structure.adoc +++ b/modules/microshift-greenboot-dir-structure.adoc @@ -46,4 +46,4 @@ If you customize the values of any environment variable in the `/etc/greenboot/g * To retain customizations when building system images with {microshift-short}, add the `greenboot.conf` file to a blueprint. * To retain customizations when using an RPM installation, apply changes to the `greenboot.conf` file after you install {microshift-short} and greenboot RPMs. -==== \ No newline at end of file +==== diff --git a/modules/microshift-greenboot-dir-use-for-scripts.adoc b/modules/microshift-greenboot-dir-use-for-scripts.adoc new file mode 100644 index 000000000000..0a84e5a8f187 --- /dev/null +++ b/modules/microshift-greenboot-dir-use-for-scripts.adoc @@ -0,0 +1,18 @@ +// Module included in the following assemblies: +// +// * microshift_install_get_ready/microshift-greenboot.adoc + +:_mod-docs-content-type: CONCEPT +[id="microshift-greenboot-dir-use-for-scripts_{context}"] += How greenboot uses directories to run scripts + +Greenboot is the generic health check framework for the `systemd` service on `rpm-ostree` systems such as {op-system-ostree-first}. This framework is included in {microshift-short} installations with the `microshift-greenboot` and `greenboot-default-health-checks` RPM packages. + +Greenboot health checks run at various times to assess system health and automate a rollback on `rpm-ostree` systems to the last healthy state in cases of software trouble, for example: + +* Default health check scripts run each time the system starts. +* In addition the to the default health checks, you can write, install, and configure application health check scripts to also run every time the system starts. +* Greenboot can reduce your risk of being locked out of edge devices during updates and prevent a significant interruption of service if an update fails. +* When a failure is detected, the system boots into the last known working configuration using the `rpm-ostree` rollback capability. This feature is especially useful automation for edge devices where direct serviceability is either limited or non-existent. + +A {microshift-short} application health check script is included in the `microshift-greenboot` RPM. The `greenboot-default-health-checks` RPM includes health check scripts verifying that DNS and `ostree` services are accessible. You can create your own health check scripts for the workloads you are running. You can write one that verifies that an application has started, for example. diff --git a/modules/microshift-greenboot-health-check-command.adoc b/modules/microshift-greenboot-health-check-command.adoc new file mode 100644 index 000000000000..4ff07af169df --- /dev/null +++ b/modules/microshift-greenboot-health-check-command.adoc @@ -0,0 +1,73 @@ +//Module included in the following assemblies: +// +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc + +:_mod-docs-content-type: REFERENCE +[id="microshift-greenboot-microshift-health-check-command_{context}"] += How to use the {microshift-short} health check command + +The `microshift healthcheck` command checks whether a workload of the provided type exists and verifies its +status for the specified timeout duration. The number of ready replicas, that is, pods, must match the expected amount. + +To run the `microshift healthcheck` command successfully, use the following prerequisites: + +* Execute commands from a root user account. +* Enable the {microshift-short} service. + +You can add the following actions to the `microshift healthcheck` command: + +* `-v=2` to increase verbosity of the output +* `--timeout="${WAIT_TIMEOUT_SECS}s"` to override default 300s timeout value +* `--namespace `__` to specify the namespace of the workloads +* `--deployments `__` to check the readiness of a specific deployment ++ +.Example command +[source,terminal] +---- +$ sudo microshift healthcheck -v=2 --timeout="300s" --namespace busybox --deployments busybox-deployment +---- ++ +.Example output +[source,text] +---- +??? I0410 08:54:03.766578 5898 service.go:29] microshift.service is enabled +??? I0410 08:54:03.766699 5898 service.go:31] Waiting 5m0s for microshift.service to be ready +??? I0410 08:54:03.768794 5898 service.go:38] microshift.service is ready +??? I0410 08:54:03.770585 5898 utils.go:34] Waiting for 1 goroutines +??? I0410 08:54:03.770955 5898 workloads.go:94] Waiting 5m0s for deployment/busybox-deployment in busybox +??? I0410 08:54:03.777830 5898 workloads.go:132] Deployment/busybox-deployment in busybox is ready +??? I0410 08:54:03.777858 5898 healthcheck.go:75] Workloads are ready +---- + +The `microshift healthcheck` command also accepts the following additional parameters to specify other kinds +of workloads: + +* `--daemonsets` +* `--statefulsets` +* These options take a comma-delimited list of resources, for example, `--daemonsets ovnkube-master,ovnkube-node`. + +Alternatively, a `--custom` option can be used with a `JSON` string, for example: + +[source,terminal] +---- +$ sudo microshift healthcheck --custom '{"openshift-storage":{"deployments": + ["lvms-operator"], "daemonsets": ["vg-manager"]}, "openshift-ovn-kubernetes": + {"daemonsets": ["ovnkube-master", "ovnkube-node"]}}' +---- + +.Example output +[source,text] +---- +??? I0410 08:54:25.291059 5979 service.go:29] microshift.service is enabled +??? I0410 08:54:25.291167 5979 service.go:31] Waiting 5m0s for microshift.service to be ready +??? I0410 08:54:25.293188 5979 service.go:38] microshift.service is ready +??? I0410 08:54:25.294331 5979 workloads.go:58] Waiting 5m0s for daemonset/ovnkube-node in openshift-ovn-kubernetes +??? I0410 08:54:25.294351 5979 workloads.go:58] Waiting 5m0s for daemonset/ovnkube-master in openshift-ovn-kubernetes +??? I0410 08:54:25.294331 5979 workloads.go:58] Waiting 5m0s for daemonset/vg-manager in openshift-storage +??? I0410 08:54:25.294341 5979 workloads.go:94] Waiting 5m0s for deployment/lvms-operator in openshift-storage +??? I0410 08:54:25.309739 5979 workloads.go:89] Daemonset/ovnkube-node in openshift-ovn-kubernetes is ready +??? I0410 08:54:25.310213 5979 workloads.go:89] Daemonset/vg-manager in openshift-storage is ready +??? I0410 08:54:25.310731 5979 workloads.go:132] Deployment/lvms-operator in openshift-storage is ready +??? I0410 08:54:25.311017 5979 workloads.go:89] Daemonset/ovnkube-master in openshift-ovn-kubernetes is ready +??? I0410 08:54:25.311189 5979 healthcheck.go:52] Workloads are ready +---- diff --git a/modules/microshift-greenboot-health-check-log.adoc b/modules/microshift-greenboot-health-check-log.adoc index 40e3a855e47e..c2461f6485d7 100644 --- a/modules/microshift-greenboot-health-check-log.adoc +++ b/modules/microshift-greenboot-health-check-log.adoc @@ -16,7 +16,7 @@ You can manually access the output of health checks in the system log by using t ---- $ sudo journalctl -o cat -u greenboot-healthcheck.service ---- - ++ .Example output of a failed health check [source,terminal] ---- @@ -34,4 +34,4 @@ Waiting 300s for MicroShift service to be active and not failed FAILURE ... ... ----- \ No newline at end of file +---- diff --git a/modules/microshift-greenboot-how-workload-health-check-scripts-work.adoc b/modules/microshift-greenboot-how-workload-health-check-scripts-work.adoc deleted file mode 100644 index a53cbf301d27..000000000000 --- a/modules/microshift-greenboot-how-workload-health-check-scripts-work.adoc +++ /dev/null @@ -1,26 +0,0 @@ -//Module included in the following assemblies: -// -//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc - -:_mod-docs-content-type: CONCEPT -[id="microshift-greenboot-how-workload-health-check-scripts-work_{context}"] -= How workload health check scripts work - -The workload or application health check script described in this tutorial uses the {microshift-short} health check functions that are available in the `/usr/share/microshift/functions/greenboot.sh` file. This enables you to reuse procedures already implemented for the {microshift-short} core services. - -The script starts by running checks that the basic functions of the workload are operating as expected. To run the script successfully: - -* Execute the script from a root user account. -* Enable the {microshift-short} service. - -The health check performs the following actions: - -* Gets a wait timeout of the current boot cycle for the `wait_for` function. -* Calls the `namespace_images_downloaded` function to wait until pod images are available. -* Calls the `namespace_pods_ready` function to wait until pods are ready. -* Calls the `namespace_pods_not_restarting` function to verify pods are not restarting. - -[NOTE] -==== -Restarting pods can indicate a crash loop. -==== \ No newline at end of file diff --git a/modules/microshift-greenboot-how-workload-health-checks-work.adoc b/modules/microshift-greenboot-how-workload-health-checks-work.adoc new file mode 100644 index 000000000000..af973afae800 --- /dev/null +++ b/modules/microshift-greenboot-how-workload-health-checks-work.adoc @@ -0,0 +1,30 @@ +//Module included in the following assemblies: +// +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc + +:_mod-docs-content-type: CONCEPT +[id="microshift-greenboot-how-workload-health-checks-work_{context}"] += How workload health checks work + +Greenboot health checks are helpful on edge devices where direct serviceability is either limited or non-existent. You can use greenboot health checks to assess the health of your workloads and applications. These additional health checks are useful for software problem detection and automatic system rollbacks. + +Workload or application health checks can use the {microshift-short} basic health check functions already implemented for the {microshift-short} core services. Creating your own comprehensive scripts for your applications is recommended. For example, you can write one that verifies that a service has started. + +You can also use the `microshift healthcheck` command, which can run checks that the basic functions of the workload are operating as expected. + +[IMPORTANT] +==== +The following functions related to checking workload health in `/usr/share/microshift/functions/greenboot.sh` are deprecated and planned for removal in a future release: + +* `wait_for` +* `namespace_images_downloaded` +* `namespace_deployment_ready` +* `namespace_daemonset_ready` +* `namespace_pods_ready` +* `namespace_pods_not_restarting` +* `print_failure_logs` +* `log_failure_cmd` +* `log_script_exit` +* `lvmsDriverShouldExist` +* `csiComponentShouldBeDeploy` +==== diff --git a/modules/microshift-greenboot-included-health-checks.adoc b/modules/microshift-greenboot-included-health-checks.adoc index 61a0bc8b5277..c9ae7a34dfc8 100644 --- a/modules/microshift-greenboot-included-health-checks.adoc +++ b/modules/microshift-greenboot-included-health-checks.adoc @@ -1,12 +1,12 @@ //Module included in the following assemblies: // -//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc :_mod-docs-content-type: CONCEPT [id="microshift-greenboot-included-health-checks_{context}"] = Included greenboot health checks -Health check scripts are available in `/usr/lib/greenboot/check`, a read-only directory in RPM-OSTree systems. The following health checks are included with the `greenboot-default-health-checks` framework. +Health check scripts are available in `/usr/lib/greenboot/check`, a read-only directory in {op-system-ostree-first} {op-system-image} systems. The following health checks are included with the `greenboot-default-health-checks` framework. * Check if repository URLs are still DNS solvable: + diff --git a/modules/microshift-greenboot-microshift-health-script.adoc b/modules/microshift-greenboot-microshift-health-script.adoc index 752b761199e1..7757c46c2053 100644 --- a/modules/microshift-greenboot-microshift-health-script.adoc +++ b/modules/microshift-greenboot-microshift-health-script.adoc @@ -30,29 +30,14 @@ The `40_microshift_running_check.sh` health check script only performs validatio |Next |`exit 1` -|Wait for Kubernetes API health endpoints to be working and receiving traffic +|For each core namespace, wait for readiness of the workload |Next |`exit 1` - -|Wait for any pod to start -|Next -|`exit 1` - -|For each core namespace, wait for images to be pulled -|Next -|`exit 1` - -|For each core namespace, wait for pods to be ready -|Next -|`exit 1` - -|For each core namespace, check if pods are not restarting -|`exit 0` -|`exit 1` |=== [id="validation-wait-period_{context}"] == Validation wait period + The wait period in each validation is five minutes by default. After the wait period, if the validation has not succeeded, it is declared a failure. This wait period is incrementally increased by the base wait period after each boot in the verification loop. * You can override the base-time wait period by setting the `MICROSHIFT_WAIT_TIMEOUT_SEC` environment variable in the `/etc/greenboot/greenboot.conf` configuration file. For example, you can change the wait time to three minutes by resetting the value to 180 seconds, such as `MICROSHIFT_WAIT_TIMEOUT_SEC=180`. diff --git a/modules/microshift-greenboot-prerollback-log.adoc b/modules/microshift-greenboot-prerollback-log.adoc index e7beb9c314c3..141521fa4555 100644 --- a/modules/microshift-greenboot-prerollback-log.adoc +++ b/modules/microshift-greenboot-prerollback-log.adoc @@ -7,7 +7,7 @@ [id="microshift-greenboot-access-prerollback-check_{context}"] = Accessing prerollback health check output in the system log -You can access the output of health check scripts in the system log. For example, check the results of a prerollback script using the following procedure. +You can access the output of health check scripts in the system log. For example, check the results of a pre-rollback script using the following procedure. .Procedure @@ -17,7 +17,7 @@ You can access the output of health check scripts in the system log. For example ---- $ sudo journalctl -o cat -u redboot-task-runner.service ---- - ++ .Example output of a prerollback script [source,terminal] ---- @@ -47,3 +47,8 @@ Script '40_microshift_pre_rollback.sh' SUCCESS FINISHED redboot-task-runner.service: Deactivated successfully. ---- ++ +[NOTE] +==== +In case of a rollback, the pre-rollback script runs the `sudo microshift-cleanup-data --ovn` command to prepare the system for a potential software downgrade. +==== diff --git a/modules/microshift-greenboot-testing-workload-script.adoc b/modules/microshift-greenboot-testing-workload-script.adoc index 89ee066fd156..483c9d4ebbf1 100644 --- a/modules/microshift-greenboot-testing-workload-script.adoc +++ b/modules/microshift-greenboot-testing-workload-script.adoc @@ -1,11 +1,13 @@ //Module included in the following assemblies: // -//* microshift_running_apps/microshift-greenboot-workload-scripts.adoc +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc :_mod-docs-content-type: PROCEDURE [id="microshift-greenboot-test-workload-health-check-script_{context}"] = Testing a workload health check script +The output of the greenboot workload health check script varies with the host system type. Example outputs for {op-system-full} system types are included for reference only. + .Prerequisites * You have root access. @@ -13,11 +15,6 @@ * You created a health check script for the workload. * The {microshift-short} service is enabled. -[NOTE] -==== -The output of the greenboot workload health check script varies with the host system type. Example outputs are included for your reference. -==== - .Procedure . To test that greenboot is running a health check script file, reboot the host by running the following command: @@ -39,40 +36,189 @@ $ sudo journalctl -o cat -u greenboot-healthcheck.service {microshift-short} core service health checks run before the workload health checks. ==== + -.Example output for a RHEL for Edge system +.Example output for an {op-system-image} system [source,terminal] ---- +Starting greenboot Health Checks Runner... +Running Required Health Check Scripts... +Script '00_required_scripts_start.sh' SUCCESS +Running Wanted Health Check Scripts... +Script '00_wanted_scripts_start.sh' SUCCESS +Running Required Health Check Scripts... +-------------------- +DEPRECATION NOTICE: +/usr/share/microshift/functions/greenboot.sh is now deprecated and will be removed in future release. +Planned removal: MicroShift 4.21 +As a replacement consider using 'microshift healthcheck' command +-------------------- +STARTED GRUB boot variables: boot_success=0 -boot_indeterminate=0 Greenboot variables: GREENBOOT_WATCHDOG_CHECK_ENABLED=true -MICROSHIFT_WAIT_TIMEOUT_SEC=600 +MICROSHIFT_GREENBOOT_FAIL_MARKER=/run/microshift-greenboot-healthcheck-failed System installation type: -ostree +bootc System installation status: -* rhel 19619bd269094510180c845c44d0944fd9aa15925376f249c4d680a3355e51ae.0 - Version: 9.4 - origin refspec: edge:rhel-9.4-microshift-4.18 +bootcHost +??? I0403 11:54:30.526488 979 service.go:29] microshift.service is enabled +??? I0403 11:54:30.527145 979 service.go:31] Waiting 10m0s for microshift.service to be ready +??? I0403 11:58:52.530299 979 service.go:38] microshift.service is ready +??? I0403 11:58:52.532292 979 net.go:79] host gateway IP address: 192.168.112.125 +??? I0403 11:58:52.555077 979 microshift_core_workloads.go:71] vgs reported: {"report":[{"vg":[{"vg_name":"rhel"}]}],"log":[]} +??? I0403 11:58:52.555138 979 microshift_core_workloads.go:93] Detected 1 volume group (rhel) - LVMS is expected +??? I0403 11:58:52.555143 979 microshift_core_workloads.go:126] Configured optional CSI components: [] +??? I0403 11:58:52.555147 979 microshift_core_workloads.go:117] At least one CSI Component is enabled +??? I0403 11:58:52.555770 979 utils.go:34] Waiting for 9 goroutines +??? I0403 11:58:52.555791 979 workloads.go:94] Waiting 10m0s for deployment/service-ca in openshift-service-ca +??? I0403 11:58:52.555890 979 workloads.go:58] Waiting 10m0s for daemonset/ovnkube-master in openshift-ovn-kubernetes +??? I0403 11:58:52.555999 979 workloads.go:94] Waiting 10m0s for deployment/router-default in openshift-ingress +??? I0403 11:58:52.556096 979 workloads.go:58] Waiting 10m0s for daemonset/dns-default in openshift-dns +??? I0403 11:58:52.556244 979 workloads.go:58] Waiting 10m0s for daemonset/ovnkube-node in openshift-ovn-kubernetes +??? I0403 11:58:52.556330 979 workloads.go:94] Waiting 10m0s for deployment/lvms-operator in openshift-storage +??? I0403 11:58:52.556382 979 workloads.go:58] Waiting 10m0s for daemonset/vg-manager in openshift-storage +??? I0403 11:58:52.556425 979 workloads.go:94] Waiting 10m0s for deployment/csi-snapshot-controller in kube-system +??? I0403 11:58:52.556474 979 workloads.go:58] Waiting 10m0s for daemonset/node-resolver in openshift-dns +??? I0403 11:58:52.574284 979 workloads.go:89] Daemonset/ovnkube-node in openshift-ovn-kubernetes is ready +??? I0403 11:58:52.574344 979 workloads.go:89] Daemonset/dns-default in openshift-dns is ready +??? I0403 11:59:12.871058 979 workloads.go:89] Daemonset/node-resolver in openshift-dns is ready +??? I0403 11:59:12.871621 979 workloads.go:89] Daemonset/ovnkube-master in openshift-ovn-kubernetes is ready +??? I0403 11:59:12.871748 979 workloads.go:132] Deployment/csi-snapshot-controller in kube-system is ready +??? I0403 11:59:25.175015 979 workloads.go:132] Deployment/service-ca in openshift-service-ca is ready +??? I0403 11:59:42.559264 979 workloads.go:132] Deployment/lvms-operator in openshift-storage is ready +??? I0403 11:59:52.557786 979 workloads.go:132] Deployment/router-default in openshift-ingress is ready +??? I0403 11:59:52.558489 979 workloads.go:89] Daemonset/vg-manager in openshift-storage is ready +??? I0403 11:59:52.558505 979 healthcheck.go:28] MicroShift is ready +Script '40_microshift_running_check.sh' SUCCESS +-------------------- +DEPRECATION NOTICE: +/usr/share/microshift/functions/greenboot.sh is now deprecated and will be removed in future release. +Planned removal: MicroShift 4.21 +As a replacement consider using 'microshift healthcheck' command +-------------------- +STARTED +GRUB boot variables: +boot_success=0 +Greenboot variables: +GREENBOOT_WATCHDOG_CHECK_ENABLED=true +MICROSHIFT_GREENBOOT_FAIL_MARKER=/run/microshift-greenboot-healthcheck-failed +System installation type: +bootc +System installation status: +bootcHost +??? I0403 11:59:52.750474 4059 service.go:29] microshift.service is enabled +??? I0403 11:59:52.750873 4059 service.go:31] Waiting 10m0s for microshift.service to be ready +??? I0403 11:59:52.752273 4059 service.go:38] microshift.service is ready +??? I0403 11:59:52.753263 4059 utils.go:34] Waiting for 1 goroutines +??? I0403 11:59:52.753393 4059 workloads.go:94] Waiting 10m0s for deployment/kserve-controller-manager in redhat-ods-applications +??? I0403 12:00:02.755475 4059 workloads.go:132] Deployment/kserve-controller-manager in redhat-ods-applications is ready +??? I0403 12:00:02.755605 4059 healthcheck.go:75] Workloads are ready +Script '41_microshift_running_check_ai_model_serving.sh' SUCCESS +-------------------- +DEPRECATION NOTICE: +/usr/share/microshift/functions/greenboot.sh is now deprecated and will be removed in future release. +Planned removal: MicroShift 4.21 +As a replacement consider using 'microshift healthcheck' command +-------------------- +STARTED +GRUB boot variables: +boot_success=0 +Greenboot variables: +GREENBOOT_WATCHDOG_CHECK_ENABLED=true +MICROSHIFT_GREENBOOT_FAIL_MARKER=/run/microshift-greenboot-healthcheck-failed +System installation type: +bootc +System installation status: +bootcHost +??? I0403 12:00:02.896949 4128 service.go:29] microshift.service is enabled +??? I0403 12:00:02.897208 4128 service.go:31] Waiting 10m0s for microshift.service to be ready +??? I0403 12:00:02.899492 4128 service.go:38] microshift.service is ready +??? I0403 12:00:02.900279 4128 utils.go:34] Waiting for 2 goroutines +??? I0403 12:00:02.900363 4128 workloads.go:94] Waiting 10m0s for deployment/istiod-openshift-gateway-api in openshift-gateway-api +??? I0403 12:00:02.900948 4128 workloads.go:94] Waiting 10m0s for deployment/servicemesh-operator3 in openshift-gateway-api +??? I0403 12:00:42.913338 4128 workloads.go:132] Deployment/servicemesh-operator3 in openshift-gateway-api is ready +??? I0403 12:01:12.902297 4128 workloads.go:132] Deployment/istiod-openshift-gateway-api in openshift-gateway-api is ready +??? I0403 12:01:12.902418 4128 healthcheck.go:75] Workloads are ready +Script '41_microshift_running_check_gateway_api.sh' SUCCESS +-------------------- +DEPRECATION NOTICE: +/usr/share/microshift/functions/greenboot.sh is now deprecated and will be removed in future release. +Planned removal: MicroShift 4.21 +As a replacement consider using 'microshift healthcheck' command +-------------------- +STARTED +GRUB boot variables: +boot_success=0 +Greenboot variables: +GREENBOOT_WATCHDOG_CHECK_ENABLED=true +MICROSHIFT_GREENBOOT_FAIL_MARKER=/run/microshift-greenboot-healthcheck-failed +System installation type: +bootc +System installation status: +bootcHost +??? I0403 12:01:13.057998 4772 service.go:29] microshift.service is enabled +??? I0403 12:01:13.058107 4772 service.go:31] Waiting 10m0s for microshift.service to be ready +??? I0403 12:01:13.059839 4772 service.go:38] microshift.service is ready +??? I0403 12:01:13.060617 4772 utils.go:34] Waiting for 2 goroutines +??? I0403 12:01:13.060644 4772 workloads.go:58] Waiting 10m0s for daemonset/dhcp-daemon in openshift-multus +??? I0403 12:01:13.060686 4772 workloads.go:58] Waiting 10m0s for daemonset/multus in openshift-multus +??? I0403 12:01:13.069341 4772 workloads.go:89] Daemonset/multus in openshift-multus is ready +??? I0403 12:01:13.069450 4772 workloads.go:89] Daemonset/dhcp-daemon in openshift-multus is ready +??? I0403 12:01:13.069503 4772 healthcheck.go:75] Workloads are ready +Script '41_microshift_running_check_multus.sh' SUCCESS +-------------------- +DEPRECATION NOTICE: +/usr/share/microshift/functions/greenboot.sh is now deprecated and will be removed in future release. +Planned removal: MicroShift 4.21 +As a replacement consider using 'microshift healthcheck' command +-------------------- +STARTED +GRUB boot variables: +boot_success=0 +Greenboot variables: +GREENBOOT_WATCHDOG_CHECK_ENABLED=true +MICROSHIFT_GREENBOOT_FAIL_MARKER=/run/microshift-greenboot-healthcheck-failed +System installation type: +bootc +System installation status: +bootcHost +??? I0403 12:01:13.206381 4804 service.go:29] microshift.service is enabled +??? I0403 12:01:13.206583 4804 service.go:31] Waiting 10m0s for microshift.service to be ready +??? I0403 12:01:13.207979 4804 service.go:38] microshift.service is ready +??? I0403 12:01:13.208717 4804 utils.go:34] Waiting for 2 goroutines +??? I0403 12:01:13.208779 4804 workloads.go:94] Waiting 10m0s for deployment/catalog-operator in openshift-operator-lifecycle-manager +??? I0403 12:01:13.209285 4804 workloads.go:94] Waiting 10m0s for deployment/olm-operator in openshift-operator-lifecycle-manager +??? I0403 12:01:13.215578 4804 workloads.go:132] Deployment/catalog-operator in openshift-operator-lifecycle-manager is ready +??? I0403 12:01:13.215673 4804 workloads.go:132] Deployment/olm-operator in openshift-operator-lifecycle-manager is ready +??? I0403 12:01:13.215684 4804 healthcheck.go:75] Workloads are ready +Script '50_microshift_running_check_olm.sh' SUCCESS +Running Wanted Health Check Scripts... +Finished greenboot Health Checks Runner. ---- + -.Example output for an image mode for RHEL system -[source,terminal] +.Example partial output for a {op-system-ostree} system +[source,terminal,subs="+attributes"] ---- +#... GRUB boot variables: boot_success=0 +boot_indeterminate=0 Greenboot variables: GREENBOOT_WATCHDOG_CHECK_ENABLED=true MICROSHIFT_WAIT_TIMEOUT_SEC=600 System installation type: -bootc +ostree System installation status: -bootcHost +* rhel 19619bd269094510180c845c44d0944fd9aa15925376f249c4d680a3355e51ae.0 + Version: {op-system-version} + origin refspec: edge:rhel-{op-system-version}-microshift-{product-version} +#... ---- + -.Example output for an RPM system +.Example partial output for an RPM system [source,terminal] ---- +#... GRUB boot variables: boot_success=1 boot_indeterminate=0 @@ -82,4 +228,5 @@ System installation type: RPM System installation status: Not an ostree / bootc system +#... ---- diff --git a/modules/microshift-greenboot-updates-workloads.adoc b/modules/microshift-greenboot-updates-workloads.adoc index 4329369b7ad7..66e70fe6ca70 100644 --- a/modules/microshift-greenboot-updates-workloads.adoc +++ b/modules/microshift-greenboot-updates-workloads.adoc @@ -13,4 +13,4 @@ Health check scripts for updates are installed into the `/etc/greenboot/check/re [IMPORTANT] ==== Wait until after an update is declared valid before starting third-party workloads. If a rollback is performed after workloads start, you can lose data. Some third-party workloads create or update data on a device before an update is complete. Upon rollback, the file system reverts to its state before the update. -==== \ No newline at end of file +==== diff --git a/modules/microshift-greenboot-workload-health-script-ex.adoc b/modules/microshift-greenboot-workload-health-script-ex.adoc new file mode 100644 index 000000000000..58a903d82c70 --- /dev/null +++ b/modules/microshift-greenboot-workload-health-script-ex.adoc @@ -0,0 +1,55 @@ +//Module included in the following assemblies: +// +//* microshift_running_apps/microshift-greenboot-workload-health-checks.adoc + +:_mod-docs-content-type: REFERENCE +[id="microshift-greenboot-workload-health-check-script-ex_{context}"] += Workload max duration or timeout script example + +The following example uses the {microshift-short} core services health check script as a template. + +[id="microshift-greenboot-app-health-check-basic-prereqs_{context}"] +== Basic prerequisites for creating a health check script + +* The workload must be installed. +* You must have root access. + +[id="microshift-greenboot-app-health-check-ex_{context}"] +== Example and functional requirements + +You can start with the following example health check script. Add to it for your use case. In your custom workload health check script, you must define the relevant namespace, deployment, `daemonset`, and `statefulset`. + +[IMPORTANT] +==== +Choose a name prefix for your application that ensures it runs after the `40_microshift_running_check.sh` script, which implements the {microshift-short} health check procedure for its core services. +==== + +.Example greenboot health check script +[source, bash] +---- +#!/bin/bash +set -e + +SCRIPT_NAME=$(basename $0) + +# Load the workload health check functions library +source /usr/share/microshift/functions/greenboot.sh + +# Stop the script if the user running it is not 'root' +if [ $(id -u) -ne 0 ] ; then + echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges" + exit 1 +fi + +echo "STARTED" + +# Set the wait timeout for the current check based on the boot counter +WAIT_TIMEOUT_SECS=$(get_wait_timeout) + +/usr/bin/microshift healthcheck -v=2 --timeout="${WAIT_TIMEOUT_SECS}s" --namespace busybox --deployments busybox-deployment +---- + +[IMPORTANT] +==== +Functions related to checking workload health previously included in the `/usr/share/microshift/functions/greenboot.sh` script file are deprecated. You can write a custom script, or use the `microshift healthcheck` command with various options instead. See "How workload health check scripts work" for more information. +==== diff --git a/modules/microshift-greenboot-workloads-validation.adoc b/modules/microshift-greenboot-workloads-validation.adoc index d73692812cd2..a26a6d2a4197 100644 --- a/modules/microshift-greenboot-workloads-validation.adoc +++ b/modules/microshift-greenboot-workloads-validation.adoc @@ -16,9 +16,10 @@ After a successful start, greenboot sets the variable `boot_success=` to `1` in ---- $ sudo grub2-editenv - list | grep ^boot_success ---- - ++ .Example output for a successful system start [source,terminal] ---- -boot_success=1 ----- \ No newline at end of file +boot_success=1 <1> +---- +<1> If your command returns `boot_success=0`, either the greenboot health check is still running, or the update is a failure.