Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS update is never triggered #1676

Open
ldevulder opened this issue Feb 11, 2025 · 3 comments
Open

OS update is never triggered #1676

ldevulder opened this issue Feb 11, 2025 · 3 comments
Labels
area/upgrade kind/bug Something isn't working

Comments

@ldevulder
Copy link
Contributor

What steps did you take and what happened:
Create a new OS channel pointing to newer image versions, create an update group to upgrade the OS on the nodes.

What did you expect to happen:
OS image on the nodes updated from SLEMicro 6.0 to 6.1.

Anything else you would like to add:
Find elemental-operator logs: upgrade.log.

Environment:

  • Elemental release version (use cat /etc/os-release):
NAME="SL-Micro"
VERSION="6.0"
VERSION_ID="6.0"
PRETTY_NAME="SUSE Linux Micro 6.0"
ID="sl-micro"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sl-micro:6.0"
HOME_URL="https://www.suse.com/products/micro/"
DOCUMENTATION_URL="https://documentation.suse.com/sl-micro/6.0/"
IMAGE_REPO="registry.suse.com/suse/sl-micro/6.0/baremetal-os-container"
IMAGE_TAG="2.1.3-4.7"
IMAGE="registry.suse.com/suse/sl-micro/6.0/baremetal-os-container:2.1.3-4.7"
TIMESTAMP=20241107104849
GRUB_ENTRY_NAME="SUSE Linux Micro"
  • Rancher version: Prime v2.10.2
  • Kubernetes version (use kubectl version): v1.30.6+k3s1 for Rancher Manager and v1.30.8+k3s1 for the Elemental cluster
  • Cloud provider or hardware configuration: Virtual Machines.

How to reproduce the issue:

  • Deploy Rancher Manager Stable or Prime version
  • Create a simple Elemental cluster (1 node is enough) with Stable version (operator v1.6.5, OS channel v6.0)
  • Add the Staging OS channel:
apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSVersionChannel
metadata:
  name: sl-micro-6.1-baremetal-channel
  namespace: fleet-default
spec:
  options:
    image: registry.opensuse.org/isv/rancher/elemental/staging/containers/rancher/elemental-channel/sl-micro:6.1-baremetal
  syncInterval: 1h
  type: custom
  • Create an update group:
apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSImage
metadata:
  name: upgrade-os-to-6.1
  namespace: fleet-default
spec:
  clusterTargets:
    - clusterName: test-cluster
  managedOSVersionName: baremetal-v2.2.0-3.7-os
  • All looks good but the update is never triggered
@ldevulder ldevulder added area/upgrade kind/bug Something isn't working labels Feb 11, 2025
@ldevulder ldevulder moved this to 🗳️ To Do in Elemental Feb 11, 2025
@ldevulder
Copy link
Contributor Author

I tested with osImage option instead of managedOSVersionName and it worked BUT ONLY with dev instead of staging:

apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSImage
metadata:
  name: upgrade-os
  namespace: fleet-default
spec:
  clusterTargets:
    - clusterName: test-cluster
  osImage: registry.opensuse.org/isv/rancher/elemental/dev/containers/suse/sl-micro/6.1/baremetal-os-container:latest

I checked and staging is available on registry.opensuse.org, like dev.

@ldevulder
Copy link
Contributor Author

I retested today and it worked with update-os-to-6-1 as the update group name but it never start with update-to-os-6.1, so clearly it's not a name length issue but because of the point in the naming.

Looks like more a Rancher Manager issue, no?

@fgiudici
Copy link
Member

The issue is with the dot . in the resource name.
This tracks down to the system-upgrade-controller resources created under the hood.

Tried to create a ManagedOSImage resource with name test.name.

Checking the system-upgrade-controller logs on the child cluster:

kubectl -n cattle-system logs deployment/system-upgrade-controller

time="2025-02-18T08:42:50Z" level=error msg="error syncing 'cattle-system/os-upgrader-test.name': handler system-upgrade-controller: failed to create cattle-system/apply-os-upgrader-test.name-on-m-9f544b7d-e272-4fe5-9db2-2494f8 batch/v1, Kind=Job for system-upgrade-controller cattle-system/os-upgrader-test.name: Job.batch \"apply-os-upgrader-test.name-on-m-9f544b7d-e272-4fe5-9db2-2494f8\" is invalid: [spec.template.spec.volumes[2].name: Invalid value: \"secret-os-upgrader-test.name\": must not contain dots, spec.template.spec.containers[0].volumeMounts[2].name: Not found: \"secret-os-upgrader-test.name\", spec.template.spec.initContainers[0].volumeMounts[2].name: Not found: \"secret-os-upgrader-test.name\"], requeuing"

So, guess we should not allow dots . in the upgrade groups (ManagedOSImage resources).
Or at the very minimum document this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/upgrade kind/bug Something isn't working
Projects
Status: 🗳️ To Do
Development

No branches or pull requests

2 participants