Skip to content

OCPNODE-3372: Update defaultRuntime doc to show options and set default to crun #2370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ngopalak-redhat
Copy link

@ngopalak-redhat ngopalak-redhat commented Jun 17, 2025

This PR addresses an issue in the generated documentation for the defaultRuntime option

Previously, the documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_apis/containerruntimeconfig-machineconfiguration-openshift-io-v1 did not list crun and runc as valid options for defaultRuntime.

While crun became the default runtime in OpenShift 4.18 (as per release notes: https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/release_notes/ocp-4-18-release-notes), the internal constant reflecting this default was not updated, though the functionality itself was correct.

This change updates the internal documentation comments and the relevant constant to accurately reflect:

Valid options: crun and runc
Default runtime: crun

Based on the review I have also added enum validation so that the failure can be caught earlier. Unit tests are added to ensure that it doesn't break any existing functionality.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 17, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 17, 2025

@ngopalak-redhat: This pull request references OCPNODE-3372 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.20.0" version, but no target version was set.

In response to this:

This PR addresses an issue in the generated documentation for the defaultRuntime option

Previously, the documentation at https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_apis/containerruntimeconfig-machineconfiguration-openshift-io-v1 did not list crun and runc as valid options for defaultRuntime.

While crun became the default runtime in OpenShift 4.18 (as per release notes: https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/release_notes/ocp-4-18-release-notes), the internal constant reflecting this default was not updated, though the functionality itself was correct.

This change updates the internal documentation comments and the relevant constant to accurately reflect:

Valid options: crun and runc
Default runtime: crun

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jun 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2025
Copy link
Contributor

openshift-ci bot commented Jun 17, 2025

Hello @ngopalak-redhat! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 17, 2025
@ngopalak-redhat ngopalak-redhat marked this pull request as ready for review June 17, 2025 04:35
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2025
@ngopalak-redhat
Copy link
Author

/assign @sairameshv @haircommander

@ngopalak-redhat
Copy link
Author

/test lint

@ngopalak-redhat ngopalak-redhat force-pushed the ngopalak/OCPNODE-3372_defaultRuntime branch from 0684334 to 6cf242b Compare June 17, 2025 05:28
@ngopalak-redhat
Copy link
Author

/retest

@ngopalak-redhat ngopalak-redhat force-pushed the ngopalak/OCPNODE-3372_defaultRuntime branch from 6cf242b to 4a4c7f8 Compare June 17, 2025 09:26
@openshift-ci openshift-ci bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 17, 2025
@ngopalak-redhat
Copy link
Author

/retest-required

@ngopalak-redhat
Copy link
Author

/test images

@ngopalak-redhat
Copy link
Author

/test minor-images

@ngopalak-redhat
Copy link
Author

/test images

@ngopalak-redhat
Copy link
Author

/test minor-images

@ngopalak-redhat
Copy link
Author

/retest-required

@ngopalak-redhat
Copy link
Author

/test e2e-azure

@ngopalak-redhat
Copy link
Author

/test okd-scos-e2e-aws-ovn

@ngopalak-redhat
Copy link
Author

/test minor-e2e-upgrade-minor

@ngopalak-redhat
Copy link
Author

/test okd-scos-e2e-aws-ovn

@ngopalak-redhat
Copy link
Author

Because the logic previously existed in MCO that would cause failures if this isn't set correctly it should be fine to restrict this validation at admission time.

That being said, we should ensure that we properly test that this change ratchets correctly. Here is an example of a ratcheting validation test:

- name: Should allow changing other fields when a persisted value is no longer valid
initialCRDPatches:
- op: remove
path: /spec/versions/0/schema/openAPIV3Schema/properties/spec/properties/nonZeroDefault/minimum # Minimum is normally 8
initial: |
apiVersion: example.openshift.io/v1
kind: StableConfigType
spec:
nonZeroDefault: 1
immutableField: foo
updated: |
apiVersion: example.openshift.io/v1
kind: StableConfigType
spec:
nonZeroDefault: 1
immutableField: foo
optionalImmutableField: foo
expected: |
apiVersion: example.openshift.io/v1
kind: StableConfigType
spec:
nonZeroDefault: 1
immutableField: foo
optionalImmutableField: foo

Addressed by adding ratchets test

apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should defaultRuntime be set to the empty string value here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@everettraven I've reverted the omitempty removal and kept defaultRuntime as a non-pointer type.

My reasoning is:

  • MCO handles both unset and empty string values for defaultRuntime as a default to crun.
  • Changing it to a pointer would require extensive code changes. May be across MCO and API repos.
  • Keeping omitempty ensures the field is omitted when unset, avoiding a confusing empty string for users.
  • If a user explicitly sets it to an empty string, it will be included in the manifest, which is acceptable as MCO handles it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCO handles both unset and empty string values for defaultRuntime as a default to crun.

Without a pointer, MCO cannot tell the difference between unset and the empty string.

Does the empty string mean anything to crun?

Keeping omitempty ensures the field is omitted when unset, avoiding a confusing empty string for users.

This is called discoverability in our APIs. When a controller generates a configuration API object, we try to generate a complete set of fields so that a user can easily see from the object, where there are options they may not have known about (they can discover the new options).

If a user explicitly sets it to an empty string, it will be included in the manifest, which is acceptable as MCO handles it.

But it won't round trip. As soon as a structured client writes the object, the field will be removed from etcd and future reads will not render the "" value. This means that an unstructured apply would never reach a stable state when the controller is responding to the writes from the unstructured client.

Either remove omitempty, or remove "" from the enum of valid values. Removing an enum value is a breaking change btw, so if this is already shipped and the enum already allows "", you should remove omitempty as this is the least breaking change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JoelSpeed . I have removed omitempty and updated test cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed can you review? Happy to address any comments you have

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are unstructured calls, unless there's an explicit default, I wouldn't expect the defaultRuntime to be added as part of this test

@ngopalak-redhat
Copy link
Author

/retest-required

@@ -3,7 +3,7 @@ name: "ContainerRuntimeConfig"
crdName: containerruntimeconfigs.machineconfiguration.openshift.io
tests:
onCreate:
- name: Should be able to create a minimal ContainerRuntimeConfig
- name: Should be able to create a minimal ContainerRuntimeConfig and default runtime set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't modify the existing case, add a new case

apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are unstructured calls, unless there's an explicit default, I wouldn't expect the defaultRuntime to be added as part of this test

- name: Should be able to update from runc to crun with other parameters
initialCRDPatches:
- op: remove
path: /spec/versions/0/schema/openAPIV3Schema/properties/spec/properties/containerRuntimeConfig/properties/defaultRuntime/enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the enum on this one? runc is a valid choice no?

- name: Should not be able to update from default to invalid value
initialCRDPatches:
- op: remove
path: /spec/versions/0/schema/openAPIV3Schema/properties/spec/properties/containerRuntimeConfig/properties/defaultRuntime/enum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the enum on thius one, "" is a valid choice no?

// When omitted, this means no opinion and the platform is left to choose a reasonable default,
// which is subject to change over time. Currently, the default is `crun`.
// +kubebuilder:validation:Enum=crun;runc;""
// +kubebuilder:default:=""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is interesting, this creates discoverability even from unstructured clients, we haven't done this before

What's the motivation?

// When set to `crun`, OpenShift will use crun to execute the container
// When omitted, this means no opinion and the platform is left to choose a reasonable default,
// which is subject to change over time. Currently, the default is `crun`.
// +kubebuilder:validation:Enum=crun;runc;""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we actually want to add "" to the enum here. Are we aiming for discoverability? Who creates this object? Is it generally created by a controller/structured client? Or do we expect end users to create these?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed Before diving into other comments, let's clarify the defaultRuntime field. Our primary goal is to inform users that only 'runc' and 'crun' are valid options, as they currently aren't aware. We also expect users to directly modify this field to change the node's underlying runtime.

Removing omitempty and handling "" as an invalid value could break existing user configurations if they previously set defaultRuntime to "".

Given this, I see these paths forward:

  1. Skip Enum Validation for now: Document 'runc' and 'crun' as the allowed values, and let MCO manage it as before.
  2. Remove "" and omitempty (breaking change): Proceed with stricter validation, accepting this might break a small number of existing users who explicitly set ""."
  3. The current change which uses "" as an enum option so that the existing users aren't broken.

Copy link
Contributor

@everettraven everettraven Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joel and I discussed this a bit further offline, and it looks like I missed the nuance of this field already having been set as omitempty previously meaning that explicitly setting the empty value ("") and completely omitting the field are treated exactly the same by the kube-apiserver (i.e users aren't broken if the enum doesn't include "").

That means we should:

  • Remove the "" value from the enum
  • Remove the default marker
  • Add omitempty back
  • Keep the messaging about default behavior when omitted in the GoDoc

My apologies for the confusion here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngopalak-redhat What is the difference between this field being omitted (not present at all), and ""? Is there a semantic difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell from the docs, it mentions only runc and crun as valid values, https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/machine_configuration/machine-configs-custom

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@everettraven When we remove the "" option in the enum and keep omitempty, it leads to an error if a customer has previously set defaultRuntime to "". You can see the specific error below:

                     Type: "FieldValueNotSupported",
                           Message: "Unsupported value: \"\": supported values: \"crun\", \"runc\"",
                           Field: "spec.containerRuntimeConfig.defaultRuntime",

My take is that this breakage should be acceptable. We've never actually instructed customers to set defaultRuntime to an empty string. Therefore, any existing configurations with "" are likely not intended
What are your thoughts?

@JoelSpeed crun is the default now. Omitting or "" for defaultRuntime will result in choosing crun.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it leads to an error if a customer has previously set defaultRuntime to ""

Which version of OpenShift did you test this with? Ratcheting configuration should allow the previous value to be persisted on the latest few versions of OpenShift.

Therefore, any existing configurations with "" are likely not intended
What are your thoughts?

My concern is that, the way the field is structured today, allows an unstructured client (think kubectl apply) to set the field to "", but a structured client (think the controller) who updates the object, will strip the value (because of the omitempty). This means that "" doesn't round trip currently, and anyone who has tried to set the value to "", I expect has seen it removed after they have tried to set it. In theory that could be tested.

@JoelSpeed crun is the default now. Omitting or "" for defaultRuntime will result in choosing crun.

This implies to me that the value "" should not be in the enum. Though please don't update right now, @deads2k had some comments about this and I'll talk this through with him tomorrow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets continue with this and remove "". We can't see anywhere that we told users to do this, and if they did, it wouldn't round trip through a structured client today either

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed I have removed "". Updated the PR. Please review

@ngopalak-redhat ngopalak-redhat force-pushed the ngopalak/OCPNODE-3372_defaultRuntime branch from 81e5b29 to 73dfbf9 Compare July 10, 2025 04:01
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
defaultRuntime: crun
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of removing the enum in the ractheting tests is to set the value to something invalid. Set this to docker in initial`, then create three different updated, One to change another field and show it remains, one to show it can be removed, one to update it to a valid value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

spec:
containerRuntimeConfig:
logLevel: info
- name: Should be able to update from runc to crun with other parameters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to remove the enum from the initial CRD validation for this test? You're not actually testing ratcheting unless the initial value has something that is not valid

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 142 to 166
- name: Should be able to update other parameters with invalid defaultRuntime
initialCRDPatches:
- op: remove
path: /spec/versions/0/schema/openAPIV3Schema/properties/spec/properties/containerRuntimeConfig/properties/defaultRuntime/enum
initial: |
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
defaultRuntime: docker
logLevel: fatal
updated: |
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
defaultRuntime: docker
logLevel: info
expected: |
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
spec:
containerRuntimeConfig:
defaultRuntime: docker
logLevel: info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a correct use of ratcheting, but you also need to show you can transition from an invalid to valid value, and from an invalid value to absent

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 11, 2025
Copy link
Contributor

openshift-ci bot commented Jul 11, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, JoelSpeed, ngopalak-redhat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 11, 2025
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 674ad74 and 2 for PR HEAD 9334ab7 in total

2 similar comments
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 674ad74 and 2 for PR HEAD 9334ab7 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 674ad74 and 2 for PR HEAD 9334ab7 in total

Copy link
Contributor

openshift-ci bot commented Jul 14, 2025

@ngopalak-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure 9334ab7 link false /test e2e-azure
ci/prow/e2e-aws-serial-2of2 9334ab7 link true /test e2e-aws-serial-2of2

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 674ad74 and 2 for PR HEAD 9334ab7 in total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants