diff --git a/enhancements/authentication/pod-security-admission-enforcement.md b/enhancements/authentication/pod-security-admission-enforcement.md index 3fc64ca3ba..fa88271eb8 100644 --- a/enhancements/authentication/pod-security-admission-enforcement.md +++ b/enhancements/authentication/pod-security-admission-enforcement.md @@ -27,25 +27,25 @@ superseded-by: [] This enhancement introduces a **new cluster-scoped API**, changes to the relevant controllers and to the `OpenShiftPodSecurityAdmission` `FeatureGate` to gradually roll out [Pod Security Admission (PSA)](https://kubernetes.io/docs/concepts/security/pod-security-admission/) enforcement [in OpenShift](https://www.redhat.com/en/blog/pod-security-admission-in-openshift-4.11). Enforcement means that the `PodSecurityAdmissionLabelSynchronizationController` sets the `pod-security.kubernetes.io/enforce` label on Namespaces, and the PodSecurityAdmission plugin enforces the `Restricted` [Pod Security Standard (PSS)](https://kubernetes.io/docs/concepts/security/pod-security-standards/). -By “gradually,” it means that these changes happen in separate steps. +By “gradually”, it means that these changes happen in separate steps. The new API offers users the option to manipulate the outcome by enforcing the `Privileged` or `Restricted` PSS directly. The suggested default decision is `Conditional`, which only progresses if no potentially failing workloads are found. -The progression starts with the `PodSecurityAdmissionLabelSynchronizationController` labeling **Namespaces** for enforcement and finishes with the **Global Configuration** of `PodSecurity` being set to `Restricted` by default. +The progression starts with the `PodSecurityAdmissionLabelSynchronizationController` labeling **Namespaces** and finishes with the **Global Configuration**. This enhancement expands the ["PodSecurity admission in OpenShift"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission.md) and ["Pod Security Admission Autolabeling"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission-autolabeling.md) enhancements. ## Motivation After introducing Pod Security Admission and Autolabeling based on SCCs, some clusters were found to have Namespaces with Pod Security violations. -Over the last few releases, the number of clusters with failing workloads has dropped significantly. +Over the last few releases, the number of clusters with violating workloads has dropped significantly. Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads. To ensure a smooth and safe transition, this proposal uses a gradual, conditional rollout based on the new API. This approach also provides an overview of which Namespaces could contain failing workloads. ### Goals -1. Start the process of rolling out Pod Security Admission enforcement. +1. Rolling out Pod Security Admission enforcement. 2. Minimize the risk of breakage for existing workloads. 3. Allow users to remain in “privileged” mode for a couple of releases. @@ -53,18 +53,16 @@ This approach also provides an overview of which Namespaces could contain failin 1. Enabling the PSA label-syncer to evaluate workloads with user-based SCC decisions. 2. Providing a detailed list of every Pod Security violation in a Namespace. -3. Allowing the user to move from enforcing the global config to relying solely on the PSA label syncer to set the `enforce` label. +3. Moving seamlessly between different progressions back and forth. ## Proposal -This section outlines the necessary changes for a safe, stepwise rollout of Pod Security Admission enforcement. - ### User Stories As a System Administrator: - I want to transition to enforcing Pod Security Admission only if the cluster would have no failing workloads. -- If there are workloads in certain Namespaces that would fail under enforcement, I want to be able to identify which Namespaces need to be fixed. -- If I encounter issues with the Pod Security Admission transition, I want to opt out (remain privileged) across my clusters until I can fix the issues. +- If there are workloads in certain Namespaces that would fail under enforcement, I want to be able to identify which Namespaces need to be investigated. +- If I encounter issues with the Pod Security Admission transition, I want to opt out (remain privileged) across my clusters until later. ### Current State @@ -86,13 +84,14 @@ To allow a safer rollout of enforcement, the following steps are proposed: The feature flag `OpenShiftPodSecurityAdmission` being enabled is a pre-condition for this process to start. It will also serve as a break-glass option. -If unexpected failures occur, the rollout will be reverted by removing the `FeatureGate` from the default `FeatureSet`. +If the progression causes failures for users, the rollout will be reverted by removing the `FeatureGate` from the default `FeatureSet`. #### Examples +Examples of failing workloads include: + - **Category 1**: Namespaces with workloads that use user-bound SCCs (workloads created directly by a user) without meeting the `Restricted` PSS. - **Category 2**: Namespaces that do not have the `pod-security.kubernetes.io/enforce` label and whose workloads would not satisfy the `Restricted` PSS. - Possible cases include: 1. Namespaces with `security.openshift.io/scc.podSecurityLabelSync: "false"` and no `pod-security.kubernetes.io/enforce` label set. 2. `openshift-` prefixed Namespaces (not necessarily created or managed by OpenShift teams). @@ -109,7 +108,7 @@ It will let administrators: ### Release Timing The gradual process will span three releases: -- **Release `n-1`**: Introduce the new API, diagnostics for identifying violating Namespaces, and enable the PSA label syncer to remove enforce labels from release `n`. +- **Release `n-1`**: Introduce the new API, improve diagnostics for identifying violating Namespaces and enable the PSA label syncer to remove enforce labels from its release `n` version. - **Release `n`**: Permit the `PodSecurityAdmissionLabelSynchronizationController` to set enforce labels if there are no workloads that would fail. - **Release `n+2`**: Enable the PodSecurity configuration to enforce `restricted` if there are no workloads that would fail. @@ -126,7 +125,7 @@ Additional diagnostics are required to confirm the full set of potential causes. #### New SCC Annotation: `security.openshift.io/ValidatedSCCSubjectType` -The annotation `openshift.io/scc` currently indicates which SCC admitted a workload, but it does not distinguish **how** the SCC was granted—whether through a user or a Pod’s ServiceAccount. +The annotation `openshift.io/scc` currently indicates which SCC admitted a workload, but it does not distinguish **how** the SCC was granted — whether through a user or a Pod’s ServiceAccount. A new annotation will help determine if a ServiceAccount with the required SCCs was used, or if a user created the workload out of band. Because the PSA label syncer does not track user-based SCCs itself, it cannot fully assess labeling under those circumstances. @@ -177,8 +176,8 @@ import ( // in the cluster. Unlike PSATargetMode, which reflects the user’s desired or “target” // setting, PSAEnforcementMode describes the effective mode currently active. // -// The modes define a progression from no enforcement, to label-based enforcement, -// to label-based plus global config enforcement. enforcement mode for Pod Security Admission rollout. +// The modes define a progression from no enforcement, through label-based enforcement +// to label-based with global config enforcement. type PSAEnforcementMode string const ( @@ -213,7 +212,7 @@ const ( // TargetModeConditional indicates that the user is willing to let the cluster // automatically enforce a stricter enforcement once there are no violating Namespaces. - // If violations exist, the cluster stays in "Privileged" until those are resolved. + // If violations exist, the cluster stays in its previous state until those are resolved. // This allows a gradual move towards label and global config enforcement without // immediately breaking workloads that are not yet compliant. TargetModeConditional PSATargetMode = "Conditional" @@ -256,13 +255,17 @@ type PSAEnforcementConfigSpec struct { // PSAEnforcementConfigStatus defines the observed state of Pod Security // Admission enforcement. type PSAEnforcementConfigStatus struct { + // enforcementMode indicates the effective Pod Security Admission enforcement + // mode in the cluster. Unlike spec.targetMode, which expresses the desired mode, + // enforcementMode reflects the actual state after considering any existing + // violations or user overrides. EnforcementMode PSAEnforcementMode `json:"enforcmentMode"` // violatingNamespaces is a list of namespaces that can initially block the // cluster from fully enforcing a "Restricted" mode. Administrators should // review each listed Namespace to fix any issues to enable strict enforcement. // - // If a cluster is already in "Restricted" mode and new violations emerge, + // If a cluster is already in a more "Restricted" mode and new violations emerge, // it remains in "Restricted" until the user explicitly switches to // "spec.mode = Privileged". // @@ -328,7 +331,7 @@ Below is a table illustrating the expected behavior when the `FeatureGate` `Open A cluster that uses `spec.targetMode = Conditional` can revert to `Privileged` only if the user explicitly sets `spec.targetMode = Privileged`. A cluster in `spec.mode = Conditional` that starts with `status.EnforcementMode = Privileged` may switch to a more restrictive enforcement mode as soon as there are no violations. -To manage the timing of this rollout, an administrator can set `spec.mode = Privileged` and later switch it to Conditional when ready. +To manage the timing of this rollout, an administrator can set `spec.mode = Privileged` and later switch it to `Conditional` when ready. `status.violatingNamespaces` lists the Namespaces that would fail if `status.enforcementMode` were `LabelEnforcement` or `FullEnforcement`. The reason field helps identify whether the PSA label syncer or the PodSecurity config is the root cause. @@ -337,8 +340,15 @@ Administrators must query the kube-apiserver (or use the [cluster debugging tool ### Implementation Details - The `PodSecurityReadinessController` in the `cluster-kube-apiserver-operator` will manage the new API. -- If the `FeatureGate` is removed, the cluster must revert to its previous state. -- The [`Config Observer controller`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/operator/configobservation/configobservercontroller/observe_config_controller.go#L131) must be updated for the `PodSecurity` configuration. +- If the `FeatureGate` is removed from the current `FeatureSet`, the cluster must revert to its previous state. +- The [`Config Observer Controller`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/operator/configobservation/configobservercontroller/observe_config_controller.go#L131) must be updated to watch for the new API alongside the `FeatureGate`. + +#### PodSecurityReadinessController + +The `PodSecurityReadinessController` will manage the `PSAEnforcementConfig` API. +It already collects most of the necessary data to determine whether a Namespace would fail enforcement or not to create a [`ClusterFleetEvaluation`](https://github.com/openshift/enhancements/blob/master/dev-guide/cluster-fleet-evaluation.md). +With the `security.openshift.io/MinimallySufficientPodSecurityStandard`, it will be able to evaluate all Namespaces for failing workloads, if any enforcement would happen. +With the `security.openshift.io/ValidatedSCCSubjectType`, it can categorize violations more accurately and create a more accurate `ClusterFleetEvaluation`. #### PodSecurity Configuration @@ -349,7 +359,7 @@ It must watch both the `status.enforcementMode` and the `FeatureGate` to make de The PSA label syncer will watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` feature gate. If `status.enforcementMode` is `LabelEnforcement` or `FullEnforcement` and `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label. -Otherwise, it will refrain from setting that label and remove any enforce labels it owns. +Otherwise, it will refrain from setting that label and remove any enforce labels it owns if existent. Because the ability to set `pod-security.kubernetes.io/enforce` is introduced in release `n`, the ability to remove that label must exist in release `n-1`. Otherwise, the cluster will be unable to revert to its previous state.