From 930c8a41ce274648fc274c5383f8ad0f836810cb Mon Sep 17 00:00:00 2001 From: Krzysztof Ostrowski Date: Fri, 24 Jan 2025 21:50:05 +0100 Subject: [PATCH 1/4] enhancements/authentication: psa enforcment --- .../pod-security-admission-enforcement.md | 428 ++++++++++++++++++ 1 file changed, 428 insertions(+) create mode 100644 enhancements/authentication/pod-security-admission-enforcement.md diff --git a/enhancements/authentication/pod-security-admission-enforcement.md b/enhancements/authentication/pod-security-admission-enforcement.md new file mode 100644 index 0000000000..5b91372845 --- /dev/null +++ b/enhancements/authentication/pod-security-admission-enforcement.md @@ -0,0 +1,428 @@ +--- +title: psa-enforcement-config +authors: + - "@ibihim" +reviewers: + - "@liouk" + - "@everettraven" +approvers: + - "@deads2k" + - "@sjenning" +api-approvers: + - "@deads2k" + - "@JoelSpeed" +creation-date: 2025-01-23 +last-updated: 2025-01-23 +tracking-link: + - https://issues.redhat.com/browse/ +see-also: + - "/enhancements/authentication/pod-security-admission.md" +replaces: [] +superseded-by: [] +--- + +# Pod Security Admission Enforcement Config + +## Summary + +This enhancement introduces a **new cluster-scoped API**, changes to the relevant controllers and to the `OpenShiftPodSecurityAdmission` `FeatureGate` to gradually roll out [Pod Security Admission (PSA)](https://kubernetes.io/docs/concepts/security/pod-security-admission/) enforcement [in OpenShift](https://www.redhat.com/en/blog/pod-security-admission-in-openshift-4.11). +Enforcement means that the `PodSecurityAdmissionLabelSynchronizationController` sets the `pod-security.kubernetes.io/enforce` label on Namespaces, and the PodSecurityAdmission plugin enforces the `Restricted` [Pod Security Standard (PSS)](https://kubernetes.io/docs/concepts/security/pod-security-standards/). +By “gradually”, it means that these changes happen in separate steps. + +The new API offers users the option to manipulate the outcome by enforcing the `Privileged` or `Restricted` PSS directly. +The suggested default decision is `Conditional`, which only progresses if no potentially failing workloads are found. +The progression starts with the `PodSecurityAdmissionLabelSynchronizationController` labeling **Namespaces** and finishes with the **Global Configuration**. + +This enhancement expands the ["PodSecurity admission in OpenShift"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission.md) and ["Pod Security Admission Autolabeling"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission-autolabeling.md) enhancements. + +## Motivation + +After introducing Pod Security Admission and Autolabeling based on SCCs, some clusters were found to have Namespaces with Pod Security violations. +Over the last few releases, the number of clusters with violating workloads has dropped significantly. +Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads. +To ensure a smooth and safe transition, this proposal uses a gradual, conditional rollout based on the new API. +This approach also provides an overview of which Namespaces could contain failing workloads. + +### Goals + +1. Rolling out Pod Security Admission enforcement. +2. Minimize the risk of breakage for existing workloads. +3. Allow users to remain in “privileged” mode for a couple of releases. + +### Non-Goals + +1. Enabling the PSA label-syncer to evaluate workloads with user-based SCC decisions. +2. Providing a detailed list of every Pod Security violation in a Namespace. +3. Moving seamlessly between different progressions back and forth. + +## Proposal + +### User Stories + +As a System Administrator: +- I want to transition to enforcing Pod Security Admission only if the cluster would have no failing workloads. +- If there are workloads in certain Namespaces that would fail under enforcement, I want to be able to identify which Namespaces need to be investigated. +- If I encounter issues with the Pod Security Admission transition, I want to opt out (remain privileged) across my clusters until later. + +### Current State + +When the `OpenShiftPodSecurityAdmission` feature flag is enabled today: +- The [PodSecurity configuration](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/cmd/render/render.go#L350-L358) for the kube-apiserver enforces `restricted` across the cluster. +- The [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/327d3cbd82fd013a9d5d5733eb04cc0dcd97aec5/pkg/cmd/controller/psalabelsyncer.go#L17-L52) automatically sets the `pod-security.kubernetes.io/enforce` label. + +This all-or-nothing mechanism makes it difficult to safely enforce PSA while minimizing disruptions. + +### Gradual Process + +To allow a safer rollout of enforcement, the following steps are proposed: + +1. **Label Enforce** + The `PodSecurityAdmissionLabelSynchronizationController` will set the `pod-security.kubernetes.io/enforce` label on Namespaces, provided the cluster would have no failing workloads. + +2. **Global Config Enforce** + Once all viable Namespaces are labeled successfully (or satisfy PSS `restricted`), the cluster will set the `PodSecurity` configuration for the kube-apiserver to `Restricted`, again only if there would be no failing workloads. + +The feature flag `OpenShiftPodSecurityAdmission` being enabled is a pre-condition for this process to start. +It will also serve as a break-glass option. +If the progression causes failures for users, the rollout will be reverted by removing the `FeatureGate` from the default `FeatureSet`. + +#### Examples + +Examples of failing workloads include: + +- **Category 1**: Namespaces with workloads that use user-bound SCCs (workloads created directly by a user) without meeting the `Restricted` PSS. +- **Category 2**: Namespaces that do not have the `pod-security.kubernetes.io/enforce` label and whose workloads would not satisfy the `Restricted` PSS. + Possible cases include: + 1. Namespaces with `security.openshift.io/scc.podSecurityLabelSync: "false"` and no `pod-security.kubernetes.io/enforce` label set. + 2. `openshift-` prefixed Namespaces (not necessarily created or managed by OpenShift teams). + +### User Control and Insights + +To allow user influence over this gradual transition, a new API called `PSAEnforcementConfig` is introduced. +It will let administrators: +- Force `Restricted` enforcement, ignoring potential violations. +- Remain in `Privileged` mode, regardless of whether violations exist or not. +- Let the cluster evaluate the state and automatically enforce `Restricted` if no workloads would fail. +- Identify Namespaces that would fail enforcement. + +### Release Timing + +The gradual process will span three releases: +- **Release `n-1`**: Introduce the new API, improve diagnostics for identifying violating Namespaces and enable the PSA label syncer to remove enforce labels from its release `n` version. +- **Release `n`**: Permit the `PodSecurityAdmissionLabelSynchronizationController` to set enforce labels if there are no workloads that would fail. +- **Release `n+2`**: Enable the PodSecurity configuration to enforce `restricted` if there are no workloads that would fail. + +Here, `n` could be OCP `4.19`, assuming it is feasible to backport the API and diagnostics to earlier versions. + +## Design Details + +### Improved Diagnostics + +The [ClusterFleetEvaluation](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/dev-guide/cluster-fleet-evaluation.md) revealed that certain clusters would fail enforcement without clear explanations. +A likely root cause is that the [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) (PSA label syncer) does not label Namespaces that rely on user-based SCCs. +In some cases, the evaluation was impossible because PSA labels had been overwritten by users. +Additional diagnostics are required to confirm the full set of potential causes. + +#### New SCC Annotation: `security.openshift.io/ValidatedSCCSubjectType` + +The annotation `openshift.io/scc` currently indicates which SCC admitted a workload, but it does not distinguish **how** the SCC was granted — whether through a user or a Pod’s ServiceAccount. +A new annotation will help determine if a ServiceAccount with the required SCCs was used, or if a user created the workload out of band. +Because the PSA label syncer does not track user-based SCCs itself, it cannot fully assess labeling under those circumstances. + +To address this, the proposal introduces: + +```go +// ValidatedSCCSubjectTypeAnnotation indicates the subject type that allowed the +// SCC admission. This can be used by controllers to detect potential issues +// between user-driven SCC usage and the ServiceAccount-driven SCC usage. +ValidatedSCCSubjectTypeAnnotation = "security.openshift.io/validated-scc-subject-type" +``` + +This annotation will be set by the [`SecurityContextConstraint` admission](https://github.com/openshift/apiserver-library-go/blob/60118cff59e5d64b12e36e754de35b900e443b44/pkg/securitycontextconstraints/sccadmission/admission.go#L138) plugin. + +#### Set PSS Annotation: `security.openshift.io/MinimallySufficientPodSecurityStandard` + +The PSA label syncer must set the `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation. +Because users can modify `pod-security.kubernetes.io/warn` and `pod-security.kubernetes.io/audit`, these labels do not reliably indicate the minimal standard. +The new annotation ensures a clear record of the minimal PSS that would be enforced if the `pod-security.kubernetes.io/enforce` label were set. + +#### Update the PodSecurityReadinessController + +By adding these annotations, the [`PodSecurityReadinessController`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/pkg/operator/podsecurityreadinesscontroller/podsecurityreadinesscontroller.go) can more accurately identify potentially failing Namespaces and understand their root causes: + +- With the `security.openshift.io/MinimallySufficientPodSecurityStandard` annotation, it can evaluate Namespaces that lack the `pod-security.kubernetes.io/enforce` label but have user-overridden warn or audit labels. +- With `ValidatedSCCSubjectType`, the controller can classify issues arising from user-based SCC workloads separately. + Many of the remaining clusters with violations appear to involve workloads admitted by user SCCs. + +### Secure Rollout + +The Proposal section indicates that enforcement will be introduced first at the Namespace level and later at the global (cluster-wide) level. +In addition to adjusting how the `OpenShiftPodSecurityAdmission` `FeatureGate` behaves, administrators need visibility and control throughout this transition. +A new API is necessary to provide this flexibility. + +#### New API + +This API offers a gradual way to roll out Pod Security Admission enforcement to clusters. +It gives users the ability to influence the rollout and see feedback on which Namespaces might violate Pod Security standards. + +```go +package v1alpha1 + +import ( + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" +) + +// PSAEnforcementMode indicates the actual enforcement state of Pod Security Admission +// in the cluster. Unlike PSATargetMode, which reflects the user’s desired or “target” +// setting, PSAEnforcementMode describes the effective mode currently active. +// +// The modes define a progression from no enforcement, through label-based enforcement +// to label-based with global config enforcement. +type PSAEnforcementMode string + +const ( + // PSAEnforcementModePrivileged indicates that no Pod Security restrictions + // are effectively applied. + // This aligns with a pre-rollout or fully "privileged" cluster state, + // where neither enforce labels are set nor the global config enforces "Restricted". + PSAEnforcementModePrivileged PSAEnforcementMode = "Privileged" + + // PSAEnforcementModeLabel indicates that the cluster is enforcing Pod Security + // labels at the Namespace level (via the PodSecurityAdmissionLabelSynchronizationController), + // but the global kube-apiserver configuration is still "Privileged." + PSAEnforcementModeLabel PSAEnforcementMode = "LabelEnforcement" + + // PSAEnforcementModeFull indicates that the cluster is enforcing + // labels at the Namespace level, and the global configuration has been set + // to "Restricted" on the kube-apiserver. + // This represents full enforcement, where both Namespace labels and the global config + // enforce Pod Security Admission restrictions. + PSAEnforcementModeFull PSAEnforcementMode = "FullEnforcement" +) + +// PSATargetMode reflects the user’s chosen (“target”) enforcement level. +type PSATargetMode string + +const ( + // PSATargetModePrivileged indicates that the user wants no Pod Security + // restrictions applied. The desired outcome is that the cluster remains + // in a fully privileged (pre-rollout) state, ignoring any label enforcement + // or global config changes. + PSATargetModePrivileged PSATargetMode = "Privileged" + + // PSATargetModeConditional indicates that the user is willing to let the cluster + // automatically enforce a stricter enforcement once there are no violating Namespaces. + // If violations exist, the cluster stays in its previous state until those are resolved. + // This allows a gradual move towards label and global config enforcement without + // immediately breaking workloads that are not yet compliant. + PSATargetModeConditional PSATargetMode = "Conditional" + + // PSATargetModeRestricted indicates that the user wants the strictest possible + // enforcement, causing the cluster to ignore any existing violations and + // enforce "Restricted" anyway. This reflects a final, fully enforced state. + PSATargetModeRestricted PSATargetMode = "Restricted" +) + +// PSAEnforcementConfig is the config for the PSA enforcement. +type PSAEnforcementConfig struct { + metav1.TypeMeta `json:",inline"` + metav1.ObjectMeta `json:"metadata,omitempty"` + + // spec holds user-settable values for configuring Pod Security Admission + // enforcement + Spec PSAEnforcementConfigSpec `json:"spec"` + + // status communicates the targeted enforcement mode, including any discovered + // issues in Namespaces. + Status PSAEnforcementConfigStatus `json:"status"` +} + +// PSAEnforcementConfigSpec defines the desired configuration for Pod Security +// Admission enforcement. +type PSAEnforcementConfigSpec struct { + // targetMode is the user-selected Pod Security Admission enforcement level. + // Valid values are: + // - "Privileged": ensures the cluster runs with no restrictions + // - "Conditional": defers the decision to cluster-based evaluation + // - "Restricted": enforces the strictest Pod Security admission + // + // If this field is not set, it defaults to "Conditional". + // + // +kubebuilder:default=Conditional + TargetMode PSATargetMode `json:"targetMode"` +} + +// PSAEnforcementConfigStatus defines the observed state of Pod Security +// Admission enforcement. +type PSAEnforcementConfigStatus struct { + // enforcementMode indicates the effective Pod Security Admission enforcement + // mode in the cluster. Unlike spec.targetMode, which expresses the desired mode, + // enforcementMode reflects the actual state after considering any existing + // violations or user overrides. + EnforcementMode PSAEnforcementMode `json:"enforcementMode"` + + // violatingNamespaces is a list of namespaces that can initially block the + // cluster from fully enforcing a "Restricted" mode. Administrators should + // review each listed Namespace to fix any issues to enable strict enforcement. + // + // If a cluster is already in a more "Restricted" mode and new violations emerge, + // it remains in "Restricted" until the user explicitly switches to + // "spec.mode = Privileged". + // + // To revert "Restricted" mode the Administrators need to set the + // PSAEnfocementMode to "Privileged". + // + // +optional + ViolatingNamespaces []ViolatingNamespace `json:"violatingNamespaces,omitempty"` +} + +// ViolatingNamespace provides information about a namespace that cannot comply +// with the chosen enforcement mode. +type ViolatingNamespace struct { + // name is the Namespace that has been flagged as potentially violating if + // enforced. + Name string `json:"name"` + + // reason is a textual description explaining why the Namespace is incompatible + // with the requested Pod Security mode and highlights which mode is affected. + // + // Possible values are: + // - PSAConfig: Misconfigured OpenShift Namespace + // - PSAConfig: PSA label syncer disabled + // - PSALabel: ServiceAccount with insufficient SCCs + // + // +optional + Reason string `json:"reason,omitempty"` +} +``` + +`Privileged` and `Restricted` each ignore cluster feedback and strictly enforce their respective modes: + +- `Privileged` -> `Privileged` +- `Restricted` -> `FullEnforcement` + +When `Conditional` is selected, enforcement depends on whether there are violating Namespaces and on the current release. + +- In `n` and `n+1`: It only progresses from `Privileged` to `LabelEnforcement`, if there would be no PSA label syncer violations. +- In `n+1`: It only progresses from `LabelEnforcement` to`FullEnforcement`, if there would be no PodSecurity config violations. + +Below is a table illustrating the expected behavior when the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled: + +| spec.targetMode | violations found | release | status.enforcementMode | +| ----------------- | ---------------- | ------- | ---------------------- | +| Restricted | none | n - 1 | Privileged | +| Restricted | found | n - 1 | Privileged | +| Privileged | none | n - 1 | Privileged | +| Privileged | found | n - 1 | Privileged | +| Conditional | none | n - 1 | Privileged | +| Conditional | found | n - 1 | Privileged | +| Restricted | none | n | FullEnforcement | +| Restricted | found | n | FullEnforcement | +| Privileged | none | n | Privileged | +| Privileged | found | n | Privileged | +| Conditional | none | n | LabelEnforcement | +| Conditional | found | n | LabelEnforcement | +| Restricted | none | n + 1 | FullEnforcement | +| Restricted | found | n + 1 | FullEnforcement | +| Privileged | none | n + 1 | Privileged | +| Privileged | found | n + 1 | Privileged | +| Conditional | none | n + 1 | FullEnforcement | +| Conditional | found | n + 1 | Privileged | + +A cluster that uses `spec.targetMode = Conditional` can revert to `Privileged` only if the user explicitly sets `spec.targetMode = Privileged`. +A cluster in `spec.mode = Conditional` that starts with `status.EnforcementMode = Privileged` may switch to a more restrictive enforcement mode as soon as there are no violations. +To manage the timing of this rollout, an administrator can set `spec.mode = Privileged` and later switch it to `Conditional` when ready. + +`status.violatingNamespaces` lists the Namespaces that would fail if `status.enforcementMode` were `LabelEnforcement` or `FullEnforcement`. +The reason field helps identify whether the PSA label syncer or the PodSecurity config is the root cause. +Administrators must query the kube-apiserver (or use the [cluster debugging tool](https://github.com/openshift/cluster-debug-tools)) to pinpoint specific workloads. + +### Implementation Details + +- The `PodSecurityReadinessController` in the `cluster-kube-apiserver-operator` will manage the new API. +- If the `FeatureGate` is removed from the current `FeatureSet`, the cluster must revert to its previous state. +- The [`Config Observer Controller`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/operator/configobservation/configobservercontroller/observe_config_controller.go#L131) must be updated to watch for the new API alongside the `FeatureGate`. + +#### PodSecurityReadinessController + +The `PodSecurityReadinessController` will manage the `PSAEnforcementConfig` API. +It already collects most of the necessary data to determine whether a Namespace would fail enforcement or not to create a [`ClusterFleetEvaluation`](https://github.com/openshift/enhancements/blob/master/dev-guide/cluster-fleet-evaluation.md). +With the `security.openshift.io/MinimallySufficientPodSecurityStandard`, it will be able to evaluate all Namespaces for failing workloads, if any enforcement would happen. +With the `security.openshift.io/ValidatedSCCSubjectType`, it can categorize violations more accurately and create a more accurate `ClusterFleetEvaluation`. + +#### PodSecurity Configuration + +A Config Observer in the `cluster-kube-apiserver-operator` manages the Global Config for the kube-apiserver, adjusting behavior based on the feature flag. +It must watch both the `status.enforcementMode` and the `FeatureGate` to make decisions. + +#### PSA Label Syncer + +The PSA label syncer will watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` feature gate. +If `status.enforcementMode` is `LabelEnforcement` or `FullEnforcement` and `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label. +Otherwise, it will refrain from setting that label and remove any enforce labels it owns if existent. + +Because the ability to set `pod-security.kubernetes.io/enforce` is introduced in release `n`, the ability to remove that label must exist in release `n-1`. +Otherwise, the cluster will be unable to revert to its previous state. + +## Open Questions + +### Fresh Installs + +Needs to be evaluated. The System Administrator needs to pre-configure the new API’s `spec.targetMode`, choosing whether the cluster will be `privileged`, `restricted`, or `conditional` during a fresh install. + +### Impact on HyperShift + +Needs to be evaluated. + +### Baseline Clusters + +The current suggestion differentiates between `restricted` and `privileged` PSS. +It may be possible to introduce an intermediate step and set the cluster to `baseline` instead. + +## Test Plan + +The PSA label syncer currently maps SCCs to PSS through a hard-coded rule set, and the PSA version is set to `latest`. +This setup risks becoming outdated if the mapping logic changes upstream. +To protect user workloads, an end-to-end test should fail if the mapping logic no longer behaves as expected. +Ideally, the PSA label syncer would use the `podsecurityadmission` package directly. +Otherwise, it can't be guaranteed that all possible SCCs are mapped correctly. + +## Graduation Criteria + +- If `status.enforcementMode = LabelEnforcement` rolls out on most clusters with no adverse effects, `status.enforcementMode = FullEnforcement` can be enabled in the subsequent release. +- If the majority of users have `status.enforcementMode = FullEnforcement`, then upgrades can be blocked on clusters that do not reach that state. + +## Upgrade / Downgrade Strategy + +### On Upgrade + +See the [Release Timing](#release-timing) section for the overall upgrade strategy. + +### On Downgrade + +See the earlier references, including the [PSA Label Syncer](#psa-label-syncer) subsection in the [Implementation Details](#implementation-details) section, for the downgrade strategy. + +## New Installation + +The default for new installs is `Conditional`, to prompt administrators toward adopting `Restricted`. + +A fresh install should not have any violating Namespaces. +Therefore, as `spec.targetMode` is not set to `Privileged`, the cluster would move to `status.enforcementMode = LabelEnforcement` or `status.enforcementMode = FullEnforcement`. +An administrator can also configure the cluster to start in `Privileged` if desired. + +## Operational Aspects + +- If a cluster is set to `Conditional` and has initial violations, those may be resolved one by one. + Once all violations are resolved, the cluster may immediately transition to `Restricted`. + Some administrators may prefer managing this switch manually. +- After a cluster switches to a stricter `status`, no violating workloads should be possible. + If a violating workload appears, there is no automatic fallback to a more privileged state, thus avoiding additional kube-apiserver restarts. +- Administrators facing issues in a cluster already set to a stricter enforcement can change `spec.targetMode` to `Privileged` to halt enforcement for other clusters. +- ClusterAdmins must ensure that directly created workloads (user-based SCCs) have correct `securityContext` settings. + Updating default workload templates can help. +- To identify specific problems in a violating Namespace, administrators can query the kube-apiserver: + + ```bash + kubectl label --dry-run=server --overwrite $NAMESPACE --all \ + pod-security.kubernetes.io/enforce=$MINIMALLY_SUFFICIENT_POD_SECURITY_STANDARD + ``` From cee640711e806ec9b721c60867e80d51ee840004 Mon Sep 17 00:00:00 2001 From: Krzysztof Ostrowski Date: Mon, 17 Feb 2025 15:10:44 +0100 Subject: [PATCH 2/4] enhancements/auth: modified until Design Details, before big removal --- .../pod-security-admission-enforcement.md | 55 ++++++------------- 1 file changed, 18 insertions(+), 37 deletions(-) diff --git a/enhancements/authentication/pod-security-admission-enforcement.md b/enhancements/authentication/pod-security-admission-enforcement.md index 5b91372845..3d39b7a4b8 100644 --- a/enhancements/authentication/pod-security-admission-enforcement.md +++ b/enhancements/authentication/pod-security-admission-enforcement.md @@ -25,13 +25,10 @@ superseded-by: [] ## Summary -This enhancement introduces a **new cluster-scoped API**, changes to the relevant controllers and to the `OpenShiftPodSecurityAdmission` `FeatureGate` to gradually roll out [Pod Security Admission (PSA)](https://kubernetes.io/docs/concepts/security/pod-security-admission/) enforcement [in OpenShift](https://www.redhat.com/en/blog/pod-security-admission-in-openshift-4.11). -Enforcement means that the `PodSecurityAdmissionLabelSynchronizationController` sets the `pod-security.kubernetes.io/enforce` label on Namespaces, and the PodSecurityAdmission plugin enforces the `Restricted` [Pod Security Standard (PSS)](https://kubernetes.io/docs/concepts/security/pod-security-standards/). -By “gradually”, it means that these changes happen in separate steps. +This enhancement introduces a **new cluster-scoped API** and changes to the relevant controllers to rollout [Pod Security Admission (PSA)](https://kubernetes.io/docs/concepts/security/pod-security-admission/) enforcement [in OpenShift](https://www.redhat.com/en/blog/pod-security-admission-in-openshift-4.11). +Enforcement means that the `PodSecurityAdmissionLabelSynchronizationController` sets the `pod-security.kubernetes.io/enforce` label on Namespaces, and the PodSecurityAdmission plugin enforces the `Restricted` [Pod Security Standard (PSS)](https://kubernetes.io/docs/concepts/security/pod-security-standards/) globally on Namespaces without any label. -The new API offers users the option to manipulate the outcome by enforcing the `Privileged` or `Restricted` PSS directly. -The suggested default decision is `Conditional`, which only progresses if no potentially failing workloads are found. -The progression starts with the `PodSecurityAdmissionLabelSynchronizationController` labeling **Namespaces** and finishes with the **Global Configuration**. +The new API allows users to either enforce the `Restricted` PSS or maintain `Privileged` PSS for several releases. Eventually, all clusters will be required to use `Restricted` PSS. This enhancement expands the ["PodSecurity admission in OpenShift"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission.md) and ["Pod Security Admission Autolabeling"](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/enhancements/authentication/pod-security-admission-autolabeling.md) enhancements. @@ -40,8 +37,12 @@ This enhancement expands the ["PodSecurity admission in OpenShift"](https://gith After introducing Pod Security Admission and Autolabeling based on SCCs, some clusters were found to have Namespaces with Pod Security violations. Over the last few releases, the number of clusters with violating workloads has dropped significantly. Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads. -To ensure a smooth and safe transition, this proposal uses a gradual, conditional rollout based on the new API. -This approach also provides an overview of which Namespaces could contain failing workloads. + +To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`. +The user would need to either resolve the potential failures or set the enforcing mode to `Privileged` for now in order to be able to upgrade. +In the following release `n+1`, the controller will then do the actual enforcement, if `Restricted` is set. + +An overview of the Namespaces with failures will be listed in the API's status, should help the user to fix any issues. ### Goals @@ -53,7 +54,6 @@ This approach also provides an overview of which Namespaces could contain failin 1. Enabling the PSA label-syncer to evaluate workloads with user-based SCC decisions. 2. Providing a detailed list of every Pod Security violation in a Namespace. -3. Moving seamlessly between different progressions back and forth. ## Proposal @@ -66,25 +66,16 @@ As a System Administrator: ### Current State -When the `OpenShiftPodSecurityAdmission` feature flag is enabled today: +When the `OpenShiftPodSecurityAdmission` feature flag is enabled: - The [PodSecurity configuration](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/cmd/render/render.go#L350-L358) for the kube-apiserver enforces `restricted` across the cluster. - The [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/327d3cbd82fd013a9d5d5733eb04cc0dcd97aec5/pkg/cmd/controller/psalabelsyncer.go#L17-L52) automatically sets the `pod-security.kubernetes.io/enforce` label. -This all-or-nothing mechanism makes it difficult to safely enforce PSA while minimizing disruptions. - -### Gradual Process - -To allow a safer rollout of enforcement, the following steps are proposed: +### Rollout -1. **Label Enforce** - The `PodSecurityAdmissionLabelSynchronizationController` will set the `pod-security.kubernetes.io/enforce` label on Namespaces, provided the cluster would have no failing workloads. +Release `n` will introduce the new API. It will default to `Restricted` PSS in its `spec` and list violating Namespaces with potentially failing workloads in the `status` field. +If violating Namespaces are detected, the Operator will move into `Upgradeable=false`. To be able to upgrade, the user needs to resolve the violations or set the `spec` to `Privileged`. -2. **Global Config Enforce** - Once all viable Namespaces are labeled successfully (or satisfy PSS `restricted`), the cluster will set the `PodSecurity` configuration for the kube-apiserver to `Restricted`, again only if there would be no failing workloads. - -The feature flag `OpenShiftPodSecurityAdmission` being enabled is a pre-condition for this process to start. -It will also serve as a break-glass option. -If the progression causes failures for users, the rollout will be reverted by removing the `FeatureGate` from the default `FeatureSet`. +In Release `n+1` the the controller will enforce PSA, if the `spec` is set to `Restricted`. If the `spec` is set to `Privileged` or the `FeatureGate` of `OpenShiftPodSecurityAdmission` is disable, the controllers won't enforce. The `FeatureGate` will act as a break-glass option. #### Examples @@ -98,21 +89,11 @@ Examples of failing workloads include: ### User Control and Insights -To allow user influence over this gradual transition, a new API called `PSAEnforcementConfig` is introduced. +To allow user influence over this transition, a new API called `PSAEnforcementConfig` is introduced. It will let administrators: -- Force `Restricted` enforcement, ignoring potential violations. -- Remain in `Privileged` mode, regardless of whether violations exist or not. -- Let the cluster evaluate the state and automatically enforce `Restricted` if no workloads would fail. -- Identify Namespaces that would fail enforcement. - -### Release Timing - -The gradual process will span three releases: -- **Release `n-1`**: Introduce the new API, improve diagnostics for identifying violating Namespaces and enable the PSA label syncer to remove enforce labels from its release `n` version. -- **Release `n`**: Permit the `PodSecurityAdmissionLabelSynchronizationController` to set enforce labels if there are no workloads that would fail. -- **Release `n+2`**: Enable the PodSecurity configuration to enforce `restricted` if there are no workloads that would fail. - -Here, `n` could be OCP `4.19`, assuming it is feasible to backport the API and diagnostics to earlier versions. +- Enable PSA enforcement by leaving the `spec` to `Restricted` and having no violating Namespaces. +- Block PSA enforcement by setting the `spec` to `Privileged`. +- Get insights which Namespaces would fail in order to resolve the issues. ## Design Details From 2aabd1b6bc6bed4160968f8d9688b937d63c37da Mon Sep 17 00:00:00 2001 From: Krzysztof Ostrowski Date: Tue, 18 Feb 2025 12:50:27 +0100 Subject: [PATCH 3/4] enhancements/authentication: trim down psa-enforcement to modified API --- .../pod-security-admission-enforcement.md | 272 ++++++------------ 1 file changed, 89 insertions(+), 183 deletions(-) diff --git a/enhancements/authentication/pod-security-admission-enforcement.md b/enhancements/authentication/pod-security-admission-enforcement.md index 3d39b7a4b8..d374fa5f41 100644 --- a/enhancements/authentication/pod-security-admission-enforcement.md +++ b/enhancements/authentication/pod-security-admission-enforcement.md @@ -40,6 +40,7 @@ Although these numbers are now quite low, it is essential to avoid any scenario To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`. The user would need to either resolve the potential failures or set the enforcing mode to `Privileged` for now in order to be able to upgrade. +`Privileged` will keep the cluster in the previous state, the non enforcing state. In the following release `n+1`, the controller will then do the actual enforcement, if `Restricted` is set. An overview of the Namespaces with failures will be listed in the API's status, should help the user to fix any issues. @@ -64,46 +65,20 @@ As a System Administrator: - If there are workloads in certain Namespaces that would fail under enforcement, I want to be able to identify which Namespaces need to be investigated. - If I encounter issues with the Pod Security Admission transition, I want to opt out (remain privileged) across my clusters until later. -### Current State - -When the `OpenShiftPodSecurityAdmission` feature flag is enabled: -- The [PodSecurity configuration](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/cmd/render/render.go#L350-L358) for the kube-apiserver enforces `restricted` across the cluster. -- The [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/327d3cbd82fd013a9d5d5733eb04cc0dcd97aec5/pkg/cmd/controller/psalabelsyncer.go#L17-L52) automatically sets the `pod-security.kubernetes.io/enforce` label. - -### Rollout - -Release `n` will introduce the new API. It will default to `Restricted` PSS in its `spec` and list violating Namespaces with potentially failing workloads in the `status` field. -If violating Namespaces are detected, the Operator will move into `Upgradeable=false`. To be able to upgrade, the user needs to resolve the violations or set the `spec` to `Privileged`. - -In Release `n+1` the the controller will enforce PSA, if the `spec` is set to `Restricted`. If the `spec` is set to `Privileged` or the `FeatureGate` of `OpenShiftPodSecurityAdmission` is disable, the controllers won't enforce. The `FeatureGate` will act as a break-glass option. - -#### Examples - -Examples of failing workloads include: - -- **Category 1**: Namespaces with workloads that use user-bound SCCs (workloads created directly by a user) without meeting the `Restricted` PSS. -- **Category 2**: Namespaces that do not have the `pod-security.kubernetes.io/enforce` label and whose workloads would not satisfy the `Restricted` PSS. - Possible cases include: - 1. Namespaces with `security.openshift.io/scc.podSecurityLabelSync: "false"` and no `pod-security.kubernetes.io/enforce` label set. - 2. `openshift-` prefixed Namespaces (not necessarily created or managed by OpenShift teams). - -### User Control and Insights - -To allow user influence over this transition, a new API called `PSAEnforcementConfig` is introduced. -It will let administrators: -- Enable PSA enforcement by leaving the `spec` to `Restricted` and having no violating Namespaces. -- Block PSA enforcement by setting the `spec` to `Privileged`. -- Get insights which Namespaces would fail in order to resolve the issues. - ## Design Details ### Improved Diagnostics -The [ClusterFleetEvaluation](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/dev-guide/cluster-fleet-evaluation.md) revealed that certain clusters would fail enforcement without clear explanations. +In order to make more accurate predictions about violating Namespaces, which means it would have failing workloads, it is necessary to improve the diagnostics. + +The [ClusterFleetEvaluation](https://github.com/openshift/enhancements/blob/61581dcd985130357d6e4b0e72b87ee35394bf6e/dev-guide/cluster-fleet-evaluation.md) revealed that certain clusters would fail enforcement. +While it can be distinguished if a workload would fail or not, the explanation is not always clear. A likely root cause is that the [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) (PSA label syncer) does not label Namespaces that rely on user-based SCCs. -In some cases, the evaluation was impossible because PSA labels had been overwritten by users. +In some other cases, the evaluation was impossible because PSA labels had been overwritten by users. Additional diagnostics are required to confirm the full set of potential causes. +While the root causes need to be identified in some cases, the result of identifying a violating Namespace is understood. + #### New SCC Annotation: `security.openshift.io/ValidatedSCCSubjectType` The annotation `openshift.io/scc` currently indicates which SCC admitted a workload, but it does not distinguish **how** the SCC was granted — whether through a user or a Pod’s ServiceAccount. @@ -135,16 +110,14 @@ By adding these annotations, the [`PodSecurityReadinessController`](https://gith - With `ValidatedSCCSubjectType`, the controller can classify issues arising from user-based SCC workloads separately. Many of the remaining clusters with violations appear to involve workloads admitted by user SCCs. -### Secure Rollout -The Proposal section indicates that enforcement will be introduced first at the Namespace level and later at the global (cluster-wide) level. -In addition to adjusting how the `OpenShiftPodSecurityAdmission` `FeatureGate` behaves, administrators need visibility and control throughout this transition. -A new API is necessary to provide this flexibility. +### New API -#### New API - -This API offers a gradual way to roll out Pod Security Admission enforcement to clusters. -It gives users the ability to influence the rollout and see feedback on which Namespaces might violate Pod Security standards. +This API is used to support users to enforce PSA. +As this is a transitory process, this API will loose its usefulness once PSA enforcement isn't optional anymore. +The API offers two things to the users: +- offers them the ability to halt enforcement and +- offers them the ability to identify failing namespaces. ```go package v1alpha1 @@ -153,105 +126,51 @@ import ( metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" ) -// PSAEnforcementMode indicates the actual enforcement state of Pod Security Admission -// in the cluster. Unlike PSATargetMode, which reflects the user’s desired or “target” -// setting, PSAEnforcementMode describes the effective mode currently active. -// -// The modes define a progression from no enforcement, through label-based enforcement -// to label-based with global config enforcement. +// PSAEnforcementMode defines the Pod Security Standard that should be applied. type PSAEnforcementMode string const ( - // PSAEnforcementModePrivileged indicates that no Pod Security restrictions - // are effectively applied. - // This aligns with a pre-rollout or fully "privileged" cluster state, - // where neither enforce labels are set nor the global config enforces "Restricted". + // PSAEnforcementModePrivileged indicates that the cluster should not enforce PSA restrictions and stay in Privileged mode. PSAEnforcementModePrivileged PSAEnforcementMode = "Privileged" - - // PSAEnforcementModeLabel indicates that the cluster is enforcing Pod Security - // labels at the Namespace level (via the PodSecurityAdmissionLabelSynchronizationController), - // but the global kube-apiserver configuration is still "Privileged." - PSAEnforcementModeLabel PSAEnforcementMode = "LabelEnforcement" - - // PSAEnforcementModeFull indicates that the cluster is enforcing - // labels at the Namespace level, and the global configuration has been set - // to "Restricted" on the kube-apiserver. - // This represents full enforcement, where both Namespace labels and the global config - // enforce Pod Security Admission restrictions. - PSAEnforcementModeFull PSAEnforcementMode = "FullEnforcement" + // PSAEnforcementModeRestricted indicates that the cluster should enforce PSA restrictions, if no violating Namepsaces are found. + PSAEnforcementModeRestricted PSAEnforcementMode = "Restricted" ) -// PSATargetMode reflects the user’s chosen (“target”) enforcement level. -type PSATargetMode string - -const ( - // PSATargetModePrivileged indicates that the user wants no Pod Security - // restrictions applied. The desired outcome is that the cluster remains - // in a fully privileged (pre-rollout) state, ignoring any label enforcement - // or global config changes. - PSATargetModePrivileged PSATargetMode = "Privileged" - - // PSATargetModeConditional indicates that the user is willing to let the cluster - // automatically enforce a stricter enforcement once there are no violating Namespaces. - // If violations exist, the cluster stays in its previous state until those are resolved. - // This allows a gradual move towards label and global config enforcement without - // immediately breaking workloads that are not yet compliant. - PSATargetModeConditional PSATargetMode = "Conditional" - - // PSATargetModeRestricted indicates that the user wants the strictest possible - // enforcement, causing the cluster to ignore any existing violations and - // enforce "Restricted" anyway. This reflects a final, fully enforced state. - PSATargetModeRestricted PSATargetMode = "Restricted" -) - -// PSAEnforcementConfig is the config for the PSA enforcement. +// PSAEnforcementConfig is a config that supports the user in the PSA enforcement transition. +// The spec struct enables a user to stop the PSA enforcement, if necessary. +// The status struct supports the user in identifying obstacles in PSA enforcement. type PSAEnforcementConfig struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` - // spec holds user-settable values for configuring Pod Security Admission - // enforcement + // spec is a configuration option that enables the customer to influence the PSA enforcement outcome. Spec PSAEnforcementConfigSpec `json:"spec"` - // status communicates the targeted enforcement mode, including any discovered - // issues in Namespaces. + // status reflects the cluster status wrt PSA enforcement. Status PSAEnforcementConfigStatus `json:"status"` } -// PSAEnforcementConfigSpec defines the desired configuration for Pod Security -// Admission enforcement. +// PSAEnforcementConfigSpec is a configuration option that enables the customer to influence the PSA enforcement outcome. type PSAEnforcementConfigSpec struct { - // targetMode is the user-selected Pod Security Admission enforcement level. - // Valid values are: - // - "Privileged": ensures the cluster runs with no restrictions - // - "Conditional": defers the decision to cluster-based evaluation - // - "Restricted": enforces the strictest Pod Security admission + // enforcementMode gives the user different options: + // - Restricted enables the cluster to move to PSA enforcement, if there are no violating Namespaces detected. + // If violating Namespaces are found, the operator moves into "Upgradeable=false". + // - Privileged enables the cluster to opt-out from PSA enforcement for now and it resolves the operator status of "Upgradeable=false" in case of violating Namespaces. // - // If this field is not set, it defaults to "Conditional". - // - // +kubebuilder:default=Conditional - TargetMode PSATargetMode `json:"targetMode"` + // defaults to "Restricted" + EnforcementMode PSAEnforcementMode `json:"enforcementMode"` } -// PSAEnforcementConfigStatus defines the observed state of Pod Security -// Admission enforcement. +// PSAEnforcementConfigStatus is a struct that signals to the user, if the cluter is going to start with PSA enforcement and if there are any violating Namespaces. type PSAEnforcementConfigStatus struct { - // enforcementMode indicates the effective Pod Security Admission enforcement - // mode in the cluster. Unlike spec.targetMode, which expresses the desired mode, - // enforcementMode reflects the actual state after considering any existing - // violations or user overrides. + // enforcementMode indicates if PSA enforcement will happen: + // - "Restricted" indicates that enforcement is possible and will happen. + // - "Privileged" indidcates that either enforcement will not happen: + // - either it is not wished or + // - it isn't possible without potentially breaking workloads. EnforcementMode PSAEnforcementMode `json:"enforcementMode"` - // violatingNamespaces is a list of namespaces that can initially block the - // cluster from fully enforcing a "Restricted" mode. Administrators should - // review each listed Namespace to fix any issues to enable strict enforcement. - // - // If a cluster is already in a more "Restricted" mode and new violations emerge, - // it remains in "Restricted" until the user explicitly switches to - // "spec.mode = Privileged". - // - // To revert "Restricted" mode the Administrators need to set the - // PSAEnfocementMode to "Privileged". + // violatingNamespaces lists Namespaces that are violating. Needs to be resolved in order to move to Restricted. // // +optional ViolatingNamespaces []ViolatingNamespace `json:"violatingNamespaces,omitempty"` @@ -277,52 +196,29 @@ type ViolatingNamespace struct { } ``` -`Privileged` and `Restricted` each ignore cluster feedback and strictly enforce their respective modes: - -- `Privileged` -> `Privileged` -- `Restricted` -> `FullEnforcement` - -When `Conditional` is selected, enforcement depends on whether there are violating Namespaces and on the current release. - -- In `n` and `n+1`: It only progresses from `Privileged` to `LabelEnforcement`, if there would be no PSA label syncer violations. -- In `n+1`: It only progresses from `LabelEnforcement` to`FullEnforcement`, if there would be no PodSecurity config violations. - -Below is a table illustrating the expected behavior when the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled: - -| spec.targetMode | violations found | release | status.enforcementMode | -| ----------------- | ---------------- | ------- | ---------------------- | -| Restricted | none | n - 1 | Privileged | -| Restricted | found | n - 1 | Privileged | -| Privileged | none | n - 1 | Privileged | -| Privileged | found | n - 1 | Privileged | -| Conditional | none | n - 1 | Privileged | -| Conditional | found | n - 1 | Privileged | -| Restricted | none | n | FullEnforcement | -| Restricted | found | n | FullEnforcement | -| Privileged | none | n | Privileged | -| Privileged | found | n | Privileged | -| Conditional | none | n | LabelEnforcement | -| Conditional | found | n | LabelEnforcement | -| Restricted | none | n + 1 | FullEnforcement | -| Restricted | found | n + 1 | FullEnforcement | -| Privileged | none | n + 1 | Privileged | -| Privileged | found | n + 1 | Privileged | -| Conditional | none | n + 1 | FullEnforcement | -| Conditional | found | n + 1 | Privileged | - -A cluster that uses `spec.targetMode = Conditional` can revert to `Privileged` only if the user explicitly sets `spec.targetMode = Privileged`. -A cluster in `spec.mode = Conditional` that starts with `status.EnforcementMode = Privileged` may switch to a more restrictive enforcement mode as soon as there are no violations. -To manage the timing of this rollout, an administrator can set `spec.mode = Privileged` and later switch it to `Conditional` when ready. - -`status.violatingNamespaces` lists the Namespaces that would fail if `status.enforcementMode` were `LabelEnforcement` or `FullEnforcement`. -The reason field helps identify whether the PSA label syncer or the PodSecurity config is the root cause. -Administrators must query the kube-apiserver (or use the [cluster debugging tool](https://github.com/openshift/cluster-debug-tools)) to pinpoint specific workloads. +Here is a boolean table with the expected outcomes: + +| `spec.enforcementMode` | length of `status.violatingNamespaces` | `status.enforcementMode` | `OperatorStatus` | +| ---------------------- | -------------------------------------- | ------------------------ | ----------------- | +| Privileged | More than 0 | Privileged | AsExpected | +| Privileged | 0 | Privileged | AsExpected | +| Restricted | More than 0 | Privileged | Upgradeable=False | +| Restricted | 0 | Restricted | AsExpected | + +If a user encounters `status.violatingNamespaces` it is expected to: + +- resolve the violating Namespaces in order to be able to `Upgrade` or +- set the `spec.enforcementMode=Privileged` and solve the violating Namespaces later. + +If a user manages several clusters and there are well known violating Namespaces, the `spec.enforcementMode=Privileged` can be set as a precaution. ### Implementation Details - The `PodSecurityReadinessController` in the `cluster-kube-apiserver-operator` will manage the new API. -- If the `FeatureGate` is removed from the current `FeatureSet`, the cluster must revert to its previous state. -- The [`Config Observer Controller`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/operator/configobservation/configobservercontroller/observe_config_controller.go#L131) must be updated to watch for the new API alongside the `FeatureGate`. +- The [`Config Observer Controller`](https://github.com/openshift/cluster-kube-apiserver-operator/blob/218530fdea4e89b93bc6e136d8b5d8c3beacdd51/pkg/operator/configobservation/configobservercontroller/observe_config_controller.go#L131) must be updated to watch for the new API's status alongside the `FeatureGate`. +- The [`PodSecurityAdmissionLabelSynchronizationController`](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/cmd/controller/psalabelsyncer.go#L17-L50) must be updated to watch for the new API's status alongside the `FeatureGate`. +- If the `FeatureGate` `OpenShiftPodSecurityAdmission` is removed from the current `FeatureSet`, the cluster must revert to its previous state. + It serves as a break-glass mechanism. #### PodSecurityReadinessController @@ -333,23 +229,23 @@ With the `security.openshift.io/ValidatedSCCSubjectType`, it can categorize viol #### PodSecurity Configuration -A Config Observer in the `cluster-kube-apiserver-operator` manages the Global Config for the kube-apiserver, adjusting behavior based on the feature flag. -It must watch both the `status.enforcementMode` and the `FeatureGate` to make decisions. +A Config Observer in the `cluster-kube-apiserver-operator` manages the Global Config for the kube-apiserver, adjusting behavior based on the `OpenShiftPodSecurityAdmission` `FeatureGate`. +It must watch both the `status.enforcementMode` for `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` to be enabled to make decisions. -#### PSA Label Syncer +#### PodSecurityAdmissionLabelSynchronizationController -The PSA label syncer will watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` feature gate. -If `status.enforcementMode` is `LabelEnforcement` or `FullEnforcement` and `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label. +The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`. +If `spec.enforcementMode` is `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label. Otherwise, it will refrain from setting that label and remove any enforce labels it owns if existent. -Because the ability to set `pod-security.kubernetes.io/enforce` is introduced in release `n`, the ability to remove that label must exist in release `n-1`. +Because the ability to set `pod-security.kubernetes.io/enforce` is introduced, the ability to remove that label must exist in the release before. Otherwise, the cluster will be unable to revert to its previous state. ## Open Questions ### Fresh Installs -Needs to be evaluated. The System Administrator needs to pre-configure the new API’s `spec.targetMode`, choosing whether the cluster will be `privileged`, `restricted`, or `conditional` during a fresh install. +Needs to be evaluated. The System Administrator needs to pre-configure the new API’s `spec.enforcementMode`, choosing whether the cluster will be `Privileged` or `Restricted` during a fresh install. ### Impact on HyperShift @@ -357,9 +253,14 @@ Needs to be evaluated. ### Baseline Clusters -The current suggestion differentiates between `restricted` and `privileged` PSS. +The current suggestion differentiates between `Restricted` and `Privileged` PSS. It may be possible to introduce an intermediate step and set the cluster to `baseline` instead. +### Enforce PSA labe syncer, fine-grained + +It would be possible to enforce only the `pod-security.kubernetes.io/enforce` labels on Namespaces without enforcing it globally through the `PodSecurity` configuration given to the kube-apiserver. +It would be possible to enforce `pod-security.kubernetes.io/enforce` labels on Namespaces that we know wouldn't fail. + ## Test Plan The PSA label syncer currently maps SCCs to PSS through a hard-coded rule set, and the PSA version is set to `latest`. @@ -370,37 +271,42 @@ Otherwise, it can't be guaranteed that all possible SCCs are mapped correctly. ## Graduation Criteria -- If `status.enforcementMode = LabelEnforcement` rolls out on most clusters with no adverse effects, `status.enforcementMode = FullEnforcement` can be enabled in the subsequent release. -- If the majority of users have `status.enforcementMode = FullEnforcement`, then upgrades can be blocked on clusters that do not reach that state. +If `spec.enforcementMode = Restricted` rolls out on most clusters with no adverse effects, the ability to avoid `Upgradeable=false` with violating Namespaces by setting `spec.enforcementMode = Privileged` will be removed. ## Upgrade / Downgrade Strategy ### On Upgrade -See the [Release Timing](#release-timing) section for the overall upgrade strategy. +The API needs to be introduced before the controllers start to use it: + +- Release `n-1`: + - Backport the API. +- Release `n`: + - Enable the `PodSecurityReadinessController` to use the API by setting it's `status`. +- Release `n+1`: + - Enable the `PodSecurityAdmissionLabelSynchronizationController` and `Config Observer Controller` to enforce, if: + - there are no potentially failing workloads (indicated by violating Namespaces) and + - the `OpenShiftPodSecurityAdmission` `FeatureGate` is enabled. + - Enable the `OpenShiftPodSecurityAdmission` `FeatureGate` ### On Downgrade -See the earlier references, including the [PSA Label Syncer](#psa-label-syncer) subsection in the [Implementation Details](#implementation-details) section, for the downgrade strategy. +The changes that will be made on enforcement, need to be able to be reverted: -## New Installation +- Release `n`: The `PodSecurityAdmissionLabelSynchronizationController` needs to be able to remove the enforcement labels that will be set in `n+1`. -The default for new installs is `Conditional`, to prompt administrators toward adopting `Restricted`. +## New Installation -A fresh install should not have any violating Namespaces. -Therefore, as `spec.targetMode` is not set to `Privileged`, the cluster would move to `status.enforcementMode = LabelEnforcement` or `status.enforcementMode = FullEnforcement`. -An administrator can also configure the cluster to start in `Privileged` if desired. +TBD ## Operational Aspects -- If a cluster is set to `Conditional` and has initial violations, those may be resolved one by one. - Once all violations are resolved, the cluster may immediately transition to `Restricted`. - Some administrators may prefer managing this switch manually. -- After a cluster switches to a stricter `status`, no violating workloads should be possible. - If a violating workload appears, there is no automatic fallback to a more privileged state, thus avoiding additional kube-apiserver restarts. -- Administrators facing issues in a cluster already set to a stricter enforcement can change `spec.targetMode` to `Privileged` to halt enforcement for other clusters. +- Administrators facing issues in a cluster already set to a stricter enforcement can change `spec.enforcementMode` to `Privileged` to halt enforcement for other clusters. - ClusterAdmins must ensure that directly created workloads (user-based SCCs) have correct `securityContext` settings. + They can't rely on the `PodSecurityAdmissionLabelSynchronizationController`, which only watches ServiceAccount-based RBAC. Updating default workload templates can help. +- The evaluation of the cluster happens once every 4 hours with a throttled client in order to avoid a denial of service on clusters with a high amount of Namespaces. + It could happen that it takes several hours to identify a violating Namespace. - To identify specific problems in a violating Namespace, administrators can query the kube-apiserver: ```bash From 96149019c577c07703fb54bf8e7e6aed4c0aed02 Mon Sep 17 00:00:00 2001 From: Krzysztof Ostrowski Date: Thu, 20 Feb 2025 15:31:00 +0100 Subject: [PATCH 4/4] enhancements/authentication: add explanations - List which namespaces aren't managed by the PSA label syncer - Explain, why we want to force restricted PSA enforcement eventually. - Add a guide on how to handle violations. --- .../pod-security-admission-enforcement.md | 89 ++++++++++++++++++- 1 file changed, 86 insertions(+), 3 deletions(-) diff --git a/enhancements/authentication/pod-security-admission-enforcement.md b/enhancements/authentication/pod-security-admission-enforcement.md index d374fa5f41..1729a0a55d 100644 --- a/enhancements/authentication/pod-security-admission-enforcement.md +++ b/enhancements/authentication/pod-security-admission-enforcement.md @@ -39,12 +39,16 @@ Over the last few releases, the number of clusters with violating workloads has Although these numbers are now quite low, it is essential to avoid any scenario where users end up with failing workloads. To ensure a safe transition, this proposal suggests that if a potential failure of workloads is being detected in release `n`, that the operator moves into `Upgradeable=false`. -The user would need to either resolve the potential failures or set the enforcing mode to `Privileged` for now in order to be able to upgrade. +The user would need to either resolve the potential failures, setting a higher PSS label for that Namespace or set the enforcing mode to `Privileged` for now in order to be able to upgrade. `Privileged` will keep the cluster in the previous state, the non enforcing state. In the following release `n+1`, the controller will then do the actual enforcement, if `Restricted` is set. An overview of the Namespaces with failures will be listed in the API's status, should help the user to fix any issues. +The temporary `Privileged` mode (opt-out from PSA enforcement) exists solely to facilitate a smooth transition. Once a vast majority of clusters have adapted their workloads to operate under `Restricted` PSS, maintaining the option to run in `Privileged` mode would undermine these security objectives. + +OpenShift strives to offer the highest security standards. Enforcing PSS ensures that OpenShift is at least as secure as upstream Kubernetes and that OpenShift complies with upstreams security best practices. + ### Goals 1. Rolling out Pod Security Admission enforcement. @@ -79,7 +83,7 @@ Additional diagnostics are required to confirm the full set of potential causes. While the root causes need to be identified in some cases, the result of identifying a violating Namespace is understood. -#### New SCC Annotation: `security.openshift.io/ValidatedSCCSubjectType` +#### New SCC Annotation: `security.openshift.io/validated-scc-subject-type` The annotation `openshift.io/scc` currently indicates which SCC admitted a workload, but it does not distinguish **how** the SCC was granted — whether through a user or a Pod’s ServiceAccount. A new annotation will help determine if a ServiceAccount with the required SCCs was used, or if a user created the workload out of band. @@ -235,9 +239,19 @@ It must watch both the `status.enforcementMode` for `Restricted` and the `Featur #### PodSecurityAdmissionLabelSynchronizationController The [PodSecurityAdmissionLabelSynchronizationController (PSA label syncer)](https://github.com/openshift/cluster-policy-controller/blob/master/pkg/psalabelsyncer/podsecurity_label_sync_controller.go) must watch the `status.enforcementMode` and the `OpenShiftPodSecurityAdmission` `FeatureGate`. -If `spec.enforcementMode` is `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label. +If `spec.enforcementMode` is `Restricted` and the `FeatureGate` `OpenShiftPodSecurityAdmission` is enabled, the syncer will set the `pod-security.kubernetes.io/enforce` label on Namespaces that it manages. Otherwise, it will refrain from setting that label and remove any enforce labels it owns if existent. +Namespaces that are **not managed** by the `PodSecurityAdmissionLabelSynchronizationController` are Namespaces that: + +- Are prefixed with `openshift`, +- Have the label `security.openshift.io/scc.podSecurityLabelSync=false`. +- Have the `pod-security.kubernetes.io/enforce` label set manually. +- Are not a run-level zero Namespace: + - `kube-system`, + - `default` or + - `kube-public`. + Because the ability to set `pod-security.kubernetes.io/enforce` is introduced, the ability to remove that label must exist in the release before. Otherwise, the cluster will be unable to revert to its previous state. @@ -301,6 +315,8 @@ TBD ## Operational Aspects +### In general + - Administrators facing issues in a cluster already set to a stricter enforcement can change `spec.enforcementMode` to `Privileged` to halt enforcement for other clusters. - ClusterAdmins must ensure that directly created workloads (user-based SCCs) have correct `securityContext` settings. They can't rely on the `PodSecurityAdmissionLabelSynchronizationController`, which only watches ServiceAccount-based RBAC. @@ -313,3 +329,70 @@ TBD kubectl label --dry-run=server --overwrite $NAMESPACE --all \ pod-security.kubernetes.io/enforce=$MINIMALLY_SUFFICIENT_POD_SECURITY_STANDARD ``` + +### Setting the `pod-security.kubernetes.io/enforce` label manually + +To assess if your Namespace is capable of running with the `Restricted` PSS, run this: + +```bash + kubectl label --dry-run=server --overwrite $NAMESPACE --all \ + pod-security.kubernetes.io/enforce=restricted +``` + +To assess if your Namespace is capable of running with the `Baseline` PSS, run this: + +```bash + kubectl label --dry-run=server --overwrite $NAMESPACE --all \ + pod-security.kubernetes.io/enforce=baseline +``` + +If both commands return warning messages, the Namespace needs `Privileged` PSS in its current state. +It can be useful to read the warning messages to identify fields in the Pod manifest that could be adjusted to meet a higher security standard. + +To set the label, remove the `--dry-run=server` flag. + +### Resolving Violating Namespaces + +There are different reasons, why the built-in solution can't set the PSS properly in the Namespace. + +##### Namespace name starts with `openshift` + +*Hint: The `openshift` prefix is reserved for OpenShift and the PSA label syncer will not set the `pod-security.kubernetes.io/enforce` label.* + +The Namespace that is listed as violating has a name that starts with `openshift`. +It happens that guides or scripts create Namespaces with the `openshift` prefix. +Another root cause is that the team that owns the Namespace did not set the required PSA labels. +This should not happen, and could indicate that not the newest version is being used. + +To solve the issue: + + - If the Namespace is being created by the user: + - it isn't supported that a user creates a Namespace with the `openshift` prefix and + - the user should recreate the Namespace with a different name or + - if not possible, set the `pod-security.kubernetes.io/enforce` label manually. + - If the Namespace is owned by OpenShift: + - Check for updates. + - If up to date: report as a bug. + +#### Namespace has disabled PSA synchronization + +Namespace has disabled [PSA synchronization](https://docs.openshift.com/container-platform/4.17/authentication/understanding-and-managing-pod-security-admission.html#security-context-constraints-psa-opting_understanding-and-managing-pod-security-admission). +This can be identified by checking the label `security.openshift.io/scc.podSecurityLabelSync=false` in the Namespace manifest. + +To solve the issue: + + - Enable PSA synchronization with `security.openshift.io/scc.podSecurityLabelSync=true` or + - Set the `pod-security.kubernetes.io/enforce` label manually. + +#### Namespace workload doesn't use ServiceAccount SCC + +Namespace workload doesn't use ServiceAccount SCC, but receives the SCCs by the executing user. +This usually happens, when a workload isn't running through a deployment with a properly set up ServiceAccount. +A way to verify that will be to check the `security.openshift.io/validated-scc-subject-type` annotation on the Pod manifest. + +To solve the issue: + + - Update the ServiceAccount to be able to use the necessary SCCs. + The necessary SCC can be identified in the annotation `security.openshift.io/scc` of the existing workloads. + After that is done, the PSA label syncer will update the PSA labels. + - Otherwise set the `pod-security.kubernetes.io/enforce` label manually.