Skip to content

[Proposal] - Dynamic Cache for Gracefully Handling RBAC Changes #2176

Open
@everettraven

Description

@everettraven

Problem Statement: As an operator author I want to develop an operator that can handle changing permissions so that cluster admins can use Role Based Access Control (RBAC) to scope the permissions given to my operator.

In order to grant cluster admins the ability to scope the permissions given to operators, we need to provide operator authors an easy way to handle dynamically changing permissions.

Background/Context

There has recently been questions posed in various avenues regarding the ability to scope operator permissions. There is currently a limitation with the existing cache implementations in that they are not able to reliably handle changes in RBAC and require specific permissions to be given to it's associated ServiceAccount at all times. If the permissions are removed the controller will either crash or enter an infinite loop that blocks reconciliation.

This proposal introduces the concept of adding a cache implementation that would allow for dynamically handling changes in RBAC and will cover advantages, disadvantages, and introduce an existing Proof-of-Concept.

Advantages

For Operator/Controller Authors

  • Makes controllers more resilient to changes in RBAC
  • Dynamically add/remove informers as needed
  • A single caching layer that can manage both cluster scoped and namespace scoped informers
  • Operators/controllers created with this can be seen as more secure

For Cluster Admins

  • Allows for an operator/controller's permissions to be configured following least-privilege principle and gives cluster-admin’s more control over what an Operator/controller can and can not do.

For the Operator/Controller Ecosystem

  • There are numerous industries where security is a big factor in allowing certain software to run on clusters. The extra security brought by being able to scope the permissions of an operator/controller opens up the opportunity for these industries to start adopting the Operator pattern.

There are likely more advantages that are not listed here.

Disadvantages

For Operator/Controller Authors

  • Using this caching layer will likely result in authors having to adopt a new pattern for establishing watches
  • Introduces some new complexity for authors to ensure that their implementation works in various conditions

There are likely more disadvantages that are not listed here.

Proof of Concept

I have worked on a few iterations of a PoC for this, but the latest and most promising one can be found here: https://github.com/everettraven/telescopia

There is also a sample operator that uses the telescopia library here: https://github.com/everettraven/scoped-operator-poc/tree/poc/telescopia (specifically the poc/telescopia branch). There is a demo in the README that I highly recommend taking a look at to get a better idea of how an operator/controller may behave with this concept implemented.


I would love to get feedback and thoughts on the advantages and disadvantages of this as well as if implementing this is something that would be of value to the controller-runtime project and the community.

I look forward to discussing this further!

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions