Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: minimal vs standard containers #126

Closed
ardentperf opened this issue Dec 27, 2024 · 4 comments
Closed

proposal: minimal vs standard containers #126

ardentperf opened this issue Dec 27, 2024 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@ardentperf
Copy link

ardentperf commented Dec 27, 2024

today, CNPG provides two operand container images: postgresql-containers and postgis-containers based on two respective upstream containers maintained by the docker community.

there are already several requests for specific extensions, and i'm sure more will come. i would propose that CNPG changes its strategy to simply provide two images: a "minimal" image with only the bare minimum extensions which is suitable for users to fork and customize with their own list of extensions, and a "standard" image with a liberal policy of including any extension which is packaged for Debian by CNPG.

this policy would encourage contribution to Postgres Debian packaging as an upstream (if desired extensions aren't there already), consolidate maintenance in the debian ecosystem, provide an easy way for CNPG users to get an image with most extensions they want, and still provide a minimal small image as a base for customization.

i don't think there needs to be a separate image for postgis; i think it can be included in a standard image (similar to what cloud providers do). at some point in the future there may be concerns about the size getting too bloated, but i think we can cross that bridge when we come to it (and discuss whether there should be some third middle option). in the meantime if we only add extensions that are explicitly requested by a user and available in Debian repos, then i think the pace of growth will be manageable.

thoughts?

@gbartolini gbartolini self-assigned this Dec 27, 2024
@gbartolini gbartolini added the enhancement New feature or request label Dec 27, 2024
@gbartolini gbartolini moved this to Todo in Roadmap Dec 27, 2024
@gbartolini
Copy link
Contributor

gbartolini commented Dec 27, 2024

Hi @ardentperf,

Thank you for your thoughtful proposal. I fully understand it, and I agree that this is a crucial moment for the community to make a decision.

That said, I believe it's important to provide some additional context regarding the future direction of the project, particularly in terms of container image requirements and extensions.

Requirements

With the introduction of CNPG-I and the experimental plugin for Barman Cloud, we are moving toward eliminating Barman Cloud as a strict requirement for CloudNativePG. In upcoming versions, we plan to transition to a plugin-based backup and recovery system, which will allow us to separate PostgreSQL operands from Barman Cloud operands. This change will enable us to leverage Docker Hub's official PostgreSQL images directly.

Extensions

We are actively involved in the development of the extension_control_path feature for PostgreSQL 18. Our long-term vision is for each extension to be distributed as a self-contained container image. CloudNativePG would then be able to rely on Kubernetes' VolumeMount functionality (currently in alpha) to mount immutable extension images as volumes and configure PostgreSQL to find them. David Wheeler has provided an excellent overview of the current status of this initiative here: [RFC on Extension Packaging and Lookup](https://justatheory.com/2024/11/rfc-extension-packaging-lookup/).

Thoughts

In light of the long-term roadmap for extensions, here's where I currently stand:

  1. We can create minimal CNPG images based on the official Docker Hub PostgreSQL images. The only modification is to adjust UID and GID from 999 to 26.
  2. We could also create standard CNPG images that include all Debian-compatible extensions and are licensing-compliant, as you suggested.

Would you be interested in contributing to this project?

What do other maintainers and component owners think?

My main concern is primarily supporting images on a community based engagement.

@ardentperf
Copy link
Author

ardentperf commented Dec 30, 2024

In upcoming versions, we plan to transition to a plugin-based backup and recovery system, which will allow us to separate PostgreSQL operands from Barman Cloud operands. This change will enable us to leverage Docker Hub's official PostgreSQL images directly.

...

Our long-term vision is for each extension to be distributed as a self-contained container image. CloudNativePG would then be able to rely on Kubernetes' VolumeMount functionality (currently in alpha) to mount immutable extension images as volumes and configure PostgreSQL to find them.

Is the idea that the kubernetes Cluster definition yaml would no longer accept only one image, but would accept a base image and also a list of extension images? And is the idea that eventually CNPG wouldn't provide base images at all anymore, but only a set of "extension" images that can be mounted (sidecar-style?) alongside the official docker postgres base image in the pod?

If someday CNPG didn't provide base images, then what about failover slots and backup software? Those two seem pretty fundamental and I feel like people probably shouldn't deploy postgres on CNPG without them. I feel like those might be the two that merit being installed in a minimal image along with the UID change. By the way, why does the UID need to be changed from 999 to 26 for CNPG?

Side note... I'm also glad that CNPG explicitly sets the default locale back to C rather than en-US which is the default in the official docker image. I'm probably tuned into the collation problems more than most, having been involved in that for awhile. But anyway... setting locale back to C is done in CNPG code at cluster provisioning time, not in the docker image itself - so not relevant to how docker images are built.

In light of the long-term roadmap for extensions, here's where I currently stand:

  1. We can create minimal CNPG images based on the official Docker Hub PostgreSQL images. The only modification is to adjust UID and GID from 999 to 26.
  2. We could also create standard CNPG images that include all Debian-compatible extensions and are licensing-compliant, as you suggested.

It's also worth considering size. When I looked just now, I think the three extensions we currently install from repos (audit, vector and failover slots) consume 2MB out of a total around 570MB in the container. Barman together with all its python dependencies is huge (130MB); hopefully that shrinks a bit with the move to packaging and CNPG-I. As we begin looking at more extensions, I do want to be cognizant of size. I think it's ok for the "standard" container to be a bit larger in order to provide an easy "batteries included" user experience, and we can document examples of how to create smaller images by building on the "minimal" container for people who need to keep the size down.

Yes, this is a project I'd be interested in contributing to.

@gbartolini
Copy link
Contributor

Is the idea that the kubernetes Cluster definition yaml would no longer accept only one image, but would accept a base image and also a list of extension images?

Not really. The concept of the image as it is now will not change (at least it is not in the current plans).

And is the idea that eventually CNPG wouldn't provide base images at all anymore, but only a set of "extension" images that can be mounted (sidecar-style?) alongside the official docker postgres base image in the pod?

The main idea is to go towards light and efficient image handling. The operand image with just the bare minimum Postgres components required to run (so ... Postgres, ideally, with core extensions). This means that Barman Cloud will not be required anymore.

If someday CNPG didn't provide base images, then what about failover slots and backup software?

Backup and Recovery software will be managed as plugins. This has been our attempt to provide our preferred way (volume snapshots and Barman Cloud) as well as allowing third parties to develop their favourite backup/recovery tool - and possibly support them. Barman Cloud will be supported as an official CloudNativePG plugin in the future.

Extensions will be managed the current way (add them to the base image) as well as - hopefully - through the new "extension_control_path" configuration option and the VolumeMount feature in Kubernetes, with self-contained images (1 extensions = 1 container image with dependencies). The latter, as you can imagine, will take years to consolidate.

Those two seem pretty fundamental and I feel like people probably shouldn't deploy postgres on CNPG without them.

The above should guarantee what you are asking.

It's also worth considering size. When I looked just now, I think the three extensions we currently install from repos (audit, vector and failover slots) consume 2MB out of a total around 570MB in the container. Barman together with all its python dependencies is huge (130MB); hopefully that shrinks a bit with the move to packaging and CNPG-I. As we begin looking at more extensions, I do want to be cognizant of size. I think it's ok for the "standard" container to be a bit larger in order to provide an easy "batteries included" user experience, and we can document examples of how to create smaller images by building on the "minimal" container for people who need to keep the size down.

Agree. Size matters and ... removing Barman Cloud from the image is also a step in that direction.

Yes, this is a project I'd be interested in contributing to.

It is worth talking with the maintainers about this then.

It is important though to clarify the supportability. It must be clear who is responsible to support users when a certain extension or combination of extensions is not working, or for example, updates, and so on. This could easily become an activity that takes a lot of time and IMHO it is not fair for a community to be expected to do that (my 2 cents).

What do you think?

@ardentperf
Copy link
Author

Lets move the discussion to #132 and close this, in order to keep it on a single thread - I think this whole discussion fits will in that new issue

@github-project-automation github-project-automation bot moved this from Todo to Done in Roadmap Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

2 participants