Skip to content

feat!: Add working conversion webhook with cert rotation #1066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

sbernauer
Copy link
Member

@sbernauer sbernauer commented Jun 30, 2025

Description

Part of stackabletech/issues#642

An working example usage can be found in stackabletech/zookeeper-operator#958 (mainly look at rust/operator-binary/src/main.rs)

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added

@sbernauer sbernauer changed the title feat(webhook): Add functioning conversion webhook with cert rotation feat(webhook): Add working conversion webhook with cert rotation Jul 2, 2025
Comment on lines +29 to +31
pub const WEBHOOK_CA_LIFETIME: Duration = Duration::from_minutes_unchecked(3);
pub const WEBHOOK_CERTIFICATE_LIFETIME: Duration = Duration::from_minutes_unchecked(2);
pub const WEBHOOK_CERTIFICATE_ROTATION_INTERVAL: Duration = Duration::from_minutes_unchecked(1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder to bump these before merging. Currently they are so low for easy testing

@sbernauer sbernauer moved this to Development: In Progress in Stackable Engineering Jul 2, 2025
@sbernauer sbernauer moved this from Development: In Progress to Development: In Review in Stackable Engineering Jul 2, 2025
@sbernauer sbernauer moved this from Development: In Review to Development: Waiting for Review in Stackable Engineering Jul 2, 2025
@Techassi Techassi self-requested a review July 3, 2025 06:43
@Techassi Techassi moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Jul 3, 2025
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, I didn't look at the CertificateResolver yet.

@Techassi Techassi changed the title feat(webhook): Add working conversion webhook with cert rotation feat!: Add working conversion webhook with cert rotation Jul 3, 2025
@sbernauer sbernauer requested a review from Techassi July 4, 2025 10:19
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a bunch of comments.

I encountered the ConversionWebhookServer which is missing the changes we talked about a few weeks back. As such, it doesn't make a whole lot of sense to continue the review before these changes are implemented. I've sent you an appropriate patch as a private message which should be a good starting point for the changes we discussed.

Comment on lines +428 to +431
"--operator-namespace",
"stackable-operators",
"--operator-service-name",
"foo-operator",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I still feel like all these CLI unit tests are pretty much useless and should be removed to speed up the testing runs.

Comment on lines +16 to +27
use kube::{
Api, Client, ResourceExt,
api::{Patch, PatchParams},
};
use snafu::{OptionExt, ResultExt, Snafu};
use stackable_operator::cli::OperatorEnvironmentOptions;
use tokio::{sync::mpsc, try_join};
use tracing::instrument;
use x509_cert::{
Certificate,
der::{EncodePem, pem::LineEnding},
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Please combine these imports with the others above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, what imports exactly can be combined? rustfmt groups them like this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is sub-optimal formatting sadly. You need to manually move them to the top imports above line 11.

@sbernauer sbernauer requested a review from Techassi August 11, 2025 15:36
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are mostly there, just some small things left.

@@ -4,6 +4,17 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added

- BREAKING: Add two new required CLI arguments: `--operator-namespace` and `-operator-service-name`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- BREAKING: Add two new required CLI arguments: `--operator-namespace` and `-operator-service-name`.
- BREAKING: Add two new required CLI arguments: `--operator-namespace` and `--operator-service-name`.

RunWebhookServer { source: WebhookError },

#[snafu(display("failed to receive certificate from channel"))]
ReceiverCertificateFromChannel,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ReceiverCertificateFromChannel,
ReceiveCertificateFromChannel,

Comment on lines +73 to +75
/// The environment the operator is running in, notably the namespace and service name it is
/// reachable at.
pub operator_environment: OperatorEnvironmentOptions,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Let's avoid pulling in a type from stackable_operator here. Instead, split it into two separate fields.

Suggested change
/// The environment the operator is running in, notably the namespace and service name it is
/// reachable at.
pub operator_environment: OperatorEnvironmentOptions,
pub namespace: String,
pub service_name: String,

@@ -2,4 +2,5 @@
//! purposes.
mod conversion;

pub use conversion::*;
pub use conversion::{ConversionWebhookError, ConversionWebhookOptions, ConversionWebhookServer};
pub use kube::core::conversion::ConversionReview;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is already re-exported via stackable_webhook::servers::conversion::ConversionReview. As such, please remove it from here again.

Comment on lines +123 to +142
let certificate = ca
.generate_ecdsa_leaf_certificate(
"Leaf",
"webhook",
subject_alterative_dns_names.iter().map(|san| san.as_str()),
WEBHOOK_CERTIFICATE_LIFETIME,
)
.context(GenerateLeafCertificateSnafu)?;

let certificate_der = certificate
.certificate_der()
.context(EncodeCertificateDerSnafu)?;
let private_key_der = certificate
.private_key_der()
.context(EncodePrivateKeyDerSnafu)?;
let certificate_key =
CertifiedKey::from_der(vec![certificate_der], private_key_der, &tls_provider)
.context(DecodeCertifiedKeyFromDerSnafu)?;

Ok((certificate.certificate().clone(), Arc::new(certificate_key)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Rename this to certificate_pair to better indicate what this is.

Suggested change
let certificate = ca
.generate_ecdsa_leaf_certificate(
"Leaf",
"webhook",
subject_alterative_dns_names.iter().map(|san| san.as_str()),
WEBHOOK_CERTIFICATE_LIFETIME,
)
.context(GenerateLeafCertificateSnafu)?;
let certificate_der = certificate
.certificate_der()
.context(EncodeCertificateDerSnafu)?;
let private_key_der = certificate
.private_key_der()
.context(EncodePrivateKeyDerSnafu)?;
let certificate_key =
CertifiedKey::from_der(vec![certificate_der], private_key_der, &tls_provider)
.context(DecodeCertifiedKeyFromDerSnafu)?;
Ok((certificate.certificate().clone(), Arc::new(certificate_key)))
let certificate_pair = ca
.generate_ecdsa_leaf_certificate(
"Leaf",
"webhook",
subject_alterative_dns_names.iter().map(|san| san.as_str()),
WEBHOOK_CERTIFICATE_LIFETIME,
)
.context(GenerateLeafCertificateSnafu)?;
let certificate_der = certificate_pair
.certificate_der()
.context(EncodeCertificateDerSnafu)?;
let private_key_der = certificate_pair
.private_key_der()
.context(EncodePrivateKeyDerSnafu)?;
let certificate_key =
CertifiedKey::from_der(vec![certificate_der], private_key_der, &tls_provider)
.context(DecodeCertifiedKeyFromDerSnafu)?;
Ok((certificate_pair.certificate().clone(), Arc::new(certificate_key)))

Comment on lines +74 to +81
let (cert, certified_key) = Self::generate_new_cert(subject_alterative_dns_names.clone())
.await
.context(GenerateNewCertificateSnafu)?;

cert_tx
.send(cert)
.await
.map_err(|_err| CertificateResolverError::SendCertificateToChannel)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I feel like this can be moved into its own function, because we repeat the exact same code below.

///
/// Usually, this struct is not constructed manually, but instead by calling
/// [`Options::builder()`] or [`OptionsBuilder::default()`].
/// [`WebhookOptions::builder()`] or [`OptionsBuilder::default()`].
#[derive(Debug, Default)]
pub struct OptionsBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: We renamed Options to WebhookOptions. As such, I think we should also rename the builder accordingly.

Suggested change
pub struct OptionsBuilder {
pub struct WebhookOptionsBuilder {

### Added

- BREAKING: Re-write the `ConversionWebhookServer`.
It can now do CRD conversions, handle multiple CRDs and takes care of reconciling the CRDs ([#1066]).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I still feel like the webhhok should not take care of reconciling the CRDs, but the operator should do it instead. Let's tackle this in a follow-up PR and leave it as is to get something out of the door.

@@ -4,6 +4,23 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I feel like this should go under the "Changed" section instead. We additionally need to mention the Options renames.

Comment on lines +13 to +14
Also, `TlsServer::new` now returns an additional `mpsc::Receiver<Certificate>`, so that the caller
can get notified about certificate rotations happening ([#1066]).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: (Addition to the comment above) This would be the perfect trigger for the operator (the caller if this function) to reconcile the CRDs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

3 participants