Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NE-1790: Enable Dynamic Configuration Manager feature gate #1159

Merged
merged 2 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,6 @@ require (
// github.com/operator-framework/operator-sdk.
replace (
bitbucket.org/ww/goautoneg => github.com/munnerz/goautoneg v0.0.0-20120707110453-a547fc61f48d
github.com/openshift/api => github.com/openshift/api v0.0.0-20241004095111-b1f700bdd8d2
github.com/openshift/api => github.com/openshift/api v0.0.0-20241018083007-4f6053f954b0
k8s.io/client-go => k8s.io/client-go v0.31.1
)
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -993,8 +993,8 @@ github.com/opencontainers/runtime-spec v0.1.2-0.20190507144316-5b71a03e2700/go.m
github.com/opencontainers/runtime-spec v0.1.2-0.20190618234442-a950415649c7/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-spec v1.0.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/opencontainers/runtime-tools v0.0.0-20181011054405-1d69bd0f9c39/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
github.com/openshift/api v0.0.0-20241004095111-b1f700bdd8d2 h1:wCRdyt+nHnJsfuLMJF9RW1JK8G4Gvo//gBy6bZI5USE=
github.com/openshift/api v0.0.0-20241004095111-b1f700bdd8d2/go.mod h1:Shkl4HanLwDiiBzakv+con/aMGnVE2MAGvoKp5oyYUo=
github.com/openshift/api v0.0.0-20241018083007-4f6053f954b0 h1:9CBNaPGycU2dDzq0XoRIqxH0vHZezKDfbINx8e5zH0I=
github.com/openshift/api v0.0.0-20241018083007-4f6053f954b0/go.mod h1:Shkl4HanLwDiiBzakv+con/aMGnVE2MAGvoKp5oyYUo=
github.com/openshift/build-machinery-go v0.0.0-20200211121458-5e3d6e570160/go.mod h1:1CkcsT3aVebzRBzVTSbiKSkJMsC/CASqxesfqEMfJEc=
github.com/openshift/client-go v0.0.0-20200116152001-92a2713fa240/go.mod h1:4riOwdj99Hd/q+iAcJZfNCsQQQMwURnZV6RL4WHYS5w=
github.com/openshift/client-go v0.0.0-20240405120947-c67c8325cdd8 h1:HGfbllzRcrJBSiwzNjBCs7sExLUxC5/1evnvlNGB0Cg=
Expand Down
1 change: 1 addition & 0 deletions pkg/operator/controller/ingress/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ type Config struct {
RouteExternalCertificateEnabled bool
IngressControllerLBSubnetsAWSEnabled bool
IngressControllerEIPAllocationsAWSEnabled bool
IngressControllerDCMEnabled bool
}

// reconciler handles the actual ingress reconciliation logic in response to
Expand Down
43 changes: 40 additions & 3 deletions pkg/operator/controller/ingress/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,11 @@ const (

RouterHAProxyConfigManager = "ROUTER_HAPROXY_CONFIG_MANAGER"

RouterHAProxyMaxDynamicServers = "ROUTER_MAX_DYNAMIC_SERVERS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be any change for this in the flags https://github.com/openshift/router/blob/master/pkg/cmd/infra/router/template.go#L185?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Why? This flag accepts an environment varible as an alternative way of using it. This one is limited to 1 as minimal value and defaults to 5. I set the envvar explicitly to 1 so no default value is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you changed the minimum for BlueprintRoutePoolSize, and wondered why the default didn't need to change for MaxDynamicServers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to deliver an MVP in 4.18, we decided to turn off blueprint routes and pre-allocate 1 dynamic server per backend. openshift/enhancements#1687 explains the rationale.

RouterHAProxyMaxDynamicServersDefaultValue = 1

RouterHAProxyBlueprintRoutePoolSize = "ROUTER_BLUEPRINT_ROUTE_POOL_SIZE"
Comment on lines +87 to +90
Copy link
Contributor

@candita candita Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two env variables (ROUTER_MAX_DYNAMIC_SERVERS and ROUTER_BLUEPRINT_ROUTE_POOL_SIZE) are not found in our haproxy.config template yet and I don't see a PR that adds them. Does that matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not used in the config template, they are only for the configuration of the router's DCM behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but it's strange that they'd need to be env variables in that case.

Copy link
Contributor

@Miciah Miciah Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator usually uses environment variables rather than command-line switches for openshift-router, for which reason I asked Andrey to use environment variables in this case too (#1159 (comment)).


RouterHAProxyContstats = "ROUTER_HAPROXY_CONTSTATS"

RouterHAProxyThreadsEnvName = "ROUTER_THREADS"
Expand Down Expand Up @@ -121,7 +126,7 @@ func (r *reconciler) ensureRouterDeployment(ci *operatorv1.IngressController, in
if err != nil {
return false, nil, fmt.Errorf("failed to determine if proxy protocol is needed for ingresscontroller %s/%s: %v", ci.Namespace, ci.Name, err)
}
desired, err := desiredRouterDeployment(ci, r.config.IngressControllerImage, ingressConfig, infraConfig, apiConfig, networkConfig, proxyNeeded, haveClientCAConfigmap, clientCAConfigmap, clusterProxyConfig, r.config.RouteExternalCertificateEnabled)
desired, err := desiredRouterDeployment(ci, r.config.IngressControllerImage, ingressConfig, infraConfig, apiConfig, networkConfig, proxyNeeded, haveClientCAConfigmap, clientCAConfigmap, clusterProxyConfig, r.config.RouteExternalCertificateEnabled, r.config.IngressControllerDCMEnabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if there is anything else to change, I would suggest refactoring the desiredRouterDeployment signature to pass in r.config instead of three different parameters in r.config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be done in a separate PR.

if err != nil {
return haveDepl, current, fmt.Errorf("failed to build router deployment: %v", err)
}
Expand Down Expand Up @@ -243,7 +248,7 @@ func headerValues(values []operatorv1.IngressControllerHTTPHeader) string {
}

// desiredRouterDeployment returns the desired router deployment.
func desiredRouterDeployment(ci *operatorv1.IngressController, ingressControllerImage string, ingressConfig *configv1.Ingress, infraConfig *configv1.Infrastructure, apiConfig *configv1.APIServer, networkConfig *configv1.Network, proxyNeeded bool, haveClientCAConfigmap bool, clientCAConfigmap *corev1.ConfigMap, clusterProxyConfig *configv1.Proxy, routeExternalCertificateEnabled bool) (*appsv1.Deployment, error) {
func desiredRouterDeployment(ci *operatorv1.IngressController, ingressControllerImage string, ingressConfig *configv1.Ingress, infraConfig *configv1.Infrastructure, apiConfig *configv1.APIServer, networkConfig *configv1.Network, proxyNeeded bool, haveClientCAConfigmap bool, clientCAConfigmap *corev1.ConfigMap, clusterProxyConfig *configv1.Proxy, routeExternalCertificateEnabled, ingressControllerDCMEnabled bool) (*appsv1.Deployment, error) {
deployment := manifests.RouterDeployment()
name := controller.RouterDeploymentName(ci)
deployment.Name = name.Name
Expand Down Expand Up @@ -523,6 +528,15 @@ func desiredRouterDeployment(ci *operatorv1.IngressController, ingressController
LoadBalancingAlgorithm string `json:"loadBalancingAlgorithm"`
DynamicConfigManager string `json:"dynamicConfigManager"`
ContStats string `json:"contStats"`
// maxDynamicServers specifies the number of dynamic server slots that will be added to each route.
// This setting can only be configured if dynamicConfigManager is `true`.
// The default value is 1, with a maximum of 10.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The default value is 1, with a maximum of 10.
// The default value is 1.

// Dynamic server slots help to avoid reloads during endpoint scale-ups. The more dynamic servers
// added, the fewer reloads required when scaling up.
// Note, however, that dynamic servers consume memory even when not enabled.
// Use this analysis of the memory usage to assess the impact of different numbers of servers:
// https://gist.github.com/frobware/2b527ce3f040797909eff482a4776e0b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: When this becomes suppported, we should add a link to an OpenShift repo.

MaxDynamicServers string `json:"maxDynamicServers"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be an int, not a string. Users may be confused that they'd have to specify 10000 as "10000" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I just followed the same pattern as for the other overrides. Two of them (dynamicConfigManager,contStats) could be bools because bool is a valid JSON type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I think we have precedent for string-based bools. Choosing the string type for numbers allows malformatting, like "100K", which will fail later.

}
if len(ci.Spec.UnsupportedConfigOverrides.Raw) > 0 {
if err := json.Unmarshal(ci.Spec.UnsupportedConfigOverrides.Raw, &unsupportedConfigOverrides); err != nil {
Expand Down Expand Up @@ -567,13 +581,36 @@ func desiredRouterDeployment(ci *operatorv1.IngressController, ingressController
})
}

enableDCM := false
if ingressControllerDCMEnabled {
enableDCM = true
}
dynamicConfigOverride := unsupportedConfigOverrides.DynamicConfigManager
if v, err := strconv.ParseBool(dynamicConfigOverride); err == nil && v {
if v, err := strconv.ParseBool(dynamicConfigOverride); err == nil {
// Config override can still be used to opt out from DCM.
enableDCM = v
}

if enableDCM {
env = append(env, corev1.EnvVar{
Name: RouterHAProxyConfigManager,
Value: "true",
}, corev1.EnvVar{
Name: RouterHAProxyBlueprintRoutePoolSize,
Value: "0",
})

// Max dynamic servers override is only taken into account when DCM is enabled.
routerHAProxyMaxDynamicServersValue := RouterHAProxyMaxDynamicServersDefaultValue
if v, err := strconv.Atoi(unsupportedConfigOverrides.MaxDynamicServers); err == nil && v > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this allows users to set it to a value > 1, which we don't yet support. Should there be a validation to make sure it doesn't exceed 1?

Copy link
Contributor Author

@alebedev87 alebedev87 Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 offers the maximum benefit with minimal downside. However, there may be some "brave customers" who’d like to go beyond this. I wanted to provide some middle ground between the recommended default and experimental usage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out we allow value > 1 in the unsupportedConfigOverride.

Copy link
Contributor

@candita candita Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding converting string to int here, see https://github.com/openshift/cluster-ingress-operator/pull/1159/files#r1852667015 And then I guess there is the edge case where someone might try to specify a number greater than int size or greater than the HAProxy size allowed (if any).

routerHAProxyMaxDynamicServersValue = v
}
env = append(env, corev1.EnvVar{
Name: RouterHAProxyMaxDynamicServers,
Value: strconv.Itoa(routerHAProxyMaxDynamicServersValue),
})
}

contStats := unsupportedConfigOverrides.ContStats
if v, err := strconv.ParseBool(contStats); err == nil && v {
env = append(env, corev1.EnvVar{
Expand Down
Loading