-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORS-2818: Adding ability to render Cloud LB IPs #286
Conversation
@sadasu: This pull request references CORS-2818 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@sadasu: This pull request references CORS-2818 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@sadasu: This pull request references CORS-2818 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/config/node.go
Outdated
@@ -437,7 +443,7 @@ func GetConfig(kubeconfigPath, clusterConfigPath, resolvConfPath string, apiVips | |||
} else { | |||
ingressVip = nil | |||
} | |||
newNode, err := getNodeConfig(kubeconfigPath, clusterConfigPath, resolvConfPath, apiVip, ingressVip, apiPort, lbPort, statPort) | |||
newNode, err := getNodeConfig(kubeconfigPath, clusterConfigPath, resolvConfPath, apiVip, ingressVip, apiPort, lbPort, statPort, apiLBIPs[0], apiIntLBIPs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it always work? What if [0]
does not exist? Can we get some basic unit tests please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding unit tests. To answer the other question, apiLBIPs[0], apiIntLBIPs[0] would be "" by default. It will have actual values only when the userProvisionedDNS
feature is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I miss something obvious but I don't see that... You call GetConfig
by passing []net.IP{}
as apiLBIPs
and then you reference apiLBIPs[0]
. This does not work, try running that code
a := []net.IP{}
fmt.Println(a[0])
and the output is not ""
but a panic runtime error: index out of range [0] with length 0
.
The initialization code is
cloudIntLBIPs, err := cmd.Flags().GetIPSlice("cloud-int-lb-ips")
if err != nil {
cloudIntLBIPs = []net.IP{}
}
meaning if you don't have the feature enabled you pass empty list, not ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mkowalski ! I needed to update the logic in that part of the code too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really recommend starting from unit tests and then proceed with the implementation... What I can see in the current code is that it is still panicking because []net.IP{}
and nil
are different entities (i.e. empty slice and nil
is not the same in golang).
Look at the following code
a := []net.IP{}
if a == nil {
fmt.Println("nil")
} else {
fmt.Println(a[0])
}
or run it via playground, e.g. https://go.dev/play/p/UbrxBrUstTw
/retest-required Retesting after openshift/installer#7671 merged. |
/retest |
/hold Some fundamental flaws in error handling, lets work on them first |
/retest |
1 similar comment
/retest |
334f418
to
5c404e4
Compare
/test e2e-metal-ipi-ovn-ipv6 |
1 similar comment
/test e2e-metal-ipi-ovn-ipv6 |
@sadasu: This pull request references CORS-2818 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@sadasu: This pull request references CORS-2818 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
a671579
to
d431f2d
Compare
/retest-required |
pkg/config/cloud_lb_config_test.go
Outdated
}) | ||
}) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good, I would only see some additional test cases for mainly your function being called with the following
- empty API LB
- empty API INT
- empty Ingress
- empty Node
This is to make sure it handles missing and/or incorrect input correctly and does not crash the whole application when someone calls it with insane input
var apiLBIP, apiIntLBIP, ingressIP net.IP | ||
nodes := []Node{} | ||
ipCount := 0 | ||
if len(clusterLBConfig.ApiIntLBIPs) > len(clusterLBConfig.IngressLBIPs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is it possible that we have different number of IPs? This is a honest question, in the IPI loadbalancer we do there is always a requirement that you have the same number of IPs for API and for Ingress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the cloud IPI case with BYO DNS, there are internal and external Ingress LBs. That is one way to have multiple entries for Ingress. For private clusters, we don't expect to have any IPs for API LB IPs, only API-Int and Ingress LBs are expected to be running. This is a way to future-proof our implementation for all combinations of LB IPs.
if i < len(clusterLBConfig.ApiIntLBIPs) { | ||
apiIntLBIP = clusterLBConfig.ApiIntLBIPs[i] | ||
} else { | ||
apiIntLBIP = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in L718 you wrote var apiIntLBIP net.ip
you do not need to assign it to nil
explicitly as it's the value you get from start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiIntLBIP and other LB IP values are being set within a loop starting at L726 so its value could have been !nil in the previous iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see it now... In that case this deserves at least a comment in Simple English directly in the code or some kind of a simplification...
As I read this whole loop now (L726-L730) is that when you finish the loop, you want apiIntLBIP
to have a value of the last element of the clusterLBConfig.ApiIntLBIPs
slice. Is that correct? If yes, in pseudocode that would be
apiIntLBIP = get_last_element(clusterLBConfig.ApiIntLBIPs)
Do I get the desired behaviour right? If yes, I'd rather do something like
apiIntLBIP = clusterLBConfig.ApiIntLBIPs[len(clusterLBConfig.ApiIntLBIPs)-1]
instead of iteration in which every time we assign a value only to ignore it afterwards and use the last assignment.
If my reasoning is wrong and your desired behaviour is different then ignore what I wrote, but that means a comment is more than desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to follow the precedence set by https://github.com/openshift/baremetal-runtimecfg/blob/master/pkg/config/node.go#L421-L447. There are definitely other ways to implement this too.
Adding a comment for clarity.
if len(clusterLBConfig.ApiLBIPs) != 0 && i < len(clusterLBConfig.ApiLBIPs) { | ||
apiLBIP = clusterLBConfig.ApiLBIPs[i] | ||
} else { | ||
apiLBIP = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, no need for an explicit nil-assignment
if len(clusterLBConfig.IngressLBIPs) != 0 && i < len(clusterLBConfig.IngressLBIPs) { | ||
ingressIP = clusterLBConfig.IngressLBIPs[i] | ||
} else { | ||
ingressIP = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, no need for an explicit nil-assignment
pkg/monitor/corednsmonitor.go
Outdated
if err != nil { | ||
return err | ||
} | ||
// Populate cloud LB IP addresses for platforms where the cloud LBs | ||
// have already been configured | ||
newConfig, _ = config.PopulateCloudLBIPAddresses(clusterLBConfig, newConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not ignore errors here (anywhere, in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Updated.
pkg/config/node.go
Outdated
@@ -694,3 +713,80 @@ func PopulateNodeAddresses(kubeconfigPath string, node *Node) { | |||
} | |||
} | |||
} | |||
|
|||
func getConfigWithCloudLBIPs(kubeconfigPath, clusterConfigPath, resolvConfPath string, clusterLBConfig ClusterLBConfig) (node Node, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of iterating over the length of structs and then accessing them using [i]
deserves some test as it has a potential to explode if by any chance we access non-existing index. I see here some kind of min()
inside of max()
so it does feel that we access the safer way now, but a test would be a guarantee for that
Hey, reviewed the last revision. The overall design seems okay, there is a couple of language-specific nits + more test cases needed (mainly for error handling and edge cases of passing empty values) |
eb8eefe
to
ff22640
Compare
/retest |
On some cloud platforms, the user cannot use the cloud's default DNS solution. They are expected to use their own custom DNS solution that is external to the cluster. OpenShift is not allowed to configure this DNS solution. In this scenario, these same customers want to continue using the cloud provided Load Balancers (LBs). OpenShift is expected to continue configuing the cloud LBs for API, API-Int and Ingress access. Since the LB information is not available, before cluster installation, the user cannot configure their custom DNS solution before cluster installation. So, to support this mode, OpenShift needs to start its in-cluster CoreDNS based DNS solution for API, API-Int and Ingress resolution so that cluster installation is successful. The customer is expected to configure their DNS solution post-install. At this the time, the cloud's API and API-Int LBs are configured by the Installer and its value is not expected to change during the life of the cluster. The Ingress operator continues to handle Ingress LB configuration. Based on enhancement: openshift/enhancements#1468
@sadasu: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/hold cancel /cc @cybertron |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mkowalski, sadasu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[ART PR BUILD NOTIFIER] This PR has been included in build ose-baremetal-runtimecfg-container-v4.16.0-202312061430.p0.g0ff1f6f.assembly.stream for distgit baremetal-runtimecfg. |
Add the ability to render and monitor Cloud LB IPs. This capability is used only when
userProvisionedDNS
is enabled on some (AWS, GCP and Azure) cloud platforms via install-config.On some cloud platforms (AWS, Azure, GCP), the user cannot use the cloud's default DNS solution. They are expected to use their own custom DNS solution that is external to the cluster. OpenShift is not allowed to configure this DNS solution. In this scenario, these same customers want to continue using the cloud provided Load Balancers (LBs).
OpenShift is expected to continue configuing the cloud LBs for API, API-Int and Ingress access. Since the LB information is not available, before cluster installation, the user cannot configure their custom DNS solution before cluster installation.
So, to support this mode, OpenShift needs to start its in-cluster CoreDNS based DNS solution for API, API-Int and Ingress resolution so that cluster installation is successful. The customer is expected to configure their DNS solution post-install.
At this the time, the cloud's API and API-Int LBs are configured by the Installer and its value is not expected to change during the life of the cluster. The Ingress operator continues to handle Ingress LB configuration.
Based on enhancement: openshift/enhancements#1468