-
Notifications
You must be signed in to change notification settings - Fork 1.5k
CORS-4212: AWS: Add the ability to configure throughput on GP3 volumes #10132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@tthvo: This pull request references CORS-4212 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-aws-ovn-serial-1of2 |
|
@tthvo: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c7bb9e60-caf7-11f0-854d-0994ba12b72e-0 |
|
/label platform/aws |
|
/test e2e-aws-ovn-public-subnets e2e-aws-ovn-public-ipv4-pool e2e-aws-ovn-custom-iam-profile e2e-aws-overlay-mtu-ovn-1200 |
|
/cc @JoelSpeed |
|
/retest Test failures seem fine (unrelated) 👀 |
Payload passed this time 👍 |
|
@tthvo: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Used the build: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-aws-modern/1993960674010599424 Verified by the script locally: https://github.com/openshift/release/pull/71426/files#diff-5e156492e6ec7b6cc1782f27eb74071576df36e33d3caf3f92f73084af22a03f cd /Users/weli/works/oc-swarm/release && SHARED_DIR=/tmp/work4-shared CLUSTER_PROFILE_DIR=/tmp/work4-shared bash ci-operator/step-registry/cucushift/installer/check/aws/rootvolume/cucushift-installer-check-aws-rootvolume-commands.sh
Cluster ID: weli-test-8fm49
Region: us-east-1
Expected compute throughput: 1000 MiB/s
Expected control plane throughput: 1200 MiB/s
Expected worker rootVolume size: 120 GiB
Expected control-plane rootVolume size: 150 GiB
Checking worker nodes
PASS: ip-10-0-39-111.ec2.internal volume vol-034246304384809e8 type=gp3 size=120GiB iops=5000 throughput=1000MiB/s
PASS: ip-10-0-4-150.ec2.internal volume vol-05f55a2f17af0cf1c type=gp3 size=120GiB iops=5000 throughput=1000MiB/s
PASS: ip-10-0-64-225.ec2.internal volume vol-08f7acd7aa708a280 type=gp3 size=120GiB iops=5000 throughput=1000MiB/s
Checking control plane nodes
PASS: ip-10-0-30-251.ec2.internal volume vol-097a2aef4a22e5f3f type=gp3 size=150GiB iops=5000 throughput=1200MiB/s
PASS: ip-10-0-51-181.ec2.internal volume vol-084b2e4af55d5e006 type=gp3 size=150GiB iops=5000 throughput=1200MiB/s
PASS: ip-10-0-64-82.ec2.internal volume vol-0aab6f02caafe824c type=gp3 size=150GiB iops=5000 throughput=1200MiB/s
==========================================
Test Summary
All root volume throughput checks passed./verified by liweinan |
|
@liweinan: This PR has been marked as verified by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
pkg/asset/machines/aws/machines.go
Outdated
| Encrypted: pointer.Bool(true), | ||
| KMSKey: machineapi.AWSResourceReference{ARN: pointer.String(in.root.KMSKeyARN)}, | ||
| } | ||
| if in.root.Throughput != nil && *in.root.Throughput > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an end users explicitly sets this to zero, wouldn't the validateThroughput catch it before we get here?
I think the nil check is probably sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we even need the nil check, we can just assign the value, nil or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it is a bit of a tricky situation: The validateThroughput only runs when the machine pool defines the EBS volume type (i.e. similar to how validateIOPS and validateVolumeSize). See here and block below:
if p.EC2RootVolume.Type != "" {
allErrs = append(allErrs, validateVolumeSize(p, fldPath)...)
allErrs = append(allErrs, validateIOPS(p, fldPath)...)
allErrs = append(allErrs, validateThroughput(p, fldPath)...)
}This makes sense because the volume type is optional as we set the default type later when generating the machine manifests. But it means the installer can skip validating throughput.
I think everyone's suggestion is the right way 👍 and the problem is that the validation should catch this. So, I updated the PR to set the volume type default (if empty) early so that the throughput validation path runs, avoiding any 0 throughput value.
PTAL 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we set the default type later when generating the machine manifests
By this, I meant the code block below 👇 The commit here just uses the same logic to set the volume type somewhere earlier.
installer/pkg/asset/machines/worker.go
Lines 102 to 117 in fd5a518
| func defaultAWSMachinePoolPlatform(poolName string) awstypes.MachinePool { | |
| defaultEBSType := awstypes.VolumeTypeGp3 | |
| // gp3 is not offered in all local-zones locations used by Edge Pools. | |
| // Once it is available, it can be used as default for all machine pools. | |
| // https://aws.amazon.com/about-aws/global-infrastructure/localzones/features | |
| if poolName == types.MachinePoolEdgeRoleName { | |
| defaultEBSType = awstypes.VolumeTypeGp2 | |
| } | |
| return awstypes.MachinePool{ | |
| EC2RootVolume: awstypes.EC2RootVolume{ | |
| Type: defaultEBSType, | |
| Size: decimalRootVolumeSize, | |
| }, | |
| } | |
| } |
|
/approve this lgtm, but some nits for discussion from joel & me |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: patrickdillon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
GP3 volumes have the ability to configure throughput from 125 MiB/s to 2000 MiB/s. This allows the ability to set this at install time in the install-config. https://issues.redhat.com/browse/CORS-4212
Set the default type EBS volume for machine pools: - Controlplane, arbiter and worker pool default to gp3 volume. - Edge pool default to gp2 volume. The default decision is taken from existing code [0]. This commit just makes the defaulting earlier. Reference [0] https://github.com/openshift/installer/blob/fd5a518e4951510b82705eee8184b3dd4f2723b2/pkg/asset/machines/worker.go#L102-L117
Thank you! I updated the PR. PTAL again 🙏 |
|
/test e2e-aws-ovn-shared-vpc-edge-zones |
Descriptions
GP3 volumes have the ability to configure throughput from 125 MiB/s to 2000 MiB/s. This allows the ability to set this at install time in the install-config.
This is a take-2 of #9945 after it was reverted in #10131 due to aws serial test failure.
Notes
The following new changes are introduced:
openshift/api. It does break the convention with iops, but at least it gives us an option to represent specified vs left-empty.nilcheck is performed before passing throughput to CAPA and MAPI.