Skip to content

DNM Checking CI job result after enabling disk IOPS/RW limitation #2961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

danpawlik
Copy link
Contributor

@danpawlik danpawlik commented May 8, 2025

The CI still have from time to time issue related to "noisy neighbor". We still can ask infra team to apply limitation inside the flavor, but until we don't know what quota can be set, let's do it via systemd. For many services and for CRC (kubelet has set cgroup systemd in /etc/kubernetes/kubelet.conf), so should respect that.

More info what value were set are in commit message and PR [1].

[1] https://review.rdoproject.org/r/c/config/+/57582/

Depends-On: openstack-k8s-operators/openstack-operator#1434

Copy link
Contributor

openshift-ci bot commented May 8, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

openshift-ci bot commented May 8, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danpawlik
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4dd7e96f32d9444684ac5ec2fd9b7778

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 56m 17s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 09m 31s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 13s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 39m 41s
✔️ cifmw-multinode-tempest SUCCESS in 1h 33m 07s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 15s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 00s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 42s
cifmw-multinode-kuttl FAILURE in 2h 24m 25s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 16m 05s
✔️ build-push-container-cifmw-client SUCCESS in 20m 58s

@danpawlik
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4b2b7e9d11314a3ea6242b61367a51df

openstack-k8s-operators-content-provider FAILURE in 7m 21s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ podified-multinode-hci-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-multinode-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 05s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 54s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 50s
cifmw-multinode-kuttl FAILURE in 2h 23m 36s
ci-framework-openstack-meta-content-provider FAILURE in 8m 29s
✔️ build-push-container-cifmw-client SUCCESS in 21m 00s

@danpawlik
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/eb552d8ff1bd47cba9b68e96e530c51c

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 43s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 12s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 39m 20s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 32m 54s
✔️ cifmw-multinode-tempest SUCCESS in 1h 29m 45s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 02s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 21s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 55s
cifmw-multinode-kuttl FAILURE in 2h 01m 10s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 14m 53s
✔️ build-push-container-cifmw-client SUCCESS in 17m 45s

@danpawlik
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/dfedc4772b52442babf1e25b2d542ba6

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 54m 41s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 50s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 45s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 38m 28s
✔️ cifmw-multinode-tempest SUCCESS in 1h 30m 16s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 07s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 02s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 50s
cifmw-multinode-kuttl FAILURE in 2h 26m 08s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 16m 26s
✔️ build-push-container-cifmw-client SUCCESS in 21m 27s

@danpawlik danpawlik force-pushed the check-iops-limitation branch from aeded39 to 3ac70f4 Compare May 9, 2025 08:24
@danpawlik
Copy link
Contributor Author

recheck

1 similar comment
@danpawlik
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/72e20fc6686e48509120326f8b730559

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 50m 10s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 35s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 32m 15s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 33m 52s
✔️ cifmw-multinode-tempest SUCCESS in 1h 35m 51s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 26s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 50s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 01s
cifmw-multinode-kuttl TIMED_OUT in 2h 40m 58s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 47m 48s
✔️ build-push-container-cifmw-client SUCCESS in 22m 13s

@danpawlik
Copy link
Contributor Author

recheck

@danpawlik
Copy link
Contributor Author

Increased IOPS and RW - https://review.rdoproject.org/r/c/config/+/57595 - it is just few minutes to finish. Eh.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f509e8861f054906b93c36547388136d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 23s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 12m 24s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 33m 25s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 38m 08s
✔️ cifmw-multinode-tempest SUCCESS in 1h 30m 51s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 43s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 50s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 04s
cifmw-multinode-kuttl TIMED_OUT in 2h 41m 30s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 58m 03s
✔️ build-push-container-cifmw-client SUCCESS in 21m 15s

@danpawlik danpawlik force-pushed the check-iops-limitation branch 2 times, most recently from 2e96cee to a2fb65e Compare May 12, 2025 09:14
Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/ci-framework for 2961,a2fb65e01eab327a93a881f9d64e81edddac59e9

The CI still have from time to time issue related to "noisy neighbor".
We still can ask infra team to apply limitation inside the flavor, but
until we don't know what quota can be set, let's do it via systemd.
For many services and for CRC (kubelet has set cgroup systemd in
/etc/kubernetes/kubelet.conf), so should respect that.

More info what value were set are in commit message and PR [1].

[1] https://review.rdoproject.org/r/c/config/+/57582/

Depends-On: openstack-k8s-operators/openstack-operator#1434
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ca4fb80ccbcf419581e1feef033ebe80

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 51m 11s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 10m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 29m 06s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 33m 07s
✔️ cifmw-multinode-tempest SUCCESS in 1h 37m 22s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 09s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 38s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 11s
cifmw-multinode-kuttl TIMED_OUT in 2h 40m 30s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 49m 28s
✔️ build-push-container-cifmw-client SUCCESS in 21m 25s

@danpawlik
Copy link
Contributor Author

15k IOPS is not enough.

The kubelet service logs provides many interesting information,
that might be helpful to see what is the root cause of failing job.

Signed-off-by: Daniel Pawlik <[email protected]>
@danpawlik danpawlik force-pushed the check-iops-limitation branch from b0d0e5e to fc44237 Compare May 12, 2025 12:00
@danpawlik
Copy link
Contributor Author

recheck

@danpawlik
Copy link
Contributor Author

recheck

@danpawlik
Copy link
Contributor Author

danpawlik commented May 12, 2025

@danpawlik
Copy link
Contributor Author

recheck

rdoproject pushed a commit to rdo-infra/review.rdoproject.org-config that referenced this pull request May 13, 2025
After doing few tests [1], the CI job pass without increasing timeout
when RW (read write) limit is set to 250MB - let's set that value
as default.
Also enable disk limitation by default.

[1] openstack-k8s-operators/ci-framework#2961

Change-Id: I3b2e81711145d398430b3830ed541040123f4535
Signed-off-by: Daniel Pawlik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants