-
Notifications
You must be signed in to change notification settings - Fork 255
OCPBUGS-61215: Tweak iptables-alerter to try to avoid crictl bug #2802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-61215: Tweak iptables-alerter to try to avoid crictl bug #2802
Conversation
@danwinship: This pull request references Jira Issue OCPBUGS-61215, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest-required |
faace85
to
1bf470b
Compare
/verified by @danwinship no e2e test, tested by hand |
/retest-required |
@danwinship: This PR has been marked as verified by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/lgtm |
# have any iptables-using pods anyway, do a pre-scan of all (non-hostnetwork) | ||
# namespaces without using crictl, and bail out early if we don't find anything | ||
iptables_output="" | ||
for netns_pid in $(lsns -t net -o pid -nr | sort -u | grep -v '^1$'); do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for others; '^1$'
excludes pid 1 :)
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship, martinkennelly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-aws-ovn-serial-2of2 Unrelated disruption |
/test e2e-aws-ovn-upgrade Unrelated - job reached timeout limit. |
@danwinship: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/override ci/prow/e2e-ovn-ipsec-step-registry unrelated, very failure-prone jobs |
@danwinship: Overrode contexts on behalf of danwinship: ci/prow/e2e-aws-ovn-serial-2of2, ci/prow/e2e-ovn-ipsec-step-registry In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
339bfc9
into
openshift:master
@danwinship: Jira Issue Verification Checks: Jira Issue OCPBUGS-61215 Jira Issue OCPBUGS-61215 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cherry-pick release-4.20 |
@danwinship: new pull request created: #2811 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Fix included in accepted release 4.21.0-0.nightly-2025-10-02-215712 |
We keep getting reports of iptables-alerter causing CPU usage alerts... we debugged this at one point and it was because crictl was suddenly using a ton of CPU and RAM for no apparent reason. We weren't able to reproduce beyond that and it's not really worth spending a lot of time trying to fix since crictl is not in the critical path of any normal pods anyway. I had tried improving things by using less crictl before (#2404) but we're still getting reports. This fixes it to not use crictl until after we've determined that some pod, somewhere on the node is using iptables, and that should hopefully be "never".