-
Notifications
You must be signed in to change notification settings - Fork 2k
Making FDP IPv4 QE jobs default #69292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.19-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.20-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.20-e2e-aws-ovn-fdp-qe pull-ci-openshift-ovn-kubernetes-release-4.19-e2e-aws-ovn-fdp-qe |
|
@deepsm007: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.19-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.20-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/lgtm |
|
@anuragthehatter Do we need @martinkennelly 's approval for this PR to merge? Not lot of code change except the flags and timeout. |
|
@anuragthehatter how many times have you executed this job and whats the pass rate? Thanks |
Executed 3 times above in rehearsals on different releases as seen above. Its 100% on all those 3 jobs. |
|
/lgtm |
I think theres a command to run it many times looking for flakes. The reason I ask is i saw it fail recently on a ds merge it and it was flake. I trust your opinion here and if we are wrong, we can revert this PR. |
|
@martinkennelly Yes. That d/s merge has IPsec usecase timeout issue which usually takes 45-52 minutes across various platforms hence this PR addressed |
|
4.21, node gone missing. I think aws cluster could b using spot instances. SDN: OCP-40908:SDN Allow from hostnetwork policy to allow traffic from hostnetwork pods on LoadBalancerService endpoint strategy cluster expand_less 10s
{failed OCP-40908:SDN Allow from hostnetwork policy to allow traffic from hostnetwork pods on LoadBalancerService endpoint strategy cluster failed
Scenario: OCP-40908:SDN Allow from hostnetwork policy to allow traffic from hostnetwork pods on LoadBalancerService endpoint strategy cluster
Given I store the ready and schedulable nodes in the :nodes clipboard
Message:
nodes 'ip-10-0-25-232.us-east-2.compute.internal' not found (BushSlicer::ResourceNotFoundError)
./lib/openshift/resource.rb:66:in `get_checked'
./lib/openshift/resource.rb:130:in `get_cached_prop'
./lib/openshift/node.rb:110:in `ready?'
./features/step_definitions/node.rb:76:in `block (2 levels) in '
./features/step_definitions/node.rb:76:in `select'
./features/step_definitions/node.rb:76:in `/^I store the( schedulable| ready and schedulable)?( windows)? (node|master|worker)s in the(?: :(\S+))? clipboard(?: excluding "(.+?)")?$/'
features/tierN/networking/network-policy.feature:2578:in `I store the ready and schedulable nodes in the :nodes clipboard'
|
Hmm also hostnetwork pod usecases failing due to OTP migration, need |
|
ill be away for a week but can you folks handle any fixes and retry here? Since I found issues on the first run ye may have to run this job many times to shake out flakes. Unfortunately theres no aggresgate it seems for rehearsals.... idk why... its needed. I looked at the docs for rehearsal commands and didnt find any aggregate :/ @jluhrsen do you know any good command to run at each release main-3 jobs 10x times? |
Sure @martinkennelly Yep. I figured our OTE migration has again impacted few more cases. Will fix them and have a PR up soon https://github.com/openshift/openshift-tests-private/pull/27809. Once that merge we can rehearse again. Few points to note:
|
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.21-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@anuragthehatter: requesting more than one rehearsal in one comment is not supported. If you would like to rehearse multiple specific jobs, please separate the job names by a space in a single command. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.17-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.17-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.21-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/test generated-config |
|
/pj-rehearse more |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.22-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse pull-ci-openshift-ovn-kubernetes-release-4.21-e2e-aws-ovn-fdp-qe |
|
@anuragthehatter: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@martinkennelly Based on various changes and commits over last several weeks, I am confident that we have achieved almost 100% stability across releases. Unfortunately we had OTE library migrations which caused flakes and upstream migration will also happen in near future which might causes flakes again and ERT team along with QE to step in to fix those if needed. From stability POV, we have achieved consistent stability as tested here. Env install failures will always be out of our control :) Please Note: We can ignore 4.22 runs at the moment. QE set Polarion tags and make agents needed for future releases when we enter into that release officially so 4.22 doesn't have blackened QE infra ready to run tests perfectly at the moment. Let me know if you have any comments else it should be good to merge now. Also in future is we see any flake it will be tracked via Tracker |
Resolved conflicts in ovn-kubernetes config files by accepting PR changes to make FDP IPv4 QE jobs default (always_run: true) with 60 min timeout.
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@anuragthehatter: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Opened #71975 |
Making FDP IPv4 QE jobs default and add TIMEOUT (post discussion with OCP QE infra maintainers) to 60 min to accomodate IPsec enable/disable/test usecase which takes ~50 min and currently causing interrupts in recent FDP workflows.
#68237 messed up due to bulk merge and force pushes..sorry. Opened this new clean one.
cc @martinkennelly @jluhrsen