Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

62221 Parallelise the performance tests #8190

Draft
wants to merge 23 commits into
base: trunk
Choose a base branch
from

Conversation

johnbillion
Copy link
Member

@johnbillion johnbillion commented Jan 25, 2025

This change introduces a job matrix for the "current", "before", and "base" performance tests to replace the current behaviour of running them sequentially in a single job.

This speeds up the overall performance workflow run by 18-20 minutes 🕐 .

Reasoning

  1. The order of the tests at the moment is "current" followed by "before" followed by "base". This means the database upgrade routine between the tests is actually being asked to perform a downgrade, which is not supported. In a PR, the "before" test is potentially being performed with a database schema from the proposed change. This problem is causing the tests in #21022 Switch to using bcrypt for hashing passwords #7333 to unnecessarily fail because of a change to the structure of data in the database which is not supported in the "before" or "base" version of the codebase.
  2. This change removes the potential for to tests to interfere with one another. The current tests rely on running the database upgrade, flushing the cache, and deleting transients between test runs, which doesn't guarantee that the "before" and "base" tests are running in a clean environment compared to the "current" test. By separating the tests into separate jobs, every test run starts with a clean environment and uses the same setup steps.
  3. The original intention of performing the tests sequentially in the same job was to reduce the chance that an environmental factor on GitHub Actions affects the comparison between the "current" and "before" tests that gets reported in a PR. I think in practice this is unlikely and uncommon, and the benefit of having tests that complete significantly faster outweighs this concern. This has no effect on the test reporting that's sent to codevitals.run.
  4. Bonus: Nicely grouped jobs on the GitHub Actions workflow run screens.

Notes

  • The "v2" workflow files are needed so the tests on old branches continue to work as they do currently.
  • Published "base" test results can be seen at https://webhook.cool/at/yellow-ice-88/. I'm using webhook.cool to avoid spamming codevitals.run with results from this PR.
  • The e2e tests, unit tests, and build tests are only disabled in this PR to prevent them from unnecessarily running on this PR. See Add some more paths config to workflow files. #8147 to get that fixed.

Future enhancements

  • The "base" run result should be cached so it theoretically only ever needs to run once per release and then all subsequent performance runs pull its cached results. Needs more work that's outside of the scope of this change.
  • The "before" run result should be cached so multiple pushes to a PR don't unnecessarily re-run the same tests. Same as above, this needs more work so I will look into it at a later date.

Todo

  • Update inline docs in the workflow files
  • Reinstate the comparison
  • Reinstate the reporting
  • Add logic to only run the base test on a push to trunk

Trac ticket: https://core.trac.wordpress.org/ticket/62221

@johnbillion
Copy link
Member Author

@swissspidy @desrosj @sirreal @joemcgill What are your thoughts on this approach? The PR still needs some tweaks but the bulk of it is ready.

Latest results here: https://github.com/WordPress/wordpress-develop/actions/runs/13012122443

@swissspidy
Copy link
Member

The original intention of performing the tests sequentially in the same job was to reduce the chance that an environmental factor on GitHub Actions affects the comparison between the "current" and "before" tests that gets reported in a PR. I think in practice this is unlikely and uncommon

IIRC @dmsnell looked into that before and there can actually be quite some fluctuation, even depending on the time of day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants