Performance tests: Make them faster #30021

ockham · 2021-03-19T13:24:43Z

aka Increase performance tests performance 😄

The perf tests are currently taking around 40 minutes to run on a given PR. That time is spent on cloning the git repo, checking out the trunk branch, npm install && npm run build:packages, running the tests, and then switching to the current branch and repeating the install/build/tests steps (see performance.js).

I think we can cut this down significantly by being a bit smarter:

We change the perf CI workflow as follows:
- We don't just run it on PRs, but also on push to trunk.
- We don't clone the repo, but only run it on the current branch.
- We upload the test results from the current branch as a build artifact (maybe in CSV format).
- We move the logic that compares results to a new job in the same workflow. If the current branch isn't trunk, we use GH's API to find the latest run of the performance workflow for trunk that completed successfully, and download the test results artifact. We then use that, and the artifact from the current branch, to create the table of test results.

Since we'd only run the perf tests for the current branch on each PR (while fetching the results for trunk from GH), I think this should cut down the perf workflow's runtime by 50% -- currently around 20 minutes. Probably even a bit less than that, since we'd skip a bit of overhead (repo cloning).

In a second step, this can probably extended for the release case (comparing the current and previous GB release branches, and the current WordPress wp/ release branch). The major hurdle there might be the lifespan of build artifacts: It's possible that the wp/ release branch artifact is quite old, and has potentially been removed by GH at that time. We might need a fallback to recreate it on the fly.

This is notably a CI-first approach; I'm not quite sure how to best backport it to the locally run bin/plugin/cli.js perf (performance.js) script. Arguably, it doesn't make sense to download test results created on a CI machine to compare them to results created locally on a developer's system, since those machines' specs might totally skew the result. Right now, I'm leaning towards just changing the CI workflow (performance.yml), and leaving the local script as-is -- potentially changing it later. One option would be to dumb down the local script to only compare test results fetched from CI (rather than running the actual perf tests itself).

This could be a fun little project to learn a bit more about GitHub Actions. I wouldn't mind mentoring someone doing this, rather than implementing it myself. Any takers? 😬 cc/ @david-szabo97 @rafaelgalani

Curious to hear thoughts from @WordPress/gutenberg-core.

The text was updated successfully, but these errors were encountered:

youknowriad · 2021-03-19T13:29:34Z

We move the logic that compares results to a new job in the same workflow. If the current branch isn't trunk, we use GH's API to find the latest run of the performance workflow for trunk that completed successfully, and download the test results artifact. We then use that, and the artifact from the current branch, to create the table of test results.

I guess this whole proposal is based on this premise but unfortunately I don't really think it's a good idea. Two jobs run on CI doesn't guarantee the same environment (same CPU power, same memory..) which means the results are not correct across jobs, this is why we need to run both branches on the same CI job.

Actually, even the way we iterate 3 times is not optimal. Instead of doing:

branch1 x3
branch2 x3

We should be doing

branch1
branch2
branch1
branch2
branch1
branch2

This would ensure the numbers are stable enough and the environments match as closely as possible.

youknowriad · 2021-03-19T13:31:42Z

another problem I can think of with the proposed approach is that a PR changing the "performance e2e test file" itself (so changing the rules) would result in completely inconsistent results across branches.

swissspidy · 2021-03-19T13:36:59Z

Those are some good points.

While reading this, I wondered: do we really need to run the performance tests for every single PR?
If not, there are other ways to improve the experience here, like only running it for PRs with a certain label, or skipping it for unrelated changes (e.g. only modifying docs or PHP files)

ockham added Automated Testing [Type] Performance labels Mar 19, 2021

WordPress / gutenberg

Performance tests: Make them faster #30021

Performance tests: Make them faster #30021

ockham commented Mar 19, 2021

youknowriad commented Mar 19, 2021

youknowriad commented Mar 19, 2021

swissspidy commented Mar 19, 2021

WordPress / gutenberg

Performance tests: Make them faster #30021

Performance tests: Make them faster #30021

Comments

ockham commented Mar 19, 2021

youknowriad commented Mar 19, 2021

youknowriad commented Mar 19, 2021

swissspidy commented Mar 19, 2021