Changes include:
- Enable parallelism for the scheduled and manual workflows
- Change length of commit hashes in order to align with GitHub's behavior
- Remove unnecessary cleanup step
[skip ci]
This commit changes the following:
- The "workspace" and "state_bucket" config options are given so that the benchmark will use a seperate state per environment ("workspace" per the benchmark's terminology) via the Terraform remote state feature. As a side-effect, cancelling a workflow run won't mess up the AWS account by leaving dangling components; everything will be properly destroyed and then recreated during the next run.
- The displayed GitHub compare link always truncates the commit hashes to 6 characters
- Some code simplification
[skip ci]
It's not needed anymore since it's now always turned on, and the exact usage (whether to use --enable-opcache or not) is automatically detected: 0b116647c7
[skip-ci]
- Ubuntu is updated to 24.04
- Log files are also uploaded as artifact
- The baseline commit is now correctly set to the merge base commit when the workflow is manually started on a PR
"{{ inputs.opcache || '1' }}" doesn't work the way I used to think: I assumed it only falls back to 1 if the input is not set (when a scheduled workflow runs). So this bug is fixed by overwriting the env vars in bash in case of manual workflows.
- Do not run micro_bench.php by default (it would take a very long time to get accurate results)
- Measure instruction count by default during the scheduled runs
First of all, the last successful build had been before opcache was made required - therefore the PHP_OPCACHE setting should be 2 to manually enable it.
Then, the manual flow should comment on the PR of the triggering repo (github.repository), not the repo of the benchmarked code (env.REPOSITORY).
https://wiki.php.net/rfc/make_opcache_required removed the --enable-opcache option, and this change creates a problem for the benchmark: the master branch (containing the RFC implementation) cannot use the deprecated options and config anymore, while earlier versions must still use them.
Therefore, the benchmark had to introduce the PHP_OPCACHE=2 config value (3455b34856) to signal that opcache still has to be manually enabled. After the next benchmark run, PHP_OPCACHE for the previous PHP version has to be switched back to "1".
[skip-ci]
This PR integrates https://github.com/kocsismate/php-version-benchmarks/ into the CI as a nightly job running every day at 12:30 AM UTC. Roughly, the following happens: the benchmark suite spins up an AWS EC2 instance via Terraform, runs the tests according to the configuration, and then the results are committed to the https://github.com/kocsismate/php-version-benchmark-results repository.
In order to have as stable results as possible, the CPU, kernel and other settings of the AWS instance are fine-tuned:
- Hyper-threading is disabled
- Turbo boost is disabled
- C states of the CPU are limited: https://docs.aws.amazon.com/linux/al2/ug/processor_state_control.html#baseline-perf
- The workload is dedicated to a single core by using taskset according to Intel's recommendations (https://web.archive.org/web/20210614053522/https://01.org/node/3774)
- An io2 SSD volume is attached to the instance which has a provisioned IOPS (https://docs.aws.amazon.com/ebs/latest/userguide/provisioned-iops.html#io2-block-express) so that IO performance is nearly constant
- The instance is dedicated so that the noisy neighbor effect is eliminated: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-instance.html
- ASLR is disabled (Disable ASLR for benchmark #13769)
Customizing the CPU is only supported by metal instances among recent instance types according to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html, so at last, a c7i.metal-24xl instance is used in the eu-west-1 region.
The benchmark suite compares the performance of the latest commit of the master branch in the time when the benchmark runs with the last commit of master from the day before yesterday. I.e. if the benchmark runs tomorrow morning at 2 AM, then the performance of the latest commit will be benchmarked against the last commit pushed yesterday. This makes it possible to spot outstanding regressions (or progressions) in time. Actually, the end goal is to send notifications in case of any significant changes for further analyzation. The reason why the benchmark is run for previous commits as well (while they may have already been measured the day before) is to make the results less sensitive for changes in the environment or the benchmark suite itself. I.e.: if AWS upgrades the OS, or if the code under test is modified, then the numbers will likely be affected, and the previous results will be invalidated).