* tree-wide: Replace `WRONG_PARAM_COUNT` by `ZEND_WRONG_PARAM_COUNT()`
This is a direct alias.
* tree-wide: Replace `ZEND_WRONG_PARAM_COUNT()` by its definition
This macro was hiding control flow (the return statement) and thus was
particularly unhygienic.
Three optimizations:
- If the entire string is returned, we don't need to duplicate it.
- Use packed filling logic.
- Use fast construction of strings. This is useful when splitting
strings on length=1. In that case I get a 6x speedup in the code
below.
Bench:
```php
$x = str_repeat('A', 100);
for ($i = 0; $i < 1000000; $i++)
str_split($x, 10);
```
On an i7-4790:
```
Benchmark 1: ./sapi/cli/php x.php
Time (mean ± σ): 160.1 ms ± 6.4 ms [User: 157.3 ms, System: 1.8 ms]
Range (min … max): 155.6 ms … 184.7 ms 18 runs
Benchmark 2: ./sapi/cli/php_old x.php
Time (mean ± σ): 202.6 ms ± 4.0 ms [User: 199.1 ms, System: 1.9 ms]
Range (min … max): 197.4 ms … 209.2 ms 14 runs
Summary
./sapi/cli/php x.php ran
1.27 ± 0.06 times faster than ./sapi/cli/php_old x.php
```
The performance gain increases with smaller lengths.
Since it's a new string we're returning we can use RETURN_NEW_STR() and
we can also use zend_string_efree() for the strings that we replace
because they have RC1.
* random: Remove `php_random_status`
Since 162e1dce98, the `php_random_status` struct
contains just a single `void*`, resulting in needless indirection when
accessing the engine state and thus decreasing readability because of the
additional non-meaningful `->state` references / the local helper variables.
There is also a small, but measurable performance benefit:
<?php
$e = new Random\Engine\Xoshiro256StarStar(0);
$r = new Random\Randomizer($e);
for ($i = 0; $i < 15; $i++)
var_dump(strlen($r->getBytes(100000000)));
goes from roughly 3.85s down to 3.60s.
The names of the `status` variables have not yet been touched to keep the diff
small. They will be renamed to the more appropriate `state` in a follow-up
cleanup commit.
* Introduce `php_random_algo_with_state`
While __php_mempcpy is only used by ext/standard/crypt_sha*, the
mempcpy "pattern" is used everywhere.
This commit removes __php_mempcpy, adds zend_mempcpy and transforms
open-coded parts into function calls.
The current implementation uses a nested loop (for + goto), which has
complexity O(|s1| * |s2|). If we instead use a lookup table, the
complexity drops to O(|s1| + |s2|).
This is conceptually the same strategy that common C library
implementations such as glibc and musl use.
The variation with a bitvector instead of a table also gives a speed-up,
but the table variation was about 1.34x faster.
On microbenchmarks this easily gave a 5x speedup.
This can bring a 1.4-1.5% performance improvement in the Symfony
benchmark.
Closes GH-12431.