This new implementation uses the new encoding conversion filters.
Aside from fewer LOC and (hopefully) improved readability,
the differences are as follows:
BEHAVIOR CHANGES:
- The old implementation used signed arithmetic when operating
on the 'convmap'. This meant that results could be surprising when
using convmap entries with 1 in the MSB. Further, types like 'int'
were used rather than those with a specific bit width, such as
'int32_t'. This meant that results could also depend on the
platform width of an 'int'.
Now unsigned arithmetic is used, with explicit bit widths.
- Similarly, while converting decimal numeric entities, the
legacy implementation would ensure that the value never overflowed
INT_MAX, and if it did, the entity would be treated as invalid
and passed through unconverted.
However, that again means that results depend on the platform
size of an 'int'. So now, we use a value with explicit bit width
(32 bits) to hold the value of a deconverted decimal entity, and
ensure that the entity value does not overflow that.
Further, because we are using an UNSIGNED 32-bit value rather
than a signed one, the ceiling for how large a decimal entity
can be is higher now.
All of this will probably not affect anyone, since Unicode
codepoints above U+10FFFF are invalid anyways. To see the
difference, you need to be using a text encoding like UCS-4,
which allows huge 'codepoints'.
- If it saw something which looked like a hex entity, but
turned out not to be a valid numeric entity, the old
implementation would sometimes convert the hexadecimal
digits a-f to A-F (uppercase). The new implementation passes
invalid numeric entities through without performing case
conversion.
- The old implementation of mb_encode_numericentity was
limited in how many decimal/hex digits it could emit.
If a text encoding like UCS-4 was in use, where 'codepoints'
can have huge values (larger than the valid range
stipulated by the Unicode standard), it would not error
out on a 'codepoint' whose value was too large for it,
but would rather mangle the value and emit a numeric
entity which decoded to some other random codepoint.
The new implementation is able to emit enough digits to
express any value which fits in 32 bits.
PERFORMANCE:
Based on micro-benchmarks run on my development machine:
Decoding numeric HTML entities is about 4 times faster, for
both decimal and hexadecimal entities, across a variety of
input string lengths. Encoding is about 3 times faster.
With request timeouts configured, php-fpm occasionally prints the
following warning:
WARNING: failed to acquire scoreboard
This is happens when php-fpm checks the child scoreboards for timeouts,
but fails to acquire a lock immediately. As this can (and does) occur
during normal operation, this commit downgrades this to a notice.
Closes#9019.
Switch statements may generate a large number of exit points. Once the max
number of exit points is reached, get_exit_addr() returns NULL. This was not
checked, and this resulted in a jump table with some 0 addresses.
Not such as fix but taking more precautions.
Indeed, the arc4random has two little flaws in this platform,
one already caught upfront by the extension (ie size 0), also
internal use of ccrng_generate which can silently fail in few rare
cases.
Closes#7824.
Implements https://wiki.php.net/rfc/partially-supported-callables-expand-deprecation-notices
so that uses of "self" and "parent" in is_callable() and callable
type constraints now raise a deprecation notice, independent of the
one raised when and if the callable is actually invoked.
A new flag is added to the existing check_flags parameter of
zend_is_callable / zend_is_callable_ex, for use in internal calls
that would otherwise repeat the notice multiple times. In particular,
arguments to internal function calls are checked first based on
arginfo, and then again during ZPP, so the former suppresses the
deprecation notice.
Some existing tests which raised this deprecation have been updated
to avoid the syntax, but the existing version retained for maximum
regression coverage until it is made an error.
With thanks to Juliette Reinders Folmer for the RFC and initial
investigation.
Closes GH-8823.
We add support for creating `VT_ERROR` variants via `__construct()`,
and allow casting to int via `variant_cast()` and `variant_set_type()`.
We do not, however, allow type conversion by other means, to avoid
otherwise easily introduced type confusion. VB(A) also only allows
explicit type conversion.
We also introduce `DISP_E_PARAMNOTFOUND` which might be the most
important `scode` for this purpose, since this allows to skip optional
parameters in method calls.
Closes GH-8886.