* Fix use-after-free in FE_FREE with GC interaction
When FE_FREE with ZEND_FREE_ON_RETURN frees the loop variable during
an early return from a foreach loop, the live range for the loop
variable was incorrectly extending past the FE_FREE to the normal
loop end. This caused GC to access the already-freed loop variable
when it ran after the RETURN opcode, resulting in use-after-free.
Fix by splitting the ZEND_LIVE_LOOP range when an FE_FREE with
ZEND_FREE_ON_RETURN is encountered:
- One range covers the early return path up to the FE_FREE
- A separate range covers the normal loop end FE_FREE
- Multiple early returns create multiple separate ranges
* Split the live-ranges of loop variables again
b0af9ac733 removed the live-range splitting of foreach variables, however it only added handling to ZEND_HANDLE_EXCEPTION.
This was sort-of elegant, until it was realized in 8258b7731b that it would leak the return variable, requiring some more special handling.
At some point we added live tmpvar rooting in 52cf7ab8a2, but this did not take into account already freed loop variables, which also might happen during ZEND_RETURN, which cannot be trivially accounted for, without even more complicated handling in zend_gc_*_tmpvars() functions.
This commit also proposes a simpler way of tracking the loop end in loopvar freeing ops: handle it directly during live range computation rather than during compilation, eliminating the need for opcache to handle it specifically.
Further, opcache was using live_ranges in its basic block computation in the past, which it no longer does. Thus this complication is no longer necessary and this approach should be actually simpler now.
Closes#20766.
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
---------
Signed-off-by: Bob Weinand <bobwei9@hotmail.com>
Co-authored-by: Gustavo Lopes <mail@geleia.net>
Attributes may themselves contain elements which can have a doc comment on
their own (namely Closures). A doc comment before the attribute list is
generally understood as belonging to the symbol having the attributes.
Fixesphp/php-src#20895.
Internal enums can be cloned and compared, unlike user enums, because we didn't set default_object_handlers when registering internal enums.
Fix by setting default_object_handlers when registering internal enums.
Fixes GH-20914
Closes GH-20915
Over the last few years, I refactored mbstring to perform encoding conversion
a buffer at a time, rather than a single byte at a time. This resulted in a
huge performance increase.
After the refactoring, the old "byte-at-a-time" code was retained for two
reasons:
1) It was used by the mailparse PECL extension.
2) It was used to implement mb_strcut for some text encodings.
However, after reviewing mailparse's use of mbstring, it is clear that
mailparse only relies on mbstring for decoding of QPrint, and possibly
Base64. It does not use the byte-at-a-time conversion code for any
other encoding.
Further, mb_strcut only relies on the byte-at-a-time conversion code
for a limited number of legacy text encodings, such as ISO-2022-JP,
HZ, UTF-7, etc.
Hence, we can remove over 5000 lines of unused code without breaking
anything. This will help to reduce binary size, and make the mbstring
codebase easier to navigate for new contributors.
The legacy mbfl_strcut function is only used to implement mb_strcut
for legacy text encodings which 1) do not use a fixed number of bytes
per codepoint, 2) do not have an 'mblen_table' which can be used to
quickly determine the codepoint length of a byte sequence, and 3) do
not have a specialized 'mb_cut' function which implements mb_strcut
for that text encoding.
Remove unused code from mbfl_strcut, and leave only what is currently
needed for the implementation of mb_strcut.
The element may be still in use in other places, so the linking pointers
should be kept consistent. If not consistent, the "move forward" code in
the sample test will read a stale, dangling pointer.
Closes GH-20885.
In the following optimization:
JMPZ(X,L1) JMP(L2) L1: -> JMPNZ(X,L2) NOP
L1 must not be followed by another block, so that it may safely be followed by
the block containing the JMPNZ. get_next_block() is used to verify L1 is the
direct follower. This function also skips empty blocks, including live, empty
target blocks, which will then implicitly follow the new follow block. This will
result in L1 being followed by two separate blocks, which is not possible.
Resolve this by get_next_block() stopping at target blocks.
Fixes OSS-Fuzz #472563272
Closes GH-20850