For x ? y : z style structures, the live range starts at z, but
may also hold the value of y. Make sure that the refcounting check
takes this into account, by checking the type of a potential phi
user.
Failure to rebind such closures is not necessarily related to them
being created by `ReflectionFunctionAbstract::getClosure()`, so we fix
the error message.
Closes GH-6424.
- Reject otherwise valid kuten codes which don't map to anything in JIS X 0208.
- Handle truncated multi-byte characters as an error.
- Convert Shift-JIS 0x7E to Unicode 0x203E (overline) as recommended by the
Unicode Consortium, and as iconv does.
- Convert Shift-JIS 0x5C to Unicode 0xA5 (yen sign) as recommended by the
Unicode Consortium, and as iconv does.
(NOTE: This will affect PHP scripts which use an internal encoding of
Shift-JIS! PHP assigns a special meaning to 0x5C, the backslash. For example,
it is used for escapes in double-quoted strings. Mapping the Shift-JIS yen
sign to the Unicode yen sign means the yen sign will not be usable for
C escapes in double-quoted strings. Japanese PHP programmers who want to
write their source code in Shift-JIS for some strange reason will have to
use the JIS X 0208 backlash or 'REVERSE SOLIDUS' character for their C
escapes.)
- Convert Unicode 0x5C (backslash) to Shift-JIS 0x815F (reverse solidus).
- Immediately handle error if first Shift-JIS byte is over 0xEF, rather than
waiting to see the next byte. (Previously, the value used was 0xFC, which is
the limit for the 2nd byte and not the 1st byte of a multi-byte character.)
- Don't allow 'control characters' to appear in the middle of a multi-byte
character.
The test case for bug 47399 is now obsolete. That test assumed that a number
of Shift-JIS byte sequences which don't map to any character were 'valid'
(because the byte values were within the legal ranges).
Let's test the current behavior here. It might not be right, but
it's long-standing behavior.
Nearly missed an assertion failure here because the test was
XFAILed...
While fixing bugs in mbstring, one of my new test cases failed with a strange
error message stating: 'Warning: Undefined array key 1...', when clearly the
array key had been set properly.
GDB'd that sucker and found that JIT'd PHP code was calling directly into
`zend_hash_add_new` (which was not converting the numeric string key to an
integer properly). But where was that code coming from? I examined the disasm,
looked up symbols to figure out where call instructions were going, then grepped
the codebase for those function names. It soon became clear that the disasm I
was looking at was compiled from `zend_jit_fetch_dim_w_helper`.
This broke one old test (Zend/tests/multibyte_encoding_003.phpt), which used
a PHP script encoded as UTF-16. The problem was that to terminate the test
script, we need the text: "\n--EXPECT--". Out of that text, the terminating
newline (0x0A byte) becomes part of the resulting test script; but a bare
0x0A byte with no 0x00 is not valid UTF-16.
Since we now treat truncated UTF-16 characters as erroneous, an extra '?' is
appended to the output as an 'illegal character' marker.
Really, if we are running PHP scripts which are treated as encoded in UTF-16
or some other arbitrary text encoding (not ASCII), and the script is not
actually a valid string in that encoding, inserting '?' characters into the
code which the PHP interpreter runs is a bad thing to do. In such cases, the
script shouldn't be treated as UTF-16 (or whatever) at all.
I wonder if mbstring's encoding detection is being used in 'non-strict' mode?
This makes a number of related changes to the generator tree
management, that should hopefully make it easier to understand,
more robust and faster for the common linear-chain case. Fixes
https://bugs.php.net/bug.php?id=80240, which was the original
motivation here.
* Generators now only add a ref to their direct parent.
* Nodes only store their children, not their leafs, which avoids
any need for leaf updating. This means it's no longer possible
to fetch the child for a certain leaf, which is something we
only needed in one place (update_current). If multi-children
nodes are involved, this will require doing a walk in the other
direction (from leaf to root). It does not affect the common
case of single-child nodes.
* The root/leaf pointers are now seen as a pair. One leaf generator
can point to the current root. If a different leaf generator is
used, we'll move the root pointer over to that one. Again, this
is a cache to make the common linear chain case fast, trees may
need to scan up the parent link.
Closes GH-6344.
As filenames are no longer interned, we need to keep a reference
to the zend_string to make sure it isn't freed.
To avoid a nominal source compatibility break, create a new member
in the globals.
We need to perform trait scope fixup for both methods involved
in the inheritance check. For that purpose we already need to
thread through a separate fn scope through the entire inheritance
checking machinery.
If we have an undefined variable and null is not accepted by the
return type, we want to throw just the undef var error.
In this case this lead to an infinite loop, because we overwrite
the exception opline in SAVE_OPLINE and it does not get reset
when chaining into a previous exception. Add an assertiong to
catch this case earlier.
If the RHS has INDIRECT elements, we do not those to be added to
the LHS verbatim. As we're using UPDATE_INDIRECT, we might even
create a nested INDIRECT that way.
This is a side-quest of oss-fuzz #26245.