The new SSE2-based implementation of mb_check_encoding for UTF-8 is
about 10% faster for 0-5 byte strings, more than 3 times faster for
~100-byte strings, and just under 4 times faster for ~10,000-byte
strings.
I believe it may be possible to make this function much faster again.
Some possible directions for further performance optimization include:
• If other ISA extensions like AVX or AVX-512 are available, use a
similar algorithm, but process text in blocks of 32 or 64 bytes
(instead of 16 bytes).
• If other SIMD ISA extensions are available, use the greater variety
of available instructions to make some of the checks tighter.
• Even if only SSE/SSE2 are available, find clever ways to squeeze
instructions out of the hot path. This would probably require a lot
of perusing instruction mauals and thinking hard about which SIMD
instructions could be used to perform the same checks with fewer
instructions.
• Find a better algorithm, possibly one where more checks could be
combined (just as the current algorithm combines the checks for
certain overlong code units and reserved codepoints).
We're in the case of ZEND_JMPZ_EX or ZEND_JMPNZ_EX. The opcode gets
overwritten and only after the overwriting gets checked if we're in a
JMPZ or JMPNZ case. This results in a wrong optimization.
Close GH-10329
This code path was only triggered if inst->cd == NULL. But the freeing
only happens if inst->cd != NULL. There is nothing to free here, so
remove this code. In fact, let's get rid of the goto too to make the
code more clear to read.
* PHP-8.2:
Fix wrong flags check for compression method in phar_object.c
Fix missing check for xmlTextWriterEndElement
Fix substr_replace with slots in repl_ht being UNDEF
* PHP-8.1:
Fix wrong flags check for compression method in phar_object.c
Fix missing check for xmlTextWriterEndElement
Fix substr_replace with slots in repl_ht being UNDEF
I found this issue using static analysis tools, it reported that the condition was always false.
We can see that flags is assigned in the switch statement above, but a mistake was made in the comparison.
Closes GH-10328
Signed-off-by: George Peter Banyard <girgias@php.net>
xmlTextWriterEndElement returns -1 if the call fails. There was already
a check for retval, but the return value wasn't assigned to retval. The
other caller of xmlTextWriterEndElement is in
xmlwriter_write_element_ns, which does the check correctly.
Closes GH-10324
Signed-off-by: George Peter Banyard <girgias@php.net>
The check that was supposed to check whether the array slot was UNDEF
was wrong and never triggered. This resulted in a replacement with the
empty string or the wrong string instead of the correct one. The correct
check pattern can be observed higher up in the function's code.
Closes GH-10323
Signed-off-by: George Peter Banyard <girgias@php.net>