1
0
mirror of https://github.com/php/php-src.git synced 2026-04-11 18:13:00 +02:00
Commit Graph

63400 Commits

Author SHA1 Message Date
Alex Dowad
b189aaacc2 Tweaks for accelerated implementation of mb_strlen for UTF-8
On longer strings, this gives a small speed boost of 10% or less.
2023-01-17 10:07:53 +02:00
Alex Dowad
3ae4779305 Add accelerated (SIMD-based) implementation of mb_check_encoding for UTF-8
The new SSE2-based implementation of mb_check_encoding for UTF-8 is
about 10% faster for 0-5 byte strings, more than 3 times faster for
~100-byte strings, and just under 4 times faster for ~10,000-byte
strings.

I believe it may be possible to make this function much faster again.
Some possible directions for further performance optimization include:

• If other ISA extensions like AVX or AVX-512 are available, use a
  similar algorithm, but process text in blocks of 32 or 64 bytes
  (instead of 16 bytes).
• If other SIMD ISA extensions are available, use the greater variety
  of available instructions to make some of the checks tighter.
• Even if only SSE/SSE2 are available, find clever ways to squeeze
  instructions out of the hot path. This would probably require a lot
  of perusing instruction mauals and thinking hard about which SIMD
  instructions could be used to perform the same checks with fewer
  instructions.
• Find a better algorithm, possibly one where more checks could be
  combined (just as the current algorithm combines the checks for
  certain overlong code units and reserved codepoints).
2023-01-17 10:07:53 +02:00
Kamil Tekiela
da550e7762 MYSQL_ATTR_USE_BUFFERED_QUERY is a bool attribute (#10320) 2023-01-16 13:11:38 +00:00
Kamil Tekiela
38dfd20526 Remove main() from mysqli warning (#10321) 2023-01-16 13:10:27 +00:00
Niels Dossche
9006f06a84 Remove dead cleanup code (#10333)
This code path was only triggered if inst->cd == NULL. But the freeing
only happens if inst->cd != NULL. There is nothing to free here, so
remove this code. In fact, let's get rid of the goto too to make the
code more clear to read.
2023-01-16 12:54:35 +00:00
Dmitry Stogov
c010e8fb02 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix GH-10271: Incorrect arithmetic calculations when using JIT
2023-01-16 14:52:14 +03:00
Dmitry Stogov
757e269b89 Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix GH-10271: Incorrect arithmetic calculations when using JIT
2023-01-16 14:51:42 +03:00
Dmitry Stogov
42eed7bb4e Fix GH-10271: Incorrect arithmetic calculations when using JIT 2023-01-16 14:51:26 +03:00
Christoph M. Becker
c8955c078a Revert GH-10220
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit ecc880f491.
This reverts commit 588a07f737.
This reverts commit f377e15751.
This reverts commit b4ba16fe18.
This reverts commit 694ec1deea.
This reverts commit 6b34de8eba.
This reverts commit aa1cd02a43.
This reverts commit 308fd311ea.
This reverts commit 16203b53e1.
This reverts commit 738fb5ca54.
This reverts commit 9fdbefacd3.
This reverts commit cd4a7c1d90.
This reverts commit 928685eba2.
This reverts commit 01e5ffc85c.
2023-01-16 12:27:33 +01:00
Christoph M. Becker
2f4973fd88 Revert GH-10279
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit 45a128c9de.
This reverts commit 1eb71c3f15.
This reverts commit 492523a779.
This reverts commit c7a4633891.
This reverts commit 308adb915c.
This reverts commit cd27d5e07f.
This reverts commit c5933409b4.
This reverts commit 46371f4eb3.
This reverts commit 623e2e9fc6.
This reverts commit e7434c1247.
This reverts commit d28d323ca2.
This reverts commit 1a067b84ee.
This reverts commit a55c0c5fc3.
This reverts commit b5aeb3a4d4.
This reverts commit f061a035e4.
This reverts commit b088575119.
This reverts commit b1d48774a7.
This reverts commit 94f9a20ce6.
This reverts commit 4831e48708.
This reverts commit cd985de190.
This reverts commit 9521d21681.
This reverts commit d6136151e9.
2023-01-16 12:25:59 +01:00
Christoph M. Becker
bf1cfc0753 Revert GH-10300
Cf. <https://github.com/php/php-src/pull/10220#issuecomment-1383739816>.

This reverts commit 68ada76f9a.
his reverts commit 45384c6e20.
This reverts commit ef7fbfd710.
This reverts commit 9b9ea0d7c6.
This reverts commit f15747c26b.
This reverts commit e883ba93c4.
This reverts commit 7e87551c37.
This reverts commit 921274d2b8.
This reverts commit fc1f528e5e.
This reverts commit 0961715cda.
This reverts commit a93f264526.
This reverts commit 72dd94e1c6.
This reverts commit 29b2dc8964.
This reverts commit 05c7653bba.
This reverts commit 5190e5c260.
This reverts commit 6b55bf228c.
This reverts commit 184b4a12d3.
This reverts commit 4c31b7888a.
This reverts commit d44e9680f0.
This reverts commit 4069a5c43f.
2023-01-16 12:22:54 +01:00
Dmitry Stogov
0d011e4626 Revert "Merge branch 'PHP-8.0' into PHP-8.1"
This reverts commit 0116864cd3, reversing
changes made to 1f715f5658.
2023-01-16 11:15:30 +03:00
George Peter Banyard
540e5104df Drop key_suffix parameter in php_url_encode_hash_ex()
The suffix was always constant and the same value between calls and depends on a prefix being needed
2023-01-15 16:00:18 +00:00
George Peter Banyard
c9b8d1bfaa Use zend_string* instead of char* and size_t pair for key_prefix 2023-01-15 16:00:18 +00:00
George Peter Banyard
76eaff080a Use a zend_string* for arg_sep in php_url_encode_hash_ex()
This prevent a repeated strlen() call for known information
2023-01-15 16:00:18 +00:00
George Peter Banyard
20a6638e22 Extract scalar url encoding into its own function 2023-01-15 16:00:18 +00:00
George Peter Banyard
7d33a30b40 Handle floats directly in http_build_query() 2023-01-15 16:00:17 +00:00
George Peter Banyard
ec7c7a7550 Add more tests for http_build_query()
Some with unusual types like resource and null
A lot more tests for objects
2023-01-15 16:00:17 +00:00
George Peter Banyard
c177ea91d4 Move http_build_query() tests to the HTTP test folder 2023-01-15 16:00:17 +00:00
George Peter Banyard
5624cbbed1 Merge branch 'PHP-8.2'
* PHP-8.2:
  Fix wrong flags check for compression method in phar_object.c
  Fix missing check for xmlTextWriterEndElement
  Fix substr_replace with slots in repl_ht being UNDEF
2023-01-15 15:43:57 +00:00
George Peter Banyard
ec377c687d Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Fix wrong flags check for compression method in phar_object.c
  Fix missing check for xmlTextWriterEndElement
  Fix substr_replace with slots in repl_ht being UNDEF
2023-01-15 15:43:34 +00:00
Niels Dossche
347b7c3628 Fix wrong flags check for compression method in phar_object.c
I found this issue using static analysis tools, it reported that the condition was always false.
We can see that flags is assigned in the switch statement above, but a mistake was made in the comparison.

Closes GH-10328

Signed-off-by: George Peter Banyard <girgias@php.net>
2023-01-15 15:35:35 +00:00
Niels Dossche
11a1feb0d7 Fix missing check for xmlTextWriterEndElement
xmlTextWriterEndElement returns -1 if the call fails. There was already
a check for retval, but the return value wasn't assigned to retval. The
other caller of xmlTextWriterEndElement is in
xmlwriter_write_element_ns, which does the check correctly.

Closes GH-10324

Signed-off-by: George Peter Banyard <girgias@php.net>
2023-01-15 15:34:43 +00:00
Niels Dossche
4bbbe6d652 Fix substr_replace with slots in repl_ht being UNDEF
The check that was supposed to check whether the array slot was UNDEF
was wrong and never triggered. This resulted in a replacement with the
empty string or the wrong string instead of the correct one. The correct
check pattern can be observed higher up in the function's code.

Closes GH-10323

Signed-off-by: George Peter Banyard <girgias@php.net>
2023-01-15 15:31:34 +00:00
Max Kellermann
d44e9680f0 ext/opcache/ZendAccelerator.h: add missing include for "INIT_FUNC_ARGS" 2023-01-15 15:07:58 +00:00
Niels Dossche
a60c6ee0ac Mark constant static arrays in function bodies actually as const (#10325) 2023-01-15 14:51:31 +00:00
Niels
e951202a69 Remove useless check, search_str is always true here (#10322) 2023-01-15 00:32:51 +01:00
Niels
6ab503814d Make array_pad's $length warning less confusing (#10149)
Remove array_pad's arbitrary length restriction

The error message was wrong; it *is* possible to use a larger length.
Furthermore, there is an arbitrary restriction on the new array's
length.

Fix both by checking the length against HT_MAX_SIZE.
2023-01-14 12:15:56 +01:00
David CARLIER
690db97c6d intl extension couple of micro optimisations for error edge cases. (#10044)
making c++ compile time few enums ranges.
2023-01-14 07:26:05 +00:00
Max Kellermann
7473b86f10 build/php.m4: remove test for integer types (#10304)
These are mandatory in C99, so it's a pointless waste of time to check
for them.

(Actually, the fixed-size integer types are not mandatory, but if they
are really not available on some theoretical system, PHP's fallbacks
won't work either, so nothing is gained from this check.)
2023-01-13 11:51:15 +00:00
Max Kellermann
061fcdb0a5 ext/opcache: use C11 atomics for "restart_in" (#10276)
Cheaper than fcntl(F_SETLK).  The same is done already on Windows, so
if that works, why not use it everywhere?  (Of course, only if the
compiler supports this C11 feature.)

As a bonus, the code in this commit also works on C++ via C++11
std::atomic, just in case somebody adds some C++ code to the opcache
extension one day.
2023-01-13 00:02:35 +01:00
David Carlier
9198e8894b socket DF flag on UDP socket via IP_MTU_DISCOVER on Linux and IP_DONTFRAGMENT on FreeBSD for path MTU discovery purpose.
idea proposal via ml :
https://marc.info/?l=php-internals&m=167329288509393&w=2

Close GH-10282
2023-01-12 22:22:30 +00:00
David Carlier
55d19eee49 posix adding posix_fpathconf.
follow-up on GH-10238 but with the file descriptor flavor.

Close GH-10253
2023-01-12 22:15:31 +00:00
Tim Düsterhus
0116864cd3 Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Revert "Make build work with newer OpenSSL"
  [ci skip] Next release will be 8.0.28
  [ci skip] Prepare for PHP 8.0.27 GA
2023-01-12 21:48:23 +01:00
Tim Düsterhus
013e0f98ac Merge branch 'PHP-8.2'
* PHP-8.2:
  unserialize: Strictly check for `:{` at object start (#10214)
2023-01-12 19:57:22 +01:00
Tim Düsterhus
f2e8c5da90 unserialize: Strictly check for :{ at object start (#10214)
* unserialize: Strictly check for `:{` at object start

* unserialize: Update CVE tests

It's unlikely that the object syntax error contributed to the actual CVE. The
CVE is rather caused by the incorrect object serialization data of the `C`
format. Add a second string without such a syntax error to ensure that path is
still executed as well to ensure the CVE is absent.

* Fix test expectation in gmp/tests/bug74670.phpt

No changes to the input required, because the test actually is intended to
verify the behavior for a missing `}`, it's just that the report position changed.

* NEWS

* UPGRADING
2023-01-12 19:55:54 +01:00
George Peter Banyard
410e78651a Merge branch 'PHP-8.2'
* PHP-8.2:
  Use absolute paths in OPCache tests when calling `opcache_compile_file()`
2023-01-12 15:48:13 +00:00
George Peter Banyard
31fd34aa4c Merge branch 'PHP-8.1' into PHP-8.2
* PHP-8.1:
  Use absolute paths in OPCache tests when calling `opcache_compile_file()`
2023-01-12 15:48:01 +00:00
Thomas Gerbet
1f715f5658 Use absolute paths in OPCache tests when calling opcache_compile_file()
This make sure the tests do not fail if they are not run from the
repository root.

Closes GH-10266

Signed-off-by: George Peter Banyard <girgias@php.net>
2023-01-12 15:47:24 +00:00
Alex Dowad
a90358639d Implement conditional casing for Greek letter sigma when title-casing text 2023-01-12 17:41:11 +02:00
Alex Dowad
290efe842d Adjust code which checks if encoding is ISO-8859-9 when converting case
Instead of checking the 'encoding number' to see if we are converting
case for ISO-8859-9 text, compare pointers instead.

This should free up 1 register in php_unicode_convert_case.
2023-01-12 17:41:11 +02:00
Alex Dowad
39b46a5398 Implement Unicode conditional casing rules for Greek letter sigma
The capital Greek letter sigma (Σ) should be lowercased as σ except
when it appears at the end of a word; in that case, it should be
lowercased as the special form ς.

This rule is included in the Unicode data file SpecialCasing.txt.
The condition for applying the rule is called "Final_Sigma" and is
defined in Unicode technical report 21. The rule is:

• For the special casing form to apply, the capital letter sigma must
  be preceded by 0 or more "case-ignorable" characters, preceded by
  at least 1 "cased" character.
• Further, capital sigma must NOT be followed by 0 or more
  case-ignorable characters and then at least 1 cased character.

"Case-ignorable" characters include certain punctuation marks, like
the apostrophe, as well as various accent marks. There are actually
close to 500 different case-ignorable characters, including accent marks
from Cyrillic, Hebrew, Armenian, Arabic, Syriac, Bengali, Gujarati,
Telugu, Tibetan, and many other alphabets. This category also includes
zero-width spaces, codepoints which indicate RTL/LTR text direction,
certain musical symbols, etc.

Since the rule involves scanning over "0 or more" of such
case-ignorable characters, it may be necessary to scan arbitrarily far
to the left and right of capital sigma to determine whether the special
lowercase form should be used or not. However, since we are trying to
be both memory-efficient and CPU-efficient, this implementation limits
how far to the left we will scan. Generally, we scan up to 63 characters
to the left looking for a "cased" character, but not more.

When scanning to the right, we go up to the end of the string if
necessary, even if it means scanning over thousands of characters.

Anyways, it is almost impossible to imagine that natural text will
include "words" with more than 63 successive apostrophes (for example)
followed by a capital sigma.

Closes GH-8096.
2023-01-12 17:41:11 +02:00
Max Kellermann
24b311bdd7 ext/opcache/zend_shared_alloc: rename _register_xlat_entry() params
The name "new" happens to be a C++ keyword, which was the my reason to
rethink those names.

The "xlat_table" is not only used to translate pointers for persisting
scripts to shared memory, but is also used to annoate pointers
(e.g. by the JIT to associate an op_array with its jit_extension).

The names "old" and "new" aren't good for that; often, there's nothing
"old" or "new" about them.  It's actually a generic lookup table, and
"old" shall be named "key" (which it is called internally already),
and "new" is renamed to simply "value".
2023-01-12 15:14:05 +00:00
Max Kellermann
b47bfd698d ext/opcache: C++ compatibility
Just in case somebody includes those headers from C++ code.  The same
already exists in other opcache headers.
2023-01-12 15:14:05 +00:00
Max Kellermann
623e2e9fc6 ext/opcache/zend_accelerator_hash: include cleanup 2023-01-12 15:12:45 +00:00
Max Kellermann
cd985de190 ext/standard/md5: include cleanup 2023-01-12 15:12:45 +00:00
Alex Dowad
4427b2e1ab Mark UTF-8 strings emitted by mbstring functions as valid UTF-8
We now have a couple of mbstring functions which have fast paths for
strings marked as 'valid UTF-8'. Later, we may likely have more. So
that these fast paths can be used more frequently, mark UTF-8 strings
emitted by mbstring as 'valid UTF-8'. This is always a correct thing
to do, because mbstring never returns invalid UTF-8 as the result of
a conversion (or similar) operation.

Internally, we do have a conversion mode which deliberately emits
invalid UTF-8 in some cases. (This is done to prevent unwanted matches
when we are converting strings to UTF-8 before performing matching
operations on them.) For such strings, don't set the 'valid UTF-8' flag.
It probably wouldn't hurt anything to set it, because strings generated
using that special conversion mode should *never* be returned to
userland, and I don't think we do anything with them which cares about
the IS_STR_VALID_UTF8 flag... but still, it would likely cause
confusion for developers.
2023-01-11 17:08:27 +02:00
Tim Düsterhus
e7c0f4e816 random: Rely on free(NULL) being safe for random status freeing (#10246)
* random: Rely on `free(NULL)` being safe for random status freeing

* random: Restructure `php_random_status_free()` to not early-return
2023-01-10 18:46:57 +01:00
Derick Rethans
cc4e958932 Merge branch 'PHP-8.2' 2023-01-10 15:16:42 +00:00
Derick Rethans
f340854a30 Merge branch 'PHP-8.1' into PHP-8.2 2023-01-10 15:16:32 +00:00