archived-php-src

mirror of https://github.com/php/php-src.git synced 2026-04-21 23:18:13 +02:00

Author	SHA1	Message	Date
Max Kellermann	4831e48708	Zend/zend_system_id: include cleanup	2023-01-12 15:12:45 +00:00
Max Kellermann	cd985de190	ext/standard/md5: include cleanup	2023-01-12 15:12:45 +00:00
Max Kellermann	9521d21681	main/php_globals.h: add missing include for PHPAPI	2023-01-12 15:12:45 +00:00
Max Kellermann	d6136151e9	Zend/zend_build.h: include php_config.h Without this, the macros ZTS, ZEND_DEBUG and PHP_COMPILER_ID may be unavailable.	2023-01-12 15:12:45 +00:00
Jakub Zelenka	da4775f071	Merge branch 'PHP-8.2'	2023-01-12 13:55:47 +00:00
Jakub Zelenka	1b48a5c802	Fix ASAN reported leak in FPM config test This happens because config test does not shutdown SAPI. In addition this commit also fixes few failures when running FPM tests under root. Closes GH-10296	2023-01-12 13:52:33 +00:00
Alex Dowad	4427b2e1ab	Mark UTF-8 strings emitted by mbstring functions as valid UTF-8 We now have a couple of mbstring functions which have fast paths for strings marked as 'valid UTF-8'. Later, we may likely have more. So that these fast paths can be used more frequently, mark UTF-8 strings emitted by mbstring as 'valid UTF-8'. This is always a correct thing to do, because mbstring never returns invalid UTF-8 as the result of a conversion (or similar) operation. Internally, we do have a conversion mode which deliberately emits invalid UTF-8 in some cases. (This is done to prevent unwanted matches when we are converting strings to UTF-8 before performing matching operations on them.) For such strings, don't set the 'valid UTF-8' flag. It probably wouldn't hurt anything to set it, because strings generated using that special conversion mode should never be returned to userland, and I don't think we do anything with them which cares about the IS_STR_VALID_UTF8 flag... but still, it would likely cause confusion for developers.	2023-01-11 17:08:27 +02:00
Tim Düsterhus	e7c0f4e816	random: Rely on `free(NULL)` being safe for random status freeing (#10246 ) * random: Rely on `free(NULL)` being safe for random status freeing * random: Restructure `php_random_status_free()` to not early-return	2023-01-10 18:46:57 +01:00
George Peter Banyard	d7f624258d	Merge branch 'PHP-8.2' * PHP-8.2: fix: indirect_return compilation warning	2023-01-10 15:23:44 +00:00
George Peter Banyard	c936c02119	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: fix: indirect_return compilation warning	2023-01-10 15:23:35 +00:00
Kévin Dunglas	55514a1119	fix: indirect_return compilation warning Closes GH-10274 Signed-off-by: George Peter Banyard <girgias@php.net>	2023-01-10 15:23:15 +00:00
Derick Rethans	cc4e958932	Merge branch 'PHP-8.2'	2023-01-10 15:16:42 +00:00
Derick Rethans	f340854a30	Merge branch 'PHP-8.1' into PHP-8.2	2023-01-10 15:16:32 +00:00
Derick Rethans	d12ba111e0	Fixed GH-10218: DateTimeZone fails to parse time zones that contain the "+" character	2023-01-10 15:15:49 +00:00
David Carlier	61cf7d49ab	posix_pathconf throwing ValueError on empty path	2023-01-10 15:03:11 +00:00
Max Kellermann	ecc880f491	Zend/zend_execute: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	588a07f737	Zend/zend_multibyte: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	f377e15751	Zend/zend_ptr_stack: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	b4ba16fe18	Zend/zend_object_handlers: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	694ec1deea	Zend/zend_{operators,variables}: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	6b34de8eba	sapi/*: add missing includes	2023-01-10 14:19:03 +00:00
Max Kellermann	aa1cd02a43	Zend/zend_fibers: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	308fd311ea	ext/{standard,json,random,...}: add missing includes	2023-01-10 14:19:03 +00:00
Max Kellermann	16203b53e1	main: add missing includes	2023-01-10 14:19:03 +00:00
Max Kellermann	738fb5ca54	Zend/zend_smart_str: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	9fdbefacd3	main/s[np]printf: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	cd4a7c1d90	Zend/zend_ini: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	928685eba2	Zend/zend_signal: include cleanup	2023-01-10 14:19:03 +00:00
Max Kellermann	01e5ffc85c	UPGRADING.INTERNALS: mention the header cleanups	2023-01-10 14:19:03 +00:00
Tim Düsterhus	13b82eef84	random: Randomizer::getFloat(): Fix check for empty open intervals (#10185 ) * random: Randomizer::getFloat(): Fix check for empty open intervals The check for invalid parameters for the IntervalBoundary::OpenOpen variant was not correct: If two consecutive doubles are passed as parameters, the resulting interval is empty, resulting in an uint64 underflow in the γ-section implementation. Instead of checking whether `$min < $max`, we must check that there is at least one more double between `$min` and `$max`, i.e. it must hold that: nextafter($min, $max) != $max Instead of duplicating the comparatively complicated and expensive `nextafter` logic for a rare error case we instead return `NAN` from the γ-section implementation when the parameters result in an empty interval and thus underflow. This allows us to reliably detect this specific error case after the fact, but without modifying the engine state. It also provides reliable error reporting for other internal functions that might use the γ-section implementation. * random: γ-section: Also check that that min is smaller than max This extends the empty-interval check in the γ-section implementation with a check that min is actually the smaller of the two parameters. * random: Use PHP_FLOAT_EPSILON in getFloat_error.phpt Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>	2023-01-10 10:16:33 +01:00
Christoph M. Becker	4280431050	Merge branch 'PHP-8.2' * PHP-8.2: Adapt ext/intl tests for ICU 72.1	2023-01-09 14:10:42 +01:00
Christoph M. Becker	435dc5ef1c	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Adapt ext/intl tests for ICU 72.1	2023-01-09 14:09:43 +01:00
Christoph M. Becker	a9e7b90cc2	Adapt ext/intl tests for ICU 72.1 This version replaces SPACEs before the meridian with NARROW NO-BREAK SPACEs. Thus, we split the affected test cases as usual. (cherry picked from commit `8dd51b462d`) Fixes GH-10262.	2023-01-09 14:08:40 +01:00
Dmitry Stogov	ce861373b9	Merge branch 'PHP-8.2' * PHP-8.2: Fix incorrect optimization of ASSIGN_OP may lead to incorrect result (sub assign -> pre dec conversion for null values)	2023-01-09 13:53:35 +03:00
Dmitry Stogov	9abc2108fa	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Fix incorrect optimization of ASSIGN_OP may lead to incorrect result (sub assign -> pre dec conversion for null values)	2023-01-09 13:53:19 +03:00
Dmitry Stogov	4d4a53beee	Fix incorrect optimization of ASSIGN_OP may lead to incorrect result (sub assign -> pre dec conversion for null values)	2023-01-09 13:51:57 +03:00
Dmitry Stogov	f8b9312709	Merge branch 'PHP-8.2' * PHP-8.2: ext/opcache/jit/zend_jit_trace: fix memory leak in _compile_root_trace() (#10146)	2023-01-09 09:51:12 +03:00
Dmitry Stogov	d13b3b6aa7	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: ext/opcache/jit/zend_jit_trace: fix memory leak in _compile_root_trace() (#10146)	2023-01-09 09:51:00 +03:00
Max Kellermann	bcc5d268f6	ext/opcache/jit/zend_jit_trace: fix memory leak in _compile_root_trace() (#10146 ) A copy of this piece of code exists in zend_jit_compile_side_trace(), but there, the leak bug does not exist. This bug exists since both copies of this piece of code were added in commit `4bf2d09ede`	2023-01-09 09:50:30 +03:00
Alex Dowad	b4cbaabd9b	Add fast SSE2-based implementation of mb_strlen for known-valid UTF-8 strings One small piece of this was obtained from Stack Overflow. According to Stack Overflow's Terms of Service, all user-contributed code on SO is provided under a Creative Commons license. I believe this license is compatible with the code being included in PHP. Benchmarking results (UTF-8 only, for strings which have already been checked using mb_check_encoding): For very short (0-5 byte) strings, mb_strlen is 12% faster. The speedup gets greater and greater on longer input strings; for strings around 100KB, mb_strlen is 23 times faster. Currently the 'fast' code is gated behind a GC flag check which ensures it is only used on strings which have already been checked for UTF-8 validity. This is because the accelerated code will return different results on some invalid UTF-8 strings.	2023-01-09 07:50:40 +02:00
Christoph M. Becker	60102c3228	Merge branch 'PHP-8.2' * PHP-8.2: Fix recently introduced gh10251.phpt	2023-01-08 18:28:34 +01:00
Christoph M. Becker	6faeb9571d	Fix recently introduced gh10251.phpt As of PHP 8.2.0, creation of dynamic properties is deprecated, so we slap a `AllowDynamicProperties` attribute on the class.	2023-01-08 18:07:21 +01:00
George Peter Banyard	3b8327a4e3	Merge branch 'PHP-8.2' * PHP-8.2: Fix GH-10251: Assertion `(flag & (1<<3)) == 0' failed. Fix GH-9710: phpdbg memory leaks by option "-h"	2023-01-08 16:12:21 +00:00
George Peter Banyard	e308dc0635	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Fix GH-10251: Assertion `(flag & (1<<3)) == 0' failed. Fix GH-9710: phpdbg memory leaks by option "-h"	2023-01-08 16:11:46 +00:00
Niels Dossche	d03025bf59	Fix GH-10251: Assertion `(flag & (1<<3)) == 0' failed. zend_get_property_guard previously assumed that at least "str" has a pre-computed hash. This is not always the case, for example when a string is created by bitwise operations, its hash is not set. Instead of forcing a computation of the hashes, drop the hash comparison. Closes GH-10254 Co-authored-by: Changochen <changochen1@gmail.com> Signed-off-by: George Peter Banyard <girgias@php.net>	2023-01-08 16:09:59 +00:00
Niels Dossche	8ff2b6abb2	Fix GH-9710: phpdbg memory leaks by option "-h" Closes GH-10237 Signed-off-by: George Peter Banyard <girgias@php.net>	2023-01-08 16:07:00 +00:00
Alex Dowad	092ad3e462	Optimize branch structure of UTF-8 decoder routine I like the asm which gcc -O3 generates on this modified code... and guess what: my CPU likes it too! (The asm is noticeably tighter, without any extra operations in the path which dispatches to the code for decoding a 1-byte, 2-byte, 3-byte, or 4-byte character. It's just CMP, conditional jump, CMP, conditional jump, CMP, conditional jump. ...Though I was admittedly impressed to see gcc could implement the boolean expression `c >= 0xC2 && c <= 0xDF` with just 3 instructions: add, CMP, then conditional jump. Pretty slick stuff there, guys.) Benchmark results: UTF-8, short - to UTF-16LE faster by 7.36% (0.0001 vs 0.0002) UTF-8, short - to UTF-16BE faster by 6.24% (0.0001 vs 0.0002) UTF-8, medium - to UTF-16BE faster by 4.56% (0.0003 vs 0.0003) UTF-8, medium - to UTF-16LE faster by 4.00% (0.0003 vs 0.0003) UTF-8, long - to UTF-16BE faster by 1.02% (0.0215 vs 0.0217) UTF-8, long - to UTF-16LE faster by 1.01% (0.0209 vs 0.0211)	2023-01-08 17:27:19 +02:00
Alex Dowad	d8b5b9fa55	Add unit tests for mb_str_split/mb_substr on MacJapanese encoding MacJapanese has a somewhat unusual feature that when mapped to Unicode, many characters map to sequences of several codepoints. Add test cases demonstrating how mb_str_split and mb_substr behave in this situation. When adding these tests, I found the behavior of mb_substr was wrong due to an inconsistency between the string "length" as measured by mb_strlen and the number of native MacJapanese characters which mb_substr would count when iterating over the string using the mblen_table. This has been fixed. I believe that mb_strstr will also return wrong results in some cases for MacJapanese. I still need to come up with unit tests which demonstrate the problem and figure out how to fix it.	2023-01-08 17:23:47 +02:00
Alex Dowad	cca4ca6d3d	Remove 'fast path' using mblen_table from mb_get_strlen (it's actually a slow path) Various mbstring legacy text encodings have what is called an 'mblen_table'; a table which gives the length of a multi-byte character using a lookup on the first byte value. Several mbstring functions have a 'fast path' which uses this table when it is available. However, it turns out that iterating through a string using the mblen_table is surprisingly slow. I found that by deleting this 'fast path' from mb_strlen, while mb_strlen becomes a few percent slower on very small strings (0-5 bytes), very large performance gains can be achieved on medium to long input strings. Part of the reason for this is because our text decoding filters are so much faster now. Here are some benchmarks: EUC-KR, short (0-5 chars) - master faster by 11.90% (0.0000 vs 0.0000) EUC-JP, short (0-5 chars) - master faster by 10.88% (0.0000 vs 0.0000) BIG-5, short (0-5 chars) - master faster by 10.66% (0.0000 vs 0.0000) UTF-8, short (0-5 chars) - master faster by 8.91% (0.0000 vs 0.0000) CP936, short (0-5 chars) - master faster by 6.27% (0.0000 vs 0.0000) UHC, short (0-5 chars) - master faster by 5.38% (0.0000 vs 0.0000) SJIS, short (0-5 chars) - master faster by 5.20% (0.0000 vs 0.0000) UTF-8, medium (~100 chars) - new faster by 127.51% (0.0004 vs 0.0002) UTF-8, long (~10000 chars) - new faster by 87.94% (0.0319 vs 0.0170) UTF-8, very long (~100000 chars) - new faster by 88.25% (0.3199 vs 0.1699) SJIS, medium (~100 chars) - new faster by 208.89% (0.0004 vs 0.0001) SJIS, long (~10000 chars) - new faster by 253.57% (0.0319 vs 0.0090) CP936, medium (~100 chars) - new faster by 126.08% (0.0004 vs 0.0002) CP936, long (~10000 chars) - new faster by 200.48% (0.0319 vs 0.0106) EUC-KR, medium (~100 chars) - new faster by 146.71% (0.0004 vs 0.0002) EUC-KR, long (~10000 chars) - new faster by 212.05% (0.0319 vs 0.0102) EUC-JP, medium (~100 chars) - new faster by 186.68% (0.0004 vs 0.0001) EUC-JP, long (~10000 chars) - new faster by 295.37% (0.0320 vs 0.0081) BIG-5, medium (~100 chars) - new faster by 173.07% (0.0004 vs 0.0001) BIG-5, long (~10000 chars) - new faster by 269.19% (0.0319 vs 0.0086) UHC, medium (~100 chars) - new faster by 196.99% (0.0004 vs 0.0001) UHC, long (~10000 chars) - new faster by 256.39% (0.0323 vs 0.0091) This does raise the question: is using the 'mblen_table' worthwhile for other mbstring functions, such as mb_str_split? The answer is yes, it is worthwhile; you see, while mb_strlen only needs to decode the input string but not re-encode it, when mb_str_split is implemented using the conversion filters, it needs to both decode the string and then re-encode it. This means that there is more potential to gain performance by using the 'mblen_table'. Benchmarking shows that in a few cases, mb_str_split becomes faster when the 'mblen_table fast path' is deleted, but in the majority of cases, it becomes slower.	2023-01-08 17:23:47 +02:00
Niels	58d741c042	Remove unnecessary NULL-checks on ctx (#10256 ) ctx can never be zero in these functions because they are dispatched virtually by looking up their entries in ctx. Furthermore, 2 of these checks never actually worked because ctx was dereferenced before ctx was NULL-checked.	2023-01-08 12:09:20 +01:00

1 2 3 4 5 ...

130854 Commits