archived-php-src

mirror of https://github.com/php/php-src.git synced 2026-04-13 19:14:16 +02:00

Author	SHA1	Message	Date
Alex Dowad	ef114f94b9	Simplify code for conversion of UHC to Unicode I was hoping to get some performance gains here, but the performance is just the same as before, +/- a fraction of a percent.	2023-01-04 18:18:22 +02:00
Max Kellermann	efd5ecb0f2	Zend/Optimizer/zend_inference: make several pointers const This allows removing several deconst casts from the JIT.	2023-01-04 12:59:16 +00:00
Alex Dowad	3b5072f6f6	Use smart_str in mb_http_input rather than mbfl_memory_device For many years, the code has contained a TODO comment indicating that the original author had wanted to do this. Using smart_str makes the code shorter and cleaner, and it is another step towards removing a bunch of legacy mbstring code which will soon be unneeded.	2023-01-03 09:10:13 +02:00
Alex Dowad	0e7160b836	Implement mb_detect_encoding using fast text conversion filters Regarding the optional 3rd `strict` argument to mb_detect_encoding, the documentation states: Controls the behaviour when string is not valid in any of the listed encodings. If strict is set to false, the closest matching encoding will be returned; if strict is set to true, false will be returned. (Ref: https://www.php.net/manual/en/function.mb-detect-encoding.php) Because of bugs in the implementation, mb_detect_encoding did not always behave according to this description when `strict` was false. For example: <?php echo var_export(mb_detect_encoding("\xc0\x00", "UTF-8", false)); // Before this commit, prints: false // After this commit, prints: 'UTF-8' Because `strict` is false in the above example, mb_detect_encoding should return the 'closest matching encoding', which is UTF-8, since that is the only candidate encoding. (Incidentally, this example shows that using mb_detect_encoding with a single candidate encoding in non-strict mode is useless.) The new implementation fixes this bug. It also fixes another problem with the old implementation as regards non-strict detection mode: The old implementation would stop processing of the input string using a particular candidate encoding as soon as it saw an error in that encoding, even in non-strict mode. This means that it could not really detect the 'closest matching encoding'; rather, what it would return in non-strict mode was 'the encoding in which the first decoding error is furthest from the beginning of the input string'. In non-strict mode, the new implementation continues trying to process the input string to its end even after seeing an error. This makes it possible to determine in which candidate encoding the string has the smallest number of errors, i.e. the 'closest matching encoding'. Rejecting candidate encodings as soon as it saw an error gave the old implementation a marked performance advantage in non-strict mode; however, the new implementation still beats it in most cases. Here are a few sample microbenchmark results: UTF-8, ~100 codepoints, strict mode Old: 0.080s (100,000 calls) New: 0.026s (" " ) UTF-8, ~100 codepoints, non-strict mode Old: 0.079s (100,000 calls) New: 0.033s (" " ) UTF-8, ~10000 codepoints, strict mode Old: 6.708s (60,000 calls) New: 1.383s (" " ) UTF-8, ~10000 codepoints, non-strict mode Old: 6.705s (60,000 calls) New: 3.044s (" " ) Notice that the old implementation had almost identical performance between strict and non-strict mode, while the new suffers a significant performance penalty for non-strict detection. This is the cost of implementing the behavior specified in the documentation. A couple more sample results: SJIS, ~10000 codepoints, strict mode Old: 4.563s New: 1.084s SJIS, ~10000 codepoints, non-strict mode Old: 4.569s New: 2.863s This is the only case I found where the new implementation loses: UTF-16LE, ~10000 codepoints, non-strict mode Old: 1.514s New: 2.813s The reason is because the test strings happened to be invalid right from the first few bytes for all the candidate encodings except for UTF-16LE; so the old implementation would immediately reject all those encodings and only process the entire string in UTF-16LE. I believe mb_detect_encoding could be made much faster if we identified good criteria for when to reject candidate encodings before reaching the end of the input string.	2023-01-03 09:10:10 +02:00
Alex Dowad	953864661a	Implement php_mb_zend_encoding_converter using fast text conversion filters	2023-01-03 09:02:21 +02:00
Alex Dowad	88c99afdac	Implement mb_str_split using fast text conversion filters There is no great difference between the old and new code for text encodings which either 1) use a fixed number of bytes per codepoint or 2) for which we have an 'mblen' table which enables us to find the length of a multi-byte character using a table lookup indexed by the first byte value. The big difference is for other text encodings, where we have to actually decode the string to split it. For such text encodings, such as ISO-2022-JP and UTF-16, I measured a speedup of 50%-120% over the previous implementation.	2023-01-03 09:02:21 +02:00
Alex Dowad	a9a672048b	Implement mb_output_handler using fast text conversion filters	2023-01-03 09:02:21 +02:00
David Carlier	2a8cecdc3d	Merge branch 'PHP-8.2'	2023-01-02 16:55:54 +00:00
David Carlier	acb1af802d	Merge branch 'PHP-8.1' into PHP-8.2	2023-01-02 16:55:03 +00:00
Niels Dossche	d5f0362e59	Fix GH-10202: posix_getgr(gid\|nam)_basic.phpt fail The issue was that passwd was empty for the issue reporter, but the test expected passwd to be non-empty. An empty passwd can occur if there is no (encrypted) group password set up.	2023-01-02 16:54:47 +00:00
Max Kellermann	10d43c40dd	ext/opcache/zend_shared_alloc: change "locked" check to assertion Calling zend_shared_alloc() without holding the lock is always a bug, not a fatal runtime error.	2023-01-02 15:49:04 +00:00
Max Kellermann	e1a25ff2ed	ext/opcache/zend_shared_alloc: add assertions on "locked" flag Let the PHP process crash if a bug causes incorrect locking calls.	2023-01-02 15:49:04 +00:00
Christoph M. Becker	0aa1fdf28d	Fix variation5-win32(-mb).phpt wrt. parallel test execution Each test should use its own temporary filenames to avoid issues when the tests are executed in parallel[1]. We also silence the `unlink()` calls in the CLEAN section just in case. And while we're at it, we also remove the erroneous comment; there is no symlinking involved for the Windows test variants. [1] <https://github.com/php/php-src/pull/10175#issuecomment-1366809933> Closes GH-10189.	2022-12-30 17:47:58 +01:00
George Peter Banyard	11f6022365	Merge branch 'PHP-8.2' * PHP-8.2: Fix GH-10187: Segfault in stripslashes() with arm64 Fix memory leak in posix_ttyname()	2022-12-30 16:43:05 +00:00
George Peter Banyard	e6c9b176d4	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Fix GH-10187: Segfault in stripslashes() with arm64 Fix memory leak in posix_ttyname()	2022-12-30 16:42:45 +00:00
Niels Dossche	4c9375e504	Fix GH-10187: Segfault in stripslashes() with arm64 Closes GH-10188 Co-authored-by: todeveni <toni.viemero@iki.fi> Signed-off-by: George Peter Banyard <girgias@php.net>	2022-12-30 16:40:56 +00:00
George Peter Banyard	c2b0be5570	Fix memory leak in posix_ttyname() Closes GH-10190	2022-12-30 16:24:28 +00:00
Alex Dowad	f40c3fca88	Improve mb_detect_encoding's recognition of Turkish text Add 4 codepoints commonly used to write Turkish text to our table of 'commonly used' Unicode codepoints. These are: • U+011F LATIN SMALL LETTER G WITH BREVE • U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE • U+0131 LATIN SMALL LETTER DOTLESS I • U+015F LATIN SMALL LETTER S WITH CEDILLA	2022-12-30 14:22:46 +02:00
Tim Düsterhus	3e48e52d93	Register parameter attributes via stub in ext/zend_test (#10183 )	2022-12-29 23:17:02 +01:00
Alex Dowad	8b37c4ea5e	Merge branch 'PHP-8.2' * PHP-8.2: Allow 'h' and 'k' flags to be combined for mb_convert_kana	2022-12-29 20:39:22 +02:00
Alex Dowad	f7a19181d7	Allow 'h' and 'k' flags to be combined for mb_convert_kana The 'h' flag makes mb_convert_kana convert zenkaku hiragana to hankaku katakana; 'k' makes it convert zenkaku katakana to hankaku katakana. When working on the implementation of mb_convert_kana, I added some additional checks to catch combinations of flags which do not make sense; but there is no conflict between 'h' and 'k' (they control conversions for two disjoint ranges of codepoints) and this combination should not have been restricted. Thanks to the GitHub user 'akira345' for reporting this problem. Closes GH-10174.	2022-12-29 20:38:01 +02:00
David Carlier	383053c4aa	Merge branch 'PHP-8.2'	2022-12-29 12:22:21 +00:00
David Carlier	07bf42df41	Merge branch 'PHP-8.1' into PHP-8.2	2022-12-29 12:21:13 +00:00
Max Kellermann	e217138b40	ext/opcache/jit/zend_jit_trace: add missing lock for EXIT_INVALIDATE Commit `6c25413183` added the flag ZEND_JIT_EXIT_INVALIDATE which resets the trace handlers in zend_jit_trace_exit(), but forgot to lock the shared memory section. This could cause another worker process who still saw the ZEND_JIT_TRACE_JITED flag to schedule ZEND_JIT_TRACE_STOP_LINK, but when it arrived at the ZEND_JIT_DEBUG_TRACE_STOP, the handler was already reverted by the first worker process and thus zend_jit_find_trace() fails. This in turn generated a bogus jump offset in the JITed code, crashing the PHP process.	2022-12-29 12:20:56 +00:00
Dmitry Stogov	ca5f668f7c	Added missed return	2022-12-29 12:40:46 +03:00
David Carlier	f7a28c4145	Merge branch 'PHP-8.2'	2022-12-26 21:19:23 +00:00
David Carlier	381d0ddc20	Merge branch 'PHP-8.1' into PHP-8.2	2022-12-26 21:18:31 +00:00
Max Kellermann	b26b758952	ext/opcache/jit: handle zend_jit_find_trace() failures Commit `6c25413` added the flag ZEND_JIT_EXIT_INVALIDATE which resets the trace handlers in zend_jit_trace_exit(), but forgot to consider that on ZEND_JIT_TRACE_STOP_LINK, this changed handler gets passed to zend_jit_find_trace(), causing it to fail, either by returning 0 (results in bogus data) or by aborting due to ZEND_UNREACHABLE(). In either case, this crashes the PHP process. I'm not quite sure how to fix this multi-threading problem properly; my suggestion is to just fail the zend_jit_trace() call. After all, the whole ZEND_JIT_EXIT_INVALIDATE fix was about reloading modified scripts, so there's probably no point in this pending zend_jit_trace() call.	2022-12-26 21:17:19 +00:00
Dmitry Stogov	f922597b51	Merge branch 'PHP-8.2' * PHP-8.2: Fix memory leak because of incorrect optimization	2022-12-26 13:22:02 +03:00
Dmitry Stogov	0464524292	Fix memory leak because of incorrect optimization Fixes oss-fuzz #54488	2022-12-26 13:20:55 +03:00
George Peter Banyard	59f0fe5f16	Merge branch 'PHP-8.2'	2022-12-23 16:29:39 +00:00
Niels Dossche	a24659e70c	Update test for changed behaviour of GMP constructor Closed GH-10160 Signed-off-by: George Peter Banyard <girgias@php.net>	2022-12-23 16:29:14 +00:00
Ilija Tovilo	292f69b345	Merge branch 'PHP-8.2' * PHP-8.2: Add a regression test for auto_globals_jit=0 with preloading on	2022-12-22 17:42:37 +01:00
Ilija Tovilo	db48f49888	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Add a regression test for auto_globals_jit=0 with preloading on	2022-12-22 17:42:27 +01:00
Niels Dossche	bbad29b9c1	Add a regression test for auto_globals_jit=0 with preloading on	2022-12-22 17:42:11 +01:00
David Carlier	9c2572565a	sockets adding TCP_QUICKACK constant. having tigher control on ACK delays, difference is the setting is `volatile` as it can be turned off by the kernel if not set explicitally set otherwise on the socket. Closes GH-10145.	2022-12-22 14:50:33 +00:00
Ilija Tovilo	08fb7f93a1	Merge branch 'PHP-8.2' * PHP-8.2: Initialize ping_auto_globals_mask to prevent undefined behaviour	2022-12-22 15:00:14 +01:00
Ilija Tovilo	c714e626c8	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Initialize ping_auto_globals_mask to prevent undefined behaviour	2022-12-22 15:00:00 +01:00
Niels Dossche	c4487b7a12	Initialize ping_auto_globals_mask to prevent undefined behaviour Closes GH-10121	2022-12-22 14:59:24 +01:00
Niels	7b2c3c11b2	Cleanup redundant lookups in phar_object.c (#10150 )	2022-12-22 13:00:28 +00:00
Arnaud Le Blanc	c46a0ce198	Merge branch 'PHP-8.2' * PHP-8.2: [ci skip] NEWS [ci skip] NEWS ext/opcache/jit/zend_jit: fix inverted bailout value in zend_runtime_jit() (#10144)	2022-12-21 14:56:26 +01:00
Arnaud Le Blanc	f1c345394b	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: [ci skip] NEWS ext/opcache/jit/zend_jit: fix inverted bailout value in zend_runtime_jit() (#10144)	2022-12-21 14:55:36 +01:00
Max Kellermann	d3a6eedf4a	ext/opcache/jit/zend_jit: fix inverted bailout value in zend_runtime_jit() (#10144 ) In the "catch" block, do_bailout must be set to true, not false, or else zend_bailout() never gets called.	2022-12-21 14:53:21 +01:00
Derick Rethans	0ec8733bf4	Merge branch 'PHP-8.2'	2022-12-20 16:07:02 +00:00
Derick Rethans	6b212b6dee	Merge branch 'PHP-8.1' into PHP-8.2	2022-12-20 16:06:55 +00:00
Derick Rethans	d19a70c9a0	Fix GH-9891: DateTime modify with unixtimestamp (@) must work like setTimestamp	2022-12-20 14:41:13 +00:00
Christoph M. Becker	a23e837f16	Merge branch 'PHP-8.2' * PHP-8.2: Force extension loading for new test	2022-12-19 16:17:02 +01:00
Christoph M. Becker	1abc1645dd	Merge branch 'PHP-8.1' into PHP-8.2 * PHP-8.1: Force extension loading for new test	2022-12-19 16:15:24 +01:00
Christoph M. Becker	da5cbca23e	Force extension loading for new test	2022-12-19 16:14:00 +01:00
Christoph M. Becker	0cbc49b3c2	Merge branch 'PHP-8.2' * PHP-8.2: Skip newly added test on 32bit platforms	2022-12-19 16:08:57 +01:00

1 2 3 4 5 ...

63310 Commits