1
0
mirror of https://github.com/php/php-src.git synced 2026-04-03 22:22:18 +02:00
Commit Graph

129026 Commits

Author SHA1 Message Date
Derick Rethans
0dbedb3dbd Fixed memory leaks with DatePeriod::__unserialise 2022-07-18 23:58:19 +01:00
Eric Norris
09237f6126 Update request startup error messages 2022-07-18 23:19:59 +01:00
Ilija Tovilo
7aadbcb8f4 GH-8344 Fetch properties of enums in const expressions 2022-07-18 23:52:28 +02:00
Jakub Zelenka
922371f3b1 Do not send X-Powered-By if headers sent (#9039)
Co-authored-by: Eric Norris <erictnorris@gmail.com>
2022-07-18 18:01:05 +01:00
root
d8fc05c05e Add FILTER_FLAG_GLOBAL_RANGE to filter Global IPs as per RFC 6890 2022-07-18 17:56:05 +01:00
Mikhail Galanin
ffdf25a270 Add "error_log_mode" setting 2022-07-18 15:41:28 +01:00
Derick Rethans
7db9c2a2c3 Fixed typo in configure message 2022-07-18 15:18:08 +01:00
Felix Wiedemann
db5f6713ee FPM Downgrade occasional "failed to acquire scoreboard" warning
With request timeouts configured, php-fpm occasionally prints the
following warning:

   WARNING: failed to acquire scoreboard

This is happens when php-fpm checks the child scoreboards for timeouts,
but fails to acquire a lock immediately.  As this can (and does) occur
during normal operation, this commit downgrades this to a notice.
2022-07-18 14:40:39 +01:00
David Carlier
079221b30e Merge branch 'PHP-8.1' 2022-07-18 14:40:07 +01:00
David CARLIER
9a8ae45c4b Revert "FPM: Downgrade occasional "failed to acquire scoreboard" warning"
This reverts commit 3040f75f43.
2022-07-18 14:21:54 +01:00
Alex Dowad
76a92c26e3 mb_decode_numericentity decodes valid entities which are truncated at end of string
Since mb_decode_numericentity does not require all HTML entities
to end with ';', but allows them to be terminated by ANY non-digit
character, it doesn't make sense that valid entities which butt
up against the end of the input string are not converted.

As it turned out, supporting this case also made it possible
to simplify the code nicely.
2022-07-18 15:11:47 +02:00
Alex Dowad
5d6bd557b3 mb_decode_numericentity converts entities which immediately follow a valid/invalid entity
Thanks to Kamil Tieleka for suggesting that some of the behaviors of
the legacy implementation which the new mb_decode_numericentity
implementation took care to maintain were actually bugs and should
be fixed. Thanks also to Trevor Rowbotham for providing a link to
the HTML specification, showing how HTML numeric entities should
be interpreted.

mb_decode_numericentity now processes numeric entities in the
following situations where the old implementation would not:

- &<ENTITY> (for example, &&#65;)
- &#<ENTITY>
- &#x<ENTITY>
- <VALID BUT UNTERMINATED DECIMAL ENTITY><ENTITY> (for example, &#65&#65;)
- <VALID BUT UNTERMINATED HEX ENTITY><ENTITY>
- <INVALID AND UNTERMINATED DECIMAL ENTITY><ENTITY> (it does not matter why
  the first entity is invalid; the value could be too big, it could have
  too many digits, or it could not match the 'convmap' parameter)
- <INVALID AND UNTERMINATED HEX ENTITY><ENTITY>

This is consistent with the way that web browsers process
HTML entities.
2022-07-18 15:11:32 +02:00
Alex Dowad
30bfeef48d mbfl_strwidth does not need to use legacy conversion filters now
...Because we have the new (faster) conversion filters now for
ALL text encodings supported by mbstring.
2022-07-18 15:11:32 +02:00
Alex Dowad
40f5048aa7 Fix new conversion filter for UUEncode
This code (written by yours truly) was very broken on input
strings long enough to require processing in multiple chunks.
Fuzzing revealed this very quickly; after initial rework,
further fuzzing also found a couple of very obscure bugs in
corner cases.
2022-07-18 15:11:32 +02:00
Alex Dowad
5fee30b630 Fix new conversion filter for QPrint (same order of check as legacy code)
Because of checking for maximum line length *before* certain other checks,
the new conversion filter for QPrint could produce different results from
the old one in some cases. This was discovered while fuzzing the new
implementation of mb_decode_numericentity.
2022-07-18 15:11:32 +02:00
Alex Dowad
3cf432798e Fix new conversion filter for CP50220 (multi-codepoint kana at end of buffer)
If two codepoints which needed to be collapsed into a single kuten code
were separated, with one at the end of one buffer and the other at the
beginning of the next buffer, they were not converted correctly.
This was discovered while fuzzing the new implementation of
mb_decode_numericentity.
2022-07-18 15:11:31 +02:00
Alex Dowad
7559bf77d2 Fix new conversion filters for mobile SJIS variants ('0' at end of buffer)
Previously, I had adjusted this code so that if a character which could
be part of a special Docomo/Softbank/KDDI 'keypad' emoji appeared at
the end of one buffer, and the 'keypad' character appeared at the
beginning of the next, they would still be combined. However, this
broke the handling of such a character appearing at the end of one
buffer, and a character which is NOT 'keypad' appearing at the
beginning of the next.

This was found while fuzzing the new implementation of
mb_decode_numericentity.
2022-07-18 15:11:31 +02:00
Alex Dowad
fa83a8e15e Fix new conversion filter for HTML entities
While fuzzing the new mb_decode_numericentity implementation, I discovered
that the fast conversion filter for 'HTML-ENTITIES' did not correctly
handle an empty named entity ('&;'), nor did it correctly handle
invalid named entities whose names were a prefix of a valid entity.
Also, it did not correctly handle the case where a named entity is
truncated and another named entity starts abruptly.
2022-07-18 15:11:31 +02:00
Alex Dowad
9c3972fb3d Fix legacy conversion filter for HZ 2022-07-18 15:11:31 +02:00
Alex Dowad
1526bab6d0 Fix legacy conversion filter for GB18030 2022-07-18 15:11:31 +02:00
Alex Dowad
6938e35122 Fix legacy conversion filter for CP50220 2022-07-18 15:11:31 +02:00
Alex Dowad
1662f7f79f Fix legacy conversion filter for UTF-7 2022-07-18 15:11:31 +02:00
Alex Dowad
c8e4f313fa Fix legacy conversion filter for ISO-2022-KR
When I was working on this code before, it really, really
looked like the index into `uhc3_ucs_table` could never
overrun the size of the table. Why did I get this wrong?
Don't know. Anyways, libfuzzer tore away my illusions
and unequivocally demonstrated that the index CAN be
larger than the size of the table.
2022-07-18 15:11:31 +02:00
Alex Dowad
cebb8009c6 Fix legacy conversion filters for... almost all 8-bit text encodings 2022-07-18 15:11:31 +02:00
Alex Dowad
2eff19e38f Fix legacy conversion filter for HTML entities 2022-07-18 15:11:31 +02:00
Alex Dowad
87b71595ba Fix legacy conversion filter for Base64 2022-07-18 15:11:31 +02:00
Alex Dowad
7ece8f18b0 Fix legacy conversion filter for MacJapanese 2022-07-18 15:11:31 +02:00
Alex Dowad
d7bab66135 Fix legacy conversion filter for SJIS-2004 2022-07-18 15:11:31 +02:00
Alex Dowad
31cbb7a3a5 Fix legacy conversion filter for QPrint 2022-07-18 15:11:30 +02:00
Alex Dowad
048f6cbcde Fix legacy conversion filter for JIS 2022-07-18 15:11:30 +02:00
Alex Dowad
91969e908f New implementation of mb_{de,en}code_numericentity
This new implementation uses the new encoding conversion filters.
Aside from fewer LOC and (hopefully) improved readability,
the differences are as follows:

BEHAVIOR CHANGES:

- The old implementation used signed arithmetic when operating
on the 'convmap'. This meant that results could be surprising when
using convmap entries with 1 in the MSB. Further, types like 'int'
were used rather than those with a specific bit width, such as
'int32_t'. This meant that results could also depend on the
platform width of an 'int'.

Now unsigned arithmetic is used, with explicit bit widths.

- Similarly, while converting decimal numeric entities, the
legacy implementation would ensure that the value never overflowed
INT_MAX, and if it did, the entity would be treated as invalid
and passed through unconverted.

However, that again means that results depend on the platform
size of an 'int'. So now, we use a value with explicit bit width
(32 bits) to hold the value of a deconverted decimal entity, and
ensure that the entity value does not overflow that.

Further, because we are using an UNSIGNED 32-bit value rather
than a signed one, the ceiling for how large a decimal entity
can be is higher now.

All of this will probably not affect anyone, since Unicode
codepoints above U+10FFFF are invalid anyways. To see the
difference, you need to be using a text encoding like UCS-4,
which allows huge 'codepoints'.

- If it saw something which looked like a hex entity, but
turned out not to be a valid numeric entity, the old
implementation would sometimes convert the hexadecimal
digits a-f to A-F (uppercase). The new implementation passes
invalid numeric entities through without performing case
conversion.

- The old implementation of mb_encode_numericentity was
limited in how many decimal/hex digits it could emit.
If a text encoding like UCS-4 was in use, where 'codepoints'
can have huge values (larger than the valid range
stipulated by the Unicode standard), it would not error
out on a 'codepoint' whose value was too large for it,
but would rather mangle the value and emit a numeric
entity which decoded to some other random codepoint.
The new implementation is able to emit enough digits to
express any value which fits in 32 bits.

PERFORMANCE:

Based on micro-benchmarks run on my development machine:

Decoding numeric HTML entities is about 4 times faster, for
both decimal and hexadecimal entities, across a variety of
input string lengths. Encoding is about 3 times faster.
2022-07-18 15:11:30 +02:00
Dmitry Stogov
c6eb5dc5fd Fix possible crash in case of exception
Fixes oss-fuzz #49068
2022-07-18 15:40:11 +03:00
David CARLIER
f6aa7a4960 [ci skip] Follow-up on #8914, usage comments addition. 2022-07-18 13:28:04 +01:00
Dmitry Stogov
34b11a7524 Fix memory leaks in
Zend/tests/type_declarations/union_types/inheritance.phpt introduced by f24548e217
2022-07-18 15:26:04 +03:00
Dmitry Stogov
f24548e217 Fix invalid free() during type persistence
Fixes oss-fuzz #49042
2022-07-18 15:11:02 +03:00
David Carlier
d0962859f4 Merge branch 'PHP-8.1' 2022-07-18 12:41:24 +01:00
David Carlier
edb173c200 Merge branch 'PHP-8.0' into PHP-8.1 2022-07-18 12:40:47 +01:00
Felix Wiedemann
3040f75f43 FPM: Downgrade occasional "failed to acquire scoreboard" warning
With request timeouts configured, php-fpm occasionally prints the
following warning:

   WARNING: failed to acquire scoreboard

This is happens when php-fpm checks the child scoreboards for timeouts,
but fails to acquire a lock immediately.  As this can (and does) occur
during normal operation, this commit downgrades this to a notice.
Closes #9019.
2022-07-18 12:40:16 +01:00
Dmitry Stogov
71814e9d99 Merge branch 'PHP-8.1'
* PHP-8.1:
  Fix type inference
2022-07-18 14:20:41 +03:00
Dmitry Stogov
82d3ad64df Fix type inference
Fixes oss-fuzz #48908
2022-07-18 14:20:06 +03:00
Máté Kocsis
f0d536844f Declare ext/mysqli constants in stubs (#8811) 2022-07-18 13:00:35 +02:00
Remi Collet
af72d6e5d9 no need for attributes on legacy 2022-07-18 12:44:29 +02:00
Remi Collet
ee1d6188cf cleanup unused 2022-07-18 12:40:28 +02:00
Arnaud Le Blanc
a6856760c2 [ci skip] NEWS 2022-07-18 12:36:54 +02:00
Arnaud Le Blanc
02a0a8ae26 Merge branch 'PHP-8.1'
* PHP-8.1:
  [ci skip] NEWS
  Fix JIT crash with large number of match/switch arms (#8961)
2022-07-18 12:36:13 +02:00
Arnaud Le Blanc
4b38779a48 [ci skip] NEWS 2022-07-18 12:35:24 +02:00
Arnaud Le Blanc
f2381ae4ba Fix JIT crash with large number of match/switch arms (#8961)
Switch statements may generate a large number of exit points. Once the max
number of exit points is reached, get_exit_addr() returns NULL. This was not
checked, and this resulted in a jump table with some 0 addresses.
2022-07-18 12:34:20 +02:00
Dmitry Stogov
26d890e6ba Merge branch 'PHP-8.1'
* PHP-8.1:
  Fix type inference for FETCH_DI_UNSET
2022-07-18 13:15:12 +03:00
Dmitry Stogov
b734d45626 Merge branch 'PHP-8.0' into PHP-8.1
* PHP-8.0:
  Fix type inference for FETCH_DI_UNSET
2022-07-18 13:15:03 +03:00
Dmitry Stogov
bd30eff5de Fix type inference for FETCH_DI_UNSET
Fixes oss-fuzz #48507
2022-07-18 13:14:15 +03:00