1
0
mirror of https://github.com/php/php-src.git synced 2026-04-02 13:43:02 +02:00
Commit Graph

60005 Commits

Author SHA1 Message Date
Nikita Popov
1e012ecb3f Fix bug #81405: Restore old PDO::PARAM_* values
Doctrine hardcodes the values of these constants, avoid changing
them.

Closes GH-7445.
2021-09-01 13:54:41 +02:00
Derick Rethans
b0dd55b11c Fixed test - the expected result was wrong 2021-08-31 17:19:28 +01:00
Nikita Popov
5adfcfe746 Merge branch 'PHP-8.0'
* PHP-8.0:
  Avoid dangling pointer in curl header.str
2021-08-31 17:25:38 +02:00
Nikita Popov
db055fdb89 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Avoid dangling pointer in curl header.str
2021-08-31 17:24:30 +02:00
Alexey Zamorov
8c292a2f9d Avoid dangling pointer in curl header.str
If buf_len is zero, this would leave behind a dangling pointer
to an already released header.str. Make sure this can't happen
by always overwriting the pointer.

Closes GH-7376.
2021-08-31 17:23:58 +02:00
Nikita Popov
4ba7e5b24d Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix curl_copy_handle() with CURLINFO_HEADER_OUT
2021-08-31 17:09:34 +02:00
Nikita Popov
416dd524f9 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix curl_copy_handle() with CURLINFO_HEADER_OUT
2021-08-31 17:09:07 +02:00
Nikita Popov
30e791ed56 Fix curl_copy_handle() with CURLINFO_HEADER_OUT
The CURLOPT_DEBUGDATA will point to the old curl handle after
copying. Update it to point to the new handle.

We don't separately store whether CURLINFO_HEADER_OUT is enabled,
so I'm doing this unconditionally. It should be harmless if
CURLOPT_DEBUGFUNCTION is not used.
2021-08-31 17:06:41 +02:00
Nikita Popov
992b5f2e08 Make it easier to run curl tests standalone
Fall back to PHP_BINARY if TEST_PHP_EXECUTABLE not given.
2021-08-31 16:53:43 +02:00
Derick Rethans
2bf451b925 Upgrade timelib to 2021.08, which address some defects and performance
- Fixed bug #80998 (Missing second with inverted interval).
- Speed up finding timezone offset information.
2021-08-31 15:29:48 +01:00
Nikita Popov
14f599ea7d Use zend_long for resource ID
Currently, resource IDs are limited to 32-bits. As resource IDs
are not reused, this means that resource ID overflow for
long-running processes is very possible.

This patch switches resource IDs to use zend_long instead, which
means that on 64-bit systems, 64-bit resource IDs will be used.
This makes resource ID overflow practically impossible.

The tradeoff is an 8 byte increase in zend_resource size.

Closes GH-7436.
2021-08-31 14:58:59 +02:00
Dmitry Stogov
6871a49b66 Fix timelib_parse_zone() performance problem.
This makes "new DateTimeZone("Europe/London");" 170 times faster.

This is a hotfix for https://github.com/derickr/timelib/pull/99
2021-08-31 15:29:05 +03:00
Nikita Popov
32d48212ea Support generating internal enum decl from stubs 2021-08-31 14:19:37 +02:00
Alex Dowad
df32267494 Add more tests for UTF7-IMAP text conversion 2021-08-31 13:41:34 +02:00
Alex Dowad
16a1e0a219 In UTF7-IMAP, reject the 2nd part of surrogate pair if it appears unexpectedly 2021-08-31 13:41:34 +02:00
Alex Dowad
355464935d Add another test for UTF-7 text conversion 2021-08-31 13:41:34 +02:00
Alex Dowad
51b6c687db Add another test for GB18030 text conversion 2021-08-31 13:41:34 +02:00
Alex Dowad
a0415b22ab Add more tests for CP5022{0,1,2} text conversion 2021-08-31 13:41:34 +02:00
Alex Dowad
e3f6a9fbfe CP5022{0,1,2} supports 'IBM extension' codes from ku 115-119
mbstring has always had the conversion tables to support CP932 codes
in ku 115-119, and the conversion code for CP5022x has an 'if' clause
specifically to handle such characters... but that 'if' clause was dead
code, since a guard clause earlier in the same function prevented it
from accepting 2-byte characters with a starting byte of 0x93-0x97.

Adjust the guard clause so that these characters can be converted as
the original author apparently intended.

The code which handles ku 115-119 is the part which reads:

    } else if (s >= cp932ext3_ucs_table_min && s < cp932ext3_ucs_table_max) {
      w = cp932ext3_ucs_table[s - cp932ext3_ucs_table_min];
2021-08-31 13:41:34 +02:00
Alex Dowad
671dcee01e Add test for mb_str_split on UCS-2 text 2021-08-31 13:41:34 +02:00
Alex Dowad
f303fc8a9b Use bool in mbfl_filt_conv_output_hex (rather than int) 2021-08-31 13:41:34 +02:00
Alex Dowad
776296e12f mbstring no longer provides 'long' substitutions for erroneous input bytes
Previously, mbstring had a special mode whereby it would convert
erroneous input byte sequences to output like "BAD+XXXX", where "XXXX"
would be the erroneous bytes expressed in hexadecimal. This mode could
be enabled by calling `mb_substitute_character("long")`.

However, accurately reproducing input byte sequences from the cached
state of a conversion filter is often tricky, and this significantly
complicates the implementation. Further, the means used for passing
the erroneous bytes through to where the "BAD+XXXX" text is generated
only allows for up to 3 bytes to be passed, meaning that some erroneous
byte sequences are truncated anyways.

More to the point, a search of publically available PHP code indicates
that nobody is really using this feature anyways.

Incidentally, this feature also provided error output like "JIS+XXXX"
if the input 'should have' represented a JISX 0208 codepoint, but it
decodes to a codepoint which does not exist in the JISX 0208 charset.
Similarly, specific error output was provided for non-existent
JISX 0212 codepoints, and likewise for JISX 0213, CP932, and a few
other charsets. All of that is now consigned to the flames.

However, "long" error markers also include a somewhat more useful
"U+XXXX" marker for Unicode codepoints which were successfully
decoded from the input text, but cannot be represented in the output
encoding. Those are still supported.

With this change, there is no need to use a variety of special values
in the high bits of a wchar to represent different types of error
values. We can (and will) just use a single error value. This will be
equal to -1.

One complicating factor: Text conversion functions return an integer to
indicate whether the conversion operation should be immediately
aborted, and the magic 'abort' marker is -1. Also, almost all of these
functions would return the received byte/codepoint to indicate success.
That doesn't work with the new error value; if an input filter detects
an error and passes -1 to the output filter, and the output filter
returns it back, that would be taken to mean 'abort'.

Therefore, amend all these functions to return 0 for success.
2021-08-31 13:41:34 +02:00
Go Kudo
eaac77f4e7 Fix nested namespaced typed property in gen_stub.php (#7418)
Property escape namespaced class name in property types.
2021-08-31 11:56:39 +02:00
Nikita Popov
5b2ddf5a17 Export zend_use_resource_as_offset()
Use a common implementation to generate this error message, as
we do so in quite a few places dealing with array keys.
2021-08-31 10:58:01 +02:00
Máté Kocsis
70f516d3e8 Make default value more explicit 2021-08-31 10:19:05 +02:00
Máté Kocsis
5256798d88 Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix default value of $flags in oci_fetch_all()
2021-08-31 10:14:19 +02:00
Máté Kocsis
26aa54e098 Fix default value of $flags in oci_fetch_all() (#7429) 2021-08-31 10:05:24 +02:00
Dmitry Stogov
dad5cfa868 Rename ZREG_FCARG1x/ZREG_FCARG1a into ZREG_FCARG1 2021-08-30 20:38:52 +03:00
Christoph M. Becker
24fe7f08b5 Merge branch 'PHP-8.0'
* PHP-8.0:
  Fix #81400: Unterminated string in dns_get_record() results
2021-08-30 18:55:16 +02:00
Christoph M. Becker
fcbe737218 Merge branch 'PHP-7.4' into PHP-8.0
* PHP-7.4:
  Fix #81400: Unterminated string in dns_get_record() results
2021-08-30 18:52:40 +02:00
Christoph M. Becker
edab9ad205 Fix #81400: Unterminated string in dns_get_record() results
If we assemble a zend_string manually, we need to end it with a NUL
byte ourselves.

We also fix the size calculation for that zend_string; there is no need
for the extra byte for each part, and we don't have to multiply by two,
since we're using DnsQuery_A(), not DnsQuery_W () (in which case we
would have to do the character set conversion, anyway).  This avoids
over-allocation, and the need to explicitly set the string length.

Finally, we use the proper access macro for zend_strings.

Closes GH-7427.
2021-08-30 18:49:39 +02:00
Dmitry Stogov
f1f4403dc2 Fixed register allocation when ADD/SUB/MUL two references in tracing JIT
The bug was introdueced by 7690fa0bd8 and
leaded to failure in `make test TESTS="-d opcache.jit=1254 --repeat 3 ext/date/tests/bug30096.phpt"`
2021-08-30 19:41:39 +03:00
Denis Ryabov
d3a6054d44 Fix/improve handling of escaping in ini parser
Quoting from UPGRADING:

- A leading dollar in a quoted string can now be escaped: "\${" will now be
  interpreted as a string with contents `${`.

- Backslashes in double quoted strings are now more consistently treated as
  escape characters. Previously, "foo\\" followed by something other than a
  newline was not considered as a teminated string. It is now interpreted as a
  string with contents `foo\`. However, as an exception, the string "foo\"
  followed by a newline will continue to be treated as a valid string with
  contents `foo\` rather than an unterminated string. This exception exists to
  support naive uses of Windows file pahts as "C:\foo\".

Closes GH-7420.
2021-08-30 16:59:22 +02:00
Alex Dowad
15ba73cee3 Add more tests for UTF-8 text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
51a32ccaf4 Add another test for UTF-16LE 2021-08-30 16:29:58 +02:00
Alex Dowad
7472c82c45 Add tests for UCS-4 text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
79015b23aa Add tests for UCS-2 text encoding 2021-08-30 16:29:58 +02:00
Alex Dowad
34ef8f3ca2 Add tests for '7bit' and '8bit' text encodings in mbstring 2021-08-30 16:29:58 +02:00
Alex Dowad
97f8495e0f UCS-4 conversion does not pass BOM through to output
This is to match the way that we handle UCS-2. When a BOM is found at
the beginning of a 'UCS-2' string (NOT 'UCS-2BE' or 'UCS-2LE'), we take
note of the intended byte order and handle the string accordingly, but
do NOT emit a BOM to the output. Rather, we just use the default byte
order for the requested output encoding.

Some might argue that if the input string used a BOM, and we are
emitting output in a text encoding where both big-endian and
little-endian byte orders are possible, we should include a BOM in the
output string. To such hypothetical debaters of minutiae, I can only
offer you a shoulder shrug. No reasonable program which handles UCS-2
and UCS-4 text should require a BOM.

Really, the concept of the BOM is a poor idea and should not have been
included in Unicode. Standardizing on a single byte order would have
been much better, similar to 'network byte order' for the Internet
Protocol. But this is not the place to speak at length of such things.
2021-08-30 16:29:58 +02:00
Alex Dowad
e6f1a72235 Add test suite for mobile variants of UTF-8 (and fix bugs) 2021-08-30 16:29:58 +02:00
Alex Dowad
1865576694 Add test suite for EUC-JP-WIN (or EUC-JP-MS) text encoding (and fix bugs) 2021-08-30 16:29:58 +02:00
Alex Dowad
6a693d2d33 Remove useless variable: mbfl_encoding_utf8_kddi_a_aliases 2021-08-30 16:29:58 +02:00
Alex Dowad
d4561894ea Extraneous trailing UCS-4 bytes are treated as error 2021-08-30 16:29:58 +02:00
Alex Dowad
0de4d6872e Add more tests for SJIS-2004 text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
c7d47cbb4c Add more tests for SJIS text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
299690a1cf Add more tests for ISO-2022-JP/JIS7/JIS8 text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
b2be85d11a Add more tests for ISO-2022-JP-MS text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
ae4c956089 Add more tests for ISO-2022-JP-KDDI text conversion 2021-08-30 16:29:58 +02:00
Alex Dowad
51e0d323e4 ISO-2022-JP-MS treats truncated multi-byte chars as error
Sigh. I included tests which were intended to check this case in the
test suite for ISO-2022-JP-MS, but those tests were faulty and didn't
actually test what they were supposed to.

Fixing the tests revealed that there were still bugs in this area.
2021-08-30 16:29:58 +02:00
Alex Dowad
57a81af041 ISO-2022-JP-KDDI text conversion doesn't swallow PUA codepoints
There was a bit of legacy code here which looks like the original author
of mbstring intended to allow conversion of Unicode Private Use Area
codepoints to ISO-2022-JP-KDDI. However, that code never worked.
It set the output variable to values which were not matched by any
of the 'if' clauses below, which meant that nothing was actually
emitted to the output. In other words, if one tried to convert Unicode
to ISO-2022-JP-KDDI, and the Unicode string contained PUA codepoints,
they would be quietly 'swallowed' and disappear.

I don't know what ISO-2022-JP-KDDI byte sequences the author wanted
to map those PUA codepoints to, and anyways, this use case is so obscure
that there is little point in worrying about it. However, it is better
to remove the non-functioning code than to leave it in.

This means that if now one tries to convert PUA codepoints to
ISO-2022-JP-KDDI, those codepoints will be treated as erroneous rather
than silently ignored.
2021-08-30 16:29:58 +02:00