1
0
mirror of https://github.com/php/php-src.git synced 2026-03-24 00:02:20 +01:00

2647 Commits

Author SHA1 Message Date
Gina Peter Banyard
f40b356ad9 Use smart_str_append() if we have a zend_string* (#21414) 2026-03-21 17:06:14 +00:00
Peter Kokot
58acc671db ext/mbstring: Fix deprecation warning (#21363)
This fixes the PHP deprecation warning:

    PHP Deprecated:  Implicit conversion from float 2048.96875 to int
    loses precision in .../ext/mbstring/gen_rare_cp_bitvec.php on line 9
2026-03-07 16:16:59 +01:00
Alexandre Daubois
11a95749b1 Convert more zend_parse_parameters_none() to fast ZPP (#21330) 2026-03-04 14:07:46 +01:00
Peter Kokot
f17c5ad83b Windows build: Add new function CHECK_HEADER() (#21191)
The current function `CHECK_HEADER_ADD_INCLUDE()` automatically defines
`HAVE_<HEADER_NAME_H>` preprocessor macros, which makes it difficult to
sync with other build systems. Specially, if some `HAVE_` macro is used
in the code and this function defines this macro but Autotools doesn't.

The new `CHECK_HEADER()` function behaves similar except it doesn't
define the `HAVE_<HEADER_NAME_H>` preprocessor macro.

This removes the following unused compile definitions:

HAVE_ARGON2_H
HAVE_AVIF_H
HAVE_BZLIB_H
HAVE_CAPSTONE_CAPSTONE_H
HAVE_CURL_EASY_H
HAVE_DB_H
HAVE_DECODE_H
HAVE_DEPOT_H
HAVE_EDITLINE_READLINE_H
HAVE_ENCHANT_H
HAVE_ENCODE_H
HAVE_FFI_H
HAVE_FIREBIRD_INTERFACE_H
HAVE_FT2BUILD_H
HAVE_GD_H
HAVE_GLIB_H
HAVE_GMP_H
HAVE_HTTPD_H
HAVE_IBASE_H
HAVE_IR_IR_H
HAVE_KECCAKHASH_H
HAVE_LBER_H
HAVE_LDAP_H
HAVE_LIBEXSLT_EXSLT_H
HAVE_LIBINTL_H
HAVE_LIBPQ_FE_H
HAVE_LIBTIDY_TIDY_H
HAVE_LIBXML_PARSER_H
HAVE_LIBXML_TREE_H
HAVE_LIBXML_XMLWRITER_H
HAVE_LIBXSLT_XSLT_H
HAVE_LMDB_H
HAVE_MBSTRING_H
HAVE_MYSQL_H
HAVE_ONIGURUMA_H
HAVE_OPENSSL_SSL_H
HAVE_PNG_H
HAVE_SNMP_H
HAVE_SODIUM_H
HAVE_SQLITE3_H
HAVE_SQLITE3EXT_H
HAVE_SYBFRONT_H
HAVE_TIDY_H
HAVE_TIDY_TIDY_H
HAVE_TIDYBUFFIO_H
HAVE_TIMELIB_CONFIG_H
HAVE_UNICODE_USPOOF_H
HAVE_UNICODE_UTF_H
HAVE_XPM_H
HAVE_ZIP_H
HAVE_ZIPCONF_H
HAVE_ZLIB_H

The following compile definitions are defined explicitly:

- HAVE_ICONV_H
- HAVE_MSCOREE_H
- HAVE_SQL_H
- HAVE_SQLEXT_H

Additionally, the `SETUP_OPENSSL()` function doesn't accept the 6th
argument anymore.
2026-03-03 20:06:40 +01:00
Arshid
f46bc8e3a7 ext/mbstring: Replace RETVAL_TRUE/RETVAL_FALSE with RETVAL_BOOL (#21276) 2026-02-27 06:26:11 +09:00
Alex Dowad
115ea486ac Merge branch 'PHP-8.5' 2026-02-17 06:51:21 +09:00
Alex Dowad
e106d688c2 Merge branch 'PHP-8.4' into PHP-8.5 2026-02-17 06:48:37 +09:00
Jordi Kroon
37c5a13d67 replace alloca with do_alloca in mb_guess_encoding_for_strings
This avoids a crash in cases where the list of candidate encodings is so huge
that alloca would fail. Such crashes have been observed when the list of
encodings was larger than around 208,000 entries.
2026-02-17 06:46:42 +09:00
Ilija Tovilo
cb51737f41 Merge branch 'PHP-8.5'
* PHP-8.5:
  Tweak zend.max_allowed_stack_size for gh20836_stack_limit.phpt
2026-02-03 00:55:05 +01:00
Ilija Tovilo
9e96c5ff39 Merge branch 'PHP-8.4' into PHP-8.5
* PHP-8.4:
  Tweak zend.max_allowed_stack_size for gh20836_stack_limit.phpt
2026-02-03 00:54:56 +01:00
Ilija Tovilo
1f57d04648 Tweak zend.max_allowed_stack_size for gh20836_stack_limit.phpt
Fixes GH-21086
2026-02-03 00:54:25 +01:00
Ilija Tovilo
6173a9a109 VAR|TMP overhaul (GH-20628)
The aim of this PR is twofold:

- Reduce the number of highly similar TMP|VAR handlers
- Avoid ZVAL_DEREF in most of these cases

This is achieved by guaranteeing that all zend_compile_expr() calls, as well as
all other compile calls with BP_VAR_{R,IS}, will result in a TMP variable. This
implies that the result will not contain an IS_INDIRECT or IS_REFERENCE value,
which was mostly already the case, with two exceptions:

- Calls to return-by-reference functions. Because return-by-reference functions
  are quite rare, this is solved by delegating the DEREF to the RETURN_BY_REF
  handler, which will examine the stack to check whether the caller expects a
  VAR or TMP to understand whether the DEREF is needed. Internal functions will
  also need to adjust by calling the zend_return_unwrap_ref() function.

- By-reference assignments, including both $a = &$b, as well as $a = [&$b]. When
  the result of these expressions is used in a BP_VAR_R context, the reference
  is unwrapped via a ZEND_QM_ASSIGN opcode beforehand. This is exceptionally
  rare.

Closes GH-20628
2026-01-31 19:44:56 +01:00
Arnaud Le Blanc
65b4073922 Include the actual stub name in generated arginfo headers (#20993) 2026-01-21 20:57:00 +01:00
Alex Dowad
7ad406a4b9 Fix crash in mb_substr with MacJapanese encoding
Thanks to the GitHub user vi3tL0u1s (Viet Hoang Luu) for reporting this issue.

The MacJapanese legacy text encoding has a very unusual property; it is possible for a string
to encode more codepoints than it has bytes. In some corner cases, this resulted in a situation
where the implementation code for mb_substr() would allocate a buffer of size -1. As you can
probably imagine, that doesn't end well.

Fixes GH-20832.
2026-01-18 20:07:12 +09:00
Alexandre Daubois
b391c28f90 Merge branch 'PHP-8.5'
* PHP-8.5:
  Fix GH-20836: Stack overflow in mb_convert_variables with recursive array references (#20839)
2026-01-14 20:11:31 +01:00
Alexandre Daubois
32803687fe Merge branch 'PHP-8.4' into PHP-8.5
* PHP-8.4:
  Fix GH-20836: Stack overflow in mb_convert_variables with recursive array references (#20839)
2026-01-14 20:10:30 +01:00
Alexandre Daubois
2c112e3696 Fix GH-20836: Stack overflow in mb_convert_variables with recursive array references (#20839) 2026-01-14 20:07:11 +01:00
Alex Dowad
c34b84ed81 Remove unused conversion code from mbstring
Over the last few years, I refactored mbstring to perform encoding conversion
a buffer at a time, rather than a single byte at a time. This resulted in a
huge performance increase.

After the refactoring, the old "byte-at-a-time" code was retained for two
reasons:

1) It was used by the mailparse PECL extension.
2) It was used to implement mb_strcut for some text encodings.

However, after reviewing mailparse's use of mbstring, it is clear that
mailparse only relies on mbstring for decoding of QPrint, and possibly
Base64. It does not use the byte-at-a-time conversion code for any
other encoding.

Further, mb_strcut only relies on the byte-at-a-time conversion code
for a limited number of legacy text encodings, such as ISO-2022-JP,
HZ, UTF-7, etc.

Hence, we can remove over 5000 lines of unused code without breaking
anything. This will help to reduce binary size, and make the mbstring
codebase easier to navigate for new contributors.
2026-01-13 11:43:44 +09:00
Alex Dowad
11bec6b92f Remove some now-unused code from mbfl_strcut
The legacy mbfl_strcut function is only used to implement mb_strcut
for legacy text encodings which 1) do not use a fixed number of bytes
per codepoint, 2) do not have an 'mblen_table' which can be used to
quickly determine the codepoint length of a byte sequence, and 3) do
not have a specialized 'mb_cut' function which implements mb_strcut
for that text encoding.

Remove unused code from mbfl_strcut, and leave only what is currently
needed for the implementation of mb_strcut.
2026-01-13 11:43:44 +09:00
Alex Dowad
79b52042e3 Use fast path in more cases when doing case folding with mb_convert_case
mbstring's Unicode case conversion is table-driven, using Minimal Perfect Hash tables.
However, for small codepoint values, we bypass the hashtable lookup and just use
hard-coded conversion logic (i.e. adding or subtracting 0x20 from the appropriate
ASCII range).

For upcasing and downcasing, we had already optimized the conditional which sends
execution down this fast path, to use the fast path for as many codepoint values
as possible. However, for case folding, this had not been done.

This will give a small performance boost for case-folding Unicode text which
includes non-breaking spaces, symbols like ¥ or ™, or accented Latin
characters (used in many European languages).
2026-01-10 13:10:59 +09:00
Niels Dossche
e4098da58a Merge branch 'PHP-8.5'
* PHP-8.5:
  Fix GH-20833: mb_str_pad() divide by zero if padding string is invalid in the encoding
2026-01-05 20:01:59 +01:00
Niels Dossche
171b52c98f Merge branch 'PHP-8.4' into PHP-8.5
* PHP-8.4:
  Fix GH-20833: mb_str_pad() divide by zero if padding string is invalid in the encoding
2026-01-05 20:01:54 +01:00
Niels Dossche
03113b09ce Fix GH-20833: mb_str_pad() divide by zero if padding string is invalid in the encoding
If the padding string is not valid in the given encoding,
mb_get_strlen() can return 0.

Closes GH-20834.
2026-01-05 20:01:25 +01:00
Gina Peter Banyard
c727f4d6c5 ext/standard/mail: use zend_string* for extra_cmd param of php_mail() 2025-12-27 23:26:58 +00:00
Niels Dossche
f20701416d mbstring: Transform RETURN_STR(zend_string_init_fast(...)) to RETURN_STRINGL_FAST(...) (#20779)
This is a dedicated API which is cleaner.
2025-12-26 12:15:25 +01:00
Yuya Hamada
64dd933a06 Merge branch 'PHP-8.4' into PHP-8.5 2025-12-15 10:58:49 +09:00
Yuya Hamada
355a4b5e61 Merge branch 'PHP-8.3' into PHP-8.4 2025-12-15 10:57:21 +09:00
Yuya Hamada
0056d013bf Fix GH-20674 mb_decode_mimeheader does not handle separator
`?=  =?` is skipped if long term, so skip space character.
Add test case from RFC2047 and fix last pattern
See: https://www.ietf.org/rfc/rfc2047#section-8
2025-12-15 10:55:17 +09:00
Yuya Hamada
85913fc61b Fix GH-20674 mb_decode_mimeheader does not handle separator
`?=  =?` is skipped if long term, so skip space character.
Add test case from RFC2047 and fix last pattern
See: https://www.ietf.org/rfc/rfc2047#section-8
2025-12-15 10:52:03 +09:00
Heran Yang
1f3fe93eff Add GB18030-2022 to default encoding list for zh-CN (#20604)
GB18030-2022 is the current official standard, superseding the previous 2005 and 2000 versions. It is essential for modern Chinese text processing for the following reasons:

    1. Superset Relationship: GB18030 is a strict superset of CP936 (GBK) and EUC-CN (GB2312). Using GB18030 as the detection target covers all characters in these older encodings while enabling support for a much wider range of characters.
    2. Extended Character Coverage: The 2022 standard includes significant updates, covering over 87,000 characters. It adds support for CJK Extensions (C, D, E, F, G) and updates mappings for rare characters that were previously mapped to the Private Use Area (PUA) in the 2005 version. This is critical for correctly handling names containing rare characters (e.g., in banking or government data).
    3. Backward Compatibility: It is safe to promote GB18030-2022 as the preferred encoding. Files encoded in EUC-CN or CP936 are valid GB18030 streams.

This PR adds GB18030-2022 to the default encoding list for CN.
2025-12-12 11:58:37 +09:00
Tobias Vorwachs
6b197ee4ed mbstring: fix missing copying of detect_order_list to current_detect_order_list on ini_set('mbstring.detect_order', string)
Closes GH-20523.
2025-12-01 20:47:57 +09:00
Niels Dossche
c0cf84158f Merge branch 'PHP-8.5'
* PHP-8.5:
  Fix GH-20492: mbstring compile warning due to non-strings
  Fix GH-20491: SLES15 compile error with mbstring oniguruma
2025-11-20 19:26:54 +01:00
Niels Dossche
929e7177f1 Merge branch 'PHP-8.4' into PHP-8.5
* PHP-8.4:
  Fix GH-20492: mbstring compile warning due to non-strings
  Fix GH-20491: SLES15 compile error with mbstring oniguruma
2025-11-20 19:26:48 +01:00
Niels Dossche
10ac41f158 Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-20492: mbstring compile warning due to non-strings
  Fix GH-20491: SLES15 compile error with mbstring oniguruma
2025-11-20 19:23:36 +01:00
Niels Dossche
159ef1401c Fix GH-20492: mbstring compile warning due to non-strings
This is a partial backport of ea69276f, but without changing public
headers as that's not allowed at this point.

Closes GH-20494.
2025-11-20 19:17:55 +01:00
Niels Dossche
a1912e3cdd Fix GH-20491: SLES15 compile error with mbstring oniguruma
The issue is specific to SLES15.
Arguably this should be reported to them as it seems to me they meddled
with the oniguruma source code.

The definition in oniguruma.h on that platform looks like this (same as upstream):
```c
ONIG_EXTERN
int onig_error_code_to_str PV_((OnigUChar* s, int err_code, ...));
```

Where `PV_` is defined as (differs):
```c
#ifndef PV_
#ifdef HAVE_STDARG_PROTOTYPES
# define PV_(args) args
#else
# define PV_(args) ()
#endif
#endif
```

So that means that `HAVE_STDARG_PROTOTYPES` is unset.
This can be set if we define `HAVE_STDARG_H`,
which we can do because PHP requires at least C99 in which the header
is always available.
We could also use an autoconf check, but this isn't really necessary as
it will always succeed.
2025-11-20 19:17:17 +01:00
Niels Dossche
94c256f997 Properly silence set-but-unused-var warning 2025-11-15 18:53:12 +01:00
Niels Dossche
fee4e1889f mbstring: Avoid pointless refcounted copy (#20325)
These scalars can use the ZVAL_COPY_VALUE variant instead of ZVAL_COPY
because they don't need refcounting.
2025-10-29 17:23:13 +01:00
Tim Düsterhus
753f287a37 mbstring: Use true / false instead of 1 / 0 for bool parameters
Changes done with Coccinelle:

    @r1@
    identifier F;
    identifier p;
    typedef bool;
    parameter list [n1] PL1;
    parameter list [n2] PL2;
    @@

    F(PL1, bool p, PL2) {
    ...
    }

    @r2@
    identifier r1.F;
    expression list [r1.n1] EL1;
    expression list [r1.n2] EL2;
    @@

    F(EL1,
    (
    - 1
    + true
    |
    - 0
    + false
    )
    , EL2)
2025-09-24 18:51:40 +02:00
Tim Düsterhus
af7340a265 mbstring: Use true / false instead of 1 / 0 when assigning to bool
Changes done with Coccinelle:

    @@
    bool b;
    @@

    - b = 0
    + b = false

    @@
    bool b;
    @@

    - b = 1
    + b = true
2025-09-24 18:51:40 +02:00
Gina Peter Banyard
93676a0425 ext/standard: Deprecate passing string which are not one byte long to ord() (#19440)
RFC: https://wiki.php.net/rfc/deprecations_php_8_5#deprecate_passing_string_which_are_not_one_byte_long_to_ord

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
2025-09-14 11:42:59 +01:00
tekimen
edc2671227 ext/mbstring: Update to Unicode 17.0 (#19796)
Updates UCD to Unicode 17.0 (released 2025 Sep).
2025-09-13 08:07:51 +09:00
Jorg Adam Sowa
1e02099e6a ext/mbstring: Use internal_encoding INI setting instead of mb_internal_encoding() in tests (#19663)
Moves the usage of `mb_internal_encoding()` to INI section for the tests not testing the encoding/function itself, but the other mbstring/iconv functions.
2025-09-03 11:34:12 +01:00
Niels Dossche
be2889411a Merge branch 'PHP-8.4'
* PHP-8.4:
  Fix GH-19397: mb_list_encodings() can cause crashes on shutdown
2025-08-08 20:33:00 +02:00
Niels Dossche
db3f6d0bf0 Merge branch 'PHP-8.3' into PHP-8.4
* PHP-8.3:
  Fix GH-19397: mb_list_encodings() can cause crashes on shutdown
2025-08-08 20:32:55 +02:00
Niels Dossche
cc93bbb765 Fix GH-19397: mb_list_encodings() can cause crashes on shutdown
The request shutdown does not necessarily hold the last reference, if
there is still a CV that refers to the array.

Closes GH-19405.
2025-08-08 20:32:29 +02:00
Gina Peter Banyard
105c1e9896 tree: use zend_str_has_nul_byte() API (#19336) 2025-07-31 23:57:27 +01:00
Niels Dossche
719419a6e5 Fix unterminated string GCC warnings in mbstring (#19192)
Necessary for for Werror builds
2025-07-23 11:49:16 +02:00
DanielEScherzer
07f1cfd9b0 Deprecate producing output in a user output handler (#19067)
https://wiki.php.net/rfc/deprecations_php_8_4
2025-07-09 21:20:58 -07:00
Gina Peter Banyard
c7778641dd ext/mbstring: Remove ZPP tests 2025-06-23 13:58:31 +02:00