1
0
mirror of https://github.com/php/php-src.git synced 2026-04-16 20:41:18 +02:00
Commit Graph

2 Commits

Author SHA1 Message Date
Nikita Popov
a8a9e93e9a Revert/fix mb_substitute_character() codepoint checks
The introduced checks did not treat "non-Unicode" encodings correctly,
because they treated the passed integer as encoded in the internal
encoding in that case, while in actuality the substitute character
is always a Unicode codepoint.

Additionally checking the codepoint against the internal encoding
is not correct in any case, because the substitution character must
be mapped in the *target* encoding of the conversion, which does
not necessarily coincide with the internal encoding (the internal
encoding is the default *source* encoding, not *target* encoding).

This reverts the checks back to simple range checks, but in a way
that still resolves #69079: Characters outside the Basic
Multilingual Plane are now accepted and Surrogate Codepoints are
rejected. A distinction between UTF-8 and non-UTF-8 encodings is
not made for surrogate checks (as in the original patch), as
surrogates are always illegal on their own. Specifying a surrogate
as substitution character would only make sense if you could
specify a substitution string with more than one character --
however we do not support that.
2017-08-03 21:12:41 +02:00
Masaki Kagaya
aa7eceb3a5 added php_mb_check_code_point for mb_substitute_character 2015-02-19 22:49:13 +09:00