1
0
mirror of https://github.com/php/php-src.git synced 2026-04-14 03:22:58 +02:00
Commit Graph

363 Commits

Author SHA1 Message Date
Nikita Popov
dee5a450d9 Fixed bug #77165
Also add some helper macros for PROTECT/UNPROTECT that check for
IMMUTABLE. These checks are needed for nearly any use of
PROTECT/UNPROTECT.
2018-11-15 17:16:39 +01:00
Nikita Popov
09c7108f74 Fix mb_strrpos() with encoding passed as 3rd param 2018-10-29 18:56:17 +01:00
Nikita Popov
1151554668 Remove the "auto" encoding
"auto" is only meaningful in functions which accept an encoding
*list* and support encoding detection. These functions have
explicit checks for "auto". It cannot be used as a standalone
encoding in any meaningful capacity, so I'm dropping it entirely.
2018-10-17 12:50:24 +02:00
Nikita Popov
56665a1b17 Fixed bug #77025
Implements 8bit conversions equivalently to iso-8859-1 conversions.
This seems quite dubious to me, but seems to match the previous
behavior.

It might make more sense to map the characters into a private area
instead, so that the 8bit encoding is treated as binary data with
no case conversions (including no case conversions in the ascii
range).
2018-10-17 12:38:31 +02:00
Peter Kokot
b746e69887 Sync leading and final newlines in *.phpt sections
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines in all
*.phpt sections.

According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.

C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."

Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
2018-10-15 04:32:30 +02:00
Peter Kokot
782352c54a Trim trailing whitespace in *.phpt 2018-10-14 19:45:12 +02:00
Peter Kokot
85290bbfcc Convert CRLF line endings to LF
This patch simplifies line endings tracked in the Git repository and
syncs them to all include the LF style instead of the CRLF files.

Newline characters:
- LF (\n) (*nix and Mac)
- CRLF (\r\n) (Windows)
- CR (\r) (old Mac, obsolete)

To see which line endings are in the index and in the working copy the
following command can be used:
`git ls-files --eol`

Git additionally provides `.gitattributes` file to specify if some files
need to have specific line endings on all platforms (either CRLF or LF).

Changed files shouldn't cause issues on modern Windows platforms because
also Git can do output conversion is core.autocrlf=true is set on
Windows and use CRLF newlines in all files in the working tree.

Unless CRLF files are tracked specifically, Git by default tracks all
files in the index using LF newlines.
2018-10-13 11:23:20 +02:00
Nikita Popov
26f82a7706 Fixed bug #76958 2018-10-02 16:13:51 +02:00
Christoph M. Becker
a003af5b62 Add missing skip conditions
mbstring can be built without mbregex support, in which case these
tests would fail.  Thus we add respective skip conditions.
2018-08-05 00:01:35 +02:00
Christoph M. Becker
5dc74d9e70 Merge branch 'PHP-7.2' into PHP-7.3
* PHP-7.2:
  Fix #76704: mb_detect_order return value varies based on argument type
2018-08-04 13:50:48 +02:00
Christoph M. Becker
db8bcdba80 Merge branch 'PHP-7.1' into PHP-7.2
* PHP-7.1:
  Fix #76704: mb_detect_order return value varies based on argument type
2018-08-04 12:57:05 +02:00
Christoph M. Becker
c00f5e6531 Fix #76704: mb_detect_order return value varies based on argument type
php_mb_parse_encoding_list() and php_mb_parse_encoding_array() are
supposed to return SUCCESS and FAILURE, not 1 and 0, respectively.
2018-08-04 12:51:57 +02:00
Nikita Popov
e6016ab20d Deprecate undocumented mbereg_* aliases
Part of https://wiki.php.net/rfc/deprecations_php_7_3.
2018-07-21 22:34:09 +02:00
ju1ius
8f1782678e adds support for named subpatterns to mb_ereg_replace
Named subpatterns are now passed to `mb_ereg_replace_callback`.

This commit also adds a subset of the oniguruma back-reference syntax
for replacements:
* `\k<name>` and `\k'name'` for named subpatterns.
* `\k<n>` and `\k'n'` for numbered subpatterns
These last two notations allow referencing numbered groups where n > 9.
2018-07-06 23:34:54 +02:00
ju1ius
212f56b7ca adds support for named captures to mb_ereg & mb_ereg_search
`mb_ereg`, `mb_ereg_search_regs` & `mb_ereg_search_getregs`
returned only numbered capturing groups.
Now they return both numbered and named capturing groups.
Fixes Bug #72704.
2018-07-06 23:34:54 +02:00
Nikita Popov
a7101415cb Merge branch 'PHP-7.2' 2018-06-28 23:06:08 +02:00
Nikita Popov
00c0d7702c Merge branch 'PHP-7.1' into PHP-7.2 2018-06-28 23:05:09 +02:00
Marcus Schwarz
bf5a802f5a Fixed bug #76532 (excessive memory usage in mb_strimwidth) 2018-06-28 23:02:28 +02:00
Anatol Belski
3b07c6cf87 Skip tests when Oniguruma is disabled 2018-06-11 17:44:34 +02:00
Nikita Popov
9d63f4dec1 Fixed bug #76319
While at it, also make sure that mbstring case conversion takes
into account the specified substitution character and substitution
mode.
2018-05-25 11:33:13 +02:00
Christoph M. Becker
9004985273 Merge branch 'PHP-7.2'
* PHP-7.2:
  Fix #75944: Wrong cp1251 detection
2018-03-19 14:48:10 +01:00
Christoph M. Becker
cd2912af5e Merge branch 'PHP-7.1' into PHP-7.2
* PHP-7.1:
  Fix #75944: Wrong cp1251 detection
2018-03-19 14:34:09 +01:00
Christoph M. Becker
47461368ca Fix #75944: Wrong cp1251 detection
`\xFF` is a valid character of CP-1251.
2018-03-19 14:24:27 +01:00
Christoph M. Becker
ef01ec08f0 Merge branch 'PHP-7.2'
* PHP-7.2:
  Fix #62545: wrong unicode mapping in some charsets
2018-03-11 18:05:08 +01:00
Christoph M. Becker
2b02e6dff3 Merge branch 'PHP-7.1' into PHP-7.2
* PHP-7.1:
  Fix #62545: wrong unicode mapping in some charsets
2018-03-11 17:54:45 +01:00
Christoph M. Becker
01ea314e8c Fix #62545: wrong unicode mapping in some charsets
Undefined characters are best mapped to Unicode REPLACEMENT characters.
2018-03-11 17:38:28 +01:00
Gabriel Caruso
e1cc4863d9 Remove duplicated tests 2018-02-22 13:03:21 +01:00
Gabriel Caruso
ded3d984c6 Use EXPECT instead of EXPECTF when possible
EXPECTF logic in run-tests.php is considerable, so let's avoid it.
2018-02-20 21:53:48 +01:00
Gabriel Caruso
21e3b0c70c Remove trailing whitespace in inc files 2018-02-10 19:20:23 +01:00
Gabriel Caruso
2d48d734a2 Fix some misspellings 2018-02-06 16:59:00 +01:00
Gabriel Caruso
fef879a2d6 Use bool instead of boolean while throwing a type error
PHP requires boolean typehints to be written "bool" and disallows
"boolean" as an alias. This changes the error messages to match
the actual type name and avoids confusing messages like "must be
of type boolean, boolean given".

This a followup to ce1d69a1f6, which
implements the same change for integer->int.
2018-02-04 23:09:40 +01:00
Gabriel Caruso
ce1d69a1f6 Use int instead of integer in type errors
PHP requires integer typehints to be written "int" and does not
allow "integer" as an alias. This changes type error messages to
match the actual type name and avoids confusing messages like
"must be of the type integer, integer given".
2018-02-04 19:08:23 +01:00
Stanislav Malyshev
3616b6b935 Cleanup some tests - remove unnecessary sections
Also unify credits - all are under --CREDITS-- now.
2018-02-04 02:21:40 -08:00
Gabriel Caruso
c6c9e71a5b Add missing SKIPIF sections 2018-02-03 13:54:34 +01:00
Nat Zimmermann
478af26d84 Update mb_preferred_mime_name tests 2018-01-26 22:25:18 +01:00
Nat Zimmermann
6fb78e3017 Add unknown encoding warning test for mb_encoding_aliases 2018-01-26 22:25:18 +01:00
Colin O'Dell
201930106d Add test for negative lengths in mb_strcut() 2017-11-22 22:47:55 +01:00
Colin O'Dell
830d87b86e Add tests for mb_language() 2017-11-22 22:47:55 +01:00
Dmitry Stogov
cb9d81ef4f Refactored recursion pretection 2017-10-06 01:34:50 +03:00
Nikita Popov
840b77c02e Merge branch 'PHP-7.2' 2017-08-04 22:20:11 +02:00
Nikita Popov
6b73b2d6eb Check for empty string in mb_ord() 2017-08-04 22:20:05 +02:00
Nikita Popov
4e4ec31e2e Merge branch 'PHP-7.2' 2017-08-04 13:02:44 +02:00
Nikita Popov
353f7bf461 Also check for invalid codepoints in mb_ord()
And return false in that case, instead of returning 0x3f...
2017-08-04 13:01:03 +02:00
Nikita Popov
5caf05f6c5 Merge branch 'PHP-7.2' 2017-08-03 22:41:15 +02:00
Nikita Popov
e53162a32b Return false on invalid codepoint in mb_chr()
Instead of returning the encoding of the current substitution
character. This allows a robust check for the failure case. The
substitution character (especially the default of "?") is also
a valid output of mb_chr() for a valid input (for "?" that would be
0x3f), so it's a bad choice for an error value.
2017-08-03 22:36:42 +02:00
Nikita Popov
41e9ba6333 Always use Unicode codepoints in mb_ord() and mb_chr()
Previously mb_chr() had two different encoding-dependent behaviors:
 * For "Unicode-encodings" it took a Unicode codepoint and returned
   its encoded representation.
 * Otherwise it returned a big-endian binary encoding of the passed
   integer.

Now the input is always interpreted as a Unicode codepoint. If
a big-endian binary encoding is what you want, you don't need
mbstring to implement that.
2017-08-03 22:14:00 +02:00
Nikita Popov
c98714f19e Merge branch 'PHP-7.2' 2017-08-03 21:57:35 +02:00
Nikita Popov
fb9bf5b64b Revert/fix substitution character fallback
The introduced checks were not correct in two respects:
 * It was checked whether the source encoding of the string matches
   the internal encoding, while the actually relevant encoding is
   the *target* encoding.
 * Even if the correct encoding is used, the checks are still too
   conservative. Just because something is not a "Unicode-encoding"
   does not mean that it does not map any non-ASCII characters.

I've reverted the added checks and instead adjusted mbfl_convert
to first try to use the provided substitution character and if
that fails, perform the fallback to '?' at that point. This means
that any codepoint mapped in the target encoding should now be
correctly supported and anything else should fall back to '?'.
2017-08-03 21:53:59 +02:00
Nikita Popov
3d948d77d1 Merge branch 'PHP-7.2' 2017-08-03 21:17:26 +02:00
Nikita Popov
a8a9e93e9a Revert/fix mb_substitute_character() codepoint checks
The introduced checks did not treat "non-Unicode" encodings correctly,
because they treated the passed integer as encoded in the internal
encoding in that case, while in actuality the substitute character
is always a Unicode codepoint.

Additionally checking the codepoint against the internal encoding
is not correct in any case, because the substitution character must
be mapped in the *target* encoding of the conversion, which does
not necessarily coincide with the internal encoding (the internal
encoding is the default *source* encoding, not *target* encoding).

This reverts the checks back to simple range checks, but in a way
that still resolves #69079: Characters outside the Basic
Multilingual Plane are now accepted and Surrogate Codepoints are
rejected. A distinction between UTF-8 and non-UTF-8 encodings is
not made for surrogate checks (as in the original patch), as
surrogates are always illegal on their own. Specifying a surrogate
as substitution character would only make sense if you could
specify a substitution string with more than one character --
however we do not support that.
2017-08-03 21:12:41 +02:00