1
0
mirror of https://github.com/php/php-src.git synced 2026-04-19 05:51:02 +02:00
Commit Graph

127 Commits

Author SHA1 Message Date
George Peter Banyard
363d87f256 Fix [-Wmissing-field-initializers] compiler warning in mbstring
Add missing NULL pointer for mbfl_convert_vtbl struct.
2020-02-21 13:19:09 +01:00
Nikita Popov
7d170eb295 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix shift ub in mbstring
  Restore digit check in mb_decode_numericentity()
2020-01-30 10:08:21 +01:00
Nikita Popov
9aadcb18e1 Restore digit check in mb_decode_numericentity()
I replaced it with a multiplication overflow check in
18599f9c52. However, we need both,
because the code for restoring the number can't handle numbers
with many leading zeros right now and I don't feel like teaching it.
2020-01-30 10:07:01 +01:00
Nikita Popov
b2c8abe951 Merge branch 'PHP-7.4'
* PHP-7.4:
  Better overflow check for entity decoding
2020-01-29 16:08:55 +01:00
Nikita Popov
18599f9c52 Better overflow check for entity decoding
Check for multiplication overflow rather than number of digits.
2020-01-29 16:08:46 +01:00
Nikita Popov
bc32cce6a2 Merge branch 'PHP-7.4'
* PHP-7.4:
  Fix recovery of large entities in mb_decode_numericentity()
2020-01-29 11:49:27 +01:00
Nikita Popov
91f878779c Fix recovery of large entities in mb_decode_numericentity()
Make sure we don't overflow the integer.
2020-01-29 11:48:34 +01:00
Nikita Popov
c2d0a413cc Also use zend_memnrstr in mbfl_strpos 2020-01-24 11:42:54 +01:00
Nikita Popov
d504ad5717 Base mbfl_strpos on zend_memnstr
The same algorithm is also used by zend_memnstr, but it also has
a fast-path for short strings / needles, where a more naive
search performs better.
2020-01-24 11:29:34 +01:00
Nikita Popov
73b31302ed Extract calculation of offset from pointer 2020-01-24 11:15:58 +01:00
Nikita Popov
ce6169832f Move offset error checking into mbfl_strpos
This avoids calculating the full length only in order to validate
the offset, as mbfl_strpos needs to find the offset internally
anyway.
2020-01-24 10:50:02 +01:00
Nikita Popov
0f6d223ddb Add #defines for mbfl_strpos error conditions 2020-01-24 10:02:41 +01:00
George Peter Banyard
483efc7e50 Allow empty needles in mb_strpos and mb_strstr function family.
MBstring analogous implementation to 6d578482a9

Closes GH-4977
2020-01-07 22:53:35 +01:00
George Peter Banyard
0a173abc84 Revert "Remove dead code in libmbfl, memory device"
Stop trusting CLion's automatic dead code detection because it seems to be wrong
more often than not.
This reverts commit 612c86db3e.
2019-12-13 18:43:14 +01:00
George Peter Banyard
612c86db3e Remove dead code in libmbfl, memory device 2019-12-13 17:58:37 +01:00
Nikita Popov
7b152990b6 Don't short-circuit MBFL_OUTPUTFILTER_ILLEGAL_MODE_NONE
Make sure we always go through mbfl_filt_conv_illegal_output(), so
that the number of illegal characters gets counted.
2019-08-09 16:33:21 +02:00
Peter Kokot
9219e56063 Remove redundant memory.h file
The memory.h file is part of the pre-C89 era and is on today's systems
only a simple wrapper for including the final string.h header file.
2019-05-11 19:47:54 +02:00
Peter Kokot
92ac598aab Remove local variables
This patch removes the so called local variables defined per
file basis for certain editors to properly show tab width, and
similar settings. These are mainly used by Vim and Emacs editors
yet with recent changes the once working definitions don't work
anymore in Vim without custom plugins or additional configuration.
Neither are these settings synced across the PHP code base.

A simpler and better approach is EditorConfig and fixing code
using some code style fixing tools in the future instead.

This patch also removes the so called modelines for Vim. Modelines
allow Vim editor specifically to set some editor configuration such as
syntax highlighting, indentation style and tab width to be set in the
first line or the last 5 lines per file basis. Since the php test
files have syntax highlighting already set in most editors properly and
EditorConfig takes care of the indentation settings, this patch removes
these as well for the Vim 6.0 and newer versions.

With the removal of local variables for certain editors such as
Emacs and Vim, the footer is also probably not needed anymore when
creating extensions using ext_skel.php script.

Additionally, Vim modelines for setting php syntax and some editor
settings has been removed from some *.phpt files.  All these are
mostly not relevant for phpt files neither work properly in the
middle of the file.
2019-02-03 21:03:00 +01:00
Nikita Popov
ad6738e886 Merge branch 'PHP-7.3' 2018-10-17 12:51:17 +02:00
Nikita Popov
1151554668 Remove the "auto" encoding
"auto" is only meaningful in functions which accept an encoding
*list* and support encoding detection. These functions have
explicit checks for "auto". It cannot be used as a standalone
encoding in any meaningful capacity, so I'm dropping it entirely.
2018-10-17 12:50:24 +02:00
Nikita Popov
e4ccc1a29f Merge branch 'PHP-7.3' 2018-10-17 12:40:39 +02:00
Nikita Popov
56665a1b17 Fixed bug #77025
Implements 8bit conversions equivalently to iso-8859-1 conversions.
This seems quite dubious to me, but seems to match the previous
behavior.

It might make more sense to map the characters into a private area
instead, so that the 8bit encoding is treated as binary data with
no case conversions (including no case conversions in the ascii
range).
2018-10-17 12:38:31 +02:00
Peter Kokot
1ad08256f3 Sync leading and final newlines in source code files
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines.

According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.

C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."

Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
2018-10-14 12:56:38 +02:00
Peter Kokot
1c850bfcca Sync leading and final newlines in source code files
This patch adds missing newlines, trims multiple redundant final
newlines into a single one, and trims redundant leading newlines.

According to POSIX, a line is a sequence of zero or more non-' <newline>'
characters plus a terminating '<newline>' character. [1] Files should
normally have at least one final newline character.

C89 [2] and later standards [3] mention a final newline:
"A source file that is not empty shall end in a new-line character,
which shall not be immediately preceded by a backslash character."

Although it is not mandatory for all files to have a final newline
fixed, a more consistent and homogeneous approach brings less of commit
differences issues and a better development experience in certain text
editors and IDEs.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206
[2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2
[3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
2018-10-14 12:55:24 +02:00
Peter Kokot
37c329d715 Trim trailing whitespace in source code files 2018-10-13 14:17:28 +02:00
Peter Kokot
3362620b5f Trim trailing whitespace in source code files 2018-10-13 14:16:33 +02:00
Nikita Popov
aec6421409 Merge branch 'PHP-7.3' 2018-10-02 16:14:36 +02:00
Nikita Popov
9cfd8f43c2 Don't fall back to vtbl_pass if no matching vtbl found
If we don't know how to convert between two encodings, make sure
we error instead of ignoring the issue.

Explicitly use vtbl_pass if we are round-tripping wchar->wchar or
8bit->8bit. Fingers crossed that nothing else relies on the
vtbl_pass fallback...
2018-10-02 16:07:22 +02:00
Peter Kokot
d3ca28f569 Remove HAVE_STRING_H
The C89 standard and later defines the `<string.h>` header as part of
the standard headers [1] and on current systems it is always present.

Code included also `<strings.h>` header as an alterinative in some
files. This kind of check was relevant on some older systems where the
`<strings.h>` file included definitions for the C89 compliant
`<string.h>`. Today such alternative check is not required anymore. The
`<strings.h>` file is part of the POSIX definition these days.

Also Autoconf suggests doing this and relying on C89 or above [2] and [3].

This patch also cleans few unused `<strings.h>` inclusions in the libmbfl.

[1]: https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2]: http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
[3]: https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-18 05:32:08 +02:00
Peter Kokot
7dd62811ce Remove HAVE_STDLIB_H
The C89 and later standard defines the `<stdlib.h>` header as part of
the standard headers [1] and on current systems it is always present
and the `HAVE_STDLIB_H` symbol can be removed.

Also Autoconf suggests doing this and relying on C89 or above [2] and [3].

[1] https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2] http://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
[3] https://www.gnu.org/software/autoconf/manual/autoconf-2.69/autoconf.html
2018-09-16 20:53:53 +02:00
Peter Kokot
6c1ff61a36 Remove HAVE_STDDEF_H
The `<stddef.h>` header file is part of the standard C89 headers [1] and
on current systems there is no need for a manual check if header is
present.

Since PHP requires at least C89 the `HAVE_STDDEF_H` symbol isn't defined
by Autoconf anywhere else anymore [2] and accross the PHP source code
the header is included unconditionally already.

This patch syncs this also for the bundled libmbfl which is maintaned as
a fork in php-src.

Refs:
[1] https://port70.net/~nsz/c/c89/c89-draft.html#4.1.2
[2] https://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4
2018-09-05 11:51:19 +02:00
Peter Kokot
255e29f3bc Remove unused libmbfl build system related files
PHP build system already builds necessary files also from libmbfl
directory using the mbstring config.m4 file.
2018-07-29 10:07:32 +02:00
Peter Kokot
8d3f8ca12a Remove unused Git attributes ident
The $Id$ keywords were used in Subversion where they can be substituted
with filename, last revision number change, last changed date, and last
user who changed it.

In Git this functionality is different and can be done with Git attribute
ident. These need to be defined manually for each file in the
.gitattributes file and are afterwards replaced with 40-character
hexadecimal blob object name which is based only on the particular file
contents.

This patch simplifies handling of $Id$ keywords by removing them since
they are not used anymore.
2018-07-25 00:53:25 +02:00
Nikita Popov
a7101415cb Merge branch 'PHP-7.2' 2018-06-28 23:06:08 +02:00
Nikita Popov
00c0d7702c Merge branch 'PHP-7.1' into PHP-7.2 2018-06-28 23:05:09 +02:00
Marcus Schwarz
bf5a802f5a Fixed bug #76532 (excessive memory usage in mb_strimwidth) 2018-06-28 23:02:28 +02:00
Joe Watkins
c898349e16 fixes PR #2722, no clue how it broke ... 2017-09-06 11:13:27 +01:00
shinemotec@gmail.com
9b77615608 fixed mbstring extension compiled broken with archlinux 2017-09-06 09:50:08 +01:00
Nikita Popov
f24db7686e Optimize mb_ord()
Don't perform a full encoding conversion into UCS4-BE, instead only
perform an input conversion into a wchar device.
2017-08-04 22:22:58 +02:00
Nikita Popov
633a471ba0 Store input and output filters in mbfl encodings
For functions like mb_chr() and mb_ord() just looking up the
input/output filter for the encoding dominates the runtime. This
commit stores the input/output filter for an encoding in the
mbfl encoding structure, so it can be looked up directly, rather
than scanning through filter function lists.
2017-08-04 22:22:58 +02:00
Nikita Popov
e20fbd43ba Separate mbfl filters into three categories
Input filters, output filters and special filters.
2017-08-04 22:22:58 +02:00
Nikita Popov
c98714f19e Merge branch 'PHP-7.2' 2017-08-03 21:57:35 +02:00
Nikita Popov
fb9bf5b64b Revert/fix substitution character fallback
The introduced checks were not correct in two respects:
 * It was checked whether the source encoding of the string matches
   the internal encoding, while the actually relevant encoding is
   the *target* encoding.
 * Even if the correct encoding is used, the checks are still too
   conservative. Just because something is not a "Unicode-encoding"
   does not mean that it does not map any non-ASCII characters.

I've reverted the added checks and instead adjusted mbfl_convert
to first try to use the provided substitution character and if
that fails, perform the fallback to '?' at that point. This means
that any codepoint mapped in the target encoding should now be
correctly supported and anything else should fall back to '?'.
2017-08-03 21:53:59 +02:00
Nikita Popov
445e13b149 Add MBFL_SUBSTR_TO_END mode to mbfl_substr
This takes the substr from the offset to the end of the string.
This avoids pointless searching for the end position and also
saves us a length calculation in the strstr family of functions.
2017-07-23 23:17:12 +02:00
Anatol Belski
7496bad2ac adjust datatype, used for position handling 2017-07-23 16:37:31 +02:00
Nikita Popov
42ff1aa86c Fix overflow checks in mbfl_memory_device
Also prune out some duplicate code and use strlen() and memcpy()
instead of ad-hoc reimplementations. Remove multiplications by
sizeof(unsigned char), which wrongly imply that this can be
anything but 1.
2017-07-23 11:55:43 +02:00
Anatol Belski
0eea41b6c4 add missing header 2017-07-23 00:23:02 +02:00
Anatol Belski
61784bcb71 sync libmbfl allocator with the size_t changes 2017-07-22 23:53:00 +02:00
Anatol Belski
e0825ec60f Mitigation for ssize_t issue in 22a5f554a8
and some more
2017-07-22 22:34:16 +02:00
Nikita Popov
a319063aae Only write single terminating byte
As far as I could determine this is sufficient. It avoids
reallocating the buffer, if it was perfectly allocated beforehand.
2017-07-20 21:41:52 +02:00