Trying to allocate a `zend_string` with a length only slighty smaller
than `SIZE_MAX` causes an integer overflow; we make sure that this
doesn't happen by catering to the maximal overhead of a `zend_string`.
Closes GH-7597.
We backport the respective upstream fix[1] to our bundled pcre2lib plus
the follow-up fix[2] for a functional regression.
[1] <dc5f966635>
[2] <e7af7efaa1>
Closes GH-7573.
Trimming a potentially over-allocated string appears to be reasonable,
so we drop the condition altogether.
We also re-allocate twice the size needed in the first place, and not
roughly tripple the size.
Closes GH-7231.
The way to fix it is to disable certain match start optimizaions. The
observed performance impact appears negligible ATM, compared to the
functional regression revealed.
A possible side effect might occur if a pattern uses (*COMMIT) or
(*MARK), which is however not a very broadly used syntax in PHP. Still
this should be observed and handled by possibly adding a possibility to
reverse PCRE2_NO_START_OPTIMIZE on the user side.
One test shows a behavior change, where instead of int 0 the match
would produce an error and return false. Except strict comparison
is used, this should be acceptable.
Signed-off-by: Anatol Belski <ab@php.net>
(cherry picked from commit d188ca7688)
Signed-off-by: Anatol Belski <ab@php.net>
The compile context is shared between patterns, so we need to set
the character tables unconditionally in case we switched from
a non-C locale to the C locale.
PCRE only validates the string starting from the start offset
(minus maximum look-behind, but let's ignore that), so we can
only remember that the string is fully valid UTF-8 is the original
start offset is zero.
We need not just the whole string to be UTF-8, but the start
position to be on a character boundary as well. Check this by
looking for a continuation byte.
A new function `pcre_get_compiled_regex_cache_ex()` is introduced,
which allows to compile regexp pattern using the "C" locale instead
of a current locale.
This will be needed to replace setlocale() usage in fileinfo,
which is not thread-safe.
Print a more informative message that indicates that this is
likely a permission issue, and also indicate that pcre.jit=0
can be used to work around it.
Also automatically disable the JIT, so that this message is
only shown once.
See bug #78630.
This is intended to fix the primary issue from bug #77260.
Prior to macOS 10.14 multiple MAP_JIT segments were not permitted,
leading to mmap failures and corresponding "no more memory" errors
on macOS 10.13.