The following sequence of actions was happening which caused a null
pointer dereference:
1. debug_backtrace() returns an array
2. The concatenation to $c will transform the array to a string via
`zval_get_string_func` for op2 and output a warning.
Note that zval op1 is of type string due to the first do-while
sequence.
3. The warning of an implicit "array to string conversion" triggers
the ob_start callback to run. This code transform $c (==op1) to a long.
4. The code below the 2 do-while sequences assume that both op1 and op2
are strings, but this is no longer the case. A dereference of the
string will therefore result in a null pointer dereference.
The solution used here is to work with the zend_string directly instead
of with the ops.
For the tests:
Co-authored-by: changochen1@gmail.com
Co-authored-by: cmbecker69@gmx.de
Co-authored-by: yukik@risec.co.jp
Closes GH-10049.
In 6fc8d014df, pakutoma added specialized validity checking functions
for some legacy text encodings like ISO-2022-JP and UTF-7. These
check functions perform a more strict validity check than the encoding
conversion functions for the same text encodings. For example, the
check function for ISO-2022-JP verifies that the string ends in the
correct state required by the specification for ISO-2022-JP.
These check functions are already being used to make detection of text
encoding more accurate when 'strict' detection mode is enabled.
However, since the default is 'non-strict' detection (a bad API design
but we're stuck with it now), most users will not benefit from
pakutoma's work. I was previously reluctant to enable this new logic
for non-strict detection mode. My intention was to reduce the scope of
behavior changes, since almost *any* behavior change may affect *some*
user in a way we don't expect.
However, we definitely have users whose (production) code was broken
by the changes I made in 28b346bc06, and enabling pakutoma's check
functions for non-strict detection mode would un-break it. (See
GH-10192 as an example.) The added checks do also make sense.
In non-strict detection mode, we will not immediately reject candidate
encodings whose validity check function returns false; but they will
be much less likely to be selected. However, failure of the validity
check function is weighted less heavily than an encoding error detected
by the encoding conversion function.
The documentation for mb_detect_encoding says that this function
"Detects the most likely character encoding for string `string` from an
ordered list of candidates".
Prior to 28b346bc06, mb_detect_encoding did not really attempt to
determine the "most likely" text encoding for the input string. It
would just return the first candidate encoding for which the string was
valid. In 28b346bc06, I amended this function so that it uses heuristics
to try to guess which candidate encoding is "most likely".
However, the caller did not have any way to indicate which candidate
text encoding(s) they consider to be more likely, in case the
heuristics applied are inconclusive. In the language of Bayesian
probability, there was no way for the caller to indicate their 'prior'
assignment of probabilities.
Further, the documentation for mb_detect_encoding also says that the
second parameter `encodings` is "a list of character encodings to try,
in order". The documentation clearly implies that the order of
the `encodings` argument should be significant.
Therefore, amend mb_detect_encoding so that while it still uses
heuristics to guess the most likely text encoding for the input string,
it favors those which are earlier in the list of candidate encodings.
One complication is that many callers of mb_detect_encoding use it
in this way:
mb_detect_encoding($string, mb_list_encodings());
In a majority of cases, this is bad code; mb_detect_encoding will both
be much slower and the results will be less reliable than if a smaller
list of candidates is used. However, since such code already exists and
people are using it in production, we should not unnecessarily break it.
The order of candidate encodings obviously does not express any prior
belief of which candidates are more likely in this case, and treating
it as if it did will degrade the accuracy of the result.
Since mb_list_encodings now returns a single, immutable array on each
call, we can avoid that problem by turning off the new behavior when
we receive the array of encodings returned by mb_list_encodings.
This implementation means that if the user does this:
$a = mb_list_encodings();
mb_detect_encoding($string, $a);
...then the order of candidate encodings will not be considered.
However, if the user explicitly initializes their own array of all
supported legacy text encodings, then the order *will* be considered.
The other functions which also follow this new behavior are:
• mb_convert_variables
• mb_convert_encoding (when multiple candidate input encodings are
listed)
Other places where "detection" (or really "guessing") of text encoding
may be performed include:
• mb_send_mail
• Zend engine, when determining the encoding of a PHP script
• mbstring processing of HTTP request contents, when http_input INI
parameter is set to a list
In these cases, the new logic based on order of candidate encodings
is *not* enabled. It *might* be logical to consider the order of
candidate encodings in some or all of these cases, but I'm not sure if
that is true, so it seems wiser to avoid more behavior changes than is
necessary. Further, ever since the new encoding detection heuristics
were implemented in 28b346bc06, we have not received any complaints of
user code being broken in these areas. So I am reluctant to "fix what
isn't broken".
Well, some might say that applying the new detection heuristics
to mb_send_mail, etc. in 28b346bc06 was "fixing what wasn't broken",
but (cough cough) I don't have any comment on that...
This will allow us to easily check in other mbstring functions if the
list of all supported encodings, returned by mb_list_encodings, is
passed in as input to another function.
Co-authored-by: Ilija Tovilo <ilija.tovilo@me.com>
We're setting the encoding from PHP_FUNCTION(mb_strpos), but mbfl_strpos would
discard it, setting it to mbfl_encoding_pass, making zend_memnrstr fail due to a
null-pointer exception.
Fixes GH-11217
Closes GH-11220
Normally, we add classes without parents (and no interfaces or traits) directly
to the class map, early binding the class. However, if the same class has
already been registered, we would instead just add a ZEND_DECLARE_CLASS
instruction and let the handler throw a duplicate class declaration exception.
However, with opcache, if on the next request the files are included in the
opposite order, we won't perform early binding. To fix this, create a
ZEND_DECLARE_CLASS_DELAYED instruction instead and handle classes without
parents accordingly, skipping any linking for classes that are already linked in
delayed early binding.
Fixes GH-8846
Once code is emitted to JIT buffer, hint the hardware to
demote the corresponding cache lines to more distant level
so other CPUs can access them more quickly.
This gets nearly 1% performance gain on our workload.
Signed-off-by: Xue,Wang <xue1.wang@intel.com>
Signed-off-by: Tao,Su <tao.su@intel.com>
Signed-off-by: Hu,chen <hu1.chen@intel.com>
This is to prevent after free accessing of the child event that might
happen when child is killed and the message is delivered at that same
time.
Also fixes GH-10889 and properly fixes GH-8517 that was not previously
fixed correctly.
php_stream_read() may return less than the requested amount of bytes by
design. This patch introduces a static function for exif which reads
from the stream in a loop until all the requested bytes are read.
For the test: Co-authored-by: dotpointer
Closes GH-10924.
If we bind the class to the runtime slot even if we're not the ones who have
performed early binding we'll miss the redeclaration error in the
ZEND_DECLARE_CLASS_DELAYED handler.
Closes GH-11226
In older versions of GCC (<=4.5) designated initializers would not accept member
names nested inside anonymous structures. Instead, we need to use a positional
member wrapped in {}.
Fixes GH-11063
Closes GH-11212