Setting up an empty default handler is not only useless, but actually
harmful, since internal entity-references are not resolved anymore.
From the libexpat docs[1]:
| Setting the handler with this call has the side effect of
| turning off expansion of references to internally defined general
| entities. Instead these references are passed to the default
| handler.
[1] <https://www.xml.com/pub/1999/09/expat/reference.html#setdefhandler>
PHP requires boolean typehints to be written "bool" and disallows
"boolean" as an alias. This changes the error messages to match
the actual type name and avoids confusing messages like "must be
of type boolean, boolean given".
This a followup to ce1d69a1f6, which
implements the same change for integer->int.
PHP requires integer typehints to be written "int" and does not
allow "integer" as an alias. This changes type error messages to
match the actual type name and avoids confusing messages like
"must be of the type integer, integer given".
The issue is caused by an integer overflow when the `long` passed as
XML_OPTION_SKIP_TAGSTART is assigned to `xml_parser::toffset` which is
declared as `int`. We can simply work around this issue, by clipping
resulting negative values to 0 (and raising a notice in this case), because
the reasonable range for this value is certainly catered to by positive
`int`s.
However, there still remains the issue that `xml_parser::toffset` is later
added to `char *`s, which can cause OOB reads, so we make sure that the
upper bound never exceeds the strlen(). We eschew optimizing `SKIP_TAGSTART`
wrt. to the potentially duplicate strlen() call, because that code path is
unexpected anyway.
bug32001.phpt has a high failure rate for the submitted reports. According to
several samples it seems the iconv implementation of glibc 2.12 (released
2010-05) is the culprit. It seems appropriate to skip the test for such old
versions.
of reported malformed sequences). (Gustavo)
#Made a public interface for get_next_char/utf-8 in trunk to use in utf8_decode.
#In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's
#get_next_char is different and is not prepared to recover appropriately from
#errors.
of reported malformed sequences). (Gustavo)
#Made a public interface for get_next_char/utf-8 in trunk to use in utf8_decode.
#In PHP 5.3, trunk's get_next_char was copied to xml.c because 5.3's
#get_next_char is different and is not prepared to recover appropriately from
#errors.