In master I use ZEND_DIAGNOSTIC_IGNORED_START, but that doesn't exist on
8.2 or 8.3 (8.3 has a similar macro though).
So to unbreak CI I just made a variation of this directly in the
php_libxml.h header.
See 683e787860 (commitcomment-134301083)
Fixes GHSA-3qrf-m4j2-pcrr.
To parse a document with libxml2, you first need to create a parsing context.
The parsing context contains parsing options (e.g. XML_NOENT to substitute
entities) that the application (in this case PHP) can set.
Unfortunately, libxml2 also supports providing default set options.
For example, if you call xmlSubstituteEntitiesDefault(1) then the XML_NOENT
option will be added to the parsing options every time you create a parsing
context **even if the application never requested XML_NOENT**.
Third party extensions can override these globals, in particular the
substitute entity global. This causes entity substitution to be
unexpectedly active.
Fix it by setting the parsing options to a sane known value.
For API calls that depend on global state we introduce
PHP_LIBXML_SANITIZE_GLOBALS() and PHP_LIBXML_RESTORE_GLOBALS().
For other APIs that work directly with a context we introduce
php_libxml_sanitize_parse_ctxt_options().
It's possible to categorise the failures into 2 categories:
- Changed error message. In this case we either duplicate the test and
modify the error message. Or if the change in error message is
small, we use the EXPECTF matchers to make the test compatible with both
old and new versions of libxml2.
- Missing warnings. This is caused by a change in libxml2 where the
parser started using SAX APIs internally [1]. In this case the
error_type passed to php_libxml_internal_error_handler() changed from
PHP_LIBXML_ERROR to PHP_LIBXML_CTX_WARNING because it internally
started to use the SAX handlers instead of the generic handlers.
However, for the SAX handlers the current input stack is empty, so
nothing is actually printed. I fixed this by falling back to a
regular warning without a filename & line number reference, which
mimicks the old behaviour. Furthermore, this change now also shows
an additional warning in a test which was previously hidden.
[1] 9a82b94a94
Closes GH-11162.
The libxml based XML functions accepting a filename actually accept
URIs with possibly percent-encoded characters. Percent-encoded NUL
bytes lead to truncation, like non-encoded NUL bytes would. We catch
those, and let the functions fail with a respective warning.
This version of libxml introduced quite a few changes. Most of
them are differences in error reporting, while some also change
behavior, e.g. null bytes are no longer supported and xinclude
recursion is limited.
Closes GH-7030. Closes GH-7046.
Co-authored-by: Nikita Popov <nikic@php.net>
1. Update: http://www.php.net/license/3_01.txt to https, as there is anyway server header "Location:" to https.
2. Update few license 3.0 to 3.01 as 3.0 states "php 5.1.1, 4.1.1, and earlier".
3. In some license comments is "at through the world-wide-web" while most is without "at", so deleted.
4. fixed indentation in some files before |
The `encoding` attribute of the XML declaration is optional; it is good
practice to use external encoding information where available if it is
missing. Thus, we check for `charset` info of `Content-Type` headers,
and see whether the encoding is supported.
We cater to trailing parameters and quoted-strings, but not to escaped
backslashes and quotes in quoted-strings, since no known character
encoding contains these anyway.
Co-authored-by: Michael Wallner <mike@php.net>
Closes GH-6747.
We're starting to see a mix between uses of zend_bool and bool.
Replace all usages with the standard bool type everywhere.
Of course, zend_bool is retained as an alias.
This method was used to protect code against XXE processing attacks.
Since PHP now requires libxml >= 2.9.0 external entity loading no longer
needs to be disabled to prevent these attacks. It is disabled by default.
Also, the method has an unwanted side effect that causes a lot of
confusion: Parsing XML data from resources like files is no longer possible.
Closes GH-5867.
Since libxml version 2.9.0 external entity loading is disabled by default.
Bumping the version requirement means that XML processing in PHP is no
longer vulnerable to XXE processing attacks by default.