* PHP-8.3:
Fix GH-16777: Calling the constructor again on a DOM object after it is in a document causes UAF
Fix GH-16808: Segmentation fault in RecursiveIteratorIterator->current() with a xml element input
We need to perform all sanity checks before doing any modification.
I don't have a reliable and easy test for this on 8.2, but I have one
for 8.4.
Closes GH-16598.
This is already forbidden by libxml, but this condition isn't properly
checked; so the return value and lack of error makes it seem like it
worked while it actually didn't. Furthermore, this can break assumptions
and assertions later on.
Closes GH-16596.
The invalid parent condition can actually happen because PHP's DOM is
allows to get children of e.g. attributes; something normally not
possible.
Closes GH-16597.
libxml2 2.13.0 introduced some relevant changes regarding the treatment
of file paths on Windows[1]. Thus we un-xfail bug69753.phpt and its
companion, and we adjust dom004.phpt. And we also disable the
workaround for erroneous file:/ URIs on Windows.
[1] <8ab1b122c4>
Closes GH-16536.
If the input contains NUL bytes then the length doesn't match the actual
duplicated string's length. Note that libxml can't handle this properly
anyway so we just reject NUL bytes and too long strings.
Closes GH-16467.
3 issues:
1) RETURN_NULL() was used via the macro NODE_GET_OBJ(), but the function
returns false on failure and cannot return null according to its
stub.
2) The struct layout of the different implementors of libxml only
guarantees overlap between the node pointer and the document
reference, so accessing the std zend_object may not work.
3) DOC_GET_OBJ() wasn't using ZSTR_VAL().
Closes GH-16307.
There are three connected subtle issues:
1) The fast path didn't correctly handle the case where the decoder
requests more data. This caused a bogus additional replacement
sequence to be outputted when encountering an incomplete sequence at
the edges of a buffer.
2) The finishing of decoding incorrectly assumed that the fast path
cannot be in a state where the last few bytes were an incomplete
sequence, but this is not true as shown by test 08.
3) The finishing of decoding could output bytes twice because it called
into dom_process_parse_chunk() twice without clearing the decoded
data. However, calling twice is not even necessary as the entire
buffer cannot be filled up entirely.
Closes GH-16226.
The reference counts of the internal document pointer are mismanaged.
In the case of fragments the refcount may be increased too much, while
for other cases the document reference may not be applied to all
children.
This bug existed for a long time and this doesn't reproduce (easily)
on 8.2 due to other bugs. Furthermore 8.2 will enter security mode soon,
and this change may be too risky.
Fixes GH-16150.
Fixed GH-16152.
Closes GH-16178.
Unfortunately, old DOM allows attributes to be used as parent nodes.
Only text nodes and entities are allowed as children for these types of
nodes, because that's the constraint DOM and libxml give us.
Closes GH-16156.
We can use `memcmp()` directly and skip some of the logic handling
in `zend_binary_strcmp()`. `perf record` shows a reduction for
`dom_html5_serializes_as_void()` from 3.12% to 0.77%.
This never did anything in lower versions, but on master this crashes
because the virtual properties don't have backing storage. Just forbid
it since it was useless to begin with.
Closes GH-15891.