This merges all usages of emitting an offset TypeError into a new ZEND_API function
zend_illegal_container_offset(const zend_string* container, const zval *offset, int type);
Where the container should represent the type on which the access is attempted (e.g. string, array)
The offset zval that is used, where the error message will display its type
The type of access, which should be a BP_VAR_* constant, to get special message for isset/empty/unset
The length is passed to xmlStrndup(), which also adds 1, and adds a null
terminator past the end. It worked because the length is not actually
stored. Strings in libxml2 are null terminated. Passing the length just
avoids a call to strlen().
A few callers remove all children of a node. The way it was done in
node.c was unsafe, because it left nodep->last dangling. It just happens
to not crash if xmlNodeSetContent() is called immediately afterwards.
The manual refers to the DOM Level 3 Core spec which says:
"On setting, this creates a Text node with the unparsed contents of the
string. I.e. any characters that an XML processor would recognize as
markup are instead treated as literal text."
PHP is expanding entities when DOMAttr::value is set, which is
non-compliant and is a difference in behaviour compared to browser DOM
implementations.
So, when value is set, remove all children of the attribute node. Then
create a single text node and insert that as the only child of the
attribute.
Add tests.
* PHP-8.2:
Fix bug #77686: Removed elements are still returned by getElementById
Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
* PHP-8.1:
Fix bug #77686: Removed elements are still returned by getElementById
Fix bug #81642: DOMChildNode::replaceWith() bug when replacing a node with itself
Fix bug #67440: append_node of a DOMDocumentFragment does not reconcile namespaces
From the moment an ID is created, libxml2's behaviour is to cache that element,
even if that element is not yet attached to the document. Similarly, only upon
destruction of the element the ID is actually removed by libxml2.
Since libxml2 has such behaviour deeply ingrained in the library, and uses the
cache for various purposes, it seems like a bad idea and lost cause to fight it.
Instead, we'll simply walk the tree upwards to check if the node is attached to
the document.
Closes GH-11369.
The test was amended from the original issue report. For the test:
Co-authored-by: php@deep-freeze.ca
The problem is that the regular dom_reconcile_ns() only works on a
single node. We actually have to reconciliate the whole tree in case a
fragment was added. This also required to move some code around such
that this special case could be handled separately.
Closes GH-11362.
* Implement iteration cache, item cache and length cache for node list iteration
The current implementation follows the spec requirement that the list
must be "live". This means that changes in the document must be
reflected in the existing node lists without requiring the user to
refetch the node list.
The consequence is that getting any item, or the length of the list,
always starts searching from the root element of the node list. This
results in O(n) time to get any item or the length. If there's a for
loop over the node list, this means the iterations will take O(n²) time
in total. This causes real-world performance issues with potential for
downtime (see GH-11308 and its references for details).
We fix this by introducing a caching strategy. We cache the last
iterated object in the iterator, the last requested item in the node
list, and the last length computation. To invalidate the cache, we
simply count the number of modifications made to the containing
document. If the modification number does not match what the number was
during caching, we know the document has been modified and the cache is
invalid. If this ever overflows, we saturate the modification number and
don't do any caching anymore. Note that we don't check for overflow on
64-bit systems because it would take hundreds of years to overflow.
Fixes GH-11308.
warning: check of ‘*resource.scheme’ for NULL after already dereferencing it [-Wanalyzer-deref-before-check]
186 | use_ssl = resource->scheme && (ZSTR_LEN(resource->scheme) > 4) && ZSTR_VAL(resource->scheme)[4] == 's';
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Although resource->scheme is already dereferenced on line 163 in the IF condition
It's a type confusion bug. `zend_make_callable` may change the function name
of the fci to become an array, causing a crash in debug mode on
`zval_ptr_dtor_str(&fci.function_name);` in `dom_xpath_ext_function_php`.
On a production build it doesn't crash but only causes a leak, because
the array elements are not destroyed, only the array container itself
is. We can use the nogc variant because it cannot contain cycles, the
potential array can only contain 2 strings.
Closes GH-11350.
I was surprised to see that getting the stricterror property showed in
in the Callgrind profile of some tests. Turns out we sometimes allocate
them. Fix this by returning the default in case no changes were made yet.
Closes GH-11345.
* PHP-8.2:
Fix DOMElement::append() and DOMElement::prepend() hierarchy checks
Fix spec compliance error for DOMDocument::getElementsByTagNameNS
Fix GH-11336: php still tries to unlock the shared memory ZendSem with opcache.file_cache_only=1 but it was never locked
Fix GH-11338: SplFileInfo empty getBasename with more than one slash
* PHP-8.1:
Fix DOMElement::append() and DOMElement::prepend() hierarchy checks
Fix spec compliance error for DOMDocument::getElementsByTagNameNS
Fix GH-11336: php still tries to unlock the shared memory ZendSem with opcache.file_cache_only=1 but it was never locked
Fix GH-11338: SplFileInfo empty getBasename with more than one slash
We could end up in an invalid hierarchy, resulting in infinite loops and
eventual crashes if we don't check for the DOM hierarchy validity.
Closes GH-11344.
Spec link: https://dom.spec.whatwg.org/#concept-getelementsbytagnamens
Spec says we should match any namespace when '*' is provided. This was
however not the case: elements that didn't have a namespace were not
returned. This patch fixes the error by modifying the namespace check.
Closes GH-11343.
I chose to check for the value of lock_file instead of checking the
file_cache_only, because it is probably a little bit faster and we're
going to access the lock_file variable anyway. It's also more generic.
Closes GH-11341.
Not explicitly documenting the possibility of returning DOMElement causes
the Intelephense linter (a popular PHP linter with ~9 million downloads:
https://marketplace.visualstudio.com/items?itemName=bmewburn.vscode-intelephense-client)
to think this code is bad:
$xp->query("whatever")->item(0)->getAttribute("foo");
DOMNode does not have getAttribute (while DOMElement does).
Documenting the DOMElement return type should fix Intelephense's linter.
Closes GH-11342.
We can't directly call xmlNodeSetContent, because it might encode the string
through xmlStringLenGetNodeList for types
XML_DOCUMENT_FRAG_NODE, XML_ELEMENT_NODE, XML_ATTRIBUTE_NODE.
In these cases we need to use a text node to avoid the encoding.
For the other cases, we *can* rely on xmlNodeSetContent because it is either
a no-op, or handles the content without encoding and clears the properties
field if needed.
The test was taken from the issue report, for the test:
Co-authored-by: ThomasWeinert <thomas@weinert.info>
Closes GH-10245.