If nothing was added to a smart_str, the interned empty string is
returned, and therefore ZVAL_NEW_STR is wrong as it'll set the
REFCOUNTED flag.
Closes GH-20773.
* uri: Use the “includes credentials” rule for WhatWg user/password getters
The URL serializing algorithm from the WHATWG URL Standard uses an “includes
credentials” rule to decide whether or not to include the `@` in the output,
indicating the presence of a userinfo component in RFC 3986 terminology. Use
this rule to determine whether or not an empty username or password should be
returned as the empty string (present but empty) or NULL (not present).
* uri: Use ZVAL_STRINGL_FAST in `whatwg_(username|password)_read()`
This nicely sidesteps the undefined behavior with passing a `(NULL, 0)` pair
without needing manual logic.
* NEWS
It makes sense to restrict the types used for $errors.
This can also improve the types for static analysis tools as they can
now rely on the array being a list of this class type.
Closes GH-19781.
* uri: Fix handling of the `errors == NULL && !silent` for uri_parser_whatwg
Previously, when `errors` was `NULL`, the `errors` pointer was used to set the
`$errors` property when throwing the exception, leading to a crash. Use a local
zval to pass the errors to the Exception and copy it into the `errors` input
when it is non-`NULL`.
* uri: Only pass the `errors` zval when interested in it in `php_uri_instantiate_uri()`
This is no longer necessary since the previous commit and also is a layering
violation, since `php_uri_instantiate_uri()` should not care how `parse_uri()`
works internally.
* uri: Use `ZVAL_EMPTY_ARRAY()` when no parsing errors are available
* uri: Avoid redundant refcounting in error handling of uri_parser_whatwg
* NEWS
RFC 3986 technically allows arbitrarily large integers as port numbers, but our
implementation is unable to deal with that, since it expects the port to fit
`zend_long`, reject those explicitly instead of misinterpreting them.
* uri: Fix double-free when assigning `$errors` by reference fails
`ZEND_TRY_ASSIGN_REF_ARR()` apparently consumes the to-be-assigned value even
when it fails.
* uri: Fix leak of parsed URI when assigning soft errors by reference fails
This is not reproducible, because the URI object will still be referenced by
Lexbor’s mraw instance and then cleanly destroyed at the end of the request.
* NEWS
There were two issues with the previous implementation of normalization:
- `php_raw_url_decode_ex()` would be used to modify a string with RC >1.
- The return value of `php_raw_url_decode_ex()` was not used, resulting in
incorrect string lengths when percent-encoded characters are decoded.
Additionally there was a bogus assertion that verified that strings returned
from the read handlers are RC =2, which was not the case for the
`parse_url`-based parser when repeatedly retrieving a component even without
normalization happening. Remove that assertion, since its usefulness is
questionable. Any obvious data type issues with read handlers should be
detectable when testing during development.
Calling `lexbor_mraw_clean()` after a specific number of parses will destroy
the data for any live `Uri\WhatWg\Url` objects, effectively resulting in a
use-after-free.
Fix the issue by removing the periodic `lexbor_mraw_clean()` call. Instead we
implement `php_uri_parser_whatwg_free()`. This also requires to move the
destruction of the lexbor structures from RSHUTDOWN to POST_ZEND_DEACTIVATE to
prevent a use-after-free in `php_uri_parser_whatwg_free()` since otherwise the
mraw would already have been destroyed.
* uri: Streamline implementation of `uriparser_parse_uri_ex()`
Avoid the use of a macro and streamline the logic.
* uri: Improve exceptions for `Uri\Rfc3986\Uri`
* uri: Allow empty URIs for RFC3986
* NEWS
* uri: Improve ext/uri/tests/004.phpt for empty URIs
`zend_enum_new()` is not intended to be used “at runtime”, since it will create
a new object, breaking the singleton property. Instead
`zend_enum_get_case_cstr()` must be used.