1
0
mirror of https://github.com/php/php-src.git synced 2026-03-24 00:02:20 +01:00

11 Commits

Author SHA1 Message Date
Máté Kocsis
9743977f92 Fix GH-20366 ext/uri: Do not throw ValueError on null-byte (#20489) 2025-11-19 20:41:27 +01:00
Tim Düsterhus
e23c6222da uri: Clean up naming of remaining public symbols (#19917)
* uri: Rename `uri_object_t` to `php_uri_object`

* uri: Rename `uri_(read|write)_component_*` to `php_uri_property_(read|write)_*_helper`

* uri: Rename `URI_SERIALIZED_PROPERTY_NAME` to `PHP_URI_SERIALIZE_URI_FIELD_NAME`

* uri: Rename `uri_internal_t` to `php_uri_internal`

* uri: Use proper `php_uri_ce_` prefix for all CEs

* uri: Make the object handlers `static` and remove them from the header
2025-09-23 09:19:56 +02:00
Tim Düsterhus
707f78528c uri: Inline parser and uri into uri_object_t (#19906)
* uri: Inline `uri_internal_from_obj()` and `Z_URI_INTERNAL_P()`

Changes performed with Coccinelle with some minor adjustments in places where
it choked due to macros:

    @@
    expression e;
    @@

    - Z_URI_INTERNAL_P(e)
    + &Z_URI_OBJECT_P(e)->internal

    @@
    expression e;
    @@

    - uri_internal_from_obj(e)
    + &uri_object_from_obj(e)->internal

* uri: Inline definition of `URI_ASSERT_INITIALIZATION()`

While a `NULL` pointer to `zend_object` would result in `->internal` also
sitting at `0`, this is not a particularly useful assertion to have. Instead
just assert that we have a parsed `->uri` available.

Changes made with Coccinelle + some manual adjustments:

    @@
    uri_internal_t *u;
    expression e;
    @@

     u = &Z_URI_OBJECT_P(e)->internal
    ... when != u
    - URI_ASSERT_INITIALIZATION(u);
    + ZEND_ASSERT(u->uri != NULL);

    @@
    uri_internal_t *u;
    expression e;
    @@

     u = &uri_object_from_obj(e)->internal
    ... when != u
    - URI_ASSERT_INITIALIZATION(u);
    + ZEND_ASSERT(u->uri != NULL);

* uri: Inline `parser` and `uri` into `uri_object_t`

After this, `uri_internal_t` will only be used when interacting with the URI
parsers without having a full-blown URI object.

Changes made with Coccinelle and some manual adjustments:

    @@
    identifier u;
    expression e;
    @@

    - uri_internal_t *u = &e->internal;
    + uri_object_t *u = e;

    @@
    uri_object_t *u;
    identifier t;
    @@

    - u->internal.t
    + u->t

* uri: Fix outdated `internal` naming for `uri_object_t`

* uri: Clean up naming of `uri_object_t` variables
2025-09-22 11:46:14 +02:00
Tim Düsterhus
fe0263f344 uri: Throw UriError for unexpected failures (#19850)
* uri: Add `UriError`

* uri: Throw `UriError` for unexpected failures in uri_parser_rfc3986

This is a follow-up for php/php-src#19779 which updated the error *messages*
for the non-syntax errors, but did not update the exception class, still
implying it's related to invalid URIs.

Given that we don't know ourselves if these are reachable in practice, they are
cannot be meaningfully handled by a user of PHP. Thus this should be a `Error`
according to our exception policy.

* uri: Throw `UriError` when unable to recompose URIs

* uri: Throw `UriError` when unable to read component

* NEWS
2025-09-17 19:46:19 +02:00
Tim Düsterhus
3e9caf5338 uri: Optimize php_uri_get_*() (#19807)
* uri: Do not check the return value of `uri_property_handler_from_internal_uri()`

It's impossible for this function to return `NULL`, since it will always return
a positive offset into a struct.

* uri: Optimize `php_uri_get_*()`

Currently the `php_uri_get_*()` functions call into `php_uri_get_property()`
with a constant `php_uri_property_name`. This name will then be used to look up
the correct property handler by a function in a different compilation unit.

Improve this by making `uri_property_handler_from_internal_uri` take a
`php_uri_parser` rather than a `uri_internal_t`, defining it in a header as
inlinable (and renaming it to better match its updated purpose).

This allows the compiler to fully inline `php_uri_get_property()`, such that no
dynamic lookups will need to happen.

* uri: Eliminate `php_uri_get_property()` entirely

Spelling out the effective implementation explicitly is not much longer than
going through `php_uri_get_property()`, but much more explicit in what is
happening.
2025-09-11 23:05:07 +02:00
Tim Düsterhus
04deb4df62 uri: Do not pass uri_internal_t to property handlers (#19805)
* uri: Do not pass `uri_internal_t` to property handlers

Within an individual property handler, the `parser` is already implicitly
known, which just leaves the `->uri` field which must contain the entire state
necessary for the handlers to work with.

Pass the `->uri` directly. It avoids one pointer indirection, since the
handlers do not need to follow the pointer to the `uri_internal_t` just to
follow the pointer to the URI state.  Instead the URI pointer can directly be
passed using a register with the dereferences (if necessary) happening in the
caller, providing more insight for the compiler to work with.

It also makes it more convenient to use the handlers directly for code that
already knows that it needs a specific URI parser, since no `uri_internal_t`
needs to be constructed to store the already-known information about which
parser to use.

* uri: Use local variable for the URI in `uri_get_debug_properties()`

This makes the code a little less verbose.
2025-09-11 21:58:03 +02:00
Tim Düsterhus
26eac7de17 uri: Clean up naming of public symbols (#19794)
* uri: Rename `uri_recomposition_mode_t` to `php_uri_recomposition_mode`

* uri: Align the names of the `php_uri_recomposition_mode` values

* uri: Rename `uri_component_read_mode_t` to `php_uri_component_read_mode`

* uri: Align the names of the `php_uri_component_read_mode` values

* uri: Rename `uri_property_name_t` to `php_uri_property_name`

* uri: Align the names of the `php_uri_property_name` values

* uri: Rename `uri_property_handler_t` to `php_uri_property_handler`

* uri: Rename `uri_(read|write)_t` to `php_uri_property_handler_(read|write)`

* uri: Rename `php_uri_property_handler`’s `(read|write)_func` to `read|write`

The `_func` is implied by the data type and the name of the struct.

* uri: Rename `uri_parser_t` to `php_uri_parser`

* uri: Shorten the names of `php_uri_parser` fields

The `_uri` suffix is implied, because this is an URI parser.
2025-09-11 12:10:41 +02:00
Tim Düsterhus
b90ab8119e uri: Call the proper clone_obj handler in uri_write_component_ex() (#19649)
* uri: Call the proper `clone_obj` handler in `uri_write_component_ex()`

For external URI implementation it's possible that the `->clone_obj` handler
does not match `uri_clone_obj_handler()`. Use the handler of the object instead
of making assumptions.

* uri: Call `RETVAL_OBJ(new_object)` early in `uri_write_component_ex()`

This allows to remove some error handling logic.

* uri: Remove now-useless declaration of `uri_clone_obj_handler` from php_uri_common.h
2025-09-06 20:53:09 +02:00
Máté Kocsis
e9c92a9739 ext/uri: Use the term "URI parser" instead of "URI handler" (#19530) 2025-08-21 07:23:47 +02:00
Máté Kocsis
5a9f5a6514 Add the Uri\Rfc3986\Uri class to ext/uri without wither support (#18836)
Relates to #14461 and https://wiki.php.net/rfc/url_parsing_api

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Co-authored-by: Tim Düsterhus <tim@tideways-gmbh.com>
2025-07-05 10:00:20 +02:00
Máté Kocsis
3399235bec Add Uri\WhatWg classes to ext/uri (#18672)
Relates to #14461 and https://wiki.php.net/rfc/url_parsing_api

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Co-authored-by: Tim Düsterhus <tim@tideways-gmbh.com>
2025-06-10 10:18:22 +02:00