-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Add internal URI handling API #19073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some first remarks. Did not yet look at everything.
static zend_string *parse_url_uri_to_string(void *uri, uri_recomposition_mode_t recomposition_mode, bool exclude_fragment) | ||
{ | ||
ZEND_UNREACHABLE(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to simply NULL
the pointer in the uri_handler_t
struct instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the same comment from @DanielEScherzer in the original PR, and I wrote him that I would like to avoid making the handlers optional if possible, because this way the existence of the handlers don't have to be checked before their usage - it's advantageous both for maintainability and performance.
The parse_url based implementation is special because it's not directly exposed for userland - it's just an internal URI "backend" for BC, and these handlers aren't necessarily needed for now. We could of course expose the to_string
handlers later for 3rd party extensions if we want to. Then the code should probably be changed to something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A function that triggers undefined behavior when called (this is what ZEND_UNREACHABLE implies for production builds) and not having a function (i.e. dereferencing a NULL pointer when trying to call the function) are functionally the same. In both cases the PHP binary will do something bad (ideally just crash).
Thus it seems to be preferable to clearly indicate that the handler is not available by using NULL rather than pretending there is a handler when calling it is unsafe.
static void *parse_url_clone_uri(void *uri) | ||
{ | ||
ZEND_UNREACHABLE(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
ext/uri/php_uri.c
Outdated
if (uri_handler_name == NULL) { | ||
return uri_handler_by_name("parse_url", sizeof("parse_url") - 1); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting to parse_url
in a new API is probably not a good idea. Instead the “legacy” users should just pass "parse_url"
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaulting to parse_url
here works because that's the default indeed where php_uri_get_handler()
is called, the other "backends" can only be used if the config is explicitly passed (not null).
The other reason why I opted for this approach is that it would be inconvenient to create and free a new zend_string
when the legacy implementation is needed, and I wanted to avoid adding a known string just for this purpose, or exposing the C string based uri_handler_by_name
function instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked at this again and I must say that I'm having trouble meaningfully reviewing this. It adds a large amount of code with unclear purpose and confusing (to me) naming.
PHPAPI zend_result php_uri_get_scheme(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_SCHEME, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_username(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_USERNAME, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_password(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PASSWORD, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_host(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_HOST, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_port(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PORT, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_path(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PATH, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_query(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_QUERY, read_mode, zv); | ||
} | ||
|
||
PHPAPI zend_result php_uri_get_fragment(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv) | ||
{ | ||
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_FRAGMENT, read_mode, zv); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of these new helpers is not clear to me. It feels like just another layer of indirection by moving the enum into the function name. There's also already uri_property_handler_from_internal_uri()
, why doesn't it work here?
No description provided.