Skip to content

Add internal URI handling API #19073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

kocsismate
Copy link
Member

No description provided.

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first remarks. Did not yet look at everything.

Comment on lines +459 to +453
static zend_string *parse_url_uri_to_string(void *uri, uri_recomposition_mode_t recomposition_mode, bool exclude_fragment)
{
ZEND_UNREACHABLE();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to simply NULL the pointer in the uri_handler_t struct instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the same comment from @DanielEScherzer in the original PR, and I wrote him that I would like to avoid making the handlers optional if possible, because this way the existence of the handlers don't have to be checked before their usage - it's advantageous both for maintainability and performance.

The parse_url based implementation is special because it's not directly exposed for userland - it's just an internal URI "backend" for BC, and these handlers aren't necessarily needed for now. We could of course expose the to_string handlers later for 3rd party extensions if we want to. Then the code should probably be changed to something else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A function that triggers undefined behavior when called (this is what ZEND_UNREACHABLE implies for production builds) and not having a function (i.e. dereferencing a NULL pointer when trying to call the function) are functionally the same. In both cases the PHP binary will do something bad (ideally just crash).

Thus it seems to be preferable to clearly indicate that the handler is not available by using NULL rather than pretending there is a handler when calling it is unsafe.

Comment on lines +428 to +431
static void *parse_url_clone_uri(void *uri)
{
ZEND_UNREACHABLE();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines 111 to 114
if (uri_handler_name == NULL) {
return uri_handler_by_name("parse_url", sizeof("parse_url") - 1);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to parse_url in a new API is probably not a good idea. Instead the “legacy” users should just pass "parse_url" explicitly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to parse_url here works because that's the default indeed where php_uri_get_handler() is called, the other "backends" can only be used if the config is explicitly passed (not null).

The other reason why I opted for this approach is that it would be inconvenient to create and free a new zend_string when the legacy implementation is needed, and I wanted to avoid adding a known string just for this purpose, or exposing the C string based uri_handler_by_name function instead.

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at this again and I must say that I'm having trouble meaningfully reviewing this. It adds a large amount of code with unclear purpose and confusing (to me) naming.

Comment on lines +148 to +186
PHPAPI zend_result php_uri_get_scheme(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_SCHEME, read_mode, zv);
}

PHPAPI zend_result php_uri_get_username(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_USERNAME, read_mode, zv);
}

PHPAPI zend_result php_uri_get_password(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PASSWORD, read_mode, zv);
}

PHPAPI zend_result php_uri_get_host(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_HOST, read_mode, zv);
}

PHPAPI zend_result php_uri_get_port(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PORT, read_mode, zv);
}

PHPAPI zend_result php_uri_get_path(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_PATH, read_mode, zv);
}

PHPAPI zend_result php_uri_get_query(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_QUERY, read_mode, zv);
}

PHPAPI zend_result php_uri_get_fragment(const uri_internal_t *internal_uri, uri_component_read_mode_t read_mode, zval *zv)
{
return php_uri_get_property(internal_uri, URI_PROPERTY_NAME_FRAGMENT, read_mode, zv);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of these new helpers is not clear to me. It feels like just another layer of indirection by moving the enum into the function name. There's also already uri_property_handler_from_internal_uri(), why doesn't it work here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants