3
3
4
4
Usage is straightforward::
5
5
6
- >>> from hyperlink import URL
7
- >>> url = URL.from_text (u'http://github.com/mahmoud/hyperlink?utm_source=docs')
6
+ >>> import hyperlink
7
+ >>> url = hyperlink.parse (u'http://github.com/mahmoud/hyperlink?utm_source=docs')
8
8
>>> url.host
9
9
u'github.com'
10
10
>>> secure_url = url.replace(scheme=u'https')
11
11
>>> secure_url.get('utm_source')[0]
12
12
u'docs'
13
13
14
- As seen here, the API revolves around the lightweight and immutable
15
- :class:`URL` type, documented below.
14
+ Hyperlink's API centers on the :class:`DecodedURL` type, which wraps
15
+ the lower-level :class:`URL`, both of which can be returned by the
16
+ :func:`parse()` convenience function.
17
+
16
18
""" # noqa: E501
17
19
18
20
import re
@@ -1971,13 +1973,25 @@ def remove(
1971
1973
1972
1974
EncodedURL = URL # An alias better describing what the URL really is
1973
1975
1976
+ _EMPTY_URL = URL ()
1977
+
1974
1978
1975
1979
class DecodedURL (object ):
1976
- """DecodedURL is a type meant to act as a higher-level interface to
1977
- the URL. It is the `unicode` to URL's `bytes`. `DecodedURL` has
1978
- almost exactly the same API as `URL`, but everything going in and
1979
- out is in its maximally decoded state. All percent decoding is
1980
- handled automatically.
1980
+ """
1981
+ :class:`DecodedURL` is a type designed to act as a higher-level
1982
+ interface to :class:`URL` and the recommended type for most
1983
+ operations. By analogy, :class:`DecodedURL` is the
1984
+ :class:`unicode` to URL's :class:`bytes`.
1985
+
1986
+ :class:`DecodedURL` automatically handles encoding and decoding
1987
+ all its components, such that all inputs and outputs are in a
1988
+ maximally-decoded state. Note that this means, for some special
1989
+ cases, a URL may not "roundtrip" character-for-character, but this
1990
+ is considered a good tradeoff for the safety of automatic
1991
+ encoding.
1992
+
1993
+ Otherwise, :class:`DecodedURL` has almost exactly the same API as
1994
+ :class:`URL`.
1981
1995
1982
1996
Where applicable, a UTF-8 encoding is presumed. Be advised that
1983
1997
some interactions can raise :exc:`UnicodeEncodeErrors` and
@@ -1991,9 +2005,20 @@ class DecodedURL(object):
1991
2005
lazy (bool): Set to True to avoid pre-decode all parts of the URL to
1992
2006
check for validity. Defaults to False.
1993
2007
2008
+ .. note::
2009
+
2010
+ The :class:`DecodedURL` initializer takes a :class:`URL` object,
2011
+ not URL components, like :class:`URL`. To programmatically
2012
+ construct a :class:`DecodedURL`, you can use this pattern:
2013
+
2014
+ >>> print(DecodedURL().replace(scheme=u'https',
2015
+ ... host=u'pypi.org', path=(u'projects', u'hyperlink')).to_text())
2016
+ https://pypi.org/projects/hyperlink
2017
+
2018
+ .. versionadded:: 18.0.0
1994
2019
"""
1995
2020
1996
- def __init__ (self , url , lazy = False ):
2021
+ def __init__ (self , url = _EMPTY_URL , lazy = False ):
1997
2022
# type: (URL, bool) -> None
1998
2023
self ._url = url
1999
2024
if not lazy :
@@ -2353,22 +2378,29 @@ def __dir__(self):
2353
2378
2354
2379
def parse (url , decoded = True , lazy = False ):
2355
2380
# type: (Text, bool, bool) -> Union[URL, DecodedURL]
2356
- """Automatically turn text into a structured URL object.
2381
+ """
2382
+ Automatically turn text into a structured URL object.
2383
+
2384
+ >>> url = parse(u"https://github.com/python-hyper/hyperlink")
2385
+ >>> print(url.to_text())
2386
+ https://github.com/python-hyper/hyperlink
2357
2387
2358
2388
Args:
2359
- url (Text ): A string representation of a URL.
2389
+ url (str ): A text string representation of a URL.
2360
2390
2361
2391
decoded (bool): Whether or not to return a :class:`DecodedURL`,
2362
2392
which automatically handles all
2363
2393
encoding/decoding/quoting/unquoting for all the various
2364
- accessors of parts of the URL, or an :class:`EncodedURL `,
2394
+ accessors of parts of the URL, or a :class:`URL `,
2365
2395
which has the same API, but requires handling of special
2366
2396
characters for different parts of the URL.
2367
2397
2368
2398
lazy (bool): In the case of `decoded=True`, this controls
2369
2399
whether the URL is decoded immediately or as accessed. The
2370
2400
default, `lazy=False`, checks all encoded parts of the URL
2371
2401
for decodability.
2402
+
2403
+ .. versionadded:: 18.0.0
2372
2404
"""
2373
2405
enc_url = EncodedURL .from_text (url )
2374
2406
if not decoded :
0 commit comments