Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ All URL components follow this pattern: property decorator + cached_property for
- Comprehensive property testing for all URL components
- HTTP method testing (when possible)
- Optional dependency tests use `@pytest.mark.skipif` decorators
- README examples are automatically tested using pytest-markdown-docs
- doctest examples are automatically tested using pytest-markdown-docs

### Development Workflow

Expand All @@ -59,12 +59,12 @@ make test
# Run unit tests only
make test-unit

# Run README tests only
# Run doc-tests only
make test-doctest

# Or use uv directly
uv run pytest tests/
uv run pytest README.md --markdown-docs
uv run pytest tests/doctests.md --markdown-docs
```

### Building and Packaging
Expand All @@ -86,7 +86,7 @@ make help
- **Build System**: `uv` with `hatchling` backend for modern Python packaging

### CI Configuration
GitHub Actions tests against Python 3.9-3.13 using `uv sync` and matrix strategy. Both unit tests and README doctests must pass.
GitHub Actions tests against Python 3.9-3.13 using `uv sync` and matrix strategy. Both unit tests and doctests must pass.

## Code Conventions

Expand All @@ -108,5 +108,6 @@ GitHub Actions tests against Python 3.9-3.13 using `uv sync` and matrix strategy
### File Structure
- `urlpath/__init__.py`: Single-file module with all classes
- `tests/test_url.py`: Comprehensive pytest test suite
- `README.md`: Extensive examples with automated pytest validation
- `README.md`: Overview of the library, feature tour, and usage examples
- `tests/doctests.md`: Extensive examples with automated pytest validation
- `conftest.py`: pytest configuration for test discovery and path setup
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,5 +36,5 @@ jobs:
run: uv sync --group dev
- name: Run unit tests
run: uv run pytest tests/
- name: Run README tests
run: uv run pytest README.md --markdown-docs
- name: Run doctest snippets
run: uv run pytest tests/doctests.md --markdown-docs
8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
# Use copy mode to avoid filesystem reflink issues
export UV_LINK_MODE = copy

DOC_TESTS = tests/doctests.md

help: ## Show this help message
@echo 'Usage: make [target]'
@echo ''
Expand All @@ -21,8 +23,8 @@ test: test-unit test-doctest ## Run all tests
test-unit: ## Run unit tests
uv run --group dev pytest tests/

test-doctest: ## Run doctests from README
uv run --group dev pytest README.md --markdown-docs
test-doctest: ## Run doctests from doctests.md
uv run --group dev pytest $(DOC_TESTS) --markdown-docs

build: ## Build package
uv build
Expand All @@ -43,4 +45,4 @@ check: ## Verify code quality (format, lint, type check, test)
uv run --group dev ruff format --check
uv run --group dev ruff check
uv run --group dev mypy urlpath/ tests/
uv run --group dev pytest tests/ README.md --markdown-docs
uv run --group dev pytest tests/ $(DOC_TESTS) --markdown-docs
279 changes: 156 additions & 123 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,160 +1,193 @@
# URLPath

URLPath provides URL manipulator class that extends [`pathlib.PurePath`](https://docs.python.org/3/library/pathlib.html#pure-paths).
URLPath turns raw URLs into first-class objects that behave like `pathlib` paths and `requests` sessions at the same time. Build, query, and call URLs with an expressive, chainable API.

[![Tests](https://github.com/brandonschabell/urlpath/actions/workflows/test.yml/badge.svg)](https://github.com/brandonschabell/urlpath/actions/workflows/test.yml)
[![PyPI version](https://img.shields.io/pypi/v/urlpath.svg)](https://pypi.python.org/pypi/urlpath)
[![Downloads](https://pepy.tech/badge/urlpath)](https://pepy.tech/project/urlpath)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python Versions](https://img.shields.io/pypi/pyversions/urlpath.svg)](https://pypi.org/project/urlpath/)

## Dependencies
## Features

* Python 3.9–3.14
* [Requests](http://docs.python-requests.org/)
* [JMESPath](https://pypi.org/project/jmespath/) (Optional)
* [WebOb](http://webob.org/) (Optional)
- Compose URLs with `pathlib` semantics: join segments, inspect components, and normalise paths.
- Access and mutate parts of the URL (`scheme`, `netloc`, `userinfo`, `query`, `fragment`, etc.) with fluent helpers.
- Treat query strings as multidicts, rebuild them from dicts/objects, or append additional parameters without losing order.
- Make HTTP requests directly from any `URL` (`get`, `post`, `patch`, `put`, `delete`) and fetch JSON with optional JMESPath filtering.
- Keep callers inside a known root using `JailedURL` guards.
- Accept familiar inputs: strings, bytes, `urllib.parse` results, `webob.Request`, and other `PathLike` objects.

## Install
## How to install

```bash
pip install urlpath
```
### Dependencies

## Examples
* **Python 3.9–3.14**
* **[Requests](http://docs.python-requests.org/)** - required for HTTP verbs.
* **[JMESPath](https://pypi.org/project/jmespath/)** - optional, enables filtered `get_json` responses.
* **[WebOb](http://webob.org/)** - optional, allows constructing URLs directly from `webob.Request` instances.

## Quick start

```python
from urlpath import URL

# Create URL object
url = URL(
'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment')

# Representation
assert str(url) == 'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert url.as_uri() == 'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert url.as_posix() == 'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment'

# Access pathlib.PurePath compatible properties
assert url.drive == 'https://username:[email protected]:1234'
assert url.root == '/'
assert url.anchor == 'https://username:[email protected]:1234/'
assert url.path == '/path/to/file.ext'
assert url.name == 'file.ext'
assert url.suffix == '.ext'
assert url.suffixes == ['.ext']
assert url.stem == 'file'
assert url.parts == ('https://username:[email protected]:1234/', 'path', 'to', 'file.ext')
assert str(url.parent) == 'https://username:[email protected]:1234/path/to'

# Access scheme
assert url.scheme == 'https'

# Access netloc
assert url.netloc == 'username:[email protected]:1234'
assert url.username == 'username'
assert url.password == 'password'
assert url.hostname == 'secure.example.com'
assert url.port == 1234

# Access query
assert url.query == 'field1=1&field2=2&field1=3'
assert url.form_fields == (('field1', '1'), ('field2', '2'), ('field1', '3'))
assert 'field1' in url.form
assert url.form.get_one('field1') == '1'
assert url.form.get_one('field3') is None

# Access fragment
assert url.fragment == 'fragment'

# Path operations
assert str(url / 'suffix') == 'https://username:[email protected]:1234/path/to/file.ext/suffix'
assert str(url / '../../rel') == 'https://username:[email protected]:1234/path/to/file.ext/../../rel'
assert str((url / '../../rel').resolve()) == 'https://username:[email protected]:1234/path/rel'
assert str(url / '/') == 'https://username:[email protected]:1234/'
assert str(url / 'http://example.com/') == 'http://example.com/'

# Replace components
assert str(url.with_scheme('http')) == 'http://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert str(url.with_netloc('www.example.com')) == 'https://www.example.com/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert str(url.with_userinfo('joe', 'pa33')) == 'https://joe:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert str(url.with_hostinfo('example.com', 8080)) == 'https://username:[email protected]:8080/path/to/file.ext?field1=1&field2=2&field1=3#fragment'
assert str(url.with_fragment('new fragment')) == 'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2&field1=3#new fragment'
assert str(url.with_components(username=None, password=None, query='query', fragment='frag')) == 'https://secure.example.com:1234/path/to/file.ext?query#frag'

# Replace query
assert str(url.with_query({'field3': '3', 'field4': [1, 2, 3]})) == 'https://username:[email protected]:1234/path/to/file.ext?field3=3&field4=1&field4=2&field4=3#fragment'
assert str(url.with_query(field3='3', field4=[1, 2, 3])) == 'https://username:[email protected]:1234/path/to/file.ext?field3=3&field4=1&field4=2&field4=3#fragment'
assert str(url.with_query('query')) == 'https://username:[email protected]:1234/path/to/file.ext?query#fragment'
assert str(url.with_query(None)) == 'https://username:[email protected]:1234/path/to/file.ext#fragment'

# Amend query
assert str(url.with_query(field1='1').add_query(field2=2)) == 'https://username:[email protected]:1234/path/to/file.ext?field1=1&field2=2#fragment'
api = URL("https://api.example.com/v1")
user = api / "users" / "123"

# Manipulate components just like pathlib
assert user.path == "/v1/users/123"
assert user.parent == URL("https://api.example.com/v1/users")

# Tweak and inspect the query string
endpoint = user.with_query(include=["profile", "activity"]).add_query(page=2)
assert str(endpoint) == "https://api.example.com/v1/users/123?include=profile&include=activity&page=2"

# Call the URL with requests
response = endpoint.get()
if response.ok:
data = endpoint.get_json(keys="user.profile") # Optional JMESPath filter
```

### HTTP requests
## Path-aware URL composition

URLPath provides convenient methods for making HTTP requests:
`URL` subclasses `pathlib.PurePath` to give you intuitive operations:

```python
from urlpath import URL
url = URL("https://username:[email protected]:1234/path/to/file.ext?field1=1#fragment")

url.drive # 'https://username:[email protected]:1234'
url.anchor # 'https://username:[email protected]:1234/'
url.parts # ('https://username:[email protected]:1234/', 'path', 'to', 'file.ext')
url.name # 'file.ext'
url.suffixes # ['.ext']
url.parent # URL('https://username:[email protected]:1234/path/to')

# Slash-join works the way pathlib users expect
assert str(url / "reports" / "2024.json") == "https://username:[email protected]:1234/path/to/file.ext/reports/2024.json"
assert str((url / "../templates").resolve()) == "https://username:[email protected]:1234/path/to/templates"

# Absolute joins or constructor segments reset the path
assert str(url / "/reset/path") == "https://username:[email protected]:1234/reset/path"
assert str(URL("https://example.com/base", "/fresh")) == "https://example.com/fresh"
```

# GET request
url = URL('https://httpbin.org/get')
response = url.get()
assert response.status_code == 200

# POST request
url = URL('https://httpbin.org/post')
response = url.post(data={'key': 'value'})
assert response.status_code == 200

# DELETE request
url = URL('https://httpbin.org/delete')
response = url.delete()
assert response.status_code == 200

# PATCH request
url = URL('https://httpbin.org/patch')
response = url.patch(data={'key': 'value'})
assert response.status_code == 200

# PUT request
url = URL('https://httpbin.org/put')
response = url.put(data={'key': 'value'})
assert response.status_code == 200
Use the fluent `with_*` helpers to surgically update components:

```python
url = URL("http://www.example.com/path/to/file.exe?query#frag")
url = url.with_scheme("https").with_userinfo("user", "secret")
assert str(url) == "https://user:[email protected]/path/to/file.exe?query#frag"
assert url.hostname == "www.example.com"
```

### Jail
## Query and fragment helpers

URLPath keeps queries ordered and exposes them through a WebOb-style multidict:

```python
from urlpath import URL
url = URL("http://www.example.com/form")
form_url = url.with_query({"field1": ["value1", "value2"], "field2": "hello, world"})

form_url.form.get("field1") # ("value1", "value2")
"field2" in form_url.form # True

# Append without losing the existing parameters
extended = form_url.add_query(field3="value3")
assert extended.query == "field1=value1&field1=value2&field2=hello%2C+world&field3=value3"

root = 'http://www.example.com/app/'
current = 'http://www.example.com/app/path/to/content'
url = URL(root).jailed / current
assert str(url / '/root') == 'http://www.example.com/app/root'
assert str((url / '../../../../../../root').resolve()) == 'http://www.example.com/app/root'
assert str(url / 'http://localhost/') == 'http://www.example.com/app/'
assert str(url / 'http://www.example.com/app/file') == 'http://www.example.com/app/file'
# Swap out the fragment without touching the rest of the URL
assert str(url.with_fragment("section-3")) == "http://www.example.com/form#section-3"
```

### Trailing separator will be retained
## HTTP requests & JSON extraction

Every `URL` instance can issue HTTP requests via `requests`:

```python
from urlpath import URL
url = URL("https://httpbin.org/anything")
response = url.post(json={"hello": "world"})
response.raise_for_status()

# Fetch JSON and optionally apply a JMESPath expression
reporting_api = URL("https://api.example.com/reports")
document = reporting_api.get_json(query={"status": "active"}, keys="items[*].name")
# => ["Quarterly", "Annual"]
```

Pass a compiled JMESPath expression instead of a string when you need to reuse filters:

```python
import jmespath

url = URL('http://www.example.com/path/with/trailing/sep/')
assert str(url).endswith('/')
assert url.trailing_sep == '/'
assert url.name == 'sep'
assert url.path == '/path/with/trailing/sep/'
assert url.parts[-1] == 'sep'

url = URL('http://www.example.com/path/without/trailing/sep')
assert not str(url).endswith('/')
assert url.trailing_sep == ''
assert url.name == 'sep'
assert url.path == '/path/without/trailing/sep'
assert url.parts[-1] == 'sep'
expr = jmespath.compile("users[*].age")
ages = URL("https://api.example.com/users").get_json(keys=expr)
```

`jmespath` is optional; install it to enable filtered lookups (`pip install urlpath[jmespath]`).

## Constrain navigation with jailed URLs

`JailedURL` confines joins and resolutions to a particular origin, preventing escapes:

```python
root = URL("https://www.example.com/app/")
current = root.jailed / "path/to/content"

assert str(current / "appendix") == "https://www.example.com/app/path/to/content/appendix"
assert str((current / "/root").resolve()) == "https://www.example.com/app/root"
assert str(current / "https://malicious.test") == "https://www.example.com/app/"
```

You can also wrap an incoming `webob.Request` to mirror the application's mount point:

```python
import webob
from urlpath import JailedURL

request = webob.Request.blank(
"/docs/page",
base_url="https://docs.example.com",
environ={"SCRIPT_NAME": "/knowledge-base"},
)

jailed = JailedURL(request)
assert str(jailed) == "https://docs.example.com/knowledge-base/docs/page"
assert str(jailed.chroot) == "https://docs.example.com/knowledge-base"
```

## Works with familiar URL sources

The constructor accepts many canonical URL representations:

```python
from pathlib import PurePosixPath
from urllib.parse import urlsplit

URL(urlsplit("https://example.com/from-split"))
URL(PurePosixPath("path/segment")) # usable when joining onto a local path
URL(b"https://example.com/from-bytes")
URL(webob.Request.blank("/resource", base_url="https://example.com"))
```

## Encoding-aware by default

IDNs and percent-encoding are handled for you:

```python
url = URL("http://www.xn--alliancefranaise-npb.nu/")
url.hostname # "www.alliancefran\u00e7aise.nu"

URL("http://example.com/name").with_name("\u65e5\u672c\u8a9e/\u540d\u524d")
# str(encoded) == "http://example.com/%E6%97%A5%E6%9C%AC%E8%AA%9E%2F%E5%90%8D%E5%89%8D"
```

## Testing the examples

You can find additional examples in the doctest script located at [docttests.md](tests/doctests.md).

See the [test suite](tests/test_url.py) for more usage patterns and edge cases.

Run `make test` to execute tests and ensure the published examples stay up to date.
Loading