Skip to content

Latest commit

 

History

History
27 lines (17 loc) · 1.14 KB

File metadata and controls

27 lines (17 loc) · 1.14 KB

feedparser: Anchoring Author Email Parsing

Upstream PR: kurtmckee/feedparser#571

Bug

Malformed email-like author values with dotted domains could make the author email regex rescan from inside the same token. The result was not a crash in normal happy-path parsing, but a correctness and robustness problem around adversarial input.

Contract

Author email extraction should either identify the intended address once or reject the malformed value. It should not repeatedly rediscover overlapping fragments inside the same token.

Fix

The regex was anchored so matching starts at the intended token boundary. A regression test captures the dotted-domain payload that motivated the change.

Verification

The PR was checked with the full local project suite and style gates:

  • python -m pytest -q
  • python -m mypy
  • black --check --target-version py310 feedparser/mixin.py tests/test_author_email.py
  • isort --check-only feedparser/mixin.py tests/test_author_email.py
  • flake8 feedparser/mixin.py tests/test_author_email.py
  • python -m compileall -q feedparser tests
  • git diff --check