Skip to content

fix: mitigate SSRF risk in report URL validity checks#82

Closed
DavidJBianco wants to merge 1 commit into
mainfrom
codex/fix-unvalidated-url-checks-in-evaluator
Closed

fix: mitigate SSRF risk in report URL validity checks#82
DavidJBianco wants to merge 1 commit into
mainfrom
codex/fix-unvalidated-url-checks-in-evaluator

Conversation

@DavidJBianco

Copy link
Copy Markdown
Collaborator

Motivation

  • The asynchronous URL validity metric performed unvalidated HEAD requests for every http(s) URL extracted from reports, enabling SSRF/inner-network probing when run on untrusted inputs.
  • The goal is to prevent outbound requests to internal or metadata endpoints while preserving external URL validation functionality.

Description

  • Added ipaddress import and a helper is_safe_public_url to reject non-HTTP(S), missing-host, localhost/.local, and private/loopback/link-local/multicast/reserved/unspecified IP literal targets in evaluate_url_validity_async in evaluations/research-agent-team-eval/evaluator.py.
  • Filtered the extracted URLs to only probe safe_urls, recorded blocked_urls count in results, and updated feedback to mention skipped private/internal URLs.
  • Disabled redirect following on HEAD requests (allow_redirects=False) to reduce redirect-based SSRF exposure.
  • Kept sampling behavior (max 20 checks) but applied it only to eligible safe URLs so external URL validation is preserved.

Testing

  • Successfully compiled the modified module with python -m compileall evaluations/research-agent-team-eval/evaluator.py.
  • Attempted pytest -q tests/unit_tests/test_oauth_url_validation.py with PYTHONPATH=., but collection failed due to a missing external dependency (autogen_agentchat) in the environment, so unit tests could not be fully executed here.

Codex Task

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50a5aeff2d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1015 to +1017
except ValueError:
# Hostname is not a direct IP literal; keep it eligible.
pass

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve hostnames before classifying URLs as safe

The new SSRF guard only blocks IP literals; any hostname that is not directly parseable as an IP is marked eligible (except ValueError: ... pass). That means attacker-controlled domains (for example DNS names that resolve to 169.254.169.254 or RFC1918 targets) will still be fetched by session.head, so internal-network probing remains possible when evaluating untrusted reports. Resolve hostnames and reject non-global resolved addresses before making outbound requests.

Useful? React with 👍 / 👎.

@DavidJBianco

Copy link
Copy Markdown
Collaborator Author

Merged into dev via #84 (batch-a-quick-fixes, commit 05b4499). Note: public hostnames that DNS-resolve to private IPs are not blocked by the is_safe_public_url() filter — acceptable for an offline eval tool but worth a follow-up if the evaluator ever runs in an automated pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant