Skip to content

Harden inline creative rendering#459

Open
prk-Jr wants to merge 4 commits intomainfrom
fix/401-harden-inline-creative-rendering
Open

Harden inline creative rendering#459
prk-Jr wants to merge 4 commits intomainfrom
fix/401-harden-inline-creative-rendering

Conversation

@prk-Jr
Copy link
Collaborator

@prk-Jr prk-Jr commented Mar 7, 2026

Summary

  • Harden the core requestAds renderer so untrusted inline creatives cannot escape the iframe sandbox or execute retained dangerous markup.
  • Fail closed on malformed or sanitized-away creatives with structured rejection metadata while avoiding raw creative HTML in logs.
  • Add regression coverage for sandbox permissions, dangerous URI/style payloads, malformed creatives, and accepted safe markup.

Changes

File Change
crates/js/lib/package.json Add dompurify as a runtime dependency for core creative sanitization.
crates/js/lib/package-lock.json Lock the new DOMPurify dependency and its transitive package metadata.
crates/js/lib/src/core/render.ts Sanitize untrusted creative HTML, reject malformed or dangerous markup, and tighten iframe sandbox permissions.
crates/js/lib/src/core/request.ts Route every inline creative through the sanitizer before srcdoc injection and add structured render/rejection logging metadata.
crates/js/lib/test/core/render.test.ts Cover sandbox tokens, accepted safe markup, rejected dangerous URI/style payloads, malformed creatives, and empty sanitization results.
crates/js/lib/test/core/request.test.ts Cover safe request-path rendering plus fail-closed behavior for dangerous, malformed, and empty creatives without logging raw HTML.

Closes

Closes #401

Test plan

  • cargo test --workspace
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • JS tests: cd crates/js/lib && npx vitest run
  • JS format: cd crates/js/lib && npm run format
  • Docs format: cd docs && npm run format
  • WASM build: cargo build --bin trusted-server-fastly --release --target wasm32-wasip1
  • Manual testing via fastly compute serve
  • Other: cd crates/js/lib && npm run build

Checklist

  • Changes follow CLAUDE.md conventions
  • No unwrap() in production code — use expect("should ...")
  • Uses tracing macros (not println!)
  • New code has tests
  • No secrets or credentials committed

@prk-Jr prk-Jr self-assigned this Mar 7, 2026
Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Solid security hardening — removing allow-scripts/allow-same-origin from the sandbox, rejecting dangerous creatives rather than silently sanitizing, and fixing String.replace $-sequence injection. However, slot clearing before validation creates a regression where rejected bids blank the slot, and the URI detection has gaps that could cause both false negatives (data:image/svg+xml) and false positives (benign URI attribute removals).

Blocking

🔧 wrench

  • Slot blanking on rejection: container.innerHTML = '' runs before sanitization — rejected creatives destroy existing slot content. In multi-bid scenarios, a later rejected bid erases an earlier successful render. (request.ts:96)
  • Missing bid.adm guard: The bid.adm && check from main was removed, so bids with missing/empty/malformed adm enter the render path, clear the slot, then get rejected. (request.ts:51)
  • Narrow data: URI pattern: Only blocks data:text/html, missing data:text/xml, data:application/xhtml+xml, and data:image/svg+xml (SVG can embed <script>). (render.ts:35)
  • Over-aggressive URI attribute flagging: isDangerousRemoval flags any removed URI attribute as dangerous regardless of value, causing false rejections for benign creatives. Inconsistent with hasDangerousMarkup which correctly checks the value. (render.ts:108)

Non-blocking

🤔 thinking

  • 3.8x bundle size increase: DOMPurify is statically imported into tsjs-core.js (8,964 B → 34,160 B raw, 3,788 B → 12,940 B gzip). The build uses inlineDynamicImports: true so lazy import() won't help. Since the policy is reject-only, hasDangerousMarkup (native <template> parser) already does the full detection. Consider removing DOMPurify entirely or moving sanitization server-side.
  • Static-only creative contract without rollout guard: Removing allow-scripts + allow-same-origin and rejecting script-bearing markup is a major behavioral shift. Most DSP creatives use JavaScript for tracking, viewability, and click handling. Consider a strict-render feature flag (default off) with rejection metrics, rolled out by seat/publisher.

♻️ refactor

  • Inconsistent sandbox policy: <form> is in DANGEROUS_TAG_NAMES (rejected) but allow-forms is in CREATIVE_SANDBOX_TOKENS (permitted). Remove allow-forms or stop rejecting <form>. (render.ts:38)
  • hasDangerousMarkup lacks intent documentation: The post-sanitization re-scan is a valid safety net for sanitizer bugs, but the comment doesn't explain why DOMPurify output is being re-scanned. (render.ts:119)

⛏ nitpick

  • srcdoc in URI_ATTRIBUTE_NAMES: srcdoc is HTML content, not a URI. DOMPurify already strips it. (render.ts:33)

🌱 seedling

  • Missing test coverage: (1) multi-bid same slot where one bid is rejected, (2) sanitizer-unavailable path, (3) data:image/svg+xml with embedded script, (4) explicit test documenting script-based creatives are intentionally rejected.

👍 praise

  • buildCreativeDocument $-sequence fix: Function callbacks in String.replace prevent replacement pattern injection. Well-tested. (render.ts:337)
  • Structured rejection logging: Rejection logs include metadata without leaking raw creative HTML. Tests verify no raw HTML in log output. (request.ts:100)

CI Status

  • cargo fmt: PASS
  • cargo test: PASS
  • vitest: PASS
  • format-typescript: PASS
  • format-docs: PASS
  • CodeQL: PASS

Copy link
Collaborator

@ChristianPavilonis ChristianPavilonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new findings in this review pass.

Copy link
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Adds server-side HTML sanitization for inline ad creatives via lol_html, tightens the client-side iframe sandbox, and improves render-path logging. The sanitizer architecture (server strips dangerous markup, client validates type/emptiness, sandbox enforces no-script) is sound, but the fail-open fallback paths and unsanitized <style> content need to be addressed before merge.

Blocking

🔧 wrench

  • Fail-open on oversized input: sanitize_creative_html returns raw markup when input exceeds MAX_CREATIVE_SIZE — should return empty string (creative.rs:355)
  • Fail-open on parse errors: Raw markup returned when lol_html fails to parse — should return empty string (creative.rs:464)
  • <style> element content not sanitized: Inline style attributes are checked but <style> blocks pass through with expression(), @import, etc. (creative.rs:402)

❓ question

  • Is preserving <style> elements intentional?: <link> is stripped but <style> is allowed — inconsistent treatment (creative.rs:393)

Non-blocking

🤔 thinking

  • Proxy path skips sanitization: CreativeHtmlProcessor (in proxy.rs) only runs through rewrite_creative_html, not sanitize_creative_html. Probably intentional since proxied pages may legitimately need scripts/iframes, but worth documenting the trust boundary difference.
  • removedCount always 0 on client: Client-side sanitization fields are always identity/zero, could mislead operators (render.ts:71)

♻️ refactor

  • data-src and srcset not checked: Missing from the URI attribute check list for defense-in-depth (creative.rs:413)

🌱 seedling

  • Missing sanitizer + rewriter integration test: No test runs both in sequence as formats.rs does

📝 note

  • Sandbox removes allow-scripts: Deliberate defense-in-depth but will break creatives relying on inline JS for click tracking, viewability, or animation — worth validating with real-world ad creatives

⛏ nitpick

  • unwrap_or("") on infallible split: .split().next() can never be None (creative.rs:323)

CI Status

All 10 checks pass: cargo fmt, cargo test, vitest, format-typescript, format-docs, CodeQL, Analyze (actions, javascript-typescript x2, rust).

markup.len(),
MAX_CREATIVE_SIZE
);
return markup.to_owned();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — Fail-open: oversized input bypasses sanitization

When markup.len() > MAX_CREATIVE_SIZE, the raw unsanitized markup is returned unchanged. An attacker who controls creative content could exceed the 1 MiB limit, causing <script>, on* handlers, etc. to pass through to the client.

Fix — fail closed by returning an empty string:

if markup.len() > MAX_CREATIVE_SIZE {
    log::warn!("sanitize_creative_html: creative too large; rejecting");
    return String::new();
}

The test sanitize_returns_unchanged_when_over_size_limit would need updating to assert an empty string.

// rewriter is in an error state and may produce garbage output.
if rewriter.write(markup.as_bytes()).is_err() || rewriter.end().is_err() {
log::warn!("sanitize_creative_html: html parse error; returning markup unchanged");
return markup.to_owned();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench — Fail-open: parse errors bypass sanitization

If rewriter.write() or rewriter.end() fails, the raw unsanitized markup is returned. A crafted input that triggers a lol_html parse error would skip all sanitization, and the browser's more forgiving HTML parser would execute the payload.

Fix — fail closed:

if rewriter.write(markup.as_bytes()).is_err() || rewriter.end().is_err() {
    log::warn!("sanitize_creative_html: html parse error; rejecting markup");
    return String::new();
}

// <script>). Attribute mutations on removed elements are benign — lol_html
// discards the tag — but the handler still fires. This is intentional and
// harmless.
element!("*", |el| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 wrench<style> element content not sanitized

The element!("*", ...) handler strips dangerous inline style attributes, but <style> element content passes through unchecked:

<style>div { background: expression(alert(document.cookie)) }</style>

@import url(https://evil.example/...) inside <style> blocks is also unrestricted. Consider stripping <style> elements entirely (like <link>) or adding a content handler to scan for dangerous patterns.

el.remove();
Ok(())
}),
element!("link", |el| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question — Is preserving <style> elements intentional?

<link> is stripped but <style> is allowed through. This is inconsistent — <style> can carry CSS injection (expression(), -moz-binding), @import for external resources, and data exfiltration via url(). Is this for creative rendering fidelity, or an oversight?

originalLength,
sanitizedHtml: creativeHtml,
sanitizedLength: originalLength,
removedCount: 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 thinkingsanitizedLength and removedCount are always identity/zero

Client-side sanitizeCreativeHtml always returns removedCount: 0 and sanitizedLength === originalLength since it only checks type/emptiness. These get logged on render, which could mislead operators. Consider removing these fields from the client types or commenting that the delta is always 0 here.

el.remove_attribute(attr);
}

for attr_name in &[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ refactordata-src and srcset not checked for dangerous URI schemes

The URI attribute list misses srcset and data-src. While data-src needs JS to activate (blocked by sandbox) and javascript: in srcset has limited browser support, defense-in-depth would include them.

.trim_start_matches("data:")
.split([';', ','])
.next()
.unwrap_or("");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpickunwrap_or("") on .split().next() which can never be None

.split().next() always yields at least one element. Could use .expect("should have at least one split segment") per project conventions, but the unwrap_or fallback is harmless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsanitized creative HTML injected into iframe with weakened sandbox

3 participants