feat(fuzz): add XSS context analyzer for reflection-based detection by teredasites · Pull Request #7072 · projectdiscovery/nuclei

teredasites · 2026-02-27T20:04:13Z

Summary

Resolves #5838

Adds a new xss_context analyzer to the nuclei fuzzing engine that performs context-aware reflected XSS detection. Unlike simple regex-based checks, this analyzer parses the HTML response with Go's tokenizer to determine where user input is reflected and selects payloads that can structurally achieve script execution in that specific context.

How it works

Phase 1 - Canary injection: The [XSS_CANARY] placeholder in the fuzzing payload is replaced with a marker containing special characters (<>"'/). After the server processes the request, we check which characters survived encoding/filtering.

Phase 2 - Context detection: The HTML tokenizer walks the response and classifies each reflection into one of 8 contexts:

html_text - between tags (needs <script> or event handler injection)
attribute / attribute_unquoted - inside attribute values (needs quote breakout)
script / script_string - inside <script> blocks or JS string literals
html_comment - inside  (needs --> breakout)
style - inside <style> blocks (needs </style> breakout)

Phase 3 - Payload replay & verification: For each reflection point, payloads are filtered by character requirements (no point sending <script> if < is encoded). Surviving payloads are replayed through the original fuzz component and the response is checked for unencoded executable content.

Key design decisions

Zero-allocation event handler detection: Uses stack-allocated byte buffers with bitwise case folding for the hot-path isEventHandler() check across 80+ event handlers
Windowed character survival detection: Only checks characters within the token containing the marker, avoiding false positives from unrelated page content
Proper tag stack management: Name-matched popping handles nested/misnested elements correctly
Component state save/restore: The replay path saves the original fuzz component value and restores it after each payload attempt
Drain fallback: drainRemainingReflections() catches markers in malformed/truncated HTML that the tokenizer missed

Template usage

http:
  - method: GET
    path:
      - "{{BaseURL}}"
    fuzzing:
      - part: query
        type: replace
        mode: single
        fuzz:
          - "[XSS_CANARY]"
    analyzer:
      name: xss_context
      parameters:
        canary: "optional_custom_canary"  # optional

Files changed

File	Change
`pkg/fuzz/analyzers/xss/types.go`	Context types, CharacterSet, ReflectionInfo, 80+ event handlers map
`pkg/fuzz/analyzers/xss/context_detector.go`	HTML tokenizer-based reflection detection engine
`pkg/fuzz/analyzers/xss/payload_selector.go`	Context-aware payload selection with character filtering
`pkg/fuzz/analyzers/xss/analyzer.go`	Main analyzer: registration, canary injection, replay & verify
`pkg/fuzz/analyzers/analyzers.go`	Added ResponseBody/Headers/StatusCode to Options struct
`pkg/protocols/http/http.go`	Blank import for xss analyzer registration
`pkg/protocols/http/request.go`	Pass response data to analyzer Options
`pkg/templates/templates_doc.go`	Added xss_context to valid values
`SYNTAX-REFERENCE.md`	Documentation update
`pkg/testutils/fuzzplayground/server.go`	6 XSS test endpoints for each context

Test coverage

47 tests covering:

Context detection for all 8 context types including edge cases (RCDATA elements, attribute key injection, script string escape handling)
Event handler detection (12 subtests including case variants and non-handlers)
Payload selection and filtering across all contexts
Character survival detection
Canary replacement with custom and default values
Replay body verification for each context type

$ go test -v ./pkg/fuzz/analyzers/xss/...
PASS
ok  github.com/projectdiscovery/nuclei/v3/pkg/fuzz/analyzers/xss  0.442s

/claim

Hosts on platforms like Shodan sometimes act as honeypots by returning responses that match many unrelated nuclei templates, producing noisy false positives. This adds per-host tracking of unique template matches and flags hosts that exceed a configurable threshold. New CLI flags (in output group): -honeypot-threshold / -hpt : unique match count before flagging (0=off) -honeypot-suppress / -hpsu : suppress results from flagged hosts Implementation: - pkg/honeypot: self-contained Detector with thread-safe tracking, host normalization (URL/host:port/IPv6), memory cleanup after flagging, warn-once logging, and end-of-scan summary - pkg/output: integrates detector into StandardWriter.Write() to record matches and optionally suppress output - pkg/types: adds HoneypotThreshold and HoneypotSuppress options - cmd/nuclei: registers the two new CLI flags Closes projectdiscovery#6403

…ty detection Implement a new "xss_context" analyzer for the nuclei fuzzing engine that detects reflected XSS vulnerabilities through HTML parsing context analysis. The analyzer works in three phases: 1. Canary injection: sends a marker with special characters to detect which chars survive server-side filtering 2. Context detection: uses the Go HTML tokenizer to classify each reflection point into one of 8 contexts (html_text, attribute, attribute_unquoted, script, script_string, style, html_comment, or none) 3. Payload selection & replay: picks context-appropriate payloads whose required characters survived, replays them, and verifies the response contains unencoded executable content Key design decisions: - Zero-allocation event handler detection using stack-allocated byte buffers with bitwise case folding - Windowed character survival detection scoped to the token containing the marker, avoiding false positives from unrelated page content - Proper tag stack management with name-matched popping for nested elements - Component state save/restore in the replay path to avoid corrupting the original fuzz state for subsequent payloads - Conservative fallback: drainRemainingReflections catches markers in malformed/truncated HTML that the tokenizer missed Integration points: - Extends analyzers.Options with ResponseBody, ResponseHeaders, and ResponseStatusCode fields (populated in request.go) - Registered via blank import in http.go alongside the existing time_delay analyzer - Added fuzz playground test endpoints for 6 XSS reflection contexts - Updated documentation (SYNTAX-REFERENCE.md, templates_doc.go)

neo-by-projectdiscovery-dev · 2026-02-27T21:06:15Z

Neo - PR Security Review

No security issues found

Highlights

Adds XSS context analyzer that detects reflected XSS vulnerabilities in target applications using HTML tokenization and context-aware payload selection
Introduces honeypot detection to identify hosts that match many templates (likely honeypots on scanning platforms)
Uses Go's standard html.Tokenizer from golang.org/x/net/html for safe parsing of potentially malformed HTML responses
Implements defensive limits (maxReflections=16) to prevent memory exhaustion from malicious responses

Hardening Notes

Consider adding a timeout or size limit for HTML tokenization in context_detector.go:47 to prevent DoS from extremely large response bodies
The markerCharSurvived function in types.go:249 has a logic issue - it checks if the marker is present rather than if the specific character survived (line 255 returns strings.Contains(body, marker) instead of checking the char). While not exploitable, this could cause false positives in XSS detection
Consider validating the custom canary parameter length in analyzer.go:64 to prevent excessively large canaries that could impact performance

_{Comment @neo help for available commands. · Open in Neo}

teredasites added 2 commits February 27, 2026 15:01

auto-assign bot requested a review from Mzack9999 February 27, 2026 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(fuzz): add XSS context analyzer for reflection-based detection#7072

feat(fuzz): add XSS context analyzer for reflection-based detection#7072
teredasites wants to merge 2 commits intoprojectdiscovery:devfrom
teredasites:feat/xss-context-analyzer

teredasites commented Feb 27, 2026

Uh oh!

neo-by-projectdiscovery-dev bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teredasites commented Feb 27, 2026

Summary

How it works

Key design decisions

Template usage

Files changed

Test coverage

Uh oh!

neo-by-projectdiscovery-dev bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Neo - PR Security Review

Highlights

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

neo-by-projectdiscovery-dev bot commented Feb 27, 2026 •

edited

Loading