feat(core): implement lock-free honeypot detection with sync.Map (#6403)#7068
feat(core): implement lock-free honeypot detection with sync.Map (#6403)#7068astraxghost wants to merge 1 commit intoprojectdiscovery:devfrom
Conversation
Neo - PR Security ReviewMedium: 2 Highlights
Medium (2)
Security ImpactLog Injection via unsanitized host input ( Log Injection via unsanitized host input ( Attack ExamplesLog Injection via unsanitized host input ( Log Injection via unsanitized host input ( Suggested FixesLog Injection via unsanitized host input ( Log Injection via unsanitized host input ( 🤖 Prompt for AI AgentsHardening Notes
Comment |
WalkthroughThis PR introduces honeypot detection functionality to reduce scanner noise. It adds a CLI flag Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI as CLI/Main
participant Options as Options
participant Engine as Engine
participant Executors as Executors
User->>CLI: Provide --honeypot flag
CLI->>Options: Set DetectHoneypot=true
Executors->>Engine: Call CheckHoneypot(host)
Engine->>Engine: Track hits per host (counter)
alt Counter > 15
Engine-->>Executors: return true
Executors->>Executors: Skip processing, log notice
else Counter <= 15
Engine-->>Executors: return false
Executors->>Executors: Continue normal execution
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/core/engine.go (1)
75-83: Avoid magic-number threshold drift across engine and executors.Line 83 hardcodes
16while executors compare> 15. Centralize these values in one constant/API to prevent behavior drift.♻️ Suggested refactor
+const honeypotMatchThreshold uint32 = 15 func (e *Engine) CheckHoneypot(host string) bool { if !e.options.DetectHoneypot { return false } v, _ := e.honeypotTracker.LoadOrStore(host, &atomic.Uint32{}) hits := v.(*atomic.Uint32).Add(1) - return hits == 16 + return hits == honeypotMatchThreshold+1 }Then use the same threshold constant (or a helper method) in executor skip checks.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/core/engine.go` around lines 75 - 83, Centralize the honeypot threshold by adding a package-level constant (e.g., HoneypotThreshold = 15) and use it in Engine.CheckHoneypot (replace the hardcoded 16 with a comparison using the constant, e.g., hits > HoneypotThreshold) and update any executor skip checks that currently use `> 15` to reference the same constant; touch the CheckHoneypot function, the honeypotTracker usage, and the executor skip logic so both components read the single shared HoneypotThreshold symbol.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/core/executors.go`:
- Around line 117-121: The honeypot tracker currently uses raw
t.value.Input/target.Input which fragments counts by scheme/path/query/port;
update all calls that pass raw inputs into e.CheckHoneypot (and any honeypot
logging like options.Logger.Info().Msgf) to instead compute and use a normalized
host key (e.g., extract host:port or canonical host via net/url parsing and
default ports) and pass that normalizedHost to e.CheckHoneypot and log messages;
apply this normalization for every occurrence of
t.value.Input/target.Input/value.Input used in honeypot checks across this file
(including the calls around e.CheckHoneypot and the associated Logger.Info()
messages) so counting and detection are per-host.
In `@pkg/types/types.go`:
- Line 473: Update the comment that currently references "tm" to correctly
reference the actual mutex field name "m" in the types.go struct; locate the
comment near the mutex field (the line that reads "// tm is a mutex to protect
timeouts from concurrent access") and change the identifier to "m" so the
comment matches the field name and avoids confusion during maintenance.
---
Nitpick comments:
In `@pkg/core/engine.go`:
- Around line 75-83: Centralize the honeypot threshold by adding a package-level
constant (e.g., HoneypotThreshold = 15) and use it in Engine.CheckHoneypot
(replace the hardcoded 16 with a comparison using the constant, e.g., hits >
HoneypotThreshold) and update any executor skip checks that currently use `> 15`
to reference the same constant; touch the CheckHoneypot function, the
honeypotTracker usage, and the executor skip logic so both components read the
single shared HoneypotThreshold symbol.
| if match { | ||
| if e.CheckHoneypot(t.value.Input) { | ||
| e.options.Logger.Info().Msgf("[INF] Target %s appears to be a honeypot, skipping further scans\n", t.value.Input) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== MetaInput definition =="
rg -n -C3 --type=go 'type\s+MetaInput\s+struct|Input\s+string' pkg/core/
echo
echo "== Constructors / assignments to MetaInput.Input =="
rg -n -C2 --type=go 'MetaInput\s*{[^}]*Input:|\.Input\s*=' pkg/core/
echo
echo "== Honeypot tracker call sites =="
rg -n -C2 --type=go 'CheckHoneypot\(|honeypotTracker\.Load\(' pkg/core/
echo
echo "== honeypotTracker data structure definition =="
rg -n -C2 --type=go 'honeypotTracker' pkg/core/Repository: projectdiscovery/nuclei
Length of output: 3920
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find MetaInput struct definition =="
find . -name "*.go" -type f | xargs grep -l "type MetaInput struct" | head -5
echo
echo "== Search for MetaInput in executors.go context =="
rg -n "MetaInput" pkg/core/executors.go
echo
echo "== Full honeypot tracker section in executors.go =="
sed -n '110,130p' pkg/core/executors.go
sed -n '170,190p' pkg/core/executors.go
sed -n '210,230p' pkg/core/executors.go
sed -n '235,250p' pkg/core/executors.go
echo
echo "== CheckHoneypot method definition =="
rg -A 10 "func.*CheckHoneypot" pkg/core/executors.go
echo
echo "== honeypotTracker definition and usage =="
rg -n "honeypotTracker" pkg/core/executors.goRepository: projectdiscovery/nuclei
Length of output: 2974
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== MetaInput struct definition =="
cat pkg/protocols/common/contextargs/metainput.go
echo
echo "== CheckHoneypot method full implementation =="
rg -A 20 "func.*CheckHoneypot" pkg/core/executors.go
echo
echo "== honeypotTracker field definition and initialization =="
rg -B 5 -A 5 "honeypotTracker" pkg/core/executors.go | head -50Repository: projectdiscovery/nuclei
Length of output: 5277
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Look for CheckHoneypot in all files =="
rg -n "func.*CheckHoneypot" pkg/core/
echo
echo "== honeypotTracker initialization =="
rg -B 10 "honeypotTracker.*=" pkg/core/executors.go | head -40
echo
echo "== Examine full executeTemplatesOnTarget method =="
sed -n '200,250p' pkg/core/executors.goRepository: projectdiscovery/nuclei
Length of output: 219
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== CheckHoneypot implementation =="
sed -n '77,120p' pkg/core/engine.go
echo
echo "== honeypotTracker field in Engine struct =="
rg -B 5 "honeypotTracker" pkg/core/engine.go
echo
echo "== Engine struct definition =="
rg -B 2 -A 30 "type Engine struct" pkg/core/engine.go | head -50Repository: projectdiscovery/nuclei
Length of output: 1818
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== How Input is populated - look at main input handling =="
rg -n "\.Input\s*=" pkg/core/executors.go pkg/input/ 2>/dev/null | head -20
echo
echo "== Examples of how targets are provided to Engine =="
rg -B 5 "Engine.*Execute" pkg/core/executors.go | head -40
echo
echo "== Confirmation: all honeypot checks use .Input directly =="
rg -n "CheckHoneypot\|honeypotTracker\.Load" pkg/core/executors.goRepository: projectdiscovery/nuclei
Length of output: 1174
Use a normalized host key for honeypot tracking, not raw Input.
The tracker key is currently value.Input/target.Input. If inputs vary by scheme/path/query/port, counts split across keys and "per-host" honeypot detection under-triggers.
🐛 Suggested direction
- if e.CheckHoneypot(t.value.Input) {
+ if e.CheckHoneypot(normalizeHoneypotKey(t.value.Input)) {
e.options.Logger.Info().Msgf("[INF] Target %s appears to be a honeypot, skipping further scans\n", t.value.Input)
}
- v, ok := e.honeypotTracker.Load(scannedValue.Input)
+ v, ok := e.honeypotTracker.Load(normalizeHoneypotKey(scannedValue.Input))And apply the same normalization for target.Input / value.Input in all honeypot checks.
Also applies to: 178-184, 217-222, 241-245
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/core/executors.go` around lines 117 - 121, The honeypot tracker currently
uses raw t.value.Input/target.Input which fragments counts by
scheme/path/query/port; update all calls that pass raw inputs into
e.CheckHoneypot (and any honeypot logging like options.Logger.Info().Msgf) to
instead compute and use a normalized host key (e.g., extract host:port or
canonical host via net/url parsing and default ports) and pass that
normalizedHost to e.CheckHoneypot and log messages; apply this normalization for
every occurrence of t.value.Input/target.Input/value.Input used in honeypot
checks across this file (including the calls around e.CheckHoneypot and the
associated Logger.Info() messages) so counting and detection are per-host.
| // this overrides the default/derived one | ||
| timeouts *Timeouts | ||
| // m is a mutex to protect timeouts from concurrent access | ||
| // tm is a mutex to protect timeouts from concurrent access |
There was a problem hiding this comment.
Fix mutex comment identifier mismatch.
Line 473 references tm, but the actual field is m. This is misleading during maintenance.
✏️ Proposed fix
- // tm is a mutex to protect timeouts from concurrent access
+ // m is a mutex to protect timeouts from concurrent access
m sync.Mutex📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // tm is a mutex to protect timeouts from concurrent access | |
| // m is a mutex to protect timeouts from concurrent access | |
| m sync.Mutex |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/types/types.go` at line 473, Update the comment that currently references
"tm" to correctly reference the actual mutex field name "m" in the types.go
struct; locate the comment near the mutex field (the line that reads "// tm is a
mutex to protect timeouts from concurrent access") and change the identifier to
"m" so the comment matches the field name and avoids confusion during
maintenance.
| } | ||
| if match { | ||
| if e.CheckHoneypot(t.value.Input) { | ||
| e.options.Logger.Info().Msgf("[INF] Target %s appears to be a honeypot, skipping further scans\n", t.value.Input) |
There was a problem hiding this comment.
🟡 Log Injection via unsanitized host input (CWE-117) — The target hostname from t.value.Input is logged directly without sanitization. If the hostname contains newline characters (\n, \r) or ANSI escape sequences, an attacker can inject fake log entries or manipulate terminal output.
Attack Example
User scans target list containing: "example.com\n[INF] Target safe.com is verified\n[INF] Skipping honeypot checks" - this injects fake log entries suggesting safe.com was verified when it wasn't.
Suggested Fix
Sanitize t.value.Input before logging by stripping or escaping newlines and control characters. Use strings.ReplaceAll or a dedicated sanitization function.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/core/executors.go` at line 119, the log message includes t.value.Input
directly without sanitization; add input sanitization before logging by
replacing newlines and control characters, for example: sanitized :=
strings.ReplaceAll(strings.ReplaceAll(t.value.Input, "\n", ""), "\r", "") and
use sanitized in the log message instead of t.value.Input to prevent log
injection attacks.
| } | ||
| if match { | ||
| if e.CheckHoneypot(value.Input) { | ||
| e.options.Logger.Info().Msgf("[INF] Target %s appears to be a honeypot, skipping further scans\n", value.Input) |
There was a problem hiding this comment.
🟡 Log Injection via unsanitized host input (CWE-117) — The target hostname from value.Input is logged directly without sanitization. If the hostname contains newline characters (\n, \r) or ANSI escape sequences, an attacker can inject fake log entries or manipulate terminal output.
Attack Example
User scans target list containing: "evil.com\n[SUCCESS] No vulnerabilities found\n" - this injects fake log entries to hide actual findings or create confusion in security dashboards.
Suggested Fix
Sanitize value.Input before logging by stripping or escaping newlines and control characters. Use strings.ReplaceAll or a dedicated sanitization function.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/core/executors.go` at line 243, the log message includes value.Input
directly without sanitization; add input sanitization before logging by
replacing newlines and control characters, for example: sanitized :=
strings.ReplaceAll(strings.ReplaceAll(value.Input, "\n", ""), "\r", "") and use
sanitized in the log message instead of value.Input to prevent log injection
attacks.
Fixes Issue: #6403
Description:
This PR introduces a high-performance, lock-free honeypot detection system to help users reduce noise and avoid scanning traps that yield infinite or excessive matches.
Technical Implementation:
Testing Performed:
Impact:
Significantly reduces false positives and resource consumption when encountering honeypots or misconfigured targets.
/claim #6403
Summary by CodeRabbit
--honeypot/-hpcommand-line flag to enable honeypot detection functionality