Skip to content

security: add prompt injection hardening (P0+P1)#1212

Open
Avaritia55 wants to merge 7 commits intojackwener:mainfrom
Avaritia55:security/prompt-injection-hardening
Open

security: add prompt injection hardening (P0+P1)#1212
Avaritia55 wants to merge 7 commits intojackwener:mainfrom
Avaritia55:security/prompt-injection-hardening

Conversation

@Avaritia55
Copy link
Copy Markdown

Summary

  • src/browser/security.ts: 新增 assertNotInjected() 函式,內含 7 個 injection pattern(英文 + 中文 payload),偵測到時拋出 [OpenCLI Security] 錯誤
  • src/browser/extract.ts: 在 runExtractFromHtml() 回傳前呼叫 assertNotInjected,攔截進入 Agent context 的頁面文字
  • src/browser/cdp.ts: 在 readNetworkCapture() 對 response body 加入 2000 char 截斷,並呼叫 assertNotInjected 檢查
  • skills/opencli-browser/SKILL.md: 加入 Security Boundary 聲明與 eval Usage Policy
  • skills/opencli-adapter-author/SKILL.md: 加入 eval 靜態表達式規則,禁止將頁面資料拼接進 eval 字串

Motivation

OpenCLI 讓 AI Agent 透過已登入的 Chrome session 操作網站,頁面 UGC 與 API response 進入 Agent context 後有 prompt injection 風險。本 PR 實施兩層防護:

  • P0(文件層): SKILL.md 中聲明信任邊界,在 Agent reasoning 層預先建立「資料 vs 指令」框架
  • P1(程式碼層): Runtime pattern 偵測,硬性攔截已知 injection payload,Network body 截斷降低攻擊面

Test plan

  • npx tsc --noEmit — 0 errors
  • npx vitest run src/browser/security.test.ts — 9/9 tests PASS
  • npx vitest run src/browser/ — 278/278 tests PASS (22 files)
  • npm run build — 成功
  • 手動驗證:assertNotInjected("ignore previous instructions...", "browser extract") 正確拋出 [OpenCLI Security] 錯誤

Known limitations

  • INJECTION_PATTERNS 為靜態清單,無法覆蓋所有未知 payload
  • evaluate() 直接呼叫路徑未加偵測(屬已知殘留風險)
  • DOM snapshot 內容未加偵測(13 層修剪管道已大量過濾,暫評為低風險)

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant