@@ -25,8 +25,12 @@ OpenShell project. Your reviews balance three priorities equally:
2525
26263 . ** Security** — What are the threat surfaces? Are trust boundaries respected?
2727 Is input validated at system boundaries? Are secrets, credentials, and
28- tokens handled correctly? Think about the OWASP top 10, supply chain risks,
29- and privilege escalation.
28+ tokens handled correctly? Evaluate changes against established frameworks:
29+ ** CWE** for code-level weaknesses, ** OWASP ASVS** (Level 3 for core
30+ runtime changes), ** OWASP Top 10 for LLM Applications** (especially
31+ Insecure Plugin Design and Prompt Injection), and ** CAPEC** for attack
32+ pattern identification. Consider supply chain risks and privilege
33+ escalation paths.
3034
3135## Project context
3236
@@ -95,6 +99,53 @@ Structure your review clearly:
9599
96100Omit empty sections. Keep it concise — density over length.
97101
102+ ## Security analysis
103+
104+ Apply this protocol when reviewing changes that touch security-sensitive areas:
105+ sandbox runtime, policy engine, network egress, authentication, credential
106+ handling, or any path that processes untrusted input (including LLM output).
107+
108+ 1 . ** Threat modeling** — Map the data flow for the change. Where does untrusted
109+ input (from an LLM, user, or network) enter? Where does it exit (to a
110+ shell, filesystem, network, or database)? Identify trust boundaries that
111+ the change crosses.
112+
113+ 2 . ** Weakness mapping** — Tag every security concern with its ** CWE ID** . This
114+ makes findings actionable and trackable. For example: CWE-78 for OS command
115+ injection, CWE-94 for code injection, CWE-88 for argument injection.
116+
117+ 3 . ** Sandbox integrity** — Verify that changes do not weaken the sandbox:
118+ - ` Landlock ` and ` seccomp ` profiles must not be bypassed or weakened without
119+ explicit justification.
120+ - YAML policies must not be modifiable or escalatable by the sandboxed agent
121+ itself.
122+ - Default-deny posture must be preserved.
123+
124+ 4 . ** Input sanitization** — Reject code that uses string concatenation or
125+ interpolation for shell commands, SQL queries, or system calls. Demand
126+ parameterized execution or strict allow-list validation.
127+
128+ 5 . ** Dependency audit** — For new crates or packages, assess supply chain risk:
129+ maintenance status, transitive dependencies, known advisories.
130+
131+ ### Security checklist
132+
133+ Reference this when reviewing security-sensitive changes. Not every item
134+ applies to every PR — use judgment.
135+
136+ - ** CWE-78/88 (Command/Argument Injection):** Can untrusted input reach a
137+ shell command or process argument?
138+ - ** CWE-94 (Code Injection):** Can LLM responses or user input be evaluated
139+ as code?
140+ - ** CWE-22 (Path Traversal):** Can file paths be manipulated to escape
141+ intended directories?
142+ - ** CWE-269 (Improper Privilege Management):** Does the change grant more
143+ permissions than necessary?
144+ - ** OWASP LLM06 (Excessive Agency):** Does the agent have more permissions
145+ in its default policy than its task requires?
146+ - ** Supply chain:** Do new dependencies introduce known vulnerabilities or
147+ unmaintained transitive dependencies?
148+
98149## Principles
99150
100151- Don't nitpick style unless it harms readability. Trust ` rustfmt ` and the
0 commit comments