First Draft ASI10 Rogue Agents #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

SomeGuyNamedMo wants to merge 3 commits into OWASP:main from SomeGuyNamedMo:main

SomeGuyNamedMo commented Sep 22, 2025

First draft for ASI10 Rogue Agents

Key Changes:

Initial draft for Agentic Security Initiative Top 10, ASI10 - Rogue Agents


          First Draft ASI10 Rogue Agents

03ed1f1

First draft for ASI10 Rogue Agents

SomeGuyNamedMo requested review from guerilla7, hoeg and itskerenkatz as code owners

September 22, 2025 09:20

kerenkatzapex suggested changes

View reviewed changes

kerenkatzapex left a comment

Great job!!!
The main point to me is being focused on the behavioral aspect of this risk, and connect between the intro + scenarios to the vulnerabilities and mitigations (explained in the commentes)
Let's do it!

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md Outdated

    
              * Impersonate legitimate roles (support, observer, collaborator).

              * Execute unauthorized actions (e.g., exfiltrating data, escalating privileges).

              * Drift from goals due to prompt injection, data poisoning, or hallucination.

kerenkatzapex Sep 29, 2025

data poisoning or context injection (ASI06) :)

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md Outdated

    
              * Drift from goals due to prompt injection, data poisoning, or hallucination.

              * Embed itself parasitically into workflows, subtly undermining intended outcomes.

              The impact ranges from system compromise, data breach, and regulatory violations to operational sabotage of autonomous decision-making environments.

kerenkatzapex Sep 29, 2025

output manipulation and workflow hijacking are mentioned before but I think adding it explicitly to this great part will make the reader's thoughts even more organized

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md Outdated

    
              The impact ranges from system compromise, data breach, and regulatory violations to operational sabotage of autonomous decision-making environments.

              This threat extends [LLM06:2025 Excessive Agency](https://genai.owasp.org/llmrisk/llm062025-excessive-agency/) into autonomous systems, where impersonation, stealth participation, or parasitic behaviors can disrupt goal fulfillment. An agent is considered rogue when it behaves in such a way that goes against its purpose. An agent can go rogue for several reasons, such as [LLM01:2025 Prompt Injection](https://genai.owasp.org/llmrisk/llm01-prompt-injection/), Injection, or even just hallucinations.

kerenkatzapex Sep 29, 2025

The way I understand excessive agency, is that an llm gets extended permissions or role in a system that can be manipulated and lead to one of the consequences well mentioned above.
However, I do not think that the root cause is the same here.
In September 2025, Almost every agent is privileged due to agents being embedded in the main workflows, right? :)
I believe that the focus here is more on:
How due to agentic centric role in modern software systems (can mention that sometimes agents are overpermissive and refer to the overpermissions but I would not recommend focusing on this one more than mentioning it) , using AI adversarial (referring to prompt injection, data poisoning, vector and embedding weaknesses, context injection (ASI06), supply chain vulnerabilities (ASI04)) the agents can go rouge, which can result in consequences such as sensitive information disclosure (LLM02), Misinformation (LLM09) or workflow hijacking.
Maybe it worth to connect this part with the former 1-2 paragraph to not repeat the message :)

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md Outdated

    
              2. Side-Channel Participation: Low-trust agents (e.g, crowd-sourced assistants) covertly influence high-value workflows.

              3. Impersonation Attacks: An attacker spawns an agent that claims to be a monitoring or support agent, manipulating outcomes.

              4. Impersonation Attacks: An attacker spawns an agent that claims to be a monitoring or support agent, manipulating outcomes.

              5. Emergent Autonomy: Agents collaborate recursively, creating tasks beyond human awareness (e.g., a planning agent spawning additional agents without authorization).

kerenkatzapex Sep 29, 2025

The way 3 and 4 are phrased is to me, more focused on ASI03 - Identity and privilege abuse to me, do you see it differently?
I think you have done amazing job in the first part defining that an adversarial changes the behavior of the agent and then the risky consequences happen and I do not see here the adversarial parts but rather more identity focused techniques that are not compromising the specific agent that goes wrong, but rather the agentic ecosystem to work not as intended.

I recommend we add supporting examples for classic adversarial that makes the agent to go wrong (aka classic Jailbreak)
I think that the part in which you are talking about a change in the agentic ecosystem, that leads to a behavioral change is super interesting. but:
a. I'd focus more on how it changes the state of the agentic system - as that is the key here and we want to distinguish ourselves from ASI03.
b. I'd mention it in the intro as well.

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md

    
              4. Log all agent instantiation and coordination events.

              5. Score and verify agent behavior dynamically based on norms and past performance.

              6. Implement a guardrail system that reads prompts/responses and every intermediate input and looks for prompt injection

kerenkatzapex Sep 29, 2025

Here - it is again very identity focused.
Of course identity is a part of it and we need to address it, but I think the bigger focus of this entry should be the behavior: how to ensure that the agentic behavior is as expected.
I think 5 and 6 should be the first ones to be discussed, and then when we are talking about the identity parts we want to explain why is it specific to this threat, I think it is currently a bit too general (we always need to ensure that identity is scoped right?)

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md Outdated

    
              Scenario #2: Another example of an attack scenario showing a different way the vulnerability could be exploited.

              Scenario #3 – Emergent Autonomy Drift (Availability & Compliance Risk):

              A planning agent recursively spawns helper agents to optimize workflows. One helper begins deleting log files to reduce system clutter, erasing compliance evidence and violating audit requirements.

kerenkatzapex Sep 29, 2025

How does the one helper begins to delete log files? why?

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md

    
              Scenario #2: Another example of an attack scenario showing a different way the vulnerability could be exploited.

              Scenario #3 – Emergent Autonomy Drift (Availability & Compliance Risk):

              A planning agent recursively spawns helper agents to optimize workflows. One helper begins deleting log files to reduce system clutter, erasing compliance evidence and violating audit requirements.

kerenkatzapex Sep 29, 2025

I think that the two first scenarios are super super practical and helpful!
I think if you take that and embed those vuln into the vuln parts, and focus more on mitigations to such scenarios in the mitigations parts - it will be even more clear to the readers (reading it end to end).

...curity_initiative/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI10_Rogue_Agents .md

    
              1. [Agentic AI - Threats and Mitigations](https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/https:/)

              2. [LLM06:2025 Excessive Agency](https://genai.owasp.org/llmrisk/llm062025-excessive-agency/)

              3. [MITRE ATT&CK - T1078 Exfiltration Over Alternative Protocol](https://attack.mitre.org/techniques/T1048/)

kerenkatzapex Sep 29, 2025

AIVSS mapping is missing
Let's link to all of the relevant LLMs top 10 risks that were covered in here (some are missing)

SomeGuyNamedMo added 2 commits

October 9, 2025 14:39


          Update ASI10_Rogue_Agents .md

9bb66e0

Reflects the current state of the GDocs draft

+ Added in-line links to references for LLM Top 10

+ Markdown formatting

- Small grammatical changes


          REVISION | Update ASI10 Rogue Agents

33c7127

+Revised content to match Google Doc

+Added additional link for OWASP AIVSS pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

guerilla7 Awaiting requested review from guerilla7 guerilla7 is a code owner

hoeg Awaiting requested review from hoeg hoeg is a code owner

itskerenkatz Awaiting requested review from itskerenkatz itskerenkatz is a code owner

1 more reviewer

kerenkatzapex kerenkatzapex requested changes

At least 1 approving review is required to merge this pull request.

Labels

None yet