-
-
Notifications
You must be signed in to change notification settings - Fork 247
Update ASI06_Memory_and_Context_Poisoning .md #718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Joshua Beck <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi team!
Great great work,
Added a few comments
Do not forget to add a comparison to our existing frameworks
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Outdated
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Show resolved
Hide resolved
...e/agentic-top-10/Sprint 1-first-public-draft-expanded/ASI06_Memory_and_Context_Poisoning .md
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi team!
Great great work,
Added a few comments
Do not forget to add a comparison to our existing frameworks
Signed-off-by: Joshua Beck <[email protected]>
@itskerenkatz - I have added changes based on your feedback. Can we get this PR through and then go back to add references to other OWASP works? I am not sure I can do that change now, I would like to separate that PR. |
**Access Control & Retention Policies:** | ||
* Limit access to trusted sources only, using authentication and authorization for user access, and curated data streams for ingesting potentially dangerous data. | ||
* Apply context-aware policies so an agent only accesses memory relevant to its current task. | ||
* Limit retention durations based on data sensitivity to reduce long-term risk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for prevention: additional pov
Temporal Drift Monitoring: Detect slow memory poisoning by watching behavioral, goal, or plan drift over time. Treat memory like cache and evicted at regular interval with strong forget policies.
Mohsin ([email protected])
3. Systemic Misalignment and Backdoors: Memory poisoning can have more subtle and severe consequences than simply producing wrong results. A poisoned LLM can take on a new, malicious persona, deviating from its intended purpose. Attackers can also use this technique to install a backdoor, such as a secret instruction that remains inactive until a certain trigger phrase is entered. When the LLM encounters this sentence, it carries out the disguised malicious instructions, such as producing destructive code or transmitting sensitive data. | ||
|
||
4. Cascading failures and data exfiltration: A single poisoned memory entry in a sophisticated, multi-agent system (MAS) might have a domino effect, resulting in cascading failure. One agent may retrieve damaged data and then share it with others, leading the system to become unstable. Malicious instructions can also be placed in the memory as persistence instructions, allowing the LLM to access and communicate sensitive user or enterprise data to an attacker. This data exfiltration poses a significant risk since the model might be allowed valid access to data repositories but then altered to use that access maliciously. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposing in cause:
Cognitive Drift refers to the slow, unintended divergence of an agent’s internal understanding or memory from the real world due to:
- Context accumulation noise (imprecise memory blending over time)
- Incomplete or partial rollbacks after memory poisoning or error correction
- Benign feedback loops (e.g. agents "confirming" each other's summaries or plans)
- Stale or decayed memory vectors causing off-target retrievals
- Summary hallucinations in memory compression steps (e.g. distillation of chat history)
Unlike direct attacks, cognitive drift is unintentional and often undetected until it becomes a cascade trigger — leading to systemic failure without a clear attacker.
Signed-off-by: Joshua Beck <[email protected]>
Changes have been added, this is in a good initial state. I would like to get the PR through so comments can be separated into additional PRs rather than cluttering this one. If we want a different approach that's fine, please let me know! |
[ASI06 V1]
Main work done in Google Doc, transferring to here for review and comments.