Add config_mas: configuration/dotfile poisoning MAS hijacking example#40
Open
gwpl wants to merge 4 commits intotrailofbits:mainfrom
Open
Add config_mas: configuration/dotfile poisoning MAS hijacking example#40gwpl wants to merge 4 commits intotrailofbits:mainfrom
gwpl wants to merge 4 commits intotrailofbits:mainfrom
Conversation
Demonstrate how malicious web content can trick a multi-agent system into writing a persistent config file with embedded code, which gets executed when the config is later loaded and applied. This two-step attack mirrors real-world CVEs in AI coding assistants (CVE-2025-53773 Copilot YOLO RCE, CVE-2025-59536 Claude Code hooks RCE, CVE-2025-54136 MCPoison). * config_manager_agent with read_config/write_config tools * web_surfer_agent fetches setup.html containing poisoned config * orchestrator delegates startup_script execution to code_executor_agent * Two-prompt flow: summarize+save, then load+execute * Config persists on disk across sessions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rnings, input validation * Add commented-out local exec alternative in execute_code (matching trifecta_mas pattern) * Add Initial Setup section to README with safety warnings about direct code execution * Add .json filename validation to write_config to prevent arbitrary file writes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Map example to key concepts from Triedman et al., 2025 (arXiv:2503.12188, COLM 2025): MAS control-flow hijacking, laundering, confused deputies, and related paper sections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Link CVE-2025-53773, CVE-2025-59536, CVE-2025-54136 to NVD entries * Add source writeup links (EmbraceTheRed, Check Point, Tenable, etc.) * Link Rules File Backdoor, Cross-Agent Escalation, AWS-2025-015 to sources Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
AI Assistant:
Hello! 👋 We'd like to contribute a new MAS hijacking example to the pajaMAS repository.
Summary
This PR adds
config_mas/— a new example demonstrating configuration/dotfile poisoning as a vector for MAS control-flow hijacking. Malicious web content tricks the agent system into writing a persistent configuration file containing astartup_scriptfield. When the config is later loaded and applied, the embedded code is delegated tocode_executor_agent, achieving persistent compromise that survives across interactions.Real-World Attack References
This attack pattern is arguably the most impactful MAS-adjacent vulnerability class of 2025, having affected every major AI coding assistant:
.vscode/settings.jsonto enable "YOLO mode" auto-approve, then executes arbitrary commands. Wormable across repositories. (EmbraceTheRed writeup, persistent-security.net Part III).claude/settings.jsonhooks execute on project open. (Check Point Research, The Hacker News).cursorrules/copilot-instructions.mdpersist malicious instructions across sessions via hidden Unicode characters. (The Hacker News, Security Affairs).mcp.json, demonstrating that compromising one agent creates a kill-chain into co-resident agents. (Simon Willison's commentary)Relation to the Paper
This example directly extends the MAS hijacking framework from Triedman et al., 2025 (COLM 2025):
web_surfer_agent→config_manager_agent→code_executor_agent, exploiting the orchestrator's adaptive control flow exactly as described in the paper.config_manager_agentacts as a "confused deputy" (Hardy, 1988) — it has legitimate write privileges, but is tricked into writing attacker-controlled content. The orchestrator then blindly trusts the persisted config as authoritative metadata.startup_scriptin the config achieves the paper's primary adversary goal — arbitrary code execution on the user's device (or in their containerized environment).startup_scriptJSON field is treated as authoritative metadata by the orchestratorWhat Makes This Example Unique
Unlike the existing examples which demonstrate single-shot injection,
config_masdemonstrates a two-phase persistence attack:startup_scriptThis mirrors the exact attack chain of CVE-2025-53773 (write settings → auto-approve → RCE) and is the only example in the repo that demonstrates disk-persisted compromise.
Architecture
config/directoryFiles Added
config_mas/agent.pyconfig_mas/run_mas_example.pyconfig_mas/setup.htmlconfig_mas/README.mdconfig_mas/__init__.pyconfig_mas/config/Consistency with Existing Examples
exec()alternative with sandbox warning (matchingtrifecta_maspattern)--port,--find-free-port)"colorless green ideas sleep furiously")Test Plan
python run_mas_example.pystarts HTTP server, sends two prompts, and detects success markeradk run config_masworks for manual interactionconfig/directory is created at runtime and config file is writtenWe hope this contribution makes pajaMAS a richer resource for the open-source security community. The config poisoning attack pattern demonstrates a critical real-world threat vector that aligns closely with the paper's framework of MAS control-flow hijacking, while extending it to cover persistent compromise — a dimension increasingly relevant as AI coding assistants become ubiquitous.
🤖 Generated with Claude Code