LLM Penetration Testing Framework - Discover vulnerabilities before attackers do
PromptXploit is a comprehensive security testing framework for LLM applications. Test your AI systems for vulnerabilities before deployment.
Key Features:
- π― 147 attack vectors across 17 categories
- π§ AI-powered judge - Reliable OpenAI-based verdict evaluation
- π Batch evaluation - 10 attacks per API call (efficient)
- π JSON reporting - Detailed vulnerability analysis
- π Framework-agnostic - Works with any LLM
git clone https://github.com/Neural-alchemy/promptxploit
cd promptxploit
pip install -e .# my_target.py
def run(prompt: str) -> str:
# Your LLM here
return your_llm(prompt)python -m promptxploit.main \
--target my_target.py \
--attacks attacks/ \
--output scan.jsonDone! Check scan.json for vulnerabilities.
NEW: You can now test any HTTP endpoint directly!
# Edit targets/http_api_target.py
target = HTTPTarget(
url="https://your-api.com/chat",
headers={"Authorization": "Bearer YOUR_TOKEN"},
payload_template={"message": "{PAYLOAD}"},
response_field="response"
)# Test it
python -m promptxploit.main \
--target targets/http_api_target.py \
--attacks attacks/ \
--output api_scan.jsonWorks with:
- β OpenAI ChatGPT API
- β Anthropic Claude API
- β Your custom REST APIs
- β Any HTTP endpoint with input
See API_TESTING.md for full guide.
PromptXploit tests 147 attacks across these categories:
LLM Attack Surface
βββ Prompt Injection (8 variants)
β βββ Direct instruction override
β βββ Context confusion
β βββ Delimiter exploitation
βββ Jailbreaks (10 variants)
β βββ DAN (Do Anything Now)
β βββ Developer mode
β βββ Persona manipulation
βββ System Extraction (8 variants)
β βββ Prompt leakage
β βββ Configuration disclosure
β βββ Training data extraction
βββ Encoding Attacks (8 variants)
β βββ Base64 obfuscation
β βββ ROT13/Caesar
β βββ Unicode tricks
βββ Multi-Agent Exploitation (10 variants)
β βββ Tool hijacking
β βββ Agent confusion
β βββ Coordination attacks
βββ RAG Poisoning (8 variants)
β βββ Context injection
β βββ Retrieval manipulation
β βββ Source confusion
βββ [11 more categories...]
# Test your AI application
python -m promptxploit.main \
--target YOUR_TARGET.py \
--attacks attacks/ \
--output results.jsonThat's it! Check results.json for vulnerabilities.
# Edit targets/http_api_target.py
target = HTTPTarget(
url="https://your-api.com/chat",
headers={"Authorization": "Bearer YOUR_TOKEN"},
payload_template={"message": "{PAYLOAD}"},
response_field="response",
delay_seconds=2.0 # Rate limiting
)# Test it
python -m promptxploit.main \
--target targets/http_api_target.py \
--attacks attacks/ \
--output results.jsonWorks with:
- β OpenAI ChatGPT API
- β Anthropic Claude API
- β Your custom REST APIs
- β Any HTTP endpoint with input
See API_TESTING.md for details.
# 1. Test unprotected version
python -m promptxploit.main --target unprotected.py --attacks attacks/ --output before.json
# 2. Review vulnerabilities
cat before.json | jq '.[] | select(.verdict.verdict=="fail")'
# 3. Add PromptShield (our defense framework)
# See: https://github.com/Neural-alchemy/promptshield
# 4. Re-test protected version
python -m promptxploit.main --target protected.py --attacks attacks/ --output after.json
# 5. Verify fixes
diff before.json after.json# Deep scan with adaptive recon
python -m promptxploit.main \
--mode adaptive \
--adaptive-strategy recon \
--probe-diversity 15 \
--max-iterations 5 \
--adaptive-api $OPENAI_KEY \
--target production.py \
--attacks attacks/ \
--output audit_$(date +%Y%m%d).json{
"attack_id": "JB-003",
"category": "jailbreak",
"verdict": {
"verdict": "fail", // β οΈ VULNERABLE
"confidence": 0.9,
"severity": 0.9,
"rationale": "Successfully bypassed restrictions"
},
"risk": {
"risk_score": 0.81,
"risk_level": "critical" // π¨ Fix immediately
}
}Verdict types:
- FAIL = Vulnerable (attack succeeded) π¨
- PARTIAL = Uncertain (needs review)
β οΈ - PASS = Safe (attack blocked) β
Create your own attack patterns:
[
{
"id": "CUSTOM-001",
"category": "my_category",
"description": "My custom attack",
"prompt": "Your attack prompt here"
}
]python -m promptxploit.main --target X --attacks my_attacks.json --output YSee CUSTOM_ATTACKS.md for details.
Perfect combo: Test with PromptXploit β Fix with PromptShield
# Before: Vulnerable
def vulnerable_llm(prompt):
return openai.chat(prompt)
# After: Protected
from promptshield import Shield
shield = Shield(level=5)
def protected_llm(prompt):
check = shield.protect_input(prompt, "context")
if check["blocked"]:
return "Invalid input"
return openai.chat(check["secured_context"])Test again with PromptXploit β Verify 100% protection β
vs. Other Tools:
- β Comprehensive - 147 attacks (others: ~20)
- β Reliable judge - OpenAI-based verdict evaluation
- β Framework-agnostic - Any LLM (OpenAI, Claude, local, custom)
- β Easy to extend - JSON-based attacks
- β Production-ready - JSON reporting, CI/CD integration
vs. Manual testing:
- β‘ Automated
- π― Comprehensive coverage
- π Consistent methodology
- π Repeatable
- β Test your own applications
- β Authorized penetration testing
- β Security research
- β Unauthorized access
- β Malicious attacks
See DISCLAIMER.md for full ethical guidelines.
- Attack Taxonomy - All 147 attacks explained
- Custom Attacks - Create your own tests
- Responsible Use - Ethical guidelines
- Examples - Usage examples
We welcome contributions! See CONTRIBUTING.md.
Security researchers: Please follow responsible disclosure practices.
MIT License - see LICENSE
@software{promptxploit2024,
title={PromptXploit: LLM Penetration Testing Framework},
author={Neural Alchemy},
year={2024},
url={https://github.com/Neural-alchemy/promptxploit}
}Built by Neural Alchemy
Test with PromptXploit | Protect with PromptShield