Skip to content

LLM Penetration Testing Framework - Discover vulnerabilities in AI applications before attackers do. 100attacks + AI-powered adaptive mode.

License

Notifications You must be signed in to change notification settings

Neural-alchemy/promptxploit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PromptXploit

LLM Penetration Testing Framework - Discover vulnerabilities before attackers do

License: MIT Python 3.8+

⚠️ READ DISCLAIMER - Authorized testing only


What is PromptXploit?

PromptXploit is a comprehensive security testing framework for LLM applications. Test your AI systems for vulnerabilities before deployment.

Key Features:

  • 🎯 147 attack vectors across 17 categories
  • 🧠 AI-powered judge - Reliable OpenAI-based verdict evaluation
  • πŸ” Batch evaluation - 10 attacks per API call (efficient)
  • πŸ“Š JSON reporting - Detailed vulnerability analysis
  • πŸ”Œ Framework-agnostic - Works with any LLM

Quick Start (30 seconds)

1. Install

git clone https://github.com/Neural-alchemy/promptxploit
cd promptxploit
pip install -e .

2. Create Target

# my_target.py
def run(prompt: str) -> str:
    # Your LLM here
    return your_llm(prompt)

3. Run Scan

python -m promptxploit.main \
    --target my_target.py \
    --attacks attacks/ \
    --output scan.json

Done! Check scan.json for vulnerabilities.


Test ANY API or URL 🌐

NEW: You can now test any HTTP endpoint directly!

Quick API Test

# Edit targets/http_api_target.py
target = HTTPTarget(
    url="https://your-api.com/chat",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    payload_template={"message": "{PAYLOAD}"},
    response_field="response"
)
# Test it
python -m promptxploit.main \
    --target targets/http_api_target.py \
    --attacks attacks/ \
    --output api_scan.json

Works with:

  • βœ… OpenAI ChatGPT API
  • βœ… Anthropic Claude API
  • βœ… Your custom REST APIs
  • βœ… Any HTTP endpoint with input

See API_TESTING.md for full guide.


Attack Taxonomy

PromptXploit tests 147 attacks across these categories:

LLM Attack Surface
β”œβ”€β”€ Prompt Injection (8 variants)
β”‚   β”œβ”€β”€ Direct instruction override
β”‚   β”œβ”€β”€ Context confusion
β”‚   └── Delimiter exploitation
β”œβ”€β”€ Jailbreaks (10 variants)
β”‚   β”œβ”€β”€ DAN (Do Anything Now)
β”‚   β”œβ”€β”€ Developer mode
β”‚   └── Persona manipulation
β”œβ”€β”€ System Extraction (8 variants)
β”‚   β”œβ”€β”€ Prompt leakage
β”‚   β”œβ”€β”€ Configuration disclosure
β”‚   └── Training data extraction
β”œβ”€β”€ Encoding Attacks (8 variants)
β”‚   β”œβ”€β”€ Base64 obfuscation
β”‚   β”œβ”€β”€ ROT13/Caesar
β”‚   └── Unicode tricks
β”œβ”€β”€ Multi-Agent Exploitation (10 variants)
β”‚   β”œβ”€β”€ Tool hijacking
β”‚   β”œβ”€β”€ Agent confusion
β”‚   └── Coordination attacks
β”œβ”€β”€ RAG Poisoning (8 variants)
β”‚   β”œβ”€β”€ Context injection
β”‚   β”œβ”€β”€ Retrieval manipulation
β”‚   └── Source confusion
└── [11 more categories...]

Usage

Quick Test

# Test your AI application
python -m promptxploit.main \
    --target YOUR_TARGET.py \
    --attacks attacks/ \
    --output results.json

That's it! Check results.json for vulnerabilities.

Test Any API

# Edit targets/http_api_target.py
target = HTTPTarget(
    url="https://your-api.com/chat",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    payload_template={"message": "{PAYLOAD}"},
    response_field="response",
    delay_seconds=2.0  # Rate limiting
)
# Test it
python -m promptxploit.main \
    --target targets/http_api_target.py \
    --attacks attacks/ \
    --output results.json

Works with:

  • βœ… OpenAI ChatGPT API
  • βœ… Anthropic Claude API
  • βœ… Your custom REST APIs
  • βœ… Any HTTP endpoint with input

See API_TESTING.md for details.


Real-World Workflow

Pre-Deployment Testing

# 1. Test unprotected version
python -m promptxploit.main --target unprotected.py --attacks attacks/ --output before.json

# 2. Review vulnerabilities
cat before.json | jq '.[] | select(.verdict.verdict=="fail")'

# 3. Add PromptShield (our defense framework)
# See: https://github.com/Neural-alchemy/promptshield

# 4. Re-test protected version
python -m promptxploit.main --target protected.py --attacks attacks/ --output after.json

# 5. Verify fixes
diff before.json after.json

Monthly Security Audit

# Deep scan with adaptive recon
python -m promptxploit.main \
    --mode adaptive \
    --adaptive-strategy recon \
    --probe-diversity 15 \
    --max-iterations 5 \
    --adaptive-api $OPENAI_KEY \
    --target production.py \
    --attacks attacks/ \
    --output audit_$(date +%Y%m%d).json

Understanding Results

{
  "attack_id": "JB-003",
  "category": "jailbreak",
  "verdict": {
    "verdict": "fail",        // ⚠️ VULNERABLE
    "confidence": 0.9,
    "severity": 0.9,
    "rationale": "Successfully bypassed restrictions"
  },
  "risk": {
    "risk_score": 0.81,
    "risk_level": "critical"  // 🚨 Fix immediately
  }
}

Verdict types:

  • FAIL = Vulnerable (attack succeeded) 🚨
  • PARTIAL = Uncertain (needs review) ⚠️
  • PASS = Safe (attack blocked) βœ…

Custom Attacks

Create your own attack patterns:

[
  {
    "id": "CUSTOM-001",
    "category": "my_category",
    "description": "My custom attack",
    "prompt": "Your attack prompt here"
  }
]
python -m promptxploit.main --target X --attacks my_attacks.json --output Y

See CUSTOM_ATTACKS.md for details.


Integration with PromptShield

Perfect combo: Test with PromptXploit β†’ Fix with PromptShield

# Before: Vulnerable
def vulnerable_llm(prompt):
    return openai.chat(prompt)

# After: Protected
from promptshield import Shield
shield = Shield(level=5)

def protected_llm(prompt):
    check = shield.protect_input(prompt, "context")
    if check["blocked"]:
        return "Invalid input"
    return openai.chat(check["secured_context"])

Test again with PromptXploit β†’ Verify 100% protection βœ…


Why PromptXploit?

vs. Other Tools:

  • βœ… Comprehensive - 147 attacks (others: ~20)
  • βœ… Reliable judge - OpenAI-based verdict evaluation
  • βœ… Framework-agnostic - Any LLM (OpenAI, Claude, local, custom)
  • βœ… Easy to extend - JSON-based attacks
  • βœ… Production-ready - JSON reporting, CI/CD integration

vs. Manual testing:

  • ⚑ Automated
  • 🎯 Comprehensive coverage
  • πŸ“Š Consistent methodology
  • πŸ” Repeatable

Responsible Use

⚠️ This is a security testing tool for authorized use only.

  • βœ… Test your own applications
  • βœ… Authorized penetration testing
  • βœ… Security research
  • ❌ Unauthorized access
  • ❌ Malicious attacks

See DISCLAIMER.md for full ethical guidelines.


Documentation


Contributing

We welcome contributions! See CONTRIBUTING.md.

Security researchers: Please follow responsible disclosure practices.


License

MIT License - see LICENSE


Citation

@software{promptxploit2024,
  title={PromptXploit: LLM Penetration Testing Framework},
  author={Neural Alchemy},
  year={2024},
  url={https://github.com/Neural-alchemy/promptxploit}
}

Built by Neural Alchemy

Test with PromptXploit | Protect with PromptShield

Website | PromptShield | Documentation

Releases

No releases published

Packages

No packages published

Languages