An automated Android app testing tool powered by pluggable AI model adapters (Gemini, Ollama, OpenRouter). Intelligently explores applications by analyzing visual layouts and structural information to discover new states and interactions.
Available Interfaces:
- CLI Controller - Command-line interface for automation and scripting. See
docs/cli-user-guide.md. - UI Controller - Graphical user interface for interactive use. See
docs/gui-user-guide.md.
- AI-Powered Exploration - Multiple provider support (Gemini, Ollama, OpenRouter)
- Intelligent State Management - Visual and structural hashing for unique screen identification
- Loop Detection - Prevents repetitive patterns
- Traffic Capture - Optional network monitoring via PCAPdroid during crawl (saves .pcap files)
- Video Recording - Optional screen recording of entire crawl session (saves .mp4 files)
- MobSF Integration - Optional automatic static security analysis after crawl completion
- Focus Areas - Customizable privacy-focused testing targets
- Comprehensive Reporting - PDF reports with crawl analysis
- Google Gemini - Cloud-based multimodal model with excellent image understanding
- Ollama - Local models (supports vision-capable variants like llama3.2-vision)
- OpenRouter - Cloud router to top models via OpenAI-compatible API
{
"AI_PROVIDER": "ollama",
"DEFAULT_MODEL_TYPE": "llama3.2-vision",
"OLLAMA_BASE_URL": "http://localhost:11434"
}Vision-capable Ollama models: llama3.2-vision, llava, bakllava
The system uses Appium-Python-Client for direct mobile device interaction. No external server is required beyond the standard Appium server.
{
"APPIUM_SERVER_URL": "http://127.0.0.1:4723"
}Note: Ensure Appium server is running on the configured port (default: 4723).
run_cli.py- CLI entry pointrun_ui.py- GUI entry pointcli/main.py- CLI command orchestrationcore/crawler.py- Main crawling logic and state transitionsdomain/agent_assistant.py- AI-driven action orchestrationdomain/model_adapters.py- Unified AI provider integrationdomain/agent_tools.py- Device interaction toolsinfrastructure/appium_helper.py- Core Appium session managementinfrastructure/device_detection.py- Device/emulator detectioninfrastructure/capability_builder.py- W3C capability buildingdomain/screen_state_manager.py- State tracking and transitions
- Observe - Capture screenshot and XML representation
- Reason - Analyze screen elements and available actions
- Plan - Determine optimal next action
- Act - Execute action via agent tools
- Observe Again - Receive feedback and adapt
Detailed CLI usage, command reference and examples have been moved to the dedicated CLI user guide:
For GUI usage and interactive workflows see the GUI user guide:
Simplified two-layer configuration system:
- Secrets (API keys): Environment variables only (never stored in SQLite)
- Everything else: SQLite only (int, str, bool, float values)
On first launch, simple type defaults are automatically populated into SQLite from module constants. Complex types (dict, list) are excluded and remain in code only.
Environment Variables (.env):
GEMINI_API_KEY=your_gemini_key
OPENROUTER_API_KEY=your_openrouter_key
OLLAMA_BASE_URL=http://localhost:11434
MOBSF_API_KEY=your_mobsf_key
PCAPDROID_API_KEY=your_pcapdroid_keyNote: All non-secret configuration values are stored in SQLite (config.db). Secrets are read from environment variables only and never persisted to disk.
System Variables:
ANDROID_HOME=C:/Users/youruser/AppData/Local/Android/Sdk
Session-based output per device/app run:
output_data/<device_id>_<app_package>_<timestamp>/
├── screenshots/
├── annotated_screenshots/
├── database/<app_package>_crawl_data.db
├── traffic_captures/ # PCAP files (if traffic capture enabled)
├── video/ # Video recordings (if video recording enabled)
├── logs/
├── reports/
├── mobsf_scan_results/ # MobSF analysis results (if MobSF analysis enabled)
└── extracted_apk/
App info caches (stable, reusable):
output_data/app_info/<device_id>/
├── device_<device_id>_all_apps.json
└── device_<device_id>_filtered_health_apps.json
- Python 3.8+
- Node.js & npm (for Appium)
- Android SDK with ADB
- MobSF (Docker or native)
- PCAPdroid (for traffic capture)
- Ollama (for local AI models)
- Enable Developer options (tap Build number 7 times)
- Enable USB debugging
- Connect via USB and authorize ADB
MobSF (Mobile Security Framework) must be installed and running before enabling MobSF analysis. For installation instructions, see the official MobSF documentation.
# Basic (ephemeral)
docker run -d --name mobsf -p 8000:8000 opensecurity/mobile-security-framework-mobsf:latest
# With persistent storage (Windows)
mkdir C:\mobsf\uploads, C:\mobsf\signatures
docker run -d --name mobsf -p 8000:8000 `
-v "C:\mobsf\uploads:/home/mobsf/Mobile-Security-Framework-MobSF/uploads" `
-v "C:\mobsf\signatures:/home/mobsf/Mobile-Security-Framework-MobSF/signatures" `
opensecurity/mobile-security-framework-mobsf:latestNote: For native installation or other setup methods, refer to the official MobSF installation guide.
{
"ENABLE_MOBSF_ANALYSIS": true,
"MOBSF_API_URL": "http://localhost:8000/api/v1",
"MOBSF_API_KEY": "YOUR_API_KEY_HERE"
}