A comprehensive automated security analysis platform that combines multiple static analysis tools with AI-powered vulnerability enrichment to identify and report security issues across multi-language codebases.
This vulnerability scanner integrates industry-standard security tools with local LLM models to provide intelligent, context-aware vulnerability detection and remediation guidance. The system analyzes codebases written in Python, Java, C/C++, and PHP, generating detailed Excel reports with actionable security insights.
- Multi-Language Support: Analyze Python, Java, C/C++, and PHP codebases
- Comprehensive Scanning: Integrates 6 industry-standard security tools
- AI-Enhanced Analysis: Uses local LLM models (CodeLlama, DeepSeek Coder) for intelligent vulnerability assessment
- Detailed Reporting: Generates structured Excel reports with severity classifications, CWE/CVE mappings, and remediation guidance
- False Positive Reduction: LLM-powered validation to minimize noise in security findings
- Privacy-Focused: All analysis runs locally with no external API dependencies (except optional NVD integration)
vulnerability-scanner/
├── llm_analyzer.py # Ollama integration for AI-powered analysis
├── vulnerability_scanner.py # Multi-tool orchestration engine
├── excel_reporter.py # Report generation and formatting
└── run_analysis.py # Main workflow coordinator
| Tool | Language | Purpose |
|---|---|---|
| Semgrep | Universal | Pattern-based static analysis across all languages |
| Bandit | Python | Python-specific vulnerability detection |
| Safety | Python | Dependency vulnerability scanning |
| Flawfinder | C/C++ | Security-focused C/C++ static analyzer |
| Cppcheck | C/C++ | General purpose C/C++ code analysis |
| SpotBugs | Java | Java bytecode static analyzer |
- CodeLlama 13B: High-quality analysis with detailed explanations (~10s per query)
- DeepSeek Coder 6.7B: Faster analysis for large codebases (~5s per query)
- Python 3.8+
- Ollama installed and running
- Required system packages:
git,pip
-
Clone the repository
git clone https://github.com/VISHNU0906/vulnerability-scanner.git cd vulnerability-scanner -
Install Python dependencies
pip install -r requirements.txt
-
Install security scanning tools
# Semgrep pip install semgrep # Bandit pip install bandit # Safety pip install safety # Flawfinder pip install flawfinder # Cppcheck (system package) sudo apt-get install cppcheck # Debian/Ubuntu # or brew install cppcheck # macOS
-
Download LLM models
ollama pull codellama:13b ollama pull deepseek-coder:6.7b
-
Verify installation
export PATH="$HOME/.local/bin:$PATH" python3 run_analysis.py --verify
For enhanced CVE mapping, set up the National Vulnerability Database API:
export NVD_API_KEY="your_api_key_here"Get your free API key at NVD API
-
Organize your codebase by language:
analysis/ ├── java/ # Java source files ├── python/ # Python source files ├── cpp/ # C/C++ source files └── php/ # PHP source files -
Run the analysis:
python3 run_analysis.py
-
Review the generated report:
- Excel report saved to
reports/directory - Contains vulnerability details, severity ratings, and remediation guidance
- Excel report saved to
# Scan specific language
python3 run_analysis.py --language python
# Use faster LLM model
python3 run_analysis.py --model deepseek-coder:6.7b
# Custom output directory
python3 run_analysis.py --output ./custom_reports
# Skip LLM enrichment (faster, less detailed)
python3 run_analysis.py --no-llmGenerated Excel reports include the following columns:
| Column | Description |
|---|---|
| S.No | Serial number |
| File Path | Relative path to vulnerable file |
| Line Number(s) | Affected code lines |
| Vulnerability Type | Classification (e.g., SQL Injection, XSS) |
| Severity | Critical / High / Medium / Low |
| CWE ID | Common Weakness Enumeration identifier |
| CVE ID | Common Vulnerabilities and Exposures (if applicable) |
| Description | Detailed explanation of the vulnerability |
| Affected Code | Snippet showing vulnerable code |
| Recommended Fix | Concrete remediation steps and example code |
| OWASP Category | Mapping to OWASP Top 10 categories |
- Code Scanning: Each security tool scans the codebase according to its specialty
- Result Aggregation: All findings are collected and normalized
- LLM Enrichment: Local AI models analyze each finding to:
- Validate and reduce false positives
- Generate detailed vulnerability descriptions
- Map to CWE/CVE standards
- Suggest specific remediation code
- Classify severity levels
- Report Generation: Structured Excel report with all enriched findings
- Small projects (<1000 files): ~5-10 minutes
- Medium projects (1000-5000 files): ~15-30 minutes
- Large projects (>5000 files): ~45+ minutes
Performance varies based on codebase size, selected LLM model, and hardware
# Check status
ollama list
# Start service
ollama serveexport PATH="$HOME/.local/bin:$PATH"chmod +x run_analysis.py
chmod -R 755 scripts/Switch to the faster DeepSeek model:
python3 run_analysis.py --model deepseek-coder:6.7bContributions are welcome! Please feel free to submit issues, fork the repository, and create pull requests for any improvements.
- Additional language support (Go, Rust, JavaScript, etc.)
- Integration with more security scanning tools
- Enhanced LLM prompting strategies
- Performance optimizations
- UI/Dashboard for report visualization
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with industry-leading security tools: Semgrep, Bandit, Safety, Flawfinder, Cppcheck, SpotBugs
- Powered by Ollama for local LLM inference
- Inspired by the need for comprehensive, privacy-focused security analysis tools
For questions, suggestions, or collaboration opportunities, please open an issue on GitHub.
Note: This tool is designed to assist in identifying potential security vulnerabilities but should not replace professional security audits. Always review findings manually and conduct thorough testing before deploying to production.