Skip to content

VISHNU0906/vulnerability-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vulnerability Scanner

A comprehensive automated security analysis platform that combines multiple static analysis tools with AI-powered vulnerability enrichment to identify and report security issues across multi-language codebases.

Overview

This vulnerability scanner integrates industry-standard security tools with local LLM models to provide intelligent, context-aware vulnerability detection and remediation guidance. The system analyzes codebases written in Python, Java, C/C++, and PHP, generating detailed Excel reports with actionable security insights.

Features

  • Multi-Language Support: Analyze Python, Java, C/C++, and PHP codebases
  • Comprehensive Scanning: Integrates 6 industry-standard security tools
  • AI-Enhanced Analysis: Uses local LLM models (CodeLlama, DeepSeek Coder) for intelligent vulnerability assessment
  • Detailed Reporting: Generates structured Excel reports with severity classifications, CWE/CVE mappings, and remediation guidance
  • False Positive Reduction: LLM-powered validation to minimize noise in security findings
  • Privacy-Focused: All analysis runs locally with no external API dependencies (except optional NVD integration)

Architecture

Core Components

vulnerability-scanner/
├── llm_analyzer.py          # Ollama integration for AI-powered analysis
├── vulnerability_scanner.py # Multi-tool orchestration engine
├── excel_reporter.py        # Report generation and formatting
└── run_analysis.py          # Main workflow coordinator

Integrated Security Tools

Tool Language Purpose
Semgrep Universal Pattern-based static analysis across all languages
Bandit Python Python-specific vulnerability detection
Safety Python Dependency vulnerability scanning
Flawfinder C/C++ Security-focused C/C++ static analyzer
Cppcheck C/C++ General purpose C/C++ code analysis
SpotBugs Java Java bytecode static analyzer

LLM Models

  • CodeLlama 13B: High-quality analysis with detailed explanations (~10s per query)
  • DeepSeek Coder 6.7B: Faster analysis for large codebases (~5s per query)

Installation

Prerequisites

  • Python 3.8+
  • Ollama installed and running
  • Required system packages: git, pip

Setup

  1. Clone the repository

    git clone https://github.com/VISHNU0906/vulnerability-scanner.git
    cd vulnerability-scanner
  2. Install Python dependencies

    pip install -r requirements.txt
  3. Install security scanning tools

    # Semgrep
    pip install semgrep
    
    # Bandit
    pip install bandit
    
    # Safety
    pip install safety
    
    # Flawfinder
    pip install flawfinder
    
    # Cppcheck (system package)
    sudo apt-get install cppcheck  # Debian/Ubuntu
    # or
    brew install cppcheck          # macOS
  4. Download LLM models

    ollama pull codellama:13b
    ollama pull deepseek-coder:6.7b
  5. Verify installation

    export PATH="$HOME/.local/bin:$PATH"
    python3 run_analysis.py --verify

Optional: NVD API Integration

For enhanced CVE mapping, set up the National Vulnerability Database API:

export NVD_API_KEY="your_api_key_here"

Get your free API key at NVD API

Usage

Basic Scan

  1. Organize your codebase by language:

    analysis/
    ├── java/    # Java source files
    ├── python/  # Python source files
    ├── cpp/     # C/C++ source files
    └── php/     # PHP source files
    
  2. Run the analysis:

    python3 run_analysis.py
  3. Review the generated report:

    • Excel report saved to reports/ directory
    • Contains vulnerability details, severity ratings, and remediation guidance

Advanced Options

# Scan specific language
python3 run_analysis.py --language python

# Use faster LLM model
python3 run_analysis.py --model deepseek-coder:6.7b

# Custom output directory
python3 run_analysis.py --output ./custom_reports

# Skip LLM enrichment (faster, less detailed)
python3 run_analysis.py --no-llm

Report Structure

Generated Excel reports include the following columns:

Column Description
S.No Serial number
File Path Relative path to vulnerable file
Line Number(s) Affected code lines
Vulnerability Type Classification (e.g., SQL Injection, XSS)
Severity Critical / High / Medium / Low
CWE ID Common Weakness Enumeration identifier
CVE ID Common Vulnerabilities and Exposures (if applicable)
Description Detailed explanation of the vulnerability
Affected Code Snippet showing vulnerable code
Recommended Fix Concrete remediation steps and example code
OWASP Category Mapping to OWASP Top 10 categories

How It Works

  1. Code Scanning: Each security tool scans the codebase according to its specialty
  2. Result Aggregation: All findings are collected and normalized
  3. LLM Enrichment: Local AI models analyze each finding to:
    • Validate and reduce false positives
    • Generate detailed vulnerability descriptions
    • Map to CWE/CVE standards
    • Suggest specific remediation code
    • Classify severity levels
  4. Report Generation: Structured Excel report with all enriched findings

Performance

  • Small projects (<1000 files): ~5-10 minutes
  • Medium projects (1000-5000 files): ~15-30 minutes
  • Large projects (>5000 files): ~45+ minutes

Performance varies based on codebase size, selected LLM model, and hardware

Troubleshooting

Ollama service not running

# Check status
ollama list

# Start service
ollama serve

Tools not found in PATH

export PATH="$HOME/.local/bin:$PATH"

Permission errors

chmod +x run_analysis.py
chmod -R 755 scripts/

LLM queries timing out

Switch to the faster DeepSeek model:

python3 run_analysis.py --model deepseek-coder:6.7b

Contributing

Contributions are welcome! Please feel free to submit issues, fork the repository, and create pull requests for any improvements.

Areas for contribution:

  • Additional language support (Go, Rust, JavaScript, etc.)
  • Integration with more security scanning tools
  • Enhanced LLM prompting strategies
  • Performance optimizations
  • UI/Dashboard for report visualization

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Built with industry-leading security tools: Semgrep, Bandit, Safety, Flawfinder, Cppcheck, SpotBugs
  • Powered by Ollama for local LLM inference
  • Inspired by the need for comprehensive, privacy-focused security analysis tools

Contact

For questions, suggestions, or collaboration opportunities, please open an issue on GitHub.


Note: This tool is designed to assist in identifying potential security vulnerabilities but should not replace professional security audits. Always review findings manually and conduct thorough testing before deploying to production.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors