AI-PhishGuard is an AI-powered phishing email detection tool with a beautiful Streamlit dashboard. It uses Natural Language Processing (NLP) and Machine Learning to classify emails as Phishing or Safe, and provides URL extraction, batch analysis, training capability, and analytics.
- Single Email Analysis: Check if an email is phishing or safe.
- Batch Analysis: Upload a CSV of multiple emails for bulk detection.
- Train Your Model: Upload a labeled dataset to retrain the AI.
- Theme Support: Dark, Light, Cyberpunk, and Matrix modes.
- Analytics Dashboard: Pie charts, bar charts, and downloadable history.
AI-PhishGuard/ │ ├── phishing_detector.py # Main dashboard script ├── phishing_model.pkl # Pre-trained ML model ├── requirements.txt # Required Python packages ├── logo.png # App logo ├── README.md # Documentation └── sample_dataset.csv # Example dataset for training & testing
- Clone the repository:
git clone https://github.com/yourusername/AI-PhishGuard.git cd AI-PhishGuard
- Install Dependenices:
pip install -r requirements.txt
- Run the app:
streamlit run phishing_detector.py
-
Single Email Analysis
- Select Check Email in the sidebar.
- Paste the email text in the input box.
- Click Analyze Email to get results.
-
Batch Email Analysis
- Select Batch Analysis in the sidebar.
- Upload a CSV file containing an email_body column.
- View predictions and download results.
-
Train a Model
- Select Train Model in the sidebar.
- Upload a CSV file with columns email_body and label (1 = Phishing, 0 = Safe).
- The model will be retrained and saved as phishing_model.pkl.
-
View Analytics
- Select History & Analytics to see:
- Phishing vs Safe email distribution
- URL count distribution
- Select History & Analytics to see:
Example sample_dataset.csv:
| email_body | label |
|---|---|
| Dear user, your account will be suspended unless you verify now at http://phishy-bank-login.com | 1 |
| Meeting scheduled for tomorrow at 10 AM. Please confirm your attendance. | 0 |
| URGENT: Your PayPal account has been locked. Click here to restore access: https://secure-paypal-reset.com | 1 |
Column Descriptions:
- email_body: Text of the email
- label:
- 1 = Phishing
- 0 = Safe
- Type: scikit-learn Pipeline
- Steps:
- CountVectorizer
- Converts email text into numerical feature vectors (Bag-of-Words approach).
- Token pattern: Words with at least 2 letters.
- Lowercase conversion enabled.
- Multinomial Naive Bayes Classifier
- Classes: 0 = Safe, 1 = Phishing.
- rained on labeled email datasets.
- CountVectorizer
- Usage:
- Loaded at runtime using joblib.load()
- Takes processed text as input
- Returns prediction as Phishing or Safe