Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,32 @@ frontend/.env.local
**/generated/
packages/*/generated/

# Additional Security - API Keys & Sensitive Data
**/.env
**/.env.*
*.env
*.env.*
Comment on lines +81 to +84
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While these patterns are good, they can be simplified. The pattern **/.env.* on line 82 already covers *.env.* on line 84. Similarly, **/.env on line 81 covers *.env on line 83 for projects in subdirectories. You could potentially simplify this section for better readability.

.env.backup
.env.production
.env.staging
.env.development

# Database files (SQLite)
*.db
*.sqlite
*.sqlite3

# API Keys and credentials
**/config/keys.json
**/config/secrets.json
api-keys.txt
credentials.json

# Session and auth files
sessions/
auth-sessions/
*.session

---

### Turborepo daemon logs
Expand Down
40 changes: 40 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# 🔒 Security & Privacy Documentation

## ✅ Data Protection Status: SECURE

### What's Protected:
-**Environment Variables**: All `.env` files are gitignored
-**API Keys**: Only placeholder values used
-**Database Files**: SQLite files are gitignored
-**Personal Data**: No real personal information stored
-**Credentials**: All auth tokens are placeholders

### Files That Are Safe & Ignored:
```
backend/.env # API keys, secrets
backend/*.db # Database files
backend/*.sqlite # SQLite databases
**/.env.* # All environment variants
sessions/ # Session data
credentials.json # Any credential files
```

### What We Created:
1. **AI Chatbot Backend** - Uses mock data, no real APIs
2. **Frontend Components** - No sensitive data embedded
3. **Database Schema** - Development only, no real user data
4. **Configuration Files** - Only placeholder values

### API Keys Used:
- `OPENAI_API_KEY=your_openai_api_key_here` *(placeholder)*
- `ALPHA_VANTAGE_API_KEY=your_alpha_vantage_api_key` *(placeholder)*
- `SECRET_KEY=your_secret_key_here` *(placeholder)*

### Before Production:
1. Replace all placeholder API keys with real ones
2. Use environment-specific `.env` files
3. Set up proper authentication
4. Configure production database

## 🛡️ Your Data is 100% Secure!
No personal information, real API keys, or sensitive data has been committed to git.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's a common convention and good practice to end all text files with a single newline character. This prevents issues with some tools and makes file concatenation more reliable.

Suggested change
No personal information, real API keys, or sensitive data has been committed to git.
No personal information, real API keys, or sensitive data has been committed to git.

118 changes: 118 additions & 0 deletions backend/app/routers/chatbot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
from fastapi import APIRouter, HTTPException, Depends
from pydantic import BaseModel
from typing import List, Dict, Any, Optional
import json
from datetime import datetime
from ..services.ai_services import AIStockAnalyzer, get_real_time_stock_data, get_stock_trends

router = APIRouter(prefix="/api/chatbot", tags=["chatbot"])

# Pydantic models
class ChatRequest(BaseModel):
message: str
session_id: str
user_id: Optional[str] = None

class ChatResponse(BaseModel):
response: str
data: Optional[Dict[str, Any]] = None
session_id: str
timestamp: datetime

# In-memory storage for demo (replace with database later)
chat_sessions = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using an in-memory dictionary for chat_sessions is not suitable for a production environment. It will lose all data on server restart, won't scale beyond a single process, and can lead to high memory consumption. The new Prisma schema correctly defines models for persisting this data, and this implementation should be updated to use the database.


@router.post("/chat", response_model=ChatResponse)
async def chat_with_ai(request: ChatRequest):
"""Main chatbot endpoint"""
try:
# Initialize AI analyzer
ai_analyzer = AIStockAnalyzer()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

A new instance of AIStockAnalyzer is created for every chat request. This is inefficient as it can involve re-reading environment variables and other setup costs. You can improve performance by creating a single instance and reusing it across requests. A good way to do this in FastAPI is with a dependency, like so:

# At module level
# You can use @lru_cache for a simple singleton
from functools import lru_cache

@lru_cache()
def get_ai_analyzer():
    return AIStockAnalyzer()

@router.post("/chat", response_model=ChatResponse)
async def chat_with_ai(request: ChatRequest, ai_analyzer: AIStockAnalyzer = Depends(get_ai_analyzer)):
    # ... use ai_analyzer directly without creating a new instance ...


# Analyze user query
analysis = await ai_analyzer.analyze_stock_query(request.message)

# Fetch required data based on analysis
data = await fetch_stock_data(analysis)

# Generate AI response
response = await ai_analyzer.generate_response(request.message, data)

# Store chat in memory (later: save to database)
if request.session_id not in chat_sessions:
chat_sessions[request.session_id] = []

chat_sessions[request.session_id].extend([
{
"role": "user",
"content": request.message,
"timestamp": datetime.now()
},
{
"role": "assistant",
"content": response,
"timestamp": datetime.now(),
"data": data
}
])

return ChatResponse(
response=response,
data=data,
session_id=request.session_id,
timestamp=datetime.now()
Comment on lines +49 to +63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

You are calling datetime.now() multiple times within this block (lines 49, 54, 63). This can result in slightly different timestamps for the user message, assistant message, and the final response. It's better to capture the timestamp once at the beginning of the request and reuse the same datetime object for consistency.

)

except Exception as e:
print(f"Chat error: {e}")
raise HTTPException(status_code=500, detail=f"Chat error: {str(e)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Returning the raw exception message str(e) to the client can leak sensitive internal implementation details, which is a security risk. It's better to return a generic error message and log the full exception details on the server for debugging.

Suggested change
raise HTTPException(status_code=500, detail=f"Chat error: {str(e)}")
raise HTTPException(status_code=500, detail="An internal error occurred while processing your request.")


@router.get("/sessions/{session_id}")
async def get_chat_history(session_id: str):
"""Get chat history for a session"""
if session_id in chat_sessions:
return {"messages": chat_sessions[session_id]}
return {"messages": []}

@router.delete("/sessions/{session_id}")
async def clear_chat_session(session_id: str):
"""Clear a chat session"""
if session_id in chat_sessions:
del chat_sessions[session_id]
return {"message": "Session cleared"}

async def fetch_stock_data(analysis: Dict[str, Any]) -> Dict[str, Any]:
"""Fetch relevant stock data based on AI analysis"""
data = {}

try:
if analysis["action"] == "get_price":
for symbol in analysis.get("symbols", [])[:5]: # Limit to 5 symbols
stock_data = await get_real_time_stock_data(symbol)
data[symbol] = stock_data

elif analysis["action"] == "get_trends":
for symbol in analysis.get("symbols", [])[:3]: # Limit to 3 symbols for trends
trends = await get_stock_trends(symbol, analysis.get("time_range", 30))
data[f"{symbol}_trends"] = trends

elif analysis["action"] == "market_summary":
# Get summary of major indices
major_stocks = ["AAPL", "GOOGL", "MSFT"]
for symbol in major_stocks:
stock_data = await get_real_time_stock_data(symbol)
data[symbol] = stock_data

# If no symbols found, provide general market info
if not data and analysis["action"] in ["get_price", "get_trends"]:
data["info"] = "No specific stocks mentioned. Try asking about stocks like AAPL, GOOGL, MSFT, or TSLA."

except Exception as e:
data["error"] = f"Error fetching stock data: {str(e)}"

return data
Comment on lines +84 to +113
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function contains several hardcoded values, such as symbol limits ([:5], [:3]) and the list of major_stocks. These should be defined as constants at the module level or moved to a configuration file. This will improve maintainability and make them easier to find and change in the future.


@router.get("/health")
async def chatbot_health():
"""Health check endpoint"""
return {"status": "healthy", "service": "AI Stock Chatbot"}
Loading