An automated intelligence pipeline that monitors security news, AI research, BSV/Bastion developments and CVE feeds. Filters by keywords, summarises with Claude API, and delivers structured reports to Obsidian twice daily. Built as part of the LeightonSec SOC Toolkit.
- Layer: Ingestion
- Receives from: External RSS feeds, CVE databases
- Feeds into: Analyst (Obsidian inbox), future Unified Dashboard
- Gap it fills: Automated threat intelligence and research ingestion
pipeline.py— Main orchestrator, keyword filtering, report generationfetcher.py— RSS ingestion, domain whitelist, rate limitingsummariser.py— Claude API summarisation, prompt injection hardeneddeduplicator.py— Cross-run deduplication, 1-day expiry windowscheduler.py— Twice daily automation (07:00 and 19:00)run.sh— Shell wrapper for launchd auto-schedulerlogs/seen_urls.json— Dedup cache (gitignored)reports/— Local report storage (gitignored)
✅ Complete and live — LeightonSec/intel-pipeline
✅ Four feed categories: security, ai_research, bsv_bastion, cve
✅ Domain whitelist security layer
✅ Rate limiting between requests
✅ Keyword filtering and deduplication (1-day window)
✅ Claude API summarisation with prompt injection protection
✅ Obsidian inbox delivery
✅ launchd auto-scheduler via run.sh wrapper
- Fix arxiv feed returning 0 items
- Build v2 multi-agent pipeline — LangGraph, 6 agents (see v2 Architecture below)
- Add feedback loop — rate articles, pipeline learns from ratings
- Slack/email alert for critical HIGH severity items
- Integration with Unified Dashboard
Six agents: FetcherAgent → FilterAgent → DeduplicatorAgent → EnricherAgent (severity scoring) → SummariserAgent → ReporterAgent Conditional routing — skip summarisation if no new items
- Python, feedparser, requests, beautifulsoup4
- Anthropic Claude API (claude-haiku-4-5-20251001)
- LangGraph (installed, v2 not yet integrated)
- schedule, python-dotenv
- API key in .env — never committed
- .env, logs/, reports/, venv/ all gitignored
- Domain whitelist — only approved sources fetched
- Rate limiting between all requests
- Claude system prompt hardened against RSS prompt injection
- Outbound only — nothing external reaches Obsidian
- All feed sources defined in RSS_FEEDS dict in fetcher.py
- Keywords defined in USER_KEYWORDS list in pipeline.py
- Obsidian path set in OBSIDIAN_PATH in pipeline.py
- Reports always named Intel-YYYY-MM-DD-AM/PM.md
- Never reduce dedup expiry below 1 day