29 lines (22 loc) · 1.06 KB

Instagram Scraper

A modular, scalable Instagram scraper built with Instaloader.

Setup

Install dependencies:
Update config/config.json with your Instagram credentials, SMTP settings, and proxies (e.g., from free-proxy-list.net).
Run the scraper:

For profiles: python main.py --profile username
For hashtags: python main.py --hashtag hashtag
For scheduling: python main.py --schedule

Features

Multi-threading for faster scraping
SQLite storage for efficient metadata management
Proxy rotation to avoid IP bans
Rotating file logs to manage disk space
Error handling with email notifications
Scheduling with logged execution times and rate limit mitigation

Troubleshooting

401 Unauthorized: Wait 15–30 minutes, use proxies, or adjust delays in config.json.
IDE Errors: Invalidate caches in PyCharm (File > Invalidate Caches / Restart).
Dependencies: Use --break-system-packages with pip on macOS due to PEP 668.

Compliance

Respect Instagram’s terms of service, including rate limits and data usage policies.