A modular, scalable Instagram scraper built with Instaloader.
-
Install dependencies:
-
Update
config/config.jsonwith your Instagram credentials, SMTP settings, and proxies (e.g., from free-proxy-list.net). -
Run the scraper:
- For profiles:
python main.py --profile username - For hashtags:
python main.py --hashtag hashtag - For scheduling:
python main.py --schedule
- Multi-threading for faster scraping
- SQLite storage for efficient metadata management
- Proxy rotation to avoid IP bans
- Rotating file logs to manage disk space
- Error handling with email notifications
- Scheduling with logged execution times and rate limit mitigation
- 401 Unauthorized: Wait 15–30 minutes, use proxies, or adjust delays in
config.json. - IDE Errors: Invalidate caches in PyCharm (File > Invalidate Caches / Restart).
- Dependencies: Use
--break-system-packageswith pip on macOS due to PEP 668.
Respect Instagram’s terms of service, including rate limits and data usage policies.