Skip to content

Latest commit

 

History

History
17 lines (15 loc) · 1.12 KB

README.md

File metadata and controls

17 lines (15 loc) · 1.12 KB

PlaystoreScraper

Gathers the list of unique email IDs and developer names, whose apps have had less than a threshold value of downloads. The progress is shown in realtime and also written to emails.txt. The links once visited are maintained in a file called visited.txt so that the same link isn't visited again, and the code isn't stuck in a loop, giving duplicates.

Pre-requisites

  1. Geckodriver (in PATH environment variable)
  2. Firefox
  3. Selenium (pip install selenium)

Customizable Options

  1. installThreshold = 500000
    • Tells the code to look for all apps which have the number of installs less than or equal to 500000
  2. emailsNeeded = 200
    • Tells the code to stop once it has collected a list of 200 emails
  3. scrollTimeout = 3
    • While loading pages and looking for more potential items on the page, waits for 3 seconds before scrolling down
  4. openBrowser = True
    • If set to True, opens up a firefox browser and shows all progress of crawling, otherwise only shows the results in the console if set to False