Requirement

pip install beautifulsoup4
pip install requests-html

Use

Type command: python main.py
as default parameters will be used:
website = "https://www.searchenginejournal.com"
keywords_file = "keywords.txt"

Or type in console:
python main.py https://www.searchenginejournal.com keywords.txt You can still skip 'keywords.txt'.

As result, you are going to receive two file:

total.csv - the csv file contains keywords with total number of results,
urls.csv - the csv file with every url collected from Google Search.

How it works

The program uses the Scraper class, which loops over given keywords and collects website links from google search service. Links are collecting only if exist in given domain (directly connected with given website).

Future

A good point would be to add a random useragent generator and allow the use a proxy.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Scraper.py		Scraper.py
keywords.txt		keywords.txt
main.py		main.py
totals.csv		totals.csv
urls.csv		urls.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Requirement

Use

How it works

Future

About

Uh oh!

Releases

Packages

Languages

Vitz/scraping-command

Folders and files

Latest commit

History

Repository files navigation

Requirement

Use

How it works

Future

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages