Chrome-Based-Web-Scraper

A simple Web Crawler which is intended to crawl through several research paper publishing websites and extract relevant data. The extraction will be done from manuscripts available online in pdf form. The query will consist of the keywords which are to be matched with the titles of the research articles. The result of the search will contain the Name of the Author, Affiliation and the email address.

Tools :

Scrapy:

Scrapy is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler.

Selenium:

Selenium is an open-source web-based automation tool. Selenium can send the standard Python commands to different browsers, despite variation in their browser's design.

Required:

Chrome Driver for the version of your browser. https://chromedriver.chromium.org/downloads

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
Web_Scraper.py		Web_Scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chrome-Based-Web-Scraper

Tools :

Scrapy:

Selenium:

Required:

About

Releases

Packages

Languages

typecaster/Chrome-Based-Web-Scraper

Folders and files

Latest commit

History

Repository files navigation

Chrome-Based-Web-Scraper

Tools :

Scrapy:

Selenium:

Required:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages