WordSearch-and-JiraScraper

This project consists of two main parts. The first part is a Python script that implements a word search algorithm on a 2D board. The second part is a web scraper that extracts issue reports from the Apache Camel project on Jira and stores them in a .csv file. The web scraper uses both BeautifulSoup and Selenium libraries to handle static and dynamic web content respectively.

Justification for Using Both BeautifulSoup and Selenium

In this project, BeautifulSoup is used to parse the static HTML content of the webpage, while Selenium is used to handle the dynamic content (the comments), which are loaded by JavaScript and cannot be accessed directly by a simple HTTP request. This approach was chosen to demonstrate the difference between static and dynamic content in web scraping, and how different tools can be used to handle different types of web content. However, in a real project, it would be more efficient to use Selenium to load the entire page (both static and dynamic content), and then parse the page source with BeautifulSoup. This would reduce the number of requests to the server and potentially speed up the scraping process. But for the purpose of demonstrating the difference between static and dynamic content in web scraping, this approach is perfectly fine.

Important Note

This project respects the robots.txt file of the Apache Jira website. Please note that the robots.txt file can change over time, and this crawler was designed in accordance with the robots.txt file as of January 12, 2024. Future users of this code should check the current robots.txt file and adjust the crawler behavior as necessary to respect any changes.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Board_Search.py		Board_Search.py
Crawler.py		Crawler.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WordSearch-and-JiraScraper

Justification for Using Both BeautifulSoup and Selenium

Important Note

About

Uh oh!

Releases

Packages

Languages

aref98/WordSearch-and-JiraScraper

Folders and files

Latest commit

History

Repository files navigation

WordSearch-and-JiraScraper

Justification for Using Both BeautifulSoup and Selenium

Important Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages