Skip to content

Commit 4da8295

Browse files
committed
Add updated resources
1 parent 16be0cf commit 4da8295

File tree

3 files changed

+49
-17
lines changed

3 files changed

+49
-17
lines changed

web-scraping-bs4/README.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,37 @@
11
# Build a Web Scraper With Requests and Beautiful Soup
22

3-
This repository contains [`scrape_jobs.py`](https://github.com/realpython/materials/blob/master/web-scraping-bs4/scrape_jobs.py), which is the sample script built in the Real Python tutorial on how to [Build a Web Scraper With Requests and Beautiful Soup](https://realpython.com/beautiful-soup-web-scraper-python/).
3+
This repository contains `scraper.py`, which is the sample script built in the Real Python tutorial on how to [Build a Web Scraper With Requests and Beautiful Soup](https://realpython.com/beautiful-soup-web-scraper-python/).
4+
5+
## Installation and Setup
6+
7+
1. Create a Python virtual environment
8+
9+
```sh
10+
$ python -m venv venv/
11+
$ source venv/bin/activate
12+
(venv) $
13+
```
14+
15+
2. Install the requirements
16+
17+
```sh
18+
(venv) $ pip install -r requirements.txt
19+
```
20+
21+
## Run the Scraper
22+
23+
Run the scraper script:
24+
25+
```sh
26+
(venv) $ python scraper.py
27+
```
28+
29+
You'll see the filtered and formatted Python job listings from the Fake Python job board printed to your console.
30+
31+
## About the Author
32+
33+
Martin Breuss - Email: [email protected]
34+
35+
## License
36+
37+
Distributed under the MIT license. See ``LICENSE`` for more information.

web-scraping-bs4/requirements.txt

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
beautifulsoup4==4.9.3
2-
certifi==2020.12.5
3-
chardet==4.0.0
4-
idna==2.10
5-
requests==2.25.1
6-
soupsieve==2.2.1
7-
urllib3==1.26.4
1+
beautifulsoup4==4.12.3
2+
certifi==2024.8.30
3+
charset-normalizer==3.3.2
4+
idna==3.10
5+
requests==2.32.3
6+
soupsieve==2.6
7+
urllib3==2.2.3

web-scraping-bs4/scrape_jobs.py renamed to web-scraping-bs4/scraper.py

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,20 @@
77
soup = BeautifulSoup(page.content, "html.parser")
88
results = soup.find(id="ResultsContainer")
99

10-
# Look for Python jobs
11-
print("PYTHON JOBS\n==============================\n")
1210
python_jobs = results.find_all(
1311
"h2", string=lambda text: "python" in text.lower()
1412
)
15-
python_job_elements = [
13+
14+
python_job_cards = [
1615
h2_element.parent.parent.parent for h2_element in python_jobs
1716
]
1817

19-
for job_element in python_job_elements:
20-
title_element = job_element.find("h2", class_="title")
21-
company_element = job_element.find("h3", class_="company")
22-
location_element = job_element.find("p", class_="location")
18+
for job_card in python_job_cards:
19+
title_element = job_card.find("h2", class_="title")
20+
company_element = job_card.find("h3", class_="company")
21+
location_element = job_card.find("p", class_="location")
2322
print(title_element.text.strip())
2423
print(company_element.text.strip())
2524
print(location_element.text.strip())
26-
link_url = job_element.find_all("a")[1]["href"]
25+
link_url = job_card.find_all("a")[1]["href"]
2726
print(f"Apply here: {link_url}\n")
28-
print()

0 commit comments

Comments
 (0)