-
Notifications
You must be signed in to change notification settings - Fork 241
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Multiple job types for Indeed, urgent keywords column (#56)
* enh(indeed): mult job types * feat(jobs): urgent kws * fix(indeed): use new session obj per request * fix: emails as comma separated in output * fix: put num urgent words in output * chore: readme
- Loading branch information
1 parent
628f4de
commit e5353e6
Showing
12 changed files
with
268 additions
and
244 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,37 +33,19 @@ _Python version >= [3.10](https://www.python.org/downloads/release/python-3100/) | |
|
||
```python | ||
from jobspy import scrape_jobs | ||
import pandas as pd | ||
|
||
jobs: pd.DataFrame = scrape_jobs( | ||
jobs = scrape_jobs( | ||
site_name=["indeed", "linkedin", "zip_recruiter"], | ||
search_term="software engineer", | ||
location="Dallas, TX", | ||
results_wanted=10, | ||
|
||
country_indeed='USA' # only needed for indeed | ||
|
||
# use if you want to use a proxy | ||
# proxy="http://jobspy:[email protected]:20001", | ||
# offset=25 # use if you want to start at a specific offset | ||
) | ||
print(f"Found {len(jobs)} jobs") | ||
print(jobs.head()) | ||
jobs.to_csv("jobs.csv", index=False) | ||
|
||
# formatting for pandas | ||
pd.set_option('display.max_columns', None) | ||
pd.set_option('display.max_rows', None) | ||
pd.set_option('display.width', None) | ||
pd.set_option('display.max_colwidth', 50) # set to 0 to see full job url / desc | ||
|
||
# 1 output to console | ||
print(jobs) | ||
|
||
# 2 display in Jupyter Notebook (1. pip install jupyter 2. jupyter notebook) | ||
# display(jobs) | ||
|
||
# 3 output to .csv | ||
# jobs.to_csv('jobs.csv', index=False) | ||
|
||
# 4 output to .xlsx | ||
# output to .xlsx | ||
# jobs.to_xlsx('jobs.xlsx', index=False) | ||
|
||
``` | ||
|
@@ -117,6 +99,9 @@ JobPost | |
│ ├── max_amount (int) | ||
│ └── currency (enum) | ||
└── date_posted (date) | ||
└── emails (str) | ||
└── num_urgent_words (int) | ||
└── is_remote (bool) - just for Indeed at the momen | ||
``` | ||
|
||
### Exceptions | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,23 +6,23 @@ | |
search_term="software engineer", | ||
location="Dallas, TX", | ||
results_wanted=50, # be wary the higher it is, the more likey you'll get blocked (rotating proxy should work tho) | ||
country_indeed='USA', | ||
country_indeed="USA", | ||
offset=25 # start jobs from an offset (use if search failed and want to continue) | ||
# proxy="http://jobspy:[email protected]:20001", | ||
) | ||
|
||
# formatting for pandas | ||
pd.set_option('display.max_columns', None) | ||
pd.set_option('display.max_rows', None) | ||
pd.set_option('display.width', None) | ||
pd.set_option('display.max_colwidth', 50) # set to 0 to see full job url / desc | ||
pd.set_option("display.max_columns", None) | ||
pd.set_option("display.max_rows", None) | ||
pd.set_option("display.width", None) | ||
pd.set_option("display.max_colwidth", 50) # set to 0 to see full job url / desc | ||
|
||
# 1: output to console | ||
print(jobs) | ||
|
||
# 2: output to .csv | ||
jobs.to_csv('./jobs.csv', index=False) | ||
print('outputted to jobs.csv') | ||
jobs.to_csv("./jobs.csv", index=False) | ||
print("outputted to jobs.csv") | ||
|
||
# 3: output to .xlsx | ||
# jobs.to_xlsx('jobs.xlsx', index=False) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[tool.poetry] | ||
name = "python-jobspy" | ||
version = "1.1.12" | ||
version = "1.1.13" | ||
description = "Job scraper for LinkedIn, Indeed & ZipRecruiter" | ||
authors = ["Zachary Hampton <[email protected]>", "Cullen Watson <[email protected]>"] | ||
homepage = "https://github.com/cullenwatson/JobSpy" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.