Skip to content

Commit 2440b0d

Browse files
committed
Merge branch 'main' of github.com:ed-donner/llm_engineering
2 parents 05dbbeb + 6835737 commit 2440b0d

15 files changed

+3986
-0
lines changed

requirements.txt

+2
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,5 @@ bitsandbytes
4040
psutil
4141
setuptools
4242
speedtest-cli
43+
sentence_transformers
44+
feedparser
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "1c6700cb-a0b0-4ac2-8fd5-363729284173",
6+
"metadata": {},
7+
"source": [
8+
"# AI-Powered Resume Analyzer for Job Postings"
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"id": "a2fa4891-b283-44de-aa63-f017eb9b140d",
14+
"metadata": {},
15+
"source": [
16+
"This tool is designed to analyze resumes against specific job postings, offering valuable insights such as:\n",
17+
"\n",
18+
"- Identification of skill gaps\n",
19+
"- Keyword matching between the CV and the job description\n",
20+
"- Tailored recommendations for CV improvement\n",
21+
"- An alignment score reflecting how well the CV fits the job\n",
22+
"- Personalized feedback \n",
23+
"- Job market trend insights\n",
24+
"\n",
25+
"An example of the tool's output can be found [here](https://tvarol.github.io/sideProjects/AILLMAgents/output.html)."
26+
]
27+
},
28+
{
29+
"cell_type": "code",
30+
"execution_count": null,
31+
"id": "8a6a34ea-191f-4c54-9793-a3eb63faab23",
32+
"metadata": {},
33+
"outputs": [],
34+
"source": [
35+
"# Imports\n",
36+
"import os\n",
37+
"import io\n",
38+
"import time\n",
39+
"import requests\n",
40+
"import PyPDF2\n",
41+
"from dotenv import load_dotenv\n",
42+
"from IPython.display import Markdown, display\n",
43+
"from openai import OpenAI\n",
44+
"from ipywidgets import Textarea, FileUpload, Button, VBox, HTML"
45+
]
46+
},
47+
{
48+
"cell_type": "code",
49+
"execution_count": null,
50+
"id": "04bbe1d3-bacc-400c-aed2-db44699e38f3",
51+
"metadata": {},
52+
"outputs": [],
53+
"source": [
54+
"# Load environment variables\n",
55+
"load_dotenv(override=True)\n",
56+
"api_key = os.getenv('OPENAI_API_KEY')\n",
57+
"\n",
58+
"# Check the key\n",
59+
"if not api_key:\n",
60+
" print(\"No API key was found!!!\")\n",
61+
"else:\n",
62+
" print(\"API key found and looks good so far!\")"
63+
]
64+
},
65+
{
66+
"cell_type": "code",
67+
"execution_count": null,
68+
"id": "27bfcee1-58e6-4ff2-9f12-9dc5c1aa5b5b",
69+
"metadata": {},
70+
"outputs": [],
71+
"source": [
72+
"openai = OpenAI()"
73+
]
74+
},
75+
{
76+
"cell_type": "markdown",
77+
"id": "c82e79f2-3139-4520-ac01-a728c11cb8b9",
78+
"metadata": {},
79+
"source": [
80+
"## Using a Frontier Model GPT-4o Mini for This Project\n",
81+
"\n",
82+
"### Types of Prompts\n",
83+
"\n",
84+
"Models like GPT4o have been trained to receive instructions in a particular way.\n",
85+
"\n",
86+
"They expect to receive:\n",
87+
"\n",
88+
"**A system prompt** that tells them what task they are performing and what tone they should use\n",
89+
"\n",
90+
"**A user prompt** -- the conversation starter that they should reply to"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": null,
96+
"id": "0da158ad-c3a8-4cef-806f-be0f90852996",
97+
"metadata": {},
98+
"outputs": [],
99+
"source": [
100+
"# Define our system prompt \n",
101+
"system_prompt = \"\"\"You are a powerful AI model designed to assist with resume analysis. Your task is to analyze a resume against a given job posting and provide feedback on how well the resume aligns with the job requirements. Your response should include the following: \n",
102+
"1) Skill gap identification: Compare the skills listed in the resume with those required in the job posting, highlighting areas where the resume may be lacking or overemphasized.\n",
103+
"2) Keyword matching between a CV and a job posting: Match keywords from the job description with the resume, determining how well they align. Provide specific suggestions for missing keywords to add to the CV.\n",
104+
"3) Recommendations for CV improvement: Provide actionable suggestions on how to enhance the resume, such as adding missing skills or rephrasing experience to match job requirements.\n",
105+
"4) Alignment score: Display a score that represents the degree of alignment between the resume and the job posting.\n",
106+
"5) Personalized feedback: Offer tailored advice based on the job posting, guiding the user on how to optimize their CV for the best chances of success.\n",
107+
"6) Job market trend insights, provide broader market trends and insights, such as in-demand skills and salary ranges.\n",
108+
"Provide responses that are concise, clear, and to the point. Respond in markdown.\"\"\""
109+
]
110+
},
111+
{
112+
"cell_type": "code",
113+
"execution_count": null,
114+
"id": "ebdb34b0-85bd-4e36-933a-20c3c42e833b",
115+
"metadata": {},
116+
"outputs": [],
117+
"source": [
118+
"# The job posting and the CV are required to define the user prompt\n",
119+
"# The user will input the job posting as text in a box here\n",
120+
"# The user will upload the CV in PDF format, from which the text will be extracted\n",
121+
"\n",
122+
"# You might need to install PyPDF2 via pip if it's not already installed\n",
123+
"# !pip install PyPDF2\n",
124+
"\n",
125+
"# Create widgets - to create a box for the job posting text\n",
126+
"job_posting_area = Textarea(\n",
127+
" placeholder='Paste the job posting text here...',\n",
128+
" description='Job Posting:',\n",
129+
" disabled=False,\n",
130+
" layout={'width': '800px', 'height': '300px'}\n",
131+
")\n",
132+
"\n",
133+
"# Define file upload for CV\n",
134+
"cv_upload = FileUpload(\n",
135+
" accept='.pdf', # Only accept PDF files\n",
136+
" multiple=False, # Only allow single file selection\n",
137+
" description='Upload CV (PDF)'\n",
138+
")\n",
139+
"\n",
140+
"status = HTML(value=\"<b>Status:</b> Waiting for inputs...\")\n",
141+
"\n",
142+
"# Create Submit Buttons\n",
143+
"submit_cv_button = Button(description='Submit CV', button_style='success')\n",
144+
"submit_job_posting_button = Button(description='Submit Job Posting', button_style='success')\n",
145+
"\n",
146+
"# Initialize variables to store the data\n",
147+
"# This dictionary will hold the text for both the job posting and the CV\n",
148+
"# It will be used to define the user_prompt\n",
149+
"for_user_prompt = {\n",
150+
" 'job_posting': '',\n",
151+
" 'cv_text': ''\n",
152+
"}\n",
153+
"\n",
154+
"# Functions\n",
155+
"def submit_cv_action(change):\n",
156+
"\n",
157+
" if not for_user_prompt['cv_text']:\n",
158+
" status.value = \"<b>Status:</b> Please upload a CV before submitting.\"\n",
159+
" \n",
160+
" if cv_upload.value:\n",
161+
" # Get the uploaded file\n",
162+
" uploaded_file = cv_upload.value[0]\n",
163+
" content = io.BytesIO(uploaded_file['content'])\n",
164+
" \n",
165+
" try:\n",
166+
" pdf_reader = PyPDF2.PdfReader(content) \n",
167+
" cv_text = \"\"\n",
168+
" for page in pdf_reader.pages: \n",
169+
" cv_text += page.extract_text() \n",
170+
" \n",
171+
" # Store CV text in for_user_prompt\n",
172+
" for_user_prompt['cv_text'] = cv_text\n",
173+
" status.value = \"<b>Status:</b> CV uploaded and processed successfully!\"\n",
174+
" except Exception as e:\n",
175+
" status.value = f\"<b>Status:</b> Error processing PDF: {str(e)}\"\n",
176+
"\n",
177+
" time.sleep(0.5) # Short pause between upload and submit messages to display both\n",
178+
" \n",
179+
" if for_user_prompt['cv_text']:\n",
180+
" #print(\"CV Submitted:\")\n",
181+
" #print(for_user_prompt['cv_text'])\n",
182+
" status.value = \"<b>Status:</b> CV submitted successfully!\"\n",
183+
" \n",
184+
"def submit_job_posting_action(b):\n",
185+
" for_user_prompt['job_posting'] = job_posting_area.value\n",
186+
" if for_user_prompt['job_posting']:\n",
187+
" #print(\"Job Posting Submitted:\")\n",
188+
" #print(for_user_prompt['job_posting'])\n",
189+
" status.value = \"<b>Status:</b> Job posting submitted successfully!\"\n",
190+
" else:\n",
191+
" status.value = \"<b>Status:</b> Please enter a job posting before submitting.\"\n",
192+
"\n",
193+
"# Attach actions to buttons\n",
194+
"submit_cv_button.on_click(submit_cv_action)\n",
195+
"submit_job_posting_button.on_click(submit_job_posting_action)\n",
196+
"\n",
197+
"# Layout\n",
198+
"job_posting_box = VBox([job_posting_area, submit_job_posting_button])\n",
199+
"cv_buttons = VBox([submit_cv_button])\n",
200+
"\n",
201+
"# Display all widgets\n",
202+
"display(VBox([\n",
203+
" HTML(value=\"<h3>Input Job Posting and CV</h3>\"),\n",
204+
" job_posting_box, \n",
205+
" cv_upload,\n",
206+
" cv_buttons,\n",
207+
" status\n",
208+
"]))"
209+
]
210+
},
211+
{
212+
"cell_type": "code",
213+
"execution_count": null,
214+
"id": "364e42a6-0910-4c7c-8c3c-2ca7d2891cb6",
215+
"metadata": {},
216+
"outputs": [],
217+
"source": [
218+
"# Now define user_prompt using for_user_prompt dictionary\n",
219+
"# Clearly label each input to differentiate the job posting and CV\n",
220+
"# The model can parse and analyze each section based on these labels\n",
221+
"user_prompt = f\"\"\"\n",
222+
"Job Posting: \n",
223+
"{for_user_prompt['job_posting']}\n",
224+
"\n",
225+
"CV: \n",
226+
"{for_user_prompt['cv_text']}\n",
227+
"\"\"\""
228+
]
229+
},
230+
{
231+
"cell_type": "markdown",
232+
"id": "3b51dda0-9a0c-48f4-8ec8-dae32c29da24",
233+
"metadata": {},
234+
"source": [
235+
"## Messages\n",
236+
"\n",
237+
"The API from OpenAI expects to receive messages in a particular structure.\n",
238+
"Many of the other APIs share this structure:\n",
239+
"\n",
240+
"```\n",
241+
"[\n",
242+
" {\"role\": \"system\", \"content\": \"system message goes here\"},\n",
243+
" {\"role\": \"user\", \"content\": \"user message goes here\"}\n",
244+
"]"
245+
]
246+
},
247+
{
248+
"cell_type": "code",
249+
"execution_count": null,
250+
"id": "3262c0b9-d3de-4e4f-b535-a25c0aed5783",
251+
"metadata": {},
252+
"outputs": [],
253+
"source": [
254+
"# Define messages with system_prompt and user_prompt\n",
255+
"def messages_for(system_prompt_input, user_prompt_input):\n",
256+
" return [\n",
257+
" {\"role\": \"system\", \"content\": system_prompt_input},\n",
258+
" {\"role\": \"user\", \"content\": user_prompt_input}\n",
259+
" ]"
260+
]
261+
},
262+
{
263+
"cell_type": "code",
264+
"execution_count": null,
265+
"id": "2409ac13-0b39-4227-b4d4-b4c0ff009fd7",
266+
"metadata": {},
267+
"outputs": [],
268+
"source": [
269+
"# And now: call the OpenAI API. \n",
270+
"response = openai.chat.completions.create(\n",
271+
" model = \"gpt-4o-mini\",\n",
272+
" messages = messages_for(system_prompt, user_prompt)\n",
273+
")\n",
274+
"\n",
275+
"# Response is provided in Markdown and displayed accordingly\n",
276+
"display(Markdown(response.choices[0].message.content))"
277+
]
278+
},
279+
{
280+
"cell_type": "code",
281+
"execution_count": null,
282+
"id": "86ab71cf-bd7e-45f7-9536-0486f349bfbe",
283+
"metadata": {},
284+
"outputs": [],
285+
"source": [
286+
"## If you would like to save the response content as a Markdown file, uncomment the following lines\n",
287+
"#with open('yourfile.md', 'w') as file:\n",
288+
"# file.write(response.choices[0].message.content)\n",
289+
"\n",
290+
"## You can then run the line below to create output.html which you can open on your browser\n",
291+
"#!pandoc yourfile.md -o output.html"
292+
]
293+
}
294+
],
295+
"metadata": {
296+
"kernelspec": {
297+
"display_name": "Python 3 (ipykernel)",
298+
"language": "python",
299+
"name": "python3"
300+
},
301+
"language_info": {
302+
"codemirror_mode": {
303+
"name": "ipython",
304+
"version": 3
305+
},
306+
"file_extension": ".py",
307+
"mimetype": "text/x-python",
308+
"name": "python",
309+
"nbconvert_exporter": "python",
310+
"pygments_lexer": "ipython3",
311+
"version": "3.11.11"
312+
}
313+
},
314+
"nbformat": 4,
315+
"nbformat_minor": 5
316+
}

0 commit comments

Comments
 (0)