Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions Playground/Basil/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Resume Job Matcher

A notebook-based prototype that ranks job postings against a resume using a combination of text similarity and skill overlap.

## What It Does

This project compares a resume with a set of job descriptions and produces ranked matches. It uses two signals:

- TF-IDF + cosine similarity for overall text relevance
- Skill extraction and overlap for a more targeted match score

The notebook then combines those scores into a final prototype score and exports the results.

## Project Structure

```text
resume_job_matcher/
├── data/
│ ├── jobs.csv
│ └── resume.txt
├── notebooks/
│ └── resume_job_matching.ipynb
├── outputs/
│ ├── final_prototype_results.csv
│ ├── ranked_jobs.csv
│ └── prototype_results_chart.png
├── src/
└── requirements.txt
```
Comment on lines +16 to +29
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README project structure block appears to describe a different top-level folder ("resume_job_matcher/") and includes a "src/" directory that doesn't exist under Playground/Basil. Please update the structure example to match the actual committed layout (or add the missing folder) to avoid misleading run instructions.

Copilot uses AI. Check for mistakes.

## Notebook Workflow

The notebook is organized into clear sections:

1. Imports and setup
2. Load input data
3. Text preprocessing
4. Text similarity matching
5. Skill extraction and overlap
6. Final scoring and ranking
7. Reporting, visualization, and export

## Requirements

Install the Python packages listed in `requirements.txt`.

Typical dependencies include:

- pandas
- nltk
- scikit-learn
- matplotlib

## How to Run

1. Open `notebooks/resume_job_matching.ipynb`.
2. Run the cells from top to bottom.
3. Make sure the data files are available in `data/`.
4. Review the ranked matches and generated outputs in `outputs/`.

## Inputs

- `data/resume.txt`: Plain-text resume used as the matching profile
- `data/jobs.csv`: Job postings dataset with job titles, companies, and descriptions

## Outputs

The notebook writes the following files to `outputs/`:

- `ranked_jobs.csv`: Jobs ranked by similarity score
- `final_prototype_results.csv`: Final combined results with match and skill scores
- `prototype_results_chart.png`: Bar chart of final prototype scores

## Matching Logic

The ranking is based on a weighted score:

- 70% text match score
- 30% skill overlap score

This keeps the prototype simple while still reflecting both broad relevance and concrete skill alignment.

## Notes

- The notebook downloads the NLTK `punkt` and `stopwords` resources on first run.
- If you change the resume or job data, rerun the notebook from the top to refresh all outputs.

6 changes: 6 additions & 0 deletions Playground/Basil/data/jobs.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
job_id,title,company,description
1,Data Analyst,Company A,"Looking for a candidate with experience in Python, SQL, dashboards, reporting, and data visualization. Knowledge of Tableau or Power BI is preferred."
2,Junior Data Scientist,Company B,"Seeking a graduate with experience in machine learning, Python, data preprocessing, feature engineering, and model evaluation."
3,Machine Learning Intern,Company C,"The ideal applicant has worked on machine learning projects, computer vision tasks, Python programming, and model training workflows."
4,AI Engineer Intern,Company D,"Looking for students with exposure to deep learning, NLP, Python, data pipelines, and practical AI project experience."
5,Business Intelligence Analyst,Company E,"Candidates should have knowledge of SQL, reporting, dashboard creation, Power BI, Excel, and stakeholder communication."
85 changes: 85 additions & 0 deletions Playground/Basil/data/resume.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
Basil Behanan
Tel: +61 468410310
Email: [email protected]
LinkedIn: LinkedIn
Comment on lines +2 to +4
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file includes personal identifiable information (phone number, email). Please remove/redact PII before committing (e.g., replace with a synthetic/sample resume) to avoid leaking sensitive data in the repository history.

Suggested change
Tel: +61 468410310
Email: Basilbehanan5@gmail.com
LinkedIn: LinkedIn
Tel: +61 400 000 000
Email: sample.resume@example.com
LinkedIn: linkedin.com/in/sample-profile

Copilot uses AI. Check for mistakes.

Career Profile
Final-year Computer Science student majoring in Data Science with experience applying data analytics, machine learning, and data engineering techniques to solve complex problems. Skilled in extracting, transforming, and analysing large datasets using Python, SQL, and statistical modelling tools to generate actionable insights. Experienced working in collaborative technical teams to develop analytical solutions, improve data workflows, and communicate technical findings to technical and non-technical stakeholders.
Key Skills
Programming & Analytics
Python, SQL, R, C++, C#, Excel, Power BI
Data Science & Machine Learning
TensorFlow, Scikit-learn, OpenCV, statistical modelling, feature engineering, model evaluation
Data Engineering & Infrastructure
ETL workflows, Docker, Git, GitHub Actions (CI/CD), Google Cloud Storage, AWS S3, Kubernetes
Data Systems & Visualisation
Power BI, MySQL, Jupyter Lab, Excel analytics
Concepts
Data lifecycle management, distributed computing, machine learning pipelines, data quality validation, reproducible experimentation

Education

Deakin University Burwood, VIC
Bachelor of Computer Science Graduation Date: June 2026
Major: Data Science
Achievements: Above 75 WAM
Relevant Coursework: Intro to Data Science and AI, Machine Learning, Database Fundamentals, Linear Algebra for Data Analysis, Discrete Mathematics, Data Wrangling, Applied Algebra and Statistics









Professional Experience

Superstat Richmond, VIC
Computer Vision and Sports Analytics Intern Jul 2025 – Sep 2025
• Prepared and validated over 2,000 annotated AFL video frames to support machine learning model development and analytical workflows.
• Contributed to computer vision classification and OCR pipelines achieving ~90% and ~95% accuracy respectively.
• Implemented CI/CD workflows using GitHub Actions, automating testing, dependency installation, and validation checks for machine learning experiments.
• Standardised development environments using Docker, improving reproducibility across machine learning experimentation.
• Utilised AWS S3 for dataset storage and cloud-based experimentation workflows.
• Collaborated closely with machine learning engineers to evaluate models and refine data preparation and validation processes.
Leadership Experience

RedBack Operations - Project Orion (Sports AI Pipeline) Burwood, VIC
Team Lead – Sports Analytics & Annotation Jul 2025 – Sep 2025
• Led a team of five analysts to build a scalable annotation pipeline supporting machine learning training across 10,000+ video frames.
• Designed structured data schemas and quality-control processes, improving dataset reliability by 30%+.
• Managed dataset storage and access workflows using Google Cloud Storage to support distributed team collaboration.
• Implemented Git-based version control workflows, coordinating dataset and code updates across teams through GitHub repositories.
• Collaborated cross-functionally with engineering and analytics teams to ensure datasets aligned with model validation requirements.
• Produced documentation and onboarding guides to streamline contributor onboarding and workflow consistency.
Project Experience

Distributed Recommendation System (Parallel ML Processing)
• Built distributed Python-based recommendation engine using MPI, reducing execution time by 38%.
• Designed scalable preprocessing workflows and evaluation pipelines.
• Applied collaborative and content-based filtering techniques.
Financial Data Analysis - Cryptocurrency Market Study
• Performed exploratory data analysis on Bitcoin market data using Python (NumPy, Matplotlib) to analyse volatility trends and trading behaviour.
• Generated statistical summaries and visualisations to interpret financial data patterns and market fluctuations


Health Data Analytics - BMI Population Analysis
• Analysed NHANES health datasets using Python and NumPy to study BMI distribution across population groups.
• Applied statistical analysis, correlation matrices, scatterplot analysis, and data visualisation techniques.
Machine Learning – IoT Cyber-Attack Classification
• Developed machine learning models to detect cyber-attacks within IoT networks using supervised learning techniques.
• Evaluated model performance using accuracy, precision, recall, and confusion matrix metrics.
Predictive Analytics – Heart Disease Prediction System
• Designed a machine learning pipeline combining feature selection, PCA dimensionality reduction, and ensemble learning models.
• Implemented models including Random Forest, Logistic Regression, and XGBoost for predictive health analytics.
Courses and Certificates

• Data Analysis with python (freeCodeCamp)
• Intro to C++ (LCC Computer Education)

Referees

Available upon request


1 change: 1 addition & 0 deletions Playground/Basil/data/temp
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file data/temp is empty and looks like a placeholder/accidental artifact. Please remove it, or if it's meant to keep the directory in git, rename to a conventional placeholder like .gitkeep with an explanatory comment in the README.

Suggested change
This placeholder file exists only to keep the `data` directory in version control.
If repository-wide file operations are allowed, prefer renaming this file to `.gitkeep`
and documenting that convention in the README.

Copilot uses AI. Check for mistakes.
Loading
Loading