Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
b100c5a
raw data
maypri20 Sep 22, 2025
c1f35ee
jupyter lab cleaned csv
juliaalcribeiro Sep 22, 2025
094d16c
testing
juliaalcribeiro Sep 22, 2025
cfdb18e
updated jupyter notebook
juliaalcribeiro Sep 22, 2025
efea323
cleaning data
maypri20 Sep 23, 2025
d0b0379
Deleted .virtual_documents/^folder
juliaalcribeiro Sep 23, 2025
ffb04cf
Merge pull request #2 from MBengochea/julia
juliaalcribeiro Sep 23, 2025
fd29e44
Deleted unwanted files
maypri20 Sep 23, 2025
41a40bc
Deleted .virtual_documents/ folder
maypri20 Sep 23, 2025
7aaaebe
Merge branch 'main' into maypri20
maypri20 Sep 23, 2025
3f97942
Merge pull request #1 from MBengochea/maypri20
maypri20 Sep 23, 2025
33a4511
Jupyter notebook 23_09 - morning
juliaalcribeiro Sep 23, 2025
8a5e83c
Merge pull request #3 from MBengochea/julia
juliaalcribeiro Sep 23, 2025
54f41d5
Update boardgames_df_julia_2309.ipynb
MBengochea Sep 23, 2025
3f57185
Added Streamlit dependency via uv
MBengochea Sep 23, 2025
cabc382
Streamlit changes
MBengochea Sep 23, 2025
acfc5c6
two notebooks added and ERD jpeg
RCastanheira03 Sep 23, 2025
755f1ad
Merge pull request #4 from MBengochea/Ricardo
RCastanheira03 Sep 23, 2025
0109c33
update
RCastanheira03 Sep 23, 2025
788fb71
update v2
RCastanheira03 Sep 23, 2025
d3052d8
App files changed
MBengochea Sep 23, 2025
f685600
Merge pull request #5 from MBengochea/Mauricio
MBengochea Sep 23, 2025
f5d6acb
Merge pull request #6 from MBengochea/Ricardo
RCastanheira03 Sep 23, 2025
0902e6f
Update
MBengochea Sep 23, 2025
bda4269
Merge pull request #7 from MBengochea/Mauricio
MBengochea Sep 23, 2025
3d748f7
Streamlit changes in app.py
MBengochea Sep 24, 2025
1b42cf5
Merge pull request #8 from MBengochea/Mauricio
MBengochea Sep 24, 2025
3ff7f39
Corrected app code
juliaalcribeiro Sep 24, 2025
fd8aa79
Corrected app code, cleaned db, deleted raw data copy
juliaalcribeiro Sep 24, 2025
fdc8c7a
Merge pull request #9 from MBengochea/julia
juliaalcribeiro Sep 24, 2025
47522fb
yaml file updates
juliaalcribeiro Sep 24, 2025
bd0f38d
Merge pull request #10 from MBengochea/julia
juliaalcribeiro Sep 24, 2025
171712b
requirements.txt
juliaalcribeiro Sep 24, 2025
ebfb427
Merge pull request #11 from MBengochea/julia
juliaalcribeiro Sep 24, 2025
b0260a1
csv files
maypri20 Sep 24, 2025
75cf036
csv files'
maypri20 Sep 24, 2025
465cae8
Merge pull request #12 from MBengochea/maypri20
maypri20 Sep 24, 2025
93a0597
added csv files for SQL, added corresponding notebooks
RCastanheira03 Sep 24, 2025
4440c3f
Merge pull request #13 from MBengochea/Ricardo
RCastanheira03 Sep 24, 2025
add657c
testing new csv file
RCastanheira03 Sep 24, 2025
19a8187
Fix error with app.py code after ignacio's help, was HEAD >>>> in the…
MBengochea Sep 24, 2025
7a64b67
Merge pull request #14 from MBengochea/Mauricio
MBengochea Sep 24, 2025
7224b8d
code for updated csvs
RCastanheira03 Sep 25, 2025
5282b8f
Merge pull request #15 from MBengochea/Ricardo
RCastanheira03 Sep 25, 2025
7839f83
julia 2409 backup
juliaalcribeiro Sep 25, 2025
71ac5f1
julia 2409 backup
juliaalcribeiro Sep 25, 2025
721126e
julia 2409 backup
juliaalcribeiro Sep 25, 2025
1ea2bca
Merge pull request #19 from MBengochea/julia
juliaalcribeiro Sep 25, 2025
aaebbb6
sql tables creation at jupyter notebook
juliaalcribeiro Sep 25, 2025
62d78d3
Merge pull request #20 from MBengochea/julia
juliaalcribeiro Sep 25, 2025
dff6c50
sql database creation script
juliaalcribeiro Sep 25, 2025
a6c69d7
Merge pull request #21 from MBengochea/julia
juliaalcribeiro Sep 25, 2025
775556d
Update code_for_graphs_ricardo.ipynb
MBengochea Sep 25, 2025
7010a49
SQL data base script and CSV games table
juliaalcribeiro Sep 26, 2025
0d062f7
Merge pull request #22 from MBengochea/julia
juliaalcribeiro Sep 26, 2025
93a6517
modifications
maypri20 Sep 26, 2025
0682c57
Merge pull request #23 from MBengochea/main
maypri20 Sep 26, 2025
ec4b6c6
conflict
maypri20 Sep 26, 2025
6e9e17c
Correlation charts
MBengochea Sep 26, 2025
f57f14d
Merge branch 'main' into maypri20
maypri20 Sep 26, 2025
210ec09
final submission
maypri20 Sep 29, 2025
5a8c0dc
Merge branch 'maypri20' of https://github.com/MBengochea/Project_w4 i…
maypri20 Sep 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 124 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,150 @@
# Project overview
...
# **🎲 Board Games Analysis & Recommendation Engine**

# Installation
## **Project Overview**

1. **Clone the repository**:
This project analyzes a **board games dataset** to uncover trends and build useful tools for players and enthusiasts.

```bash
git clone https://github.com/YourUsername/repository_name.git
```
* The data was **cleaned, organized, and stored in a SQL database** for efficient querying.

2. **Install UV**
* A **filtering-based recommendation engine** (Python \+ Streamlit) helps users discover games based on:

If you're a MacOS/Linux user type:
* Ratings

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
* Playtime

If you're a Windows user open an Anaconda Powershell Prompt and type :
* Number of players

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
* Complexity

3. **Create an environment**
* Visual charts highlight:

```bash
uv venv
```
* Top-rated games

3. **Activate the environment**
* Most-played games

If you're a MacOS/Linux user type (if you're using a bash shell):
* Most-wishlisted games

```bash
source ./venv/bin/activate
```
Overall, this project combines **data engineering, analysis, and visualization** to deliver practical insights and personalized recommendations.

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):
📑 **Presentation**: [Board Game Database](https://docs.google.com/presentation/d/1pBzKzOzYKCxy7n_QEZRLTVhZnCo_pdmCDucYisod0L8/edit?usp=sharing)

```bash
source ./venv/bin/activate.csh
```
---

If you're a Windows user type:
## **Data Sources**

```bash
.\venv\Scripts\activate
```
* **Board Game CSV file**: Dataset with \~2000 rows

4. **Install dependencies**:
* **YAML configuration**: Used to dynamically manage file paths and input/output references

```bash
uv pip install -r requirements.txt
```
---

# Questions
...
## **Figures**

# Dataset
...
* **ER Diagram**: A visual representation of the database logical structure

## Main dataset issues
---

- ...
- ...
- ...
## **Libraries Used**

## Solutions for the dataset issues
...
* pandas

# Conclussions
...
* streamlit

* seaborn

* matplotlib

* PyYAML

---

## **Jupyter Notebooks**

### **`code_recommendation_engine.ipynb`**

* Load the database

* Drop irrelevant columns and clean `age` column

* Export cleaned database to CSV

* Build recommendation engine based on:

* Minimum age

* Number of players

* Level of complexity

* Playing time

* Order results by average rating

### **`code_for_graphs.ipynb`**

Exploratory Data Analysis (EDA) with visualizations:

* Top 10 most played games

* Top 10 trending/wishlisted games

* Top 10 rated games

* Correlation matrix

### **`importing_csv_sql.ipynb`**

* Create and import tables based on the ER diagram

---

## **SQL Scripts**

* **boardsgames\_schema\_V2.sql**: Defines table creation, primary keys, and foreign key relationships

* Data imported using the SQL Data Import Wizard

---

## **App**

* **app.py**: Contains the Streamlit app with the recommendation engine

Run locally with:

`streamlit run app.py`

---

## **How to Run the Project**

**Clone the repository**

`git clone <repo_url>`

`cd <repo_name>`

**Create a virtual environment (recommended)**

`python -m venv venv`

`source venv/bin/activate # Mac/Linux`

`venv\Scripts\activate # Windows`

**Install dependencies**

`pip install -r requirements.txt`

**Run the Streamlit app**

`streamlit run app.py`

Open the **localhost URL** in your browser to interact with the app.

---

## **Contributors**

Julia • Ricardo • Mauricio • Priyanka

# Next steps
...
Binary file added anaconda_projects/db/project_filebrowser.db
Binary file not shown.
73 changes: 73 additions & 0 deletions app/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import streamlit as st
import pandas as pd
import requests
import xml.etree.ElementTree as ET
import yaml

try:
with open("../config.yaml", "r") as file:
config = yaml.safe_load(file)
except:
print("Yaml configuration file not found!")

# --- Load your dataset ---
boardgames_df = pd.read_csv(config['output_data']['file'])

# --- Image Fetcher using BoardGameGeek XML API ---
def fetch_bgg_image(game_name):
search_url = f"https://boardgamegeek.com/xmlapi2/search?query={game_name}&type=boardgame"
try:
search_response = requests.get(search_url)
root = ET.fromstring(search_response.content)
first_item = root.find("item")
if first_item is not None:
game_id = first_item.attrib["id"]
thing_url = f"https://boardgamegeek.com/xmlapi2/thing?id={game_id}&stats=1"
thing_response = requests.get(thing_url)
thing_root = ET.fromstring(thing_response.content)
image_tag = thing_root.find(".//image")
if image_tag is not None:
return image_tag.text
except Exception as e:
print(f"Error fetching image for {game_name}: {e}")
return None

# --- Streamlit UI ---
st.title("🎲 Board Game Recommender")
st.write("Find the best board games for your session based on your preferences.")

# --- User Inputs ---
playtime = st.slider("⏱️ Desired playtime (minutes)", 0, 1200, 30)
number_players = st.slider("👥 Number of players", 1, 100, 2)
min_age = st.slider("🧒 Age of youngest player", 0, 18)
min_age = st.slider("🧒 Age of youngest player", 4, 18, 12)
difficulty_level = st.selectbox("🧠 Desired difficulty level", [1, 2, 3, 4], format_func=lambda x: ["Easy", "Medium", "Hard", "Very Hard"][x-1])
complexity = float(difficulty_level)

# --- Filter Logic ---
filtered_df = boardgames_df[
(boardgames_df['min_playtime'] <= playtime) &
(boardgames_df['max_playtime'] >= playtime) &
(boardgames_df['min_players'] <= number_players) &
(boardgames_df['max_players'] >= number_players) &
(boardgames_df['minimum_age'] <= min_age) &
(boardgames_df['complexity'] <= complexity + 1) &
(boardgames_df['complexity'] >= complexity)
]

filtered_df_ranked = filtered_df.sort_values(by="avg_rating", ascending=False)

# --- Display Results ---
if filtered_df_ranked.empty:
st.warning("⚠️ No games match all criteria. Try relaxing one or more inputs.")
else:
st.subheader("🔥 Top 5 Matching Games")
for _, row in filtered_df_ranked.head(5).iterrows():
st.markdown(f"### {row['boardgame']}")
st.write(f"⭐ Rating: {row['avg_rating']} | 👥 Players: {row['min_players']}–{row['max_players']} | ⏱️ Playtime: {row['min_playtime']}–{row['max_playtime']} mins | 🧠 Complexity: {row['complexity']}")
img_url = fetch_bgg_image(row['boardgame'])
if img_url:
st.image(img_url, width=250)
else:
st.info("No image found.")

5 changes: 3 additions & 2 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
input_data:
file: "../data/raw/raw_data_file.csv"
file: "../data/raw/boardgames_df_raw.csv"

output_data:
file: "../data/clean/cleaned_data_file.csv"
file: "../data/clean/boardgames_df_cleaned.csv"

Loading