isg75 · maypri20 · Sep 22, 2025 · Sep 22, 2025 · Sep 22, 2025 · Sep 22, 2025
diff --git a/README.md b/README.md
@@ -1,77 +1,150 @@
-# Project overview
-...
+# **🎲 Board Games Analysis & Recommendation Engine**
 
-# Installation
+## **Project Overview**
 
-1. **Clone the repository**:
+This project analyzes a **board games dataset** to uncover trends and build useful tools for players and enthusiasts.
 
-```bash
-git clone https://github.com/YourUsername/repository_name.git
-```
+* The data was **cleaned, organized, and stored in a SQL database** for efficient querying.
 
-2. **Install UV**
+* A **filtering-based recommendation engine** (Python \+ Streamlit) helps users discover games based on:
 
-If you're a MacOS/Linux user type:
+  * Ratings
 
-```bash
-curl -LsSf https://astral.sh/uv/install.sh | sh
-```
+  * Playtime
 
-If you're a Windows user open an Anaconda Powershell Prompt and type :
+  * Number of players
 
-```bash
-powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
-```
+  * Complexity
 
-3. **Create an environment**
+* Visual charts highlight:
 
-```bash
-uv venv 
-```
+  * Top-rated games
 
-3. **Activate the environment**
+  * Most-played games
 
-If you're a MacOS/Linux user type (if you're using a bash shell):
+  * Most-wishlisted games
 
-```bash
-source ./venv/bin/activate
-```
+Overall, this project combines **data engineering, analysis, and visualization** to deliver practical insights and personalized recommendations.
 
-If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):
+📑 **Presentation**: [Board Game Database](https://docs.google.com/presentation/d/1pBzKzOzYKCxy7n_QEZRLTVhZnCo_pdmCDucYisod0L8/edit?usp=sharing)
 
-```bash
-source ./venv/bin/activate.csh
-```
+---
 
-If you're a Windows user type:
+## **Data Sources**
 
-```bash
-.\venv\Scripts\activate
-```
+* **Board Game CSV file**: Dataset with \~2000 rows
 
-4. **Install dependencies**:
+* **YAML configuration**: Used to dynamically manage file paths and input/output references
 
-```bash
-uv pip install -r requirements.txt
-```
+---
 
-# Questions 
-...
+## **Figures**
 
-# Dataset 
-...
+* **ER Diagram**: A visual representation of the database logical structure
 
-## Main dataset issues
+---
 
-- ...
-- ...
-- ...
+## **Libraries Used**
 
-## Solutions for the dataset issues
-...
+* pandas
 
-# Conclussions
-...
+* streamlit
+
+* seaborn
+
+* matplotlib
+
+* PyYAML
+
+---
+
+## **Jupyter Notebooks**
+
+### **`code_recommendation_engine.ipynb`**
+
+* Load the database
+
+* Drop irrelevant columns and clean `age` column
+
+* Export cleaned database to CSV
+
+* Build recommendation engine based on:
+
+  * Minimum age
+
+  * Number of players
+
+  * Level of complexity
+
+  * Playing time
+
+* Order results by average rating
+
+### **`code_for_graphs.ipynb`**
+
+Exploratory Data Analysis (EDA) with visualizations:
+
+* Top 10 most played games
+
+* Top 10 trending/wishlisted games
+
+* Top 10 rated games
+
+* Correlation matrix
+
+### **`importing_csv_sql.ipynb`**
+
+* Create and import tables based on the ER diagram
+
+---
+
+## **SQL Scripts**
+
+* **boardsgames\_schema\_V2.sql**: Defines table creation, primary keys, and foreign key relationships
+
+* Data imported using the SQL Data Import Wizard
+
+---
+
+## **App**
+
+* **app.py**: Contains the Streamlit app with the recommendation engine
+
+Run locally with:
+
+`streamlit run app.py`
+
+---
+
+## **How to Run the Project**
+
+**Clone the repository**
+
+ `git clone <repo_url>`
+
+`cd <repo_name>`
+
+**Create a virtual environment (recommended)**
+
+ `python -m venv venv`
+
+`source venv/bin/activate   # Mac/Linux`
+
+`venv\Scripts\activate      # Windows`
+
+**Install dependencies**
+
+ `pip install -r requirements.txt`
+
+**Run the Streamlit app**
+
+ `streamlit run app.py`
+
+Open the **localhost URL** in your browser to interact with the app.
+
+---
+
+## **Contributors**
+
+Julia • Ricardo • Mauricio • Priyanka
 
-# Next steps
-...
diff --git a/anaconda_projects/db/project_filebrowser.db b/anaconda_projects/db/project_filebrowser.db
diff --git a/app/app.py b/app/app.py
@@ -0,0 +1,73 @@
+import streamlit as st
+import pandas as pd
+import requests
+import xml.etree.ElementTree as ET
+import yaml
+
+try:
+    with open("../config.yaml", "r") as file:
+        config = yaml.safe_load(file)
+except:
+    print("Yaml configuration file not found!")
+
+# --- Load your dataset ---
+boardgames_df = pd.read_csv(config['output_data']['file'])
+
+# --- Image Fetcher using BoardGameGeek XML API ---
+def fetch_bgg_image(game_name):
+    search_url = f"https://boardgamegeek.com/xmlapi2/search?query={game_name}&type=boardgame"
+    try:
+        search_response = requests.get(search_url)
+        root = ET.fromstring(search_response.content)
+        first_item = root.find("item")
+        if first_item is not None:
+            game_id = first_item.attrib["id"]
+            thing_url = f"https://boardgamegeek.com/xmlapi2/thing?id={game_id}&stats=1"
+            thing_response = requests.get(thing_url)
+            thing_root = ET.fromstring(thing_response.content)
+            image_tag = thing_root.find(".//image")
+            if image_tag is not None:
+                return image_tag.text
+    except Exception as e:
+        print(f"Error fetching image for {game_name}: {e}")
+    return None
+
+# --- Streamlit UI ---
+st.title("🎲 Board Game Recommender")
+st.write("Find the best board games for your session based on your preferences.")
+
+# --- User Inputs ---
+playtime = st.slider("⏱️ Desired playtime (minutes)", 0, 1200, 30)
+number_players = st.slider("👥 Number of players", 1, 100, 2)
+min_age = st.slider("🧒 Age of youngest player", 0, 18)
+min_age = st.slider("🧒 Age of youngest player", 4, 18, 12)
+difficulty_level = st.selectbox("🧠 Desired difficulty level", [1, 2, 3, 4], format_func=lambda x: ["Easy", "Medium", "Hard", "Very Hard"][x-1])
+complexity = float(difficulty_level)
+
+# --- Filter Logic ---
+filtered_df = boardgames_df[
+    (boardgames_df['min_playtime'] <= playtime) &
+    (boardgames_df['max_playtime'] >= playtime) &
+    (boardgames_df['min_players'] <= number_players) &
+    (boardgames_df['max_players'] >= number_players) &
+    (boardgames_df['minimum_age'] <= min_age) &
+    (boardgames_df['complexity'] <= complexity + 1) &
+    (boardgames_df['complexity'] >= complexity)
+]
+
+filtered_df_ranked = filtered_df.sort_values(by="avg_rating", ascending=False)
+
+# --- Display Results ---
+if filtered_df_ranked.empty:
+    st.warning("⚠️ No games match all criteria. Try relaxing one or more inputs.")
+else:
+    st.subheader("🔥 Top 5 Matching Games")
+    for _, row in filtered_df_ranked.head(5).iterrows():
+        st.markdown(f"### {row['boardgame']}")
+        st.write(f"⭐ Rating: {row['avg_rating']} | 👥 Players: {row['min_players']}–{row['max_players']} | ⏱️ Playtime: {row['min_playtime']}–{row['max_playtime']} mins | 🧠 Complexity: {row['complexity']}")
+        img_url = fetch_bgg_image(row['boardgame'])
+        if img_url:
+            st.image(img_url, width=250)
+        else:
+            st.info("No image found.")
+
diff --git a/config.yaml b/config.yaml
@@ -1,5 +1,6 @@
 input_data:
-  file: "../data/raw/raw_data_file.csv"
+  file: "../data/raw/boardgames_df_raw.csv"
 
 output_data:
-  file: "../data/clean/cleaned_data_file.csv"
+  file: "../data/clean/boardgames_df_cleaned.csv"
+