diff --git a/02_activities/assignments/Cohort_8/Assignment2.md b/02_activities/assignments/Cohort_8/Assignment2.md
index 47118b2ba..a0c981f2b 100644
--- a/02_activities/assignments/Cohort_8/Assignment2.md
+++ b/02_activities/assignments/Cohort_8/Assignment2.md
@@ -45,16 +45,36 @@ There are several tools online you can use, I'd recommend [Draw.io](https://www.
**HINT:** You do not need to create any data for this prompt. This is a logical model (ERD) only.
+
+
+
+
+
#### Prompt 2
We want to create employee shifts, splitting up the day into morning and evening. Add this to the ERD.
+
+
+See above
+
+
+
+
#### Prompt 3
The store wants to keep customer addresses. Propose two architectures for the CUSTOMER_ADDRESS table, one that will retain changes, and another that will overwrite. Which is type 1, which is type 2?
**HINT:** search type 1 vs type 2 slowly changing dimensions.
```
-Your answer...
+
+
+image-prompt3.png
+
+
+
+
+
+
```
***
@@ -88,6 +108,15 @@ Find the NULLs and then using COALESCE, replace the NULL with a blank for the fi
-
+Answer
+
+SELECT
+COALESCE(IFNULL(product_name, ''), '') || ', ' || COALESCE(IFNULL(product_size, 'unit'), 'unit') || ' (' || COALESCE(IFNULL(product_qty_type, ''), '') || ')'
+FROM product;
+
+
+
+
#### Windowed Functions
1. Write a query that selects from the customer_purchases table and numbers each customer’s visits to the farmer’s market (labeling each market date with a different number). Each customer’s first visit is labeled 1, second visit is labeled 2, etc.
@@ -95,12 +124,56 @@ You can either display all rows in the customer_purchases table, with the counte
**HINT**: One of these approaches uses ROW_NUMBER() and one uses DENSE_RANK().
+
+answer
+
+SELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date) AS visit_number
+FROM customer_purchases;
+
+
+
2. Reverse the numbering of the query from a part so each customer’s most recent visit is labeled 1, then write another query that uses this one as a subquery (or temp table) and filters the results to only the customer’s most recent visit.
+answer
+
+SELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+FROM customer_purchases
+
+
+query2
+SELECT *
+FROM (
+ SELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+ FROM customer_purchases
+) AS ranked_visits
+WHERE visit_number = 1
+
+
+
+
3. Using a COUNT() window function, include a value along with each row of the customer_purchases table that indicates how many different times that customer has purchased that product_id.
-
+
+answer
+
+SELECT
+ customer_id,
+ product_id,
+ market_date,
+ COUNT(*) OVER (PARTITION BY customer_id, product_id) AS product_purchase_count
+FROM customer_purchases;
+
#### String manipulations
1. Some product names in the product table have descriptions like "Jar" or "Organic". These are separated from the product name with a hyphen. Create a column using SUBSTR (and a couple of other commands) that captures these, but is otherwise NULL. Remove any trailing or leading whitespaces. Don't just use a case statement for each product!
@@ -110,10 +183,28 @@ You can either display all rows in the customer_purchases table, with the counte
**HINT**: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column.
+
+answer
+
+SELECT
+ product_name,
+ CASE
+ WHEN INSTR(product_name, '-') > 0 THEN TRIM(SUBSTR(product_name, INSTR(product_name, '-') + 1))
+ ELSE NULL
+ END AS description
+FROM product;
+
2. Filter the query to show any product_size value that contain a number with REGEXP.
-
+answer
+
+SELECT *
+FROM product
+WHERE product_size REGEXP '[0-9]';
+
+
#### UNION
1. Using a UNION, write a query that displays the market dates with the highest and lowest total sales.
@@ -121,6 +212,35 @@ You can either display all rows in the customer_purchases table, with the counte
***
+
+answer
+
+
+WITH date_sales AS (
+ SELECT
+ market_date,
+ SUM(quantity * cost_to_customer_per_qty) AS total_sales
+ FROM customer_purchases
+ GROUP BY market_date
+),
+ranked_dates AS (
+ SELECT
+ market_date,
+ total_sales,
+ RANK() OVER (ORDER BY total_sales DESC) AS best_rank,
+ RANK() OVER (ORDER BY total_sales ASC) AS worst_rank
+ FROM date_sales
+)
+SELECT market_date, total_sales, 'Best Day' AS category
+FROM ranked_dates
+WHERE best_rank = 1
+
+UNION
+
+SELECT market_date, total_sales, 'Worst Day' AS category
+FROM ranked_dates
+WHERE worst_rank = 1;
+
## Section 3:
You can start this section following *session 5*.
@@ -139,11 +259,71 @@ Steps to complete this part of the assignment:
-
+
+answer
+
+
+WITH vendor_products AS (
+ SELECT
+ vi.vendor_id,
+ vi.product_id,
+ original_price,
+ v.vendor_name,
+ p.product_name
+ FROM vendor_inventory vi
+ JOIN vendor v ON vi.vendor_id = v.vendor_id
+ JOIN product p ON vi.product_id = p.product_id
+),
+all_customers AS (
+ SELECT customer_id FROM customer
+)
+SELECT
+ vp.vendor_name,
+ vp.product_name,
+ SUM(5 * original_price) AS total_revenue
+FROM vendor_products vp
+CROSS JOIN all_customers ac
+GROUP BY vp.vendor_name, vp.product_name
+ORDER BY vp.vendor_name, vp.product_name;
+
#### INSERT
1. Create a new table "product_units". This table will contain only products where the `product_qty_type = 'unit'`. It should use all of the columns from the product table, as well as a new column for the `CURRENT_TIMESTAMP`. Name the timestamp column `snapshot_timestamp`.
+answer
+
+
+
+CREATE TABLE product_units AS
+SELECT
+ p.*,
+ CURRENT_TIMESTAMP AS snapshot_timestamp
+FROM product p
+WHERE product_qty_type = 'unit';
+
+
2. Using `INSERT`, add a new row to the product_unit table (with an updated timestamp). This can be any product you desire (e.g. add another record for Apple Pie).
+
+answer
+
+
+INSERT INTO product_units (
+ product_id,
+ product_name,
+ product_size,
+ product_category_id,
+ product_qty_type,
+ snapshot_timestamp
+)
+VALUES (
+ 999,
+ 'Apple Pie',
+ '10 inch',
+ '7',
+ 'unit',
+ CURRENT_TIMESTAMP
+);
+
-
#### DELETE
@@ -152,6 +332,10 @@ Steps to complete this part of the assignment:
**HINT**: If you don't specify a WHERE clause, [you are going to have a bad time](https://imgflip.com/i/8iq872).
-
+answer
+
+DELETE FROM product_units
+WHERE product_id = '999';
#### UPDATE
1. We want to add the current_quantity to the product_units table. First, add a new column, `current_quantity` to the table using the following syntax.
@@ -163,3 +347,29 @@ ADD current_quantity INT;
Then, using `UPDATE`, change the current_quantity equal to the **last** `quantity` value from the vendor_inventory details.
**HINT**: This one is pretty hard. First, determine how to get the "last" quantity per product. Second, coalesce null values to 0 (if you don't have null values, figure out how to rearrange your query so you do.) Third, `SET current_quantity = (...your select statement...)`, remembering that WHERE can only accommodate one column. Finally, make sure you have a WHERE statement to update the right row, you'll need to use `product_units.product_id` to refer to the correct row within the product_units table. When you have all of these components, you can run the update statement.
+
+
+
+ALTER TABLE product_units
+ADD current_quantitys INT;
+
+SELECT product_id,
+ COALESCE(quantity, 0) AS last_quantity
+FROM (
+ SELECT product_id,
+ quantity,
+ ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY market_date DESC) AS rn
+ FROM vendor_inventory
+) t
+WHERE rn = 1;
+
+UPDATE product_units
+SET current_quantitys = (
+ SELECT COALESCE(vi.quantity, 0)
+ FROM vendor_inventory vi
+ WHERE vi.product_id = product_id
+ ORDER BY vi.market_date DESC
+ LIMIT 1
+);
+
+
diff --git a/02_activities/assignments/Cohort_8/Baruni Prabaharan - Assignment 1 Logical Diagram.xlsx b/02_activities/assignments/Cohort_8/Baruni Prabaharan - Assignment 1 Logical Diagram.xlsx
new file mode 100644
index 000000000..c19ec190c
Binary files /dev/null and b/02_activities/assignments/Cohort_8/Baruni Prabaharan - Assignment 1 Logical Diagram.xlsx differ
diff --git a/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.db b/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.db
new file mode 100644
index 000000000..bb787a662
Binary files /dev/null and b/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.db differ
diff --git a/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.sqbpro b/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.sqbpro
new file mode 100644
index 000000000..976ebc152
--- /dev/null
+++ b/02_activities/assignments/Cohort_8/Baruni Prabaharan Assignment 1.sqbpro
@@ -0,0 +1,4 @@
+SELECT *
+FROM customer_purchases
+WHERE product_id IN (4, 9);
+
diff --git a/02_activities/assignments/Cohort_8/assignment1.sql b/02_activities/assignments/Cohort_8/assignment1.sql
index c992e3205..7c65a7792 100644
--- a/02_activities/assignments/Cohort_8/assignment1.sql
+++ b/02_activities/assignments/Cohort_8/assignment1.sql
@@ -4,17 +4,22 @@
--SELECT
/* 1. Write a query that returns everything in the customer table. */
-
+SELECT * FROM customer;
/* 2. Write a query that displays all of the columns and 10 rows from the cus- tomer table,
sorted by customer_last_name, then customer_first_ name. */
+SELECT * FROM customer
+ORDER BY customer_last_name, customer_first_name
+limit 10;
--WHERE
/* 1. Write a query that returns all customer purchases of product IDs 4 and 9. */
-
+SELECT *
+FROM customer_purchases
+WHERE product_id IN (4, 9);
/*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty),
@@ -23,29 +28,65 @@ filtered by customer IDs between 8 and 10 (inclusive) using either:
2. one condition using BETWEEN
*/
-- option 1
+SELECT *, (QUANTITY*cost_to_customer_per_qty) AS PRICE
+FROM customer_purchases
+WHERE customer_id >= 8 AND customer_id <= 10;
--- option 2
+-- option 2
+SELECT *, (QUANTITY*cost_to_customer_per_qty) AS PRICE
+FROM customer_purchases
+WHERE customer_id BETWEEN 8 AND 10;
--CASE
/* 1. Products can be sold by the individual unit or by bulk measures like lbs. or oz.
Using the product table, write a query that outputs the product_id and product_name
columns and add a column called prod_qty_type_condensed that displays the word “unit”
if the product_qty_type is “unit,” and otherwise displays the word “bulk.” */
-
+SELECT
+product_id,
+product_name,
+CASE
+WHEN product_qty_type = 'unit' THEN 'unit'
+ELSE 'bulk'
+END AS prod_qty_type_condensed
+FROM product;
/* 2. We want to flag all of the different types of pepper products that are sold at the market.
add a column to the previous query called pepper_flag that outputs a 1 if the product_name
contains the word “pepper” (regardless of capitalization), and otherwise outputs 0. */
+SELECT
+product_id,
+product_name,
+CASE
+WHEN product_qty_type = 'unit' THEN 'unit'
+ELSE 'bulk'
+END AS prod_qty_type_condensed
+,CASE
+WHEN LOWER(product_name) LIKE '%pepper%' THEN 1
+ELSE 0
+END AS pepper_flag
+FROM product;
+
--JOIN
/* 1. Write a query that INNER JOINs the vendor table to the vendor_booth_assignments table on the
vendor_id field they both have in common, and sorts the result by vendor_name, then market_date. */
+SELECT
+v.vendor_id,
+v.vendor_name,
+vba.market_date,
+vba.booth_number
+FROM vendor AS v
+INNER JOIN vendor_booth_assignments AS vba
+ON v.vendor_id = vba.vendor_id
+ORDER BY v.vendor_name, vba.market_date;
+
@@ -55,6 +96,13 @@ vendor_id field they both have in common, and sorts the result by vendor_name, t
-- AGGREGATE
/* 1. Write a query that determines how many times each vendor has rented a booth
at the farmer’s market by counting the vendor booth assignments per vendor_id. */
+SELECT
+vendor_id,
+COUNT(*) AS booth_rental_count
+FROM vendor_booth_assignments
+GROUP BY vendor_id
+ORDER BY booth_rental_count DESC;
+
@@ -63,7 +111,17 @@ sticker to everyone who has ever spent more than $2000 at the market. Write a qu
of customers for them to give stickers to, sorted by last name, then first name.
HINT: This query requires you to join two tables, use an aggregate function, and use the HAVING keyword. */
-
+SELECT
+c.customer_id,
+c.customer_first_name,
+c.customer_last_name,
+SUM(cp.quantity * cp.cost_to_customer_per_qty) AS total_spent
+FROM customer_purchases AS cp
+INNER JOIN customer AS c
+ON cp.customer_id = c.customer_id
+GROUP BY c.customer_id, c.customer_first_name, c.customer_last_name
+HAVING SUM(cp.quantity * cp.cost_to_customer_per_qty) > 2000
+ORDER BY c.customer_last_name, c.customer_first_name;
--Temp Table
@@ -77,6 +135,13 @@ When inserting the new vendor, you need to appropriately align the columns to be
-> To insert the new row use VALUES, specifying the value you want for each column:
VALUES(col1,col2,col3,col4,col5)
*/
+CREATE TABLE temp.new_vendor AS
+SELECT *
+FROM vendor;
+INSERT INTO temp.new_vendor (vendor_id, vendor_name, vendor_type, vendor_owner_first_name, vendor_owner_last_name)
+VALUES (10, 'Thomass Superfood Store', 'Fresh Focused', 'Thomas', 'Rosenthal');
+
+
@@ -85,7 +150,11 @@ VALUES(col1,col2,col3,col4,col5)
HINT: you might need to search for strfrtime modifers sqlite on the web to know what the modifers for month
and year are! */
-
+SELECT
+customer_id,
+strftime('%m', MARKET_date) AS month,
+strftime('%Y', market_date) AS year
+FROM customer_purchases;
/* 2. Using the previous query as a base, determine how much money each customer spent in April 2022.
@@ -94,3 +163,11 @@ Remember that money spent is quantity*cost_to_customer_per_qty.
HINTS: you will need to AGGREGATE, GROUP BY, and filter...
but remember, STRFTIME returns a STRING for your WHERE statement!! */
+SELECT
+customer_id,
+SUM(quantity * cost_to_customer_per_qty) AS total_spent
+FROM customer_purchases
+WHERE strftime('%m', market_date) = '04'
+AND strftime('%Y', market_date) = '2022'
+GROUP BY customer_id;
+
diff --git a/02_activities/assignments/Cohort_8/assignment2.sql b/02_activities/assignments/Cohort_8/assignment2.sql
index c2743d3b7..0594253b8 100644
--- a/02_activities/assignments/Cohort_8/assignment2.sql
+++ b/02_activities/assignments/Cohort_8/assignment2.sql
@@ -21,6 +21,11 @@ The `||` values concatenate the columns into strings.
Edit the appropriate columns -- you're making two edits -- and the NULL rows will be fixed.
All the other rows will remain the same. */
+Answer
+
+SELECT
+COALESCE(IFNULL(product_name, ''), '') || ', ' || COALESCE(IFNULL(product_size, 'unit'), 'unit') || ' (' || COALESCE(IFNULL(product_qty_type, ''), '') || ')'
+FROM product;
@@ -35,17 +40,46 @@ each new market date for each customer, or select only the unique market dates p
HINT: One of these approaches uses ROW_NUMBER() and one uses DENSE_RANK(). */
+SELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date) AS visit_number
+FROM customer_purchases;
/* 2. Reverse the numbering of the query from a part so each customer’s most recent visit is labeled 1,
then write another query that uses this one as a subquery (or temp table) and filters the results to
only the customer’s most recent visit. */
+ELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+FROM customer_purchases
+
+
+query2
+SELECT *
+FROM (
+ SELECT
+ customer_id,
+ market_date,
+ ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS visit_number
+ FROM customer_purchases
+) AS ranked_visits
+WHERE visit_number = 1;
+
+
/* 3. Using a COUNT() window function, include a value along with each row of the
customer_purchases table that indicates how many different times that customer has purchased that product_id. */
-
+SELECT
+ customer_id,
+ product_id,
+ market_date,
+ COUNT(*) OVER (PARTITION BY customer_id, product_id) AS product_purchase_count
+FROM customer_purchases;
-- String manipulations
/* 1. Some product names in the product table have descriptions like "Jar" or "Organic".
@@ -60,9 +94,21 @@ Remove any trailing or leading whitespaces. Don't just use a case statement for
Hint: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column. */
+SELECT
+ product_name,
+ CASE
+ WHEN INSTR(product_name, '-') > 0 THEN TRIM(SUBSTR(product_name, INSTR(product_name, '-') + 1))
+ ELSE NULL
+ END AS description
+FROM product;
+
+
/* 2. Filter the query to show any product_size value that contain a number with REGEXP. */
+SELECT *
+FROM product
+WHERE product_size REGEXP '[0-9]';
-- UNION
@@ -77,6 +123,31 @@ with a UNION binding them. */
+WITH date_sales AS (
+ SELECT
+ market_date,
+ SUM(quantity * cost_to_customer_per_qty) AS total_sales
+ FROM customer_purchases
+ GROUP BY market_date
+),
+ranked_dates AS (
+ SELECT
+ market_date,
+ total_sales,
+ RANK() OVER (ORDER BY total_sales DESC) AS best_rank,
+ RANK() OVER (ORDER BY total_sales ASC) AS worst_rank
+ FROM date_sales
+)
+SELECT market_date, total_sales, 'Best Day' AS category
+FROM ranked_dates
+WHERE best_rank = 1
+
+UNION
+
+SELECT market_date, total_sales, 'Worst Day' AS category
+FROM ranked_dates
+WHERE worst_rank = 1;
+
/* SECTION 3 */
@@ -91,7 +162,28 @@ Think a bit about the row counts: how many distinct vendors, product names are t
How many customers are there (y).
Before your final group by you should have the product of those two queries (x*y). */
-
+WITH vendor_products AS (
+ SELECT
+ vi.vendor_id,
+ vi.product_id,
+ original_price,
+ v.vendor_name,
+ p.product_name
+ FROM vendor_inventory vi
+ JOIN vendor v ON vi.vendor_id = v.vendor_id
+ JOIN product p ON vi.product_id = p.product_id
+),
+all_customers AS (
+ SELECT customer_id FROM customer
+)
+SELECT
+ vp.vendor_name,
+ vp.product_name,
+ SUM(5 * original_price) AS total_revenue
+FROM vendor_products vp
+CROSS JOIN all_customers ac
+GROUP BY vp.vendor_name, vp.product_name
+ORDER BY vp.vendor_name, vp.product_name;
-- INSERT
/*1. Create a new table "product_units".
@@ -99,17 +191,40 @@ This table will contain only products where the `product_qty_type = 'unit'`.
It should use all of the columns from the product table, as well as a new column for the `CURRENT_TIMESTAMP`.
Name the timestamp column `snapshot_timestamp`. */
-
+CREATE TABLE product_units AS
+SELECT
+ p.*,
+ CURRENT_TIMESTAMP AS snapshot_timestamp
+FROM product p
+WHERE product_qty_type = 'unit';
/*2. Using `INSERT`, add a new row to the product_units table (with an updated timestamp).
This can be any product you desire (e.g. add another record for Apple Pie). */
+INSERT INTO product_units (
+ product_id,
+ product_name,
+ product_size,
+ product_category_id,
+ product_qty_type,
+ snapshot_timestamp
+)
+VALUES (
+ 999,
+ 'Apple Pie',
+ '10 inch',
+ '7',
+ 'unit',
+ CURRENT_TIMESTAMP
+);
-- DELETE
/* 1. Delete the older record for the whatever product you added.
HINT: If you don't specify a WHERE clause, you are going to have a bad time.*/
+DELETE FROM product_units
+WHERE product_id = '999';
@@ -131,5 +246,28 @@ Finally, make sure you have a WHERE statement to update the right row,
When you have all of these components, you can run the update statement. */
+ALTER TABLE product_units
+ADD current_quantitys INT;
+
+SELECT product_id,
+ COALESCE(quantity, 0) AS last_quantity
+FROM (
+ SELECT product_id,
+ quantity,
+ ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY market_date DESC) AS rn
+ FROM vendor_inventory
+) t
+WHERE rn = 1;
+
+UPDATE product_units
+SET current_quantitys = (
+ SELECT COALESCE(vi.quantity, 0)
+ FROM vendor_inventory vi
+ WHERE vi.product_id = product_id
+ ORDER BY vi.market_date DESC
+ LIMIT 1
+);
+
+
diff --git a/02_activities/assignments/Cohort_8/image-prompt1-2.png b/02_activities/assignments/Cohort_8/image-prompt1-2.png
new file mode 100644
index 000000000..80d5fb7a6
Binary files /dev/null and b/02_activities/assignments/Cohort_8/image-prompt1-2.png differ
diff --git a/02_activities/assignments/Cohort_8/image-prompt3.png b/02_activities/assignments/Cohort_8/image-prompt3.png
new file mode 100644
index 000000000..6e2bf26dd
Binary files /dev/null and b/02_activities/assignments/Cohort_8/image-prompt3.png differ