Skip to content
This repository was archived by the owner on Apr 27, 2024. It is now read-only.

Commit fd619a4

Browse files
authored
Add files via upload
1 parent 21f3c67 commit fd619a4

37 files changed

+116866
-0
lines changed

AB Testing/AB test new menu.pdf

723 KB
Binary file not shown.

AB Testing/readme.md

+45
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
## The Business Problem ☕ 🏪
2+
Round Roasters is an upscale coffee chain with locations in the western United States of America. The past few years have resulted in stagnant growth at the coffee chain, and a new management team was put in place to reignite growth at their stores.
3+
4+
The first major growth initiative is to introduce gourmet sandwiches to the menu, along with limited wine offerings. The new management team believes that a television advertising campaign is crucial to drive people into the stores with these new offerings.
5+
6+
However, the television campaign will require a significant boost in the company’s marketing budget, with an unknown return on investment (ROI). Additionally, there is concern that current customers will not buy into the new menu offerings.
7+
8+
To minimize risk, the management team decides to test the changes in two cities with new television advertising. Denver and Chicago cities were chosen to participate in this test because the stores in these two cities (or markets) perform similarly to all stores across the entire chain of stores; performance in these two markets would be a good proxy to predict how well the updated menu performs.
9+
10+
The test ran for a period of 12 weeks (2016-April-29 to 2016-July-21) where five stores in each of the test markets offered the updated menu along with television advertising.
11+
12+
The comparative period is the test period, but for last year (2015-April-29 to 2015-July-21).
13+
14+
You’ve been asked to analyze the results of the experiment to determine whether the menu changes should be applied to all stores. The predicted impact to profitability should be enough to justify the increased marketing budget: at least 18% increase in profit growth compared to the comparative period while compared to the control stores; otherwise known as incremental lift. In the data, profit is represented in the gross_margin variable.
15+
16+
You have been able to gather three data files to use for your analysis:
17+
18+
- Transaction data for all stores from 2015-January-21 to 2016-August-18
19+
- A listing of all Round Roasters stores
20+
- A listing of the 10 stores (5 in each market) that were used as test markets.
21+
22+
##### Plan Your Analysis
23+
To perform the correct analysis, you will need to prepare a data set. Prior to rolling up your sleeves and preparing the data, it’s a good idea to have a plan of what you need to do in order to prepare the correct data set. A good plan will help you with your analysis. Here are a few questions to get you started:
24+
25+
- What is the performance metric you’ll use to evaluate the results of your test?
26+
- What is the test period?
27+
- At what level (day, week, month, etc.) should the data be aggregated?
28+
29+
##### Clean Up Your Data
30+
In this step, you should prepare the data for steps 3 and 4. You should aggregate the transaction data to the appropriate level and filter on the appropriate data ranges. You can assume that there is no missing, incomplete, duplicate, or dirty data. You’re ready to move on to the next step when you have weekly transaction data for all stores.
31+
32+
##### Match Treatment and Control Units
33+
In this step, you should create the trend and seasonality variables, and use them along with you other control variable(s) to match two control units to each treatment unit. Treatment stores should be matched to control stores in the same region. Note: Calculate the number of transactions per store per week and use 12 periods to calculate trend and seasonality.
34+
35+
Apart from trend and seasonality...
36+
37+
- What control variables should be considered? Note: Only consider variables in the RoundRoastersStore file.
38+
- What is the correlation between your each potential control variable and your performance metric? (Example of correlation matrix below)
39+
- What control variables will you use to match treatment and control stores?
40+
41+
42+
##### Analysis and Writeup
43+
Conduct your A/B analysis and create a short report outlining your results and recommendations.
44+
45+

AB Testing/round-roaster-stores.csv

+134
Large diffs are not rendered by default.
Binary file not shown.

Capstone/readme.md

+47
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
## Store Format for Existing Stores 🌳 🗺️ 🕰️
2+
Your company currently has 85 grocery stores and is planning to open 10 new stores at the beginning of the year. Currently, all stores use the same store format for selling their products. Up until now, the company has treated all stores similarly, shipping the same amount of product to each store. This is beginning to cause problems as stores are suffering from product surpluses in some product categories and shortages in others. You've been asked to provide analytical support to make decisions about store formats and inventory planning.
3+
4+
##### Determining Store Format
5+
To remedy the product surplus and shortages, the company wants to introduce different store formats. Each store format will have a different product selection in order to better match local demand. The actual building sizes will not change, just the product selection and internal layouts. The terms "formats" and "segments" will be used interchangeably throughout this project. You’ve been asked to:
6+
7+
- Determine the optimal number of store formats based on sales data.
8+
- Sum sales data by StoreID and Year
9+
- Use percentage sales per category per store for clustering (category sales as a percentage of total store sales).
10+
- Use only 2015 sales data.
11+
- Use a K-means clustering model.
12+
- Segment the 85 current stores into the different store formats.
13+
- Use the StoreSalesData.csv and StoreInformation.csv files.
14+
15+
## Store Format for New Stores
16+
The grocery store chain has 10 new stores opening up at the beginning of the year. The company wants to determine which store format each of the new stores should have. However, we don’t have sales data for these new stores yet, so we’ll have to determine the format using each of the new store’s demographic data.
17+
18+
##### Determine the Store Format for New Stores
19+
You’ve been asked to:
20+
21+
- Develop a model that predicts which segment a store falls into based on the demographic and socioeconomic characteristics of the population that resides in the area around each new store.
22+
- Use a 20% validation sample with Random Seed = 3 when creating samples with which to compare the accuracy of the models. Make sure to compare a decision tree, forest, and boosted model.
23+
- Use the model to predict the best store format for each of the 10 new stores.
24+
- Use the StoreDemographicData.csv file, which contains the information for the area around each store.
25+
-
26+
**Note:** In a real world scenario, you could use PCA to reduce the number of predictor variables. However, there is no need to do so in this project. You can leave all predictor variables in the model.
27+
28+
## Forecasting
29+
Fresh produce has a short life span, and due to increasing costs, the company wants to have an accurate monthly sales forecast.
30+
31+
##### Forecasting Produce Sales
32+
You’ve been asked to prepare a monthly forecast for produce sales for the full year of 2016 for both existing and new stores. To do so, follow the steps below.
33+
34+
**Note:** Use a 6 month holdout sample for the TS Compare tool (this is because we do not have that much data so using a 12 month holdout would remove too much of the data)
35+
36+
- Step 1: To forecast produce sales for existing stores you should aggregate produce sales across all stores by month and create a forecast.
37+
38+
- Step 2: To forecast produce sales for new stores:
39+
40+
##### Forecast produce sales (not total sales) for the average store (rather than the aggregate) for each segment.
41+
- Multiply the average store produce sales forecast by the number of new stores in that segment.
42+
- For example, if the forecasted average store produce sales for segment 1 for March is 10,000, and there are 4 new stores in segment 1, the forecast for the new stores in segment 1 would be 40,000.
43+
- Sum the new stores produce sales forecasts for each of the segments to get the forecast for all new stores.
44+
- Step 3: Sum the forecasts of the existing and new stores together for the total produce sales forecast.
45+
46+
47+

0 commit comments

Comments
 (0)