Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
abb5f2d
Brenda_Day_2
Jul 15, 2025
c876d7f
Merge pull request #1 from Brenvillag/Brenda
Brenvillag Jul 15, 2025
3228f60
Day 2 Damian
dejmengit Jul 15, 2025
ac3ab64
Merge pull request #2 from Brenvillag/dejmen
dejmengit Jul 15, 2025
5c118c4
Day 2 Delmar
DelmarB Jul 15, 2025
2a071ed
Merge pull request #3 from Brenvillag/delmar
DelmarB Jul 15, 2025
7e9bd32
day2 sherin
Sherinskuruvilla Jul 15, 2025
713410d
Merge pull request #4 from Brenvillag/sherin
Sherinskuruvilla Jul 15, 2025
ec28d32
W6_Day2_Brenda
Jul 22, 2025
7437fe0
Merge pull request #5 from Brenvillag/brenda
Brenvillag Jul 22, 2025
c9f904d
W6_day2_sherin
Sherinskuruvilla Jul 22, 2025
ac5f7fa
Merge pull request #6 from Brenvillag/sherin
Sherinskuruvilla Jul 22, 2025
30b25b5
wk6-day1
dejmengit Jul 22, 2025
7ef05b6
Merge pull request #7 from Brenvillag/dejmen
dejmengit Jul 22, 2025
d22326b
w6_d2_delmar
DelmarB Jul 22, 2025
9f49599
Merge pull request #8 from Brenvillag/delmar
DelmarB Jul 22, 2025
06af823
Merge branch 'main' into delmar
DelmarB Jul 23, 2025
12fb35b
Merge pull request #9 from Brenvillag/delmar
DelmarB Jul 23, 2025
2ded043
Merge pull request #10 from Brenvillag/delmar
DelmarB Jul 24, 2025
4889450
Final_push_brenda
Jul 25, 2025
2ff5cd1
Updated README.md
Jul 25, 2025
b817743
Merge pull request #11 from Brenvillag/brenda
Brenvillag Jul 25, 2025
835761e
Final push
DelmarB Jul 25, 2025
4c9d93d
Merge pull request #12 from Brenvillag/delmar
DelmarB Jul 25, 2025
15ba9ff
Final push
Sherinskuruvilla Jul 25, 2025
96e6e8d
Merge pull request #13 from Brenvillag/sherin
Sherinskuruvilla Jul 25, 2025
c8cab64
final push dejmen
dejmengit Jul 25, 2025
c623275
Merge pull request #14 from Brenvillag/dejmen
dejmengit Jul 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .virtual_documents/notebooks/Untitled.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import pandas as pd

pd.read_csv('Users/Dejmen/Desktop/Ironhack/week5/Day1/vanguard-ab-test/data/raw/df_final_demo.txt', sep="\t")


df = pd.read_csv("../data/raw/df_final_demo.txt")


df.head(20)



164 changes: 113 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,139 @@
# Project overview
...
# 🌍 Vanguard A/B Testing: Website Redesign Performance Analysis

# Installation
## Objective
This project uses A/B testing to evaluate the performance of a new website design compared to the existing version. Our goal is to determine — through formal statistical hypothesis testing — whether the new design improves key user behavior metrics, such as completion rate and time efficiency. In addition, we aim to uncover potential usability issue(s) within the new design for further refinement.

1. **Clone the repository**:
---

```bash
git clone https://github.com/YourUsername/repository_name.git
```
## 🔍 Hypothesis

2. **Install UV**
We hypothesize that the new website design improves user performance across several key indicators, including:

If you're a MacOS/Linux user type:
### A higher completion rate

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### Lower error (reversal) rates

If you're a Windows user open an Anaconda Powershell Prompt and type :
### Shorter time spent on steps, indicating better usability

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
We will apply statistical hypothesis testing (2 Sample T-tests) to compare performance metrics between users assigned to the old design (control group) and the new design (test group).

3. **Create an environment**
---

```bash
uv venv
```
## Funnel Structure
The user journey consists of three sequential steps, followed by a final "Confirm" step, representing successful completion. Users may proceed forward or move backward in the process. Backward navigation (step reversal) may indicate confusion or design inefficiencies.

3. **Activate the environment**
This funnel structure provides the framework for defining and analyzing all metrics.

If you're a MacOS/Linux user type (if you're using a bash shell):
---

```bash
source ./venv/bin/activate
```
## Primary Metric
- Quality Visits Leading to Confirmation

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):
A "quality visit" is defined as a session in which the user completes all steps and reaches the final "Confirm" stage.

### KPIs

```bash
source ./venv/bin/activate.csh
```
#### Key Performance Indicators (KPIs)

If you're a Windows user type:
Each KPI below was subjected to hypothesis testing to assess whether the differences between the old and new designs are statistically significant.

```bash
.\venv\Scripts\activate
```
##### Completion Rate without error regardless of prior error visits
The proportion of users who reach the final "Confirm" step in one visit. Comparing the test group +5% threshold (cost-effectiveness)
Hypothesis test: Z-test

4. **Install dependencies**:
##### Time Spent on Each Step
The average time users spend at each step of the funnel, analyzed by age group.
Hypothesis test: T-Test

```bash
uv pip install -r requirements.txt
```
##### Error Rate (Step Reversals)
Proportion of users (according to age group) who move backward in the process flow (from a later step to an earlier one) and fail to complete the final step – control and test group separation.
Hypothesis test: T-Test

# Questions
...
#### Age Group Engagement
Comparison of the average session duration (in seconds) across defined age groups (<30, 30–39, 40–49, 50-59, 60-69, 70-79, 80+) between control group and test group of the website.
Test: Barplot

# Dataset
...
#### Expected Outcomes
- Statistically confirm whether the new design improves user engagement and conversion.
- Ensure that engagement remains consistent across age groups.
- Identify steps in the funnel where users experience friction (e.g., high reversal rates or time delays) and provide actionable redesign suggestions.

## Main dataset issues

- ...
- ...
- ...
## 🧾 Dataset Description

## Solutions for the dataset issues
...
### 🧱 Raw Datasets:

- **df_final_demo.txt**
- **df_final_experiment_clients.txt**
- **df_final_web_data_pt_1.txt**
- **df_final_web_data_pt_2.txt**

### Dataset obstacles:
-
- **df_final_experiment_clients.txt**
- ~ 20,000 rows were deleted due to multiple NaN values

- **df_final_demo.txt**
- 14 rows were delted due to all columns having Nan values

### Final
> Note: All txt file were individually exported to single csv files. After cleaning, individual csv files were exported to be merged into one table. This made querying easier.
- **merged_df_clean.csv**

---

## 💻 Technologies Used

| Area | Tools/Technologies |
|----------------------|---------------------------------------------------------|
| Data Manipulation | Python (Pandas, NumPy) |
| Data Visualization | Matplotlib, Seaborn, Pyplot |
| Documentation | Jupyter Notebook, Markdown, GitHub, |
| Version Control | Git, GitHub, Anaconda Powershell |
| Statistical Analysis | Scipy, statsmodels |


---

## 📦 Deliverables

- ✅ [Repository "vanguard-ab-test" on GitHub](https://github.com/Brenvillag/vanguard-ab-test)
- ✅ [Raw dataset](https://github.com/data-bootcamp-v4/lessons/tree/main/5_6_eda_inf_stats_tableau/project/files_for_project)
- ✅ Jupyter Notebook with cleaned and documented dataset (`merged_df_clean.csv`)
- ✅ Jupyter Notebook calling of the functions
- ✅ Python ".py"-file with functions
- ✅ Tableau file
- ✅ [Group 1 Trello Project Page](https://trello.com/b/xIrQ1kK7/vanguard-ab-test)
- ✅ README documentation: README.md
- ✅ [Group 1 Presentation](https://docs.google.com/presentation/d/1Z9yE8gTMzNdZwtDIAucWzqTXzSzvAsn52Qk6IsR0oF4/edit?usp=sharing)


---

## 👨‍💼 Target Audience

- **Target Market**: AGE GROUP - We have to cater to the older clients to "confirm"
- **Stakeholders**: Suggest changes or accept the new design
- **Analysts / Webdesigners**: Offer a clean dataset for further projects or optimizations

---

## 🛠️ Future Work
- **Assist in Webdesign optimizations**: Notebooks and function python file ready to use after later changes

---

## 👥 Contributors

- Brenda Villaverde
- Damian Witkowski
- Sherin Kuruvilla
- Delmar Bumanglag

---

## 🌐 We have proven our hypthesis
### WE NEED TO CHANGE THIS
📢 *The web design does contribute to a faster confirm rate per visit.*

# Conclussions
...

# Next steps
...
15 changes: 15 additions & 0 deletions data/clean/age_group_error.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Variation,age_group,error_rate
Control,0-30,0.15558953697647318
Control,30-39,0.15940402768893042
Control,40-49,0.17310098148834152
Control,50-59,0.19123746897912958
Control,60-69,0.19960180306261735
Control,70-79,0.22200764133244416
Control,80+,0.24773730196068816
Test,0-30,0.16564013002027975
Test,30-39,0.17899731432486105
Test,40-49,0.19186831800873064
Test,50-59,0.22495122646676183
Test,60-69,0.23618682235425859
Test,70-79,0.27431886823911966
Test,80+,0.28537193092450375
71 changes: 71 additions & 0 deletions data/clean/avg_step_duration_for_tableau.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
process_step,Variation,age_group,avg_step_duration_seconds
confirm,Control,<30,88.78953626634959
confirm,Control,30–39,98.81361892583121
confirm,Control,40–49,119.84432809773124
confirm,Control,50–59,133.71443708609272
confirm,Control,60–69,165.97391304347826
confirm,Control,70–79,194.9187165775401
confirm,Control,80+,206.316091954023
confirm,Test,<30,96.64970145009951
confirm,Test,30–39,95.47562371252003
confirm,Test,40–49,108.9019344438474
confirm,Test,50–59,139.2750573036049
confirm,Test,60–69,170.72394881170018
confirm,Test,70–79,193.96587301587303
confirm,Test,80+,193.47602739726028
start,Control,<30,124.17042939353696
start,Control,30–39,123.16122082585278
start,Control,40–49,168.35644310474754
start,Control,50–59,164.11613406079502
start,Control,60–69,164.76671289875173
start,Control,70–79,177.45447705041386
start,Control,80+,162.98187311178248
start,Test,<30,126.47676837725382
start,Test,30–39,131.79662423907027
start,Test,40–49,151.10570005534035
start,Test,50–59,149.25849762066622
start,Test,60–69,156.62193095809488
start,Test,70–79,182.8992684299381
start,Test,80+,153.2340425531915
step_1,Control,<30,29.697157267308572
step_1,Control,30–39,34.593635486981675
step_1,Control,40–49,37.48498031903874
step_1,Control,50–59,45.53220648698036
step_1,Control,60–69,52.142279845091764
step_1,Control,70–79,66.15290669272106
step_1,Control,80+,65.53932584269663
step_1,Test,<30,31.161000179888468
step_1,Test,30–39,30.739241265557055
step_1,Test,40–49,33.76131687242798
step_1,Test,50–59,39.101942305482126
step_1,Test,60–69,44.34618217530076
step_1,Test,70–79,50.809257185516984
step_1,Test,80+,45.43069306930693
step_2,Control,<30,24.26086956521739
step_2,Control,30–39,28.077482876712327
step_2,Control,40–49,34.75361653272101
step_2,Control,50–59,44.883597883597886
step_2,Control,60–69,49.07524752475248
step_2,Control,70–79,57.26813655761024
step_2,Control,80+,67.81428571428572
step_2,Test,<30,38.98378254910918
step_2,Test,30–39,38.70398338682273
step_2,Test,40–49,38.680789798436855
step_2,Test,50–59,51.74283293320153
step_2,Test,60–69,56.66799265605875
step_2,Test,70–79,65.04033379694019
step_2,Test,80+,84.8649193548387
step_3,Control,<30,85.31499429874573
step_3,Control,30–39,92.84223664503246
step_3,Control,40–49,104.78492893537141
step_3,Control,50–59,110.91185112634672
step_3,Control,60–69,75.60493827160494
step_3,Control,70–79,73.74145616641901
step_3,Control,80+,84.42574257425743
step_3,Test,<30,92.35365853658537
step_3,Test,30–39,93.17980022197558
step_3,Test,40–49,109.72405251424325
step_3,Test,50–59,111.91771708683473
step_3,Test,60–69,82.81814901677792
step_3,Test,70–79,82.87806205770278
step_3,Test,80+,92.08994708994709
3 changes: 3 additions & 0 deletions data/clean/clean_flow.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Group,users_completed,total_users,clean_completion_rate
Control,7683,26271,29.25
Test,8944,29908,29.91
Empty file removed data/clean/cleaned_data_file.csv
Empty file.
5 changes: 5 additions & 0 deletions data/clean/completions_stat_summary_clean.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Scenario,Group,Completion Rate (0 errors),Completion Rate (Confirmed),Observed Difference,Required Difference for 5% Lift,Z-statistic,P-value (one-sided),Statistical Conclusion,Interpretation
0 errors,Control,29.2452%,,n/a,n/a,n/a,n/a,n/a,n/a
0 errors,Test,29.9050%,,0.6599%,1.4623%,-2.0788,0.9812,Fail to reject null,No cost-effective improvement
Confirm,Control,,59.2288%,n/a,n/a,n/a,n/a,n/a,n/a
Confirm,Test,,65.1966%,5.9678%,2.9614%,7.3403,0.0,Reject the null hypothesis,Test shows cost-effective improvement (lift > 5%).
Loading