Machine Learning & Data Science Statistics A B Testing

A/B testing is a method commonly used in marketing, product development, and web design to compare two versions (A and B) of a single variable to determine which one performs better. For example, it might compare two different webpage layouts to see which one results in higher conversion rates.

Procedure

Hypothesis Formulation: Define what you want to test (e.g., a new button colour).
Splitting the Sample: Randomly divide your audience into two groups—Group A (control) and Group B (variant).
Running the Test: Expose Group A to the control version and Group B to the variant version.
Data Collection: Measure the performance of each group using predefined metrics.
Analysis: Use statistical methods to determine if there is a significant difference between the two groups.
Conclusion: Decide whether to implement the change based on the results.

Assumptions for A/B Testing

Random Sampling

Each participant has an equal chance of being assigned to either the control or the variant group.
This ensures that the groups are comparable and any differences in outcome can be attributed to the changes being tested.

Independence

Observations are independent of each other.
The behaviour of one user does not influence the behaviour of another.

Sufficient Sample Size

The sample size must be large enough to detect a meaningful difference between the control and the variant.
Use power analysis to determine the required sample size.

Identical Experimental Conditions

Apart from the element being tested, all other conditions should remain the same for both groups.
This ensures that any observed differences can be attributed to the change being tested.

Consistency Over Time

The external environment should remain stable during the test period.
Major changes in external factors (e.g., holidays, news events) can influence user behaviour and confound the results.

No Peeking

Avoid looking at the results before the test is complete.
Prematurely stopping the test can lead to incorrect conclusions due to statistical noise.

Metric Sensitivity

The chosen metrics should be sensitive enough to detect changes caused by the variant.
Select metrics that are directly impacted by the changes being tested.

Homogeneous Population

The population being tested should be homogeneous, meaning the test groups are comparable.
Differences in user segments (e.g., demographics) should be evenly distributed between the control and variant groups.

Machine Learning & Data Science Statistics A B Testing

Procedure

Assumptions for A/B Testing

Random Sampling

Independence

Sufficient Sample Size

Identical Experimental Conditions

Consistency Over Time

No Peeking

Metric Sensitivity

Homogeneous Population

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!