What you can learn from A/B Testing?

We have been running a few tests to find out what makes the visitors click links, inside and below an article. When you have tools to measure user engagement at page level, make use of them. You will be surprised to learn that lot of your assumptions are wrong. It hurts but this is going to be one of the best learning experiences. 

For example, we were under the impression that use of Brand Names in anchor text is going to boost click through rate. Let us take two examples for our test (although this was not the tests, we used similar branding vs. non-branding links).

Version A

Download Professor X’s lecture on Decision making using Macroeconomics Models (Professor X is a thought leader in the field of Decision making using Macroeconomics models. He has helped over 524 companies make informed decisions with his models)

Vs.

Version B

Download 54 mins lecture on Decision making using Macroeconomics Models

If you go by common sense, a bio about the professor below the branded link should ensure a high click through rate in the first experiment. Visitor’s lack of attention has precedence over common sense. Although the first experiment was a more compelling call to action, the 2nd link got over 67% more clicks.

Why?

This is where we have to learn to make conclusions from the experiment. If you don’t takeaway anything from this article, just remember to do this one exercise.

Answer the following questions

1) What is the conclusion from the experiment?

In Call to Actions where a brand’s popularity is uncertain, a detailed explanation of the brand below the link will not enable a higher click through rate. 

2) What have you learned about the audience?

The audience don’t have the patience and attention to read details about the brand. It is better to provide a general call to action with a simple value proposition. In this case, 54 mins lecture on Decision making using Macroeconomics Models

Are we certain that this behaviour is similar for all audience? 

That is where Statistical Significance is so important.

Statistical Significance

In the above case, it was a simple A/B test, where we tested two links. The first - a branded call to action with details provided below the link. The second - a non-branded call to action with no details provided. For A/B tests, the number of visitors required to reach statistical significance is low but still always look at such experiments with doubt.

Doubt = Null Hypothesis

Statistician Ronald Fisher has already developed a hypothesis called Null Hypothesis for this. According to this hypothesis – no variation exists between variables, or that a single variable is no different from zero. It is presumed to be true until statistical evidence nullifies it for an alternative hypothesis

In the above case, null hypothesis would be something like “In the above experiments, the difference in click though rate is because of chance rather than the use of non-branded call to action without explanation”

How will you prove the contrary?

You have to use a large enough sample size and collect data till you can prove statistical significance.

What should be the Sample Size?

Although the generic rule of thumb is that you should have 1000 conversions on each versions, a more scientific way of calculating the sample size are the following:

1) Define the variable for conversion (clicks, sign ups etc.)

2) What is a substantial difference in conversion (for A/B tests, it is any number greater than 10%)

In the above case, if version A has a click through rate of 2% and B a CTR of 15%, we can say that there is a substantial difference in CTR.

3) What is the baseline conversion

For the past 5 months, we have been getting 1.4 – 2.6% conversion on Version A with the median conversion at 2%.  In this case, we will take the baseline conversion as 2%

4) Use Power Analysis

The basic principle of power analysis is to calculate the probability of finding a real difference in the two versions. If there is an 80% probability that one version is better than the other, then the sample size is accepted.

Don’t worry about Power Analysis, most A/B testing tools have Power Analysis tools in-built in them.

How to use Statistical significance to reject Null Hypothesis?

Calculating statistical significance is beyond the scope of this article but we have online tools to perform the calculation. But remember this one number – 5%. This variable is the Significance level. A significance level of 0.05 means that difference in conversion resulting from chance is less than 5%. If we get a number less than 0.05, we can safely reject the null hypothesis.

The A/B testing tools that we are using collects the following information

Existing conversion rate (%) %

Expected improvement in conversion rate (%) %

Number of combinations (variations)

Average number of daily visitors

Percent visitors included in test

Based on the above values, it calculates the total number of days to run the test.

If you don’t remember anything from this article, just remember to answer the following questions after the experiment.

1) What is the conclusion from the experiment?

2) What have you learned about your audience?

You might make wrong conclusions or the behaviour of your audience might change in the next experiment but if you don’t learn from each experiment, the assumptions that you make in the next experiment would be far away from reality.