45
Spotlight on... A/B testing Is it really dead?

Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Spotlight on... A/B testing Is it really dead?

Page 2: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Summary

It has become received wisdom in the hotel industry andbeyond that A/B testing should be employed on everywebsite in order to optimize performance. In reality, themajority of websites do not have the required traffic orresource to run a statistically significant A/B test.

As A/B test provider for some of the world's largesthotel groups, Triptease has industry-leading knowledgeof A/B testing on hotel websites. In this report wehighlight common misconceptions, explain thedifferences between hotels and OTAs when it comes totesting, lay out the mechanics of our own A/B testprocess and explain how to avoid an invalid test.

2

Page 3: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Contents

INTRODUCTION

WHAT IS AN A/B TEST?

HOW TO RUN A TEST

ENSURING A VALID TEST

HOTELS VS. OTAS

MYTHBUSTING

CONCLUSION

SOURCES & READING

04

07

11

23

29

34

42

44

3

Page 4: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

"A/B testing is dead"*

In February 2018, A/B testing platformOptimizely called time on their free A/B testtool in a move heralded by many as the deathknell of indiscriminate split-traffic testing. 

For many years, businesses and individualshave turned to A/B testing as the catch-allsolution for conversion optimization, the'scientific' nature of the test leading many tobelieve that they were gathering ironcladevidence for the profitability of whateverchange they were trying to introduce. 

What many marketers ignored, however, wasthe fact that setting up scientifically valid 

Introduction

experiments is a complex process. Notmincing their words, Venture Beat havesuggested that "trained statisticians wouldtear the average marketer's A/B test toshreds."

The vogue for A/B testing has taken particularhold in the hotel industry. It is no secret thatA/B testing is central to the business model ofOTAs like Booking.com and Expedia.Hoteliers have sought to emulate thepractices of their biggest rivals by adopting asimilar approach, facilitated by third-partytechnology vendors offering A/B tests as 'partof the deal'.

Yvonne Koleczek, 'Optimizely's decision to ditch its free plan suggests A/B website testing is dead,' Venture Beat

4

Page 5: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

The problem, however, is that the majority ofhotel websites have nowhere near the trafficrequired to perform a statistically significanttest. Poorly set up A/B tests based oninappropriate hypotheses with inadequateamounts of traffic are all too common - andthey're wasting hoteliers' valuable time.

Does that mean, though, that A/B testing as awhole is redundant - just because it's beenoverused and misapplied? 

As providers of a software platform to someof the world's largest hotel groups, we areregularly asked to set up tests on thewebsites of our clients. With accuracy beingsuch a core element of our value proposition,performing the right kind of testing issomething we spend a lot of time thinkingabout. And, often, the right kind of testing isindeed a properly hypothesized, correctly setup, professionally monitored A/B test.

So, we wouldn't agree that A/B testing is dead.We do however celebrate the growingacceptance of the fact that A/B tests are not acatch-all solution that anyone can benefit from.The over-simplified version of A/B testing thatbusinesses have been sold for the best part of adecade encourages them to look at their datain the wrong way.

Testing is not about tweaking a widget andwaiting for your conversion rate to increase.It's about understanding your data, organizingit correctly, coming up with a hypothesis aboutyour users' behavior, and introducing a testonce you have accounted for all the variables.

Every test should start from the point oftalking to and understanding the end users ofyour product. The A/B test itself can be apowerful method of determining the impact ofa change, but it can't give you that initial insightinto what you need to change.

5

Page 6: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Performing tests and analyzing data isextremely powerful, but more often than notit should be the end point of a journey thatstarts with talking to a guest.

A/B testing isn't dead - but it does need to bereconsidered. We've put this report togetherwith the help of Triptease's dedicated testingsquad, who know everything there is to knowabout conversion analytics for hotel websites(and then some).

Time to rethink everything you thought youknew about A/B testing.

About the report

Author

Lily McIlwain Content Manager

Contributors

Mark Farragher Data Analyst

Tom Damant Data Analyst

6

Page 7: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

The most common form of A/B test is that ofa controlled experiment involving twovariants to see whether one variant performsbetter for a given metric than the other. 

An example would be testing two versions ofa website homepage, one with a 'Book Now'button and one without a 'Book Now' button,to see which version experiences a higherconversion rate.

The idea is that by the end of the test, youwill be able to see which version performsmore successfully and is likelier todrive improvement

What is an A/B test?

drive improvement in your given metric. Youcan then make the appropriate change onyour website.

In an A/B test there are two variations of yourwebsite live at the same time. The control isthe original version, and the variant containsthe desired change. As users arrive on thewebsite they are entered into one of the twotesting groups and see only one of the twovariations.

The ratio of traffic between groups is open todetermination but it is most often a 50:50split.

1

7

Page 8: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

The important feature of an A/B test, andwhat differentiates it from other commonforms of website testing, is that it gives you acontrol group, which brings it in line withscientific experimentation. It follows thatthere is no point in doing an A/B test which isnot carried out rigorously and scientifically.

There are many other types of test hotels canperform if they don't have the resources tomethodically carry out a scientifically soundtest. The scientific, 'objective' nature of anA/B test does not make it the only valid wayof testing your website.

If you have the resources and appropriatevolume of traffic, though, A/B testing can andshould be a valuable part of your optimizationarsenal. 

QUOTE

Tom Damant  Data Analyst at Triptease

"The majority of hotelsdo not have thenecessary traffic to run avalid A/B test on theirwebsite."

8

Page 9: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

In 2017, Best Western UK carried out anA/B test of the impact of Triptease's PriceCheck on their website. The test waspositive, demonstrating a 4.8% uplift inconversion rate. We spoke to Jim Muir, Headof Marketing at Best Western UK, to get hisview on A/B testing.

BEST  WESTERN

UK

Case study

Why did you decide to A/B test the Tripteaseplatform on your site?

"We are a membership organisation and weneed to be certain that any investment in ourwebsite is going to deliver a positive ROI forour members. Due to the platform being athird-party tool, we wanted to be able tovalidate the claims put forward by Tripteaseas well as monitor any potential performanceissues.

"By rolling this out as an A/B test we wereable to confirm the uplift, how big it was andcalculate a likely ROI to further our businesscase internally."

In general, what do you A/B test and why?

"On www.bestwestern.co.uk we like to test asmuch as we can to document what works,measure how much of an impact a change hashad as well as what the next develop-

9

Page 10: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

-ment steps may be based off these findings.This allows us to remove opinion, testhypotheses and deliver what works best forour users.

"Examples of previous tests include variousredesigns and tweaks to our hotel detailpages, the checkout process, location searchresults as well as key landing pages such asthe homepage. Whenever we invest in thewebsite, we consider the fact that we'respending the members' money - so we needto be sure that any investment has a positiveROI."

What did you hope the Triptease platformwould achieve on your website?

"The Triptease platform provides a great toolfor users to compare and confirm priceswithout the need to leave the direct website.With the launch of our Book Direct campaign

campaign we needed further ways ofpromoting this message to our audience.Triptease provided the best tool to furtherpromote and prove the message to our users.

"Our objectives were simple: reduce the needfor web visitors to shop around = improvedconversion rate = more revenue through ourwebsite = lower cost of sale for our memberhotels."

How did you find the process of workingwith the Triptease A/B testing team?

"Our setup was slightly more complicatedthan a usual integration because we wantedto show different prices based on a user'smember state (if a user is logged in as arewards member, we reveal our exclusivepricing). Despite the potential complications,the Triptease team were great to work withthroughout."

10

Page 11: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

How to run a test

2

11

Page 12: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

There are a couple of routes you can go downwhen setting up an A/B test. Many peoplechoose to employ a third party to perform theentire test for them. 

Others - generally those working with smalleramounts of traffic - prefer to use self-servetools with which they set up, monitor andanalyze the test themselves. One such (free)tool was Optimizely's Starter plan, nowconsigned to history. Most providers offerboth an enterprise and a self-serve solution.

There's a lot we could say about the wisdom

Choosing a provider

of using self-serve A/B test tools,  but in theinterests of space it's probably best to saythat they're unlikely to driveany  significant  value to your business unlessthey're in the hands of a team who reallyknow how to use them. 

Most hotels depend on third parties for someor all of the features on their website. The riskof using a third-party A/B testing tool tomonitor the impact of a third-party change onyour website is that data will be notcommunicated fully between those two thirdparties!

2(i)

12

Page 13: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

At Triptease, we have built our own A/Btesting tool designed to effectively monitorthe impact of our platform on hotel websites. Ahuge benefit of this is that we have end-to-endvisibility of every test we run, meaning we canavoid the problem just mentioned.

A typical A/B test for Triptease would bemeasuring the impact of Price Check (our pricecomparison messaging) on a hotel website'sconversion rate.  The fact that our analyticsteam have an in-depth knowledge of how theproduct works means that problems andoddities can be spotted much faster than by athird-party coming to the product cold. 

In addition, our extensive data means we canreliably ensure that every test runs completelyfairly. For example, our tool is the only onewhich can check parity on both sides of thetest, as well as millions of other data points wehave developed that are highly specific to thehotel industry.

QUOTE

Tom Damant

13

"Investing time and effort in reallyperfecting your A/B testing processpays dividends in the long run. Themore you invest in getting yourprocess and monitoring right, theeasier your subsequent tests will be -and the greater faith you will have inthe results you get."

Page 14: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

From the outset, an A/B test should be aboutdefining an area of importance to yourbusiness and working out how to improve it.The test should be centred on a particular KPI(key performance indicator) such as traffic,retention or conversion.

A good thing to remember when it comes toA/B tests is that, while you may learn anumber of different things about youraudience during the process, you should gointo the test with one key improvement inmind. A/B tests are most successful whenthere is as little 'noise' surrounding them as 

Deciding what to improve

possible; trying to improve five metrics withone test will likely dilute your focus and couldmuddy the validity of your data.

There are a number of ways you could goingabout deciding what to improve. If you havereliable access to your performance data,then you could use that to benchmarkyourself against industry figures and seewhere you fall short. If you don't, it is just asvalid to use a combination of your ownjudgment and qualitative data from your userbase, such as surveys and feedback.

2(ii)

14

Page 15: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Every good experiment needs a hypothesis.An example of a hypothesis we would make atTriptease is as follows: "the presence of thePrice Check widget on x hotel website willlead to a conversion rate increase of at leasty%."

Not only does a hypothesis focus yourattention on the metric you are trying toimprove, it also forces you to take a step backand consider whether the experiment is reallyworth it. It's good to get out of the mindset of"I'll just run an A/B test and see 

Making a hypothesis

what happens."  When setting yourhypothesis, bear in mind the importance ofOccam's razor: the simplest theory, or the onethat makes fewest assumptions, is often themost powerful. 

If you have relatively low traffic, there are fewopportunities to undertake tests, so it is allthe more important to prioritize hypotheseswith the best logic first. Make an educatedguess about the change most likely toimprove the metric you've identified.

2(iii)

Hypothesis: A statement based on the basis of limited preliminary evidence that is used as a startingpoint for further investigation

15

Page 16: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Before setting anything live, you need tocalculate some parameters for your A/B test.Determining your statistical significance,statistical power and minimum detectableeffect allows you to determine the necessarysample size for your experiment. 

SAMPLE SIZE

Taking a sample of your potential guestsallows you to infer a result that isrepresentative of your entire audience, butonly when the sample size is big enough. Thelarger your sample size, the more meaningfulyour insights will be.

The setup

Think about flipping a coin. If you flipped itfive times, you may well observe four heads.Based on this, you might decide that the coinis biased. However, if you flipped it a thousandtimes, you might observe 499 heads - andyour conclusion would be very different. Theconclusion in the second example is muchmore accurate than the first, thanks to theincreased sample size.

One of the most common misconceptionsabout A/B testing is that anybody with awebsite can use it to improve theirperformance. In reality, an A/B test would notbe appropriate for most small properties

2(iv)

16

Page 17: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

with a relatively low level of website traffic. Infact, it would take the average independenthotel two years to run a statisticallysignificant A/B test where the goal is tomeasure conversion rate uplift.

Unfortunately, running an A/B test for thatlength of time would most likely result insome seriously dodgy data. Firstly, it wouldsignificantly increase the chance of a userseeing both sides of the test - and thereforerender their behavior invalid. If a user clearstheir cookies after visiting a site running anA/B test, they could be entered into a differentside of the test if they return.

Running a test for such a long time would alsomake it much hard to attribute any change ofuser behavior to the single action you'retesting. The chance of events occurring whichcould invalidate your test increases thelonger you run your test.

2 years

The length of time it wouldtake the average independent

hotel to run a statisticallysignificant A/B test

17

Page 18: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

MINIMUM DETECTABLE EFFECT

For each experiment you run, you shouldhave an idea of the primary metric that you'retrying to improve. For hotels, this is likely tobe website conversion rate (the rate at whichvisitors to your site go on to book a room). 

Your current conversion rate is known asyour baseline - let's call it 2%. The minimumdetectable effect represents the smallestimprovement over that baseline that you'reinterested in detecting. The larger yourminimum detectable effect is, the smalleryour sample size needs to be.

A good rule of thumb for deciding yourexperiment's minimum detectable effect is toask yourself, "what is the smallestimprovement I would need to see to makeinvesting in this change worthwhile?"

13.3% Highest baseline conversion

rate seen in a Triptease test

18

Page 19: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

STATISTICAL SIGNIFICANCE

Measuring statistical significance is crucial toa reliable and accurate A/B test. Significanceis measured so that you have anunderstanding of the risk of making a 'falsepositive' error - in other words, thinking thatyour change is having an impact when it isn'tin reality.  The significance level is usually setat 0.05, which means that 5% of all 'winning'A/B test results may not actually be positive.

STATISTICAL POWER

This gives you a measure of how likely you areto make a 'false negative' error. That meansbelieving that your change has not had apositive impact, when it fact it has.  Thedefault value for statistical power is 80%,which means that 20% of times that you don'tobserve a winner there may actually havebeen one. In tests with low statistical power,it is likely that your result will not be positive -even if your change has had a genuine impact.

0.05 Default significance level

for A/B tests

80% Minimum level for

statistical power

19

Page 20: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Once you've made your hypothesis anddetermined your sample size based on thefactors above, it's a good idea to perform anA/A test before moving on to A/B.

The A/A test simply involves splitting yourwebsite traffic without making a change oneither side. The point of this step is to checkthat traffic is splitting evenly across allsegments - device, browser, country, etc - butalso to make sure that there are nounderlying biases in your A/B testing tool. 

The A/A test

2(v)

You need to be sure that the traffic splittingis truly random. Most tools are built toachieve this, but there are all kinds ofunderlying technical problems that couldcause a non-randomized split of traffic.

The idea of the A/A test is simple: no changehas been made to your website, soconversion rates on either side of the testshould be roughly equal (they are unlikely tobe exactly equal unless your websiteexperiences a huge amount of traffic).

20

Page 21: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

1 2 3 4 5 6 7 8 9 10 11 12-15

-10

-5

0

5

Day of test

Upl

iftDon't judge too soon

This chart depicts the first 12 days of an A/A test we ran with a hotel group in 2017. Theline represents the 'uplift' experienced by one side of the test (although of course, bothsides were being shown exactly the same website at this point). As we can see, the changevaries greatly in the first ten days of the test - which might lead the tester to believe thattheir test is set up incorrectly. In fact, we can see that the variants begin to converge onzero after 12 days, or around one thousand conversions.  

21

Page 22: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Finally, the test itself. After having set ahypothesis, determined your parameters, andrun an A/A test, you are ready to set the twovariants of the website live and start tocollect data about user behavior.

Constant vigilance is crucial at this stage. AnA/B test is resource intensive and can beexpensive to run, so you need to be payingclose attention to its performance in order tocatch anything dodgy before it pollutes yourdata collection.

It can be useful to determine in advance'tolerance levels' when it comes to issues suchas uneven splitting or mismatchedconversions

The A/B test

2(vi)

conversions. We talk through somerecommended checks to perform during thetest in the next section. 

The length of each test depends on levels oftraffic to the website and the desiredminimum detectable effect. At Triptease, werecommend hotels should have at least fivethousand conversions over a one-monthperiod to be suitable for an A/B test.

Once your test has achieved statisticalsignificance, you are ready to either accept orreject your hypothesis and make theindicated change on your website.

22

Page 23: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Ensuring a valid test

3

23

Page 24: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Nobody wants to run an A/B test only to findout after it's over that a mistake or technicalproblem has rendered the data invalid. 

The complex ecosystem of hotel websitetechnology means that there is unfortunatelyplenty of opportunity for things to go wrong,whether that be with data not fully passingfrom the booking engine to the testing tool ormissing a line of code from a specific page ofthe website.

At Triptease, each test is subject to a series ofstringent checks at every stage of theprocess, from planning to completion.  Onlywhen every check has been completed can atest progress to the next stage, or 'gate'.

3

There are four types of validity threat thatcould make the results of a test invalid:

Instrumentation effect: the implementationof the code or test tool is affecting thesampling process

History effect: external events beyond yourcontrol can distort the sample (e.g. bookingbehavior is different during Christmas week)

Selection effect: the sample is notrepresentative of the typical traffic over abooking cycle, e.g. promotion or special offer

Novelty effect: it may take time for users toget used to changes in the user experience.

24

Page 25: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Fundamentally, you have to make sure that (a)what you wanted to happen hashappened (the text you've changed, the productyou've installed etc is actually appearing as itshould to users) and (b) nothing elsesignificant has changed, either on thewebsite or in the behavior of users.

Taking a macro view of the test will help tosharpen your senses when it comes tountrustworthy data. Is overall traffic normalcompared to other times of year? Is thebaseline conversion rate in line with previousmonths? If not, you risk making a businessdecision based on a data set that isn'trepresentative of your business as usual.

A whole host of things can interfere with userbehavior and throw it off track from thenorm. One of the most common ones we seein hotels running A/B tests is a coincidentalmarketing campaign, where the hotel makes alarge marketing push at the same time astesting a change to their site.

CASE STUDY

During an A/B test on the impact of PriceCheck on a UK hotel group's website, ourtesting team noticed an unusual spike intraffic that could have introduced a selectioneffect. The 'deals' page on the group'swebsite had been linked to from a popularUK promotions & discounts site, leading toan abnormally large amount of traffic.

11k Visitors on

day of deal

5.2k Normal daily

visitors

It was clear that the profile of traffic visitingthe offers page was not representative oftraffic over the typical booking cycle, so thattraffic was excluded from the overall testanalysis. Over the page we give three moreexamples of things to check during your test.

25

Page 26: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Does the user datamatch betweendifferent analyticsplatforms?

Most hotels will already be using another analytics platform (such as Google Analytics) to monitortheir data before they begin an A/B test. An easy check to perform to ensure the validity of an A/Btest is to make sure that the number of visits, pageviews and sessions is approximately the same inboth Google Analytics and the A/B testing platform.

These numbers will likely vary a little for various reasons, such as time zone setup or differentcaching systems, so it makes sense to set a tolerance level for this check. At Triptease, we allow avariation of 5% between different analytics platforms.

26

Page 27: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Can your conversiondata becorroborated?

In a similar vein, it is important to check that the number of bookings visible in the A/B test datamatches the number provided by the booking engine or the party processing bookings on behalf ofthe hotel. 

For tests in which the hypothesis centres on conversion rate improvement, it's crucial that thesenumbers match as closely as possible. At Triptease, we set our tolerance level for booking matches at2% - meaning we allow a 2% variance between the data in the test and that from the booking engine.

27

Page 28: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Has your page loadtime been affected?

While an A/B test is usually focused on altering one specific metric - for example, the booking engineconversion rate - it is important to keep an eye on the rest of the website's performance in order tomake sure that the change made for the test isn't having any unwanted side-effects. 

One thing we always watch out for in our internal testing is whether there's any variation in pageload time between the two sides of the test. It could be possible that the change being made with thehope of improving conversion rate actually slows up the website's loading speed, which is likely toharm rather than help conversion performance. It also invalidates the test data to some extent; ineffect, the 'B' side of the test is being exposed to two changes, which means you can't attribute 100%of any impact to one single change.

28

Page 29: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Hotels vs. OTAs

4

29

Page 30: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

As a hotelier, it is hard to ignore the constantemphasis on testing from the likes ofBooking.com and Expedia. It is no wonderthat many hotels are convinced that A/Btesting is the right choice for them - they haveseen it employed to such great success bytheir biggest rivals.

At ITB 2018, Booking.com's CEO Gillian Tansmade no bones about the extent to which herteam are constantly iterating on theirproduct.

"Everything you see in our product, in ouradvertising or on our website has beentested. If it's there, it's been tested."*

4

The idea of a 'perfectly tested product' isincreasingly integral to the organizationalidentity of many OTAs, and Booking.com inparticular. This creates a pressure on hotelsthat is difficult to ignore.  There are several reasons, though, why hotelsand OTAs aren't on the same playing fieldwhen it comes to testing. But that doesn'tmean hoteliers should lose hope.

In this section we take you through some ofthe main differences between hotels andOTAs when it comes to A/B testing.

30 * Gillian Tans, 'ITB CEO Interview: Gillian Tans, CEO Booking.com, one-on-one with Philip C. Wolf" (ITB 2018)

Page 31: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Test sensitivity

Hotels vs. OTAs #1 

OTAs have the necessary economies of scaleto invest heavily in data infrastructure andachieve accurate testing results for minuteUX (user experience) changes.

They can change the colour of a button andtest the impact, because they can achieve asample size that enables them to detect tinyuplifts. The vast majority of hotels don't havethe necessary traffic to be able to do this.

While measuring tiny uplifts (such as a 0.1%increase in conversion rate) is worthwhile forhuge OTAs, it almost certainly isn't for mosthotels. Hoteliers should be focused onidentifying areas for significant improvementrather than testing each button change.

1,500

Number of engineers workingat Booking.com*

* Gillian Tans (ITB 2018)

31

Page 32: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Data scale

Hotels vs. OTAs #2 

Due to their scale and the breadth of theiruser base, OTAs have a far greater number ofreference points by which test results can beassessed. Traffic to OTAs is made up ofmillions of different types of consumer, whichmeans OTAs can test the impact of a changeon any number of different user groups.

OTAs can see which changes have the biggesteffect on which type of person, and why. Thelargest hotel groups can perform this kind ofanalysis, but a single property or a smallgroup only has a very narrow traffic base totest on. It is difficult to capture manyconfounding variables on such a small base.

1.7m

Number of properties onBooking.com as of March 2018

32

Page 33: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Knowing the guest

Hotels vs. OTAs #3 

This section may feel like slightly dispiritingreading for hoteliers. But it shouldn't be.Hotel websites are about far more than justconverting a guest. They're about providing agreat experience.

One thing an OTA will never be able to do isactually meet and speak to the person who isbooking a room through them. While youmight not have millions of visitors to yourwebsite each month upon which to test, yourhotel is full of potential research subjects. 

One of the most simple things a hotelier cando is just ask each guest about their bookingexperience as they check in.

The hospitality industry is defined by humanrelationships. While OTAs are only focusedon one specific part of the guest 'journey',hotels are necessarily invested in the entiretrip - from booking, to pre-stay, to stay, toafter the guest has gone. The knowledge andunderstanding that this gives hotels is oftenunderestimated. A hotelier can apply their'soft' knowledge of guests to how theyoptimize their website in a way that OTAscannot.

A hotel website does not have to be all thingsto all people, as an OTA does. Reallyunderstanding your guests will set you wellon the way to an optimized website.

33

Page 34: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Mythbusting

5

34

Page 35: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

There's a great deal of misunderstandingfloating around when it comes to A/B testing,and it's hindering the ability of many hoteliersto make meaningful changes to theirwebsites.

Given the limited time available to mosthoteliers, it's all the more important that thetime they do have to spend on their website isspent efficiently and economically. For mostsmaller hoteliers, an A/B test would beneither efficient nor economical.

We've gathered some of the most commonA/B testing misconceptions here.

5

35

QUOTE

Mark Farragher Data Analyst at Triptease

"Whilst a lot of people have heard of orcome across A/B testing before, many ofthem have very different expectations ofthe process than can actually be achievedin reality."

Page 36: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Myth #1

I should A/B test every cosmetic change I make to my website.

Mythbusting:

The example often given of an A/B testing hypothesis is something like "my conversion rate willincrease if I change the color of this button from red to green." Whilst changes to the superficialappearance of certain parts of your website do indeed have an impact on the experience of users,they are unlikely to be revenue drivers in the same way as making fundamental changes to thestructure, layout or composition of your site.

Analytics company Qubit published a meta-analysis of thousands of their experiments in 2017.Their results showed that changes grounded in behavioral psychology - social proof,abandonment recovery, etc - had the biggest average impact across all of their tests. Qubitdescribed these changes as those which "alter the users' perception of the product's value." So, afeature like Triptease's Price Check falls into this category (as it alters the user's perception ofwhere they can obtain the best price), whereas a button color change does not.

36

Page 37: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Qubit's meta-analysis also demonstrated that cosmetic changes "do not constitute an effectivestrategy for increasing revenue." Though often the subject of A/B tests due to the relative easewith which businesses can implement them on their website (cosmetic changes often don't requirethe input of developers), Qubit suggests that "the probability that these simple UI changes havemeaningful impact on revenue is very low." 

Rather than looking to changes in colour or wording as change agents for your business, Qubitrecommends just "choosing a design and sticking with it based on preference or through aqualitative process."

QUOTE

Qubit Digital Ltd

"Cosmetic changes, such as changing the color of buttons, do notconstitute an effective strategy for increasing revenue [...] Theprobability that these simple UI changes have meaningful impact onrevenue is very low."

37

Page 38: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Myth #2

My test has shown a 10% uplift, so I'm going to see a 10% uplift inrevenue from my  website this year.

Mythbusting:

Unfortunately, this is not the case. The results of an A/B test only inform the tester of whetherthey should reject their null hypothesis (e.g. "making this change will not have a positive impact on myconversion rate"). 

The results do not tell you the exact amount by which your change will increase your chosenmetric, only whether it will be increased at all. If your test is positive and you accept youralternate hypothesis (e.g. "making this change will have a positive impact on my conversion rate"), theactual uplift could well be higher or lower than the observed uplift. It is also possible that theactual uplift will vary with time.

38

Page 39: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

"We often see people making the mistake of assuming that a 10% uplift (regardless of samplesize) observed during the test will guarantee an uplift of 10% in reality. People are often surprisedto see that A/B test results are not replicated on their bottom line. 

An A/B test provides confidence that a change will likely have a positive impact on a certainmetric. It does not guarantee a positive impact.

If you stop an A/B test when you see a positive uplift (instead of waiting for a sufficient samplesize), you increase the risk of implementing a change that will actually have a negative impact onyour key metric.”

Evolution AI

Consultants to Triptease A/B testing process

QUOTE

39

Page 40: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Myth #3

I've started my test but can already see my conversion rate isdecreasing on the variant side - I should turn my test off.

Mythbusting:

This is not a good idea. Conversion rates are volatile and can fluctuate greatly in the short term.We've observed tests which started with strikingly different uplift values to the final result. It'svery important that an A/B test is left to run long enough to even out the impact of randomvariations in your data.

In the early stages of a test, you should be focused on (a) the behavior of the test tool and whetherit is working normally, and (b) the change you have made and whether it is appearing as youexpected. The change in the metric itself is not as important at this stage.

40

Page 41: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Myth #4

I read about a hotel where they ran an A/B test on the position ofthe 'Book Now' button and saw a 15% uplift in conversion. I'm goingto make the same change on my website to achieve the same uplift.

Mythbusting:

It's never wise to assume that one specific change made by another hotel will work in the exactsame way for you. While we wholeheartedly recommend speaking to your peers, asking for (andgiving) recommendations, and sharing advice, this should be used to inform your own thinkingand strategy rather than as a blueprint for one-size-fits-all success. There are any number ofreasons why a successful change on one website will not be successful on another: the websitestructure, the expectations of the audience, the type of hotel, even the timing of the change.

If you have the necessary scale and capability to conduct a test on the suggested change then thisshould always be your first port of call. If you don't, there is nothing to stop you trying it out (witha knowledge of the potential risks) if you really think it is the right decision for your audience. Justdon't be surprised if your website doesn't experience exactly the same results as the example youread about.

41

Page 42: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Conclusion

42

Page 43: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

We hope that this report has served toilluminate some of the complexities of A/Btesting on hotel websites. While A/B testingwon't be a viable option for every hotel, it isentirely a good thing that hoteliers in generalare becoming much more aware of theimportance of testing and optimizing theirwebsites.

For those who do not have the necessaryscale to A/B test, it may be useful to readTriptease's Spotlight on... Online GuestExperience from October 2017, where wecover alternate forms of website testing.

As the industry progresses towards a new eraof automation and hyper-connectivity, it is allthe more important for hoteliers to be payingattention to their data, whether in a  

testing environment or elsewhere. One keyway that smaller hotels can get the most outof their data is by working with providers whocan aggregate and benchmark an individual'sperformance against an entire database. AtTriptease, we are working hard to surface thevalue of our data to every hotelier we workalongside.

Optimizing your website conversion rateshould be a process that begins and ends withlistening to your guests, whether that be byanalyzing their user data or having anindividual conversation with everyone whowalks into your lobby. 

A/B testing is not the only way of working outhow to improve your website - but, if you'regoing to do it, you have to get it right.

43

Page 44: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Sources & further reading

Yvonne Koleczek, 'Optimizely's decision to ditch its free plan suggests A/B website testing is dead,'Venture Beat

Neil Patel & Joseph Putnam, 'The definitive guide to conversion optimization,' Quicksprout

Will Browne & Mike Swarbrick Jones, 'What works in e-commerce - a meta-analysis of 6700 onlineexperiments,' Qubit Digital Ltd

Gillian Tans & Philip C. Wolf, 'ITB CEO interview,' ITB Berlin

Clare Hutchison, 'Spotlight on... Online Guest Experience,' Triptease

44

Page 45: Is it really dead? Spotlight on A/B testing · "A/B testing is dead"* In February 2018, A/B testing platform Optimizely called time on their free A/B test tool in a move heralded

Spotlight on... A/B testing Is it really dead?