Using machine learning to determine drivers of bounce and conversion

Using machine learning to determine drivers

of bounce and conversionVelocity 2016 Santa Clara

Pat Meenan@patmeenan

Tammy Everts@tameverts

What we did (and why we did it)

Get the codehttps://github.com/WPO-

Foundation/beacon-ml

Deep learning

weights

Random forestLots of random decision trees

Vectorizing the data• Everything needs to be numeric• Strings converted to several inputs as

yes/no (1/0)• i.e. Device manufacturer• “Apple” would be a discrete input

• Watch out for input explosion (UA String)

Balancing the data• 3% conversion rate• 97% accurate by always guessing

no• Subsample the data for 50/50 mix

Validation data• Train on 80% of the data• Validate on 20% to prevent

overfitting

Smoothing the dataML works best on normally

distributed data

scaler = StandardScaler()x_train = scaler.fit_transform(x_train)x_val = scaler.transform(x_val)

Input/output relationships

• SSL highly correlated with conversions• Long sessions highly correlated with

not bouncing• Remove correlated features from

training

Training deep learning

model = Sequential()model.add(...)model.compile(optimizer='adagrad', loss='binary_crossentropy', metrics=["accuracy"])model.fit(x_train, y_train, nb_epoch=EPOCH_COUNT, batch_size=32, validation_data=(x_val, y_val), verbose=2, shuffle=True)

Training random forest

clf = RandomForestClassifier(n_estimators=FOREST_SIZE, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=12, random_state=None, verbose=2, warm_start=False, class_weight=None)clf.fit(x_train, y_train)

Feature importancesclf.feature_importances_

What we learned

What’s in our beacon?• Top-level – domain, timestamp, SSL

• Session – start time, length (in pages), total load time• User agent – browser, OS, mobile ISP• Geo – country, city, organization, ISP, network speed• Bandwidth• Timers – base, custom, user-defined• Custom metrics• HTTP headers• Etc.

Conversion rate

Bounce rate

Finding 1Number of scripts was a predictor…

but not in the way we expected

Number of scripts per page (median)

Finding 2When entire sessions were more

complex, they converted less

Finding 3Sessions that converted had 38% fewer images than sessions that didn’t

Number of images per page (median)

Finding 4DOM ready was the greatest

indicator of bounce rate

DOM ready (median)

Finding 5Full load time was the second

greatest indicator of bounce rate

timers_loaded (median)

Finding 6Mobile-related measurements weren’t meaningful predictors of conversions

Conversions

Finding 7Some conventional metrics

were (almost) meaningless, too

Feature Importance (out of 93)

DNS lookup 79Start render 69

Takeaways

1. YMMV2. Do this with your own data3. Gather your RUM data4. Run the machine learning

against it

Thanks!

Using machine learning to determine drivers of bounce and conversion

Technology

Bounce Back!

BOUNCE - sportlomo-userupload.s3.amazonaws.comsportlomo-userupload.s3.amazonaws.com/.../7/bounce... · BOUNCE PRACTISE THE TECHNIQUE 1 2 3 BOUNCE KINg Players into two teams. Team

Using machine learning to determine drivers of bounce and conversion (part 2)

About Bounce - naughtone · About Bounce Called the Bounce chair due to the independent bounce movement of the seat and back, this flex provides a comfort not usually associated with

The big bounce

Cheque Bounce

Bounce Book

Bounce house rentals

Trampoline upper bounce jumping mats|upper bounce jumping mats

Bounce back sample

Bounce Rate and Exit Rate | How to reduce Bounce Rate

Pacer Bounce System

Velocity 2016 Speaking Session - Using Machine Learning to Determine Drivers of Bounce and Conversion

Improving Bounce Rate

18 Ways to Reduce Website Bounce Rate and Increase Conversion - Infographic

Infinite Bounce Ii

Norma Pitch Bounce

Bounce diagram technique

When you bounce a ball, how high will it bounce?

Bounce Work Bookpdf