Prediction - the future of game analytics - white paper

Prediction: The Future of Game Analytics(Its Already Here--Its Just Not Very Evenly Distributed)

WHITE PAPER

2014. Ninja Metrics

www.ninjametrics.com


2014. Ninja Metrics

www.ninjametrics.com | [email protected]

Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 2

Table Of Contents

Get Distributed 3

The Value of Big Data 3

Understanding Historical Data 5

Predictive Analytics 6

Getting Your Action Items On 8

Red Ball vs. Blue: Real-Time Analytics 10

Predictive is Not Real-Time 11

Data Models and Why Your Education Probably Wasnt Good Enough 12


2014. Ninja Metrics



Get Distributed The futurist William Gibson wrote that the future is

already hereits just not very evenly distributed.

Thats both social commentary and cautionary tale in one. It tells us that some technological

advances feel like science fiction simply because most people dont yet know about, or have

access to them.

In this white paper, we will outline some of those advances in the field of data analytics for

games, and encourage you to be among those with access.

The Value of Big Data First, Understand Users

Big Data is the flavor of the moment, but its not always clear what the term means. Weve

worked with dozens of game companies and more often than not big data means the kitchen

sink approach, i.e. lets just collect everything in case we need it.

Thats fair enough, but using big data intelligently involves more than just a big database. It

means collecting the right kinds of data that answer specific questions addressing specific

business needs. Its all doable, but it requires planning and smart decisions up front that take into

account the nature of your customers.

Doctor House Doesnt Believe His Patients and Neither Should You

You may be familiar with the popular TV show -

House. The principal character - Dr. Greg House

is a curmudgeonly mentat of a medical analyst

who never trusts his patients to tell him the

truth. Weighed down by their own egos, cultures,

superstitions and fears, Houses patients regularly

lie to themselves and to him; obscuring key facts

of their illness and making diagnosis difficult.

House teaches us that listening to your patients

may not always be the best way to help them.

Listening to players talk about your game is

flawed in the same way. First, theres the problem


2014. Ninja Metrics



of sampling. Listening to players who scream the loudest, i.e. your bored, lonely, raging forum

trolls, does not yield a scientifically representative sample of all players. Their complaints are

proof that a phenomenon exists, but not that its common, or even important.

Secondly, these people may be flat out wrong in their observationsnot because theyre evil,

stupid or angry (though those are always possible), but because they are a poor gauge of even

their own behavior.

Consider the following classic case from social science: How do you know how much TV someone

watches? The most common approach to answering this question is to simply ask them. The

problem is that they tend to be wrong.

Ask yourself how many hours of TV you watched last week or last month. Your estimate will

usually be wrong. Most will have watched 30 hours of TV, but will answer 25 hours, 32 hours,

35 hours, etc. Sometimes they are wrong due to ego and worrying about

appearance like Houses patients - but sometimes they just cant

estimate well.

Webbs Unobtrusive Observation

Eugene Webb (1966) was a pioneering social scientist in the

1960s who invented the term unobtrusive observation.

His insight was that its a lot better to watch people do

something and measure it yourself than to ask them

to recall it. He also made sure that the watching didnt

interfere with the behavior.

Webbs classic case looked at the popularity of museum

exhibits. When asked which exhibit they liked the best,

museum visitors would consistently give answers that

made them sound intelligent and wise. For example, many

would say they liked the exhibit on atomic structures even

though this exhibit room was consistently devoid of visitors.

The visitors were clearly lying to feel better about themselves and

appear wiser.

So how could the museum learn the truth

without following the guests around?

Webbs simple approach was novel and revolutionary. His team collected unobtrusive data by

counting the number of nose smudges on the glass cases and the wear of the floor tiles in front

of the exhibits. Those turned out to be far better gauges of actual popularity. Its a simple insight,


2014. Ninja Metrics



and one that can be brought to bear in the realm of big data and games.

Game logs represent an analysts dream for data quality and purity. If a game is instrumented

intelligently, it will yield a flawless record of every action, transaction and interaction the players

make. And this is what powers good big data. Once its available, the challenge then becomes

how to make intelligent use of it.

Understanding Historical Data What Just Happened?

The past is the best building block for understanding the present as well as the future. On the

simplest level, any game company needs to know how many customers it has, how much theyve

spent, and what theyve done.

Ninja Metrics labels this part of our dashboard Basic Metrics. It contains all of the

basic metrics based on past behavior. Factors such as Daily Active Users

(DAU), Average Revenue per User (ARPU) and K-factor can all be found

here.

Such basic metrics are invaluable for understanding

aggregate trends, and if they can be augmented with

add-on functionality like AB testing and segmentation,

they can become quite handy indeed. For example, if an

A group gets one kind of content (or mechanic, or CRM

intervention, etc.) and a B group another, then the ARPUs

of those two groups can be compared to see which

performed better.

Instrumenting for Historical Data

To get that historical data, instrumenting your game is

critical. Instrumenting your games is all about making sure

these metrics are actually recorded as part of the game.

Analytics companies will tell you which events to capture and how to

report them - and its pretty straightforward. For example, when an event

happenssay a user logs inthe game system needs to be able to say User 3482

logged in at time XX:XX:XX. The analytics company supplies a piece of code that then fires this

event off to us, typically up in the cloud, where our algorithms and reporting are applied. This

code is supplied in a library that essentially says wherever you have your log-in event happening,


2014. Ninja Metrics



put this line of code here. If the events are already instrumented, this process shouldnt take

more than a day at most.

Obviously, if your analytics software requires data for a metric you havent instrumented for,

youll need to work that into your development cycle. For example, if you want to know how

many players are on level 8 and you dont collect an event like advanced from level 7 to 8,

that metric isnt going to show up. But in general, this is not a complicated process and simply

requires time and planning.

So clearly, having a solid foundation of historical player data is hugely important. It supplies you

with data to populate the familiar basic metrics of DAU, ARPU, ARPPU, Churn Rate and many

others. But these metrics have one key limitation: theyre based on historical data and therefore

reactive by their very nature.

To be proactive, you need to peer into the future using big data and predictive analytics. It isnt

easy but its not magic either.

Predictive Analytics Theres a Minority Report For a Reason

If youve seen the movie version of Philip K. Dicks

Minority Report, you saw a future society where the

police use people known as precogs to predict the

future and stop crimes before they happen. Its a fun

premise, and of course it goes spectacularly wrong. It

turns out that predictions made by the precogs arent

100% accurate, and when one of them screams out

NO! its considered a minority report. The minority

report throws doubt on the validity of the prediction.

In other words, the police might be arresting the wrong

person.

It seems that with great power indeed comes great

responsibility. And sometimes we screw it up.

So lets not screw this up.

Predictive Analytics is Not Magic But Its Not 100% Either

With todays advances in big data analytics, the ability to accurately predict player behavior is a

reality.


2014. Ninja Metrics



The science can get confusing but heres a fairly simple way of understanding it.

Lets say a computer watches all of the events that happen in a game and it starts to recognize

patterns. Some patterns are repeated, while others arent. When the patterns are repeated, the

machine learns that and starts looking for that pattern to occur again, this time starting to

make a prediction about what is going to come next.

Say the computer sees A-B-C-D, over and over. After a while it recognizes it. Then it sees A-B-C,

and you ask it what is going to happen next. It says D, of course, but it can also tell you how

likely this prediction is to be correct. How can it do that? Well, when it looked into the past, it

wasnt always A-B-C-D. Occasionally it was A-B-C-Q. So the computer also starts to understand

likelihood, and can tell you how often that guess has turned out right. Thats the prediction. No

magic, really.

How Right Is It?

Theres a technique used to test and verify these predictions. Its

called bear with me here cross fold validation.

Heres how it works:

Lets say you have a really big data set. The computer

takes the whole thing and splits it into two halves. With

the first half it looks for patterns and builds up its

model. By the end, it says We see A-B-C and then D

happens 75% of the time. Now lets take that model

and see if its accurate in the second, totally untouched

half of the data. If A-B-C-D happens 75% in this data as

well, we all start feeling pretty good about the predictive

nature. But we start thinking of it being 75% accurate, not

100%.

Why not 100%?

The short answer is that the world has a lot of moving pieces. For

example, you might make a prediction that Bob Smith will spend $20

tomorrow. But what if poor Bob gets hit by a bus or has a big breakup with his fianc? Those

events could very likely alter his behavior (especially the getting hit by a bus). And you certainly

didnt plan for those data points in your model. So when Bob doesnt show up and spend his $20,

you scratch your head. Was your model wrong? No, just incomplete.

Secondly, its very easy to cheat with these models. For example, by ignoring any false positive or


2014. Ninja Metrics



false negatives in data sets, anyone could just tell you that all of the players

will spend $20 tomorrow. And they guarantee that theyll have covered

everyone who does. The ones they were accurate about they could

report as 100% accurate.

This happens more often than youd think in scientific circles

and when it happens its labeled junk science. And rightly

so.

Putting Trust in the F-Score

So, to be responsible, Ninja Metrics advocates using

something called an F-score. This takes one stat that

allows for false positives and another that allows for false

negatives and simply averages them. The result is expressed

as a percentage and is extremely trustworthy. It cant cheat.

Now you need good performance.

For reference, F-scores have been used for a long time in churn models

by the telecommunication companies. For industries like these where the

data isnt particularly detailed, high quality F-scores tend to hit the .35 to .45 range on average.

With game data, you can get much higher F-scores because the data detail in gaming is really,

really good. A .50 to .70 score is very good. A .80 to .90 is spectacular. Anything over that is

rare indeed. Our models tend to hit .85 to .90 on average, and thats after 7 years of R&D and

academic specialization in game player data algorithms.

Still, at the end of the day, seeing that confidence value in a predictive model is

key. Any consultant or company that gives you a predictive value without that

is no better than voodoo. Science is all about transparency and provability.

Insist on it.

Getting Your Action Items On Now that you have a value as well as a level of

confidence in it, how do you turn it into actions?

In other words, how do you turn predictive insight into targeted promotions or

player interventions?

F


2014. Ninja Metrics



Predicting Player Churn Rate

Say youre trying to predict player churn rate. You run the numbers and it comes out with an

overall accuracy of .85 (Well say % from here on out, but an F-score is best). An overall number

is a great start but its not very actionable. What is actionable is a score for an individual player.

Lets consider two players A and B:

Player A has a probability of quitting of 60% in the next week, and the model is 90% accurate.

Player B has a probability of quitting of 80% in the next week, and the model is 20% accurate.

There are a couple ways of working with this.

One is simple. Just set a threshold percentage and say that anyone over this threshold is worth

taking action on, period. Thats fine, but it doesnt take into account the fact that you have scarce

resources that you can devote to keeping that player. Also, it doesnt prevent you from offering

an inappropriate promotion or intervention to a player that wasnt even planning on quitting the

game in the first place. Imagine the ham-handedness of a Dont Go! promotion aimed at a loyal

and happy player.

The second way of thinking about this is to multiply the two percentages together. Player

A is 60% likely, and were 90% sure. Taken together, if we had say 100 players

like this, 54% of them (90% of 60%) would quit next week. Its a little like

opportunity cost. Just consider these players 54% likely to leave and react

appropriately.

Predicting Player Lifetime Value

Prediction of a players future spending has always been

elusive. But with predictive modeling, a players lifetime value,

or LTV, can be reduced to just another predictive metric. You

are predicting both spending and lifespan.

Lets say there are two new players:

Player C has an LTV of $150, a likelihood of churning out of

30%, and the model is saying its 90% accurate.

Player D has an LTV of $80, a likelihood of churning out of 90%

and the model is likewise 90% accurate.

What are they each worth? First, both models are pretty accurate, so

lets just take them as trustworthy and ignore the x .9 part.


2014. Ninja Metrics



Simply multiply the LTV by the churn probability. Player C is $150x30%, or $45 of expected value,

and player D is $80x90%, or $72 of expected value.

Which player do you want to spend resources to keep?

If all youre going to do is send a zero-cost email, by all means send it to them both. But when

are you willing to give away something more costly like a free item, a free month of play or a fruit

basket in the mail? Compare the cost of those interventions with the expected value and make a

rational decision. If a fruit basket costs $50 and will save the player, then it makes rational sense

to send it to Player D, but not C.

Ninja Metrics Katana software provides all of these numbers to enable exactly these kinds of

transparent and rational decisions. We know that game developers will have their own tastes for

how they handle interventionscommunity managers, customer service reps, email campaigns,

push notification systems, etc.and we support whatever systems we learn about. The best

possible analytics system will allow you to set your own thresholds, decide on your own course of

action, connect directly to the vendor or system for that action, then track the results.

Red Ball vs. Blue: Real-Time Analytics Back to Minority Report you may recall that their system etched

the future criminals name on a little wooden ball. If the ball was

blue, the crime was going to occur some time off in the future.

The police had time to plan a proper response. If it was red, the

crime was imminent and the police had to react immediately.

In the game world, this would be like viewing a list of possible quitters with high probability of

quitting in the next 48 hours and then running a quick intervention to reduce churn.

With Ninja Metrics software, blue and red ball analysis is possible. You can use dashboards to


2014. Ninja Metrics



monitor and review and think big picture, but you can also automate these decisions.

For example, if you know that Player X is going to quit in two weeks, great, you have some time

for management to take action. But what about Player Y, whos quitting tomorrow? How nimble

are you? You should have any red ball players set off a trigger that sends notification to the

right person.

For example, any time a player with an LTV over $5 is likely to quit at more than 60% and with a

confidence level of 70%, send an urgent email to Lisa Brown, the games community manager.

Predictive is Not Real-Time Nearly every analytics company says they offer real-time analytics. To some

extent, this is statistical and marketing sleight of hand. Yes, we can all process

your users and tell you how many logged in, or are on level 5, or have spent

money. Thats easy, and it can be as fast as a (very) big Excel file spreadsheet.

But predictive modeling is definitely not instantaneous. It takes time to run

these models. How long depends on the number of players and the

number of data points youre consideringnot to mention how

well your analytics team deals with Hadoop and map reduce!

If your game has 10,000 players, these models are going

to be very fast. But if your game has 10 million players,

it might take a few hours. And heavy duty models like

Ninja Metrics Social Value system take longer still.

The key is to consider the trade-off between the cost

of running your models and your ability to quickly

act on their results. We suggest running them daily

because of the trade-offs of processing costs and

many companies inability to act on things any faster.

Remember the red ball. So, if youre ninja nimble - and

can afford extra costs - consider running your models

more frequently.

Its about actionability at the end of the day. Given a

specific result from the predictive models, would you act? If

you had more actionable data, more often, would it be worth

it to your business to run a promotion or intervention? If so,

consider springing for it. If not, dont waste resources.


2014. Ninja Metrics



Data Models and Why Your Education Probably Wasnt Good Enough Most marketers are trained in a business school, or perhaps some

kind of social science program like communication, sociology,

etc. In these programs, we learn useful statistical tools such as

correlations, ANOVAs, and most often, regression models.

In a regression model, we have an outcome variable (dependent variable) and some number of

predictors (independent variables). Applying this to calculating player churn, we might have a

model that looks something like:

Quitting = Gender + Time Spent + Character Type + Error

And wed take a look at the overall stats for the model and determine if we thought it was

trustworthy enough. Maybe wed look at the standard error of the model,

maybe the r-squared, etc. If its good enough, well take a look at the

coefficients on the independent variables, see which ones reached

statistical significance and how big they are, and in what

direction. Fair enough.

Heres the problem.

Those models arent very good. Theyre pretty good

sometimes you get an r-squared of like .46 and youre

reasonably confident given the tools you used, i.e.

statistical models based on sampling.

But if we want models that reach accuracy levels of over

.5, were going to have to leave the world of old-school

statistical modeling and get with big data.

And big data is the realm of computer science, not social

science.

Do You Want the Good News or the Bad News?

The good news is that computer science has models with power that,


2014. Ninja Metrics



frankly, beat the crap out of regular social science and b-school approaches. Its just night and

day. You can now have models that hit 60%, 80%, sometimes 90%+ accuracy levels.

The bad news is that they dont look like regressions anymore and you need new training to

understand them. The results come out in tables, if-then statements, rule sets and other long,

unfathomable formats. Literally no human being can intuit much of it. Ninja Metrics has spent 7

years figuring out best practices in this new area of science and its still challenging.

So how can a model be good if its not even understandable?

Fair Question. But do you really need to understand why it works, or is it good enough to

understand that it just is with a high degree of certainty?

Imagine you want to know if Player A is going to spend money next month. Theres a black box

there that will tell you, with 85% accuracy, if she will. But you cant know why. Or, theres another

transparent box that will tell you, with 40% accuracy, and you can know why. Which box do you

want?

From a practical, actionable point of view, its actually an easy question to answer. If youre going

to run interventions and test their effectiveness anyway, youre going to get the why eventually.

And if youre going to send everyone the same email anyway, its irrelevant.

Dont get me wrong. Im a long-time modeler who likes to know the why. But if I can get 80%+

confidence levels without knowing the why, Im happy to give it up. I would prefer to have other

parts of my dashboard focus on why issues. Its the smarter entrance to the rabbit hole. And

again, its actionable.

If youre a larger gaming company, you may have a sharp analyst down in the BI department.

Thats great, but shes probably not running an automated model every day. And even if she is,


2014. Ninja Metrics



does she have the question-asking and contextual skills (mostly right brain) of a social scientist as

well as the hard-core big data skills of a computer scientist (mostly left brain)? In this early era of

big data a person like this is rare and in extremely high demand.

And this is precisely why we built Ninja Metrics. It enables marketing, BI and the developers to

ask the right questions and then automate all the answers.

Our Katana system is essentially a team of PhDs in a box, working daily.

If you have some PhDs on staff already, fantastic. They will take the tool even farther. Ninja

Metrics supplies scads of new and powerful DVs and IVs for them to play with and investigate

furtherall in an easy to use, automated system. Theyll have quick answers they cant come up

with on their own, and will conceive of more uses for them than we will ever think up. Win-win.

References

Webb, E., D. Campbell, et al. (1966). Unobtrusive measures: Non-reactive research in the social

sciences. Chicago, Rand McNally and Company.
Get Distributed The Value of Big Data Understanding Historical Data Predictive Analytics Getting Your Action Items On Red Ball vs. Blue: Real-Time Analytics Predictive is Not Real-Time Data Models and Why Your Education Probably Wasnt Good Enough

Business

Prediction - the future of game analytics - white paper