14
Prediction: The Future of Game Analytics (It’s Already Here--It’s Just Not Very Evenly Distributed) WHITE PAPER © 2014. Ninja Metrics® www.ninjametrics.com

Prediction - the future of game analytics - white paper

Embed Size (px)

Citation preview

  • Prediction: The Future of Game Analytics(Its Already Here--Its Just Not Very Evenly Distributed)

    WHITE PAPER

    2014. Ninja Metrics

    www.ninjametrics.com

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 2

    Table Of Contents

    Get Distributed 3

    The Value of Big Data 3

    Understanding Historical Data 5

    Predictive Analytics 6

    Getting Your Action Items On 8

    Red Ball vs. Blue: Real-Time Analytics 10

    Predictive is Not Real-Time 11

    Data Models and Why Your Education Probably Wasnt Good Enough 12

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 3

    Get Distributed The futurist William Gibson wrote that the future is

    already hereits just not very evenly distributed.

    Thats both social commentary and cautionary tale in one. It tells us that some technological

    advances feel like science fiction simply because most people dont yet know about, or have

    access to them.

    In this white paper, we will outline some of those advances in the field of data analytics for

    games, and encourage you to be among those with access.

    The Value of Big Data First, Understand Users

    Big Data is the flavor of the moment, but its not always clear what the term means. Weve

    worked with dozens of game companies and more often than not big data means the kitchen

    sink approach, i.e. lets just collect everything in case we need it.

    Thats fair enough, but using big data intelligently involves more than just a big database. It

    means collecting the right kinds of data that answer specific questions addressing specific

    business needs. Its all doable, but it requires planning and smart decisions up front that take into

    account the nature of your customers.

    Doctor House Doesnt Believe His Patients and Neither Should You

    You may be familiar with the popular TV show -

    House. The principal character - Dr. Greg House

    is a curmudgeonly mentat of a medical analyst

    who never trusts his patients to tell him the

    truth. Weighed down by their own egos, cultures,

    superstitions and fears, Houses patients regularly

    lie to themselves and to him; obscuring key facts

    of their illness and making diagnosis difficult.

    House teaches us that listening to your patients

    may not always be the best way to help them.

    Listening to players talk about your game is

    flawed in the same way. First, theres the problem

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 4

    of sampling. Listening to players who scream the loudest, i.e. your bored, lonely, raging forum

    trolls, does not yield a scientifically representative sample of all players. Their complaints are

    proof that a phenomenon exists, but not that its common, or even important.

    Secondly, these people may be flat out wrong in their observationsnot because theyre evil,

    stupid or angry (though those are always possible), but because they are a poor gauge of even

    their own behavior.

    Consider the following classic case from social science: How do you know how much TV someone

    watches? The most common approach to answering this question is to simply ask them. The

    problem is that they tend to be wrong.

    Ask yourself how many hours of TV you watched last week or last month. Your estimate will

    usually be wrong. Most will have watched 30 hours of TV, but will answer 25 hours, 32 hours,

    35 hours, etc. Sometimes they are wrong due to ego and worrying about

    appearance like Houses patients - but sometimes they just cant

    estimate well.

    Webbs Unobtrusive Observation

    Eugene Webb (1966) was a pioneering social scientist in the

    1960s who invented the term unobtrusive observation.

    His insight was that its a lot better to watch people do

    something and measure it yourself than to ask them

    to recall it. He also made sure that the watching didnt

    interfere with the behavior.

    Webbs classic case looked at the popularity of museum

    exhibits. When asked which exhibit they liked the best,

    museum visitors would consistently give answers that

    made them sound intelligent and wise. For example, many

    would say they liked the exhibit on atomic structures even

    though this exhibit room was consistently devoid of visitors.

    The visitors were clearly lying to feel better about themselves and

    appear wiser.

    So how could the museum learn the truth

    without following the guests around?

    Webbs simple approach was novel and revolutionary. His team collected unobtrusive data by

    counting the number of nose smudges on the glass cases and the wear of the floor tiles in front

    of the exhibits. Those turned out to be far better gauges of actual popularity. Its a simple insight,

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 5

    and one that can be brought to bear in the realm of big data and games.

    Game logs represent an analysts dream for data quality and purity. If a game is instrumented

    intelligently, it will yield a flawless record of every action, transaction and interaction the players

    make. And this is what powers good big data. Once its available, the challenge then becomes

    how to make intelligent use of it.

    Understanding Historical Data What Just Happened?

    The past is the best building block for understanding the present as well as the future. On the

    simplest level, any game company needs to know how many customers it has, how much theyve

    spent, and what theyve done.

    Ninja Metrics labels this part of our dashboard Basic Metrics. It contains all of the

    basic metrics based on past behavior. Factors such as Daily Active Users

    (DAU), Average Revenue per User (ARPU) and K-factor can all be found

    here.

    Such basic metrics are invaluable for understanding

    aggregate trends, and if they can be augmented with

    add-on functionality like AB testing and segmentation,

    they can become quite handy indeed. For example, if an

    A group gets one kind of content (or mechanic, or CRM

    intervention, etc.) and a B group another, then the ARPUs

    of those two groups can be compared to see which

    performed better.

    Instrumenting for Historical Data

    To get that historical data, instrumenting your game is

    critical. Instrumenting your games is all about making sure

    these metrics are actually recorded as part of the game.

    Analytics companies will tell you which events to capture and how to

    report them - and its pretty straightforward. For example, when an event

    happenssay a user logs inthe game system needs to be able to say User 3482

    logged in at time XX:XX:XX. The analytics company supplies a piece of code that then fires this

    event off to us, typically up in the cloud, where our algorithms and reporting are applied. This

    code is supplied in a library that essentially says wherever you have your log-in event happening,

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 6

    put this line of code here. If the events are already instrumented, this process shouldnt take

    more than a day at most.

    Obviously, if your analytics software requires data for a metric you havent instrumented for,

    youll need to work that into your development cycle. For example, if you want to know how

    many players are on level 8 and you dont collect an event like advanced from level 7 to 8,

    that metric isnt going to show up. But in general, this is not a complicated process and simply

    requires time and planning.

    So clearly, having a solid foundation of historical player data is hugely important. It supplies you

    with data to populate the familiar basic metrics of DAU, ARPU, ARPPU, Churn Rate and many

    others. But these metrics have one key limitation: theyre based on historical data and therefore

    reactive by their very nature.

    To be proactive, you need to peer into the future using big data and predictive analytics. It isnt

    easy but its not magic either.

    Predictive Analytics Theres a Minority Report For a Reason

    If youve seen the movie version of Philip K. Dicks

    Minority Report, you saw a future society where the

    police use people known as precogs to predict the

    future and stop crimes before they happen. Its a fun

    premise, and of course it goes spectacularly wrong. It

    turns out that predictions made by the precogs arent

    100% accurate, and when one of them screams out

    NO! its considered a minority report. The minority

    report throws doubt on the validity of the prediction.

    In other words, the police might be arresting the wrong

    person.

    It seems that with great power indeed comes great

    responsibility. And sometimes we screw it up.

    So lets not screw this up.

    Predictive Analytics is Not Magic But Its Not 100% Either

    With todays advances in big data analytics, the ability to accurately predict player behavior is a

    reality.

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 7

    The science can get confusing but heres a fairly simple way of understanding it.

    Lets say a computer watches all of the events that happen in a game and it starts to recognize

    patterns. Some patterns are repeated, while others arent. When the patterns are repeated, the

    machine learns that and starts looking for that pattern to occur again, this time starting to

    make a prediction about what is going to come next.

    Say the computer sees A-B-C-D, over and over. After a while it recognizes it. Then it sees A-B-C,

    and you ask it what is going to happen next. It says D, of course, but it can also tell you how

    likely this prediction is to be correct. How can it do that? Well, when it looked into the past, it

    wasnt always A-B-C-D. Occasionally it was A-B-C-Q. So the computer also starts to understand

    likelihood, and can tell you how often that guess has turned out right. Thats the prediction. No

    magic, really.

    How Right Is It?

    Theres a technique used to test and verify these predictions. Its

    called bear with me here cross fold validation.

    Heres how it works:

    Lets say you have a really big data set. The computer

    takes the whole thing and splits it into two halves. With

    the first half it looks for patterns and builds up its

    model. By the end, it says We see A-B-C and then D

    happens 75% of the time. Now lets take that model

    and see if its accurate in the second, totally untouched

    half of the data. If A-B-C-D happens 75% in this data as

    well, we all start feeling pretty good about the predictive

    nature. But we start thinking of it being 75% accurate, not

    100%.

    Why not 100%?

    The short answer is that the world has a lot of moving pieces. For

    example, you might make a prediction that Bob Smith will spend $20

    tomorrow. But what if poor Bob gets hit by a bus or has a big breakup with his fianc? Those

    events could very likely alter his behavior (especially the getting hit by a bus). And you certainly

    didnt plan for those data points in your model. So when Bob doesnt show up and spend his $20,

    you scratch your head. Was your model wrong? No, just incomplete.

    Secondly, its very easy to cheat with these models. For example, by ignoring any false positive or

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 8

    false negatives in data sets, anyone could just tell you that all of the players

    will spend $20 tomorrow. And they guarantee that theyll have covered

    everyone who does. The ones they were accurate about they could

    report as 100% accurate.

    This happens more often than youd think in scientific circles

    and when it happens its labeled junk science. And rightly

    so.

    Putting Trust in the F-Score

    So, to be responsible, Ninja Metrics advocates using

    something called an F-score. This takes one stat that

    allows for false positives and another that allows for false

    negatives and simply averages them. The result is expressed

    as a percentage and is extremely trustworthy. It cant cheat.

    Now you need good performance.

    For reference, F-scores have been used for a long time in churn models

    by the telecommunication companies. For industries like these where the

    data isnt particularly detailed, high quality F-scores tend to hit the .35 to .45 range on average.

    With game data, you can get much higher F-scores because the data detail in gaming is really,

    really good. A .50 to .70 score is very good. A .80 to .90 is spectacular. Anything over that is

    rare indeed. Our models tend to hit .85 to .90 on average, and thats after 7 years of R&D and

    academic specialization in game player data algorithms.

    Still, at the end of the day, seeing that confidence value in a predictive model is

    key. Any consultant or company that gives you a predictive value without that

    is no better than voodoo. Science is all about transparency and provability.

    Insist on it.

    Getting Your Action Items On Now that you have a value as well as a level of

    confidence in it, how do you turn it into actions?

    In other words, how do you turn predictive insight into targeted promotions or

    player interventions?

    F

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 9

    Predicting Player Churn Rate

    Say youre trying to predict player churn rate. You run the numbers and it comes out with an

    overall accuracy of .85 (Well say % from here on out, but an F-score is best). An overall number

    is a great start but its not very actionable. What is actionable is a score for an individual player.

    Lets consider two players A and B:

    Player A has a probability of quitting of 60% in the next week, and the model is 90% accurate.

    Player B has a probability of quitting of 80% in the next week, and the model is 20% accurate.

    There are a couple ways of working with this.

    One is simple. Just set a threshold percentage and say that anyone over this threshold is worth

    taking action on, period. Thats fine, but it doesnt take into account the fact that you have scarce

    resources that you can devote to keeping that player. Also, it doesnt prevent you from offering

    an inappropriate promotion or intervention to a player that wasnt even planning on quitting the

    game in the first place. Imagine the ham-handedness of a Dont Go! promotion aimed at a loyal

    and happy player.

    The second way of thinking about this is to multiply the two percentages together. Player

    A is 60% likely, and were 90% sure. Taken together, if we had say 100 players

    like this, 54% of them (90% of 60%) would quit next week. Its a little like

    opportunity cost. Just consider these players 54% likely to leave and react

    appropriately.

    Predicting Player Lifetime Value

    Prediction of a players future spending has always been

    elusive. But with predictive modeling, a players lifetime value,

    or LTV, can be reduced to just another predictive metric. You

    are predicting both spending and lifespan.

    Lets say there are two new players:

    Player C has an LTV of $150, a likelihood of churning out of

    30%, and the model is saying its 90% accurate.

    Player D has an LTV of $80, a likelihood of churning out of 90%

    and the model is likewise 90% accurate.

    What are they each worth? First, both models are pretty accurate, so

    lets just take them as trustworthy and ignore the x .9 part.

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 10

    Simply multiply the LTV by the churn probability. Player C is $150x30%, or $45 of expected value,

    and player D is $80x90%, or $72 of expected value.

    Which player do you want to spend resources to keep?

    If all youre going to do is send a zero-cost email, by all means send it to them both. But when

    are you willing to give away something more costly like a free item, a free month of play or a fruit

    basket in the mail? Compare the cost of those interventions with the expected value and make a

    rational decision. If a fruit basket costs $50 and will save the player, then it makes rational sense

    to send it to Player D, but not C.

    Ninja Metrics Katana software provides all of these numbers to enable exactly these kinds of

    transparent and rational decisions. We know that game developers will have their own tastes for

    how they handle interventionscommunity managers, customer service reps, email campaigns,

    push notification systems, etc.and we support whatever systems we learn about. The best

    possible analytics system will allow you to set your own thresholds, decide on your own course of

    action, connect directly to the vendor or system for that action, then track the results.

    Red Ball vs. Blue: Real-Time Analytics Back to Minority Report you may recall that their system etched

    the future criminals name on a little wooden ball. If the ball was

    blue, the crime was going to occur some time off in the future.

    The police had time to plan a proper response. If it was red, the

    crime was imminent and the police had to react immediately.

    In the game world, this would be like viewing a list of possible quitters with high probability of

    quitting in the next 48 hours and then running a quick intervention to reduce churn.

    With Ninja Metrics software, blue and red ball analysis is possible. You can use dashboards to

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 11

    monitor and review and think big picture, but you can also automate these decisions.

    For example, if you know that Player X is going to quit in two weeks, great, you have some time

    for management to take action. But what about Player Y, whos quitting tomorrow? How nimble

    are you? You should have any red ball players set off a trigger that sends notification to the

    right person.

    For example, any time a player with an LTV over $5 is likely to quit at more than 60% and with a

    confidence level of 70%, send an urgent email to Lisa Brown, the games community manager.

    Predictive is Not Real-Time Nearly every analytics company says they offer real-time analytics. To some

    extent, this is statistical and marketing sleight of hand. Yes, we can all process

    your users and tell you how many logged in, or are on level 5, or have spent

    money. Thats easy, and it can be as fast as a (very) big Excel file spreadsheet.

    But predictive modeling is definitely not instantaneous. It takes time to run

    these models. How long depends on the number of players and the

    number of data points youre consideringnot to mention how

    well your analytics team deals with Hadoop and map reduce!

    If your game has 10,000 players, these models are going

    to be very fast. But if your game has 10 million players,

    it might take a few hours. And heavy duty models like

    Ninja Metrics Social Value system take longer still.

    The key is to consider the trade-off between the cost

    of running your models and your ability to quickly

    act on their results. We suggest running them daily

    because of the trade-offs of processing costs and

    many companies inability to act on things any faster.

    Remember the red ball. So, if youre ninja nimble - and

    can afford extra costs - consider running your models

    more frequently.

    Its about actionability at the end of the day. Given a

    specific result from the predictive models, would you act? If

    you had more actionable data, more often, would it be worth

    it to your business to run a promotion or intervention? If so,

    consider springing for it. If not, dont waste resources.

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 12

    Data Models and Why Your Education Probably Wasnt Good Enough Most marketers are trained in a business school, or perhaps some

    kind of social science program like communication, sociology,

    etc. In these programs, we learn useful statistical tools such as

    correlations, ANOVAs, and most often, regression models.

    In a regression model, we have an outcome variable (dependent variable) and some number of

    predictors (independent variables). Applying this to calculating player churn, we might have a

    model that looks something like:

    Quitting = Gender + Time Spent + Character Type + Error

    And wed take a look at the overall stats for the model and determine if we thought it was

    trustworthy enough. Maybe wed look at the standard error of the model,

    maybe the r-squared, etc. If its good enough, well take a look at the

    coefficients on the independent variables, see which ones reached

    statistical significance and how big they are, and in what

    direction. Fair enough.

    Heres the problem.

    Those models arent very good. Theyre pretty good

    sometimes you get an r-squared of like .46 and youre

    reasonably confident given the tools you used, i.e.

    statistical models based on sampling.

    But if we want models that reach accuracy levels of over

    .5, were going to have to leave the world of old-school

    statistical modeling and get with big data.

    And big data is the realm of computer science, not social

    science.

    Do You Want the Good News or the Bad News?

    The good news is that computer science has models with power that,

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 13

    frankly, beat the crap out of regular social science and b-school approaches. Its just night and

    day. You can now have models that hit 60%, 80%, sometimes 90%+ accuracy levels.

    The bad news is that they dont look like regressions anymore and you need new training to

    understand them. The results come out in tables, if-then statements, rule sets and other long,

    unfathomable formats. Literally no human being can intuit much of it. Ninja Metrics has spent 7

    years figuring out best practices in this new area of science and its still challenging.

    So how can a model be good if its not even understandable?

    Fair Question. But do you really need to understand why it works, or is it good enough to

    understand that it just is with a high degree of certainty?

    Imagine you want to know if Player A is going to spend money next month. Theres a black box

    there that will tell you, with 85% accuracy, if she will. But you cant know why. Or, theres another

    transparent box that will tell you, with 40% accuracy, and you can know why. Which box do you

    want?

    From a practical, actionable point of view, its actually an easy question to answer. If youre going

    to run interventions and test their effectiveness anyway, youre going to get the why eventually.

    And if youre going to send everyone the same email anyway, its irrelevant.

    Dont get me wrong. Im a long-time modeler who likes to know the why. But if I can get 80%+

    confidence levels without knowing the why, Im happy to give it up. I would prefer to have other

    parts of my dashboard focus on why issues. Its the smarter entrance to the rabbit hole. And

    again, its actionable.

    If youre a larger gaming company, you may have a sharp analyst down in the BI department.

    Thats great, but shes probably not running an automated model every day. And even if she is,

  • www.ninjametrics.com

    2014. Ninja Metrics

    www.ninjametrics.com | [email protected]

    Ninja Metrics - White Paper Prediction: The Future of Game Analytics | Page 14

    does she have the question-asking and contextual skills (mostly right brain) of a social scientist as

    well as the hard-core big data skills of a computer scientist (mostly left brain)? In this early era of

    big data a person like this is rare and in extremely high demand.

    And this is precisely why we built Ninja Metrics. It enables marketing, BI and the developers to

    ask the right questions and then automate all the answers.

    Our Katana system is essentially a team of PhDs in a box, working daily.

    If you have some PhDs on staff already, fantastic. They will take the tool even farther. Ninja

    Metrics supplies scads of new and powerful DVs and IVs for them to play with and investigate

    furtherall in an easy to use, automated system. Theyll have quick answers they cant come up

    with on their own, and will conceive of more uses for them than we will ever think up. Win-win.

    References

    Webb, E., D. Campbell, et al. (1966). Unobtrusive measures: Non-reactive research in the social

    sciences. Chicago, Rand McNally and Company.

    Get Distributed The Value of Big Data Understanding Historical Data Predictive Analytics Getting Your Action Items On Red Ball vs. Blue: Real-Time Analytics Predictive is Not Real-Time Data Models and Why Your Education Probably Wasnt Good Enough