Transcript
Page 1: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Michael Limcaco, Amazon Web Services

Page 2: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 3: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 4: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Content discovery … and the conversation around it … matter!

[1] http://www.slideshare.net/AmazonWebServices/maximizing-audience-engagement-in-media-delivery-med303-aws-reinvent-2013-28622676

[2] http://www.nielsen.com/content/corporate/us/en/press-room/2013/new-nielsen-research-indicates-two-way-causal-influence-between-.html

[3] http://www.google.com.au/think/research-studies/quantifying-movie-magic.html

Page 5: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Search

Watch

Listen

Play

Download

Purchase

Contact sales

Subscribe

Contact support

Cancel

Rate It

Review It

Upgrade It

Sharing

Tagging

Bookmarking

Social Sentiment

Page 6: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

• Descriptive

– Retrospective

– What happened or is happening

– Simple aggregations and counters

• Predictive

– Statistical forecast

– Predict a value in a dataset

– Machine learning

• Prescriptive (emergent)

– What should I do about it?

Descriptive

Predictive

Prescriptive

Page 7: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Machine Learning

Signals Predictions

Page 8: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Recommendations

Clustering

Classification

Page 9: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 10: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 11: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 12: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Storage

Visualization

&

Analysis

R

Octave

Matlab

Excel

DAS

Graphlab

Mahout

Spark MLlib

H20

Hbase

HDFS

RDBMS

SAN/NAS

KNIME

WEKA

Python Kits

Single Node Big Data

Page 13: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 14: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Use Case 1

Page 15: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 16: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Spark H20

Recommendation Clustering Classification

Math Library

Hadoop

Map-Reduce

Page 17: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Estimate similar users and items

http://www.slideshare.net/tdunning/recommendation-techn

Page 18: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

User1 Thing1

User2 Thing2

User3 Thing3

User2 Thing4

User5 Thing1

User1 Thing2

User1 Thing3

Mike

Jon

Mary

Phil

Kris

Logs History Matrix

Page 19: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

History Matrix

2 8

2 4

8

4

Item-Item Matrix

Page 20: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

2 8

2 4

8

4

Item-Item Matrix

LLR

Indicators

(“Items Similar To This….”)

Page 21: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Indicators

(“Items Similar To This….”)

Items Similar To This

Page 22: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Superman Highlander,

Dune

Star Wars Raiders,

Minority

Report

Highlander Superman

Mulan Home Alone,

Mermaid

Star Trek …

… …

4587 223, 5234

748 5345, 235

12 8234

245 9543, 7673

3456 4587

… …

Index

Page 23: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Indicators

Page 24: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

748 Star Wars 45, 235

12 Highlander 8234

245 Mulan 9543,

7673

4587 Superman 12, 5234

3456 Star Trek 2458 …

Query

“12”

5345

3456

12

Page 25: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 26: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

users

Page 27: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

users

Media

platforms

Mobile

Search

Play

Buy

Rate

Recommendations

Page 28: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 29: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

https://github.com/apache/mahout

Page 30: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

movie-b movie-c:2.772588722239781

movie-a:2.772588722239781

movie-d ….Indicators

(“Items Similar To This….”)

% mahout spark-itemsimilarity

-i input-folder/data.txt

-o output-folder/

--filter1 buy -fc 1 -ic 2

--filter2 view

Page 31: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Use Case 2

Page 32: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Classify (estimate) as Positive | Negative

http://www.slideshare.net/tdunning/recommendation-techn

Page 33: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

“I thought Star Wars Episode 28 was not without merit ”

https://github.com/cyhex/streamcrab

Page 34: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

users

Page 35: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

users

Media

platforms

Mobile

Search

Play

Buy

Rate

Recommend

Social Media activity

Page 36: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Extract

FeaturesClassify

Extract

FeaturesClassify

Extract

FeaturesClassify

Model

Training

Positive Negative

“I adored this

movie”

“adore” =

POSITIVE

Page 37: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Extract

FeaturesClassify

Extract

FeaturesClassify

Extract

FeaturesClassify

Model

Training

Positive Negative

Page 38: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

http://www.nltk.org/book/ch06.html

TextBlob + Natural Language Toolkit (NLTK)

1

2

Page 39: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

from textblob.classifier import NaiveBayesClassifier

training_data = [(‘I love this movie’, ‘Positive’),

(‘This makes me mad ’, ‘Negative’) …]

my_classifier = NaiveBayesClassifier(training_data)

Page 40: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

“I thought Star Wars Episode 29 was not without merit ”

“Positive”

from amazon_kclpy import kcl import json, base64

class RecordProcessor(kcl.RecordProcessorBase):

def process_records(self, records, checkpointer):

:

inbound_tweet = base64.b64decode(record.get(‘data’))

sentiment = my_classifier.classify(inbound_tweet)

Page 41: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Extract

FeaturesClassify

Extract

FeaturesClassify

Extract

FeaturesClassify

Model

Training

Positive Negative

Page 42: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 43: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

12 2 7 85 1 997

Mulan

1 5 99 85 50 4

Mulan

1 2 3 4 5 6

Mulan

3 1 4 6 7 9

Mulan

Page 44: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 45: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 46: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Use Case 3

Page 47: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

This is a form of unsupervised learning

Page 48: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Segaran, Toby. Programming Collective Intelligence. Sebastopol: O’Reilly, 2009. Print.

Page 49: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6374152&isnumber=6374097

Page 50: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 51: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

R + H20

Page 52: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

R + H20

Data

Science

Desktop

Machine

Learning

Cluster

Page 53: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

R + H20

% java –jar h20.jar

Page 54: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 55: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 56: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Use Case 4

Page 57: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 58: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

Customer Geo Account Type Account

Age

Support

Tickets

Minutes

streamed

Churn?

Mike CA Premium 120 10 240 TBD

John CA Basic 240 1 140 TBD

Ingrid WA Premium 60 5 1800 TBD

Mark WA Basic 30 0 0 TBD

Usman WA Basic 720 0 360 TBD

Page 59: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

http://www.bigml.com

Page 60: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 61: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 62: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

AWS Marketplace

Software

• BigML

• Revolution R Enterprise

• PredictionIO

• Yhat

• Mortar

• Zementis

Page 63: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 64: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014
Page 65: (MED302) Leveraging Cloud-Based Predictive Analytics to Strengthen Audience Engagement | AWS re:Invent 2014

http://bit.ly/awsevals