24
Machine Learning at Comcast November 10 th , 2015 Andrew Leamon – Director Chushi Ren – Software Engineer / Data Scientist Engineering Analysis

H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren

Embed Size (px)

Citation preview

Machine Learning at Comcast

November 10th, 2015 Andrew Leamon – Director Chushi Ren – Software Engineer / Data Scientist Engineering Analysis

About Comcast

Machine Learning at Comcast 2

Comcast brings together the best in media and technology. We drive innovation to create the world’s best entertainment and online experiences.

High Speed Internet

Video

IP Telephony

Home Security / Automation

Universal Parks

Media Properties

Netflix

LIVETV

Online Video

Machine Learning at Comcast 3

Machine Learning for X1 Features

Ø Average US household watches 3-5 hours of TV per day (Nielsen)

Ø  3x more than Netflix (BTIG Research 4/2015)

Ø  4x Videos on Smartphones, Tablets, Computers

Ø  50% of leisure time is spent watching TV!

Importance of Live TV

Netflix

LIVE TV

Online Video

Machine Learning at Comcast 4

CONTENT INFORMATION

CONTENT

IMAGES

LOGOS

SUBSCRIBER INFORMATION

CATALOGS

ENTITLEMENTS

CHANNEL LINEUPS

DISCOVERY

SEARCH

BROWSE

RECOMMEND

PERSONALIZE

VOICE CONTROL

MENU

MILLIONS OF DEVICES

METADATA PROVIDERS

CONTENT PROVIDERS

BILLING SYSTEMS

CUSTOMER USAGE

PURCHASES

DEEP METADATA

SPARK

5

X1 Personalization

•  Ensemble of Gradient Boosted Decision Trees •  Input: statistics of program ratings, program metadata, channel info, …

Number of Signals

0.77

= New Signal

Trending on X1 – Predict Popularity 24 Hours in Advance

Machine Learning at Comcast 6

Program recommendations are updated every 20 sec (Spark Streaming) For more details and code samples see our talk at the Spark Summit

East March 2015 - https://spark-summit.org/east-2015/

Live Tune Activity from

Kafka

Batch: User Clustering

with KMeans

Real-time: TopK Trending Programs

per Cluster

Real-time Program recommendations per

user

User History from HDFS

Real-time Recommendations

Machine Learning at Comcast 7

Netflix

LIVETV

Online Video

Machine Learning at Comcast 8

Machine Learning to Improve Customer Care

Problem: Avoidable Truck Rolls (ATR)

Machine Learning at Comcast 9

Customer calls to report an issue with their service

Customer service agent goes through ITG to debug the problem with customer via phone

When agent cannot resolve the problem by phone, a truck roll will be scheduled

Ø  Examples of avoidable truck rolls: Ø  Reset modem Ø  Change remote battery Ø  Entitlement issue

Ø  Goal Ø  Build a predictive model to prevent ATRs

ATR Machine Learning Pipeline

Machine Learning at Comcast 10

Feature extraction

Feature selection

Model training

Model validation

Data source

Training data

Test data

Classifier

ATR Challenges

Machine Learning at Comcast 11

Ø  Skewed data --- only a very small portion of the truck rolls are avoidable Ø  Use balance class option

in H2O to upsample data with minority class

Ø  Subsemble

Ø  Information leakage --- we use some feature statistic as feature, which will cause information leakage Ø  Hold current row off Ø  Add random noise

Ø  Operationalize model

Netflix

LIVETV

Online Video

Machine Learning at Comcast 12

Machine Learning to Improve Customer Experience

Problem: Customer Experience Metric (CXE Metric)

Machine Learning at Comcast 13

In CMTS (Cable Modem Termination System), ports are logically bonded to form “Service Group”.

SG Utilization = Customer experience?

Why Do We Need CXE Metric?

Machine Learning at Comcast 14

CXE Metric

Understand Customers’ Need

Prioritize Hardware Deployment

Customer Experience Metric

Machine Learning at Comcast 15

Ø  Select features correlated to customer experience across different dataset

Ø  Join them and perform cleaning and aggregation Ø  Cluster to form customer experience groups

Netflix

LIVETV

Online Video

Machine Learning at Comcast 16

Machine Learning for More Resilient & Reliable Products

The Evolution of Resiliency – Scale It!

Machine Learning at Comcast 17

System Errors •  User experiences an

Issue

Customer Contact •  Effort Required

Agent Manually Fixes •  Effort Required

System Errors •  User Experiences an Issue

Machine Learning •  Intelligent Scoring for Solution

Automated / Suggested Fix •  Issue Resolved with lower

effort

Ø  We can reduce effort for Customers and for Customer Care by building intelligent systems.

Self Healing & Sharing Context

Machine Learning at Comcast 18

Netflix

LIVETV

Online Video

Machine Learning at Comcast 19

Machine Learning: The Promise / Challenge of

Operationalization

Real-time Data + Operationalized Models -> Better Products

“However valuable these PhDs are, the organizations that have been lucky enough to secure these resources are realizing the limitations in human-powered data science: it’s simply not a scalable solution.”

“The commonality across all of these new technologies is that they offer something additional humans cannot provide: the power of scale. Organizations that do not have a strategic initiative to regularly and organically engage with its customers will be at a serious disadvantage. Soon, AI-driven engagement models that interpret data and intuitively interact with clients will be the norm.”

Harvard Business Review: “Data Scientists Don’t Scale”: https://hbr.org/2015/05/data-scientists-dont-scale

20 Machine Learning at Comcast

Challenges in Operation: Getting Data in Real-time

Machine Learning at Comcast 21

Ø  Various source of data with different format Ø  Enables real time query with customer event data

Challenges in Operation: Computation in Real-time

Machine Learning at Comcast 22

Ø  Challenges Ø  Handles heavy computation involved to transform raw data Ø  Responds to large amount of prediction requests fast Ø  Updates model with latest data

Ø  Potential Solution Ø  Spark + Sparkling Water

Tools & Infrastructure to integrate with Actual Products

Machine Learning at Comcast 23

Data

•  Real-time Production •  Schema Management •  Governance

Models

•  Versioning •  Operationalization •  Publishing / Deployment

Integration

•  Execution at Runtime •  System APIs •  Validation