Big Bang to New Economy - Gateway Analytics Network 2015

Preview:

Citation preview

FROM THE BIG BANG TO THE NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATAPatrick DeglonDirector of Engineering, Analytics Area Tech Leadpdeglon@motorola.comlinkd.in/pdeglon

2FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

from the Big Bang…

Image: CERN

13.8 billions years

5 billions years

1 billion years

300,000 years

2 min

0.0000000001 sec

10-34 sec = 0.0…001 sec (34 zeros)

10-43 sec = 0.0…001 sec (43 zeros)

During 1996-2002, worked at CERN (the European Laboratory for Particle Physics) for my MS and PhD at the University of Geneva

4

Geneva Switzerland

Image: CERN

17 miles underground tunnelfor the LEP & LHC accelerator

Source: CERN

Mont Blanc

5Image: CERN Source: CERN

6

Tape robotSource: CERN

PAW – Physics Analysis WorkstationSource: Wikipedia

Data collection & analysis was done in Fortran. Advance

analysis/statistics was done through PAW. [1996-2002]

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Example of a particle collision

7FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Solving the puzzle… which particles go together?

8

?

A

B

CD

1. AB + CD?2. AC + BD?3. AD + BC?

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product)

9

Statistical Noise

Signal(particle resonance)

Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html

Schematic View CERN Example(discovery of a new particle bb)

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

10

Size of the electron?

01

23

45

6

R < 5.1 x 10-19 m ***

*** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

11

Extra dimension?

MS > 1.1 TeV ***

e-

e+

e+

e-

our universe in 4 dimensions

extra dimension

*** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332

graviton

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

12FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

… to the New Economy

13FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

… to the New Economy

Imagine a world...

… where information is ubiquitous (anytime & anywhere)

15FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

… to the New Economy

… where buildings can recognize your presence

16FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

… to the New Economy… where even streetlights are connected to Internet

17FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

… to the New EconomyWelcome to a connected world

#1 KPI reporting & Impact Measurement

#2 Marketing

#3 The cost of Big Data

#4 Human Resources

Examples

19

Example #1

KPI reporting & Impact Measurement

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

So, how is the business doing?

20

Key Performance Indicators

Motorola Factory# Shipments

Distribution Channels# Sales

First Usage# Activations

Simplified Business Flow

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Google BigQuery

MotorolaCloud

Insights

...

21

Google Spreadsheet as a Reporting Engine

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

22

Google Spreadsheet as a Reporting Engine

Spreadsheet

Google Big Query

HTML body in sheet

Google Mail

GoogleApp Script

Google Scheduler

Google Charts

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

23FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Big Querydatasets (SQL)

Google AppEngine

Google Analytics

data

InstrumentationApp Engine

Tableaureports

ETLApp Engine

Users, ReportsDatastore

Goo

gle

Driv

e

InternalUsers

Machine Learned

Models

gCha

rt +

D3

+ Ta

blea

u AP

I

Enabling Self-Service Analytics

24

42…so what?

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Answer to the Ultimate Question of Life, The Universe, and Everything

25FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Relativity• Versus time (WoW, YoY, …)• Versus plan (target, budget, forecast, …)• Versus other products, customers, markets• Versus competition• Versus internal/external/social events• Versus trend in other metrics• …

26FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Data Issue

time (day)

# Active Users

Normal Band

Number of Active Users using their camera in US

Root Causes

● Some files don’t get loaded properly in BigQuery, creating gaps in user count.

● The instrumentation changed on the device● Customer behavior

Business Issue

# System Restarts

Number of System Restarts

Root Cause

A buggy Android app doesn’t handle the timezone change properly, crashing the devices.

Exception Reports (Illustrative Examples)

27FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

1. Define a multi-dimensional cubes with real data. For example: Product, Market, # Users taking a picture

2. Each cell becomes then a time series

3. Clean the data (remove seasonality, weekday cycle and any other know perturbation)

* Note: (Bayesian likelihood with knowledge base)

4. Fit trend and establish volatility band (2 std deviations)

5. Measure variance versus prediction for each cell (e.g. market/product/metric) and trigger an exception if outside band

6. Collect all exceptions into a matrix and apply fuzzy logic* to propose potential root causes

mar

kets

BR

productsmetr

ics

Approach

Measuring impact of initiatives

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

Aug 1st Sep 1st Oct 1st

Number of listings

2012

2011B

A

C

Pre/Post analysis illustrative example (Simulation)

D

Impact of the initiative

pre post

Initiativelaunched

• Used to measure the impact of an initiative in a full market or a market segment

• Randomized Test/Control group methodology is a golden standard in research

A/B test illustrative example (Simulation)

0

50

100

150

200

250

300

350

400

450

Aug 1st Sep 1st Oct 1st

Number of purchases

Impact of the initiative

Initiativelaunched

control group

test group

28FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

29

Campaign MeasurementCampaigns

• Campaign Id• Campaign Name• Time range• Set of Countries• Set of Products

KPI

• Date• Country• Product• KPI[]

X

Trend

• Campaign Id• Date• Total of KPI[]

Summary

• Campaign Id• Campaign Name• Impact Measurement[]• Statistical Error[]

Time Series

Analysis

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

30

Example of one campaign cell measurement

Campaign Window

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

31

Campaign Measurement

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Define Campaign

Run Campaign

Measure Impacts

Drive Insights

32FROM THE BIG BANG TO ECOMMERCE,

A JOURNEY IN MAKING SENSE OF BIG DATA

Descriptive Analytics

Predictive Analytics

Prescriptive Analytics

Holy Grail of Analytics

33

Example #2

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

How much sales did my campaign generated?

Marketing

Case study: Online Search

Natural/OrganicSearch (free)

Paid Search

34FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

X days

2 purchases

missing

X days

Y days

all purchasesare incremental1 purchase is

uncorrelated

Y days

Jan 1st Feb 1st

$ $ $ $ $ $ $

click

$ $ $

Behavioral purchasesUncorrelated to Marketing

clickMar 1st

$

Influence purchaseCorrelated to Marketing

Customer behaviors and Internet Marketing Investment

Which customer purchases are influenced by Marketing?

35FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Remember this physics problem?

36

?

A

B

CD

1. AB + CD?2. AC + BD?3. AD + BC?

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product)

37

Statistical Noise

Signal(particle resonance)

Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html

Schematic View

Combine correlated events and uncorrelated events produce a system with a statistical noise (which is simple enough to extract) and the researched signal

CERN Example(discovery of a new particle bb)

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Marketing incrementality

(correlated purchases) Level of

behavioral purchases

Positive LatencyPurchase after Click (potential causality)

Behavior & Internet Marketing impact

Level of behavioral purchases

0 2 4 6 8 10 12 14

Latency (days)

Number of events (pairs click-purchase)

Negative LatencyPurchase before Click (no causality)

Behavior only

-14 -12 -10 -8 -6 -4 -2

User clicks on an ad-banner at time=0

User makes a purchase X days later

Latency time for each pair click - purchase

38FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

39

Sales ROI Channel A 8% +20%Channel B 5% -10%Channel C 1% +10%

Method 1• Reduce spend on channel B• Invest in channel A• When prioritizing, ignore

channel C

Sales ROI Channel A 7% -20%Channel B 6% +30%Channel C 12% +60%

Method 2• Reduce spend on channel A• Invest heavily on channel C• Marketing counts actually for

25% of the site

<>

… So what?

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis

Case study: Online Search

40FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Case study: Online Search

Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis 41

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis

Case study: Online Search

42FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis

Case study: Online Search

43FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

44

So, what’s next?Marketing 101

Don’t Do Marketing Do Marketing

No Purchase

PurchaseL L

D DC

C?

?

Cost

Direct Return

Incr Return

Rule #1: Never, ever, spend money unless you really-really have to

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

So, what’s next?

Investment (costs)

Output Cost

Return (Revenues)

ProfitMax SalesNo Profit

Total ROI = 0

Max Profit

DReturn = DInvestmenti.e. marginal ROI = 0Rule #2: If you have to spend, you spend

to the point of marginal return=0

45FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

SpendBucket i

SpendBucket 0

(most profitable)

SpendBucket N

(leastprofitable)

Marginal Return Chart

CumulativeCost

ROI

CurrentSpend Level

Area/initiatives/segment withnegative profitability

Cost reduction opportunity!

Point of marginal

return = 0(maximum profit)

In depth Analysis require to validate

high ROI

46FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

47

Example #3

The cost of Big Data

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

What is my share of the pile?

48FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Google Cloud Platform Cost

~ 0 > 0

How to determine who is costing how much?

49FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

How to track Big Query usage?

Google does not provide a data feed on its customer’s usage of BigQuery. However three API can help us:

bigquery.jobs.list

List all the Jobs in a specified project.

Note: use projection = full to get email of user

bigquery.jobs.get

Retrieve the

specified job by ID.

The queries are parsed to extract underlying tables used, and the data is stored in the App Engine datastore as well as in Big Query through the streaming API (close to real-time).

bigquery.projects.list

List all (visible) projects

50FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Beyond Queries, we also scan Tables

bigquery.projects.list

List projects visible

bigquery.tables.list

List tables within a dataset

bigquery.datasets.list

List datasets within a project

bigquery.tables.get

Get details about a table

datastorequeries

information

51FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Enables Enlightenment Questions for an Analyst

• When was this table last refreshed?• How often is it refreshed?• How was it created? • Underlying data sources/tables?• Who created this table?• Who knows how to use this table?• Where can I find this great query I ran?• Who knows how to use this tag/metric?

• How much bandwidth am I using?• How much space are my tables using?• How much does my usage cost?

Rick Hotten

52FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

How much bandwidth am I using in BigQuery?

53FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Big Query Pricing

$0.02 per GB per month$6.83 per TB per day Storage Cost

Query Cost $5 per TB$20,000 per month

for 5 GB/s unit,i.e. $1.58 per TB*

On-demand Reserved capacity

* Note: for continuous usage of the 5 GB/s bandwidth

54FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

How much does my usage of BigQuery cost?

Assuming that the Motorola bandwidth is elastic, i.e. we always pay for the optimal number of units (5 Gb/s), we can use $1.58 per TB as a proxy

55FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Weekly Email to largest BQ users

56

Example #4

Human Resources

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

It’s time for your annual review

Annual Review Feedback

57FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

What is the optimal method to determine your key work partners for feedbacks? With objectivity and relevancy?

58FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Scrapping the trace of your collaboration:Gmail and Google Calendar

gmail.users.messages.getGet 1 email details

datastore

calendar.events.listList events & meta-data (by page of 100)

gmail.users.messages.listList User Email (by page of 100)

Scoring

1 pts = 30 min meeting

= 10 emailsWeight is divided by

number of participants

Fred 34 ptsNancy 24 ptsDaniel 17 pts

59FROM THE BIG BANG TO NEW ECONOMY,

A JOURNEY IN MAKING SENSE OF BIG DATA

Example

Wrapping Up… CERN vs New Economy

60

CERN

• Write kilometers long Fortran code

New Economy

• Write miles long SQL code• Analysis can run for many hours… before a

batch robot error• Queries can run for many hours… before a

spool space error

• Study billions of collision data • Study billions of customer data• Great depth of data structure & complexity • Great depth of data structure & complexity• Know your local expert for question – but try

to find the solution by yourself… much quicker

• Know your local expert for question – but try to find the solution by yourself… much quicker

• Remove “bad runs” (unclean data batch) • Remove “wackos” (non material transactions)

• Transform a complex system into insights • Transform a complex system into insights• Communicate findings to conferences • Communicate recommendation to business

review• Strong competitive landscape (4 distinct

experiments competing to the first to publish, or publish better results)

• Strong competitive landscape

FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA

Recommended