66
MarketShare confidential and proprietary 1 Predictive Analytics as a Service Prateem Mandal, Tech Lead Architect November 02, 2015

H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

Embed Size (px)

Citation preview

Page 1: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 1

Predictive Analytics as a Service

Prateem Mandal, Tech Lead Architect

November 02, 2015

Page 2: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 2

MarketShare: A Decade of Decision Enablement 20-30% increase in marketing efficiency, 3-4% increase in revenue, 10-50% ROI

Proprietary analytical techniques fuse

market level, consumer and CRM

techniques

100+ Data Scientists and PhDs with

operational modeling experience

10 Patents and 36 patents pending

for applied modeling and technology

Science & Math Application Enablement

70K Scenarios generated by clients in

2014

Predictive modeling on 10’s of TB of

data per customer

1,000+ active users

Global CMO network

Experience deploying

technologies at over 100

Fortune 500 companies

Deep strategic

partnerships

Page 3: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 3

Analytics Life Cycle

1 New Client

Model UAT

Onboarding

Scoring &

Attribution Modeling

Scenario

Analysis &

Reporting Feature

Engineering

ETL

Page 4: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 4

Analytics Life Cycle

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Modeling

Feature Set

Transformed

Modeling

Feature Set

Feature

Engineering

Configs

ETL

Configs

Model UAT

Automated

Modeling

Onboarding

Scoring &

Attribution Modeling

Scenario

Analysis &

Reporting Feature

Engineering

ETL

Page 5: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 5

Analytics Life Cycle

1 New Client

Model UAT

Modeling

Feature Set

Transformed

Modeling

Feature Set

Onboarding

Scoring &

Attribution Modeling

Scenario

Analysis &

Reporting Feature

Engineering

ETL

Page 6: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T

L O O K B A C K

C A U S E

U S E R S E Q U E N C E

U S E R

Conversion

PURCHASE

Online Events Offline Events Past Activity

Page 7: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T

L O O K B A C K

C A U S E

U S E R S E Q U E N C E

U S E R

Online Events Offline Events Past Activity

EMAIL SEARCH DISPLAY NEWSPAPER TV ACTIVITY

Conversion

PURCHASE

Page 8: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CHANNEL

Conversion

PURCHASE

Page 9: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CHANNEL

SUB CHANNEL EMAIL BRAND

AUDIO

BRAND

VIDEO BRAND

NON

BRAND

NON

BRAND

BRAND

VIDEO

BRAND

VIDEO

Conversion

PURCHASE

Page 10: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CHANNEL

SUB CHANNEL

TACTIC

Conversion

PURCHASE

EMAIL BRAND

AUDIO

BRAND

VIDEO BRAND

NON

BRAND

NON

BRAND

BRAND

VIDEO

BRAND

VIDEO

EMAIL TARGETING UNKNOWN CON PLA TARGETING NON

TARGETING TARGETING

Page 11: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CHANNEL

SUB CHANNEL

TACTIC

Conversion

PURCHASE

EMAIL BRAND

AUDIO

BRAND

VIDEO BRAND

NON

BRAND

NON

BRAND

BRAND

VIDEO

BRAND

VIDEO

EMAIL TARGETING UNKNOWN CON PLA TARGETING NON

TARGETING TARGETING

L O O K B A C K

Page 12: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CHANNEL

SUB CHANNEL

TACTIC

Conversion

PURCHASE

EMAIL BRAND

AUDIO

BRAND

VIDEO BRAND

NON

BRAND

NON

BRAND

BRAND

VIDEO

BRAND

VIDEO

EMAIL TARGETING UNKNOWN CON PLA TARGETING NON

TARGETING TARGETING

L O O K B A C K

ATTRIBUTION

88%

Page 13: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R

U S E R S E Q U E N C E

CHANNEL

PURCHASE L O O K B A C K

Page 14: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R

U S E R S E Q U E N C E

L O O K B A C K

Page 15: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R

U S E R S E Q U E N C E

L O O K B A C K

Page 16: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T U S E R

A G G R E G A T E D U S E R S E Q U E N C E

M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHANNEL

Page 17: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

U S E R E F F E C T C A U S E

Page 18: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

Page 19: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

5

7

13

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

Page 20: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

5

7

13

E F F E C T U S E R M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 7 5

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

Page 21: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

5

7

13

E F F E C T U S E R M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 7 5

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

𝟏

𝟏 +𝐦𝐢𝐧(𝟏𝟑, 𝟕, 𝟓)

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U S E R

Page 22: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

5

7

13

E F F E C T U S E R M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 7 5

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

𝟏

𝟔

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U S E R

Page 23: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

U S E R S E Q U E N C E

5

7

13

E F F E C T U S E R M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13 7 5

𝑮𝑯𝑻 𝒙𝒊 =𝟏

𝟏 +𝐦𝐢𝐧(𝒙𝒊)

E F F E C T C A U S E

E F F E C T C A U S E

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . U S E R

Page 24: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

Analytics Life Cycle

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Onboarding

Scoring & Attribution Modeling

Scenario Analysis & Reporting

Feature Engineering

ETL

Page 25: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

A G G R E G A T E D U S E R S E Q U E N C E

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

Page 26: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

A G G R E G A T E D U S E R S E Q U E N C E

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎 𝟎. 𝟗𝟗𝟗𝟗𝟗𝟗 𝟏 𝟏 𝟎

Page 27: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

A G G R E G A T E D U S E R S E Q U E N C E

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎 𝟎. 𝟗𝟗𝟗𝟗𝟗𝟗 𝟏 𝟏 𝟎

0.88 0.35 0.41 0.12

Page 28: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

A G G R E G A T E D U S E R S E Q U E N C E

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎. 𝟏𝟔𝟔𝟔𝟔𝟔𝟕

E F F E C T M E T R I C i M E T R I C 1 M E T R I C 2 U S E R

𝟎 𝟎. 𝟗𝟗𝟗𝟗𝟗𝟗 𝟏 𝟏 𝟎

M E T R I C S E F F E C T U S E R A T T R M- A T T R

0.88

0.88 0.35 0.41 0.12

Page 29: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

Analytics Life Cycle

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Onboarding

Scoring & Attribution Modeling

Scenario Analysis & Reporting

Feature Engineering

ETL

Page 30: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

M E T R I C S E F F E C T U S E R A T T R M- A T T R

U S E R E F F E C T C A U S E

0.88

Page 31: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R A T T R M- A T T R E- A T T R

M E T R I C S E F F E C T U S E R A T T R M- A T T R

U S E R E F F E C T C A U S E

0.117

0.117

0.117

0.35

0.88

0.88

Page 32: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R A T T R M- A T T R E- A T T R

M E T R I C S E F F E C T U S E R A T T R M- A T T R

U S E R E F F E C T C A U S E

0.000

0.000

0.350

0.35

0.88

0.88

Page 33: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R A T T R M- A T T R E- A T T R

M E T R I C S E F F E C T U S E R A T T R M- A T T R

U S E R E F F E C T C A U S E

0.35

0.41

0.12

0.88

0.88

Page 34: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

C A U S E E F F E C T U S E R E- A T T R

U S E R E F F E C T C A U S E

Page 35: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

CID Channel Campaign Tactic

E1 DISPLAY BRAND VIDEO TARGETING

E2 DISPLAY BRAND VIDEO TARGETING

E3 DISPLAY BRAND VIDEO NON TARGETING

E4 SEARCH BRAND UNKNOWN

E5 SEARCH NON BRAND CON

E6 SEARCH NON BRAND PLA

E7 STORE NEWS --

ATTR

.05

.20

.10

.18

.08

.15

.12

SID Geo Zip

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

EID Product Brand Group

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

U S E R E F F E C T C A U S E

M E A S U R E S

Page 36: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

CID Channel Campaign Tactic

E1 DISPLAY BRAND VIDEO TARGETING

E2 DISPLAY BRAND VIDEO TARGETING

E3 DISPLAY BRAND VIDEO NON TARGETING

E4 SEARCH BRAND UNKNOWN

E5 SEARCH NON BRAND CON

E6 SEARCH NON BRAND PLA

E7 STORE NEWS --

ATTR

.05

.20

.10

.18

.08

.15

.12

SID Geo Zip

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

EID Product Brand Group

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

U S E R E F F E C T C A U S E

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

Channel Campaign

DISPLAY BRAND VIDEO

DISPLAY BRAND VIDEO

DISPLAY BRAND VIDEO

SEARCH BRAND

SEARCH NON BRAND

SEARCH NON BRAND

STORE NEWS

ATTR

.05

.20

.10

.18

.08

.15

.12

Geo

NA

NA

NA

NA

NA

NA

NA

Product Brand

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

U S E R E F F E C T C A U S E

Reporting dimensions

M E A S U R E S

M E A S U R E S

Page 37: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

CID Channel Campaign Tactic

E1 DISPLAY BRAND VIDEO TARGETING

E2 DISPLAY BRAND VIDEO TARGETING

E3 DISPLAY BRAND VIDEO NON TARGETING

E4 SEARCH BRAND UNKNOWN

E5 SEARCH NON BRAND CON

E6 SEARCH NON BRAND PLA

E7 STORE NEWS --

ATTR

.05

.20

.10

.18

.08

.15

.12

SID Geo Zip

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

EID Product Brand Group

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

U S E R E F F E C T C A U S E

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

Channel Campaign

DISPLAY BRAND VIDEO

SEARCH BRAND

SEARCH NON BRAND

STORE NEWS

ATTR

.35

.18

.23

.12

Geo

NA

NA

NA

NA

Product Brand

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

U S E R E F F E C T C A U S E

Reporting dimensions

M E A S U R E S

M E A S U R E S

Page 38: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

CID Channel Campaign Tactic

E1 DISPLAY BRAND VIDEO TARGETING

E2 DISPLAY BRAND VIDEO TARGETING

E3 DISPLAY BRAND VIDEO NON TARGETING

E4 SEARCH BRAND UNKNOWN

E5 SEARCH NON BRAND CON

E6 SEARCH NON BRAND PLA

E7 STORE NEWS --

ATTR

.05

.20

.10

.18

.08

.15

.12

SID Geo Zip

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

U1 NA 123

EID Product Brand Group

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

A1 Prod1 Brand1 Group1

U S E R E F F E C T C A U S E

E F F E C T C A U S E

U S E R S E Q U E N C E

U S E R

M E A S U R E S

Channel Campaign

DISPLAY BRAND VIDEO

SEARCH BRAND

SEARCH NON BRAND

STORE NEWS

ATTR

.35

.18

.23

.12

Geo

NA

NA

NA

NA

Product Brand

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

Prod1 Brand1

U S E R E F F E C T C A U S E

M E A S U R E S

Reporting dimensions

SPEND

10.5

8.33

2.75

14.85

Page 39: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

39

Page 40: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

40

Page 41: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

41

Page 42: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

42

Page 43: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

43

Page 44: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

44

Page 45: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

Analytics Life Cycle – Engineering Aspects

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Modeling

Feature Set

Transformed

Modeling

Feature Set

Feature

Engineering

Configs

ETL

Configs

Model UAT

Automated

Modeling

Onboarding

Scoring & Attribution Modeling

Scenario Analysis & Reporting

Feature Engineering

ETL

Page 46: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 46

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display,

.. AS stackmetric_hasdonebefore_event_paidsearch,

......

FROM

...... <set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“FALSE">NUM</val>

</set>

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Input Config

Generated Config

Page 47: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 47

Analytics Life Cycle

1 New Client

Model UAT

Model UAT

Automated

Modeling

Onboarding

Scoring &

Attribution Modeling

Scenario

Analysis &

Reporting Feature

Engineering

ETL

Page 48: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 48

Modelling Spec • Initial set

of variables • Business

rules

Model

Build N Choice Models

Model Evaluation and Ranking • Business Rules • Goodness of Fit • Statistical diagnosis

Finalized Model

Population Generational Iteration • Top models cross –over to

generate offspring models • Variable mutation

• Bayesian priors • Coeff bounds • Attribution bounds

Attribution

Stack

Automated Model Search – A Genetic Algorithm Approach

Page 49: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 49

Max of modelScore Model Source

Iterations CrossOver Mutation New Max Score

1 0.814674133 0.814674133

2 0.840250827 0.874642259 -0.142370371 0.874642259

3 0.86051811 0.843359676 0.726339045 0.86051811

4 0.861854898 0.850294643 0.434102915 0.861854898

5 0.868137103 0.851901188 0.709897779 0.868137103

6 0.890517983 0.873505914 0.835870771 0.890517983

7 0.890538416 0.857600064 0.538250044 0.890538416

8 0.900386103 0.877312775 0.389444869 0.900386103

9 0.896563775 0.861402806 0.227565829 0.896563775

10 0.893281436 0.866702979 0.429202581 0.893281436

11 0.907564542 0.869666755 0.738092262 0.907564542

12 0.904108687 0.887093232 0.850036487 0.904108687

13 0.904159874 0.884415807 0.860378596 0.904159874

14 0.905516755 0.898474717 0.4425833 0.905516755

15 0.911024584 0.904605785 0.776345059 0.911024584

16 0.911007328 0.905694898 0.496509904 0.911007328

17 0.910692312 0.898231697 0.815998341 0.910692312

18 0.895302999 0.901081899 0.45632258 0.901081899

19 0.907471425 0.860367793 0.675483196 0.907471425

20 0.904312199 0.872822381 0.787050984 0.904312199

Max Score 0.911024584 0.905694898 0.860378596 0.911024584

GA Progression over Iterations

Model with Max Score is generated from {Cross-Over, Mutation} and not from {New}

Page 50: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 50

0.8

0.82

0.84

0.86

0.88

0.9

0.92

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Mo

del

Sco

re

Iterations

C

C

Cross-Over Model Score by Iterations

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Mo

del

Sco

re

Iterations C - Average of modelScore C - StdDev of modelScore

Page 51: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 51

Model Score Improvement for Cross-over models over Iterations

-0.800

-0.600

-0.400

-0.200

0.000

0.200

0.400

0.600

0.800

1.000

1 3 5 7 9

11

13

15

17

19

21

23

25

27 29 31

33

35

37

39 41 43

45

47

49

51 53 55

57

59

61

63

65

67

69

71

73

75

77

79

81

83

85

87

89

91

93

95

97

99

101

103

105

107

109

111

113

115

117

119

121

123

125M

od

el S

core

Models over Iterations

Page 52: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 52

Analytics Life Cycle

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Modeling

Feature Set

Transformed

Modeling

Feature Set

Feature

Engineering

Configs

ETL

Configs

Model UAT

Automated

Modeling

Onboarding

Scoring &

Attribution Modeling

Scenario

Analysis &

Reporting Feature

Engineering

ETL

Page 53: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 53

Analytics Life Cycle - Numbers

1 New Client

Model UAT

Attribution

Funnel

creation

Post

Processing

Modeling

Feature Set

Transformed

Modeling

Feature Set

Feature

Engineering

Configs

ETL

Configs

Model UAT

Automated

Modeling

Onboarding

Scoring &

Attribution Modeling

ETL

Scenario

Analysis &

Reporting Feature

Engineering

0-2 w 2 d

2 h

4 h

.5 h

25 h

1 d 3 h

1.5h

2 h

Page 54: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 54

Questions &

Discussion

Page 55: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 55

Additional

Engineering

Slides

Page 56: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 56

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display,

.. AS stackmetric_hasdonebefore_event_paidsearch,

......

FROM

...... <set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“FALSE">NUM</val>

</set>

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Input Config

Generated Config

Page 57: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 57

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display,

.. AS stackmetric_hasdonebefore_event_paidsearch,

.. AS stackmetric_hasdonebefore_event_organicsearch,

......

FROM

...... <set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“FALSE">NUM</val>

</set>

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Input Config

Generated Config

Data change

Page 58: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 58

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“FALSE">NUM</val>

</set>

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Input Config

Generated Config

Adding segmentation dimension

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` ................. 5

Page 59: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 59

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

.. AS stackmetric_num_event_display_onlinedisplay,

.. AS stackmetric_num_event_display_directresponsevideo,

.. AS stackmetric_num_event_paidsearch_brand,

.. AS stackmetric_num_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“TRUE">NUM</val>

</set>

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Input Config

Generated Config

Selecting NUM metric

Page 60: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 60

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

.. AS stackmetric_num_event_display_onlinedisplay,

.. AS stackmetric_num_event_display_directresponsevideo,

.. AS stackmetric_num_event_paidsearch_brand,

.. AS stackmetric_num_event_organicsearch_unknown,

.. AS stackmetric_gsm90_event_display_onlinedisplay,

.. AS stackmetric_gsm90_event_display_directresponsevideo,

......

.. AS stackmetric_gsm365_event_paidsearch_brand,

.. AS stackmetric_gsm365_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“TRUE">NUM</val>

<val enable=“TRUE">GSM</val>

</set>

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

Input Config

Adding new metric

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Generated Config

<set name= “GSM_STEP">

<val>90</val>

<val>180</val>

<val>365</val>

</set>

Page 61: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 61

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

.. AS stackmetric_num_event_display_onlinedisplay,

.. AS stackmetric_num_event_display_directresponsevideo,

.. AS stackmetric_num_event_paidsearch_brand,

.. AS stackmetric_num_event_organicsearch_unknown,

.. AS stackmetric_gsm_event_display_onlinedisplay,

.. AS stackmetric_gsm_event_display_directresponsevideo,

.. AS stackmetric_gsm_event_paidsearch_brand,

.. AS stackmetric_gsm_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“TRUE">NUM</val>

<val enable=“TRUE">GSM</val>

</set>

<set name= “GSM_STEP">

<val>3</val>

</set>

Input Config

Adding new metric

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Generated Config

stackevtterms_(<STCKGEN_RP><SDATE>i)

-- This query calculating all the NUM event metrics (stack variables or stack metrics).

-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name

DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);

CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) AS

SELECT

<METRICS_EVTSTACK>

<OTHER_METRICS_EVTSTACK> t2.rp_start AS rp_start,

t2.rp_end AS rp_end,

t1.userid AS userid,

t1.actuuid AS actuuid,

t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,

t1.refdate_rp AS refdate_rp,

t1.response AS response

FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN elbis_(<STCKGEN_RP><SDATE>i) t2

ON (t1.userid = t2.userid)

WHERE (t1.sample_flag = 1)

GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp, t1.response;

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

Page 62: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 62

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

.. AS stackmetric_num_event_display_onlinedisplay,

.. AS stackmetric_num_event_display_directresponsevideo,

.. AS stackmetric_num_event_paidsearch_brand,

.. AS stackmetric_num_event_organicsearch_unknown,

.. AS stackmetric_gsm_event_display_onlinedisplay,

.. AS stackmetric_gsm_event_display_directresponsevideo,

.. AS stackmetric_gsm_event_paidsearch_brand,

.. AS stackmetric_gsm_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“TRUE">NUM</val>

<val enable=“TRUE">GSM</val>

</set>

<set name= “GSM_STEP">

<val>3</val>

</set>

Input Config

Adding new metric

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Generated Config

stackevtterms_(<STCKGEN_RP><SDATE>i)

-- This query calculating all the NUM event metrics (stack variables or stack metrics).

-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name

DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);

CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) AS

SELECT

<METRICS_EVTSTACK> <OTHER_METRICS_EVTSTACK> t2.rp_start AS rp_start,

t2.rp_end AS rp_end,

t1.userid AS userid,

t1.actuuid AS actuuid,

t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,

t1.refdate_rp AS refdate_rp,

t1.response AS response

FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN elbis_(<STCKGEN_RP><SDATE>i) t2

ON (t1.userid = t2.userid)

WHERE (t1.sample_flag = 1)

GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp, t1.response;

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

Page 63: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 63

SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN (

(channel_type IN ('Display') AND evt_sub_channel IN ('Online Display')) AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END

)/365 )

) AS stackmetric_gsm365_event_display_onlinedisplay,

SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN (

<and> <xprod> <in>

<xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist> </in> </xprod> </and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END

) / <val>GSM_STEP</val> )

) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,

Page 64: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 64

stack_parameters.xml

stack_transforms.xml

stack.hql

Code

Generator

BaaS

Config

Generator

stack_gen.sh

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<set name="EVT_MARKETING_SEG_DIMS">

<attr>channel_type</attr>

<attr>evt_sub_channel</attr>

</set>

#!/bin/bash ................. cmdstatus=`java -jar ${codegenerator_jar_dist_loc}/${code_generator_jar_name} -s ${common_stages_loc}/$PHASE_NAME -d ${script_loc}/$PHASE_NAME -c $input_schema_config_loc -g $code_generator_grammar_loc` .................

OR

CH

ES

TR

AT

ION

RUN

GET

PUT

ACTION

BACKEND

1

3

4

5

stack_gen_config.xml

2

<param>METRICS_EVTSTACK</param>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

CREATE TABLE stackevtterms

AS

SELECT

.. AS stackmetric_hasdonebefore_event_display_onlinedisplay,

.. AS stackmetric_hasdonebefore_event_display_directresponsevideo,

.. AS stackmetric_hasdonebefore_event_paidsearch_brand,

.. AS stackmetric_hasdonebefore_event_organicsearch_unknown,

.. AS stackmetric_num_event_display_onlinedisplay,

.. AS stackmetric_num_event_display_directresponsevideo,

.. AS stackmetric_num_event_paidsearch_brand,

.. AS stackmetric_num_event_organicsearch_unknown,

.. AS stackmetric_gsm_event_display_onlinedisplay,

.. AS stackmetric_gsm_event_display_directresponsevideo,

.. AS stackmetric_gsm_event_paidsearch_brand,

.. AS stackmetric_gsm_event_organicsearch_unknown,

......

FROM

......

<set name="EVT_STACKMETRICS">

<val enable=“TRUE">HASDONEBEFORE</val>

<val enable=“TRUE">NUM</val>

<val enable=“TRUE">GSM</val>

</set>

<set name= “GSM_STEP">

<val>3</val>

</set>

Input Config

Adding new metric

#!/bin/bash ................. cmdstatus=`hive –f $PHASE_NAME.hql` .................

6

Generated Config

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>.....</defn>

</metric>

stackevtterms_(<STCKGEN_RP><SDATE>i)

-- This query calculating all the NUM event metrics (stack variables or stack metrics).

-- stackevtterms_<RP_START_i> is created separately for each of the RP with RP_START in its name

DROP TABLE stackevtterms_(<STCKGEN_RP><SDATE>i);

CREATE TABLE stackevtterms_(<STCKGEN_RP><SDATE>i) AS

SELECT

<METRICS_EVTSTACK> <OTHER_METRICS_EVTSTACK> t2.rp_start AS rp_start,

t2.rp_end AS rp_end,

t1.userid AS userid,

t1.actuuid AS actuuid,

t1.<ACTIVITYGROUP> AS <ACTIVITYGROUP>,

t1.refdate_rp AS refdate_rp,

t1.response AS response

FROM tmpUsersActivityDate_(<STCKGEN_RP><SDATE>i) t1 LEFT OUTER JOIN elbis_(<STCKGEN_RP><SDATE>i) t2

ON (t1.userid = t2.userid)

WHERE (t1.sample_flag = 1)

GROUP BY t2.rp_start, t2.rp_end, t1.userid, t1.actuuid, t1.<ACTIVITYGROUP>, t1.refdate_rp, t1.response;

<param>METRICS_EVTSTACK</param>

<metric>

<name>NUM</name>

<defn>.....</defn>

</metric>

<metric>

<name>HASDONEBEFORE</name>

<defn>.....</defn>

</metric>

<metric>

<name>GSM</name>

<defn>

</defn>

</metric>

<select>

........

<param>METRICS_EVTSTACK</param>

........

</select>

........

Page 65: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 65

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<set name= “GSM_STEP">

<val>90</val>

<val>180</val>

<val>365</val>

</set>

SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN (

<and> <xprod> <in>

<xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist> </in> </xprod> </and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END

) / <val>GSM_STEP</val> )

) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,

12 different columns will be generated from this

single expression

3 Steps

4 segments

Page 66: H2O World - PAAS: Predictive Analytics offered as a Service - Prateem Mandal

MarketShare confidential and proprietary 66

<set>

<name>EVT_MARKETING_SEG_DIMS_SET</name>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Online Display”</value></elements> </elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Display”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Direct Response Video”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Paid Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Brand”</value></elements>

</elements>

<elements>

<elements><attribute>channel_type</attribute><value>“Organic Search”</value></elements>

<elements><attribute>evt_sub_channel</attribute><value>“Unknown”</value></elements>

</elements>

</set>

<set name= “GSM_STEP">

<val>90</val>

<val>180</val>

<val>365</val>

</set>

SUM(1 / (1 + DATEDIFF( refdate_rp CASE WHEN (

<and> <xprod> <in>

<xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>attribute</element></xprodlist> <xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist> </in> </xprod> </and> AND (evt_time > refdate_rp - (60 + 1) AND evt_time <= refdate_rp) ) THEN evt_time ELSE null END

) / <val>GSM_STEP</val> )

) AS stackmetric_gsm<val>GSM_STEP</val>_event_<xprod><xprodlist>EVT_MARKETING_SEG_DIMS_SET<element>value</element></xprodlist></xprod>,

12 different columns will be generated from this

single expression

3 Steps

4 segments ......

......

......

.. AS stackmetric_gsm90_event_display_onlinedisplay,

.. AS stackmetric_gsm90_event_display_directresponsevideo,

.. AS stackmetric_gsm90_event_paidsearch_brand,

.. AS stackmetric_gsm90_event_organicsearch_unknown,

.. AS stackmetric_gsm180_event_display_onlinedisplay,

.. AS stackmetric_gsm180_event_display_directresponsevideo,

.. AS stackmetric_gsm180_event_paidsearch_brand,

.. AS stackmetric_gsm180_event_organicsearch_unknown,

.. AS stackmetric_gsm365_event_display_onlinedisplay,

.. AS stackmetric_gsm365_event_display_directresponsevideo,

.. AS stackmetric_gsm365_event_paidsearch_brand,

.. AS stackmetric_gsm365_event_organicsearch_unknown,

......