Epam BI - Near Realtime Marketing Support System

Preview:

Citation preview

1

EPAM BI Competency Center

1Near Real-time Marketing

Support SystemImplementation Details

by <Kiryl Sultanau> & <Yauheni Yushyn> & <Dzmitry Maskayeu>

2

Preamble

3

Bidding Support

Improve Ad campaigns

NRT Data Visualization:● Near Real Time visualization of bids match with clicks, leads etc.● Detect best Ad type/place/size/position... for different users/devices/regions...● Quick reaction and better estimation for just started Ad Campaigns.

Improve Ad campaigns:● Improve keyword campaigns with more relevant keywords and better

specialized target group, region etc.● Create new short time campaigns for special events or occasions● Collect specific users and information about them

4

Prerequisites

External bidding info (impressions, clicks)

Ad Campaign infoPublisher log streams (impressions, click, search...

Other Dictionaries

5

Architectural Overview

6

PipelineSourcedata

ETLlayer

Presentationlayer

7

Stream Processing

server log [cookies, user_agent, city_id, log_type_id …]

Ad Exchange

- Google DoubleClick AdX

- TANX Alibaba- Baidu- Google Mobile...

JOIN

City

US city names

Log Type

- bid-impression- bid-click- site-open- site-search- site-impression- site-click

Site Pages

Owner URL & google tag

User Tag

External URL & user search keyword

State

US state names

KeywordsUser Keywords as union of google tags and user search keywords

Spark Cache

Kafka RDDDataFrame Apply schema

8

joined server log DataFrame

Parse User Agent String

Browser

OS

Group

Manufacturer

Rendering engine

Version: major, minor

Name

Name

Platform

Device

Manufacturer

DataFrame

Stream Processing

9

joined server log + user agentDataFrame

JOIN

Cassandra table UNPIVOTDataFrame

id bid_click_kw site_open_kw site_click_kw site_lead_kwsite_search_k

w

joined server log + user agent + previous user behaviorDataFrame

Stream Processing

10

joined server log + user agent + previous user behaviorDataFrame

joined server log + user agent + previous user behavior + target group marker

Stream Processing

11

Saving data

Users Dimension

Analytics

Service API

12

Saving data

13

Visualisation

Discover

VisualizeDashboard

14

Tags Analyser Tool● real time data● slices by any collected

metric (time, geo-location, action type, make, model, user behavior …)

● apply filters on the fly easy as a cake

● combine and manage filters

● share dashboards● add new visualisations on

the fly● serve all this staff from UI

15

NRT Data Visualization

16

Question: How to recognize users that will potentially bring profit to provider?

Input data: Logs of searches and clicks on site, logs from partner sites. The data will be merged and split on parts: 60% training, 20% test, 20% validation.

Features: The variables for model training that we’ll use as defining the output are: region, city, user actions and searches on site.

Algorithms: Deep Learning algorithm from H2O package.

Evaluation: The model will be evaluated based on number of predicted clicks + N * number of predicted conversions.

Model Usage: After being trained the Model will receive data on user and his actions on site and will provide probability that this user will click on ad.

Lead Prediction Using Machine Learning

17

NRT Bidding

region: LA, CAsex: maleage: 31stream: google.com > Edmunds.com > search: SUV

region: CAtags: top, SUV, 2015price: 90$ CPMlimit: 200$ day

region: CA, NYtags: SUV, crossoverprice: 70$ CPMlimit: 300$ day

18

Crawl Social Networks (event, places, post, feeds...)

19

Crawl Social Networks (attenders, followers, likers...)

20Confidential 20

Recommended