Computational advertising

Computational advertising

Kira Radinsky

Slides based on material from the paper “Bandits for Taxonomies: A Model-based Approach” by Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabarti, Vanja Josifovski, in SDM 200

The Content Match ProblemAds

Ads DB

Adv

ertis

ers

Ad Impression: Showing an add to a user


Ads DB

Adv

ertis

ers

Ad click: user click leads to revenue for ad server and content provider

(Click)


Ads DB

Adv

ertis

ers

The Content Match Problem: Match ads to pages to maximize clicks

(Click)


Ads DB

Adv

ertis

ers

Maximizing the number of clicks means:• For each webpage, find the ad with the best

Click-Through Rate (CTR)• But, without wasting too many impressions in learning this.

(Click)

Background: Bandits

𝑝1 𝑝2 𝑝3

Bandit “arms”

(Unknown payoff probabilities)

Pull arms sequentially so as to maximize the total expected reward• Estimate payoff probabilities • Bias the estimation process towards ‘better’ arms.

Background: Bandits Solutions

Try 1: Greedy solution:• Compute the sample mean of an arm ‘A’ by

dividing the total reward received from the arm by the number of times the arm has been pulled.

• At each time step – choose the arm with the highest sample mean.

Try 2: Naïve solution:• Pull each arm an equal number of timesEpsilon-greedy strategy:• The best bandit is selected for a propotion of of

the trials.• Another bandit is randomly selected (with

uniform probability) for a proportion of

Background: Bandits

pag

es ads

Web

page

1W

ebpa

ge2

Web

page

3

Bandit “arms”are ads

Background: BanditsW

ebpa

ges

AdsOne instance of the MAB problem

Unknown CTR

Content Match = A matrix• Each row is a bandit• Each cell has an

unknown CTR

Background: Bandits

Priority1

Bandit Policy:1. Assign Priority to each arm2. “Pull” arm with max priority

and observe reward3. Update priorities

Priority2 Priority3

Allocation

Estimation

Background: Bandits

Why not simply apply a bandit policy directly to the problem?• Converges too slowly with instances of MAB

and each bandit with arms per instance• Additional structure is available, we wish to

use it.

Multi-level PolicyAdsclasses

Webpagesclasses

Consider only two levels.

Multi-level PolicyApparel

Idea: CTRs in a block are homogeneous

Computers Travel Ad parent classes

Ad child classes

App

arel

Com

pute

rsTr

avel

Block

One MAB problem instance

Multi-level Policy

CTR in a block are homogeneous Used in allocation (picking ad for each

new page) Used in estimation (updating priorities

after each observation)

Multi-level Policy - Allocation

A C T

AC

T

? Page classifier

• Classify webpage page class, parent page class• Run bandit on ad parent classes pick one ad parent class• The two above steps results in a block


A C T

AC

T

? Page classifier

• Classify webpage page class, parent page class• Run bandit on ad parent classes pick one ad parent class• The two above steps results in a block• Run bandit among cells pick one ad class• (In general, continue from root to leaf final ad)


A C T

AC

T

? Page classifier

Bandits at higher levels:• Use aggregated information• Have fewer bandit arms Quickly figure out the best ad parent class

Multi-level Policy

CTR in a block are homogeneous Used in allocation (picking ad for each

new page) Used in estimation (updating priorities

after each observation)

Multi-level Policy - Estimation

CTR in a block are homogeneous Observations from one cell

also give information about others in the block.

How can we model this dependence?

A C T

AC

T


A C T

AC

T

Shrinkage Model

#clicks in cell

#impressions in cell

All cells in a block come from the same distribution


A C T

AC

T

• Intuitively, this leads to shrinkage of cell CTRs towards block CTRs

Estimated CTR

Beta prior (“block CTR”) Observed CTR

Experiments (S. Panday et al. 2007)

Root

20 nodes

221 nodes

~7000 nodes

Use this 2 levels

Depth 0

Depth 1

Depth 2

Depth 7

Taxonomy Structure


• Data collected over a 1 day period• Collected from only one server, under

some other ad-matching rules (not out bandit).

• ~229M impressions• CTR values have been linearly

transformed for purpose of confidentiality


Number of pulls

Clic

ks

Multi-level gives much higher #clicks!


Number of pulls

Mea

n-sq

uare

d E

rror

Multi-level gives much better MSE – it learnt more from its explorations.

Conclusions

• When having a CTR guided system, exploration is a key component.

• Short term penalty for the exploration needs to be limited (exploration budge)

• Most exploration mechanisms use a weighted combination of the predicted CTR rate (average) and the CTR uncertainty (variance)

• Exploration in a reduced dimensional space: class hirerchy

• Top down traversal of the hirerchy to determine the class of the ad to show

Documents

Computational advertising