Randomness and Uncertainty in Portfolio Methods for ...ecee.colorado.edu/ragad3/papers/working_portfolio.pdf · For example, if an advertiser pays a bid bonly when the ad elicits

Randomness and Uncertainty in Portfolio Methodsfor Inventory Allocation in Online Advertising

ERIC BAX

Yahoo!

KRISHNA PRASAD CHITRAPURA

Yahoo!

SACHIN GARG

Yahoo!

and

RAGAVENDRAN GOPALAKRISHNAN

Caltech

In markets for online advertising, advertisers may post bids that they pay only when a user

responds to an ad. Market-makers estimate response rates for each ad and multiply by the bid

to estimate expected revenue for showing the ad. For each advertising opportunity, called an adcall, the market-maker selects an ad that maximizes estimated expected revenue. Actual revenue

deviates from estimated expected revenue for two reasons: (a) uncertainty introduced by errorsin estimation of response rates and (b) random fluctuations in response rates from their expected

values.

This paper outlines a method to allocate a set of ad calls over a set of ads. The methodmediates a tradeoff between maximizing estimated expected revenue for publishers and minimizing

estimated variance for that revenue. The method accounts for uncertainty as well as randomness

as sources of variability. The paper also demonstrates the surprising result that using portfolioallocation to reduce variance can also increase revenue, making portfolio allocation useful even if

the publisher is risk-neutral.

Categories and Subject Descriptors: [Electronic Commerce]: Online Auctions; [Electronic

Commerce]: Online Advertising

General Terms: Electronic CommerceAdditional Key Words and Phrases: internet advertising, portfolio optimization, portfolio theory,

explore exploit, uncertainty

1. INTRODUCTION

Online content providers show ads to generate revenue [Varian 2009; 2006; Edelmanet al. 2007; Lahie and Pennock 2007]. Some advertisers pay only when a userresponds to an ad. So the expected payoff for showing the ad is the advertiserpayment times the user response rate. The content provider or a market-maker can

Authors’ emails: {ebax, pkrishna, gsachin}@yahoo-inc.com, [email protected] to make digital/hard copy of all or part of this material without fee for personal

or classroom use provided that the copies are not made or distributed for profit or commercial

advantage, the ACM copyright/server notice, the title of the publication, and its date appear, andnotice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.c© 20??? ACM 0000-0000/20???/0000-0001 $5.00

ACM Journal Name, Vol. ?, No. ??, ???? 20???, Pages 1–0??.

2 · Bax, Chitrapura, Garg, and Gopalakrishnan

estimate the user response rate to calculate an estimated expected payoff.A market-maker may segment content and the corresponding ad calls into mar-

kets. The segmentation may be based on factors such as themes like sports andentertainment, more detailed topics such as basketball camps and beekeeping, andaudience factors such as age, gender, locale, and previous viewing and buying be-havior. The market-maker may estimate response rates for ads on a per marketbasis.

If a market-maker selects the ad with maximum estimated expected payoff andshows it on all content in a market, the content provider incurs a risk that the actualpayoff may be less than the estimated expected payoff for the ad. The actual payoffdiffers from the estimated expected payoff due to (a) error in estimation of userresponse rate and (b) random fluctuations in user response rate. To reduce theserisks, a market-maker can select a portfolio of ads for each market, showing eachad in the portfolio on a portion of the ad calls.

This paper outlines a method to show a mix of ads that maximizes estimatedexpected payoffs over ad calls subject to a bound on the variance of payoffs. Themethod uses techniques from portfolio optimization [Markovitz 1952; Lintner 1965;Sharpe 1964; Tobin 1958]. Standard portfolio optimization accounts for randomfluctuations from expected payoffs, but it assumes that the statistics of those fluc-tuations are known. This paper addresses the fact that, for online advertising, thestatistics are estimated. In general, for online ad selection, optimal portfolios areshaped by a combination of errors in estimating response rates and of fluctuationsin response rates.

The rest of this paper is organized as follows. Section 2 defines a formal modelthat is the basis for our results. Section 3 describes how to optimize for a combi-nation of mean revenue and variance in revenue under the formal model. Section 4compares the optimization problem to similar portfolio problems in finance. Section5 analyzes the roles of uncertainty, randomness, and response rates in selecting aportfolio of ads. Section 6 presents results from simulations, showing the tradeoffsbetween optimizing for mean and variance of revenue and demonstrating that, insome situations, selecting a portfolio to decrease estimated variance actually in-creases actual expected revenue. Section 7 concludes with a discussion of directionsfor future work.

2. FORMAL MODEL

Let m be the number of ad calls and let n be the number of ads. An allocationvector k = (k1, . . . , kn) specifies the number of ad calls to allocate to each ad. Thegoal is to select an optimal allocation vector k∗ that mediates a tradeoff betweenmaximizing expected revenue and minimizing variance of revenue.

Assume each ad i in the market is generated by a distribution over possible ads.Each possible ad has a response rate. Let Ri be the distribution of response ratesover ads drawn from the distribution that generates ad i. Let random variable Si

be the response rate for ad i. Let random variable Xi(Si) be the revenue fromshowing ad i on an ad call. For example, if an advertiser pays a bid b only whenthe ad elicits a response, thenACM Journal Name, Vol. ?, No. ??, ???? 20???.

Randomness and Uncertainty in Portfolio Methods · 3

Xi(Si) ={

b with probability Si

0 with probability 1− Si.

Define random variables Xhi(Si), for h in {1, . . . ,m} and i in {1, . . . , n}, to bethe revenue Xi(Si) if ad call h is allocated to ad i. (Random variables Xhi(Si)are realizations of random variable Xi(Si).) Response rate Si is drawn once anddetermines a distribution for revenue for all draws of Xi(Si): X1i, . . . , Xmi. ButXi(Si) is redrawn i.i.d. according to that distribution for each ad call. Think of itas drawing a coin from a bag of coins to determine the response probability Si for adi, then tossing that coin once for each ad call to determine the values X1i, . . . , Xmi.

The goal is to select an allocation k∗ to maximize expected return subject to con-trols on variance of returns. The expectation and variance are over S = (S1, . . . , Sn)and X = (X11, . . . , X1n, . . . , Xm1, . . . , Xmn). However, S and X are unknown atthe time of portfolio allocation, so their statistics must be estimated.

Let R be the distribution for S. Call R the actual prior. Let R be an estimatedprior distribution for S. Such a distribution can be based on historical responserates for ads. Using Bayesian statistics, let D be an estimated posterior distributionfor S. Statistics for this estimated posterior can be computed from the estimatedprior R and the past performance of ads 1 to n. (Refer to [Berger 1985] for methodsto compute statistics for the estimated posterior.) The statistics for the estimatedposterior D and the bids for the ads determine the statistics for estimated posteriorfor X, since revenue scales linearly with the bids.

A note on notation: expectations, variances, and covariances are over the distri-butions of the random variables in subscripts. For example, ES is expectation overthe distribution of S. Similarly, V arS,X is variance over the joint distribution of(S,X).

3. OPTIMAL ALLOCATION

Let k be an allocation. Assume, without loss of generality, that the first k1 ad callsare allocated to ad 1, the next k2 are allocated to ad 2, etc. Then the revenue forallocation k is

r(k,S,X) =n∑

i=1

k1+...+ki∑h=k1+...+ki−1+1

Xhi(Si).

So the allocation optimization problem (AOP) is:

maxk

ES,Xr(k,S,X)

subject to

V arS,Xr(k,S,X) ≤ d,

where d is a specified bound on variance, andACM Journal Name, Vol. ?, No. ??, ???? 20???.


∀i : ki ≥ 0, andn∑

i=1

ki = m.

Since expectations are linear, the expected revenue is

ES,Xr(k,S,X) =n∑

i=1

kiESi,XiXi(Si) (1)

The variance of revenue is:

V arS,Xr(k,S,X) =n∑

i=1

n∑j=1

kikjCovSi,Sj[EXi

Xi(Si), EXjXj(Sj)]+

n∑i=1

kiESiV arXi

Xi(Si).

(2)(See Appendix B for the proof.)

Define matrix A as

aij = CovSi,Sj [EXiXi(Si), EXjXj(Sj)],

and define vectors b and c:

bi = ESi[V arXi

Xi(Si)] and ci = ESi,XiXi(Si).

(Please excuse the abuse of notation: the symbol b represents advertiser’s bidelsewhere in this paper.) The allocation optimization problem (AOP) can be statedas

maxk

cT k

subject to

kT Ak + bT k ≤ d,

k ≥ 0 and 1T k = m.

Call this the matrix allocation problem (MAP). This is a convex quadratic pro-gramming problem, which can be solved by any of a number of available quadraticprogramming (QP) solvers, employing techniques such as Wolfe’s method [Wolfe1959; Franklin 1980].

Alternatively, use a parameter q ∈ [0,∞) to express how much to weight averagereturns versus variance. Solve the problem

mink

kT Ak + bT k− qcT k

subject toACM Journal Name, Vol. ?, No. ??, ???? 20???.


k ≥ 0 and 1T k = 1.

Call this the q-weighted matrix allocation problem (QMAP). This convex quadraticprogramming problem is in a form that is convenient for many QP solvers. (Forgeneral background on allocation problems, refer to [Franklin 1980].)

4. RELATIONSHIP TO PORTFOLIO ALLOCATION IN FINANCE

The inventory allocation problems MAP and QMAP have the same form as thestandard portfolio allocation problem in finance [Markovitz 1952; Lintner 1965;Sharpe 1964; Tobin 1958; Fabozzi et al. 2007]. In finance, an investor seeks toallocate funds among investments, with the goals of achieving high expected returnsand low variance of returns. In our scenario, a publisher seeks to allocate ad callsamong ads, with the goals of achieving high expected revenue and low variance ofrevenue.

Like financial investors selecting a portfolio, using MAP and QMAP causesrevenue-seeking risk-averse publishers to:

(1) Allocate more ad calls to ads that have higher expected revenues.(2) Allocate more ad calls to ads that have less variance of revenues.(3) Diversify: spread ad calls more evenly over multiple ads to reduce variance of

revenue when ads with less correlated revenues are available.

The first point is obvious. The second point leads us to explore factors that de-termine variance in the next section. On the third point, we examine diversificationand covariance next in this section.

To examine diversification, suppose there are r ads available, and they have inde-pendent and identical distributions Si and Xi(Si). Then all allocations k1, . . . , kr

of k ad calls have the same expected revenue. Because of independence,

∀i 6= j : kikjCovSi,Sj [EXiXi(Si), EXjXj(Sj)] = 0.

So the variance of revenue is

r∑i=1

k2i V arSi

[EXiXi(Si)] + kiESi

[V arXiXi(Si)].

The first term is variance due to uncertainty about the ad’s response rate. Thesecond term is variance due to random differences between actual and observedresponse rates. Let the uncertainty terms

V arS1 [EX1X1(S1)] = . . . = V arSr[EXr

Xr(Sr)] = α.

Let the randomness terms

ES1 [V arX1X1(S1)] = . . . = ESr[V arXr

Xr(Sr)] = β.

(The terms are equal because the distributions of Si and Xi(Si) are assumed tobe identical over ads.) If all ad calls are allocated to a single ad, then the variance

ACM Journal Name, Vol. ?, No. ??, ???? 20???.


is k2α+kβ. If the ad calls are distributed uniformly over the ads, then the varianceis 1

rk2α+ kβ. So diversification reduces variance caused by uncertainty.

Now consider the role of covariance. In our model, randomness is independentamong ad calls and among ads. In contrast, uncertainty is completely correlatedamong ad calls allocated to the same ad, and it may also be correlated amongdifferent ads. In practice, uncertainty becomes correlated among ads when they“share” learning. For example, in a tree-based model for learning response rates,the ad calls and responses for each ad may influence response rate estimates forother ads in the same branch of the tree. As a result, differences between estimatedand actual expected response rates are likely to become correlated for ads that areneighbors in the tree.

Empirical data can be used to estimate covariance of expected returns among ads.A model for the covariance can be based on whether ads share a branch or sub-branch in a tree model for response rate estimation [Agarwal et al. 2007; Dudik et al.2007; Gelman and Hill 2007], are in the same cluster in a cluster model [Regelsonand Fain 2006], have similar scores for factors in a factor-based model [Agarwal andChen 2009; Weinberger et al. 2009; Richardson et al. 2007], or use the same rulesin a rule-based model [Dembczynski et al. 2008]. The model for covariance can betrained by using converged response rate estimates for a population of experiencedads as proxies for actual response rates and observing how differences between earlyestimates and converged estimates are correlated for ads that “share” learning.

Similar to investments in finance, if two new ads have the same estimated ex-pected returns that are at or near market levels, but one has returns that are lesscorrelated with the other ads that have high expected returns, then that one hasmore value to the publisher. The publisher can reduce variance while maintainingreturns by adding that ad to the portfolio.

5. FACTORS THAT INFLUENCE VARIANCE

This section focuses on variance of returns for an allocation of ad calls in a market tosome ad. To focus on individual allocations, assume in this section that covarianceswith respect to all Si and Sj are zero: ∀i 6= j, aij = 0. (This occurs when adresponse rates are estimated independently.)

This section is organized as follows. The first three subsections address aspectsof ad response rates and revenues that contribute to variance. The last subsectioncombines these aspects in a model of variance that allows us to examine contribu-tions from different factors.

5.1 Uncertainty and Randomness

The portion of variance in revenue due to each ad i in the portfolio allocation is

k2i V arSi

[EXiXi(Si)] + kiESi

[V arXiXi(Si)].

The first term is variance due to uncertainty. The second term is variance dueto randomness. Variance due to uncertainty scales with the square of allocatedad calls ki. Variance due to randomness scales linearly. Uncertainty increaseswith allocation size because the actual response rate Si is drawn once and appliesto all ad calls allocated to ad i, making their revenues correlated. In contrast,ACM Journal Name, Vol. ?, No. ??, ???? 20???.


deviations in revenue due to differences between actual and observed response ratesare independent from ad call to ad call.

5.2 Price Types and Response Rates

Offers with different price types compete for the same ad calls in marketplacesfor display advertising such as Yahoo’s RightMedia exchange. Typical price typesinclude CPM (cost per mille), where the advertiser pays per ad call, CPC (costper click), where the advertiser pays only when a user clicks on the ad, and CPA(cost per action), where the advertiser pays only when the user responds to the adwith some specified action. This paper focuses on CPC and CPA ads, where theexpected revenue for showing an ad depends on the rate of response: a click forCPC ads or an action for CPA ads.

Response rates vary. CPC response rates typically have orders of magnitude1/100 or 1/1000. CPA response rates are typically lower because actions, such asmaking a purchase or even filling out a form after clicking on an ad, are typicallyrarer than clicks.

CPC and CPA offers compete on the basis of expected revenue. Let b be the bid– the amount an advertiser pays per click or action. Let p be the response rate.Then the expected revenue per ad call is bp. If c is the expected revenue requiredto be competitive, then advertisers have to bid at least

b =c

p.

So CPA advertisers typically have higher bids than CPC advertisers.

5.3 Learning and Uncertainty

As ads are shown to users, response data are collected, which decreases uncertaintyabout response rates. Suppose an ad has actual response rate p and obtains uresponses from being allocated to v ad calls. Treating each ad call as a Bernoullitrial [Feller 1968] with success probability p, u

v is an unbiased estimator:

Eu

v= p,

with standard deviation

σ =

√p(1− p)

v≈√p

v.

(Since response rates p are small, 1 − p is approximately one.) Based on theapproximation:

E|uv− p| ≈ σ,

the relative error

E|uv − p|p

≈ σ

p≈ 1√vp. (3)



So smaller response rates require proportionally more samples to achieve thesame relative uncertainty. (Note that vp is the expected number of responses. So,for example, 100 responses produce an estimate with about 10% relative accuracy.)

This is a simplified view of response rate estimation and learning. In practice,response rate estimation systems “share” learning among similar ads and ad calls,making them more effective (usually.) However, in practice, response rates neednot be stationary. Also, ad calls used to estimate response rates may not be drawnfrom the same distribution as ad calls to which the estimates are applied. Forexample, ad calls that have been allocated to a new ad may be over-concentratedon some times of day or geographic areas. So there are reasons for actual systemsto perform better and worse than the simple model we use here.

5.4 Model

Now we develop a simplified model of variance to examine relationships betweenuncertainty, randomness, response rates, and learning. For ad i:

—Let k be the number of ad calls allocated to ad i in the current session.—Let b be the advertiser’s bid – the amount paid per click or conversion.—Let p be the (unknown) actual response rate.—Let d be the standard deviation in the estimate of p as a fraction of p.

Then the portion of portfolio variance due to ad i:

k2i V arSi

[EXiXi(Si)] + kiESi

[V arXiXi(Si)]

is

≈ k2d2b2p2 + kb2p. (4)

To account for bids needing to be higher if response rates are lower, substituteb = c

p :

= k2d2c2 +kc2

p.

To include the effects of learning, substitute d = 1√vp :

=k2c2

vp+kc2

p.

Based on this model, an ad contributes more variance when the allocation k islarger, when the response rate p is smaller, and when fewer learning ad calls v havebeen used to estimate the response rate.

Since the first term represents uncertainty and the second term randomness, theratio of uncertainty to randomness is

k

v: 1.



So, for example, when the number of learning ad calls v is about 10 times theallocation k for the present session, uncertainty accounts for about 10% of thevariance in revenue.

6. SIMULATIONS

This section presents results of simulations to explore how portfolio allocation foronline ad inventory affects average revenue, standard deviation of revenue, andselectivity, meaning the tendency to award ad calls to ads that offer the greatestactual expected revenue. Selectivity is important because consistently allocatingmore ad calls to an ad with maximum expected revenue encourages competitivebidding among advertisers.

The simulations focus on markets for display advertising that have a mix of CPCand CPA ads. These results should also apply to markets that have only CPAads, with a variety of definitions of an action and hence a variety of response rates.This is the typical case for CPA-dominated markets in display, where an actionmay mean anything from filling out a form to completing a major purchase. Theparameter values used in the simulations are based on observed values; thoughactual display advertising markets host a wide variety of auctions, having differentnumbers of offers, different combinations of offer price types, and varying responserates.

In the simulations, ad response rates Si are independent of each other. So we usethe following notation for distributions. For each ad i, let Ri be the actual priordistribution for Si. Let Ri be the estimated prior, and let Di to be the estimatedposterior distribution.

Each simulation is based on a set of actual priors Ri and estimated priors Ri.Each simulation follows the steps:

(1) Generate ad response rates Si at random based on actual priors Ri.(2) For each ad, randomly generate a series of 100,000 “learning” ad calls, with

response rate Si, and record the number of responses.(3) For each ad, compute an estimated posterior Di based on the ad’s estimated

prior Ri and numbers of learning ad calls and responses, as detailed in Appendix0??.

(4) Use QMAP to allocate ad calls over the ads, based on statistics over the esti-mated posteriors Di.

(5) Record the actual expected revenue: rq ≡∑

i kiSibi where ki is the number ofad calls allocated to ad i. This is the expected revenue achieved by the QMAPallocation.

(6) Record the estimated standard deviation of revenue, which is the square rootof the estimated variance in the QMAP allocation.

(7) Record the selectivity, which is defined as the fraction of ad calls awarded toany ad with maximum actual expected revenue.

(8) For comparison, record the ideal expected revenue: r∗ = maxi Sibim, where biis the bid for ad i. This would be the expected revenue if perfect knowledgeof response rates could be used to select an ad with maximum actual expectedrevenue.



(9) Also for comparison, identify a“single winner” ad – an ad with maximum es-timated revenue based on the estimated posteriors Di. (The revenue estimateis the bid times the mean of the estimated posterior.) This is the ad thatwould be selected to receive all ad calls, based on the available information, ifthere was not a portfolio allocation. Record the actual expected revenue rs andselectivity (zero or one) for the single winner ad.

Each simulation computes the QMAP allocation and collects results for all qvalues in 0 to 1500 with increments of 25, the values 1750, 2000, 3000, 4000, 5000,7500, the values 10,000 to 35,000 with increments of 5000, and the values 40,000 to100,000 with increments of 10,000. (The plots only show results for q up to 20,000,because beyond that value the results change very little.) Each plot in this sectionshows results averaged over 10,000 simulations.

Each simulation uses 20 ads: 10 CPC ads and 10 CPA ads. The CPC ads have$1 bids and actual priors Ri = N (0.001, 0.0001) – Gaussians with mean 0.001and standard deviation 0.0001. The CPA ads have $10 bids and actual priorsRi = N (0.0001, 0.00001), so that their revenues have the same distributions as theCPC ad revenues.

There are simulations using several estimated priors for response rates Ri:

—Uniform – The prior is uniform over [0, 1]. Using this prior simulates focusing onthe empirical performance for the ads and ignoring the distribution of responserates for past ads in the same marketplace in estimating response rates for thepresent set of ads.

—Approximate – The prior is uniform over [µ−4σ, µ+4σ], where µ and σ are themean and standard deviation of the actual priors Ri. Using this prior simulateshaving and using approximate knowledge about the generating distributions forad response rates.

—Exact – The prior is the actual distribution used to generate response rates:N (0.001, 0.0001) for CPC ads and N (0.0001, 0.00001) for CPA ads. Using thisprior simulates having exact knowledge of the generating distributions, an idealthat does not occur in practice.

Figures 1 and 2 show how QMAP mediates the tradeoff between estimated expec-tation and estimated variance of revenues. As expected, as q increases, estimatedexpectations increase, and so do estimated variances. Figure 1 shows estimatedexpected revenues as fractions of ideal revenues r∗. Note that estimated expectedrevenues exceed ideal revenues as q increases. This is because selecting ads based onexpected revenues estimated from some learning ad calls introduces selection bias[Bax and Romero 2009; Galton 1886], where ads with the highest revenue estimatesare likely to have over-estimates. Figure 2 shows estimated standard deviations ofrevenues, which are square roots of variances.

The following subsections focus on actual (rather than estimated) revenue andselectivity for the QMAP allocations. There is a subsection for each estimatedprior Ri: uniform, approximate, and exact. The results in the subsections showthat controlling estimated variance can increase actual revenue, and the effect isstrongest for the least accurate estimated priors.ACM Journal Name, Vol. ?, No. ??, ???? 20???.


Fig. 1. Estimated Expected Revenue

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0 5000 10000 15000 20000

reve

nue

(fra

ctio

n of

idea

l rev

enue

)

q

uniform - single winneruniform - portfolio

approximate - single winnerapproximate - portfolio

exact - single winnerexact - portfolio

Fig. 2. Estimated Standard Deviation of Revenue

0

500

1000

1500

2000

2500

3000

3500

4000

0 5000 10000 15000 20000

stan

dard

dev

iatio

n

q

uniform - single winneruniform - portfolio

approximate - single winnerapproximate - portfolio

exact - single winnerexact - portfolio



6.1 Uniform Prior

Figures 3, 4, 5, and 6 show results for using a uniform prior over [0, 1] to estimatedistributions for returns as inputs to portfolio optimization. Figure 3 shows rev-enues rq for QMAP allocations for various q values and revenue rs for selecting asingle winner. Single-winner revenue is plotted as a horizontal line because it doesnot depend on q. The revenues are shown as a fraction of ideal revenue r∗. Figure 4shows how selectivity varies with q. Recall that selectivity is defined as the fractionof ad calls allocated to an ad with maximum actual expected value.

For q near zero, QMAP expected revenue and selectivity are smaller than forselecting a single winner, because QMAP emphasizes controlling estimated varianceover maximizing estimated revenue. In contrast, for the largest values of q, QMAPemphasizes estimated revenue over estimated variance, allocating most ad calls toone or a few ads with the highest estimated revenues. As a result, QMAP actualrevenue and selectivity are close to single-winner actual revenue and selectivity.The surprise is around q = 1000. Here, QMAP achieves higher actual expectedrevenue and selectivity than selecting a single winner.

Figures 5 and 6 show why this occurs. The figures show the fractions of ad callsthat QMAP allocates to different ads. Both figures separate the ads by class, withthe CPC allocation above the middle solid line and the CPA allocation below.

Figure 5 shows allocations ordered by shares of the ad calls within each class.For example, in Figure 5, the distance from the top of the figure to the top solidline shows the average (over iterations) of the allocation to the CPC ad with thegreatest share of the ad calls.

Figure 6 shows allocations ordered by actual expected revenues within each class.For example, the distance from the middle solid line down to the nearest dottedline in Figure 6 shows the average allocation to the CPA ad with the greatest actualexpected revenue. (Ties are broken randomly.)

Observe how the allocations between CPC and CPA ads change with q. Forq = 0, the allocations are nearly equal within classes, as QMAP seeks to reduceestimated variance. As q increases into the hundreds, QMAP allocates more adcalls to CPC ads. To see why, recall from Equation (3) that the standard deviationof the response rate as a fraction of response rate scales with 1√

vp , where p isthe response rate and v is the number of learning ad calls. Since both ad classesreceived the same number of learning ad calls, the higher response rates for CPCads indicate less standard deviation in response rate as a fraction of response rate.Since expected revenues are response rate scaled by bid, CPC ads have revenueswith smaller random fluctuations from their expectations. So QMAP favors CPCads in order to reduce variance of returns by reducing random fluctuations.

As q reaches into the thousands, though, QMAP favors CPA ads. For these qvalues, QMAP emphasizes increasing estimated expectation over decreasing esti-mated variance of revenue. Because the standard deviation of the response rate asa fraction of response rate scales with 1√

vp , the estimates of response rates for theCPA ads are less accurate than the estimates for CPC ads. So the CPA ads includethe ads with the greatest and the least estimated expected revenues. By selectingan ad to maximize estimated expected revenue, QMAP selects a CPA ad. But itselects based on misestimation more than on actual value offered by the ad.ACM Journal Name, Vol. ?, No. ??, ???? 20???.


Fig. 3. Actual Expected Revenue – Uniform Prior

0.85

0.86

0.87

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0 5000 10000 15000 20000

reve

nue

(fra

ctio

n of

idea

l rev

enue

)

q

single winner revenue r_sportfolio revenue r_q

To see this, compare the CPA sections on the right sides of Figures 5 and 6.Figure 6 shows that QMAP allocates the majority of ad calls to a single CPA adwith greatest estimated revenue. Figure 6 shows that the selected ad is almost aslikely to have medium or low actual revenue as high actual revenue.

For q near 1000, QMAP balances the allocation between CPC and CPA ads. Forthese values of q, QMAP emphasizes estimated revenue, but not so much that itselects based mostly on misestimation. As a result, QMAP maximizes its actualexpected revenue and selectivity and also outperforms the single-winner allocation.

6.2 Approximate Prior

Figures 7, 8, 9, and 10 show results for using approximate priors to form estimatedposterior distributions for ad revenues, which, in turn, produce statistics for inputto portfolio optimization. Recall that the actual priors used to generate ad responserates are Gaussians, with µ = 0.001 for the CPC ads, µ = 0.0001 for the CPA ads,and σ = µ/10 for both. We use approximate priors that are uniform over [µ−4σ, µ+4σ], so almost all actual response rates fall in the covered range. The intention isto simulate using historical ad response rates in each category to estimate thegenerating prior.

These figures show results that are similar to the results for using a uniformprior. Figures 7 and 8 show that when QMAP achieves maximum actual expectedrevenue and selectivity, QMAP outperforms selecting a single winner. Figures 9 and10 show a similar evolution of allocations as in Figures 5 and 6, with allocationsnearly balanced within classes for q = 0, CPC allocations increasing initially, andthen CPA allocations increasing as q increases. Using the approximate prior leadsto less dramatic fluctuations in allocations between CPC and CPA ads and also toless dramatic increases in revenue and selectivity over selecting a single winner.



Fig. 4. Selectivity – Uniform Prior

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0 5000 10000 15000 20000

sele

ctiv

ity

q

single winner selectivityportfolio selectivity

Fig. 5. QMAP Allocations Ordered by Shares of Ad Calls – Uniform Prior

0

0.2

0.4

0.6

0.8

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

6.3 Exact Prior

Figures 11, 12, 13, and 14 show results for using the actual generating priors to formestimated posterior distributions for ad revenues. In this case, both the estimatingand generating priors are Gaussians with µ = 0.001 for the CPC ads, µ = 0.0001 forACM Journal Name, Vol. ?, No. ??, ???? 20???.


Fig. 6. QMAP Allocations Ordered by Actual Values – Uniform Prior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

Fig. 7. Actual Expected Revenue – Approximate Prior

0.86

0.87

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0 5000 10000 15000 20000

reve

nue

(fra

ctio

n of

idea

l rev

enue

)

q


the CPA ads, and σ = µ/10 for both. The intention is to simulate having extremelyaccurate knowledge of the priors.

Figures 11 and 12 show QMAP actual expected revenue and selectivity very closeto single-winner actual expected revenue and selectivity for q > 5000. Figures 13and 14 show that the allocations favor CPC ads for q > 0. By heavily favoring CPC



Fig. 8. Selectivity – Approximate Prior

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0 5000 10000 15000 20000

sele

ctiv

ity

q


Fig. 9. QMAP Allocations Ordered by Shares of Ad Calls – Approximate Prior

0

0.2

0.4

0.6

0.8

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

ads, QMAP usually foregoes selecting the revenue-maximizing ad when it is amongthe CPA ads. However, since the revenue estimates are more accurate for CPCthan CPA ads, favoring CPC ads can increase the likelihood of QMAP selectingan ad with greatest actual expected revenue. Figure 14 shows this. From the rightside of the figure, observe that QMAP awards about 40% of ad calls to a CPC withACM Journal Name, Vol. ?, No. ??, ???? 20???.


Fig. 10. QMAP Allocations Ordered by Actual Values – Approximate Prior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

Fig. 11. Actual Expected Revenue – Exact Prior

0.86

0.87

0.88

0.89

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0 5000 10000 15000 20000

reve

nue

(fra

ctio

n of

idea

l rev

enue

)

q


greatest actual expected revenue, out of about 90% of ad calls allocated to all CPCads. So about 45% of ad calls that QMAP awards to any CPC ad are awarded to aCPC ad that maximizes value. In contrast, only about 25% of ad calls that QMAPawards to any CPA ad are awarded to one offering maximum value among CPAads.



Fig. 12. Selectivity – Exact Prior

0.05

0.1

0.15

0.2

0.25

0.3

0 5000 10000 15000 20000

sele

ctiv

ity

q


Fig. 13. QMAP Allocations Ordered by Shares of Ad Calls – Exact Prior

0

0.2

0.4

0.6

0.8

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

7. DISCUSSION

This paper describes a technique to allocate inventory among buyers in onlineadvertising. The technique mediates a tradeoff between increasing estimated ex-pected revenue and decreasing estimated variance of revenue. The estimated vari-ACM Journal Name, Vol. ?, No. ??, ???? 20???.


Fig. 14. QMAP Allocations Ordered by Actual Values – Exact Prior

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5000 10000 15000 20000

% a

lloca

tion

(cum

ulat

ive)

q

CPC allocations

CPA allocations

ance accounts for both randomness and uncertainty. Simulations show that usingthe method to control estimated variance of revenue can increase actual expectedrevenue by preventing the exchange from selecting a single winner based on anover-estimate of value.

The results of this paper suggest some steps to improve exchanges for onlineadvertising:

—Use priors and a Bayesian approach to estimate response probabilities.—Make the priors as accurate as possible. Use historical data to categorize ads,

deduce the histograms or functional forms of priors, and fit any parameters.—Apply portfolio optimization. Experiment with parameter settings such as q to

optimize [Box et al. 2005] for a combination of actual expected revenue, selectiv-ity, and, if desired, actual variance of revenue.

Though actual expected revenue, selectivity, and actual variance of revenue can-not be measured directly a priori, they can be estimated over time. Actual revenueis a good proxy for actual expected revenue when averaged over many ad calls.Selectivity can be estimated retrospectively, after ads accumulate enough ad callsto accurately estimate their response rates. Variance of revenue can be estimatedfrom a time series of actual revenue for an ad over time.

One direction for future work is to extend the portfolio allocation technique tooperate in concert with an explore-exploit method [Gittins 1979; Gittins and Jones1979; Gittins 1989; Auer et al. 2002; Langford et al. 2002; Vermorel and Mohri2005; Agarwal et al. 2009; Audibert et al. 2007; Mnih et al. 2008]. (Explore-exploitmethods are often called multi-armed bandit techniques.) The technique in thispaper can play the role of an exploitation method, offering the side benefit ofperforming some exploration by allocating ad calls to multiple ads. However, there



is no guarantee that the learning will be systematic or in any way optimal. Onepotential approach to extend the portfolio allocation method to include systematicexploration is to add terms to the ad revenue expectations and variances to accountfor distributions of future revenues due to learning more about response rates. Formore on the value of learning, please refer to [Vermorel and Mohri 2005].

Exploration is investing ad calls to learn whether ads’ response rates warrantexploiting them by including them in future portfolio allocations. In a sense, explo-ration to determine an ad’s revenue statistics is investing in the option to exploitthe ad should it be determined to contribute value to the portfolio. In future re-search, it would be interesting to explore how this is similar to investing in a calloption [Hull 2008] in a financial market. In both cases, an upfront investment se-cures a right to decide whether to make another investment after more informationis obtained. In a financial market, the information is revealed over time. In onlineadvertising, the investment buys the information. In both cases, the downside riskis limited to the amount of the initial investment. In a financial market, this occurswhen an option is out of the money. In online advertising, it occurs when the ad isdiscovered to have such poor revenue statistics that it should not be allocated adcalls.

Another direction for future work is to extend methods to accommodate un-certainty in portfolio analysis for financial markets to portfolio analysis for onlineadvertising markets [Jorion 1986; Jobson et al. 1979; Vasicek 1973]. It should beuseful to apply James-Stein corrections [Bock 1975; Brown 1966; Stein 1955; Jamesand Stein 1961] or similar shrinkage methods to estimates of the means, variances,and covariances of revenue distributions for ads. Such methods may be useful evenif the market maker selects a single winner. In this case, the portfolio uncertaintycorrections can be applied to the statistics for all ads, and then the single winnercan be selected to maximize the corrected mean revenue. Conversely, it would beinteresting to explore whether controlling for estimated variance of returns has theeffect of increasing actual expected returns in the financial markets, as we have seenfor simulations in online advertising markets.

The field of robust optimization focuses on optimization under uncertainty. Somerobust optimization approaches address strict uncertainty [Sniedovich 2007] (a termfrom decision theory [French 1988],) where the probabilities of possible outcomesare completely unknown. Others address less uncertain problems, where the distri-bution over possible outcomes is unknown but restricted to some set of distributions[Ben-Tal and Nemirovski 1998; Ben-Haim 2005; Chen et al. 2007]. In this paper,we began by examining the effect on risk [French 1988] of drawing a distributionover outcomes (corresponding to S) from a distribution over distributions (corre-sponding to R.) Then we used simulations to explore the effect of having imperfectinformation about the distribution over distributions. This introduces a form ofuncertainty beyond risk, but not as severe as strict uncertainty. In the future, itwould be interesting to apply the methods of robust optimization to ad allocationproblems with uncertainty about the parameters (such as the number of ad callsavailable) as well as the payoffs. For more on robust optimization for portfolioproblems, refer to [Goldfarb and Iyengar 2003; Schottle 2007; Fabozzi et al. 2007].ACM Journal Name, Vol. ?, No. ??, ???? 20???.


A. VARIANCE OF PAYOFF FOR AN ALLOCATION

Theorem A.1.

V arS,Xr(k,S,X) =n∑

i=1

n∑j=1

kikjCovSi,Sj[EXi

Xi(Si), EXjXj(Sj)]+

n∑i=1

kiESiV arXi

Xi(Si).

Proof. Use the well-known equality [Feller 1968] for variance: V arX = EX2−(EX)2:

V arS,Xr(k,S,X) = ES,Xr(k,S,X)2 − [ES,Xr(k,S,X)]2.

For the first term, separate expectations for S and X, and apply the equalityEX2 = V arX + (EX)2:

= ES[V arXr(k,S,X)2] + ES[EXr(k,S,X)]2 − [ES,Xr(k,S,X)]2. (5)

Now expand the three terms one at a time. For the first term, use the definitionof r(k,S,X):

ES[V arXr(k,S,X)2] = ESV arX

n∑i=1

k1+...+ki∑h=k1+...+ki−1

Xhi(Si).

Since payoffs are i.i.d. with respect to X,

ES[V arXr(k,S,X)2] = ES

n∑i=1

kiV arXiXi(Si) =n∑

i=1

kiESi [V arXiXi(Si)].

This is the last term on the RHS of the equation in the statement of the theorem.Next, expand the second term of Equation (5). Use the definition of r(k,S,X).

ES[EXr(k,S,X)]2 = ES[(EX

n∑i=1

k1+...+ki∑h=k1+...+ki−1

Xhi(Si))(EX

n∑j=1

k1+...+ki∑g=k1+...+ki−1

Xgj(Sj)].

Since payoffs are i.i.d. with respect to X, ‘

= ES[n∑

i=1

kiEXiXi(Si)][

n∑j=1

kjEXjXj(Sj)].

Distribute ES and multiply the sums term-by-term.

=n∑

i=1

n∑j=1

kikjES(EXiXi(Si) · EXj

Xj(Sj)). (6)

Now expand the third term of Equation (5). Substitute in the expectation ofr(k,S,X) from Equation (1).



−[ES,Xr(k,S,X)]2 = −[n∑

i=1

kiESi,XiXi(Si)]2

Expand the square.

= −n∑

i=1

n∑j=1

kikjESi,XiXi(Si)ESj ,Xj

Xj(Sj).

Apply the equality for covariance [Feller 1968]: Cov(X,Y ) = EXY − (EX)(EY ).

= −n∑

i=1

n∑j=1

kikj [−ESi,Sj(EXi

Xi(Si)·EXjXj(Sj))+CovSi,Sj

(EXiXi(Si), EXj

Xj(Sj))]

Carry through the sign and separate the expectation and covariance terms.

= −n∑

i=1

n∑j=1

kikjESi,Sj(EXi

Xi(Si)·EXjXj(Sj))+

n∑i=1

n∑j=1

kikjCovSi,Sj(EXi

Xi(Si), EXjXj(Sj)).

The first term cancels Equation (6). The second term completes the RHS of thestatement of the Theorem.

REFERENCES

Agarwal, D., Broder, A. Z., Chakrabarti, D., and Diklic, D. 2007. Estimating rates of rareevents at multiple resolutions. KDD 2007 , 16–25.

Agarwal, D., Chen, B., and Elango, P. 2009. Explore/exploit schemes for web content opti-mization. to appear in proceedings of IEEE International Conference on Data Mining 2009 .

Agarwal, D. and Chen, B.-C. 2009. Regression-based latent factor models. KDD 2009 , 19–28.

Audibert, J.-Y., Munos, R., and Szepesvari, C. 2007. Variance estimates and exploration

function in multi-armed bandit. CERTIS Research Report 07-31 .

Auer, P., Cesa-Bianchi, N., and Fischer, P. 2002. Finite-time analysis of the multiarmed

bandit problem. Machine Learning 47, 235–256.

Bax, E. and Romero, J. 2009. Comparing predicted prices. Caltech/Yahoo SISL Conference

2009 .

Ben-Haim, Y. 2005. Value at risk with info-gap uncertainty. Journal of Risk Finance 6, 5,

388–403.

Ben-Tal, A. and Nemirovski, A. 1998. Robust convex optimization. Mathematics of OperationsResearch 23, 4, 769–805.

Berger, J. O. 1985. Statistical Decision Theory and Bayesian Analysis (2nd Edition). Springer.

Bock, M. E. 1975. Minimax estimators of the mean of a multivariate distribution. Annals of

Statistics 3, 1, 209–218.

Box, G. E. P., Hunter, J. S., and Hunter, W. G. 2005. Statistics for Experimenters. John

Wiley and Sons.

Brown, L. D. 1966. On the inadmissibility of invariant estimators of one or more location

parameters. Annals of Mathematical Statistics 37, 1087–1136.

Chen, X., Sim, M., and Sun, P. 2007. A robust optimization perspective of stochastic program-ming. Operations Research 55, 6, 1058–1071.

Dembczynski, K., Kotlowski, W., and Weiss, D. 2008. Predicting ads’ click-through rate withdecision rules. WWW 2008 .



Dudik, M., Blei, D. M., and Schapire, R. E. 2007. Hierarchical maximum entropy density

estimation. 249–256.

Edelman, B., Ostrovsky, M., and Schwarz, M. 2007. Internet advertising and the general-ized second-price auction: selling billions of dollars worth of keywords. American Economic

Review 97, 242–259.

Fabozzi, F. J., Kolm, P. N., Pachamanova, D. A., and Focardi, S. M. 2007. Portfolio Opti-

mization and Management. Wiley Finance.

Feller, W. 1968. An Introduction to Probability Theory and Its Applications. John Wiley andSons, New York, Chichester, Brisbane, Toronto, Singapore.

Franklin, J. 1980. Methods of Mathematical Economics. Springer-Verlag, New York, Heidlberg,

Berlin.

French, S. D. 1988. Decision Theory. Ellis Horwood.

Galton, F. 1886. Regression towards mediocrity in hereditary stature. Journal of the Anthro-

pological Institute of Great Britain and Ireland , 246–263.

Gelman, A. and Hill, J. 2007. Data Analysis Using Regression/Multi-level Hierarchical Models.Cambridge University Press.

Gittins, J. C. 1979. Bandit processes and dynamic allocation indices. Journal of the Royal

Statistical Society, Series B (Methodological) 41, 2, 148–177.

Gittins, J. C. 1989. Multi-Armed Bandit Allocation Indices. John Wiley and Sons, New York.

Gittins, J. C. and Jones, D. M. 1979. A dynamic allocation index for the discounted multiarmed

bandit problem. Biometrika 66, 3, 561–565.

Goldfarb, D. and Iyengar, G. 2003. Robust portfolio selection problems. Mathematics ofOperations Research 28, 1–38.

Hull, J. 2008. Options, Futures, and Other Derivatives, 7th Ed. Prentice Hall.

James, W. and Stein, C. 1961. Estimation with quadratic loss. Proc. Fourth Berkeley Symp.

Math. Statist. Prob. 1, 361–379.

Jobson, J. D., Korkie, B., and Ratti, V. 1979. Improved estimation for markowitz portfoliosusing james-stein type estimators. Proc. of the American Stat. Assoc., Business and Economics

Statistics Section 41, 279–284.

Jorion, P. 1986. Bayes-stein estimation for portfolio analysis. Journal of Financial and Quan-

titative Analysis, 279–292.

Lahie, S. and Pennock, D. M. 2007. Revenue analysis of a family of ranking rules for keywordauctions. ACM Conference on Electronic Commerce, 50–56.

Langford, J., Zinkevich, M., and Kakade, S. 2002. Competitive analysis of the explore/exploit

tradeoff. Proceedings of the Nineteenth International Conference on Machine Learning, 339–

346.

Lintner, J. 1965. The valuation of risk assests and the selection of risky investments in stockportfolios and capital budgets. The Review of Economics and Statistics 47, 1, 13–39.

Markovitz, H. M. 1952. Portfolio selection. Journal of Finance 7, 1, 77–91.

Mnih, V., Szepesvari, C., and Audibert, J.-Y. 2008. Empirical bernstein stopping. Proceedings

of the 25th International Conference on Machine Learning, 672–679.

Regelson, M. and Fain, D. 2006. Predicting click-through rate using keyword clusters.

Richardson, M., Dominowska, E., and Ragno, R. 2007. Predicting clicks: estimating the

click-through rate for new ads.

Schottle, K. 2007. Robust Optimization with Application in Asset Management. Ph. D. Thesis,Technische Universitat Munuchen.

Sharpe, W. F. 1964. Capital asset prices: A theory of market equilibrium under conditions of

risk. Journal of Finance 19, 3, 425–442.

Sniedovich, M. 2007. The art and science of modeling decision-making under severe uncertainty.Decision Making in Manufacturing and Services 1, 1-2, 111–136.

Stein, C. 1955. Inadmissibility of the usual estimator for the mean of a multivariate normaldistribution. Proceedings of the Third Berkeley Symp. on Prob. and Stat. 1, 197–206.



Tobin, J. 1958. Liquidity preference as behavior towards risk. The Review of Economic Stud-

ies 25, 65–86.

Varian, H. R. 2006. Position auctions. International Journal of Industrial Organization 25,1163–1178.

Varian, H. R. 2009. Online ad auctions. American Economic Review 99, 430–434.

Vasicek, O. 1973. A note on using cross-sectional information on bayesian estimation of security

betas. Journal of Finance 28, 1233–1239.

Vermorel, J. and Mohri, M. 2005. Multi-armed bandit algorithms and empirical evaluation.Machine Learning: ECML 2005 3720, 437–448.

Weinberger, K., Dasgupta, A., Langford, J., Smola, A., and Attenberg, J. 2009. Feature

hashing for large-scale multitask learning. 1113–1120.

Wolfe, P. 1959. The simplex method for quadratic programming. Econometrica 27, 382–398.


Documents

Randomness and Uncertainty in Portfolio Methods for ...ecee.colorado.edu/ragad3/papers/working_portfolio.pdf · For example, if an advertiser pays a bid bonly when the ad elicits