Competitive Solutions for Online Financial Problemsscheideler/club/spring_02/p28-el-yaniv.pdffinancial problems in the order previ-ously stated. For each problem (except for the leasing

Competitive Solutions for Online Financial ProblemsRAN EL-YANIV

Institute of Computer Science, The Hebrew University, email: ^[email protected]&

This article surveys results concerning online algorithms for solving problemsrelated to the management of money and other assets. In particular, the surveyfocuses on search, replacement, and portfolio selection problems.

Categories and Subject Descriptors: F.1.2 [Computation by Abstract Devices]:Modes of Computation—online computation, probabilistic computation; F.2.2[Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithmsand Problems—sorting and searching; G.2.M [Discrete Mathematics]:Miscellaneous; G.3 [Mathematics of Computation]: Probability and Statistics;J.1 [Computer Applications]: Administrative Data Processing—financial

General Terms: Algorithms, Measurement, Performance, Theory

1. INTRODUCTION

This article1 surveys work related toonline financial problems, that is, on-line problems (i.e., a single player deci-sion problem under uncertainty) relatedto the management of money and otherassets. More precisely, we are concernedwith online tasks whose natural model-ing is in terms of monetary cost or profitfunctions. The family of such problems(and their related applications) is ex-tremely broad, and the related litera-ture is vast and originates from a num-ber of disciplines including economics,finance, operations research, and deci-sion theory.

In all these fields much of the relatedwork is Bayesian. This means that alluncertain events are modeled by meansof stochastic processes that are typicallyassumed to be known to the decisionmaker. Then the goal is to optimize theaverage-case performance (often with

respect to a utility function). Indeed, forfinancial problems this approach hasbeen dominant over the last several de-cades and has led to the development ofrich mathematical theory.

In contrast, in this survey we areconcerned with non-Bayesian analysesof online financial problems while focus-ing on those using the competitive-ratio(worst-case) optimality criterion (de-fined in Section 2). This criterion pro-vides a complementary type of analysisfor online problems that can lead todifferent types of algorithms and a dif-ferent perspective.

There are vastly fewer papers relatedto worst-case competitive analysis of fi-nancial problems than papers related toBayesian analyses of these problems.Still, however, due to space limitationswe focus here only on a subset of thework done in this field of study.

Many of the financial problems stud-

1 Parts of this article are based on Ch. 14 of Online Computation and Competitive Analysis by A.Borodin and R. El-Yaniv, Cambridge University Press, 1998.Permission to make digital / hard copy of part or all of this work for personal or classroom use is grantedwithout fee provided that the copies are not made or distributed for profit or commercial advantage, thecopyright notice, the title of the publication, and its date appear, and notice is given that copying is bypermission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute tolists, requires prior specific permission and / or a fee.© 1998 ACM 0360-0300/98/0300–0028 $5.00

ACM Computing Surveys, Vol. 30, No. 1, March 1998

ied in the literature and all of thosediscussed in this survey are a variant oran application of one of the followingelementary problems, which are all de-scribed in terms of a one-player gameagainst an adversary (sometimes callednature). The player is called the onlineplayer. Each one of these games takesplace during some time horizon thatmay be continuous or discrete, dividedinto time periods. In all the games theonline player’s general objective is tominimize (resp., maximize) some mone-tary cost (resp., profit) function.

—Search problems. The online player issearching for the maximum (resp.,minimum) price in a sequence ofprices that unfolds sequentially. Atthe beginning of each time period theplayer can pay some sampling cost toobtain a price quotation p, afterwhich the player has to decidewhether to accept p or continue sam-pling more prices. The game endswhen the player accepts some price.The player’s total return is the ac-cepted price p minus the sum of allsampling costs incurred. Typical ap-plications are: searches for jobs andemployees, search for the best price ofsome product or asset, and the like.

—Replacement problems. At each timeperiod the player is engaged in oneactivity. Associated with each activityis a pair of numbers called thechangeover cost and the flow rate.During the periods in which theplayer is engaged in some activity hisbudget is depleted at a rate corre-sponding to this activity’s flow rate.From time to time new activities areoffered as possible replacements forthe current one. If the player decidesto change to a new activity he mustpay the associated changeover cost.The question is when to switch fromone activity to another so that thetotal cost, consisting of the sum of allflows and changeover costs, is mini-mized. Typical applications are:equipment replacement, job replace-ment, supplier replacement, mortgagerefinancing, and the like.

—Portfolio selection. In the basic setupof this problem the online player hassome initial wealth and at the start ofeach period must decide how to reallo-cate her current wealth among theavailable investment opportunities,which may be commodities, securities,and their derivatives. The value ofeach investment opportunity changesfrom period to period in an uncertainmanner and the goal of the player isto maximize her total wealth (deter-mined by the final value of her hold-

CONTENTS1. INTRODUCTION

1.1 General Modeling Issues1.2 Optimality Criteria1.3 Descriptive Versus Prescriptive Theories1.4 Survey Organization

2. PRELIMINARIES2.1 Online Problems and Competitive Analysis2.2 Viewing Online Problems as Two-Player Games2.3 Competitive Analysis

3. SEARCH PROBLEMS3.1 The Elementary Search Problem3.2 Applications of Search3.3 Basic Features of Bayesian Results3.4 Competitive Search Algorithms

4. REPLACEMENT PROBLEMS4.1 A General Replacement Problem4.2 Applications of the Replacement Problem4.3 On Some Bayesian Solutions for the Replace-

ment Problem4.4 Competitive Replacement Algorithms for the Ele-

mentary Continuous-Time Finite-Horizon Problem4.5 Discrete-Time Replacement with Multiple Per-

manent Replacement Options5. TWO-WAY TRADING AND PORTFOLIO SELEC-

TION5.1 The Elementary Portfolio Selection Problem5.2 Buy-and-Hold Versus Market Timing5.3 Online Portfolio Selection5.4 Other Performance Measures5.5 Results for the Two-Way Trading Problem5.6 Other Statistical Adversaries and “Money-Mak-

ing” Algorithms5.7 The Fixed Fluctuation Model5.8 Online Portfolio Selection, Results for Arbitrary m

6. COMPETITIVE RISK MANAGEMENT6.1 One-Way Trading Revisited: Analysis with Risk

Management7. CONCLUDING REMARKS AND DIRECTIONS

FOR FUTURE RESEARCHACKNOWLEDGMENTSREFERENCES

Online Financial Problems • 29


ings). In a more realistic version ofthis problem, each reallocation incursa commission that is typically propor-tional to the money flow required toexecute the reallocation.

—Leasing problems. The player needsto use some equipment during a num-ber of time periods. This number isnot known in advance but is madeknown online. Specifically, at thestart of each period it becomes knownwhether the equipment will be neededin the current period and the playermust choose whether to buy theequipment for a price b or to rent itfor a price r, with r , b. The game isover when the player purchases theequipment or no longer needs theequipment. Note that leasing prob-lems can be presented as rudimentaryforms of replacement problems. Dueto lack of space the leasing problem isnot discussed here. The reader is re-ferred to the papers by El-Yaniv et al.[1993] and Irani and Ramanathan[1994] that study this problem usingtwo different approaches.

1.1 General Modeling Issues

Each of the preceding problems at-tempts to model a particular economicsituation in which an individual or aninstitution may be repeatedly involvedwithin the modern market. Clearly,each of these problems abstracts awaymany factors while attempting to cap-ture some of the essence of the decision-making task in question. With respectto each problem, various general orproblem-specific features can be consid-ered and incorporated in the model. Inparticular, the following features maybe considered: (i) finite versus infinitetime horizons; (ii) discrete versus con-tinuous time; (iii) discounting versusnondiscounting; and (iv) considerationof various utility functions that repre-sent the player’s subjective rate of di-minishing marginal worth. Clearly, theparticular model chosen to formulatethe problem determines the degree of itsrelevance to the corresponding real-life

problem, whereas the addition of morefeatures typically trades off its mathe-matical tractability.

The analyses discussed in this articletypically concern the simpler models ofthe corresponding problems. Thus theymay be used and should be judged asstarting points for more elaborate mod-els and results.

1.2 Optimality Criteria

This survey is oriented towards resultsconcerning competitive algorithms.That is, our primary concern is theanalysis and design of online algorithmsusing the competitive-ratio decision cri-terion for uncertainty (definition anddiscussion follow in Section 2). Abroader view encompasses (financial)decision making under uncertainty us-ing other criteria such as the minimaxcost (maximin return) and others. In-deed, we also discuss analyses of onlinealgorithms via other optimality criteriaincluding some criteria that relax thestrict uncertainty assumptions typicalof “pure” competitive analysis. Due tospace limitations, Bayesian solutionsare not surveyed at all, although occa-sionally we point out key ideas or rele-vant features of Bayesian solutions thatmay bring further insight or motivation.

1.3 Descriptive Versus PrescriptiveTheories

Descriptive decision-making theories at-tempt to analyze and predict how indi-viduals or institutions act. In contrast,prescriptive (or normative) theories at-tempt to describe how rational decisionmakers should act ideally. Althoughthere may be an overlap between de-scriptive and prescriptive theories, it isnot in general expected that they willcoincide. Of course, both kinds of theo-ries have their own independent merits.For example, if one wishes to predictthe behavior of a certain market basedon the aggregate actions of the individ-ual decision makers, then descriptivetheories should be employed. On the

30 • R. El-Yaniv


other hand, prescriptive theories at-tempt to advise decision makers how tooptimize their performance, sometimeseven against their intuitive understand-ing. The “competitive” approach dis-cussed in this survey is, for the mostpart, a prescriptive approach.

1.4 Survey Organization

This survey is organized as follows. Sec-tion 2 briefly presents the definitionsand notation used throughout. This sec-tion also briefly discusses the competi-tive analysis approach, its advantages,and drawbacks. Sections 3 through 5survey results concerning these basicfinancial problems in the order previ-ously stated. For each problem (exceptfor the leasing problem) there is a spe-cial section that includes: a formal de-scription of the basic problem and someof its variants, a discussion of somebasic features of the problem and itsnotable applications, and a survey ofselected known solutions. In Section 6we present a recent generalization ofthe competitive analysis frameworkthat allows risk management, a re-quired feature in financial decision mak-ing. We conclude with Section 7, pointingout directions for future research.

2. PRELIMINARIES

2.1 Online Problems and CompetitiveAnalysis

Consider a cost-minimization problem 3consisting of a set ( of inputs and a costfunction C. Associated with each inputI [ ( is a set of feasible outputs F(I).For each input I and a feasible outputO [ F(I), the cost associated with I andO is C(I, O) [ R1. Let ALG be anyalgorithm for 3. Denote by ALG[I] afeasible output produced by ALG giventhe input I. The cost incurred by ALG isdenoted ALG(I) 5 C(I, ALG[I]). In theproblems considered here each input Iis a finite sequence, I 5 i1, i2, . . . , inand a corresponding feasible output is afinite sequence O 5 o1, o2, . . . , on. An

online algorithm for 3 must produce afeasible output in stages such that atthe jth stage the algorithm is presentedwith the jth component of the input andmust produce the jth component of (afeasible) output before the rest of theinput is made known. Denote by OPT anoptimal offline algorithm for 3. That is,for each input I,

OPT~I! 5 minO[F~I!

C~I, O!.

An online algorithm ALG is c-competi-tive (or “attains a competitive ratio c”) ifthere exists a constant a such that foreach input I,

ALG~I! # c z OPT~I! 1 a.

The smallest c such that ALG is c-com-petitive is called ALG’s competitive ratio.Thus a c-competitive algorithm is guar-anteed to incur a cost no larger than ctimes the smallest possible cost (inhindsight) for each input sequence up tothe additive constant a. For profit max-imization problems the competitive ra-tio is defined analogously. Specifically,with ALG(I) (resp., OPT(I)) denoting theprofit (or return) accrued by ALG (resp.,OPT), ALG is c-competitive if there existsa constant a such that for all I,

c z ALG~I! $ OPT~I! 1 a.

For bounded problems where the cost(resp., profit) function is bounded werequire in the preceding definitions thatthe constant a be zero.

The extension of the competitive-ratiodefinition to randomized online algo-rithms is straightforward under the as-sumption that the adversary generatingthe input sequences is oblivious to therandom choices made by the onlineplayer.2 Specifically, the definition ofthe competitive ratio with respect to anoblivious adversary is the same as the

2 Such an adversary is called an oblivious adver-sary. Other kinds of adversaries are of adaptivetype (see Ben-David et al. [1990]). In this surveywe consider only oblivious adversaries.



preceding with E[ALG(I)] replacingALG(I), where E[z] is the expectationwith respect to the random choicesmade by ALG.

2.2 Viewing Online Problems as Two-Player Games

When we optimize through the competi-tive ratio measure, it is often conve-nient to view an online problem as thefollowing two-player game. The firstplayer is the online player (running theonline algorithm). The second player iscalled the adversary or the offlineplayer. The online player chooses an(online) algorithm ALG and makes itknown to the adversary. Then, based onALG, the adversary chooses an input soas to maximize the competitive ratio.The online player’s objective is to mini-mize the competitive ratio (whichmeans that this game is a zero-sumgame). Thus determining competitive-optimal algorithms is equivalent to de-termining an optimal strategy for thefirst player and the best possible com-petitive ratio is the value of this game,which in general is obtained using ran-domized online algorithms.

2.3 Competitive Analysis

The use of the competitive ratio for theevaluation of online algorithms is calledcompetitive analysis.3 Competitive anal-ysis was first used by computer scien-tists in the ’70s in connection with ap-proximation algorithms for NP-completeproblems (see, e.g., Graham [1966],Johnson [1973], Johnson et al. [1974],and Yao [1980]) and was more explicitlyformulated in the ’80s in the seminalwork of Sleator and Tarjan [1985] onlist accessing and paging algorithms.Since then, competitive analysis hasbeen used extensively to analyze anddesign online algorithms for many on-line optimization problems related tocomputer systems. Within the theoreticalcomputer science community, the compet-

itive ratio has gained much recognitionand has become a standard approach forthe analysis of online problems.

One common argument against theuse of competitive algorithms is thatthey are inherently risk-averse as theyare optimized with respect to worst-caseevent sequences. This argument is cer-tainly valid in cases where decisionmakers possess reliable informationabout stochastic processes that deter-mine the uncertainties. In this case theuse of competitive algorithms that bla-tantly ignore this information may leadto inferior performance relative toBayesian algorithms. Nevertheless, inmany instances decision makers do notpossess such information. In somecases, by investing sufficient resourcesthey may acquire such information,whereas in other situations, often due tothe complexities of event sequences, notmuch can be learned about the underly-ing stochastic process. Whatever thereason for the absence of such informa-tion, competitive algorithms offer rea-sonable initial solutions upon whichmore elaborate algorithms may be con-structed after additional information isdetermined.

Although the baseline competitiveanalysis approach is purely worst-case,it can be extended to a framework thatcan utilize and exploit forecasts whileallowing risk control (see Section 6), adesirable feature especially in financialdecision making. Nevertheless, thisgeneralized competitive analysis withrisk management is still a worst-caseapproach that dispenses with probabi-listic assumptions. This property can befavorable for risk-averse financial deci-sion makers who may prefer somewhatinferior but guaranteed performance tobetter average performance.

3. SEARCH PROBLEMS

3.1 The Elementary Search Problem

The online player is searching for themaximum price in a sequence of pricesthat unfolds sequentially. At the begin-3 This term was coined by Karlin et al. [1988].

32 • R. El-Yaniv


ning of each time period i the player canpay some sampling cost ci $ 0 to obtaina price quotation pi, after which theplayer has to decide whether to acceptpi or continue sampling more prices.The game ends when the player acceptssome price pj and the total return isthen

pj 2 O1#i#j

ci .

Search for the minimum price is simi-larly defined with respect to a cost func-tion where the total cost of the player ispj 1 (1#i#j ci.

This elementary search problem hasmany extensions and variations. Asearch problem is termed with recall ifall or some number k of the most recentprice offers are retained and the playermay choose any of the retained offers.We distinguish between search prob-lems with and without recall and withand without discounting. Also, we dis-tinguish between discrete and continu-ous time problems. In either case weassume a finite time horizon but distin-guish between searches of known andunknown duration. In the case of un-known duration we assume that theplayer is informed just before the lastperiod that the game will end, thus giv-ing the player the opportunity to acceptthe last price if he or she has not ac-cepted any price earlier.

3.2 Applications of Search

Search is a most fundamental feature ofeconomic markets. Let us mention someof its basic applications.

3.2.1 Job and Employee Search. Thesetwo applications are of major impor-tance to labor markets. In the job searchapplication, the player is seeking em-ployment. In each period the job seekerobtains one job offer (which correspondsto the preceding price quotation). Theoffer can be interpreted as the lifetimeearning from the job. The sampling costcorresponds to the cost of generating anoffer; it includes all expenditures such

as advertising and transportation andmay also include the loss incurred frombeing currently unemployed. Thus eachsampling cost may be modeled as a con-stant c.

Just as prospective employees aresearching for jobs, employers aresearching for employees to fill job va-cancies. In the employee search applica-tion, the player is an employer search-ing for an employee with certainidentifiable characteristics that shouldcorrespond to the employee’s productiv-ity (modeled via the preceding “pricequotation”).4 This modeling has an un-derlying assumption that there are suchcharacteristics that are correlated tothe employee’s productivity and thatthese characteristics can be tested bythe employer.

3.2.2 Search for the Lowest Price ofGoods. Here the player considers thepurchase of some goods that are sold atdifferent stores at different prices. Theplayer can elicit quotations from thevarious sellers by paying the samplingfee that takes into account traveling orphone costs and the time wasted forsampling, and may include a penalty ofmanaging without the goods since thestart of the search. Like the job (andemployee) search applications, this ap-plication is of fundamental value to eco-nomics because optimal searching rulesdetermine the demand function thatsellers face, which in turn determines(some of) the nature of the marketsthemselves.

3.2.3 One-Way Trading. In this appli-cation the player is a trader who isconsidering the exchange of some initialwealth w0, given in some currency (say,dollars), to some other asset or currency(say, yen). Each period starts when anew price quotation is made available

4 This application is closely related to the well-known secretary problem [Freeman 1983; Ajtai etal. 1995], the difference being that in secretaryproblems the objective is typically to accept one ormore “secretaries” of best ordinal value among anordered set of all secretaries.



and the trader must decide whether toaccept it or wait for a better price. Inthis application the sampling cost istypically negligible, as prices are widelyavailable in banks, newspapers, andquotation services. Nevertheless, thetrader is required to pay transactionfees (e.g., to a financial institution),that are typically some fixed percentageof the return. Note that in this applica-tion the trader may partition the initialwealth and exchange w0 sequentially inparts. One-way trading algorithms canbe applied in various economic situa-tions. For instance, consider a fundmanager who decides to change the po-sition of a portfolio and enter (or exit)some market (in which case w0 is thepart of her wealth allocated to the newposition). Another natural instance re-lated to foreign exchange is when theplayer, for the purpose of emigrating toa foreign country, sells his local prop-erty in order to exchange the local cur-rency received into the foreign one.

3.3 Basic Features of Bayesian Results

The Bayesian approach derives searchstrategies that are dependent on a priordistribution of prices that is usually as-sumed to be known to the player. Also,it is typically assumed that this distri-bution is fixed throughout the searchand that prices are independent obser-vations drawn from this distribution.The theory developed is very rich andrelies on mathematical tools from thetheory of optimal stopping [Chow et al.1971]. The reader is referred to the ex-cellent surveys by Lippman and McCall[1976; 1981] that discuss Bayesian solu-tions for a wide array of search vari-ants. One notable feature of Bayesianoptimal search algorithms (applicable tomany problem variants) is that theyhave the following structure: based onthe problem parameters (in particular,the probability distribution assumed)there is a single fixed critical number,the reservation price, such that the opti-mal policy is to reject all prices belowthe reservation price and to accept any

offer above it. Reservation prices changedynamically in the case of finite horizonand known duration (and they are non-decreasing in problems without recall;the case of recall is more complicatedand not fully solved). Another issue ofinterest when pursuing Bayesian solu-tions is the expected number of pricequotations required until stopping.

The least acceptable assumption ofthe Bayesian search models is that theprobability distribution of prices is fullyknown to the player. Several modelsattempt to relax this assumption. Forexample, Rosenfield and Shapiro [1981]studied cases where the distribution it-self is a random variable and the playerknows the probability distribution ofsome of its moments (say, the price dis-tribution is unknown but known to benormal and the distribution of its meanis known). In various cases such relaxedmodels do not admit optimal solutionswith a reservation price (see Lippmanand McCall [1976]).

3.4 Competitive Search Algorithms

In this section we describe various com-petitive solutions to variants of thesearch problem without recall in whichthe sampling cost is zero or is a fixedpercentage of the return. With respectto the competitive ratio these two vari-ants are readily equivalent. For the restof this section assume that the samplingcost is zero. In all the variants we con-sider we assume only that prices aredrawn from some finite interval [m, M].Call the ratio w 5 M/m the global fluc-tuation ratio. In one of the variants theonline player knows the values m andM and in another variant the playerknows only w. We seek to determine theoptimal competitive performance interms of these parameters.

Suppose that m and M are known tothe player. The optimal deterministicsolution is the following reservationprice policy (denoted RPP): accept thefirst price greater than or equal to p* 5=Mm. This strategy is =w-competitiveand its analysis is trivial. Using a bal-

34 • R. El-Yaniv


ancing argument, p* should be chosento equate the performance ratio, offlineto online, corresponding to the twoevents: the maximum price encoun-tered, pmax, will be greater than p*; andpmax # p*. It follows that p* is thesolution p of M/p 5 p/m. The precedingreservation policy is optimal for both aninfinite and finite time horizon andwhen the duration is known or un-known. It is not hard to see that if onlyw is known to the player, then no com-petitive ratio smaller than the trivialone, w, is achievable by a deterministicalgorithm.

A considerable improvement is ob-tained by using randomization.5 Firstwe show how a simple randomizedstrategy due to Levin [1994] attains acompetitive ratio of O(log w). For sim-plicity, assume first that w 5 2k (forsome integer k). For i 5 0, 1, . . . , k 2 1denote by S(i) the deterministic reser-vation price policy with reservationprice m2i. Levin’s randomized strategy,denoted S, is a uniform probability mix-ture over {S(i)}i; that is, S chooses S(i)with probability 1/k. It is easy to showthat there exists a function f(w) suchthat S is [f(w) log w)]-competitive withf(w) greater than but approaching 1 as wgrows. Let pmax be the (posteriori) max-imum price obtained and let j be aninteger such that m2j # pmax , m2j11.The optimal offline return is pmax. Foreach i # j the strategy S(i) chooses aprice not smaller than m2i and for alli . j, S(i) obtains at least m. It followsthat S will obtain at least (2j11 1 (k 2j 2 1))m/k on average. The resultingcompetitive ratio is greater than, butapproaching, k 5 log w.

Exactly the same bound holds even ifthe player does not know the values ofm and M and knows only w. Here, how-ever, the strategies S(i) are set after thefirst price p1 is revealed (in which case

S(i) has reservation price p12i). For anarbitrary w (not a power of two) thesame method gives O(k)-competitiveperformance where bk 5 w (b real and kpositive integer). Recall that for thisvariant (only w is known) no determinis-tic strategy can achieve a competitiveratio smaller than w.

This simple randomized strategy canbe modified to work even when theplayer does not known w. In this case itcan be shown that there exists a func-tion c(«) such that a competitive ratio ofO(c(«) z log11«(w)) can be obtained forevery positive «, where w is the posteri-ori global fluctuation ratio.

For the variants where m and M areknown or only w is known to the player(known or unknown duration), the com-petitive ratio of O(log w) is within amultiplicative factor of the best that canbe obtained against an oblivious adver-sary. This is established in El-Yaniv etal. [1993] which presents strictly opti-mal algorithms for these variants. Thepresentation of the results of this paperis simplified when described as a one-way trading game. Consider a traderwith initial wealth of $1. The trader ispresented with a sequence of exchangerates p1, p2, . . . , where pi gives theexchange rate, yen per dollar, for theith period (say, day). It is assumed thatall rates are drawn from [m, M]. Oneach day i the trader can exchange anyamount si of his remaining dollars foryen. Upon completion the total return ofthe trader is the amount of yen he accu-mulated. It is assumed that arbitraryfractions of dollars can be traded.

Notice that in the search problem theonline player must accept one price andin the one-way trading problem thetrader can partition his initial wealthand trade the parts sequentially, eachpart at different exchange rates. Never-theless, these problems are closely re-lated and, in fact, equivalent in thefollowing sense. Any deterministic one-way trading algorithm can be inter-preted as a randomized search algo-rithm and vice versa. This follows fromthe fact that any deterministic one-way

5 Note that in general randomization does not givethe player any advantage if it is assumed that theprice distribution is known and an average-caseperformance is sought (i.e., the Bayesian ap-proach).



trading algorithm that trades the initialwealth in parts is equivalent (in termsof returns) to a randomized trading al-gorithm and vice versa. Specifically,suppose that a deterministic algorithmtrades a fraction si of its initial wealthat the ith period, (i si 5 1. The quan-tity si can be interpreted as the proba-bility of trading the entire wealth at theith period. Clearly, the average returnof this randomized algorithm equals thereturn of the deterministic algorithm.Thus the competitive ratio of the deter-ministic algorithm is exactly the aver-age competitive ratio of the randomizedalgorithm against an oblivious adver-sary. On the other hand, using Kuhn’s[1953] theorem of game theory, any ran-domized one-way trading algorithm isequivalent to a mixed randomized algo-rithm (i.e., a probability distributionover deterministic algorithms), and bylinearity of expectation (and using thefact that the sum of all traded amountsis 1) this mixed algorithm is equivalentto a deterministic algorithm that tradesthe initial wealth in parts. It followsthat an optimal deterministic one-waytrading algorithm is an optimal ran-domized search algorithm. This impliesthat randomization cannot improve thecompetitive performance in one-waytrading. In contrast, randomization isadvantageous for search. When fixedsampling costs are introduced (i.e., theplayer pays some constant for each pricequotation she obtains) there is no longeran equivalence between deterministicone-way trading algorithms and ran-domized search (and one-way trading)algorithms. The reason is that the ran-domized algorithm may spend less onsampling on the average.

3.4.1 Threat-Based Algorithms. Theoptimal performance is obtained by al-gorithms that obey the following threat-based policy. Let c be any competitiveratio that can be attained by a deter-ministic trading algorithm. Assumethat c is known to the trader. For eachsuch c the corresponding threat-basedpolicy consists of the rules:

Rule 1. consider trading dollars foryen only when the currentrate is the highest seen sofar;

Rule 2. whenever you convert dol-lars, convert just enough toensure that a competitive ra-tio c would be obtained if anadversary dropped the ex-change rate to the minimumpossible rate6 and kept itthere throughout the game.

Thus algorithms prescribed by thispolicy convert dollars to yen based onthe threat that the exchange rate willdrop permanently to the minimum pos-sible rate. For each attainable competi-tive ratio c the corresponding threat-based algorithm can be shown to bec-competitive. Intuitively, this state-ment can be justified as follows. Con-sider the first trade (exchange rate isp1). Since the current exchange rate isthe highest seen so far, the algorithmconsiders a trade. The competitive ratioc is attainable (by some deterministictrading algorithm), and therefore thereexist some s $ 0 such that the ratio cwill still be attainable if s dollars aretraded for yen. Furthermore, the chosenamount of dollars s is such that theratio c is so far guaranteed even if thereis a permanent drop of the exchangerate and no further trades are con-ducted (except for one last trade con-verting the remaining dollars with theminimum possible exchange rate). Inparticular, there is no need to considerany exchange rate smaller than p1. Sim-ilar arguments can be used to justifythe choice of the amounts chosen for therest of the trades and may convince thereader that this policy induces a c-com-petitive algorithm. A sketch of a moreformal analysis follows.

Assume the problem variant with

6 The “minimum possible rate” is defined withrespect to the information known to the trader.That is, it is m if m is known, and it is p/w if onlyw is known and p is the highest price seen so far.

36 • R. El-Yaniv


known duration n and known m and M.We now show how the optimal threat-based algorithm (denoted THREAT) forthis variant can be derived. Any ex-change rate that is not a global maxi-mum at the time it is revealed to thetrader is ignored (Rule 1). Hence we canassume w.l.o.g. that the exchange ratesequence consists of an initial segmentof successive maxima of length k # n.In order to realize a threat the adver-sary may choose k , n and choose pk11 5pk12 5 . . . 5 pn 5 m.

Initially, assume that the optimalcompetitive ratio attainable by THREATc*, is known. For each i 5 0, 1, . . . , n,set Di and Yi to be the number of re-maining dollars and the number of ac-cumulated yen, respectively, just afterthe ith period. By assumption thetrader starts with D0 5 1 dollar andY0 5 0 yen. Let si 5 Di21 2 Di be thenumber of dollars traded at the ithperiod, i 5 1, 2, . . . , n. Thus Yi 5 (j51

i

sjpj. The ratio c* is attained by algo-rithm THREAT and therefore, by Rule 2,it must be that the amounts si are cho-sen such that for any 1 # i # n,

pi

Yi 1 m z Di

5pi

~Yi21 1 sipi! 1 m z ~Di21 2 si!# c*.

(1)

Here the denominator of the left-handside represents the return of THREAT ifan adversary dropped the exchange rateto m, and the numerator of the left-hand side is the return of OPT for suchan exchange rate sequence. By Rule 2,THREAT must spend the minimal si thatsatisfies inequality (1). The left-handside is decreasing with si. Since si ischosen to achieve the target competitiveratio c*, and since the trader mustspend the minimum possible amount (inorder to leave as much as possible forhigher rates), we replace the inequality

in (1) with equality. Solving the result-ing equality for si we obtain

si 5pi 2 c* z ~Yi21 1 mDi21!

c* z ~ pi 2 m!. (2)

From (1) (with equality) we also obtainthe relation:

Yi 1 m z Di 5 pi/c*. (3)

Closed-form expressions for the sis cannow be obtained. From (2) at i 5 1 weget si 5 1/c* (p1 2 mc*)/e1 2 m). From(2) and (3), with i 2 1 replacing i, weobtain Yi21 1 mDi21 5 pi21/c*. Hencefor i . 1 we have si 5 1/c* (pi 2pi21)/(pi 2 m).

It remains of course to determine c*,the optimal competitive ratio attainableby algorithm THREAT. For any sequenceof k exchange rate maxima it must bethat the sis satisfy (1#i#k si # D0 5 1.If the value of k is known to the trader,then the optimal choice of sis is suchthat no dollars will remain after the lasttransaction. That is, the optimal com-petitive ratio has the property that(1#i#k si 5 1. Substituting into thisequation the expressions determined forthe sis, one can obtain

c* 5 c*~k, m, p1 , p2 , . . . , pk!

5 1 1p1 2 m

p1

z Oi52

k pi 2 pi21

pi 2 m.

Denote by c*n(m, M) the optimal com-petitive ratio for the n-day game. Thus,

c*n~m, M! 5 maxk#n

m#p1,p2,· · ·,pk#M

c*~k, m, p1 , p2 , . . . , pk!.

We skip this maximization routine anddiscuss the end result. An explicit ex-pression for c*n(m, M) cannot be ob-tained but it can be shown thatc*n(m, M) is the unique root c* of the



equation

c 5 n z S1 2 Sm~c 2 1!

M 2 m D 1/nD . (4)

Thus c*n(m, M) is the minimum compet-itive ratio attainable by algorithmTHREAT for an n-day game. On the otherhand, it can be shown that c*n(m, M) isa lower bound on the competitive ratioof any randomized algorithm against anoblivious adversary for an n-day tradinggame [El-Yaniv et al. 1992]. Hencec*n(m, M) is the exact competitive ratiofor the trading problem of known dura-tion and known bounds (m and M). Thefact that randomization cannot help inthis problem is perhaps somewhat sur-prising but clear given that any deter-ministic trading algorithm that parti-tions the initial wealth and performsmultiple trades is equivalent to a ran-domized one-way trading (and search)algorithm with the same performance.

The preceding solution implies a solu-tion for one-way trading with unknownduration. The threat-based algorithmcorresponding to the ratio c* (m, M) 5limn3` c*n(m, M) can handle any finitenumber of days while attaining a com-petitive ratio of c* (m, M). However,when the duration is not known (it ismade known online on the last day) theadversary can choose an arbitrary largen, thus forcing a competitive ratio ap-proaching c* (m, M) (note that c*n(m, M)is strictly increasing with n). c* (m, M)can be shown to be the unique root c* of

c 5 lnM 2 m

m~c 2 1!.

Clearly, c* (m, M) 5 Q(ln w).The paper by El-Yaniv et al. [1992]

studies two other variants of the one-way trading problem in which thetrader knows only the global fluctuationw 5 M/m but not the actual values of mand M (known and unknown duration).The value of c*n(w), the optimal compet-itive ratio of this trading game (knownduration, n-day game), is determinedand is shown to be

c*n~w! 5 wS1 2 ~w 2 1!S w 2 1

wn/~n21! 2 1DnD .

(5)

Here again, this optimal performance isattainable by algorithm THREAT. Notsurprisingly, c*n(w) is monotone increas-ing with n and w and the optimal com-petitive ratio of the unknown durationgame c* (w) can be shown to be

c*~w! 5 w 2w 2 1

w1/~w21!5 Q~ln w!.

In order to get some feel for the actualcompetitive ratios obtained, it is inter-esting to observe some numerical exam-ples. Table I summarizes the competi-tive ratio of the algorithms discussed inthis section for some values of w. We seethat the optimal threat-based algorithmis always significantly superior to allother algorithms. Note that the deter-ministic algorithm RPP is superior toLevin’s algorithm for small values of w.Nevertheless, recall that the growthrate of the competitive ratio of Levin’salgorithms is almost the logarithm of

Table I. Numerical Examples of Competitive Ratios for Some Search and One-Way Trading Algorithms(unknown duration)

AlgorithmValue of w

1.5 2 4 8 16 32

RPP (m, M known) 1.22 1.41 2 2.82 4 5.65LEVIN’s (only w known) 1.5 2 2.66 3.42 4.26 5.16THREAT (only w known) 1.27 1.50 2.11 2.80 3.53 4.28THREAT (m, M known) 1.15 1.28 1.60 1.97 2.38 2.83

38 • R. El-Yaniv


the growth rate of the competitive ratioof the deterministic algorithm.

It is also interesting to consider therate of increase of the optimal competi-tive ratio as a function of the number oftrading days n. It is possible to showthat this function grows very quickly toits asymptote. Nevertheless, there isstill a slight advantage to playing shortgames. For instance, already at the20th period c*n(1, 2) almost approachesits asymptote, c* (1, 2) ' 1.278 (which isequivalent to guaranteeing 78.2% of theoptimal offline return) at n 5 10; theratio achieved is 1.26 (79.3% of OPT) andat n 5 5 the ratio is 1.24 (80.6% of OPT).

3.4.2 Other Results. El-Yaniv et al.[1992] study two other variants of thetrading game: a continuous-time modeland a model in which the adversarychooses a probability distribution thatdetermines the exchange rates (thisprobability distribution is made knownto the trader). We note that the contin-uous-time model is significantly simplerto analyze but its precise formulation ismore involved.

4. REPLACEMENT PROBLEMS

4.1 A General Replacement Problem

At each time the online player must beengaged in a single activity. Associatedwith each activity a is its cost c(a) andits flow rate f(a) (which may also be afunction of time f(a) 5 f(a, t)).Throughout the time period in whichthe player is engaged in activity a, shepays money at the rate f(a). From timeto time new activities are offered aspossible replacements to the current ac-tivity. If at some time t the playerchooses to replace the current activity awith a new one a9, she pays a replace-ment (or changeover) cost of rt(a, a9),where rt is a real-valued function pa-rameterized by time (for t 5 0 the re-placement cost is rt(a9), where a9 is theinitial activity chosen). For any se-quence of activities a0, a1, . . . , ak cho-sen at times t0, t1, . . . , tk, the total cost

incurred by the player up to time T is

r0~a0! 1 Oi51

k

~~ti11 2 ti! f~ai!

1 rti~ai21 , ai!!,

where t0 5 0 and tk11 5 T.We note that this general replace-

ment model generalizes the MetricalTask System model of Borodin et al.[1992], which abstracts many onlineproblems.

4.1.1 Some Problem Variants. Thissection discusses two special instancesof the previous general replacementproblem using the following notation. Inthe continuous-time (resp., discrete-time) variant of the problem, at eachtime t (resp., at the start of each periodi) the online player is presented with afinite set Rt (resp., Ri) containing somenumber of activities offered as a possi-ble replacement to the current one. Rt iscalled the replacement set of time t(resp., Ri is called the ith replacementset). We first consider a continuous-timefinite-horizon model such that at eachtime there may be only a single replace-ment alternative to the current one;that is, for all t, uRtu 5 1. Also in thissimple variant all replacement costsequal a constant C; that is, for all activ-ities a, b, rt(a, b) 5 C. This problemvariant is referred to as the elementarycontinuous-time finite-horizon replace-ment problem.

In the second variant we assume dis-crete time and that for all activities a,b, rt(a, b) 5 c(b). Also we require thatfor all i, Ri # Ri11. That is, an activitythat is once offered is always availablethereafter (i.e., replacement options arepermanent). This variant is referred toas the discrete-time replacement prob-lem with multiple permanent replace-ment options. In contrast to the firstvariant, here we allow multiple replace-ment alternatives and varying replace-ment costs. We note that the assump-tion of permanent replacement options



together with the assumption of discretetime simplifies the problem. Indeed, de-spite the added complication that re-placement costs may vary, this variantallows for somewhat easier analysis andmuch better competitive ratios. Finally,notice that the preceding replacementmodel also captures the leasing problemas a special case. Specifically, at eachtime there are optional activities: onewith a small replacement cost that cor-responds to renting, and another with ahigher replacement cost that corre-sponds to buying. It is also given thatthe game ends as soon as the playerchooses the activity that corresponds tobuying.

4.2 Applications of the ReplacementProblem

The replacement problem has variousinteresting applications, in all of whichthe basic question is when to switchfrom one activity, investment, or facil-ity, to another more rewarding one,when there is a cost associated withmaking the switch. Some striking exam-ples of particular applications are thefollowing.

4.2.1 Equipment and Machine Re-placement. Here the player needs to usesome piece of equipment throughout thetime horizon. For its regular use, theequipment incurs some operating, pro-duction, and/or maintenance cost. Fromtime to time, due to a priori unknowneconomic events, technological improve-ments, and/or equipment deterioration,the player can and may wish to switchto different or newer equipment thatincurs a lower operating cost (or higherpayoff). Some examples of equipmentfor which this application is relevantare cars, computers, industrial machin-ery, and the like. The same formulationapplies of course to more abstract typesof “equipment” such as jobs. In many ofthese examples the operating cost canbe approximated by a fixed-rate-pay-ment flow (e.g., gasoline consumptionrate, salary, etc.). Other applicationsmay require more elaborate modeling.

For instance, in order to model deterio-ration, the maintenance may be re-quired to be some monotone nonde-creasing function (rather than aconstant).

4.2.2 Supplier Replacement. A firm ispurchasing goods at a constant ratefrom one supplier. The cost of purchas-ing the same goods from other suppliersvaries with time. The firm can switch toanother supplier but at a certain cost.The cost of this switchover can be ap-proximated by some constant that takesinto account the paperwork, the wastedtime, and possibly the costs involved inbreaking the contract with the first sup-plier.

4.2.3 The Menu-Cost Problem. Manyfirms are constantly faced with theproblem of when to adjust prices of thegoods or services they offer. Due to in-flationary markets and/or other eco-nomic events, the firm may wish to up-date its price menu to reflect the goods’or services’ “real” values in order toincrease its overall payoff. Each of theseprice adjustments, which correspond toour (flow) changeovers, incurs somefixed cost to physically update the“menu,” advertise, and so on.

4.2.4 Mortgage Refinancing. In thisapplication the flow rate corresponds tothe mortgage payment rate, which isbased on a fixed interest rate (and theprincipal). Among the popular mort-gages available in North American mar-kets are those where refinancing costsare fixed (sometimes called zero-pointmortgages). These fixed costs corre-spond to the paperwork and time over-head required by the switch to anothermortgage and thus are best modeled bya fixed changeover cost. In other kindsof (fixed-rate) mortgages a changeovercost is some fixed percentage of theprincipal.

4.3 On Some Bayesian Solutions for theReplacement Problem

The literature related to online replace-ment problems is quite extensive. The

40 • R. El-Yaniv


typical assumption is that the flow ratefunction follows a particular (usuallysimple) stochastic process that may ormay not be known to the online player.Let us describe two examples.

Derman [1963] studies a simplifieddiscrete-time replacement problemwhere the analogue of our flow ratefunction is a piecewise constant func-tion in which the next value is deter-mined via a simple one-stage Markovprocess.

Sheshinski and Weiss [1993] studyprice-adjustment policies solving themenu-cost problem under the assump-tion that “real” prices are determined bythe following two-state process. Duringeach state the price level is changed at afixed rate (in particular, they assumethat in one state the price is fixed andin the second state the price increasesat a fixed rate). The duration of eachstate is an independent exponentiallydistributed random variable. Note thatin both these examples (and in mostother analyses of this kind) the optimalpolicy is heavily dependent on the sto-chastic assumptions.

4.4 Competitive Replacement Algorithmsfor the Elementary Continuous-TimeFinite-Horizon Problem

In this section we discuss competitivesolutions for the elementary replace-ment problem with continuous time andfinite horizon [El-Yaniv and Karp 1997].Recall that in this version of the re-placement problem all changeover costsequal a constant C. Also in this variantthere is only one possible replacementactivity for the current one. Hence wedenote the flow rate of the activity of-fered at time t by f(t). The player startswith the initial activity paying at flowrate f(0) and may choose any number kof changeover times, 0 , t1 , t2 , . . ., tk , T. For each such changeovertime ti, the player pays a changeovercost C and throughout the interval [ti,ti11), i 5 0, 1, . . . , k, his payment flowis at the rate f(ti). (By convention, taket0 5 0 and tk11 5 T.) For each particu-

lar choice of changeover times the totalcost incurred by the player, composed ofpayment flows and changeover costs, is

kC 1 Oi50

k

~ti11 2 ti! f~ti!.

Any choice of k and (k) changeovertimes is called a replacement policy. Ofcourse we are interested in replacementpolicies that minimize the total cost.Given f(t) and C, it is straightforward tocompute an optimal offline replacementpolicy and OPT(f), the optimal offlinecost, via (continuous) dynamic program-ming [Bellman 1955].

As in the search problem, we assumethat the flow rate function is boundedsuch that for all t, m # f(t) # M, wherem, M [ R, and 0 # m , M. Further-more, in this online variant of the re-placement problem the player must de-termine the changeover times onlinewithout knowledge of future values ofthe flow rate function. Thus we assumethat at each time t the player knows fonly over the interval [0, t]. Let S beany online replacement policy and de-note its total cost with respect to theflow f by S(f).

Problem reduction. By suitably scal-ing the time and cost axes, we mayassume that C 5 T 5 1. Specifically,given an initial problem setup with pa-rameters m9, M9, T9, and C9, we mapeach flow rate x9 [ [m9, M9] to x 5x9T9/C9 and each time t9 [ [0, T9] tot9/T9. Thus the new problem setup isgiven by M 5 M9T9/C9, m 5 m9T9/C9,and T 5 C 5 1. It is not hard to seethat this scaling preserves the competi-tive ratio. After scaling, we further as-sume that m 1 1 , M. For M # m 1 1the problem is trivial in the sense thatthe online player can always achieve a“perfect” competitive ratio of 1. For therest of this section we consider the re-duced replacement problem with C 5T 5 1 so that the only relevant parame-ters are m and M.



4.4.1 Some Types of Online Replace-ment Policies. Perhaps the most naiveonline replacement policies that are stillinteresting are the following class oftime-independent policies. A policy inthis class is a sequence of constantchangeover thresholds that are fixedover time independent of the flow ratefunction. Specifically, a time-indepen-dent policy is a decreasing sequence ofreal numbers,

M $ M1 . M2 . . . . . Mk $ m.

The interpretation is that the onlineplayer changes over for the ith timewhen the flow rate decreases to thelevel of (or below) Mi.

A more sophisticated class of policiesis the following class of time-dependentor refusal policies (we use both terms).A refusal policy is defined as a sequence{Mi(t)}i51

k of functions such that Mi(t) :[0, 1] 3 [m, M] ø {21}. Each of thesefunctions is nonincreasing and for all iand t, Mi(t) . Mi11(t). Here again theinterpretation is that the online playerchanges over for the ith time at the firstinstance t when f(t) # Mi(t) but refusesto change over as long as f(t) . Mi(t). Aparticular subclass of simple refusalstrategies is the one where each Mi is aconstant except for one step at sometime bi from which the function remainsat 21; that is,

Mi~t! 5 H Mi

21if t # bi ;otherwise.

Call such a strategy a constant-thresh-old refusal policy. Any such strategy isthus specified by the two sequences {Mi}and {bi}. Notice that a time-independentpolicy is a rudimentary form of a con-stant threshold refusal policy where thebis are all 1. Here we focus on time-independent and constant threshold re-fusal policies. Nevertheless, note thatmore sophisticated strategies wouldmake use of the history of flow rates.Somewhat surprisingly, it turns outthat in all instances of the precedingreplacement problem it is possible to

obtain optimal or approximately opti-mal online performance using onlytime-independent and constant thresh-old refusal policies (without resorting tohistory-dependent policies).

4.4.2 A Lower Bound. A lower boundon the competitive ratio of any deter-ministic policy for this variant of thereplacement problem is obtained in El-Yaniv and Karp [1997]. We now sketchthe essential ideas of this bound. Con-sider an adversary that can choose theflow rate functions from a restrictedfamily consisting of functions that startat time zero at the rate M, then drop“instantaneously” (i.e., during an infini-tesimally short time) and continuouslyto some rate m chosen by the adversary,and then “jump” back to the maximumpossible rate M and remain there. Wemay assume that m # M 2 1, since nosensible strategy will change over forany flow rate larger than M 2 1 (thechangeover cost is 1). It can be shownthat against such functions the optimalonline performance can be attained by atime-independent policy. Intuitivelythis is due to the fact that all replace-ment opportunities occur during an in-finitesimally short time period.

For any choice of m the optimal offlinecost is m 1 1, since the optimal offlinealgorithm changes over to the rate mpaying 1 for the changeover penalty. LetS 5 {Mi}i51

k be a time-independent pol-icy and assume that it is r-competitive.It can be shown that S must satisfy therelations:

Mi 1 i 5 r~Mi11 1 1!,

i 5 0, 1, . . . , k (6)

(by convention we set M0 5 M, Mk11 5m). Intuitively, the reason is that if anadversary wishes m to lie in some inter-val [Mi11, Mi], it will pick m 5 Mi11 1« (for some small «), in which case theoptimal offline cost will be (arbitrarilyclose) to Mi11 1 1 and the online cost,Mi 1 i (the online strategy will changeover i times). On the other hand, thestrategy defined by the recurrence rela-

42 • R. El-Yaniv


tion (6) is r-competitive provided thatMk11 5 m. From properties of this re-currence relation, it can be shown thatif S is r-competitive, then the number ofthresholds in S is

k 5 ~m 1 1!~r 2 1!. (7)

A closed-form expression for the recur-rence relation (6) is

Mi 5 Sa 1r2

~r 2 1!2Dr2i

1i

r 2 12

r2

~r 2 1!2. (8)

The optimal competitive ratio r* cannow be determined using (7) and (8). Itis the minimum r that solves the equa-tion Mk11 5 m. It can be shown that fora fixed m, r* 5 Q (ln M/ln ln M). Sincer* is the optimal competitive ratio withrespect to restricted flow-rate functions,it is a lower bound on the competitiveratio of any replacement policy againstunrestricted flow-rate functions.

4.4.3 A Characterization of Competi-tive Refusal Policies. El-Yaniv andKarp [1997] establish a characterizationtheorem for constant threshold refusalpolicies that have a decreasing refusaltime sequence {bi} (i.e., if the policyrefuses to change over for the first time,it will refuse to change over for the restof the game). Given any refusal policy S(of the preceding type) with kchangeover thresholds and a real num-ber r . 1, the theorem specifies thefollowing conditions, C1 and C2, suchthat they are both satisfied if and only ifS is r-competitive.

C1 for all i and j with 0 # i # j # k,

Mibj11 1 Mj~1 2 bj11! 1 j # r z min

3M0 ,Mi11 1 1,M0bj11 1 m~1 2 bj11! 1 1,Mi11bj11 1 m~1 2 bj11! 1 2

4 ;

C2 for all i and j with 0 # i , j # k,

Mibj 1 Mj~1 2 bj! 1 j # r z min

3M0 ,Mi11 1 1,M0bj 1 m~1 2 bj! 1 1,Mi11bj 1 m~1 2 bj! 1 2

4 .

The proof of this theorem is obtainedby bounding from below the optimal off-line cost via a linear form of variableschosen by the adversary (these vari-ables determine the flow function). Itfollows that the set of feasible choicesfor the adversary is a polytope. By aconvexity argument, it is then sufficientto consider only corner points of thispolytope. In each corner point most ofthe coordinates (i.e., variables) are zero.Then with respect to the corner pointswe obtain simple expressions for theoffline costs (as they appear in theright-hand side of C1 and C2); in partic-ular, each possible cost includes at mosttwo changeovers.

4.4.4 Upper Bounds. Using the pre-ceding characterization theorem it ispossible to obtain various interestingupper bounds for this replacement prob-lem. We now describe a policy thatachieves a competitive ratio that forsufficiently large M is within a constantfactor of r* for every positive m.

Consider the constant threshold time-independent policy {Mi}i51

k , where thesequence of changeover thresholds {Mi}is defined by the following recurrencerelation. For each r . 1, set k 5 r.Define

5M0 5 M;

Mi11 5Mi 1 k

r2 1, integer i $ 1

.

(9)

It is possible to show that for everypositive m and for sufficiently large r,the sequence defined by (9) decreasesbelow m within k steps. Call a r forwhich Mk11 # m and Mk . m good. Foreach r, each m . 0, and each M . m 1 1,



let S*r(m, M) denote the policy {Mi} (asdefined by (9)). Now, by consideringS*r(m, M) as a (degenerate) time-depen-dent policy, we can apply the previouscharacterization conditions and provethat S*r(m, M) is r-competitive for allgood r and almost all values of M. Spe-cifically, assume that M . r/(r 21) 5 k/(r 2 1). This implies that M1 11 2 M # 0, which means that min{Mi 11, M} 5 Mi 1 1. We use this fact later.Let us now specialize the conditions C1and C2 of the characterization result tothe case where bk11 5 0 and bi 5 1, i 51, 2, . . . , k, that is, when the (degener-ate) refusal policy is a time-independentpolicy. For r 5 r, C1 and C2 reduce tothe condition: for all 0 # i # j # k,

Mi 1 j # r z min$M0 , Mi11 1 1%.

But this condition readily holds by thedefinition of the Mis and the fact thatM1 1 1 # M. For each (sufficientlylarge) M, define r(M) to be the mini-mum (infimum) good r. Then it is possi-ble to show that for a fixed m, r(M) 5 Q(ln M/ln ln M), which has the same as-ymptotic growth as r*, the lower boundfor the problem.

El-Yaniv and Karp [1997] also con-sider constant threshold refusal policyand prove its (strict) optimality when-ever =M/(m 1 1) # (m 1 2)/(m 1 1) orm 5 0. The proof relies on the precedingcharacterization result but is muchmore involved.

4.5 Discrete-Time Replacement withMultiple Permanent ReplacementOptions

In this section we only state the resultsof a recent paper by Azar et al. [1996]that studies the discrete time replace-ment problem variant with multiplepermanent replacement options. Recallthat in this problem variant we requirethat uRiu . 0 and that for all i, Ri, #Ri11 (see the notation of Section 4.1).That is, an activity that is once offeredis always available thereafter.

4.5.1 The Convex Problem Vari-ant. An instance of this replacementproblem is called convex if for each i andb1, b2 [ Ri if f(b1) , f(b2), then r(b1) $r(b2). For the convex variant Azar et al.[1996] obtain a simple 7-competitive al-gorithm.

4.5.2 The Nonconvex Variant. An in-stance of this problem, which is notconvex in the previous sense, is callednonconvex. Nonconvex instances turnout to be markedly harder. Azar et al.[1996] introduce an algorithm that at-tains the competitive ratio:

O~min$log~crmax!, log log~cfmax!,

log~cnmax!%!,

with rmax being the ratio between themaximum and minimum replacementcosts, fmax the ratio between the maxi-mum and minimum flow rates, nmax thetotal number of replacement alterna-tives presented to the player throughoutthe game, and c some constant. Azar etal. also present a nemesis request se-quence of replacement sets that forcesthe following competitive ratio on everyonline algorithm.

VSminH log~crmax!,log log~cfmax!

log loglog~cfmax!,

log~cnmax!

log log~cnmax!JD ,

where c is some constant. Thus, as afunction of rmax, their algorithm attainsa competitive ratio that is within a con-stant factor of the best possible.

5. TWO-WAY TRADING AND PORTFOLIOSELECTION

5.1 The Elementary Portfolio SelectionProblem

Consider a market of m securities: thesecan be stocks, bonds, foreign currencies,or commodities. Let pW i 5 (pi1, pi2, . . . ,pim) denote a vector of prices where foreach j 5 1, 2, . . . , m, pij denotes the

44 • R. El-Yaniv


number of units of the jth security thatcan be bought for one dollar at the startof the ith trading period, i 5 1, 2, . . . .The “local” currency, say, dollars, mayor may not be one of the m securities.This local currency is referred to ascash. The change in security prices dur-ing the ith period is represented as avector xW i 5 (xi1, xi2, . . . , xim) where foreach i and j, xij 5 pij/p(i11)j. The quan-tity xij is called the price relative ofsecurity j (of the ith period). Thus aninvestment of d dollars in the jth secu-rity just before the start of the ith pe-riod yields dxij dollars by the end of theith trading period.

An investment in the market, or port-folio, is specified as the proportion ofdollar wealth currently invested in eachof the m securities. Specifically, we rep-resent a portfolio as a probability distri-bution bW 5 (b1, b2, . . . , bm), where bi $0 and (i bi 5 1. Consider a portfolio bW 1invested just before the first period. Bythe start of the second period this port-folio yields

bW 1t z xW 1 5 O

j51

m

b1 jx1j

dollars per each initial dollar invested.At this stage the investment can becashed in and adjusted, say, by rein-vesting the entire current wealth insome other proportion bW 2, and so on.Assuming an initial wealth of $1, thecompounded return of a sequence ofportfolios, B 5 bW 1, bW 2, . . . , bW n withrespect to a sequence of market pricerelatives X 5 xW1, . . . , xWn is defined as

R~B, X! 5 Pi51

n

bW it z xW i 5 P

i51

n Oj51

m

bijxij .

A portfolio selection algorithm is anysequence of portfolios specifying how toreinvest the current wealth from periodto period. The compounded return of aportfolio selection algorithm ALG withrespect to a market sequence X is de-noted by ALG(X) 5 R(ALG, X). The port-

folio selection problem revolves aroundthe question of identifying and analyz-ing portfolio selection algorithms. Ofcourse, here we are mainly concernedwith online portfolio selection algo-rithms.

It is important to notice that the pre-ceding basic portfolio selection problemis only a crude approximation of thecorresponding real-life problem. Forpractical purposes, perhaps the mostimportant factor missing in this modelis transaction costs (or alternatively,bid-ask spreads). (Note that in Section5.8.7 we add this feature into the port-folio selection model.) Another simplifi-cation of this model is to assume thatmoney and units of securities are arbi-trarily divisible. Other important miss-ing factors are taxes and interest ratesthat are typically important issues ininvestment planning. Finally, thismodel does not allow many investmentopportunities that exist in a modernmarket, starting with short selling andending with myriad derivative instru-ments such as futures, options, and thelike. Nevertheless, this model is richenough in itself and makes a reasonablefoundation for studying some of the es-sential questions related to portfolio se-lection.

5.1.1 The Two-Way Trading Prob-lem. The special problem instance withm 5 2, where one of the two securitiesis cash, has received special attentionand turns out to be sufficiently interest-ing in itself. We call this particular in-stance the two-way trading problem.Note that in the two-way trading prob-lem we assume that the prices and pricerelatives of the cash are always 1. Inthe special case of the two-way tradingproblem we sometimes refer to portfolioselection algorithms simply as (two-way) trading algorithms.

5.2 Buy-and-Hold Versus Market Timing

Financial agents study and use a largevariety of portfolio selection strategies,including the “slow” and almost static



strategies typically used by mutual fundmanagers that select and buy someportfolio and hold it for quite a longtime. Such strategies rely on the natu-ral tendency of securities to increasetheir value due to natural economicforces. For instance, stocks pay divi-dends and increase their prices as theunderlying firms succeed in their busi-nesses. Such “slow” strategies are gen-erally called buy-and-hold (BAH). Thusthe essence of devising a BAH strategy isthe particular selection of assets to holdand investors attempt to diversify theirholdings in order to reduce the risk(variance) of their portfolios.7

In contrast, some financial agents usemore aggressive strategies that buy andsell securities more frequently, some-times even many times during one day.Such strategies mainly attempt to takeadvantage of securities’ price fluctua-tions and are called market-timingstrategies. Of course, in the long runevery buy-and-hold strategy is also amarket-timing strategy. After all,sooner or later owners of securitieswant to realize the monetary value ofthe assets they hold. Hence, whether astrategy is buy-and-hold or market tim-ing is relative to the time horizon con-sidered.

The fact that market-timing strate-gies have the potential for enormousreturns is not surprising. Consider thefollowing illustration due to Shilling[1992]. A $1 portfolio invested in theDow Jones Industrial Average in Janu-ary 1946 was worth $116 at the end of1991 (including reinvestment of divi-dends but excluding tax deductions).This is equivalent to 11.2% compoundannual gain. A market-timing strategythat was lucky enough to be out of themarket during the 50 weakest monthsin that 552-month period but otherwisewas fully invested using the same fixedportfolio would return $2,541, or a 19%annual gain. Furthermore, an offline

strategy which during the 50 weakestmonths was in a short position wouldhave returned $44,967, or a 26.9% an-nual gain. The reader is referred to thecomprehensive survey by Merton [1981]for classical theories and ideas regard-ing buy-and-hold and market-timingstrategies.

5.2.1 The Constant Rebalanced Algo-rithm. An algorithm that is often usedin practice and is considered in thissurvey is the constant rebalanced algo-rithm that invests in a fixed portfoliobW 5 (b1, b2, . . . , bm) at the start ofeach trading period. This algorithm, de-noted CBALbW , is clearly a market-timingstrategy. The performance of the opti-mal offline CBAL is always at least asgood as that of the optimal offline BAHbut usually it is significantly better(see, e.g., the examples in Cover [1991]).The reason is that the optimal offlineBAH only performs as does the best secu-rity in the market but the optimal off-line CBAL also takes advantage of fluctu-ations in the market giving rise toexponential returns. Of course, CBALmakes as many as m transactions at thestart of each trading period to adjust itsportfolio, whereas BAH performs twotransactions during the entire tradingperiod. Although in the models consid-ered here this large number of transac-tions is not a consideration, it definitelybecomes significant when transactioncosts are introduced.

5.3 Online Portfolio Selection

Let ALG be an online portfolio-selectionalgorithm. Recall that ALG(X) denotesthe compounded return of ALG with re-spect to the market sequence X 5 xW1,. . . , xWn of price-relative vectors andstarting with an initial wealth of $1.The competitive ratio of ALG is

supX

OPT~X!

ALG~X!,

where OPT is an optimal offline portfolioselection algorithm. By considering the

7 The reader is referred to Bodie et al. [1993] for athorough treatment of this topic, as well as otherissues in elementary mathematical finance.

46 • R. El-Yaniv


simpler one-way trading problem (seeSection 3.4), the paper by El-Yaniv etal. [1992] obtains a lower bound ofc*2(w)n/2 for portfolio selection (m 5 2)where the constant c*2(w) . 1 is theoptimal bound for a one-way two-daytrading game in which the player knowsthe global fluctuation ratio w (see Equa-tion (5)). This simple lower bound isessentially obtained via a decompositionof the two-way trading problem to asequence of one-way trading games, us-ing the known lower bound for one-waytrading games. Note that El-Yaniv et al.[1992] present an upper bound of c* (w)n

for the case m 5 2, again based on astraightforward decomposition of thetwo-way trading game into a sequenceof one-way games.

Thus, for the online portfolio selectionproblem, one cannot hope for competi-tive ratios that are subexponential in n,the number of trading periods. Also, toobtain bounded ratios one must assumelower and upper bounds on the securi-ties’ prices (or other equivalent con-straints). Nevertheless, one should keepin mind that for typical market se-quences the optimal offline algorithmaccrues astronomical returns (which arealso typically exponential in n), so thepreceding lower bound does not excludepossibilities of large returns to the on-line player. On the other hand, noticethat competitive online strategies canlose money (i.e., end the game withfewer dollars than the initial wealth),whereas the optimal offline strategywill never lose.

5.4 Other Performance Measures

Worst-case studies of online portfolioselection strategies consider perfor-mance measures other than the compet-itive ratio. These performance measuresdiffer from the standard competitive ra-tio optimization in (some of) these ways:

—considering a restricted kind of offlinebenchmark algorithm, for example,some kind of a “static” offline algo-rithm (such as the optimal offline

CBAL) instead of the standard, unre-stricted optimal offline algorithm;

—imposing more constraints on the ad-versary; for example, considering ad-versaries that must choose theirworst-case market sequence whileconforming to some additional (statis-tical) parameters. Such adversariesare called statistical adversaries[Raghavan 1992]8;

—using decision criteria other than thecompetitive ratio, for example, themaximin or the minimax regret (defi-nitions follow);

—instead of measuring the compoundedreturn of the strategy (online or off-line), measuring returns via someutility functions. For example, onepopular measure is the exponentialgrowth rate defined (for an algorithmALG and a market sequence X) as 1/n zlog(ALG(X)) where n 5 uXu.

Let us now formally define other mea-sures that are discussed later. As usual,ALG denotes an online algorithm andOPT denotes an offline algorithm. Let Udenote any utility function. With re-spect to ALG, OPT, and U (and assumingprofit maximization) we now distin-guish among three adversaries. First, acompetitive-ratio adversary chooses amarket sequence X* such that

X* 5 arg maxX

U~OPT~X!!/U~ALG~X!!.

The maximin adversary chooses a mar-ket sequence X* such that

X* 5 arg minX

U~ALG~X!!.

The minimax-regret adversary choosesX* with

X* 5 arg maxX

$U~OPT~X!!

2 U~ALG~X!!%.

8 Notice that most of the adversaries considered sofar are “minimal” forms of statistical adversaries(e.g., some of the adversaries assumed in theone-way trading problem must conform to lowerand upper bounds on prices).



Applying additional restrictions on theadversary simply means that we con-strain the preceding maximizations(minimizations) with respect to theseadditional restrictions that specify whatfeasible market sequences are. Noticethat the use of the exponential growth-rate utility in conjunction with theminimax-regret adversary is equivalentto using the competitive-ratio adversarywith the identity utility function.

We note that most of the precedingvariations apply in general to any on-line problem. There are various intui-tive motivations for the use of any ofthese variations. For instance, the useof restricted statistical adversaries ismotivated by the desire to avoid consid-ering unrealistic adverse market se-quences that are very unlikely to occurin practice. The reader is referred toEl-Yaniv [1996], where the rationalityof the competitive ratio as well as acomparison to some of the other perfor-mance measures is studied. Neverthe-less, we note that the practical utility ofall these worst-case performance mea-sures is not yet well understood andshould be supported by means of exper-imental studies.

5.5 Results for the Two-Way TradingProblem

Here we focus on several results for thetwo-way trading problem, that is, theportfolio-selection problem with m 5 2securities such that one of them is cash.In this special case, since the cashprices and price relatives are always 1,a market sequence is specified by a se-quence of the security prices p1, p2, . . .(or price relatives).

5.5.1 Games Against Mean-VarianceStatistical Adversaries. Raghavan [1992]considers a market in which the secu-rity price sequence p1, p2, . . . maintainsknown mean of 1 and standard devia-tion s [ [0, 1]. He also imposes therestrictions that (i) prices are drawnfrom the interval [1 2 c, 1 1 c] wherec # 1 is some constant and (ii) p1 5

pn 5 1, where n is the number of trad-ing periods. With respect to a (statisti-cal) maximin adversary conforming tothe preceding restrictions he obtains thefollowing results (proofs of the statedresults are not given). First it is statedthat any deterministic online algorithmcannot return more than 1 1 s. He thenconsiders two strategies: CBAL and thefollowing averaging strategy, denotedAVE. Assuming n trading periods, AVEinvests 1/n dollars in the security at thestart of each trading period. The follow-ing bounds hold for AVE for each feasiblemarket sequence X 5 p1, p2, . . . , pn,

1 1 s2~1 2 c! # AVE~X! # 1 1 s2.

For CBALbW that maintains a portfolio bW 5(12,

12), the following bounds are stated,

1 1 s3/3 # CBAL~X! # 1 1 s3/ 2.

It is stated that the 1 1 s3/2 upperbound holds for any constant rebal-anced strategy (under the same as-sumptions). The bounds for algorithmsAVE and CBAL can be obtained as follows.Consider a price sequence X 5 p1,p2, . . . , pn. Any dollar investment ofAVE at the ith trading period returns1/pi dollars (since pn 5 1). Since AVEinvests 1/n dollars in each period, thetotal return is 1/n (i 1/pi. Hence,bounds on AVE’s total return can be ob-tained by maximizing this expressionover all price vectors satisfying themean-variance (standard deviation)constraints. Similarly, obtaining thebounds for CBAL can be reduced to suchconstrained optimization. Note that thereturn of CBAL for the market sequenceX is

CBAL~X! 51

2n21 Pi51

n21 Spi 1 pi11

pi11D .

5.6 Other Statistical Adversaries and“Money-Making” Algorithms

5.6.1 Money-Making Algorithms. De-fine the profit of a trading strategy asits final compounded return minus its

48 • R. El-Yaniv


initial wealth (a negative profit is calledloss). The realization that any competi-tive trading algorithm may end thetrading game with a loss motivatedChou et al. [1995a] to study models withadditional constraints that guaranteepositive profit for the online player withrespect to market sequences for whichthe optimal offline algorithm accruespositive profits. An online trading algo-rithm that satisfies this property iscalled money-making. The models con-sidered assume several kinds of statisti-cal adversaries that must conform toprespecified constraints. Not surpris-ingly, in order to achieve the money-making property the assumptions onmarket sequences must be quite strong.Nevertheless, considering such modelsleads to quite interesting results. Forthe rest of this section we consider finitemarket sequences prescribed by a se-quence of security prices p1, p2, . . . , pn,or alternatively by a sequence of theprice relatives x1, x2, . . . , xn21, wherexi 5 pi/pi11, i 5 1, 2, . . . , n 2 1. Whenreferring to a market sequence X wetake the most convenient of these repre-sentations.

5.6.2 The (n, f)-Adversary. Fix somen $ 2. Assume that each feasible pricesequence is of length n and impose therestriction that the optimal offline re-turn associated with a feasible sequenceis at least f (clearly f $ 1). The under-lying assumption here is that true(n, f) pairs exist in relevant real pricesequences in the sense that such pairscan be statistically estimated from pastmarkets with a reasonable degree ofconfidence. An adversary that is con-strained to generate only such feasiblesequences is called the (n, f)-adver-sary. Let X 5 x1, x2, . . . , xn21 be afeasible sequence of price relatives. It isnot hard to see that the optimal offlinereturn f is given by

f 5 P1#i#n21

max$1, xi%. (10)

It follows that, for any such sequence ofprices (price relatives), an online player

knowing f and n at the start of thegame can determine at the start of the(i 1 1)st period, just after the (i 1 1)stprice is revealed (i 5 1, 2, . . .), whatthe optimal offline return fn21 wouldbe for a new (n 2 i)-period game start-ing in that period, that is, if the optimaloffline algorithm were to start in thatperiod a “new game” (with initial wealthof 1) with respect to the suffix of theoriginal market sequence, xi11, . . . , xn.Specifically, using (10) we have fn21 5min{f, f/x1} and

fn2j21 5 min$fn2j , fn2j/xj11%.

Hence, against the (n, f)-adversary, theonline player can track the “state” (i.e.,current wealth accrued) of the optimaloffline algorithm with a delay of oneday. This property allows for a dynamicprogramming derivation of the optimalonline algorithm. Denote by Rn(f) thereturn of the optimal online algorithmS* (which knows the parameters n andf). It is not hard to obtain the followingrecurrence relation for Rn.

Rn~f! 5 sup0#b#1

infx#f

$~bx 1 1

2 b! Rn21~fn21!%; (11)

R2~f! 5 f.

Notice that algorithm S* attempts tochoose its best investment b against theworst possible price relative x. Thewealth obtained from the investment bxplus the remaining cash 1 2 b are thenreinvested optimally with respect to an(n 2 1)-period game in which the opti-mal offline return is fn21, and so on.

A closed form for Rn is probably be-yond reach, but it is not hard to proveby induction on n that S* is money-making. However, it is not difficult toobtain the following upper bound onRn(f) for any f . 1 and n $ 2.

Rn~f! #1

1 2 ~1 2 1/f!n21. (12)

This bound can be obtained by consider-ing the following restricted version of



the (n, f)-adversary. In each period thisadversary has two options: either to de-crease the price by a factor of f or toincrease the price by a very large factorso that the dollar value of the previousinvestment becomes negligible. Oncethe adversary chooses the first option,there will be no further downward fluc-tuations since the optimal offline of fhas been realized. Hence, if this is thecase, the game is over. The optimal on-line return R9n(f) for this restrictedgame is

R9n~f! 5 max0#b#1

min$bf 1 1 2 b,

~1 2 b! R9n21~f!%,

and its closed form is the right-handside of (12), which clearly bounds aboveRn(f).

Using the approximation (1 21/f)n21 5 (1 2 1/f)f((n21)/f)

' e2(n21)/f,for large f we have

R9n~f! < 1/~1 2 e2~n21!/f!,

and then the relations are obtained:

—if f 5 v(n), then e2(n21)/f ' 1 2 (n 21)/f and R9n(f) ' f/(n 2 1);

—if f 5 Q(n), then R9n(f) ' 1/(1 2e2c), where c is some positive con-stant;

—if f 5 o(n), then R9n(f) approaches 1as n 3 `.

Hence, although S* is “money-mak-ing,” the optimal online return Rn(f)against the (n, f)-adversary can be aminuscule fraction of f. In terms ofcompetitiveness, the competitive ratioof S* is not smaller than max{n 2 1, f}and no upper bound was determined.

5.6.3 The General “Money-Making”Scheme Against Statistical Adversaries.The preceding derivation of an optimalonline algorithm against the (n, f)-ad-versary gives rise to a general schemefor obtaining money-making optimal on-line algorithms with respect to any sta-tistical adversary that is at least asrestricted as the (n, f)-adversary. Spe-

cifically, for any collection of constraintsC that subsume the (n, f) constraint, arelation similar to (11) obtains the cor-responding optimal online algorithmwhen the constraints in C are includedappropriately in the recurrence.

5.6.4 Weaker Adversaries. Againstthe (n, f)-adversary the online player isforced to invest very small amounts inmost trading periods, since the adver-sary can depreciate the value of mostinvestments by increasing the price ar-bitrarily. Theoretically, such threatscan be made until the second to lastperiod. Such market sequences are ofcourse unrealistic. By imposing addi-tional constraints it is possible to re-duce such threats. The following con-straints can substantially reduce theseunrealistic threats: (i) upper bounds onprice relatives; (ii) minimum and maxi-mum bounds on prices; (ii) upperbounds on the length of runs of mono-tone increasing (decreasing) prices; and(iii) other statistical parameters such asmean and higher moments of the empir-ical distribution observed. Note that the(n, f) constraint is “minimal” in thesense that when one of these parame-ters is relinquished the money-makingproperty cannot be achieved.

5.7 The Fixed-Fluctuation Model

In the fixed-fluctuation model all pricerelatives xi are in {a, a21} where a . 1.Assuming w.l.o.g. that the initial priceis 1, it follows that all prices are in{aj : j integer}. We now add the restric-tion that each feasible sequence of pricerelatives is of length n and the numberof downward (i.e., profitable) a21-fluctu-ations is exactly k, 0 # k # n. Hencethe number of upward a-fluctuations isn 2 k. Thus the “type” of each marketsequence is specified by the number k.Call the adversary that produces suchfeasible sequences the (a, n, p)-adver-sary. Clearly this (a, n, p)-constraintsubsumes the preceding (n, f)-con-straint since the implied optimal offlinereturn for each feasible market se-

50 • R. El-Yaniv


quence is exactly f 5 ak. Hence it ispossible to obtain the optimal onlinetrading algorithm (against the (a, n,p)-adversary) using the preceding “mon-ey-making” scheme. Chou et al. [1995a]study this optimal online algorithm andcharacterize some of its properties.

5.7.1 On the Fixed-Fluctuation Modeland Time Scaling. Before we continuesketching the results of Chou et al.[1995a], let us consider the practicalrelevancy of the fixed fluctuation model.Clearly “daily” price relatives are vari-able. Hence, to approximate the fixed-fluctuation model one can use a time-scaling approach where each tradingperiod is of variable length such thatthe (i 1 1)st price “tick” occurs at thefirst time instance after the ith pricetick when a price that approximates piaor pi/a is encountered (with pi denotingthe price at the ith tick).9 One advan-tage of this fixed-fluctuation model isthat the player may choose a suitable ato filter out “noisy” fluctuations (i.e.,very small fluctuations that should beavoided when transaction costs aretaken into consideration). Of course, thechoice of the fluctuation parameter ashould be correlated to the choice of themarket type parameter k, which in gen-eral should be based on a rough predic-tion of the market trend within the ho-rizon n.

5.7.2 The Optimal Strategy Againstthe (a, n, p)-Adversary. Denote byR(n, k) 5 R(a, n, k) the return of theoptimal online algorithm S** againstthe (a, n, k)-adversary. Using the pre-ceding scheme, a recurrence relation forR(n, k) is easily obtained:

R~n, k! 5 max0#b#1

minH ~ab 1 1 2 b! R~n 2 1, k 2 1!,~ba21 1 1 2 b! R~n 2 1, k! J

R~n, 0! 5 1

R~n, n! 5 an. (13)

Here the b that minimizes (13) is thefirst investment made by algorithmS**. Since the upper operand in the minoperator in (13) is decreasing with b andthe lower one is increasing with b, by abalancing argument the optimal b, de-noted b*, equates both operands. There-fore,

b* 5R~n 2 1, k! 2 R~n 2 1, k 2 1!

~a 2 1! R~n 2 1, k 2 1!

2 ~a21 2 1! R~n 2 1, k!

.

Substituting b* for b in either operandof the min in (13) and rearranging, weobtain the expression for the reciprocalR21:

R21~n, k! 51

a 1 1R21~n 2 1, k 2 1!

1a

a 1 1R21~n 2 1, k!. (14)

It is possible to solve R21 in a closedform (in terms of partial binomial sums)as follows. Artificially extend the do-main of R21(n, k) such that

R21~n, j! 5 H1an22k

j # 0;j $ n.

(15)

(It is possible to prove by induction on nthat the extended recurrence is well de-fined.) Now consider the followinggraph (similar to the “Pascal triangle”)called a binomial tree,10 correspondingto an expansion of the extended recur-rence R21(n, k). In this graph, eachnode is labeled by some pair (x, y) thatcorresponds to the value R21(x, y). Theroot of this graph is (n, k) and it hastwo outgoing edges, one to its left child(x 2 1, y 2 1) and the other to its right

9 Of course, to allow such time scaling a must bechosen in accordance with the special propertiesof the market in question.

10 The name binomial tree is typically used infinance [Cox and Rubinstein 1985]. Note that theresulting graph is not a tree in the graph-theo-retic sense, but a lattice.



child (x 2 1, y). All the leaves are of theform (1, k 2 (n 2 i)), i 5 1, . . . , n andtheir values are obtained by the exten-sion (15). By (14), the value for eachnode (x, y) is computed from the valuesof its children (x 2 1, y 2 1) and (x 2 1,y). Notice that in each node (x, y), xcorresponds to the height of the node(with the leaves at level 1 and the rootat level n) and y corresponds to the edgedistance from the top-left to bottom-right diagonal (i.e., the sequence ofedges connecting (n, k) to (1, k)). Noticethat all the paths to the same leaf havethe same number of left-right (andright-left) moves. In particular, the pathto the leaf (1, k 2 (n 2 i)) has n 2 itop-left edges and (n 2 1) 2 (n 2 i) 5i 2 1 top-right edges (there is a total ofn 2 1 edges in each path). Using (14),when we calculate the value R21(n, k)each left move contributes a factor q 51/(1 1 a) and each right move contrib-utes a factor p 5 a/(1 1 a) (notice thatq 5 1 2 p). Hence the “weight” of thepath to the leaf (1, k 2 (n 2 i)) ispi21qn2i. Define

BpHnkJ 5 O

0#i#k

Sni D pi~1 2 p!n2i,

the partial binomial sum. Abbreviatingzi 5 k 2 (n 2 i), we thus have

R21~n, k! 5 Oleaf ~1,zi!

R21~1, zi!

z @# of paths to ~1, zi!#

z @weight of a path to ~1, zi!#

5 O1#i#n

R21~1, zi!Sn 2 1i 2 1 D pi21qn2i

5 Ozi#0

R21~1, zi!Sn 2 1i 2 1 D pi21qn21

1 Ozi $1

R21~1, zi!Sn 2 1i 2 1 D pi21qn2i

5 Ozi#0

1Sn 2 1i 2 1 D pi21qn2i

1 Ozi$1

an22kSn 2 1i 2 1 D pi21qn2i

5 Oi51

n2k Sn 2 1i 2 1 D pi21qn2i

1 an22k Oi5n2k11

n Sn 2 1i 2 1 D pi21qn2i

5 Oi50

n2k21 Sn 2 1i D piqn2i21

1 an22k z Oi50

n2k Sn 2 1i Dqipn2i21

5 BpH n 2 1n 2 k 2 1J 1 an22k) z BqH n1

n 2 kJ5 BpH n 2 1

n 2 k 2 1J 1 an22k) z BpHn 2 1k 2 1J .

Using the preceding expression and tailestimates of the binomial distribution,Chou et al. [1995b] obtained the follow-ing asymptotic behavior of R(n, k). Letc [ [0, 1] and define

gx 5xx~1 2 x!~12x!~1 1 a!

a~12x!. (16)

Then the following bounds hold (cruderbut similar bounds were obtained ear-lier in Chou et al. [1995a]).

(1) if 0 # c , p, then R(n, cn) 3 1;(2) if c 5 p, then R(n, cn) 3 2;(3) if p , c , q, then R(n, cn) 5

Q(=ngcn);

(4) if c 5 q, then R(n, cn) 3 2gcn 5

2a(2c21)n;(5) if q , c # 1, then R(n, cn) 3

a(2c21)n (R(n, cn) $ a(2c21)n).

52 • R. El-Yaniv


This result entails an interesting corol-lary. Consider the (optimal offline) BAH.If c . 1

2, BAH invests its entire wealth onthe first period and cashes it at the endof the game. Otherwise, BAH keeps itswealth in cash. The return of BAH,R(BAH), is thus

R~BAH! 5 5 1

an~2c21!

if c #1

2;

if c $1

2.

(17)

It follows that algorithm S** alwaysoutperforms BAH. Moreover, if q , c ,p, S** performs exponentially betterthan BAH. Whenever c [ [0, 1

2] or c [[p, 1] this result readily follows fromthe bounds obtained for S** and thereturn of BAH. Whenever c [ (1

2, p), itis not hard to see that gc . a2c21

(the function f(c) 5 gc/a2c21 5 (c/a)c(1 2

c)12c(1 1 a) is strictly decreasingin (1

2, p) and f(p) 5 1).Thus, whenever the market is “sta-

ble,” in the sense that it does not exhibita “major” trend but is nevertheless “ac-tive” (i.e., there are fluctuations), algo-rithm S** performs remarkably well (inparticular, relative to BAH). Moreover,even if the market exhibits a slight un-favorable trend (i.e., c [ (q, 1

2)), algo-rithm S** still yields exponential re-turns. Notice that the size of this“profitable” interval (q, p) increaseswith a and thus can be controlled by theonline player. Nevertheless, for largervalues of a, because of the time scalingrequired to approximate the fixed-fluc-tuation model, algorithm S** may ig-nore many prices and therefore missprofitable transactions.

Chou et al. [1995b] also comparethese bounds on the performance of S**to two more algorithms. First, theystudy the performance of the optimalconstant rebalanced algorithm, denotedCBAL*, against the (a, n, k)-adversary.Second, they study the following distri-butional model corresponding to theconstraint that feasible market se-

quences must be of type (n, k). Specifi-cally, in this model a random price-relative sequence is chosen uniformlyamong all the feasible sequences (oflength n and with exactly k downwardfluctuations). With respect to this modelthey determine the optimal trading al-gorithm (i.e., optimal with respect toaverage return). Consider first the con-stant rebalanced algorithm. Let b be theconstant fraction of wealth invested byCBAL in the security. With respect to anyfeasible market sequence the return ofCBAL is

~1 2 b 1 ba!k~1 2 b 1 ba21!n2k.

The optimal value of b, denoted b*, iseasily determined to be b* 5 (ac 1 c 21)/(a 2 1), where c 5 n/k.

The distributional variant describedcan be presented equivalently as fol-lows. Define l 5 n 2 k. Given the (n, k)restriction, the probability of an a21-fluctuation in the first period is k/n andthe probability of an a-fluctuation is l/n.After the first fluctuation was (random-ly) obtained, the remaining sequencemust be of type (n 2 1, k 2 1) or (n,k 2 1) depending on whether the firstfluctuation was a21 or a, respectively,and so on. Denote by R(n, l) the ex-pected return of the optimal online algo-rithm when feasible markets are of type(n, k). Thus

R~n, l ! 5 maxo#b#1

Hn

n~1 2 b 1 ba21! R~n 2 1, l !

1l

n~1 2 b 1 ba! R~n, l 2 1!J .

Since the return is a maximum of alinear function of b, it follows that b iseither 0 or 1. Note that b is only theinvestment for the first period. Never-theless, a nontrivial analysis presentedin Chou et al. [1995b] shows that theoptimal online algorithm is constant re-balanced. Therefore, given that the first



investment is “all or nothing,” the opti-mal online strategy is BAH which investsin the security iff k $ l. We note that,with respect to a sequence of price rela-tives generated randomly by a sequenceof i.i.d. random variables, it is knownthat the optimal algorithm is constantrebalanced [Breiman 1961].

The performances of the optimal con-stant rebalanced algorithm CBAL* andthe optimal algorithm of the distribu-tional problem variant (denoted DIST),plus the bounds for algorithm S**, aresummarized in Table II (which also in-cludes, for comparison, the return of theoptimal offline BAH).

A striking fact immediately observedin this table is that the asymptotic re-turns of all three optimal online algo-rithms are similar. Denote the four re-gions of c, (0, q), (q, 1

2), [12, p), and (p, 1)

by R1, R2, R3, and R4, respectively. Formarkets that exhibit significant trend(regions R1 and R4), the returns of allthree algorithms are within a constantfactor of BAH. In fact, in these regionsthe optimal constant rebalanced algo-rithm degenerates to BAH (and DIST actsin a similar way to BAH across all re-gions). For markets with no trend (re-gions R2 and R3), all three algorithmsperform exponentially better than BAH.

These results also indicate that infor-mation about the “type” of the marketcan be almost as valuable as informa-tion about the distribution (note that inregion R3, DIST outperforms S** andCBAL by a constant factor).

These results, together with the factthat both CBAL* and S** make transac-tions in almost every trading period,may indicate that the following is truewhenever transaction costs are intro-

duced. When the market is trendy, evenif the trader has perfect prediction ofthe trend’s slope it may be beneficial toavoid market timing and follow thetrend using BAH. On the other hand, theexponential advantage of market-timingstrategies in nontrendy markets mayvery well compensate for the losses in-curred by transaction costs.

Finally, we note that Chou et al.[1995b] show how to derive the returnof S** using a standard method calledbinomial risk-neutral pricing (see Hull[1993]) that is usually used to priceoptions and other derivative instru-ments. Their results imply that the bi-nomial risk-neutral pricing method isequivalent to worst-case analysis.

5.7.3 On the Empirical Performance ofS**. Chou [1994] reported on prelimi-nary experimental results testing theperformance of S**. He used two datasets, both of which included intradayprices for both US dollars versus Japa-nese yen and US dollars versus Germanmarks. Data set A spanned one monthand included all price quotations11 anddata set B included prices at six-minuteintervals during one year. In both casesa was set to 1 1 5/(initial exchangerate), since almost all changes in thesedata were of 5 points12; since suchchanges were small relative to the ex-change rates, the additive changescould be reasonably well approximatedby multiplicative changes of a. Thevalue of n varied between 100 and

11 In such data a new price “tick” occurs every10–120 seconds.12 A point is the smallest unit used to measureprices.

Table II. Asymptotic Performance of CBAL*, S**, DIST, and BAH for “Trendy” and “Nontrendy” Markets

Region of c 5 k/n CBAL* S** DIST BAH

(0, q) 1 1 Q(1) 1c 5 q 1 2 Q(=n) 1(q, 1

2) gc Q(=n z gc) Q(=n z gc) 1

[12, p) gc Q(=n z gc) Q(=n z gc) a(2c21)n

p a(2c21))n 2a(2c21)n Q(=n z a((2c21)n) a(2c21)n

(p, 1) a(2c21)n a(2c21)n Q(a(2c21)n) a(2c21)n

54 • R. El-Yaniv


1,000, thus breaking the trading periodinto a sequence of short games withreinvestment. Finally, k was naivelychosen to be n/2 for all games. Thereturns of S** with respect to data setA were remarkably high (223% and167% for USD/DM and USD/JY, respec-tively). The results with respect to dataset B were marginal (104% and 111%).When transaction costs of 0.02% wereintroduced, the returns against data setA remained very high (133% and 134%)but against data set B, S** sufferedsevere losses (73% and 89%). Examina-tions of these data sets revealed that indata set A the assumption of k 5 n/2was quite closely satisfied, whereas indata set B it was severely violated. Onenot surprising drawback of S** thatwas found in these experiments is thatS** reacts to and trades with everyprice tick.

These results indicate that algorithmS** is quite robust with respect to theassumed parameters. Nevertheless, notheoretical results on the algorithm’ssensitivity to noise were presented.Note that sensitivity analyses are de-sired in all problems where one makesuse of statistical adversaries.

Finally, it is important to emphasizethat empirical results such as the pre-ceding usually cannot be conclusive. Ingeneral, in order to exhibit informativeempirical performance of financial algo-rithms one must quantify the risk asso-ciated with the exhibited levels of re-turn.13

Chou et al. [1995b] reported on exten-sive statistical tests measuring thetrends exhibited in price sequences as afunction of a and n. In particular, theirgoal was to measure the empirical dis-tribution of trend types in terms of the

preceding four regions (R1, R2, R3, andR4) with varying values of a and n. Thedata used included 486 of the 500 stockscomprising the Standard and Poor 500index (S&P 500) spanning a time periodof about 30 years. The tests reveal thefollowing relations. Call regions R2 andR3 the “nontrendy” regions and regionsR1 and R4 the “trendy” regions. As a orn grow, the fraction of sequences (oflength n) in the nontrendy regionsgrows (and the fraction of sequences inregion R3 is always larger than in R2).For small values of a (e.g., 1.005 # a #1.05) the majority of sequences are inthe trendy regions; for larger values of athe majority of sequences are in thenontrendy regions. From these results itfollows that with respect to the stockmarket, the practical advantage of mar-ket-timing strategies like S** or con-stant rebalanced, using a fixed fluctua-tion model, can be obtained only whenusing large values of a and n (whileusing time scaling). On the other hand,if, in a particular market, such largevalues of a are not feasible, it may bewiser to remain “static” and use BAH.Note that for large values of a (say, a $1.01), transaction points cannot occurtoo often (which may be considered anadvantage by some traders).

5.8 Online Portfolio Selection, Results forArbitrary m

5.8.1 The Family of m-Weighted Port-folio-Selection Algorithms. Cover andOrdentlich [1996] define the followinggeneral class of m-weighted online port-folio-selection algorithms. Each algo-rithm in this class is specified by aprobability measure m over the set of allportfolios. The algorithm starts byhedging uniformly on all possible port-folios. Then it rebalances the weight itgives to various securities according tothe past performance of constant rebal-anced portfolios while putting moreweight on the better performing ones.The probability measure m provides anadditional weighting mechanism thatcan favor particular portfolios.

13 There are various common measures of risk, forexample, Sharpe’s ratio, given by (E(R(ALG)] 2R(risk-free))/s(ALG), where E[R(ALG)] and s(ALG)are the mean and standard deviation of ALG’sreturn and R(risk-free) is the risk-free rate ofreturn [Von Neumann and Morgenstern 1944].The reader is referred to any elementary financebook such as Bodie et al. [1993] that discussesmeasures of risk.



The introduction of the m-weightedalgorithms requires some notation. Fixsome positive integer m. Let @ be theset of all possible portfolios over m secu-rities (i.e., @ is the (m 2 1)-dimensionalsimplex). With respect to a market se-quence X 5 xW1, . . . , xWn, define Xi 5 xW1,. . . , xW i, i 5 1, 2, . . . , n, the prefix of Xconsisting of the first i market vectors.With respect to Xi define

Ri~bW ! 5 R~bW , Xi! 5 Pj51

i

bW t z xW j .

That is, Ri(bW ) is the compounded returnof a fixed portfolio bW after i tradingperiods. By convention, set R0(bW ) 5 1,the initial wealth. (Alternatively,Ri(bW ) 5 CBALbW (Xi).)

Fix some market sequence X and let mbe a probability measure over @. Analgorithm is a m-weighted portfolio se-lection algorithm if its ith period portfo-lio bW i is specified by

bW i 5*@bW Ri21~bW !dm~bW !

*@Ri21~bW !dm~bW !. (18)

Here the integral is the Lebesgue inte-gral. Note that when the probabilitymeasure m has a density function theLebesgue integral can be expressed interms of ordinary (multiple) integrals.Clearly, any m-weighted algorithm oper-ates online.

Let X be any market sequence oflength n and let bW 1, bW 2, . . . , bW n be theportfolios obtained by a m-weighted al-gorithm ALGm. The expression for thecompounded return ALGm(X), of them-weighted algorithm ALGm, can be eas-ily simplified as:

ALGm~X! 5 P1#i#n

bW it z xW i

5 P1#i#n

~*@bW tRi21~bW !dm~bW !! z xW i

*@Ri21~bW !dm~bW !

5 P1#i#n

~*@bW t z xW iRi21~bW !dm~bW !!

*@Ri21~bW !dm~bW !

5 P1#i#n

*@Ri~bW !dm~bW !

*@Ri21~bW !dm~bW !. (19)

Hence since the last product telescopeswe have

ALGm~X! 5 E@

Rn~bW !dm~bW !

5 E@

CBALbW~X!m~bW ! . (20)

Notice that although the m-weighted al-gorithm adapts its portfolio online andgives more weight to the better perform-ing (constant rebalanced) portfolios (seeEquation (18)), its final return is simplya m-average of the returns of all con-stant rebalanced portfolios in @.

5.8.2 The Uniform-Weighted Algorithm.Denote by UNI the m-weighted algorithmcorresponding to the uniform distribu-tion over @. Denote by CBAL-OPT theoptimal offline constant rebalanced al-gorithm. That is, CBAL-OPT 5 CBALbW *where for any market sequence X 5xW1, . . . , xWn, the fixed portfolio bW * usedby CBAL-OPT is

bW * 5 arg maxbW

P1#i#n

bW z xW i .

Cover and Ordentlich show that withrespect to the optimal offline constantrebalanced algorithm CBAL-OPT, and anymarket sequence X of length n, the fol-lowing holds.

CBAL-OPT~X!

UNI~X!# Sn 1 m 2 1

m 2 1 D# ~n 1 1!m21. (21)

Thus, in a restricted sense of competi-tiveness, algorithm UNI is (n 1 1)m21

56 • R. El-Yaniv


competitive (with respect to CBAL-OPT).Stated equivalently (but perhaps moreastoundingly), using the preceding ex-ponential growth-rate utility functionand the minimax regret criterion, thisresult states that the minimax regret (ofthe exponential growth rates) of algo-rithm UNI is bounded above by(m 2 1)log(n 1 1)/n. This means thatthe regret is diminishing when n grows.

The proof of the bound (21) is inter-esting and is based on profound ideas.Somewhat unexpectedly, it is possible totransform the analysis into information-theoretic terms. Indeed, in proving thistheorem we rely on some elementaryresults from information theory.

We now sketch the proof of thisbound. Let [m] 5 {1, 2, . . . , m}. Denoteby Jn 5 (j1, j2, . . . , jn) any sequence ofindices from [m] (i.e., Jn [ [m]n). LetX 5 xW1, . . . , xWn and let bW be any portfo-lio. The analysis relies on the followingtrick that gives a different representa-tion for the compounded return of theconstant rebalanced algorithm CBALbW .

CBALbW~X! 5 P1#i#n

bW t z xW i

5 P1#i#n

O1#j#m

bj xij

5 OJn[@m#n

P1#i#n

bji xiji . (22)

Fix any market sequence X of lengthn. Let bW * be the portfolio used by CBAL-OPT. Using the representation (22) forCBAL-OPT and UNI we have

CBAL-OPT~X! 5 OJn[@m#n

Pi51

n

b*ji xiji ;

UNI~X! 5 OJn[@m#n

E Pi51

n

bji xiji dm~bW ! ,

where m is a uniform probability mea-sure over @. The ratio of compound

returns, CBAL-OPT(X)/UNI(X), is now aratio of two finite summations each in-volving N 5 mn nonnegative terms. It isnot hard to see that if a1, . . . , aN $ 0,b1, . . . , bN $ 0, then

(1#i#Nai

(1#i#Naibi

# maxj

aj

bj

.

Using this upper bound on the ratio ofsums we have

CBAL-OPT~X!

UNI~X!

# maxJn[@m#n:Pixji.0

Pi51n b*ji xiji

*@ Pi51n bji xiji dm~bW !

5 maxJn[@m#n:Pixji.0

Pi51n b*ji

*@ Pi51n bjidm~bW !

5 maxJn[@m#n

Pi51n b*ji

*@ Pi51n bjidm~bW !

. (23)

The right-hand side of (23) can now beupper bounded as follows using the“method of types” of information theory(see Cover and Thomas [1991], Chapter12). For each vector Jn and r [ [m],denote by nr(Jn) the proportion of thenumber of occurrences ni(Jn) of r in Jn(i.e., the number of occurrences dividedby n). In this way, the distributionD(Jn) 5 (n1(Jn), n2(Jn), . . . , nm(Jn))specifies the “type” of the sequence Jn.Theorem 12.1.2 in Cover and Thomas[1991] then states that for any distribu-tion (portfolio) bW and any such Jn.

P1#i#n

bji #1

2nHD~ Jn!,

where H(D(Jn)) is the Shannon entropyof D(Jn) (i.e., H(D(Jn)) 5 2(i ni(Jn) logni(Jn)). Equality is obtained at bW 5D(Jn). This result upper bounds the nu-merator of (23). The integral in the de-nominator can be expressed in a closed



form as

E@

P1#i#n

bjidm~bW !

5 1YSSn 1 m 2 1m 2 1 DT~ Jn!D ,

where T(Jn) is the number of sequencesof the type D(Jn). Note that

Sn 1 m 2 1m 2 1 D

is the number of D(Jn)-types. Theorem12.1.3 in Cover and Thomas [1991]states that T(Jn) # 2nH(D(Jn)). Fromthese bounds it readily follows that

BOPT~X!

UNI~X!# Sn 1 m 2 1

m 2 1 D ,

and the upper bound (n 1 1)m21 on theright-hand side follows from Theorem12.1.1 in Cover and Thomas [1991].

5.8.3 Implementation of the Uniform-Weighted Algorithm. Cover and Or-dentlich [1996] obtain recursive proce-dures to compute the precedingalgorithm but the time and space re-quirements of this procedure grow asnm21 so it can be applied only to a smallnumber of securities. Blum and Kalai[1997] offer the following randomizedapproximation to the uniform-weightedalgorithm. As we see in (20), the algo-rithm UNI’s return is simply a uniform-weighted average of all constant rebal-anced algorithms. This fact suggests thefollowing approximation: among allportfolios choose K uniformly at ran-dom. Then divide and invest the initialwealth equally between all K chosenportfolios. Using standard tail bounds(e.g., Chebyshev’s inequality), it is nothard to show that using K 5 (R 21)/«d2, where R is the (empirical) “com-petitive ratio” of UNI (with respect to theoptimum constant rebalanced portfolio),we can guarantee with probability atleast 1 2 d a return that is at least as

large as 1 2 « times the return of algo-rithm UNI. Although in the worst case Rgrows as nm21, experiments on histori-cal stock-market data [Cover 1991;Helmbold et al. 1996] suggest that R istypically much lower. Therefore, in us-ing this approximation when m is large,one should estimate R empirically anduse this algorithm when R is suffi-ciently small.

5.8.4 Dirichlet-Weighted Algorithms.We now consider m-weighted algorithmswhere m is a Dirichlet distribution. Arandom vector bW 5 (b1, b2, . . . , bm) hasa Dirichlet distribution (sometimescalled the multinomial beta distribu-tion), denoted Dirichlet (aW ), with para-metric vector aW 5 (a1, a2, . . . , am), ai. 0, if the probability density functionof bW with respect to aW , faW (bW ) satisfies thefollowing conditions. At any point bW [Rm with bi $ 0, (i bi 5 1,

faW~bW ! 5G~a1 1 · · · 1 am!

G~a1!· · ·G~am!b1

a121· · · bmam21 ,

where G is the gamma function.14 At anyother point bW 9 [ Rm, faW (bW 9) 5 0. ThusfaW(bW) is positive only when the vector bW isa distribution function, (i bi 5 1. Hence,despite its appearance, faW (bW ) is not anm-dimensional p.d.f. but rather givesthe joint p.d.f. of any (m 2 1)-subset ofthe m random variables b1, b2, . . . , bm(and one of them is set by the relation(ibi 5 1). Notice that the Dirichlet (1,1, . . . , 1) density is the uniform densityover the (m 2 1)-dimensional simplex.Also, we note that the symmetric Dir-ichlet (a, a, . . . , a) density with a , 1is strictly convex (with poles at the unitvectors). This means that if it is used bya m-weighted algorithm it gives moreweight to “extremal” portfolios that in-vest in a small number of securities.

Denote by DIRaW the m-weighted portfo-lio selection algorithm, with m being the

14 The gamma function is G(x) 5 *0` e2ttx21 dt. It

can be shown that G(1) 5 1 and that G(x 1 1) 5xG(x). Thus if n is an integer, n $ 1, G(n 1 1) 5n!.

58 • R. El-Yaniv


symmetric Dirichlet (aW ). Thus,DIR(1,1, . . . ,1) is exactly algorithm UNI.For the Dirichlet (1

2, . . . , 12)-weighted

algorithm, Cover and Ordentlich [1996]show that for any market sequence X,

CBAL-OPT~X!

DIR~1/ 2,1/ 2, . . . ,1/ 2!~X!#

G~1/ 2!G~n 1 m/ 2!

G~m/ 2!G~n 1 1/ 2!

# 2~n 1 1!~m21!/ 2.

Thus DIR(1/2,1/2, . . . ,1/2) is (2(n 1 1)(m21)/2)-competitive with respect to CBAL-OPT.Here again, stated in terms of the expo-nential growth-rate utility function andthe minimax regret criterion, this boundimplies that the minimax regret (of theexponential growth rates) of algorithmDIR is bounded above by ((m 21)log(2n 1 2))/2n.

This result for algorithm DIR can beproved following the same line as theproof of the bound for algorithm UNI butit is somewhat more technical (since theintegrals involved are more complex).

5.8.5 Extremal Mixture Algorithms.Equation (22) suggests the following in-terpretation. Fix some Jn 5 (j1, j2, . . . ,jn) [ [m]n and consider the “extremal”algorithm, denoted EXTJn

, that at time iinvests the entire wealth in the jithsecurity. Let bW 5 (b1, . . . , bm) andconsider the constant rebalanced algo-rithm CBALbW . It is possible to representCBALbW in terms of extremal algorithmsas follows. Partition CBALbW ’s initialwealth into mn portions, one for each ofthe mn sequences Jn in [m]n. The num-ber of dollars in the part correspondingto Jn is w(Jn) 5 )i51

n bji. Notice that

OJn[@m#n

w~ Jn! 5 1.

That is, {w(Jn)} is a probability distri-bution (and a proper partition of theinitial wealth). For each of the extremalalgorithms EXTJn

, maintain a separateinvestment “account” starting with aninitial wealth of w(Jn). For a marketsequence X, the wealth accrued by EXTJn

is thus

EXTJn~X! 5 w~ Jn!Pi51

n

xiji 5 Pi51

n

bji xiji .

Hence, by (22) we see that the sum ofwealth accrued by all the extremal algo-rithms is exactly CBALbW (X).

Exactly as for the one-way (andsearch) algorithms (see Section 3), wecan interpret (and apply) this algorithmas a (mixed) randomized algorithm.Simply choose the extremal algorithmEXTJn

with probability w(Jn). In thiscase, of course, CBALbW (X) will be theexpected return of this algorithm. Froma computational point of view, operatingthe extremal mixture algorithm ran-domly saves most of the large memoryrequired for operating the deterministicalgorithm.

Using Equation (20) we can interpretalgorithms UNI and DIR (and anym-weighted algorithm) analogously. Allthat is needed is to partition the initialwealth so that algorithm EXTJn

is as-signed

w~ Jn! 5 E@

Pi51

n

bjidm~bW !

(with m being the corresponding proba-bility measure).

The preceding representation (inter-pretation) of the constant rebalanced(and m-weighted) algorithms gives riseto the following general class of “ex-tremal mixture” algorithms. For each nthe extremal mixture strategy is speci-fied by a probability distribution w overthe set {Jn} and invests in the extremalalgorithm EXTJn

a fraction w(Jn) of itsinitial wealth. From the previous dis-cussion it follows that for each n, anoptimal online extremal mixture algo-rithm performs at least as well as thebest m-weighted algorithm.

For each n let OEXTn be the extremalmixture algorithm specified by the fol-lowing probability distribution over theextremal algorithms. For each Jn [



[m]n, set q(Jn) 5 )i51m ni(Jn)ni(Jn). The

distribution w is then

w~ Jn! 5q~ Jn!

(Jn [ @m#n q~ Jn!.

Ordentlich and Cover [1996] prove thatalgorithm OEXTn is an optimal extremalmixture algorithm for each market se-quence X of length n. Furthermore, it isshown that for each such X,

CBAL-OPT~X!

OEXTn~X!

# OJn[@m#n

n!

)1#i#m ni~Jn!!P

1#i#m

Sni~Jn!

n Dni~ Jn!

,

(24)

and in the worst case this bound istight.

Given the equivalence of determinis-tic and (mixed) randomized extremalmixture algorithms, the ratio (24) is alower bound on the competitive ratio ofany randomized extremal mixture algo-rithm (and m-weighted algorithm)against an oblivious adversary (with re-spect to CBAL-OPT).

For the case m 5 2 this competitiveratio (24) of OEXTn (with respect to CBAL)reduces to

O0#k#n

SnkD z Sk

nDk

z Sn 2 k

n Dn2k

,

and using Stirling approximation can beshown to be approximately =np/2 '1.253 =n. Note that the competitiveratio of the DIR(1/2, . . . ,1/2) algorithm form 5 2 is no larger than 2=n 1 1, so theDirichlet algorithm attains a competi-tive ratio that is within a factor of (ap-proximately, for large n) 1.6 of algo-rithm OEXTn.

5.8.6 Portfolio Selection with Side In-formation. Cover and Ordentlich [1996]extended the preceding portfolio-selec-tion model to incorporate possible “sideinformation” that the trader may have.

Typically such “side information” maybe based on various kinds of predictionsof future values of market vectors. Tomodel side information, we consider an“oracle” that announces a number infoi[ ( 5 {1, 2, . . . , k} at the start of theith trading period. The number an-nounced represents an abstract state ofsome prediction apparatus. For exam-ple, in the best case infoi identifies thebest security in the ith period. Never-theless, we assume that the onlineplayer does not know in advance thequality of the side information providedto her and, moreover, that the quality ofthe side information may vary duringthe trading period. Thus in order tobenefit from such side information thetrader must learn its quality during thetrading period. Formally, with respectto an n-period trading game we defineside information as any sequence info1,info2, . . . , infon with infoi [ (.

Let X 5 xW1, . . . , xWn be any marketsequence and I 5 info1, info2, . . . , infonany side information. As usual, denoteby Ii the prefix of I consisting of thefirst i elements. We denote by ALG(XuI)the return of the algorithm ALG withrespect to X given the side informationI. Let bW be any portfolio and , [ (. Withrespect to the side information Ii andmarket sequence Xi, define

Ri~bW u,! 5 Pj#i: infoj5,

i

bW txW j .

That is, Ri(bW u,) is the compounded re-turn of a constant rebalanced algorithmthat is out of the market at all periodswhere the side information does notequal , and is fully invested, using theportfolio bW , at all other periods.

Let m be any probability measure. Us-ing Ri(bW uz), we now define them-weighted algorithm with side infor-mation as the algorithm that at the ithperiod uses the portfolio

bW i~infoi! 5*@bW Ri21~bW uinfoi!dm~bW !

*@Ri21~bW uinfoi!dm~bW !.

60 • R. El-Yaniv


This is a generalization of the definitionof the m-weighted algorithm. Clearly,when k 5 u(u 5 1 (i.e., no side informa-tion) this definition reduces to (18).

In a way analogous to the simplifica-tion obtained by formula (19), we nowsimplify the expression for the com-pounded return ALGm(XuI):

ALGm~XuI! 5 Pi51

n

bW it~infoi!xW i

5 Pj51

k Pi#n:infoi5j

bW it~ j!xW i

5 Pj51

k E@

Rn~bW u j!dm~bW ! (25)

Although it is probably impossible todetermine the performance ratio of them-weighted algorithm with side infor-mation with respect to the optimal off-line constant rebalanced algorithm (it isheavily dependent on the side informa-tion), it is possible to determine theperformance ratio with respect to amore powerful offline algorithm that isalso dependent on the side information.The advantage of this approach is thatthe dependence on the side informationwill “factor out” and we shall obtain aperformance ratio that is dependentonly on the cardinality of (.

Fix (. Let B : ( 3 @ be any mappingfrom side information to portfolios. De-fine the state-constant rebalanced algo-rithm (with mapping B), denotedSCBALB, as the algorithm that at eachperiod uses one of the portfolios B(1),B(2), . . . , B(k). Specifically, the algo-rithm invests according to the portfolioB(infoi) during the ith trading period(for which the side information is infoi).The return of algorithm SCBALB for amarket sequence X and side informa-tion sequence I is thus given by

SCBALB~XuI! 5 Pi51

n

Bt~infoi! z xW i .

The optimal offline state-constant rebal-anced algorithm, denoted SCBAL-OPT, is astate-constant rebalanced algorithmthat optimizes its choice of the mappingB based on advance knowledge of themarket sequence X and the side infor-mation I. That is, this algorithm usesthe mapping B*, where

B* 5 arg maxB

Pi51

n

Bt~infoi! z xW i .

Denote by UNI(XuI) and SCBAL-OPT(XuI)the compounded returns of the onlineuniform-weighted algorithm and the op-timal offline state-constant rebalancedalgorithm, respectively, for a market se-quence X given side information I. Foreach j 5 1, 2, . . . , k, set nj(I) to be thenumber of js in I. Now for each marketsequence X and side information I,

UNI~XuI!

SCBAL-OPT~XuI!# P

j51

k

~nj~I! 1 1!~m21!

# ~n 1 1!k~m21!.

To prove this bound, for each j 5 1,2, . . . , k, denote by X(j) the subse-quence of X, xW j1

, xW j2, . . . , xW jl

, whereinfojr

5 j for all 1 # r # l. By thedefinition of SCBAL-OPT and Equation(25) we have

UNI~XuI!

SCBAL-OPT~XuI!5 P

j51

kUNI~X~ j!!

SCBALB*~X~ j!!.

Hence, by the known bound for algo-rithm UNI (21), applied with the se-quences X(j), we have

Pj51

kUNI~X~ j!!

SCBALB*~X~ j!!# P

j51

k

~nj~I! 1 1!~m21!

# Pj51

k

~n 1 1!~m21!.



With respect to a market sequence Xand side information sequence I, denoteby DIR(1/2, . . . ,1/2)(XuI) the compoundedreturn of the Dirichlet (1

2, . . . , 12) algo-

rithm with side information. A deriva-tion similar to the preceding (for algo-rithm UNI) proves the following result.For each market sequence X and sideinformation I,

SCBAL-OPT~XuI!

DIR~1/ 2, . . . ,1/ 2!~XuI!# P

j51

k

~nj~I! 1 1!~m21!/ 2

# 2k~n 1 1!~k~m21!!/ 2.

5.8.7 Portfolio Selection with Transac-tion Costs. All the analyses presentedso far completely ignore transactioncosts. We now briefly discuss a recentresult of Blum and Kalai [1997] givingthe competitiveness of the uniformweighted algorithm (algorithm UNI ofSection 5.8.2) for a model with transac-tion costs.

We focus on the following, more com-mon, proportional transaction-cost mod-el.15 In this model we charge a fixedpercentage g of each amount transacted.We call g the commission rate.

Consider the behavior of a constantrebalanced algorithm CBALbW in a marketwith commission rate g. As before,CBALbW rebalances its holdings so that itsportfolio is bW -balanced at the start ofeach trading period. Let bW 5 (b1, . . . ,bm) Þ bW 9 5 (b91, . . . , b9m). If at the endof some period the portfolio of CBALbW isbW 9-balanced, then, in order to rebalance,for all j 5 1, 2, . . . , m, CBALbW moves afraction ub9j 2 bju of its wealth in or outof the jth security. It follows that thetotal (per dollar) transaction fee to re-balance is g(j51

m ub9j 2 bju.Let UNIg be the uniform-weighted al-

gorithm that pays commissions at rateg. Blum and Kalai proved the surprisingresult that in a market with commissionrate g, for any market sequence X of

length n,

CBAL-OPT~X!

UNIg~X!# S ~1 1 2g!n 1 m 2 1

m 2 1 D# ~~1 1 2g!n 1 1!m21.

(26)

Starting with the representation (20),the proof of this result, given by Blumand Kalai, is different and somewhatmore direct than Cover and Ordentlich’sproof (presented in Section 5.8.2).

5.8.8 Other Results. There are quite afew other works concerning online port-folio selection that are not discussed inthis survey. Here we briefly mentionseveral other known results. Cover andGluss [1986] consider a model where theset of possible market vectors of pricerelatives is finite, say, with cardinalityk. Using the game-theoretic approach-ability-excludability theorem of Blackwell[1956], they obtain an online portfolio-selection algorithm whose exponentialgrowth rate approaches that of the opti-mal offline constant rebalanced algo-rithm at (convergence) rate (L=k 11)=2L2 1 1/=n, where L is a bound onthe logarithm of the maximum pricerelative and n is the length of thegame.16 Thus in the long run it is possi-ble in this model to almost track theperformance of the optimal offline con-stant rebalanced algorithm and in par-ticular to outperform the optimal offlineBAH strategy.

Cover and Ordentlich’s results re-garding the m-weighted algorithms area generalization of a result by Cover[1991] that the exponential growth rateof the uniform-weighted portfolio ap-proaches, as n grows, the exponentialgrowth rate of the optimal offline con-stant rebalanced portfolio. This resultassumes that all price relatives arebounded away from zero. In the samepaper Cover presents a more refined

15 The alternative model of fixed transaction costshas not been studied much.

16 Similar results can be obtained using the re-sults of the recent paper by Hart and Mass-Collel[1997].

62 • R. El-Yaniv


analysis and bounds the competitive ra-tio of the uniform-weighted algorithm(with respect to CBAL) in terms of asensitivity matrix A measuring the “em-pirical volatility” (defined in the paper)of prices exhibited in the market se-quence presented to the algorithm. Inparticular, the upper bound on the com-petitive ratio is =uAu(n/(2p))(m21)/2/(m 2 1)!. The value of the determinantuAu can be bounded by a constant if theprices of all securities are sufficientlyvolatile so that CBAL gives positiveweights to all securities. Cover also pro-vides some interesting experimental re-sults demonstrating the performance ofthis algorithm. Jamshidian [1992] ana-lyzed the performance of the uniform-weighted algorithm in a continuous-time model. The results obtained aresimilar to those obtained by Cover[1991].

Following Cover and Ordentlich,Helmbold et al. [1996] presented anonline portfolio selection algorithmbased on techniques from statisticalinference. This algorithm chooses thenext (i 1 1)st portfolio, bW i11, to maxi-mize hlog(bW i11

t z xW i) 2 D(bW i11ibW i), whereh is some constant that determines theadaptation rate and D(ziz) is the Kull-back–Leibler dissimilarity measure.17

The algorithm thus adapts to the recenthistory while keeping its next portfoliosimilar to the previous portfolio. Helm-bold et al. present several computation-ally efficient approximation variants ofthis algorithm for the cases where n isknown or unknown and for a model withside information. All their bounds arecomparable but inferior to the respec-tive bounds obtained by Cover and Or-dentlich. Nevertheless, it is shown thatthis algorithm empirically outperformsthe uniform-weighted and Dirichlet-weighted algorithms on data sets con-sisting of several NYSE stock pricesequences.18 Given this empirical evi-

dence and the fact that the time andspace requirements of an approximationof this algorithm grow linearly in thenumber of stocks, this algorithm seemsvery attractive, despite its inferior theo-retical guarantees.

Blum and Kalai [1997] report on ex-periments with the data sets of Cover(and Helmbold et al.) to test the perfor-mance of the uniform-weighted algo-rithm with transaction costs. On alldata sets checked, the uniform-weightedalgorithm gave inferior returns to theoptimal offline constant rebalanced al-gorithm but still beat the market (i.e.,the average stock) when the commissionfactor was less than 2%. On these ex-periments the uniform-weighted algo-rithm rebalanced monthly (thus ignor-ing most price ticks that occurred daily).Not surprisingly, it was found that re-balancing less frequently is beneficialwhen commission factors are high.

Awerbuch et al. [1996] consider thefollowing setting. The online playerwishes to invest his entire wealth in oneof the m securities, hoping that the cho-sen security will be a “winner” thatyields high dividends. The decision isirreversible and after the player haschosen one security the game is essen-tially over. If D is the posteriori divi-dend return of the best security, thenthe optimal expected return of theplayer is trivially Q(D/m). Now considera game such that in each period, each ofthe securities may issue a dividend ofexactly $1. Under the assumptions thatthe yield of the best security is D $ 3log m and that D is known to the onlineplayer, Awerbuch et al. provide a selec-tion strategy obtaining a return of atleast D/3 log m with probability at least1 2 (3 log m)/D 2 2/m. The strategy isthe following. For each security j, j 5 1,2, . . . , m, set d(j, i) to be the cumula-tive number of dividends issued bythe jth security within the first i peri-ods. At the (i 1 1)st period choose thejth security with probability m3d(j,i)/D22.This basic result is extended for multi-ple choices of securities. Finally, in thecase where D is not known in advance,

17 For two m-ary probability vectors uW and vWD(uW ivW ) 5 (i51

m ui log ui/vi.18 These data sets were first used by Cover [1991]to evaluate the uniform-weighted algorithm.



the player can retain the same yield butthe probability of success is decreased.

Auer et al. [1995] consider the follow-ing adversarial variant of the multi-armed bandit problem. The onlineplayer must repeatedly choose oneamong m slot machines. The player’sgoal is to maximize the total reward in asequence of n trials (no sampling costsare assumed). Using the minimax re-gret adversary, who before each roundselects an m-ary vector of the currentrewards, and assuming that all individ-ual rewards are in [a, b], Auer et al.prove that the expected regret (obtainedby a randomized algorithm) is at mostO((b 2 a)n2/3(m log m)1/3). They alsoprove a lower bound of V(=nm) on theregret of any algorithm.

6. COMPETITIVE RISK MANAGEMENT

One aspect that distinguishes financialdecision making under uncertainty isthe need of financial players to under-stand and control the risk inherent intheir activities. This requirement tomanage risk is common to all financialapplications regardless of the marketsinvolved, the instruments traded, andstrategies employed. To manage therisk associated with decision makingunder uncertainty means to control theinherent tradeoff between risk and re-ward. The premise here is that inves-tors require more potential reward forhigher levels of exposure to risk.

The definitions of risk and rewardand the interplay between them depend,first of all, upon the particular optimi-zation model used. When we fix ouroptimization model, which typicallyfixes the definition of reward, there arequite a few reasonable definitions forrisk. In general, it is widely acceptedthat risk should measure somethinglike the “exposure” to (chance of) loss(see, e.g., MacCrimmon and Wehrung[1986]). Consider, for example, the clas-sical Rothschild–Stiglitz [1970a, 1971]risk-reward model. In this model, thereturn of a strategy is given by a ran-dom variable R and the associated re-

ward is given by its expected value E[R]under some probabilistic assumptions.Now consider two returns (random vari-ables) R1 and R2 with equal expecta-tions. Let their distribution functions beF1 and F2, respectively. In the Roths-child–Stiglitz model, the return R1 isless risky than R2 if and only if

E2`

t

~F2~ x! 2 F1~ x!!dx $ 0, for all t.

(27)

Intuitively this means that F2 has moreweight in its tails.19 This simple modelis often approximated by measuring therisk associated with a given return viaits variance (or standard deviation) (seeBodie et al. [1993]). There are of courseother, more elaborate, measures of risk(see, e.g., Merton [1981]).

An underlying ingredient of any opti-mization model that supports a measureof risk is the notion of a forecast. Forexample, in the preceding Rothschild–Stiglitz model, the forecast is of coursethe probability distribution of the re-turn. The forecast ingredient is by defi-nition completely lacking in the purecompetitive analysis framework, whichgives no flexibility to control risk andreward. The resulting optimal competi-tive algorithms are extremely risk-averse since they avoid making assump-tions about future inputs. It is notsurprising then that the empirical re-wards of purely competitive algorithmscan sometimes be low (see, e.g., Bach-rach and El-Yaniv [1997]). In the statis-tical adversary approach (see Section5.5) we do take risks by assuming thatmarkets will obey some constraints buteven this approach offers no mechanismto quantify and control the tradeoff be-tween risk and reward.

Recognizing the need for risk manage-

19 This measure of risk is directly tied to vonNeumann–Morgenstern [1994] expected utilitytheory. Specifically, under the assumption thatE[R1] 5 E[R2], it can be shown that if U is aconcave (utility) function then condition (27) holdsif and only if E[U(R1)] $ E[U(R2)].

64 • R. El-Yaniv


ment in financial applications, al-Binali[1997] offers the following frameworkthat generalizes competitive analysisand allows for flexible risk manage-ment. The framework extends purecompetitive analysis by introducing twoingredients: risk and forecast. Fix anonline problem. For any algorithm ALGfor this problem, denote by #(ALG) thecompetitive ratio of ALG. Let c* 5infALG#(ALG) be the optimal competitiveratio for this problem. Given any (on-line) algorithm ALG, define the risk5(ALG) of ALG to be 5(ALG) 5 #(ALG)/c*.Thus the risk of any algorithm is $1and the lower the risk, the better itsperformance guarantee. Next, define aforecast as any subset F of the allow-able input sequences. For example, theset of market sequences allowable forthe (a, n, p)-adversary of Section 5.7.2is a forecast.

The online player specifies a risk tol-erance T. This means that the player iswilling to use only algorithms from theset 7 5 {ALG : 5(ALG) # T}. Each one ofthe algorithms in 7 thus has a competi-tive ratio #Tc*. Fix any forecast F. Anoptimal algorithm, according to thisframework, is an algorithm from 7 thatminimizes the competitive ratio, re-stricted to input sequences from F. For-mally, we extend the function #(z) to beparameterized by any subset F # (* ofinput sequences such that (say, withrespect to a profit maximization prob-lem)

#F~ALG*! 5 sups[F

OPT~s!

ALG~s!.

Thus an optimal T-tolerant algorithmALG* (for a player with a risk toleranceT) with respect to a forecast F satisfies

#F~ALG*! 5 infALG

$#F~ALG! : ALG [ 7%.

Clearly this simple risk-rewardframework provides a smooth generali-zation of the pure competitive analysisframework, which is the special caseobtained with risk tolerance 1. It is

hoped that by posing reasonable fore-casts the player can boost performancesignificantly as long as input sequencesconform to the forecast. Nevertheless,regardless of forecast, the player alwayskeeps the risk within the desired toler-ance. In the following section wepresent a concrete example.

6.1 One-Way Trading Revisited: Analysiswith Risk Management

To demonstrate the utility of the pre-ceding risk management framework, al-Binali [1997] analyzes the one-waytrading problem of Section 3.2.3 withinthe preceding risk-management frame-work. Here we briefly describe some ofresults.

Assume the one-way trading problemvariant with known duration n, andknown m and M (see Section 3.4.1). Fixany D [ [0, M 2 m] and consider theforecast F: the exchange rate will in-crease to at least m 1 D within thetrading period. That is,

F 5 $~ p1 , p2 , . . . , pn! :

?i s.t. pi $ m 1 D%.

Now fix a risk tolerance T. It can beshown that the optimal T-tolerant algo-rithm is the following two-stage threat-based algorithm THREATT. In the first,conservative stage, THREATT operatesunder the threat that the prediction isincorrect and attempts to achieve acompetitive ratio of Tc*. In particular,during this stage, THREATT converts justenough dollars to guarantee this (non-optimal) competitive ratio while savingresources, as much as possible, for thecase where the forecast will come true.Then, if and when the forecast comestrue (i.e., the exchange rate swingsabove m 1 D), the algorithm begins itssecond stage. In this stage, the algo-rithm first computes the optimal re-stricted ratio infALG#F(ALG) and nowtrades, using this restricted ratio, underthe threat that the exchange rate dropsto m and remains there for the rest ofthe game.



Based on the analysis described inSection 3.4.1 it can be shown that dur-ing its first stage (the forecast F has notbeen realized yet) algorithm THREATTtrades only when the exchange rate hitsa new high pi at which time it convertssi/T dollars to yen, where si is given by(2). Then immediately when for somej # n, pj $ m 1 D, the algorithm entersits second stage. Now the algorithmcomputes the optimal competitive ratioc*T restricted to the forecast F which isgiven by

c*F

5a1Tc*~D~a2 j 1 n 2 a2 2 j 2 a2n! 2 a2m!

a2~D~1 2 a1 2 j 1 a1 j 2 a1Tc*!

2 a1m 1 a1j m~1 2 Tc*!!

,

where c* 5 c*n(m, M) is given by (4) and

a1 5 S D

m~Tc* 2 1!D 1/j

a2 5 SM 2 m

DD 1/~n2j!

.

Using c*T, THREATT trades sj 5 1/(pj 2m) (pj/c*T) 2 ((pj21)/Tc*T)) dollars in thejth trading period, and si 5 (pi 2 pi21)/c*T(pi 2 m) dollars in subsequent peri-ods i . j.

Consider Table III where we givesome numerical examples of the re-stricted competitive ratio obtained for athirty-day game with m 5 1, M 5 1.5.Note that in this case the pure competi-tive ratio is 1.154. For a forecast F with

D 5 0.35 that is realized in the jthperiod, j 5 10, 20, the table gives therestricted competitive ratio obtained fortolerances T 5 1.01, 1.05, 1.1.

7. CONCLUDING REMARKS ANDDIRECTIONS FOR FUTURE RESEARCH

The competitive approach is productivefor a variety of financial problems. Al-though the full extent of its theoreticalvalue and its empirical relevance areyet to be determined, it is already quiteclear, from the examples in this survey,that this approach gives rise to interest-ing and nonobvious algorithms andanalyses.

At the time of this writing, the workpresented here is the state of the art inthis area. Therefore, almost any im-provement on the stated bounds andany extensions of the models consideredwill be desirable advances. Due to thenature of financial applications, onesomewhat urgent issue that requiresfurther attention is the power of therisk-reward framework of al-Binali (seeSection 6). In particular, it would beimportant to consider some of the finan-cial problems presented here in thisframework while identifying realisticclasses of forecasts.

Several other attractive financialproblems can and should be studiedwithin the competitive analysis frame-work. We conclude this survey with for-mulations of three online financial prob-lems that to the best of our knowledgehave not yet been studied using thecompetitive approach (variants of allthese problems have been extensivelystudied within the Bayesian approach;see the following references).

—Inventory management. The theory ofonline inventory management is con-cerned with decision problems ofwhen and how much to buy of certaincommodities whose future demand,for production or trade, is uncertain.For concreteness, consider the follow-ing problem formulation. The onlineplayer runs a store that sells a certain

Table III. Restricted Competitive Ratio c*F.†

T j c*F

1.01 10 1.1401.01 20 1.1431.05 10 1.1141.05 20 1.1151.1 10 1.09791.1 20 1.0975

† Obtained for the forecast F with D 5 0.35, m 51, M 5 1.5, and n 5 30, for tolerances T 5 1.01,1.05, 1.1, and j 5 10, 20 (note that for this gamethe (pure) competitive ratio is 1.154).

66 • R. El-Yaniv


commodity. The player maintains aninventory in which he stores someitems of this commodity. During eachperiod a demand for some number ofitems of this commodity is placed. Foreach item sold the player earns someprice c per item. At the end of eachperiod the player must decide on anumber of items to order so he canreplenish his inventory. For each itemordered he pays some wholesale pricew. The items ordered arrive after adelay of d periods. For each itemstored in the inventory the playerpays, per period, some holding cost h.The player’s objective is to issue or-ders so that the total profit is maxi-mized.The literature on Bayesian inventorymanagement is extensive. The readeris referred to Scarf [1963] and Stokeyand Lucas [1989]. We note that pre-liminary competitive results for thisproblem were obtained by Karp et al.[1993].

—Insurance problems. In insuranceproblems the online player must de-cide whether to purchase an insur-ance contract guaranteeing that theinsurer will pay the player a certainamount of money provided that a stip-ulated event occurs. For instance,consider the following elementaryone-stage two-state insurance prob-lem. There are two possible states ofnature, one of which will prevail. Inthe first state the player is endowedwith wealth w1 whereas in the sec-ond, disaster state, the player is en-dowed with wealth w2, w2 , w1.Before the state of nature is madeknown, the player can guarantee acompensation of c dollars, in thecase where the second state of na-ture prevails, by paying the insur-ance company an amount of bc. Theamounts wi and b are fixed andknown to the online player beforethe decision is made. The player’sgoal is to maximize her wealth. Thiselementary one-stage two-state for-mulation can be generalized to a multi-

stage multiple-state insurance game.Insurance problems are most funda-mental to the economics of uncer-tainty. For Bayesian solutions to thisproblem the reader is referred to Lip-pman and McCall [1981] and refer-ences therein.

—Consumption problems. In consump-tion problems the online player mustrepeatedly decide how much of hercurrent wealth to consume and howmuch to save for future consumption.Her goal is to maximize her total life-time consumption. The uncertaintiesin this problem may arise in one ormore of the following ways. First, theplayer’s future income may vary insome unknown manner. Second, fu-ture investment appreciation (depre-ciation) rates may fluctuate in someuncertain way. Finally, the player’slifespan is unknown. Consider the fol-lowing basic consumption problem.The player starts with some initialwealth w0. At the start of the jthperiod, j 5 1, 2, . . . , the player re-ceives an income si and must choosesome amount c from her currentwealth si 1 wj21 that she immedi-ately consumes. The rest of the moneyis allocated for investment and growsat rate 1 1 ij throughout the jthperiod, so that the player’s wealth atthe end of the jth period is wj 5 (1 1ij) (si 1 wj21 2 c). At the end of someperiod T, unknown in advance, thegame ends. The player’s goal is tomaximize her total consumptionthroughout the game.

For some Bayesian results concern-ing variants of this problem, thereader is referred to Levhari and Mir-man [1977], Mirman [1971], and Lip-pman and McCall [1981].

ACKNOWLEDGMENTS

My thanks to Thomas Cover and Erik Ordentlichfor helping to obtain the bibliographic material. Ithank Allan Borodin, Andrew Chou, Vincent Felt-kamp, and Erik Ordentlich for many discussionsand comments that helped me understand thematerial and greatly improved the presentation. Ialso thank Susanne Albers, Robert Aumann, Dean



Foster, Sergiu Hart, Sageev Oore, Motty Perry,Steve Ponzio, Adi Rosen, and Asher Wolinsky fortheir useful remarks. Finally, I thank BrendaBrown, who proofread several versions of thisarticle.

REFERENCES

AJTAI, M., MEGIDDO, N., AND WAARTS, O.1995. Improved algorithms and analysis forsecretary problems and generalizations. InProceedings of the 36th Annual Symposium onFoundations of Computer Science, 473–482.

AL BINALI, S. 1997. The competitive analysis ofrisk taking with application to online trading.In Proceedings of the 38th Annual Symposiumon Foundations of Computer Science.

AUER, P., CESA-BIANCHI, N., FREUND, Y., AND SCHA-PIRE, R. E. 1995. Gambling in a rigged ca-sino: The adversarial multi-armed banditproblem. In Proceedings of the 36th AnnualSymposium on Foundations of Computer Sci-ence, 322–331.

AWERBUCH, B., AZAR, Y., FIAT, A., AND LEIGHTON,T. 1996. Making commitments in the faceof uncertainty: How to pick a winner almostevery time. In Proceedings of the 28th AnnualACM Symposium on Theory of Computing.

AZAR, Y., BARTAL, Y., FEUERSTEIN, E., FIAT, A.,LEONARDI, S., AND ROSEN, A. 1996. On capi-tal investment. In Proceedings of the 23rdICALP, 429–441.

BACHRACH, R. AND EL-YANIV, R. 1997. Onlinelist accessing algorithms and their applica-tions: Recent empirical evidence. In Proceed-ings of the 8th Annual ACM-SIAM Sympo-sium on Discrete Algorithms, 53–62.

BELLMAN, R. 1955. Equipment replacement pol-icy. J. Soc. Ind. Appl. Math. 3, 133–136.

BEN-DAVID, S., BORODIN, A., KARP, R., TARDOS, G.,AND WIDGERSON, A. 1990. On the power ofrandomization in on-line algorithms. In Pro-ceedings of the 22nd Symposium on Theory ofAlgorithms, 379–386. Also in Algorithmica 11(1994), 2–14.

BLACKWELL, D. 1956. An analog of the minimaxtheorem for vector payoffs. Pacific J. Math. 6,1–8.

BLUM, A. AND KALAI, A. 1997. Universal portfo-lios with and without transaction costs. InTenth Annual Conference on ComputationalLearning Theory.

BODIE, Z., KANE, A., AND MARCUS, A. J. 1993.Investments. Irwin.

BORODIN, A., LINIAL, N., AND SAKS, M. 1992. Anoptimal online algorithm for metrical tasksystems. J. ACM 39, 745–763.

BREIMAN, L. 1961. Optimal gambling systemsfor favorable games. In Proceedings of theFourth Berkeley Symposium on Mathematical

Statistics and Probability, Vol. 1. Universityof California Press, Berkeley, 65–78.

CHOU, A. 1994. Optimal trading strategies vs. astatistical adversary. Master’s Thesis, Massa-chusetts Institute of Technology.

CHOU, A., COOPERSTOCK, J., EL-YANIV, R., KLUGER-MAN, M., AND LEIGHTON, T. 1995a. The sta-tistical adversary allows optimal money-mak-ing trading strategies. In Proceedings of theSixth Annual ACM-SIAM Symposium on Dis-crete Algorithms.

CHOU, A., SHRIVASTAVA, A., AND SIDNEY, R.1995b. On the power of magnitude. (To bepublished).

CHOW, Y. S., ROBBINS, H., AND SIEGMUND, D.1971. The Theory of Optimal Stopping. Do-ver Publications.

COVER, T. M. 1991. Universal portfolios. Math.Finance 1, 1 (Jan.), 1–29.

COVER, T. M. AND GLUSS, D. H. 1986. EmpiricalBayes stock market portfolios. Adv. Appl.Math. 7, 170–181.

COVER, T. M. AND ORDENTLICH, E. 1996. Uni-versal portfolios with side information. IEEETrans. Inf. Theor. (March).

COVER, T. M. AND THOMAS, J. A. 1991. Elementsof Information Theory. John Wiley, New York.

COX, J. AND RUBINSTEIN, M. 1985. Options Mar-kets. Prentice-Hall, Englewood Cliffs, NJ.

DERMAN, C. 1963. On optimal replacementrules when changes of the state are Mark-ovian. In Mathematical Optimization Tech-niques, R. Bellman, Ed.

EL-YANIV, R. 1996. On the decision theoreticfoundations of the competitive ratio. (To bepublished).

EL-YANIV, R., FIAT, A., KARP, R., AND TURPIN,G. 1992. Competitive analysis of financialgames. In Proceedings of the 33rd Symposiumon Foundations of Computer Science, 327–333.

EL-YANIV, R. AND KARP, R. M. 1997. Nearly op-timal competitive online replacement policies.Math. Oper. Res. 22, 4, 814–839.

EL-YANIV, R., KANIEL, R., AND LINIAL, N.1993. On the equipment rental problem. (Tobe published).

FREEMAN, P. R. 1983. The secretary problemand its extensions. Int. Stat. Rev. 51, 189–206.

GRAHAM, R. L. 1966. Bounds for certain multi-processor anomalies. Bell Syst. Tech. J. 45,1563–1581.

HART, S. AND MAS-COLELL, A. 1997. A simpleadaptive procedure leading to correlated equi-librium. The Hebrew University of Jerusalem.Discussion paper 120. Jan.

HELMBOLD, D. P., SCHAPIRE, R. E., SINGER, Y., ANDWARMUTH, M. K. 1996. On-line portfolio se-lection using multiplicative updates. In Ma-

68 • R. El-Yaniv


chine Learning: Proceedings of the 13th Inter-national Conference, 243–251.

HULL, J. 1993. Options, Futures, and Other De-rivative Securities. Prentice-Hall, EnglewoodCliffs, NJ.

IRANI, S. AND RAMANATHAN, D. 1994. The prob-lem of renting versus buying. (Unpublishedmanuscript).

JAMSHIDIAN, F. 1992. Asymptotically optimalportfolios. Math. Fin. 2, 2, 131–150.

JOHNSON, D. S. 1973. Near-Optimal Bin Pack-ing Algorithms. Ph.D. Thesis, MassachusettsInstitute of Technology.

JOHNSON, D. S., DEMERS, A., ULLMAN, J. D., GAREY,M. R., AND GRAHAM, R. L. 1974. Worst-caseperformance bounds for simple one-dimen-sional packing algorithms. SIAM J. Comput.3, 299–325.

KARLIN, A. R., MANASSE, M. S., RUDOLPH, L., ANDSLEATOR, D. D. 1988. Competitive snoopycaching. Algorithmica 3, 1, 70–119.

KARP, R. M., OSTROVSKI, R., AND RABANI, Y.1993. Personal communication.

KUHN, H. W. 1953. Extensive games and theproblem of information. In Contribution to theTheory of Games, Vol. II, H. W. Kuhn andA. W. Tucker, Eds., Princeton UniversityPress, Princeton, NJ, 193–216.

LEVHARI, D. AND MIRMAN, L. J. 1977. Savingand uncertainty with an uncertain horizon. J.Political Econ. 85, 265–281.

LEVIN, L. 1994. Personal communication, July.LIPPMAN, S. A. AND MCCALL, J. J. 1976. The

economics of job search: A survey. Econ. In-quiry (June), 155–189.

LIPPMAN, S. A. AND MCCALL, J. J. 1981. Theeconomics of uncertainty: Selected topics andprobabilistic methods. In Handbook of Mathe-matical Economics, Vol. 1, K. J. Arrow andM. D. Intriligator, Eds. Chapter 6, North-Holland, 211–284.

MACCRIMMON, K. R. AND WEHRUNG, D. A.1986. Taking Risks: The Management of Un-certainty. The Free Press.

MERTON, R. C. 1981. On the microeconomic the-ory of investment under uncertainty. InHandbook of Mathematical Economics, Vol. 2,K. J. Arrow and M. D. Intriligator, Eds.,Chapter 13, North-Holland, 601–669.

MIRMAN, L. J. 1971. Uncertainty and optimalconsumption decisions. Econometrica 39,179–185.

ORDENTLICH, E. AND COVER, T. M. 1996. On-lineportfolio selection. In Proceedings of COLT96. (To appear in Math. Oper. Res.).

RAGHAVAN, P. 1992. A statistical adversary foron-line algorithms. DIMACS Series in DiscreteMath. Theor. Comput. Sci. 7, 79–83.

ROSENFIELD, D. B. AND SHAPIRO, R. D.1981. Optimal price search with Bayesianextensions. J. Econ. Theor.

ROTHSCHILD, M. AND STIGLITZ, J. E. 1970. In-creasing risk I: A definition. J. Econ. Theor.82, 225–243.

ROTHSCHILD, M. AND STIGLITZ, J. E. 1971. In-creasing risk II: Its economic consequences. J.Econ. Theor. 3, 66–84.

SCARF, H. E. 1963. A survey of analytic tech-niques in inventory theory. In Multistage In-ventory Models and Techniques. H. E. Scarf,D. M. Gilford, and M. W. Shelly, Eds., Chap-ter 7, Stanford University Press, Stanford,CA, 185–225.

SHARPE, W. F. 1966. Mutual fund performance.J. Business (Jan.).

SHESHINSKI, E. AND WEISS, Y. 1993. Optimumpricing policy under stochastic inflation. InOptimal Pricing, Inflation and the Cost ofPrice Adjustment, E. Sheshinski and Y.Weiss, eds., The MIT Press, Cambridge, MA.

SHILLING, A. G. 1992. Market timing: Betterthan a buy-and-hold strategy. Finan. Anal. J.(March–April), 46–50.

SLEATOR, D. AND TARJAN, R. E. 1985. Amortizedefficiency of list update and paging rules.Commun. ACM 28, 202–208.

STOKEY, N. L. AND LUCAS, R. E., JR. 1989. Re-cursive Methods in Economic Dynamics,Chapter 13, Harvard University Press, Cam-bridge, MA, 389–413.

VON NEUMANN, J. AND MORGENSTERN, O.1944. Theory of Games and Economic Behav-ior. Princeton University Press, Princeton,NJ.

YAO, A. C. 1980. New algorithms for bin pack-ing. J. ACM 27, 207–227.

Received November 1996; revised August 1997; accepted November 1997



Documents

Competitive Solutions for Online Financial Problemsscheideler/club/spring_02/p28-el-yaniv.pdffinancial problems in the order previ-ously stated. For each problem (except for the leasing