A method for portfolio choice

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRYAppl. Stochastic Models Bus. Ind., 2003; 19:1–11 (DOI: 10.1002/asmb.482)

A method for portfolio choice

Robert Elliott1,y and Juri Hinz2,*,z

1Faculty of Management, University of Calgary, Calgary, Alberta, Canada2Mathematisches Institut, Universit .aat T .uubingen, 72076 T .uubingen, Germany

SUMMARY

This paper shows how one can use the theory of hidden Markov models for portfolio optimization. Weillustrate our method by a ball and urn experiment. An application to historical data is examined.Copyright # 2003 John Wiley & Sons, Ltd.

KEY WORDS: Hidden Markov models; portfolio optimization; logarithmic utility

1. INTRODUCTION

Let ðSðtÞÞt50 be asset price given by a positive-valued continuous stochastic process on aprobability space ðO;F; P Þ: Suppose an investor follows a trading strategy trying to maximizehis wealth. In the real-world market each trading strategy involves only a finite number oftrades. Moreover, due to transaction costs, traders set the so-called take profit and stop loss

margins to re-allocate their positions only after price changes a certain size. Hence, we restrictourselves to consider strategies where trading occurs only at pre-defined times 0 ¼ t15t25 � � �5tn given by

tkþ1 :¼ infft5tk : SðtÞ =2 �d � SðtkÞ; u � SðtkÞ½g; t0 ¼ 0 ð1Þ

Here d and u with 05d515u are percentage price changes, which trigger a portfolio re-allocation. At tk the investor’s decision is based on some information Ik available at that time,that is, we consider a filtration ðIkÞ

n�1k¼0: Here we suppose that no transaction costs are to be

paid, the risky asset is arbitrary divisible, and short positions are allowed. For simplicity, wesuppose also that the interest rate is zero. This assumption is justified for a short time horizon.In general, the interest rate is transformed to zero by taking the riskless asset as numeraire. Theaction of the investor is described by a ðIkÞ

n�1k¼0-adapted portfolio process p ¼ ðpkÞ

n�1k¼0 where pk

represents that part of the wealth which is invested in the stock immediately after tk : Startingwith the initial investment 1 and following the portfolio p ¼ ðpkÞ

n�1k¼0; at the time tk the investor

Received 23 April 2002Copyright # 2003 John Wiley & Sons, Ltd. Revised 11 May 2002

*Correspondence to: Juri Hinz, Mathematisches Institut, Universit.aat T .uubingen, 72076 T .uubingen, GermanyyE-mail: [email protected]: [email protected]

owns the wealth X pk ; given recursively by

X pkþ1 ¼ X p

k þ X pk pkðvkþ1 � 1Þ; X p

0 ¼ 1 ð2Þ

Here ðvk :¼ SðtkÞ=Sðtk�1ÞÞÞnk¼1 are the multiplicative increments of the stock price.

To obtain an appropriate trading strategy, we have first to estimate a reliable statistical modelfor ðO; ðIkÞ

n�1k¼0; ðvkÞ

nk¼1Þ using historical stock prices. More exactly, we have to solve successively

two different problems. Fix for ðO; ðIkÞn�1k¼0; ðvkÞ

nk¼1Þ a parameterized family ðP nÞn2V of

distributions and denote by Enð�Þ the expectation with respect to P n:Problem 1. (Identification)For given historical data o 2 O; estimate those parameters #nnðoÞ which most likely explain the

observation o:Problem 2. (Optimization)For n 2 V; calculate the portfolio p�ðnÞ for which the supremum of p/Enðln ðX p

n ÞÞ is reachedon the set

fp ¼ ðpkÞn�1k¼0 : p is ðIkÞ

n�1k¼0-adapted such that Enðjln ðX p

k ÞjÞ51 for k ¼ 0; . . . ; ng ð3Þ

Both tasks are very different. To solve the first problem, a wide range of statistical methods isavailable. The second problem admits at least a numerical solution using dynamicprogramming. However, solving them separately does not yield a satisfactory strategy sinceeach statistical estimation is an approximation and calibrating the parameter does notautomatically imply an improvement in portfolio performance. Intuitively, the estimationshould be portfolio efficient in the sense that the estimated parameters yield a strategy whichgives the most profit when back-testing it on historical data.

Definition 1

A parameter estimation O ! V;o/#nnðoÞ is called portfolio efficient; if

#nnðoÞ is a maximum of n/X p�ðnÞn ðoÞ for all o 2 O ð4Þ

There is a special case where (4) trivially holds: the usual maximum-likelihood parameterestimation provides portfolio efficiency if all measures P n restricted to the final informationIn�1 _ sðvnÞ spanned by In�1 and vn are absolutely continuous with respect to a referencemeasure R on In�1 _ sðvnÞ and

dP njIn�1_sðvnÞ

dR¼ X p�ðnÞ

n ; 8n 2 V ð5Þ

In the present work, we describe ðO; ðIkÞn�1k¼0; ðP

nÞn2V; ðvkÞnk¼1Þ in the context of hidden Markov

theory and consider a class of models satisfying (5).

Remark 1

Note that due to the random time sampling, the wealth is maximized at a random time tn:Clearly, that change leads to an optimality criterion different to the usual one. However, sinceutility is logarithmic, it turns out that the optimal portfolio, see (7), does not depend on n:Hence, an investor who follows this portfolio maximizes his wealth at each time tk which seemsacceptable for practical purposes. But the random time sampling causes problems with non-time-homogeneous financial data since dependencies observed in the original time scale are hard

Copyright # 2003 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind., 2003; 19:1–11

R. ELLIOTT AND J. HINZ2

to incorporate in our setting. However, our method could work on homogenous data, forinstance, for foreign exchange rates.

Remark 2

Sampling asset prices, we implicitly assume a multiplicative price dynamics, which maycontradict the tick effect. However, even if the hidden Markov hypothesis is not appropriate todescribe sampled prices, (say, for an unfortunate choice of d and u), portfolio efficiency ensuresthat we still obtain the best fit within a class of hidden Markov models for the purpose ofportfolio optimization.

Remark 3

The advantage of our framework is that estimated parameter #nnðoÞ maximizes n/X p�ðnÞn ðoÞ and

this maximum is obtained in our setting due to (5) by the standard EM-algorithm, (seeReference [1]), for maximum likelihood parameter estimation. Without (5) and the EM-algorithm, the maximization of the function n/X p�ðnÞ

n ðoÞ is difficult since it involves the wholeset of historical data, which is typically very large. Hence, using a maximizer of n/X p�ðnÞ

n ðoÞ asportfolio efficient parameter estimation method makes sense only if it can be calculated inpractice. Unfortunately, this is not the case for non-logarithmic utility function, for models withtransaction costs, or for models with several risky assets.

Remark 4

To expose the idea of our approach all proofs are omitted. However, the interested reader mayfind them in References [2, 3].

2. PORTFOLIO OPTIMIZATION BY HIDDEN MARKOV MODELS

At first glance, condition (5) appears unnatural and difficult to satisfy, but let us recall thecontinuous-time portfolio optimization problem [9], where the terminal wealth of the optimallogarithmic portfolio is in fact the Radon–Nikodym derivative of the true probability measurewith respect to the risk neutral measure. Of course, the situation is different in our setting. Wecan use the relation between the Black–Scholes and of the Cox–Ross–Rubinstein model and themartingale representation to obtain the desired property (5) in a special class of models. Thecrucial point here is the time sampling which makes the price movement appear similar to theCox–Ross–Rubinstein approach. We now formulate the model assumptions.

We suppose that all times t05t1; . . . ;5tn are almost surely finite. Then Sðtkþ1Þ ¼ SðtkÞvkþ1;for all k ¼ 0; . . . ; n� 1 using the fd; ug-valued process ðvk ¼ SðtkÞ=Sðtk�1ÞÞ

nk¼1: We shall agree

that, based the present information, no sure prediction of the next price movement is possible:

P ðvkþ1 ¼ d jIkÞ 2�0; 1½

almost surely 8k ¼ 0; . . . ; n� 1 ð6Þ

In that case, as shown in Reference [2] the optimal logarithmic portfolio p� ¼ ðp�kÞn�1k¼0 is given by

p�k ¼P ðvkþ1 ¼ ujIkÞ

1� d�

P ðvkþ1 ¼ d jIkÞu� 1

for all k ¼ 0; . . . ; n� 1 ð7Þ


A METHOD FOR PORTFOLIO CHOICE 3

For purposes of calculation, to model the investor’s information, we introduce the processðzkÞ

nk¼1; and suppose that

Ik ¼ sðvj; zj : j4kÞ _ sðzkþ1Þ for all k ¼ 0; . . . ; n� 1 ð8Þ

Further, we suppose the two-component process ðyk :¼ ðvk ; zkÞÞnk¼1 is the output process of a

finite state hidden Markov model.The main idea of a hidden Markov model is to observe the time series ðykÞ

nk¼1 under the

assumption that ðykÞnk¼1 is random and depends on some background device which may operate

in different regimes. The operating regime is not observed and changes like a Markov chain. Insome situations, the random device and its operating regimes can be given a concrete physicalmeaning, but for many cases they are only an approximation of the real-world. Let us illustratethe basic idea behind hidden Markov theory by considering the urn and ball experiment from [5],which we adapt for our purposes. Suppose that there is a finite set S of urns. Within each urnthere are a (large) number of black and white balls. The balls may have different sizes, notdepending on their color. The urns are hidden. The physical process for obtaining observationsis as follows: initially, one urn is selected randomly. At the first step, a random device selects thenext urn. From this urn a ball is chosen at random, and its color and size are recorded as theobservation. The ball is then replaced in the urn from which it was selected. At the next step, thedevice selects an urn again at random and a ball is chosen from that urn and is shown. Thisprocedure is repeated n times. The entire process generates a sequence ðxkÞ

nk¼0 of selected urns

which is not observed, and an observed sequence ðyk ¼ ðvk ; zkÞÞnk¼1 of ball colors ðvkÞ

nk¼1 and ball

sizes ðzkÞnk¼1: Let us assume that the random device changes from the urn x 2 S to an urn w 2 S

with a known probability XxðXÞ and that the joint probability distribution Yx of ball sizeand ball color from urn x 2 S is given. Clearly, the experiment is uniquely determinedby the probability distribution p ¼ ðP ðx0 ¼ xÞÞx2S of the initial urn, by the stochastic kernelX ¼ ðXxðwÞÞx;w2S of the random device and by the stochastic kernel Y ¼ ðYxðdZÞÞx2S from S to theset of all possible colors and sizes. The processes ðxkÞk¼0 and ðykÞ

nk¼1 form an example of hidden

Markov model (HMM).The following questions may be treated efficiently by the tools from HMMs. Suppose we

noted all colors and sizes of past choices ðvj; zjÞkj¼1 and observe also the size zkþ1 of the current

ball. How might we determine the distribution f/P ðvkþ1 ¼ f j sðz1; v1; . . . ; zk ; vk ; zkþ1ÞÞ wheref 2 fblack;whiteg of the current ball color vkþ1 is conditioned by the observations? A secondquestion concerns the model parameters. Suppose we observe a sequence ðvjðoÞ; zjðoÞÞ

nj¼1; but

the parameters are not exactly known. Which triple ðp;X ; Y Þ explains that observations best?Note these questions are connected to portfolio optimization by the following interpretation: atthe step k 2 f1; . . . ; ng; we consider instead of the ball color (black or white) the observationvk ¼ SðtkÞ=Sðtk�1Þ of the last price movement, (up or down). Instead of the ball size we may putany quantity zk ; (or even quantities, making zk multi-dimensional), which is observable at thetime tk�1: For example, zk could be the number of trades during ½tk�2; tk�1�; or possibly thetrading volume during that period. Generally speaking, we shall take for zk some path-dependent functionals of the stock price evaluated at the time tk�1: Note that to calculate theoptimal logarithmic portfolio as in (7) we require the same conditional probabilities which aresought in the first question. Furthermore, the identification of the model parameters ðp;X ; Y Þfor a given sequence of stock price observation is the purpose of the second question. We shallsee later conditions which ensure that the maximum likelihood estimation of the parametersðp;X ; Y Þ does have the desired the property (5).



3. HIDDEN MARKOV MODELS

We now give the formal definition of a hidden Markov model. Let ðO;F; P Þ be a probabilityspace. We introduce the system process ðxkÞ

nk¼0 on the finite state space S: Without loss of

generality we can identify S with the set fe1; . . . ; ejSjg of unit vectors ei ¼ ð0; . . . ; 1; . . . ; 0Þ of RjSj:The output process ðykÞ

nk¼1 takes values in the output space Rd : We call ðO;G; P ; ðxkÞ

nk¼0; ðykÞ

nk¼1Þ a

hidden Markov model (HMM), if

(i) ðxkÞnk¼0 is Markovian with initial distribution p and transition kernel X :

(ii) the distribution of y1; . . . ; yn conditioned by x0 ¼ x0; . . . ; xn ¼ xn is �nj¼1Yxj:

(iii) the measures fYx : x 2 Sg are equivalent.

Note that a hidden Markov model is uniquely described by a triple ðp;X ; Y Þ consistingof a probability distribution p on S; a transition kernel X on S; and a transition kernelY from S to Rd as in (iii), since the joint distribution of ðx0; ðx1; y1Þ; . . . ; ðxn; ynÞÞ isdetermined by

P ðfx0 ¼ x0g\\n

j¼1

fxj ¼ xj; yj 2 CjgÞ

¼ P ðfx0 ¼ x0gÞYnj¼1

Xxj�1ðfxjgÞYxj�1ðCjÞ ð9Þ

for all x0; . . . ; xn 2 S;C1; . . . ;Cn 2 BðRd Þ: Given the parameter ðp;X ; Y Þ of a HMM, p is calledthe initial distribution, and X and Y are called the transition kernel and output kernel,respectively. Furthermore, each probability measure m equivalent to Yx for x 2 S is called arepresenting measure of the HMM. Let ðO;G; P ; ðxkÞ

nk¼0; ðykÞ

nk¼1Þ be a HMM. Define the

filtrations Y;G as

Y :¼ ðYk :¼ sðfyj : j4kgÞÞnk¼0; G :¼ ðGk :¼ sðfxj; yj : j4kgÞÞnk¼0

In this work, we consider HMMs with a two-component output processes ðyk ¼ ðvk ; zkÞÞnk¼1

where ðvkÞnk¼1 takes values in fd; ug and ðzkÞ

nk¼1 is Rd�1-valued. For such output process, we

introduce the filtration

I :¼ ðIk :¼ Yk _ sðzkþ1ÞÞn�1k¼0

We shall suppose ðykÞnk¼1 is obtained from stock price observations or, alternatively, as records

from the urn and ball experiment.A reason why HMMs are popular in applications is that they provide an efficient treatment

for the calculation of ð #hhk :¼ Eðhk jYkÞÞnk¼0 where ðhkÞ

nk¼0 is a ðGkÞ

nk¼0-adapted process. The

measure change technique yields a solution as ð #hhk :¼ gkðhkÞ=gkð1ÞÞnk¼0; where ðgkðhkÞÞ

nk¼0 and

ðgkð1ÞÞnk¼0 are so-called unnormalized estimates. For many processes of interest, these

unnormalized estimates are obtained recursively. We shall need such recursions forunnormalized estimates of the state xk ; of the number Jx;w

k ¼Pk�1

j¼0 1fxj¼x;xjþ1¼wg of jumps fromx to w and also of the step number T

x;fk ¼

Pk�1j¼0 1fxj¼x;vjþ1¼fg where the first observation

component is f and the previous state is x: To write down these recursions, we require thefollowing notation: fix a representing measure m and write DðykÞ :¼ diagððdYx=dmÞðykÞÞx2S forall k ¼ 1; . . . ; n: The transposed stochastic matrix of the transition kernel is denoted by A :¼ðXxðwÞÞ

tx;w2S and Xxð�Þ stands for its column corresponding to x 2 S: We cite from [4] the



following formulas for all k ¼ 0; . . . ; n� 1; x;X 2 S; and f 2 fd; ug:

gkþ1ðxkþ1Þ ¼ ADðykþ1ÞgkðxkÞ ð10Þ

gkþ1ðxkþ1Jx;wkþ1Þ ¼ ADðykþ1ÞgkðxkJ

x;xk Þ þ hx;Dðykþ1ÞgkðxkÞiXxðwÞw ð11Þ

gkþ1ðxkþ1Tx;fkþ1Þ ¼ ADðykþ1ÞgkðxkT

x;xk Þ þ 1fvkþ1¼fghx;Dðykþ1ÞgkðxkÞiXxð�Þ ð12Þ

These recursions are initialized at g0ðx0Þ ¼ p; g0ðx0Jx;w0 Þ ¼ 0; g0ðx0T

x;f0 Þ ¼ 0; respectively. The

quantities gkð1Þ; gkðJx;wk Þ; gkðT

x;fk Þ are obtained by summing up all entries:

gkð1Þ ¼ hgkðxkÞ;~11i; gkðJx;wk Þ ¼ hgkðxkJ

x;wk Þ;~11i; gkðT

x;fk Þ ¼ hgkðxkT

x;fk Þ;~11i ð13Þ

Finally, normalization yields the filtered variables:

#xxk ¼gkðxkÞgkð1Þ

; #JJx;Xk ¼

gkðJx;wk Þ

gkð1Þ; #TT

x;fk ¼

gkðTx;fk Þ

gkð1Þ; k ¼ 0; . . . ; n ð14Þ

In what follows, we need also the occupation time Oxk :¼

Pk�1j¼0 1fxj¼xg of the state x 2 S; given by

Oxk ¼

PX2S J

x;Xk : This implies

#OOxk ¼

XX2S

#JJx;wk ð15Þ

An important feature of hidden Markov modelling is that the model parameters may be re-estimated by the so-called EM-algorithm. It is applied in the following manner: suppose we aregiven a parameterized family (P nÞn2V; such that for each n 2 V we know that ðO; P n; ðxkÞ

nk¼0;

ðykÞnk¼1Þ is a HMM with parameter ðp;X n; Y nÞ and the measures

Y nx are equivalent for all n 2 V; x 2 S ð16Þ

In this case, we define the unique measure Rm onYn such that the distribution of ðy1; . . . ; ynÞ withrespect to Rm is �n

1m: If we restrict each measure P n to

Yn ¼ In�1 _ sðvnÞ ¼ sðy1; . . . ; ynÞ

we see from (ii) and (iii) that P njYnis equivalent to Rm for all n 2 V: Consequently, the

observations y1ðoÞ; . . . ; ynðoÞ define the likelihood function as n/ðdP njYn=dRmÞðoÞ: The EM-

algorithm, which we shall not describe here in detail, calculates a sequence ðniÞi2N such that thecorresponding values of the likelihood function are increasing. To use this property for portfoliooptimization, we shall find a parameterized family of HMMs satisfying (16) such that thereexists a representing measure m where R :¼ Rm satisfies (5). The next theorem (see Reference [3])characterizes such families.

Proposition 1

Suppose that the family ðP nÞn2V of measures on O is given such that ðO; P n; ðxkÞnk¼0; ðyk ¼

ðvk ; zkÞÞnk¼1Þ forms a HMM with parameter ðp;X n; Y nÞ for each n 2 V: Moreover, we assume that

there exist a measure r on Rd�1; and for each n 2 V; a kernel Bn from S � Rd�1 to fd; ug suchthat for all n 2 V

Y nx ðdf;dzÞ ¼ Bn

x;zðdfÞrðdzÞ with Bnx;zðdÞ 2�0; 1½ for all x 2 S; z 2 Rd�1 ð17Þ



Then m :¼ ððu� 1Þ=ðu� dÞdd þ ð1� dÞ=ðu� dÞduÞ � r is equivalent to Y nx for all x 2 S; n 2 V

and the terminal wealth of the optimal logarithmic portfolio p� satisfies

X p�ðnÞn ¼

dP njYn

dRm8n 2 V

Let us comment on the crucial condition (17). Fix n for the moment. Given Z 2 BðRd�1Þ; wehave Y n

x ðfd; ug � ZÞ ¼Rz

Rfd;ug B

nx;zðdfÞrðdzÞ ¼ rðZÞ: This implies that the probability of the

event zkþ1 2 Z does not depend on the hidden state xk : Figure 1 illustrates by a urn and ballexperiment a hidden Markov model satisfying (17). Here xk is the chosen urn at the time k; vkþ1

is the random ball color and zkþ1 is the random ball size from the urn xk ; r is interpreted as ballsize distribution which is the same for each urn. Moreover, Bn

x;zðdfÞ is the distribution of ballcolour within urn x and size group z: As explained above, (17) means that the observation of theball size zkþ1 does not contain any information about which urn it is from since ball sizedistribution is the same for each urn. Still, it is useful to guess the ball colour vkþ1: For example,suppose we know that xk is the right urn. Then, if the ball size zkþ1 is small, then it is likely (withprobability 7

8; see Figure 1) that it is white.

Although the last proposition suggests the modelling of the price movements by aparameterized family of HMMs as specified in that theorem, it does not say anything aboutthe parameterization. The crucial point here is an appropriate parameterization n/Bn: It is noteasy to find a parameterization, which is flexible enough and such that the EM-algorithm works.Currently, we know only one parameterization which we present now.

Let S0 be a set of orthogonal unit vectors of RjS0 j; and assume that

ðzkÞnk¼1 takes values in U :¼fz 2 RjS0 jj

Xx02S0

hx0; zi ¼ 1;hx0; zi > 0

8x0 2 S0g ð18Þ

Further, suppose there exist stochastic kernels V n from S � S0 to fd; ug; with V nx;x0 ðdÞ 2�0; 1½

for all n 2 Vand ðx; x0Þ 2 S � S0; such that for all n 2 V

Bnx;z ¼

Xx02S0

hx0; ziV nx;x0 ; 8x 2 S; z 2 U ð19Þ

We may interpret (18) and (19) for the urn and ball experiment as follows: Imagine that theballs are arranged into jS0j size categories and each urn is divided into jS0j sub-urns, eachfilled with balls of the corresponding category. Let us interprete the numbers Vx;x0 ðfÞ to bethe probability that the ball from the sub-urn x0 of the urn x is of the colour f 2 fd; ug: Suppose

Figure 1. The urn and ball experiment.



that the size measurement is imprecise, that is, its result is a probability distribution z 2 U on theset S0 of size categories. In that case Bn

x;zðfÞ is the probability that the ball from the urn x 2 S;with size measurement result z 2 U ; is of the colour f 2 fd; ug: The advantage of the stochastickernel Y n from (17) satisfying (19) is that V n may be re-estimated. Unfortunately, this is notpossible using a straight forward procedure. Therefore, we construct a tensor-like extension ofthe HMM ðO; P n; ðxkÞ

nk¼0; ðykÞ

nk¼1Þ with the state space S and parameter ðp;X n; Y nÞ satisfying (17)

and (19) by introducing another HMM ð *OO; eP nP n; ð *xxkÞnk¼0; ð *yykÞ

nk¼1Þ; with the state space *SS :¼ S � S0:

Its parameter ð *pp;fX nX n;fY nY nÞ is given as follows: put *pp :¼ p � e where e denotes the uniformdistribution on S0 and define the kernel fX nX n by *XX x;x0 ðX;X0Þ ¼ XxðXÞ=jS0j for all ðx; x0Þ; ðX;X0Þ 2 *SS:Then set fY nY n by

fY nY nx;x0 ðdðf; zÞÞ ¼ V n

x;x0 ðdfÞ � jS0jhx0; zilðdzÞ; 8ðx; x0Þ 2 *SS ð20Þ

Here l stands for the Lebesgue measure on U normalized by lðU Þ ¼ 1: Note that *vvkþ1 and *zzkþ1

are independent given the state *xxk : This is seen from factorization (20) of *YY x;x0 : This extension isanalogous to the urn and ball experiment with imprecise size measurement: Each sub-urn is nowconsidered as a proper urn. This explains the definition of *SS and *XX : The kernel *YY and therepresenting measure *mm are chosen such that the extended model reproduces the same Radon–Nikodym density in the sense of the following proposition from Reference [3].

Proposition 2

Let ðp;X n; Y nÞ and ð *pp; *XX n; *YY nÞ be as above, with corresponding HMMs ðO; P n; ðxkÞnk¼0; ðykÞ

nk¼1Þ

and ð *OO; eP nP n; ð *xxkÞnk¼0; ð *yykÞ

nk¼1Þ: Then

m :¼u� 1

u� ddd þ

1� du� d

du

� �� r; *mm :¼

u� 1

u� ddd þ

1� du� d

du

� �� l ð21Þ

are representing measures with

d eP nP nj *YYn

dR *mmð *ooÞ ¼

d eP nP njYn

dR *mmðoÞ for all o 2 O; *oo 2 *OO satisfying ðykðoÞ ¼ *yykð *ooÞÞ

nk¼1

4. CALCULATING THE OPTIMAL PORTFOLIO

In our framework, portfolio optimization consists of two stages. First, we determine the optimalHMM to describe the price behaviour. This requires parameter identification based on truehistorical data. Furthermore, the optimal logarithmic portfolio is to be calculated within thatmodel.

Step 1 (Parameter identification): Suppose that we are given the so-called tick by tick historicaldata, that is, the data exhibit each contract with an exactly specified time and price for eachtransaction. Let us agree that the data are seen, approximatively, as a continuous path. Fromthat path, our observations ððfk ; zkÞÞ

nk¼1 2 ðfd; ug � U Þn are determined. Consider a family ðO;

ðP nÞn2V; ðxkÞnk¼0; ðykÞ

nk¼1Þ; where each parameter set ðp;X n; Y nÞ satisfies (17) and (19). From

Proposition 2, we obtain the corresponding family ð *OO; ð *PPnÞn2V; ð *xxkÞnk¼0; ð *yykÞ

nk¼1Þ with parameters

ð *pp; *XX n; *YY nÞ for all n 2 V: In both models, the observations are explained as a realization of the



output process ðykðoÞ ¼ ðfk ; zkÞ ¼ *yykð *ooÞÞnk¼1: Choose the representing measures m and *mm as in

(21). The EM algorithm produces a parameter sequence ðniÞi2N which increases the densitiesððdpvi=dRmÞðoÞ ¼ d *ppvi=dR *mmð *ooÞÞ: The passage from ðp;X ni ; Y niÞ to ðp;X niþ1 ; Y niþ1Þ is effected eitherby (a) the improvement of the transition kernel or by (b) the improvement of the output kernel.Usually, the steps (a) and (b) are repeated one after the other.

(a) Here Y niþ1 :¼ Y ni ; but the transition kernel is changed to

X niþ1

x ðwÞ :¼ #JJx;wn ðoÞ= #OOx

nðoÞ for all x; w 2 S

where #JJx;wn ; #OOx

n are calculated within ðO; P ni ; ðxkÞnk¼0; ðykÞ

nk¼1Þ using (11), (13), (14) and (15).

(b) Here X niþ1 :¼ X ni ; but to obtain Y niþ1 ; we replace V nix;x0 by

V niþ1

x;x0 :¼ #TTðx;x0Þ;fn ð *ooÞ= #OOðx;x0Þ

n ð *ooÞ for all ðx; x0Þ 2 *SS;f 2 fd; ug

where #TTðx;x0Þ;fn ; #OOðx;x0Þ

n are determined within ð *OO; *PPni ; ð *xxkÞnk¼0; ð *yykÞ

nk¼1Þ using (12)–(15).

Step 2 (Portfolio calculation)

The parameter identification yields the HMM with parameter set ðp;X ; Y Þ satisfying (17) and(19). The optimal logarithmic portfolio is calculated using (7). The conditional probabilitiesneeded are calculated in [3] as

P ðvkþ1 ¼ fjIkÞ ¼ hðBx;zkþ1ðfÞÞx2S ; #xxki 8f 2 fd; ug; k ¼ 0; . . . ; n� 1

where the filtered state #xxk is determined recursively by using (10), (13), and (14).

5. EXAMPLE

Let us examine how the portfolio optimization works by applying it to the Dow–JonesIndustrial Average Index with sampling bounds u :¼ 1:0008 and d :¼ u�1: In reality, the indexprice is not continuous but the stopping times (1) are well defined. In our example, we chose thetwo-dimensional process ðzkÞ

nk¼1 to carry information about the number of ticks between two

stopping times. Intuitively, zkþ1 should specify the rapidity of the last price change from Sðtk�1Þto SðtkÞ compared to the previous changes, so we use a ranking procedure. Let Nk is the numberof ticks within ½tk�1; tk�: We then define the process ðzkÞ

nk¼1 as in (18) taking values in the set of

two-dimensional stochastic vectors by

z1kþ1ðoÞ ¼ r�1jfj : NkðoÞ > Nk�jðoÞ; j ¼ 1; . . . ; rgj; z2kþ1ðoÞ ¼ 1� z1kþ1ðoÞ 8o 2 O

where the rank size r is set r :¼ 20: We sample the tick by tick data of two years from 1 Januaryof 1999 to 31 December of 2000 to obtain n=41 376 observations ðvkðoÞ; zkðoÞÞ

nk¼1: A hidden

Markov Model with three states and uniform initial distribution was adapted to this data. After50 iterations of the EM-algorithm we obtained the kernels

ðXxðwÞÞx;w2S ¼

0:5829 0:2449 0:1722

0:3632 0:2715 0:3653

0:1789 0:2321 0:5890

2664

3775; ðVx;x0 ðdÞÞðx;x0Þ2 *SS ¼

0:0807 0:0640

0:4834 0:5048

0:9139 0:9352

2664

3775

This model was run on a set of test data ðvkðo0Þ; zkðo0ÞÞ2851k¼1 ; obtained by sampling two months ofthe Dow–Jones Index from July 2 to August 31, 2001. Figure 2 shows for this case the index



values and the wealth of the optimal portfolio, which is plotted on a logarithmic scale, (hence, itstarts at 0=ln(1)). This calculation shows that the optimal logarithmic portfolio increases thewealth. However, it is a theoretical portfolio since the index itself is not tradable. Moreover,each practical implementation has to take into account bid-ask spreads, transaction costs,limited market liquidity, and non-divisibility of asset units. All these limitations are notconsidered here. Our approach shows rather that the rich and interesting structure of high-frequency financial data (as exposed in Reference [6]), can be recognized by using hiddenMarkov theory. Further experiments show that taking more than three hidden states does notincrease the portfolio performance. Hence, a low-dimensional HMM, calibrated by portfolioefficient parameter estimation, may provide a reasonable statistical model for high-frequencyfinancial data.

ACKNOWLEDGEMENTS

The authors wish to thank an anonymous referee for careful reading with many useful comments which wehave utilized.

Figure 2. The Dow–Jones index and the optimal wealth.



REFERENCES

1. Elliott RJ, Aggoun L, Moore JB. Hidden Markov Models. Estimation and Control. Springer: New York, Berlin,Heidelberg, 1995.

2. Elliott RJ, Hinz J. Portfolio optimization, hidden Markov models, and technical analysis of P&F-Charts.International Journal of Theoretical and Applied Finance, to appear.

3. Elliott RJ, Hinz J. On portfolio efficiency of data modeling. Dresdner Schriften zur Mathematischen Stochastik2002;1.

4. Korn R. Optimal Portfolios. World Scientific: Singapore, 1997.5. Rabiner LR. A tutorial on hidden Markov Models and selected applications in speech recognition. Proceedings of

the IEEE 1989;77(2).6. Dacorogna M, Gencay R, Muller U, Olsen R, Pictet O. An Introduction to High Frequency Finance. Academic Press:

New York, 2001.7. Elliott RJ. Stochastic Calculus and Applications. Springer: Berlin, 1982.8. Elliott RJ, Hunter WC, Jamieson BM. Drift and volatility estimation in discrete time. Journal of Economics and

Dynamics Control 1997; 22(2):209–218.9. Elliott RJ, van der Hoek J. An application of hidden Markov models to asset allocation problems. Finance and

Stochastics 1997; 1(3):229–238.10. Elliott RJ. Exact adaptive filters for Markov chains observed in Gaussian noise. Automatica 1994; 30(9):1399–1408.11. Elliott RJ, Rishel Raymond W. Estimating the implicit interest rate of a risky asset. Stochastic Processes and

Applications 1994; 49(2):199–206.12. Juang BH, Rabiner LR. Hidden Markov models for speech recognition. Technometrics 1991; 33(3):251–272.13. MacDonald IL, Zucchini W. Hidden Markov and Other Models for Discrete-Valued Time Series. Monographs on

Statistics and Applied Probability, vol. 70. Chapman & Hall: London, 1997.



Documents

A method for portfolio choice