View
9
Download
0
Category
Preview:
Citation preview
Portfolio Management with Cost
Model using Multi Objective
Genetic Algorithms
A Thesis Submitted toThe Department of Frontier Informatics
byClaus Aranha
56343
In Partial Fulfillmentof the Requirements for the Degree
Master of Science
Supervisor: Hitoshi Iba
Graduate School of Frontier SciencesThe University of Tokyo
August 2007
Abstract
In this work, we study in depth the problem of Portfolio Optimization, and the
application of Genetic Algorithms to solve it. We discuss the limitation of current
approaches, that do not take into consideration multiple scenarios, nor transaction
costs, and propose a modification of the Genetic Algorithm System for Portfolio
Optimization to address these issues.
A Financial Portfolio is a strategy of making investiments in a number of
different applications, instead of focusing in a single security. The idea behind the
creation of a Financial Portfolio is that by investing in many different assets, the
investor can reduce the specific risk from each asset, while maintaining a target
return rate. Markowitz, in 1958, proposed a formal model for this strategy, which
is called the Modern Portfolio Theory (MPT).
Nowadays, there is a number of optimization techniques that propose to solve
the MPTs equations in order to find out the risk/return optimal portfolio for any
given market. However, while Markowitz model has a very elegant mathematical
construction, when we remove the restrictions that do not hold in the real world,
it becomes a very hard optimizaton problem.
A number of numerical techniques have been developed to solve this prob-
lem. Among then, we highlight the Genetic Algorithms (GA), which are random
search heuristics which have showed themselves very appropriate for such large
problems. Current GA-based approaches, however, still tackle a limited version of
the problem, without complications like transaction costs.
In this work we extend the current GA-algorithms for Portfolio Optimzation,
aiming at a system that is able to overcome some of the incomplete points in its
predecessors. More specifically, we wish to address the modeling of a transaction
cost measure, and the influence of market changes over time.
In this work we develop two new techniques: Objective Sharing and Seeding,
to address the above questions. We perform experiments of these new techniques
using historical data from the NIKKEI and the NASDAQ market indexes.
From the results of our experiments we can see that it is possible to get consis-
tently high returns while reducing the distance of the optimal portfolios between
scenarios. It becomes apparent that our approach is a fruitful one, and can be used
to build models that are closer to the real world than other current techniques.
ii
Acknowledgements
Id like to thank Professor Hitoshi Iba and the Japanese Ministry of Education for
allowing me the opportunity of conducting my studies in the University of Tokyo.
This was a unique period of my academic life where I enjoyed many chances to
enlarge my horizons.
Also, during these 2 and a half year, I always had the support from all my
fellow colleagues at IBA laboratory. In particular, I would like to thank Nasimul
Noman, Makoto Iwashita and Yoshihiko Hasegawa for the many many suggestions
they gave me regarding my work. Be assured that I took every last one of them
into consideration. I would also like to thank Daichi Ando and Shotaro Kamio for
all the practical guidance during my first months in Japan. Finally, I would like
to extend very special thanks to Toshihiko Yanase, who gave me lots of support,
both technical and moral, and made me feel less of a foreigner in the laboratory.
Outside the academia, I would like to thank my many friends which I met
in Japan, and made this period much more interesting. Among them I would
like to mention the Fabulous Four of Kashiwa Campus (Chaminda, Pamela and
Ahmet), Mrs Saito and her friends in the graduate affairs division, and all the
many Brazilian exchange student that I met through ASEBEX.
Last but not least I would like to thank all the support my parents gave me,
and all the patience and love from Marilia, who was always quick to remind me
that reading Slashdot when I was supposed to be writing my dissertation is not
procrastination, but relaxation :-).
Kashiwanoha 5-1-5, Kashiwa-shi, Chiba 277-8561 Claus Aranha
August 1st, 2007
iii
Table of Contents
Abstract ii
Acknowledgements iii
Table of Contents iv
1 Introduction 1
1.1 Financial Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Portfolio Optimization and Management . . . . . . . . . . . . . . . 3
1.3 Traditional Optimization Methods . . . . . . . . . . . . . . . . . . . 4
1.4 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Financial Background 8
2.1 Markowitz Modern Portfolio Theory . . . . . . . . . . . . . . . . . 9
2.1.1 Efficient Portfolios . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.2 Riskless Asset . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.3 Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . 14
2.2 Financial Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Return Estimation . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Risk Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Evolutionary Algorithm-based Optimization 20
3.1 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Selection Strategies . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 Crossover Strategies . . . . . . . . . . . . . . . . . . . . . . 24
3.1.3 Mutation Strategies . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Multi Objective Genetic Algorithms . . . . . . . . . . . . . . . . . . 26
iv
v3.2.1 Aggregate Methods . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.2 Objective Sharing Methods . . . . . . . . . . . . . . . . . . 28
3.2.3 Pareto-based Methods . . . . . . . . . . . . . . . . . . . . . 30
4 Related Work 31
4.1 Review of GA based methods . . . . . . . . . . . . . . . . . . . . . 31
4.1.1 Portfolio representation . . . . . . . . . . . . . . . . . . . . . 32
4.1.2 Asset selection . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.3 Market Model . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.4 Return Estimation . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.5 Risk Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.6 Cost modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.7 Fitness Functions . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.8 Evolutionary Strategies . . . . . . . . . . . . . . . . . . . . . 37
5 Multi-Objective Portfolio Management 38
5.1 Portfolio Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.1 Portfolio Representation . . . . . . . . . . . . . . . . . . . . 39
5.1.2 Return Measure . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.3 Risk Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.4 Cost Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.5 Assumptions of the model . . . . . . . . . . . . . . . . . . . 41
5.2 Evolutionary Optimization . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.1 Genome Representation . . . . . . . . . . . . . . . . . . . . 42
5.2.2 Return Estimation . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.3 Fitness Measures . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2.4 Evolutionary Strategy . . . . . . . . . . . . . . . . . . . . . 46
5.3 Portfolio Management . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3.1 Seeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3.2 Objective Sharing . . . . . . . . . . . . . . . . . . . . . . . . 49
vi
6 Experiments and Results 50
6.1 Data Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 NASDAQ 100 Index . . . . . . . . . . . . . . . . . . . . . . 51
6.1.2 NIKKEI 255 Index . . . . . . . . . . . . . . . . . . . . . . . 52
6.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.1 Parameter Value . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.2 Effect of randomness in the results . . . . . . . . . . . . . . 54
6.2.3 Comparison and validation . . . . . . . . . . . . . . . . . . . 55
6.3 Parameter Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . 56
6.3.1 Objective Sharing Parameter analysis . . . . . . . . . . . . . 57
6.3.2 Time Length parameter analysis . . . . . . . . . . . . . . . . 58
6.3.3 Legacy Size parameter analysis . . . . . . . . . . . . . . . . 60
6.4 Full Length Returns experiment . . . . . . . . . . . . . . . . . . . . 62
6.5 Experiment Result Analysis . . . . . . . . . . . . . . . . . . . . . . 67
7 Conclusion 69
7.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.1.1 Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.1.2 Representation . . . . . . . . . . . . . . . . . . . . . . . . . 71
Bibliography 72
Publication List 76
List of Figures
2.1 The feasible and efficient sets of portfolios, according to the MPTs
mean-variance model. . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Possible combinations between a riskless asset and an efficient port-
folio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 A simple description of the Evolutionary Algorithm framework . . . 21
3.2 Example of k-points crossover with k = 2 . . . . . . . . . . . . . . . 25
3.3 The Pareto Optimal Set for an hypothetical problem . . . . . . . . 27
3.4 Differences between Objective Sharing approaches . . . . . . . . . . 29
5.1 Generating a portfolio from its genomic representation. . . . . . . . 44
5.2 Description of the Seeding method . . . . . . . . . . . . . . . . . . 48
6.1 Performance of the NASDAQ composite Index. . . . . . . . . . . . 52
6.2 Performance of the NIKKEI Index. . . . . . . . . . . . . . . . . . . 53
6.3 Total return as a function of time length for NASDAQ data. . . . . 59
6.4 Total return as a function of time length for NIKKEI data. . . . . . 59
6.5 Effect of Legacy Size on the NASDAQ dataset. . . . . . . . . . . . . 61
6.6 Effect of Legacy Size on the NIKKEI dataset. . . . . . . . . . . . . 61
6.7 Cumulative returns for the NASDAQ dataset . . . . . . . . . . . . 65
6.8 Distance values for the NASDAQ dataset . . . . . . . . . . . . . . . 65
6.9 Cumulative returns for the NIKKEI dataset . . . . . . . . . . . . . 66
6.10 Distance values for the NIKKEI dataset . . . . . . . . . . . . . . . 66
vii
List of Tables
5.1 Parameters Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.1 Generic Parameters and Specific Parameters . . . . . . . . . . . . . 54
6.2 Values for Generic Parameters . . . . . . . . . . . . . . . . . . . . . 54
6.3 Sharpe Ratio and Distance as a function of Pos parameter. . . . . . 58
6.4 Results of Full Length Experiment - NASDAQ Dataset . . . . . . . 63
6.5 Results of Full Length Experiment - NIKKEI Dataset . . . . . . . . 63
viii
Chapter 1
Introduction
With the popularization of the Internet and personal computers in the past decade
and a half, the trading of financial instruments has left the realm of the financial
institutions, and became much more accessible. Many people nowadays trade
stocks from different markets as their secondary (sometimes even primary) means
of income. Simulations of the market to the public are also common, and allow
people without much knowledge of financial theory to get an intuitive grasp of the
workings of the market place.
In this environment, the interest on financial research, and its development
outside the academic circles of economists and mathematicians, has also flourished.
The improved computer power allows the calculation of a greater amount of data,
faster. This leads to the development of computational tools for the aid of the
financial trader. For instance, more accurate methods for prediction of future
trends (Nikolaev and Iba, 2001), the simulation of market behavior (Subramanian
et al., 2006), and more recently, the automation of simpler tasks in the financial
market (Jiang and Szeto (2003), Aranha et al. (2007)).
In this work, we address the problem of Financial Portfolio Optimization from
a Genetic Algorithm perspective. This is a problem that has gathered a lot of
attention in recent years, yet poses many open questions. In this dissertation well
overview the Portfolio Optimization problem, the most relevant work in regards
to it, and then address some of those open problems.
1
21.1 Financial Portfolios
The word Portfolio usually means collection. Thus we can have the portfolio
of a company, meaning the different products that this company produces, or the
portfolio of an artist, which is the collection of his significant works.
In the financial sense, a portfolio is a collection of different investments. Since
any market activity involves a certain amount of risk, it is often a good idea not
to put all of your resources into a single security. Also, different securities (like
bonds, stocks, options, foreign currency, etc) have different characteristics regard-
ing liquidity, dividends, and valorization rate. Because of this, an investment
strategy must take into account the characteristics of each of the many securities,
and divide the available resources accordingly.
The idea behind a portfolio spread is that different securities may show different
behaviors over time, even when their perceived risk and return are the same. For
instance, if two securities belong to two competing industries, some external factor
that affects negatively one of them, may affect positively the other one. If we had
concentrated all of our capital in one of the two securities, there would be 50%
chance that we picked the right security, and a 50% chance that we picked the
wrong one. On the other hand, by sharing the capital between the two securities
would mean that the losses in one of them would be offset by the gains in the
other.
For the day trader or other short term investors, it might be more rewarding to
take the risk of concentrating their capital in a few lucrative investments. These
investors work with many transactions of small value, hoping to profit from local
fluctuations of individual securities. However, for longer-term investors and funds,
it is better to reduce risk as much as possible (while maintaining desired return
rates) so that large investments may be used to reap long term rising trends in the
market. Also, having a well diversified portfolio prevents the risk of unexpected
events in certain markets that may lead to crashes.
31.2 Portfolio Optimization and Management
Markowitz stated the Modern Portfolio Theory (Markowitz, 1987), which says
that the risk of an investment can be diminished by an appropriate diversification
of the assets that compose it. This diversification reduces the specific risk, leaving
only the systematic risk.
Specific Risk is the risk associated to a single asset. It happens due to factors
particular to that asset. For instance, a sudden bankruptcy of one company may
plummet the values of the stocks associated to that company - the chance of
this happening would be considered as the specific risk of those assets. Portfolio
Optimization is capable of reducing Specific Risk.
Systematic Risk, on the other hand, is the risk associated with the whole
market. Sudden crashes or bubbles may rise or lower dramatically the values of
all assets in a market, beyond the predicted values. No amount of diversification
is capable of preventing Systematic Risk.
The first challenge to applying the Modern Portfolio Theory is to design a
proper model in which to calculate the optimum portfolio. This model must be
precise enough to reflect the behavior of the market in the real world. However, it
cannot be so complex that it is not possible to analyze it. Over the years, many
different models have been proposed in an attempt to strike a balance between
these two factors.
After the model has been devised, it is necessary to develop an optimization
routine to this model, and decide how to apply the results into the real world.
The optimization method will return a weighting for every available asset, so that
the investor knows how to distribute his capital in order to maximize the expected
return, and minimize the risk of his portfolio.
In short, the problem of Portfolio Optimization can be thought of as a Resource
Management problem. The capital is the limited resource that must be distributed
between the available assets, in order to maximize the objective function.
Portfolio Management is an extension of the Portfolio Optimization problem.
It is defined as the task of achieving a desirable risk/return ratio on a dynamic
4market in a continuous fashion, as opposed to the one scenario nature of Portfolio
Optimization.
There are two added difficulties in the Portfolio Management problem. The
first one is that, as the market is dynamic, its proprieties change with time. The
risk and return values and characteristics for each asset are not constant. So
the method used to optimize the portfolio must be able to adapt itself to the
changing characteristics of the market. We find that Genetic Algorithms are quite
appropriate for this aspect of the task.
The second difficulty is the issue of cost. If the optimal portfolio at one scenario
is too different from the optimal portfolios at the scenarios immediately before and
after it, changing back and forth between optimal portfolios might be more costly
than picking a solution which is sub-optimal in some, but positive to all scenarios.
1.3 Traditional Optimization Methods
The model described by Markowitzs Modern Portfolio Theory has already been
solved in theory. Following that model, it can be proved that the optimal portfolio
is the Market Portfolio, which contains all assets available in the market, weighted
by capitalization.
However, this basic model has too many assumptions which do not hold in
practice. As we remove those assumptions, and add restrictions like cost, trading
limits, imperfect information, etc., the model becomes more and more complex,
and finding an optimal portfolio becomes harder.
More elaborate models of the problem have been solved by numerical opti-
mization methods, like quadratic programming or linear programming (ping Chen
et al., 2002). However, the biggest problem with numerical methods is that their
algorithms do not scale very well, and quickly become complex with the number
of assets.
For instance, optimization techniques for versions of the MPT require the cor-
relation values between all the assets for each iteration. This is computationally
very intensive, and grows exponentially on the size of the data.
5In fact, reports on simulations for most Portfolio Optimization techniques limit
themselves to 10 or 20 assets at a time. The use of search heuristics is a possible
solution to this problem.
1.4 Genetic Algorithms
Genetic Algorithm is a Random Search heuristic which is inspired by biological
evolution. The basic setup of a Genetic Algorithm is as follows:
Each possible solution to a problem is considered an individual. A population
of random individuals is generated, and each solution is tested for its goodness.
The best solutions in the population are mixed together, generating new individ-
uals. The worst solutions are removed from the population.
The heuristic behind Genetic Algorithms is that by mixing the best solutions,
we can direct the population to explore better regions of the search space. Genetic
Algorithms is a mature technique that has been researched for more than 20 years,
with practical applications in many fields.
One of the good characteristics of Genetic Algorithms is that they are able
to adapt to changes to the problem they are trying to solve. If the environment
changes (for example, if the data values change), the algorithm is able to find
a new optimal solution without any changes to its population or parameters, by
executing a few more iterations.
This makes Genetic Algorithm specially fit for problems where they must con-
tinually optimize a solution in an evolving environment. Like the Portfolio Man-
agement problem we have described earlier.
1.5 Proposal
In this work, we study the Portfolio Optimization and the Portfolio Management
problem, and the research field of using the GA heuristic to solve those problems.
There were many works in recent years that aim to use the powers of Genetic
Algorithm to solve the Portfolio Optimization problem. However, each of these
6works seem to approach the problem from a different angle, using GA in different
ways, and different models for the Portfolio and the market.
In this dissertation survey the many methods that were developed, and analyze
the techniques proposed, and the models assembled. From the results of this
analysis, we understand that the current methods for Portfolio Optimization lack
a proper approach to transaction costs, and to Portfolio Management.
We propose the use of the adaptive characteristic of GA to improve a simple
Portfolio Optimizer into a system able to manage a portfolio through multiple
scenarios. Our improvements consist mainly of two new ideas:
1. The insertion of successful individuals from previous scenarios into the cur-
rent generation. And;
2. The use of a secondary goodness metric: The distance between two portfo-
lios, which we use to measure transaction costs.
In this work we explain with details both approaches, and we perform experi-
ment to understand their effectiveness.
1.6 Outline of the Thesis
Chapter 2 explains the basic financial concepts necessary for the development of
this work. In this chapter we discuss our choice for the portfolio model used,
and define the calculations of risk, return, and transaction cost. We offer some
information on other possible models, and some of the traditional methods on
how to approach the portfolio optimization problem. The goal of this chapter is
to introduce all the mathematics necessary for this work to the reader without a
financial background.
Chapter 3 makes a brief introduction to genetic algorithms, and then explains
how they can be used to solve optimization problems. Well also discuss the
concept of Multi-Objective evolutionary algorithms, which can be used to solve
problems where there are more than one competing objectives. The goal of this
7chapter is to explain the concept of Genetic Algorithms for Optimization Problems
for the readers without this familiarity.
Chapter 4 exposes previous works from other researches that have a similar
approach as ours. These works have some influence in our current method, and
we discuss the more important ones in detail, pointing out their strengths, and
where they could be improved.
Chapter 5 explains our approach for portfolio management by use of GA. Here
we explain again in details our model, and how it is implemented in the system.
We show what are the parameters of the system, and describe how were values
decided for these parameters. We describe the difference between using Genetic
Algorithms for Optimizing a portfolio, and for Managing a portfolio, and argue
on the superiority of the later method.
Chapter 6 shows the experiments that were performed to validate our proposal.
We have executed a simulation with real world historical data from Japanese and
American markets, and observed good results with the proposed method. The
methodology for the experiments is explained, and the results are shown and
discussed.
Chapter 7 proposes a reflection on the field of GA-based portfolio optimization
based on the results of our work. We explain how the information acquired here
can be useful to the field, both as basis for applications, and as insight for future
experiments.
Chapter 2
Financial Background
Financial Engineering is the application of economical theories to real-world in-
vestment problems. Economic models have been developed describing the behav-
ior of many different kinds of securities under varied conditions. These models
propose elegant solutions that generate optimal strategies for investment in many
problems.
For instance, Markowitzs Modern Portfolio Theory (Markowitz (1952) and
Markowitz (1987)) (MPT) describes the behavior of investors when many assets
with different risks and returns are available. The main result is that the optimal
strategy is to invest in the market portfolio, which is composed of all available
assets, weighted by capitalization. In order to achieve higher returns or lower
risks, the market portfolio may be combined with a theoretical riskless asset.
From the MPT, the Capital Asset Pricing Model (CAPM) can be derived. The
CAPM is a model that allows the evaluation of assets and portfolios with relation
to the market portfolio.
Also, the MPT and the CAPM rely on the knowledge of market rates of return
and risk for all the assets. That kind of information, however, is not so readily
available. While risk as deviation is a well known measure of risk, there are
other models for modelling risk which can also be used. The choice of model
depends on which characteristics of risk the investor finds more important.
Return as well, can be estimated in many different ways. The straightforward
moving averages was used for the elaboration of the MPT. But we could use more
fancy moving averages, like the weighted moving average or the geometric moving
average, or elaborated schemes based on evolutionary computing in order to obtain
8
9a more precise figure on the estimated return.
In this chapter we will introduce the basics of Markowitzs MPT and CAPM,
and then discuss some risk and return models.
Both the MPT and the CAPM make strong assumptions about the model.
Among those are: the investors have perfect information, they react perfectly to
market changes and risk and return are the only relevant characteristics of an asset.
Also trading costs, upper and lower bounds to trade volume are not included in
the model. By releasing these assumptions, the solutions proposed in their models
become much harder computationally.
2.1 Markowitz Modern Portfolio Theory
A financial asset can be represented in terms of its return, and its risk.
The return of an asset is the relative change of its value over time - it indicates
whether the price of the asset has gone up (positive return), or down (negative
return). Financial applications are usually based on the idea of the Estimated
return of an asset. If the estimated return for a certain point in the future is high,
it is profitable to buy the asset now, to sell it at a higher value later.
The risk of an asset is the chance that the actual value of the return in the
future will be different than the expected return. In the Modern Portfolio Theory,
we use the variance of the rate of return as the measure of risk for one asset.
We model the market by defining n tradeable assets. The rate of return for
each asset is described as rn, and the expected return rate is E[rn]. The error for
each asset is defined as n.
A portfolio W is a weighted composition with n weights (w1, w2, ...wn). Each
weight wi represents how much of the available capital will be allocated to asset
i. There are usually two restrictions for the values of wi:
1.n
i=1 wi = 1
2. wi 0
10
The second restriction means that short-selling is not allowed. Short-selling
means selling assets that you have borrowed. It can be shown that the Modern
Portfolio Models with and without short-selling are equivalent (Yuh-Dauh-Lyu,
2002), so well add this restriction for the sake of simplicity.
In this model, the return of the portfolio W is given as the weighted sum of
the assets composing it:
rW =n
i=1
wiri (2.1.1)
And the variance (risk) of the portfolio is calculated by the correlation of the risks
of the composing assets:
W =n
i=1
nj=1
wiwjij (2.1.2)
Where ii = 2i is the variance of ri, and ij is the covariance between ri and rj.
The calculation for the error of the portfolio in equation 2.1.2 can then be
divided in two parts, one containing the variance of the returns, and another
containing the covariance of the returns:
W =ni
w2i 2i +i6=j
wiwjij (2.1.3)
The first part of equation 2.1.3 describes the portfolio risk component com-
posed by the risk of individual assets. It is called the specific risk. If there is no
correlation between the assets in the portfolio (the second part of the equation is
0), we can show that the specific risk can be reduced by diversification (adding
more assets to the portfolio):
Proof: Diversification reduces specific risk. For a portfolio with n assets, assume
that wi = 1/n, and that ij = 0 for every i and j. The risk of the portfolio is
then:
W =
2i
n2
While the mean of the risks is:
=
in
11
Make the risk of all assets equal to the highest risk, /sigmamax. Then as n increases
(as we add assets to the portfolio), the portfolios risk will decrease towards zero,
as the mean of the risks remains the same:
2maxn2
maxn
n2maxn2
nmaxn
2maxn
max
For increasing values of n.
The second part of equation 2.1.3 describes the portfolio risk component com-
posed by the risk shared by the assets. It is called systematic risk. If the variance
is different then zero, then the systematic risk cannot be reduced below the average
by adding more assets:
Proof: Systematic risk cannot be reduced by Diversification. For a portfolio with
n assets, assume that wi = 1/n, ii = s and sigmaij = a for every i and j. Then
the risk of the portfolio becomes:
W =
ni s
2
n2+
i6=j a
n2
=ns2
n2+ n(n 1)
a
n2
=s2
n+ a
a
n
= a+s2 a
n
So that even if we increase n (the number of assets), the risk wont become less
than the average covariance, a.
2.1.1 Efficient Portfolios
The MPT makes two assumptions about the behavior of investors which can be
used to decide the efficiency (optimality) of a portfolio.
12
Figure 2.1: The feasible and efficient sets of portfolios, according to the MPTs
mean-variance model.
The first one is that they only care about risk and return. This means that,
even if the composing assets are different, two portfolios with the same measures
of risk and return are equal in the MPT.
The second one is that the investors are risk-averse. This means that given
two portfolios with the same mean return, the one with smaller risk/variance will
be considered more efficient.
Based on these two assumptions, we can define an efficient (optimal) portfolio,
according to the MPT model. Consider a two dimensional mapping where the
horizontal axis denotes the risk associated with a given portfolio, and the vertical
axis denotes the mean of its return (see figure 2.1).
Each point inside the shaded area indicates one of the possible portfolio con-
figurations. We can assume that this region, which we will call the feasible set is
solid and convex to the left (Yuh-Dauh-Lyu, 2002). If we pick an arbitrary point,
moving up in the Y axis from that point means that we find another portfolio with
a higher return for the same risk rate. Conversely, if we move left from a given
point, well find a portfolio which has a lower risk for the same mean return.
In other words, portfolios that are located more to the left, or to the top of
13
the diagram are considered to be better, in that they provide a higher return for
a lower risk. We say that a given risk/return combination represented in this way
is efficient, if there is no other portfolio that offer a higher return for the same
amount of risk, or less risk for the same return.
We call the set of efficient combinations of risk/return in a scenario the efficient
frontier. In figure 2.1, this combination is the upper left border of the feasible set.
Portfolios whose representation falls on the efficient frontier are said to be efficient
portfolios.
2.1.2 Riskless Asset
Normally, it is not possible to design a portfolio with a better return/risk ratio
than that in the efficient frontier. However, when we add a riskless asset to the
model, this restriction may be lifted.
A riskless asset, as the name implies, is a security whichs deviation of the mean
return is 0. While perfectly riskless assets are a theoretical construct, in practice
we can get similar results with very low risk assets, like government bonds.
The presence of the riskless assets allows for the investor to combine it with a
portfolio at the efficient frontier. According to Markowitzs 2-fund theorem, the
combination of two portfolios is linear : This mean that if we have a portfolio Wc
composed of two portfolios (Wc = (1 )W1 + W2), the possible return and risk
representations of Wc are rc = (1 )r1 + r2 and c = (1 )1 + 2.
Figure 2.2 describes this relationship more clearly. As can be seen, by combin-
ing the riskless asset and a portfolio in the efficient frontier, it is possible to hold
portfolios outside the feasible set.
In particular, if we find the line that is tangent between the riskless asset
and the feasible set, we have a new efficient frontier. If holding short positions
(borrowing) the riskless asset is allowed, this line may extend beyond the tangent
point, allowing for higher returns with similar risk amounts when compared with
the feasible set.
In this way, finding the efficient frontier becomes the problem of finding the
tangent portfolio between the riskless asset and the feasible set.
14
ReturnMean
RisklessAsset
Variance
Feasible Set
Efficient Frontier
Figure 2.2: Possible combinations between a riskless asset and an efficient portfolio
2.1.3 Capital Asset Pricing Model
If we imagine that all the investors in a market are rational, and possess the same
information about the risk and return values from the assets, we can draw some
conclusions about the nature of the efficient portfolio.
From the previous subsection we know that the efficient portfolio is the one
portfolio located in the tangent between the efficient frontier and the feasible set.
This line is composed by the points representing different balances between the
efficient portfolio and the riskless asset.
The slope of the line between the riskless asset and a portfolio is given by:
Sr =rp rfp
(2.1.4)
Where rp and p are the portfolios mean return and risk, respectively, and rf is
the riskless asset rate of return. This values tells us the rate at which the risk will
rise when we increase the return. In other words, it can be thought of as the price
of risk.
This rate is also known as the Sharpe Ratio. It can be used as an analysis tool
to determine how good is a portfolio, since it measures the amount of risk and
15
return that can be traded by balancing the portfolio with the riskless asset (Lin
and Gen (2007), Subramanian et al. (2006), Yan and Clack (2006)).
By definition, the positions located in the efficient frontier hold the best risk/return
ratios. Consequently, we can assume that all the investors in the market will want
to hold only some amount of the riskless asset and the efficient portfolio. Since
all assets in the market must be held by somebody, a consequence of the MPT is
that the efficient portfolio is composed by all the assets available in the market,
weighted by their capitalization. Well call this portfolio the market portfolio.
Using the this model, we can also identify the contribution of each individual
asset i to the risk of the market portfolio.
i =i,M2M
(2.1.5)
Where 2M is the risk of the market portfolio, and i,M is the covariance between
asset i and the market portfolio. This beta can be extended to symbolize the risk
relationship between any portfolio and the market portfolio:
p =
i
wii (2.1.6)
Beta is another CAPM indicator that can be used to measure the risk/return
efficiency of a portfolio (Lipinski et al., 2007).
2.2 Financial Engineering
In order to extract effective predictions from models like the MPT, we need nu-
merical methods to extract information from the data. Well call the trading and
financial decision systems based on this kind of methods Technical Trading.
In the Portfolio Optimization problem, two important technical measures are
the Return Estimation and the Risk Estimation. The return estimation must
be made based on the past values of the return. A precise calculation of this
estimation will allow the model to make more realistic predictions.
The risk estimation is based on the past variance of the asset, and represents
how reliable is one asset with relation to its estimated return. There are different
16
methods of calculating this measure, which carry with them different beliefs about
the nature of the risk.
2.2.1 Return Estimation
Moving averages are often used in technical analysis as a tool to analyze time
series data.
Moving average series can be calculated for any time series. In finance, it
is often applied to estimate future returns or trading volumes. Moving averages
smooth out short-term fluctuations, thus highlighting longer-term trends or cycles.
The threshold between short-term and long-term depends on the application, so
that the size of the moving average must be set on a case-by-case basis.
A simple moving average is the unweighted mean of the previous n data points.
For instance, if rt1, rt2, ... rtn are the values for the returns for the last N
months, the simple moving averages for these returns would be
SMAt =rt1 + rt2 + ...+ rtn
N(2.2.1)
A good characteristic of moving averages in general is that it is very simple, given
the value of the moving average for a point in time, to calculate the moving average
for the next point in time.
SMAt+1 = SMAt +rt rtn
N(2.2.2)
A problem with the simple moving average is that it introduces some lag
into the the estimation, due to its nature of using past data points. This lag will
be greater as the earlier data points are more divergent from current data.
One alternative to reduce this problem is to add weights to the Moving Aver-
ages. For example, the weighted moving average adds linear weights to the sum,
where more recent values have the highest weight, and older values have smaller
weights:
WMAt =nrt1 + (n 1)rt2 + ...+ 2rtn1 + rtn
n+ (n 1) + ...+ 2 + 1(2.2.3)
17
However, this weighting strategy makes calculating the next moving average a bit
more complicated.
Tt+1 = Tt + rt rtn (2.2.4)
Numt+1 = Numt + nrt Tt (2.2.5)
WMAt+1 =Numt+1
n+ (n 1) + ...+ 2 + 1(2.2.6)
Another possible way to calculate the Moving Averages is the Volume Weighted
Moving Averages. In this version, we weight each return variable based on the
volume at that time. If vi is the volume at time i, we have:
VWMAt =vt1rt1 + vt2rt2 + ...+ vtnrtn
vt1 + vt2 + ...+ vtn(2.2.7)
2.2.2 Risk Estimation
Differently from the Return Estimation, the method chosen for the risk estimation
carries with it some suppositions about the model. By using different methods to
calculate the Risk, we are changing the underlying assumptions of the model.
In this way, the risk method must not be chosen purely according to the numer-
ical complexity of the calculations. The underlying ideas behind the risk definition
must also be carefully considered.
Variance
In the Modern Portfolio Theory, Markowitz uses the variance of the return of
an asset as its risk measure. The variance of a random variable, in this case,
the return over time of an asset, is the measure of how large the differences are
between the values of that variable.
In other words, if the Estimated Return of an asset is its Moving Average,
the Variance risk means how much error we can expect from this estimate. The
variance is defined as:
2 = E[(R E[R])2] (2.2.8)
18
Which, in practice, can be defined for a limited data sample as:
2r =1
n 1
t1i=tn
(ri r) (2.2.9)
Where ri is the value of the return at time i, and r is the average return of the
period (i.e., the moving average).
The way to calculate the Variance-risk of a portfolio is to use the weighted
co-variance between all the assets presents in the portfolio. This is described in
eq. 2.1.2. The co-variance between two assets, r1 and r2 is calculated as follows:
Cov(r1, r2) =
(ri1 r1)(r
i2 r2)
N 1(2.2.10)
Where r1 and r2 are the moving averages. This calculation increases in size with
the number of lines in the data, and must be done for every pair of assets in a
portfolio. So, for a portfolio with many assets, and a large historical dataset, the
calculation of the covariances becomes expensive quickly.
Value at Risk
Value at Risk (VaR) is another measure of risk used for financial assets and port-
folios. It gives a measure of how much of the value of the asset might decrease at
a certain probability over a certain time period.
To calculate the VaR, we need to draw a probability distribution function of
the possible return values of the portfolio over the next time period (1 day, 10
days, 1 month, etc.). From this probability distribution, we choose our target
probability, and find out how much value at most may be lost at that probability.
For example, Consider a trading portfolio that is reported to have a 1-day
VaR of US$4 million at the 95% confidence level. This means that, under normal
market conditions, we can expect that, with a probability of 95%, a change in the
value of its portfolio would not result in a decrease of more than $4 million during
1 day, or, in other words, that, with a probability of 5%, the value of its portfolio
will decrease by $4 million or more during 1 day.
To calculate the return distribution, there are many different numerical meth-
ods. Some of the most common are the variance-covariance model, historical
19
simulation, and Monte Carlo simulation (Uludag et al., 2007).
The variance-covariance methods starts from the assumption that the portfolio
returns follow some given distribution, like normal, or T-normal. The distribution
and weights of the assets are calculated to find out the best-fitting parameters for
the distribution model. Once the values of the distribution model are found, the
loss for the desired probability is easily taken.
The historical simulation method is more transparent, if a large amount of
historical data is available, but is also more computationally intensive. It involves
running the current portfolio across a set of historical price changes to yield a
distribution of changes in portfolio value, and computing a percentile (the VaR).
The benefits of this method are its simplicity to implement, and the fact that it
does not assume a normal distribution of asset returns.
The Monte Carlo simulation consists of generating a large number of possible
scenarios based on the current data, and some market model. The varied results
from this simulation are ordered, and the lowest 5% (in a 95% VaR example) are
taken as the value at risk. It is then important to choose the model properly.
Monte Carlo simulation is generally used to compute VaR for portfolios con-
taining securities with non-linear returns (e.g. options) since the computational
effort required is non-trivial. For portfolios without these complicated securities,
such as a portfolio of stocks, the variance-covariance method is perfectly suitable
and should probably be used instead.
Chapter 3
Evolutionary Algorithm-based
Optimization
Evolutionary Algorithms (EA) are a group of heuristics for random searches that
are based on inspirations from the the biological evolutionary process. It intend to
uses the same mechanism that is described in the theory of evolution: a population
of individuals, through crossover and mutation of its gene pool, is capable of
adapting to changes in the environment.
In EAs, the environment is the problem to be solved, and the population is a
group of solutions, subset of the search space. The best individuals in a popula-
tion (dictated by a problem-specific fitness function) are mixed, generating new
individuals (see figure 3.1). In this way, the search is biased towards the direction
of the best individuals in the current population.
Algorithms inspired in the evolutionary theory were first proposed around
1965, as a numerical optimization technique (Schwefel, 1981). A formulation
more similar to what is known today as genetic algorithms was proposed later
in 1975 (J.H.Holland (1975),Jong (1975)).
While the first uses of Genetic Algorithms were for numerical optimizations,
one of the main good characteristics of GA is its ability to adapt itself to changes
in the problem conditions. The population in the genetic algorithms can easily
change itself to adapt to changes in the fitness function. A good example of this can
be seen in the many approaches to co-evolution, where two evolving populations
compete between them (Rosin and Belew, 1997).
Besides Genetic Algorithms, another technique which also fit under the Evolu-
tionary Algorithm paradigm is Genetic Programming. Both Genetic Algorithms
20
21
Create InitialPopulation
FitnessSelection
Crossover andMutation
Add toPopulation
Selected
Rejected
New Individuals
Figure 3.1: A simple description of the Evolutionary Algorithm framework
and Genetic program have in common the fact of being population based, and the
use of genetic operators (crossover, mutation) in order to mix the solutions and
produce new ones. Genetic programming evolves tree-structured individuals while
Genetic algorithm focuses in array based representation of solutions.
This difference leads to many significant changes in the philosophy behind
crossover and generation of individuals between the two techniques. In this work,
we concentrate ourselves to Genetic Algorithms, although the use of Genetic Pro-
gramming to the portfolio problem might yield interesting results.
Another limitation in the original EAs is related to the fitness function. The
fitness function guides the population towards the goal of the search. However,
when the search has multiple goals, specially when these goals are independent,
or contradictory, it becomes difficult do create a fitness function that can properly
bias the population to both objectives.
Portfolio Management, which we address in this thesis, is an example of such
a problem. It has three goals: return, risk and transaction costs, of which the
first two are contradictory, and the third is independent. To address this problem,
there is the development of Multi Objective GA methods (MOGA).
In this chapter we describe the basic concepts of Genetic Algorithms, and
22
Multi Objective Genetic Algorithms that were needed for the development of our
system. The first section describes the genetic algorithm and brakes it down to its
composing parts, concentrating on the choices that have to be made to apply GA
to a given problem. Then in the next session we describe some basic techniques
of Multi Objective Genetic Algorithms, and show our choice for the current work.
3.1 Genetic Algorithms
A Genetic Algorithm is a black box heuristic for a Random Search. It can be
defined by a search space S with its candidate solutions s, a subset of the search
space P (the population), a fitness function f(s), a selection operator, and the
mixing operators m(s) and c(s1, s2), the mutation and crossover, respectively.
Each individual in the Population represents a solution to the problem. How
to represent the solution is an important question in the genetic algorithm. For
instance, if the solution for the problem is a set of real-valued parameters, a
common representation is to turn each parameter into its binary representation,
and one individual would be composed by the concatenated binary strings. Others
would create an individual as an array of real values, each value being considered
a gene in the individual. It is not yet known whether the binary or the real
valued representation have any search advantage over the other (Yao, 1999).
A rough sketch of a generic genetic algorithm was drawn in figure 3.1. The
first step is to generate the initial subset P , also called the initial generation,
according to some rule. Then each member of the population is evaluated by f ,
which generates a fitness value for every member. Next, the fitness values are used
by the selection operator, which will remove individuals with worse fitness values
from the population. Finally, the remaining individuals are used by the mixing
operators to generate new individuals, which are used to complete P .
The heuristic behind the genetic algorithm is that by mixing and changing the
solutions with good performance, we can find better solutions than by randomly
searching through the solution space. In this regard, the selection operator is used
to prune bad solutions, the crossover operator generate new solutions that have
23
the characteristic of existing good solutions, and the mutation operator has the
role of introducing new information into the system (Vose, 1999).
While Genetic Algorithms have met with good practical results in a variety
of areas, whether they are particularly good for a particular class of problems or
not is a question that has not been properly answered yet. So, most applications
of genetic algorithms require extensive experimentation with Selection, Crossover
and Mutation strategies to tune the evolutionary system to the problem.
3.1.1 Selection Strategies
The selection operators performs the survival role in the evolutionary mech-
anism. The goal of this operator is to select from the current population the
individuals which will be used to construct the next population.
This selection is based according to the fitness value of each individual in the
population. Individuals with a higher fitness score must have a higher probability
of being chosen to be recombined than individuals with a lower fitness score. The
difference between the reproduction probability of a low fitness individual and a
high fitness individual in a given selection strategy is called the selection pressure.
If the selection pressure is too high, then there is a risk that the population will
be dominated by a few individuals that are not optimal but have a higher fitness
than the rest of the current population, leading the search to a local optima.
On the other hand, if the selection pressure is too low, the search will move too
randomly through the search space (Yao, 1999).
A simple selection strategy is the Fitness Proportional selection, also known
as Roulette Wheel. The probability for each individual in the population to be
selected is proportional to the relation between its fitness fi and the sum of the
fitness of all individuals in the population. While this method is quite simple, it
can build too much selection pressure if there are a few individuals with a relatively
high fitness in the population.
To address this problem, we have the Rank-based selection. In this selection
method, the individuals in the population are ordered by fitness, and the probabil-
ity of selection is proportional to the individuals rank. This removes the influence
24
of relative fitness values, and allows for a finer control of the selection pressure by.
Both Rank-based and Fitness proportional Selection methods compare the
whole population at once. Tournament Selection, on the other hand, works with
small subsets of the population. A parametric number of elements is chosen at ran-
dom, and them their fitness is compared, and the winner is chosen for reproduction.
The winning criteria of the tournament can be either deterministic (the highest
fitness is always the winner) or probabilistic (winning chance is proportional to
fitness). The tournament selection is appropriate for a parallel implementation,
since it breaks the selection problem into many small independent parts, and can
in this way can speed up the processing of a genetic algorithm.
Finally, Elite selection is not a selection method per se, but a common addition
to other selection policies. The application of an elite policy means that, regardless
of mutation or crossover, one or more of the best individuals from the current
population are copied without changes into the next population. It can be used
to keep the current best solutions when using aggressive crossover and mutation
strategies.
3.1.2 Crossover Strategies
The role of the crossover operator is to perform exploitation of the search space,
by testing new solutions with characteristics of two good individuals (Vose (1999),
Yao (1999)).
Crossover is an operator that is applied to two members of the population
(the parents), to create one (or two) offspring with characteristics from both its
parents. The offspring individuals contain informations from both parents.There
are three basic forms that the crossover operator usually takes.
The k-point crossover is one such form. From the genetic representation k
points are chosen; p1, p2, ...pk. If x and y are the parents genetic vectors, the
offspring vector, c, will be composed by alternately copying elements from x and
y, and changing from one parent to the other at each of the k points (see figure
3.2.
The Linear crossover is (also known as Uniform crossover is similar to the
25
Point 1 Point 2
Parent X
Offspring C
Parent Y
Figure 3.2: Example of k-points crossover with k = 2
k-points crossover. However, instead of copying blocks of elements from each
parent into the offspring, in the linear crossover each element is tested against a
probability of being copied from either parent (usually 50%).
The Intermediate Crossover is usually found only on real-valued GA. In this
form of crossover, each element is the average from the respective elements in both
parents. Thus, for an offspring c composed from parents x and y we have:
ci = xi + (1 )yi
Where is an weight between both parents (usually 0.5).
The choice of the crossover function influences what kind of information is
kept from parent to child individual. For instance in the k-point crossover, long
blocks of information is copied from parent to offspring - if two elements who are
represented side by side are related, this relationship will be passed to the next
generation. Linear crossover, on the other hand, assumes that the position of the
genes holds no information.
Sometimes, the crossover function has to be modified to fit with specific re-
quirements of the problem. For instance, in (Aydemir et al., 2006) the crossover
function has severe restrictions on what places in the genome representation can
26
be cut to perform the k-point crossover. This is because the genome represents
a series of continuous movement by a robotic motor. In this situation, if the
crossover happens at any location it is possible to generate an invalid genome.
While crossover is traditionally performed before mutation, if both operations
are independent, there is no statistical difference between doing one or the other
first (Vose, 1999).
3.1.3 Mutation Strategies
The mutation operator introduces new information into the population in the form
of new genes. While the crossover operation performs the exploitation role of the
genetic algorithm, the mutation performs the exploration.
Mutation usually takes place by having a parametrical chance of mutating
each gene in an individual. If the representation is binary, the mutated genes are
flipped. If the representation is real-valued, the mutated genes are perturbed by a
value given by some distribution (for instance a small random value from a normal
distribution).
3.2 Multi Objective Genetic Algorithms
The main decision that must be made when applying a Genetic Algorithm to a
problem is the design of the fitness function. A good fitness function will guarantee
that eventually the optimal desired solution will be reached, and that the algorithm
will not be locked into a local optima of the search space.
However, for many practical problems, there are multiple goals to be achieved,
and this goals are often partially independent. When this is the case, a single
fitness function will not be able to accurately represent the efficiency of the solution
for every objective.
For example, in the Portfolio Problem as discussed in this word, we find po-
tentially three different objectives: The maximization of the estimated return of
the portfolio, the minimization of the risk, and the minimization of transaction
costs.
27
Objective 1
Objective 2
ParetoFrontier
Figure 3.3: The Pareto Optimal Set for an hypothetical problem
It is actually very difficult in these problems to find a single solution which will
reach optimal values for every objective. Usually, we have to abandon the concept
of a single optimal solution in favor of a series of trade-offs between each goal.
This new definition of optimality is called the Pareto Optimum, and can be
defined as follows. A vector of decision variables x F is Pareto-optimal if there
does not exist another x F such that fi(x) fi(x) for all i = 1, ..., l and
fj(x) < fj(x) for at least one j.
In other words, a Pareto optimal solution is one where the value of at least
one of the objectives is higher than any other solution in the search space. From
this definition we can also derive the concept of a Pareto Optimal Set, which is
the group of all Pareto-optimal solutions to a problem. Figure 3.3 illustrates the
Pareto Optimal Set for a bi-objective problem.
28
Genetic Algorithms have been regarded as an appropriate technique to multi-
objective problems. This is because GA is less sensitive to concave or discon-
tinuous search spaces, and Pareto sets (Coello Coello, 2001). There are differ-
ent approaches, though, as to how to use GA with multi-objective problems.
Each method has different characteristics regarding complexity, efficiency, and
fine-tuning requirements. Here we will discuss the most popular ones.
3.2.1 Aggregate Methods
Aggregate methods try to solve the multiple objective scenario by designing a sin-
gle fitness function representative of all the goals in the problem. The most com-
mon example of an Aggregate Method for Multi Objective GA would be weighting.
In the weighting scheme, a fitness function is designed for each goal in the
problem, and the final fitness function is a weighted sum of the problem-specific
functions. This method is extremely simple to implement, and do not require any
changes to the basic logic of the Genetic Algorithm System.
The problem with weighting, and other aggregate methods is twofold. First,
by summing the values of different metrics together into a single fitness function,
it becomes very difficult to generate members of the Pareto Optimal Set when this
set is concave (Das and Dennis, 1997).
Second, it is not intuitive how the weights must be attributed to the different
fitness components. Specially in problems where the different objectives are not
related, or have very different value scales. For instance, in the problem discussed
in this work, The risk metric and the distance metric are not related, and thus
there is not much meaning in trying to sum the two values together.
3.2.2 Objective Sharing Methods
Objective Sharing Methods modify the selection strategy of GA, by having the
different fitness functions, related to the different goals, affect the selection sepa-
rately.
In VEGA (Schaffer, 1985), for instance, the population is separated into a
number of subgroups equal to the number of objectives. Each subgroup undergoes
29
Population
SubPop 2SubPop 1
Dividethe Population
Fitness 1 Fitness 2
Selection
Population
SelectFitness
Fitness 1 Fitness 2
Selection
OR
Merge the population
AND
Figure 3.4: Differences between Objective Sharing approaches
30
a different fitness function, and then all groups are merged for the crossover.
Lexical ordering (Coello Coello, 2001) approaches this differently. Each goal
has its fitness function, but instead of using them all at the same time, each
generation a different fitness function is used. The way to choose which fitness
function to use every generation is an important design decision for this method.
The differences between the two methods can be visualized in figure 3.4.
Objective sharing methods do not show the same problems regarding concave
Pareto sets as agregate methods do. However, it has some issues with speciation.
Speciation means that, due to the nature of separating the population and evolving
each part under a different metric, individuals with extreme values for one of the
objectives will appear more often than individuals with average values for all goals.
This can be a disadvantage when balance between the objectives is critical.
3.2.3 Pareto-based Methods
Pareto-based methods use the concept of dominance between solutions in order
to determine the fitness of individuals. We say that a individual a dominates an
individual b, if for at least one goal a is better than b, and for every other goal, a
is at least as good as b. Two individuals are non-dominant between each other if
each has at least one objective metric that is higher than the other. For example,
the Pareto Set described earlier in this section is composed of individuals that are
non-dominated by any other solution in the population.
Most MOGA methods use the concept of dominance to establish a rank among
the individuals, and generate the fitness based on this rank. Thus, the values for
the goal metrics are not used directly to assign fitness values to the individuals.
Pareto-based techniques have the advantage of being very efficient, and are
reported to generate solutions in all positions along the Pareto frontier. However,
the fitness assigning step is difficult to estimate correctly, and is known to heavily
influence the efficiency of the method. Also, Pareto-based methods require an
extra step called fitness sharing. This step is to prevent the populating from
concentrating in a single spot in the Pareto front. The effective implementation
of fitness sharing is still a topic of study (Horn, 1997).
Chapter 4
Related Work
Although the Modern Portfolio Theory was first proposed in the 50ies, and the
CAPM in the 80ies by Markowitz, the first uses of Genetic Algorithms for portfolio
optimization are much more recent.
We trace the earlier works in this subject to 2002 (Jian and K.Y.Szeto, 2002).
These first works approach Portfolio Optimization as simply the problem of trans-
lating to the Genetic Algorithm paradigm the model described to Markowitz.
Later works build upon these models, by trying to add extra constraints that
make the model closer to real life applications, or by making changes to the Evolu-
tionary Algorithmic heuristic that yield better results in this particular problem.
Most interesting recent works are the recent articles by Wei and Clark (Yan
and Clack (2006), Yan and Clack (2007)), where they research the characteristics
of a specific subproblem of the portfolio optimization problem: Hedge Trading.
We believe that working with application niches might indeed turn up interesting
improvements to the heuristics.
So, we consider this to be a research topic still in an early stage of development.
Evidence of this is that almost no works approach the multiscenario portfolio
problem, limiting themselves to the single scenario problem initially proposed by
Markowitz. This encourages us to follow up this line in this work.
4.1 Review of GA based methods
The implementation of the Markowitz model for portfolio optimization can be
neatly divided in several independent blocks. This was demonstrated earlier, when
we described the problem, and can be observed in all of the previous works, where
31
32
the same division takes place.
The basis of the algorithm is the optimization of the portfolio weights in order
to maximize some utility function. While the utility function is most often the
Sharpe Ration, other functions have been proposed in the literature.
Regardless of the utility function chosen, a number of estimation techniques
to analyze the data used in the function are necessary. Return estimation and
Risk estimation are the two obvious functions in this regard, but if other data
(like volume) are considered in the model, they will need their own estimation
methods.
Besides the model for the portfolio optimization, a lot of research has been
focused in the Genetic Operators. The use of MOGA instead of a single fitness
function for risk and return is the most obvious of the changes, but studies on the
most appropriate encoding have also showed interesting results.
We will detail below each of these building blocks for the system, describing the
most used approach, interesting diversions from this consensus, and our opinions
on the research done so far.
4.1.1 Portfolio representation
The most common way to represent a portfolio in a Genetic Algorithm is by using a
vector of real valued variables, one for each weight in the portfolio. Some examples
of this representation technique are Lipinski et al. (2007), Lin and Gen (2007) and
Hochreiter (2007).
The real-valued representation has the advantage of directly representing the
portfolio. So the transformation from the genome to the problem solution is very
straightforward.
There have been works on more indirect representations as well. Werner and
Fogarti (2002) suggests the use of Genetic programming to generate rules that
calculate the values for the portfolio weights, instead of directly optimizing the
weights themselves. They propose that by generating this rules, it improves the
adaptive characteristic of the portfolio. However, the work has not been continued.
Yan and Clack also use GP (Yan and Clack (2006), Yan and Clack (2007)). In
33
their work, the genetic programming generates rules for buying/ selling each asset,
in a similar way to that performed in Automated Trading Strategies research. The
weights of the portfolio, in this work, are the same across all assets, and the only
difference is whether the particular asset is held in a short or long position.
4.1.2 Asset selection
Past works face a common problem of executing simulations on a small universe
of assets. Most works use only up to 20 assets at once to compose the portfolio.
Yan and Clack (2006), Yan and Clack (2007), Hochreiter (2007) are examples of
works where all assets are used at the same time to compose the portfolio.
Lipinski et al. (2007), on the other hand, uses a larger universe for the total
assets, but chooses 10 assets at random from this universe to compose the portfolio.
A more interesting idea was proposed by Streichert et al. (2003). In this work,
an index array is used together with the real-valued array to select the assets that
will take part of the portfolio. This index array goes under the evolutive process
in the same way that the real-valued array. It is said that selecting the assets like
this provides better results than purely using all the assets in the portfolio.
In this dissertation, we use this last technique, which details will be discussed
in the next chapter.
4.1.3 Market Model
Most works in this subject use simulations based on historical data in order to
test their results against the model. While the use of historical data allows for
a precise repetition of real-world events, it also has some limitations. The most
important of those is that simulations using past data do not react to the actions
of the portfolio system.
An alternative is presented by Subramanian et al. (2006), which performs its
portfolio selection under an artificial market composed by many agents. When all
the agents that influence the market are represented in the simulation, the effect
of buying or selling large amounts of desired assets can be analyzed.
34
However, this approach has the disadvantage of being more artificial. The
number of agents that can be simulated is limited, and the accuracy of the simula-
tion is also limited. So it becomes difficult to understand to what degree a system
that succeeds in an artificial market can also succeed in a real one.
In this work, we follow the general trend of using historical data simulation as
the model for the market.
4.1.4 Return Estimation
The most common method for Return Estimation in the Portfolio Optimization
context is the moving averages (Yan and Clack (2006), Yan and Clack (2007),
Lipinski et al. (2007), Lin and Gen (2007)). The moving averages method can
be simply described as the average of the most recent N prices in the available
historical data.
Werner and Fogarti (2002) uses a more sophisticated approach, with a least
squares optimization method to model the value of the return for each asset in the
portfolio.
Neural networks are another source of prediction for the return values. Kwon et
al. (Kwon and Moon (2004), Kwon et al. (2004)) uses a hybrid approach between
Evolutionary computing and Neural Networks to calculate the return values of
different assets. Evolutionary Computation is used to optimize the weight of the
Neural Network. The inputs are the previous values of the stock price time series,
and the output is the future value.
Azzini and Tettamanzi (2006) also uses neural networks with weights calibrated
by evolutionary computing to calculate the future price of stock, and uses this
values in a portfolio optimization system.
It is not known by how much the use of advanced methods for calculating
the estimated return can improve the results of MPT optimization. Since the
goal is not to reach an specific value, or trade the portfolio in specific trade points,
predicting the exact value of the asset is not as important as predicting the general
trend of the market.
35
Using elaborate methods for Return Estimation adds a complexity to the sys-
tem that makes it more difficult to evaluate the effect of adding cost to the model
by itself. So for this work, we will use the more common and accepted method of
moving averages.
4.1.5 Risk Estimation
The standard deviation, as suggested by Markowitz, is the most used measure
of risk (Lin and Gen (2007), Hochreiter (2007)). Like for the expected return,
many works suggest other ways to calculate the risk for a given asset. However,
unlike the expected return, the alternate methods of risk calculation do not aim
at achieving more precise values.
Instead, alternate methods for calculating risk are actually changes in the MPT
model itself, to put extra effort in certain aspects of the assets that compose the
portfolio. For instance, Lipinski et al. (2007) and Subramanian et al. (2006) use
the semi-variance, which reduces the influence of positive variance, and increases
the influence of negative variance. By using this method for risk estimation, these
works reinforce the value of positive risk (the risk that the actual value of an asset
is above the value predicted). Whether this assumption is correct or not is beyond
the scope of our work, and requires a deeper knowledge of economics to correctly
access the answer.
Chen et al. (2002) also suggests the use of other models for risk. In this
particular paper, they introduce the linf function, which is similar to the risk
function used by Markowitz for the portfolio risk, but does not uses the covariance
value between assets. This results in an average risk rate which is lower than the
one reported by works which use the Markowitz model.
4.1.6 Cost modeling
Cost measures are often cited in financial engineering works as necessary, but rarely
they are actually used in the simulations of those works. This is not different for
Portfolio Optimization.
36
Streichert et al. (2003) and Fyfe et al. (2005) choose two arbitrary, fixed values
for the cost, and arrive at quite different results due to this. Chen et al. (2002)
uses a function that varies the cost according to the volume traded.
It is difficult to correctly insert a measure of cost into the portfolio model. Stock
dealers may have different policies regarding the costs of financial transactions.
Often, an arbitrary value for cost is chosen, like in Streichert et al. (2003); Fyfe
et al. (2005). Sometimes, a more abstract approach is taken, and the cost value
is let unfixed Chen et al. (2002). Lin et al. (2005) build the cost directly into the
function that generates the return estimate of the portfolio. The cost values are
different for buying and selling.
In all these works, the costs are added to the final results, but not taken directly
into account by the evolutionary process. Also, all the above works are single
scenario, which means that the costs are considered against an initial position
where no assets are held.
In this work, we propose a more complete cost model, where an explicit objec-
tive function for cost is added to the evolutionary model, and the position held in
the previous scenario is taken into account.
4.1.7 Fitness Functions
The most common fitness function for Evolutionary solutions of the Portfolio
Optimization problem is the Sharpe Ratio (Yan and Clack (2006), Yan and Clack
(2007), Subramanian et al. (2006), Lin and Gen (2007), among others). The
Sharpe Ratio is suggested in basic books on finance as a way to compare a portfolio
to the Market Portfolio the theoretical optimal portfolio. The Sharpe Ratio also
has the advantage of cleanly putting together risk and measure in a single function,
but it is by no means the only fitness function used in the literature.
Hochreiter (2007) and Lin et al. (2005) use a weighted sum between the Risk
measure, and the Return, as the fitness function. If the risk of the portfolio is
P , and the return is rP , their fitness function would be something like P (x) =
P + (1 )rP . Alpha is a balancing parameter between the two goals.
This approach has the obvious disadvantage that a new parameter is added
37
to the system. But another serious problem with this function is that Risk and
Estimated return are not two directly comparable values. Adding both directly is
similar to adding the speed of an object with its acceleration.
Streichert et al. (2003) suggests that risk and return should be evaluated sep-
arately, in a Multi Objective GA architecture. In their work, they use the Pareto
front as an elite strategy to keep solutions with both high return and low risk in
the population during the evolution. Yan and Clack (2007) disagrees with this
approach, and says that Pareto-based MOGA is not able to solve the risk-return
optimization, but do not provide evidence to this assertion.
4.1.8 Evolutionary Strategies
Most GA-based Portfolio Optimization systems follow a formula similar to the one
described by Lin and Gen (2007). However, as we stated before, in some works
different evolutionary strategies are pursued.
Yan and Clack (2006) and Yan and Clack (2007) study how to increase the ro-
bustness of the genome during the search. In the first study, the concept of genetic
diversity is used to improve robustness. They use the correlation values between
different solutions in order to increase the difference between the individuals of
the same generation.
In the second study, they change the selection step by adding different scenarios
for evaluation for each generation. This is done with the goal of reducing overfitting
and increasing robustness of the achieved solutions.
Chapter 5
Multi-Objective Portfolio
Management
In this work, we expand over well-known Genetic Algorithm-based Portfolio Op-
timization methods. Our approach is to pick the best ideas among the published
work to build a scenario-based portfolio optimization system. Over this system, we
implement two ideas researched over this work to achieve multi-scenario Portfolio
Management.
The first one is Seeding, based on the idea of migration. Individuals from
previous scenarios are brought in to the current scenario, to breed with the new
population, and tilt the search towards regions in the search space that were
successful in the past.
The second one is Objective Sharing, based on Multi-Objective GA techniques.
We define a new fitness function to measure the distance between the individuals
in the population and the portfolio position taken in the previous scenario. At
each generation, there is a chance that the standard optimization fitness function
will be replaced by this new fitness, directing the population towards solutions
that have smaller distances from the position.
Our goal is to use the self-adapting characteristics of Genetic Algorithms to
develop a portfolio selection strategy that is resistant to changes in the market.
In this chapter, we describe how we intend to achieve such a goal. The model
used to represent our problem is described, along with its assumptions and restric-
tions. Then, based on this model, we describe an optimization and management
system that generates effective portfolio strategies.
38
39
5.1 Portfolio Model
Assume a market with N assets. For each asset n N , rtn is the given historical
return at time t for that asset. Also, for each asset, tn is the given risk measure
at time t for that asset.
Based on this data, the system must generate investment portfolios that are
optimized in three different measures: to minimize the portfolios risk, to maximize
the portfolios return, and to minimize the cost between the current portfolio (time
t), and the previously generated portfolio (time t 1).
5.1.1 Portfolio Representation
The output for the system is a portfolio configuration for each time t. These
portfolios are defined as sets, W t, of N weights, in which wn Wt is the weight
that refers to the proportion of investment that will be allocated to asset n.
We impose two restrictions to the value of the weights w:
n wn = 1 and
0 wn 1. The first restriction indicates that the total of the weights cannot
exceed the total of resources, and the second restriction imposes that the values
of the weights must be non-negative.
5.1.2 Return Measure
The return for one asset in the model is given as its logarithmic return value. To
calculate this, we use the closing price of the asset at times t and t 1 (indicated
by P tn and Pt1n ):
rtn = log
(P tnP t1n
)(5.1.1)
The return of a portfolio is given as the weighted sum of the returns of its
composing assets, and calculated as follows:
rtW =N
n=1
wnrtn (5.1.2)
40
5.1.3 Risk Measure
The risk for one asset in the model is given as the variance of the return of that
asset. The risk up to a given time t is calculated as the variance of all the data
available up to that time. Because of this, it is needed in practice to have some
historical data available prior to the time t from which we want to calculate the
portfolio.
The risk of the portfolio is given as the weighted sum of the risk of its composing
assets:
tW =N
i=1
Nj=1
wiwjij (5.1.3)
5.1.4 Cost Measure
One of the main novelties we introduce is the addition of transaction costs into
the portfolio model. Transaction costs are added when selling or buying an asset,
so that if an investor changes his position too much to follow the market, the
transaction costs may become higher than the difference in return of the new
position.
In practice, transaction costs depend on the policies of the broker. A flat rate
can be charged per transaction, or it can be proportional to the trade volume.
Different rates can be charged of different classes of assets.
We use, instead, an indirect approach to cost. We assume that the transaction
costs are based on the amount of an asset sold or bought, but do not set a fixed
value to this cost. Instead, the distance between the current portfolio and the
desired portfolio is used as the cost measure. This concept has not yet been used
in GA optimized portfolio approaches.
Following this definition, distance is the amount by which the weights of two
different portfolios differ. We quantify the distance as the Euclidean distance of
the weight vectors:
d(Wa,Wb) =
Nn=1
(wan wbn)
2 (5.1.4)
41
With higher distance values meaning higher costs to move from one portfolio
position to another.
5.1.5 Assumptions of the model
The model, as presented, makes a number of assumptions that simplify the prob-
lem, and do not hold in practice. We make explicit those assumptions here. Note
that these assumptions are also made by most of the current work in the field. The
removal of each restriction must me progressively addressed as the model evolves
through time. But an understanding of the effects of these restrictions and their
consequences allow us to use the results obtained by the system in the real world.
The effects of Volume in trading are not included. This means that there are
no limits on the minimal or maximal amount of any asset that can be held by the
investor. In practice, the minimal amount of an asset that can be held is usually
limited by the price of a single share of that asset, and there are often limits to
the maximum amount of an asset that can be held by a single individual. Also,
the price for buying or selling assets can be affected by the volume dealt. The
effects of volume trading have not been satisfactorily addressed in other works in
the same field.
By restricting the weights to non-negative values, we remove the possibility
of holding short positions in the portfolio. However, this is not unrealistic, as
it can be shown that models for the MPT with and without short-selling are
equivalent (Yuh-Dauh-Lyu (2002)). We add this restriction to the model for the
sake of simplicity.
The choice of the Euclidean Distance as a measure of cost also carries with
it a few assumptions about the model. According to the definition we use for
the function, many small changes in different weights result in a smaller distance
(cost) than a large change in only one weight.
However, the choice of the Euclidean Distance as a measure of cost has some
drawba
Recommended