1
Statistical Estimation of High-Dimensional Portfolio Hiroyuki Oka*, Hiroshi Shiraishi Keio University Graduate School of Science and Technology Abstract Methods Results References Bai, Z., Liu, H., and Wong, W.-K. (2009). Enhancement of the applicability of markowitz’s portfolio optimization by utilizing random matrix theory. Mathematical Finance, 19(4):639–667. El Karoui, N. et al. (2010). High-dimensionality effects in the markowitz problem and other quadratic programs with linear constraints: Risk underestimation. The Annals of Statistics, 38(6):3487–3566. Bodnar, T., Parolya, N., and Schmid, W. (2014). Estimation of the global minimum variance portfolio in high dimensions. arXiv preprint arXiv:1406.0437. Fujikoshi, Y., Ulyanov, V. V., and Shimizu, R. (2010). Multivariate Statistics: High-Dimensional and Large-Sample Approximations (Wiley Series in Probabilityand Statistics). Wiley, 1 edition. Glombek, K. (2012). High-Dimensionality in Statistics and Portfolio Optimization. Josef Eul Verlag Gmbh. 0.5 0.6 0.7 0.8 0.4 0.2 0.0 0.2 efficient frontirer Portfolio standard deviation Portfolio mean 0.5 0.6 0.7 0.8 0.4 0.2 0.0 0.2 Objective n=100 sim100n[, 4] theta100n[1] Density 0 2 4 6 0.0 0.2 0.4 0.6 n=500 sim500n[, 4] theta500n[1] Density 0.5 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 n=1000 sim[, 4] theta[1] Density 0.4 0.2 0.0 0.2 0.4 0.6 0.8 0.0 0.5 1.0 1.5 2.0 2.5 Background Future Work We introduce a Markowitz’s mean-variance optimal portfolio estimator from d × n data matrix under high dimensional set- ting where d is the number of assets and n is the sample size. When d/n converges in (0, 1), we show inconsistency of the traditional estimator and propose a consistent estimator. Let X be a asset returns (r.v.), and w be portfolio weights. Then, optimal portfolio weights are the solution of following optimization problem. max wR d u(w)= E[w X ] - 1 2γ Var(w X ) subject to w 1 d =1 Here, γ is a positive constant depending on individual investor. And then, using expressions E[X ]= μ and Var(X )= Σ, expected value and variance of optimal portfolio are expressed as follows. Expected value μ opt and variance σ 2 opt of optimal portfolio return are expressed as follows. μ opt (γ ; θ )= γ θ 1 - θ 2 2 θ 3 + θ 2 θ 3 σ 2 opt (γ ; θ )= γ 2 θ 1 - θ 2 2 θ 3 + 1 θ 3 A set of (σ opt opt ) is called ”ecient frontier ”. Here, θ is θ = θ 1 θ 2 θ 3 = μ Σ -1 μ 1 d Σ -1 μ 1 d Σ -1 1 d Fig.1 Ecient frontierFollowing figure1 shows portfolio plot. Black points (·) are the portofolios in feasible area which can obtain by changing value of w. And red circles () shows optimal portfolios which can obtain by changing value of γ . This figure shows that rational investor prefer lower risk when mean is same value, and higher return when risk is same value. Purposeestimate ecient frontier in high dimension dfix, n →∞⇒ S -1 is consistent n, d →∞⇒ S -1 is inconsistent To estimate optimal portfolio, we should estimate optimal portfolio parametor θ . It is assumed that n data vectors X 1 ,..., X n is following unknown distribution which has mean vector μ, and covariance matrix Σ. Then, ˜ θ , estimator of parmetor θ , is defined as follows. ˜ θ = ˜ θ 1 ˜ θ 2 ˜ θ 3 = ¯ X S -1 ¯ X 1 d S -1 ¯ X 1 d S -1 1 d For some mathematical argument, we put some assumptions. So, we would like to derive asymptotic property of optimal portfolio on the following assumption. n →∞, d →∞, d n ρ (0, 1) ˆ μ opt (γ ; θ )= γ ˆ θ 1 - ˆ θ 2 2 ˆ θ 3 + ˆ θ 2 ˆ θ 3 , ˆ σ 2 opt (γ ; θ )= γ 2 ˆ θ 1 - ˆ θ 2 2 ˆ θ 3 + 1 ˆ θ 3 Fig.2 Consistency of estimatorsFig.2 shows histgrams of generated ˜ θ 1 and ˆ θ 1 . Red part is histgram of ˆ θ 1 , and blue part is histgram of ˜ θ 1 . In order of n = 100, 500, 1000 from the left, and this shows that ˆ θ 1 converge to true value θ 1 Fig.3 Asymptotic normality of n( ˆ θ 1 - θ 1 )Fig.3 shows histgram of n( ˆ θ 1 - θ 1 ) and blue curve line of asymptotic normal distribution. In order of n = 100, 500, 1000 from the left, and this shows that n( ˆ θ 1 - θ 1 ) converge to objective normal distribution. Simulation Study fix d/n =0.8 increase n = 100, 500, 1000, and assume X t i.i.d. N d (μ, Σ) components of μ are devided [-1, 1] into d equal parts Σ has diagonal components 1and the others 0.5 In this condition, generate ˜ θ 1 and ˆ θ 1 in 10000 times and confirm theoretical results. When an investor invest in d financial products, he consider how maximize the portfolio return for a given level of risk, defined as variance. Define optimal portfolio as follows. Because E[X ]= μ and Var(X )= Σ is generally unknown, it is considered that optimal portofolio should be estimated by d × n data matrix (X 1 ,..., X n ). We estimate μ by sample mean vector ¯ X , and Σ by sample covariance matrix S . ¯ X = 1 n n t=1 X t , S = 1 n n t=1 (X t - ¯ X )(X t - ¯ X ) In these days, because of expansion of market scale, the num- ber of assets d grows bigger. But, it is known that the bigger dimension size d grows, the worse estimator S -1 becomes. We introduce (n, d)-asymptotic properties of estimators of op- timal portfolio parametor θ . It is known that when X 1 ,..., X n i.i.d. (μ, Σ) and satisfy previous assumption 14, ˜ θ converges following value as n goes to infinity. ˜ θ 1 a.s. 1 1 - ρ α 1 + ρ 1 - ρ , ˜ θ 2 a.s. 1 1 - ρ α 2 , ˜ θ 3 a.s. 1 1 - ρ α 3 This shows that estimator using ¯ X and S is overestimated. So, we need to correct estimators. We propose the following estimator ˆ θ . Result1 Define estimator ˆ θ =( ˆ θ 1 , ˆ θ 2 , ˆ θ 3 ) as following expressions. ˆ θ 1 = 1 - d n ˜ θ 1 - d n , ˆ θ 2 = 1 - d n ˜ θ 2 , ˆ θ 3 = 1 - d n ˜ θ 3 Then, ˆ θ is consistent estimator of θ . This estimator ˆ θ has asymptotic normality. Result2 X 1 ,..., X n i.i.d. (μ, Σ) satisfy previous assump- tion 14. Then, n( ˆ θ - θ ) converges to normal distribution as n goes to infinity. n( ˆ θ - θ ) D N 3 (0, Ω) ((n, d)-asymptotic ) In this, Ω is following matrix. Ω = 1 1 - ρ 2α 2 1 +4α 1 +2ρ * * 2α 1 α 2 α 2 2 + α 1 α 3 + α 3 * 2α 1 α 3 2α 2 α 3 2α 2 3 Using this estimator ˆ θ , we make ecient frontier estimators ˆ μ opt and ˆ σ 2 opt as follows. These ecient frontier estimators ˆ μ opt and ˆ σ 2 opt has consis- tency. Result3 ˆ μ opt and ˆ σ 2 opt satisfy previous assumption 14. Then, ˆ μ opt and ˆ σ 2 opt converge to following values as n goes to infinity. ˆ μ opt a.s. μ opt , ˆ σ 2 opt a.s. σ 2 opt ((n, d)-asymptotic ) To derive asymptotic nomality of ˆ μ opt and ˆ σ 2 opt To derive confidential interval and test of ecient frontier To analyze ecient frontier from actual stock price data Assumption1 Let Z 1 ,..., Z n i.i.d. (0, I d ). Assume that entries of Z t are independent with 4+ moment. Assumption2 Then, data vectors X t can be expressed as X t = Σ 1 2 Z t + μ.(μ R d , Σ > 0) Assumption3 d is expressed with n, and d/n ρ (0, 1) (n →∞) is satisfied. We call this limit operation ”(n, d)-asymptotic ”. Assumption4 Assume that θ converge the following constants α 1 , α 2 , α 3 . θ 1 α 1 , θ 2 α 2 , θ 3 α 3 (d →∞) However, it is satisfied that α 1 , α 3 > 0, α 2 R, α 1 α 3 - α 2 2 > 0.

Keio University Graduate School of Science and Technologymath.bu.edu/keio2016/talks/Oka.pdfStatistical Estimation of High-Dimensional Portfolio Hiroyuki Oka*, Hiroshi Shiraishi Keio

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Keio University Graduate School of Science and Technologymath.bu.edu/keio2016/talks/Oka.pdfStatistical Estimation of High-Dimensional Portfolio Hiroyuki Oka*, Hiroshi Shiraishi Keio

Statistical Estimation of High-Dimensional Portfolio Hiroyuki Oka*, Hiroshi Shiraishi

Keio University Graduate School of Science and Technology

Abstract

Methods

Results

References Bai, Z., Liu, H., and Wong, W.-K. (2009). Enhancement of the applicability of markowitz’s portfolio optimization by utilizing random matrix theory. Mathematical Finance, 19(4):639–667. El Karoui, N. et al. (2010). High-dimensionality effects in the markowitz problem and other quadratic programs with linear constraints: Risk underestimation. The Annals of Statistics, 38(6):3487–3566. Bodnar, T., Parolya, N., and Schmid, W. (2014). Estimation of the global minimum variance portfolio in high dimensions. arXiv preprint arXiv:1406.0437. Fujikoshi, Y., Ulyanov, V. V., and Shimizu, R. (2010). Multivariate Statistics: High-Dimensional and Large-Sample Approximations (Wiley Series in Probabilityand Statistics). Wiley, 1 edition. Glombek, K. (2012). High-Dimensionality in Statistics and Portfolio Optimization. Josef Eul Verlag Gmbh.

0.5 0.6 0.7 0.8

−0.4

−0.2

0.0

0.2

efficient frontirer

Portfolio standard deviation

Portf

olio

mea

n

0.5 0.6 0.7 0.8

−0.4

−0.2

0.0

0.2

Objective

n=100

sim100n[, 4] − theta100n[1]

Den

sity

0 2 4 6

0.0

0.2

0.4

0.6

n=500

sim500n[, 4] − theta500n[1]

Den

sity

−0.5 0.0 0.5 1.0 1.5

0.0

0.5

1.0

1.5

n=1000

sim[, 4] − theta[1]

Den

sity

−0.4 −0.2 0.0 0.2 0.4 0.6 0.8

0.0

0.5

1.0

1.5

2.0

2.5

Background

Future Work

We introduce a Markowitz’s mean-variance optimal portfolio

estimator from d � n data matrix under high dimensional set-

ting where d is the number of assets and n is the sample size.

When d/n converges in (0, 1), we show inconsistency of the

traditional estimator and propose a consistent estimator.

Let X be a asset returns (r.v.), and w be portfolio weights.

Then, optimal portfolio weights are the solution of following

optimization problem.

��

�maxw�Rd

u(w) = E[w�X] � 12� Var(w�X)

subject to w�1d = 1

Here, � is a positive constant depending on individual investor.

And then, using expressions E[X] = µ and Var(X) = �,

expected value and variance of optimal portfolio are expressed

as follows.

Expected value µopt and variance �2opt of optimal portfolio

return are expressed as follows.

µopt(�; �) = �

��1 � �2

2

�3

�+

�2

�3

�2opt(�; �) = �2

��1 � �2

2

�3

�+

1

�3

A set of (�opt, µopt) is called ”e�cient frontier”. Here, � is

� =

��1

�2

�3

� =

�µ���1µ1�

d ��1µ1�

d ��11d

Fig.1 E�cient frontier:Following figure1 shows portfolio plot. Black points (·) are the portofolios in feasible area which can obtain by

changing value of w. And red circles (�) shows optimal portfolios which can obtain by changing value of �. This figure shows that rational

investor prefer lower risk when mean is same value, and higher return when risk is same value.

Purpose:estimate e�cient frontier in high dimension

• d:fix, n � � � S�1 is consistent

• n, d � � � S�1 is inconsistent

To estimate optimal portfolio, we should estimate optimal

portfolio parametor �. It is assumed that n data vectors

X1, . . . , Xn is following unknown distribution which has mean

vector µ, and covariance matrix �.

Then, �̃, estimator of parmetor �, is defined as follows.

�̃ =

���̃1

�̃2

�̃3

�� =

��X̄�S�1X̄

1d�S�1X̄

1d�S�11d

��

For some mathematical argument, we put some assumptions.

So, we would like to derive asymptotic property of optimal

portfolio on the following assumption.

n � �, d � �,d

n� � � (0, 1)

µ̂opt(�; �) = �

��̂1 � �̂2

2

�̂3

�+

�̂2

�̂3

, �̂2opt(�; �) = �2

��̂1 � �̂2

2

�̂3

�+

1

�̂3

Fig.2 Consistency of estimators:Fig.2 shows histgrams of generated �̃1 and �̂1. Red part is histgram

of �̂1, and blue part is histgram of �̃1. In order of n = 100, 500, 1000 from the left, and this shows that

�̂1 converge to true value �1

Fig.3 Asymptotic normality of�

n(�̂1 � �1):Fig.3 shows histgram of�

n(�̂1 � �1) and blue curve line

of asymptotic normal distribution. In order of n = 100, 500, 1000 from the left, and this shows that�

n(�̂1 � �1) converge to objective normal distribution.

Simulation Study

• fix d/n = 0.8

• increase n = 100, 500, 1000, and assume Xti.i.d.� Nd(µ,�)

• components of µ are devided [�1, 1] into d equal parts

• � has diagonal components 1、and the others 0.5

In this condition, generate �̃1 and �̂1 in 10000 times and confirm theoretical results.

When an investor invest in d financial products, he consider

how maximize the portfolio return for a given level of risk,

defined as variance. Define optimal portfolio as follows.

Because E[X] = µ and Var(X) = � is generally unknown, it

is considered that optimal portofolio should be estimated by

d � n data matrix (X1, . . . , Xn). We estimate µ by sample

mean vector X̄, and � by sample covariance matrix S.

X̄ =1

n

n�

t=1

Xt, S =1

n

n�

t=1

(Xt � X̄)(Xt � X̄)�

In these days, because of expansion of market scale, the num-

ber of assets d grows bigger. But, it is known that the bigger

dimension size d grows, the worse estimator S�1 becomes.

We introduce (n, d)-asymptotic properties of estimators of op-

timal portfolio parametor �. It is known that when

X1, . . . , Xni.i.d.� (µ,�) and satisfy previous assumption 1�4,

�̃ converges following value as n goes to infinity.

�̃1a.s.� 1

1 � ��1 +

1 � �, �̃2

a.s.� 1

1 � ��2, �̃3

a.s.� 1

1 � ��3

This shows that estimator using X̄ and S is overestimated.

So, we need to correct estimators. We propose the following

estimator �̂.

Result1   Define estimator �̂ = (�̂1, �̂2, �̂3)� as following

expressions.

�̂1 =

�1 � d

n

��̃1�

d

n, �̂2 =

�1 � d

n

��̃2, �̂3 =

�1 � d

n

��̃3

Then, �̂ is consistent estimator of �.

This estimator �̂ has asymptotic normality.

Result2 X1, . . . , Xni.i.d.� (µ,�) satisfy previous assump-

tion 1�4. Then,�

n(�̂ � �) converges to normal distribution

as n goes to infinity.

�n(�̂ � �)

D� N3(0,�) ((n, d)-asymptotic)

In this, � is following matrix.

� =1

1 � �

��2�2

1 + 4�1 + 2� � �2�1�2 �2

2 + �1�3 + �3 �2�1�3 2�2�3 2�2

3

��

Using this estimator �̂, we make e�cient frontier estimators

µ̂opt and �̂2opt as follows.

These e�cient frontier estimators µ̂opt and �̂2opt has consis-

tency.

Result3  µ̂opt and �̂2opt satisfy previous assumption 1�4.

Then, µ̂opt and �̂2opt converge to following values as n goes

to infinity.

µ̂opta.s.� µopt, �̂2

opta.s.� �2

opt ((n, d)-asymptotic)

• To derive asymptotic nomality of µ̂opt and �̂2opt

• To derive confidential interval and test of e�cient frontier

• To analyze e�cient frontier from actual stock price data

Assumption1 Let Z1, . . . , Zni.i.d.� (0, Id). Assume that entries of Zt are independent with 4 + � moment.

Assumption2 Then, data vectors Xt can be expressed as Xt = �12 Zt + µ. (µ � Rd,� > 0)

Assumption3 d is expressed with n, and d/n � � � (0, 1) (n � �) is satisfied.

We call this limit operation ”(n, d)-asymptotic”.

Assumption4 Assume that � converge the following constants �1, �2, �3.

�1 � �1, �2 � �2, �3 � �3 (d � �)

However, it is satisfied that �1, �3 > 0, �2 � R, �1�3 � �22 > 0.