MS&E 444 Kay Giesecke, April 7 2010 Project 4. Statistical Arbitrage MS&E 444 Investment Practice Spring 2010 Jeff Blokker [[email protected]][email protected]

MS&E 444 Kay Giesecke, April 7 2010

Project 4. Statistical Arbitrage

MS&E 444 Investment Practice

Spring 2010

Jeff Blokker [[email protected]]Emile Chamoun[[email protected]]

Ibrahim Jreige[[email protected]]Paris Georgoudis[[email protected]]

Sameh Galal[[email protected]]

MS&E 444 – Investment Practice

2 Factor Model

• Statistical Arbitrage– A standard model for the dynamics of stock price is

– This model can be enhanced by expanding the noise term

– Where are risk factors associated with the market

– In discrete time

– Assume that , , , and that

F and are independent.

( )

1

pjt

j t tjt

dSdt F d

S

( )jtF

tt

t

dSdt d

S

t

( )

1

(log )n

jt j t t

j

d S dt F d

1log logi i i i ir P P t βF

t tdt d βF

( ) 0E F cov( ) F I ( ) 0E


3 Covariance of Log Returns

– If we have n observations and p factors:

– Or in matrix form

– Using

(1) (2) ( )1 1 11 12 1 1

(1) (2) ( )2 2 21 22 2 2

(1) (2) ( )1 2

...

...

...

pt t p t

pt t p t

pn n n t n t np t n

r t F F F

r t F F F

r t F F F

t r μ βF ε( )( ) ( )T T T T Tt t r μ r μ βF βF εβF βF

cov( ) ( )( )TE t t r r μ r μ

( ) ( ) ( ) ( )T T T T TE E E E β FF β εF β β F

T ββ Ψ


4 Principal Component Analysis

• Principal Component Analysis – Spectra decomposition of matrix

where are the Eigen value, Eigen vector pair

• Noise Reduction– We can approximate the model with a limited set of m Eigen vectors or Principal

Components

– Using the largest Eigen vectors will add the components that contribute most to the variance in the data

1

cov( )p

T Ti i i

i

r ββ Ψ e e Ψ

( , )i i e

( )

1

ˆm p

ji j t t

j

r dt F d


5 Stability of Principal Components

• Comparison of the Stability/Evolution of the PCA– 30 day initial data sample– Moved forward one day at a time.

– 10 largest Eigen cectors compared to the first sample using dot product

• Two Subtle Problems– 1. The Eigen vectors returned by PCA may be the inverse of the first set.

– 2. Since the Eigen vectors are given in descending order, a change in the relative magnitude of any components may swap their position. Therefore, comparisons must be made carefully.

• Results– Eigen vectors are relatively stable over time.

– After 10 Eigen vectors they become more unstable.

0cos Tn n e e



-0.5 0 0.5 1 1.5 2 2.5 3 3.50

5

10

15

20

25

30Distribution of Eigen Vector #1

Cos(theta), Mean=0.99382

Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

1

2

3

4

5

6

7

8



Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

1

2

3

4

5

6



Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

1

2

3

4

5



Num

ber

of V

ecto

rs



-0.5 0 0.5 1 1.5 2 2.5 3 3.50

1

2

3

4

5

6

7

8

9



Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

0.5

1

1.5

2

2.5

3

3.5



Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

0.5

1

1.5

2

2.5

3

3.5



Num

ber

of V

ecto

rs

-0.5 0 0.5 1 1.5 2 2.5 3 3.50

0.5

1

1.5

2

2.5

3

3.5



Num

ber

of V

ecto

rs


8 Statistical Distance vs Time of Day

• Mahanalobis Distance– The distance a data point is from the center of the distribution

• Procedure– The training set of 15 minute log return data was for 100 days.

– The distance of the next 10 data points was calculated.

– The training set was then shifted forward and the next 10 points measured.

– The data was sorted by time of day to analyze the time of day that generated the most outliers.

1( ) ( ) ( )TMD x x Σ x


9 Distance of new Test Data form the Training Data

0 5 10 15 20 25 300

1

2

3

4

5

6

7

8

9x 10

4 Mahalanobis Distance of new Data Throughout the day

Mag

natu

de o

f D

ista

nce

Number of 15 Minute Intervals in Day

Mahalanobis Distance1( ) ( ) ( )T

MD x x Σ x

Conclusion – We can separate the market into two distinct time periods where the returns are generated by two different processes.


10 Generation of Residuals• Partial Least Squares

– If X is the data set and Y is the component desired to regress from the data

– then PCA analyzes

– And PLS analyzes 1. PLS finds the matrix information associated with the first Eigen vector

2. Subtracts this information from the covariance matrix

3. Then finds the information for the second Eigen vector, etc.

• Procedure– Test data : 100 day sample of 15 minute log returns on 500 stocks

– Predict the next 10 points of data using PLS with largest 9 Eigen vectors

– Test data moved forward

• Results– Measure of fit

2 1( ) ( )

T

TR

ε ε

y y

( )TE X X( )TE X Y


11 PLS First 45 Minutes of Market Removed

-0.015 -0.01 -0.005 0 0.005 0.01 0.0150

100

200

300

400

500

600

700

800Out of Sample Distribution of Residuals

Deviation of Residuals =0.0016586 R2 =0.87011

Num

ber

of

Sam

ple

s

-4 -3 -2 -1 0 1 2 3 4-0.015

-0.01

-0.005

0

0.005

0.01

0.015

Standard Normal Quantiles

Quantile

s o

f In

put

Sam

ple

Q-Q Plot Out of Sample Residuals

0 1000 2000 3000 4000 5000 6000 7000 8000 9000-0.015

-0.01

-0.005

0

0.005

0.01

0.015Out of Sample Residuals over time

Residuals

Tim

e


12 PLS First 45 Minutes of the Market

0 200 400 600 800 1000 1200-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03Out of Sample Residuals over time

Residuals

Tim

e

-4 -3 -2 -1 0 1 2 3 4-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

Standard Normal Quantiles

Quantile

s o

f In

put

Sam

ple

Q-Q Plot Out of Sample Residuals

-0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.030

50

100

150

200

250

300Out of Sample Distribution of Residuals

Deviation of Residuals =0.004329 R2 =0.7535

Num

ber

of

Sam

ple

s


13 Calibrating OU Process: Problem Setup• Need to estimate κ, μ and σ in the OU-Process Equation:

• The discrete form of the solution of the SDE can be written as:

κ: coefficient of mean reversion

∆: discretization time step

μ: long term mean of the residuals

ttt dWXdX *)(

)1,0(*2

1

)1(

:

*

2

1

Ne

eb

ea

where

bXaX tt


14 Calibrating OU Process: OLS and MLE• Least Squares:

Basic idea: Fit parameters by minimizing sum of square of error terms.

• Maximum Likelihood Estimation: Basic idea: Find parameters by maximizing log-likelihood of the data.


15 Main Issue• OLS and MLE tend to produce similar results.• However, MLE is known for overestimating the mean

reversion speed κ:example: Johnson, Thomas. “Approximating Optimal Trading Strategies Under Parameter Uncertainty: A Monte Carlo Approach”. Kellog Business School. 2009.• Main idea: MLE typically overestimates the mean reversion speed and as

a result, underestimates the noise σ.• Paper compares filtering trading strategy to MLE. • Filtering outperforms MLE every time.

• Reason: Boguslavsky, Boguslavskaya. “Arbitrage Under Power”. February 2009.• MLE model suggests overly aggressive positions that can quickly lead

the trader to bankruptcy.


16 Kalman Filtering• Idea: mathematical method to use noisy measurements to

produced results that tend to be closer to the true value of the variable of interest.


17 Comparison of Estimation Methods

• Parameter estimation by Kalman Filtering Produces produces more accurate estimates of the OU process parameters than either MLE or OLS.

• Major disadvantage of EM Algorithm: Might take a long time to converge, computationally intensive for large window sizes.

• Solution: Use MLE/OLS to produce initial guesses then use EM to refine estimation.


18 Optimal Trading of the Residuals-1• Implement the Boguslavsky/ Boguslavskyaya strategy

described in: “Optimal Arbitrage Trading” (2003).

• O-U process:

• Conditional Distribution:

• Utility Function

• Normalization Process : Let α be the control variable and W the wealth at time t:

• Value Function:


19 Optimal Trading of the Residuals-2• Solve for optimal control parameter using HJB equation:

• Reduces to the PDE:

• Solution: Let τ be the time left for trading,


20 Results on EvA residuals• ∆ ~ 1 min, γ = -0.5, initial wealth = 100,000

Cumulative Wealth, Optimal Trading PositionPeak ~ 4,300,000End ~ 3,700,000


21 Results on Our residuals using EvA’s data-XOM

• ∆ ~ 15 min, initialWealth = 100,000

Cumulative Wealth, γ = 0 Cumulative Wealth,γ = -0.5Peak ~ 530,000 Peak ~ 520,000End ~ 490,000 End ~ 450,000


22 Incorporating TC-Separate Fund Allocation• All wealths curves will • lie between the red and• green curves.• • Blue curve = no fixed cost• peak = 530,000, End = 490,000• • Green curve• peak = 470,000, end = 420,000

Blue = no costGreen = 10*fixed costRed = 1*fixed cost


23 Trading Residuals in Practice• Look at historical 15 minute data for ~500 stocks using a

100 days sliding window

• For every stock i at time t– Generate partial least square representation using 10 components using

the remaining 499 stocks last 100 days return sliding window

– Generate a residual return by removing the PLS approximation from the stock return

– Generate residue replicating portfolio weights• Pi = [-β1 –β2 …. -βi-1 1 -βi+1 …. -βn]


24 Available Data at Time t• Stock returns vector R(t)

• Residuals returns Vector Rresidue(t)

• Residuals means Vector μresidue(t)

• Residuals standard deviations Vector σresidue(t)

• Residuals replication matrix P(t)– Pij(t) is the weight of the jth stock in the portfolio replicating ith residue

– If we have residuals positions vector V(t), the final investment portfolio will be V(t)P(t)


25 The Trading Strategy• Evaluate the market every 15 minutes to look for strong

deviations of residuals from mean– Enter positions that exceed a entering threshold

– Leave positions that cross the leaving threshold

– Allocate money in a certain defined percentage equally between all opportunities invested in given a certain minimum cash position percentage

• The dynamic rebalancing of portfolio is based on log optimal portfolio growth strategy of volatility pumping


26 The Secret Sauce: Trading Parameters• 6 parameters

– Long Enter threshold, Short Enter threshold

– Long Exit Threshold, Short exit threshold

– Minimum Cash percentage

– Maximum single position percentage

• Trading algorithm is robust with trading parameters (at least as far as I tested!)

• Divided data sets into a training period and used matlab optimization toolbox to find parameters that maximizes sharpe ratio and applied the resulting parameters into a testing period

• This strategy can be applied continuously to periodically recalibrate the trading parameters


27

0 1000 2000 3000 4000 5000 60000.5

1

1.5

2

2.5

3

Wea

lth M

ultip

les

Long Enter threshold=-2.698336, Long Exit threshold=-1.500553,Short Enter Threshold=2.698336, Short Exit Threshold=1.500553,

Minimum Cash Position=84.418854%, Maximum Investment=4.071198%, sharpe ratio=0.349356

training period

test period

Documents

MS&E 444 Kay Giesecke, April 7 2010 Project 4. Statistical Arbitrage MS&E 444 Investment Practice Spring 2010 Jeff Blokker [[email protected]][email protected]