Introduction to Algorithmic Trading Strategies Lecture 3
Pairs Trading by Cointegration
Haksun Li [email protected]
www.numericalmethod.com
Outline Distance method Cointegration Stationarity DickeyβFuller tests
2
References Pairs Trading: A Cointegration Approach. Arlen David
Schmidt. University of Sydney. Finance Honours Thesis. November 2008.
Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Soren Johansen. Oxford University Press, USA. February 1, 1996.
3
Pairs Trading Definition: trade one asset (or basket) against another
asset (or basket) Long one and short the other
Intuition: For two closely related assets, they tend to βmove togetherβ (common trend). We want to buy the cheap one and sell the expensive one. Exploit short term deviation from long term equilibrium.
Try to make money from βspreadβ.
4
Spread π = π β π½π π½ hedge ratio cointegration coefficient
5
Dollar Neutral Hedge Suppose ES (S&P500 E-mini future) is at 1220 and each
point worth $50, its dollar value is about $61,000. Suppose NQ (Nasdaq 100 E-mini future) is at 1634 and each point worth $20, its dollar value is $32,680.
π½ = 6100032680
= 1.87.
π = πΈπΈ β 1.87 Γ ππ Buy Z = Buy 10 ES contracts and Sell 19 NQ contracts. Sell Z = Sell 10 ES contracts and Buy 19 NQ contracts.
6
Market Neutral Hedge Suppose ES has a beta of 1.25, NQ 1.11.
We use π½ = 1.251.11
= 1.13
7
Dynamic Hedge π½ changes with time, covariance, market conditions,
etc. Periodic recalibration.
8
Distance The distance between two time series: π = β π₯π β π¦π
2
π₯π, π¦π are the normalized prices. We choose a pair of stocks among a collection with the
smallest distance, π.
9
Distance Trading Strategy Sell Z if Z is too expensive. Buy Z if Z is too cheap. How do we do the evaluation?
10
Z Transform We normalize Z. The normalized value is called z-score.
π§ = π₯βοΏ½Μ οΏ½ππ₯
Other forms:
π§ = π₯βπΓοΏ½Μ οΏ½πΓππ₯
M, S are proprietary functions for forecasting.
11
A Very Simple Distance Pairs Trading Sell Z when z > 2 (standard deviations). Sell 10 ES contracts and Buy 19 NQ contracts.
Buy Z when z < -2 (standard deviations). Buy 10 ES contracts and Sell 19 NQ contracts.
12
Pros of the Distance Model Model free. No mis-specification. No mis-estimation. Distance measure intuitively captures the LOP idea.
13
Cons of the Distance Model Does not guarantee stationarity. Cannot predict the convergence time (expected
holding period). Ignores the dynamic nature of the spread process,
essentially treat the spread as i.i.d. Using more strict criterions works for equity. In fixed
income trading, we donβt have the luxury of throwing away many pairs.
14
Risks in Pairs Trading Long term equilibrium does not hold. Systematic market risk. Firm specific risk. Liquidity.
15
Stationarity These ad-hoc π½calibration does not guarantee the
single most important statistical property in trading: stationarity.
Strong stationarity: the joint probability distribution of π₯π‘ does not change over time.
Weak stationarity: the first and second moments do not change over time. Covariance stationarity
16
Cointegration Cointegration: select a linear combination of assets to
construct an (approximately) stationary portfolio. A stationary stochastic process is mean-reverting. Long when the spread/portfolio/basket falls
sufficiently below a long term equilibrium. Short when the spread/portfolio/basket rises
sufficiently above a long term equilibrium.
17
Objective Given two I(1) price series, we want to find a linear
combination such that: π§π‘ = π₯π‘ β π½π¦π‘ = π + ππ‘
ππ‘ is I(0), a stationary residue. π is the long term equilibrium. Long when π§π‘ < π β Ξ. Sell when π§π‘ > π + Ξ.
18
Stocks from the Same Industry Reduce market risk, esp., in bear market. Stocks from the same industry are likely to be subject to the
same systematic risk. Give some theoretical unpinning to the pairs trading. Stocks from the same industry are likely to be driven by the
same fundamental factors (common trends).
19
Cointegration Definition ππ‘~CI π, π if All components of ππ‘ are integrated of same order π. There exists a π½π‘ such that the linear combination,π½π‘ππ‘ =π½1π1π‘ + π½2π2π‘ + β―+ π½ππππ‘, is integrated of order π β π ,π > 0.
π½is the cointegrating vector, not unique.
20
Illustration for Trading Suppose we have two assets, both reasonably I(1), we
want to find π½ such that π = π + π½π is I(0), i.e., stationary.
In this case, we have π = 1, π = 1.
21
A Simple VAR Example π¦π‘ = π11π¦π‘β1 + π12π§π‘β1 + ππ¦π‘ π§π‘ = π21π¦π‘β1 + π22π§π‘β1 + ππ§π‘ Theorem 4.2, Johansen, places certain restrictions on
the coefficients for the VAR to be cointegrated. The roots of the characteristics equation lie on or outside
the unit disc.
22
Coefficient Restrictions
π11 = 1βπ22 βπ12π211βπ22
π22 > β1 π12π21 + π22 < 1
23
VECM (1) Taking differences π¦π‘ β π¦π‘β1 = π11 β 1 π¦π‘β1 + π12π§π‘β1 + ππ¦π‘ π§π‘ β π§π‘β1 = π21π¦π‘β1 + π22 β 1 π§π‘β1 + ππ§π‘
Ξπ¦π‘Ξπ§π‘
= π11 β 1 π12π21 π22 β 1
π¦π‘β1π§π‘β1 +
ππ¦π‘ππ§π‘
Substitution of π11
Ξπ¦π‘Ξπ§π‘
=βπ12π211βπ22
π12π21 π22 β 1
π¦π‘β1π§π‘β1 +
ππ¦π‘ππ§π‘
24
VECM (2) Ξπ¦π‘ = πΌπ¦ π¦π‘β1 β π½π§π‘β1 + ππ¦π‘ Ξπ§π‘ = πΌπ§ π¦π‘β1 β π½π§π‘β1 + ππ§π‘ πΌπ¦ = βπ12π21
1βπ22
πΌπ§ = π21
π½ = 1βπ22π21
, the cointegrating coefficient
π¦π‘β1 β π½π§π‘β1 is the long run equilibrium, I(0). πΌπ¦, πΌπ§ are the speed of adjustment parameters.
25
Interpretation Suppose the long run equilibrium is 0, Ξπ¦π‘, Ξπ§π‘ responds only to shocks.
Suppose πΌπ¦ < 0, πΌπ§ > 0, π¦π‘ decreases in response to a +ve deviation. π§π‘ increases in response to a +ve deviation.
26
Granger Representation Theorem If ππ‘is cointegrated, an VECM form exists. The increments can be expressed as a functions of the
dis-equilibrium, and the lagged increments. Ξππ‘ = πΌπ½β²ππ‘β1 + βππ‘Ξππ‘β1 + ππ‘ In our simple example, we have
Ξπ¦π‘Ξπ§π‘
=πΌπ¦πΌπ§
1 βπ½π¦π‘β1π§π‘β1 +
ππ¦π‘ππ§π‘
27
Granger Causality π§π‘ does not Granger Cause π¦π‘ if lagged values of Ξπ§π‘βπ do not enter the Ξπ¦π‘ equation.
π¦π‘ does not Granger Cause π§π‘ if lagged values of Ξπ¦π‘βπ do not enter the Ξπ§π‘ equation.
28
Test for Stationarity An augmented DickeyβFuller test (ADF) is a test for a unit
root in a time series sample. It is an augmented version of the DickeyβFuller test for a
larger and more complicated set of time series models. Intuition: if the series π¦π‘ is stationary, then it has a tendency to return to a
constant mean. Therefore large values will tend to be followed by smaller values, and small values by larger values. Accordingly, the level of the series will be a significant predictor of next period's change, and will have a negative coefficient.
If, on the other hand, the series is integrated, then positive changes and negative changes will occur with probabilities that do not depend on the current level of the series.
In a random walk, where you are now does not affect which way you will go next.
29
ADF Math Ξπ¦π‘ = πΌ + π½π½ + πΎπ¦π‘β1 + β Ξπ¦π‘βπ
πβ1π=1 + ππ‘
Null hypothesis π»0: πΎ = 0. (π¦π‘ non-stationary) πΌ = 0,π½ = 0 models a random walk. π½ = 0 models a random walk with drift.
Test statistics = πΎοΏ½π πΎοΏ½
, the more negative, the more reason to reject π»0 (hence π¦π‘ stationary).
SuanShu: AugmentedDickeyFuller.java
30
Engle-Granger Two Step Approach Estimate either π¦π‘ = π½10 + π½11π§π‘ + π1π‘ π§π‘ = π½20 + π½21π¦π‘ + π2π‘ As the sample size increase indefinitely, asymptotically a
test for a unit root in π1π‘ and π2π‘ are equivalent, but not for small sample sizes.
Test for unit root using ADF on either π1π‘ and π2π‘ . If π¦π‘ and π§π‘ are cointegrated, π½ super converges.
31
Engle-Granger Pros and Cons Pros: simple
Cons: This approach is subject to twice the estimation errors. Any
errors introduced in the first step carry over to the second step.
Work only for two I(1) time series.
32
Testing for Cointegration Note that in the VECM, the rows in the coefficient, Ξ ,
are NOT linearly independent.
Ξπ¦π‘Ξπ§π‘
=βπ12π211βπ22
π12π21 π22 β 1
π¦π‘β1π§π‘β1 +
ππ¦π‘ππ§π‘
βπ12π211βπ22
π12 Γ β 1βπ22π12
= π21 π22 β 1
The rank of Ξ determine whether the two assets π¦π‘ and π§π‘ are cointegrated.
33
VAR & VECM In general, we can write convert a VAR to an VECM. VAR (from numerical estimation by, e.g., OLS): ππ‘ = β π΄πππ‘βπ + ππ‘
ππ=1
Transitory form of VECM (reduced form) Ξππ‘ = Ξ ππ‘β1 + β ΞπΞππ‘βπ + ππ‘
πβ1π=1
Long run form of VECM Ξππ‘ = β Ξ₯πΞππ‘βπ
πβ1π=1 + Ξ ππ‘βπ + ππ‘
34
The Ξ Matrix Rank(Ξ ) = n, full rank The system is already stationary; a standard VAR model in
levels. Rank(Ξ ) = 0 There exists NO cointegrating relations among the time
series. 0 < Rank(Ξ ) < n Ξ = πΌπ½β² π½ is the cointegrating vector πΌ is the speed of adjustment.
35
Rank Determination Determining the rank of Ξ is amount to determining
the number of non-zero eigenvalues of Ξ . Ξ is usually obtained from (numerical VAR) estimation. Eigenvalues are computed using a numerical procedure.
36
Trace Statistics Suppose the eigenvalues of Ξ are:π1 > π2 > β― > ππ. For the 0 eigenvalues, ln 1 β ππ = 0. For the (big) non-zero eigenvalues, ln 1 β ππ is (very
negative). The likelihood ratio test statistics π π» π |π» π = βπβ log 1 β ππ
ππ=π+1
H0: rank β€ r; there are at most r cointegrating Ξ².
37
Test Procedure int r = 0;//rank for (; r <= n; ++r) { compute Q = π π» π |π» π ; If (Q > c.v.) {//compare against a critical value
break;//fail to reject the null hypothesis; rank found }
} r is the rank found
38
Decomposing Ξ Suppose the rank of Ξ = π. Ξ = πΌπ½β². Ξ is π Γ π. πΌ is π Γ π. π½β²is r Γ π.
39
Estimating π½ π½ can estimated by maximizing the log-likelihood
function in Chapter 6, Johansen. logL Ξ¨,πΌ,π½,Ξ©
Theorem 6.1, Johansen: π½ is found by solving the following eigenvalue problem: ππΈ11 β πΈ10πΈ00β1πΈ01 = 0
40
π½ Each non-zero eigenvalue Ξ» corresponds to a
cointegrating vector, which is its eigenvector. π½ = π£1, π£2,β― , π£π π½ spans the cointegrating space. For two cointegrating asset, there are only one π½ (π£1)
so it is unequivocal. When there are multiple π½, we need to add economic
restrictions to identify π½.
41
Trading the Pairs Given a space of (liquid) assets, we compute the
pairwise cointegrating relationships. For each pair, we validate stationarity by performing
the ADF test. For the strongly mean-reverting pairs, we can design
trading strategies around them.
42