Upload
zudora
View
46
Download
2
Tags:
Embed Size (px)
DESCRIPTION
STAT 497 LECTURE NOTE 11. VAR MODELS AND GRANGER CAUSALITY. VECTOR TIME SERIES. A vector series consists of multiple single series. Why we need multiple series? To be able to understand the relationship between several components To be able to get better forecasts. VECTOR TIME SERIES. - PowerPoint PPT Presentation
Citation preview
STAT 497LECTURE NOTE 11
VAR MODELS AND GRANGER CAUSALITY
1
VECTOR TIME SERIES
• A vector series consists of multiple single series.
• Why we need multiple series?– To be able to understand the relationship
between several components– To be able to get better forecasts
2
VECTOR TIME SERIES
• Price movements in one market can spread easily and instantly to another market. For this reason, financial markets are more dependent on each other than ever before. So, we have to consider them jointly to better understand the dynamic structure of global market. Knowing how markets are interrelated is of great importance in finance.
• For an investor or a financial institution holding multiple assets play an important role in decision making.
3
VECTOR TIME SERIES
4
VECTOR TIME SERIES
5
VECTOR TIME SERIES
6
7
8
VECTOR TIME SERIES
• Consider an m-dimensional time series Yt=(Y1,Y2,…,Ym)’. The series Yt is weakly stationary if its first two moments are time invariant and the cross covariance between Yit and Yjs for all i and j are functions of the time difference (st) only.
9
VECTOR TIME SERIES
• The mean vector:
• The covariance matrix function
10
mtE ,,, 21 Y
kkk
kkk
kkk
ECovk
mmmm
m
m
tkttkt
21
22221
11211
YYY,Y
VECTOR TIME SERIES
• The correlation matrix function:
where D is a diagonal matrix in which the i-th diagonal element is the variance of the i-th process, i.e.
• The covariance and correlation matrix functions are positive semi-definite.
11
kkk ij// 2121 DD
.0,,0,0 2211 mmdiag D
VECTOR WHITE NOISE PROCESS
• {at}~WN(0,) iff {at} is stationary with mean 0 vector and
12
o.w.
kk
,
0,
0
VECTOR TIME SERIES
• {Yt} is a linear process if it can be expressed as
where {j} is a sequence of mxn matrix whose entries are absolutely summable, i.e.
13
,~aa 0
0WN for Y t
jjtjt
jj l,i m.1,2,...,li, for 0
VECTOR TIME SERIES
• For a linear process, E(Yt)=0 and
14
,...,,, 210
kk
jjkj
MA (WOLD) REPRESENTATION
• For the process to be stationary, s should be square summable in the sense that each of the mxm sequence ij.s is square summable.
15
tt B aY
0s
ssBB where
AR REPRESENTATION
• For the process to be invertible, s should be absolute summable.
16
ttB aY
01
s
ssBB where
THE VECTOR AUTOREGRESSIVE MOVING AVERAGE (VARMA) PROCESSES
• VARMA(p,q) process:
17
tqtp BB aY
q
ppp
BBB
BBB where
10
10
pVARBq ttp aY 0
VMA(q)B0p tqt aY
VARMA PROCESS
• VARMA process is stationary, if the zeros of |p(B)| are outside the unit circle.
• VARMA process is invertible, if the zeros of |q(B)| are outside the unit circle.
18
BBB qptt 1 aY
ttpq
tt
BB
B
aY
aY
1
IDENTIFIBILITY PROBLEM
• Multiplying matrices by some arbitrary matrix polynomial may give us an identical covariance matrix. So, the VARMA(p,q) model is not identifiable. We cannot uniquely determine p and q.
19
IDENTIFIBILITY PROBLEM
• Example: VARMA(1,1) process
20
12
11
2
1
12
11
2
1
00
0
00
0
t,
t,
t,
t,
t,
t,
t,
t,
a
am
a
a
Y
Ym
Y
Y
t,
t,
t,
t,
a
amB
Y
YBm
2
1
2
1
10
1
10
1
t,
t,
t,
t,
t,
t,
a
aB
a
amBBm
Y
Y
2
1
2
11
2
1
10
1
10
1
10
1
MA()=VMA(1)
21
IDENTIFIBILITY
• To eliminate this problem, there are three methods suggested by Hannan (1969, 1970, 1976, 1979).– From each of the equivalent models, choose the
minimum MA order q and AR order p. The resulting representation will be unique if Rank(p(B))=m.
– Represent p(B) in lower triangular form. If the order of ij(B) for i,j=1,2,…,m, then the model is identifiable.
– Represent p(B) in a form p(B) =p(B)I where p(B) is a univariate AR(p). The model is identifiable if p0.
22
VAR(1) PROCESS
• Yi,t depends not only the lagged values of Yit but also the lagged values of the other variables.
• Always invertible.• Stationary if outside the unit
circle. Let =B1.
23
ttB aYI
0 BI
00 BI
The zeros of |IB| is related to the eigenvalues of .
VAR(1) PROCESS
• Hence, VAR(1) process is stationary if the eigenvalues of ; i, i=1,2,…,m are all inside the unit circle.
• The autocovariance matrix:
24
tkttkt
ttkttkt
E
EEk
aYYY
aYYYY
1
1
101
01
k,k
k,k k
VAR(1) PROCESS
• k=1,
25
0101 1
00
100010
1010
11
1
1
VAR(1) PROCESS
• Then,
26
BvecAC
vecIvec
ABCvec
product Kronecker where
10
00
7
6
2
1
4
3
71
64
23
XX vece.g.
Ba
BB
BA
mnm1
1n
aB
aa
e.g.
11
VAR(1) PROCESS
• Example:
27
5080
04031
30602011
2060
3011
2060
3011
21
2
1
.,.
..
....det
..
..
..
..ttt
I
I
aYY
The process is stationary.
VMA(1) PROCESS
• Always stationary.• The autocovariance function:
• The autocovariance matrix function cuts of after lag 1.
28
.,WN~ where t1ttt 0aaaY
0
o.w.,
k,
k,
k
0
1
1
VMA(1) PROCESS
• Hence, VMA(1) process is invertible if the eigenvalues of ; i, i=1,2,…,m are all inside the unit circle.
29
IDENTIFICATION OF VARMA PROCESSES
• Same as univariate case.• SAMPLE CORRELATION MATRIC FUNCTION:
Given a vector series of n observations, the sample correlation matrix function is
where ‘s are the crosscorrelation for the i-th and j-th component series.
• It is very useful to identify VMA(q).
30
kˆkˆ ij kˆ ij
SAMPLE CORRELATION MATRIC FUNCTION
• Tiao and Box (1981): They have proposed to use +, and . signs to show the significance of the cross correlations.
+ sign: the value is greater than 2 times the estimated standard error
sign: the value is less than 2 times the estimated standard error
. sign: the value is within the 2 times estimated standard error
31
PARTIAL AUTOREGRESSION OR PARTIAL LAG CORRELATION MATRIX FUNCTION
• They are useful to identify VAR order. The partial autoregression matrix function is proposed by Tiao and Box (1981) but it is not a proper correlation coefficient. Then, Heyse and Wei (1985) have proposed the partial lag correlation matrix function which is a proper correlation coefficient. Both of them can be used to identify the VARMA(p,q).
32
EXAMPLE OF VAR MODELING IN R
• “vars” package deals with VAR models.• Let’s consider the Canadian data for an
application of the model.• Canadian time series for labour productivity
(prod), employment (e), unemployment rate (U) and real wages (rw) (source: OECD database)
• Series is quarterly. The sample range is from the 1stQ 1980 until ¨ 4thQ 2000.
33
Canadian example
> library(vars) > data(Canada)> layout(matrix(1:4, nrow = 2, ncol = 2)) > plot.ts(Canada$e, main = "Employment", ylab = "", xlab = "") > plot.ts(Canada$prod, main = "Productivity", ylab = "", xlab = "") > plot.ts(Canada$rw, main = "Real Wage", ylab = "", xlab = "") > plot.ts(Canada$U, main = "Unemployment Rate", ylab = "", xlab = "")
34
35
• An optimal lag-order can be determined according to an information criteria or the final prediction error of a VAR(p) with the function VARselect().
> VARselect(Canada, lag.max = 5, type = "const")$selectionAIC(n) HQ(n) SC(n) FPE(n) 3 2 2 3• According to the more conservative SC(n) and
HQ(n) criteria, the empirical optimal lag-order is 2.
36
• In a next step, the VAR(2) is estimated with the function VAR() and as deterministic regressors a constant is included.
> var.2c <- VAR(Canada, p = 2, type = "const")> names(var.2c)[1] "varresult" "datamat" "y" "type" "p"[6] "K" "obs" "totobs" "restrictions" "call“> summary(var.2c)> plot(var.2c)
37
• The OLS results of the example are shown in separate tables 1 – 4 below. It turns out, that not all lagged endogenous variables enter significantly into the equations of the VAR(2).
38
39
40
The stability of the system of difference equations has to be checked. If the moduli of the eigenvalues of the companion matrix are less than one, the system is stable.> roots(var.2c)[1] 0.9950338 0.9081062 0.9081062 0.7380565 0.7380565 0.1856381 0.1428889 0.1428889
Although, the first eigenvalue is pretty close to unity, for the sake of simplicity, we assume a stable VAR(2)-process with a constant as deterministic regressor.
41
Restricted VARs
• From tables 1-4 it is obvious that not all regressors enter significantly.
• With the function restrict() the user has the option to re-estimate the VAR either by significance (argument method = ’ser’) or by imposing zero restrictions manually (argument method = ’manual’).
• In the former case, each equation is re-estimated separately as long as there are t-values that are in absolute value below the threshold value set by the function’s argument thresh.
• In the latter case, a restriction matrix has to be provided that consists of 0/1 values, thereby selecting the coefficients to be retained in the model. The function’s arguments are therefore:
42
> var2c.ser <- restrict(var.2c, method = "ser", thresh = 2)
> var2c.ser$restrictions
e.l1 prod.l1 rw.l1 U.l1 e.l2 prod.l2 rw.l2 U.l2 const
e 1 1 1 1 1 0 0 0 1
prod 0 1 0 0 1 0 1 1 1
rw 0 1 1 0 1 0 0 1 0
U 1 0 0 1 1 0 1 0 1
43
> B(var2c.ser)
44
45
46
Diagnostic testing
• In package ‘vars’ the functions for diagnostic testing are arch(), normality(), serial() and stability().
> var2c.arch <- arch(var.2c)
47
• The Jarque-Bera normality tests for univariate and multivariate series are implemented and applied to the residuals of a VAR(p) as well as separate tests for multivariate skewness and kurtosis (see Bera & Jarque [1980], [1981] and Jarque & Bera [1987] and Lutkepohl [2006]).
• The univariate versions of the Jarque-Bera test are applied to the residuals of each equation.
• A multivariate version of this test can be computed by using the residuals that are standardized by a Choleski decomposition of the variance-covariance matrix for the centered residuals.
48
> var2c.norm <- normality(var.2c, multivariate.only = TRUE)> var2c.norm$JBJB-Test (multivariate)Chi-squared = 5.094, df = 8, p-value = 0.7475$SkewnessSkewness only (multivariate)Chi-squared = 1.7761, df = 4, p-value = 0.7769$KurtosisKurtosis only (multivariate)Chi-squared = 3.3179, df = 4, p-value = 0.5061
49
• For testing the lack of serial correlation in the residuals of a VAR(p), a Portmanteau test and the LM test proposed by Breusch & Godfrey are implemented in the function serial().
> var2c.pt.asy <- serial(var.2c, lags.pt = 16, type = "PT.asymptotic")> var2c.pt.asy Portmanteau Test (asymptotic)Chi-squared = 205.3538, df = 224, p-value = 0.8092> var2c.pt.adj <- serial(var.2c, lags.pt = 16, type = "PT.adjusted")> var2c.pt.adj Portmanteau Test (adjusted)Chi-squared = 231.5907, df = 224, p-value = 0.3497
50
• The Breusch-Godfrey LM-statistic (see Breusch 1978, Godfrey 1978) is based upon the following auxiliary regressions:
> var2c.BG <- serial(var.2c, lags.pt = 16, type = "BG")> var2c.BG Breusch-Godfrey LM testChi-squared = 92.6282, df = 80, p-value = 0.1581> var2c.ES <- serial(var.2c, lags.pt = 16, type = "ES")> var2c.ES Edgerton-Shukur F testF statistic = 1.1186, df1 = 80, df2 = 199, p-value = 0.2648
51
• The stability of the regression relationships in a VAR(p) can be assessed with the function stability(). An empirical fluctuation process is estimated for each regression by passing the function’s arguments to the efp()-function contained in the package strucchange.
> args(stability)function (x, type = c("Rec-CUSUM", "OLS-CUSUM", "Rec-MOSUM","OLS-MOSUM", "RE", "ME", "Score-CUSUM", "Score-MOSUM", "fluctuation"),h = 0.15, dynamic = FALSE, rescale = TRUE)NULL> var2c.stab <- stability(var.2c, type = "OLS-CUSUM")> names(var2c.stab)[1] "stability" "names" "K"
52
53
ForecastingA predict-method for objects with class attribute varest is available. The n.ahead forecasts are computed recursively for the estimated VAR, beginning with h = 1, 2, . . . , n.ahead:
> var.f10 <- predict(var.2c, n.ahead = 10, ci = 0.95)> names(var.f10)[1] "fcst" "endog" "model" "exo.fcst"> class(var.f10)[1] "varprd"> plot(var.f10)> fanchart(var.f10)
54
55
56
GRANGER CAUSALITY
• In time series analysis, sometimes, we would like to know whether changes in a variable will have an impact on changes other variables.
• To find out this phenomena more accurately, we need to learn more about Granger Causality Test.
57
GRANGER CAUSALITY
• In principle, the concept is as follows:
• If X causes Y, then, changes of X happened first then followed by changes of Y.
58
GRANGER CAUSALITY
• If X causes Y, there are two conditions to be satisfied:
1. X can help in predicting Y. Regression of X on Y has a big R2
2. Y can not help in predicting X.
59
GRANGER CAUSALITY
• In most regressions, it is very hard to discuss causality. For instance, the significance of the coefficient in the regression
only tells the ‘co-occurrence’ of x and y, not that x causes y.
• In other words, usually the regression only tells us there is some ‘relationship’ between x and y, and does not tell the nature of the relationship, such as whether x causes y or y causes x.
60
iii xy
GRANGER CAUSALITY
• One good thing of time series vector autoregression is that we could test ‘causality’ in some sense. This test is first proposed by Granger (1969), and therefore we refer it Granger causality.
• We will restrict our discussion to a system of two variables, x and y. y is said to Granger-cause x if current or lagged values of y helps to predict future values of x. On the other hand, y fails to Granger-cause x if for all s > 0, the mean squared error of a forecast of xt+s based on (xt, xt−1, . . .) is the same as that is based on (yt, yt−1, . . .) and (xt, xt−1, . . .).
61
GRANGER CAUSALITY
• If we restrict ourselves to linear functions, x fails to Granger-cause x if
• Equivalently, we can say that x is exogenous in the time series sense with respect to y, or y is not linearly informative about future x.
62
,y,y,,x,xxEMSE,x,xxEMSE ttttststttst 111
GRANGER CAUSALITY
• A variable X is said to Granger cause another variable Y, if Y can be better predicted from the past of X and Y together than the past of Y alone, other relevant information being used in the prediction (Pierce, 1977).
63
GRANGER CAUSALITY• In the VAR equation, the example we
proposed above implies a lower triangular coefficient matrix:
Or if we use MA representations,
64
t
t
pt
pt
pp
p
t
t
t
t
a
a
y
x
y
x
c
c
y
x
2
1
2221
11
1
1
122
121
111
2
1 00
t
t
t
t
a
a
BB
B
y
x
2
1
2221
11
2
1 0
.,,BBB where ijijijij 01 021
022
011
2210
GRANGER CAUSALITY
• Consider a linear projection of yt on past, present and future x’s,
where E(etx ) = 0 for all t and . Then y fails to Granger-cause x iff dj = 0 for j = 1, 2, . . ..
65
0 1jt
jjtjjtjt exdxbcy
TESTING GRANGER CAUSALITYProcedure1) Check that both series are stationary in mean, variance
and covariance (if necessary transform the data via logs, differences to ensure this)
2) Estimate AR(p) models for each series, where p is large enough to ensure white noise residuals. F tests and other criteria (e.g. Schwartz or Akaike) can be used to establish the maximum lag p that is needed.
3) Re-estimate both model, now including all the lags of the other variable
4) Use F tests to determine whether, after controlling for past Y, past values of X can improve forecasts Y (and vice versa)
66
TEST OUTCOMES
1. X Granger causes Y but Y does not Granger cause X
2. Y Granger causes X but X does not Granger cause Y
3. X Granger causes Y and Y Granger causes X (i.e., there is a feedback system)
4. X does not Granger cause Y and Y does not Granger cause X
67
TESTING GRANGER CAUSALITY
• The simplest test is to estimate the regression which is based on
using OLS and then conduct a F-test of the null hypothesis
H0 : 1 = 2 = . . . = p = 0.
68
p
it
p
jjtjitit uyxcx
0 11
TESTING GRANGER CAUSALITY
2.Run the following regression, and calculate RSS (full model)
3.Run the following limited regression, and calculate RSS (Restricted model).
69
p
it
p
jjtjitit uyxcx
0 11
p
ititit uxcx
01
TESTING GRANGER CAUSALITY
4.Do the following F-test using RSS obtained from stages 2 and 3:
F = [{(n-k) /q }.{(RSSrestricted-RSSfull) / RSSfull}]
n: number of observationsk: number of parameters from full modelq: number of parameters from restricted model
70
TESTING GRANGER CAUSALITY
5. If H0 rejected, then X causes Y.
• This technique can be used in investigating whether or not Y causes X.
71
Example of the Usage of Granger Test
World Oil Price and Growth of US Economy• Does the increase of world oil price influence the
growth of US economy or does the growth of US economy effects the world oil price?
• James Hamilton did this study using the following model:
Zt= a0+ a1 Zt-1+...+amZt-m+b1Xt-1 +…bmXt-m+εt
Zt= ΔPt; changes of world price of oilXt= log (GNPt/ GNPt-1)
72
World Oil Price and Growth of US Economy
• There are two causalities that need to be observed:
(i) H0: Growth of US Economy does not influence world oil price
Full: Zt= a0+ a1 Zt-1+...+amZt-m+b1Xt-1 +…+bmXt-m+εt
Restricted: Zt= a0+ a1 Zt-1+...+amZt-m+ εt
73
World Oil Price and Growth of US Economy
(ii) H0 : World oil price does not influence growth of US Economy
• Full : Xt= a0+ a1 Xt-1+ …+amXt-m+ b1Zt-1+…+bmZt-m+ εt
• Restricted: Xt= a0+ a1 Xt-1+ …+amXt-m+ εt
74
World Oil Price and Growth of US Economy
• F Tests Results:1. Hypothesis that world oil price does not
influence US economy is rejected. It means that the world oil price does influence US economy .
2. Hypothesis that US economy does not affect world oil price is not rejected. It means that the US economy does not have effect on world oil price.
75
World Oil Price and Growth of US Economy
• Summary of James Hamilton’s Results
76
Null Hypothesis (H0) (I)F(4,86) (II)F(8,74)
I. Economic growth ≠→World Oil Price
0.58 0.71
II. World Oil Price≠→Economic
growth
5.55 3.28
World Oil Price and Growth of US Economy
• Remark: The first experiment used the data 1949-1972 (95 observations) and m=4; while the second experiment used data 1950-1972 (91 observations) and m=8.
77
Canadian exampleThe function causality() is now applied for investigating if the real wage and productivity is causal to employment and unemployment.> causality(var.2c, cause = c("rw", "prod"))$GrangerGranger causality H0: prod rw do not Granger-cause e UF-Test = 3.4529, df1 = 8, df2 = 292, p-value = 0.0008086$InstantH0: No instantaneous causality between: prod rw and e Udata: VAR object var.2cChi-squared = 2.5822, df = 4, p-value = 0.63
The null hypothesis of no Granger-causality from the real wage and labour productivity to employment and unemployment must be rejected; whereas the null hypothesis of non-instantenous causality cannot be rejected. This test outcome is economically plausible, given the frictions observed in labour markets.Instantaneous causality appears when we include the current information of variables
78
Chicken vs. Egg• This causality test is also can be used in
explaining which comes first: chicken or egg. More specifically, the test can be used in testing whether the existence of egg causes the existence of chicken or vise versa.
• Thurman and Fisher did this study using yearly data of chicken population and egg productions in the US from 1930 to1983
• The results:1. Egg causes the chicken.2. There is no evidence that chicken causes egg.
79
Chicken vs. Egg
• Remark: Hypothesis that egg has no effect on chicken population is rejected; while the other hypothesis that chicken has no effect on egg is not rejected. Why?
80
GRANGER CAUSALITY
• We have to be aware of that Granger causality does not equal to what we usually mean by causality. For instance, even if x1 does not cause x2, it may still help to predict x2, and thus Granger-causes x2 if changes in x1 precedes that of x2 for some reason.
• A naive example is that we observe that a dragonfly flies much lower before a rain storm, due to the lower air pressure. We know that dragonflies do not cause a rain storm, but it does help to predict a rain storm, thus Granger-causes a rain storm.
81