Financial Econometric Models III

Preview:

DESCRIPTION

Third Session, MSc 5th Year

Citation preview

Financial Econometric Models Vincent JEANNIN – ESGF 5IFM

Q1 2012

1

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

2

Summary of the session (Est. 3h) • Reminder of Last Session • Time Series Analysis Principles • Auto Regressive Process • Moving Average Process • ARMA • Conclusion

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

3

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

Reminder of Last Session

Be logic!

4

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

𝑌𝐷𝑖𝑓𝑓 = ln(𝑌)

Differentiation possible

5

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

Time can be a factor of a regression

6

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

Differentiation can add value

7

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

Check ACF/PACF for autocorrelation

Time Series Analysis Principles

8

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Reminders of the 3 steps

Identify

Fit

Forecast

9

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Reminders of the 3 components

Trend

Seasonality

Residual

10

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Lag

𝐵𝑥𝑡 = 𝑥𝑡−1

Difference

∆𝑥𝑡= 𝑥𝑡 − 𝑥𝑡−1

Seasonality Difference

∆30𝑥𝑡 = 𝑥𝑡 − 𝑥𝑡−30

11

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Differentiate series to obtain stationary series

Time series analysis and forecast simpler with stationary series

Different models involved with stationary or heteroscedasticity

12

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Properties of stationary series

(𝑌1, 𝑌2, 𝑌3, … , 𝑌𝑛)

(𝑌2, 𝑌3, 𝑌4, … , 𝑌𝑛+1)

Same distribution of the following

Distribution not time dependent

Rare occurrence

Stationarity accepted if

𝐸(𝑌𝑡) = 𝜇 Constant in the time

𝐶𝑜𝑣(𝑌𝑡 , 𝑌𝑡−𝑛) Depends only on n

13

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Acceptable Shortcut

A series is stationary if the mean and the variance are stable

Which one is more likely to be stationary?

14

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

About the residuals…

White noise!

Normality test

Have an idea with

Skewness

Kurtosis

Proper tests: KS, Durbin Watson, Portmanteau,…

Auto Regressive Process

15

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

There is a correlation between current data and previous data

Main principle

𝑋𝑡 = 𝑐 + 𝜑1𝑋𝑡−1 + 𝜑2𝑋𝑡−2 + ⋯+ 𝜑𝑛𝑋𝑡−𝑛 + 𝜀𝑡

𝜑𝑛 Parameters of the model

𝜀𝑛 White noise

If the parameters are identified, the prediction will be easy

AR(n)

16

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

DATA<-read.csv(file="C:/Users/vin/Desktop/Series1.csv",header=T)

plot(DATA$Val, type="l")

Let’s upload some data

17

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Is this a white noise?

hist(DATA$Val, breaks=20)

18

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Probably not…

Portmanteau test

Test the autocorrelation of a series

If there is autocorrelation, data aren’t independently distributed

Let’s use Ljung–Box statistics

𝑄 = 𝑛(𝑛 + 2) 𝜌 2𝑘

𝑛 − 𝑘

𝑛

𝑘=1

𝜌 𝑘 Autocorrelation at the lag k

H0: Data are independently distributed H1: Data aren’t independently distributed

𝑄 > Χ21−𝛼,ℎ

With α confidence interval rejection following a Chi Square distribution

19

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> Box.test(DATA$Val)

Box-Pierce test

data: DATA$Val

X-squared = 188.3263, df = 1, p-value < 2.2e-16

H0 is rejected, the data aren’t independently distributed

20

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Let’s try a regression and analyse residuals

TReg<-lm(DATA$Val~DATA$t)

plot(DATA$Val, type="l")

abline(TReg, col="blue")

21

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

eps<-resid(TReg)

ks.test(eps, "pnorm")

layout(matrix(1:4,2,2))

plot(TReg)

22

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Box-Pierce test

data: eps

X-squared = 187.6299, df = 1, p-value < 2.2e-16

Residuals aren’t a white noise

Regression rejected

Not a surprise, did the series look stationary?

What next then?

23

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

lag.plot(DATA$Val, 9, do.lines=FALSE)

Differentiation seems to be interesting

24

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Does the differentiation create a stationary series?

plot(diff(DATA$Val), type="l")

25

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

ACF & PACF

par(mfrow=c(2,1))

acf(diff(DATA$Val),20)

pacf(diff(DATA$Val),20)

ACF decreasing

PACF cancelling after order 1

26

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Decreasing ACF

PACF cancel after order 1

Typically an Autoregressive Process

AR(1) ?

27

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Modl<-ar(diff(DATA$Val),order.max=20)

plot(Modl$aic)

Let’s try to fit an AR(1) model

The likelihood for the order 1 is significant

28

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> ar(diff(DATA$Val),order.max=20)

Call:

ar(x = diff(DATA$Val), order.max = 20)

Coefficients:

1 2 3

0.5925 -0.1669 0.1385

Order selected 3 sigma^2 estimated as 0.8514

> ARDif<-diff(DATA$Val)

> ARDif[1]

[1] 0.3757723

We have our coefficient and standard deviation

We know the first term of our series

𝑦𝑡 = 0.3757723 + 0.5925. 𝑦𝑡−1 + 𝜀𝑡

Here is our model

29

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Need to test the residuals

Box.test(Modl$resid)

Box-Pierce test

data: Modl$resid

X-squared = 7e-04, df = 1, p-value = 0.9789

H0 accepted, residuals are independently distributed (white noise)

The differentiated series is a AR(1)

30

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> predict(arima(diff(DATA$Val), order = c(1,0,0)), n.ahead = 7)

$pred

Time Series:

Start = 193

End = 199

Frequency = 1

[1] -0.81359048 -0.43300609 -0.22850452 -0.11861853 -0.05957287 -

0.02784553 -0.01079729

$se

Time Series:

Start = 193

End = 199

Frequency = 1

[1] 0.923352 1.048210 1.081582 1.091027 1.093739 1.094521 1.094747

80

85

90

95

100

105

110

115

120

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101106111116121126131136141146151156161166171176181186191196

31

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Another typical example?

You make the comments!

32

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

DATA<-read.csv(file="C:/Users/vin/Desktop/Series2.csv",header=T)

plot(DATA$Ser2, type="l")

hist(DATA$Ser2, breaks=20)

33

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> Box.test(DATA$Ser2)

Box-Pierce test

data: DATA$Ser2

X-squared = 149.9227, df = 1, p-value < 2.2e-16

TReg<-lm(DATA$Ser2~DATA$t)

plot(DATA$Ser2, type="l")

abline(TReg, col="blue")

34

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> eps<-resid(TReg)

> Box.test(eps)

Box-Pierce test

data: eps

X-squared = 148.5669, df = 1, p-value < 2.2e-16

> layout(matrix(1:4,2,2))

> plot(TReg)

35

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> lag.plot(DATA$Ser2, 9, do.lines=FALSE)

Much less obvious but clues of autoregression

36

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

par(mfrow=c(2,1))

plot(diff(DATA$Ser2), type="l")

plot(diff(DATA$Ser2, lag=2), type="l")

37

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

par(mfrow=c(2,1))

plot(diff(DATA$Ser2), type="l")

plot(diff(DATA$Ser2, lag=2), type="l")

ACF decreases 2 by 2

PACF cancelling after order 2

38

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

First order differentiation, strong AR(2) clues

par(mfrow=c(1,1))

Modl<-ar(diff(DATA$Ser2),order.max=20)

plot(Modl$aic)

39

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Parameters estimation

> ar(diff(DATA$Ser2),order.max=20)

Call:

ar(x = diff(DATA$Ser2), order.max = 20)

Coefficients:

1 2 3

0.5919 -0.8326 0.1086

Order selected 3 sigma^2 estimated as 0.877

> ARDif<-diff(DATA$Ser2)

> ARDif[1]

[1] 0.3757723

40

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> predict(arima(diff(DATA$Ser2), order = c(2,0,0)), n.ahead = 7)

$pred

Time Series:

Start = 193

End = 199

Frequency = 1

[1] 0.4505213 2.0075741 0.6639701 -1.2321156 -1.1409989 0.3866745

1.0879588

$se

Time Series:

Start = 193

End = 199

Frequency = 1

[1] 0.9220713 1.0332515 1.1413067 1.2938326 1.2957576 1.3932158 1.4080266

80

85

90

95

100

105

110

115

1 6

11

16

21

26

31

36

41

46

51

56

61

66

71

76

81

86

91

96

10

1

10

6

11

1

11

6

12

1

12

6

13

1

13

6

14

1

14

6

15

1

15

6

16

1

16

6

17

1

17

6

18

1

18

6

19

1

19

6

41

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The more factors the harder the prediction is

> Box.test(Modl$resid)

Box-Pierce test

data: Modl$resid

X-squared = 0.0023, df = 1, p-value = 0.9619

Model accepted

The more factors there are the more stationary need to be the series for a good prediction

Moving Average Process

42

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Stationary series with auto correlation of errors

Main principle

𝑋𝑡 = 𝜇 + 𝑍𝑡 + 𝜑1𝑍𝑡−1 + 𝜑2𝑍𝑡−2 + ⋯+ 𝜑𝑛𝑍𝑡−𝑛

𝜑𝑛 Parameters of the model

𝑍𝑛 White noise

More difficult to estimate than a AR(n)

MA(n)

43

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

plot(Data, type="l")

hist(Data, breaks=20)

44

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

acf(Data,20)

pacf(Data,20)

ACF & PACF suggest MA(1)

ACF cancels after order 1

PACF decays to 0

45

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> Box.test(Rslt$residuals)

Box-Pierce test

data: Rslt$residuals

X-squared = 0, df = 1, p-value = 0.9967

It works, MA(1), 0 mean, parameter -0.4621

> arima(Data, order = c(0, 0, 1),include.mean = FALSE)

Call:

arima(x = Data, order = c(0, 0, 1), include.mean = FALSE)

Coefficients:

ma1

-0.4621

s.e. 0.0903

sigma^2 estimated as 0.937: log likelihood = -138.76, aic = 281.52

46

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Fore<-predict(Rslt, n.ahead=5)

U = Fore$pred + 2*Fore$se

L = Fore$pred - 2*Fore$se

minx=min(Data,L)

maxx=max(Data,U)

ts.plot(Data,Fore$pred,col=1:2,

ylim=c(minx,maxx))

lines(U, col="blue", lty="dashed")

lines(L, col="blue", lty="dashed")

47

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Another typical example?

You make the comments!

48

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

plot(Data, type="l")

hist(Data, breaks=20)

49

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

50

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> arima(Data, order = c(0, 0, 2),include.mean = FALSE)

Call:

arima(x = Data, order = c(0, 0, 2), include.mean = FALSE)

Coefficients:

ma1 ma2

-0.5365 0.6489

s.e. 0.0701 0.1044

sigma^2 estimated as 1.005: log likelihood = -142.74, aic = 291.48

> Box.test(Rslt$residuals)

Box-Pierce test

data: Rslt$residuals

X-squared = 0.0283, df = 1, p-value = 0.8664

MA(2)

51

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Fore<-predict(Rslt, n.ahead=5)

U = Fore$pred + 2*Fore$se

L = Fore$pred - 2*Fore$se

minx=min(Data,L)

maxx=max(Data,U)

ts.plot(Data,Fore$pred,col=1:2,

ylim=c(minx,maxx))

lines(U, col="blue", lty="dashed")

lines(L, col="blue", lty="dashed")

ARMA

52

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The series is a function of past values plus current and past values of the noise

Main principle

ARMA(p,q)

Combines AR(p) & MA(q)

53

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

plot(Data, type="l")

hist(Data, breaks=20)

54

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Both ACF and PACF decreases exponentially after order 1

55

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> Rslt<-arima(Data, order = c(1, 0, 1),include.mean = FALSE)

> Rslt

Call:

arima(x = Data, order = c(1, 0, 1), include.mean = FALSE)

Coefficients:

ar1 ma1

0.7214 0.7563

s.e. 0.0716 0.0721

sigma^2 estimated as 0.961: log likelihood = -141.13, aic = 288.27

> Box.test(Rslt$residuals)

Box-Pierce test

data: Rslt$residuals

X-squared = 0.0098, df = 1, p-value = 0.9213

ARMA(1,1) fits

56

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> par(mfrow=c(1,1))

> Fore<-predict(Rslt, n.ahead=5)

> U = Fore$pred + 2*Fore$se

> L = Fore$pred - 2*Fore$se

> minx=min(Data,L)

> maxx=max(Data,U)

57

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Identification can get tricky at this stage

58

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

What do you think?

59

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> Rslt<-arima(Data, order = c(4, 0, 3),include.mean = FALSE)

> Rslt

Call:

arima(x = Data, order = c(4, 0, 3), include.mean = FALSE)

Coefficients:

ar1 ar2 ar3 ar4 ma1 ma2 ma3

0.2722 -0.5276 0.0202 -0.2663 0.8765 -0.4672 -0.5248

s.e. 0.2018 0.2308 0.1968 0.1546 0.1992 0.1690 0.1882

sigma^2 estimated as 1.140: log likelihood = -151.19, aic = 318.38

> Box.test(Rslt$residuals)

Box-Pierce test

data: Rslt$residuals

X-squared = 0.2953, df = 1, p-value = 0.5869

Data<-arima.sim(model=list(ar=c(0.5,-0.5,0.3,-

0.3),ma=c(0.75,-0.5,-0.5)),n=100)

Was supposed to fit pretty wel….

60

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Identification can be difficult

Easiest model is AR

Imagine when the series is not stationary…

Step by step approach, exploration, tries,…

Sometimes you find a satisfying model

Sometimes you don’t!

61

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 5

IFM

Q1

20

12

Conclusion

AR

MA

ARMA

Times series

Recommended