Upload
joshy29
View
342
Download
9
Tags:
Embed Size (px)
DESCRIPTION
Introduction to ARIMA with EViews tutorial
Citation preview
ARIMA
Autoregressive Integrated Moving Average
Introduction - ARMA
ARMA - Auto Regressive Moving Average Introduced by Box and Jenkins in 1976. Box-Jenkins model. Used to develop a model that will forecast an
element based on its historical values. For example, the exchange rate in time t can
be forecasted based on its values in time t-2 and time t-5 plus stochastic error terms.
Two Common Processes
Autoregressive process.
Moving average process.
It is likely that X has characteristics of both AR and MA and is therefore ARMA. In general, in an ARMA (p,q) process, there will be p autoregressive and q moving average.
Introduction - ARIMA
Most economic time series (i.e. GDP) are nonstationary, that is, they are integrated.
In general, if a time series has to be differenced d times, it is integrated of order d or I(d).
On the other hand, if d=0, the resulting to I(0) process corresponds a stationary time series.
Therefore, if we have to difference a time series d times to make it stationary and then apply the ARMA(p,q) model to it, we say that the original time series is ARIMA (p,d,q).
P, D, Q Model
P: the autoregressive parameter D: the integrated parameter of the number of
differencing passes Q: the moving average parameter For example: (0,1,2) model
The Constants in ARIMA
If there are no autoregressive parameters in the model, then the expected value of the constant is , the mean of the series.
If there are autoregressive parameters in the series, then the constant represents the intercept.
If the series is differenced, then the constant represents the mean or intercept of the differenced series.
Assumptions
Stationarity. No uncontrolled correlation. Arbitrary model lag order. No outliers. Randomly distributed shocks. Uncorrelated random errors.
Stationarity
The input series should have a constant mean, variance, and autocorrelation through time.
This assumption is tested through the Augmented Dickey-Fuller Test and non-stationarity is fixed through differencing.
Assumptions
No Uncontrolled Correlation
Autocorrelation means that the value of a given datum is largely determined by the value of the preceding datum in the series.
Assumption is tested through the Durbin-Watson Coefficient, with range of value from 0 to 4. A value of 2 indicates no autocorrelation, 0 indicates autocorrelation, and 4 indicates negative autocorrelation.
Assumptions
Arbitrary Model Lag Order
The researcher must have a theoretical basis to establish the face validity of the order of the model.
Assumptions
No Outliers
As in other forms of regression, outliers may affect conclusions strongly and misleadingly.
Assumptions
Randomly Distributed Shocks
If shocks are present in the time series, they are assumed to be randomly distributed with a mean of 0 and a constant variance.
Assumptions
Uncorrelated Random Errors
Residuals are randomly and normally distributed, have non-significant autocorrelations and partial autocorrelations, and have a mean of 0 and homogeneity of variance over time.
The Durbin-Watson test is the standard test for correlated error.
Assumptions
Procedure
Test for the assumptions. The Box-Jenkins Methodology
Identification. Estimation. Diagnostic Checking. Forecasting.
Test for Stationarity
Visual plot. Correlogram. Unit root test.
Augmented Dickey-Fuller Test Ho: Series is non-stationary Ha: Series is stationary. If absolute value of ADF > absolute value of the
critical regions, reject Ho
Differencing Differencing is a procedure which attempts to de-trend
the data in order to control autocorrelation and achieve stationarity.
It does this by subtracting each datum in a series from its predecessor.
The number of times a series needs to be differenced to achieve stationarity is reflected in the d parameter.
In order to determine the necessary level of differencing, one should examine the plot of the data and autocorrelogram
Caution: Some time series may require little or no differencing. An over differenced series produce less stable coefficient estimates.
Procedure
Identification Major tools: ACF and PACF One autoregressive (p) parameter: ACF - exponential
decay; PACF - spike at lag 1, no correlation for other lags. Two autoregressive (p) parameters: ACF - a sine-wave
shape pattern or a set of exponential decays; PACF - spikes at lags 1 and 2, no correlation for other lags.
One moving average (q) parameter: ACF - spike at lag 1, no correlation for other lags; PACF - damps out exponentially.
Two moving average (q) parameters: ACF - spikes at lags 1 and 2, no correlation for other lags; PACF - a sine-wave shape pattern or a set of exponential decays.
One autoregressive (p) and one moving average (q) parameter: ACF - exponential decay starting at lag 1; PACF - exponential decay starting at lag 1.
Procedure
Estimation
approximate maximum likelihood method the fastest method should be used for very long time series (e.g., with more
than 30,000 observations) approximate maximum likelihood method with
backcasting must use this method first to establish initial parameter
estimates that are very close to the actual final values exact maximum likelihood method
may be inefficient when used to estimate parameters for seasonal models with long seasonal lags (e.g., with yearly lags of 365 days)
Procedure
Diagnostic Checking
Test for the significance of the parameter estimates.
Use partial data to generate forecasts. Analysis of residuals.
Limitations
The ARIMA method is appropriate only for a time series that is stationary.
At least 50 observations are recommended for the input data.
It is also assumed that the values of the estimated parameters are constant throughout the series.
Illustration
Background: US GDP data from 1970 – 1996
Frequency: quarterly
Extract Data From Excel
1
3
2
To Extract Data: File Open Foreign Data as Workfile Choose File Open
Step 1: Check for Stationarity
Visual Plot
To Plot Data: Series Box View Graph Line
1
2
Step 1: Check for Stationarity ADF / Unit Root Test
Series Box View Unit Root Test Level
1
3
2
Step 2: Difference the Data ADF Test
Series Box View Unit Root Test 1st Difference
1
3
2
Step 3: Estimate the P and Q• Correlogram – ACF and PACF
Series Box ViewCorrelogram 1st Difference
1
3
2
Step 3: Estimate the P and Q
Correlogram – ACF and PACF
Possible AR and MA models
And so on…
Step 4: Estimate several modelsQuick Estimate EquationType equation: u_s_gdp c ar(1) ok1
3
2
Quick Estimate EquationType Equation: u_s_gdp c ar(1) ma(1) ok1
3
2
Step 5: Determine the model
Model AIC value SC value
AR(1) 10.04537 10.10206m AR(1)MA(1) 9.985905 10.07094
Step 6: Checking
Use the t-test to check for the significance of the parameters.
Use the Durbin-Watson test to check for the autocorrelation of the error terms.