of 83 /83
Time Series Analysis

Univariate Time Series

Embed Size (px)

DESCRIPTION

-

Citation preview

• Time Series Analysis

• DefinitionA time series is a sequence of observations taken sequentially in time

An intrinsic feature of a time series is that, typically adjacent observations are dependent

The nature of this dependence among observations of a time series is of considerable practical interest

Time Series Analysis is concerned with techniques for the analysis of this dependence

• Time Series ForecastingExamine the past behavior of a time series in order to infer something about its future behavior

A sophisticated and widely used technique to forecast the future demand

ExamplesUnivariate time series: AR, MA, ARMA, ARIMA, ARIMA-GARCH

Multivariate: VAR, Cointegration

• Univariate Time-series ModelsThe term refers to a time-series that consists of single (scalar) observations recorded sequentially over equal time increments

Univariate time-series analysis incorporates making use of historical data of the concerned variable to construct a model that describes the behavior of this variable (time-series)

This model can, subsequently, be used for forecasting purpose

Appropriate technique for forecasting high frequency time series where data on independent variables are either non-existent or difficult to identify

• Famous forecasting quotes"I have seen the future and it is very much like the present, only longer." - Kehlog Albran, The Profit

This nugget of pseudo-philosophy is actually a concise description of statistical forecasting. We search for statistical properties of a time series that are constant in time - levels, trends, seasonal patterns, correlations and autocorrelations, etc. We then predict that those properties will describe the future as well as the present.

"Prediction is very difficult, especially if it's about the future." Nils Bohr, Nobel laureate in Physics

This quote serves as a warning of the importance of validating a forecasting model out-of-sample. It's often easy to find a model that fits the past data well--perhaps too well! - but quite another matter to find a model that correctly identifies those patterns in the past data that will continue to hold in the future

• Time series dataSecular Trend: long run pattern

Cyclical Fluctuation: expansion and contraction of overall economy (business cycle)

Seasonality: annual sales patterns tied to weather, traditions, customs

Irregular or random component

• Ex-Post vs. Ex-Ante ForecastsHow can we compare the forecast performance of our model?

There are two ways.

Ex Ante: Forecast into the future, wait for the future to arrive, and then compare the actual to the predicted

Ex Post: Fit your model over a shortened sample

Then forecast over a range of observed dataThen compare actual and predicted.

• Ex-Post and Ex-Ante Estimation & Forecast PeriodsSuppose you have data covering the period 1980.Q1-2001.Q49 2001Ex-Post Estimation PeriodEx-PostForecast PeriodEx-AnteForecastPeriod TheFuture

• Examining the In-Sample FitOne thing that can be done, once you have fit your model is to examine the in-sample fit

That is, over the period of estimation, you can compare the actual to the fitted data

It can help to identify areas where your model is consistently under or over predicting take appropriate measures

Simply estimate equation and look at residuals

• Model PerformanceRMSE =(1/n(fi xi)2 - difference between forecast and actual summed smaller the betterMAE & MAPE smaller the betterThe Theil inequality coefficient always lies between zero and one, where zero indicates a perfect fit.Bias portion - Should be zeroHow far is the mean of the forecast from the mean of the actual series?

• Model Performance Variance portion - Should be zeroHow far is variation of forecast from forecast of actual series variance?

Covariance portion - Should be oneWhat portion of forecast error is unsystematic (not predictable)

If your forecast is "good", the bias and variance proportions should be small so that most of the bias should be concentrated on the covariance proportions

• Autocorrelation function (ACF)Autocorrelation function (ACF) of a random process describes the correlation between the process at different points in time. Let Xt be the value of the process at time t (where t may be an integer for a discrete-time process or a real number for a continuous-time process). If Xt has mean ; and variance 2 then the definition of ACF is

.

• ACF & PACF

The partial autocorrelation at lag k is the regression coefficient on Yt-k when Yt is regressed on a constant,Yt-1Yt-k

This is a partial correlation since it measures the correlation of values that are periods apart after removing the correlation from the intervening lags

Correlogram: Plot of ACF & PACF against lags

• Stationary Time SeriesA stochastic process is said to be stationary if its mean and variance are constant over time and the value of covariance between two time periods depends only the distance or gap or lag between the two time periods and not the actual time at which the covariance is computed

In time series literature, such stochastic process is known as weakly stationary or covariance stationary

In most practical situation, this type of stationary often suffices

A time series is strictly stationary if all the moments of its probability distribution and not just the first two (mean & variance) are invariant over time

• Stationary Time SeriesHowever, if the stationary process is normal , the weakly stationary process is also strictly stationary as normal stachastic process is fully specified by its two moments, the mean & variance

Let Yt be a stochastic time series with properties:

Mean: E(Yt) = Variance: var(Yt) = E (Yt )2 = 2Covariance:k = E (Yt )(Yt+k ) autocovariance between Yt and Yt+k, i.e. between two Y values k pariods apart

If k = 0, we obtain 0, which is simply the variance of Y

If k = 1, 1 is the covariance between two adjacent values of Y

• Stationary Time SeriesNow, if we shift the origin from Yt to Yt+m, the mean, variance and autocovariance of Yt+m must be same as those of Yt

This, if a time series is stationary, its mean, variance, autocovariance remains same, no matter at what point we measure them i.e. they are time invariant

Such a time series is tend to returns to its mean, called mean reversion

• Non-stationary SeriesA non-stationary time series will have a time varying mean or variance or both

For non-stationary time series, we can study its behavior only for the time period under consideration

Each set of time series data will therefore be for a particular episode

So it is not possible to generalize it to other time periods

Therefore, for the purpose of forecasting, non-stationary time series may be of little practical value

• ForecastingMost statistical forecasting methods are based on the assumption that the time series can be rendered approximately stationary (i.e., "stationarized") through the use of mathematical transformations

A stationarized series is relatively easy to predict: you simply predict that its statistical properties will be the same in the future as they have been in the past!

• ForecastingThe predictions for the stationarized series can then be "untransformed," by reversing whatever mathematical transformations were previously used, to obtain predictions for the original series

The details are normally taken care of by software

Thus, finding the sequence of transformations needed to stationarize a time series often provides important clues in the search for an appropriate forecasting model.

• Random or White Noise ProcessWe call a stochastic process purely random or white noise process if it has a zero mean, constant variance and serially uncorrelated

Error term entered in CLRM is assumed to be white noise process as u ~ iid (0, 2)

Random walk model, non-stationary in nature, observed in asset price, stock price or exchange rates (discuss later)

• Trend: ACF & PACFThe ACF function shows a definite pattern, it decreases with the lags. This means there is a trend in the data. Since the pattern does not repeat , we can conclude that the data does not show any seasonality.

• Seasonality

• Trend & Seasonality: ACF & PACFThe ACF plots clearly show a repetition in the pattern indicating that the data are seasonal, there is periodicity after every 12 observations, ie they show seasonality and trend in the data The PACF plots also show seasonality, trend

• Estimation and Removal of Trend & Seasonality

Classical Decomposition of a Time Series

Xt = mt + st + Yt

mt : trend component (deterministic, changes slowly with t);st : seasonal component (deterministic, period d);Yt : noise component (random, stationary).

Aim: Extract components mt and st, and hope that Yt will be stationary. Then focus on modeling Yt.

We may need to do preliminary transformations if the noise or amplitude of the seasonal fluctuations appear to change over time.

• Time series data, Xt = mt + st + Yt

ACF, PACF, ADF testsNon-stationary seriesStationary Series, Xt=YtDe-trend and/orDe-seasonalizeStationary Series YtModel for YtAR, MA, ARMAResidual series WNEstimate AR, MA, ARMA parametersForecast Xt (In-sample/Out of sample)Model for Xt=YtAR, MA, ARMAResidual series WNEstimate AR, MA, ARMA parametersForecast Xt=Yt (In-sample/Out of sample)

• Backward Shift OperatorThis operator B plays an important role in the mathematics of TSA

BXt=Xt-1 and in general BsXt = Xt-s

A polynomial in the lag operator takes the form(B)=1+ 1B+ 2B2+.+ qBq, where 1 q are parameters

The roots of such a polynomial are defined as q values of B which satisfy the polynomial equation (B) =0

• Backward Shift OperatorIf q=1, (B)=1+ B=0 B= - (1/ )

A root is said to lie outside the unit circle if the modulus is greater than one

The first difference operator is defined as = 1- B Xt = Xt Xt-1

2 = (1 B)2 More generally, d=(1- B)d is a dth order polynomial

• Elimination of Trend

Nonseasonal model with trend: Xt = mt + Yt, E(Yt)=0

Methods:

Moving Average Smoothing

Exponential Smoothing

Spectral Smoothing

Polynomial Fitting

Differencing k times to eliminate trend

• Differencing k times to eliminate trendDefine the backward shift operator B as follows: B Xt = Xt-1

We can remove trend by differencing, e.g. (1-B) Xt=Xt - Xt-1, and,(1-B)2 Xt = (1-2B+B2) Xt = Xt - 2Xt-1 + Xt-2

It can be shown that a polynomial trend of degree k will be reduced to a constant by differencing k times, that is, by applying the operator (1-B)k Xt

Given a sequence {xt}, we could therefore proceed by differencing repeatedly until the resulting series can plausibly be modeled as a realization of a stationary process.

• Elimination of Seasonality

Seasonal model without trend: Xt = st + Yt, E(Yt)=0,.Classical Decomposition

Regress level variable (Y) on dummy variables (with or without intercept)

Calculate residuals

Add these residuals to mean value of Y

Resulting series is deseasonalized time series

(b) Differencing at lag d to eliminate period d

Since, (1-Bd)st= st - st-d = 0, differencing at lag d will eliminate a seasonal component of period d.

• Elimination of Trend+Seasonality

Elimination of both trend and seasonal components in a series, can be achieved by using trend as well as seasonal differencing

For example: (1-B)(1-B12)Xt

• Time series data, Xt = mt + st + Yt

ACF, PACF, ADF testsNon-stationary seriesStationary Series, XtDe-trend and/orDe-seasonalizeStationary Series Model for stn. seriesAR, MA, ARMAResidual series WNEstimate AR, MA, ARMA parametersForecast Xt after re-transformation(In-sample/Out of sample)Model for Xt=YtAR, MA, ARMAResidual series WNEstimate AR, MA, ARMA parametersForecast Xt(In-sample/Out of sample)

• Non-Seasonal & Seasonal AR, MA & ARMA Process

• Autoregressive ProcessAR(1) model specification is

Yt = m + Yt-1 + ut {ut} WN(0,2).(1 L) Yt = m + utYt = (1+ L+ 2L2 +.)(m + ut)

Since a constant like m has the same value at all periods, application of lag operator any number of times simply reproduces the constant. So

• Autoregressive ProcessYt = (1++2+)m+ (ut+ ut-1+2ut-2+)E(Yt) = (1+ + 2+)m

This expression only exists if the infinite geometric series has a limit

The necessary & sufficient condition is []

• Autoregressive ProcessAR(2) Process: yt = 1 yt-1 + 2 yt-2 + utAR(p) Process: yt = 1 yt-1 + 2 yt-2 + .+ p yt-p + ut yt = p yt-p

Defining the AR polynomial(L) = 1- 1L - ... - pL p

we can write the AR(p) model concisely as: (L)yt = ut

• Autoregressive ProcessIt is sometime difficult to distinguish between AR processes of different orders solely based on correlograms

A sharper discrimination is possible on the basis on partial autocorrelation coeff

For an AR(p), PACF vanishes for lags greater than p. while, ACF of an AR(p) decays exponentially

• Moving Average ProcessIn a pure MA process, a variable is expressed solely in terms of the current and pervious white noise disturbancesMA(1) Process: yt = ut + q1 ut-1MA(q) Process: yt = ut + q1 u t-1 + ... + qqu t-q, {ut} WN(0,2)Defining the MA polynomialq(L) = 1 + q1 L+ ... + qq Lq we can write the MA(q) model concisely as: yt = q(L) ut.

• Moving Average ProcessFor parameter identifiability reasons, and in analogy with the concept of causality for AR processes, we require that all roots of q(L) be greater than 1 in magnitude

The resulting process is said to be invertible

The PACF of an MA(q) decays exponentially

The ACF vanishes for lags beyond q

• The single negative spike at lag 1 in the ACF is an MA(1) signature

• ARMA ProcessWe can put an AR(p) and an MA(q) process together to form the more general ARMA(p,q) process: yt - 1 y t-1 - ... - p yt-p = ut + q1 ut-1 + ... + qq ut-q, where {ut} WN(0,2).By definition, we require that {yt} be stationary. Using the compact AR & MA polynomial notation, we can write the ARMA(p,q) as: (L) yt = q(L) ut, {ut} WN(0,2)

• ARMA ProcessFor stationarity and invertibility, we require as before, that all roots of (L) and q(L) be greater than 1 in magnitude

AR & MA are special cases: an AR(p)=ARMA(p,0), and an MA(q)=ARMA(0,q) ACF & PACF both decay exponentially

• Sample ACF/PACFFor an AR(p) the ACF decays geometrically, and the PACF is zero beyond lag p. The sample ACF/PACF should exhibit similar behavior, and significance at the 95% level can be assessed via the usual bounds

For an MA(q) the PACF decays geometrically, and the ACF is zero beyond lag q. The sample ACF/PACF should exhibit similar behavior, and significance at the 95% level can also be assessed via the 1.96/n bounds

For an ARMA(p,q), the ACF & PACF both decay exponentially.

Examining the sample ACF/PACF therefore can serve only as a guide in determining possible maximum values for p & q to be properly investigated via AICC.

• Sample ACF/PACFThe PACF shows a sharper "cutoff" than the ACFIn particular, the PACF has only two significant spikes, while the ACF has four

Thus, the series displays an AR(2) signature

If we therefore set the order of the AR term to 2 i.e., fit an ARIMA(2,1,0) model--we obtain the following ACF and PACF plots for the residuals

• Order Selection/Model IdentificationIn real-life data, there is usually no underlying true model. The question then becomes how to select an appropriate statistical model for a given data set?A breakthrough was made in the early 1970s by the Japanese statistician Akaike. Using ideas from information theory, he discovered a way to measure how far a candidate model is from the true model. We should therefore minimize the distance from the truth, and select the ARMA(p,q) model that minimizes Akaikes Information Criterion (AIC):

• Order Selection/Model Identificationwhere denotes the likelihood evaluated at the MLEs of f, q, and s2, respectively. (Nowadays we actually use a bias-corrected version of AIC called AICC.)The first term in the AIC expression measures how well the model fits the data; the lower it is, the better the fit. The second term penalizes models with more parameters. Final model selection can then be based upon goodness-of-fit tests and model parsimony (simplicity).There are several other information criteria currently in use, SBC, FPE, SIC, MDL, etc., but AIC and SBC seem to be the most popular.

• Non-stationary Time Series

- Unit Root- ARIMA

• Random Walk ModelAlthough our interest is on stationary time series, we often encounters non-stationary time series

Classic example: RWM (stock price, exchange rate)

Can be of two typesRandom walk without drift: Xt= Xt-1 + utRandom walk with drift: Xt= + Xt-1 + ut

• Random walk without driftLet X1=Xt-1+u1X1=X0+u1; X2 = X1+u2X2=X0+u1+u2E(Xt) = X0 and var(Xt) = t2Mean value of X is its initial value, which is constant, but as t increases, its variance increases indefinitely, thus violating the stationary conditionRWM is the persistence of random shocks and impact of particular shock does not die awayRWM said to have infinite memory

• Random walk with driftXt= +Xt-1 + ut Xt= + ut

Xt drift upward or downward depending upon being positive or negative

RWM is an example of what is known as unit root process

• Unit Root ProcessSay, Xt=Xt-1+ut; -1 <
• Unit root

• Difference Stationary (DS) ProcessIf the trend of a time series is predictable and not variable, we call it deterministic trendIf trend is not predictable, we call it stochastic trendSay, Xt=b1+b2t+b3Xt-1+ut, ut WNIf b1=b2=0, b3=1 RWM without drift non-stationary 1st difference stationary

• Trend Stationary ProcessIf b1=b2 0, b3=0 Xt=b1+b2t+ut

This is called TS process

Though mean is not constant, variance is

Once the values of b1 and b2 is known, the mean can be forecast perfectly

Thus, if we subtract the mean of Xt from Xt, the resultant series will be stationary

• Dicky-Fuller unit root testsSimple AR(1) model xt=xt-1+ut .. (1)The null hypothesis of unit root, Ho: =1 with H1: < 1Subtracting xt-1 from both sides of equ (1), we getxt xt-1 = xt-1 xt-1 + utxt-1 = (-1)xt-1+ utxt-1 = xt-1+ utHere null hypothesis of unit root Ho: = 0 and H1: < 0

• Detection of Unit Root ADF Tests ADF test is conducted with the following model: Where Xt is the underlying variable at time t, ut is the error termThe lag terms are introduced in order to justify that errors are uncorrelated with lag terms. For the above-specified model the hypothesis, which would be of our interest, is:H0: = 0

• ADF Tests-EviewsTo begin, double click on the series name to open the series window, and choose View/Unit Root Test

Specify whether you wish to test for a unit root in the level, first difference, or second difference of the series

Choose your exogenous regressors. You can choose to include a constant, a constant and linear trend, or neither

EViews automatically select lag length, others use AIC, SBC and other criteria

• Null hypothesis of an unit root cannot be rejected

• Other Unit Root TestsPhillips-Perron (1998) tests GLS-detrended Dickey-Fuller tests(Elliot, Rothenberg, and Stock, 1996) Kwiatkowski, Phillips, Schmidt, and Shin tests(KPSS, 1992),Elliott, Rothenberg, and Stock Point Optimal tests (ERS, 1996) Ng and Perron (NP, 2001) unit root tests

• Integrated Stochastic ProcessRWM is a specific case of more general class of stochastic process known as integrated processOrder of integration is the minimum number of times the series need to be first differenced to yield a stationary seriesRWM is non-stationary but 1st difference is stationary I(1) seriesA stationary series is called I(0) series1st difference of I(0) series still yields I(0) series

• ARIMA ModelsAn integrated process Xt is designed as an ARIMA (p,d,q), if taking differences of order d, a stationary process Zt of the type ARMA (p, q) is obtained.

The ARIMA (p, d, q) model is expressed by the function

Zt = 1 Zt - 1 + 2 Zt - 2 + ..+ p Zt - p + ut - 1 ut 1 - 2 u t 2 - - q u t q

Or (L) (1 L) dX t = (L) ut

• Summary of ARMA/ARIMA modeling proceduresPerform preliminary transformations (if necessary) to stabilize variance over time

Detrend and deseasonalize the data (if necessary) to make the stationarity assumption look reasonable

Trend and seasonality are also characterized by ACFs that are slowly decaying and nearly periodic, respectively The primary methods for achieving this are classical decomposition, and differencing

• Summary of ARMA/ARIMA modeling proceduresIf the data looks nonstationary without a well-defined trend or seasonality, an alternative to the above option is to difference successively & use ADF tests

Examine sample ACF & PACF to get an idea of potential p & q values. For an AR(p)/MA(q), the sample PACF/ACF cuts off after lag p/q

Estimate the coefficients for the promising models

• Summary of ARMA/ARIMA modeling procedures

From the fitted ML models above, choose the one with smallest AICC

Inspection of the standard errors of the coefficients at the estimation stage, may reveal that some of them are not significant If so, subset models can be fitted by constraining these to be zero at a second iteration of ML estimation

Check the candidate models for goodness-of-fit by examining their residuals. This involves inspecting their ACF/PACF for departures from WN, and by carrying out the formal WN hypothesis tests

• Seasonal part of ARIMA model

The seasonal part of an ARIMA model has the same structure as the non-seasonal part: it may have an AR factor, an MA factor, and/or an order of differencing

In the seasonal part of the model, all of these factors operate across multiples of lag s (the number of periods in a season)

A seasonal ARIMA model is classified as an ARIMA(P,D,Q) model, where P=number of seasonal autoregressive (SAR) terms, D=number of seasonal differences, Q=number of seasonal moving average (SMA) terms

In identifying a seasonal model, the first step is to determine whether or not a seasonal difference is needed, in addition to or perhaps instead of a non-seasonal difference

• Seasonal part of ARIMA modelThe seasonal models ARIMA (P, D, Q) which are not stationary but homogenous of degree D can be expressed as

Zt = 1 Zt - s + 2 Zt - 2s + ..+ p Zt p s ++ ut - 1ut s - 2 ut 2 s- . p (Ls) (1 Ls) D X t = + Q (Ls) ut

The signature of pure SAR or pure SMA behavior is similar to the signature of pure AR or pure MA behavior, except that the pattern appears across multiples of lag s in the ACF and PACF

For example, a pure SAR(1) process has spikes in the ACF at lags s, 2s, 3s, etc., while the PACF cuts off after lag s

• Seasonal part of ARIMA modelConversely, a pure SMA(1) process has spikes in the PACF at lags s, 2s, 3s, etc., while the ACF cuts off after lag s

An SAR signature usually occurs when the autocorrelation at the seasonal period is positive, whereas an SMA signature usually occurs when the seasonal autocorrelation is negative

• General multiplicative seasonal models ARIMA (p, d, q) (P, D, Q)s

An integrated process Xt is designed as an ARIMA (p,d,q), if taking differences of order d, a stationary process Zt of the type ARMA (p, q) is obtained. The ARIMA (p, d, q) model is expressed by the function

Zt = 1 Zt - 1 + 2 Zt - 2 + ..+ p Zt - p + ut - 1 ut 1 - 2 u t 2 - - q u t qOr (L) (1 L) dX t = (L) ut The seasonal models ARIMA (P, D, Q) which are not stationary but homogenous of degree D can be expressed as

Zt = 1 Zt - s + 2 Zt - 2s + ..+ p Zt p s ++ ut - 1ut s - 2 ut 2 s- . Or p (Ls) (1 Ls) D X t = + Q (Ls) ut

General multiplicative seasonal models, ARIMA (p, d, q) (P, D, Q)s

p (Ls) p (L)(1 Ls) D (1 L) d X t = Q (Ls) q (L) ut.

• ARIMA Model BuildingIdentification

This stage basically tries to identify an appropriate ARIMA model for the underlying stationary time series on the basis of ACF and PACF If the series is nonstationary it is first transformed to covariance-stationary and then one can easily identify the possible values of the regular part of the model i.e. autoregressive order p and moving average order q in a univariate ARMA model along with the seasonal part

• ARIMA Model BuildingEstimationPoint estimates of the coefficients can be obtained by the method of maximum likelihood Associated standard errors are also provided, suggesting which coefficients could be droppedDiagnostic checking One should also examine whether the residues of the model appear to be white noise process

Forecasting.

• MSARIMA (p, d, q)(P, D, Q)

MSARIMA (2,1,1)(0,0,1)

MSARIMA (0, 0,1)(1,1,0)

MSARIMA (1,0,1)(1,0,1)

MSARIMA (2,1,0)(0,1,1)