Time Series Analysis
Dr. Qingling Wu
UCL Geography
Office: G20 Pearson Building
Lecture outline
• Introduction to Time Series Analysis
• Decompose signals:
– Loess, STL
• Modeling Time Series
– Autogressive Integrated Moving Average (ARIMA) Model
– Wavelet & Fourier Analysis
– Principle Component Analysis (for 2D)
Reading list
• Textbooks: – Chandler and Scott (2011). Statistical Methods for Trend Detection and
Analysis in the Environmental Sciences, Wiley, 368pp.
– Hamilton (1994). Time Series Analysis, Princeton University Press, 799pp.
– Jolliffe (2005). Principal component analysis. John Wiley & Sons, Ltd. 290pp.
– Wilks (2011). Statistical Methods in the Atmospheric Sciences, 3rd Ed. Chapter 9 (or Chapter 8 in 2nd Ed.)
– Mudelsee (2010). Climate time series analysis: classical statistical and bootstrap methods. Vol. 42. Springer (Very basic, available at here)
• Journal Articles (Time Series in Geographical and Environmental Studies): – Wavelet:
• Grinsted, Aslak, John C. Moore, and Svetlana Jevrejeva (2004). "Application of the cross wavelet transform and wavelet coherence to geophysical time series."Nonlinear processes in geophysics 11.5/6: 561-566.
– PCA/EOF: • Eastman and Filk. (1993) "Long sequence time series evaluation using standardized
principal components." Photogrammetric Engineering and remote sensing 59.6: 991-996. Link
Categorization
• Continuous vs. Discrete
– Observations form a continuous time series
– Can digitize the values at a specified time interval to
obtain a discrete time series.
• Deterministic vs. Stochastic
– Physical laws described by Differential Equations?
– Anthropogenic climate forcings?
Categorization
• Stationary vs. Non-stationary
– Stationary, if statistical characteristic (e.g. mean,
variance) contains no systematic change
Objectives of Time Series Analysis
• Description – Trend, seasonality/cyclicity, outliers, sudden changes or breaks
• Explanation – Using one TS to explain another
– May help understand the mechanisms
• Prediction – i.e. forecasting
• Control – TS often collected to improve control over a physical process
– Monitoring to alert when conditions exceed an priori determined threshold
Basic Components in a Time Series
• All time-series data have three basic parts:
– A trend component (T)
• Long term change in mean
– A seasonal/cyclic component (S & C)
• Seasonality: i.e. annual variation. Can be deseasonalized
• Cyclicity: variation fixed in period. E.g. diurnal temp. variation
– An irregular component (I)
• Signal after removal of trend and seasonal/cyclic variations
Additive vs. Multiplicative
• Y = T + S + I , when independent
• Y = T x S x I , when dependent
Ln(Y) = Ln(T x S x I) = Ln(T) + Ln(S) + Ln(I)
Smoothing & Fitting
• Smoothing (Convolution)
– Moving Average
– Weighted Moving Average
– Exponential smoothing
• Fitting
– Linear fitting
– Polynomial fitting
– Exponential curve
Moving Average
1t 2t 3t 4t 5t 6t 7tOriginal
Moving Avg. 3
321 ttt
3
432 ttt
3
543 ttt
3
654 ttt
3
765 ttt
New 2t 3t 4t 5t 6t
LOWESS & LOESS
• Locally weighted scatterplot smoother (LOWESS)
– Cleveland 1979
• LOcal regrESSion (LOESS)
– fitting simple models to localized subsets of the data – At each point in the data set a low-degree polynomial is fitted to a subset
of the data, with explanatory variable values near the point
whose response is being estimated. The polynomial is fitted
using weighted least squares, giving more weight to points near the point
whose response is being estimated and less weight to points further away.
http://en.wikipedia.org/wiki/Local_regression
Seasonal Trend Lowess (STL)
• Decompose series into trend, seasonal,
and random components using loess
• The seasonal component is found by
loess smoothing of the seasonal sub-
series (e.g. series of January values)
• xt is deseasonalized and smoothed to
find the trend
• Overall level subtracted from seasonal
series and added to the trend
• This process is repeated a few times
until convergence
• Remainder is the residuals of the trend
+ seasonal components
Autocorrelation
• Autocorrelation
– Correlation between a series and itself (lagged)
– “effective sample size” is reduced in the presence of
+ve autocorrelation
• Autoregression
– whether the next value in the time series can be
predicted as some function of its previous values
Autocorrelation
• (Pearson) correlation coefficient, R
• Autocorrelation coefficient, at lag k
What if k<0?
Autocorrelation function (ACF)
• Autocorrelation function (ACF) – A collection of {r(k)}
• Correlogram
– Strength of linear association between two quantities:
• intrinsically related, or both influenced by other factors
– Lack of correlation doesn’t imply lack of association; e.g. non-linear relationship
Partial Autocorrelation function (PACF)
• Wait …
If t0 is correlated with t1, and t1 is correlated with t2,
then to will necessarily be correlated with t2 also…
Partial Autocorrelation function (PACF)
• The partial autocorrelation αk is obtained as
coefficient βk from the regression
• This is an autoregressive (AR) process of order k
ktkttt xxxx ...22110
Cross-correlation function (CCF)
• measures the correlation
between observations of two
series at different lags
Autocorrelation is the cross-correlation of a signal with itself.
Autoregressive (AR) process
• Specifies that the output variable depends linearly
on its own previous values
• AR(p)
– Xt: a function of the past observations plus a purely
random process
– Order p depends on the cutoff on PACF correlogram
– AR(p) process is zero at lag p + 1 and greater
Autoregressive (AR) Model AR(p)
• AR(0): no dependence between the terms
– White noise
• AR(1): only the previous term and the noise term
contribute to the output
– Markov process, Random Walk
Moving Average (MA) Model MA(q)
• xt is modeled as a function of the current and past
values of a purely random process, i.e. errors
– Modeling based on the previous error term(s)
– MA(q): a random shock at time t has effect until t+q
• Examples
– Variation caused by coupons in sales data
– Cooling due to a volcanic eruption, say, lasting for 2 yrs
Autoregressive & Moving Average model
ARMA(p,q)
• An autoregressive moving average process
combines both AR(p) and MA(q)
terms into a general model for
time series
• ARMA(p,q)
– p: order of the AR process
– q: order of the MA process
How to choose model orders
• AR and MA signatures: If the PACF displays a
sharp cutoff while the ACF decays more slowly
(i.e., has significant spikes at higher lags), we say
that the stationarized series displays an "AR
signature," meaning that the autocorrelation
pattern can be explained more easily by adding
AR terms than by adding MA terms.
http://people.duke.edu/~rnau/411arim3.htm
Differencing
• Differencing,
• Also can stationarize the series
– But only the mean, can’t stationarize the variance
211 2 tttttt xxxyyz
Transformation
• Log()
Multiplicative
Additive
Trend, seasonality,
& errors: all transformed
Non-stationary Stationary
Test for Stationary
• The null hypothesis of the Augmented Dickey-
Fuller (ADF) t-test is:
– H0: θ = (i.e. the data needs to be differenced to make it
stationary)
• versus the alternative hypothesis of
– H1: θ < (i.e. the data is stationary and doesn’t need to
be differenced)
http://faculty.smu.edu/tfomby/eco6375/BJ%20Notes/ADF%20Notes.pdf
ARIMA Model
• Autoregressive Integrated Moving Average
(ARIMA)
• ARIMA(p,d,q)
– d: order of the differencing applied to the original xt
• First-order differencing is usually sufficient
– so generally d = 1
• Random walk ARIMA(0,1,0)
Identifying d in ARIMA(p,d,q)
• Once you are happy that the mean and variance
are as constant as possible you can assign d a
value
– If d = 0 the data are stationary and there is no trend;
– If d = 1 the data need to be differenced once so the
linear trend is removed;
– If d = 2 the data need to be difference twice so that the
linear and quadratic trends are removed.
Thompson UCL
Identifying p and q in ARIMA(p,d,q)
• Correlogram
• ACF & PACF Revisit
– ACF: correlation coefficients for consecutive lags
– PACF: partials out the immediate autocorrelations and
estimates the autocorrelation at a specific lag (e.g. 2
lags)
Thomson UCL
Seasonal autoregressive integrated moving
average Model SARIMA(p,d,q)(P,D,Q)s
• Acknowledge seasonal component
• In a monthly series we expect xt to depend on xt-12 and perhaps xt-24 as well as on more recent non-seasonal values such as xt-1 and xt-2
• e.g.
s: the period of seasonality
Evaluating Models, Goodness of fit
• Akaike Information Criterion (AIC)
– A function of both the fit, the sum of squared residuals
(S), and the number of parameters p
• Bayesian Information Criterion (BIC)
– punishes extra parameters more
Akaike, H. (December 1974). "A new look at the statistical model identification". IEEE
Transactions on Automatic Control 19 (6)
Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics, 6, 461-4
Generalized Least Squares
Assume covariance of two errors depends only on their
separation in time
Define the autocorrelation at lag s as s, the correlation
between two errors that are separated by |s| time
periods
Error covariance matrix
Spectral Analysis
• Period: a repeating pattern in a series
• Sine wave: the fundamental periodic signal
• Joseph Fourier (1768-1830)
– Good approximations to most periodic signals can be
achieved using sums of sine (and cos) waves
• Spectral analysis
– based on sine waves and a decomposition of variation
in series into waves of various frequencies
Fourier Transformation
• Time domain to frequency domain
• Sine wave that makes m cycles in series length is
the mth harmonic
• Amplitude of mth harmonic is
– In general, instead of via a regression, the calculations
above are usually performed with the fast fourier
transform (FFT) algorithm
• Fourier line spectrum: plot of as spikes agst m
22
mmm baA
mA
Wavelet Analysis
• Generalization of Fourier transforms in filtering
• Wavelets are ‘small waves’, sinusoids are ‘big
waves’
• A function Ф() is a wavelet, if:
– Its integral = 0
– The integral of Ф^2() is unity
• Such properties hold for Ф(u) = sin(x)/pi.
1)(2 duu
0)( duu
Time Series in Remote Sensing
• 1D 2D
• Spatial && Temporal feature extraction
• PCA/EOF
– Principle Component Analysis (Empirical Orthogonal
Function)