Time series Alina Vorontseva
Time series -
series of data points indexed in time order.
Most commonly, a time series is a sequence taken at successive equally spaced points in time.
Main goals -
• identifying the nature of the phenomenon represented by the sequence of observations
• to predict future values of timeseries
This derives hidden insights to make informed decision making. Many companies work on time series data to analyze sales number for the next year, website traffic, competition position and much more.
• "I have seen the future and it is very much like the present, only longer.“
Kehlog Albran, The Profit
• "Prediction is very difficult, especially if it's about the future.“
Nils Bohr, Nobel laureate in Physics
Some definitions
Timeseries = Trend +/* Seasonality + Random noise
Some definitions
Stationary timeseries:
• Constant mean
• Constant variance
• Constant covariance
How to make timeseries stationary?
• Detrending
• Deseasonalizing
• Differencing
• Seasonal differencing
• Transformation (e.g. log, root, Box-Cox)
• Remove outliers
Why do we need stationarity? • Model assumptions
• Meaningful sample statistics
Some definitions
Smoothing (moving average) - local averaging of data such that the nonsystematic components of individual observations cancel each other out
Some definitions • Autocorrelation – correlation between
consecutive lags in timeseries
Timeseries modelling
Model evaluation
• Log likelihood (maximize logarithm of the probability of the observed data coming from the estimated model)
• AIC - Akaike’s Information Criterion (minimize relative information lost when a given model is used to represent the process that generated the data)
• BIC - Bayesian Information Criterion (minimize)
Naïve methods
• Predict last value
• Predict mean value
• Predict value from last season
• Predict value with linear slope from previous step
ARIMA
ARIMA =
AR (autoregression) + MA (moving average)
AR(p): MA(q):
ARIMA(p,d,q):
ARIMA
• If c=0 and d=0, the long-term forecasts will go to zero.
• If c=0 and d=1, the long-term forecasts will go to a non-zero constant.
• If c=0 and d=2, the long-term forecasts will follow a straight line.
• If c≠0 and d=0, the long-term forecasts will go to the mean of the data.
• If c≠0 and d=1, the long-term forecasts will follow a straight line.
• If c≠0 and d=2, the long-term forecasts will follow a quadratic trend.
Simple exponential smoothing
Holt's linear trend method
• denotes an estimate of the level of the series at time t
• denotes an estimate of the trend (slope) of the series at time t
• α is the smoothing parameter for the level, 0≤α≤10≤α≤1
• β∗ is the smoothing parameter for the trend, 0≤β∗≤1
Variation: Exponential trend method
• now represents an estimated growth rate (in relative terms rather than absolute) which is multiplied rather than added to the estimated level.
Holt-Winters seasonal method
• Additive method
• Multiplicative method
Holt-Winters seasonal method
Exponential smoothing methods
Useful tools
• Libraries for R, Python, Octave, Java, C++, etc…
• Excel advanced tools
• Other software:
References
• https://en.wikipedia.org/wiki/Time_series
• https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/
• http://www.statsoft.com/Textbook/Time-Series-Analysis
• https://people.duke.edu/~rnau/411quote.htm
• https://www.otexts.org/fpp
• https://en.wikipedia.org/wiki/Category:Time_series_software