Forecasting: principles and practice · Outline 1Variance stabilization 2Box-Cox transformations 3Back-transformation 4Lab session 9 Forecasting: principles and practice Variance

Forecasting: principles and practice 1

Forecasting: principlesand practice

Rob J Hyndman

2.2 Transformations

Outline

1 Variance stabilization

2 Box-Cox transformations

3 Back-transformation

4 Lab session 9

Forecasting: principles and practice Variance stabilization 2

Variance stabilizationIf the data show different variation at different levels ofthe series, then a transformation can be useful.Denote original observations as y1, . . . , yn andtransformed observations as w1, . . . ,wn.Mathematical transformations for stabilizing variation

Square root wt =√yt ↓

Cube root wt = 3√yt Increasing

Logarithm wt = log(yt) strength

Logarithms, in particular, are useful because they are moreinterpretable: changes in a log value are relative (percent)changes on the original scale.




















Variance stabilization

4000

8000

12000

16000

1960 1970 1980 1990

Year

Electricity production



50

75

100

125

1960 1970 1980 1990

Year

Square root electricity production



15

20

25

1960 1970 1980 1990

Year

Cube root electricity production



8

9

1960 1970 1980 1990

Year

Log electricity production



−8e−04

−6e−04

−4e−04

−2e−04

1960 1970 1980 1990

Year

Inverse electricity production


Outline




4 Lab session 9

Forecasting: principles and practice Box-Cox transformations 9

Box-Cox transformations

Each of these transformations is close to a member of thefamily of Box-Cox transformations:

wt = log(yt), λ = 0;(yλt − 1)/λ, λ 6= 0.

λ = 1: (No substantive transformation)λ = 1

2 : (Square root plus linear transformation)λ = 0: (Natural logarithm)λ = −1: (Inverse plus 1)



Each of these transformations is close to a member of thefamily of Box-Cox transformations:

wt = log(yt), λ = 0;(yλt − 1)/λ, λ 6= 0.

λ = 1: (No substantive transformation)λ = 1

2 : (Square root plus linear transformation)λ = 0: (Natural logarithm)λ = −1: (Inverse plus 1)





autoplot(BoxCox(elec,lambda=1/3))

30

40

50

60

70

1960 1970 1980 1990

Time

Box

Cox

(ele

c, la

mbd

a =

1/3

)


Box-Cox transformationsyλt for λ close to zero behaves like logs.If some yt = 0, then must have λ > 0if some yt < 0, no power transformation is possibleunless all yt adjusted by adding a constant to allvalues.Choose a simple value of λ. It makes explanationeasier.Results are relatively insensitive to value of λOften no transformation (λ = 1) needed.Transformation often makes little difference toforecasts but has large effect on PI.Choosing λ = 0 is a simple way to force forecasts tobe positiveForecasting: principles and practice Box-Cox transformations 13

Automated Box-Cox transformations

(BoxCox.lambda(elec))

## [1] 0.2654076

This attempts to balance the seasonal fluctuationsand random variation across the series.Always check the results.A low value of λ can give extremely large predictionintervals.


Automated Box-Cox transformations

(BoxCox.lambda(elec))

## [1] 0.2654076

This attempts to balance the seasonal fluctuationsand random variation across the series.Always check the results.A low value of λ can give extremely large predictionintervals.


Outline




4 Lab session 9

Forecasting: principles and practice Back-transformation 15

Back-transformation

Wemust reverse the transformation (or back-transform) toobtain forecasts on the original scale. The reverse Box-Coxtransformations are given by

yt = exp(wt), λ = 0;(λWt + 1)1/λ, λ 6= 0.


Back-transformation

fit <- snaive(elec, lambda=1/3)autoplot(fit)

5000

10000

15000

1960 1970 1980 1990

Time

elec

level

80

95

Forecasts from Seasonal naive method


Back-transformation

autoplot(fit, include=120)

10000

12000

14000

16000

1987.5 1990.0 1992.5 1995.0 1997.5

Time

elec

level

80

95



Back transformation

Back-transformed point forecasts are medians.Back-transformed PI have the correct coverage.

Back-transformed meansLet X be have mean µ and variance σ2.

Let f(x) be back-transformation function, and Y = f(X).

E[Y] = E[f(X)] = f(µ) + 12σ

2[f′′(µ)]2.


Back transformation

Back-transformed point forecasts are medians.Back-transformed PI have the correct coverage.

Back-transformed meansLet X be have mean µ and variance σ2.

Let f(x) be back-transformation function, and Y = f(X).

E[Y] = E[f(X)] = f(µ) + 12σ

2[f′′(µ)]2.


Back transformationBox-Cox back-transformation:

yt = exp(wt) λ = 0;(λWt + 1)1/λ λ 6= 0.

f(x) =

ex λ = 0;(λx + 1)1/λ λ 6= 0.

f′′(x) =

ex λ = 0;(1− λ)(λx + 1)1/λ−2 λ 6= 0.

E[Y] =

eµ[1 + σ2

2

]λ = 0;

(λµ + 1)1/λ[1 + σ2(1−λ)

2(λµ+1)2

]λ 6= 0.


Back transformationBox-Cox back-transformation:

yt = exp(wt) λ = 0;(λWt + 1)1/λ λ 6= 0.

f(x) =

ex λ = 0;(λx + 1)1/λ λ 6= 0.

f′′(x) =

ex λ = 0;(1− λ)(λx + 1)1/λ−2 λ 6= 0.

E[Y] =

eµ[1 + σ2

2

]λ = 0;

(λµ + 1)1/λ[1 + σ2(1−λ)

2(λµ+1)2

]λ 6= 0.


Back-transformationelec %>% snaive(lambda=1/3, biasadj=FALSE) %>%

autoplot(include=120)

10000

12000

14000

16000

1987.5 1990.0 1992.5 1995.0 1997.5

Time

.

level

80

95



Back-transformationelec %>% snaive(lambda=1/3, biasadj=TRUE) %>%

autoplot(include=120)

10000

12000

14000

16000

1987.5 1990.0 1992.5 1995.0 1997.5

Time

.

level

80

95



Back-transformationeggs %>% ses(lambda=1/3, biasadj=FALSE) %>%

autoplot

100

200

300

1920 1950 1980

Time

.

level

80

95

Forecasts from Simple exponential smoothing


Back-transformationeggs %>% ses(lambda=1/3, biasadj=TRUE) %>%

autoplot

100

200

300

1920 1950 1980

Time

.

level

80

95

Forecasts from Simple exponential smoothing


Outline




4 Lab session 9

Forecasting: principles and practice Lab session 9 25

Lab Session 9

Forecasting: principles and practice Lab session 9 26

Documents

Forecasting: principles and practice · Outline 1Variance stabilization 2Box-Cox transformations 3Back-transformation 4Lab session 9 Forecasting: principles and practice Variance