1046 Suhartono Statistics Final Paper Suhartono ITS

Embed Size (px)

Citation preview

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    1/10

    The Effect of Decomposition Method as Data Preprocessingon Neural Networks Model for Forecasting

    Trend and Seasonal Time Series

    SuhartonoStatistics Department, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia

    Email:[email protected];[email protected]

    Subanar!athematics Department, "ni#ersitas $ad%ah !ada, &ogyakarta, Indonesia

    Email: [email protected]

    ABSTRACT

    Recently, one of the central topics for the neural networks (NN) community is the

    issue of data preprocessing on the use of NN. In this paper, we will investigate this topic

    particularly on the effect of Decomposition method as data processing and the use of NN

    for modeling effectively time series with both trend and seasonal patterns. imited

    empirical studies on seasonal time series forecasting with neural networks show that some

    find neural networks are able to model seasonality directly and prior deseasonali!ation is

    not necessary, and others conclude "ust the opposite. In this research, we study

    particularly on the effectiveness of data preprocessing, including detrending and

    deseasonali!ation by applying Decomposition method on NN modeling and forecasting

    performance. #e use two kinds of data, simulation and real data. $imulation data are

    e%amined on multiplicative of trend and seasonality patterns. &he results are compared tothose obtained from the classical time series model. 'ur result shows that a combination

    of detrending and deseasonali!ation by applying Decomposition method is the effective

    data preprocessing on the use of NN for forecasting trend and seasonal time series.

    Keywords decomposition, data preprocessing, neural networks, trend, seasonality, time

    series, forecasting.

    1. NT!"D#$T"N

    !any business and economic time series are non'stationary time series that

    contain trend and seasonal #ariations. The trend is the long'term component that

    represents the gro(th or decline in the time series o#er an e)tended period o* time.Seasonality is a periodic and recurrent pattern caused by *actors such as (eather,

    holidays, or repeating promotions. +ccurate *orecasting o* trend and seasonal time

    series is #ery important *or e**ecti#e decisions in retail, marketing, production,

    in#entory control, personnel, and many other business sectors !akridakis and

    -heel(right, /012. Thus, ho( to model and *orecast trend and seasonal time

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    2/10

    S"3+4T5N5 +NDS"6+N+4

    series has long been a ma%or research topic that has signi*icant practical

    implications.There are some *orecasting techni7ues that usually used to *orecast data time

    series (ith trend and seasonality, including additi#e and multiplicati#e methods.

    Those methods are -inter8s e)ponential smoothing, Decomposition, Time series

    regression, and +4I!+ models see e.g. 6o(erman and 589onnel //2 or

    3anke and 4eitsch //22. 4ecently, Neural Net(orks NN2 models are also used

    *or time series *orecasting see e.g. aashoek and ?an Di%k, AA22. Suhartono et al. AA2 did

    comparati#e study o* these methods by using airline data and concluded that there

    (as no best model satis*ies simultaneously in both training and testing data. They

    also recommended the possibility *or doing *urther research by combining some

    methods.

    The aim o* this paper is to de#elop ne( hybrid model by combining

    decomposition method as data preprocessing and NN model *or *orecasting trend

    and seasonal time series. The results are compared to +4I!+ models.

    %. M"DE&N' T!END (ND SE(S"N(& TME SE!ES

    !odeling trend and seasonal time series has been one o* the main research

    endea#ors *or decades. In the early /As, the decomposition model along (ith

    seasonal ad%ustment (as the ma%or research *ocus due to Bersons //, /2

    (ork on decomposing a seasonal time series. 3olt /12 and -inters /=A2

    de#eloped method *or *orecasting trend and seasonal time series based on the

    (eighted e)ponential smoothing. +mong them, the (ork by 6o) and Cenkins/1=2 on the seasonal +4I!+ model has had a ma%or impact on the practical

    applications to seasonal time series modeling. This model has per*ormed (ell in

    many real (orld applications and is still one o* the most (idely used seasonal

    *orecasting methods. !ore recently, NN ha#e been (idely used as a po(er*ul

    alternati#e to traditional time series modeling see e.g. 3ansen and Nelson AA2,

    Nelson et al. ///2, also hang et al. //022. -hile their ability to model

    comple) *unctional patterns in the data has been tested, their capability *or

    modeling seasonal time series is not systematically in#estigated.

    In this section, (e (ill gi#e a brie* re#ie( o* these *orecasting models,

    particularly seasonal +4I!+, decomposition method and NN model.

    %.1. Seasonal (!M( Model

    The seasonal +4I!+ model belongs to a *amily o* *le)ible linear time

    series models that can be used to model many di**erent types o* seasonal as (ell as

    nonseasonal time series. The seasonal +4I!+ model can be e)pressed as see e.g.

    6o) et al. //2, 9ryer /0=2, and -ei //A22:

    2

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    3/10

    The E**ect o* Decomposition !ethod as Data Breprocessing on Neural Net(orks F

    t$

    )*tD$d$

    +p ,,y,,,, 222.2.22 = , 2

    (here $ is the seasonal length, , is the back shi*t operator and t is ase7uence o* (hite noises (ith Gero mean and constant #ariance. 6o) and Cenkins/1=2 proposed a set o* e**ecti#e model building strategies *or seasonal +4I!+

    based on the autocorrelation structures in a time series.

    %.%. Decomposition Method

    The multiplicati#e decomposition model has been *ound to be use*ul (hen

    modeling time series that display increasing or decreasing seasonal #ariation

    6o(erman and 589onnel, //; chapter 12. The key assumption inherent in this

    model is that seasonality can be separated *rom other components o* the series. The

    multiplicati#e decomposition model is

    ttttt I-$&y = 2

    (here

    ty H the obser#ed #alue o* the time series in time period t t& H the trend component in time period t

    t$ H the seasonal component in time period t t- H the cyclical component in time period t tI H the irregular component in time period t .

    %.). Neural Networks Model

    Neural net(orks NN2 are a class o* *le)ible nonlinear models that can

    disco#er patterns adapti#ely *rom the data. Theoretically, it has been sho(n that

    gi#en an appropriate number o* nonlinear processing units, NN can learn *rom

    e)perience and estimate any comple) *unctional relationship (ith high accuracy.

    Empirically, numerous success*ul applications ha#e established their role *or

    pattern recognition and time series *orecasting.

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    4/10

    S"3+4T5N5 +NDS"6+N+4

    (here p is the number o* input nodes, * is the number o* hidden nodes, f is a

    sigmoid trans*er *unction such as the logistic:

    %e

    %f

    +

    =

    .

    .2, . 2

    I,,.,A,J *"" = is a #ector o* (eights *rom the hidden to output nodes andI,,@,.;,,.,A,J *"pii" == are (eights *rom the input to hidden nodes.

    Note that e7uation 2 indicates a linear trans*er *unction is employed in the outputnode.

    Output Layer

    (Dependent Var.)

    InputLayer (Lag Dependent Var.)

    Hidden Layer

    (q unit neurons)

    Figure 1. Architecture of neural network model with single hidden layer

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    5/10

    The E**ect o* Decomposition !ethod as Data Breprocessing on Neural Net(orks F

    The purpose o* this research is to pro#ide empirical e#idence on the

    comparati#e study o* many data preprocessing method in NN model *or *orecastingtrend and seasonal time series. The ma%or research 7uestions (e in#estigate is:

    Does data preprocessing has a great impact on the accuracy o* NN model *or

    *orecasting trend and seasonal time seriesL

    -hich data preprocessing is the most e**ecti#e on NN model *orr *orecasting

    model *or trend and seasonal time seriesL

    -e conduct empirical study (ith simulation and real data, the international

    airline passenger data, to address these 7uestions. This real data has been analyGed

    by many researchers, see *or e)ample Nam and Schae*er //2, 3ill et al. //=2,

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    6/10

    S"3+4T5N5 +NDS"6+N+4

    standard data preprocessing in NN *or the airline data by trans*orm detrend,

    deseasonal, and combination detrend'deseasonal data to NA,2 scale. Theper*ormance o* in'sample *it training data2 and out'sample *orecast testing data2

    is %udged by the commonly used error measures, the mean s7uared error !SE2 and

    ratio !SE to +4I!+ model.

    Figure %. Time series plot of simulation and real data

    ,. EMP!$(& !ES#&TS

    Table summariGes the result o* the impact o* some data preprocessing on

    NN *orecasting and report per*ormance measures across training and testing

    samples *or the simulation data. Numbers greater than one on column ratio indicate

    poorer *orecast per*ormance than comparable +4I!+ model, and #ice #ersa *or

    numbers less than one.

    6

    Testing data

    Training data

    Testing data

    Training data

    Simulation data

    Airline passenger data

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    7/10

    The E**ect o* Decomposition !ethod as Data Breprocessing on Neural Net(orks F

    Table 1. The result of the comparison between preprocessing data for

    FFNN and AR!A models" both in training and testing data"for the simulation data.

    Model andPreprocessing

    IN-SAMPLE !"AININ#$A!A%

    O&!-SAMPLE !ES!IN#$A!A%

    MSE"atio toA"IMA

    MSE"atio toA"IMA

    ARIMA model 0.0234!2 " 0.020"""0 "

    ##$$ model

    ("). %riginal Data a. Model 3&"&"('')

    . Model 3&"0&" (')

    (2). Detrend a. Model 3&2&"('')

    . Model 3&"0&" (')

    (3). Deseasonal

    . Model 3&3&" ('') (')

    (4). Detrend&Deseasonal a. Model 3&&" ('')

    . Model 3&"0&" (')

    0.0"!3"230.00*+03

    0.0"!00+20.00*!"3

    0.!32!

    0.00"00.003444

    0.!3+0.2

    0.!20.2*!

    23.!2

    0.2"+0."

    0.02432+*0.404"0!+

    0.0224""0.0!22*3

    2.*"!+

    0.00*4+44.30+++

    ".2"020.0*

    ".23.*

    "4.!+2

    0.4!22"4.2

    '% , t-e est model in training data (in&sample ore/ast) ''% , t-e est model in testing data (out&sample ore/ast)

    The results o* the impact o* some data preprocessing on NN *orecasting and reportper*ormance measures across training and testing samples *or the airline data are

    summariGed in table .

    Se#eral obser#ations can be made *rom table and .

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    8/10

    S"3+4T5N5 +NDS"6+N+4

    Table %. The result of the comparison between preprocessing data for

    FFNN and AR!A models" both in training and testing data"

    for the airline passenger data#

    Model andPreprocessing

    IN-SAMPLE !"AININ#$A!A%

    O&!-SAMPLE !ES!IN#$A!A%

    MSE"atio toA"IMA

    MSE"atio toA"IMA

    ARIMA model ++.+"+ " "2!.03 "

    ##$$ model and datatransorm to $(0")

    ("). %riginal Data a. Model 3&"&"('')

    . Model 3&"0&" (')

    (2). Detrend a. Model 3&4&"('')

    . Model 3&"0&" (')

    (3). Deseasonal a. Model 3&&" ('')

    . Model 3&"0&" (')

    (4). Detrend&Deseasonal a. Model 3&4&" ('')

    . Model 3&"0&" (')

    *2.+!2*2.3230

    !".002320.200

    2.2444"2.*04!

    3.40+"".3+42

    ".040.2*

    0.!**0.22!

    0.2+40."4

    0.3**0."2+

    "2"*.+"2**.0

    "!2.2!30.3

    42"+."+2*3*.30

    +2.*3"32."!

    0.!**3.4!0

    ".0*3.+!

    2.!2"!.0*

    0.3+2".003

    '% , t-e est model in training data (in&sample ore/ast) ''% , t-e est model in testing data (out&sample ore/ast)

    -. $"N$S"NS

    6ased on the results (e can conclude that the combination detrend'

    deseasonal based on the decomposition method2 as data preprocessing in

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    9/10

    The E**ect o* Decomposition !ethod as Data Breprocessing on Neural Net(orks F

    +tok, 4.!. and Suhartono, AAA. -omparison between Neural Networks, RI/

    o%01enkins and 2%ponential $moothing /ethods for &ime $eries3orecasting,4esearch 4eport, emlit: Institut Teknologi Sepuluh Nopember.

    6o), $.E.B. and $.!. Cenkins, /1=. &ime $eries nalysis 3orecasting and

    -ontrol, San ENT Bublishing 9o.

    9ybenko, $., /0/. O+ppro)imation by superpositions o* a sigmoidal *unctionP,

    /athematics of -ontrol, $ignals and $ystems,%, AQ.

  • 8/13/2019 1046 Suhartono Statistics Final Paper Suhartono ITS

    10/10

    S"3+4T5N5 +NDS"6+N+4

    Nelson, !., T. 3ill, T. 4emus and !. 589onnor, ///. OTime series *orecasting

    using NNs: Should the data be deseasonaliGed *irstLP, 1ournal of 3orecasting,10, pp. /Q=1.

    Bersons, -.!., //. OIndices o* business conditionsP,Review of 2conomics and

    $tatistics1, pp. QA1.

    Bersons, -.!., /. O9orrelation o* time seriesP,1ournal of merican $tatistical

    ssociation,10, pp. QA1.

    Suhartono, Subanar and S. $uritno, AAa. O+ 9omparati#e study o* *orecasting

    models *or trend and seasonal time series: Does comple) model al(ays yield

    better *orecast than simple modelsLP, 1urnal &eknik Industri 1urnal

    8eilmuan dan plikasi &eknik Industri, ?ol. , No. , pp. 'A.

    Suhartono, Subanar and S. 4eGeki, AAb. O