ERASMUS UNIVERSITY ROTTERDAM Erasmus School of …

ERASMUS UNIVERSITY ROTTERDAM

Erasmus School of Economics

Bachelor Thesis Econometrics & Operations Research

Forecasting Inflation With Local Linear Forest

Author: Pawan Rashmi

Student ID number: 510339

Supervisor: Dr.(Andrea) AA Naghi

Second assessor: Vermeulen, S.H.L.C.G

Date: July 4, 2021

Abstract

The stability of price and monetary supply is essential for an economy to thrive. In

order to regulate this, inflation should be adequately predicted by central banks. Recently

machine learning methods were considered, and according to Medeiros et al. (2021) Random

Forest(RF) provided the most accurate forecasts. In order to enhance these results, I applied

Local Linear Forest(LLF). The dataset is obtained from Medeiros et al. (2021) and contains

inflation as derived from Consumer Price Index and Personal Consumption Expenditures.

Furthermore, this dataset contains 122 macroeconomic variables. The results imply that LLF

does not significantly outperform RF. In contrary, occasionally RF significantly outperforms

LLF. Moreover, using LLF with Local Linear split does not increase the performance of LLF

such that it significantly performs better than RF. Finally, combining the forecasts of RF

and LLF only seems to improve the forecasts when the volatility of inflation is low.

The views stated in this thesis are those of the author and not necessarily those of the supervisor, second

assessor, Erasmus School of Economics or Erasmus University Rotterdam.

1

Table of contents

1 Introduction 3

2 Literature review 4

3 Data 5

4 Methodology 6

4.1 Random Forest (RF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.2 Local Linear Forest (LLF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.3 LASSO regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4.5 Combining forecasts of Local Linear Forest and Random Forest . . . . . . . . . . 10

4.6 Variable Importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Results 11

5.1 LLF (CART split) vs RF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5.2 LLF(LL) vs LLF(CART) vs RF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5.3 Combining Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 Conclusion & Discussion 19

Appendices 22

A Simulation study non-linearity RF vs LLF(CART) 22

B Graph analyses LLF(CART) vs RF for CPI 22

C PCE results 24

D code 27

2

1 Introduction

In 2020 when COVID hit, the governments around the world provided stimuli to give the econ-

omy a boost. The FED(central bank of the USA) increased the monetary supply significantly.

However, the FED needs to keep the inflation rate in check as it is crucial for the stability

of price and interest rates (Montes and Curi, 2015). For this reason, accurate forecasts are

essential and the backbone of economic policy. Recently, the use of big data and machine learn-

ing models is becoming increasingly popular in finance, as these techniques are able to gain

useful insights from large financial-related datasets (Fang and Zhang, 2016). The use of ma-

chine learning techniques in financial applications provides promising results. Chakraborty and

Joseph (2017) argue that generally machine learning models outperform traditional forecasting

methods. Also, Gu et al. (2018) reported that large economic gains could be achieved with the

use of machine learning methods as the rate of improvement is in some cases double compared

to traditional regression-based methods in the literature. Medeiros et al. (2021) implemented

multiple machine learning methods to make predictions about future inflation. They concluded

that Random Forest outperformed traditional methods such as Random Walk, Autoregressive

and Unobserved Components Stochastic Volatility (UCSV) models. Besides traditional meth-

ods, Random Forest also outperformed other machine learning methods such as Artificial Neural

Networks and Boosted Trees. However, according to Friedberg et al. (2020) Random Forest is

not able to exploit smoothness in the regression surface. For this reason, Random Forest is not

able to take advantage of strong local linear trends that might be present in data. Therefore,

Friedberg et al. (2020) proposed the method Local Linear Forest. This method uses Random

Forest as an adaptive kernel method in combination with Ridge regression to adjust for local

linear signals. In this research, I want to assess the performance of this method when forecasting

inflation and inspect whether there are linear smooth signals in the data that this method can

utilize.

All of this brings us to the first research question: ”How much will Local Linear Forest

improve upon Random Forest that is given by Medeiros et al. (2021) with the use of local

linear signals that present are in the data?” Furthermore, Random Forest uses the CART al-

gorithm to perform splits in the tree. Friedberg et al. (2020) proposed a different splitting

method called Local Linear split(LL). This splitting rule avoids using global linear effects in

the construction of the tree such that these linear effects can be modelled adequately during

prediction. Furthermore, Granger and Ramanathan (1984) argue that combining forecasts with

linear combinations can potentially outperform individual forecasts. With this, I also want to

state a second research question: ”How much will the performance of the forecasts increase by

3

using additional techniques such as Local Linear splitting rule or combining forecasts?”

I will now discuss what methods are applied in this research. First of all, Random Forest is

used, because Random Forest provided the best inflation predictions in Medeiros et al. (2021).

To potentially improve the predictions Local Linear Forest will be applied. Furthermore, LASSO

will be used to perform variable selection. Moreover, OLS will be used to combine the forecasts

made by RF and LLF. Additionally, Root Mean Squared Error, Mean Average Error and Median

Average Deviation are used as evaluation metrics. In order to verify whether the differences are

significant the Model Confidence Set(MCS) test will be used. Finally, a variable importance

method will be used. The data that is used to make forecasts is obtained from Medeiros et al.

(2021). This dataset contains 2 inflation metrics that are derived from CPI (Consumer price

index) and PCE (Personal Consumption Expenditures). Both metrics are forecasted. The

dataset contains 122 macroeconomic variables with 672 observations that range from January

1960 to December 2015. Hereby, January 1990 till December 2015 is used for out of sample

forecasting.

This paper is structured as follows. In section 3, the dataset is explained. In section 4, the

machine learning models and the evaluation techniques are given. In 5, the results are presented

that are obtained from the application of the methods. Finally, a general conclusion & discussion

is given in section 6.

2 Literature review

Forecasting inflation is popular in financial literature. Multiple famous models have been applied,

such as the Philips Curve, VAR models and multiple more as described in Faust and Wright

(2013). However, Medeiros et al. (2021) argued that there is no empirical evidence that these

models outperform the benchmark models significantly. This resulted in researchers searching for

more advanced models. As a result, machine learning methods were introduced in the inflation

forecasting literature. The recent rise in computational power also contributed to this (Friedman

et al., 2001). Moreover, machine learning models are more able to work with high dimensional

data (Friedman et al., 2001). Medeiros and Mendes (2016) presented that LASSO-based models

perform better than traditional autoregressive models. The authors; however, only considered

linear models with just 1 step ahead forecasting. Furthermore, recently non-linear models were

also considered, such as Artificial Neural Networks and Random Forest in Medeiros et al. (2021).

In this paper, these non-linear models were compared to the performance of traditional models as

benchmarks such as RW (Random Walk), AR (Autoregressive models) and UCSV (Unobserved

Components with Stochastic Volatility). Also, linear models were considered such as LASSO

4

and adaLASSO. Finally, LASSO and adaLASSO were also used to perform dimension reduction

for Random Forest. Out of these models, Random Forest performed the best. Indicating that

the data generating process of inflation with relation to the macroeconomic variables is non-

linear. However, the relation could contain smooth linear signals that Random Forest is unable

to utilize due to the non-linearity of the method. Therefore, there is a gap in the literature

where a method is considered that can utilize non-linear as well as linear signals to provide

enhanced forecasts. Subsequently, the contribution of this research is to perform the method

Local Linear Forest as described by Friedberg et al. (2020). This method will be able to analyse

whether there is a linear relationship in forecasting inflation with macroeconomic variables in

combination with a non-linear relationship.

3 Data

The dataset is obtained from Medeiros et al. (2021). This dataset contains 122 macroeconomic

variables with 672 monthly observations that range from January 1960 to December 2015. All

variables are transformed in order to achieve stationary. This is further described in the sup-

plementary materials of Medeiros et al. (2021), along with the definition and explanation of the

macroeconomic variables. Two variables are used to derive inflation namely, Consumer Price

Index (CPI) and Personal Consumption Expenditures(PCE). Both of these metrics will be fore-

casted. Inflation is as follows calculated from these two metrics: πt = log(Pt)− log(Pt−1), Pt is

a price index in period t. Also according to Medeiros et al. (2021) CPI and PCE contained a

large outlier in 2008 associated with a rapid decline in oil prices. This outlier is removed. The

forecasting sample ranges from January 1990 to December 2015. This sample is divided into

two subsamples namely January 1990 - December 2000 and January 2001 - December 2015 with

respectively 132 and 180 observations. The distinction is made because inflation as measured

by CPI and PCE is more volatile in the second subsample compared to the first subsample. In

Figure 1 inflation is presented in terms of CPI and PCE for the entire forecasting sample.

5

Figure 1: Comparison inflation predictions of RF and LLF over time for PCE

In this graph we can observe that inflation derived from PCE is more volatile compared to CPI.

Also in general we can state that both inflation metrics are more stable and persistent in the

first subsample compared to the second subsample. Inflation is especially unstable during the

periods of 1990, 2000 and 2008 this is due to the recessions that occurred.

The dataset that is used to feed the methods will also contain 4 principal components and

4 lags of all variables. In addition 4 autoregressive terms are added.

4 Methodology

In this section, the methods will be discussed that are used for forecasting inflation for 12 different

time horizons, namely 1 to 12 months. Furthermore, 3 evaluation metrics will be used to assess

the performance, namely Root Mean Squared Prediction Error (MSPE), Mean Absolute Error

(MAE) and Median Absolute Deviation (MAD). In order to test for significant differences in

performance for the different forecasting models the Model Confidence Set(MCS) will be used.

Also, a variable importance measurement is presented. Moreover, all predictions will be made

using a moving window. The length of the moving window differs for the two subsamples. The

first and second subsample contains 360−h−p−1 and 492−h−p−1 observations respectively.

Here h is the horizon and p is the number of lags. The code that is used to implement the

methods is given in GitHub (2021).

In Section 5 the results are given as ratio, hereby the results of each method are divided by

the results of the Random Walk Model. The Random Walk Model is defined as πt+h|t = πt. πt

is the true inflation on time t and πt+h|t is the forecasted inflation for time t+ h, h refers to the

horizon.

6

4.1 Random Forest (RF)

Random Forest builds multiple regression trees using subsampling and Bagging. First I will

explain what regression trees are. According to Friedman et al. (2001) regression trees split the

predictor space into J distinct regions that are non-overlapping. These regions are equivalent

to leaf nodes and the predictor space refers to the features. After constructing a regression tree

a test instance is predicted by taking the average of every training instance that is in the same

leaf node. The predictor space is split using the CART procedure that was first introduced

by Breiman et al. (1984). In this procedure, the split is chosen such that the sum of squared

errors in the child nodes C1 and C2 of parent node P is minimum. The errors are defined as

the difference between the response value Yi of training instance i and the mean of all response

values of each training instance in the corresponding child node as given by equation (1).

∑i:Xi∈C1

(Yi − Y1

)2+

∑i:Xi∈C2

(Yi − Y2

)2(1)

According to Friedman et al. (2001), regression trees suffer from high variance, outliers and

overfitting. For this reason, an extension is used of decision trees, namely Random Forest.

Random Forest uses Bagging (Breiman, 1996) and the Random Subspace method (Ho, 1998).

Bagging constructs multiple trees based on new training sets that are created by means of

bootstrapping, after which the average is taken over all the constructed trees. The Random

Subspace method randomly selects a subset of features that is used for splitting in order to

construct a tree. In a nutshell, this method reduces the correlation between the constructed

trees; hence Bagging and the Random Subspace method shrink the variance of the final result.

Implementation Random Forest will be implemented according to Medeiros et al. (2021)

with the RandomForest package in R.

4.2 Local Linear Forest (LLF)

The lack of smoothness in the regression surface is especially harmful to the performance of

regression trees according to Friedman et al. (2001). However, Friedberg et al. (2020) argued

that Local Linear Forest can exploit smooth signals in the prediction surface. To explain how

LLF works I am first going to explain how this method is trained. LLF is trained the same

way as RF using CART. However, LLF uses ”honesty”. According to Wager and Athey (2018)

procedure 1, honest trees are created by dividing the training set into two samples. The first

sample is used to define the structure of the tree by performing splits. The second sample is

used to define a neighborhood of a test instance. Hence, represents training instances that are

7

in the same leaf as the test instance. The dependent variable in the second subsample is not

used for splitting; however, the covariates can be used. Friedberg et al. (2020) argued that the

benefit of honesty is that it can control for overfitting.

During the construction of a tree, the Local Linear split technique could be used instead

of the splitting technique CART that is described in section 4.1. According to Friedberg et al.

(2020) splitting based on local regression works as follows. First, a Ridge regression is run in

parent node P to predict Yi from Xi. After which the estimated Yi is determined from XTi as

given in equation (2)

Yi = αP +XTi βP . (2)

Here α is a constant and β is a coefficient. After running the regression, we calculate the

difference between Yi and Yi as error. We then apply a CART split on the errors. In this

way, the different amount of errors; hence, small and large errors will be grouped separately.

According to Friedberg et al. (2020) the benefit of Local Linear split is that the global effects

will not be utilized during the construction of the tree; hence, these linear effects will be reserved

for prediction.

I will now explain how Local Linear Forest is used to predict instances. LLF uses RF as

an adaptive weight generator in order to make predictions. These forest weights are given in

equation (3)

αi (x0) =1

B

B∑b=1

1 {Xi ∈ Lb (x0)}|Lb (x0)|

. (3)

Here B refers to the amount of trees in a forest, L refers to a leaf node and x0 is a test instance.

The more often training instance Xi appears in the same leaf node as the test instance across

all trees in the forest the larger the weight will be. By construction it holds that 0 ≤ αi(x0) ≤ 1.

These weights are then used to minimize the errors of the following regression:

Y = µ(x0) + (X − x0)θ(x0) + ε. (4)

In this regression, Y is a vector that contains the dependent values for all training instances.

(X − x0) refers to the difference between the training instances and the test instance. The

parameters are µ, θ and represent a constant and a coefficient respectively. The optimal values

for these parameters are obtained by minimizing the errors. For this the definition in equation

(5) is used µ (x0)

θ (x0)

= argminµ,θ

{n∑i=1

αi(x0) (Yi − µ (x0)− (Xi − x0) θ (x0))2 + λ ‖θ (x0)‖22

}. (5)

In this equation, the magnitude of the errors is assessed by the pre-determined weights αi(x0).

The error for every training instance is the same as ei = Yi − µ(x0) − (Xi − x0)θ(x0). If the

8

training instance is often in the same leaf as the test instance that will cause the errors to weigh

more than if it is not the case, this results in a local linear correction. Furthermore, to prevent

overfitting a Ridge penalty is applied. The magnitude of the penalty is given by parameter

λ. According to Sarle (1996) a Ridge regression can prevent overfitting. Hence, it prevents

overfitting to the local linear trend. However, the optimal value for this penalty needs to be

tuned.

Implementation The variable selection method LASSO will be used to select a subset of

covariates that will perform the local linear correction and Local Linear split, this is because the

amount of variables in the dataset is quite large. LASSO is explained in section 4.3. Furthermore,

the optimal lambda parameter that is used for Ridge regression during prediction is determined

as follows. In every step, the prediction will be computed for the lambdas in the set

{0, 0.001, 0.01 ,0.1 ,1}. The error that relates to each prediction will be tracked, the parameter

value that results in the least amount of absolute errors over a moving window of 20 steps will

be used as a parameter to predict inflation. To initialise this process 20 predictions will be made

before the start. Finally, this method will be implemented using the grf R package.

4.3 LASSO regression

Usually, all features can be used in LLF. However, using feature selection on a high dimensional

dataset can significantly reduce errors and decrease computational time (Friedberg et al., 2020).

Because the dimensionality of the dataset used is large, LASSO will be applied to perform

feature selection. According to Fonti and Belitser (2017) LASSO performs an ordinary least

squares regression that penalized the sum of the absolute values of the coefficients. The goal of

this regression is to minimize the errors. During this process, LASSO shrinks the coefficients of

the non-relevant features to zero. The features of which the coefficients are larger than zero are

then selected.

Implementation The features that are selected by LASSO are used to determine Local Lin-

ear Forest predictions and perform Local Linear split. This method is implemented using the

HDeconometrics package in R. The built in lambda tuning is used from this package, this lambda

tuning is built to be used on time series data.

4.4 Evaluation

In order to evaluate the performance of the forecasts, I will use root mean squared prediction

error (RMSE), mean absolute error (MAE) and median absolute deviation (MAD). The formulas

9

are given respectively in equation (6),(7) and (8)

RMSEm,h =

√√√√ 1

T − T0 + 1

T∑t=T0

e2t,m,h (6)

MAEm,h =1

T − T0 + 1

T∑t=T0

|et,m,h| (7)

MADm,h = median [|et,m,h −median (et,m,h)|] . (8)

Hereby et,m,h is defined as et,m,h = πt − πt,m,h, where πt is the true inflation and πt,m,h is the

forecasted inflation made by model m with information t−h. T−T0 is the amount of predictions

made. According to Medeiros et al. (2021) RMSE & MAE are commonly used in the forecasting

literature. Additionally, both MAE and MAD are more robust estimates of errors, meaning that

all errors are penalized according to their magnitude. RMSE in contrast penalizes larger errors

more than smaller errors.

In order to determine whether the performance of distinct models are significantly different,

I will use the Model Confidence Set test described in Hansen et al. (2011). Bernardi and

Catania (2018) state ”this test consists of a sequence of statistical tests that construct a set of

superior models”. In this test the null hypothesis of equal predictability is not rejected at certain

significance level α. The errors of each model will be given as input. However, a transformation

function needs to be applied to the errors. I choose the square and absolute errors as loss

functions. Finally, this method will be implemented using the MCS R package.

4.5 Combining forecasts of Local Linear Forest and Random Forest

Combining the forecasts made by LLF and RF could potentially lead to enhanced results. There

are multiple methods that could be applied to combine forecasts. I will try different methods

with the ForecastComb package in R. This package implements 15 different methods and chooses

the method that has the lowest RMSE in the training set. A detailed explanation of each method

can be found in the documentation of the ForecastComb package. However, I observed that only

Ordinary Least Squares(OLS) was considered for combining forecasts. Therefore, I will give a

brief explanation about this method.

Using OLS for combining forecasts was first introduced by Granger and Ramanathan (1984).

The OLS regression that is used is given in equation (9). In this equation y refers to real values,

α represents a constant, wi refers to the weights given to the forecast of model i, fit are the

forecasts of model i and εt refers to the error.

yt = α+ wLLF fLLFt + wRF fRFt + εt (9)

10

The weights are determined using a moving window of 30 observations, after which the weights

are used to make a one step ahead combined forecast.

4.6 Variable Importance

To understand how the models perform we need to understand which variables contribute the

most to the performance of the model. For this, I will use the variable importance measurement

that is described in Fisher et al. (2019). The intuition for this method is that the change in

performance is measured after removing an explanatory variable. If the performance loss is large

then this will mean that the removed variables is important. Hence, the loss in performance

after removal of a variable is proportional to the importance of that variable. To implement this

method the DALEX R package is used. Also, RMSE is used as a loss function to determine the

performance loss.

5 Results

In this section, the performance of the different methods will be presented. The methods used

are: RF, LLF with CART split, LLF with Local Linear split and the forecast combing technique.

The forecasts for inflation are made over 2 subsamples that range from 1990 M01 to 2000 M12

and 2001 M01 to 2015M12. At first, a comparison will be made between the replicated RF,

RF as obtained from Medeiros et al. (2021) and LLF with CART split. This is done for the

entire forecast sample as well as the two subsamples separately. After which LLF with Local

Linear split is compared to LLF with CART split as well as the replicated RF. For this, the

forecasts for both subsamples will be assessed jointly. Finally, I will analyse if combining the

forecast made by the replicated RF and LLF with CART split will yield performance gains. For

this, the 2 subsamples will be analysed separately. Furthermore, I replicate the results of Table

2 in Medeiros et al. (2021), and Table S.12, S.13 and S.19 of the supplementary materials of

Medeiros et al. (2021).

Moreover, the error metrics such as RMSE, MAE and MAD that each model obtains are

divided by the error metrics of the Random Walk model; therefore, is given as ratio in each

table. Furthermore, only forecasting inflation in terms of CPI is presented in this section, the

results for PCE are given in appendix C, this is in line with Medeiros et al. (2021). Moreover,

the significance level for the MCS test is 0.10.

Now I will define how each model is referred to: LLF with CART split, LLF with Local

Linear split, RF that is replicated and RF results that are obtained from Medeiros et al. (2021)

are respectively referred to as LLF(CART), LLF(LL), RF(rep) and RF(med).

11

5.1 LLF (CART split) vs RF

In Table 1 the comparison is presented of inflation forecasts made by RF(rep), LLF(CART) and

the RF(med). The forecasts are made over the entire forecast sample for 12 horizons.

Table 1: Summary performance measures forecasting CPI

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

RF - rep 0.83 0.74 0.76 0.74 0.72 0.73 0.73 0.73 0.76 0.78 0.80 0.70

RF - med 0.84 0.73 0.71 0.74 0.71 0.72 0.72 0.71 0.72 0.76 0.77 0.68

LLF 0.79 0.78 0.76 0.77 0.72 0.72 0.73 0.73 0.77 0.81 0.81 0.70

MAE ratio

RF - rep 0.81 0.73 0.75 0.75 0.74 0.74 0.72 0.70 0.75 0.77 0.80 0.69

RF - med 0.81 0.72 0.71 0.75 0.73 0.73 0.70 0.68 0.72 0.75 0.77 0.67

LLF 0.78 0.75 0.74 0.77 0.75 0.74 0.73 0.70 0.77 0.81 0.82 0.69

MAD ratio

RF - rep 0.72 0.66 0.76 0.79 0.70 0.73 0.60 0.59 0.69 0.67 0.69 0.56

RF - med 0.70 0.63 0.77 0.84 0.75 0.73 0.65 0.64 0.73 0.69 0.71 0.58

LLF 0.77 0.72 0.71 0.84 0.73 0.69 0.65 0.60 0.74 0.80 0.72 0.63

MCS RF - rep vs LLF

sq 0.52 0.14 0.53 0.22 0.44 0.62 0.58 0.73 0.19 0.24 0.31 0.85

abs 0.83 0.39 0.89 0.17 0.51 0.87 0.51 0.98 0.04 0.05 0.35 0.86The lowest values as well as significant p values for MCS are in bold

From this table, we can observe that LLF(CART) performs better for the first horizon because

MAE and RMSE are lower than that of RF(med) and RF(ref). However, after the first horizon

RF(med) and RF(ref) outperform LLF(CART) in terms of RMSE and MAE. For horizon 3 and

6 we observe that LLF(CART) has lower MAD values. From the MCS test it follows that the

differences between the performance of LLF(CART) and RF are small, as the difference is only

significant for horizon 9 and 10. Hence, RF(rep) outperforms LLF(CART) only for horizon 9

and 10. This might indicate that RF generally performs better than LLF for larger horizons.

This is in line with the results of Medeiros et al. (2021) because they state that the performance

of RF becomes more evident for longer horizons. This could be because RF is good at modelling

non-linearity and longer horizons introduce more of it.

12

Table 2: measures forecasting CPI with RF & LLF for subsample 1

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

RF - rep 0.79 0.78 0.85 0.77 0.74 0.76 0.76 0.77 0.82 0.82 0.85 0.71

RF - med 0.79 0.78 0.85 0.77 0.73 0.76 0.76 0.77 0.82 0.82 0.85 0.72

LLF 0.77 0.83 0.85 0.82 0.76 0.75 0.78 0.77 0.83 0.87 0.86 0.73

MAE ratio

RF - rep 0.82 0.78 0.88 0.77 0.76 0.79 0.79 0.75 0.86 0.86 0.90 0.75

RF - med 0.82 0.78 0.88 0.77 0.76 0.79 0.78 0.75 0.86 0.86 0.89 0.76

LLF 0.78 0.82 0.86 0.80 0.78 0.78 0.81 0.76 0.88 0.93 0.93 0.77

MAD ratio

RF - rep 0.68 0.65 0.67 0.60 0.61 0.68 0.63 0.56 0.64 0.72 0.70 0.65

LLF 0.70 0.77 0.72 0.74 0.67 0.69 0.80 0.61 0.76 0.98 0.77 0.73

MCS RF - rep vs LLF

sq 0.58 0.06 0.74 0.01 0.13 0.62 0.40 0.92 0.63 0.04 0.65 0.66


In Table 2 the results for inflation forecasting over subsample 1 is given. From this we can

again observe that LLF(CART) has lower RMSE for the first horizon, after which RF(med)

seems to perform better. For the MAE performance metric, LLF has lower values for horizon 1,

3 and 6. Hence, LLF(CART) seems to be performing better for the MAE performance metric

compared to RMSE. From the MCS test with absolute loss function the difference between LLF

and RF(rep) is only significantly different for horizon 10, whereas the difference according to the

square loss function is significantly different for horizon 2, 4 and 10. Hence, RF(rep) performs

more often significantly better than LLF(CART) for the square loss function compared to the

absolute loss function. Furthermore, according to MAD, RF(rep) performs better because it

obtains constantly lower values.

13

Table 3: measures forecasting CPI with RF & LLF for subsample 2

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

RF - rep 0.86 0.72 0.69 0.73 0.71 0.70 0.71 0.71 0.71 0.76 0.76 0.68

RF - med 0.86 0.72 0.69 0.73 0.71 0.71 0.71 0.70 0.71 0.75 0.76 0.68

LLF 0.89 0.73 0.70 0.74 0.69 0.70 0.70 0.71 0.73 0.77 0.78 0.68

MAE ratio

RF - rep 0.81 0.70 0.66 0.73 0.72 0.70 0.67 0.67 0.67 0.71 0.73 0.64

RF - med 0.81 0.70 0.66 0.74 0.71 0.70 0.67 0.66 0.67 0.70 0.72 0.63

LLF 0.84 0.69 0.66 0.75 0.72 0.70 0.67 0.66 0.70 0.73 0.74 0.64

MAD ratio

RF - rep 0.75 0.67 0.82 0.92 0.77 0.76 0.58 0.62 0.72 0.64 0.69 0.50

LLF 0.84 0.68 0.70 0.92 0.78 0.68 0.55 0.58 0.72 0.66 0.69 0.56

MCS RF - rep vs LLF

sq 0.37 0.40 0.55 0.56 0.24 0.68 0.44 0.70 0.23 0.48 0.38 0.57

abs 0.47 0.97 0.83 0.36 0.85 0.96 0.71 0.76 0.05 0.47 0.58 0.84

The lowest values as well as significant p values for MCS are in bold

In Table 3 the results are given for inflation forecasting over the second subsample. We can

observe that LLF(CART) obtains a lower RMSE ratio than RF(med) and RF(rep) for horizon

5 and 7. However, as opposed to the first subsample LLF(CART) does not obtain the lowest

values for RMSE and MAE for the first horizon. Furthermore, from the MCS(abs) test it follows

that RF(rep) only performs significantly better than LLF for horizon 9. For the other horizons

the null hypothesis of equal predictability is not refuted, meaning that LLF and RF(rep) does

not perform significantly different from each other. However, the p-values for the MCS(abs)

seems to be higher compared to MCS(sq) with exception of horizon 4, 9 and 10. Indicating that

LLF(CART) suffers from larger mistakes in comparison with RF(rep). Further analysis of the

differences between LLF(CART) and RF are given in appendix B. In summary, for subsample

1 LLF(CART) performed more often worse than RF compared to subsample 2. Hence; we can

state that the differences in performance between LLF(CART) and RF decreased when more

volatility was present, this especially holds for smaller horizons.

Furthermore, in order to analyse which variables are responsible for the performance of

LLF(CART) and RF the variable importance method is applied. The results for RF and

LLF(CART) are given respectively in Figure 2 and 3. In these figures the variable name with

14

the lag in parenthesis is given on the y-axis and the loss in performance measured in RMSE

compared to the full model is given on the x-axis.

Figure 2: 10 most important variables for RF for horizon 9

From this plot we can observe that 3 of the 10 most important variables are autoregres-

sive terms. Moreover, PCE seems to be the most important measure for forecasting infla-

tion(CPI). AMDMUOx(unfilled orders for durable goods), CU$R0000$AC(CPI : Commodities)

and CUSR0000SAOL5(CPI: all items less medical care) also seem to be important.

Figure 3: 10 most important variables for LLF(CART) for horizon 9

However, for LLF(CART) PCE is not the most important variable anymore. The 4th lag of

HOUSTMW(Housing starts, Midwest) is the most important variable followed by the second lag

of HOUSTMW. CES0600000007 (Avg Weekly Hours : Goods-Producing) is the only other vari-

able that is important1 for LLF but not for RF. This could potentially mean that HOUSTMW

and CES0600000007 are responsible for local linear signals that LLF(CART) is able to utilize.

Nevertheless, these linear signals are not strong enough for LLF to outperform RF, in contrast

LLF(CART) underperforms RF. The most important variable for RF is PCE. Medeiros et al.

(2021) stated that PCE has a strong non-linear relationship to other macroeconomic variables.

1important in this context is defined as the top 10 most important variables

15

This might indicate that the LLF(CART) underperforms RF because LLF(CART) is unable

to model the non-linearity that PCE inhabits. In more general terms we might deduce that

LLF(CART) is not as proficient as RF in modelling non-linearity that is present in the data.

This statement is supported by the simulation that is performed in appendix A. From the results

of this simulation I found that RF is indeed better at modelling non-linearity in a large dataset.

Another relevant finding was that the ability of LLF(CART) to model non-linearity especially

decreases when the dataset is large.

5.2 LLF(LL) vs LLF(CART) vs RF

Furthermore, I will analyse whether Local Linear split yields performance gains over LLF with

CART split and RF(rep). For this the entire forecasting sample is considered. The results are

given in Table 4. From this table, it is observed that the evaluation metrics of LLF(CART)

and LLF(LL) seems to be identical. However, one could observe that MAE and MAD are

slightly lower for horizon 9, 10 and 11. For RSME this holds for horizon 9 and 11. These

differences are significant according to the MCS(sq) and MCS(abs) test. This might indicate

that LLF with Local Linear split enhances the performance of LLF with CART split for higher

horizons. Nevertheless, The differences between LLF(LL) and RF(rep) are not significantly

different. However, from Table 1 it was observed that LLF(CART) performs significantly worse

compared to RF(rep) for larger horizons, this is not the case anymore when Local Linear split

is applied.

Table 4: Results inflation(CPI) forecasting with LLF(LL)

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE 0.84 0.78 0.77 0.77 0.72 0.72 0.73 0.73 0.76 0.81 0.81 0.70

MAE 0.81 0.75 0.75 0.78 0.74 0.73 0.73 0.70 0.76 0.80 0.81 0.70

MAD 0.78 0.71 0.72 0.82 0.73 0.68 0.63 0.59 0.73 0.77 0.72 0.63

CART(LLF) vs LL(LLF)

sq 0.95 0.39 0.16 0.95 0.32 0.84 0.17 0.80 0.05 0.23 0.07 0.88

abs 0.48 0.76 0.21 0.73 0.23 0.63 0.37 0.72 0.01 0.08 0.07 0.37

MCS CART(RF) vs LL(LLF)

sq 0.49 0.13 0.19 0.19 0.36 0.61 0.74 0.68 0.46 0.36 0.46 0.91

abs 0.89 0.33 0.55 0.10 0.88 0.73 0.37 0.89 0.14 0.14 0.63 0.57The Lowest values as well as significant p values for MCS are in bold

In section 4.2 it is described that Local Linear split avoids splitting on global linear effects.

Hence, if variables are discovered that LLF(CART) frequently splits upon but not LLF(LL).

16

Then these variables could potentially contribute to the performance gain of LLF(LL). For this

I will analyse the variables chosen for the splits for horizon 9. This is because the performance

gain of LLF(LL) is the most evident for horizon 9 as the p-values are the lowest. The variables

that LLF(LL) does not frequently 2 split upon as compared to LLF(CART) are: CPI(All Items

Less Food), Total Business Inventories and PCE. Hence, these variables could potentially aid

the performance gain of LLF(LL) for longer horizons. In order to gain a deeper understanding

the variable importance plot is analysed that is given in Figure 4.

Figure 4: 10 most important variables for LLF(LL) for horizon 9

This plot seems to be identical to the variable importance plot of LLF(CART) because the top

10 most important variables are almost identical. HOUSTMW still takes first and second place,

the third lag of HOUSTMW also appears in the top 10. Hence, 3 of the 10 most important

variables are lags of HOUSTMW. CES0600000007 is now in 3rd place up from the 5th. RMSE

loss of these variables are also higher. Hence, variables that I suspected have strong linear signals

became even more important after applying LLF with Local Linear split. This could potentially

mean that LLF with Local Linear Split is more able to extract the global signals in the data

compared to CART split. The utilization of these strong linear effects seems to compensate

for the lack in non-linearity that LLF is able model, as LLF(LL) does not underperform RF

anymore for longer horizons.

In summary, variables such as CPI(All Items Less Food), Total Business Inventories and

PCE potentially improves the results. However, the performance gains due to HOUSTMW and

CES0600000007 seems to be more important.

2Frequently in this context is defined as more than 20000 splits cumulative over the entire sample for 3 depths

of the tree.

17

5.3 Combining Forecasts

Finally, the results are presented that are obtained from combining the forecast made by the

models LLF(CART) and RF. For this both subsamples are analysed separately. The results

of the first subsample are given in Table 5. We can observe that the RMSE of the combined

forecast is lower than the RMSE of LLF and RF in horizon 1, 2, 3 and 11. Furthermore,

for MAE the values of the combined forecasts are lower for every horizon except horizon 10.

This indicates that the combined forecast makes fewer smaller errors compared to LLF and

RF. For the MAD ratio RF generally performs the best. Moreover, from the MCS test it is

observed that the combined forecasts significantly outperforms LLF and RF for horizon 1, 2, 7

and 11. For the other horizons the difference in performance is not significantly different. In

the second subsample given in Table 6, the performance gain of the combined forecast is absent.

In contrary it performs significantly worse compared to LLF and RF for all horizons. The

difference in performance of the combined forecasts for subsample 1 in comparison to subsample

2 indicates that the combined forecast performs worse when inflation is more volatile.

Table 5: Comparison Combined Forecast(CPI) for subsample 1

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

Comb 0.68 0.71 0.88 0.79 0.77 0.83 0.77 0.78 0.85 0.94 0.93 0.76

RF 0.77 0.76 0.88 0.79 0.76 0.83 0.76 0.77 0.84 0.87 0.94 0.75

LLF 0.76 0.81 0.90 0.88 0.82 0.86 0.78 0.78 0.86 0.95 1.03 0.81

MAE ratio

Comb 0.73 0.71 0.85 0.77 0.75 0.82 0.77 0.74 0.88 0.91 0.94 0.79

RF 0.83 0.81 0.89 0.78 0.79 0.86 0.80 0.75 0.90 0.89 0.99 0.83

LLF 0.82 0.86 0.88 0.82 0.82 0.87 0.82 0.77 0.92 0.96 1.07 0.86

MAD ratio

Comb 0.68 0.64 0.74 0.74 0.62 0.83 1.00 0.56 1.01 0.79 0.74 0.65

RF 0.62 0.62 0.64 0.60 0.60 0.74 0.98 0.50 1.02 0.64 0.69 0.65

LLF 0.81 0.80 0.71 0.74 0.64 0.77 1.22 0.57 1.21 0.92 0.81 0.75

MCS RF - Comb

sq 0.01 0.07 0.96 0.99 0.81 0.99 0.56 0.76 0.64 0.17 0.65 0.84

abs 0.01 0.00 0.42 0.86 0.26 0.21 0.40 0.61 0.42 0.71 0.20 0.49

MCS LLF - Comb

sq 0.02 0.00 0.55 0.07 0.24 0.46 0.58 0.89 0.69 0.86 0.04 0.39


18

Table 6: Comparison Combined Forecast(CPI) for subsample 2

CPI 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

Comb 1.19 0.92 0.79 0.86 0.83 0.74 0.73 0.78 0.74 0.80 0.80 0.84

RF 0.87 0.73 0.68 0.71 0.70 0.69 0.70 0.70 0.71 0.76 0.76 0.68

LLF 0.90 0.74 0.69 0.71 0.68 0.68 0.68 0.69 0.72 0.76 0.76 0.67

MAE ratio

Comb 0.94 0.80 0.72 0.81 0.80 0.77 0.70 0.72 0.72 0.78 0.77 0.76

RF 0.82 0.70 0.65 0.71 0.72 0.69 0.65 0.66 0.67 0.71 0.71 0.64

LLF 0.85 0.70 0.65 0.71 0.71 0.67 0.63 0.65 0.68 0.71 0.70 0.63

MAD ratio

Comb 0.84 0.63 0.78 0.67 0.77 0.81 0.63 0.66 0.70 0.76 0.71 0.64

RF 0.80 0.73 0.83 0.76 0.80 0.77 0.57 0.63 0.73 0.66 0.69 0.49

LLF 0.91 0.67 0.72 0.76 0.79 0.67 0.49 0.60 0.65 0.70 0.64 0.56

MCS RF - Comb

sq 0.08 0.10 0.03 0.04 0.12 0.06 0.22 0.24 0.04 0.01 0.15 0.18

abs 0.12 0.08 0.02 0.01 0.08 0.01 0.10 0.08 0.02 0.00 0.05 0.10

MCS LLF - Comb

sq 0.08 0.08 0.02 0.04 0.04 0.01 0.01 0.17 0.17 0.09 0.26 0.17

abs 0.17 0.05 0.01 0.00 0.01 0.00 0.00 0.02 0.12 0.00 0.03 0.08


6 Conclusion & Discussion

The forecasting results over the entire forecasting sample imply that Local Linear Forest(LLF)

and Random Forest(RF) generally do not significantly differ in performance. However, RF

often outperforms LLF for higher horizons. From analysing, horizon 9 and the results of the

simulation, I found that this is because LLF(CART) is not able to model the non-linearity’s that

are present in the data as proficient as RF. All in all, I can state that LLF does not improve

upon the RF results as obtained from Medeiros et al. (2021). With this I have answered the first

research question that is: ”How much will Local Linear Forest improve upon Random Forest

that is given by Medeiros et al. (2021) with the use of local linear signals that present are in

the data.”. Moreover, using Local Linear split as splitting rule as opposed to CART seems

to improve the results of LLF for larger horizons. Subsequently, it follows that RF does not

19

significantly outperform LLF anymore for larger horizons, because LLF with Local Linear split

is more able to extract the linear signals that are present. This compensates for the lack in non-

linearity that LLF is able to model. Furthermore, combing the forecasts LLF with CART split

and RF does improve the forecasts for the first subsample as the performance metrics are lower

than the metrics of LLF and RF. Nevertheless, the combined forecast only performs significantly

better for forecasting horizon 1, 2, 7 and 11. For the second subsample the advantage of the

combined forecast is not present. In contrary the combined forecasts significantly underperforms

RF and LLF for almost all horizons. This could be due to the higher volatility of inflation in

the second subsample. In summary, I can state that Local Linear splitting does improve the

results of LLF for higher horizons and combining forecasts improves upon LLF and RF when

the volatility of inflation is low. With this the second research question is answered that is:

”How much will the performance of the forecasts increase by using additional techniques such

as Local Linear splitting or combining forecasts?”.

For future research, I would suggest to tune the parameters more. LLF uses a Ridge regres-

sion to perform the local linear correction. The parameter for this Ridge regression is optimized

with a moving window of 20 forecasts, the length of this moving window was chosen arbitrarily

and could be tuned. Furthermore, Local Linear split also performs a Ridge regression, the pa-

rameter for this is arbitrary chosen as 1 because optimizing this parameter and the parameter

for Local Linear correction is computationally expensive. Nevertheless, further optimizing this

could potentially yield more performance gains.

References

M. Bernardi and L. Catania. The model confidence set package for r. International Journal of

Computational Economics and Econometrics, 8(2):144–158, 2018.

L. Breiman. Bagging predictors. Machine learning, 24(2):123–140, 1996.

L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen. Classification and regression trees.

CRC press, 1984.

C. Chakraborty and A. Joseph. Machine learning at central banks. 2017.

B. Fang and P. Zhang. Big data in finance. In Big data concepts, theories, and applications,

pages 391–412. Springer, 2016.

J. Faust and J. H. Wright. Forecasting inflation. In Handbook of economic forecasting, volume 2,

pages 2–56. Elsevier, 2013.

20

A. Fisher, C. Rudin, and F. Dominici. All models are wrong, but many are useful: Learning a

variable’s importance by studying an entire class of prediction models simultaneously. Journal

of Machine Learning Research, 20(177):1–81, 2019.

V. Fonti and E. Belitser. Feature selection using lasso. VU Amsterdam Research Paper in

Business Analytics, 30:1–25, 2017.

R. Friedberg, J. Tibshirani, S. Athey, and S. Wager. Local linear forests. Journal of Computa-

tional and Graphical Statistics, pages 1–15, 2020.

J. Friedman, T. Hastie, R. Tibshirani, et al. The elements of statistical learning, volume 1.

Springer series in statistics New York, 2001.

GitHub. Code forecasting inflation with local linear forest. https://github.com/510339pr/

ForecastingInflationWithLocalLinearForest, 2021.

C. W. Granger and R. Ramanathan. Improved methods of combining forecasts. Journal of

forecasting, 3(2):197–204, 1984.

S. Gu, B. Kelly, and D. Xiu. Empirical asset pricing via machine learning. Technical report,

National bureau of economic research, 2018.

P. R. Hansen, A. Lunde, and J. M. Nason. The model confidence set. Econometrica, 79(2):

453–497, 2011.

T. K. Ho. The random subspace method for constructing decision forests. IEEE transactions

on pattern analysis and machine intelligence, 20(8):832–844, 1998.

M. C. Medeiros and E. F. Mendes. 1-regularization of high-dimensional time-series models with

non-gaussian and heteroskedastic errors. Journal of Econometrics, 191(1):255–271, 2016.

M. C. Medeiros, G. F. Vasconcelos, A. Veiga, and E. Zilberman. Forecasting inflation in a data-

rich environment: the benefits of machine learning methods. Journal of Business & Economic

Statistics, 39(1):98–119, 2021.

G. C. Montes and A. Curi. The importance of credibility for the conduct of monetary policy and

inflation control: theoretical model and empirical analysis for brazil under inflation targeting.

Planejamento e Polıticas Publicas, (46), 2015.

W. S. Sarle. Stopped training and other remedies for overfitting. Computing science and statis-

tics, pages 352–360, 1996.

21

https://github.com/510339pr/ForecastingInflationWithLocalLinearForest

https://github.com/510339pr/ForecastingInflationWithLocalLinearForest

S. Wager and S. Athey. Estimation and inference of heterogeneous treatment effects using

random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.

Appendices

A Simulation study non-linearity RF vs LLF(CART)

In this simulation study I will analyse the performance of LLF(CART) in contrast to RF. The

data consists of 600 observations with 500 explanatory variables(X1, .., X500) that are generated

from U [0, 1]. The values for the dependent variable will be generated with equation 10, hereby

σ2 = 500. This equation is non-linear; hence, the dependent variable will have a non-linear

relationship with the explanatory variables.

Y = sin (πX1X2) + 20 (X3 − 0.5)2 +X34 +X4

5 + ε ε ∼ N(0, σ2) (10)

From the prediction results of this simulation I found that LLF(CART) obtains an RMSE of

4.79 and RF obtains a RMSE of 3.42. Hence RF perform much better than LLF(CART). This

indicates that RF is indeed better than LLF(CART) in modeling non-linearity’s in the data.

However, if the amount of explanatory variables is decreased to just 10 then the RMSE of

LLF(CART) drops to 3.80, the RMSE of RF slightly increased to 3.54. This implies that LLF

has more difficulty modelling non-linearity in large datasets compared to RF. If the amount

of explanatory variables is reduced significantly then LLF performs almost as good as RF in

extracting non-linearity. Nevertheless, RF is still better in modelling non-linearity. RF in this

simulation is implemented using the grf package, this is done to keep other effects such as

honesty constant.

B Graph analyses LLF(CART) vs RF for CPI

In order to gain a deeper understanding on how RF performs significantly better than LLF(CART)

the forecasts are plot for horizon 10. Horizon 10 is chosen because according to the MCS(sq)

and MCS(abs) test, the difference between LLF and RF is significant for the first subsample.

The graph is given in Figure 5.

22

Figure 5: Comparison inflation(CPI) predictions of RF and LLF for horizon 10

In this figure we observe that the predictions of RF(rep) are more stable in the first subsample

compared to LLF(CART). This stability seems to benefit RF(rep) as it made fewer errors with

respect to RMSE, MAE and MAD. The second subsample however is more volatile and the

advantage that RF(rep) had, seems to be deteriorated as the differences are not significant

anymore.

In contrast to this I will analyse the forecasts made by LLF(CART) & RF(rep) for horizon

1. As LLF seems to produce better forecasts compared to RF for subsample 1. The comparisons

of these forecasts are given in Figure 6.

Figure 6: Comparison inflation(CPI) predictions of RF and LLF for horizon 1

From this figure we can observe that RF is again more stable compared to LLF. However, this

seems to benefit the performance of LLF. Due to the adaptability of the model the forecasts are

more accurate. However, this adaptability advantage quickly deteriorates after the first horizon

and after the first subsample.

23

C PCE results

Table 7: Summary performance measures forecasting PCE

PCE 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

RF - rep 0.85 0.77 0.78 0.78 0.75 0.75 0.74 0.75 0.78 0.77 0.80 0.74

RF - med 0.86 0.76 0.74 0.76 0.73 0.73 0.72 0.72 0.75 0.75 0.78 0.71

LLF 0.84 0.78 0.77 0.79 0.76 0.76 0.76 0.76 0.79 0.78 0.81 0.75

MAE ratio

RF - rep 0.82 0.75 0.76 0.79 0.78 0.77 0.74 0.70 0.75 0.76 0.80 0.72

RF - med 0.82 0.74 0.73 0.77 0.77 0.75 0.72 0.68 0.73 0.74 0.78 0.70

LLF 0.81 0.76 0.75 0.80 0.77 0.76 0.76 0.71 0.77 0.77 0.83 0.75

MAD ratio

RF - rep 0.71 0.71 0.70 0.83 0.75 0.76 0.74 0.60 0.71 0.74 0.75 0.66

LLF 0.76 0.70 0.74 0.88 0.80 0.72 0.76 0.62 0.78 0.77 0.83 0.70

MCS RF - rep vs LLF

sq 0.79 0.28 0.86 0.56 0.28 0.14 0.03 0.08 0.73 0.01 0.13 0.03


Figure 7: Comparison inflation(PCE) predictions of RF and LLF for horizon 10

24

Figure 8: Comparison inflation(PCE) predictions of RF and LLF for horizon 1

Table 8: Results inflation(PCE) forecasting with LLF(LL)

PCE 1 2 3 4 5 6 7 8 9 10 11 12

RMSE 0.87 0.78 0.73 0.77 0.75 0.75 0.74 0.74 0.76 0.78 0.80 0.72

MAE 0.83 0.75 0.72 0.79 0.77 0.75 0.74 0.70 0.75 0.77 0.81 0.72

MAD 0.82 0.77 0.72 0.89 0.86 0.73 0.75 0.68 0.79 0.76 0.81 0.69

CART(LLF) vs LL(LLF)

sq 0.25 0.53 0.53 0.68 0.15 0.45 0.23 0.71 0.88 0.70 0.97 0.08

abs 0.13 0.76 0.86 0.90 0.14 0.41 0.65 0.83 0.83 0.09 0.26 0.55

MCS CART(RF) vs LL(LLF)

sq 0.97 0.32 0.79 0.49 0.11 0.11 0.04 0.08 0.60 0.00 0.16 0.09

abs 0.81 0.55 0.34 0.27 0.56 0.69 0.06 0.11 0.35 0.04 0.02 0.03


25

Table 9: Comparison combined forecast(PCE) for subsample 1

PCE 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

Comb 0.66 0.73 0.89 0.82 0.79 0.92 0.76 0.76 0.87 0.82 0.88 0.80

RF 0.79 0.76 0.88 0.83 0.83 0.93 0.78 0.79 0.88 0.82 0.91 0.86

LLF 0.72 0.74 0.92 0.88 0.86 0.93 0.80 0.79 0.90 0.85 0.93 0.90

MAE ratio

Comb 0.68 0.74 0.91 0.81 0.80 0.90 0.80 0.69 0.86 0.83 0.87 0.82

RF 0.79 0.76 0.91 0.83 0.87 0.94 0.84 0.74 0.88 0.83 0.93 0.90

LLF 0.72 0.73 0.91 0.88 0.87 0.92 0.85 0.76 0.91 0.85 0.94 0.93

MAD ratio

Comb 0.64 0.69 0.95 0.80 0.83 0.93 1.02 0.66 0.93 0.92 0.98 0.80

RF 0.60 0.67 0.77 0.78 0.75 0.85 0.90 0.60 0.76 0.84 0.89 0.84

LLF 0.65 0.70 0.87 0.85 0.86 0.83 0.86 0.69 0.94 0.90 0.93 0.93

MCS RF - Comb

sq 0.00 0.14 1.00 0.71 0.43 0.61 0.42 0.47 0.67 0.78 0.65 0.04

abs 0.00 0.17 0.86 0.28 0.08 0.23 0.21 0.24 0.39 0.46 0.22 0.02

MCS LLF - Comb

sq 0.06 0.54 0.67 0.04 0.05 0.58 0.19 0.29 0.24 0.38 0.34 0.01


26

Table 10: Comparison combined forecast(PCE) for subsample 2

PCE 1 2 3 4 5 6 7 8 9 10 11 12

RMSE ratio

Comb 1.23 1.17 0.76 0.76 0.75 0.74 0.72 0.76 0.76 0.79 0.85 0.74

RF 0.90 0.76 0.71 0.73 0.71 0.70 0.71 0.71 0.74 0.75 0.76 0.70

LLF 0.91 0.77 0.69 0.72 0.70 0.71 0.71 0.70 0.72 0.76 0.76 0.71

MAE ratio

Comb 0.95 0.88 0.71 0.78 0.81 0.75 0.69 0.69 0.74 0.76 0.81 0.71

RF 0.83 0.75 0.66 0.72 0.74 0.70 0.67 0.66 0.71 0.72 0.73 0.64

LLF 0.86 0.75 0.64 0.70 0.70 0.67 0.66 0.65 0.70 0.70 0.75 0.67

MAD ratio

Comb 0.77 0.73 0.71 0.80 0.90 0.75 0.64 0.65 0.77 0.84 0.83 0.70

RF 0.76 0.77 0.69 0.79 0.78 0.72 0.65 0.68 0.78 0.78 0.71 0.57

LLF 0.79 0.74 0.63 0.71 0.71 0.64 0.63 0.65 0.74 0.73 0.76 0.65

MCS RF - Comb

sq 0.11 0.02 0.04 0.04 0.91 0.03 0.03 0.01 0.06 0.03 0.05 0.20

abs 0.07 0.06 0.01 0.19 0.60 0.00 0.00 0.00 0.02 0.00 0.00 0.02

MCS LLF - Comb

sq 0.05 0.30 0.05 0.07 0.71 0.00 0.04 0.06 0.06 0.32 0.25 0.37


D code

The code that is used to implement the methods is given in GitHub (2021), hereby the code

provided by Medeiros et al. (2021) is also used. The repository consists of three files: ”first

sample”, ”second sample” and ”entire sample”. ”first sample” contains the data and code for

the first subsample. This file also contains two files, ”functions” and ”run”. ”functions” contains

the code that computes. ”run” contains the code that executes the functions. ”functions”

contains the following six R documents. ”func-LLF(CART)”: code to run LLF with CART

splitting algorithm; ”func-LLF(LL)”: code to run LLF with Local Linear splitting algorithm;

”func-RF”: code to run RF; ”func-(insert model) WITH VARIMP”: computes predictions of

models with variable importance measurement. ”second sample” follows the same structure as

the first sample only with the second sample data. ”entire sample” contains six r documents.

”combForecasts”: combines forecasts with OLS; ”graphs”: builts the graphs that are given in

this paper; ”Local linear signals”: checks what variable is often split upon; ”SimulationStudy”:

contains the code to replicate the simulation study; ”testing”: contains code that performs the

testing.

27

Documents

ERASMUS UNIVERSITY ROTTERDAM Erasmus School of …