A hybrid conceptual cost estimating model using …...Traditional cost estimation methods in construction & Harleen Kaur [email protected] 1 Planning and Coordination Deputy

ORIGINAL ARTICLE

A hybrid conceptual cost estimating model using ANN and GAfor power plant projects

Sanaz Tayefeh Hashemi1,2• Omid Mahdi Ebadati E.3 • Harleen Kaur4

Received: 1 August 2016 /Accepted: 14 August 2017 / Published online: 29 August 2017

� The Natural Computing Applications Forum 2017

Abstract Providing an accurate completion cost estimate

helps managers in deciding whether to undertake the pro-

ject due to cash in hand. Hence, MAPNA Group Co. as an

Iranian leading general contractor of power plant projects

is not an exception too. Cost prediction in these projects is

of great importance, whereas it can assist managers to keep

their overall budget under control. Literature has been

reviewed and influencing variables are explored. There-

after, an artificial neural network model is developed and

combined with genetic algorithm to select the best network

architecture. According to the literature reviewed, almost

all of the performed studies have selected the optimum

network architecture through a process of trial and error,

which makes the present method worthy of implementa-

tion. The best network architecture is capable of predicting

projects’ cost of accuracy equal to 94.71%. A sensitivity

analysis is then performed to test the significance degree of

model input variables.

Keywords MAPNA Group Co. � Artificial neural network(ANN) � Genetic algorithm (GA) � Construction cost �Early-stage cost estimation

1 Introduction

Power plants are industrial sites mainly constructed and

utilized to generate and distribute electric power. MAPNA

Group Co. is a leading Iranian company known as the first

and largest general contractor of power plants in the

Middle East and West Asia, which also operates in other

fields such as oil and gas, railway, wind farms and

investing projects as well as thermal power plants. Power

plant projects are basically classified into different cate-

gories across the worlds today. The main power plants

conducted by MAPNA Group Co. are as follows: gas tur-

bine power plant (GTPP), combined cycle power plant

(CCPP), combined cycle block (2 gas turbine ? 1 steam

turbine), hydroelectric, and Industrial Special (ISP) power

plant projects.

The main concern of construction project managers is to

undertake projects with allocating required resources,

regarding predefined criteria, within which cost is the most

important one, and delivering economic facilities while

meeting acceptable safety standards [1]. Hence, cost

management plays a vital role during project management

process and though early-stage cost estimation in projects

is a matter of great importance due to embedded uncer-

tainties because of lacking available accurate data.

Cost estimation is an excessively experience-oriented

process within which several matters such as relative

influencing factors and their interrelationships should be

considered based on adequate knowledge and expertise [2].

Traditional cost estimation methods in construction

& Harleen Kaur

[email protected]

1 Planning and Coordination Deputy of Power Division,

MAPNA Group Co, #231, Mirdamad Blvd.,

1918953651 Tehran, Iran

2 Department of Information Technology Management,

Kharazmi University, #242 Somayeh St., Between Qarani &

Vila, 15936-56311 Tehran, Iran

3 Department of Mathematics and Computer Science,

Kharazmi University, #242, Somayeh St., Between Qarani &

Vila, 15936-56311 Tehran, Iran

4 Department of Computer Science and Engineering, School of

Engineering Sciences and Technology, Hamdard University,

New Delhi 110062, India

123

Neural Comput & Applic (2019) 31:2143–2154

DOI 10.1007/s00521-017-3175-5

http://crossmark.crossref.org/dialog/?doi=10.1007/s00521-017-3175-5&domain=pdf

http://crossmark.crossref.org/dialog/?doi=10.1007/s00521-017-3175-5&domain=pdf

industry have long been noticed as methods fraught with

uncertainty, which needed major improvements regarding

the accuracy of prediction capabilities. The advent of

artificial neural networks has fulfilled this need in terms of

their capability of learning from nonlinear incomplete

datasets to predict novel cases with acceptable accuracy

even in the beginnings of projects [3].

Meta-heuristic algorithms such as genetic algorithm can

also enhance artificial neural networks performance, which

is based on simulated evolution. Application of genetic

algorithms in learning tasks has shown acceptable im-

provement as well as in optimization problems [4]. The

main objective of this study is to propose a model for

estimating power plant projects cost through the use of a

hybrid model, including artificial neural network optimized

by genetic algorithm.

This study has considered actual cost of finished power

plant projects with respect to their specific characteristics

regarding a financial view rather than taking into account

civil and construction factors. To the best of authors’

knowledge, the current studies done in the power plant

projects have implemented either just ANN [2] or a com-

parison between ANN and conventional regression analysis

[5], whereas the current study is the first one in the power

plant projects scope, which has taken into account the

implementation of ANN to estimate the costs within which

the best architecture is the result of subtle combination with

GA in order to set each parameter, including the number of

hidden layers, number of nodes per each hidden layer, and

corresponding weights and biases accurately. This research

has a novel approach of combining the actual cost in terms

of financial view with machine learning techniques in

power plant projects within which the best ANN topology

is the result of delicately combining GA to tune the cor-

responding network parameters.

The rest of paper is structured as follows: Sect. 2 con-

cisely reviews the literature on ANN and its hybrid models

to predict the cost of construction projects. Section 3

introduces the methodology and considers in detail the

variables, data gathering and data preparation for applying

them to the proposed method. Section 4 analyses the

results of the proposed method and conducts a sensitivity

analysis test. Finally, the paper is concluded in Sect. 5.

2 Related works

Nowadays, almost all businesses are based on a large

amount of data that in some cases predicting future per-

formance based on available past data is of great impor-

tance [6]. Predicting based on past data is not only crucial

to business success, but also inevitable in today’s com-

petitive economy. Therefore, construction industry is not

an exception too. As noted by [7] cost estimation in the

early stage of the project, when there neither exists enough

information nor the scope of work is finalized, has a major

impact on initial decision-making issues in construction

projects. Providing project managers with accurate cost

estimations prior to start of the project will assist them to

consider adequate and appropriate alternatives. As projects

progress, the level of accuracy increases due to more

information being available [8].

Conventional methods of predicting projects’ costs are

known to be faced with several deficiencies, including

inability to diagnose complex interrelationships between a

number of existing variables, neglecting inevitable uncer-

tainties and therefore, incapability of reaching reliable

forecasted final cost [9]. In return, artificial neural networks

with their successful experience in forecasting diverse

problems are among the most accurate and trustworthy

used models. Their ability to learn from incomplete data-

sets in order to predict the unseen section of data besides

their capability of modeling the problem with the least

available data and estimating almost all continuous func-

tions, have made them attractive enough to be used in

prediction problems [10].

Neural networks’ forecasting process is divided into two

sections. In the first section, the network is provided by a

set of data containing inputs and desired outputs and in the

second part, it tries to tune its parameters, including

weights and biases to reach desired output by minimizing

the difference between the generated output and desired

output known as the target in each iteration [11].

ANNs have been widely used in optimization problems

making them beneficial to tackle with problems instead of

conventional methods. Ye et al. [12] have taken advantage

of a specific type of neural network called projection neural

network to estimate the parameters of multiple-input

multiple-output (MIMO) models in predictive control

problems. Furthermore [13], has implemented an approach

called KDESOINN as a combination of kernel density

estimation (KDE) and self-organizing incremental neural

network (SOINN) as a density function estimator of big

data, out of which the neural network accounts for learning

from online noisy big data in order to be able to analyze

them. On the other hand, in aircraft industry [14], have

compared the use of multiple regression, GM(1,1) and a

combination of GM(1,N) and multi-layer perceptron

(MLP) neural network to estimate the development costs in

aircraft industry through which the later outperforms all. In

their study, the MLPNN is fed with GM(1,N) performance

and simulation data to optimize the forecasting process.

We can also see the use of neural networks in wind power

generation realm [15] through which a topographical

feedforward neural network is applied to predict the wind

speed in areas where wind speed measurements have not

2144 Neural Comput & Applic (2019) 31:2143–2154

123

been implemented. The neural network in this study has

been capable of predicting wind speed with an accuracy

equal to 96.6%.

ANNs have also been specifically used in studied related

to cost estimation. Cost estimation in different types of

projects has been the main study of many researchers, in

which some of them are listed below. Cost estimation is the

process of applying required art and technology to

approximate the extent to which the project is likely to

worth based on the current available data [16]. Application

of ANN has been investigated in manufacturing industry

[17–23]. Also in software cost estimation scope, [24] have

investigated a novel approach called cuckoo search

inspired by cuckoo’s breeding mechanism used to select

the best possible parameters of their cost estimation model.

A number of studies have also been conducted in con-

struction projects’ realm. The construction industry is

associated with some uncertainties among which cost

overrun and delays are the most important results of these

uncertainties [25]. Therefore, several researches have been

conducted to predict projects’ costs in early stages in order

to proactively deal with these uncertainties some of which

have studied this realm based on fuzzy logic [26–28], The

other researchers have investigated the subject through a

hybrid model of CBR and GA [29], as well as [30] which

have gone through the case with respect to a comparison

between an assembly-based data method and a historical-

based data method. Moreover [31], have gone into the

subject by developing ANN and support vector machine

(SVM) models. The work of [32] comprises a backpropa-

gation neural network used for predicting cost of school

buildings with two architectures, including a different

number of inputs, where the one with more inputs out-

performs the other, implying the influence of considering

more significant input parameters on the performance of

the network. Furthermore [33], in their study have devel-

oped six networks with distinct completed intervals of

construction projects as the networks’ inputs and the

remaining intervals up to projects’ completion representing

the networks’ outputs as a substitute approach for tradi-

tional cost flow forecasting methods. In [34], a comparison

between multiple regression analysis and artificial neural

networks is conducted to depict the superiority of neural

networks in estimating construction projects’ cost. In this

study, the best architecture of the neural network is defined

through a process of trial and error, and finally the opti-

mum network predicts the cost with 16.6 mean absolute

percentage error. A comprehensive attempt to investigate

situations under which artificial neural networks may per-

form better was done by [35] in building projects. In their

study, neural networks were fully examined by a different

number of inputs, various architectures, data transforma-

tion, data preparation, and different number of datasets.

Finally, analysis of variance (ANOVA) test was under-

taken in order to study the significant difference among

four different input sets.

Kim et al. [36] have conducted a comparative study

investigating advantages and disadvantages of three dif-

ferent approaches, including multiple regression analysis,

artificial neural networks, and case-based reasoning, in

estimating cost of construction projects, where they con-

clude that the last two methods perform much better than

the first one, while case-based reasoning (CBR) is less time

consuming and ANN produces results with smaller asso-

ciated error. The best architecture of ANN is also set by

trial and error. They propose the use of hybrid models of

ANN specially a combo with genetic algorithm for their

future research. Gunaydın and Dogan [37] have compared

regression analysis with backpropagation neural network in

early-stage cost estimation of structural systems of build-

ings where backpropagation neural network (BP ANN)

shows a better performance with an accuracy around 93%.

In [38], an ANN model was proposed for cost estimation

of highway projects and associated escalation in a future,

where the best architecture of the neural network is

determined by trial and error. Their study strongly advo-

cates the use of ANN over usual methods, for its superior

performance. Another study conducted by [11] outlines the

dominance of ANN over traditional earned value man-

agement (EVM) methods in cost estimation of sample

projects. Furthermore, [39] presents an ANN model to

predict installation projects’ cost, where the best corre-

sponding architecture is selected through a process of trial

and error, which leads to an accuracy around 80% that

shows a better performance compared to traditional meth-

ods. In addition, [40] in their study have investigated BP

ANN model compared with regression-based one in

buildings’ cost estimation, where the former within which

the best architecture is chosen after several trials, out-

weighs the latter. Further, [7] have developed a BP ANN

model to estimate building projects’ cost within which

again the best architecture is nominated after examining

different cases.

By comparing two types of neural networks, multi-layer

feedforward and general regression one, with the tradi-

tional methods of cost estimation, multiple regression

analysis [41], have proven that neural networks are more

reliable in tunnel construction cost prediction, with respect

to the reported accuracy (95.35%). Besides, the work of [3]

has shown an acceptable performance of BP ANN in cost

prediction of building projects. Furthermore [42], have

developed an ANN model for early cost estimation of

building projects in Gaza Strip with an accuracy equal to

94%. Roxas and Ongpeng [1] have also developed an ANN

model for cost estimation of building projects in the

Philippines. Moreover, [43] have applied BP ANN to

Neural Comput & Applic (2019) 31:2143–2154 2145

123

estimate the cost of projects, which yields an accuracy

around 92%. In [44] and [2], ANN models have been

implemented for cost estimation in construction projects

and water treatment plants projects, respectively, with

errors equal to 28.2 in the former, and 21.18% in the latter.

There have been studies conducted to investigate the

performance of neural networks, while combined by

genetic algorithm, which have reported positive influence

of genetic algorithm on the performance of proposed

models. Except [45], others have strongly proven this

correlation. In [45], a neural network model based on

Microsoft Excel is developed for cost estimation of high-

way projects, where the weights of the network are opti-

mized through three different methods, including Microsoft

Excel Solver (simplex method), application of genetic

algorithm, and backpropagation network, in which even-

tually the simplex method outperforms the others. As

opined by [41] neural networks are data-driven and there is

a strong correlation between the amount of training data

and model’s accuracy. Besides, we can see the study of

[46] presenting a hybrid model of BP ANN with genetic

algorithm that has successfully overcome the drawbacks of

BP ANN while implemented alone including the slow

convergence process and even the problem of network

being trapped in local minimums. It has also led to a lower

rate of error compared to BP ANN model alone, which is

worthy of comparison.

The learning ability of ANNs is extremely dependent on

its topology, and initial weights where choosing them

heuristically is highly time consuming [47]. Hence, genetic

algorithm because of its parallel searching ability, evolving

the best solution based on the population of solutions, and

being needless of any prior knowledge about the problem,

has been nominated as a method worthy of implementation

in selecting the best neural network topology in contrast to

other optimization methods [48]. This topic has been

investigated in the couple of studies [49, 50, 52–55]. In

[49], an attempt to determine the best architecture of

backpropagation neural network was made by proposing a

hybrid model incorporating genetic algorithm, which yields

results with error around 2.62%. In addition, [50] suggested

a hybrid model of genetic algorithm and backpropagation

neural network to estimate cost of building projects after

comparing it with two other models, including BP ANN

and a combo of ANN and genetic algorithm.

Recently [5], in their study have probed forecasting

hydroelectric power plant projects’ cost by means of ANN-

based model compared with multi-regression one, by

which three different architectures have been generated and

examined in the former, seeking the best performance.

They concluded that the ANN model is preferred to the

other in terms of forecasting error.

3 Research methodology

Research literature is reviewed to investigate different

techniques applied in predicting cost at completion of

projects of diverse types and specifically construction ones.

Main factors affecting the cost of power plant projects are

collected by conducting interviews with experts in this

domain. Thereafter, a hybrid model consisting of artificial

neural network and genetic algorithm as a meta-heuristic

algorithm for optimizing the network’s architecture is

applied. Research methodology is depicted in Fig. 1.

Historical data are needed to feed the model, and it is

collected and each project is divided into four main types

and four leading phases. EPC projects, as their name sug-

gests, are undertaken within 3 major phases, including

engineering, procurement, civil and commissioning; in this

study, due to different indexes needed for adjusting civil

works on one hand and commissioning works, on the other

hand, the civil and commissioning phase itself is divided

into two phases. Cost corresponding to each individual

phase is updated to the latest possible year (here 2015) by

applying appropriate indexes available due to statistical

indexes declared by Central Bank of Iran and Management

and Planning Organization (refer to Circular No. 1-9706/

54/2080). This step is conducted through applying opti-

mistic, most likely, and pessimistic scenarios, and thus the

final result is obtained via program evaluation review

technique (PERT). Besides, another way for updating the

cost is also applied according to inflation calculations (refer

to customer price index (CPI) provided by Central Bank of

Conducting Interviews with experts

Developing a Hybrid Model of ANN & GA

Choosing the Best Neural Network Architecture

Training the Model

Start

Testing the Model

Sensitivity Analysis Test

End

Data Preparation

(Classification & Adjustment)

Escalation-Based CostInflation-Based Cost

Fig. 1 Research methodology flowchart


123

Iran). The results of these two methods are considered as

ANN’s target values, and the factors affecting projects’

cost are the ANN’s inputs, while bearing this thought in

mind that what is estimated as the project cost is different

from tender price in that the tender price contains other

amounts, including company’s profit and contingency

reserve known as markup [51]. The best architecture for

ANN is selected by the GA algorithm trained and even-

tually is tested for further ability of the network to predict

new projects’ cost. As a final step, a sensitivity analysis is

performed to measure the significance of each of the model

inputs.

3.1 Variables’ identification

Factors affecting construction projects’ final cost have been

gathered through conducting interviews with experts within

which nine major influencing variables are finalized. These

factors are the input variables for ANN. Among these,

there are five factors that cause major changes in projects’

final cost, describing work packages, which are executed in

projects based on their contracts’ content and hence, they

should be considered in a cost estimating process:

• Substation construction (X1).

• Piling and soil stabilization (X2).

• Main cooling system type (X3).

• Number of fuel oil storage tanks (X4).

• Fuel type (fuel oil, gas or both) (X5).

There remain 4 other factors, which should be added to

aforementioned variables mainly defining the project’s

specifications:

• Power plant type (GTPP, CCPP, Block, and ISP) (X6).

• Project duration (in months) (X7).

• Number of units (X8).

• Projects phases (engineering, procurement, civil and

construction and commissioning) (X9).

In fact, the last item is added in order to be able to

consider the cost of each phase individually.

The indexes used in this study are retrieved from sta-

tistical data presented by Central Bank of Iran. These

indexes consist of escalation index and exchange rate, in

which the former is used for adjusting time in engineering,

civil and construction, and commissioning phases, and the

latter is applied for procurement phase. Another index is

the inflation index with respect to the year 2011 as the base

year, which is used for inflation-based method. The total

cost of a project is the output of the network, while bearing

this thought in mind that the presented cost is not a tender

price since the tender price contains other amounts,

including the profit and overheads known as markup and

indirect costs, respectively [51].

3.2 Historical data collection and preparation

Historical data of 39 projects are gathered in a database.

These projects are classified into four main types,

including GTPP, CCPP, Block and ISP. The input data

include both quantitative and qualitative variables. The

quantitative variables are remained unchanged, while the

qualitative ones (X1, X2, X3, X5, X6, and X9) are

transformed to scalar quantities and sorted with respect

to their influence on project cost as stated in [34]. Hence,

greater values show increasing effect on actual cost of

the project. For example, the procurement phase of

projects is regarded as the most expensive cost center,

while the engineering phase is at the opposite side of the

spectrum; therefore, the procurement phase acquires the

largest number, here 4, compared to engineering phase.

Time adjustment and location adjustment are performed

according to the method discussed by [54]. Since the

projects are undertaken in the different provinces of Iran,

in a primary step project’s cost is adjusted by location

indexes (provided in 1981 by Management and Planning

Organization formerly known as Budget and Planning

Organization). Hence, project’s cost is divided by

appropriate regional factors identified in the project’s

contract, related to the province in which the project is

undertaken and then multiplied by 1 as a base location

index associated with Tehran, the capital of Iran. Fur-

thermore, each project is executed within a period of

time starting with project’s start date and finished by

PAC (Provisional Acceptance Certificate) date. These

two milestones have specific indexes, in which each of

them is used for further time adjustment issue. Time

adjustment process consists of indexes in both, the time

of interest and time of reference. Each year’s index has

been monitored within 4 quarters. Time adjustment for

each phase has been done through two methods: esca-

lation-based and inflation-based adjustment method,

where the former is based on escalation indexes and

done through PERT technique, including optimistic, most

likely, and pessimistic scenarios, and the latter is

according to inflation indexes. The appropriate quarter

for the start date of each phase is considered regarding

that the project mainly starts as its civil phase is trig-

gered. In this way, the civil phase negotiations are made

and contracted about 3 months earlier than the project

start date due to experts’ opinions; since the engineering

phase starts on average 2 or 3 months prior to project

start date, it is assumed that the best time for this phase

is a quarter earlier than the project start date; usually,

procurement phase starts simultaneously on the project

start date, so the project can be led with needed materials

and equipment. Ultimately, the commissioning phase

starts 6 months later than the project start date since the


123

prerequisite tasks shall be completed in order to com-

mence this phase. Related index assumptions are shown

in Table 1.

Based on these hypotheses, location adjustment process

is performed by Eqs. (1) and (2):

C1 ¼ invoice

regional factorð1Þ

C2 ¼ invoiceþ escalation

regional factorð2Þ

CO¼C2 � Escalation Index time of interestð ÞEscalation Index time of referenceð Þ ð3Þ

CML ¼ C1 � Escalation Index time of interestð ÞEscalation Index time of referenceð Þ ð4Þ

CP ¼ C2 � Escalation Index time of interestð ÞEscalation Index time of referenceð Þ ð5Þ

CI ¼C1 � Inflation Index time of interestð ÞInflation Index time of referenceð Þ ð6Þ

The adjusted cost of each phase in escalation-based

method (pessimistic scenario) (CP) inflation-based method

(CI) and escalation-based method (most likely scenario)

(CML) is resulted by considering the start date of the

project as the time of reference, whereas in the escalation-

based method (optimistic scenario) (CO) the time of ref-

erence is set to the PAC date of project. This is mainly

because the indexes grow due to pass of time, which

consequently lead to greater indexes at the end of project

versus the start date, so smaller time of a reference index

leads to greater cost due to its inverse effect on time

adjustment process. The adjusted cost associated with

each of the methods is calculated by Eq. (3) through

Eq. (6).

The final cost in escalation-based method is calculated

by Eq. (7) based on PERT technique:

CE ¼ CO þ 4 � CML þ CP

6ð7Þ

Finally, cost resulted from pert technique applied to an

escalation-based method known as CE and cost resulted

from inflation-based method, are considered as target val-

ues in the ANN. Besides, the 9 aforementioned variables

are the networks’ input variables.

3.3 Neural network model design

Despite the black box mechanism of neural networks, they

have been widely used in prediction problems demon-

strating reasonable results as scrutinized in the literature.

Developing hybrid model of backpropagation neural net-

works and genetic algorithm will lead to more accurate

predictions and prevent the model from representing erro-

neous performance and hence can overcome encapsulated

shortcomings [23].

f ðhÞ ¼ 1

1þ e�hð8Þ

where h is obtained from multiplying neurons’ weights by

their input values, summed up with bias values. Due to

confidentiality of data and for better performance of the

network, data have been normalized into range [0, 1] with

the use of Eq. (9). Eventually, given the input parameters

for projects, the trained network is capable of estimating

project’s cost at the completion state (Table 2).

XðNormalizedÞ ¼ X �MinðXÞMaxðXÞ �MinðXÞ ð9Þ

3.4 Training and testing neural network model

Data are split into three parts, where 60 percent of data are

used for training the network, 20% for cross-validation and

the remaining part for testing the accuracy of the trained

network. The validation set is not used for training the

network. First of all, a primary ANN is initialized and is

then trained through the application of genetic algorithm,

which continuously searches for better network architec-

tures to achieve the lowest possible validation error. After

choosing the best network architecture by the application

of genetic algorithm, the efficiency of the model is exam-

ined through presenting novel cases in terms of test set to

the network.

3.5 Genetic algorithm implementation

GA accounts for choosing the best possible ANN archi-

tecture based on its evolution computing capabilities. For

this purpose, a global network is defined initially. The main

thought behind creating this global network is creating a

Table 1 Index assumptionsPhase Index Remarks

Engineering Escalation index Index1

Procurement Exchange rate (EURO/IRR) and (USD/IRR) Index2

Civil and construction Escalation index Index3

Commissioning Escalation index Index1


123

network once, rather than generating it each time in each

generation for each individual in the population which

drastically saves compiling time. Thus, this network is

recalled each time needed and is modified according to

each individual. This network is constructed as large as

possible, with 4 hidden layers and 8 nodes per each layer so

that it can cover all possible architecture within this range.

3.5.1 Initial population generation and encoding

A random initial population is generated initially within

which each chromosome contains 5 genes as follows:

1. Number of hidden layers

2. Number of nodes per each hidden layer

3. Input weights

4. Hidden layer weights

5. Biases

The chromosomes are populated and encoded by ran-

dom values in terms of continuous figures in the range of

[-1, 1]. The structure of each individual in the population

is depicted in Fig. 2:

3.5.2 Decoding process

Number of hidden layers is a two-gene chromosome where

random binary values are generated by MATLAB to define

it. These values are the encoded form of this gene, which is

interpretable by MATLAB. So it shall be decoded to

construct the desired network. The decoding process is a

conversion from binary to decimal values as follows

(Fig. 3):

Also the number of neurons per each hidden layer is a

three-gene chromosome where random binary values are

generated by MATLAB to define it. The corresponding

decoding process is a conversion from binary to decimal

values as follows (Fig. 4):

The other 3 genes are random values in the range of

[-1, 1] representing network weight initialization. They

are used just as they are generated in terms of input layer,

hidden layers and bias weights as much as needed.

3.5.3 Network construction and objective function

evaluation

Based on each individual in the population, a network is

constructed, respectively, which inherits its characteristics

from global network and the others such as the number of

hidden layers, nodes per each layer, and corresponding

weights for input layer, hidden layer, and biases modified

according to the chromosomes’ characteristics. Thereafter,

this network is examined in terms of its capability of pre-

dicting the cost with as low error as possible. Thus, each

chromosome has its corresponding cost and is sorted

according to it in an ascending order.

3.5.4 Evaluating the population

The population generated is evaluated by the ANN ability

to predict with reasonable accuracy. Thus, the performance

of ANN is the main concern, which is measured in terms of

mean squared error (MSE) of the results which is in turn

the fitness function of the problem. The final accuracy of

the network is calculated by Eqs. (2–4):

Accuracy ¼ 100�ffiffiffiffiffiffiffiffiffiffi

MSEp

100ð10Þ

Thereafter, the objective function which the problem

seeks to optimize is the reverse of fitness function:

Objective function ¼ 1

MSEð11Þ

3.5.5 Elitism and selection

About 25 percent of the population is reserved as elites. The

rest of the population is gone through selection operator,

here roulette wheel. The roulette wheel selection performs

Table 2 Best model parameters

based on trial and errorGenetic algorithm parameters Network parameters

Population 100 No. of training epochs 1000

Selection method Roulette wheel Training function Trainlm

Crossover probability 80% – –

Mutation probability 2% – –

No. of generations 50 – –

Number of hidden layers

Number of nodes per each hidden

layer

Input weights

Hidden layer weights

Biases weights

2 alleles 3*4 alleles 9*8 alleles 8*8*3+8 alleles 8*4+1 alleles

Fig. 2 Genetic algorithm

chromosome structure


123

as a fitness proportionate selection method where individ-

uals with higher fitness values will have the higher chances

to be selected while bearing this thought in mind that it is

not based on selecting the best and discarding the rest, so

that the weak have still the chance to be selected which is

considered as an advantage of this selection method.

3.5.6 Crossover and mutation operators

Crossover and mutation operators are presented in each

generation while bearing this thought in mind that they are

implemented based on probability values, which shall be

met as discussed earlier. New individuals are generated till

the population pool is filled for the next generation. The

new population is then evaluated based on the objective

function and goes for further evolution till the final number

of generations is met.

4 Results and analysis

The operations of the proposed method are built with

MATLAB version 2014b. The cost data with considering

aforementioned hypotheses are gathered in Microsoft

Excel and read in .dat format. The program is compiled for

40 times, and the final results are summarized in Table 3.

The networks chosen by GA along with the corre-

sponding range of errors in predicting projects cost are

depicted in this table. Thereafter, the results of each type of

neural network based on the number of hidden layers are

averaged according to Eq. (12) where Fi is the frequency of

observed error in each range and xi is the mean of the

corresponding range. According to these results, a two-

hidden layer network has yielded higher frequency of

errors in lower ranges and thus has shown a superior total

result toward the two other networks within which the best

architecture is shown in Fig. 5.

Binary Code Binary to Decimal Conversion

Summation with 1

Final Values

0 0 +1 10 1 +1 21 0 +1 31 1 +1 4

Fig. 3 Binary to decimal

conversion (number of hidden

layers)

Binary Code Binary to Decimal Conversion

Summation with 1

Final Values

0 0 0 +1 10 0 1 +1 20 1 0 +1 30 1 1 +1 41 0 0 +1 51 0 1 +1 61 1 0 +1 71 1 1 +1 8

Fig. 4 Binary to decimal

conversion (number of nodes

per each hidden layers)

Table 3 Summary of resultsRange of error One-hidden layer (%) Two-hidden layer Three-hidden layer

[4.4, 8] 72.73 77.78 44.44

[8, 11.7] 18.18 22.22 44.44

[11.7, 15.3] 4.55 – –

[15.3, 18.9] 4.55 – 11.11

Average error 7.67 6.99 9.02

Fig. 5 Best network

architecture


123

Average Error ¼P

Fi � xi100

ð12Þ

The process of improving the network architecture via

GA, through which the aforementioned best network

architecture is resulted, is depicted in Fig. 6, and the final

accuracy of the network is calculated by Eq. (13).

Accuracy ¼ 100�ffiffiffiffiffiffiffiffiffiffi

MSEp

100ð13Þ

As shown in Fig. 6, the validation error decreases from

around 0.0059 to 0.0047 by applying GA algorithm. The

results of testing the model are depicted in Figs. 7 and 8.

Eventually, the model is able to predict project cost with

mean squared error (MSE) equal to 94.71% (Figs. 9, 10, 11).

Fig. 6 Genetic algorithm performance

Fig. 7 Test data (output versus target)

Fig. 8 Test data adaptability

Fig. 9 Network validation performance

Fig. 10 Network error histogram


123

4.1 Sensitivity analysis

In order to measure the impact of neural network’s inputs

on its performance, a sensitivity analysis is conducted

based on the method stated by [40]. Hence, the network is

compiled nine times in the absence of each of the nine

input parameters to monitor their significance level. Results

are shown in Fig. 12, in terms of the ratio of deteriorated

MSE due to the absence of each parameter, unto the

original best network MSE. As shown in Fig. 12, the type

of power plant project is the most influencing factor in

project cost, where establishing power distribution substa-

tion is the least important one. Furthermore, predicting

projects’ cost by considering it in phases will yield more

accurate results.

5 Conclusion

Early cost estimation is a vital process in project man-

agement as it helps project managers to make appropriate

decisions prior to undertake the projects. This study con-

tributes to this process by proposing a hybrid model con-

sisting of an artificial neural network and an optimization

algorithm for selecting the best network architecture. His-

torical data of MAPNA Group Co. power plant projects are

gathered and processed through two methods: escalation-

based and inflation-based methods, in which the former is

calculated through PERT technique, and the latter is the

result of incorporating the inflation index. Hence, the

model is provided with target value calculated through two

methods along with 9 input values. The best network

architecture is a two-hidden layer network with an accu-

racy equal to 94.71%. Eventually, a sensitivity analysis is

done to explore the effect of the network’s inputs on the

final result which shows that the type of power plant pro-

ject is the most influencing factor in the process of pre-

dicting projects’ cost.

Fig. 11 Network training

regression

4.36 4.033.47 3.26

2.441.70

1.17 1.02 0.82

0.00

1.00

2.00

3.00

4.00

5.00Input Variables Sensi�vity Analysis

Fig. 12 Input variables sensitivity analysis result


123

Acknowledgements The authors would like to acknowledge with

great gratitude: MAPNA Group Co., where the case study is taken

place, Engineer Abolfazl Asgari Vice President of planning deputy

for continued support of this research, and Engineer Omid Mehdi-

zadeh for his invaluable recommendations during the study, which

would not have been possible without his supervision.

References

1. Roxas CLC, Ongpeng JMC (2014) An artificial neural network

approach to structural cost estimation of building projects in the

Philippines

2. Marzouk M, Elkadi M (2016) Estimating water treatment plants

costs using factor analysis and artificial neural networks. J Clean

Prod 112:4540–4549

3. Bala K, Ahmad Bustani S, Shehu Waziri B (2014) A computer-

based cost prediction model for institutional building projects in

Nigeria: an artificial neural network approach. J Eng Des Technol

12(4):519–530

4. Mitchell TM (1997) Machine learning. WCB, McGraw-Hill,

Boston

5. Gunduz M, Sahin HB (2015) An early cost estimation model for

hydroelectric power plant projects using neural networks and

multiple regression analysis. J Civil Eng Manag 21(4):470–477

6. Lin H (2014) An artificial neural network model for data pre-

diction. In: Advanced materials research. Trans Tech Publ

7. Arafa M, Alqedra M (2011) Early stage cost estimation of

buildings construction projects using artificial neural networks.

J Artif Intell 4(1):63–75

8. Sodikov J (2005) Cost estimation of highway projects in devel-

oping countries: artificial neural network approach. J Eastern

Asia Soc Transp Stud 6:1036–1047

9. Ahiaga-Dagbui DD, Smith SD (2012) Neural networks for

modelling the final target cost of water projects. In: Smith SD

(ed) Proceedings of the 28th annual ARCOM conference, 3–5

September 2012. Association of Researchers in Construction

Management, Edinburgh, UK, pp 307–316

10. Khashei M, Bijari M (2010) An artificial neural network (p, d, q)

model for timeseries forecasting. Expert Syst Appl

37(1):479–489

11. Iranmanesh SH, Zarezadeh M (2008) Application of artificial

neural network to forecast actual cost of a project to improve

earned value management system. In: World congress on science,

engineering and technology, pp 240–243

12. Ye Q, Lou X, Sheng L (2017) Generalized predictive control of a

class of MIMO models via a projection neural network. Neuro-

computing 234:192–197

13. Nakamura Y, Hasegawa O (2017) Nonparametric density esti-

mation based on self-organizing incremental neural network for

large noisy data. IEEE Trans Neural Netw Learn Syst 28(1):8–17

14. Xie N-M et al (2017) Estimating a civil aircraft’s development

cost with a GM (1, N) model and an MLP neural network. Grey

Syst Theory Appl 7(1):2–18

15. Lawan S et al (2017) Wind power generation via ground wind

station and topographical feedforward neural network (T-FFNN)

model for small-scale applications. J Clean Prod 143:1246–1259

16. Nussbaum DA, Mislick GK (2015) Cost estimation: methods and

tools. Wiley, Hoboken

17. Bode J (1998) Neural networks for cost estimation. Cost Eng

40(1):25–30

18. Shtub A, Versano R (1999) Estimating the cost of steel pipe

bending, a comparison between neural networks and regression

analysis. Int J Prod Econ 62(3):201–207

19. Duran O, Rodriguez N, Consalter LA (2009) Neural networks for

cost estimation of shell and tube heat exchangers. Expert Syst

Appl 36(4):7435–7440

20. Polat TK, Arslankaya S (2010) The cost forecasting application

in an enterprise with artificial neural networks. Proc World Congr

Eng 3:2258–2262

21. Duran O, Maciel J, Rodriguez N (2012) Comparisons between

two types of neural networks for manufacturing cost estimation of

piping elements. Expert Syst Appl 39(9):7788–7795

22. Zhai K, Jiang N, Pedrycz W (2013) Cost prediction method based

on an improved fuzzy model. Int J Adv Manuf Technol

65(5–8):1045–1053

23. Bai Z et al (2014) Predictive model of energy cost in steelmaking

process based on BP neural network. In: 2nd international con-

ference on software engineering, knowledge engineering and

information engineering (SEKEIE), Singapore, 2014

24. Kumari S, Pushkar S (2017) Software cost estimation using

cuckoo search. In: Advances in computational intelligence: pro-

ceedings of international conference on computational intelli-

gence 2015. Springer

25. Abdul-Rahman H, Wang C, Muhammad NAB (2011) Project

performance monitoring methods used in Malaysia and per-

spectives of introducing EVA as a standard approach. J Civil Eng

Manag 17(3):445–455

26. Cheng M-Y, Tsai H-C, Sudjono E (2010) Conceptual cost esti-

mates using evolutionary fuzzy hybrid neural network for pro-

jects in construction industry. Expert Syst Appl 37(6):4224–4231

27. He X et al (2011) Cost estimation of construction project using

fuzzy neural network model embedded with modified particle

optimizer. In: Advanced materials research. 2011. Trans Tech

Publ

28. El Sawalhi NI (2012) Modelling the parametric construction

project cost estimate using fuzzy logic. Int J Emerg Technol Adv

Eng 2(4):2250–2459

29. Kim KJ, Kim K (2010) Preliminary cost estimation model using

case-based reasoning and genetic algorithms. J Comput Civil Eng

24(6):499–505

30. Kim H-J, Seo Y-C, Hyun C-T (2012) A hybrid conceptual cost

estimating model for large building projects. Autom Constr

25:72–81

31. Wang Y-R, Yu C-Y, Chan H-H (2012) Predicting construction

cost and schedule success using artificial neural networks

ensemble and support vector machines classification models. Int J

Project Manage 30(4):470–478

32. Elhag T, Boussabaine A (1998) An artificial neural system for

cost estimation of construction projects. In: Proceedings of the

14th ARCOM annual conference

33. Boussabaine A, Kaka A (1998) A neural networks approach for

cost flow forecasting. Constr Manag Econ 16(4):471–479

34. Emsley MW et al (2002) Data modelling and the application of a

neural network approach to the prediction of total construction

costs. Constr Manag Econ 20(6):465–472

35. Setyawati BR, Sahirman S, Creese RC (2002) Neural networks

for cost estimation. In: AACE international transactions p. ES131

36. Kim G-H, An S-H, Kang K-I (2004) Comparison of construction

cost estimating models based on regression analysis, neural net-

works, and case-based reasoning. Build Environ

39(10):1235–1242

37. Gunaydın HM, Dogan SZ (2004) A neural network approach for

early cost estimation of structural systems of buildings. Int J

Project Manage 22(7):595–602

38. Wilmot CG, Mei B (2005) Neural network modeling of highway

construction costs. J Constr Eng Manag 131(7):765–771

39. Alex DP et al (2009) Artificial neural network model for cost

estimation: city of Edmonton’s water and sewer installation ser-

vices. J Constr Eng Manag 136(7):745–756


123

40. Tatari O, Kucukvar M (2011) Cost premium prediction of certi-

fied green buildings: a neural network approach. Build Environ

46(5):1081–1086

41. Petroutsatou K et al (2011) Early cost estimating of road tunnel

construction using neural networks. J Constr Eng Manag

138(6):679–687

42. El-Sawalhi NI, Shehatto O (2014) A neural network model for

building construction projects cost estimating. J Constr Eng Proj

Manag 4(4):9–16

43. Putra GAS, Triyono RA (2015) Neural network method for

instrumentation and control cost estimation of the EPC compa-

nies bidding proposal. Procedia Manuf 4:98–106

44. Hyari KH, Al-Daraiseh A, El-Mashaleh M (2015) Conceptual

cost estimation model for engineering services in public con-

struction projects. J Manag Eng 32(1):04015021

45. Hegazy T, Ayed A (1998) Neural network model for parametric

cost estimation of highway projects. J Constr Eng Manag

124(3):210–218

46. Feng GL, Li L (2013) Application of genetic algorithm and

neural network in construction cost estimate. In: Advanced

materials research. 2013. Trans Tech Publ

47. Paul S, Kumar S, Singh L (2012) Novel hybrid compact genetic

algorithm for simultaneous structure and parameter learning of

neural networks. In: 2012 IEEE congress on evolutionary

computation

48. ul Islam B et al (2014) Optimization of neural network archi-

tecture using genetic algorithm for load forecasting. In: 5th

international conference on intelligent and advanced systems

(ICIAS)

49. Kim G-H et al (2004) Neural network model incorporating a

genetic algorithm in estimating construction costs. Build Environ

39(11):1333–1340

50. Kim G, Seo D, Kang KI (2005) Hybrid models of neural net-

works and genetic algorithms for predicting preliminary cost

estimates. J Comput Civil Eng 19(2):208–211

51. Hegazy T (2002) Computer-based construction project manage-

ment. Prentice Hall, Upper Saddle River

52. Kaur H, Tao X (eds) (2014) ICTs and the Millennium develop-

ment goals: a United Nations perspective. Springer, New York

53. Kaur H et al (eds) (2017) Catalyzing development through ICT

adoption: the developing world experience. Springer Interna-

tional Publishing, Switzerland

54. Kaur H, Chauhan R, Wasan SK. A bayesian network model for

probability estimation. In: Encyclopedia of Information Science

(2015)

55. Setyawati BR, Creese RC, Sahirman S (2003) Neural networks

for cost estimation (Part 2). In: AACE international transactions,

p. ES141


123

Documents

A hybrid conceptual cost estimating model using …...Traditional cost estimation methods in construction & Harleen Kaur [email protected] 1 Planning and Coordination Deputy