Development and Testing of a Method for …Development and Testing of a Method for Forecasting Prices of Multi-Storey Buildings during the Early Design Stage: the Storey Enclosure

Development and Testing of a Method for

Forecasting Prices of Multi-Storey Buildings

during the Early Design Stage: the Storey

Enclosure Method Revisited

Franco Kai Tak Cheung

Doctor of Philosophy

School of Construction Management and Property

Queensland University of Technology

2005

To Ritz and my parents

i

SSttaatteemmeenntt ooff OOrriiggiinnaall AAuutthhoorrsshhiipp

The work contained in this thesis has not been previously submitted for a

degree or diploma at any other higher education institution. To the best of my

knowledge and belief, the thesis contains no material previously published or written

by another person except where due reference is made.

Signature:

Date:

ii

AAbbssttrraacctt

Although design decisions that are made in the preliminary design stages of a

building are more cost sensitive than those that are made at later stages, previous

research suggests that they result in only a slight improvement in the accuracy of

building price forecasts as the design develops. However, established conventional

forecasting methods lack measures of their own performance, which has inhibited the

development of simpler early-stage techniques.

One early-stage price forecasting model, the Storey Enclosure Method, which

was developed by James in 1954, uses the basic physical measurements of buildings

to estimate building prices. Although James’ Storey Enclosure Model (JSEM) is not

widely used in practice, it has been proved empirically, if rather crudely, to be a

better model than other commonly used models. This research aims firstly to advance

JSEM by using regression techniques and secondly to develop an objective approach

for the assessment of model performance.

To accomplish the first research aim, this research uses data from 148

completed Hong Kong projects for four types of building: offices, private housing,

nursing homes, and primary and secondary schools. Sophisticated features of the

modelling exercise include the use of leave-one-out cross validation to simulate the

way in which forecasts are produced in practice and a dual stepwise selection

strategy that enhances the chance of identifying the best model. Two types of

iii

regressed models from different candidate sets, the Regressed Model for James’

Storey Enclosure Method (RJSEM) and Regressed Model for Advanced Storey

Enclosure Method (RASEM), are developed accordingly.

In considering the RJSEM, RASEM, and the most commonly used alternative

early stage floor area and cube models, all of the models except JSEM are found to

be unbiased. The RJSEM and RASEM models are also examined for their

consistency using a structured approach that involves the use of both parametric and

non-parametric inference tests. This shows that although the RASEMs for different

building types are generally more consistent, they are not significantly better than the

other models. Finally, the combination of the forecasts that are generated from

different models to capture the different aspects of information from the models is

suggested as an alternative strategy for improving forecasting performance.

iv

AAcckknnoowwlleeddggeemmeennttss

I am indebted to the following people for their time, help and contribution to

the production of this thesis.

Great appreciation is due to my former team mates at Levett & Bailey Ltd.,

including Hon Kong Yu, who allowed me to access the data; See Ping Wong, who

provided me with a valuable insight into estimating practices; and Anselm Chow,

who gave detailed explanations of the company’s recording system, and answered all

of my queries.

Gratitude is expressed to Dr. Derek Drew for suggesting the research topic

and for supervising this research project. Without his crucial suggestions, this

research would never have been started.

Thanks are due to all my colleagues at the City University of Hong Kong for

their support, and in particular to Professor Andrew Leung, Dr. S. M. Lo and Dr. S O

Cheung for their encouragement and advice, to Dr. Raymond Lee for introducing me

to the mathematical software Mathcad and to Dr. Eric Lee for giving me private

lessons on resampling methods.

Special acknowledgment is given to Dr. H P Lo, my local supervisor in Hong

Kong, who gave pointed me in the right direction for many of the statistical problems

encountered. His advice on the choice of techniques and proper mathematical

v

interpretation was particularly helpful. His patience in correcting my thinking is

greatly appreciated.

I am indebted to Professor Martin Skitmore for many things, such as his

extensive assistance, superb guidance, sharp advice, incredible patience and prompt

responses to my queries throughout this study. The time and effort that he spent

discussing my research during the occasion of his visit to Hong Kong and my stays at

QUT are highly appreciated. Without his guidance and advice, I would not have been

able to proceed and bring the research to completion.

vi

TTaabbllee ooff CCoonntteennttss

STATEMENT OF ORIGINAL AUTHORSHIP................................................................................ I

ABSTRACT......................................................................................................................................... II

ACKNOWLEDGEMENTS............................................................................................................... IV

TABLE OF CONTENTS................................................................................................................... VI

LIST OF FIGURES ........................................................................................................................... IX

LIST OF TABLES ...............................................................................................................................X

CHAPTER 1 INTRODUCTION ....................................................................................................1

CHAPTER 2 COST FORECASTING IN PRACTICE: A REVIEW .........................................8

2.1 INTRODUCTION .....................................................................................................................8 2.2 BUILDING ECONOMICS .......................................................................................................10 2.3 COST PLANNING AND CONTROL .........................................................................................10 2.4 COST FORECASTING IN THE COST PLANNING AND CONTROL PROCESS...............................11 2.5 DESIGN PROCESS AND DESIGNERS’ FORECASTS .................................................................14 2.6 EARLY STAGE FORECASTING IN PRACTICE .........................................................................17 2.7 PROBLEMS OF EXISTING FORECASTING PRACTICE..............................................................20

2.7.1 Misconception of the relationship between level of detail and forecasting accuracy ...20 2.7.2 Lack of theoretical background.....................................................................................21 2.7.3 Lack of performance evaluation....................................................................................22 2.7.4 Inexplicability, unrelatedness and determinism ............................................................23

2.8 SUMMARY ..........................................................................................................................24 CHAPTER 3 DEVELOPMENT OF FORECASTING MODELS............................................26

3.1 INTRODUCTION ...................................................................................................................26 3.2 DEFINITION OF COST MODEL..............................................................................................27 3.3 BRANDON’S “PARADIGM SHIFT” ........................................................................................29

3.3.1 Black box versus realistic models .................................................................................31 3.3.2 Deterministic versus stochastic models.........................................................................32 3.3.3 Deductive versus inductive models................................................................................33

3.4 MAJOR DIRECTIONS OF MODEL DEVELOPMENT .................................................................34 3.5 LIMITATIONS OF COST MODELS..........................................................................................42

3.5.1 Model assumptions........................................................................................................42 3.5.2 Reliance on historical data for prediction.....................................................................43 3.5.3 Insufficiency of information and preparation time........................................................44 3.5.4 Reliance on expert judgment .........................................................................................44

vii

3.6 REVIEW OF COST MODELS IN USE ..................................................................................... 45 3.7 SIGNIFICANT ITEMS ESTIMATION....................................................................................... 47 3.8 DISCUSSIONS ON RESEARCH OPPORTUNITIES .................................................................... 49 3.9 STOREY ENCLOSURE METHOD........................................................................................... 52 3.10 REGRESSION ANALYSIS ..................................................................................................... 56 3.11 REVIEW OF MODEL PREDICTORS ....................................................................................... 57 3.12 OCCAM’S RAZOR: PARSIMONY OF VARIABLES .................................................................. 66 3.13 SUMMARY.......................................................................................................................... 70

CHAPTER 4 PERFORMANCE OF FORECASTING MODELS........................................... 73

4.1 INTRODUCTION .................................................................................................................. 73 4.2 MEASURES OF FORECASTING ACCURACY.......................................................................... 74 4.3 BASE TARGET FOR FORECASTING ACCURACY................................................................... 82 4.4 OVERVIEW OF MODEL PERFORMANCE AT VARIOUS DESIGN STAGES................................ 83 4.5 SUMMARY.......................................................................................................................... 87

CHAPTER 5 METHODOLOGY ................................................................................................ 89

5.1 INTRODUCTION .................................................................................................................. 89 5.2 RESEARCH FRAMEWORK ................................................................................................... 90 5.3 TYPES OF QUANTITY MEASURED IN SINGLE-RATE FORECASTING MODELS ...................... 92 5.4 SIMPLIFICATION OF JSEM ................................................................................................. 93 5.5 IDENTIFICATION OF A PROBLEM......................................................................................... 97 5.6 DATA PREPARATION AND ENTRY ...................................................................................... 99

5.6.1 Data sample................................................................................................................ 100 5.6.2 Definition and classification of building types ........................................................... 101 5.6.3 Treating of outliers..................................................................................................... 104

5.7 MODEL BUILDING............................................................................................................ 105 5.7.1 Dependent Variables .................................................................................................. 105

5.7.1.1 Price Index Adjustment .................................................................................................. 106 5.7.1.2 Other Adjustments .......................................................................................................... 107

5.7.2 Candidate variables ................................................................................................... 107 5.7.3 Fitting Criterion ......................................................................................................... 109

5.7.3.1 Matrix Notation for Calculation of MSQ........................................................................ 110 5.7.4 Reliability analysis ..................................................................................................... 112

5.7.4.1 Matrix Notation for Calculation of MSQ by Leave-one-out Method.............................. 114 5.7.5 Selection Strategies .................................................................................................... 115

5.8 MODEL ADJUSTMENT ...................................................................................................... 119 5.8.1 Exclusion of candidates.............................................................................................. 119 5.8.2 Transformation of variables ....................................................................................... 121

5.9 COMPARISON OF BEST MODEL WITH OTHER MODELS..................................................... 122 5.9.1 Choice of parametric and non-parametric inference ................................................. 124 5.9.2 Statistical inference for bias....................................................................................... 126 5.9.3 Statistical inference for consistency ........................................................................... 127

5.10 TOOLS FOR COMPUTATION .............................................................................................. 132 5.11 SUMMARY........................................................................................................................ 133

CHAPTER 6 ANALYSIS........................................................................................................... 137

6.1 INTRODUCTION ................................................................................................................ 137

viii

6.2 MODEL DEVELOPMENT.....................................................................................................138 6.2.1 Data Collected ............................................................................................................138 6.2.2 Candidates for Regression Models..............................................................................139 6.2.3 Response for Regression Models.................................................................................139 6.2.4 Selection of Predictors ................................................................................................142

6.2.4.1 Selected Predictors for RJSEMs and RASEMs ...............................................................144 6.2.5 Model Transformation.................................................................................................164

6.3 PERFORMANCE VALIDATION ............................................................................................164 6.3.1 Forecasting Results .....................................................................................................164 6.3.2 Normality Testing........................................................................................................168 6.3.3 Significance of Variable Transformation ....................................................................174 6.3.4 Comparisons of Models...............................................................................................175

6.3.4.1 Models for Offices...........................................................................................................178 6.3.4.2 Models for Private Housing.............................................................................................179 6.3.4.3 Models for Nursing Homes .............................................................................................180 6.3.4.4 Models for Schools..........................................................................................................180 6.3.4.5 Discussions on model comparisons .................................................................................181

6.4 COMBINING FORECASTS ...................................................................................................183 6.5 SUMMARY ........................................................................................................................188

CHAPTER 7 CONCLUSIONS...................................................................................................193

7.1 INTRODUCTION .................................................................................................................193 7.2 MODEL DEVELOPMENT.....................................................................................................194 7.3 PERFORMANCE VALIDATION ............................................................................................196 7.4 COMBINING FORECASTS....................................................................................................198 7.5 IMPLICATIONS FOR PRACTICE ...........................................................................................199 7.6 MODEL LIMITATIONS........................................................................................................200 7.7 OPPORTUNITIES FOR FURTHER RESEARCH........................................................................202

BIBLIOGRAPHY .............................................................................................................................205

APPENDIX A: APPROVAL LETTER FOR ACCESS OF COST ANALYSES ........................218

APPENDIX B : TENDER PRICE INDICES AND COST TRENDS IN HONG KONG, MARCH 2004 (PUBLISHED BY LEVETT AND BAILEY CHARTERED QUANTITY SURVEYORS LTD.)..................................................................................................................................................218

APPENDIX C: ORIGINAL DATA.................................................................................................218

APPENDIX D: FORECASTS BY CROSS VALIDATION USING CONVENTIONAL MODELS ...........................................................................................................................................218

APPENDIX E: ERRORS AND PERCENTAGE ERRORS OF FORECASTS ..........................218

APPENDIX F: RESULTS OF COMBINING FORECASTS .......................................................218

ix

LLiisstt ooff FFiigguurreess

FIGURE 2-1: MODEL OF DESIGN PROCESS (SOURCE: MAVER 1970 P.200).................... 16

FIGURE 2-2: LEVEL OF INFLUENCE ON PROJECT COST (IN PER CENT) (SOURCE: BARRIE AND PAULSON 1978 P. 154) ........................................................................................... 16

FIGURE 2-3: DESIGNERS’ COMMITMENT TO EXPENDITURE (SOURCE: FERRY ET AL. 1999 P. 96)........................................................................................................................................... 17

FIGURE 5-1: RESEARCH FRAMEWORK FOR IDENTIFICATION, SELECTION AND VALIDATION OF PRICE MODELS.............................................................................................. 91

FIGURE 5-2: ALGORITHM FOR DUAL STEPWISE SELECTION....................................... 118

FIGURE 5-3: ALGORITHM FOR COMPARISONS OF VARIANCES OF PERCENTAGE ERRORS........................................................................................................................................... 128

FIGURE 6-1: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE FLOOR AREA MODEL FOR OFFICES................................................................................................................. 170

FIGURE 6-2: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE LRASEM FOR OFFICES .......................................................................................................................................... 170

FIGURE 6-3: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE JSEM FOR PRIVATE HOUSING...................................................................................................................... 171

FIGURE 6-4: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE FLOOR AREA MODEL FOR PRIVATE HOUSING............................................................................................. 171

FIGURE 6-5: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE CUBE MODEL FOR PRIVATE HOUSING............................................................................................................. 172

FIGURE 6-6: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE RJSEM FOR NURSING HOMES ......................................................................................................................... 173

FIGURE 6-7: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE RASEM FOR NURSING HOMES ......................................................................................................................... 173

FIGURE 6-8: TESTS OF HOMOGENEITY OF VARIANCES USING BARTLETT’S TESTS, KRUSKAL WALLIS TESTS AND MANN-WHITNEY U TESTS............................................. 177

x

LLiisstt ooff TTaabblleess

TABLE 2-1: MODEL SELECTION CRITERIA (EXTRACTED AND MODIFIED FROM FORTUNE AND HINKS 1998) .........................................................................................................19

TABLE 3-1: CLASSIFICATION OF THIS RESEARCH ACCORDING TO NEWTON’S DESCRIPTIVE PRIMITIVES ..........................................................................................................37

TABLE 3-2: PREVIOUS STUDIES ON MODELLING TECHNIQUES AND APPLICATIONS ACCORDING TO NEWTON’S CLASSIFICATION.....................................................................38

TABLE 3-3: SUMMARY OF ESTIMATING TECHNIQUES (EXTRACTED FROM SKITMORE & PATCHELL 1990) ...................................................................................................40

TABLE 3-4: ADJUSTMENT FOR THE FACTORS AFFECTING THE ESTIMATES IN THE STOREY ENCLOSURE METHOD .................................................................................................53

TABLE 3-5: WEIGHTINGS AND INCLUSIONS FOR INDIVIDUAL COMPONENTS IN THE STOREY ENCLOSURE METHOD .................................................................................................54

TABLE 3-6: THE RESULTS OF TESTS FOR THE CUBE, FLOOR AREA AND STOREY ENCLOSURE METHODS IN JAMES’ STUDY (SOURCE: JAMES (1954)) .............................56

TABLE 3-7: SUMMARY OF THE MODELS DEVELOPED BY THE POST-GRADUATE STUDENTS OF THE DEPARTMENT OF CIVIL ENGINEERING AT LOUGHBOROUGH UNIVERSITY OF TECHNOLOGY (EXTRACTED FROM MCCAFFER 1975).......................68

TABLE 3-8: SUMMARY OF FORECASTING TARGETS AND INFLUENCING VARIABLES IN PREVIOUS EMPIRICAL STUDIES ..........................................................................................69

TABLE 4-1: MEASURES OF PERFORMANCE OF FORECASTS (SOURCE: SKITMORE ET AL. 1990 P. 22) ..............................................................................................................................77

TABLE 4-2: FACTORS AFFECTING QUALITY OF FORECASTS – SUMMARY OF EMPIRICAL EVIDENCE (EXTENDED FROM THE SIMILAR TABLE IN SKITMORE ET AL. (1990, P. 20-21)) ...........................................................................................................................78

TABLE 4-3: PERFORMANCE OF DESIGNERS’ FORECASTS REVIEWED BY ASHWORTH AND SKITMORE (1983)...........................................................................................87

TABLE 5-1: COEFFICIENTS AND VARIABLES DESIGNATED IN JSEM.............................99

TABLE 5-2: CLASSIFICATION OF BUILDING PROJECTS ACCORDING TO BUILDING TYPES................................................................................................................................................103

TABLE 5-3: LIST OF CANDIDATE VARIABLES......................................................................108

xi

TABLE 6-2: INCLUDED CANDIDATES, EXCLUDED CANDIDATES AND SELECTED PREDICTORS FOR RJSEMS AND RASEMS ............................................................................ 144

TABLE 6-3: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR OFFICES................................................................................................................................. 147

TABLE 6-4: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR PRIVATE HOUSING............................................................................................................. 148

TABLE 6-5: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR NURSING HOMES ................................................................................................................ 148

TABLE 6-6: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR SCHOOLS............................................................................................................................... 149

TABLE 6-7: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR OFFICES................................................................................................................................. 150

TABLE 6-8: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR PRIVATE HOUSING............................................................................................................. 151

TABLE 6-9: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR NURSING HOMES ................................................................................................................ 152

TABLE 6-10: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR SCHOOLS............................................................................................................................... 153

TABLE 6-11: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR OFFICE ............................................ 154

TABLE 6-12: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR PRIVATE HOUSING...................... 155

TABLE 6-13: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR NURSING HOMES ......................... 156

TABLE 6-14: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR SCHOOLS ........................................ 157

TABLE 6-15: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR OFFICES ......................................... 158

TABLE 6-15: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR OFFICES ......................................... 158

TABLE 6-16: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR PRIVATE HOUSING..................... 159

TABLE 6-17: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR NURSING HOMES ........................ 160

TABLE 6-18: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR SCHOOLS ....................................... 161

TABLE 6-19: SIGNS OF COEFFICIENTS FOR SELECTED PREDICTORS........................ 161

TABLE 6-20: CONTRIBUTIONS OF FLOOR AREA RELATED PREDICTOR TO RESPONSE....................................................................................................................................... 162

xii

TABLE 6-21: CONTRIBUTION OF NON-FLOOR AREA RELATED PREDICTORS TO RESPONSES .....................................................................................................................................163

TABLE 6-22: SUMMARY OF MEANS AND STANDARD DEVIATIONS OF PERCENTAGE ERRORS............................................................................................................................................167

TABLE 6-23: RESULTS OF NORMALITY TESTS FOR PERCENTAGE ERRORS ACCORDING TO BUILDING AND MODEL TYPES ................................................................169

TABLE 6-24: ESTIMATED LAMBDA VALUES ACCORDING TO BUILDING AND MODEL TYPES (FOR MODELS NOT SATISFYING NORMALITY ASSUMPTION ONLY).............173

TABLE 6-25: TWO-SAMPLE F-TESTS AND MANN-WHITNEY U TEST BETWEEN REGRESSED MODELS WITH UNTRANSFORMED VARIABLES AND WITH LOGARITHMIC TRANSFORMED VARIABLES ......................................................................175

TABLE 6-26: TWO-SAMPLE MANN-WHITNEY U-TESTS BETWEEN MODELS FOR OFFICE AND PRIVATE HOUSING .............................................................................................176

TABLE 6-27: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 1 MODELS ................................................................185








TABLE D-1: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR OFFICES................................................................................................................218

TABLE D-2: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR PRIVATE HOUSING ...........................................................................................218

TABLE D-3: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR NURSING HOMES...............................................................................................218

TABLE D-4: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR SCHOOLS..............................................................................................................218

TABLE E-1: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR OFFICES..............................................................................218

TABLE E-2: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR PRIVATE HOUSING .........................................................218

xiii

TABLE E-3: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR NURSING HOMES ............................................................ 218

TABLE E-4: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR SCHOOLS........................................................................... 218

TABLE E-5: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR OFFICES...................................................................................... 218

TABLE E-6: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR PRIVATE HOUSING ................................................................. 218

TABLE E-7: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR NURSING HOMES..................................................................... 218

TABLE E-8: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR SCHOOLS.................................................................................... 218

TABLE F-1: COMBINED FORECASTS FOR GROUP 1 MODELS ........................................ 218








1

CChhaapptteerr 11 IInnttrroodduuccttiioonn

Philosophy is a game with objectives and no rules. Mathematics is a game with rules and no objectives. Anonymous

The forecasting approach for the prediction of building prices that is used in

practice has been criticized for misconstruing the relationship between level of detail

and forecasting accuracy (Bennett et al. 1979), for lacking solid theoretical support

(Brandon 1982; Skitmore 1988; Bon 2001), for lacking performance evaluation

(Morrison 1983; Raftery 1984a; Fortune and Lees 1996), and for being inexplicable,

unrelated, and deterministic (Bowen et al. 1987). Although many alternative

approaches and new models have been developed, solid evidence from surveys that

have been conducted in different countries suggests that they are rarely put forth in

practice (Akintoye et al. 1992; Fortune and Lees 1996; Bowen and Edwards 1998).

The majority of studies of model development have chosen to focus on the uniqueness

of a new model and the way in which it is different from other models (Raftery 1984a;

Newton 1990).

In the early design stage of a building project, the freedom to modify the

scopes, requirements, standards and designs is very high. This alone will create

high uncertainty in building price despite the fact that the later decisions on tendering

http://proverb.taiwanonline.org/display.php?author=W.S.+Anglin&row=0

2

arrangement, procurement methods and number of tenderers to be invited, etc., and

the possible change in market conditions as design develops will also have serious

price implication. Although the design information available is very coarse and

limited in the early design stage, construction clients are generally eager to know the

likely building price. Very often, this price refers to the lowest tender price.

Conventionally, practicing forecasters measure the total floor area from a few sketch

drawings and make a forecast using the floor area method (or the cube method before

the floor area method gained the popularity). To make full use of information

extracted from sketches, James proposed a method as a rule of thumb, called the

Storey Enclosure Method, which he claimed takes into account the effect of physical

shape, the total floor area, the vertical positioning of the floor area, the storey heights

and the sinking usable floor area below ground level (e.g. basement) on building

prices. Like the floor area and the cube method, James’ method is a single rate

method which uses the storey enclosure area as the quantity for measurement. To

determine this area, the area for each floor, the external wall area, the basement wall

area and the roof area are first measured. Then, these measured areas are multiplied

by their associated weightings. Finally, the products of these areas and weightings

are summed and the total is the storey enclosure area.

Although James’ Storey Enclosure Model (JSEM) has not been developed

empirically, its forecasting performance, together with that of two other conventional

models, the floor area and cube models, have been calculated with empirical data for

comparison. James’ study of 1954 is a pioneering study in model exploration that

attempts to show model advancement empirically. It is able to show that forecasts

that are produced by his model are nearer to actual tender prices than those that are

produced by the other two models, and that the range of price variation is reduced

3

accordingly. Despite the better performance demonstrated by James, JSEM serves

primarily as a textbook method for forecasting, rather than a method that is used in

practice. Clearly, JSEM is more complicated than the floor area and the cube

models in terms of measurement and ease of understanding. Moreover, there is a

major criticism in that the use of weightings are purely based on experience

(Wilderness Group 1964; Seeley 1996 pp.161-162; Ashworth 1999 p.251).

JSEM, which is considered to be the most sophisticated model of all of the

single rate models, has been chosen for further development in this research. The

idea of using areas of different parts of a building as variables allows for model

exploration using regression analysis. The major problem of JSEM, that of a lack

of rigor, can be solved by using advanced modelling techniques for model

development, and statistical inferences for performance validation. By following a

rigorous approach of cross validation to the further development of JSEM and the

subsequent examination of the developed model by statistical testing, it is expected

that the model will achieve a balance between the requirements of theory (science)

and practicability (technology) for forecasting building prices.

With reference to the variables identified in the JSEM, the primary aim of

this research is to develop regressed models for forecasting the lowest tender prices

of multi-storey buildings in the early design stage using a systematic and logical

approach. To achieve this aim, this research adopts the cross validation approach

for modelling using regression analysis as it is proved to be markedly superior for

small data sets (Goutte 1997). The accuracy of statistical inference in cross

validation is preserved by dividing at random a sample of data into two sub-samples,

an exploratory sub-sample, which is used to select a statistical model for the data,

and a validatory sub-sample, which is used for formal statistical inference (Fox 1997).

4

The cross validation algorithm developed in this study for modelling JSEM’s

variables is a significant contribution because of its advancement to the model

building process. Although the data, i.e. the observed values for the candidates and

the response, used in this study are only for four different types of building projects,

the developed methodology for modelling is also applicable to data from other types

of buildings and other types of data. In revisiting James’ study, the specific

objectives of this research are: (1) to collect project data of multi-storey buildings; (2)

to classify the data according to the type of projects; (3) to develop a cross-validated

regression algorithm for model selection; (4) to generate regressed models of

different project types by the cross validation method using the variables in JSEM as

candidates; (5) to repeat (4) based on another set of variables that are modified from

the variables in JSEM.

It is hypothesised that the new regressed models will outperform the

conventional forecasting models, i.e. the JSEM, the floor area and cube models.

The secondary aim is to prove the hypothesis. To accomplish this, the forecasting

accuracy of the developed models has to be tested against that of the conventional

models. An algorithm for selecting the appropriate tests for the comparisons is

designed. The specific objectives concerning the statistical inference are: (1) to

measure the forecasting accuracy in terms of bias and consistency ; (2) to compare

the forecasting accuracy of these models by the use of different parametric and

non-parametric tests and (3) to group the models that show the same potency

together if the developed models do not perform significantly better than the

conventional models.

5

The thesis is divided into three parts. The background to this research is

presented in Chapters 2 to 4, the empirical work is contained in Chapters 5 to 6, and

the conclusions are presented in Chapter 7.

Due to the difference in cost sensitivity, early design decisions that are

strongly influenced by forecasting accuracy have a stronger impact on the final value

of a building. In Chapter 2, the significance of early stage forecasting and

forecasting in practice are reviewed. Disregarding the probable strong impact, it is

found that accuracy is rarely monitored in real life forecasting. There is also a lack

of theoretical support for the widely adopted forecasting methods, but by contrast

there are a variety of forecasting cost models that have been developed, mainly by

academia, arguably for the sake of publication purposes only.

The development, use, classification and limitation of cost models are

summarised in Chapter 3. In particular, JSEM, as the model for further

development in this study, is extensively explained. This research applies

regression techniques to the variables in JSEM, and previous studies on the

application of similar techniques and the variables selected are also discussed. In

the process of model development, modellers always face a dilemma between

choosing a slightly complex model that is general but may be unrealistic, and

choosing a very complex model that is specific but may be unreliable. To resolve

this dilemma, the principle of parsimony in scientific theory and model development

is addressed.

Due to the limited information that is available for early-stage forecasting,

models are mainly operated in ‘black box’ mode. In the case of models that are

developed by regression, as in this research, performance validation is essential.

6

The different ways of measuring forecasting performance, and previous empirical

work on forecasting accuracy, are reviewed in Chapter 4.

The methodology is described in Chapter 5. The cost analyses for four

types of building were collected from a large quantity surveying practice in Hong

Kong. To employ the regression methodology for multi-storey buildings, the

number of variables in the original JSEM is trimmed down to a manageable level by

making an assumption of the variations in floor size. Some advanced features of

this methodology include the use of cross validation for reliability analysis, which

simulates the practical production of forecasts, and a dual stepwise selection strategy

that enhances the chance of identifying the best model. In the section concerning

the comparison of the models, two commonly used measures – bias and consistency

– are described, and statistical inference using parametric and non-parametric tests is

compared. To assist in the making of a proper statistical inference, a framework for

choosing the appropriate tests is also proposed.

The analysis in Chapter 6 contains three sections: model development,

performance validation, and combining forecasts. Eight regressed models were

developed according to two sets of candidates (one set from JSEM and one set that

was modified from JSEM) for the four types of building. The selected variables

were also transformed to seek a further improvement in forecasting accuracy. Each

regressed model was compared separately with the conventional models, and the

models possessing the same potency were grouped together. Finally, an approach

to combining forecasts to improve forecasting performance is demonstrated with

empirical data.

7

An overall summary, conclusions, and suggested further research are

presented in Chapter 7.

8

CChhaapptteerr 22 CCoosstt FFoorreeccaassttiinngg iinn PPrraaccttiiccee:: AA

RReevviieeww

In science, it doesn't matter if you're wrong, as long as you're not stupid. In business, it doesn't matter if you're stupid, so long as you're not wrong.

Anonymous

2.1 Introduction

The total cost of a development includes the cost of land, building costs,

finance charges, legal charges, consultants’ fees, and so forth. In a broader sense, it

also includes the running, marketing, maintenance and repair costs. To manage the

economic aspect of a building development effectively, clients often employ

professionals of various disciplines, such as accountants, general practice surveyors

and quantity surveyors. Quantity surveyors, whose profession originates in the

measurement of the quantities of buildings, are responsible for giving advice on

building costs.

The economic aspects of building procurement play a very significant role,

because building cost is one of the major components of the total development cost,

next to the cost of land. Unlike the cost of land, which reflects the cost of

ownership and usage, building cost is determined by the building market through the

9

cost approach. In a market economy, it is the traded value to a contractor for the

procurement of a building.

At the design development stage, building cost planning and control is an

iterative process that is used to forecast the unknown building price based on

available drawings and specifications (i.e., costing a design) and the revision of

drawings and specifications to ensure that the building price falls within a

predetermined sum (i.e., designing to a cost) (Jaggar et al. 2002 pp.10-11). Design

decisions that are made during this process are crucial to the success of a project.

As the decisions that are made in the early design stages, especially before a detailed

design has been worked out, are more cost sensitive than those that are made in the

later stages, changes to design decisions in the later design stages or execution stage

may lead to serious redundancy. Thus, it is essential to produce an accurate cost

forecast, especially in the early design stage. The use of the right cost model is

therefore a fundamental concern.

The task of forecasting the cost of buildings is especially difficult because of

the heterogeneity of the design, procurement, and contractual arrangements; the

complexity of the resources and production methods that are involved; and the

lengthy cycle of building projects. The task of forecasting is, however, very

important in the design process, as design decisions are always made with reference

to the forecasts (outputs of the task) of building costs. An incorrect forecast will

inevitably lead to the ineffective use of resources.

In a typical building project, the quantity surveyor is held responsible for

giving strategic cost advice. The science and art of this function is what

distinguishes quantity surveying as a professional discipline (James 1954; Male 1990;

10

Connauhgton and Meikle 1991; RICS 1992), and forecasting forms a core part of this

function. This chapter reviews the significance of early stage forecasting,

conventional forecasting practice, and the underlying problems of the traditional

forecasting approach.

2.2 Building Economics

To give professional cost advice, quantity surveyors should be well equipped

with knowledge of building economics. Building economics is the study of

economising the use of scarce resources throughout the development life cycle, from

conception to demolition (Bon 1989 p.5). It involves a combination of technical

skills, informal optimisations, cost accounting, cost control, price forecasting and

resource allocation (Raftery 1991 pp.4-5). In a broader sense, it can be considered

as a branch of general economics that involves the application of the techniques and

expertise of economics to the study of the construction firm, the construction process

and the construction industry (Hillebrandt 1985 p.3). The objective of seeking an

optimal allocation of resources for building clients distinguishes building economics

from cost and management accounting.

2.3 Cost Planning and Control

Practitioners refer to the process of applying economics principles to building

projects as cost planning and control. The purposes of cost planning and control are

to provide clients with a good value for money project, to achieve a balance of

11

design expenditure on various building component, and to keep expenditure within

the amount that is allowed by the client (Maver 1979; Karashenas 1984; Seeley 1996

p.6; Flanagan and Tate 1997 p.13; and Ashworth 1999 pp. 9-10). In practice, it

may involve the study of the client’s requirements, the possible effects on the

surrounding areas if the development is carried out, the relationship between space

and shape, the assessment of the initial cost, the reasons for, and methods of,

controlling costs and the estimation of the life of the building and materials

(Ashworth 1999 pp.9-10; Ferry et al. 1999, pp.26-28).

2.4 Cost Forecasting in the Cost Planning and

Control Process

To avoid ambiguities in the understanding of commonly used terms such as

‘cost planning’ and ‘cost forecasting’ for describing the activities involved, and

‘estimate, ‘forecast’ and ‘prediction’ for describing the output produced, their

corresponding definitions are reviewed. The terms ‘cost planning’ and ‘cost

control’ are defined by Seeley (1996).

Cost planning – a systematic application of cost criteria to the design

process, so as to maintain in the first place a sensible and economic

relation between cost, quality and utility and appearance, and in the

second place, such overall control of proposed expenditure as

circumstances may dictate. (p. 22)

12

Cost control – all methods of controlling the cost of building projects

within the limits of a predetermined sum, throughout the design and

construction stage. (p. 23)

With reference to his definitions, there are four key elements in the process of

cost planning and control. First, it is necessary to produce a base figure as a cost

target or a cost limit. Second, the analysis of cost and the production of a probable

building cost is an iterative process. Third, the cost study requires the application

of knowledge of how to relate building design to building economics. Fourth, the

cost target, or cost limit, is used to monitor the probable building cost. In short, the

process is an iterative process that forecasts the building cost based on available

information such as drawings and specifications (costing a design) and the revision

of drawings and specifications to ensure that the building cost falls within the limit of

a predetermined sum (designing to a cost) (Jaggar et al. 2002 p.11).

The terms ‘cost’ and ‘cost forecasting’ are defined in the first chapter of the

book, Cost Modelling, edited by Skitmore and Marston (1999):

Cost – the cost of the contract to the client. This is the value of the

lowest bid received for the contract, or the contract sum. (p. 18)

Cost forecasting – the process of forecasting the client’s cost. Cost

forecasting is a part of the cost evaluation (planning and control)

process. (p. 18)

Obviously, the cost of a building under a building contract formed between a

building owner and a contractor is different from the cost of production of that

building. Their relationship varies according to many unquantifiable factors such as

13

market condition and project risks, etc. As one person’s price is another person’s

cost, the terms ‘price’ and ‘cost’ of a building refer to the amount received by a

contractor and the amount paid by a owner respectively (Raftery 1991, pp. 30-32).

To avoid ambiguity, the terms ‘price’ and ‘cost’ are used synonymously throughout

this thesis in the sense of the cost to building owners.

According to the definitions of cost planning and cost forecasting, the latter

determines what the future cost will be, whereas the former determines what it

should be. Cost forecasting is an input to the cost planning process, or a

sub-process of cost planning. The importance of cost forecasting is often

understated compared with cost planning. Armstrong (1985 p.6) argues that

alternative plans can be compared only if reasonable forecasts can be made, and that

forecasting should be considered as being as important as planning.

The term ‘forecast’ is distinguished from ‘estimate’ and ‘prediction’. An

estimate is made of quantities that may exist before, during or after the event under

consideration. Forecasting requires a prior estimate, and is a subset of the

estimating task (Skitmore et al. 1990 p.3). An estimate of a future event must by

definition be a forecast, whereas an estimate of an event that is based on information

that contains the event itself is a prediction (Skitmore and Marston 1999 p. 19). In

statistical parlance, a prediction is an estimate of formulae. A forecast, however, is

an estimate of a similar value that is outside of the database (Skitmore et al. 1990

p.4). As this research concerns the development of cost models for the estimation

of future events, the term ‘forecast’ is used throughout the thesis.

14

2.5 Design Process and Designers’ Forecasts

Making appropriate design decisions is crucial to the success of a project,

because design changes that involve vast expenditure, future changes or variations in

decisions, especially after the commencement of construction, often lead to

redundancy and waste in terms of work completed and resources deployed. Some

decisions on design may have long-term consequences or may be unrecoverable.

Design decisions are solutions to problems of function, form, time and economy for

buildings (Peña and Parshell 2001). Referring to Figure 2-1, which exhibits the

process of the search for design solutions as an iterative process of analysis,

synthesis, and appraisal (Maver 1970), it can be seen that building cost forecasts are

used at the appraisal stage to assist in the making of decisions to achieve an

economic objective. These forecasts are also referred to as ‘designers’

forecasts’(Ashworth and Skitmore 1983), as it is the building design that gives the

information for forecasting and determines whether value can be achieved at an

acceptable cost (Morton and Jagga 1995 p. 9). As clients need reliable cost advice

to enable the assessment of the viability of a project as soon as is possible (Fortune

and Lees 1994), designers’ forecasts help to make them aware of their probable

financial commitments before any extensive design work is undertaken (Seeley 1996

p. 54).

The outline plan of work of the Royal Institute of British Architects (RIBA)

(RIBA 1991) divides the construction process into 12 stages. It gives a

comprehensive picture of the information that is required and the tasks that need to

be completed at each stage of work. There are four stages in between the

appointment of various professionals and the production of tender information.

15

They are the feasibility, outline proposal, sketch design and detailed design stages.

During these stages, quantity surveyors are responsible for producing designers’

forecasts according to the information that is provided, and the most important goal

of these forecasts is to give a forecasted value of the work that is as close to the

unknown value of the lowest tender as possible. Although designers’ forecasts at

different stages share the same goal, the levels of influence differ according to the

qualities of these forecasts. Figure 2-2 illustrates the level of influence of the

different project stages on project cost. It shows that the level of influence drops

drastically from the planning and design stage to the procurement and construction

stage, even though the percentage of actual cost spent in relation to the overall

building cost is small in the former stage. This is also reinforced by current studies,

which suggest that the commitment of a construction cost before a sketch design is

formalised may amount to over 80% of the final potential cost (Skitmore 1985; Ferry

et al. 1999 pp.95-96). Figure 2-3 shows the suggested accumulated commitment

expenditure against the design time. As demonstrated, early decisions are more

cost sensitive, and thus the quality of early stage forecasts plays a more influential

role in the final value of buildings than the quality of later forecasts.

Skitmore et al. (1990 p.5) suggested five primary determinants that affect the

quality of forecasting: the nature of the target, the information used, the forecasting

technique used, the feedback mechanism used and the person providing the forecast.

The forecasting technique for early stage forecasting is identified as the study area

for this research.

16

Figure 2-1: Model of design process (Source: Maver 1970 p.200)

Figure 2-2: Level of influence on project cost (in per cent) (Source: Barrie and

Paulson 1978 p. 154)

halla

This figure is not available online. Please consult the hardcopy thesis available from the QUT Library

halla


17

Figure 2-3: Designers’ commitment to expenditure (Source: Ferry et al. 1999 p.

96)

2.6 Early Stage Forecasting in Practice

Bennett et al. (1979) classified conventional designers’ forecasting techniques

into eight categories: cost limit calculation, floor area method, functional unit method,

elemental cost estimation, lump sum estimation, cost per meter squared for

functional use, approximate quantities, and pricing the bill of quantities. None of

these techniques take into account how building cost is actually incurred by

contractors. In traditional procurement, the responsibilities of design and

construction are separately carried out by two groups of professionals. Designers’

forecasts are usually prepared by professional quantity surveyors who do not

normally has access to the cost data of contractor’s accounts. These data show how

actual cost of construction is incurred by a contractor. Due to the lack of these data,

forecasters can only refer to the prices and unit rates from returned tenders. Out of

halla


18

the eight types of techniques involved, the most frequently used before the

preparation of a tender are the floor area method, elemental cost estimation and

approximate quantities. The first method assumes that the building price is

proportional to its floor area. The second method divides a building into a set of

elements and assumes that the cost of an element is proportional to the unit of

measurement that is defined for that element. The third method requires the

calculation of quantities of the major items of a building and pricing them by means

of composite unit rates. The popularity of these methods is unrelated to their

efficacy. Moreover, there is a major criticism about the lack of a theoretical

relationship for the application of these methods at different design stages.

Conventional forecasting techniques are applied in when there is a trade-off

between the estimation of accuracy and the time that is available for forecasting or

between forecasting accuracy and the adequacy of available forecasting information

(Taylor 1984). According to the results of surveys on the forecasting techniques that

are employed by practitioners in Nigeria (Akintoye et al. 1992), South Africa (Bowen

and Edwards 1998) and the United Kingdom (Fortune and Lees 1996; Fortune and

Hinks 1998), conventional forecasting methods are still in widespread use, and their

applications outnumber those of the newer models. The UK survey, which was

conducted by Fortune and Hinks (1998), also prioritised the model selection criteria

for practicing forecasters. Table 2-1 shows the identified model selection criteria in

descending order of importance. The two highest-ranked criteria are the availability

of data and the data that is needed for a model.

19

Table 2-1: Model Selection Criteria (extracted and modified from Fortune and

Hinks 1998)

Model Selection Criteria Identified by Fortune and Hinks (1998) (UK) according to the descending importance ranking

Model Selection Criteria Identified by Akintoye et al. (1992) (Nigeria)

Model Selection Criteria Identified by Bowen and Edwards (1998) (South Africa)

Amount of project data availability Data need for model Forecasters understanding of the model Time available for forecast preparation Project type Accuracy of model output Forecasters experience of model in-use Amount of risk in project decisions Ease of model application Feedback from previous forecasts Complexity of the project Speed of the model in use Human resources required to operate model Site characteristics of the project Level of awareness of new models Flexibility of model in use Project size Nature of the client Market conditions Cost of using model Design consultants for the project Quality levels required in the project Availability of computers for use with model Relationships between the forecaster and manager

Anticipated height of project Geographic location of the project Other criterion found: Other criteria found: (1) Expected frequency of

model use (1) Cost of Project; (2) Client Sophistication

Although the approximate quantities method is generally thought to be more

accurate (Fortune and Lees 1996), as it utilises more data, a recent study surprisingly

found the opposite result, that the floor area method is significantly more accurate than

the approximate quantities method (Skitmore and Drew 2003). Moreover, the

approximate quantities technique requires more detailed design information and more

time to prepare. By contrast, the floor area method is very rough, and requires much

20

less information and effort to produce. As design decisions that are made before the

completion of the sketch design stage are far more important than those that are made

afterwards (refer to Figure 2-2 and 2-3), it would be more worthwhile to spend time on

improving the early stage forecast, from the perspective of cost and benefit.

2.7 Problems of Existing Forecasting Practice

2.7.1 Misconception of the relationship between level of detail and

forecasting accuracy

Bennett et al. (1979) found that some forecasters applied quite detailed

estimating techniques at a very early stage of the design planning process without

taking into account the correlating accuracy. Practising forecasters generally

believe that forecasts that are produced from more detailed quantities are more

accurate. Thus, forecasters usually attempt to measure quantities in as much detail

as possible within the limitations on the available data and allowable time. This

explains why forecasters ranked very highly the three model selection criteria of

“amount of project data available” (most important), “data needed for model”

(second most important) and “time available for forecast preparation” (fourth most

important) in the model selection criteria survey.

The conviction of practising forecasters that the more detailed a forecast is,

the higher its accuracy remains a proposition only. Skitmore (1991) highlights the

need for the assessment of the performance of individual techniques:

21

“the standard construction price forecasting texts all assert that more

detailed forecasting techniques such as those using approximate

quantities are ipso facto necessarily of better quality than coarser

techniques such as the floor area method . . . very little research

seems to have been attempted in establishing the validity of this

assertion or of the relative quality of individual techniques.” (p. 12)

Ironically, empirical evidence of forecasting accuracy reveals that very little

improvement can be made to overall accuracy simply by increasing the level of detail

and the complexity of quantity-based methods (Ashworth and Skitmore 1983; Ross

1983; Morrison 1984; Beeston 1987). This could be due to the fact that factors

such as the type, size and shape of buildings that are not counted in quantity-based

methods have a greater significance, and that costs are closely related to market

forces and therefore, to an extent, are divorced from actual costs (Skitmore 1995).

More empirical studies on forecasting accuracy are reviewed in Chapter 4.

2.7.2 Lack of theoretical background

In the preface of the book, Building as an Economic Process: An Introduction

to Building Economics, Bon (1989) raises the question “why then is building

economics developing at such a sluggish pace, and what are the reasons for its lack

of professional recognition?” He opines that it is because the field lacks a

theoretical foundation. Although effort has been expended in the development of

advanced forecasting systems, theoretical development has not been forthcoming

(Skitmore 1988). With the assistance of information technology today, forecasting

22

researchers are now faced with an unmanageable amount of data but no theoretical

basis for analysis (Skitmore and Patchell 1990).

2.7.3 Lack of performance evaluation

Forecasters generally assume that a forecast is correct, and that the error is in

the difference between the forecast and tender price (Morrison 1983; Fortune and

Lees 1996). In his study of cost planning and the forecasting techniques that are

used in practice, Morrison (1983) finds that no forecaster in practice monitors their

own forecasting performance against received tenders. Forecasters are too

optimistic about their own forecasting performance, and pay very little explicit

attention to the confidence limits that are attached to the forecasted range of prices

within which the eventual outcome is expected to fall (Bowen and Edwards 1985a).

Practitioners often neglect the importance of producing accurate forecasts.

An opinion survey was conducted amongst architects and quantity surveyors, and

found that a significant number of respondents expected a great degree of accuracy

from price forecasts that are produced by quantity surveyors (Bowen and Edwards

1985a). Empirical studies also show that clients are generally dissatisfied with the

quality of strategic cost advice that is provided by their professional advisors (Ellis

and Turner 1986; Proctor et al. 1993). These studies reveal that there is room for

forecasters to improve, that forecasters have traditionally had no awareness of their

own performance and that forecasters should monitor and find ways to improve the

quality of cost advice to satisfy the needs of their clients.

23

2.7.4 Inexplicability, unrelatedness and determinism

The use of forecasting methods in practice is subjective, although research

studies on the formalisation of the model selection process have been carried out

(Fortune and Hinks 1997, 1998). Forecasters sometimes use a mixture of different

techniques to manage the forecasting task without a clear rationale. For example, a

forecaster may use the floor area method to forecast a part of the work for which they

have little data to refer to, and uses the approximate quantities method for the rest of

the work for which more detailed data exists. These conventional methods were

mainly developed by rule of thumb without any attention being paid to the theory

behind them, and their use in combination is theoretically baseless.

The reliability of the forecasts that are produced by conventional methods

depends on the reliability of each quantity value, the reliability of each unit price rate

value, the number of items and the collinearity of the quantity and rate values

(Skitmore and Patchell 1990). It is doubtful, however, that unit price rates that are

derived deterministically from a number of historical projects can produce accurate

forecasts. Moreover, to use process-biased data (e.g. historical price rates, which

tend to reflect the utilisation of available resources) for design-biased forecasting

models would be to imply either that production methods do not differ, or that

differing production methods do not significantly affect cost, both of which are

patently untrue (Bowen and Edwards 1985a). Furthermore, the supposition that

forecasts will be accurate only if the quantities and unit price rates can be determined

ignores the variability of unit price rates (Flanagan and Norman 1983). There is no

explicit qualification with regard to the inherent variability and uncertainty of the

conventional models.

24

To conclude, conventional forecasting methods and approaches suffer from

their inexplicability, unrelatedness and determinism (Brandon 1982; Wilson 1982;

Taylor 1984; Bowen and Edwards 1985a, 1985b; Bowen et al. 1987). In short,

these approaches fail to explain the systems they purport to represent, fail to show

the relationship and interdependency between the variables and fail to consider the

variability and uncertainty of forecasting.

2.8 Summary

Building price forecasting is a sub-process of cost planning. It helps

decision makers to be aware of the probable financial commitments before extensive

design work is undertaken. After all, a decision to build can be put forth, or

alternative plans compared, only if a reasonable forecast can be made.

Although the forecasts that are made at different stages have similar functions,

their levels of influence are different, because a design decision that is made in the

early stages is more cost sensitive than the same decision made later. Thus, early

stage forecasts play an influential role in the final value of buildings.

At the early design stage, the type of information that is available for the use

of forecasting is usually very rough, and practising forecasters use simple single unit

methods, such as the floor area method, to accomplish the forecasting task.

Practitioners generally believe that accuracy is proportional to the level of detail of

the forecast. This perception is reflected in their choice of forecasting model, and

that they consider the amount of data available to be the most significant criterion.

25

Paradoxically, the simple floor area method is found to be more accurate than the

detailed approximate quantities method.

A few empirical surveys on forecasting practices have been undertaken in

different countries. They all show that conventional forecasting methods, such as

the floor area and approximate quantities methods, still dominate, despite the fact

that plenty of new alternative models have been developed. Several problems of

existing forecasting practices are identified. They include the misconception of the

relationship between level of detail and forecasting accuracy, the lack of theoretical

background, the lack of performance evaluation, and the inexplicability,

unrelatedness and determinism that are rooted in the forecasting approach.

Therefore, the direction of development for new models should focus on the features

of logical transparency (i.e. be theoretically supported), interdependence (i.e., show

the relationship between variables) and stochastic variability (i.e., allow the output to

be expressed in probability terms). The performance of new models should also be

measured empirically to demonstrate their forecasting ability.

26

CChhaapptteerr 33 DDeevveellooppmmeenntt ooff FFoorreeccaassttiinngg

MMooddeellss

If the moon's face is red, of water she speaks.

Saying of the Zuni Indians of the Southwest

3.1 Introduction

The first recorded forecasting method was the cube method, which was

invented about 200 years ago (Skitmore et al. 1990 p. xix). The more widely used

floor area method was developed around 1920 (Skitmore et al. 1990 p. xix).

Starting from the mid-1950s, more and more research has focused on the

development of alternative forecasting cost models. One of the pioneers, James,

developed the storey enclosure method in 1954 as an alternative method to the floor

and cube methods for early stage forecasting. As a method developed 50 years ago,

it possesses the inherent problems that are explained in Chapter 2. However, James

identifies some possible variables other than total floor area and building volume that

might influence building cost. These variables attempt to explain the variability of

building shapes, the vertical positioning of the floor areas, storey heights and the

presence of basements in the design of a building. The author also demonstrates

27

(although only through a very crude comparison) that the accuracy level of his

proposed storey enclosure method is greater than the floor area and cube methods.

The storey enclosure model is considered to be the most sophisticated model of all of

the single price-rate models (as elaborated in Section 3.8) that are used for

forecasting in the early design stage, but despite the empirical evidence for the

performance of the storey enclosure model, it has not been widely used by practising

forecasters.

The conventional methods, such as the approximate quantities and elemental

cost methods, are the cost models that express building costs as a function of

quantities and unit rates. Extensive studies on the subject of cost modelling were

conducted in the mid-1970s. Researchers started to apply statistical techniques to

modelling. A wider variety of cost modelling techniques in the categories of

simulation, generation and optimisation have been developed in the past 30 years.

However, a lot of research on model development focuses on the way in which new

alternative models are different from other models, and stresses their uniqueness.

There is a lack of clear demonstration of the applicability of these models, which is

considered to be the biggest obstacle to their practical application.

3.2 Definition of Cost Model

The English word, ‘model’, comes from the Latin word, ‘exemplum’, which

means the manner, fashion, or example to be followed, a precedent and an example

of what may happen. A model is a representation of a structure, or an “organised

body of mutually connected and dependent parts” (Holes 1987). The etymology

28

suggests that a model only represents the general picture of what may occur. It is

clear enough from its definition that uncertainties do exist within it. A model that is

developed from historical information or experience can represent reality, but it does

not thereby become reality (Beeston 1974; Bowen 1984).

Seeley (1996 p. 202-203) defines the word ‘model’ as “a procedure

developed to reflect, by means of derived processes, adequately acceptable output for

an established series of input data”. Therefore, a building cost forecasting model is

a system that produces forecasted prices (output) from historical data (input).

Beeston (1987 p. 46) considers that all forecasting methods can be described as cost

models, which are classified as in-place quantity-based, descriptive or realistic, and

their task may be to forecast the cost of a whole design or of an element of it, or to

calculate the cost effect of a design change.

Cost models are technical models that are used to assist in the evaluation of

the financial implication of building design decisions (Maver 1979). Skitmore and

Marston (1999 pp.2-4) differentiate technical models from isomorphic models. The

former type features an important step in the abstraction of the most significant

influencing elements at the beginning of the model development process, whereas

the latter type involves the mapping of every influencing element within the results,

which is expensive and is not cost effective. As buildings are composed of

thousands of items, involve hundreds of companies in their production and take years

to complete, the number of elements that influence building costs is huge. Building

a cost model requires the selection of a sub-set of major influencing elements, which

is an exercise in cost-benefit trade-off. Even if the resources were available, it is

impossible to construct an isomorphic model for building costs due to individual

variation between projects (Kenley and Wilson 1986).

29

The purposes of cost models are to forecast the total cost that the client will

have to pay for the building at any stage in the design evaluation, to compare a range of

actual design alternatives at any stage in the design evolution, to compare a range of

possible design alternatives at any stage in the design evolution, and to forecast the

economic effects upon society of changes in design codes and regulations (Skitmore

and Marston 1999 p. 9).

3.3 Brandon’s “Paradigm Shift”

Although many experimental cost models were generated in the 1970s (for

example: Buchanan (1972), Regdon (1972), Kouskoulas and Koehn (1974), Braby

(1975), McCaffer (1975), Wilson and Templeman (1976), Flanagan and Norman

(1978)) few are able to challenge the existing forecasting approaches. Nobody had

probed the possibility that the existing forecasting models might actually be wrong

until Brandon (1982) addressed the need for a paradigm shift in building cost

research. He doubts the reliability of existing forecasting models, and urged the

development of a cost model that is founded on solid theory. With the assistance of

computer technology, which makes complicated calculation much easier than before,

simulation is suggested as the direction for further research investigation, because it

gives a better understanding of why certain costs arise.

This new approach sets out a more explicit and sound criteria for model

development. Brandon’s view is inspiring and visionary. In response to Brandon’s

suggestion, Bowen and Edwards (1985a) review the existing paradigm, and address

who it is that needs a new paradigm and why it must it be a new paradigm. The

30

authors believed that the new approach to cost modelling and price forecasting after

the shift would entail the recognition both of the continuing need for historically

derived data in the exploration of cost trends and relationships, and the recognition of

the importance of the building process by the incorporation of significant aspects of

resource utilisation into the estimation methods. They also believed that the new

approach would insist on inferential statements backed by statistically reliable data,

that the approach would be stochastic in creatively dealing with future uncertainty

through the use of probabilistic techniques, and that it would simulate reality and be

capable of demonstrating the strength and associative characteristics of the

relationships that exist between the factors involved. Forecasters would then profit

from the knowledge base that would be gained through their expert understanding of

the field, and be capable of using this systematically to provide logically coherent

solutions to cost modelling and price forecasting problems.

Beeston (1987) does not rule out the use of descriptive methods (that is,

those that contain variables that describe the design and its environment by

measurements of such factors as size, shape, type of construction and location),

despite their inherent deficiencies. He considers that they would be suitable both

for forecasting at the early planning stage and for forecasting the maintenance costs

of estates.

Both Beeston (1987 p. 18) and Bowen et al. (1987) suggest that the

development of modelling systems for the purpose of design economics should

attempt to represent as closely as possible the way in which costs are actually

incurred. As is highlighted in Chapter 2, the conventional approach is ill equipped

because of its inexplicability, unrelatedness and determinism, and thus the

31

development of new cost models should shift towards logical transparency,

interdependence and stochastic variability (Bowen et al. 1987).

3.3.1 Black box versus realistic models

There are two distinct ways of representing costs – the realistic approach and

the “black box” approach (Beeston 1983). The realistic approach attempts to

represent the ways in which costs arise, whereas the black box approach does not.

The former approach identifies all of the direct causes of cost, and measures them

directly. This involves the detailed comparison of methods and prototype structures,

and thus this approach has the best potential accuracy. However, the data that is

required for the realistic approach is extremely difficult, if not impossible, for

forecasters that represent clients to acquire (Hardcastle 1984). Although it is

possible, even at the early design stage when information is scant, to use the realistic

approach through the simulation of production operations (such as CPS (Bennett and

Ormerod 1984) and CASPAR (Thompson and Willmer 1985)), forecasters still

prefer to use black box models. This is partly because the way that cost is incurred

is not a perfect function of the building design, and thus forecasters have to make

additional assumptions to convert design information into production information if

the realistic approach is used. These additional assumptions will inevitably create

extra complications.

Thus, models for very early stage forecasting are unavoidably inexplicable,

but their performance can still be judged, and indeed, the justification for the black

box approach rests on its actual performance. It is measured by comparing the

32

output of the model that is based on the black box approach in response to certain

stimuli with the output of the prototype under the same stimuli.

Both the black box approach and the realistic approach have their raison

d'être. Choosing which of them to use depends on the purpose of the model

(Skitmore and Patchell 1990). The realistic approach needs structural validation to

test its soundness, but it has the benefit of being explanatory. However, the black

box approach uses model performance in model testing.

3.3.2 Deterministic versus stochastic models

A model without a formal measure of uncertainty is, by definition, a

deterministic model. Conventional models generally only give a single-figure

estimate as their output without recognising the reality of the inherent variability and

uncertainty, and are thus deterministic models. The variability and uncertainty are

not formally assessed, but are more often dealt with intuitively by forecasters. By

contrast, if the duration and cost of activities or groups of activities are recognised as

being uncertain, then they will be modelled as stochastic variables using a

probabilistic approach (Bowen and Edward 1985a). Formal measures of

uncertainty may be articulated as the associated coefficient of variation (as in

regression) or the cumulative frequency distribution (as in the Monte Carlo

Simulation) (Newton 1990). The application of probabilistic approaches to the

problems of building economics has been demonstrated through various studies such

as that of Spooner (1974), Mathur (1982), Wilson (1982) and Diekmann (1983).

Despite the different considerations of uncertainty that are discussed, the earlier

studies do not challenge the validity of the hidden assumptions, for example, that the

33

events that are simulated are independent events, and that the use of normal and

rectangular frequency distributions is appropriate in the application of the Monte

Carlo Simulation (Raftery 1984b). More recent works by Chau (1995a; 1995b) and

Wall (1997) validate these assumptions in their application of the Monte Carlo

Simulation. The test of underlying assumptions in the modelling process is an

indication of the sophistication of the simulation techniques.

3.3.3 Deductive versus inductive models

Approaches to modelling cost in construction can also be classified as

deductive or inductive (Wilson 1982; Raftery 1984b). Models that are developed

from the former approach involve the analysis of cost data over design variables

(whichever are being considered) with the objective of deriving formal mathematical

expressions that succinctly relate a wide range of design-valuable values to price.

This approach draws heavily upon the techniques of statistics, and of correlation and

least-squares regression in particular. Deductive models arise largely from the

follow equation:

P = f1(V1, V2, V3, … Vn), (3.1)

where P is the forecasted price, which is a function, f1, of the design variables,

V1, V2, V3, … Vn. The crucial constraints to the deductive approach include the not

inconsiderable limitations of the statistical techniques that are available for

modelling, and the total dependence upon the suitability of the cost data used.

Inductive models do not involve the analysis of a set of given cost data, but

rather the synthesis of the costs of individual discrete design solutions from the

34

constituent components of the design. Inductive methods require the summation of

cost over some suitably defined set of subsystems that are appropriate to the building

design. The most detailed level of subsystem definition would be the individual

resources themselves, but several other levels of aggregation are in common use, for

example, operational activities and constructional elements. Inductive models arise

largely from the equation:

( )j

n

jj CP' ∑

=

=1f , (3.2)

where P’ is the forecasted price, which is the summation of each cost

function fj of the resources committed, Cj, for j equal to 1 to n, where n is the total

number of subsystems that represent the prices.

In deductive models, the techniques of statistical inference are used to deduce

the relationships between building features or design models, whereas in inductive

models the resource implications of design decisions are calculated and aggregated to

measure economic performance. Thus, the former models are more relevant to

early design stages designs and the latter models to later design stages.

3.4 Major Directions of Model Development

Newton (1990) classifies nine descriptive primitives for cost modelling

studies: data, units, usage, approach, application, model, technique, assumptions and

uncertainty. Table 3-1 briefly explains the meaning of each primitive and its

corresponding classification criteria. The descriptive primitives for this research are

also exhibited in the table. Table 3-2 shows a summary of the reviewed research

35

studies on modelling techniques and applications according to Newton’s

classification. The number of studies on modelling techniques for early design

stages (feasibility and sketch design) exceeds the number for later design stages.

This circumstance seems reasonable because, as is discussed in Chapter 2, design

decisions are more cost sensitive at the early stage than they are at the later stage, and

the potential benefit of developing a good model for the early design stage is

therefore greater. Thus, the development of designers’ forecast models focuses on

their application in the early stages of design.

Skitmore and Patchell (1990) review all of the modelling techniques that have

been developed in the building and process plant industries. The authors

differentiate the various techniques one by one according to their characteristics or

primitives, which include the mathematical model, relevant contract type, general

accuracy, whether the technique itself is deterministic or probabilistic, the number of

variables, type of variables, the characteristic of quantities (derivation, deterministic

or probabilistic (quantity model), derivation database) and the characteristic of rates

(weighting, current, quantity trend and deterministic or probabilistic (rate model)).

A summary of the characteristics for the various techniques identified is shown in

Table 3-3.

To summarise the development of cost models, Skitmore and Patchell

conclude that research has developed with differing emphasis on all of the four

factors that influence estimation reliability, although much system development has

been centred at the item level involving the search for the best set of predictors of

tender price (regression analysis), the homogenisation of database contracts by

weighting or proximity measures (BCIS and Lu Qian system), the generation of

items and quantities from contract characteristics (Holes, Calculix and expert

36

systems such as ELSIE) and the quantification of overall estimate reliability from

assumed item reliability (probabilistic model (PERT-COST) and simulation).

Since the 1990s, there is a new class of tools, neural networks, which offers

an alternative approach to cost forecasting (Li 1995; Adeli and Wu 1998; Bode 1998

Emsley et al. 2002; Kim et al. 2004). Neural network models are black-box in

nature and usually involve complicated algorithm. The superiority of neural

network models over other mathematical models lies on their ability to learn and

adapt their own representation during the model training process. Although many

researches have proved their outstanding performance, especially in terms of error

reduction, it is doubtful that practising forecasters understand these models, or even

have heard their names. As suggested in Fortune and Lee’s report, the relative

performance of new and traditional cost models in strategic advice for clients (as

addressed in Section 2.6), the possible fact that many practising forecasters are not

well-equipped enough to understand and use these models could be a big hurdle for

their real-life application.

37

Table 3-1: Classification of this research according to Newton’s descriptive

primitives

Descriptive primitives Explanation Suggested Classification

Primitives of this research

Data Whether data is specifically relates to a type of design proposal or not

Specific or non-specific

Specific

Units Whether it is a unit in abstract form, a unit of finished works or unit of as-built works

Abstracted, finished or as-built

Finished works (floor area, external wall (building perimeter and storey height) and roof area of final product)

Usage Whether the purpose is for designers’ price estimation or builders’ bidding

Cost or price Price

Approach

Whether it is implemented for estimation of the whole building cost or a particular component or part

Marco or micro Marco approach

Application When is the model applied in the design process

Feasibility, sketch, detailed, tender, throughout, non-construction

Feasibility (or very early sketch)

Model Common classification of techniques

Simulation, generation or optimisation

Simulation

Technique (See also Table 3.2)

Type of technique used Dynamic programming, expert system, functional dependency, linear programming, manual, monte carlos simulation, networks, parametric modelling, probability analysis, regression analysis

Regression analysis

Assumptions

Whether assumptions can be accessed or not

Explicit or implicit Explicit

Uncertainty Whether there is a formal measure of uncertainty not

Stochastic or deterministic

Stochastic

38

Table 3-2: Previous studies on modelling techniques and applications according

to Newton’s classification

Techniques Application Previous works Dynamic programming

Feasibility

Sketch Atkin (1987) Detailed Tender Throughout Non-construction Expert system Feasibility Sketch Brandon (1988), Lu (1988) Detailed Tender Throughout Non-construction Functional dependency

Feasibility Wilderness (1964), Thomsen (1965), Bathurst and Butler (1977), Flanagan and Norman (1978) , Pegg (1984), Meijer (1987), Tan (1999)

Sketch DOE (1971), Townsend (1978), Moore and Brandon (1979), Powell and Chisnall (1981), Scholfield et al. (1982), Langston (1983), Newton (1983), Weight (1987), Boussabaine and Elhag (1999)

Detailed Tender Throughout Holes and Thomas (1982), Sidwell and Wottoon (1984), Berny

and Howes (1987), Holes (1987), Woodhead et al. (1987) Non-construction Linear programming

Feasibility

Sketch Russell and Choudhary (1980), Cusack (1985) Detailed Tender Throughout Non-construction Manual Feasibility James (1954) Sketch Dunican (1960), RICS (1964), Barrett (1970) Detailed Gray (1982), PSA (1987), Munns and Al-Haimus (2000) Tender Throughout Kiiras (1987), Dreger (1988) Non-construction

39

Table 3-2: Previous studies on modelling techniques and application according

to Newton’s classification (Cont’d)

Techniques Application Previous works Monte carlos simulation

Feasibility

Sketch Mathur (1982), Pitt (1982), Wilson (1982), Bennett and Ormerod (1984)

Detailed Tender Walker (1988) Throughout Non-construction Gehring and Narula (1986) Networks Feasibility Sketch Detailed Tender Throughout Bowen et al. (1987), Brown (1987) Non-construction Parametric modelling

Feasibility Tregenza (1972), Selinger (1988)

Sketch Nadel (1967), Meyrat (1969), Southwell (1971), Tregenza (1972),Brandon (1978) , Selinger (1988), Warszawski (2003)

Detailed Tender Throughout Non-construction Park (1988) Probability analysis

Feasibility

Sketch Zahry (1982), Cusack (1987), Pegg (1987) Detailed Tender Fine (1980) Throughout Skitmore (1982) Non-construction Regression analysis

Feasibility Buchanan (1972), Regdon (1972), Kouskoulas and Koehn (1974), Braby (1975), McCaffer (1975), Wilson and Templeman (1976), Bathurst and Butler (1977), McCaffer et al (1984), Karshenas (1984)

Sketch Gould (1970), Buchanan (1972), Sierra (1982), Yokoyama and Tomiya (1988), Skitmore and Patchell (1990)

Detailed Tender Throughout Khosrowshahi (1988) Non-construction

40

Table 3-3: Summary of estimating techniques (Extracted from Skitmore &

Patchell 1990)

Estimate Technique Model

Relevant Contract

Type

General Accuracy

(c.v.)

Deterministic / Probabilistic

Number of variables Type of variables

Unit P = qr All 25-30% Deterministic Single Any comparable unit, e.g. tonne steelwork, metre pipeline

Graphical P = fr(q) Process Plant 15-30% Deterministic Few Ditto

Functional Unit P = qr Buildings 25-30% Deterministic Single Ditto, e.g. number of beds, number of pupils

Parametric P = fr(q1, q2,, q3,,…) Process Plant 15-30% Deterministic Few

Process parameters, e.g. capacity pressure, temperature, material,

cost index

Exponent r

qq

PP1

212 =

Process Plant 15-30% Deterministic Single Size of plant or equipment, e.g.

capacity

Factor

i

N

ii

m

ii rqfactP ∑∑

==

=11

a) m=1 (Lang method)

b) m>1, fact1 ≠ fact2,etc. (Hand

method)

c) facti = U(αi, βi) (Chiltern Method)

Process Plant 10-15% Deterministic Few Any

Comparative ∑=

−+=N

iii ppPP

11212 )(

All 25-30% Deterministic Few Depends on differences

Interpolation P = qr Buildings 25-30% Deterministic Single Gross floor area

Conference P = f(P1,P2,…) Process Plant ? Deterministic any Any

Floor Area P = qr Buildings 20-30% Deterministic Single Gross floor area

Cube P = qr Buildings 20-45%

(based on 86 cases)

Deterministic Single Volume

Storey Enclosure P = qr Buildings

15-30% (based on 86 cases)

Deterministic Single Floor area, external wall area, basement wall area and roof area

BQ Pricing:

(i) Conventional ∑=

=N

iii rqP

1 Construction

10-20% (5-8% for builders)

Deterministic

Many (number of variables varies)

Quantities required under SMM

(ii) B Fine ∑=

=N

iii rqP

1 Buildings 15-20% Deterministic

Many (number of variables varies)

Ditto

Significant Item Estimating ∑

=

=N

iii rqP

1 PSA

Buildings 10-20% Deterministic Medium Quantities required under SMM

Approximate Quantities:

(i) Conventional ∑=

=N

iii rqP

1 Construction 15-25% Deterministic Medium to

many Combining quantities and items

required under SMM

41

Table 3-3 (Cont’d): Summary of estimating techniques (Extracted from

Skitmore & Patchell 1990)

Estimate Technique Model

Relevant Contract

Type

General Accuracy

(c.v.)


Number of variables Type of variables

Approximate Quantities:

(ii) Gleeda ∑=

=N

iii rqP

1

Buildings 15-25% Deterministic Few to

medium Ditto

(iii) Gilmore ∑=

=N

iii rqP

1

Buildings 15-25% Deterministic Few to

medium Ditto

(iv) Ross 1 ∑=

=N

iii rqP

1

Buildings 25% (based

on 17 cases)Deterministic / Probabilistic

Few to medium Ditto

(v) Ross 2 ∑

=

+N

iii rqp

1

Buildings 50% (based

on 17 cases)Deterministic / Probabilistic

Few to medium Ditto

(vi) Ross 3

∑=

+N

iii rqp

1

(Pi = a + bqi + e, e = N(0,σ2)

Buildings 30% (based on 17 cases)


Few to medium Ditto

Elemental ∑=

=N

iii rqP

1

Buildings 20-25% Deterministic Medium BCIS/Cl afb entities (UK),

individual company manual (HK)

CPU ∑=

=N

iii rqP

1

Buildings 20-25% Deterministic Medium Similar

Elsie ∑=

=N

iii rqP

1

2

Offices Deterministic Medium DBE

Norms (schedule) ∑=

=N

iii rqP

1

2 Buildings 10-20% Deterministic

Many (number of variables

varies) SMM type, e.g. PSA schedule

Regression ebqaP

N

iii ++= ∑

=1

e = N(0,σ2)

All 15-25% Deterministic /

Probabilistic Few Usually contract characteristics, e.g. floor area, no. of storey

Lu Qian ∑=

=N

iii rqP

1 Buildings ? Deterministic Few Usually contract characteristics,

e.g. floor area, no. of storey

Resource (Scheduling

Activity, Operational)

∑=

=N

iii rqP

1 All 5-8% Deterministic

Many (number of variables

varies)

Resources, e.g. man hours, materials, plant

PERT-COST ∑

=

=N

iipP

1

where pi = N(qiri,σ i2)

All N/A Probabilistic Number of variables

varies

Usually time resources, e.g. man hours

CPS ∑∑

==

+=N

ii

N

iii rnrtP

11

where ti = F(µi,σ i

2)

Buildings 6.50% Probabilistic Usually few Resources, e.g. man hours, materials, plant

Risk Estimating ∑=

=N

iii rqP

1 Construction N/A Probabilistic Usually few Any

Homogenised Estimating (BCIS on line) (BICPE

etc.)

∑=

=N

iii rqP

1 Buildings N/A Deterministic Any Any

42

3.5 Limitations of Cost Models

3.5.1 Model assumptions

Models are only ever a representation of reality, and forecasting models are

always non-isomorphic models that are simplifications of an actual system. Every

model has a set of inherent assumptions about problem boundaries, about what is or

is not significant and about how the user might best conceptualise a problem

(Newton 1990). Regardless of whether the assumptions of a model are explicit or

implicit, it is always possible to devise tests that show models to be deficient in some

way or other. This implies that models should be used with care, and should not be

pushed beyond the limits of their validity (Skitmore and Marston 1999).

Designers’ forecast models are structured to represent completed buildings or

their components. However, the origin of the price of a building or a component

should be based on the construction process and the resources that are employed.

To modify this kind of price data to suit a designers’ forecast model, an implicit

assumption must be made that the actual buildings in the data pool are so similar that

their production methods do not differ, or that differing production methods do not

significantly affect cost. Obviously, these assumptions are untrue (Bowen and

Edwards 1985a).

43

3.5.2 Reliance on historical data for prediction

All forecasting models demand historical data as inputs for prediction. The

Wilderness Group (1964, p. 254-255) point out two limitations to using historical

data:

“it is almost impossible to find the actual buildings which are

sufficiently similar for their differences in cost to be related to

particular factors and . . . It was found impracticable merely with

historical data to isolate with any certainty and the effect upon

buildings cost of certain design factors individual to those buildings of

which the costs were examined”.

Moreover, Bowen and Edwards (1985a) criticise the use of mathematics in

historical data for modelling, because it fails to reflect the change in technology over

time. It is also debateable whether backward-looking concepts that are based on

historical price data should be used for forecasting. Bon (1989 pp.61-62) explains

the problem:

“Ex ante or forward-looking concepts predominate in economics,

while ex post or backward-looking concepts are more prevalent in

accounting . . . Cost is often treated as the pre-eminent ex post

concept. However, no matter how accurate and exhaustive our

historical records, backward-looking concepts of cost are inadequate

for two reasons. First, at the moment of decision one is perforce

considering future costs. Second, the valuation of costs is impossible

without explicit account of opportunity cost - the satisfaction forgone .

44

. . The difficulties with cost forecasting based on historical data are

exacerbated in the case of long-lived capital goods, such as buildings.”

3.5.3 Insufficiency of information and preparation time

The preparation of a forecast relies heavily on information input from

external organisations such as the client’s brief and the designers’ layout plans, and

the information that is available within the organisation, such as historical price data.

It is quite common that the design information that is given at the early design stage

is ambiguous and contradictory. The very limited information and allowable time

for producing forecasts may force forecasters to make assumptions according to their

own subjective judgements. For instance, forecasters will usually rely on price data

that is derived from a sample of buildings that do not perfectly match the

characteristics of the proposed building or works if appropriate historical price data

are unavailable (Flanagan and Norman 1983).

3.5.4 Reliance on expert judgment

Forecasting is partly an art and partly a science. The science part involves

the use of modelling techniques and mathematics. The art part comes with the

exercising of professional judgement. Tversky and Kahneman (1974) suggest that

in making judgements in uncertain conditions, people in general do not follow the

calculus of chance or the statistical theory of prediction, but instead rely on a number

of simplifying strategies or heuristics that direct their judgements. Such heuristics

(rules of thumb) can sometimes lead to reasonable judgements and sometimes to

45

severe and systemic errors. The exercise of judgement is therefore the cost

forecaster, rather than the forecasting model itself (Skitmore et al. 1990). Raftery

(1995) and Birnie (1995) point out that humans make mistakes when making

judgements, and state that more work is needed to understand the behavioural

processes that are involved. Empirical evidence shows that judgement has a

significant role within the formulation and transmission of early cost advice to clients

(Fortune and Lees 1996). As the exercise of judgement is a human cognitive

process, it can be subject to error, bias and heuristics.

3.6 Review of Cost Models in Use

Fortune and Lees (1996) study the incidence of the use of certain techniques,

and the extent to which lack of understanding is a factor that influences the incidence

of the use of certain techniques in their research on the relative performance of new

and traditional cost models. The studied techniques are classified into seven

categories: traditional (conventional) techniques, statistical techniques,

knowledge-based techniques, life cycle costing, resource- and process-based

techniques, risk analysis and value-related techniques. The authors reveal that the

use of conventional techniques outweighs the use of all of the other techniques, and

that these other techniques were not well understood by respondents.

A more recent study by Fortune and Hinks on models that are used by UK

quantity surveying practices also reinforces the notion that practitioners have not yet

answered the call of academia to adopt the new computer-based stochastic models that

are available in the assessment of project risk and uncertainty (Fortune and Hinks

46

1998). The study also indicates that, in the period 1993 to 1997, conventional models

that provide single-figure deterministic price forecasts had only a slightly reduced

incidence of use, whereas newer computer-based models had only a slightly increased

incidence of use, which suggests that the paradigm shift in the formulation of reliable

early cost advice has not yet been achieved in practice (Fortune and Hinks 1998). A

similar survey that studies the forecasting models that are used in South Africa also

indicates that conventional models remain firmly in the mainstream in application

(Bowen and Edwards 1998).

The demand for a move to a more scientific basis for forecasting appears to

come mainly from academia, rather than from practice (Bowen and Edwards 1985,

Raftery 1987 p. 53). For the sake of producing publications, academics (modellers)

have focused on the demonstration of how a newly developed model is different

from other models. However, the conservative attitude and the ignorance of

practitioners (forecasters) towards change and new knowledge create another hurdle

(Brandon 1982). To initiate a paradigm shift, academia will have to convince

practitioners by establishing and advertising the benefits that forecasters will enjoy

from these alternative approaches. This could include educating forecasters and

managers about how these new approaches can be applied and about how much

better they are than the conventional approaches, and the heightening of their

awareness of the inadequacy of the conventional approaches (Fortune and Lees

1996). A model that is new and mathematically sound for forecasting may not

necessarily be appropriate for implementation. Thorough studies on the benefits of

and strategies for putting a new model into practice are crucial to the acceptance of

new models or forecasting approaches.

47

3.7 Significant Items Estimation

Surveys that have been conducted in the UK and South Africa have reinforced

the fact that newer models are not popular in practice. More than 20 years after the

proposal of a paradigm shift, the idea remains a pipe dream, and the popularity of

conventional models remains unchanged. Perhaps the only new model that has

been put forth in practice (although it is still not well recognised), is the significant

items estimating model that was developed by the Property Services Agency (PSA)

in the UK.

Barnes (1971) investigates the implication of the proposition that different

values of rates have different degrees of reliability, and, specifically, that the

reliability of a product of quantity and rates is an increasing function of its value.

By assuming a constant coefficient of variation for each item, he shows that a

selective reduction in the number of low-valued items has a trivial effect on the

overall estimate reliability. The empirical evidence that backs up favour Barnes’

assumption is quite strong, and therefore its essence has been used to develop the

significant items method.

According to the outline that was published by the Department of

Environment of the UK government in 1987, the statement “some 80% of the value

of measured work on building projects is contained within 20% of the items in the

bills of quantities” was tested by analysing the prices in 40 bills of quantities. It

was found that 78% of the value was contained within the top 20% of items, which

broadly confirmed the 80/20 relationship. By restricting measurement and pricing to

the most significant items (the top 20% of items), and by using data with a reasonable

48

sample size, it should be possible to minimise the unreliability of the rates from the

bills of quantities (PSA 1987). The major benefits of the significant items

estimating model include the shorter forecasting time that is required due to its

concentration on fewer items, the improvement in accuracy, the improvement in

reliability (because its outputs are derived from data from a large sample) and the

flexibility it affords in allowing a move away from average rates towards varying

percentage additions for each trade (Allman 1988).

Munns and Al-Haimus (2000), in their work to refine the significant items

model, reveal that there is a lack of formal rules for the selection of work packages to

be used within the original significant items model, and therefore a potential to

overestimate the cost of projects. By using their new methodology for selecting

work packages and the refined technique that is known as the cost significant global

cost model, they demonstrate that there is a significant improvement in performance

over the original significant items model.

Although the use of the significant items model has shown significant

improvements in performance, the actual contribution to the overall value of a

building is limited, because it is a forecasting model for the later design stage. At

this stage, the design information is quite sophisticated, and there is little room for

cost saving or value enhancement. However, studies on the significant items model

give an empirically supported demonstration of how conventional models, such as

the approximate quantities method, can be further advanced by the use of statistical

techniques.

49

3.8 Discussions on Research Opportunities

Having reviewed the development of, and limitations to, forecasting models,

it seems reasonable to conclude that there is no universally agreed approach to

modelling building costs. There is no general agreement on the most useful set of

elements and functions for each of the model types, nor on how the models

themselves and their values should be derived, nor is there any agreement on the

nature of the functions that connect the cost with the various elements (Skitmore and

Marston 1999, p.19). In contrast, it appears that practising forecasters need

commonly agreed models. Since the existing models in use have been developed to

be a convention, forecasters can strictly follow them to prepare estimates (even

though many of them are not developed strictly) without the need to worry that their

choice of models will be challenged by other practitioners. However, the fact that

every forecaster is using a model does not automatically validate that model. The

conventional models deserve rigorous tests to justify their existence.

This research focuses on the study of conventional models used in the early

design stage. As explained in Section 2.5, earlier forecasts’ contribution is higher

since earlier decisions mainly influenced by forecasts are more cost sensitive. The

successful experience of applying the significant item model in practice provides

insights into the potential of developing a new model that is applicable to the early

design stage. According to the RIBA outline plan of works (RIBA 1991), this early

design stage corresponds to the period between the beginning of the feasibility stage

and the midway of the sketch design stage. Before the beginning of this period,

referred to by the RIBA outline plan of works as the inception stage, there are no

drawings available. Forecasters have to make their best guess by discretion, or

50

sometimes known in forecasters’ slang as “guestimates”. After the end of this

period, when there is more available information such as formal sketch layout plans

(as compared with those sketches produced during the early design stage), a few

sketch elevation plans, draft specifications, and perhaps the schedules of finishes,

doors and window, forecasters can use more detailed methods (for example, the

elemental cost estimating method) for forecasting.

During the early design stage, information is often more brief and can only be

extracted from a few sketches. Because of this, all the conventional methods used

by forecasters follow a single price-rate system. Differing from the elemental

method, which is applied at a later stage, and the significant item method or the

approximate quantities method, which is applied at an even later stage, the lack of

information inevitably imposes higher uncertainty on forecasts prepared by single

price-rate methods. However, the two early conventional methods, the cube and

floor area methods, appear unable to extract all the information available from the

sketches. The subsequent storey enclosure method, as described in Section 3.9,

shows sophistication in attempting to extract some further, arguably all the major,

information from the sketches, i.e. the area of each floor and the envelop area of a

building. Although also following a single price-rate system, the storey enclosure

method takes into account a additional aspects of building design economics. To

avoid falling into the trap of developing something new without any theoretical base,

as criticised by Raftery (1984b) and Newton (1990), the storey enclosure model that

was developed by James in 1954 is chosen for further development. Although the

model shares many of the flaws of the single-unit deterministic models, it is

considered to be the most sophisticated model, and is worth refining. Skitmore and

Marston (1999, p.164) suggest that the model has considerable potential for further

51

development by statistical means. Ashworth (1999, p.251) also suggests that in the

past, credibility was a factor to be taken into account, but that it might be more

acceptable today to apply the storey enclosure model. After all, what is needed is

an approach that will harness the strengths and minimise the weaknesses of all that

has been developed to date (Bowen and Edwards 1985a).

Patchell (1987) suggests four criteria to be observed for cost advice at the

schematic or feasibility stage: cost accuracy from very preliminary information, a

flexible and quick response to various options, economy of production in man and

machine hours, and estimation and analysis on the same basis. These criteria are

very practical, and are used as the requirements for the forecasting models that are

developed in this research.

The models that are developed in this research share similar justifications

with the significant items model, but they are designed to be applied in the early

design stage. They should be understandable, easy to use, fairly accurate and

relatively reliable compared with the conventional forecasting models.

The accomplishment of this research relies on the use of empirical data for

both the model development and the assessment of model performance. The

emphasis here is on the purely empirical nature of the model, which is thought to be

the best way to avoid subjectivity, as the essence of a good empirical research is to

minimise the role of the researcher in interpreting the results of the study (Skitmore

1988).

Like much of the research on empirical models, the modelling of prices in

this research follows mainstream research by using statistical techniques such as

regression analysis. Hypothetical models that contain various groups of variables

52

are derived by multiple regressions. The machinery of the approach is described in

Chapter 5.

3.9 Storey Enclosure Method

Generally, all of the conventional methods for forecasting building prices at

the early design stage are single-rate methods. Amongst them, the most commonly

used method is the floor area method. The floor area method is simple in

application, easily understood and produces a forecast quickly. However, it is also

considered to be too simple to take into account the different characteristics of

buildings. James (1954) criticised the floor area method as well as the cube method.

First, none of the two methods is satisfactory for universal application. Second,

none of the two methods reflect the cost implications of building shape, building

height and the number of storeys. Third, the two single price-rate methods have to

account separately for basements. Four, the cube method is sensitive to changing

unit rates. He proposed an alternative single-rate method, the storey enclosure

method, to overcome this limitation. This method takes into account various

important aspects of design in building price forecasting, whilst leaving the type of

structure and standard of finishes to be assessed in the price rate. The factors to be

considered and the adjustments in the methods to reflect those factors are shown in

Table 3-4. The method involves the multiplication of a weighting that is assigned

to each of the adjustments in the table. The assigned weightings suggested by

James and inclusions for each component are shown in Table 3-5.

53

The summation of the products of each measured value of the adjustment and

its corresponding assigned weighting will form the storey enclosure area, which is

the unit quantity of the storey enclosure method. The product of an appropriate

single unit rate and the storey enclosure area will produce a forecasted price.

Equations (3.3) and (3.4), which represent the forecasting method, are shown.

P = S . R (3.3)

RrspfspfiP j

m

jj

m

jji

n

ii

n

ii ⋅⎟⎟

⎠

⎞⎜⎜⎝

⎛+′′+′+++= ∑∑∑∑

==== 00005.22)15.02( , (3.4)

where P is the forecasted price, S is the storey enclosure area, R is the unit

rate, fi is the floor area at i storeys above ground, pi is the perimeter of the external

wall at i storeys above ground, si is the storey height at i storeys above ground, n is

the total number of storeys above ground level, m is the total number of floors below

ground level, f’j is the floor area at j floors below ground level, p’j is the

perimeter of the external wall at j storeys below ground level, s’j is the storey height

at j storeys below ground level and r is the roof area.

Table 3-4: Adjustment for the factors affecting the estimates in the storey

enclosure method

Factors affecting the estimates Adjustment

Shape of building By measuring the external wall area

Total floor area By measuring the area of each floor

Vertical positioning of the floor area in a building

By using a greater multiplier to the floor area of a suspended floor positioned higher in a building

Storey heights of building Proportion of floor and roof areas to the external wall

Overall Building heights Ratio of roof area to external wall area

Extra cost of sinking usable floor area below ground level

By using increased multiplier for work below ground level

54

Table 3-5: Weightings and inclusions for individual components in the storey

enclosure method

Components Weighting Factors

Inclusions

Above Ground Components Ground Floor 2 Internal partitioning, finishings, fitments, doors,

etc., on the floor; a non-suspended floor; finishings on one side of it; and normal foundations to all vertical structural members in a single storey building including those of its external walls

Upper Floors 2 + (0.15 x No. of Floor above Ground)

Internal partitioning, finishings, fitments, doors, etc., on the floor; a suspended load-carrying floor; finishings on both sides of it; vertical structural supports to it; and the further cost which arises, in the case of vertical structural floor supports to the lower floors of multi-storey buildings, from the need to support the additional transmitted load of all superimposed floors and the roof above them

Roof 1 A suspended roof and its ligher-than-floor) load; finishings on both sides of it (one weatherproof); horizontal structural supports to it (such as beams and trusses); and vertical structural supports to it (such as walls and columns)

External Walls 1 A wall with weatherproof qualities; finishings on both sides of it; windows and external doors, etc.; and normal architectural features

Below Ground Components

Floors below Ground 2

External Walls below Ground 2.5

Displacement and disposal of earth; waterproof tanking and the loading skins to keep it in position; members of heavier construction than those required in equivalent positions above ground; finishings on one side of these members; internal partitioning finishes, fitments, door, etc.; and normal (in the basement sense) foundations to all vertical structural members in a single basement-storey building

In James’ study, the proposed storey enclosure method is applied to 86

tenders for different building types. The storey enclosure method is compared with

two other early stage methods – the superficial floor area method and the cube

method. James’ results of the tests for the cube, floor area and storey enclosure

methods as shown in Table 3-6. The estimates that are produced by the storey

55

enclosure method are nearer to the tender figures, and that the range of price

variation is reduced accordingly. These results turn out to be statistically significant

(chi-square 5.99, 2df), with the storey enclosure and floor area methods being better

than the cube method (Skitmore 1991). There are some examples that show the use

of the storey enclosure method in the textbooks of cost planning (Cartlidge and

Mehrtens 1982; Seeley 1996 pp.160-162; Ashworth 1999 pp.250-251; Ferry at el.

1999). However, despite the many benefits that are demonstrated by James, the use

of the storey enclosure method remains very limited in practice. Survey results on

the use of conventional cost forecasting models in the UK reveal that less than 2% of

respondents made use of the storey enclosure method to provide strategic cost advice

to clients (Fortune and Lees 1989). However, another survey that was conducted

more recently in South Africa indicates that 27% of the respondents had used the

storey enclosure method in practice (Bowen and Edwards 1998).

Like any of the single-unit deterministic models, the story enclosure method

suffers the deficiency of being inexplicable, unrelated and deterministic. Its

unpopularity is probably due to the fact that the weightings are not derived

empirically from proven data, but are based on experience (Wilderness Group 1964;

Ashworth 1999 p.251), that there is insufficient historical data support (Wilderness

Group 1964), that there are difficulties in obtaining an appropriate rate (Seeley 1996

pp.161-162), that the calculations that are involved are relatively complex (Seeley

1996 pp.161-162), and that the method provides no link with other forecasting

methods, such as the elemental or approximate quantity method that would be used

subsequently as the design develops.

56

Table 3-6: The results of tests for the cube, floor area and storey enclosure

methods in James’ study (Source: James (1954))

3.10 Regression Analysis

As there is no universal set of elements or variables for forecasting models,

the purpose of reviewing previous empirical research on the influencing variables

and forecasting targets is to consolidate a list of them for later use in model selection.

A review of the surrounding literature shows that the technique of regression analysis

has been widely used in the modelling of building prices. The technique of

regression analysis is statistically able to demonstrate the strength of the relationship

between two or more variables, for example, height and unit price. A variety of

applications of regression analysis in the forecasting of building cost have been

developed since the mid 1970s. Regression analysis has been used for modelling

the prices of building at three levels: the overall price, the price of building elements

and the price of components. Regression analysis was first used to model building

prices for offices (Department of Environment 1971; Tregenza 1972; Flanagan and

halla


57

Norman 1978; Karshenas 1984; Skitmore and Patchell 1990), schools (Moyles 1973),

houses (Neale 1973; Braby 1975; Khosrowshahi and Kaka 1996), homes for old

people (Baker 1974), lifts (Blackhall 1974), electrical services (Blackhall 1974),

motorway drainage (Coates 1974) and a few other types of building (Kouskoulas and

Koehn 1974). It was then used to model the prices of reinforced concrete frames

(Buchanan 1972; Singh 1990) and building services (Gould, 1970). It has also been

used to model the prices of components such as the beams of suspended-roof steel

structures (Southwell, 1971). This research concerns the modelling of overall

building prices.

3.11 Review of Model Predictors

Of the conventional methods of forecasting for the early design stage, the

floor area method is the most widely used (Akintoye et al. 1992; Bowen and Edwards

1998; Fortune and Lees 1996). In this method, the floor area is presumed to be the

only variable that is directly proportional to the building price. Another frequently

addressed variable is the height of a building, and previous studies have expressed

this in different measures, such as the overall building height (Kouskoulas and

Koehn 1974; Karshenas 1984; Pegg 1987), number of storeys (Clark and Kingston

1930; Wilderness Group 1964; Buchanan 1969; Department of Environment 1971;

Buchanan 1972; Tregenza 1972; Steyert 1972; Neale 1973; Braby 1975; Flanagan

and Norman 1978; Singh 1990) and storey height (Wilderness Group 1964;

Buchanan 1969; Buchanan 1972; Moyles 1973). High-rise buildings are generally

more expensive to build than low-rise buildings, because the former require extra cost

for the special arrangements for servicing the building, particularly the upper floors,

58

and because the lower part of high rises is designed to carry the weight of the upper

floors and the extra wind load. The additional cost of working at a great height from

the ground when erecting the building, and the increasing area that is occupied by the

service core and circulation are also factors that increase the cost of high rises (Ferry et

al. 1999 p. 293).

The earliest work on the identification of the variable of building height was

undertaken in the United States. Clark and Kingston (1930) analyse the relative

costs of the major components of eight office buildings that range from 8 to 75

storeys on a hypothetical site. In general, they find that the unit building cost tends

to rise moderately with the building height.

In the UK, Stone (1963) reports a moderate rise in the unit building cost with

the building height for blocks of flat and maisonettes in London and other parts of

the UK. The Wilderness Group (1964) produced a series of schedules that detail

the costs of a steel frame for a structure. The spans, storey heights and number of

storeys vary and are manually priced. Their study is the first serious attempt within

the UK building industry to isolate the cost effects of fundamental design variables,

such as the number of storeys, storey height, the superimposed loading of suspended

floors, column spacing in the direction of the slab span and column spacing across

the slab span, taking into account the interacting cost effects of each variable upon

the others.

Tan (1999) cites a report that was prepared by Thomsen (1966) from the

United States that states that, except for the lower floors, the unit office building cost

is almost constant when the building height is varied. However, as details of the

59

simple simulation study are not given in Thomsen’s report, Tan warns that the results

should be interpreted with care.

A study that was conducted by the Department of Environment (1971) of the

UK government reports that the cost of local authority office blocks rises fairly

constantly by two per cent per floor as the height increases above four storeys.

Tregenza (1972) analyses the price per square meter of ten office buildings

that range from one to eighteen storeys high. The prices were rebased to January

1971 prices. A linear regression line was fitted and the result agrees with the

findings of earlier works that tall buildings tend to be more expensive than low

buildings with the same internal floor area. However, the sample was too small,

and the fitting was done by pure observation. Thus, it is doubtful whether it is

appropriate to interpret the relationship as being linear.

Buchanan (1972) uses the multiple regression technique for the development

of a model that represents the total cost of a reinforced concrete structure. The

model was developed from 38 reinforced concrete frame buildings that were

constructed by the Ministry of Public Building and Works of the UK between 1960

and 1968. The dependent variables that are identified are the gross floor area,

storey height, number of storeys, average superimposed loading, shortest span,

longest span, slab concrete thickness and number of lifts.

Kouskoulas and Koehn (1974) represent the pre-design estimation of building

prices per square foot (price per area) as a function of six variables: building locality,

price index, building type, building height, building quality and building technology.

Karshenas (1984) regards the resulting pre-design estimation technique that is

devised by Kouskoulas and Koehn as being simple, fast and applicable to forecasts

60

for a wide variety of building construction projects, and opines that the methodology

might be generalised in a global sense. Kouskoulas and Koehn’s use of raw

dependent variables with the inflation index as the independent variable is also

particularly interesting. They use a multiple regression methodology to derive the

single cost-estimation function from 40 sets of data on building contracts in the US.

Disregarding the possibility of obtaining a better-performing model by the

elimination of some of the variables, they insist on keeping all of the variables, as

they believe that the better result that is obtained by omitting some of the variables is

due to a bias in the data sample. This supposition is rather subjective. The final

model is tested on only two ex ante projects, and shows little forecasting bias. This

test sample is also considered as be too small to draw a reasonable conclusion from.

Unfortunately, no results on the performance of the reduced model are shown as a

comparison in their paper.

In Australia, Braby (1975) studied the relationship between the height of

buildings, as represented by the number of storeys, and the building price per floor

area in eighty buildings in Melbourne. Instead of classifying the data according to

the building type, as is typical in other studies, Braby divides the data according to its

location relative to the central business district (i.e., whether it is inside or outside of

the central business district). The results of the linear regression indicate that

building prices generally increase with the number of storeys. However, the results

are not conclusive due to the poor determination of the correlations.

McCaffer (1975) summarises research work that was produced by

post-graduate students (Buchanan (1969); Gould (1970); Moyles (1973); Neale

(1973); Baker (1974); Blackhall (1974) and Coat (1974)) of the Department of Civil

Engineering at Loughborough University of Technology on the use of regression

61

analysis for forecasting. A summary of the models that were developed by the

post-graduate students is shown in Table 3-7. The paper raises an important

statistical concern about the deterioration of the performance of regression models in

actual forecasting using ex ante data from their validation performance using ex post

data. The experience of the author indicates that the coefficient of variation (as a

measure for model performance) increases by 25% to 50% when the derived model is

applied to data outside of its own database. Thus, a model with a coefficient of

variation of 10% in its validation will deteriorate by 15% to 20% when used for other

cases of a similar type. This study, although it does not show detailed calculations as

evidence, is particularly important to studies in the area of cost modelling, as it is the

first in the field to address the difference between ex post performance and ex ante

performance. In fact, except when using a more advanced approach to resampling

validation, such as the cross validation (as is applied in this research), or

bootstrapping, it is crucial to measure both the ex post and ex ante performance in the

validation of a model to give a full picture of its performance.

Based on the theoretical study that was undertaken by Steyert (1972), who

suggests that the cost of the various elements of a building respond differently to

changes in the number of storeys, Flanagan and Norman (1978) further elaborate his

idea by suggesting that the cost components of a building can be split into four

categories: those that fall as the number of storeys increases, those that rise as the

number of storeys increases, those that are unaffected by height and those that fall

initially and then rise as the number of storeys increases. They use the learning

curve that was produced by the Committee of Housing in New York to illustrate that

every time the number of repetitions doubles, the output time declines by a fixed

percentage. Fifteen office buildings of more than two storeys that were built

62

between 1964 and 1975, including the ten that were used by Tregenza (1972), are

selected for curve fitting. They apply the regression analysis technique to model

the relationship between the building height and building price. By making the

assumption that other influencing variables, such as the quality of building,

geographical location, size of project, site characteristics and so forth, are constant,

the results show that the relationship between the price per square meter and the

number of storeys in an office is projected to be U-shaped.

Karshenas (1984) uses data from 24 historical multi-storey office buildings in

the US to derive the mathematical relationship between price, overall building height

and average floor area (termed “typical floor area” in Karshenas’ paper). By

merely observing the points that are distributed on the chart of the average floor area

against the height, a set of contours that represents the constant per area price for

different heights and average floor areas is constructed on the chart. Based on the

shape of the contours, the author hypothesises that building price is a function of the

average floor area and overall building height:

C = α . Aβ . Hγ, (3.5)

where C is the building price, A is the average floor area, H is the overall

building height and α, β and γ are the constants.

By transforming both sides with a natural logarithm, the equation becomes:

lnC = lnα + β . lnA + γ . lnH. (3.6)

63

This transformed hypothetical equation suits the methodology of the multiple

linear regression. To make building prices comparable, Karshenas updates all

prices to the base of March 1982, according to the price index. His derived model

of building price with the average area and overall height as variables is compared

with the floor area model using floor area unit rates from the published price book.

Unfortunately, the comparison does not pay attention to the deterioration problems

that are addressed by McCaffer. Thus, the conclusion that a better method has been

developed is not persuasive enough.

Based on a large sample of 1188 projects, Pegg (1984) identifies ten

variables that statistically and significantly affect building price level: building price

date, location, selection of contractor, contract sum (≤£20,000 or >£20,000), building

function, measurement of structural steelwork, building height, form of contract, site

conditions and type of work. Within these significant variables, the only

quantitative variable is the building height. Skitmore and Marston (1999, p.252)

criticise the study for not giving a clear description of the method of analysis or of

levels of accuracy.

Apart from summarising the model development in the résumé that was cited

earlier in this review, Skitmore and Patchell (1990) also demonstrate the use of the

multiple regression analysis technique in the development of a forecasting model of

building price per gross floor area (GFA) based on six raw independent variables,

including the number of storeys. Data was extracted from 28 office buildings in the

UK for the period 1982 to 1988. The final model is a natural logarithmic

transformed model that is derived by forward stepwise regression. It contains three

chosen variables: the number of bidders, GFA and the contract period. Very

64

detailed empirical work is incorporated in this study, but no ex ante performance

validation is included.

Khosrowshahi and Kaka (1996) use multivariate regression analysis with an

improvised iterative method to develop forecasting models for the cost and duration

of housing projects. The objective of the paper is to develop building price and

duration forecasting models for both the contractor and the client. Fifteen variables

are taken into account, including the number of storeys, which is divided into three

groups (low, medium and high). Data from 54 housing projects in the UK in the

period 1981 to 1991 are used. Six of the fifteen candidate variables are selected by

multivariate regression analysis These include one scale variable, ‘unit’, and five

categorical variables, ‘project operation’ (which comprises refurbishment, extension,

alteration and new); ‘project sub-type’ (whether sheltered, public or bungalow);

‘abnormality’ (which comprises access to site, poor communication, repeating

stoppages, sudden speed ups, transportation problems, time and cost yardsticks,

keeping occupation and unknown factor, contractor’s mistakes, various delays,

resource shortages, repeating variations, lack of presence and others); ‘starting

month’ (January, February, March, April, May, June, July, August, September,

October, November or December); and ‘horizontal access’ (whether good, fair or

poor). The final model may have a problematic area in application. In real-life,

abnormalities cannot be assumed to be mutually exclusive and independent to each

other. The presence of more than one abnormalities at the same time and the

interdependence of those abnormalities could easily ruin the model. Also, there is

no actual performance validation being shown in Khosrowshahi and Kaka’s paper.

An interesting commonality between Skitmore and Patchell’s final office

price model and Khosrowshahi and Kaka’s house model is that the variable that

65

represents the height of building was eliminated during the process of selecting the

variables. A summary of the forecasting targets (independent variables) and the

influencing variables (dependent variables) for the building price forecasting models

as reviewed in this section is shown in Table 3-8. The influencing variables in the

table are classified according to whether they are quantitative (measurable) or

qualitative (intangible, normally divided into various levels) in nature. Most of the

studies put quantitative and qualitative measures into their model. This approach is

acceptable, because these models all belong to the category of ‘black-box’

forecasting tools, which are validated solely on the performance accuracy of the

models. However, the ways in which the qualitative variables are chosen, defined

and divided into various levels, are mainly based on the experience of the modellers.

By defining the variables or the levels or scales differently, a rather different final

model may be produced. Thus, these models must be used with extra care.

Alternatively, this possible flaw can be avoided by employing models that use only

quantitative variables. Instead of putting the qualitative variables into the model, an

alternative approach is to group the data with similar qualitative characteristics

together to derive a model that explains only a particular set of qualitative

characteristics. This approach, however, produces more models than a generalised

approach does, and the grouping criteria are subjective.

Armstrong (2001 p.342-345) reviews the general principles for using

forecasting methods in published research. He concludes that a quantitative method

should be used if there is enough data. In consideration of the limited amount of

data that is available in the early design stage (because many of the qualitative

characters of a project are yet to be determined), the quantitative variables of floor

66

area, roof area, basement wall area and external wall area, as identified in JSEM, are

used in the model development in this research.

3.12 Occam’s Razor: Parsimony of Variables

For a given set of data, there are always an unlimited number of possible

explanatory models. If a model is too simple, then the model and its predictions

will be unrealistic, whereas if it is too complex, then the model will be specific but

its predictions unreliable (Edwards 2001 p.129). It has long been advocated that

“economists should follow the advice of natural scientists and others to keep their

models sophisticatedly simple, especially as simple models seem to work well in

practice” (Zellner 2001 p.4).

In the world of scientific modelling and theory development, scientists should

adopt the underlying principle of parsimony to distinguish a better model from others.

The principle of parsimony, also known as Occam’s razor, is attributed to the

mediaeval philosopher William of Occam, who suggested that “pluralitas non est

ponenda sine necessitate” (another version is “entia non sunt multiplicanda praeter

necessitatem”), which means entities should not be multiplied unnecessarily. Thus,

if there are two competing theories (or models in the context of this study) that both

describe the same characteristics of observed fact (data set), then the simpler of the

two should be adopted until more evidence comes along (Stangl 1997). Occam’s

razor is particularly important in the development of universal models, as the subject

domain of these models is of an unlimited complexity. Because of this complexity,

the chance of obtaining a manageable model is very slight if the modelling process

67

starts with a very complicated theoretical foundation. The same principle also

applies to the development and selection of early stage forecasting models, because

data that is obtained at the early stage are highly abstract and uncertain, and the use

of a complicated model will inevitably add unnecessary assumptions. The

discourse on scientific theory suggests that no theory can be totally validated, but

that any theory can be falsified by facts (Popper 1959 pp.78-92). Thus, science

operates according to the principle of parsimony.

To apply the principle of parsimony to model selection, Simon (2001 p.35)

suggests expressing parsimony as a measure in the ratio of the complexity of the data

set to the complexity of the formula set. In the context of the competition between

two models, the parsimony of the relationship of the data set to the simpler model

(e.g., a linear model that contains one explanatory variable) is greater than the

parsimony of its relationship with the more complex model (e.g., a linear model

containing two explanatory variables) if they both describe a data set equally. By

the same token, the parsimony of the relationship of a model with a larger data set is

greater than the parsimony of the relation of the model with a smaller data set, if the

same model equally describes the two data sets.

To implement Occam’s razor, regression techniques can be applied to achieve

the parsimony of variables. This involves the development of a model through the

least-squares error method for a given domain (data for a particular type of building).

The goal of the final model is to produce accurate forecasts, and the criterion for the

selection of that model is the forecasting accuracy.

Forecasting accuracy is an objective measure of the success of a model, and

is also the expected fit of unseen data in a domain. It plays a very important role in

68

the judgment of models, as models themselves can never give error-free forecasts.

More reviews on forecasting accuracy are contained in Chapter 4.

Table 3-7: Summary of the models developed by the post-graduate students of

the Department of Civil Engineering at Loughborough University of

Technology (extracted from McCaffer 1975)

Author(s) Subject of Regression Model

Variables Model performance

Buchanan, J. S. (1969)

Reinforced concrete frame in buildings

Gross floor area, average load, shortest span, longest span, no. of floors, height between floors, slab concrete thickness and no. of lifts

More accurate for medium and high cost schemes rather than low cost schemes.

Gould, P. R. (1970)

Heating, ventilating and air conditioning services in buildings

Functions which described the heat and air flow through the building, the heat source and distance which it has to be ducted and shape

High accuracies at higher cost

Moyles, B. F. (1973)

System built school buildings

Floor area, area of external and internal walls, no. of rooms and functional units, area of corridors, storey height and no. of sanitary fittings

Generally, high accuracy

Neale, R. H. (1973)

Houses for private sale

Floor area, area of roof, are of garage, number of storeys, slope of site, unit cost of external finishes and cost of sanitary fittings, area and volume of kitchen units, site densities, regional factors, number of doors, area of walls, number of angles on plan, construction date and duration of development, and type of central heating

Only two cases fell outside the ±10%

Baker, J. (1974)

Residential apartment scheme for old people

Area of single units, double units, triple units, common rooms, Warden’s Flat, laundry, access corridors, number of lifts and garages and duration of contract

Coefficient of Variation (c.v.) of 9.16%

Blackhall, J. D. (1974)

Passenger lifts in office building

Contract date, dimensions of the car, no. of landings, length of travel, operating speed, type of control system and location of installation


Blackhall, J. D. (1974)

Electrical services in buildings

No. of distribution boards, fused load, number of active ways, no. of socket and other outlets, voltage, contract date and a differentiation of whether the building was commercial or domestic use


Coates, D. (1974)

Motorway drainage (including three models: (1) using porous pipes; (2) using helpline pipes and (3) using asbestos pipes)

Internal diameter, average depths and cost of pipes

(1) For porous pipe, c.v. of 12.8%; (2) For helpline pipe, c.v. of 9.2%; and (3) For asbestos pipe, c.v. of 6.9%

69

Table 3-8: Summary of Forecasting Targets and Influencing Variables in

Previous Empirical Studies

Variables Empirical studies

Forecasting Targets (Dependent Variables Used)

Overall building price / Cost of reinforced concrete structure

James (1954); Buchanan (1972); Moyles, (1973); Neale (1973); Baker (1974); Karshenas (1984); Singh (1990); Khosrowshahi and Kaka (1996)

Overall building price building price per square meter floor area

Department of Environment (1971); Tregenza (1972); Kouskoula and Koehn (1974); Braby (1975); Flanagan and Norman (1978); Skitmore and Patchell (1990)

Influencing Factors (Independent Variables Used)

Building type Kouskoula and Koehn (1974); Khosrowshahi and Kaka (1996)

Gross floor area Buchanan (1972); Moyles (1973); Neale (1973); Skitmore and Patchell (1990)

Typical floor area Karshenas (1984)

Number of storeys Department of Environment (1971); Buchanan (1972); Tregenza (1972); Neale (1973); Braby (1975); Flanagan and Norman (1978); Singh (1990)

Overall height Kouskoula and Koehn (1974); Karshenas (1984)

Storey height Buchanan (1972); Moyles (1973)

External wall area James (1954); Moyles (1973)

Location index (No location index in Hong Kong)

Kouskoula and Koehn (1974), Neale (1973)

Roof area James (1954); Neale (1973)

Starting date Neale (1973); Khosrowshahi and Kaka (1996)

Contract duration Neale (1973); Baker (1974); Skitmore and Patchell (1990)

Area of garage Neale (1973); Baker (1974)

Area of corridors Buchanan (1972); Baker (1974)

Number of lifts Buchanan (1972); Baker (1974)

Basement wall area James (1954)

Average superimposed loading, shortest span, longest span, slab concrete thickness

Buchanan (1972)

Internal walls area of number of rooms and functional units, number of sanitary fittings

Moyles (1973)

70

Table 3-8 (Cont’d): Summary of Forecasting Targets and Influencing Variables

in Previous Empirical Studies

Variables Empirical studies

Influencing Factors (Independent Variables Used) (cont’d)

Slope of site, unit cost of external finishes, cost of sanitary fittings, area and volume of kitchen units, site densities, number of doors, area of walls, number of angle on plan, duration of development, type of central heating

Neale (1973)

Area of single units, double units, triple units, common rooms, Warden’s Flat, laundry

Baker (1974)

Price index, quality, building technology Kouskoula and Koehn (1974)

Quantities of constituents of concrete construction, structural scheme, section of beams, grade of concrete, grid location, grid size

Singh (1990)

Number of bidders Skitmore and Patchell (1990)

Project operation, abnormality, and horizontal access

Khosrowshahi and Kaka (1996)

Note: Bold typed variables are measurable.

3.13 Summary

A building price forecasting model is a system that produces forecasted prices

from historical data. It is a type of technical model that attempts to dig out the

variables that have most influence on building prices. Forecasting models can be

distinguished according to whether they are black box or realistic, deterministic or

stochastic, and deductive or inductive. A more detailed classification was prepared

by Newton using descriptive primitives. According to his classification, the final

models in this research are specific for individual types of building (Data); applicable

71

to finished works, i.e. equating price to a function of identified variables that

comprises floor and external wall areas and so forth (Units); represent the designer’s

price forecast (Usage); follow Marco’s approach, i.e., producing forecasts for the

whole building (Approach); are applied at the feasibility and sketch design stage

(Application); are simulation models in terms of the problem boundary, the variables

considered and the inter-relationships between the variables (Model); are generated

by regression analysis (Techniques); are based on explicit assumptions about defined

problem boundaries (Assumptions) and are stochastic in terms of their performance

assessment (Uncertainty).

The characteristics of different types of forecasting cost models are

summarised in Skitmore and Patchell’s study. The application of cost models is

highly restricted by the assumptions that lie beneath the models, their reliance on

historical data for predicting future events, the insufficiency of information and

preparation time, and their reliance on expert judgment. Many studies on the

development of new models are criticised for their overemphasis on the uniqueness

and innovativeness of the model, their ignorance of the practicability of the model

and the lack of a clear demonstration of the benefit of the model, especially in terms

of their forecasting performance relative to the conventional models. To put forth

more advanced models in practice, their statistical significance and practical

significance are both crucial issues that should be addressed.

James’ storey enclosure model (JSEM), proposed in 1954, has been chosen

for further development. The original model uses some physical measurements,

such as floor area, roof area and elevation area of buildings to estimate building

prices. Although JSEM is not a widely used model in practice, and suffers from the

same inherent shortcomings as other early stage conventional forecasting models,

72

JSEM has been proved empirically to outperform other models. As the simplified

equation for JSEM for multi-storey buildings (as elaborated in Chapter 5) shows that

it can be considered as a problem of determining the best set of predictors, regression

analysis is employed to improve the JSEM further. The regressed models are

developed empirically, and are expected to be understandable, easy to use, fairly

accurate and reliable.

Predictors for regression models that have been used in previous studies are

reviewed. The two most commonly studied variables are the floor area, as

represented by the gross floor area or typical floor area, and the building height, as

represented by the number of storeys, overall height and storey height. The former

variable represents the costs of the horizontal elements of a building, whereas the

latter variable represents the costs of the vertical elements. In JSEM, the identified

variables include the areas of the floor, roof, basement walls and external walls.

The price of buildings can be expressed in an unlimited number of ways with

different mathematical functions and variables. Occam’s razor is addressed at the

end of this chapter because it is considered to be the most important principle for

model development. Taking this into account, the regression technique that is used

in this research is considered to be the means to achieve the necessary parsimony.

73

CChhaapptteerr 44 PPeerrffoorrmmaannccee ooff FFoorreeccaassttiinngg MMooddeellss

The more unpredictable the world is the more we rely on predictions.

Steve Rivkin

4.1 Introduction

It is essential for modellers to demonstrate the benefits of a new forecasting

model or approach to practising forecasters before its launch. The fundamental

benefit that a new model should show is an improvement in forecasting performance.

For instance, this study hypothesises that the new regressed models outperform the

conventional models in terms of forecasting accuracy. Much research has been

conducted in the past on the subject of forecasting performance, and some of this

research has studied the determinants of forecasting performance.

The measures for forecasting performance include bias, consistency and

accuracy. The bias in forecasting that is produced by a model is generally

represented either by the average percentage difference between the designers’

forecast and the lowest tender sum, or the average ratio between them. Bias is the

most popular measure of performance. Consistency refers to the degree of variation

http://www.worldofquotes.com/author/Steve-Rivkin/1/index.html

74

around the average that is represented by standard deviations, and accuracy is the

combination of bias and consistency into a single quantity (Skitmore 1991 p.2).

4.2 Measures of Forecasting Accuracy

A naive definition of accuracy would be the absence of error, or the assertion

that the smaller the error, the higher the accuracy and vice versa (Flanagan and

Norman 1983). Accuracy measures are usually defined in terms of the ratio of the

lowest bid to a forecast, the ratio of a forecast to the lowest bid (the reciprocal of the

ratio of the lowest bid to a forecast), the percentage by which the lowest bid exceeds

a forecast, the percentage by which a forecast exceeds the lowest bid (the reciprocal

of the percentage by which the lowest bid exceeds a forecast), the difference between

the lowest bid and a forecast, and the total number of “serious” errors. As the

percentage by which a forecast exceeds the lowest bid, is a widely accepted

expression of error in practice and is a unit-free measure, it is used to measure

accuracy in this study.

To properly interpret accuracy measures, there are two major components:

bias and consistency. Bias can be measured by the arithmetic mean, median,

Pearson r, Spearman’s rho and the coefficient of regression of the errors, the

percentage errors or the ratios described above. The first measure of bias uses

forecasts as the base of reference, which is suitable for the evaluation of the

forecasting performance of an individual forecaster or an individual company. The

second and third measures are statistically the same. The fourth measure does not

75

take into account the scale of the data, and data with large numbers might easily

dominate the comparisons.

Consistency can be measured by the standard deviation and coefficient of

variation of the errors, the percentage errors or the ratios described above. While

they both represent the degree of variation around the mean, the latter measure

adjusts the differences in magnitudes of the means of the data sets.

Instead of measuring bias and consistency, alternatively, accuracy can be

measured by a single quantity. The common combined measures, found mainly

from researches on modelling, are the mean square error, the root mean square error

and the mean modulus (absolute) percentage error.

Skitmore et al. (1990 pp. 5-23) extensively review the measures of

performance of forecasts in the literature. The different representations of bias,

consistency and accuracy (combined accuracy measures) in previous research are

summarised in Table 4-1. The authors found that the consistency measures in terms

of the coefficient of variation of forecasts and the overall accuracy measures are by

far less frequently used then bias measures.

Since all the models in comparison in this study are generated and tested by

the same set of data and the use of cross validation for modelling would likely

produce mean percentage errors that are close to zero, the effect of magnitude

differences mentioned earlier is likely to be small which lessens the benefit of using

the coefficient of variations. To compare models deterministically, a forecasting

model that is less biased (e.g. smaller mean error) and more consistent (e.g. smaller

standard deviation of error), or more accurate (e.g. smaller mean square error) than

other models is more preferable. However, the more sophisticated probabilistic

76

approach suggests that statistical inference should be used to conclude whether one

model is significantly better than the others. There are far more statistical inference

methods available for the measures using mean and standard deviation (or variance,

i.e. the square of standard deviation). Therefore, this study adopts the mean and

standard deviation of percentage error as the measures of forecasting performance.

Skitmore et al. (1990 pp. 5-23) also suggest that there are five primary

determinants that affect forecasting performance: the nature of the target, the

information used, the forecasting technique used, the feedback mechanism used and

the person who is providing the forecast. Except for the feedback mechanism, the

other factors have been well explored by many researchers. A summary of the

empirical evidence on the factors that affect forecasting quality is shown in Table 4-2.

The table is an extended version of a similar table that was prepared by Skitmore et

al. (1990 p.20-21), but more recent empirical studies are incorporated.

One of the major inadequacies that is found from a review of the literature on

forecasting accuracy is that some of the evidence is not strong enough because of a

lack of tests for the significance of forecasting errors (Skitmore and Drew 2003).

According to Table 4-2, there are a few contradictory results, but these contradictions

might have occurred by chance and may not represent the true population (Gunner

1997 p.30-31).

77

Table 4-1: Measures of Performance of Forecasts (Source: Skitmore et al. 1990

p. 22)

halla


78

Table 4-2: Factors affecting quality of forecasts – summary of empirical

evidence (extended from the similar table in Skitmore et al. (1990, p. 20-21))

Factor Researcher Evidence

(1) Nature of target

Contract works type

McCaffer (1975) Buildings more biased and more consistent than roads.

Harvey (1979) Different biases for buildings, non buildings, special trades, and others.

Morrison & Stevens (1980) Different bias and consistency for schools, new housing, housing modifications, and others.

Flanagan & Norman (1983) No bias differences between schools, new housing, housing modifications, and others.

Skitmore (1985) Different bias and consistency for school, housing, factory, health centre and offices.

Skitmore & Tan (1988) No bias or consistency differences for libraries, schools, council houses, offices and other buildings.

Skitmore et al (1990, pp. 79-87)

No bias or consistency differences for primary school, sheltered housing, offices, unit factories, health centres and other buildings

Quah, L. K. (1992) New works more consistent than refurbishment

Gunner and Skitmore (1999) No bias or consistency differences for commercial, non-commercial and residential buildings. Renovation works more biased and more consistent than new works.

Skitmore and Drew (2003) No bias or consistency differences for commercial, health, apartment, education and other. No bias or consistency differences for new and alterations works.

Contract size McCaffer (1975) No bias trend.

Harvey (1979) Bias reduces with size.

Morrison & Stevens (1980) Modulus error reduces with size. Consistency improves with sizes.

Flanagan & Norman (1983) Bias trend reversed between samples.

Wilson et al (1987) No linear bias trend.

Skitmore & Tan (1988) Bias reduces and consistency improves with size.

Skitmore (1988) No consistency trend.

Ogunlana and Thrope (1991) Consistency reduces with larger contract size.

Cheong (1991, p.106) No consistency trend.

Thng (1989) Ditto.

Gunner and Skitmore (1999) Bias reduces and consistency improves with size.

79

Table 4-2 (Cont’d): Factors affecting quality of forecasts – summary of

empirical evidence (extended from the similar table in Skitmore et al. (1990,

p.20-21))


(1) Nature of target (cont’d)

Contract size (Cont’d)

Skitmore (2002) No bias or consistency trend.

Skitmore and Drew (2003) No bias or consistency trend.

Project size (area)

Skitmore and Drew (2003) No bias or consistency trend.

Contract conditions type

Wilson et al (1987) More bias for bill of quantities contracts.

Gunner and Skitmore (1999) (1) Bias difference between conditions of contract issued by Singapore Institute of Architects (SIA) and standard form (RHLB form). (2) Consistency difference between contract with a fluctuation provision and contract without.

Geographical location

Harvey (1979) Bias differences between Canadian regions.

Wilson et al (1987) No bias trend between Australian regions.

Ogunlana and Thrope (1991) No conclusion although bias and consistency difference between regions of United Kingdom.

Nature of competition

Harvey (1979) Bias differences for individual bidders.

McCaffer (1975) Estimates higher with more bidders.

de Neufville et al (1977) Ditto.

Harvey (1979) Ditto. Inverse number of bidders gives best model.

Flanagan & Norman (1983) Estimates higher with more bidders.

Runeson & Bennett (1983) Ditto.

Hanscomb Association (1984) Estimates higher with more bidders. Non linear relationship.

Wilson et al (1987) Ditto.

Tan (1988) Ditto but not with UK data.

Ogunlana and Thrope (1991) No bias and consistency trend.

Skitmore (2002) No consistency trend.

80



p.20-21))


(1) Nature of target (cont’d)

Prevailing economic climate

de Neufville et al (1977) Estimates higher in ‘bad’ years with lagged response rate.

Harvey (1979) Ditto.

Flanagan & Norman (1983) Ditto.

Morrison & Stevens (1980) Estimates lower in uncertain economic climate.

Ogunlana and Thrope (1991) No significant relationship

Gunner and Skitmore (1999) Estimates higher in ‘bad’ years with lagged response rate.

Price intensity Skitmore et al (1990, p.191) High value contracts were underestimated and low value contracts over estimated.

Gunner and Skitmore (1999) Ditto.

Contract period Skitmore (1988) No difference between groups of contract period

Gunner and Skitmore (1999) No conclusion due to different results obtained from using contract sum as the base for measurement of bias against contract sum minus provisional sums as the same.

Other project characteristics

Skitmore & Tan (1988) Bias reduces and consistency trend with contract period and basic plan shape.

Ogunlana and Thrope (1991) Bias and consistency differences between design offices

Gunner and Skitmore (1999) (1) Bias and consistency better for foreign than local (Singapore) contractors. (2) No conclusion although bias difference between foreign and local architects. (3) Consistency improves with increasing area. (4) Consistency better for private sector than public sector

Skitmore and Drew (2003) No bias or consistency trend with client type.

81



p.20-21))


(2) Level of information

Number of priced items

Jupp & McMillan (1981) Slight bias reduction with price data.

Bennett (1987) Consistency differences between price data sources.

Skitmore (1985) No bias or consistency trend with price data. Increased bias and consistency with project information.

Gunner and Skitmore (1999) Consistency reduces as the number of items reduces.

Preliminaries percentage

Gunner and Skitmore (1999) No conclusion due to different results obtained from using contract sum as the base for measurement of bias against contract sum minus provisional sums as the same.

3) Forecasting technique

James (1954) Consistency differences between cube, floor area and storey enclosure methods.

McCaffer (1975) Consistency better for regression methods than conventional.

Morrison & Stevens (1981) Simulation model has less bias and more consistency than conventional

Ross (1983) Consistency better with simpler techniques.

McCaffer et al (1984) Consistency of regression and method comparable with conventional.

Brandon et al (1988) Expert system has less bias and more consistency than conventional.

Munns and Al-Haimus (2000) Cost significant global model has less bias and more consistency than conventional.

Skitmore and Drew (2003) Bias and consistency differences between approximate quantities and floor area methods. Consistency better for floor area method.

(4) Use of feedback

No evidence available

82



p.20-21))


(5) Ability of forecasters

Forecasters Jupp & McMillan (1981) Bias and consistency differences between subjects.

Morrison & Stevens (1980) Bias and consistency differences between offices.

Skitmore (1985) Bias and consistency differences between subjects.

Skitmore et al (1990) Bias and consistency differences between subjects.

Gunner and Skitmore (1999) No bias but consistency differences between subjects

Number of price forecasts

Gunner and Skitmore (1999) Bias reduces in proportion to number of price forecasts

4.3 Base Target for Forecasting Accuracy

Generally speaking, contractors derive a tender price by summing the

estimated total costs of production (including head office overheads and the cost of

finance) and their mark up. For the traditional procurement method, where the task

of design is separated from that of construction, design team members gain no access

to the details of the estimated costs of production or the allowed mark up in tenders.

The target of forecasts at the early design stage is the returned tender price,

rather than the final contract price, as the latter presents far too many unforeseeable

83

reasons and uncertainties, such as the possibility of contractual claims, that would

frustrate the forecasts of the final contract sum in the early design stage. There is,

however, a controversy between practice and academia about the use of returned

tenders as the forecasting target. Some suggest that the target should be the lowest

returned tender price (Morrison 1994; Ogunlana and Thorpe 1987) whilst others

suggest employing the mean (McCaffer 1976) or median of the returned tender

prices. The proposal for using the mean is based on the reason that it is less

variable, and is therefore more likely to be more accurate. The proposed use of the

median simply derives from a conservative notion, though one that is widely

accepted by practicing forecasters, that the possibility of underestimation should be

avoided. As price models are used to forecast the market price (i.e. the unknown

value of the contract to contractors buying on the contract market) (Skitmore and

Marston 1999, p.20), the lowest returned tender price is chosen to be the forecasting

target in this study. After all, the major interest of the forecasting exercise is to

predict the probable market price, and the use of the mean or median is ill defined.

Moreover, the effect of using the lowest or the mean tender on the assessment of

accuracy is found to be small (Beeston 1983).

4.4 Overview of Model Performance at Various

Design Stages

The field of forecasting techniques has been studied for more than four

decades. The popularity of research on this subject is due to the inherent

84

shortcomings of the conventional models of forecasting, as is described in Chapter 2.

As is detailed in section 3.6 of Chapter 3, modellers should demonstrate the benefits

of a model before attempting to implement it in practice, and thus the assessment of

the performance of a newly developed model is essential. However, only a few

empirical studies demonstrate the performance of new models, in a relative sense,

compared with that of conventional models, and even less deal with performance

measurement seriously by the use of statistical inference.

Barnes (1971) suggests that the performance of designers’ forecasts at the

commencement of feasibility studies is between +20% to 40% of the coefficient of

variation (cv), which improves to +10% to 20% cv at the commencement of the

detailed design stage.

Beeston (1974) uses a hypothetical example to show that the performance of

designers’ forecasts can only be reduced to close to the contractor’s estimate.

Based on the assumption that the variability of the designers’ forecast can be reduced

to 5%, this would lead to a figure of 6% of the coefficient of the variation of

differences between the forecast and the lowest tender, and there would be no further

reduction. If 6% of the coefficient of variation could be achieved, then 60% of the

designers’ forecast would fall within 5% of the lowest tender, 90% would fall within

10% and all of the forecast would fall within 20%.

Marr (1974) divides designers’ price forecasting into four stages: planning,

budget, schematics and preliminaries. Their corresponding adequate degrees of

accuracy are stated as 20-40% for planning, 15-30% for budget, 10-20% for

schematics and 8-15% for preliminaries, reducing to 5-10%. McCaffery (1978), in

his assessment of the forecasting accuracy for 15 schools, also makes a similar

85

division of stages, i.e., forecast, brief, sketch plan and detailed design. Their

corresponding cv are 17%, 10%, 9% and 6%.

McCaffer (1975) compares the quality of eight multiple regression statistical

models with that of other unspecified (conventional) models that are used by

practicing forecasters. Table 3-7 shows the performance of the eight methods.

Based on the assumption that the coefficient of variation of the forecast is likely to be

25% to 50% greater than the coefficient of variation of the prediction, as suggested

by the author, the multiple regression approach is proved to produce better quality

forecasts than the other (unspecified) methods that are adopted in practice.

Ross (1983) reveals some surprising results on the relationship of the

sophistication of models and their performance. Three models are compared in his

study. The first uses the simple average of the value of sections of construction

work from a set of bills of quantities for previous contracts. The second model uses

a regression procedure to predict the total value from the sectional values, and the

third model uses a regression on the unit value of items. The models are therefore

arranged in order of increasing use of information. However, the results reveal the

first model to be the most accurate, with a coefficient of variation of 24.5%, followed

by the second model with a cv of 30.49%, and the third method with a cv of 52.66%,

which suggests, controversially, that the more sophisticated methods that utilise more

of the available data produce less accurate results.

Ashworth and Skitmore (1983) review the literature that is concerned with the

forecasting accuracy of conventional models. Their cited references are shown in

Table 4-3. They draw the important conclusions that certain types of project are

associated with higher degrees of accuracy, and that the estimation accuracy is found

86

to be 15% to 20% cv in the early design stages, which only improves to 13% to 18% at

the tender stage.

McCaffer et al. (1984) suggest a more sophisticated approach to forecasting

based on the element unit rate method. This approach involves the use of 32

different models together with a criterion for selecting the most appropriate model to

match the characteristics of the target. The reported consistency of this method is

between 10% to 19% cv, which is at least comparable to that of conventional

methods.

The research of Brandon et al. (1988) suggests the use of a developed expert

system for forecasting. The performance of the expert system for early stage

forecasting for office projects is reported to be within 5% of that predicted by the

expert forecaster, and the system provides a forecast within 10% of the lowest bid,

which is much better than that achieved by the average forecaster.

Skitmore and Drew (2003) reveal significant differences in both bias

(ANOVA test, p=0.021) and consistency (Bartlett’s test, p=0.030) between the

approximate quantities and the floor area method. Similar to the surprising result

found by Ross, the approximate quantities method (with 14.27% cv) that utilizes

more data is found to be less accurate than the floor area method (with 10.87% cv).

87

Table 4-3: Performance of designers’ forecasts reviewed by Ashworth and

Skitmore (1983) [email protected]

4.5 Summary

The bias of the forecasts that are produced by a model is calculated by the

arithmetic mean of the percentage difference between the designers’ forecast and the

lowest tender sum. The use of a percentage mean is a unit-free measure for

forecasting errors. Consistency is the degree of variation around the average, which

halla


88

is represented by the standard deviation of percentage errors. Both bias and

consistency are chosen to be the accuracy measures in this study.

There are different options as to the choice of forecasting target. The lowest

returned tender price is considered to be a more appropriate forecasting target than

the mean or median of returned tender prices, as the major concern of the forecasting

exercise is to predict the probable market price to be paid by clients, which is often

the lowest bid in a tendering exercise.

Results from studies on bias and consistency tend to be contradictory, rather

than conclusive. Although significance tests can provide strong evidence to show

whether one model prevails over the others, and as a result can demonstrate the

major benefit of using the model in terms of its accurate performance, a review of

forecasting studies finds that sufficient significance testing is lacking.

Empirical studies on forecasting accuracy in different stages suggest that

there is little improvement in accuracy as a building project proceeds from the early

design stage to the detailed design stage. Paradoxically, there are two studies, one

by Ross and the other by Skitmore and Drew, which provide evidence that rougher

models are more accurate than more sophisticated models.

89

CChhaapptteerr 55 MMeetthhooddoollooggyy

By three methods we may learn wisdom: first, by reflection which is noblest; second, by imitation, which is the easiest; and third, by experience, which is the bitterest. Confucius

5.1 Introduction

James’ Storey Enclosure Model (JSEM) has been chosen for further

development because it is considered to be more sophisticated than other

conventional models in terms of the number of variables contained therein (such as

the floor area for each floor, basement area, external wall area and roof area of a

building) and the rationale behind the use of these variables (i.e. the consideration of

certain design factors, such as the shape of buildings, the vertical positioning of floor

areas, the storey heights and the cost of sinking storeys below ground in estimating

building prices). However, JSEM lacks the appropriate support for its assigned

weightings and selection of variables. Disregarding this deficiency, JSEM has been

judged, if rather roughly, to be a more accurate method than the floor area and the

cube models. The floor area model is still a very popular model that is widely

employed in practice, whereas JSEM is still only found in the textbooks of building

price studies. Because JSEM is as simple in application as other conventional

http://www.worldofquotes.com/author/Confucius/1/index.html

90

models, and has been proved to be relatively more accurate, it has been chosen for

further development in this research. The method for the development of JSEM

that is undertaken comprises the simplification of JSEM for multi-storey buildings by

making reasonable assumptions, the use of regression techniques for modelling

empirical data of building projects in Hong Kong and the assessment of forecasting

performance by statistical inference.

5.2 Research Framework

The further development of JSEM involves a purpose-designed modelling

approach that uses different regression techniques. Figure 5-1 shows the

framework for the identification, selection and validation of the price models in this

research. The framework comprises seven major steps: (1) the simplification of

JSEM, (2) data collection, (3) model building, (4) reliability analysis, (5) model

selection, (6) model adjustment and (7) performance assessment. The final price

models are developed through the identification of candidates in JSEM (Step 1) and

through the selection of predictors by regression techniques (Steps 2 to 6). The use

of regression techniques overcomes the major criticism of the irrationality of the

assigned weightings in JSEM. The forecasting performance of the final price models

(i.e. the best subset-regressed models) is then assessed by using the known measures

of the bias and consistency (Step 7). Finally, the best subset-regressed models are

compared with other conventional models to classify the models according to their

forecasting performance.

91

Figure 5-1: Research Framework for Identification, Selection and Validation of

Price Models

Simplification of JSEM

Classification and Entry of Historical Data

Generation of Subset Models by Least Square Error Method

Data Collection

Identification of Candidate Variables

Calculate Forecasting Error for Each

Sub-sample

Fit Each Sub-sample Using Least Square

Method

Determine Average MSQ for Sub-samples

of a Subset of Predictors

Construct Sub-samples

(each omitting one unique case from the

sample)

Reliability Analysis

Model Selection

Accuracy Testing against JSEM, Floor Area and Cube Models

Performance Assessment

Best Subset Model (Model

with Smallest Average MSQ)

Exclusion of Offending Variables

Model Adjustment

Calculation of Average MSQ for Each Subset Model by Cross Validation

Model Building

Leave-One-Out Method Formation of Base Model containing all

Candidates

Selection by Forward Stepwise and Backward Stepwise Procedures

92

5.3 Types of Quantity Measured in Single-Rate

Forecasting Models

The traditional models that are used for comparison in this study, including

JSEM, the floor area model and the cube model, are all single-rate forecasting

models. JSEM is the most complicated of the models because it demands more

measured variables, including the area of each floor, the perimeter of each floor, the

storey height of each floor and the roof area. After the introduction of these

traditional models, there are more models proposed such as those reviewed in

Chapter 3. However, most of them demand far more information than what is

extractable from sketch drawings. In other words, these models are to be used at a

later design stage.

As described in Chapter 3, JSEM can be represented by Equation (5.1):

RrspfspfiP j

m

jj

m

jji

n

ii

n

ii ⋅⎟⎟

⎠

⎞⎜⎜⎝

⎛+′′+′+++= ∑∑∑∑

==== 0000

5.22)15.02( , (5.1)

where P is the forecasted price, fi is the floor area at i storeys above ground, pi

is the perimeter of the external wall at i storeys above ground, si is the storey height

at i storeys above ground, n is the total number of storeys above ground level, m is

the total number of storeys below ground, f’j is the floor area at j storeys below

ground level, p’j is the perimeter of the external wall at j storeys below ground level,

s’j is the storey height at j storeys below ground level, r is the roof area and R is the

unit rate (determined by historical data).

93

The floor area model and cube model can also be represented mathematically

by Equations (5.2) and (5.3), respectively:

RfPnm

ii ⋅⎟⎠

⎞⎜⎝

⎛= ∑

+

=0 (5.2)

RsfPnm

iii ⋅⎟⎠

⎞⎜⎝

⎛⋅= ∑

+

=0 (5.3)

According to Equations (5.1) to (5.3), there are common variables amongst

the three models (e.g., fi, m and n).

5.4 Simplification of JSEM

To make a price model useful, it must be general enough to accommodate

variations without violating the original assumptions of the model, and specific

enough to reflect cost-significant factors. It must also be simple enough to be

understood by practicing forecasters, and intricate enough to explain real situations.

Although the data that are used in James’ study are mainly from low-rise buildings

(less than three storeys) such as houses, and medium rise buildings (3 to 10 storeys)

such as schools and industrial buildings, JSEM can also be applied to high-rise

buildings (higher than 10 storeys). Moreover, it can be applied to building projects

that contain more than one building (by adding another set of

variables, "")15.02(00

"l

t

ll

t

ll sprfl ∑∑

==

+++ ). However, the higher the building, the more

variables have to be created. If Equation (5.1) is used to estimate the price of a

40-storey building without a basement (a typical number of storeys for high-rise

buildings in Hong Kong), then one has to measure the floor area, the perimeter and

94

storey height ten times (once for each level), the number of levels, and the roof area.

With the JSEM, 81 variables (e.g., pjsj and p’js’j) have to be created, which are

calculated from 161 items of measurement (e.g., pj ,,sj, p’j and s’j). The rationale in

behind of the JSEM is that the areas of different parts of a building affects the

building price differently. The huge number of variables generated for modelling

the price of high-rise buildings would induce a heavy burden on the size of the data

set required. However, the rationale can be sustained and the number of variables

can be significantly reduced if the assumption is made that the floor areas at different

levels are approximately the same. This assumption is supported by the fact that

high-rise buildings generally comprise repeating floors. Very often, only typical

layout plans instead of the layout plans for every floor are provided for forecasting in

the early design stage. Although layout plans for every floor are more available at

the later stages, other development restrictions such as those laid down on land leases,

e.g. the site coverage and the plot ratio, leave little room for designers or the decision

makers to change the distribution of areas and the number of storey drastically.

With this assumption, the number of variables is reduced to four: the total level, total

elevation area (which can be easily measured by multiplying the average perimeter

by the overall building height), the average floor area and the roof area.

Equation (5.7) represents the simplified JSEM for use with high-rise

buildings. Care has to be taken to avoid applying the simplified equation to

buildings with significantly different floor sizes at different levels, or the assumption

will be violated. It is possible that the presence of a podium in a typical large

development may also violate the assumption, as floors that are located at podium

level will generally contribute a much larger average floor area than those in the

tower or towers above the podium. To avoid a probable violation, the variables in

95

JSEM that represent the floor area above ground level have to be divided into two

parts – one for the podium and the other for towers. This leads to Equation (5.6),

which represents JSEM for buildings with a podium design. Here are the steps for

deriving Equations (5.6) and (5.7).

Let ptpti

n

ii snpsp =∑

=0

, where ppt is the average perimeter of the superstructure

and spt, is the average storey height of the podium. Letbbj

m

jj smpsp =′′∑

=0

, where pb is

the average perimeter of the basement and sb, is the average storey height of the

basement. Let b

m

jj mff =′∑

=0

, where fb is the average floor area per storey for floors at

basement level and f’0 ≈ f’1 ≈ … ≈ f’m ≈ fb (the floor area for each level of the

basement is more or less the same, and is approximately equal to fb). Equation (5.1)

for JSEM becomes:

RsmpmfsnprfiP bbbptpt

n

ii ⋅⎟

⎠

⎞⎜⎝

⎛+++++= ∑

=

5.22)15.02(0

. (5.4)

Consider that a building comprises a podium section and a tower section. Let

n = a + b, where a is the number of storeys of the podium and b is the number of

storeys of the tower. f0 ≈ f1 ≈ …≈ fa ≈ fp (the floor area for each level of the podium is

more or less the same, and is approximately equal to fp), where fp is the average

storey area for floors at the podium level. fa+1 ≈ fa+2, … , fb ≈ ft (the floor area for each

level of the tower is more or less the same, and is approximately equal to ft), where ft

is the average storey area for floors at tower level. Then,

96

∑∑∑∑+==

+

==

+++=+=+b

ait

a

ip

ba

ii

n

ii fifififi

1000

)15.02()15.02()15.02()15.02(

[ ] ttpp fbaaabffaaf )()2()1(15.02)10(15.02 ++++++++++++= LL

fbabfbffaaf tpp )21(15.015.02)10(15.02 ++++++++++= LL (5.5)

The simplified equation for JSEM becomes:

Rsmpmfsnpr

abffbbffaafP

bbbptpt

tttpp ⋅⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

++++

++⎟⎠⎞

⎜⎝⎛ −++⎟

⎠⎞

⎜⎝⎛ −

=5.22

15.0215.0

215.02

215.0

215.02 22

RsmpRmfRsnprR

RabfRfbRbfRfaRaf

bbbptpt

tttpp

522

1502150

21502

2150

21502 22

.

.....

++++

++⎟⎠⎞

⎜⎝⎛ −++⎟

⎠⎞

⎜⎝⎛ −=

(5.6)

Consider a building that has no podium, or that the average storey area for the

podium is approximately equal to that of the tower, i.e., fp ≈ ft ≈ fpt, where fpt is the

average storey area for floors above ground level, and a + b = n. The simplified

equation becomes:

tttpp

tttpp

abffbbffaaf

fbbabfbffaaaf

15.0215.0

215.02

215.0

215.02

2)1(15.015.02

2)1(15.02

22 ++⎟⎠⎞

⎜⎝⎛ −++⎟

⎠⎞

⎜⎝⎛ −=

−⋅+++

−⋅+=

97

Rsmpmfsnpr

abffbbffaafP

bbbptpt

ptptptptpt ⋅⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

++++

++⎟⎠⎞

⎜⎝⎛ −++⎟

⎠⎞

⎜⎝⎛ −

=5.22

15.0215.0

215.02

215.0

215.02 22

Rsmpmfsnprfbafba bbbptptptpt ⋅⎥⎦

⎤⎢⎣

⎡+++++++⎟

⎠⎞

⎜⎝⎛ −= 5.22)(

215.0)(

215.02 2

(5.7)

Rsmpmfsnprfnnf bbbptptptpt ⋅⎥⎦

⎤⎢⎣

⎡+++++⎟

⎠⎞

⎜⎝⎛ −= 5.22

215.0

215.02 2

RsmpRmfRsnprRRfnRnf bbbptptptpt 5.22215.0

215.02 2 +++++⎟

⎠⎞

⎜⎝⎛ −=

5.5 Identification of a Problem

In JSEM, building prices are assumed to be proportional to the floor area,

roof area and elevation area. However, their exact relationships have not been

properly studied. As JSEM has been determined by rule of thumb, or by a very

coarse method, it is possible that JSEM may include some irrelevant predicting

variables, or have excluded some significant predicting variables, and that the

relationships between building prices and the predicting variables are not the same as

has been proposed. As suggested, JSEM can be represented by Equations (5.6)

and (5.7). These equations actually fit the hypothetical models such that each

equation contains one dependent variable (response), P, and some independent

variables (predictors), including nfptR and rR, which can be statistically developed by

regression techniques. The question in hand can be considered as a typical multiple

linear regression problem. Regression techniques can be used to determine the

98

subset of variables and the corresponding coefficients that give the best forecast of

the building prices. The developed regressed models and the employed modelling

approach in this research are both advancements of JSEM.

Let all of the possible predictors be Vi, where i = 1, 2, …, k, the building price

model can be represented as

i

k

iikk VVVVP ∑

=

+=++++=1

022110 ββββββ L , (5.8)

where β0, βis are constant coefficients and Vis are independent variables.

Table 5-1 shows the coefficients and the variables that are designated in JSEM with

reference to Equations (5.6) and (5.7).

There are other available techniques for modelling the variables other than

multiple regression analysis. Perhaps the closest alternative approach that serves

the same purpose is structural equation modelling. This takes into account the

modelling of interactions, nonlinearities, correlated independents, measurement

errors, correlated error terms, multiple latent independents, and one or more latent

dependents (independents and dependents are each measured by multiple indicators).

Compared with multiple regression, structural equation modelling includes more

flexible assumptions (particularly in allowing interpretation even in the face of

multicollinearity), uses confirmatory factor analysis to reduce measurement error by

having multiple indicators for each latent variable, provides a graphical modelling

interface and has the ability to test models with multiple dependents to model

mediating variables and error terms, and tests coefficients across multiple

between-subject groups (Garson 2004). Although structural equation modelling has

many advantages over the multiple regression method and is considerably more

99

powerful, multiple regression is more suitable for this research because of the

possible violation of assumptions and the multivariate normality of the indicators

(Jaccard and Wan 1996 p. 80). More importantly, the cross validation approach to

the multiple regression method that is used in this research provides a more direct

means for the measurement of reliability for small size samples.

Table 5-1: Coefficients and Variables Designated in JSEM

Equation (5.6) Equation (5.7)

Coefficients (βi) Variables (Vi) Coefficients (βi) Variables (Vi)

β0 = 0 β0 = 0

β1 = ⎟⎠⎞

⎜⎝⎛ −

21502 .

V1 = Raf p β1 = ⎟

⎠⎞

⎜⎝⎛ −

21502 .

V1 = Rnfn

β2 = 2150. V2 = Rfa p

2 β2 = 2150. V2 = Rfn n

2

β3 = ⎟⎠⎞

⎜⎝⎛ −

21502 .

V3 = Rbft β3 =1 V6 = rR

β4 = 2150. V4 = Rfb t

2 β4 =1 V7 = Rsnp ptpt

β5 = 150. V5 = Rabft β5 =2 V8 = Rmfb

β6 =1 V6 = rR β6 = 52. V9 = Rsmp bb

β7 =1 V7 = Rsnp ptpt

β8 =2 V8 = Rmfb

β9 = 52. V9 = Rsmp bb

5.6 Data Preparation and Entry

Cost analyses that were prepared by forecasters are chosen to be the data

source, as they contain all the information that is required for this study, such as the

100

tender price, floor area, roof area, building height and external wall area. The cost

analyses that are used in this research were provided by one of the two dominating

quantity surveying practices in Hong Kong (see Appendix A). Since the quantity

surveying consultants in Hong Kong rarely focus their business on providing services

to projects of particular types or with particular characteristics, project data

obtainable from the dominating practices are considered to have sufficient

representation of the price behaviour.

5.6.1 Data sample

The data sample consists of the values of identified candidate variables and

tender prices from 148 completed projects in Hong Kong. The tenders for these

projects were received in the ten-year period between the third quarter of 1988 and

the second quarter of 1997.

Hong Kong is a former British colony. Both the structure of the

construction industry and the professional practices within the industry are very

similar to those in the UK. In 1929, the Royal Institution of Chartered Surveyors

(RICS) established a branch office in Hong Kong. Before the local surveying

institution, The Hong Kong Institute of Surveyors (HKIS), was founded in 1984, the

Hong Kong branch of theRICS was the only institution that recognizes and provides

local support to surveyors. However, neither the HKIS nor the RICS have

attempted to formalise forecasting practice in Hong Kong.

Unlike forecasts that are produced in the UK, which are generally presented

in the format of the Building Cost Information Service (BCIS) (BCIS 1969), there is

101

no standardised definition or classification of building elements in Hong Kong.

Using data from a single source avoids the unnecessary complications that arise from

differences in the classification of building elements or the format and the

breakdown of building costs, which may differ across firms. Moreover, there is no

Building Cost Information Service (BCIS) type of organisation that provides online

cost advice services in Hong Kong, and a practice will not provide its own historical

project data to a competitor. Thus, it is almost impossible for the forecaster of one

company to get access to cost data from a third party such as the BCIS or another

practice. That the data is collected from a single source also ensures that the

models that are generated in this research are applicable to practical forecasting,

because the cross validation approach, as described in section 5.7.4, is very similar to

the manner in which forecasts are prepared in practice.

5.6.2 Definition and classification of building types

James’ study is based on a sample of 86 tenders in the categories of flats,

schools, industrial buildings and let houses in the 1950s in the UK. In accordance

with James’ study, all of the data from the 148 projects in this research are grouped

according to their building types.

The data are grouped for analysis into different building types according to

the Construction Index Samarbetskommitem for Byggnadsfrager (CI/SfB), which is

published by the Royal Institution of British Architects (RIBA) (Ray-Jones and

Clegg 1976). Five types of building were identified: (1) code no. 32 – office

facilities, offices; (2) code no. 442 – nursing homes, convalescent homes, sanatoria;

102

(3) code no. 712 – primary schools; (4) code no. 713 – secondary schools; (5) code

no. 816 – flats (apartments).

Because the number of available projects was small, and the provisions for

primary and secondary schools were very similar, these two sub-types were grouped

together. For ease of reference, the four groups are known as offices, nursing

homes, schools and private housing. Table 5-2 shows the distributions of the

building projects that were used for the development of price models, according to

their building types.

It should be noted that a few projects contain a mixture of more than one type

of building. For example, a 50-storey office tower project may have a few shops at

ground floor level. As all of the projects selected are dominated by one particular

type of building, the effect of the presence of another type or types of building is

considered to be insignificant.

103

Table 5-2: Classification of building projects according to building types

CI/SfB code

Building Type

Inclusions Exclusions No. of Samples Collected

No. of Discarded Cases

No. of Samples Used

32 Office Offices, such as design offices, professional offices and executive offices, that are not associated with a particular facilities

Official administrative facilities, law court, commercial facilities, trading facilities, shops, protective service facilities, bank, shopping arcade, industrial and office

45 3 42

442 Nursing home

Nursing homes, convalescent homes and sanatoria

Hospital facilities, hospitals, medical facilities and animal welfare facilities

23 - 23

712 & 713

School Primary and secondary schools including infants schools, secondary modern, secondary technical and community schools

Universities, colleges, nursery schools, kindergarten, scientific facilities, private schools, exhibition, display facilities, information facilities, libraries and other education facilities

23 - 23

816 Private Housing

Multi-storey Flats (Apartments)

Low-rise housing, one-off housing units, houses, public housing, special housing facilities, hotels, hostel, historical residential facilities, quasi-private housing and service apartment

57 7 50

Total: 148 10 138

104

5.6.3 Treating of outliers

Outliers (extreme cases) are especially troublesome when the goal is to select

from a set of forecasting models, but are less of a problem for model calibration

(Armstrong and Collopy 1992). The presence of outliers can seriously affect the

least-squares fitting of a regressed model. These outliers may possess different

characteristics from the rest of the data. Some regression diagnostics, such as the

jack-knife residual and leverage, assist in the identification of outliers. However,

pure reliance on the results of these statistical techniques (e.g., when they lie three or

more standard deviations from the mean of the residuals) for excluding extreme

cases without studying the plausibility of the exclusion may lead to a favourable

model being produced from biased data. Therefore, unless there is strong evidence

to indicate that a case is not a member of an intended sample, it should not be

discarded.

All of the inputs and outputs of the regression are evaluated according to

three criteria: reasonableness and given knowledge of the variable, response

extremeness, and predictor extremeness (Kleinbaun et al. 1998 p. 228). The

residuals from the regressed models were analysed, and three office and seven

private housing cases were discarded. All of the discarded office cases had

comparatively lower response values. Further investigation revealed that the three

office cases were for industrial and office (I-O) purposes1. Moreover, five of the

seven discarded private housing cases had comparatively lower response values, and

1 “An I-O Building is defined as a dual-purpose building in which every unit of the building, other than that in the purpose-designed non-industrial portion, can be used flexibly for both industrial and office purposes. In terms of building construction, the building must comply with all relevant building and fire regulations applicable to both industrial and office buildings, including floor loading, compartmentation, lighting, ventilation, provision of means of escape and sanitary fitments.” (Town Planning Board 2003)

105

the other two had higher response values. The five lower response cases were

discarded, as they are quasi-private housing development (housings completed under

the Private Sector Participation Scheme (PSPS)2), which were not solely developed

by private developers and thus were not part of the intended sample data. The two

higher response cases were found to be service apartment buildings, which are

generally better furnished than ordinary private housing, and were therefore also

discarded. To sum up, the differences in response values for the discarded cases

may be caused by the differences in the building provisions (industrial and office

buildings, service apartments and quasi-private housing), contractual arrangements

(quasi-private housing) and technology of fabrication (quasi-private housing).

Finally, 138 building projects in four categories were used for the modelling (see

Table 5-2).

5.7 Model Building

5.7.1 Dependent Variables

As is reviewed in section 4.4 of Chapter 4, the lowest tender price is set to be

the target of forecast. In accordance with James’ paper, the lowest tender prices

that are used in the modelling exclude the price of the foundations, building services,

external works, preliminaries and contingencies.

2 Under the Private Sector Participation Scheme (PSPS), private sector developers bid for the right to build according to a given design. The finished flats will be purchased by the Housing Authority of the Hong Kong Government at a pre-agreed price for onward sale to buyers who are selected by the Housing Authority.

106

When tender prices are used as the response for modelling, there is a risk of

producing poorly performing models in terms of their percentage errors, i.e. the ratio

of error (which is forecasted tender price minus the actual or lowest tender price) to

the actual tender price. It is also found that the magnitude of error that is produced

from forecasts of a wide range of tender prices (e.g., for offices, the tender prices

range from HK$24 million to $1,477 million) varies significantly. As the

performance of the forecasts are measured according to their percentage errors, the

minimisation of total squared errors in the least-squares method is not necessarily an

effective means of obtaining a good model unless tender prices in all of the cases in a

model are fairly close to each other. To reduce the influence of a wide tender price

range, the tender price per total floor area is adopted as the response. The tender

price per total floor area is a sensible alternative because forecasters usually present

building prices in unit prices, especially at the early budget stage, and the calculation

of forecasted prices from the unit price model is straight forward. The unit price

model can be directly compared with other conventional models despite their

responses being different, because performance is measured on the basis of

percentage errors.

5.7.1.1 Price Index Adjustment

The tender prices were rebased to the prices of the second quarter of 1997 by

means of the tender price index that is published by the quantity surveying practice

that provided the data for this study. A copy of this tender price index is attached in

Appendix B.

107

5.7.1.2 Other Adjustments

Apart from incorporating inflationary effects using the tender price index,

there may be some other characteristics that need to be adjusted using indices

(Kouskoulas and Koehn 1974; Pegg 1984). However, there is a lack of indices

other than the tender price index in Hong Kong. For instance, the location index,

while popular in many countries, is not in use at all. As the overall area of Hong

Kong is only slightly more than 400 square miles, projects that are undertaken

anywhere in Hong Kong are interpreted as being in the same geographical region

(Drew 1995). Other than projects that are located in remote areas such as outlying

islands and hillsides, etc., the location effect is not significant. No buildings located

in remote areas are included in the data pool.

The other possible adjustments by indices such as the quality and technology

of buildings are considered to be either irrelevant or inapplicable. First, there are no

quality and technology indices in use. Second, detailed specifications and method

statement for buildings are yet to be defined at the early design stage. Instead of

using indices for adjustment, only project data with similar characteristics, such as

project type, are grouped together for modelling.

5.7.2 Candidate variables

To identify the predictors for best subset models, the modelling process

started off with the variables that are used in JSEM. The actual measurements of

quantities (e.g. perimeter and storey height) for the variables in JSEM (e.g. elevation

area) were extracted to form the primary candidate variables for the regression

analysis. With reference to the variables in JSEM, a few candidate variables, such

108

as the number of storeys, the square of the number of storeys and their interaction

with storey height, were also added to form another set of candidate variables for

regression analysis. The unit rate ‘R’ was excluded, because the tender price is not

measured on a unit area basis in regressed models. Table 5-3 shows a full list of the

candidate variables for the regressed models for buildings with and without

basements.

Table 5-3: List of Candidate Variables

Primary Model

JSEM Model

All Subsets Model (With Basement)

All Subsets Model (Without Basement)

All Identified Variables (without higher degree and interaction effects) No. of storey for podium (a), a fp , a²fp , No. of storey for tower (b), bft , b²ft , No. of storey for basement (m), abft , mfb , Square of no. of storey for podium (a²), (asp + bst)ppt , Square of no. of storey for tower (b²), msbpb , r Average floor area for podium (fp), Average floor area for tower (ft), (separating Average floor area for basement (fb), podium and Average storey height for podium (sp), tower) Average storey height for tower (st), Average storey height for basement (sb), Average perimeter for tower and podium (ppt), Average perimeter for basement (pb), Roof area (r) Reduced Version of All Identified Variables (without higher degree and interaction effects)

No. of storey for superstructure (n), n fpt , n²fpt , n , m , n² , n , n² , fpt , No. of storey for basement (m), mfb , nsptppt , fpt , fb , spt , spt , ppt , Square of no. of storey for podium (n²), msbpb , r sb , ppt , pb , nfpt , n²fpt , Average floor area for superstructure (fpt), nfpt , n²fpt , nspt , n²spt , Average floor area for basement (fb), (combining mfb , nspt , nsptppt , Average storey height for superstructure (spt), podium and msb , n²sp t , n²sptppt , r Average storey height for basement (sb), tower) nsptppt , Average perimeter for tower and podium (ppt), msbpb , Average perimeter for basement (pb), n²sptppt , r Roof area (r)

109

5.7.3 Fitting Criterion

There are two approaches for the selection of predictors based on errors of

forecasts – parametric and non parametric. For a linear model, the former approach

demands the satisfaction of some statistical assumptions, including the following

(Kleinbaum et al. 1998, pp. 43-46). (1) For any fixed value of the variable V, P is a

random variable with a certain probability distribution, e.g., a normal

distribution ( )VPVP |2

| , μσ , that has a finite mean and variance; (2) the p-values are

statistically independent of one anther; (3) the mean value of P ( )VP|μ is a straight

line function of V; (4) the variance of P is the same for any V ( )2|

2| ba VPVP σσ = ; and (5)

for any fixed value of V, P has a normal distribution. If assumptions (1) to (4) are

satisfied and assumption (5) is not badly violated, then the conclusions that are

reached by a regression analysis remain reliable and accurate. This approach allows

the use of multiple partial F statistics and p-values to select variables for the best

models. These parametric procedures are suitable for routine problems, but not for

the problems that are identified in this research. First, the sample sizes for the

various types of building are small, around 25 to 50. This would easily cause bias

in the estimation of the coefficient. Second, the use of parametric techniques such

as the least-squares method is known to be robust, even if the normality assumption

(the fifth assumption) is not fully satisfied. However, the parametric estimates of

the error rates may not be correspondingly robust (McLachlan 1987). Although

transformation can be applied to variables to fulfil the requirement of normality, it

may cause the violation of other assumptions.

110

Instead of relying on the multiple partial F statistics and p-values for the

selection of variables, a non-parametric approach that is based on the mean square

error (MSQ) is adopted. There are two main advantages of using MSQ rather than

the actual errors or absolute errors. The first is that positive differences do not

cancel negative differences, and the second is that the use of differentiation is not

difficult (Fausett 2002).

Previous regressed price models that have been developed by researchers use

either the least-squares approach or the minimum variance approach for the model

fitting. In a linear fitting, both approaches produce the same solution (Kleinbaum et

al. 1998, p. 118). According to the non-parametric approach that is adopted in this

study, the termination criterion is to minimise the MSQ, and therefore the

least-squares approach is preferred.

5.7.3.1 Matrix Notation for Calculation of MSQ

Recall that Equation (5.8) can be presented in a matrix notation. Let P be a

column vector containing n rows of observed values for the response {P1, P2, … ,

Pn}T and V be a matrix that contains n x (k + 1) of the observed values for a subset of

variables such that:

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

knnn

k

k

n VVV

VVVVVV

,2,1,

,22,21,2

,12,11,1

2

1

1

11

L

MOMMM

L

L

M

V

VV

V . (5.9)

Corresponding to Pi is Vi, a row vector that contains the observed values for

the variables (which contain a constant term and k number of predictors) and {1, Vi,1,

111

Vi,2, … , Vi,k}, where i = 1, 2, … , n . In a regressed model, the price is

represented by:

eβVP += , (5.10)

where β is a column vector of the coefficients {β0, β1, β2, … , βk}T and e

is a column vector of the forecasting errors {e1, e2, … , en}T. The mean square error

then becomes:

)(n

en

eeTn

ii

11MSQ1

2 == ∑=

( ) ( )

)(1

1

βVVββVPPVβPP

βVPβVP

TTTTTT

T

n

n

+−−=

−−=

(5.11)

β̂ is the β that produces the minimum MSQ. To determine β̂ , the

MSQ is differentiated with respect to β , and the result is equated to zero, i.e.,

0βVVPVβ ββ

=+−=∂

∂

=

)ˆ22(1MSQˆ

TT

n. (5.12)

This yields:

PVβVV TT =ˆ

( ) PVVVβ TT 1ˆ −= (5.13)

Therefore, the minimum MSQ is:

112

( )βVVββVPPVβPP TTTTTT ˆˆˆˆnmin +−−=1MSQ . (5.14)

5.7.4 Reliability analysis

The fitness of a model that is built by historical data is not a reliable indicator

of its forecasting ability (Armstrong 1985). In classical statistical inference, a

model is validated using ex ante (out of sample) forecasts. However, the lack of

available data is always a limitation in the construction of price forecasting models.

Unquestionably, it is problematic to use the same data both to build up and to

validate a statistical model, i.e., to use ex post simulation prediction (within simple),

but the alternative of analysing data blindly simply to preserve the purity of classical

statistical inference, presents even worse problems.

In this research, a resampling method is adopted to select variables and

evaluate models. Three possible resampling methods were considered (Efron 1982):

cross validation, in which one case is omitted in turn from the model derivation and

the resulting coefficients are applied to that case; the jack-knife method, in which

one case is omitted in turn from the model derivation and the resulting coefficients

are applied to the other cases; and the bootstrap method, in which the coefficients are

used to generate simulated data from which a second set of coefficients is obtained.

For predictive applications, the cross validation method has the most intuitive appeal

as with non-time-series data of this nature each error value can be thought of as a real

error that may arise in the practice of forecasting (Skitmore 1992). In cross

validation, the accuracy of statistical inference is preserved by dividing at random a

sample of data into two sub-samples, an exploratory sub-sample, which is used to

113

select a statistical model for the data, and a validatory sub-sample, which is used for

formal statistical inference (Fox 1997). This is a compromise method that keeps the

integrity of the inference when the same data are used for the selection and validation

of statistical models, and is an approach to ex post forecasting, because test data are

within simple but are not used in model fitting. It is different from split sample

validation in that the split sample validation uses only a single sub-sample (the

validation set) to estimate the error. This distinction is particularly important,

because cross validation is proved to be markedly superior for small data sets (Goutte

1997).

To simulate a practical situation, the ‘leave-one-out’ cross validation method

is the most suitable approach, and is adopted in this study. The steps of the

‘leave-one-out’ cross validation approach for the assessment of the reliability of a

model are shown in Figure 5-1. The accuracy of statistical inference in the

leave-one-out method is preserved by dividing a sample that contains n cases of data

into n exploratory sub-samples (each containing n - 1 cases that are obtained from

the original n-case sample by the omission of one case without repetition), each of

which is used to select a statistical model using the least-squares approach, and n

omitted cases, each of which is used to validate the selected model from an

exploratory sub-sample that does not contain the omitted case. An average MSQ is

deduced from n models for each subset of candidates. The average MSQs from

models of different subsets of candidates are compared, and the model with the

smallest average MSQ is taken to be the best subset model.

Cross validation appears to make no assumptions at all. For the purpose of

comparing models, each explanatory sub-sample produces a slightly different

best-fitting curve in the family, and there is a penalty for large, complex families of

114

curves because large families tend to produce greater variation in the curves that best

fit an explanatory sub-sample (Turney 1990a). This leads to an average fit that is

poorer than the fit of the curve that best fits the total data sample (Forster 2001 pp.

96-97). In cross validation, the selection criterion is designed implicitly, rather than

explicitly, as it gives the forecasting accuracy in terms of MSQ.

5.7.4.1 Matrix Notation for Calculation of MSQ by Leave-one-out Method

Referring to the least-squares method that is described in the matrix notation

in section 5.7.4.1, let P(-j) be a column vector that contains n rows of observed values

for the response {P1, P2, …, P(j-1), P(j+1), …, Pn}T, let V(-j) be a matrix containing (n –

1) x (k + 1) of the observed values for the subset of variables (with the omission of

one row of the observed values, representing the jth case, from the matrix of variables

V such that j is any number from 1 to n):

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎣

⎡

=

+++

−−−

+

−−

knnn

kjjj

kjjj

k

k

n

j

j

V

VV

VV

V j

,,,

),(),(),(

),(),(),(

,,,

,,,

)(

)(

VVV

VVVVVV

VVVVVV

)(

L

MOMMM

L

L

MOMMM

L

L

M

M

21

12111

12111

22212

12111

1

1

2

1

1

11

11

. (5.15)

)( jβ − is a column vector of the coefficients {β0, β1, β2, …, β(j-1), β(j+1), … ,

βk}T and e(-j) is a column vector of the forecasting errors {e1, e2, …, e(j-1), e(j+1), … ,

en}T of the regressed model )()()()( jjjj eβVP −−−− += . Similar to the derivation that is

shown in Equations (5.11) to (5.14), the minimum MSQ of the regressed model that

does not contain the jth case becomes:

115

⎟⎠⎞

⎜⎝⎛ +−−=

−−−−−−−−−−−−− )()()()()()()()()()()()()( ˆˆˆˆnmin

jjjjjjjjjjjjj

βVVββVPPVβPPTTTTTT1MSQ (5.16)

The average of MSQmin(-j),

)(

min

j−

MSQ , is deduced from n regressed models

(for j = 1, … , n) of the subset of variables in accordance with Equation (5.17),

∑=

−−

=n

j

jj

1MSQ1MSQ

)()(

minmin n. (5.17)

Different )(

min

j−

MSQ from different subsets of variables that are chosen by

the selection strategy that is described in the next section are compared. The subset

of variables that gives the smallest )(

min

j−

MSQ is the best subset model.

5.7.5 Selection Strategies

The all-possible regressions procedure that fits all combinations of variables

is used over other variables selection procedures whenever practicable, because it is

the only procedure that guarantees the identification of the best subset model.

However, to find the best subset out of all of the subsets for the models with

basements that are listed in Table 5-3 using this procedure involves the fitting of 1 to

19 combinations of variables, i.e.,

519

110243.5

)!19(!!19

×=−∑

=i ii.

If each fitting consumes four seconds of computing time, then a full analysis

of all of the subsets for one type of building using one fitting criterion would take

116

over 24 days of computing time. As four types of building are included in this

research and two sets of variables are suggested (refer to Table 5.3), the overall

computing time would be much longer than 24 days!

There are a few common selection procedures for parametric problems, such

as forward elimination, backward elimination and stepwise selection. Forward

selection begins with no variable in the regression equation. The variable that has

the highest correlation with the dependent (criterion) variable is entered into the

equation first. The remaining variables are then entered into the equation

depending on the contribution of each variable. Backward elimination begins with

all of the predictor variables in the regression equation, and sequentially removes

them. Stepwise selection is a combination of the forward and backward elimination

procedures.

These procedures can also be applied to non-parametric regression, and the

difference rests on the use of different termination criteria. To ensure the selection

of the best subset model, a dual stepwise procedure that consists of a combination of

the forward stepwise and backward stepwise procedures is adopted (Figure 5-2).

According to the algorithm for the forward stepwise procedure (on the left-hand side

of the figure), forward regression is first applied by entering one candidate variable

at a time. When no candidate that enters into the model can further reduce the

average MSQ, the forward regression ends. A subset of variables that produces the

minimal average MSQ is selected. Backward regression is then applied, and if the

number of variables that was selected in the forward regression is less than two, then

the stepwise procedure will be terminated, as all single predictor models have been

considered in the forward regression. Candidates in the subset that are selected by

the forward regression are eliminated one at a time until the average MSQ cannot be

117

further reduced by the elimination of a candidate. Forward regression starts again

and backward regression follows until the average MSQ cannot be further reduced,

and a minimum average MSQ is determined at the end of the forward stepwise

procedure. The backward stepwise procedure (on the right-hand side of the figure)

is the same as the forward stepwise procedure, except that it commences with all of

the candidates being contained in the model and starts off with a backward regression.

The best subset model that is deduced by the forward stepwise procedure is

compared with that deduced from the backward stepwise procedure. If they are the

same, then the selected subset model will either be very close to, or the same as, the

best model using the all-possible regression procedure.

118

The best model in this stage contains r number of variables

The best model in this stage contains (i-1) number of variables

Generate all 1-variable models

For i = 2

Select best 1-variable model

Backward Regression

Generate all i-variable model with 1st to (i-1)th variables

already entered

Is average MSQ of best i -variable model < that of

best (i-1)-variable model?

Select best i-variable model For i = i + 1

Generate all (i-2)-variable models from already entered 1st

to (i-1)th variables

Select best (i-2)-variable model

Forward Regression

Is average MSQ of best (i-2) -

variable model < that of best (i-1)-variable model?

Best Subset

Model by Forward Stepwise Procedure

Generate all r-variable models from already entered 1st to

(r+1)th variables

For r = i – 3, which the best model in this stage contains (r+1) number of variables

Select best r-variable model

Is average MSQ of best r-variable model < that of

best (r+1)-variable model?

For r = r - 1

For i – 1 = r + 1,

which the best model

in this stage

contains (i -1) number of variables

Yes

No

Yes

No

Yes

No

Yes

No

Backward Regression

Generate all (r-1)-variable models from already entered 1st

to rth variables

Select best (r-1)-variable model

Is average MSQ of best (r-1) -

variable model < that of r-

variable model?

Best Subset

Model by Backward Stepwise Procedure Generate all (r+1)-variable

model with 1st to rth variables already entered

No

Yes

Generate n-variable model

Is r > (n – 1)?

For r = r - 1

For r = n

Yes

No

Forward Regression

Select best (r+1)-variable model

Is average MSQ of best (r+1) -variable model

< that of r-variable model?

No

Yes

For r = i - 2, which the best model in this stage contains (r

+ 1) number of variables

Generate all i-variable model with 1st to (i-1)th variables

already entered

Select best i-variable model

For i = i + 1

Is average MSQ of best i -

variable model < that of (i-1)-

variable model?

Yes For i = r, which the

best model in this stage contains i number of variables

No

Are they the same model?

STOP

Identification of a base model containing n

variables

FORWARD STEPWISE REGRESSION BACKWARD STEPWISE REGRESSION

Is i > 2?

Yes

Exclusion of an offending

variable

Figure 5-2: Algorithm for Dual Stepwise Selection

119

5.8 Model Adjustment

5.8.1 Exclusion of candidates

The best subset models that are selected by the forward stepwise and

backward stepwise procedures are not necessarily the same. Divergence is easily

caused by multicollinearity, i.e., strong correlations amongst the predictors. One

typical strategy to avoid the presence of multicollinearity is to combine or remove

predictors that are strongly correlated to each other. This can be easily

implemented by the use of correlation tables. However, this strategy is not

appropriate for the modelling exercise in this research, because a lot of the selected

predictors are actually interaction terms, and are likely to be strongly correlated with

the primary variables (in Table 5-3). Moreover, as the future use of the best model

is for forecasting rather than understanding how predictors in the model have an

impact on the response, good models that suffer from multicollinearity still produce

accurate forecasts. Therefore, except for variables that are very highly correlated (>

0.95), predictors that have similar values to each other have not been deleted simply

because their correlation is high (say, > 0.7). If the cross-validated average MSQs

of the best models that are generated from the two procedures are different, then one

of them will always be better – the one with the smaller average MSQ. To prevent

a less significant candidate acting as an offending variable and entering into the

model before a more significant candidate (or a more significant candidate being

eliminated from the model before a less significant candidate), an algorithm to

120

exclude offending variables has been set up to deal with the possible divergence.

This involves four steps: (1) the exclusion of a candidate in turn before modelling by

regression, (2) the generation of models with forward stepwise and backward

stepwise procedures, (3) the selection of the model with the smaller average MSQ if

two different subsets of variables are chosen, and (4) the comparison of the smaller

average MSQ with that of a subset of variables that is selected from an all-subset

model that contains the excluded candidate. Step 1 is repeated (i.e., excluding the

second, third or more candidates before modelling) if the forward stepwise and

backward stepwise procedures for modelling cannot produce an agreeable model, or

the average MSQ of the best subset model is higher than that of the subset of

variables that is selected from an all-subset model that contains the excluded

candidate(s). The procedure for excluding candidates as described stops if the

forward and backward stepwise procedures produce the same model (subset of

predictors) with the smallest average MSQ.

The use of cross validation is a non-parametric approach to the determination

of the best subset of predictors, and therefore does not have to fulfil the assumptions

of homoscedasticity and normality of predictors that are required in parametric

regression. Because of this, the use of transformation strategies for variables in this

research is limited to the circumstances in which the original data suggested a model

that is non-linear in either the regression coefficients or the original variables, or the

linearisation of the regression coefficients.

A few studies have attempted to find the relationships between various

predictors and the price of building or the prices of the components of a building

(Wilderness Group 1964; Flanagan and Norman 1978; Russell and Choudhary 1980;

Tan 1999). However, a generalised relationship between any particular predictor

121

and the price of building or the prices of its components is absent, and on the

contrary, many studies have shown quite different relationships for the same subjects.

For example, the relationship between building price (represented by total price or

price per total floor area) and building height (represented by the number of storeys

or overall building height) has been expressed as a linear (Tregenza 1972; Braby

1975), a parabolic with a minimum (Flanagan and Norman 1978) and a power

(Karshenas 1984) function. Perhaps it can only be concluded that each relationship

can only be held true for the data from which it is generated.

5.8.2 Transformation of variables

For a given set of predictors and a given response, there can be unlimited

combinations of transformed predictors and transformed responses. Certainly,

models with transformed variables are more complicated, more inexplicable, and

bear a higher risk of being too specific for the given data than their untransformed

counterparts. More importantly, complicated models often do a bad job of

forecasting new data, although they can be made to fit old data quite well. This is

experienced also by modellers in other disciplines (Sober 2001 p.30). In terms of

practicability, simplicity also aids understanding and implementation by decision

makers, reduces the likelihood of mistakes, and is less expensive (Armstrong 2001

pp. 374-375). In the light of the principle of parsimony3, as reviewed in Chapter 3,

this research avoids the development of models with complex mathematical

functions. Instead, each best subset model has been transformed to a power

3 “The concern for parsimony can lead to normative rules for discovery systems: that such systems should be designed, as far as possible, to generate simple rules before generating complex ones.” (Simon 2001 p.42-43)

122

function, because this has been demonstrated by Karshenas (1984) and Skitmore and

Patchell (1990) to improve accuracy. The power function model can be expressed

as follows:

∏=

⋅=k

ii

i

10

'V'β'P' β , (5.18)

where P’ is the forecasted price, β’0, β’is are constant coefficients and V’is are

the variables of the best subset model. Taking the natural logarithm (ln) of both

sides (the ln transformation for the model), Equation (5.18) may be equivalently

expressed as:

i

k

ii V'β'β'P' lnlnln

10 ⋅+= ∑

=

. (5.19)

Equation (5.19) shows the transformation of the original variables to a linear

function of ln variables. The forecasting performance of the linear best subset

model has to compare with that of the model that is represented by Equation (5.18).

Referring to the principle of parsimony, the linear model prevails over the power

function counterpart unless the latter is shown to make significantly better forecasts.

5.9 Comparison of Best Model with Other Models

To assess the forecasting accuracy of the best subset models for the four types

of building, their forecast results have been compared with those obtained from the

other three conventional models. The same set of data that was collected for

123

building regressed price models is used to analyse the performance of all of the

models to facilitate a fair comparison.

With regard to the regressed models, the forecasted price per total floor area

for each case is multiplied by the total floor area to obtain the forecasted price to

calculate the forecasting error. Similar to the leave-one-out method, the reliability

of the three conventional models is also analysed using cross validation. The data

for each building type is split into two parts in turns without repetition. One part is

the exploratory sub-sample that contains all of the cases minus the one that is used to

calculate the average unit rate, and the other part contains the omitted case for the

assessment of the forecasting ability. The forecast for each turn is then calculated

by multiplying the average unit rate by the value of the predictor in the omitted case.

To measure the closeness of a forecast relative to the actual tender price, the

percentage error of the forecast is used, i.e.,

%100PriceTender Actual

PriceTender Actual - PriceTender Forecasted× (5.20)

The mean and standard deviation of percentage errors that represent the two

widely established accuracy measures of bias and consistency are used. The higher

the mean, the more bias the model has, and the higher the standard deviation, the less

consistent the model is. However, the magnitude of these two measures cannot

distinguish whether a model is better or worse than the others without significance

testing. The confidence level for all of the significance tests that are employed in

this research is 95%.

124

5.9.1 Choice of parametric and non-parametric inference

There are two approaches to statistical inference – parametric and

non-parametric. The former approach refers to modern statistical inference that is

based on the postulation of a parametric statistical model (Fisher 1922). The

parametric models are arguably simpler than the non-parametric models because they

are more informative, more amenable to statistical adequacy assessment, are often

more parsimonious and are more likely to give rise to reliable and precise empirical

evidence (Spanos 2001 p.186). Therefore, statistical adequacy can best be analysed

in a parametric setting. However, the common assumption of normality that lies

behind a parametric model may not always be fulfilled.

There are statistical tests that are available to check normality, such as the

Anderson-Darling (A-D) and Kolmogorov-Smirnov (K-S) tests. The K-S test

essentially looks at the most extreme absolute deviation, and determines the

probability that this deviation can be explained by a normally distributed data set,

whereas the A-D test is a modification of the K-S test that gives more weight to the

tails than the K-S test. The A-D test also differs from the K-S test in that it makes

use of specific distributions, such as a normal distribution, in the calculation of

critical values, and thus has the advantage of being more sensitive. The A-D test is

adopted for testing the assumption of normality in this research. The null hypothesis

for the test is that the forecasted percentage errors for a particular model follow a

normal distribution. The A-D test statistic is defined as:

( ) ( )( )[ ]ini

n

i−+

=

−+−

−−= ∑ 11

2 1lnyln12 yDDn

)i(nA , (5.21)

125

where D is the cumulative distribution function of the normal distribution, n

is the sample size and yi are the ordered data.

In a case in which the assumption of normality is proved to be invalid,

transformation using such techniques as the Box-Cox normality plot may help to

normalise a distribution. The Box-Cox transformation identifies a value of lambda

(λ) such that the suggested transformation of the original data is Yiλwhen λ≠ 0 and

ln(Yi) whenλ= 0.

To find the optimal lambda values, the Box-Cox transformation modifies the

original data using Equations (5.22) and (5.23) for Wi (a standardised transformed

variable). It then calculates the standard deviation of the variable Wi. The goal is

to find the value of lambda that minimises the standard deviation of Wi.

( )11 −

−= λ

λ

λ GYW i

i whenλ≠ 0 (5.22)

Wi =G ln(Yi) whenλ= 0, (5.23)

where Yi is the original data, G is the geometric mean of all the data and λ is

the lambda value.

If the transformation of the data fails to fulfil the normality assumption, then

the parametric way to proceed is to postulate another appropriate distribution.

Unfortunately, there are much fewer available statistical tests for distributions that

are other than normal. Alternatively, the non-parametric model, which makes use

126

of less specific probabilistic assumptions, may be used for inference. The

non-parametric model is distribution free, which refers to implicit assumptions such

as whether the random variable is discrete or continuous, the nature of the support set

of the distribution, the existence of certain moments and the smoothness of the

distribution. Inference using a non-parametric model is based on rank, and is less

susceptible to the problem of statistical inadequacy. The benefits of non-parametric

inference include its significant gains in power and efficiency when the error

distribution has tails that are heavier than those of a normal distribution, and superior

robustness in general (Hettmansperger and McKean 1998 p. xiii).

5.9.2 Statistical inference for bias

To ascertain the significance of bias, the models are tested against a mean

zero using t statistics. The t-test is well known for its robustness, even if the

distribution of data departs from normality (Lehmann 1959). The null hypothesis

for the t-test is that the mean percentage error for a model is equal to zero, which

represents an unbiased model. Let dμ be the mean percentage error, dσ be the

standard derivation of the percentage error, and nd the total number of cases for one

of the models that is represented by the notation d. The p-value that is calculated

from the t statistics in Equation (5.24) shows whether a model is significantly biased

from the zero mean percentage error.

d

d

d

n

tσμ

= . (5.24)

127

As all forecasts have been produced by cross-validated models that are

represented by the same set of selected predictors and their different coefficients in

the regressed models (or the different average unit rates in the conventional models

for different turns), the mean percentage errors for the models are likely to be close

to zero.

5.9.3 Statistical inference for consistency

As models are expected to be more or less unbiased, the consistency of the

models becomes an important indicator to distinguish the model or models that

perform better than others. Although the t-test for bias is robust even for departures

from normality, the parametric inference tests for consistency (the standard deviation

of percentage errors) are not.

Figure 5-3 shows an algorithm for the selection of parametric and

non-parametric tests. To avoid using the parametric tests naively, the assumption of

the normality of the data (forecasted prices) has been tested. As the parametric

inference is more amenable in terms of statistical adequacy, it is more preferable that

the assumption of normality be fulfilled, by means transformation if necessary. The

details concerning the checking of the normality assumption and the use of the

Box-Cox transformation are described in section 5.9.1. Alternatively,

non-parametric inference is employed if the assumption is not satisfied.

After deciding on the type of inference, the forecasting models are first tested

in groups for homogeneity of multivariances. This involves the use of the Bartlett’s

test for parametric inference and the Kruskal-Wallis test non-parametric inference.

128

Figure 5-3: Algorithm for Comparisons of Variances of Percentage Errors

Yes

Yes

Yes

No

Models of about same potency in consistency are grouped together

Conduct Multiple F-tests

using LSD approach

No

No

Yes

No

Conduct Box-Cox transformations of percentage errors and Anderson-Darling test

Conduct Anderson-Darling test for normality of distributions

Determine forecasted percentage errors for models under comparison

Is distribution of percentage errors for each model

normal?

Conduct Kruskal-Wallis test for equality of rank deviations

Conduct Bartlett’s test for

equality of variances

Is distribution of transformed errors for each model normal?

Are models of same variance?

Are models of same variance?

Conduct Multiple Mann-Whitney U tests using LSD approach

All models are comparable in consistency

Parametric Tests

Non-parametric Tests

129

The Bartlett’s test is used to study the significance of the differences between

the variance of percentage errors for the models under comparison. The null

hypothesis for the test is that the variance of percentage error for the models in

comparison is equal. Let M be the number of models for comparison, and the

Bartlett’s test statistic (B) be represented by Equation (5.25) as follows:

( )( )

( )( )

( ) ( ) ( )⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

−−

−++

⋅−−⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

−

⋅−⋅⎟

⎠

⎞⎜⎝

⎛−

=

∑∑

∑∑

∑∑

=

=

=

=

=

=

M

dd

M

d d

d

M

ddM

dd

M

dddM

dd

nnM

nn

nn

B

1

1

2

1

1

1

2

1

1

11

113

11

ln11

1ln1 σ

σ

. (5.25)

With reference to a chi-square (x²) distribution, the B value corresponds to a

p-value, which suggests whether the models in comparison are of equal variance.

The Kruskal-Wallis test (H-test) is a nonparametric equivalent to a one-way

ANOVA that tests whether several independent samples have a mean. The central

tendencies or medians are the main concern in the H-test. Based on the assumption

that the values for each sample under consideration have underlying continuous

distributions, the null hypothesis is that k samples from possibly different

populations actually originate from similar populations. By replacing percentage

errors with absolute deviations from the sample mean as the sample values for

ranking, the H-test assesses for the homogeneity of population variance (Sprent 1993

pp. 155-157). Let Rj be the sum of ranks of the jth sample, nj be the size of the jth

sample, and N be the size of the combined sample. The H-test statistic is:

130

( ) ( )131

121

2

+−⎥⎥⎦

⎤

⎢⎢⎣

⎡⋅

+= ∑

=

NnR

NNH

k

j j

j . (5.26)

With reference to a chi-square (x²) distribution, the H value corresponds to a

p-value, which suggests whether the models in comparison are of equal variances.

If the p-value from the Bartlett’s test or Kruskal-Wallis test statistics is

smaller than 0.05 and the null hypothesis is not supported, then the consistencies of

the models in comparison are not equal. The next step is to determine which of the

models differ specifically from each other. To do this, the variance of percentage

errors of the models are compared in pairwise using the F-tests or Mann-Whitney U

rank sum tests.

Following the Bartlett’s test that shows the significant difference of variances

amongst the models, the F-test is used to test the null hypothesis of whether the

variances or standard deviations of the forecasted percentage errors for two models

are equal. The F-test statistics is:

22

21

ssF = , (5.27)

where s12 and s2

2 are the sample variances. The more this ratio deviates

from 1, the stronger the evidence for unequal population variances. With reference

to the F distribution, a corresponding p-value can be found that suggests whether the

two models in comparison are of equal variance.

131

If the H-test shows a significant difference of variances amongst the models,

then it follows the Mann-Whitney U test (U-test) by using the rank sums of the two

samples to examine the null hypothesis of whether the absolute deviations from the

sample means of the two samples are equal. The observations from both samples

are combined and ranked, with the average rank assigned in the case of a tie. If the

percentage error deviations for the two samples in comparison are identical, then the

ranks should be randomly mixed between the two samples. Two rank sums, Ta and

Tb, are calculated. For sample sizes that are larger than 20, the U statistics refer to a

normal Z distribution, as is shown in Equation (5.28):

( )12

12

2121

21

++

−=

nnnn

nnUZ , (5.28)

where U is the smaller of Ua and Ub in Equations (5.29) and (5.30) as

follows:

( ) T - nn nn U aa 2111

21+

+= (5.29)

( ) T - nn nn U bb 2122

21+

+= . (5.30)

With reference to the Z distribution, a corresponding p-value can be found

that suggests whether the two models in comparison are of equal variance.

Unfortunately, performing several F-tests or Mann-Whitney U rank sum tests

has a serious drawback. The more null hypotheses there are to be tested, the more

likely it is that one of them will be rejected even if all of the null hypotheses are

132

actually true (Kleinbaum et al. 1998 pp. 443-447). In other words, if each test has a

5% probability of erroneously rejecting the null hypothesis (H0), then the probability

of incorrectly rejecting at least one H0 is much larger than 5%, and continues to

increase with each additional test that is carried out.

Fisher’s least significance difference (LSD) approach is used to correct

exaggerated significance levels. For example, if k sets of two-sample tests are

produced, then the maximum possible value for this overall significance is 0.05k.

The remedy for the LSD is to decrease the significance level to 0.05/k. In this

research, six (4C2) two-sample tests are produced for each type of building (i.e., k =

6), and therefore the corrected significance level for each pairwise test is 0.0083.

5.10 Tools for Computation

Both spreadsheets (e.g. Excel) and statistical software packages (e.g. SPSS)

provide built-in regression functions. Users can simply use these functions by

inputting the observed values for dependent and independent variables, and a

regression model by least-squares method (or other methods), together with other

relevant information to describe the model, will automatically be generated in report

format. However, these built-in functions do not feature a resampling procedure,

which means that they are unable to satisfy the needs of this study. To accomplish

this requirement and follow the various algorithms that are described in sections 5.7

to 5.9 of this chapter requires a purpose-made programme. Therefore, this research

uses the programming language of the mathematical software MathCad to write a

programme for handling the selection procedures and reliability analysis. Mathcad

133

is also used as a calculation tool in this study. It possesses advantages over other

programming languages in its use of direct equation input and its approach to the

solution of mathematical problems symbolically or numerically, which means that

programmes that are written by Mathcad are readable even for someone who has no

background in programming language. To illustrate the use of the worksheets that

were written by Mathcad, an example for the RASEM for office is attached in

Appendix D.

In addition, the functions of significance tests, such as the t-test, K-W test and

U-test, are available in the spreadsheets and statistical software packages that are

used.

5.11 Summary

This chapter describes an approach to further develop JSEM. JSEM is first

simplified to avoid an escalation in the number of variables that are induced by

increasing the number of storeys of a building. The simplification procedure

successfully reduces the number of variables for JSEM from a function of the

number of storeys to 9 for buildings with a podium and 6 for buildings without a

podium.

The cost analyses of 148 completed projects in Hong Kong for four types of

building – offices, private housing, nursing homes and primary and secondary

schools – were collected. Ten out of the 148 samples were considered as outliers

due to their differences in building provision, contractual arrangement and

technology of fabrication, and were discarded from further analysis. The building

134

prices per total floor area, which were extracted from the analyses and rebased in

accordance with the tender price index, are set as the observed values of the response

for modelling. With reference to the actual measurements of quantities (e.g.

perimeter and storey height of buildings) for the variables in JSEM (e.g. elevation

area), another two sets of variables are identified, one containing 12 variables for

buildings with basements and the other containing 19 variables for buildings without

basements.

A non-parametric approach using the least average MSQ as the termination

criterion is proposed to prevent the violation of parametric assumptions that are

likely to be caused by small sample sizes. The leave-one-out cross validation

method, which on the one hand determines the models by an explanatory sub-sample,

and on the other hand checks the forecasting ability of the models by an omitted case,

is considered to be the most intuitive method that simulates the practice of

forecasting. To improve the probability of identifying the best subset models, a

dual stepwise procedure, together with an algorithm to eliminate the possible

offending variables, is suggested. The transformation of variables may further

improve forecasting performance, and in this research the natural logarithmic

transformation method is selected for the variables that are chosen in the best

regressed models. The principle of parsimony is particularly addressed in the

selection of models, and a more complicated model has to demonstrate its benefits in

terms of forecasting accuracy to be chosen over a simpler model.

The performance of a forecast is measured by the percentage error of

departure from the actual price. To assess the performance of a model in terms of

forecasting accuracy, bias and consistency are adopted.

135

Statistical inference can be classified as parametric or non-parametric. The

former approach is more powerful and is used if its assumptions can be satisfied. If

they cannot, then the percentage errors are transformed to see if the transformed

distribution can fulfil the assumptions. If the assumptions are still not fulfilled, then

the non-parametric approach is used.

As the regressed models and the unit rates in the conventional models are

developed by cross validation, it is expected that the forecasts from these models will

have a close to zero bias. Because of this, each model is first tested against a zero

bias using the t-test. The t-test is parametric, but is known to be robust for

departures from normality. However, parametric tests for consistency are not

robust, and an algorithm is developed to assist in the selection of an appropriate

approach and significance tests within that approach.

Two stages are involved to distinguish the models using measures of

consistency. First, the homogeneity of variance of all of the models is tested using

k-sample tests, such as the Bartlett’s test under the parametric approach and the

Kruskal Wallis test under the non-parametric approach. If models are found to be

significantly different, then these tests are followed by multiple two-sample tests,

such as F-tests under the parametric approach and Mann-Whitney U-tests under the

non-parametric approach. Because of the exaggerated significance levels due to the

multiple comparisons, the Fisher’s least significance difference approach (LSD) is

used for rectification. With the assistance of the LSD, models of the same potency

in consistency are grouped together.

The benefits of advances in computer software are harnessed to assist in this

research, and a combination of different software is used. The mathematical

136

software Mathcad is used to execute the purpose-made algorithm of regression

analysis using cross validation, and commonly used spreadsheets and statistical

packages that offer a variety of built-in functions for significance tests are also

adopted to produce statistical inferences.

137

CChhaapptteerr 66 AAnnaallyyssiiss

Think as you work, for in the final analysis, your worth to your company comes not only in solving problems, but also in anticipating them. Harold Wallace Ross

6.1 Introduction

This chapter is divided into three sections. The first section concerns the

development of the regressed models based on the data that was collected from Hong

Kong projects. The details of the eight regressed models that are generated from

two sets of variables for the four types of buildings and the corresponding

logarithmic transformed models are explained. The variables that are selected in

each regressed model are different.

The bias and consistency of percentage errors of the forecasts from the

regressed models that were developed in the first section, and those of the

conventional methods, are measured in the second section. Each regressed model is

compared individually with the conventional models. On average, the regressed

models, especially the Regressed Model for Advanced Storey Enclosure Method

(RASEM), produce more accurate forecasts and all fall into the best clusters of

138

models in the eight groups of models under comparison. However, there is

insufficient evidence to conclude their superiority over their conventional

counterparts.

A practical approach to combining forecasts is proposed in the third section

to improve prediction accuracy. The combined forecast is always more accurate

than the average forecast, and is sometimes better than the best forecast.

6.2 Model Development

6.2.1 Data Collected

The data collected include the number of podium storeys (a), the number of

tower storeys (b), the number of basement storeys (m), the average area per podium

storey in m² (fp), the average area per tower storey in m² (ft), the average area per

basement storey in m² (fb), the average podium storey height in m (sp), the average

tower storey height in m (st), the average basement storey height in m (sb), the

average perimeter on plan for the superstructure in m (ppt), the average perimeter on

plan for the basement in m (pb), the roof area in m² (r), the original tender price in

Hong Kong dollars (tp), the date the tender was returned and the tender price index

(TPI). Appendix C (enclosing Table C-1 to Table C-4) is attached to display these

data according to the building type in a tabular format. The original tender prices

were rebased to the base period of the second quarter of 1997 in accordance with the

tender price index in Appendix B. The rebased prices are also shown in Appendix

C.

139

6.2.2 Candidates for Regression Models

The regression methodology that is described in Chapter 5 is used to advance

the original JSEM. A new model – the Regressed Model for James’ Storey

Enclosure Method (RJSEM) – is developed by using the variables that were

identified in JSEM for each type of building. The methodology that is applied to

the new model is the Regressed Model for Advanced Storey Enclosure Method

(RASEM) methodology, which uses another set of variables. The RASEM contains

four types of candidates: the primary variable (n, m fpt, fb, spt, sb, ppt, pb, r), the

second degree variable (n2), the interaction term that is formed amongst the primary

variables (nfpt, mfb, nspt, msb, nsptppt, msbpb) and the interaction term that is

formed between primary variables and second degree variables (n2fpt, n2spt,

n2sptppt). Table 6-1 shows the candidate variables, the response and the

corresponding equations for the RJSEM and the RASEM.

6.2.3 Response for Regression Models

A regressed model that produces a small average MSQ may not produce a

corresponding small mean or standard deviation of percentage errors (for which the

mean represents bias and the standard deviation represents the consistency of the

model), because larger response values have more influential effects in the

least-squares method, whereas the use of percentage errors for performance

assessment is unit free. These large-value influential effects can be reduced

tremendously by changing the response from the tender price to the tender price per

total floor area, as described in section 5.7.1 of Chapter 5. By adopting this change,

140

the ranges of actual response values that are represented by the ratios of the

maximum actual response value to the minimum are reduced from 60.74 to 2.33 for

offices, from 113.16 to 2.17 for private housing, from 6.87 to 2.09 for nursing homes

and from 7.93 to 2.23 for schools.

141

Table 6-1: Candidates, Responses and their Equations for the RJSEM and the

RASEM

Variable Equation Notation

RJSEMCandidatesTotal floor area for podium a · fp afpStorey number for podium · Total floor area for podium a² · fp a2fpTotal floor area for tower b · ft bftStorey number for tower · Total floor area for tower b² · ft b2ftStorey number for podium · Total floor area for tower a · b · ft abftTotal floor area for basement m · fb mfbElevation area (a · sp + b · st) · ppt nsptpptBasement wall area m · sb · pb msbpbRoof area r rResponseAdjusted tender price per total floor area P ÷ (a · fp + b · ft + m · fb) Y

RASEMCandidatesStorey number for superstructure a + b nStorey number for basement m mSquare of storey number for superstructure (a + b)² n2Average area per storey for superstructure (a · fp + b · ft) ÷ (a + b) fptAverage area per storey for basement fb fbAverage storey height of superstructure (a · sp + b · st) ÷ (a + b) sptAverage storey height of basement sb sbAverage perimeter on plan for superstructure ppt pptAverage perimeter on plan for basement pb pbTotal floor area for superstructure (a · fp + b · ft) nfptStorey number for superstructure · Total floor area forsuperstructure

(a + b) · (a · fp + b · ft) n2fpt

Total floor area for basement m · fb mfbHeight of building above ground (a · sp + b · st) nsptDepth of basement m · sb msbStorey number for superstructure · Height of building aboveground

(a + b) · (a · sp + b · st) n2spt

Elevation area (a · sp + b · st) · ppt nsptpptBasement wall area m · sb · pb msbpbStorey number for superstructure · Elevation area (a + b) · (a · sp + b · st) · ppt n2sptpptRoof area r rResponseAdjusted tender price per total floor area P ÷ (a · fp + b · ft + m · fb) Y

142

6.2.4 Selection of Predictors

The selection of best models (the best subsets of the predictors) concerns the

minimisation of the average MSQ by leave-one-out cross validation. The dual

stepwise procedure that is described in section 5.7.5 of Chapter 5 is applied to the

two sets of candidates and responses (one for the RJSEM and the other for the

RASEM), as is shown in Table 6-1. Except for the RJSEMs for nursing homes and

schools, for which agreeable subsets of predictors were produced, two different

subsets of predictors were selected from the values of these candidates and responses

using the forward stepwise and backward stepwise procedures separately. As is

explained in section 5.8.1 of Chapter 5, this discrepancy may possibly be due to a

less significant predictor that acts as an offending variable and enters the model

before a more significant predictor, or a more significant predictor that acts as an

offending variable is eliminated from the model before a less significant predictor.

To avoid this circumstance, candidates in the RJSEMs or RASEMs are excluded

repetitively using the algorithm that is shown in Figure 5.3 of Chapter 5. According

to this algorithm, the selection process ceases when both forward stepwise and

backward stepwise procedures produce the same best subset of variables. Several

candidates in the RJSEMs and RASEMs for the four types of building were excluded.

Table 6-2 shows the included candidates, excluded candidates and selected predictors

in these models. Amongst the candidates in the RJSEMs, msbpb (basement area)

was the only candidate that was excluded in the RJSEMs for offices and private

housing. However, there were more excluded candidates in the RASEMs. First of

all, the observed values for r (roof area) were found to be very close to, or the same

as, those for fpt (average floor area for the superstructure), because most multi-storey

buildings in Hong Kong, including those in this research, have a flat roof design for

143

the podium and tower. As fpt is considered to be a more representative candidate,

because the average floor area corresponds to more elements of a building than the

roof area, r was excluded from the RASEMs. The other primary variables, such as

n, m fpt, fb, spt, sb, ppt, and pb, and the second degree variable, n2, were kept

because the use of untransformed variables excluding any interaction term is the best

starting point for a general regression model (Skitmore and Patchell 1990). All of

the interaction terms were subject to the exclusion procedures. nfpt (being a

candidate in RJSEM as well), n2fpt, n2spt, msbpb (being a candidate in RJSEM as

well) and n2sptppt were excluded from the RASEMs for the four types of building.

Furthermore, the interaction terms mfb (total basement floor area) and msb (depth of

basement) were also excluded from the private housing and nursing home models.

The agreeable best models from both the forward stepwise and backward stepwise

procedures were generated from the Mathcad worksheets that were purposefully

written to carry out the selection algorithm and the reliability analysis using cross

validation.

144

Table 6-2: Included Candidates, Excluded Candidates and Selected Predictors

for RJSEMs and RASEMs

Office Private Housing

Nursing Home School Office Private

HousingNursing Home School

afp / nfpt* o o o o n o o o oa2fp / o o o o m o o o NAbft o o NA NA n2 o o o ob2ft o o NA NA fpt o o o oabft o o NA NA fb o o o NAmfb o o o NA spt o o o onsptppt o o o o sb o o o NAmsbpb x x o NA ppt o o o or o o o o pb o o o NA

nfpt x x x xn2fpt x x x x

Legend: mfb o o x NAo - Candidate x - Excluded Candidate nspt o o o oo - Selected Predictor NA - Not applicable msb o x x NA

n2spt x x x xRemarks: nsptppt o o o o* - afp and a2fp for office and private housing, msbpb x x x NA nfpt and n2fpt for nursing home and school n2sptppt x x x x

r x x x x

RASEMRJSEM

6.2.4.1 Selected Predictors for RJSEMs and RASEMs

Tables 6-3 to 6-10 show the step by step results of the predictor selection by

forward stepwise and backward stepwise procedures based on the criterion of

average MSQ. Tables 6-11 to 6-18 show the regression coefficients for each

predictor, forecast and MSQ as determined by the cross-validated models.

Table 6-19 divides the constants and selected predictors of all of the

regressed models according to the signs of their corresponding coefficients. Special

145

attention is drawn to the fact that the sign of a coefficient does not represent the

actual relationship between a predictor and the response of tender price per total floor

area, but the relationship between them in the best model under the proposed

regression methodology. Thus, the use of another methodology (e.g. the use of

another termination criterion rather than the least average MSQ) may produce

another best model (such as another group of transformed valuables or another subset

of predictors) that would suggest a different set of relationships between the selected

predictors and the response in terms of the signs and values of the coefficients.

All of the constant terms (β0) are positive except the term for the RASEM for

private housing. The selected predictors can be classified into two groups: floor

area related predictors and non-floor area related predictors. Referring to Table

6-23, all of the models have at least one floor area related predictor. The floor area

predictors include afp, a2fp, bft, b2ft, fb, fpt, n2fpt and r. Most of these predictors

exhibit a negative effect on the tender price per total floor area in the RJSEMs and

the RASEMs. The average floor area of the superstructure (fpt) does not exist as a

candidate in the RJSEMs. Instead, the effect of floor area on the response is

represented by the total floor area and total floor area multiplied by the number of

storeys (afp, a2fp, bft and b2ft or nfpt and n2fpt). If r in the RJSEMs is considered

to be an alternative candidate to fpt in the RASEMs due to their proximity in value,

then all of the regressed models except for the RJSEMs for offices and private

housing would have a negative component that is represented by the average area of

the superstructure (similar to the typical floor area for multi-storey buildings of

rectangular shape). In addition, the RASEM for nursing homes is considered to be

very similar to the corresponding RJSEM in terms of the selected predictors (nsptppt

and r are the predictors in the RJSEM, whereas nsptppt and fpt are the predictors in

146

the RASEM) and the values of the corresponding coefficients due to the proximity of

value of fpt and r.

All of the RASEMs contain the predictor fpt with a corresponding negative

coefficient. If these models were used for prediction, then they would suggest that

the higher the value of the average floor area of a superstructure, the smaller the

forecasted tender price per total floor area would be. In the RJSEMs for offices and

private housing, the predictors of total floor area such as a2fp (for offices), bft (for

offices and private housing) and b2ft (for offices), instead of the predictors of

average area per storey, are present as the negative components. In contrast, some

other floor area related predictors such as afp (in the RJSEM for offices), r (in the

RJSEM for private housing), n2fpt (in the RJSEM for schools) and fb (in the

RASEM for private housing) are present in the different models with positive

coefficients. To find out the overall effect of the floor area related predictors on all

of the regressed models, their aggregate contributions to the response were reckoned.

Table 6-20 shows the contributions of the floor area related predictors to the response.

From the table, it can be found that the aggregate contribution of these floor area

related predictors is generally negative (except for a few cases in the RJSEMs for

offices and private housing and the RASEM for private housing), which suggests that

the tender price per total floor area is inversely proportional to the floor area related

predictors in the models. However, the non-floor related predictors, n2, pb, ppt, sb,

spt, nspt and nsptppt exhibit solely positive aggregate contributions to all responses.

Their contributions are shown in Table 6-21.

Unlike the original JSEM that assumes the price of components (e.g. external

wall, window and external finishes) to be proportional to the measured areas (e.g.

external wall area), the regressed models select variables without assuming such a

147

relationship. The aggregate contributions according to the classification of floor or

non-floor related predictors provide further information on the composition of the

regressed models.

Table 6-3: Step-by-Step Selection Results of Predictors for the RJSEM for

Offices

Step Variables entered Variables deleted Average MSQ

1 a2fp 3.00E+062 nsptppt 2.86E+063 bft 2.48E+064 afp 2.10E+065 b2ft 2.07E+066Final model: 2.07E+06


1 afp, a2fp, bft, b2ft, abft, mfb, nsptppt, r 2.87E+06

2 abft 2.46E+063 r 2.11E+064 mfb 2.07E+0656Final model: 2.07E+06

(No deletion or entry, end regression)

(No entry or deletion, end regression)

a2fp, nsptppt, bft, afp, b2ft

Forward Stepwise

Backward Stepwise

a2fp, nsptppt, bft, afp, b2ft

(Stop backward, start forward)

148


Private Housing


1 bft 9.72E+052 r 9.59E+053Final model: 9.59E+05


1 afp, a2fp, bft, b2ft, abft, mfb, nsptppt, r 1.69E+06

2 mfb 1.33E+063 b2ft 1.15E+064 nsptppt 1.08E+065 a2fp 1.06E+066 abft 9.95E+057 afp 9.59E+058Final model: 9.59E+05bft, r



Forward Stepwise

Backward Stepwise

bft, r


Nursing Homes


1 r 6.73E+052 nsptppt 6.57E+053Final model: 6.57E+05


1 nfpt, n2fpt, mfb, nsptppt, msbpb, r 3.26E+06

2 n2fpt 1.01E+063 mfb 7.74E+054 nfpt 7.00E+055 msbpb 6.57E+056Final model: 6.57E+05r, nsptppt



Forward Stepwise

Backward Stepwise

r, nsptppt

149


Schools


1 r 2.17E+052 n2fpt 2.07E+053Final model: 2.07E+05


1 nfpt, n2fpt, nsptppt, r 3.16E+052 nsptppt 2.35E+053 afp 2.07E+054Final model: 2.07E+05r, n2fpt



Forward Stepwise

Backward Stepwise

r, n2fpt

150

Table 6-7: Step-by-Step Selection Results of Predictors for the RASEM for

Offices


1 nspt 2.79E+062 n2 2.04E+063 fpt 1.79E+064 ppt 1.63E+065Final model: 1.63E+06


1 n, m, n2, fpt, fb, spt, sb, ppt, pb, mfb, nspt, msb, nsptppt 1.85E+07

2 pb 7.14E+063 nspt 3.78E+064 spt 2.51E+065 fb 2.04E+066 nsptppt 1.93E+067 msb 1.92E+068 mfb 1.90E+069 m 1.89E+061011 nspt 1.80E+061213 sb 1.69E+0614 n 1.63E+0615Final model: 1.63E+06



nspt, n2, fpt, ppt

(Stop backward, start forward)

(Stop forward, start backward)

Forward Stepwise

Backward Stepwise

nspt, n2, fpt, ppt

151


Private Housing


1 spt 5.96E+052 fb 5.62E+053 pb 5.19E+054 fpt 4.96E+055 sb 4.92E+056Final model: 4.92E+05


1n, m, n2, fpt, fb, spt, sb, ppt, pb, mfb, nspt, nsptppt 5.26E+06

2 m 6.74E+053 nspt 6.11E+054 ppt 5.67E+055 n 5.41E+056 mfb 5.20E+057 nsptppt 5.02E+058 n2 4.92E+059Final model: 4.92E+05spt, fb, pb, fpt, sb

Forward Stepwise

Backward Stepwise

spt, fb, pb, fpt, sb



152


Nursing Homes


1 fpt 6.70E+052 nsptppt 6.47E+053Final model: fpt, nsptppt 6.47E+05


1 n, m, n2, fpt, fb, spt, sb, ppt, pb, nspt, nsptppt 1.23E+08

2 sb 2.34E+073 n2 3.79E+064 pb 1.33E+065 fb 9.68E+056 n 8.63E+057 m 7.89E+058 nspt 7.36E+059 ppt 6.47E+0510 spt 6.47E+0511Final model: fpt, nsptppt 6.47E+05



Forward Stepwise

Backward Stepwise

153


Schools

Step Variables entered Variables deleted Average MSQ1 nspt 1.80E+052 fpt 1.75E+053Final model: 1.75E+05

Step Variables entered Variables deleted Average MSQ1 n, n2, fpt, spt, ppt, nspt,

nsptppt 2.68E+052 n2 2.26E+053 nsptppt 2.07E+054 ppt 2.04E+055 n 2.00E+056 spt 1.75E+057Final model: 1.75E+05fpt, nspt

Forward Stepwise

Backward Stepwise

fpt, nspt



154

Table 6-11: Coefficients, Forecasts and MSQs Determined by Leave-One-Out

Method for the RJSEM for Office

Caseβ 0 β 1 β 2 β 3 β 4 β 5 Forecasted Y MSQ

1 4696 -0.066 0.223 -0.079 0.295 -0.0002 5,802 1.26E+062 4708 -0.069 0.222 -0.079 0.306 -0.0002 5,657 8.77E+053 4690 -0.063 0.221 -0.078 0.286 -0.0002 5,118 8.07E+054 4710 -0.070 0.221 -0.078 0.308 -0.0002 5,645 8.19E+055 4706 -0.069 0.224 -0.080 0.309 -0.0002 6,186 2.33E+066 4541 -0.083 0.242 -0.097 0.402 -0.0001 2,325 3.08E+067 4686 -0.068 0.226 -0.080 0.305 -0.0003 6,247 2.49E+068 4614 -0.064 0.217 -0.074 0.277 -0.0003 6,540 6.78E+069 4568 -0.059 0.239 -0.081 0.251 -0.0002 9,757 2.88E+0610 4697 -0.065 0.214 -0.078 0.293 -0.0002 7,508 1.44E+0611 4663 -0.068 0.223 -0.078 0.300 -0.0003 6,618 1.73E+0412 4620 -0.066 0.224 -0.078 0.293 -0.0003 6,034 1.34E+0613 4676 -0.067 0.225 -0.079 0.300 -0.0003 6,375 6.34E+0514 4694 -0.058 0.225 -0.079 0.246 -0.0002 3,925 1.16E+0615 4703 -0.064 0.212 -0.075 0.280 -0.0002 7,969 1.40E+0616 4524 -0.064 0.228 -0.076 0.278 -0.0003 5,639 8.25E+0617 4787 -0.121 0.221 -0.081 0.466 -0.0002 2,516 6.12E+0618 4713 -0.069 0.221 -0.079 0.307 -0.0002 5,692 1.09E+0619 4510 -0.066 0.232 -0.079 0.296 -0.0003 5,279 6.51E+0620 4655 -0.068 0.225 -0.079 0.301 -0.0003 8,907 1.36E+0421 4711 -0.069 0.221 -0.077 0.300 -0.0002 5,359 6.54E+0522 4599 -0.069 0.227 -0.078 0.303 -0.0003 5,226 1.27E+0623 4665 -0.068 0.223 -0.078 0.299 -0.0003 8,958 1.74E+0324 4662 -0.068 0.223 -0.078 0.300 -0.0003 5,424 3.51E+0125 4667 -0.068 0.223 -0.077 0.298 -0.0003 4,518 3.17E+0426 4593 -0.064 0.218 -0.074 0.280 -0.0003 6,333 1.01E+0727 4559 -0.069 0.229 -0.078 0.303 -0.0003 5,323 4.03E+0628 4649 -0.066 0.223 -0.078 0.294 -0.0003 6,432 2.00E+0529 4665 -0.068 0.223 -0.078 0.301 -0.0003 5,916 3.09E+0330 4727 -0.087 0.227 -0.088 0.403 -0.0002 6,743 5.49E+0631 4715 -0.068 0.219 -0.077 0.295 -0.0003 4,828 5.44E+0532 4669 -0.068 0.224 -0.079 0.301 -0.0003 6,505 1.56E+0533 4653 -0.068 0.224 -0.078 0.299 -0.0003 5,262 2.19E+0434 4642 -0.068 0.225 -0.078 0.297 -0.0003 5,506 1.47E+0535 4700 -0.068 0.221 -0.078 0.300 -0.0002 5,268 4.07E+0536 4702 -0.071 0.220 -0.082 0.325 -0.0002 6,080 1.02E+0637 4639 -0.068 0.226 -0.077 0.297 -0.0003 5,149 9.69E+0438 4718 -0.067 0.220 -0.075 0.285 -0.0003 5,431 1.54E+0639 4745 -0.078 0.221 -0.082 0.349 -0.0002 6,025 1.65E+0640 4722 -0.067 0.220 -0.075 0.284 -0.0003 5,401 1.64E+0641 4604 -0.056 0.235 -0.064 0.207 -0.0005 7,046 7.38E+0642 4698 -0.067 0.223 -0.076 0.290 -0.0003 5,861 1.15E+06

Average: 2.07E+06

RJSEM ( β 0 + β 1 ⋅ a2fp + β 2 ⋅ nsptppt + β 3 ⋅ bft + β 4 ⋅ afp + β 5 ⋅ b2ft )

155


Method for the RJSEM for Private Housing

Caseβ 0 β 1 β 2 Forecasted Y MSQ

1 4530 -0.008 0.057 3,935 2.41E+062 4567 -0.008 0.052 3,784 4.77E+063 4484 -0.007 0.055 4,504 1.53E+064 4466 -0.007 0.058 4,430 2.09E+065 4533 -0.007 0.057 3,353 8.83E+026 4496 -0.008 0.087 5,744 1.27E+067 4475 -0.007 0.058 4,480 1.27E+068 4509 -0.007 0.057 4,520 2.25E+059 4536 -0.007 0.056 4,466 8.91E+0310 4531 -0.007 0.045 4,771 5.08E+0511 4625 -0.010 0.069 2,997 9.64E+0612 4533 -0.007 0.057 3,845 5.93E+0413 4474 -0.007 0.056 4,471 1.99E+0614 4470 -0.007 0.058 4,464 1.52E+0615 4522 -0.007 0.057 4,514 4.95E+0416 4509 -0.007 0.061 4,024 7.96E+0517 4558 -0.007 0.056 4,566 2.35E+0518 4532 -0.007 0.056 3,926 3.08E+0419 4583 -0.008 0.056 4,595 9.98E+0520 4490 -0.007 0.058 4,368 1.12E+0621 4511 -0.007 0.058 4,213 6.41E+0522 4516 -0.007 0.058 4,261 2.60E+0523 4568 -0.007 0.055 4,321 1.35E+0624 4535 -0.007 0.056 4,048 6.96E+0325 4530 -0.007 0.056 4,531 3.55E+0326 4541 -0.007 0.060 4,416 4.88E+0527 4552 -0.007 0.056 4,472 1.99E+0528 4551 -0.007 0.056 4,219 9.78E+0529 4536 -0.007 0.056 4,037 1.91E+0530 4495 -0.007 0.057 4,418 9.56E+0531 4534 -0.007 0.055 3,890 2.68E+0532 4586 -0.008 0.054 4,399 1.85E+0633 4526 -0.007 0.054 4,648 1.10E+0534 4551 -0.007 0.059 4,403 8.02E+0535 4537 -0.007 0.057 4,259 9.87E+0436 4552 -0.007 0.056 4,499 1.61E+0537 4534 -0.007 0.054 3,856 6.24E+0538 4523 -0.007 0.054 3,600 3.37E+0539 4550 -0.007 0.054 4,067 1.01E+0640 4523 -0.007 0.054 3,614 3.22E+0541 4533 -0.007 0.055 3,820 2.76E+0542 4530 -0.007 0.057 4,028 3.60E+0443 4532 -0.007 0.055 3,802 9.83E+0444 4572 -0.007 0.055 4,447 7.69E+0545 4558 -0.007 0.054 4,153 1.11E+0646 4601 -0.008 0.055 4,486 2.78E+0647 4499 -0.007 0.058 4,377 6.62E+0548 4542 -0.007 0.056 4,479 3.55E+0449 4560 -0.007 0.056 4,490 4.01E+0550 4532 -0.007 0.054 3,822 6.27E+05

Average: 9.59E+05

RJSEM ( β 0 + β 1 ⋅ bft + β 2 ⋅ r )

156


Method for the RJSEM for Nursing Homes


1 4541 -0.799 0.121 4,389 2.01E+042 4424 -0.814 0.161 4,730 1.73E+063 4619 -0.859 0.111 3,928 1.20E+064 4604 -0.875 0.121 3,512 4.97E+055 4210 -0.722 0.163 4,181 1.02E+066 4566 -0.815 0.124 4,608 1.02E+057 4540 -0.719 0.096 4,822 1.14E+068 4278 -0.730 0.149 4,310 1.46E+069 4527 -0.801 0.125 3,825 4.84E+0310 4569 -0.810 0.123 4,434 2.06E+0511 4480 -0.830 0.135 3,838 6.47E+0512 4577 -0.842 0.122 3,133 6.30E+0413 4548 -0.807 0.126 4,476 1.64E+0514 4575 -0.740 0.112 3,901 1.16E+0615 4719 -0.813 0.099 4,237 1.10E+0616 4420 -0.788 0.139 4,082 3.05E+0517 4509 -0.803 0.130 5,434 5.86E+0318 4501 -0.752 0.122 3,184 7.23E+0419 4585 -0.830 0.125 4,734 2.32E+0520 4397 -0.760 0.137 4,391 4.63E+0521 4776 -0.830 0.093 4,327 1.48E+0622 4621 -0.835 0.125 4,520 1.62E+0623 4496 -0.752 0.115 4,780 4.09E+05

Average: 6.57E+05

RJSEM ( β 0 + β 1 ⋅ r + β 2 ⋅ nsptppt )

157


Method for the RJSEM for Schools



Average: 2.07E+05

RJSEM ( β 0 + β 1 ⋅ r + β 2 ⋅ n2fpt )

158


Method for the RASEM for Offices

Caseβ 0 β 1 β 2 β 3 β 4 Forecasted Y MSQ

1 2370 43.45 -1.892 -1.571 18.76 6,005 1.75E+062 2359 45.68 -1.981 -1.450 16.88 5,576 7.34E+053 2363 45.18 -1.961 -1.454 17.08 4,937 5.14E+054 2356 45.81 -1.983 -1.441 16.76 5,520 6.09E+055 2300 48.45 -2.107 -1.398 15.94 6,487 3.34E+066 2025 47.18 -2.038 -1.688 19.90 2,394 2.84E+067 2288 47.24 -2.057 -1.453 16.94 6,433 3.11E+068 2395 42.12 -1.799 -1.468 17.60 7,068 4.31E+069 2235 47.75 -2.025 -1.407 16.24 8,550 2.40E+0510 2405 48.12 -2.059 -1.218 13.33 7,416 1.68E+0611 2281 46.65 -2.024 -1.440 16.83 6,899 2.23E+0412 2286 45.11 -1.955 -1.480 17.57 6,409 6.10E+0513 2298 47.77 -2.061 -1.402 16.05 6,610 1.06E+0614 2387 46.10 -1.982 -1.508 16.59 3,836 1.36E+0615 2433 46.52 -1.989 -1.260 14.10 7,866 1.65E+0616 2106 48.58 -2.091 -1.368 16.09 5,694 7.93E+0617 2290 46.27 -2.004 -1.449 16.94 4,925 4.25E+0318 2361 46.88 -2.029 -1.396 15.99 5,738 1.18E+0619 2096 45.67 -1.983 -1.546 18.86 5,414 5.84E+0620 2239 47.86 -2.076 -1.422 16.50 9,053 6.92E+0421 2370 46.51 -2.016 -1.405 16.15 5,519 9.40E+0522 2218 45.53 -1.976 -1.510 18.07 5,430 8.50E+0523 2352 46.75 -2.015 -1.353 15.49 8,587 1.70E+0524 2300 46.61 -2.019 -1.427 16.61 5,587 2.48E+0425 2291 45.53 -1.963 -1.448 17.38 5,605 1.60E+0626 2269 46.35 -2.005 -1.340 15.72 6,509 9.02E+0627 2108 47.33 -2.061 -1.478 17.69 5,339 3.96E+0628 2292 46.35 -2.008 -1.438 16.80 6,770 1.22E+0429 2275 46.94 -2.034 -1.420 16.53 5,661 3.95E+0430 2335 46.00 -2.001 -1.440 16.93 5,606 1.46E+0631 2293 46.50 -2.015 -1.434 16.73 4,113 5.50E+0232 2296 45.41 -1.973 -1.516 17.92 6,801 4.78E+0533 2266 46.47 -2.013 -1.448 16.99 5,171 5.72E+0434 2244 47.00 -2.035 -1.430 16.74 5,392 2.48E+0535 2347 45.84 -1.987 -1.440 16.76 5,121 2.41E+0536 2270 46.98 -2.035 -1.425 16.63 7,224 1.80E+0437 2101 52.95 -2.439 -1.337 15.23 4,320 1.30E+0638 2367 45.35 -1.982 -1.469 17.29 5,611 2.02E+0639 2284 46.62 -2.020 -1.434 16.73 4,696 1.96E+0340 2378 44.97 -1.963 -1.488 17.53 5,720 2.56E+0641 2290 46.48 -2.015 -1.435 16.77 4,363 1.06E+0342 2202 53.66 -2.347 -1.232 13.35 6,966 4.73E+06

Average: 1.63E+06

RASEM ( β 0 + β 1 ⋅ nspt + β 2 ⋅ n2 + β 3 ⋅ fpt + β 4 ⋅ ppt)

159


Method for the RASEM for Private Housing

Caseβ 0 β 1 β 2 β 3 β 4 β 5 Forecasted Y MSQ

1 -6090 3757 0.617 -3.045 -0.142 -114.9 4,166 1.74E+062 -5808 3666 0.632 -3.119 -0.165 -105.6 3,969 4.00E+063 -6091 3748 0.603 -2.975 -0.125 -118.9 4,868 7.60E+054 -8495 4625 0.633 -3.110 -0.116 -167.1 7,330 2.12E+065 -6419 3873 0.606 -2.985 -0.121 -131.6 3,564 3.29E+046 -6386 3863 0.514 -2.298 -0.126 -150.0 5,053 1.91E+057 -6281 3813 0.600 -2.955 -0.117 -123.0 4,946 4.37E+058 -6403 3861 0.605 -2.978 -0.120 -126.6 4,590 1.64E+059 -6370 3861 0.612 -3.015 -0.128 -130.7 4,538 2.76E+0410 -6369 3857 0.581 -2.836 -0.126 -131.7 5,292 3.67E+0411 -6309 3836 0.454 -2.157 -0.127 -121.2 5,212 7.90E+0512 -6208 3804 0.664 -3.268 -0.133 -122.0 4,995 8.22E+0513 -6162 3782 0.608 -2.995 -0.128 -124.7 5,620 6.93E+0414 -6258 3803 0.599 -2.949 -0.116 -122.3 4,974 5.24E+0515 -7189 4168 0.629 -3.097 -0.134 -148.9 5,918 1.40E+0616 -6508 3901 0.598 -2.948 -0.110 -136.2 3,724 3.51E+0517 -6301 3846 0.618 -3.048 -0.137 -132.8 4,622 2.92E+0518 -6373 3859 0.610 -3.005 -0.126 -129.8 3,790 1.51E+0319 -6145 3791 0.617 -3.042 -0.138 -130.8 4,118 2.73E+0520 -6251 3815 0.620 -3.069 -0.126 -134.9 5,148 7.65E+0421 -6459 3890 0.611 -3.025 -0.126 -124.1 5,164 2.27E+0422 -6544 3908 0.606 -2.984 -0.120 -125.6 4,042 5.32E+0523 -6197 3799 0.543 -2.588 -0.130 -119.2 3,631 2.22E+0524 -6249 3822 0.623 -3.071 -0.145 -122.6 3,518 3.76E+0525 -6855 4009 0.599 -2.947 -0.106 -129.1 3,806 6.15E+0526 -6368 3863 0.628 -3.110 -0.133 -135.9 3,451 7.07E+0427 -6496 3898 0.608 -2.998 -0.123 -128.7 3,753 7.48E+0428 -6291 3837 0.610 -3.006 -0.127 -134.9 4,014 6.16E+0529 -6369 3860 0.610 -2.991 -0.129 -134.5 3,488 1.24E+0430 -6114 3767 0.662 -3.302 -0.127 -151.2 4,623 5.97E+0531 -6217 3801 0.676 -3.492 -0.124 -91.8 3,994 3.85E+0532 -5997 3744 0.617 -3.046 -0.140 -132.3 3,964 8.56E+0533 -6525 3902 0.608 -2.997 -0.124 -123.1 3,983 9.94E+0534 -6329 3847 0.609 -3.003 -0.126 -132.6 3,955 2.01E+0535 -6371 3859 0.610 -3.007 -0.127 -129.5 3,958 1.81E+0236 -6601 3934 0.619 -3.043 -0.118 -146.3 3,707 1.53E+0537 -6433 3880 0.611 -3.012 -0.127 -129.2 2,957 1.18E+0438 -6380 3859 0.601 -2.964 -0.116 -133.9 3,536 2.67E+0539 -6199 3804 0.609 -3.003 -0.127 -133.2 3,777 5.09E+0540 -6392 3862 0.602 -2.965 -0.116 -133.6 3,509 2.14E+0541 -6341 3848 0.607 -2.991 -0.123 -131.8 3,622 1.07E+0542 -6486 3895 0.611 -3.012 -0.127 -126.7 3,722 2.46E+0543 -6361 3855 0.609 -3.001 -0.125 -130.3 3,608 1.43E+0444 -6237 3817 0.612 -3.020 -0.132 -129.6 3,797 5.16E+0445 -6142 3785 0.611 -3.012 -0.131 -131.0 3,638 2.90E+0546 -5902 3705 0.614 -3.031 -0.139 -128.7 3,441 3.88E+0547 -6694 3952 0.600 -2.954 -0.111 -124.3 4,067 1.26E+0648 -6355 3855 0.616 -3.069 -0.130 -117.2 4,541 6.25E+0449 -6510 3918 0.615 -3.031 -0.129 -137.7 4,664 6.52E+0550 -6383 3864 0.603 -2.972 -0.117 -136.5 3,856 6.81E+05

Average: 4.92E+05

RASEM ( β 0 + β 1 ⋅ spt + β 2 ⋅ fb + β 3 ⋅ pb + β 4 ⋅ fpt + β 5 ⋅ sb)

160


Method for the RASEM for Nursing Homes



Average: 6.47E+05

RASEM ( β 0 + β 1 ⋅ fpt + β 2 ⋅ nsptppt )

161


Method for the RASEM for Schools


1 1484 49.079 -0.208 2,282 1.73E+022 1443 50.343 -0.185 2,023 1.16E+043 1485 48.432 -0.205 2,272 3.30E+044 1461 49.437 -0.201 2,153 5.47E+045 1473 49.570 -0.206 1,840 8.99E+026 1489 50.339 -0.223 2,466 4.79E+047 1503 51.626 -0.244 2,452 3.08E+058 1445 50.826 -0.192 2,225 3.14E+049 1673 35.520 -0.195 2,417 8.84E+0510 1497 49.233 -0.214 2,223 3.74E+0411 1478 49.856 -0.211 2,596 3.76E+0312 1488 49.902 -0.212 2,247 9.42E+0413 1559 41.767 -0.195 2,298 9.48E+0514 1395 52.655 -0.201 1,914 1.91E+0515 1476 49.299 -0.199 2,008 1.37E+0416 1391 55.184 -0.202 2,491 2.61E+0517 1389 55.851 -0.245 1,271 1.31E+0518 1480 49.330 -0.208 1,846 2.51E+0219 1635 46.517 -0.272 2,293 6.15E+0520 1415 51.450 -0.198 2,003 1.51E+0521 1465 49.734 -0.202 2,061 3.42E+0322 1576 45.224 -0.219 1,817 5.45E+0423 1402 50.264 -0.135 1,918 1.54E+05

Average: 1.75E+05

RASEM ( β 0 + β 1 ⋅ nspt + β 2 ⋅ fpt )

Table 6-19: Signs of Coefficients for Selected Predictors

Positive Coefficients Negative Coefficients RJSEM Office Constant, nsptppt and afp a2fp, bft and b2ft Private Housing Constant, r bft Nursing Home Constant, nsptppt r School Constant, n2fpt r RASEM

Office Constant, nspt and ppt n2 and fpt Private Housing spt and fb Constant, pb, fpt, and sb Nursing Home Constant, nsptppt fpt School Constant, nspt fpt

Remark: Bold – Floor area related predictor

162

Table 6-20: Contributions of Floor Area Related Predictor to Response

CaseOffice Private

HousingNursing Home

School Office Private Housing

Nursing Home

School

(β 1 ⋅ a2fp + β 3 ⋅ bft + β 4 ⋅ afp + β 5 ⋅ b2ft)

(β 1 ⋅ bft + β 2 ⋅ r)

(β 1 ⋅ r) (β 1 ⋅ r + β 2 ⋅ n2fpt)

(β 3 ⋅ fpt) (β 2 ⋅ fb + β 4 ⋅ fpt)

(β 1 ⋅ fpt) (β 2 ⋅ fpt)

1 -750 -596 -999 -175 -945 -422 -1,058 -1872 -431 -786 -847 -364 -598 -610 -966 -4613 -1,322 21 -1,366 -193 -1,853 -161 -1,474 -1844 -525 -36 -1,768 -140 -630 -62 -2,133 -1785 -707 -1,180 -332 -371 -679 -647 -333 -1866 -4,540 1,255 -534 -34 -6,077 6,718 -614 -1107 -1,395 6 -396 -141 -1,363 -20 -414 -1358 -330 11 -460 -42 -399 -45 -533 -3119 -9,242 -72 -1,274 -18 -2,293 -117 -1,485 -15510 -2,137 236 -711 -289 -1,747 5,379 -800 -18611 -2,173 -1,601 -1,187 -37 -2,113 5,953 -1,155 -10812 -570 -688 -2,172 -198 -489 6,348 -2,335 -20413 -688 -2 -738 -165 -425 -146 -848 -17414 -15,890 -6 -1,132 -471 -11,830 -18 -1,223 -21815 -1,165 -7 -779 -313 -1,171 -40 -891 -33116 -238 -482 -737 237 -378 -529 -852 -20717 -7,146 8 -477 -1,229 -4,839 -26 -553 -51518 -435 -608 -1,865 -430 -474 -434 -1,766 -21919 -291 13 -461 -211 -247 -52 -531 -14620 -1,208 -122 -486 -313 -758 1,664 -552 -19421 -423 -298 -712 -168 -367 310 -813 -10022 -337 -254 -728 -396 -241 -162 -832 -21323 -1,323 -248 -416 -587 -1,298 1,070 -480 -32724 -583 -485 -517 -71525 -7,349 1 -4,807 -1726 -556 -122 -481 2,15627 -510 -80 -428 -15928 -524 -333 -629 -28129 -639 -499 -968 -10030 -1,074 -76 -2,305 1,20531 -658 -644 -1,362 1,17632 -540 -187 -487 -14733 -217 123 -391 -23834 -269 -147 -477 -30335 -349 -278 -446 -31936 -10,290 -53 -4,958 85537 -10,140 -677 -2,801 -42538 -1,113 -923 -1,557 -50339 274 -483 -1,186 -29540 -1,135 -910 -1,324 -52941 -2,083 -711 -6,673 -42742 -1,170 -501 -759 -30943 -732 -43944 -125 -8045 -404 -25046 -115 -15647 -121 -10648 -63 27849 -70 -18750 -709 -386

Remark: Bold numbers represents positive contributions to the responses

Regressed JSEM Regressed ASEM

163

Table 6-21: Contribution of Non-Floor Area Related Predictors to Responses

CaseOffice Private

HousinNursing Home

School Office Private Housing

Nursing Home

School

(β 2 ⋅ nsptppt) (β 2 ⋅ nsptppt) (β 1 ⋅ nspt + β 2 ⋅ n2 +

β 4 ⋅ ppt)

( β 1 ⋅ spt + β 3 ⋅ pb + β 5 ⋅ sb)

(β 2 ⋅ nsptppt) (β 1 ⋅ nspt )

1 1,856 847 4,580 10,678 873 9852 1,380 1,153 3,815 10,387 1,164 1,0413 1,750 675 4,427 11,120 693 9714 1,460 676 3,794 15,887 689 8705 2,187 303 4,866 10,630 302 5536 2,324 576 6,446 4,721 588 1,0877 2,956 678 5,508 11,247 709 1,0848 2,256 492 5,072 11,038 500 1,0919 14,431 572 8,608 11,025 588 89910 4,948 576 6,758 6,282 590 91211 4,128 545 6,731 5,568 552 1,22612 1,984 728 4,612 4,855 742 96313 2,387 666 4,737 11,928 681 91314 15,121 458 13,279 11,250 471 73715 4,431 297 6,604 13,147 310 86316 1,353 399 3,966 10,761 409 1,30717 4,875 1,402 7,474 10,949 1,480 39718 1,414 548 3,851 10,597 553 58519 1,060 610 3,565 10,315 624 80420 5,460 480 7,572 9,735 489 78221 1,071 263 3,516 11,313 273 69622 964 627 3,453 10,748 641 45423 5,616 700 7,533 8,758 719 84324 1,345 3,804 10,48225 7,200 8,121 10,67826 2,296 4,721 7,66327 1,274 3,659 10,40828 2,307 5,107 10,58629 1,890 4,354 9,95730 3,090 5,576 9,53231 771 3,182 9,03532 2,376 4,992 10,10833 826 3,296 10,74634 1,133 3,625 10,58735 917 3,220 10,64836 11,668 9,912 9,45337 10,650 5,020 9,81538 1,826 4,801 10,41939 1,006 3,598 10,27140 1,814 4,666 10,43041 4,525 8,746 10,39042 2,333 5,523 10,51743 10,40844 10,11445 10,03046 9,49947 10,86748 10,61849 11,36150 10,625

Remark: Bold numbers represents positive contributions to the responses

Regressed JSEM Regressed ASEM

164

6.2.5 Model Transformation

The regressed models with the logarithmic transformed variables can be expressed in

the form of Equation (5.19) in Chapter 5. The response and all of the predictors in

the regressed models were logarithmically transformed (in base e). The LRJSEM

and the LRASEM represent the transformed models for the RJSEM and the RASEM,

respectively. There is a key condition that governs the logarithmic transformation

that all of the values of the transformed variables must be larger than zero.

Unfortunately, the predictors of two of the regressed models do not satisfy this

condition. As some of the office projects do not have podiums and private housing

projects do not have basements, certain predictors, including afp and a2fp in the

RJSEM for offices, and fb, sb and pb in the RASEM for private housing, cannot be

transformed. To fulfil the condition, these predictors were excluded in the

LRJSEM for offices and the LRASEM for private housing.

6.3 Performance Validation

6.3.1 Forecasting Results

To study whether the regressed models improve the performance of forecasts,

their performance was compared with that of the conventional models. The same

data for generating the regressed models were used to assess the performance of the

conventional models. Forecasted tender prices for the JSEM, the floor area model

165

and the cube model were calculated using Equations (6.1) to (6.3), respectively, as

follows:

R

sbpbmfbm

pptsptbarftbaftb

ftbfpafpa

P ⋅

⎪⎪⎪

⎭

⎪⎪⎪

⎬

⎫

⎪⎪⎪

⎩

⎪⎪⎪

⎨

⎧

⋅⋅+⋅+

⋅⋅+++⋅⋅+⋅+

⋅⎟⎠⎞

⎜⎝⎛ −+⋅+⋅⎟

⎠⎞

⎜⎝⎛ −

=

5.22

)(15.0215.0

215.02

215.0

215.02

ˆ 2

2

(6.1)

( ) ''ˆ RfbmftbfpaP ⋅⋅+⋅+⋅= (6.2)

( ) ''''ˆ RsbfbmstftbspfpaP ⋅⋅⋅+⋅⋅+⋅⋅= , (6.3)

where P̂ , 'P̂ and ''P̂ are the forecasted prices for the JSEM, the floor

area and cube models, respectively, and R , 'R and ''R are their corresponding

unit rates that are deduced by cross validation as described in section 5.9 of Chapter

5. The quantities measured, the cross-validated unit rates and the forecasted tender

prices for the three conventional models for offices, private housing, nursing homes

and schools are shown in Tables E-1 to E-4 in Appendix E. The forecasted prices

as shown in the tables were used to calculate the corresponding percentage errors for

the purpose of making comparisons with the regressed models.

To assess the performance of the best subset of regressed models, their

forecasting results were compared with those that were obtained from the

conventional models. First of all, the forecasting errors and percentage errors of all

of the models were calculated. The forecasting errors for various conventional

models and regressed models are shown in Tables F-1 to F-4 and the percentage

errors are shown in Tables F-5 to F-8 in Appendix F. Table 6-22 shows a summary

166

of the means and standard deviations of the percentage errors that represent the bias

and consistency of all the models as extracted from the appendix, and the results of

the significance testing (p-values of the t-tests) for zero bias for all of the models.

As expected, the forecasted prices from the models that were generated by the

method of cross validation generally have very little bias, and most do not deviate

significantly from zero. The only exception is the JSEM for offices. This model

is significantly biased, and has the highest mean percentage error (-6.88%) amongst

all of the models. As bias alone is not informative enough to distinguish the

performance of the models, consistency becomes an important measure in this study.

Unlike the t-tests that are used for the comparison of means, the use of parametric

tests for the homogeneity of variance are not robust in their departure from normality,

as is explained in section 5.9.1 of Chapter 5. As parametric tests are more

preferable than non-parametric tests, the distribution of errors (in terms of the ratio of

forecast to actual tender price) for all of the models were examined in order to

choose the appropriate tests.

167

Table 6-22: Summary of Means and Standard Deviations of Percentage Errors

Office Private Housing Nursing Home SchoolJSEMMean % error (m) -6.88% -2.73% 2.09% 4.08%SD of % error 21.43% 29.04% 20.03% 21.25%p -value for t -test (H0: m=0)

0.04 0.51 0.62 0.37

FLOOR AREAMean % error (m) 5.62% 1.31% 4.20% 3.35%SD of % error 27.32% 23.53% 24.45% 21.45%p -value for t -test (H0: m=0)

0.19 0.69 0.42 0.46

CUBEMean % error (m) 0.16% 1.47% 5.75% 3.56%SD of % error 26.99% 19.59% 25.21% 24.56%p -value for t -test (H0: m=0)

0.97 0.60 0.29 0.49

RJSEMMean % error (m) 3.06% 4.84% 3.21% 3.41%SD of % error 25.38% 22.64% 21.45% 20.84%p -value for t -test (H0: m=0)

0.44 0.14 0.48 0.44

Predictors afp, a2fp, bft, b2ft, nsptppt

bft, r n2fpt, r n2fpt, r

RASEMMean % error (m) 2.96% 2.66% 3.09% 2.94%SD of % error 22.15% 15.95% 21.36% 19.56%p -value for t -test (H0: m=0)

0.39 0.24 0.49 0.48

Predictors n2, fpt, ppt, nspt fpt, fb, spt, sb,pb

fpt, nspt fpt, nspt

LRJSEM

Mean % error (m) 1.87% 2.27% 1.44% 2.14%SD of % error 19.47% 21.14% 20.28% 19.64%p -value for t -test (H0: m=0)

0.54 0.45 0.74 0.61

Predictors ln(bft ), ln(b2ft ), ln(nsptppt )

ln(bft ), ln(r ) ln(n2fpt ), ln(r ) ln(n2fpt ), ln(r )

LRASEMMean % error (m) 2.71% 1.68% 1.36% 2.07%SD of % error 21.86% 17.60% 19.69% 20.06%p -value for t -test (H0: m=0)

0.43 0.50 0.74 0.63

Predictors ln(n2 ), ln(fpt ), ln(ppt ), ln(nspt )

ln(fpt ), ln(spt ) ln(fpt ), ln(nspt ) ln(fpt ), ln(nspt )

Remark:Bold - p -value < 0.05, H0 is rejected (i.e., Mean % error is significantly different from zero)

168

6.3.2 Normality Testing

To use the parametric tests appropriately, the distributions of the forecast to

actual tender price ratios should follow normality. If the models have to be

transformed to fulfil the normality requirement, then the ratios for the models under

examination should be transformed on the same basis. Therefore, all of the

distributions of the ratios for the three conventional models, together with the

distribution of the ratios for one of the regressed models (either with the

untransformed variables or the transformed variables for comparison), would have to

pass the normality tests before the parametric tests could be used to ascertain

homogeneity of variance. The same requirement would also have to be applied to

the comparison between two regressed models with untransformed variables and

transformed variables.

Table 6-23 shows the p-values of the Anderson-Darling (A-D) tests for

normality. The ratios of forecast to actual tender price were used to produce the

plot and to deduce the lambda value, rather than the percentage errors, to avoid the

presence of negative values that handicap the transformation of the logarithm or

square root. Seven distributions of the ratios of forecast to actual tender price were

found to depart significantly from the norm at a confidence level of 95%. They

were from the floor area model and the LRASEM for offices, the JSEM and the floor

area and cube models for private housing, and the RJSEM and the RASEM for

nursing homes. To normalise these distributions, a transformation was carried out

using the Box-Cox normality plots, as is shown in Figures 6-1 to 6-7.

169

The best lambda (λ) values were determined from the normality plots and are

summarised in Table 6-24. If the best λ equals 1, then no transformation can further

normalise the distribution. If it equals 0.5, then a square root transformation is

suggested; if 0, then a logarithmic transformation is suggested; and if -1, then

reciprocal transformation is suggested. As none of the lambda values for the

models in any particular type of building matches with any of the others, the ratios

for each model under the same building type were transformed according to the same

determined lambda value, with the exception of schools because all school models

support the normality assumption. The transformed ratios for each distribution

were then subjected to the A-D tests again to assess the normality of all of the

distributions of the transformed ratios. Unfortunately, the various attempts to

transform the ratios for the groups of models under comparison in sections 6.3.3 and

6.3.4 failed to normalise their distributions. Therefore, non-parametric tests were

employed for the comparisons involving the seven models that failed to fulfil the

normality requirement.

Table 6-23: Results of Normality Tests for Percentage Errors According to

Building and Model Types

Anderson-Darling Tests (p-value)

Office Private Housing Nursing Home School JSEM 0.227 <0.005 0.261 0.455

Floor Area <0.005 <0.005 0.102 0.483 Cube 0.431 0.045 0.550 0.243

RJSEM 0.580 0.765 0.013 0.788 RASEM 0.728 0.133 0.022 0.853 LRJSEM 0.312 0.473 0.092 0.930 LRASEM 0.015 0.224 0.099 0.602

Remark: Bold figures represent p-value < 0.05, H0 is rejected.

170

Figure 6-1: Box-Cox Plot of Percentage Errors for the Floor Area Model for

Offices

Figure 6-2: Box-Cox Plot of Percentage Errors for the LRASEM for Offices

171

Figure 6-3: Box-Cox Plot of Percentage Errors for the JSEM for Private

Housing

Figure 6-4: Box-Cox Plot of Percentage Errors for the Floor Area Model for

Private Housing

172

Figure 6-5: Box-Cox Plot of Percentage Errors for the Cube Model for Private

Housing

173

Figure 6-6: Box-Cox Plot of Percentage Errors for the RJSEM for Nursing

Homes

Figure 6-7: Box-Cox Plot of Percentage Errors for the RASEM for Nursing

Homes

Table 6-24: Estimated Lambda Values According to Building and Model Types

(for Models not Satisfying Normality Assumption Only)

Office Private Housing Nursing Home School

Estimated λ-value

Best λ-value

Estimated λ-value

Best λ-value

Estimated λ-value

Best λ-value

Estimated λ-value

Best λ-value

JSEM N.A. N.A. -0.03 0 N.A. N.A. N.A. N.A. Floor Area 1.33 1 0.48 0.5 N.A. N.A. N.A. N.A. Cube N.A. N.A. 0.86 1 N.A. N.A. N.A. N.A. RJSEM N.A. N.A. N.A. N.A. 1.32 1 N.A. N.A. RASEM N.A. N.A. N.A. N.A. -1.16 -1 N.A. N.A. LRJSEM N.A. N.A. N.A. N.A. N.A. N.A. N.A. N.A. LRASEM -0.83 -1 N.A. N.A. N.A. N.A. N.A. N.A. Remark: N.A. stands for not applicable as the assumption of normality is supported.

174

6.3.3 Significance of Variable Transformation

As described in section 5.8.2 of Chapter 5, the variables were transformed

only in circumstances in which a transformed model would significantly improve the

performance of a forecast. According to the bias and consistency of the regressed

models in terms of the means and standard deviations of the percentage errors that

are shown in Table 6-22, the transformed model LRJSEM generally performed better

than its untransformed counterpart, the RJSEM. Although the transformed model

LRASEM also produced less biased forecasts than its untransformed counterpart, the

RASEM, the same did not apply to the measures of consistency. As the t-tests

support the hypothesis that each regressed model is zero biased, the significance

testing for the consistency between each pair of models (transformed and

untransformed) becomes crucial in judging whether models in a pair are significantly

different.

Two-sample F-tests for the homogeneity of variance were applied to the

percentage errors of every pair of the transformed and untransformed models except

the LRASEM and RASEM for offices, due to their failure to comply with the

normality assumption described in section 6.3.2. For the exception, a

Mann-Whitney U-test for homogeneity of absolute deviation was used. The results

of these tests are summarised in Table 6-25. None of the transformed models were

found to be significantly different from their untransformed counterparts.

According to the principle of parsimony, the regressed models RJSEM and RASEM

were selected for comparison with the conventional models.

175

Table 6-25: Two-sample F-tests and Mann-Whitney U test between Regressed

Models with Untransformed Variables and with Logarithmic Transformed

Variables

p-value of significance tests

Office Private Housing Nursing Home School

RJSEM & LRJSEM 0.09 0.63 0.79 0.78

Statistics F-test F-test F-test F-test

H0: No variance difference (reject if p < 0.05)

Accept H0 Accept H0 Accept H0 Accept H0

RASEM & LRASEM 0.68 0.80 0.71 0.91

Statistics U-test F-test F-test F-test

H0: No variance difference (reject if p < 0.05)

Accept H0 (H0: No absolute

deviation difference)

Accept H0 Accept H0 Accept H0

6.3.4 Comparisons of Models

Eight groups of models were compared separately: four comprising the

RJSEM and the conventional models, and four comprising the RASEM and the

conventional models. The forecasting performance of the models under comparison

in this section is shown in Table 6-22. To distinguish the better performing model

or models in terms of their consistency, the Kruskal Wallis (K-W) tests

(non-parametric) were first employed to the six groups of models for offices, private

housing and nursing homes, and Bartlett’s test (parametric) was applied to the two

groups of models for schools. For the models that were found to be significantly

different in consistency, multiple two-sample tests were then applied. According to

176

Fisher’s Least Significance Difference (LSD) approach, the corrected significance

level for each pairwise test was 99.17%.

Figure 6-8 shows a graphical presentation of the results of these tests. The

four groups of models for offices and private housing were found to be significantly

different, whereas the four groups for nursing homes and schools were not.

Therefore, the former groups were examined in pairwise using Mann-Whitney

U-tests. The results of the U-tests for the four groups for offices and private

housing are shown in Table 6-26.

Table 6-26: Two-sample Mann-Whitney U-tests between Models for Office and

Private Housing

Mann-Whitney U-test (at 99.17%* significance level) Office Private Housing

Pair Z p-value H0: No difference in absolute

deviation (reject if p < 0.0083)

Z p-value H0: No difference in absolute

deviation (reject if p < 0.0083)

Common Comparisons for Both Groups

JSEM and Floor Area -2.8896 0.0039 Reject H0 -1.8544 0.0637 Accept H0 Floor Area and Cube -1.3240 0.1855 Accept H0 -1.6821 0.0926 Accept H0 Cube and JSEM -1.4493 0.1473 Accept H0 -3.0609 0.0022 Reject H0 Comparisons with RJSEM

JSEM and RJSEM -1.1988 0.2306 Accept H0 -2.4818 0.0131 Accept H0 Floor Area and RJSEM -1.6103 0.1073 Accept H0 -1.1651 0.2440 Accept H0 Cube and RJSEM -0.1252 0.9003 Accept H0 -0.4481 0.6541 Accept H0 Comparisons with RASEM

JSEM and RASEM -0.2952 0.7678 Accept H0 -4.3707 0.0000 Reject H0 Floor Area and RASEM -2.2007 0.0278 Accept H0 -3.0126 0.0026 Reject H0 Cube and RASEM -0.8946 0.3710 Accept H0 -1.7441 0.0811 Accept H0 Remark: * – 99.17% = (1 – 0.05/6) x 100%

177

Figure 6-8: Tests of Homogeneity of Variances Using Bartlett’s Tests, Kruskal

Wallis Tests and Mann-Whitney U Tests

Group 2 Group 1

LSD Comparison of Sample Variances (by U-tests) LSD Comparison of Sample Variances (by U-tests)

LSD Comparison of Sample Variances (by U-tests)

Floor Area

Floor Area

Cube

JSEM Cube RASEM

Kruskal Wallis test (p=0.000)

JSEMRASEM

Significant difference

0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28

x x x x

0.0.20 0.19 0.18 0.290.17 0.16 0.15

O F F I C E

P R I V A T E H O U S I N G

N U R S I N G H O M E

S C H O O L

JSEM Floor Area Cube RASEM

No significant difference

Bartlett’s test (p=0.757)

Floor Area

Cube RJSEM

JSEM Cube RJSEM


Floor Area

JSEM


0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28

x x x x

0.29

Floor Area

Floor Area

RJSEM

JSEM Cube RJSEM


JSEM Cube


0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28

x x x x

0.200.19

x x

0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28

x x

Floor Area

CubeRASEM

JSEM Cube RASEM



JSEM Floor Area

JSEM Floor Area Cube RJSEM


Bartlett’s test (p=0.857)

JSEM Floor Area Cube RASEM


Kruskal Wallis Test (p=0.653)

JSEM Floor Area Cube RJSEM


Kruskal Wallis Test (p=0.642)

LSD Comparison of Sample Variances (by U-tests)

Group 7 Group 8

Group 5 Group 6

Group 3 Group 4

178

6.3.4.1 Models for Offices

Compared with the range of bias for the other type of buildings, the range of

bias for office models is the largest (-6.85 to 5.62%). Except for the t-tests for the

JSEM, all of the other t-tests for the office models supported the null hypothesis,

which suggests that the JSEM is the most biased model and the others are all

unbiased models, and are therefore comparable with each other.

The ascending order of the sample variances is the JSEM, the RASEM, the

cube model and the floor area model in Group 1 and the JSEM, the RJSEM, the cube

model and the floor area model in Group 2. The Kruskal Wallis tests for both

groups of models, as shown in Figure 6-8, rejected the notion that the models under

comparison are equal in consistency. The LSD approach of multiple pairwise

comparisons using U-tests is illustrated diagrammatically in Figure 6-8. In Group 1,

the JSEM, the RASEM and the cube model have the same potency, the RASEM and

the cube and floor area models also have the same potency and the JSEM differs

from the floor area model. Therefore, the more consistent set of models for Group

1 comprises the three comparable models: the JSEM, the RASEM and the cube

model. Similarly, the more consistent set of models for Group 2 comprises the

JSEM, the RJSEM and the cube model.

As JSEM is significantly different from a zero mean percentage error, the

best performing sets of models, taking into account both the bias and consistency, are

the RASEM and the cube model in Group 1, and the RJSEM and the cube model in

Group 2.

179

6.3.4.2 Models for Private Housing

All of the t-tests for the private housing models supported the null hypotheses

that the percentage errors of the models are not significantly different from a zero

mean.

The ascending orders of the sample variances are the RASEM, the cube

model, the floor area model and the JSEM model in Group 3, and the cube model,

the RJSEM, the floor area model and the JSEM in Group 4. As for Group 1 and 2,

both the Kruskal Wallis tests for the models in Group 3 and 4 rejected the notion that

the models under comparison are equal in consistency, as is shown in Figure 6-8.

In particular, the RASEM in Group 3 attained spectacularly low consistency

(15.95%). In this group, the RASEM and the cube model have the same potency,

the cube and floor area models have the same potency, the floor area model and the

JSEM have the same potency, both the RASEM and the cube model differ from the

JSEM, and the RASEM differs from the floor area model. Therefore, the more

consistent set of models for Group 3 comprises the two comparable models: the

RASEM and the cube model.

In Group 4, the cube model, the RJSEM and the floor area model have the

same potency, the RJSEM, the floor area model and the JSEM have the same

potency, and the cube model differs from the JSEM. Therefore, the more consistent

set of models for Group 4 comprises the three comparable models: the cube model,

RJSEM and floor area model.

180

6.3.4.3 Models for Nursing Homes

As with the private housing models, all of the t-tests for the nursing home

models supported the null hypotheses that the percentage errors of the models are not

significantly different from a zero mean.

The ascending orders of the sample variances are the JSEM, the RASEM,

floor area and cube models in Group 5, and JSEM, the RJSEM and the floor area and

cube models in Group 6. Moreover, the Kruskal Wallis tests for the models in

Group 5 and 6 supported the notion that the models under comparison are equal in

consistency. Therefore, all of the models are comparable with each other in terms

of both bias and consistency for both groups.

6.3.4.4 Models for Schools

As with the private housing and nursing home models, all of the t-tests for the

school models supported the null hypotheses that the percentage errors of the models

are not significantly different from a zero mean.

The ascending orders of the sample variances are the RASEM, the JSEM, and

the floor area and cube models in Group 7, and the RJSEM, the JSEM and the floor

area and cube models in Group 8. Moreover, the Bartlett’s tests for the models in

Group 7 and 8 supported the notion that the models under comparison are equal in

consistency. Therefore, all of the models are comparable to each other in terms of

both bias and consistency for both groups.

181

6.3.4.5 Discussions on model comparisons

Amongst the eight groups, the mean percentage errors, which represent the

bias of a model, are generally quite close to zero (-0.068 to 0.058), due to the use of

cross validation and the least-squares method for deducing the model unit rates and

coefficients. The range of standard deviations of percentage errors is quite narrow

(from 0.159 to 0.328), possibly because of the exclusion of the building component

costs, such as the foundation, building services, preliminaries and contingency, from

the original tender price in the forecasting target. These were excluded because of

the similar nature of the data collected, as multi-storey reinforced concrete buildings

in Hong Kong are very similar in terms of construction methods and specifications.

The coefficient of variation (cv) that represents the general accuracy was 20%

to 30% for the JSEM, 21% to 26% for the floor area model and 19% to 33% for the

cube model. These accuracy ranges generally fall within the ranges that were

reviewed by Skitmore and Patchell (1990), i.e. 15% to 30% for the JSEM, 20% to

30% for the floor area model and 20% to 45% for the cube model.

James used rather crude measures (i.e., multiplying the lowest rate by the

equal highest rate and the number of rates within a percentage group) and a small

sample size (i.e., 16 flats, 14 school, 39 industrial buildings and 17 houses) to

conclude that the JSEM was a better model than the floor area and cube models.

When better-accepted and defined measures were used for comparison in this study,

such as bias and consistency, it was found that the conventional models are more

likely to be comparable.

The three conventional models were also compared with the RJSEM and the

RASEM separately for consistency. In a four-sample comparison, Groups 1 to 4

182

(the office and private housing models) were found to be significantly different in

their absolute deviation of percentage errors, whereas Groups 5 to 8 (the nursing

home and school models) were not found to be significantly different. One of the

possible causes for the lack of significant improvement in the regressed models for

nursing homes and schools is the insufficient number of candidate variables. In this

study, the number of candidates was largely reduced in these two regressed models

because of the absence of podiums (for nursing homes and schools) and basements

(for schools), and because of the procedure of excluding candidates to tackle the

multicollinearity problem. Thus, the forecast performance could probably be

further improved by identifying and including more uncorrelated candidates in the

regressed models if more information is extracted as design develops from the early

design stage to later stages.

Both the regressed and cube models were included in all of the best sets of

comparable models. However, the comparison results created ambiguities in

interpreting the models as some models, such as the RASEM and the cube model in

Group 1, the RJSEM and the cube model in Group 2, the cube model in Group 3 and

the RJSEM and floor area model in Group 4, show potency in two different sets of

comparable models. Nevertheless, it can be concluded from the LSD comparisons

that the use of the RASEM may improve the forecasts, and at least will not worsen

them.

The major concern of this research is forecasting accuracy. The evidence

shows that the forecasting models are more likely to be comparable in terms of

accuracy measures than uniquely outstanding. Hence, the hypothesis that the new

regressed models outperform the conventional forecasting models is rejected. The

type of information that is available in the early design stage is coarse and very

183

limited, which constricts the forecasting ability of any model, because a model can

only capture as much as the available information allows. It appears that even more

information such as the elevation area and the roof area, etc. have been extracted and

used in the regressed models, the improvement is not significant enough to

distinguish them from the conventional models. Unless more information can be

brought into the modelling process or a less rigour statistical inference is used to

distinguish the models, it is very difficult if not impossible to produce a significantly

outstanding model in the early design stage. As no single model performed

significantly better than all of the others together, an alternative strategy of

combining forecasts is explored in the next section.

6.4 Combining Forecasts

There is a line of research concerning the combination of multiple individual

forecasts that are produced by different forecasting models. The literature of

forecasting suggests that when different models are similar in their forecasting

accuracy, an approach that combines the different forecasts may improve accuracy.

The concept of combining forecasts is based on the implicit assumption that different

forecasting models are able to capture different aspects of the information that is

available for forecasting without knowing the underlying process (Clemen 1989).

Armstrong (2001 p.428) summarised 30 studies on the combination of

forecasts, all of which show a certain amount of gain in accuracy, and on average

there was a 12.5% reduction in forecasting error. Although the regressed models in

this research did improve the forecasting accuracy on average, and fall within the

184

best cluster of models in the LSD approach, they are not distinguishably better than

the conventional models in terms of forecasting accuracy amongst the eight groups

of models that were examined. The approach of combining forecasts ensures a gain

in accuracy over the average performance of the models without risking the

performance of a single model.

Empirical studies on time-series forecasting methods suggest that correlations

between forecasts should be ignored in calculating the combination weights

(Newbold and Granger 1974; Makridakis and Hibon (1979); Makridakis et al. (1982,

1983)). Clemen (1989) conducted a comprehensive review of the evidence, and

found equal weighting to be accurate for many types of forecasting. This evidence

leads to the conclusion made by Armstrong (2001 pp.419-424) that an equally

weighted combination of forecasts should be used when it is not certain which model

is best. The author also suggests that an equal-weights rule is a reasonable starting

point, and that a trimmed mean is desirable if the combination contains five or more

models. The author states that different weights should only be used if the domain

knowledge or information upon the method of greatest accuracy is well understood.

Two types of combinations were produced for the eight groups of models in

this research, as is shown in Figure 6-8. One combines the forecasts of the best sets

of models (C1), and the other of all of the models (C or C2) under comparison. The

equal-weight rule is applied to both types. The combined forecasts for the eight

groups of models are shown in Tables G-1 to G-8 in Appendix G, and the results

from these tables are summarised in Tables 6-27 to 6-34. Tables 6-27 to 6-34 also

show the average minimum and maximum percentage error forecasts. The former

takes the average of the percentage errors of the best forecasts for all cases, as shown

in the tables in Appendix H, and the latter takes the average of the worst. All

185

forecasts fall within the range between the average minimum and the average

maximum.

As expected, the accuracy of the combined forecasts shows improvement

over that of the average models in every group. The accuracy (cv) gains from the

combination of forecasts over the average of the best sets of the models was in the

range of 2.41% to 17.61% and over the average of all of the models was in the range

of 4.49% to 15.61%, or the averages of 9.42% and 9.33%, respectively. This

improvement is slightly less than the average reduction of 12.5% in Armstrong’s

study. Except the RASEM for private housing in Group 3 and the cube model for

private housing in Group 4, the C1 combined forecasts did show a gain in accuracy

(i.e., the negative values in the rows “C1 effect” in Tables 6-27 to 6-34). There are

a few more exceptional cases for the C or C2 combined (according to the rows “C

effect” or “C2 effect” in Tables 6-27 to 6-34). The C1 combined forecasts

produced the best forecasts in Groups 1 and 2, and the C combined forecast produced

the best forecast in Group 8. To conclude, the combined forecasts are more

accurate than the average forecasts, and are sometimes better than the best forecasts.

Table 6-27: Accuracy for Combined, Model Average, Minimum and Maximum

Forecasts for Group 1 Models

Group 1 JSEM Floor Area

Cube RASEM

Combined Forecasts

Average

(w) (x) (y) (z) (JSEM, Cube &

RASEM) (C1)

(All Four

Models) (C2)

(w, y & z)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error (Out of

Four Models)

Mean % Err:

-6.88% 5.62% 0.16% 2.96% -1.25% 0.47% -1.25% 0.47% -1.71% 5.06%

SD % Err: 21.43% 27.32% 26.99% 22.15% 19.38% 20.65% 23.52% 24.47% 11.57% 35.80%

CV: 23.01% 25.87% 26.94% 21.51% 19.62% 20.56% 23.82% 24.36% 11.77% 34.07%

C1 Effect: -14.72% -24.14% -27.17% -8.77% - - -17.61% - - -

C2 Effect: -10.67% -20.54% -23.70% -4.43% - - - -15.61% - -

186




Cube RJSEM Combined Forecasts Average

(w) (x) (y) (z) (JSEM, Cube & RJSEM)

(C1)

(All Four

Models) (C2)

(w, y & z)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error (Out of

Four Models)

Mean % Err:

-6.88% 5.62% 0.16% 3.06% -1.22% 0.49% -1.22% 0.49% -4.29% 2.71%

SD % Err: 21.43% 27.32% 26.99% 25.38% 20.28% 21.27% 24.60% 25.28% 12.16% 36.23%

CV: 23.01% 25.87% 26.94% 24.62% 20.53% 21.16% 24.90% 25.16% 12.71% 35.27%

C1 Effect: -10.79% -20.64% -23.81% -16.62% - - -17.56% - - -

C2 Effect: -8.03% -18.19% -21.45% -14.04% - - - -15.87% - -




Cube RASEM Combined Forecasts Average

(w) (x) (y) (z) (Cube & RASEM)

(C1)

(All Four

Models) (C2)

(y & z)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error

(Out of Four

Models)

Mean % Err: -2.73% 1.31% 1.47% 2.66% 2.07% 0.68% 2.07% 0.68% 0.63% -1.28%

SD % Err: 29.04% 23.53% 19.59% 15.95% 16.70% 20.50% 17.77% 22.03% 13.26% 30.70%

CV: 29.86% 23.22% 19.30% 15.53% 16.36% 20.36% 17.41% 21.88% 13.18% 31.10%

C1 Effect: -45.20% -29.54% -15.23% +5.34% - - -6.00% - - -

C2 Effect: -31.82% -12.33% +5.48% +31.07% - - - -6.94% - -

187



Group 4 JSEM Floor

Area Cube RJSEM Combined Forecasts Average

(w) (x) (y) (z) (Floor Area,

Cube & RASEM)

(C1)

(All Four

Models) (C2)

(x, y & z)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error

(Out of Four

Models)

Mean % Err:

-2.73% 1.31% 1.47% 4.84% 2.54% 1.22% 2.54% 1.22% 0.98% 0.28%

SD % Err: 29.04% 23.53% 19.59% 22.64% 21.39% 22.64% 21.92% 23.70% 16.57% 32.39%

CV: 29.86% 23.22% 19.30% 21.60% 20.86% 22.36% 21.38% 23.41% 16.41% 32.30%

C1 Effect: -30.15% -10.18% +8.07% -3.41% - - -2.41% - - -

C2 Effect: -25.11% -3.71% +15.86% +3.55% - - - -4.49% - -



Group 5 JSEM Floor Area Cube RASEM Combined Forecasts

Average

(w) (x) (y) (z) (All Four Models)

(C)

(w, x, y & z)

Min % Error

(Out of Four Models)

Max % Error

(Out of Four Models)

Mean % Err: 2.09% 4.20% 5.75% 3.09% 3.78% 3.78% 3.57% 7.55%

SD % Err: 20.03% 24.45% 25.21% 21.36% 20.73% 22.76% 12.40% 30.00%

CV: 19.62% 23.47% 23.84% 20.72% 19.97% 21.93% 11.97% 27.90%

C Effect: +1.80% -14.90% -16.22% -3.63% - -8.95% - -




Cube RJSEM Combined Forecasts

Average

(w) (x) (y) (z) (All Four

Models) (C) (w, x, y

& z)

Min % Error

(Out of Four

Models)

Max % Error

(Out of Four

Models)

Mean % Err:

2.09% 4.20% 5.75% 3.21% 3.81% 3.81% 2.50% 7.61%

SD % Err: 20.03% 24.45% 25.21% 21.45% 20.77% 22.79% 12.38% 30.09%

CV: 19.62% 23.47% 23.84% 20.78% 20.01% 21.95% 12.08% 27.96%

C Effect: +1.99% -14.74% -16.06% -3.73% - -8.83% - -

188




Cube RASEM Combined Forecasts

Average


Models) (C)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error

(Out of Four

Models)

Mean % Err:

4.08% 3.35% 3.56% 2.94% 3.48% 3.48% 1.92% 6.85%

SD % Err: 21.25% 21.45% 24.56% 19.56% 20.23% 21.70% 15.04% 27.39%

CV: 20.41% 20.75% 23.72% 19.01% 19.55% 20.97% 14.75% 25.64%

C Effect: -4.22% -5.78% -17.56% +2.88% - -6.78% - -




Cube RJSEM Combined Forecasts

Average


Models) (C)

(w, x, y & z)

Min % Error

(Out of Four

Models)

Max % Error

(Out of Four

Models)

Mean % Err:

4.08% 3.35% 3.56% 3.41% 3.60% 3.60% 2.18% 7.18%

SD % Err: 21.25% 21.45% 24.56% 20.84% 20.44% 22.02% 15.02% 27.40%

CV: 20.41% 20.75% 23.72% 20.16% 19.73% 21.26% 14.70% 25.56%

C Effect: -3.35% -4.93% -16.82% -2.12% - -7.20% - -

6.5 Summary

Eight regressed models were built from two sets of variables, one for the

RJSEM and the other for the RASEM, for the four types of buildings – offices,

private housing, nursing homes and schools. By setting the tender price per total

floor area as the response, rather than the tender price, the average ratios of the

189

maximum actual response value to the minimum actual response value for each

building type were all reduced to around 2. This avoids the significantly large value

effect in modelling by the least-squares method, and improves the accuracy of the

regressed models.

Eight regressed models with different best subset variables were generated.

The selected predictors are height of the building above ground, square of the

number of storeys of the superstructure, average area per storey of the superstructure

and average perimeter on plan of the superstructure for the office RASEM (in Group

1); the number of podium storeys multiplied by the total podium floor area, elevation

area, total tower floor area and number of tower storeys multiplied by total tower

floor area for the office RJSEM (in Group 2); the average storey height of the

superstructure, average area per basement storey , average basement perimeter on

plan, average area per storey of the superstructure, and average basement storey

height for the private housing RASEM (in Group 3); the total tower floor area and

roof area for the private housing RJSEM (in Group 4); the average area per storey of

the superstructure and elevation area for the nursing home RASEM (in Group 5); the

roof area and elevation area for the nursing home RJSEM (in Group 6); the average

area per storey of the superstructure and height of building above ground for the

school RASEM (in Group 7); and the roof area and number of storeys of the

superstructure multiplied by the total floor area of the superstructure for the school

RJSEM (in Group 8).

In particular, the RJSEM and the RASEM for nursing homes are considered

to be very similar to each other, as they both contain the elevation area as the one of

the predictors, and the other predictors – roof area in the RJSEM and average floor

area in the RASEM – are very similar in their observed values.

190

The predictors of all of the regressed models are divided into two types: floor

related (e.g., average floor area, total floor area and roof area) and non-floor related

(e.g., elevation area, storey height and number of storeys). The aggregate

contribution of the former type of predictors is generally negative (except for a few

cases in the office RJSEM and the private housing RASEM), whereas the latter type

exhibits solely positive aggregate contributions to all responses, which suggests that

the tender price per total floor area is negatively correlated to the floor area related

predictors, and is positively correlated to the non-floor area related predictors in

these models.

The variables (other than those that relate to podiums or basements) that

contain zero observed values in the regressed models were logarithmically

transformed in base e. The transformed models, the LRJSEM and the LRASEM,

were tested against the original regressed models, the RJSEM and the RASEM,

respectively. The LRJSEM is generally less biased and more consistent than the

RJSEM. However, the statistical tests suggest that the transformed models are not

significantly better than their original counterparts. Therefore, the original

regressed models, being the simpler models, were chosen for further comparison

with the conventional models.

The coefficients of variation (cv) that represent general accuracy were 20% to

30% for the JSEM, 21% to 26% for the floor area model and 19% to 33% for the

cube model. These accuracy ranges generally fall within the ranges that were

reviewed by Skitmore and Patchell (1990), i.e. 15% to 30% for the JSEM, 20% to

30% for the floor area model and 20% to 45% for the cube model.

191

As the regressed models and the unit rates for conventional models were

generated by cross validation, the percentage errors of all the models except the

JSEM for offices did not significantly depart from a zero mean, which suggests that

most of the models are generally not biased at all, and the consistency of the models

would be a more influential indicator for distinguishing their performance.

Eight groups of models, each comprising one regressed model and the three

conventional models for the same type of buildings were formed. The models in

each group were tested for their homogeneity of variance, which is a measure of

whether the models are equally consistent. To accomplish this task, a combination

of parametric tests such as Bartlett’s test and the non-parametric tests such as the

Kruskal Wall test and Mann-Whitney U test. First, the models were compared as a

group. Only the models for offices and private housing (Group 1 to 4) were found

to be significantly different in consistency. Within their own group, these models

were then compared in pairwise using Fisher’s Least Significance Difference (LSD)

approach. No one single model was found to be significantly more consistent than

the others. The cluster of most consistent models in Group 1 comprises the JSEM,

the RASEM and the cube model; in Group 2 comprises the JSEM, the RJSEM and

the cube model; in Group 3 comprises the RASEM and the cube model; and in

Group 4 the cube model, the RJSEM and the floor area model. Both the regressed

and cube models were included in all of the best sets of comparable models that had

the same potency. Hence, the hypothesis that the new regressed models outperform

the conventional forecasting models is rejected.

A strategy to improve forecasting accuracy by combining forecasts is

proposed. This is considered to be particularly suitable for early stage forecasting,

because the available information is usually very limited at this stage. By combining

192

forecasts, the different aspects of information can be captured. The combined

forecasts are always more accurate than the average, and are sometimes better than

the best forecasts. In this study, the average accuracy gain from the combination of

forecasts from the best clusters of models was 9.42%, and from all of the models

over their averages was 9.33%.

193

CChhaapptteerr 77 CCoonncclluussiioonnss

Mistake, error, is the discipline through which we advance.

William Ellery Channing

7.1 Introduction

Forecasting methods that are adopted in practice have been criticised for

lacking theoretical support and a proper evaluation of performance. By considering

the significance of early stage forecasts, this research focuses on the development of

cost models that improve forecasting performance. The primary aim of this work is

to develop forecasting models by a systematic and logical approach. JSEM is

chosen for further development because as reviewed, it is the most sophisticated

conventional model applicable in the early design stage. The regressed models

developed in this study using variables as identified in JSEM are expected to capture

more information depicted from sketch drawings (the only key information available

during the early design stage).

Evidence reveals that conventional forecasting methods, such as the floor

area and approximate quantities methods, are still the most widely used methods,

194

despite a number of alternative methods having been developed by researchers. To

put forth the use of new cost models and forecasting approaches in practice, it is

crucial that practitioners appreciate the improved performance of these models and

approaches.

A secondary aim of this research is to prove the hypothesis that the new

regressed models outperform the conventional models by testing the forecasting

accuracy of the former models against the latter models. A rigorous and objective

approach for comparing and examining the forecasting accuracy of the models

empirically, which has been usually overlooked in previous studies of model

development, is adopted.

7.2 Model Development

Eight regressed models were built from two sets of variables. The models

are the RJSEM and RASEM for offices, private housing, nursing homes and schools.

The performance of the forecasts was noticeably better when the tender price per

total floor area was used as the response, rather than the tender price, because the

influential effect of large tender figures was significantly reduced.

In the regression analysis, the forecasting errors for each model were

minimised by reducing the number of identified independent variables using the

cross validation approach. All of the models are found to comprise different

predictors. The only common predictor amongst the RASEMs is the average floor

area, but its effect on the response is not conclusive as its coefficients in various

RASEMs are different in their sign and magnitude.

195

The predictors of the eight regressed models can be divided into two types:

floor related (e.g., average floor area, total floor area and roof area) and non-floor

related (e.g., elevation area, storey height and number of storeys). The tender price

per total floor area is generally negatively correlated to the gross contributions of the

floor area related predictors, and positively correlated to the gross contributions of

the non-floor area related predictors in the models.

Following previous modelling studies on the improvement of forecasting

performance by transformation strategy, the variables in the best subset models were

logarithmically transformed. The statistical testing suggested that none of the

transformed models were significantly better than their original untransformed

counterparts, although there was some improvement in some cases.

To conclude, the cross validation algorithm developed in this study for

modelling JSEM’s variables is a significant contribution as it makes advancement to

the model building process. Although the data, the observed values for the

candidates and the response, used in this study are only from four different types of

building projects, the developed methodology for modelling is also applicable to data

from other types of buildings as well as to other types of data. In using the cross

validation approach, both the regressed and conventional models are examined

simultaneously based on this criterion. It is found to be particularly suitable for the

problem of building price forecasting, because in practice, forecasters extract the

relevant information from a pool of historical projects to make a prediction, and the

sample base for modelling in the cross validation approach corresponds to that

relevant information. The difference, however, is that practicing forecasters rely

heavily on their judgment in choosing the data, the methods for forecasting and

deciding the relationship with the tender price. The cross validation approach has

196

considerable intuitive appeal because it produces forecasts in a similar way to

forecasters, but it also preserves objectivity. Using the cross validation approach

under the dual stepwise selection procedure provides an automatic means of

achieving variable parsimony.

7.3 Performance Validation

The coefficients of variation (cv) that represent the general accuracy using the

cross validated approach were 20% to 30% for the JSEM, 21% to 26% for the floor

area model and 19% to 33% for the cube model. These accuracy ranges generally

fall within the ranges that were reviewed by Skitmore and Patchell (1990), i.e., 15%

to 30% for the JSEM, 20% to 30% for the floor area model and 20% to 45% for the

cube model.

In James’ study, the JSEM was proved empirically to outperform the floor

and cube models. However, when bias and consistency are used as the accuracy

measures, James’ result is not supported. Models for the same type of building are

more likely to be comparable with each other, rather than superior or inferior.

As was anticipated from forecasts using the cross validation approach, the

models were generally unbiased, the only exception being the JSEM for offices.

There is no significant difference between the performance of the regressed models

(the RASEM or the RJSEM) and the three conventional models for the nursing home

and school samples. When the least significance difference (LSD) approach was

applied to make multiple comparisons amongst the models, both the regressed and

cube models appeared to fall into the best clusters of comparable models in the office

197

and private housing samples. Disregarding that all regressed models belong to the

best clusters of comparable models, the hypothesis that the new regressed models

outperform the conventional forecasting models is rejected.

The principle of parsimony is particularly important in the context of model

selection, as the number of possible models is unlimited for a given set of data. It is

logical to follow the principle that a more complicated model has to demonstrate its

benefits over a simpler model. The average and standard deviation of percentage

errors that are used in this research can give an indication, but needs to be supported

by statistical inference to draw conclusions about the comparisons. The proposed

framework for comparisons in this research allows a fair judgment to be made.

The three major weaknesses in the development of cost models in previous

studies, that is, the lack of theoretical support, the deterministic emphasis of

approach, and the crude evaluation of new models, have been overcome in this

research. First of all, James’ storey enclosure model was simplified with reasonable

assumptions to fit a typical problem that can be solved by multi-linear regression.

The developed models share the common objective of minimising forecasting errors,

which is achieved mathematically by the use of the least-squares error approach for

selection of the variables and the determination of the coefficients. Secondly,

conventional approaches to cost models generally rely upon the use of historical

price data to produce single-figure (i.e., deterministic) building price forecasts, which

do not explicitly describe inherent variability and uncertainty. In the cross

validation approach that is used in this research, costs are modelled repetitively, and

the reliability of the models is measured according to the mean and standard

deviation of percentage errors (the stochastic components of forecasts). The

evaluation of the models was conducted with reference to a framework for the

198

selection of the appropriate parametric and non-parametric tests that were used to

examine the performance of the models. This framework is an exemplar which

ensures the objectivity and rigorousness in the evaluation of models.

Compared with other forecasting regression models, the RASEM and RJSEM

gain an advantage over previously developed models in terms of the use of cross

validation for reliability analysis, which avoids the major problem of within-sample

validation and makes the best use of sample data; applicability, as the candidates and

predictors identified are extractable from existing cost analyses, which avoids the

subjective elements in defining and measuring qualitative variables; and the use of

statistical inference for comparing models, which provides a fair basis for the

assessment of model performance.

7.4 Combining forecasts

A strategy to improve forecasting accuracy by combining forecasts is

proposed. This is considered to be particularly suitable for early stage forecasting,

as the information that is available at this stage is usually very limited. By combining

forecasts, the different aspects of information can be captured. The combined

forecasts are always more accurate than the average forecasts, and are sometimes

better than the best forecasts. In this research, the average accuracy gain from the

combination of forecasts over the averages was 9.42% for the best clusters of models

and 9.33% for all of the models together.

199

7.5 Implications for Practice

Although the regressed models are not distinguishably better, they are

replicable, because they are backed by the cross validation approach; are easy to use,

because they involve only a few predictors; and are fairly accurate and reliable,

because they are comparable with other models within the best clusters. If a

regressed model is chosen for prediction, then it can, on average, produce forecasts

that are at least as good as the forecasts of any of the conventional models. For

certain applications, such as when the RASEM is used for private housing, the

chance of getting better forecasts is high.

In the framework for model comparison, the use of both parametric and

non-parametric tests with reference to a selection algorithm is adopted. The

proposed selection algorithm can be applied for the comparison of forecasting

models of any kind and amongst any number of models. The use of bias and

consistency together with significance testing ensures objectivity in judging the

models.

The alternative strategy of combining forecasts has been shown to be

practical and useful. This is particularly suitable for early stage forecasting, as

models that are used at this stage are all very simple in terms of the number of

measurements and the calculation that is involved. This approach is also

economical to use, as a simple equally weighted combination of forecasts can

improve forecasting accuracy. This applies both to situations in which a forecaster

has previously acknowledged the performance of the models and in which a

forecaster has not acknowledged this. If the forecaster can identify the better

200

models, then a reasonable application would be to combine the forecasts from these

models. If the better models cannot be identified, then all of the possible models

should be combined.

In practice, estimates are prepared by a number of forecasters within a

company. The combination strategy is mechanical, and avoids additional

uncertainties that are caused by the subjectivity of forecasting experts. However,

expert judgment can also be applied to combine forecasts that are produced from

different forecasters if such judgments are essential to that type of forecast.

With powerful computers and software, any modellers should be able to

follow the methodology that is detailed here to create and examine cost models, and

both experienced and inexperienced forecasters should be able to use the regressed

models that have been developed.

7.6 Model Limitations

No cost model can ever be a perfect representation of building prices, nor can

it produce forecasts with no errors. Cost models are limited by their underlying

assumptions, by their reliance on historical data for predicting future events, by the

insufficiency of information and preparation time and by their reliance on expert

judgment. Compared with the parametric approach to the development of regressed

models, there are fewer assumptions to be met in the cross validation approach.

Although the development of forecasting models is heavily influenced by the

sufficiency of data, the cross validation approach is relatively undemanding. In

terms of model building, the regressed models certainly take more time to construct

201

and to maintain compared with the conventional models. In terms of producing

forecasts, the regressed models are as easy to use as the conventional models, as only

a few predictors have to be measured. Subjective judgment is unavoidable, but the

proposed method for the development and testing of models is systemic and logical,

and is an effective way to avoid making unreasonable judgments. For instance, the

regressed models that are described in this research can be easily replicated by

modellers for forecasts with different sets of variables.

All the models for nursing homes and schools are of no significant difference.

The small number of samples that are included in these models (23 samples only)

and the relatively fewer variables that were identified because of the absence of

podiums and basements could be the causes of the potential bias in the results.

Given that the data were collected in a ten-year period and from a single practice, it

is apparent that further studies into the creation of models for nursing homes and

schools must seek to identify other potential variables. In this regard, further

information extractable as design develops from the early design stage to later stages

may help to address those variables.

This research shows only the significance of the developed models in terms

of statistics. To improve both the methodology and the combined forecast approach,

the practical significance of both should also be studied.

Previous studies have suggested that practitioners prefer to exercise their own

judgement in giving cost advice in order to demonstrate their expertise. Cost

models and forecasting approaches that complement the professional judgement of

practitioners are therefore likely to receive better recognition. Since this study

focuses on developing models based on the quantitative data extractable from sketch

202

drawings, the qualitative variables have not been incorporated. However, it would

be particularly worthwhile to develop the new approach of this research further by

incorporating qualitative variables that require the exercising of judgement. For

instance, in the cost models that are developed in this research, the quality of

buildings is implicitly assumed to be equal for each building type, and thus buildings

of significantly different quality were discarded from the data sample. Data on

options and intentions that represent the quality variable on a scale can be judged by

practitioners and could also be used for modelling. Furthermore, there are other

potential variables that demand expert judgement, and which could be added to the

model, including market condition, level of competition and the credit worthiness of

clients.

The combination of forecasts is limited by the three major criticisms that are

levelled at it – that it ruins traditional statistical procedures, that there is one

appropriate way to forecast, and that instead of combining forecasts, one should look

for a comprehensive model that incorporates all the relevant information.

7.7 Opportunities for Further Research

Further research could be undertaken by refining the models, or by

developing similar models for early stage forecasting. The models that are

presented here could be refined in the ways detailed here, which are presented in list

form for the sake of brevity.

203

1. Adding more relevant variables (e.g., finishing standard, building

provisions, etc.) to the models based on the information other than

sketch drawings.

2. Producing another set of interaction terms.

3. Using other variable transformations (e.g., reciprocal, square root).

4. Using other model functions (e.g., polynomial).

5. Using different types of buildings (e.g., shopping malls, public housing,

etc.).

6. Using all of the combinations of selection procedures.

7. Exploring the weighted least-squares approach.

8. Exploring other modelling techniques (e.g., artificial neural network

(ANN), fuzzy set theory, and genetic algorithm, etc.).

9. Exploring different approaches to determine the best weightings for the

combination of forecasts (e.g., trimmed means).

10. Comparing similar models that are generated from data from other

countries.

11. Comparing the use of different error measures on the performance of

forecasts

As practical significance is as important as statistical significance, it would be

useful to assess other criteria, such as the comprehensibility and acceptability of the

regressed model. Moreover, as the quality of a designers’ forecast is related to the

204

way in which it is perceived by decision makers, it would be worthwhile not only to

focus on forecasting accuracy, but also on the satisfaction side of the forecasting task.

A sophisticated survey on how decision makers perceive the quality of forecasts

could also be a way of improving the forecasting function. The results of such a

study would draw the attention of both practicing forecasters and researchers to the

needs of decision makers.

Previous empirical studies on forecasting accuracy suggest that there is little

improvement, or even a decrease, in accuracy as a building project proceeds from the

early design stage to the detailed design stage. However, no study on forecasting

accuracy has ever been conducted for the same projects at different stages.

Although it would be very difficult to collect the required data, and there would be

complexities in classifying the stages of the different projects, this sort of study

would provide a clearer picture of the paradoxical results that have been thrown up

by other accuracy studies, which have found that using more information can

produce poorer forecasts.

205

BBiibblliiooggrraapphhyy

Adeli, H. and Wu, M. (1998). "Regularization neural network for construction cost estimation." Journal of Construction Engineering and Management 24(1): 18-24.

Akintoye, A. S., Ajewole, O. and Olomolaiye, P. O. (1992). "Construction cost information management in Nigeria." Construction Management and Economics 10: 107-116. Allman, I. (1988). Significant Items Estimating: Review a PSA Estimating System. Chartered Quantity Surveyor: 24-5.

Armstrong, J. S. (1985). LONG-RANGE FORECASTING From Crystal Ball to Computer, 2nd ed. A Wiley-Interscience Publication.

Armstrong, J. S. (2001). Principles of forecasting : a handbook for researchers and practitioners, Boston, MA : Kluwer Academic.

Armstrong, J. S. and Collopy, F. (1992). "Error Measures for Generalizing about Forecasting Methods: Empirical Comparisons." International Journal of Forecasting 8(69-80).

Ashworth, A. (1999). Cost studies of buildings. 3rd ed. Harlow, England : Longman.

Ashworth, A. and Skitmore, R. M. (1983). Accuracy in estimating, The Chartered Institute of Building.

Association Industrial Consultants Limited and Business Operations Research Limited (1967). Report of the Joint Consulting Team on Serial Contracting for Road Construction, Ministry of Transport.

Atkin, B. (1987). A time/cost planning technique for early design evaluation. Building Cost Modelling and Computers. P. S. Brandon. London, E & F N Spon: 145-54.

Baker, M. J. (1974). Cost of Houses for the Aged. Department of Civil Engineering, Loughborough University of Technology.

Barnes, N. M. L. (1971). The design and use of experimental bills of quantities for civil engineering contracts, University of Manchester Institute of Science and Technology.

Barrett, A. C. (1970). Preparing a cost plan on the basis of outline proposals. Chartered Surveyor: 507-20.

Barrie, D. S. and Paulson, B. C. (1978). Professional Construction Management, McGraw-Hill Book Co., New York.

Bathurst, E. P. and Butler, D. A. (1977). Building Cost Control Techniques

206

and Economics, Heinemann.

BCIS (1969). Standard form of cost analysis, The Royal Institution of Chartered Surveyors.

Beeston, D. T. (1974). One statistician's view of estimating, London: Building Cost Information Service.

Beeston, D. T. (1983). Statistical methods for building price data. London ; New York, E & F N Spon.

Beeston, D. T. (1987). A future for cost modelling. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 15-24. Bennett, J. and Ferry, D. (1987). Towards a Simulated Model of the Total Construction Process. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 377-86.

Bennett, J. and Omerod, R. N. (1984). "Simulation Applied to Construction Projects." Construction Management and Economics 2: 225-63.

Bennett, J., Morrison, N. and Stevens, S. (1979). Construction cost data base - a report prepared on behalf of PSA by University of Reading, Department of Environment (Directorate of Quantity Surveying), Property Services Agency Library.

Bennett, J., Morrison, N. and Stevens, S. (1980). Construction cost data base - second annual report, Department of Environment (Directorate of Quantity Surveying), Property Services Agency Library.

Berny, J. and Howes, R. (1987). Project Management Control Using Growth Curve Models Applied to Budgeting, Monitoring and Forecasting Within the Construction Industry. Management Construction Worldwide. P. R. Landsley and P. A. Harlow. London, E & FN Spon. 1: Systems for Managing Construction: 304-13.

Birnie, J. (1995). The Possiblr Effects of Human Bias in Construction Cost Prediction. Proceedings of 11th Annual ARCOM Conference, University of York, September.

Blackhall, J. D. (1974). The Application of Regression Modelling to the Production of a Price Index for Electrical Services. Department of Civil Engineering, Loughborough University of Technology.

Bode, J. (1998). "Neural networks for cost estimation." Cost Engineering 40(1): 25-30.

Bon, R. (1989). Building as an Economic Process: An Introduction to Building Economics.

Bon, R. (2001). “The future of building economics: a note.” Construction Management and Economics 19: 255-258.

Boussabaine, A. and Elhag, T. (1999). Knowledge Discovery in Residential Construction Project Cost Data. ANNUAL CONFERENCE- ARCOM 1999 15TH.

Bowen, P. A. (1984). Applied econometric cost modelling. Proceedings, 3rd International Symp on Building Economics, CIB W-55, Ottawa.

207

Bowen, P. A. and Edward, P. J. (1998). "Building Cost Planning and Cost Information Managment in South Africa." International Journal of Procurement(June): 16-25.

Bowen, P. A. and Edwards, P. J. (1985a). "Cost Modelling and Price Forecasting: Practice and Theory in Perspective." Construction Management and Economics(3): 199-215.

Bowen, P. A. and Edwards, P. J. (1985b). A Conceptual Understanding of the Paradigm Shift in Cost Modelling Techniques Used in the Economics of Building. Durban, Department of Quantity Surveying and Building Economics, University of Natal.

Bowen, P. A., Wolvaardt, J. S. and Taylor, R. G. (1987). Cost Modelling: a Process-Modelling Approach. Building Cost Modelling and Computer. P. S. Brandon, E & F N Spon: 387-395.

Braby, R. H. (1975). Costs of high-rise buildings. Building Economist. 14: 84-6.

Brandon, P. S. (1978). A Framework for Cost Exploration and Strategic Cost Planning in Design. Building and Quantity Surveying Quarterly. 5: 60-3.

Brandon, P. S., Basden, A., Hamilton, I. W. and Stockley, J. E. (1988). Application of Expert System to Quantity Surveying, N B S Services Ltd.

Brandon, P. S., Ed. (1982). Building cost research: need for a paradigm shift? Building cost techniques: New Directions, E & FN Spon.

Brown, H. W. (1987). Predicting the Elemental Allocation of Building Costs by Simulation with Special Reference to the Cost of Building Service Elements. Building Cost Modelling and Computer. P. S. Brandon, E & F N Spon: 397-406.

Buchanan, J. S. (1969). Development of a Cost Model for the Reinforced Concrete Frame of a Building. Department of Civil Engineering, Loughborough University of Technology.

Buchanan, J. S. (1972). Cost Models for Estimating: Outline of the Development of a Cost Model for the Reinforced Concrete Frame of a Building. London, RICS.

Cartlidge, D. P. and Mehrtens, I. N. (1982). Practical cost planning : a guide for surveyors and architects. London :, Hutchinson.

Chau, K. W. (1995). "Monte Carlo simulation of construction costs using subjective data." Construction Management and Economics 13: 369-83.

Chau, K. W. (1995). "The validity of the triangular distribution assumption in Monte Carlo simulation of construction costs: empirical evidence from Hong Kong." Construction Management and Economics 13(1): 15-21.

Cheong, P. F. (1991). Accuracy in design stage cost estimating, National University of Singapore.

Clark, W. and Kingston, J. (1930). The Skyscrapter: A Study in the Economic Height of Modern Office Buildings, American Institute of Steel

208

Construction, New York.

Clemen, R. T. (1989). "Combining forecasts: A review and annotated bibliography." International Journal of Forecasting 3: 379-391.

Coates, D. (1974). Estimating for French Drains - A Computer Based Model. Department of Civil Engineering, Loughborough University of Technology.

Connauhgton, J. and Meikle, J. (1991). The Future Role of the Chartered Quantity Surveyor. London:, The Royal Institution of Chartered Surveyors, Quantity Surveyors Division. Cusack, M. M. (1985). Optimization of Time and Cost. Project Management. 3: 50-4.

Cusack, M. M. (1987). An Integrated Model for the Control of Costs, Duration and Resources on Complex Projects. Proceedings of the Fourth International Symposium on Building Economics, Section C: Resource Utilisation, Copenhagen, SBI.

de Neufville, R., Hani, E. N. and Lesage, Y. (1977). "Bidding model: effects of bidders' risk aversion." Journal of the Construction Division ASCE 103(CO1): 57-70.

Department of Environment (DOE) (1971). Local Authority Offices: Areas and Costs, DOE, London.

Diekmann , J. E. (1983). "Probabilistic estimating: mathematics and applications." Journal of Construction Engineering and Management, ASCE 109: 297-308.

Dreger, G. T. (1988). Cost Management Models for Design Application. Transactions of the American Association of Cost Engineers, Morgantown: AACE.

Drew, D. (1995). The Effect of Contract Type and Size on Competitiveness in Construction Contract Bidding. PhD Thesis. Department of Civil Engineering, University of Salford.

Dunican, P. (1960). Structural Steelwork and Reinforced Concrete for Framed Buildings. The Chartered Surveyor. August 1960: 74-77.

Edwards, A. W. F. (2001). 7 Occam's bonus. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 128-132.

Efron, B. (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia, Pa. :, Society for Industrial & Applied Math.,.

Ellis, C. and Turner, A. (1986). Procurement Problems. Chartered Quantity Surveyor: 11.

Emsley, M. W., Lowe, D. J., Duff, A. R. Harding A., and Hickson, A. (2002). “Data modelling and the application of a neural network approach to the prediction of total construction costs.” Construction Management and Economics 20(6): 465-472.

Fausett, L. V. (2002). Numerical methods using MathCAD. Upper Saddle

209

River, N.J., Prentice Hall.

Ferry, D. J., Brandon, P. S. and Ferry, J. D. (1999). Cost planning of buildings, 7th ed. Blackwell Science.

Fine, B. (1980). Construction Management Laboratory, Fine, Curtis, Gross.

Fine, B. and Hackermar, G. (1970). Estimating and bidding strategy. Building Technology and Management: 8-9.

Fisher, R. A. (1922). "On the mathematical foundations of theoretical statistics." Philosophical Transactions of the Royal Society A 222: 309-368.

Flanagan, R. and Norman, G. (1978). The relationship between construction price and height. Chartered Surveyor B and QS Quarterly: 69-71.

Flanagan, R. and Norman, G. (1983). "The accuracy and monitoring of quantity surveyors' price forecasting for building work." Construction Management and Economics: 157-180.

Flanagan, R. and Tate, B. (1997). Cost control in building design : an interacitve learning text. Oxford :, Blackwell Science,.

Forster, M. R. (2001). 5 The new science of simplicity. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 83-119.

Fortune, C. (1999). "Quality issues in building project price forecasting - factors affecting model selection." Journal of Construction Procurement 5(2): 129-140.

Fortune, C. and Hinks, J. (1997). Model Selection Criteria in Building Project Price Forecast. Annual conference; 13th Conference Volume Title Association of Researchers in Construction Management, Cambridge, Association of Researchers in Construction Management.

Fortune, C. and Hinks, J. (1998). "Strategic building project price forecasting models in use - paradigm shift postponed." Journal of Financial Management of Property and Construction 3(1): 3-26.

Fortune, C. and Lees, M. (1989). An investigation into Methods of Early Cost Advice for Clients, Salford College of Technology.

Fortune, C. and Lees, M. (1994). Early cost advice for clients: the practitioners' view. ANNUAL CONFERENCE- ARCOM 1994 10th.

Fortune, C. and Lees, M. (1996). The relative performance of new and traditional cost models in strategic advice and clients, The Royal Institution of Chartered Surveyors.

Fox, J. (1997). Applied regression analysis, linear models and related models. Thousand Oaks, Calif.: Sage Publications.

Garson, G. D. (2004). Structural Equation Modelling. [Internet] North Carolina State University. Available from: <http://www2.chass.ncsu.edu/garson/pa765/structur.htm> [Assessed 15 July 2004]

Gehring, H. and Narula, S. (1986). Project Cost Planning with Qualitative

http://www2.chass.ncsu.edu/garson/pa765/structur.htm

210

Information. Project Management. 4: 61-5.

Gould, P. R. (1970). The Development of a Cost Model for H&V and A. C. Installations in Buildings. Department of Civil Engineering, Loughborough University of Technology.

Goutte, C. (1997). “Note on free lunches and cross-validation.” Neural Computation 9(6): 1246-9.

Gray, C., Ed. (1982). Analysis of the Preliminary Element of Building Production Cost. Building cost techniques: New Directions, E & FN Spon.

Grinyer, R. H. and Whittaker, J. D. (1973). "Managerial judgement in a competitive bidding model." Operational Research Quarterly 24(2): 181-191.

Gunner, J. C. (1997). A Model of Building Price Forecasting Accuracy. Department of Surveying. Salford, University of Salford.

Gunner, J. C. and Skitmore, R. M. (1999). "Comparative analysis of pre-bid forecasting of building prices based on Singapore data." Construction Management and Economics 17(5): 635-646.

Hanscomb Associates (1984). Area Cost Factors, Report of the US Army Corps of Engineers, Hanscomb Associates Inc, 600 West Peachtree Street, NW, Suite 1400, Atlanta, Georgia 30308, USA.

Hardcastle, C. (1984). The relationship between cost communications and price prediction in the evaluation of building design. Proceedings 3rd Int Symp on Build Econ, CIB W-55, Ottawa.

Harvey, J. (1979). Competitive Bidding on Canadian Public Construction Contracts: Stochastic Analysis for Optimization, University of Ontario: 102.

Hettmansperger, T. P. and McKean, J. W. (1998). Robust Nonparametric Statistical Methods, London: Arnold.

Hillebrandt, P. M. (1985). Economic Theory and the Construction Industry, MacMillan, 2nd ed. London.

Holes, L. G. (1987). Holostoc Resource and Cost Modelling. Building Cost Modelling and Computers. P. S. Brandon. London, E & F N Spon: 221-27.

Holes, L. G. and Thomas, R. (1982). General purpose cost modelling. Building Cost Techniques - New Directions. P. S. Brandon, E & F N Spon: 220-7.

Jaccard, J. and Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications.

Jaggar, D., Ross A., Smith J. and Love, P. (2002). Building Design Cost Management, Blackwell Publishing

James, W. (1954). "A New Approach to Single Price Rate Approximate Estimating." RICS Journal XXXIII (XI)(May): 810-24.

Jupp Mansfield & Partners (1981). "Reliability of detailed cost data for estimating during detailed design (unpublished)."

Karshenas, S. (1984). "Predesign Cost Estimating Method for Multistory

211

Buildings." Journal of Construction Engineering and Management, ASCE 110(1): 79-86.

Kenley, R. and Wilson O. D. (1986). "A construction project cash flow model – an idiographic approach" Construction Management and Economics 4: 213-232.

Kim, G. H., Yoon, J. E., An, S. H., Cho, H. H. and Kang, K. I. (2004). "Neural network model incorporating a genetic algorithm in estimating construction costs" Building and Environment 39(11): 1330-1340.

Khosrowshahi, F. (1988). Construction Project Budgeting and Forecasting. Transactions of the American Association of Cost Engineers, Morgantown: AACE.

Khosrowshahi, F. and Kaka, A. P. (1996). "Estimation of Project Total Cost and Duration for Housing Projects in the U.K." Building and Environment 31(4): 375-383.

Kiiras, J. (1987). NCCS, Normal Cost Control System for Finnish Building Projects. Proceedings of the Fourth International Symposium on Building Economics, Session B: Design Optimisation, Copenhagen, SBI.

Kleinbaum, D. G., L. L. Kupper, et al. (1998). Applied regression analysis and other multivariable methods. 3rd ed. Boston, Mass. :, PWS-Kent,.

Kouskoulas, V. and Koehn, E. (1974). "Predesign cost estimating function for buildings." Journal of Construction Division, ASCE: 589-604.

Langston, C. A. (1983). Computerised Cost Planning Techniques. The Building Economist. 21: 171-73.

Lehmann, E. L. (1959). Testing Statistical Hypotheses, NewYork: JohnWiley.

Li, H. (1995). "Neural networks for construction cost estimation." Building Research and Information 23(5): 279-284.

Lu, Q. (1988). Cost estimation based on theory of fuzzy sets and prediction techniques - an expert system approach. Construction Contracting in China, Department of Civil and Structural Engineering, Hong Kong Polytechnic: 113-25.

MacCaffery (1978). Tender-price prediction for UK buildings - a feasibility study. Department of Civil Engineering, Loughborough University of Technology.

Makridakis, S. and Hibon, M. (1979). "Accuracy of Forecasting: An Empirical Investigation." Journal of the Royal Statistical Society Series A, Vol. 142(Part 2): 79-145.

Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandowshi, R., Newton, J., Parzen, E. and Winkler, R. (1982). "The accuracy of extrapolation (time series) methods: Results of a forecasting competition." Journal of Forecasting 1: 111-153.

Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandowshi, R., Newton, J., Parzen, E. and Winkler, R. (1983). The

212

Forecasting Accuracy of Major Time Series Methods, London: Wiley.

Male, S. P. (1990). "Professional Authority, Power and Emerging Forms of "Profession" in Quantity Surveyor." Construction Management and Economics 8: 191-204.

Marr, K. F. (1974). Standards for Construction Cost Estimating. American Association of Cost Engineers, Transactions 1974.

Mathur, K., Ed. (1982). A Probabilistic Planning Model. Building cost techniques: New Directions, E & FN Spon.

Maver, T. (1970). "A Theory of Architectural Design in which the Role of the Computer is Identified." Building Science 4: 199-207.

Maver, T. (1979). Cost Performance Modelling. Chartered Quantity Surveyor. 2: 111-15.

Mayer, J. F. and Robinson, C. (1988). Accuracy of estimating - Summary of comparisons made between four estimating stages, Building Research Establishment, Department of Environment.

McCaffer, R. (1975). Some examples of the use of regression analysis as an estimating tool. Quantity Surveyor: 81-86.

McCaffer, R. (1976). Contractor's bidding behaviour and tender price prediction. Department of Civil Engineering, Loughborough University of Technology.

McCaffer, R., McCaffrey, M. J. and Thorpe, A. (1984). "Predicting the tender price of buildings during early stage design: method and validation." J. Opl Res. Soc. 35(5): 415-424.

McLachlan, G. J. (1987). Advances in multivariate statistical analysis. A. K. Gupta. Dordrecht, Holland, D. Reidel.

Meijer, R. F. (1987). Cost Modelling of Archetypes. Building Cost Modelling and Computers. P. S. Brandon. London, E & FN Spon: 223-31.

Meyrat, R. F. (1969). Algebraic Calculation of Cost Price. BUILD International: 27-36.

Moore, G. and Brandon, P. S. (1979). A Cost Model for Reinforced Concrete Frame Design. Chartered Quantity Surveyor. October 1979: 40-44.

Morrison, N. A. D. (1983). The Cost Planning and Estimating Techniques Employed by the Quantity Surveying Profession. Department of Construction Management, University of Reading.

Morrison, N. A. D. (1984). "The accuracy of quantity surveyors' cost estimating." Construction Management and Economics(2): 57-75.

Morton, R. and Jaggar, D. (1995). Design and the economics of building. London :, E & FN Spon,.

Moyles, B. F. (1973). An Analysis of the Contractors' Estimating Process.

213

Department of Civil Engineering, Loughborough University of Technology.

Munns and Al, H. (2000). "Estimating using cost significant global cost models." Construction Management and Economics 18(5): 575-585.

Nadel, E. (1967). "Parameter cost estimates." Engineering New Record(16 Mar): 112-23.

Neale, R. H. (1973). The Use of Regression Analysis as a Contractor's Estimating Tool. Department of Civil Engineering, Loughborough University of Technology. Newbold, P. and Granger, C. W. J. (1974). "Experience with forecasting univariate time series and the combination of forecasts." Journal of Royal Statistical Society Series A(137): 131-149.

Newton, S. (1983). Analysis of Construction Economics. School of Architecture and Building Science, University of Strethclyde.

Newton, S. (1988). Cost modelling techniques in perspective. Transactions 32nd Ann meeting of the AACE and the 10th Int cost engineering congress, New York, AACE.

Newton, S. (1990). "An agenda for cost modelling research." Construction Management and Economics.

Ogunlana, O. and T. Thorpe (1987). "Design phase cost estimating. The state of art." International Journal of Construction Management and Technology 2(4): 34-47.

Ogunlana, S. and Thorpe, A. (1991). "Factor affecting the accuracy of cost estimates: developing correct associations." Building and Environment 26(2): 77-86.

Park, R. E. (1988). Parametric Sofeware Cost Estimation with an Adaptable Model. Transactions of the American Association of Cost Engineers, Morgantown: AACE.

Patchell, B. R. T. (1987). The implementation of cost modelling theory. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 233-42.

Pegg, I. (1984). Cost Study F38a: The effect of location and other measurable parameters on tender levels, Building Cost Information Service, Royal Institution of Chartered Surveyors: 13-27.

Pegg, I. (1987). Computerised Approximate Estimating from the BCIS On-line Data Base. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 243-9.

Pena, W. and Parshall, S. A. (2001). Problem seeking : an architectural programming primer. 4th ed. New York :, Wiley.

Pitt, T. (1982). The Identification and Use of Spend Units in the Financial Monitoring and Control of Construction Projects. Building Cost Techniques - New Directions. P. S. Brandon, London: E & F N Spon: 255-62.

Popper, K. (1959). The Logic of Scientific Discovery, New York: Science

214

Editions.

Powell, J. and Chisnall, J. (1981). Getting early estimates right. Chartered Quantity Surveyor: 279-281.

Proctor, C. J., Bowen, P. A., Le Roux, G. K. and Fielding, M. J. (1993). Client and Architect Satisfaction with Building Price Advice: An Empirical Study. CIB W55/W95 Internal Symposium on Economic Evaluation and the Built Environment, Lisbon.

Property Services Agency (PSA) (1977). Early cost advice (B & CE elements) - offices, sleeping quarters, HMSO. Property Services Agency (PSA) (1987). Significant items estimating. Croydon :, Property Services Agency.

Quah, L. K. (1992). "Comparative variability in tender bids for refurbishment and new build work." Construction Management and Economics 10: 263-69.

Raftery, J. (1984a). Some problems of data collection and model validation. Paper for Research Seminar, Liverpool Polytechnic.

Raftery, J. (1984b). Models in building economics: a conceptual framework for the assessment of performance. Proceedings 3rd International Symposium on Building Economics, Ottawa, CIB W-55.

Raftery, J. (1987). The state of cost/price modelling in the construction industry: a multicriteria approach. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 49-71.

Raftery, J. (1991). Principles of Building Economics, BSP Professional Books.

Raftery, J. (1995). Property and Construction Economics as the Study of Human Behaviour in Exchange,. Keynote Address to the Internal Conference on Financial Management of Property and Construction, Newcastle, Co. Down, Northern Ireland, 165-75.

Ray-Jones, A. and D. Clegg (1976). CI/SfB construction indexing manual. 3rd ed. London :, RIBA Publications,.

Regdon, G. (1972). "Pre-determination for Housing Cost." BUILD International (March/April 1972): 94-99.

Ross, E. (1983). A database and computer system for tender price prediction by approximate quantities, Loughborough University of Technology.

Royal Institution of British Architects (RIBA) (1991). Architect's Handbook of Practice Management, RIBA Publications.

Royal Institution of Chartered Surveyors (RICS) (1992). The core skills and knowledge base of the quantity surveyor, The Royal Institution of Chartered Surveyors.

Royal Institution of Chartered Surveyors (RICS) Junior Organisation (1964). The effect of shape and height on building cost. The Chartered Surveyor.

215

Runeson, G. and Bennett, J. (1983). "Tendering and the price level in the New Zealand building industry." Construction Papers 2(2): 29-35.

Russell, A. D. and Choudhary, K. T. (1980). "Cost optimisation of buildings." Journal of Structural Division, American Society of Civil Engineers(January): 283-300.

Schofield, D., Raftery, J. and Wilson, A. (1982). An Economic Model of Means of Escape Provision in Commercial Buildings. Building Cost Techniques: New Directions. P. S. Brandon. London, E & FN Spon: 210-220. Seeley, I. H. (1996). Building economics : appraisal and control of building design cost and efficiency. 4th ed. Houndmills, Basingstoke, Hampshire :, Macmillan,.

Selinger, S. (1988). Computerized Parametric Estimating. British-Israeli Seminar on Building Economics, Haifa: The Building Research Station: 160-67.

Sidwell, A. C. and Woottoon, A. H. (1984). Operation Estimating. Organizing and Managing Construction. V. K. Handa. Ontario, University of Waterloo. 3: Developing Countries Research: 1015-20.

Sierra, J. E. E. (1982). "A Statistical Analysis of Low Rise Office Accommodation Investment Packages." The Building Economist(March): 175-78.

Simon, H. A. (2001). 3 Science seeks parsimony, not simplicity: searching for pattern in phenomena. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 32-72.

Singh, S. (1989). Computer Model for Cost Estimation of Structures in High Rise Commercial Buildings. PROCEEDINGS OF THE ANNUAL CONFERENCE- ASSOCIATED SCHOOLS OF CONSTRUCTION 1989 25th.

Skitmore, M. (1981). Bidding dispersion - an investigation into a method of measuring the accuracy of building cost estimates. Department of Civil Engineering, University of Salford.

Skitmore, M. (1982). A Bidding Model. Building Cost Techniques - New Directions. P. S. Brandon, London: E & F N Spon: 175-8.

Skitmore, M. (1992). "Parameter prediction for cash flow forecasting models." Construction Management and Economics 10: 397-413.

Skitmore, M. (2002). "Raftery curve construction for tender price forecast." Construction Management and Economics 20: 83-89.

Skitmore, M. and Drew, D. (2003). "The analysis of pre-tender building price forecasting performance: a case study." ENGINEERING CONSTRUCTION AND ARCHITECTURAL MANAGEMENT 10(1): 36-42.

Skitmore, M., Stradling, S., Tuohy, A. and Makwezalamba, H. (1990). The Accuracy of Construction Price Forecasts: A Study of Quantity Surveyors' Performance in Early Stage Estimating, The University of Salford.

216

Skitmore, R. M. (1985). The influence of professional expertise in construction price forecasts, Department of Civil Engineering, University of Salford.

Skitmore, R. M. (1988). Fundamental research in building and estimating. Transactions CIB British-Israeli seminar on building economics, Haifa, Israel Building Research Station.

Skitmore, R. M. (1991). Early Stage Construction Price Forecasting - A Review of Performance, The Royal Institution of Chartered Surveyors.

Skitmore, R. M. and Marston, V. K. (1999). Cost Modelling. London, E & FN Spon.

Skitmore, R. M. and Patchell, B. R. T. (1990). Development in contract price forecasting and bidding techniques. Quantity Surveying Techniques: New Directions. P. S. Brandon, Blackwell Scientific: 75-120.

Skitmore, R. M. and Tan, S. H. (1988). Factors affecting the accuracy of engineers' estimates. Transactions 10th International Cost Engineering Congress, The American Association of Cost Engineers, paper B-3, American Association of Cost Engineers, Mortgantown.

Sober, E. (2001). 2 What is the problem of simplicity? Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 13-31.

Southwell, J. (1971). Building Cost Forecasting, Royal Institution of Chartered Surveyors.

Spanos, A. (2001). 11 Parametric versus non-parametric inference: statistical models and simplicity. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 181-206.

Spooner, J. E. (1974). "Probabilistic estimating." Journal of Construction Division, ASCE: March.

Sprent, P. (1993). Applied Non-Parametric Statistical Methods, Chapman & Hall.

Stangl W. (1997). Ockham's razor also Occam's razor. "Pluralitas non est ponenda sine necessitate". [Internet] Johannes Kepler University (JKU), Linz, Austria. Available from: <http://paedpsych.jk.uni-linz.ac.at/INTERNET/ARBEITSBLAETTERORD/PHILOSOPHIEORD/Occam.html> [Assessed 15 July 2004]

Steyert, R. S. (1972). The Economics of High Rise Apartment Buildings of Alternate Design Construction, Construction Research Council, American Society of Civil Engineer.

Stone, P. (1963). Housing, Town Development Land and Costs. London.

Taylor, R. G. (1984). A critical examination of quantity surveying techniques in cost appraisal and tendering within the building industry. Department of Quantity Surveying and Building Economics, University of Natal.

217

Tan, S. H. (1988). An investigation into the accuracy of cost estimates during the design stages of construction projects,. Department of Civil Engineering, The University of Salford.

Tan, W. (1999). "Construction cost and building height." Construction Management and Economics 17(2): 129-32.

Thng, S. H. (1989). Estimating accuracy at public sector - housing development board, National University of Singapore.

Thompson, P. A. and Willmer, G. (1985). CASPAR - A Program for Engineering Project Appraisal and Management. Proceedings, 2nd International Conference on Civil and Structural Engineering Computing, London.

Thomsen, C. (1965). "How high to rise." AIA Journal(April 1965): 66-68.

Thomsen, C. (1966). "How high to rise." Appraisal Journal 34(4): 585-91.

Townsend, P. R. F. (1978). The effect of design decisions on the cost of office development. Chartered Surveyor - Building and Quantity Surveying Quarterly: 53-6.

Tregenza, T. (1972). "Association between building height and cost." Architects' Journal 156(44): 1031-2.

Turney, P. D. (1990a). "The curve fitting problem - a solution." British Journal for the Philosophy of Science 41(509-530).

Tversky, A. and Kahneman, N. (1974). "Judgement Under Uncertainty: Heuristics and Biases." Science 185: 1124-1131.

Walker, D. H. T. (1988). Using Spreadsheets for Simulation Studies. The Building Economist. 27: 14-5.

Wall, D. M. (1997). "Distribution and correlations in Monte Carlo simulation." Construction Management and Economics 15: 241-58.

Warszawski, A. (2003). “Parametric analysis of the financing cost in a building project.” Construction Management and Economics 21(5): 447-459.

Weight, D. (1987). Patterns Cost Modelling. Building Cost Modelling and Computers. P. S. Brandon. London, E & FN Spon: 257-66.

Wilderness Group (1964). An investigation into building cost relationships of the following design variables: storey height, floor loading, column spacing, number of storeys, The Royal Institution of Chartered Surveyors: 253-71.

Wilson, A. J. and Templeman, A. B. (1976). "An Approach to the Optimal Thermal Design of Office Buildings." Building and Environment 11(1): 39-50.

Wilson, A. J., Ed. (1982). Experiments in probabilistic cost modelling. Building cost techniques: New Directions, E & FN Spon.

Wilson, O. D., Sharpe, K. and Kenley, R. (1987). "Estimates Given and Tenders Received." Construction Management and Economics 5(3).

218

Woodhead, W. D., Rahilly, M., Salomonsson, G. D. and Tait, R. (1987). An Integrated Cost Estimating System for House Builders. Proceedings of the Fourth International Symposium on Building Economics, Session B: Design Optimation, Copenhagen, SBI.

Yokoyama, K. and Tomiya, T. (1988). The Integrated Cost Estimating Systems Technique for Building Costs. Transactions of the American Association of Cost Engineers, Morgantown: AACE.

Zahry, M. (1982). Capital cost prediction by multi-variate analysis. School of Architecture, University of Strathclyde. Zellner, A., Keuzenkamp, H. A. and McAleer, M. (2001). Simplicity, inference and modelling : keeping it sophisticatedly simple. Cambridge, Cambridge University Press.

halla


halla


halla


halla


halla


halla


233

AAppppeennddiixx DD:: FFoorreeccaassttss bbyy CCrroossss VVaalliiddaattiioonn

UUssiinngg CCoonnvveennttiioonnaall MMooddeellss

234

Table D-1: Forecasts by Cross Validation Using the Conventional Models for

Offices

James Floor Cube James Floor Cube James Floor Cube

(m²) (m²) (m³) ($/m²) ($/m²) ($/m³) (HK$) (HK$) (HK$)1 50,550 12,213 43,744 1,175 5,861 1,455 59,372,873 71,583,498 63,662,214 2 38,811 9,176 31,035 1,175 5,859 1,455 45,585,401 53,760,591 45,143,231 3 83,694 23,330 78,378 1,174 5,877 1,457 98,277,121 137,105,039 114,181,365 4 46,142 10,491 32,639 1,175 5,860 1,454 54,209,728 61,471,200 47,455,012 5 58,593 12,620 47,163 1,176 5,862 1,456 68,878,692 73,975,591 68,663,684 6 160,425 50,940 163,404 1,172 5,912 1,459 187,972,498 301,140,203 238,433,705 7 110,466 25,052 92,362 1,176 5,871 1,457 129,899,266 147,087,959 134,580,309 8 34,558 6,520 27,926 1,172 5,838 1,451 40,494,985 38,065,902 40,527,931 9 941,061 130,060 637,140 1,183 5,652 1,433 1,112,835,911 735,117,042 912,914,988 10 185,100 42,250 192,065 1,154 5,773 1,440 213,696,762 243,903,254 276,534,754 11 193,526 47,820 224,431 1,162 5,824 1,455 224,821,862 278,494,684 326,526,162 12 51,123 10,011 36,773 1,173 5,843 1,451 59,953,119 58,499,117 53,371,035 13 54,357 10,298 36,571 1,175 5,854 1,454 63,873,689 60,283,413 53,160,909 14 1,363,377 295,360 1,126,816 1,194 6,050 1,486 1,627,239,786 1,786,965,675 1,673,999,827 15 115,367 24,600 109,215 1,163 5,799 1,444 134,126,404 142,666,828 157,664,365 16 25,137 5,585 20,441 1,172 5,843 1,451 29,458,901 32,629,344 29,669,766 17 306,948 81,700 367,355 1,168 5,900 1,476 358,504,079 481,992,100 542,082,691 18 38,140 8,492 28,486 1,175 5,859 1,455 44,812,745 49,753,951 41,436,441 19 18,267 3,680 12,070 1,173 5,847 1,453 21,432,969 21,518,551 17,532,517 20 118,746 20,254 94,403 1,169 5,814 1,448 138,842,813 117,748,657 136,671,495 21 24,461 5,490 18,283 1,175 5,857 1,455 28,735,303 32,155,761 26,594,149 22 18,527 3,828 12,631 1,174 5,851 1,453 21,749,466 22,397,117 18,357,240 23 128,261 26,454 122,827 1,163 5,798 1,445 149,150,758 153,380,485 177,446,419 24 38,095 8,697 28,786 1,174 5,854 1,453 44,721,857 50,917,009 41,838,760 25 638,680 130,070 451,993 1,200 5,989 1,470 766,506,884 778,994,468 664,510,311 26 56,424 11,560 43,695 1,169 5,825 1,447 65,938,132 67,335,601 63,221,858 27 35,511 7,642 23,758 1,172 5,845 1,451 41,633,854 44,666,333 34,470,126 28 54,626 11,377 44,522 1,172 5,845 1,452 64,045,606 66,493,664 64,652,059 29 60,329 14,996 48,130 1,172 5,852 1,451 70,709,169 87,756,220 69,858,260 30 144,936 36,820 126,841 1,175 5,887 1,458 170,347,624 216,761,177 184,927,046 31 26,578 8,550 32,405 1,174 5,862 1,456 31,196,743 50,118,072 47,189,075 32 39,939 7,718 27,511 1,174 5,851 1,453 46,897,923 45,153,654 39,977,555 33 19,899 4,921 17,703 1,174 5,853 1,454 23,358,683 28,807,337 25,743,836 34 26,577 6,453 23,366 1,173 5,852 1,454 31,184,898 37,760,493 33,967,145 35 21,525 5,267 17,005 1,174 5,856 1,454 25,277,990 30,843,509 24,731,553 36 941,236 183,462 736,348 1,146 5,688 1,413 1,078,454,181 1,043,505,263 1,040,471,312 37 1,208,137 171,960 644,176 1,247 5,900 1,454 1,506,529,900 1,014,641,302 936,597,527 38 53,764 14,840 64,406 1,174 5,868 1,459 63,139,681 87,080,970 93,990,913 39 27,690 8,268 35,476 1,173 5,858 1,456 32,491,687 48,433,883 51,663,376 40 57,878 15,130 57,797 1,175 5,869 1,458 68,006,401 88,797,765 84,255,294 41 112,359 35,350 252,357 1,172 5,887 1,490 131,630,183 208,113,877 375,929,649 42 75,550 16,897 69,902 1,175 5,864 1,458 88,791,931 99,078,500 101,892,166

Case Model Quantity Model Rate Forecasted Tender Sum

235


Private Housing


(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 710,487 118,840 341,184 760 4,020 1,426 539,822,162 477,790,905 486,573,380 2 944,939 155,150 445,564 755 3,988 1,415 713,552,191 618,674,258 630,354,139 3 131,043 30,840 94,746 761 4,051 1,437 99,778,400 124,920,812 136,153,077 4 112,167 20,937 70,734 763 4,054 1,439 85,613,626 84,879,293 101,768,657 5 956,202 181,820 494,364 771 4,095 1,450 736,919,515 744,639,741 716,582,916 6 1,157,607 240,440 754,110 754 4,029 1,438 872,813,723 968,627,663 1,084,710,974 7 12,366 2,800 8,680 765 4,062 1,440 9,456,723 11,374,722 12,502,087 8 24,127 6,066 17,893 764 4,062 1,440 18,443,658 24,640,525 25,769,121 9 250,732 43,003 128,526 765 4,060 1,440 191,856,830 174,599,877 185,123,898 10 566,453 134,220 404,167 751 4,015 1,427 425,155,543 538,865,772 576,608,977 11 1,120,997 211,440 590,512 744 3,951 1,400 833,797,413 835,449,571 826,753,484 12 663,624 132,070 372,257 763 4,063 1,440 506,647,430 536,549,802 536,151,998 13 151,135 34,335 108,442 761 4,048 1,437 115,008,129 138,985,475 155,784,824 14 27,793 5,195 15,730 765 4,061 1,440 21,251,161 21,098,368 22,650,548 15 35,515 7,811 24,663 765 4,062 1,440 27,152,627 31,729,589 35,527,123 16 1,133,306 197,320 558,684 777 4,111 1,458 880,832,193 811,217,111 814,454,721 17 10,811 2,793 8,285 765 4,063 1,441 8,269,007 11,349,233 11,935,431 18 787,734 141,300 391,527 769 4,075 1,444 605,381,821 575,764,763 565,246,703 19 28,116 6,841 19,484 765 4,064 1,441 21,504,864 27,803,438 28,075,002 20 142,064 31,520 97,830 762 4,053 1,438 108,265,743 127,743,892 140,676,020 21 382,217 66,825 210,871 763 4,048 1,438 291,627,893 270,476,690 303,195,772 22 252,034 45,900 126,225 764 4,055 1,437 192,496,780 186,140,819 181,426,161 23 313,524 60,140 169,387 767 4,077 1,445 240,584,951 245,196,620 244,834,083 24 1,002,584 187,360 525,872 765 4,060 1,439 766,639,967 760,717,016 756,785,376 25 17,041 3,769 9,479 765 4,063 1,440 13,033,398 15,313,365 13,652,540 26 384,544 95,770 270,824 762 4,072 1,444 293,053,202 389,961,458 390,981,903 27 82,080 20,640 55,109 764 4,064 1,440 62,714,248 83,873,938 79,372,448 28 522,038 90,720 254,088 770 4,083 1,447 402,007,775 370,370,941 367,703,903 29 499,149 108,627 304,590 765 4,076 1,445 381,634,950 442,791,500 440,105,402 30 148,177 34,470 106,967 762 4,052 1,438 112,852,580 139,673,486 153,796,614 31 627,877 122,750 340,357 768 4,085 1,447 482,324,197 501,440,924 492,666,293 32 170,618 32,550 87,885 766 4,072 1,443 130,777,011 132,536,234 126,823,048 33 384,252 74,700 219,960 761 4,046 1,436 292,536,200 302,254,734 315,794,951 34 407,097 79,260 223,270 767 4,075 1,445 312,076,760 322,950,196 322,515,873 35 529,896 97,860 276,927 766 4,066 1,442 405,862,926 397,939,532 399,266,095 36 103,759 21,938 64,166 765 4,063 1,441 79,325,442 89,140,383 92,453,090 37 652,203 123,950 313,594 771 4,095 1,447 502,657,976 507,571,144 453,792,371 38 958,982 169,260 457,002 776 4,109 1,454 744,014,266 695,492,936 664,503,547 39 499,756 85,840 231,768 771 4,085 1,447 385,154,078 350,666,191 335,355,838 40 966,375 168,720 455,544 776 4,108 1,454 749,878,706 693,047,206 662,170,897 41 733,776 128,390 346,653 772 4,089 1,448 566,228,519 524,940,289 501,783,295 42 494,856 87,480 236,196 765 4,060 1,438 378,793,092 355,173,599 339,660,311 43 767,745 129,870 350,649 771 4,083 1,445 592,301,009.74 530,198,096.05 506,800,720.56 44 114,848 21,121 55,971 766 4,066 1,441 87,926,473.50 85,879,578.90 80,658,164.91 45 393,972 70,670 187,276 769 4,081 1,445 302,928,374.19 288,375,864.90 270,637,948.94 46 130,842 28,000 70,488 766 4,072 1,443 100,224,854.20 114,019,920.97 101,686,494.80 47 118,933 24,850 68,338 763 4,057 1,438 90,773,455.62 100,804,279.38 98,264,079.00 48 65,162 14,940 44,329 764 4,063 1,441 49,806,322.57 60,695,623.54 63,860,229.19 49 83,376 20,300 58,870 764 4,065 1,441 63,726,309.10 82,509,513.31 84,842,576.37 50 737,963 128,700 353,925 773 4,097 1,451 570,746,064.53 527,332,419.33 513,701,177.26


236


Nursing Homes


(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 32,195 9,357 37,295 1,211 4,215 1,183 38,992,273 39,436,561 44,130,687 2 37,823 10,640 39,472 1,236 4,292 1,200 46,752,715 45,665,113 47,348,177 3 28,924 8,940 35,760 1,199 4,186 1,175 34,672,981 37,423,995 42,024,543 4 30,853 10,100 45,450 1,208 4,234 1,201 37,258,239 42,766,748 54,599,219 5 8,020 2,400 9,600 1,212 4,219 1,178 9,723,386 10,124,491 11,311,030 6 18,536 5,240 18,602 1,217 4,231 1,179 22,566,468 22,172,318 21,937,612 7 16,224 3,502 12,258 1,216 4,196 1,169 19,722,838 14,695,509 14,332,358 8 12,913 3,783 15,132 1,208 4,202 1,175 15,594,843 15,895,715 17,777,176 9 34,503 10,900 33,108 1,216 4,258 1,174 41,969,369 46,414,397 38,868,066 10 22,893 6,865 24,028 1,218 4,245 1,182 27,889,849 29,139,691 28,404,085 11 19,628 6,150 18,143 1,209 4,217 1,167 23,720,771 25,933,114 21,178,107 12 35,890 11,900 41,650 1,224 4,302 1,197 43,925,314 51,193,195 49,857,789 13 16,713 4,575 19,444 1,220 4,238 1,188 20,397,313 19,389,012 23,101,057 14 18,357 5,720 17,160 1,229 4,286 1,188 22,558,469 24,515,279 20,380,578 15 17,620 5,752 23,008 1,223 4,273 1,196 21,551,242 24,575,631 27,526,333 16 12,112 3,740 17,204 1,212 4,224 1,186 14,685,236 15,796,253 20,397,589 17 72,543 15,978 47,197 1,223 4,108 1,123 88,713,548 65,633,295 53,006,869 18 20,374 6,240 21,216 1,230 4,287 1,193 25,053,592 26,751,383 25,304,864 19 22,236 6,105 20,757 1,219 4,232 1,178 27,114,010 25,839,115 24,443,092 20 14,943 4,405 19,383 1,210 4,209 1,181 18,074,729 18,543,558 22,896,143 21 13,564 4,290 15,659 1,223 4,264 1,189 16,590,664 18,293,875 18,625,695 22 18,287 5,230 17,259 1,227 4,267 1,186 22,440,674 22,315,583 20,475,966 23 27,805 7,190 21,517 1,208 4,177 1,156 33,575,265 30,033,124 24,863,430


237


Schools


(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 14,916 4,498 18,036 632 2,104 549 9,428,724 9,464,630 9,905,382 2 38,219 12,466 51,609 636 2,136 562 24,291,506 26,628,941 29,000,856 3 14,916 4,498 18,036 630 2,098 547 9,399,254 9,435,048 9,874,423 4 14,861 4,436 15,613 631 2,101 545 9,381,703 9,317,595 8,515,324 5 8,541 2,710 10,081 635 2,118 551 5,427,421 5,739,048 5,557,443 6 9,882 2,970 10,692 633 2,108 548 6,258,694 6,261,287 5,862,205 7 9,649 2,771 11,638 637 2,117 553 6,144,545 5,866,423 6,433,834 8 32,645 9,715 34,780 637 2,118 548 20,789,791 20,574,694 19,067,844 9 19,003 5,570 20,165 616 2,047 532 11,703,934 11,404,989 10,723,764 10 12,441 3,480 16,112 637 2,114 554 7,921,848 7,358,185 8,933,449 11 10,858 3,068 12,579 632 2,100 548 6,862,630 6,442,888 6,894,603 12 16,273 4,814 18,582 637 2,119 552 10,369,211 10,202,933 10,262,204 13 14,996 4,459 19,486 620 2,064 541 9,304,505 9,205,053 10,538,984 14 14,985 4,340 15,190 633 2,102 546 9,478,966 9,123,962 8,289,897 15 27,004 8,325 29,138 638 2,129 551 17,240,545 17,726,537 16,050,372 16 23,185 7,171 24,238 636 2,121 548 14,743,916 15,207,998 13,283,659 17 14,201 4,200 14,910 640 2,130 553 9,094,442 8,947,076 8,250,403 18 10,848 3,152 12,450 637 2,119 552 6,912,741 6,679,007 6,878,452 19 7,197 2,140 9,245 638 2,123 555 4,592,511 4,544,131 5,126,316 20 13,913 3,920 14,896 633 2,102 547 8,806,584 8,238,969 8,153,064 21 6,407 1,976 6,916 634 2,112 549 4,062,983 4,172,677 3,798,607 22 9,739 2,921 9,785 639 2,126 552 6,220,005 6,209,432 5,400,789 23 32,434 9,700 40,740 651 2,167 569 21,121,705 21,016,634 23,199,480


238

AAppppeennddiixx EE:: EErrrroorrss aanndd PPeerrcceennttaaggee EErrrroorrss ooff

FFoorreeccaassttss

239

Table E-1: Errors and Percentage Errors of Forecasts for the Conventional

Models for Offices

Case

Error % Error Error % Error Error % Error

1 2.3E+06 0.040 1.4E+07 0.252 7.2E+06 0.126 2 2.3E+06 0.053 1.0E+07 0.241 2.3E+06 0.053 3 -1.1E+05 -0.001 3.9E+07 0.393 1.7E+07 0.172 4 4.5E+06 0.091 1.2E+07 0.236 -1.8E+06 -0.036 5 1.0E+07 0.172 1.5E+07 0.258 1.1E+07 0.180 6 -2.0E+07 -0.095 9.3E+07 0.449 3.3E+07 0.160 7 1.3E+07 0.111 3.0E+07 0.257 1.9E+07 0.163 8 -1.9E+07 -0.320 -2.2E+07 -0.362 -1.9E+07 -0.313 9 6.5E+07 0.062 -3.1E+08 -0.299 -1.2E+08 -0.119 10 -1.5E+08 -0.419 -1.2E+08 -0.337 -8.8E+07 -0.240 11 -9.8E+07 -0.303 -4.4E+07 -0.137 7.3E+06 0.023 12 -1.2E+07 -0.166 -1.3E+07 -0.187 -1.8E+07 -0.251 13 6.5E+06 0.112 2.8E+06 0.049 -3.7E+06 -0.065 14 1.5E+08 0.103 3.1E+08 0.210 2.2E+08 0.148 15 -9.1E+07 -0.404 -8.2E+07 -0.366 -6.6E+07 -0.292 16 -1.8E+07 -0.380 -1.5E+07 -0.313 -1.8E+07 -0.369 17 -4.9E+07 -0.120 7.4E+07 0.182 1.4E+08 0.344 18 5.4E+06 0.136 1.0E+07 0.260 2.4E+06 0.060 19 -7.4E+06 -0.256 -7.3E+06 -0.253 -1.1E+07 -0.385 20 -3.9E+07 -0.220 -6.0E+07 -0.339 -4.0E+07 -0.224 21 3.8E+06 0.151 7.2E+06 0.287 1.9E+06 0.076 22 -2.5E+06 -0.105 -1.9E+06 -0.079 -5.8E+06 -0.237 23 -8.9E+07 -0.373 -8.5E+07 -0.356 -5.9E+07 -0.247 24 -2.5E+06 -0.052 3.7E+06 0.078 -4.9E+06 -0.105 25 2.0E+08 0.359 2.1E+08 0.380 1.1E+08 0.190 26 -4.4E+07 -0.400 -4.3E+07 -0.388 -4.6E+07 -0.419 27 -1.4E+07 -0.256 -1.1E+07 -0.203 -2.1E+07 -0.378 28 -1.4E+07 -0.181 -1.2E+07 -0.150 -1.3E+07 -0.165 29 -1.7E+07 -0.195 -1.2E+05 -0.001 -1.7E+07 -0.197 30 8.5E+06 0.052 5.5E+07 0.338 2.5E+07 0.154 31 -3.8E+06 -0.107 1.5E+07 0.433 1.3E+07 0.364 32 -2.2E+05 -0.005 -2.0E+06 -0.042 -6.8E+06 -0.143 33 -3.2E+06 -0.122 2.2E+06 0.082 -6.1E+05 -0.023 34 -6.8E+06 -0.179 -2.5E+05 -0.006 -3.7E+06 -0.097 35 9.1E+05 0.037 6.5E+06 0.265 6.1E+05 0.025 36 -2.2E+08 -0.170 -2.6E+08 -0.198 -2.5E+08 -0.191 37 5.7E+08 0.606 7.6E+07 0.081 8.6E+06 0.009 38 1.0E+06 0.016 2.5E+07 0.400 3.3E+07 0.528 39 -6.7E+06 -0.170 9.2E+06 0.236 1.3E+07 0.332 40 5.7E+06 0.092 2.6E+07 0.425 2.3E+07 0.366 41 -2.8E+07 -0.181 5.5E+07 0.360 1.3E+08 0.822 42 7.9E+06 0.098 1.8E+07 0.224 2.2E+07 0.272

MSQ: 1.2E+16 8.9E+15 4.7E+15Max: 0.606 0.449 0.822Min: -0.419 -0.388 -0.419Mean: -0.069 0.056 0.002SD: 0.214 0.273 0.270

ORIGINAL JSEM CUBE FLOOR AREA

240


Models for Private Housing

Case


1 -1.1E+08 -0.172 -1.7E+08 -0.267 -1.7E+08 -0.254 2 -2.1E+08 -0.229 -3.1E+08 -0.332 -3.0E+08 -0.319 3 -7.7E+07 -0.436 -5.2E+07 -0.294 -4.1E+07 -0.231 4 -3.7E+07 -0.304 -3.8E+07 -0.310 -2.1E+07 -0.173 5 1.2E+08 0.198 1.3E+08 0.211 1.0E+08 0.165 6 -2.4E+08 -0.214 -1.4E+08 -0.127 -2.5E+07 -0.023 7 -6.2E+06 -0.398 -4.3E+06 -0.275 -3.2E+06 -0.204 8 -1.2E+07 -0.391 -5.7E+06 -0.187 -4.5E+06 -0.150 9 3.9E+06 0.021 -1.3E+07 -0.071 -2.9E+06 -0.015 10 -3.1E+08 -0.422 -2.0E+08 -0.268 -1.6E+08 -0.217 11 -4.6E+08 -0.354 -4.5E+08 -0.352 -4.6E+08 -0.359 12 -3.3E+07 -0.062 -3.5E+06 -0.006 -3.8E+06 -0.007 13 -8.7E+07 -0.431 -6.3E+07 -0.312 -4.6E+07 -0.229 14 -8.3E+06 -0.282 -8.5E+06 -0.287 -6.9E+06 -0.235 15 -9.8E+06 -0.266 -5.3E+06 -0.142 -1.5E+06 -0.040 16 2.6E+08 0.425 1.9E+08 0.313 2.0E+08 0.318 17 -3.1E+06 -0.275 -5.1E+04 -0.004 5.4E+05 0.047 18 7.5E+07 0.142 4.6E+07 0.086 3.5E+07 0.067 19 -3.1E+06 -0.126 3.2E+06 0.130 3.5E+06 0.141 20 -6.3E+07 -0.367 -4.3E+07 -0.253 -3.0E+07 -0.177 21 -4.3E+07 -0.129 -6.5E+07 -0.193 -3.2E+07 -0.095 22 -2.7E+07 -0.121 -3.3E+07 -0.150 -3.8E+07 -0.172 23 5.1E+07 0.266 5.5E+07 0.291 5.5E+07 0.289 24 -7.4E+06 -0.010 -1.3E+07 -0.017 -1.7E+07 -0.022 25 -4.3E+06 -0.247 -2.0E+06 -0.115 -3.6E+06 -0.211 26 -6.3E+07 -0.177 3.4E+07 0.095 3.5E+07 0.098 27 -2.0E+07 -0.245 7.7E+05 0.009 -3.7E+06 -0.045 28 1.1E+08 0.372 7.7E+07 0.264 7.5E+07 0.255 29 -9.4E+06 -0.024 5.2E+07 0.132 4.9E+07 0.126 30 -7.3E+07 -0.393 -4.6E+07 -0.249 -3.2E+07 -0.173 31 6.8E+07 0.165 8.7E+07 0.211 7.9E+07 0.190 32 3.2E+07 0.322 3.4E+07 0.340 2.8E+07 0.282 33 -7.9E+07 -0.214 -7.0E+07 -0.187 -5.6E+07 -0.151 34 3.4E+07 0.123 4.5E+07 0.162 4.5E+07 0.160 35 2.0E+07 0.051 1.2E+07 0.031 1.3E+07 0.034 36 -1.1E+07 -0.118 -7.6E+05 -0.008 2.6E+06 0.028 37 1.2E+08 0.323 1.3E+08 0.336 7.4E+07 0.194 38 2.3E+08 0.456 1.8E+08 0.361 1.5E+08 0.300 39 1.2E+08 0.464 8.8E+07 0.333 7.2E+07 0.275 40 2.4E+08 0.459 1.8E+08 0.348 1.5E+08 0.288 41 1.4E+08 0.339 1.0E+08 0.241 7.9E+07 0.186 42 9.8E+06 0.027 -1.4E+07 -0.037 -2.9E+07 -0.080 43 1.4E+08 0.308 7.7E+07 0.170 5.4E+07 0.119 44 1.3E+07 0.166 1.0E+07 0.139 5.3E+06 0.070 45 8.4E+07 0.383 6.9E+07 0.317 5.2E+07 0.236 46 2.1E+07 0.270 3.5E+07 0.445 2.3E+07 0.289 47 -3.8E+07 -0.296 -2.8E+07 -0.219 -3.1E+07 -0.238 48 -1.4E+07 -0.223 -3.4E+06 -0.053 -2.4E+05 -0.004 49 -1.5E+07 -0.186 4.2E+06 0.054 6.5E+06 0.084 50 1.8E+08 0.463 1.4E+08 0.352 1.2E+08 0.317

MSQ: 1.6E+16 1.3E+16 1.0E+16Max: 0.464 0.445 0.318Min: -0.436 -0.352 -0.359Mean: -0.027 0.013 0.015SD: 0.290 0.235 0.196

ORIGINAL JSEM FLOOR AREA CUBE

241


Models for Nursing Homes

Case


1 -3.4E+06 -0.080 -3.0E+06 -0.070 1.7E+06 0.041 2 1.0E+07 0.286 9.3E+06 0.256 1.1E+07 0.303 3 -1.0E+07 -0.228 -7.5E+06 -0.167 -2.9E+06 -0.064 4 -5.3E+06 -0.125 1.7E+05 0.004 1.2E+07 0.282 5 -2.7E+06 -0.220 -2.3E+06 -0.188 -1.2E+06 -0.092 6 9.5E+04 0.004 -3.0E+05 -0.013 -5.3E+05 -0.024 7 -9.0E+05 -0.044 -5.9E+06 -0.287 -6.3E+06 -0.305 8 -5.3E+06 -0.253 -5.0E+06 -0.239 -3.1E+06 -0.149 9 -4.8E+05 -0.011 4.0E+06 0.094 -3.6E+06 -0.084 10 5.6E+05 0.021 1.8E+06 0.066 1.1E+06 0.039 11 -4.8E+06 -0.169 -2.6E+06 -0.092 -7.4E+06 -0.258 12 3.7E+06 0.091 1.1E+07 0.271 9.6E+06 0.238 13 1.8E+06 0.095 7.6E+05 0.041 4.5E+06 0.240 14 6.4E+06 0.397 8.4E+06 0.519 4.2E+06 0.262 15 3.2E+06 0.176 6.2E+06 0.341 9.2E+06 0.502 16 -2.6E+06 -0.153 -1.5E+06 -0.088 3.1E+06 0.177 17 3.1E+06 0.036 -2.0E+07 -0.233 -3.3E+07 -0.381 18 6.9E+06 0.377 8.6E+06 0.471 7.1E+06 0.391 19 1.1E+06 0.044 -1.3E+05 -0.005 -1.5E+06 -0.059 20 -4.3E+06 -0.191 -3.8E+06 -0.170 5.6E+05 0.025 21 3.2E+06 0.242 4.9E+06 0.370 5.3E+06 0.395 22 5.5E+06 0.322 5.3E+06 0.315 3.5E+06 0.206 23 -5.4E+06 -0.138 -8.9E+06 -0.229 -1.4E+07 -0.362

MSQ: 2.3E+13 4.8E+13 8.6E+13Max: 0.397 0.519 0.502Min: -0.253 -0.287 -0.381Mean: 0.021 0.042 0.058SD: 0.200 0.245 0.252


242


Models for Schools

Case


1 -8.9E+05 -0.086 -8.6E+05 -0.083 -4.2E+05 -0.040 2 4.2E+05 0.018 2.8E+06 0.116 5.1E+06 0.215 3 -1.6E+06 -0.148 -1.6E+06 -0.145 -1.2E+06 -0.105 4 -1.2E+06 -0.114 -1.3E+06 -0.120 -2.1E+06 -0.196 5 3.6E+05 0.071 6.7E+05 0.132 4.9E+05 0.097 6 -4.1E+05 -0.062 -4.1E+05 -0.062 -8.1E+05 -0.122 7 8.9E+05 0.169 6.1E+05 0.116 1.2E+06 0.224 8 9.0E+05 0.045 6.8E+05 0.034 -8.3E+05 -0.042 9 -7.0E+06 -0.374 -7.3E+06 -0.390 -8.0E+06 -0.427 10 8.6E+05 0.122 3.0E+05 0.042 1.9E+06 0.265 11 -9.1E+05 -0.118 -1.3E+06 -0.172 -8.8E+05 -0.113 12 1.0E+06 0.110 8.6E+05 0.092 9.2E+05 0.099 13 -5.3E+06 -0.362 -5.4E+06 -0.369 -4.0E+06 -0.278 14 -7.2E+05 -0.071 -1.1E+06 -0.106 -1.9E+06 -0.187 15 1.5E+06 0.095 2.0E+06 0.126 3.1E+05 0.019 16 5.5E+05 0.039 1.0E+06 0.071 -9.1E+05 -0.064 17 2.2E+06 0.326 2.1E+06 0.305 1.4E+06 0.203 18 1.0E+06 0.178 8.1E+05 0.138 1.0E+06 0.172 19 1.4E+06 0.422 1.3E+06 0.407 1.9E+06 0.588 20 -5.7E+05 -0.061 -1.1E+06 -0.121 -1.2E+06 -0.130 21 -1.3E+05 -0.030 -1.5E+04 -0.004 -3.9E+05 -0.093 22 1.6E+06 0.344 1.6E+06 0.342 7.7E+05 0.167 23 6.3E+06 0.427 6.2E+06 0.420 8.4E+06 0.567

MSQ: 6.1E+12 6.7E+12 8.9E+12Max: 0.427 0.420 0.588Min: -0.374 -0.390 -0.427Mean: 0.041 0.033 0.036SD: 0.212 0.214 0.246


243

Table E-5: Errors and Percentage Errors of Forecasts for the Regressed Models

for Offices

Case Error * % Error Error * % Error Error * % Error Error * % Error

1 1.4E+07 0.240 1.6E+07 0.283 1.6E+07 0.272 1.5E+07 0.256 2 8.6E+06 0.199 7.9E+06 0.181 8.0E+06 0.186 7.1E+06 0.164 3 2.1E+07 0.213 1.7E+07 0.170 6.3E+06 0.064 -2.1E+06 -0.021 4 9.5E+06 0.191 8.2E+06 0.165 4.6E+06 0.093 5.9E+06 0.118 5 1.9E+07 0.327 2.3E+07 0.392 1.9E+07 0.331 1.8E+07 0.311 6 -8.9E+07 -0.430 -8.6E+07 -0.413 -6.3E+07 -0.304 -5.0E+07 -0.241 7 4.0E+07 0.338 4.4E+07 0.378 2.3E+07 0.199 2.0E+07 0.169 8 -1.7E+07 -0.285 -1.4E+07 -0.227 -5.8E+06 -0.097 -5.1E+06 -0.086 9 2.2E+08 0.211 6.4E+07 0.061 -8.4E+07 -0.080 -1.3E+08 -0.120 10 -5.1E+07 -0.138 -5.5E+07 -0.149 -1.1E+08 -0.298 -9.7E+07 -0.264 11 -6.3E+06 -0.020 7.1E+06 0.022 -5.9E+07 -0.183 -5.3E+07 -0.164 12 -1.2E+07 -0.161 -7.8E+06 -0.109 -9.5E+06 -0.132 -7.4E+06 -0.103 13 8.2E+06 0.143 1.1E+07 0.185 2.1E+07 0.365 1.8E+07 0.308 14 -3.2E+08 -0.215 -3.4E+08 -0.233 2.1E+08 0.143 2.7E+08 0.183 15 -2.9E+07 -0.129 -3.2E+07 -0.140 -5.2E+07 -0.230 -4.7E+07 -0.207 16 -1.6E+07 -0.337 -1.6E+07 -0.331 -1.1E+07 -0.227 -1.1E+07 -0.240 17 -2.0E+08 -0.496 -5.3E+06 -0.013 -9.4E+06 -0.023 -5.3E+07 -0.131 18 8.8E+06 0.224 9.2E+06 0.234 1.0E+07 0.252 8.4E+06 0.212 19 -9.4E+06 -0.326 -8.9E+06 -0.309 -7.5E+06 -0.261 -4.6E+06 -0.160 20 2.4E+06 0.013 5.3E+06 0.030 -7.3E+05 -0.004 -2.0E+06 -0.011 21 4.4E+06 0.178 5.3E+06 0.213 4.1E+06 0.165 6.9E+06 0.275 22 -4.3E+06 -0.177 -3.5E+06 -0.145 4.7E+04 0.002 -9.5E+05 -0.039 23 -1.1E+06 -0.005 -1.1E+07 -0.046 -3.6E+07 -0.150 -2.6E+07 -0.111 24 -5.2E+04 -0.001 1.4E+06 0.029 1.3E+06 0.028 -1.4E+06 -0.029 25 2.3E+07 0.041 1.6E+08 0.291 1.1E+08 0.192 1.2E+08 0.204 26 -3.7E+07 -0.334 -3.5E+07 -0.316 -3.0E+07 -0.274 -2.8E+07 -0.256 27 -1.5E+07 -0.274 -1.5E+07 -0.272 -1.4E+07 -0.258 -1.4E+07 -0.255 28 -5.1E+06 -0.065 -1.3E+06 -0.016 -5.3E+06 -0.067 -3.9E+06 -0.050 29 8.4E+05 0.010 -3.0E+06 -0.034 -4.7E+06 -0.053 -9.9E+06 -0.113 30 8.6E+07 0.533 4.4E+07 0.274 1.9E+07 0.118 8.3E+06 0.051 31 6.3E+06 0.180 2.0E+05 0.006 -7.6E+06 -0.217 -5.2E+06 -0.148 32 3.0E+06 0.065 5.3E+06 0.113 1.5E+07 0.310 2.2E+07 0.473 33 -7.3E+05 -0.027 -1.2E+06 -0.044 -7.8E+05 -0.029 -2.3E+06 -0.085 34 -2.5E+06 -0.065 -3.2E+06 -0.085 -7.4E+05 -0.019 -2.3E+06 -0.060 35 3.4E+06 0.138 2.6E+06 0.106 4.0E+06 0.164 3.4E+06 0.140 36 -1.9E+08 -0.142 2.5E+07 0.019 -2.1E+08 -0.164 -2.3E+08 -0.179 37 -5.3E+07 -0.057 -2.0E+08 -0.209 2.2E+08 0.236 1.7E+08 0.183 38 1.8E+07 0.296 2.1E+07 0.339 6.4E+06 0.102 1.1E+07 0.179 39 1.1E+07 0.271 -3.6E+05 -0.009 1.1E+06 0.028 -5.5E+06 -0.140 40 1.9E+07 0.311 2.4E+07 0.388 8.3E+06 0.133 1.3E+07 0.214 41 9.6E+07 0.627 1.2E+06 0.008 4.9E+07 0.321 1.1E+08 0.700 42 1.8E+07 0.224 3.7E+07 0.454 1.2E+07 0.153 1.7E+07 0.208

MSQ: 6.3E+15 5.0E+15 4.5E+15 5.3E+15Max: 0.627 0.454 0.365 0.700Min: -0.496 -0.413 -0.304 -0.264Mean: 0.031 0.030 0.019 0.027SD: 0.254 0.221 0.195 0.219

Remark: "*" - errors of predicted price calculated by actual price - (actual floor area x predicted price per area)

RJSEM LRJSEM LRASEM RASEM

244


for Private Housing


1 -1.8E+08 -0.283 -1.6E+08 -0.241 -1.9E+08 -0.299 -1.6E+08 -0.244 2 -3.4E+08 -0.366 -3.1E+08 -0.335 -3.2E+08 -0.346 -3.0E+08 -0.319 3 -3.8E+07 -0.215 -2.7E+07 -0.152 -3.4E+07 -0.193 -3.1E+07 -0.176 4 -3.0E+07 -0.246 3.0E+07 0.248 -3.3E+07 -0.270 3.6E+07 0.294 5 -5.4E+06 -0.009 3.3E+07 0.054 -1.8E+07 -0.030 7.4E+07 0.120 6 2.7E+08 0.244 1.0E+08 0.095 5.7E+07 0.051 -1.5E+08 -0.135 7 -3.2E+06 -0.201 -1.9E+06 -0.118 -1.9E+06 -0.120 -2.0E+06 -0.130 8 -2.9E+06 -0.095 -2.5E+06 -0.081 -9.7E+04 -0.003 -3.3E+06 -0.108 9 4.1E+06 0.022 7.1E+06 0.038 1.0E+06 0.006 -1.2E+05 -0.001 10 -9.6E+07 -0.130 -2.6E+07 -0.035 -9.5E+07 -0.129 -2.0E+08 -0.273 11 -6.6E+08 -0.509 -1.9E+08 -0.146 -5.5E+08 -0.424 -5.1E+08 -0.394 12 -3.2E+07 -0.060 1.2E+08 0.222 -5.2E+07 -0.097 -3.6E+07 -0.067 13 -4.8E+07 -0.240 -9.0E+06 -0.045 -4.6E+07 -0.228 -1.1E+07 -0.056 14 -6.4E+06 -0.217 -3.8E+06 -0.127 -6.4E+06 -0.215 -4.1E+06 -0.137 15 -1.7E+06 -0.047 9.2E+06 0.249 -1.0E+06 -0.028 9.6E+06 0.260 16 1.8E+08 0.285 1.2E+08 0.189 2.0E+08 0.325 1.4E+08 0.234 17 1.4E+06 0.119 1.5E+06 0.132 4.1E+06 0.357 1.4E+06 0.124 18 2.5E+07 0.047 5.5E+06 0.010 1.7E+07 0.033 9.8E+06 0.018 19 6.8E+06 0.278 3.6E+06 0.145 1.1E+07 0.427 2.7E+06 0.111 20 -3.3E+07 -0.195 -8.7E+06 -0.051 -4.5E+07 -0.260 -3.5E+06 -0.020 21 -5.3E+07 -0.160 1.0E+07 0.030 -7.8E+07 -0.232 3.5E+07 0.104 22 -2.3E+07 -0.107 -3.3E+07 -0.153 -4.8E+07 -0.219 -3.9E+07 -0.180 23 7.0E+07 0.368 2.8E+07 0.149 5.0E+07 0.265 5.3E+07 0.278 24 -1.6E+07 -0.020 -1.1E+08 -0.148 6.0E+06 0.008 -7.1E+07 -0.092 25 -2.2E+05 -0.013 -3.0E+06 -0.171 1.1E+06 0.065 -3.4E+06 -0.198 26 6.7E+07 0.188 -2.5E+07 -0.072 6.4E+07 0.180 9.1E+06 0.025 27 9.2E+06 0.111 -5.6E+06 -0.068 4.1E+06 0.049 -8.1E+06 -0.097 28 9.0E+07 0.306 7.1E+07 0.243 7.2E+07 0.245 6.3E+07 0.216 29 4.8E+07 0.122 -1.2E+07 -0.031 3.2E+07 0.081 4.4E+07 0.114 30 -3.4E+07 -0.181 -2.7E+07 -0.143 -3.9E+07 -0.207 -1.0E+07 -0.054 31 6.3E+07 0.153 7.6E+07 0.184 4.5E+07 0.109 5.7E+07 0.139 32 4.4E+07 0.448 3.0E+07 0.305 2.8E+07 0.279 2.5E+07 0.254 33 -2.5E+07 -0.067 -7.4E+07 -0.200 -1.9E+07 -0.051 -8.2E+07 -0.220 34 7.1E+07 0.255 3.5E+07 0.128 6.6E+07 0.237 3.0E+07 0.108 35 3.1E+07 0.080 1.3E+06 0.003 2.1E+07 0.053 -4.6E+06 -0.012 36 8.8E+06 0.098 -8.6E+06 -0.095 4.1E+06 0.046 -8.1E+05 -0.009 37 9.8E+07 0.258 -1.3E+07 -0.035 7.7E+07 0.201 5.2E+06 0.014 38 9.8E+07 0.192 8.8E+07 0.171 1.1E+08 0.206 1.1E+08 0.214 39 8.6E+07 0.327 6.1E+07 0.233 5.6E+07 0.211 5.6E+07 0.214 40 9.6E+07 0.186 7.8E+07 0.152 1.0E+08 0.203 1.0E+08 0.201 41 6.7E+07 0.159 4.2E+07 0.099 4.8E+07 0.113 4.8E+07 0.114 42 -1.7E+07 -0.045 -4.3E+07 -0.118 -4.8E+07 -0.130 -4.7E+07 -0.128 43 4.1E+07 0.090 1.6E+07 0.034 2.2E+07 0.049 2.3E+07 0.050 44 1.9E+07 0.246 4.8E+06 0.064 5.6E+06 0.074 2.0E+06 0.026 45 7.4E+07 0.340 3.8E+07 0.174 4.4E+07 0.202 3.3E+07 0.150 46 4.7E+07 0.592 1.7E+07 0.221 3.9E+07 0.498 1.5E+07 0.191 47 -2.0E+07 -0.157 -2.8E+07 -0.217 -3.2E+07 -0.249 -3.1E+07 -0.243 48 2.8E+06 0.044 3.7E+06 0.058 -2.1E+06 -0.033 7.5E+06 0.117 49 1.3E+07 0.164 1.6E+07 0.209 9.1E+06 0.116 1.3E+07 0.168 50 1.0E+08 0.261 1.1E+08 0.272 8.1E+07 0.207 1.1E+08 0.275




245

Table E-7: Errors and Percentage Errors of Forecasts for the Regressed

Models for Nursing Homes


1 -1.3E+06 -0.031 -1.0E+06 -0.024 -4.2E+06 -0.098 -3.7E+06 -0.086 2 1.4E+07 0.385 1.4E+07 0.373 1.1E+07 0.308 1.1E+07 0.296 3 -9.8E+06 -0.218 -9.9E+06 -0.221 -1.2E+07 -0.266 -1.2E+07 -0.265 4 -7.1E+06 -0.167 -9.4E+06 -0.222 -8.0E+06 -0.188 -9.1E+06 -0.213 5 -2.4E+06 -0.195 -2.2E+06 -0.179 -2.5E+06 -0.203 -2.0E+06 -0.164 6 1.7E+06 0.075 1.7E+06 0.076 2.0E+06 0.087 1.9E+06 0.083 7 -3.7E+06 -0.181 -3.5E+06 -0.169 -3.1E+06 -0.153 -2.5E+06 -0.123 8 -4.6E+06 -0.219 -4.6E+06 -0.218 -4.4E+06 -0.210 -4.5E+06 -0.215 9 -7.5E+05 -0.018 -2.1E+06 -0.049 -2.8E+06 -0.067 -3.8E+06 -0.089 10 3.1E+06 0.114 3.1E+06 0.113 2.2E+06 0.082 2.2E+06 0.079 11 -4.9E+06 -0.173 -4.3E+06 -0.150 -6.3E+06 -0.219 -5.6E+06 -0.197 12 -3.0E+06 -0.074 -3.7E+06 -0.092 -7.8E+05 -0.019 -1.0E+06 -0.025 13 1.9E+06 0.099 1.7E+06 0.093 1.2E+06 0.063 1.0E+06 0.053 14 6.2E+06 0.382 6.1E+06 0.378 5.2E+06 0.322 5.1E+06 0.318 15 6.0E+06 0.330 5.8E+06 0.316 4.9E+06 0.270 4.6E+06 0.250 16 -2.1E+06 -0.119 -2.2E+06 -0.127 -2.9E+06 -0.166 -3.1E+06 -0.180 17 1.2E+06 0.014 2.2E+06 0.026 -6.7E+05 -0.008 1.7E+05 0.002 18 1.7E+06 0.092 2.7E+06 0.150 2.6E+06 0.142 3.0E+06 0.164 19 2.9E+06 0.113 3.1E+06 0.118 4.3E+06 0.165 4.3E+06 0.165 20 -3.0E+06 -0.134 -2.9E+06 -0.131 -2.8E+06 -0.126 -2.8E+06 -0.126 21 5.2E+06 0.390 5.1E+06 0.380 4.6E+06 0.342 4.3E+06 0.323 22 6.7E+06 0.393 6.5E+06 0.385 6.0E+06 0.356 5.8E+06 0.344 23 -4.6E+06 -0.118 -4.4E+06 -0.114 -3.3E+06 -0.084 -3.1E+06 -0.081



lnRASEM RASEM RJSEM LRJSEM

246


for Schools

Case

Error * % Error Error * % Error Error * % Error Error * % Error 1 -4.0E+05 -0.039 -5.7E+04 -0.006 -5.4E+05 -0.053 -3.9E+05 -0.038 2 1.3E+06 0.053 1.3E+06 0.056 2.2E+06 0.091 2.4E+06 0.099 3 -1.2E+06 -0.105 -8.2E+05 -0.074 -1.3E+06 -0.117 -1.2E+06 -0.104 4 -5.8E+05 -0.054 -1.0E+06 -0.098 -6.5E+05 -0.061 -1.2E+06 -0.115 5 4.7E+05 0.092 -8.2E+04 -0.016 -1.0E+05 -0.020 -1.1E+05 -0.022 6 3.8E+05 0.057 6.5E+05 0.097 6.7E+05 0.100 4.7E+05 0.070 7 1.2E+06 0.222 1.5E+06 0.292 1.2E+06 0.223 1.4E+06 0.265 8 3.0E+06 0.152 1.7E+06 0.087 1.9E+06 0.094 1.2E+06 0.059 9 -5.8E+06 -0.311 -5.2E+06 -0.280 -5.7E+06 -0.303 -5.9E+06 -0.313 10 3.3E+05 0.046 6.7E+05 0.095 -2.0E+03 0.000 4.9E+05 0.069 11 -5.9E+05 -0.076 1.9E+05 0.024 -3.8E+05 -0.049 -2.1E+05 -0.028 12 1.3E+06 0.139 1.5E+06 0.158 1.2E+06 0.124 1.2E+06 0.126 13 -5.0E+06 -0.341 -4.3E+06 -0.298 -5.1E+06 -0.347 -4.7E+06 -0.324 14 -1.9E+06 -0.188 -1.9E+06 -0.186 -2.1E+06 -0.201 -1.9E+06 -0.185 15 1.5E+06 0.094 9.7E+05 0.062 1.7E+06 0.109 1.1E+06 0.071 16 4.7E+06 0.334 3.7E+06 0.258 3.7E+06 0.264 2.6E+06 0.184 17 -1.8E+06 -0.263 -1.5E+06 -0.222 -9.5E+05 -0.139 -1.1E+06 -0.160 18 3.6E+05 0.062 -5.0E+04 -0.009 -1.7E+05 -0.029 -5.6E+04 -0.010 19 1.7E+06 0.527 1.7E+06 0.520 1.6E+06 0.481 1.7E+06 0.539 20 -1.3E+06 -0.137 -1.5E+06 -0.163 -1.6E+06 -0.168 -1.6E+06 -0.167 21 2.6E+05 0.063 -1.2E+05 -0.028 1.7E+04 0.004 -1.1E+05 -0.026 22 1.3E+06 0.287 6.8E+05 0.147 8.2E+05 0.177 6.5E+05 0.140 23 2.5E+06 0.170 3.8E+06 0.257 4.6E+06 0.314 5.1E+06 0.345




247

AAppppeennddiixx FF:: RReessuullttss ooff CCoommbbiinniinngg FFoorreeccaassttss

248

Table F-1: Combined Forecasts for Group 1 Models

Case JSEM FLOOR

AREA CUBE RASEM Min Max

(a) (b) (c) (d) avg. (a, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error

1 0.040 0.252 0.126 0.283 0.149 0.175 0.040 0.2832 0.053 0.241 0.053 0.181 0.096 0.132 0.053 0.2413 -0.001 0.393 0.172 0.170 0.114 0.183 -0.001 0.3934 0.091 0.236 -0.036 0.165 0.073 0.114 -0.036 0.2365 0.172 0.258 0.180 0.392 0.248 0.250 0.172 0.3926 -0.095 0.449 0.160 -0.413 -0.116 0.025 -0.095 0.4497 0.111 0.257 0.163 0.378 0.217 0.227 0.111 0.3788 -0.320 -0.362 -0.313 -0.227 -0.287 -0.305 -0.227 -0.3629 0.062 -0.299 -0.119 0.061 0.001 -0.074 0.061 -0.29910 -0.419 -0.337 -0.240 -0.149 -0.269 -0.286 -0.149 -0.41911 -0.303 -0.137 0.023 0.022 -0.086 -0.099 0.022 -0.30312 -0.166 -0.187 -0.251 -0.109 -0.175 -0.178 -0.109 -0.25113 0.112 0.049 -0.065 0.185 0.077 0.070 0.049 0.18514 0.103 0.210 0.148 -0.233 0.006 0.057 0.103 -0.23315 -0.404 -0.366 -0.292 -0.140 -0.279 -0.301 -0.140 -0.40416 -0.380 -0.313 -0.369 -0.331 -0.360 -0.348 -0.313 -0.38017 -0.120 0.182 0.344 -0.013 0.070 0.098 -0.013 0.34418 0.136 0.260 0.060 0.234 0.143 0.172 0.060 0.26019 -0.256 -0.253 -0.385 -0.309 -0.316 -0.301 -0.253 -0.38520 -0.220 -0.339 -0.224 0.030 -0.138 -0.188 0.030 -0.33921 0.151 0.287 0.076 0.213 0.147 0.182 0.076 0.28722 -0.105 -0.079 -0.237 -0.145 -0.162 -0.141 -0.079 -0.23723 -0.373 -0.356 -0.247 -0.046 -0.222 -0.255 -0.046 -0.37324 -0.052 0.078 -0.105 0.029 -0.043 -0.013 0.029 -0.10525 0.359 0.380 0.190 0.291 0.280 0.305 0.190 0.38026 -0.400 -0.388 -0.419 -0.316 -0.378 -0.381 -0.316 -0.41927 -0.256 -0.203 -0.378 -0.272 -0.302 -0.277 -0.203 -0.37828 -0.181 -0.150 -0.165 -0.016 -0.121 -0.128 -0.016 -0.18129 -0.195 -0.001 -0.197 -0.034 -0.142 -0.107 -0.001 -0.19730 0.052 0.338 0.154 0.274 0.160 0.204 0.052 0.33831 -0.107 0.433 0.364 0.006 0.087 0.174 0.006 0.43332 -0.005 -0.042 -0.143 0.113 -0.012 -0.019 -0.005 -0.14333 -0.122 0.082 -0.023 -0.044 -0.063 -0.027 -0.023 -0.12234 -0.179 -0.006 -0.097 -0.085 -0.120 -0.092 -0.006 -0.17935 0.037 0.265 0.025 0.106 0.056 0.108 0.025 0.26536 -0.170 -0.198 -0.191 0.019 -0.114 -0.135 0.019 -0.19837 0.606 0.081 0.009 -0.209 0.135 0.122 0.009 0.60638 0.016 0.400 0.528 0.339 0.294 0.321 0.016 0.52839 -0.170 0.236 0.332 -0.009 0.051 0.097 -0.009 0.33240 0.092 0.425 0.366 0.388 0.282 0.318 0.092 0.42541 -0.181 0.360 0.822 0.008 0.216 0.252 0.008 0.82242 0.098 0.224 0.272 0.454 0.275 0.262 0.098 0.454

Mean: (0.069) 0.056 0.002 0.030 (0.013) 0.005 (0.017) 0.051SD: 0.214 0.273 0.270 0.221 0.194 0.207 0.116 0.358

(w) (x) (y) (z)

Average mean (w, x, y & z): 0.005Average SD (w, x, y & z): 0.245

Combined Forecasts

249


Case JSEM FLOOR

AREA CUBE RJSEM Min Max

(a) (b) (c) (d) avg. (a, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error

1 0.040 0.25242 0.126 0.240 0.135 0.164 0.040 0.2522 0.053 0.241 0.053 0.199 0.102 0.137 0.053 0.2413 -0.001 0.393 0.172 0.213 0.128 0.194 -0.001 0.3934 0.091 0.236 -0.036 0.191 0.082 0.121 -0.036 0.2365 0.172 0.258 0.180 0.327 0.226 0.234 0.172 0.3276 -0.095 0.449 0.160 -0.430 -0.122 0.021 -0.095 0.4497 0.111 0.257 0.163 0.338 0.204 0.217 0.111 0.3388 -0.320 -0.362 -0.313 -0.285 -0.306 -0.320 -0.285 -0.3629 0.062 -0.299 -0.119 0.211 0.051 -0.036 0.062 -0.29910 -0.419 -0.337 -0.240 -0.138 -0.266 -0.284 -0.138 -0.41911 -0.303 -0.137 0.023 -0.020 -0.100 -0.109 -0.020 -0.30312 -0.166 -0.187 -0.251 -0.161 -0.193 -0.191 -0.161 -0.25113 0.112 0.049 -0.065 0.143 0.063 0.060 0.049 0.14314 0.103 0.210 0.148 -0.215 0.012 0.061 0.103 -0.21515 -0.404 -0.366 -0.292 -0.129 -0.275 -0.298 -0.129 -0.40416 -0.380 -0.313 -0.369 -0.337 -0.362 -0.350 -0.313 -0.38017 -0.120 0.182 0.344 -0.496 -0.090 -0.022 -0.120 -0.49618 0.136 0.260 0.060 0.224 0.140 0.170 0.060 0.26019 -0.256 -0.253 -0.385 -0.326 -0.322 -0.305 -0.253 -0.38520 -0.220 -0.339 -0.224 0.013 -0.143 -0.192 0.013 -0.33921 0.151 0.287 0.076 0.178 0.135 0.173 0.076 0.28722 -0.105 -0.079 -0.237 -0.177 -0.173 -0.150 -0.079 -0.23723 -0.373 -0.356 -0.247 -0.005 -0.208 -0.245 -0.005 -0.37324 -0.052 0.078 -0.105 -0.001 -0.053 -0.020 -0.001 -0.10525 0.359 0.380 0.190 0.041 0.197 0.243 0.041 0.38026 -0.400 -0.388 -0.419 -0.334 -0.384 -0.385 -0.334 -0.41927 -0.256 -0.203 -0.378 -0.274 -0.303 -0.278 -0.203 -0.37828 -0.181 -0.150 -0.165 -0.065 -0.137 -0.141 -0.065 -0.18129 -0.195 -0.001 -0.197 0.010 -0.127 -0.096 -0.001 -0.19730 0.052 0.338 0.154 0.533 0.246 0.269 0.052 0.53331 -0.107 0.433 0.364 0.180 0.146 0.217 -0.107 0.43332 -0.005 -0.042 -0.143 0.065 -0.028 -0.031 -0.005 -0.14333 -0.122 0.082 -0.023 -0.027 -0.057 -0.023 -0.023 -0.12234 -0.179 -0.006 -0.097 -0.065 -0.114 -0.087 -0.006 -0.17935 0.037 0.265 0.025 0.138 0.067 0.116 0.025 0.26536 -0.170 -0.198 -0.191 -0.142 -0.168 -0.175 -0.142 -0.19837 0.606 0.081 0.009 -0.057 0.186 0.160 0.009 0.60638 0.016 0.400 0.528 0.296 0.280 0.310 0.016 0.52839 -0.170 0.236 0.332 0.271 0.144 0.167 -0.170 0.33240 0.092 0.425 0.366 0.311 0.256 0.298 0.092 0.42541 -0.181 0.360 0.822 0.627 0.423 0.407 -0.181 0.82242 0.098 0.224 0.272 0.224 0.198 0.204 0.098 0.272

Mean: (0.069) 0.056 0.002 0.031 (0.012) 0.005 (0.043) 0.027SD: 0.214 0.273 0.270 0.254 0.203 0.213 0.122 0.362

(w) (x) (y) (z)


Combined Forecasts

250


Case JSEM FLOOR

AREA CUBE RASEM Min Max

(a) (b) (c) (d) avg. (c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error

1 -0.172 -0.267 -0.254 -0.241 -0.247 -0.233 -0.172 -0.2672 -0.229 -0.332 -0.319 -0.335 -0.327 -0.304 -0.229 -0.3353 -0.436 -0.294 -0.231 -0.152 -0.191 -0.278 -0.152 -0.4364 -0.304 -0.310 -0.173 0.248 0.038 -0.135 -0.173 -0.3105 0.198 0.211 0.165 0.054 0.109 0.157 0.054 0.2116 -0.214 -0.127 -0.023 0.095 0.036 -0.067 -0.023 -0.2147 -0.398 -0.275 -0.204 -0.118 -0.161 -0.249 -0.118 -0.3988 -0.391 -0.187 -0.150 -0.081 -0.115 -0.202 -0.081 -0.3919 0.021 -0.071 -0.015 0.038 0.011 -0.007 -0.015 -0.07110 -0.422 -0.268 -0.217 -0.035 -0.126 -0.235 -0.035 -0.42211 -0.354 -0.352 -0.359 -0.146 -0.252 -0.303 -0.146 -0.35912 -0.062 -0.006 -0.007 0.222 0.107 0.037 -0.006 0.22213 -0.431 -0.312 -0.229 -0.045 -0.137 -0.254 -0.045 -0.43114 -0.282 -0.287 -0.235 -0.127 -0.181 -0.233 -0.127 -0.28715 -0.266 -0.142 -0.040 0.249 0.105 -0.050 -0.040 -0.26616 0.425 0.313 0.318 0.189 0.253 0.311 0.189 0.42517 -0.275 -0.004 0.047 0.132 0.090 -0.025 -0.004 -0.27518 0.142 0.086 0.067 0.010 0.038 0.076 0.010 0.14219 -0.126 0.130 0.141 0.145 0.143 0.073 -0.126 0.14520 -0.367 -0.253 -0.177 -0.051 -0.114 -0.212 -0.051 -0.36721 -0.129 -0.193 -0.095 0.030 -0.032 -0.097 0.030 -0.19322 -0.121 -0.150 -0.172 -0.153 -0.162 -0.149 -0.121 -0.17223 0.266 0.291 0.289 0.149 0.219 0.249 0.149 0.29124 -0.010 -0.017 -0.022 -0.148 -0.085 -0.049 -0.010 -0.14825 -0.247 -0.115 -0.211 -0.171 -0.191 -0.186 -0.115 -0.24726 -0.177 0.095 0.098 -0.072 0.013 -0.014 -0.072 -0.17727 -0.245 0.009 -0.045 -0.068 -0.056 -0.087 0.009 -0.24528 0.372 0.264 0.255 0.243 0.249 0.283 0.243 0.37229 -0.024 0.132 0.126 -0.031 0.047 0.051 -0.024 0.13230 -0.393 -0.249 -0.173 -0.143 -0.158 -0.240 -0.143 -0.39331 0.165 0.211 0.190 0.184 0.187 0.188 0.165 0.21132 0.322 0.340 0.282 0.305 0.293 0.312 0.282 0.34033 -0.214 -0.187 -0.151 -0.200 -0.176 -0.188 -0.151 -0.21434 0.123 0.162 0.160 0.128 0.144 0.143 0.123 0.16235 0.051 0.031 0.034 0.003 0.019 0.030 0.003 0.05136 -0.118 -0.008 0.028 -0.095 -0.033 -0.048 -0.008 -0.11837 0.323 0.336 0.194 -0.035 0.079 0.204 -0.035 0.33638 0.456 0.361 0.300 0.171 0.236 0.322 0.171 0.45639 0.464 0.333 0.275 0.233 0.254 0.326 0.233 0.46440 0.459 0.348 0.288 0.152 0.220 0.312 0.152 0.45941 0.339 0.241 0.186 0.099 0.143 0.216 0.099 0.33942 0.027 -0.037 -0.080 -0.118 -0.099 -0.052 0.027 -0.11843 0.308 0.170 0.119 0.034 0.077 0.158 0.034 0.30844 0.166 0.139 0.070 0.064 0.067 0.110 0.064 0.16645 0.383 0.317 0.236 0.174 0.205 0.277 0.174 0.38346 0.270 0.445 0.289 0.221 0.255 0.306 0.221 0.44547 -0.296 -0.219 -0.238 -0.217 -0.227 -0.242 -0.217 -0.29648 -0.223 -0.053 -0.004 0.058 0.027 -0.055 -0.004 -0.22349 -0.186 0.054 0.084 0.209 0.146 0.040 0.054 0.20950 0.463 0.352 0.317 0.272 0.295 0.351 0.272 0.463

Mean: (0.027) 0.013 0.015 0.027 0.021 0.007 0.006 (0.013)SD: 0.290 0.235 0.196 0.159 0.167 0.205 0.133 0.307

(w) (x) (y) (z)

Average mean (w, x, y & z): 0.007 Average SD (w, x, y & z): 0.220

Combined Forecasts

251


Case JSEM FLOOR

AREA CUBE RJSEM Min Max

(a) (b) (c) (d) avg. (b, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error

1 -0.172 -0.267 -0.254 -0.283 -0.268 -0.244 -0.172 -0.2832 -0.229 -0.332 -0.319 -0.366 -0.339 -0.312 -0.229 -0.3663 -0.436 -0.294 -0.231 -0.215 -0.247 -0.294 -0.215 -0.4364 -0.304 -0.310 -0.173 -0.246 -0.243 -0.258 -0.173 -0.3105 0.198 0.211 0.165 -0.009 0.122 0.141 -0.009 0.2116 -0.214 -0.127 -0.023 0.244 0.031 -0.030 -0.023 0.2447 -0.398 -0.275 -0.204 -0.201 -0.227 -0.269 -0.201 -0.3988 -0.391 -0.187 -0.150 -0.095 -0.144 -0.206 -0.095 -0.3919 0.021 -0.071 -0.015 0.022 -0.022 -0.011 -0.015 -0.07110 -0.422 -0.268 -0.217 -0.130 -0.205 -0.259 -0.130 -0.42211 -0.354 -0.352 -0.359 -0.509 -0.407 -0.393 -0.352 -0.50912 -0.062 -0.006 -0.007 -0.060 -0.024 -0.034 -0.006 -0.06213 -0.431 -0.312 -0.229 -0.240 -0.260 -0.303 -0.229 -0.43114 -0.282 -0.287 -0.235 -0.217 -0.246 -0.255 -0.217 -0.28715 -0.266 -0.142 -0.040 -0.047 -0.076 -0.124 -0.040 -0.26616 0.425 0.313 0.318 0.285 0.305 0.335 0.285 0.42517 -0.275 -0.004 0.047 0.119 0.054 -0.028 -0.004 -0.27518 0.142 0.086 0.067 0.047 0.067 0.085 0.047 0.14219 -0.126 0.130 0.141 0.278 0.183 0.106 -0.126 0.27820 -0.367 -0.253 -0.177 -0.195 -0.208 -0.248 -0.177 -0.36721 -0.129 -0.193 -0.095 -0.160 -0.149 -0.144 -0.095 -0.19322 -0.121 -0.150 -0.172 -0.107 -0.143 -0.137 -0.107 -0.17223 0.266 0.291 0.289 0.368 0.316 0.303 0.266 0.36824 -0.010 -0.017 -0.022 -0.020 -0.020 -0.017 -0.010 -0.02225 -0.247 -0.115 -0.211 -0.013 -0.113 -0.146 -0.013 -0.24726 -0.177 0.095 0.098 0.188 0.127 0.051 0.095 0.18827 -0.245 0.009 -0.045 0.111 0.025 -0.043 0.009 -0.24528 0.372 0.264 0.255 0.306 0.275 0.299 0.255 0.37229 -0.024 0.132 0.126 0.122 0.127 0.089 -0.024 0.13230 -0.393 -0.249 -0.173 -0.181 -0.201 -0.249 -0.173 -0.39331 0.165 0.211 0.190 0.153 0.185 0.180 0.153 0.21132 0.322 0.340 0.282 0.448 0.357 0.348 0.282 0.44833 -0.214 -0.187 -0.151 -0.067 -0.135 -0.155 -0.067 -0.21434 0.123 0.162 0.160 0.255 0.192 0.175 0.123 0.25535 0.051 0.031 0.034 0.080 0.048 0.049 0.031 0.08036 -0.118 -0.008 0.028 0.098 0.039 0.000 -0.008 -0.11837 0.323 0.336 0.194 0.258 0.263 0.278 0.194 0.33638 0.456 0.361 0.300 0.192 0.285 0.327 0.192 0.45639 0.464 0.333 0.275 0.327 0.312 0.350 0.275 0.46440 0.459 0.348 0.288 0.186 0.274 0.320 0.186 0.45941 0.339 0.241 0.186 0.159 0.196 0.231 0.159 0.33942 0.027 -0.037 -0.080 -0.045 -0.054 -0.034 0.027 -0.08043 0.308 0.170 0.119 0.090 0.126 0.172 0.090 0.30844 0.166 0.139 0.070 0.246 0.151 0.155 0.070 0.24645 0.383 0.317 0.236 0.340 0.298 0.319 0.236 0.38346 0.270 0.445 0.289 0.592 0.442 0.399 0.270 0.59247 -0.296 -0.219 -0.238 -0.157 -0.205 -0.227 -0.157 -0.29648 -0.223 -0.053 -0.004 0.044 -0.004 -0.059 -0.004 -0.22349 -0.186 0.054 0.084 0.164 0.100 0.029 0.054 -0.18650 0.463 0.352 0.317 0.261 0.310 0.349 0.261 0.463

Mean: (0.027) 0.013 0.015 0.048 0.025 0.012 0.010 0.003SD: 0.290 0.235 0.196 0.226 0.214 0.226 0.166 0.324

(w) (x) (y) (z)


Combined Forecasts

252


Case JSEM FLOOR

AREA CUBE RASEM Combined

Forecasts Min Max

(a) (b) (c) (d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error

1 -0.080 -0.070 0.041 -0.024 -0.033 -0.024 -0.0802 0.286 0.256 0.303 0.373 0.305 0.256 0.3733 -0.228 -0.167 -0.064 -0.221 -0.170 -0.064 -0.2284 -0.125 0.004 0.282 -0.222 -0.015 0.004 0.2825 -0.220 -0.188 -0.092 -0.179 -0.170 -0.092 -0.2206 0.004 -0.013 -0.024 0.076 0.011 0.004 0.0767 -0.044 -0.287 -0.305 -0.169 -0.201 -0.044 -0.3058 -0.253 -0.239 -0.149 -0.218 -0.215 -0.149 -0.2539 -0.011 0.094 -0.084 -0.049 -0.013 -0.011 0.09410 0.021 0.066 0.039 0.113 0.060 0.021 0.11311 -0.169 -0.092 -0.258 -0.150 -0.167 -0.092 -0.25812 0.091 0.271 0.238 -0.092 0.127 0.091 0.27113 0.095 0.041 0.240 0.093 0.117 0.041 0.24014 0.397 0.519 0.262 0.378 0.389 0.262 0.51915 0.176 0.341 0.502 0.316 0.334 0.176 0.50216 -0.153 -0.088 0.177 -0.127 -0.048 -0.088 0.17717 0.036 -0.233 -0.381 0.026 -0.138 0.026 -0.38118 0.377 0.471 0.391 0.150 0.347 0.150 0.47119 0.044 -0.005 -0.059 0.118 0.025 -0.005 0.11820 -0.191 -0.170 0.025 -0.131 -0.117 0.025 -0.19121 0.242 0.370 0.395 0.380 0.347 0.242 0.39522 0.322 0.315 0.206 0.385 0.307 0.206 0.38523 -0.138 -0.229 -0.362 -0.114 -0.211 -0.114 -0.362

Mean: 0.021 0.042 0.058 0.031 0.038 0.036 0.075 SD: 0.200 0.245 0.252 0.214 0.207 0.124 0.300

(w) (x) (y) (z)


253


Case JSEM FLOOR

AREA CUBE RJSEM Combined

Forecasts Min Max


1 -0.080 -0.070 0.041 -0.031 -0.035 -0.031 -0.0802 0.286 0.256 0.303 0.385 0.307 0.256 0.3853 -0.228 -0.167 -0.064 -0.218 -0.169 -0.064 -0.2284 -0.125 0.004 0.282 -0.167 -0.002 0.004 0.2825 -0.220 -0.188 -0.092 -0.195 -0.174 -0.092 -0.2206 0.004 -0.013 -0.024 0.075 0.010 0.004 0.0757 -0.044 -0.287 -0.305 -0.181 -0.204 -0.044 -0.3058 -0.253 -0.239 -0.149 -0.219 -0.215 -0.149 -0.2539 -0.011 0.094 -0.084 -0.018 -0.005 -0.011 0.09410 0.021 0.066 0.039 0.114 0.060 0.021 0.11411 -0.169 -0.092 -0.258 -0.173 -0.173 -0.092 -0.25812 0.091 0.271 0.238 -0.074 0.132 -0.074 0.27113 0.095 0.041 0.240 0.099 0.119 0.041 0.24014 0.397 0.519 0.262 0.382 0.390 0.262 0.51915 0.176 0.341 0.502 0.330 0.337 0.176 0.50216 -0.153 -0.088 0.177 -0.119 -0.046 -0.088 0.17717 0.036 -0.233 -0.381 0.014 -0.141 0.014 -0.38118 0.377 0.471 0.391 0.092 0.333 0.092 0.47119 0.044 -0.005 -0.059 0.113 0.023 -0.005 0.11320 -0.191 -0.170 0.025 -0.134 -0.118 0.025 -0.19121 0.242 0.370 0.395 0.390 0.349 0.242 0.39522 0.322 0.315 0.206 0.393 0.309 0.206 0.39323 -0.138 -0.229 -0.362 -0.118 -0.212 -0.118 -0.362

Mean: 0.021 0.042 0.058 0.032 0.038 0.025 0.076SD: 0.200 0.245 0.252 0.215 0.208 0.124 0.301

(w) (x) (y) (z)


254


Case JSEM FLOOR

AREA CUBE RASEM Combined

Forecasts Min Max


1 -0.086 -0.083 -0.040 -0.006 -0.054 -0.006 -0.0862 0.018 0.116 0.215 0.056 0.101 0.018 0.2153 -0.148 -0.145 -0.105 -0.074 -0.118 -0.074 -0.1484 -0.114 -0.120 -0.196 -0.098 -0.132 -0.098 -0.1965 0.071 0.132 0.097 -0.016 0.071 -0.016 0.1326 -0.062 -0.062 -0.122 0.097 -0.037 -0.062 -0.1227 0.169 0.116 0.224 0.292 0.200 0.116 0.2928 0.045 0.034 -0.042 0.087 0.031 0.034 0.0879 -0.374 -0.390 -0.427 -0.280 -0.368 -0.280 -0.42710 0.122 0.042 0.265 0.095 0.131 0.042 0.26511 -0.118 -0.172 -0.113 0.024 -0.095 0.024 -0.17212 0.110 0.092 0.099 0.158 0.115 0.092 0.15813 -0.362 -0.369 -0.278 -0.298 -0.327 -0.278 -0.36914 -0.071 -0.106 -0.187 -0.186 -0.137 -0.071 -0.18715 0.095 0.126 0.019 0.062 0.076 0.019 0.12616 0.039 0.071 -0.064 0.258 0.076 0.039 0.25817 0.326 0.305 0.203 -0.222 0.153 0.203 0.32618 0.178 0.138 0.172 -0.009 0.120 -0.009 0.17819 0.422 0.407 0.588 0.520 0.484 0.407 0.58820 -0.061 -0.121 -0.130 -0.163 -0.119 -0.061 -0.16321 -0.030 -0.004 -0.093 -0.028 -0.039 -0.004 -0.09322 0.344 0.342 0.167 0.147 0.250 0.147 0.34423 0.427 0.420 0.567 0.257 0.418 0.257 0.567

Mean: 0.041 0.033 0.036 0.029 0.035 0.019 0.068 SD: 0.212 0.214 0.246 0.196 0.202 0.150 0.274

(w) (x) (y) (z)


255


Case JSEM FLOOR

AREA CUBE RJSEM Combined

Forecasts Min Max


1 -0.086 -0.083 -0.040 -0.039 -0.062 -0.039 -0.0862 0.018 0.116 0.215 0.053 0.100 0.018 0.2153 -0.148 -0.145 -0.105 -0.105 -0.126 -0.105 -0.1484 -0.114 -0.120 -0.196 -0.054 -0.121 -0.054 -0.1965 0.071 0.132 0.097 0.092 0.098 0.071 0.1326 -0.062 -0.062 -0.122 0.057 -0.047 0.057 -0.1227 0.169 0.116 0.224 0.222 0.183 0.116 0.2248 0.045 0.034 -0.042 0.152 0.047 0.034 0.1529 -0.374 -0.390 -0.427 -0.311 -0.375 -0.311 -0.42710 0.122 0.042 0.265 0.046 0.119 0.042 0.26511 -0.118 -0.172 -0.113 -0.076 -0.120 -0.076 -0.17212 0.110 0.092 0.099 0.139 0.110 0.092 0.13913 -0.362 -0.369 -0.278 -0.341 -0.337 -0.278 -0.36914 -0.071 -0.106 -0.187 -0.188 -0.138 -0.071 -0.18815 0.095 0.126 0.019 0.094 0.084 0.019 0.12616 0.039 0.071 -0.064 0.334 0.095 0.039 0.33417 0.326 0.305 0.203 -0.263 0.143 0.203 0.32618 0.178 0.138 0.172 0.062 0.137 0.062 0.17819 0.422 0.407 0.588 0.527 0.486 0.407 0.58820 -0.061 -0.121 -0.130 -0.137 -0.112 -0.061 -0.13721 -0.030 -0.004 -0.093 0.063 -0.016 -0.004 -0.09322 0.344 0.342 0.167 0.287 0.285 0.167 0.34423 0.427 0.420 0.567 0.170 0.396 0.170 0.567

Mean: 0.041 0.033 0.036 0.034 0.036 0.022 0.072 SD: 0.212 0.214 0.246 0.208 0.204 0.150 0.274

(w) (x) (y) (z)

Average mean (w, x, y & z): 0.036

Documents

Development and Testing of a Method for …Development and Testing of a Method for Forecasting Prices of Multi-Storey Buildings during the Early Design Stage: the Storey Enclosure