Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Development and Testing of a Method for
Forecasting Prices of Multi-Storey Buildings
during the Early Design Stage: the Storey
Enclosure Method Revisited
Franco Kai Tak Cheung
Doctor of Philosophy
School of Construction Management and Property
Queensland University of Technology
2005
To Ritz and my parents
i
SSttaatteemmeenntt ooff OOrriiggiinnaall AAuutthhoorrsshhiipp
The work contained in this thesis has not been previously submitted for a
degree or diploma at any other higher education institution. To the best of my
knowledge and belief, the thesis contains no material previously published or written
by another person except where due reference is made.
Signature:
Date:
ii
AAbbssttrraacctt
Although design decisions that are made in the preliminary design stages of a
building are more cost sensitive than those that are made at later stages, previous
research suggests that they result in only a slight improvement in the accuracy of
building price forecasts as the design develops. However, established conventional
forecasting methods lack measures of their own performance, which has inhibited the
development of simpler early-stage techniques.
One early-stage price forecasting model, the Storey Enclosure Method, which
was developed by James in 1954, uses the basic physical measurements of buildings
to estimate building prices. Although James’ Storey Enclosure Model (JSEM) is not
widely used in practice, it has been proved empirically, if rather crudely, to be a
better model than other commonly used models. This research aims firstly to advance
JSEM by using regression techniques and secondly to develop an objective approach
for the assessment of model performance.
To accomplish the first research aim, this research uses data from 148
completed Hong Kong projects for four types of building: offices, private housing,
nursing homes, and primary and secondary schools. Sophisticated features of the
modelling exercise include the use of leave-one-out cross validation to simulate the
way in which forecasts are produced in practice and a dual stepwise selection
strategy that enhances the chance of identifying the best model. Two types of
iii
regressed models from different candidate sets, the Regressed Model for James’
Storey Enclosure Method (RJSEM) and Regressed Model for Advanced Storey
Enclosure Method (RASEM), are developed accordingly.
In considering the RJSEM, RASEM, and the most commonly used alternative
early stage floor area and cube models, all of the models except JSEM are found to
be unbiased. The RJSEM and RASEM models are also examined for their
consistency using a structured approach that involves the use of both parametric and
non-parametric inference tests. This shows that although the RASEMs for different
building types are generally more consistent, they are not significantly better than the
other models. Finally, the combination of the forecasts that are generated from
different models to capture the different aspects of information from the models is
suggested as an alternative strategy for improving forecasting performance.
iv
AAcckknnoowwlleeddggeemmeennttss
I am indebted to the following people for their time, help and contribution to
the production of this thesis.
Great appreciation is due to my former team mates at Levett & Bailey Ltd.,
including Hon Kong Yu, who allowed me to access the data; See Ping Wong, who
provided me with a valuable insight into estimating practices; and Anselm Chow,
who gave detailed explanations of the company’s recording system, and answered all
of my queries.
Gratitude is expressed to Dr. Derek Drew for suggesting the research topic
and for supervising this research project. Without his crucial suggestions, this
research would never have been started.
Thanks are due to all my colleagues at the City University of Hong Kong for
their support, and in particular to Professor Andrew Leung, Dr. S. M. Lo and Dr. S O
Cheung for their encouragement and advice, to Dr. Raymond Lee for introducing me
to the mathematical software Mathcad and to Dr. Eric Lee for giving me private
lessons on resampling methods.
Special acknowledgment is given to Dr. H P Lo, my local supervisor in Hong
Kong, who gave pointed me in the right direction for many of the statistical problems
encountered. His advice on the choice of techniques and proper mathematical
v
interpretation was particularly helpful. His patience in correcting my thinking is
greatly appreciated.
I am indebted to Professor Martin Skitmore for many things, such as his
extensive assistance, superb guidance, sharp advice, incredible patience and prompt
responses to my queries throughout this study. The time and effort that he spent
discussing my research during the occasion of his visit to Hong Kong and my stays at
QUT are highly appreciated. Without his guidance and advice, I would not have been
able to proceed and bring the research to completion.
vi
TTaabbllee ooff CCoonntteennttss
STATEMENT OF ORIGINAL AUTHORSHIP................................................................................ I
ABSTRACT......................................................................................................................................... II
ACKNOWLEDGEMENTS............................................................................................................... IV
TABLE OF CONTENTS................................................................................................................... VI
LIST OF FIGURES ........................................................................................................................... IX
LIST OF TABLES ...............................................................................................................................X
CHAPTER 1 INTRODUCTION ....................................................................................................1
CHAPTER 2 COST FORECASTING IN PRACTICE: A REVIEW .........................................8
2.1 INTRODUCTION .....................................................................................................................8 2.2 BUILDING ECONOMICS .......................................................................................................10 2.3 COST PLANNING AND CONTROL .........................................................................................10 2.4 COST FORECASTING IN THE COST PLANNING AND CONTROL PROCESS...............................11 2.5 DESIGN PROCESS AND DESIGNERS’ FORECASTS .................................................................14 2.6 EARLY STAGE FORECASTING IN PRACTICE .........................................................................17 2.7 PROBLEMS OF EXISTING FORECASTING PRACTICE..............................................................20
2.7.1 Misconception of the relationship between level of detail and forecasting accuracy ...20 2.7.2 Lack of theoretical background.....................................................................................21 2.7.3 Lack of performance evaluation....................................................................................22 2.7.4 Inexplicability, unrelatedness and determinism ............................................................23
2.8 SUMMARY ..........................................................................................................................24 CHAPTER 3 DEVELOPMENT OF FORECASTING MODELS............................................26
3.1 INTRODUCTION ...................................................................................................................26 3.2 DEFINITION OF COST MODEL..............................................................................................27 3.3 BRANDON’S “PARADIGM SHIFT” ........................................................................................29
3.3.1 Black box versus realistic models .................................................................................31 3.3.2 Deterministic versus stochastic models.........................................................................32 3.3.3 Deductive versus inductive models................................................................................33
3.4 MAJOR DIRECTIONS OF MODEL DEVELOPMENT .................................................................34 3.5 LIMITATIONS OF COST MODELS..........................................................................................42
3.5.1 Model assumptions........................................................................................................42 3.5.2 Reliance on historical data for prediction.....................................................................43 3.5.3 Insufficiency of information and preparation time........................................................44 3.5.4 Reliance on expert judgment .........................................................................................44
vii
3.6 REVIEW OF COST MODELS IN USE ..................................................................................... 45 3.7 SIGNIFICANT ITEMS ESTIMATION....................................................................................... 47 3.8 DISCUSSIONS ON RESEARCH OPPORTUNITIES .................................................................... 49 3.9 STOREY ENCLOSURE METHOD........................................................................................... 52 3.10 REGRESSION ANALYSIS ..................................................................................................... 56 3.11 REVIEW OF MODEL PREDICTORS ....................................................................................... 57 3.12 OCCAM’S RAZOR: PARSIMONY OF VARIABLES .................................................................. 66 3.13 SUMMARY.......................................................................................................................... 70
CHAPTER 4 PERFORMANCE OF FORECASTING MODELS........................................... 73
4.1 INTRODUCTION .................................................................................................................. 73 4.2 MEASURES OF FORECASTING ACCURACY.......................................................................... 74 4.3 BASE TARGET FOR FORECASTING ACCURACY................................................................... 82 4.4 OVERVIEW OF MODEL PERFORMANCE AT VARIOUS DESIGN STAGES................................ 83 4.5 SUMMARY.......................................................................................................................... 87
CHAPTER 5 METHODOLOGY ................................................................................................ 89
5.1 INTRODUCTION .................................................................................................................. 89 5.2 RESEARCH FRAMEWORK ................................................................................................... 90 5.3 TYPES OF QUANTITY MEASURED IN SINGLE-RATE FORECASTING MODELS ...................... 92 5.4 SIMPLIFICATION OF JSEM ................................................................................................. 93 5.5 IDENTIFICATION OF A PROBLEM......................................................................................... 97 5.6 DATA PREPARATION AND ENTRY ...................................................................................... 99
5.6.1 Data sample................................................................................................................ 100 5.6.2 Definition and classification of building types ........................................................... 101 5.6.3 Treating of outliers..................................................................................................... 104
5.7 MODEL BUILDING............................................................................................................ 105 5.7.1 Dependent Variables .................................................................................................. 105
5.7.1.1 Price Index Adjustment .................................................................................................. 106 5.7.1.2 Other Adjustments .......................................................................................................... 107
5.7.2 Candidate variables ................................................................................................... 107 5.7.3 Fitting Criterion ......................................................................................................... 109
5.7.3.1 Matrix Notation for Calculation of MSQ........................................................................ 110 5.7.4 Reliability analysis ..................................................................................................... 112
5.7.4.1 Matrix Notation for Calculation of MSQ by Leave-one-out Method.............................. 114 5.7.5 Selection Strategies .................................................................................................... 115
5.8 MODEL ADJUSTMENT ...................................................................................................... 119 5.8.1 Exclusion of candidates.............................................................................................. 119 5.8.2 Transformation of variables ....................................................................................... 121
5.9 COMPARISON OF BEST MODEL WITH OTHER MODELS..................................................... 122 5.9.1 Choice of parametric and non-parametric inference ................................................. 124 5.9.2 Statistical inference for bias....................................................................................... 126 5.9.3 Statistical inference for consistency ........................................................................... 127
5.10 TOOLS FOR COMPUTATION .............................................................................................. 132 5.11 SUMMARY........................................................................................................................ 133
CHAPTER 6 ANALYSIS........................................................................................................... 137
6.1 INTRODUCTION ................................................................................................................ 137
viii
6.2 MODEL DEVELOPMENT.....................................................................................................138 6.2.1 Data Collected ............................................................................................................138 6.2.2 Candidates for Regression Models..............................................................................139 6.2.3 Response for Regression Models.................................................................................139 6.2.4 Selection of Predictors ................................................................................................142
6.2.4.1 Selected Predictors for RJSEMs and RASEMs ...............................................................144 6.2.5 Model Transformation.................................................................................................164
6.3 PERFORMANCE VALIDATION ............................................................................................164 6.3.1 Forecasting Results .....................................................................................................164 6.3.2 Normality Testing........................................................................................................168 6.3.3 Significance of Variable Transformation ....................................................................174 6.3.4 Comparisons of Models...............................................................................................175
6.3.4.1 Models for Offices...........................................................................................................178 6.3.4.2 Models for Private Housing.............................................................................................179 6.3.4.3 Models for Nursing Homes .............................................................................................180 6.3.4.4 Models for Schools..........................................................................................................180 6.3.4.5 Discussions on model comparisons .................................................................................181
6.4 COMBINING FORECASTS ...................................................................................................183 6.5 SUMMARY ........................................................................................................................188
CHAPTER 7 CONCLUSIONS...................................................................................................193
7.1 INTRODUCTION .................................................................................................................193 7.2 MODEL DEVELOPMENT.....................................................................................................194 7.3 PERFORMANCE VALIDATION ............................................................................................196 7.4 COMBINING FORECASTS....................................................................................................198 7.5 IMPLICATIONS FOR PRACTICE ...........................................................................................199 7.6 MODEL LIMITATIONS........................................................................................................200 7.7 OPPORTUNITIES FOR FURTHER RESEARCH........................................................................202
BIBLIOGRAPHY .............................................................................................................................205
APPENDIX A: APPROVAL LETTER FOR ACCESS OF COST ANALYSES ........................218
APPENDIX B : TENDER PRICE INDICES AND COST TRENDS IN HONG KONG, MARCH 2004 (PUBLISHED BY LEVETT AND BAILEY CHARTERED QUANTITY SURVEYORS LTD.)..................................................................................................................................................218
APPENDIX C: ORIGINAL DATA.................................................................................................218
APPENDIX D: FORECASTS BY CROSS VALIDATION USING CONVENTIONAL MODELS ...........................................................................................................................................218
APPENDIX E: ERRORS AND PERCENTAGE ERRORS OF FORECASTS ..........................218
APPENDIX F: RESULTS OF COMBINING FORECASTS .......................................................218
ix
LLiisstt ooff FFiigguurreess
FIGURE 2-1: MODEL OF DESIGN PROCESS (SOURCE: MAVER 1970 P.200).................... 16
FIGURE 2-2: LEVEL OF INFLUENCE ON PROJECT COST (IN PER CENT) (SOURCE: BARRIE AND PAULSON 1978 P. 154) ........................................................................................... 16
FIGURE 2-3: DESIGNERS’ COMMITMENT TO EXPENDITURE (SOURCE: FERRY ET AL. 1999 P. 96)........................................................................................................................................... 17
FIGURE 5-1: RESEARCH FRAMEWORK FOR IDENTIFICATION, SELECTION AND VALIDATION OF PRICE MODELS.............................................................................................. 91
FIGURE 5-2: ALGORITHM FOR DUAL STEPWISE SELECTION....................................... 118
FIGURE 5-3: ALGORITHM FOR COMPARISONS OF VARIANCES OF PERCENTAGE ERRORS........................................................................................................................................... 128
FIGURE 6-1: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE FLOOR AREA MODEL FOR OFFICES................................................................................................................. 170
FIGURE 6-2: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE LRASEM FOR OFFICES .......................................................................................................................................... 170
FIGURE 6-3: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE JSEM FOR PRIVATE HOUSING...................................................................................................................... 171
FIGURE 6-4: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE FLOOR AREA MODEL FOR PRIVATE HOUSING............................................................................................. 171
FIGURE 6-5: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE CUBE MODEL FOR PRIVATE HOUSING............................................................................................................. 172
FIGURE 6-6: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE RJSEM FOR NURSING HOMES ......................................................................................................................... 173
FIGURE 6-7: BOX-COX PLOT OF PERCENTAGE ERRORS FOR THE RASEM FOR NURSING HOMES ......................................................................................................................... 173
FIGURE 6-8: TESTS OF HOMOGENEITY OF VARIANCES USING BARTLETT’S TESTS, KRUSKAL WALLIS TESTS AND MANN-WHITNEY U TESTS............................................. 177
x
LLiisstt ooff TTaabblleess
TABLE 2-1: MODEL SELECTION CRITERIA (EXTRACTED AND MODIFIED FROM FORTUNE AND HINKS 1998) .........................................................................................................19
TABLE 3-1: CLASSIFICATION OF THIS RESEARCH ACCORDING TO NEWTON’S DESCRIPTIVE PRIMITIVES ..........................................................................................................37
TABLE 3-2: PREVIOUS STUDIES ON MODELLING TECHNIQUES AND APPLICATIONS ACCORDING TO NEWTON’S CLASSIFICATION.....................................................................38
TABLE 3-3: SUMMARY OF ESTIMATING TECHNIQUES (EXTRACTED FROM SKITMORE & PATCHELL 1990) ...................................................................................................40
TABLE 3-4: ADJUSTMENT FOR THE FACTORS AFFECTING THE ESTIMATES IN THE STOREY ENCLOSURE METHOD .................................................................................................53
TABLE 3-5: WEIGHTINGS AND INCLUSIONS FOR INDIVIDUAL COMPONENTS IN THE STOREY ENCLOSURE METHOD .................................................................................................54
TABLE 3-6: THE RESULTS OF TESTS FOR THE CUBE, FLOOR AREA AND STOREY ENCLOSURE METHODS IN JAMES’ STUDY (SOURCE: JAMES (1954)) .............................56
TABLE 3-7: SUMMARY OF THE MODELS DEVELOPED BY THE POST-GRADUATE STUDENTS OF THE DEPARTMENT OF CIVIL ENGINEERING AT LOUGHBOROUGH UNIVERSITY OF TECHNOLOGY (EXTRACTED FROM MCCAFFER 1975).......................68
TABLE 3-8: SUMMARY OF FORECASTING TARGETS AND INFLUENCING VARIABLES IN PREVIOUS EMPIRICAL STUDIES ..........................................................................................69
TABLE 4-1: MEASURES OF PERFORMANCE OF FORECASTS (SOURCE: SKITMORE ET AL. 1990 P. 22) ..............................................................................................................................77
TABLE 4-2: FACTORS AFFECTING QUALITY OF FORECASTS – SUMMARY OF EMPIRICAL EVIDENCE (EXTENDED FROM THE SIMILAR TABLE IN SKITMORE ET AL. (1990, P. 20-21)) ...........................................................................................................................78
TABLE 4-3: PERFORMANCE OF DESIGNERS’ FORECASTS REVIEWED BY ASHWORTH AND SKITMORE (1983)...........................................................................................87
TABLE 5-1: COEFFICIENTS AND VARIABLES DESIGNATED IN JSEM.............................99
TABLE 5-2: CLASSIFICATION OF BUILDING PROJECTS ACCORDING TO BUILDING TYPES................................................................................................................................................103
TABLE 5-3: LIST OF CANDIDATE VARIABLES......................................................................108
xi
TABLE 6-2: INCLUDED CANDIDATES, EXCLUDED CANDIDATES AND SELECTED PREDICTORS FOR RJSEMS AND RASEMS ............................................................................ 144
TABLE 6-3: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR OFFICES................................................................................................................................. 147
TABLE 6-4: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR PRIVATE HOUSING............................................................................................................. 148
TABLE 6-5: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR NURSING HOMES ................................................................................................................ 148
TABLE 6-6: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RJSEM FOR SCHOOLS............................................................................................................................... 149
TABLE 6-7: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR OFFICES................................................................................................................................. 150
TABLE 6-8: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR PRIVATE HOUSING............................................................................................................. 151
TABLE 6-9: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR NURSING HOMES ................................................................................................................ 152
TABLE 6-10: STEP-BY-STEP SELECTION RESULTS OF PREDICTORS FOR THE RASEM FOR SCHOOLS............................................................................................................................... 153
TABLE 6-11: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR OFFICE ............................................ 154
TABLE 6-12: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR PRIVATE HOUSING...................... 155
TABLE 6-13: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR NURSING HOMES ......................... 156
TABLE 6-14: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RJSEM FOR SCHOOLS ........................................ 157
TABLE 6-15: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR OFFICES ......................................... 158
TABLE 6-15: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR OFFICES ......................................... 158
TABLE 6-16: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR PRIVATE HOUSING..................... 159
TABLE 6-17: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR NURSING HOMES ........................ 160
TABLE 6-18: COEFFICIENTS, FORECASTS AND MSQS DETERMINED BY LEAVE-ONE-OUT METHOD FOR THE RASEM FOR SCHOOLS ....................................... 161
TABLE 6-19: SIGNS OF COEFFICIENTS FOR SELECTED PREDICTORS........................ 161
TABLE 6-20: CONTRIBUTIONS OF FLOOR AREA RELATED PREDICTOR TO RESPONSE....................................................................................................................................... 162
xii
TABLE 6-21: CONTRIBUTION OF NON-FLOOR AREA RELATED PREDICTORS TO RESPONSES .....................................................................................................................................163
TABLE 6-22: SUMMARY OF MEANS AND STANDARD DEVIATIONS OF PERCENTAGE ERRORS............................................................................................................................................167
TABLE 6-23: RESULTS OF NORMALITY TESTS FOR PERCENTAGE ERRORS ACCORDING TO BUILDING AND MODEL TYPES ................................................................169
TABLE 6-24: ESTIMATED LAMBDA VALUES ACCORDING TO BUILDING AND MODEL TYPES (FOR MODELS NOT SATISFYING NORMALITY ASSUMPTION ONLY).............173
TABLE 6-25: TWO-SAMPLE F-TESTS AND MANN-WHITNEY U TEST BETWEEN REGRESSED MODELS WITH UNTRANSFORMED VARIABLES AND WITH LOGARITHMIC TRANSFORMED VARIABLES ......................................................................175
TABLE 6-26: TWO-SAMPLE MANN-WHITNEY U-TESTS BETWEEN MODELS FOR OFFICE AND PRIVATE HOUSING .............................................................................................176
TABLE 6-27: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 1 MODELS ................................................................185
TABLE 6-28: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 2 MODELS ................................................................186
TABLE 6-29: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 3 MODELS ................................................................186
TABLE 6-30: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 4 MODELS ................................................................187
TABLE 6-31: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 5 MODELS ................................................................187
TABLE 6-32: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 6 MODELS ................................................................187
TABLE 6-33: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 7 MODELS ................................................................188
TABLE 6-34: ACCURACY FOR COMBINED, MODEL AVERAGE, MINIMUM AND MAXIMUM FORECASTS FOR GROUP 8 MODELS ................................................................188
TABLE D-1: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR OFFICES................................................................................................................218
TABLE D-2: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR PRIVATE HOUSING ...........................................................................................218
TABLE D-3: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR NURSING HOMES...............................................................................................218
TABLE D-4: FORECASTS BY CROSS VALIDATION USING THE CONVENTIONAL MODELS FOR SCHOOLS..............................................................................................................218
TABLE E-1: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR OFFICES..............................................................................218
TABLE E-2: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR PRIVATE HOUSING .........................................................218
xiii
TABLE E-3: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR NURSING HOMES ............................................................ 218
TABLE E-4: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE CONVENTIONAL MODELS FOR SCHOOLS........................................................................... 218
TABLE E-5: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR OFFICES...................................................................................... 218
TABLE E-6: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR PRIVATE HOUSING ................................................................. 218
TABLE E-7: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR NURSING HOMES..................................................................... 218
TABLE E-8: ERRORS AND PERCENTAGE ERRORS OF FORECASTS FOR THE REGRESSED MODELS FOR SCHOOLS.................................................................................... 218
TABLE F-1: COMBINED FORECASTS FOR GROUP 1 MODELS ........................................ 218
TABLE F-2: COMBINED FORECASTS FOR GROUP 2 MODELS ........................................ 218
TABLE F-3: COMBINED FORECASTS FOR GROUP 3 MODELS ........................................ 218
TABLE F-4: COMBINED FORECASTS FOR GROUP 4 MODELS ........................................ 218
TABLE F-5: COMBINED FORECASTS FOR GROUP 5 MODELS ........................................ 218
TABLE F-6: COMBINED FORECASTS FOR GROUP 6 MODELS ........................................ 218
TABLE F-7: COMBINED FORECASTS FOR GROUP 7 MODELS ........................................ 218
TABLE F-8: COMBINED FORECASTS FOR GROUP 8 MODELS ........................................ 218
1
CChhaapptteerr 11 IInnttrroodduuccttiioonn
Philosophy is a game with objectives and no rules. Mathematics is a game with rules and no objectives. Anonymous
The forecasting approach for the prediction of building prices that is used in
practice has been criticized for misconstruing the relationship between level of detail
and forecasting accuracy (Bennett et al. 1979), for lacking solid theoretical support
(Brandon 1982; Skitmore 1988; Bon 2001), for lacking performance evaluation
(Morrison 1983; Raftery 1984a; Fortune and Lees 1996), and for being inexplicable,
unrelated, and deterministic (Bowen et al. 1987). Although many alternative
approaches and new models have been developed, solid evidence from surveys that
have been conducted in different countries suggests that they are rarely put forth in
practice (Akintoye et al. 1992; Fortune and Lees 1996; Bowen and Edwards 1998).
The majority of studies of model development have chosen to focus on the uniqueness
of a new model and the way in which it is different from other models (Raftery 1984a;
Newton 1990).
In the early design stage of a building project, the freedom to modify the
scopes, requirements, standards and designs is very high. This alone will create
high uncertainty in building price despite the fact that the later decisions on tendering
2
arrangement, procurement methods and number of tenderers to be invited, etc., and
the possible change in market conditions as design develops will also have serious
price implication. Although the design information available is very coarse and
limited in the early design stage, construction clients are generally eager to know the
likely building price. Very often, this price refers to the lowest tender price.
Conventionally, practicing forecasters measure the total floor area from a few sketch
drawings and make a forecast using the floor area method (or the cube method before
the floor area method gained the popularity). To make full use of information
extracted from sketches, James proposed a method as a rule of thumb, called the
Storey Enclosure Method, which he claimed takes into account the effect of physical
shape, the total floor area, the vertical positioning of the floor area, the storey heights
and the sinking usable floor area below ground level (e.g. basement) on building
prices. Like the floor area and the cube method, James’ method is a single rate
method which uses the storey enclosure area as the quantity for measurement. To
determine this area, the area for each floor, the external wall area, the basement wall
area and the roof area are first measured. Then, these measured areas are multiplied
by their associated weightings. Finally, the products of these areas and weightings
are summed and the total is the storey enclosure area.
Although James’ Storey Enclosure Model (JSEM) has not been developed
empirically, its forecasting performance, together with that of two other conventional
models, the floor area and cube models, have been calculated with empirical data for
comparison. James’ study of 1954 is a pioneering study in model exploration that
attempts to show model advancement empirically. It is able to show that forecasts
that are produced by his model are nearer to actual tender prices than those that are
produced by the other two models, and that the range of price variation is reduced
3
accordingly. Despite the better performance demonstrated by James, JSEM serves
primarily as a textbook method for forecasting, rather than a method that is used in
practice. Clearly, JSEM is more complicated than the floor area and the cube
models in terms of measurement and ease of understanding. Moreover, there is a
major criticism in that the use of weightings are purely based on experience
(Wilderness Group 1964; Seeley 1996 pp.161-162; Ashworth 1999 p.251).
JSEM, which is considered to be the most sophisticated model of all of the
single rate models, has been chosen for further development in this research. The
idea of using areas of different parts of a building as variables allows for model
exploration using regression analysis. The major problem of JSEM, that of a lack
of rigor, can be solved by using advanced modelling techniques for model
development, and statistical inferences for performance validation. By following a
rigorous approach of cross validation to the further development of JSEM and the
subsequent examination of the developed model by statistical testing, it is expected
that the model will achieve a balance between the requirements of theory (science)
and practicability (technology) for forecasting building prices.
With reference to the variables identified in the JSEM, the primary aim of
this research is to develop regressed models for forecasting the lowest tender prices
of multi-storey buildings in the early design stage using a systematic and logical
approach. To achieve this aim, this research adopts the cross validation approach
for modelling using regression analysis as it is proved to be markedly superior for
small data sets (Goutte 1997). The accuracy of statistical inference in cross
validation is preserved by dividing at random a sample of data into two sub-samples,
an exploratory sub-sample, which is used to select a statistical model for the data,
and a validatory sub-sample, which is used for formal statistical inference (Fox 1997).
4
The cross validation algorithm developed in this study for modelling JSEM’s
variables is a significant contribution because of its advancement to the model
building process. Although the data, i.e. the observed values for the candidates and
the response, used in this study are only for four different types of building projects,
the developed methodology for modelling is also applicable to data from other types
of buildings and other types of data. In revisiting James’ study, the specific
objectives of this research are: (1) to collect project data of multi-storey buildings; (2)
to classify the data according to the type of projects; (3) to develop a cross-validated
regression algorithm for model selection; (4) to generate regressed models of
different project types by the cross validation method using the variables in JSEM as
candidates; (5) to repeat (4) based on another set of variables that are modified from
the variables in JSEM.
It is hypothesised that the new regressed models will outperform the
conventional forecasting models, i.e. the JSEM, the floor area and cube models.
The secondary aim is to prove the hypothesis. To accomplish this, the forecasting
accuracy of the developed models has to be tested against that of the conventional
models. An algorithm for selecting the appropriate tests for the comparisons is
designed. The specific objectives concerning the statistical inference are: (1) to
measure the forecasting accuracy in terms of bias and consistency ; (2) to compare
the forecasting accuracy of these models by the use of different parametric and
non-parametric tests and (3) to group the models that show the same potency
together if the developed models do not perform significantly better than the
conventional models.
5
The thesis is divided into three parts. The background to this research is
presented in Chapters 2 to 4, the empirical work is contained in Chapters 5 to 6, and
the conclusions are presented in Chapter 7.
Due to the difference in cost sensitivity, early design decisions that are
strongly influenced by forecasting accuracy have a stronger impact on the final value
of a building. In Chapter 2, the significance of early stage forecasting and
forecasting in practice are reviewed. Disregarding the probable strong impact, it is
found that accuracy is rarely monitored in real life forecasting. There is also a lack
of theoretical support for the widely adopted forecasting methods, but by contrast
there are a variety of forecasting cost models that have been developed, mainly by
academia, arguably for the sake of publication purposes only.
The development, use, classification and limitation of cost models are
summarised in Chapter 3. In particular, JSEM, as the model for further
development in this study, is extensively explained. This research applies
regression techniques to the variables in JSEM, and previous studies on the
application of similar techniques and the variables selected are also discussed. In
the process of model development, modellers always face a dilemma between
choosing a slightly complex model that is general but may be unrealistic, and
choosing a very complex model that is specific but may be unreliable. To resolve
this dilemma, the principle of parsimony in scientific theory and model development
is addressed.
Due to the limited information that is available for early-stage forecasting,
models are mainly operated in ‘black box’ mode. In the case of models that are
developed by regression, as in this research, performance validation is essential.
6
The different ways of measuring forecasting performance, and previous empirical
work on forecasting accuracy, are reviewed in Chapter 4.
The methodology is described in Chapter 5. The cost analyses for four
types of building were collected from a large quantity surveying practice in Hong
Kong. To employ the regression methodology for multi-storey buildings, the
number of variables in the original JSEM is trimmed down to a manageable level by
making an assumption of the variations in floor size. Some advanced features of
this methodology include the use of cross validation for reliability analysis, which
simulates the practical production of forecasts, and a dual stepwise selection strategy
that enhances the chance of identifying the best model. In the section concerning
the comparison of the models, two commonly used measures – bias and consistency
– are described, and statistical inference using parametric and non-parametric tests is
compared. To assist in the making of a proper statistical inference, a framework for
choosing the appropriate tests is also proposed.
The analysis in Chapter 6 contains three sections: model development,
performance validation, and combining forecasts. Eight regressed models were
developed according to two sets of candidates (one set from JSEM and one set that
was modified from JSEM) for the four types of building. The selected variables
were also transformed to seek a further improvement in forecasting accuracy. Each
regressed model was compared separately with the conventional models, and the
models possessing the same potency were grouped together. Finally, an approach
to combining forecasts to improve forecasting performance is demonstrated with
empirical data.
7
An overall summary, conclusions, and suggested further research are
presented in Chapter 7.
8
CChhaapptteerr 22 CCoosstt FFoorreeccaassttiinngg iinn PPrraaccttiiccee:: AA
RReevviieeww
In science, it doesn't matter if you're wrong, as long as you're not stupid. In business, it doesn't matter if you're stupid, so long as you're not wrong.
Anonymous
2.1 Introduction
The total cost of a development includes the cost of land, building costs,
finance charges, legal charges, consultants’ fees, and so forth. In a broader sense, it
also includes the running, marketing, maintenance and repair costs. To manage the
economic aspect of a building development effectively, clients often employ
professionals of various disciplines, such as accountants, general practice surveyors
and quantity surveyors. Quantity surveyors, whose profession originates in the
measurement of the quantities of buildings, are responsible for giving advice on
building costs.
The economic aspects of building procurement play a very significant role,
because building cost is one of the major components of the total development cost,
next to the cost of land. Unlike the cost of land, which reflects the cost of
ownership and usage, building cost is determined by the building market through the
9
cost approach. In a market economy, it is the traded value to a contractor for the
procurement of a building.
At the design development stage, building cost planning and control is an
iterative process that is used to forecast the unknown building price based on
available drawings and specifications (i.e., costing a design) and the revision of
drawings and specifications to ensure that the building price falls within a
predetermined sum (i.e., designing to a cost) (Jaggar et al. 2002 pp.10-11). Design
decisions that are made during this process are crucial to the success of a project.
As the decisions that are made in the early design stages, especially before a detailed
design has been worked out, are more cost sensitive than those that are made in the
later stages, changes to design decisions in the later design stages or execution stage
may lead to serious redundancy. Thus, it is essential to produce an accurate cost
forecast, especially in the early design stage. The use of the right cost model is
therefore a fundamental concern.
The task of forecasting the cost of buildings is especially difficult because of
the heterogeneity of the design, procurement, and contractual arrangements; the
complexity of the resources and production methods that are involved; and the
lengthy cycle of building projects. The task of forecasting is, however, very
important in the design process, as design decisions are always made with reference
to the forecasts (outputs of the task) of building costs. An incorrect forecast will
inevitably lead to the ineffective use of resources.
In a typical building project, the quantity surveyor is held responsible for
giving strategic cost advice. The science and art of this function is what
distinguishes quantity surveying as a professional discipline (James 1954; Male 1990;
10
Connauhgton and Meikle 1991; RICS 1992), and forecasting forms a core part of this
function. This chapter reviews the significance of early stage forecasting,
conventional forecasting practice, and the underlying problems of the traditional
forecasting approach.
2.2 Building Economics
To give professional cost advice, quantity surveyors should be well equipped
with knowledge of building economics. Building economics is the study of
economising the use of scarce resources throughout the development life cycle, from
conception to demolition (Bon 1989 p.5). It involves a combination of technical
skills, informal optimisations, cost accounting, cost control, price forecasting and
resource allocation (Raftery 1991 pp.4-5). In a broader sense, it can be considered
as a branch of general economics that involves the application of the techniques and
expertise of economics to the study of the construction firm, the construction process
and the construction industry (Hillebrandt 1985 p.3). The objective of seeking an
optimal allocation of resources for building clients distinguishes building economics
from cost and management accounting.
2.3 Cost Planning and Control
Practitioners refer to the process of applying economics principles to building
projects as cost planning and control. The purposes of cost planning and control are
to provide clients with a good value for money project, to achieve a balance of
11
design expenditure on various building component, and to keep expenditure within
the amount that is allowed by the client (Maver 1979; Karashenas 1984; Seeley 1996
p.6; Flanagan and Tate 1997 p.13; and Ashworth 1999 pp. 9-10). In practice, it
may involve the study of the client’s requirements, the possible effects on the
surrounding areas if the development is carried out, the relationship between space
and shape, the assessment of the initial cost, the reasons for, and methods of,
controlling costs and the estimation of the life of the building and materials
(Ashworth 1999 pp.9-10; Ferry et al. 1999, pp.26-28).
2.4 Cost Forecasting in the Cost Planning and
Control Process
To avoid ambiguities in the understanding of commonly used terms such as
‘cost planning’ and ‘cost forecasting’ for describing the activities involved, and
‘estimate, ‘forecast’ and ‘prediction’ for describing the output produced, their
corresponding definitions are reviewed. The terms ‘cost planning’ and ‘cost
control’ are defined by Seeley (1996).
Cost planning – a systematic application of cost criteria to the design
process, so as to maintain in the first place a sensible and economic
relation between cost, quality and utility and appearance, and in the
second place, such overall control of proposed expenditure as
circumstances may dictate. (p. 22)
12
Cost control – all methods of controlling the cost of building projects
within the limits of a predetermined sum, throughout the design and
construction stage. (p. 23)
With reference to his definitions, there are four key elements in the process of
cost planning and control. First, it is necessary to produce a base figure as a cost
target or a cost limit. Second, the analysis of cost and the production of a probable
building cost is an iterative process. Third, the cost study requires the application
of knowledge of how to relate building design to building economics. Fourth, the
cost target, or cost limit, is used to monitor the probable building cost. In short, the
process is an iterative process that forecasts the building cost based on available
information such as drawings and specifications (costing a design) and the revision
of drawings and specifications to ensure that the building cost falls within the limit of
a predetermined sum (designing to a cost) (Jaggar et al. 2002 p.11).
The terms ‘cost’ and ‘cost forecasting’ are defined in the first chapter of the
book, Cost Modelling, edited by Skitmore and Marston (1999):
Cost – the cost of the contract to the client. This is the value of the
lowest bid received for the contract, or the contract sum. (p. 18)
Cost forecasting – the process of forecasting the client’s cost. Cost
forecasting is a part of the cost evaluation (planning and control)
process. (p. 18)
Obviously, the cost of a building under a building contract formed between a
building owner and a contractor is different from the cost of production of that
building. Their relationship varies according to many unquantifiable factors such as
13
market condition and project risks, etc. As one person’s price is another person’s
cost, the terms ‘price’ and ‘cost’ of a building refer to the amount received by a
contractor and the amount paid by a owner respectively (Raftery 1991, pp. 30-32).
To avoid ambiguity, the terms ‘price’ and ‘cost’ are used synonymously throughout
this thesis in the sense of the cost to building owners.
According to the definitions of cost planning and cost forecasting, the latter
determines what the future cost will be, whereas the former determines what it
should be. Cost forecasting is an input to the cost planning process, or a
sub-process of cost planning. The importance of cost forecasting is often
understated compared with cost planning. Armstrong (1985 p.6) argues that
alternative plans can be compared only if reasonable forecasts can be made, and that
forecasting should be considered as being as important as planning.
The term ‘forecast’ is distinguished from ‘estimate’ and ‘prediction’. An
estimate is made of quantities that may exist before, during or after the event under
consideration. Forecasting requires a prior estimate, and is a subset of the
estimating task (Skitmore et al. 1990 p.3). An estimate of a future event must by
definition be a forecast, whereas an estimate of an event that is based on information
that contains the event itself is a prediction (Skitmore and Marston 1999 p. 19). In
statistical parlance, a prediction is an estimate of formulae. A forecast, however, is
an estimate of a similar value that is outside of the database (Skitmore et al. 1990
p.4). As this research concerns the development of cost models for the estimation
of future events, the term ‘forecast’ is used throughout the thesis.
14
2.5 Design Process and Designers’ Forecasts
Making appropriate design decisions is crucial to the success of a project,
because design changes that involve vast expenditure, future changes or variations in
decisions, especially after the commencement of construction, often lead to
redundancy and waste in terms of work completed and resources deployed. Some
decisions on design may have long-term consequences or may be unrecoverable.
Design decisions are solutions to problems of function, form, time and economy for
buildings (Peña and Parshell 2001). Referring to Figure 2-1, which exhibits the
process of the search for design solutions as an iterative process of analysis,
synthesis, and appraisal (Maver 1970), it can be seen that building cost forecasts are
used at the appraisal stage to assist in the making of decisions to achieve an
economic objective. These forecasts are also referred to as ‘designers’
forecasts’(Ashworth and Skitmore 1983), as it is the building design that gives the
information for forecasting and determines whether value can be achieved at an
acceptable cost (Morton and Jagga 1995 p. 9). As clients need reliable cost advice
to enable the assessment of the viability of a project as soon as is possible (Fortune
and Lees 1994), designers’ forecasts help to make them aware of their probable
financial commitments before any extensive design work is undertaken (Seeley 1996
p. 54).
The outline plan of work of the Royal Institute of British Architects (RIBA)
(RIBA 1991) divides the construction process into 12 stages. It gives a
comprehensive picture of the information that is required and the tasks that need to
be completed at each stage of work. There are four stages in between the
appointment of various professionals and the production of tender information.
15
They are the feasibility, outline proposal, sketch design and detailed design stages.
During these stages, quantity surveyors are responsible for producing designers’
forecasts according to the information that is provided, and the most important goal
of these forecasts is to give a forecasted value of the work that is as close to the
unknown value of the lowest tender as possible. Although designers’ forecasts at
different stages share the same goal, the levels of influence differ according to the
qualities of these forecasts. Figure 2-2 illustrates the level of influence of the
different project stages on project cost. It shows that the level of influence drops
drastically from the planning and design stage to the procurement and construction
stage, even though the percentage of actual cost spent in relation to the overall
building cost is small in the former stage. This is also reinforced by current studies,
which suggest that the commitment of a construction cost before a sketch design is
formalised may amount to over 80% of the final potential cost (Skitmore 1985; Ferry
et al. 1999 pp.95-96). Figure 2-3 shows the suggested accumulated commitment
expenditure against the design time. As demonstrated, early decisions are more
cost sensitive, and thus the quality of early stage forecasts plays a more influential
role in the final value of buildings than the quality of later forecasts.
Skitmore et al. (1990 p.5) suggested five primary determinants that affect the
quality of forecasting: the nature of the target, the information used, the forecasting
technique used, the feedback mechanism used and the person providing the forecast.
The forecasting technique for early stage forecasting is identified as the study area
for this research.
16
Figure 2-1: Model of design process (Source: Maver 1970 p.200)
Figure 2-2: Level of influence on project cost (in per cent) (Source: Barrie and
Paulson 1978 p. 154)
17
Figure 2-3: Designers’ commitment to expenditure (Source: Ferry et al. 1999 p.
96)
2.6 Early Stage Forecasting in Practice
Bennett et al. (1979) classified conventional designers’ forecasting techniques
into eight categories: cost limit calculation, floor area method, functional unit method,
elemental cost estimation, lump sum estimation, cost per meter squared for
functional use, approximate quantities, and pricing the bill of quantities. None of
these techniques take into account how building cost is actually incurred by
contractors. In traditional procurement, the responsibilities of design and
construction are separately carried out by two groups of professionals. Designers’
forecasts are usually prepared by professional quantity surveyors who do not
normally has access to the cost data of contractor’s accounts. These data show how
actual cost of construction is incurred by a contractor. Due to the lack of these data,
forecasters can only refer to the prices and unit rates from returned tenders. Out of
18
the eight types of techniques involved, the most frequently used before the
preparation of a tender are the floor area method, elemental cost estimation and
approximate quantities. The first method assumes that the building price is
proportional to its floor area. The second method divides a building into a set of
elements and assumes that the cost of an element is proportional to the unit of
measurement that is defined for that element. The third method requires the
calculation of quantities of the major items of a building and pricing them by means
of composite unit rates. The popularity of these methods is unrelated to their
efficacy. Moreover, there is a major criticism about the lack of a theoretical
relationship for the application of these methods at different design stages.
Conventional forecasting techniques are applied in when there is a trade-off
between the estimation of accuracy and the time that is available for forecasting or
between forecasting accuracy and the adequacy of available forecasting information
(Taylor 1984). According to the results of surveys on the forecasting techniques that
are employed by practitioners in Nigeria (Akintoye et al. 1992), South Africa (Bowen
and Edwards 1998) and the United Kingdom (Fortune and Lees 1996; Fortune and
Hinks 1998), conventional forecasting methods are still in widespread use, and their
applications outnumber those of the newer models. The UK survey, which was
conducted by Fortune and Hinks (1998), also prioritised the model selection criteria
for practicing forecasters. Table 2-1 shows the identified model selection criteria in
descending order of importance. The two highest-ranked criteria are the availability
of data and the data that is needed for a model.
19
Table 2-1: Model Selection Criteria (extracted and modified from Fortune and
Hinks 1998)
Model Selection Criteria Identified by Fortune and Hinks (1998) (UK) according to the descending importance ranking
Model Selection Criteria Identified by Akintoye et al. (1992) (Nigeria)
Model Selection Criteria Identified by Bowen and Edwards (1998) (South Africa)
Amount of project data availability Data need for model Forecasters understanding of the model Time available for forecast preparation Project type Accuracy of model output Forecasters experience of model in-use Amount of risk in project decisions Ease of model application Feedback from previous forecasts Complexity of the project Speed of the model in use Human resources required to operate model Site characteristics of the project Level of awareness of new models Flexibility of model in use Project size Nature of the client Market conditions Cost of using model Design consultants for the project Quality levels required in the project Availability of computers for use with model Relationships between the forecaster and manager
Anticipated height of project Geographic location of the project Other criterion found: Other criteria found: (1) Expected frequency of
model use (1) Cost of Project; (2) Client Sophistication
Although the approximate quantities method is generally thought to be more
accurate (Fortune and Lees 1996), as it utilises more data, a recent study surprisingly
found the opposite result, that the floor area method is significantly more accurate than
the approximate quantities method (Skitmore and Drew 2003). Moreover, the
approximate quantities technique requires more detailed design information and more
time to prepare. By contrast, the floor area method is very rough, and requires much
20
less information and effort to produce. As design decisions that are made before the
completion of the sketch design stage are far more important than those that are made
afterwards (refer to Figure 2-2 and 2-3), it would be more worthwhile to spend time on
improving the early stage forecast, from the perspective of cost and benefit.
2.7 Problems of Existing Forecasting Practice
2.7.1 Misconception of the relationship between level of detail and
forecasting accuracy
Bennett et al. (1979) found that some forecasters applied quite detailed
estimating techniques at a very early stage of the design planning process without
taking into account the correlating accuracy. Practising forecasters generally
believe that forecasts that are produced from more detailed quantities are more
accurate. Thus, forecasters usually attempt to measure quantities in as much detail
as possible within the limitations on the available data and allowable time. This
explains why forecasters ranked very highly the three model selection criteria of
“amount of project data available” (most important), “data needed for model”
(second most important) and “time available for forecast preparation” (fourth most
important) in the model selection criteria survey.
The conviction of practising forecasters that the more detailed a forecast is,
the higher its accuracy remains a proposition only. Skitmore (1991) highlights the
need for the assessment of the performance of individual techniques:
21
“the standard construction price forecasting texts all assert that more
detailed forecasting techniques such as those using approximate
quantities are ipso facto necessarily of better quality than coarser
techniques such as the floor area method . . . very little research
seems to have been attempted in establishing the validity of this
assertion or of the relative quality of individual techniques.” (p. 12)
Ironically, empirical evidence of forecasting accuracy reveals that very little
improvement can be made to overall accuracy simply by increasing the level of detail
and the complexity of quantity-based methods (Ashworth and Skitmore 1983; Ross
1983; Morrison 1984; Beeston 1987). This could be due to the fact that factors
such as the type, size and shape of buildings that are not counted in quantity-based
methods have a greater significance, and that costs are closely related to market
forces and therefore, to an extent, are divorced from actual costs (Skitmore 1995).
More empirical studies on forecasting accuracy are reviewed in Chapter 4.
2.7.2 Lack of theoretical background
In the preface of the book, Building as an Economic Process: An Introduction
to Building Economics, Bon (1989) raises the question “why then is building
economics developing at such a sluggish pace, and what are the reasons for its lack
of professional recognition?” He opines that it is because the field lacks a
theoretical foundation. Although effort has been expended in the development of
advanced forecasting systems, theoretical development has not been forthcoming
(Skitmore 1988). With the assistance of information technology today, forecasting
22
researchers are now faced with an unmanageable amount of data but no theoretical
basis for analysis (Skitmore and Patchell 1990).
2.7.3 Lack of performance evaluation
Forecasters generally assume that a forecast is correct, and that the error is in
the difference between the forecast and tender price (Morrison 1983; Fortune and
Lees 1996). In his study of cost planning and the forecasting techniques that are
used in practice, Morrison (1983) finds that no forecaster in practice monitors their
own forecasting performance against received tenders. Forecasters are too
optimistic about their own forecasting performance, and pay very little explicit
attention to the confidence limits that are attached to the forecasted range of prices
within which the eventual outcome is expected to fall (Bowen and Edwards 1985a).
Practitioners often neglect the importance of producing accurate forecasts.
An opinion survey was conducted amongst architects and quantity surveyors, and
found that a significant number of respondents expected a great degree of accuracy
from price forecasts that are produced by quantity surveyors (Bowen and Edwards
1985a). Empirical studies also show that clients are generally dissatisfied with the
quality of strategic cost advice that is provided by their professional advisors (Ellis
and Turner 1986; Proctor et al. 1993). These studies reveal that there is room for
forecasters to improve, that forecasters have traditionally had no awareness of their
own performance and that forecasters should monitor and find ways to improve the
quality of cost advice to satisfy the needs of their clients.
23
2.7.4 Inexplicability, unrelatedness and determinism
The use of forecasting methods in practice is subjective, although research
studies on the formalisation of the model selection process have been carried out
(Fortune and Hinks 1997, 1998). Forecasters sometimes use a mixture of different
techniques to manage the forecasting task without a clear rationale. For example, a
forecaster may use the floor area method to forecast a part of the work for which they
have little data to refer to, and uses the approximate quantities method for the rest of
the work for which more detailed data exists. These conventional methods were
mainly developed by rule of thumb without any attention being paid to the theory
behind them, and their use in combination is theoretically baseless.
The reliability of the forecasts that are produced by conventional methods
depends on the reliability of each quantity value, the reliability of each unit price rate
value, the number of items and the collinearity of the quantity and rate values
(Skitmore and Patchell 1990). It is doubtful, however, that unit price rates that are
derived deterministically from a number of historical projects can produce accurate
forecasts. Moreover, to use process-biased data (e.g. historical price rates, which
tend to reflect the utilisation of available resources) for design-biased forecasting
models would be to imply either that production methods do not differ, or that
differing production methods do not significantly affect cost, both of which are
patently untrue (Bowen and Edwards 1985a). Furthermore, the supposition that
forecasts will be accurate only if the quantities and unit price rates can be determined
ignores the variability of unit price rates (Flanagan and Norman 1983). There is no
explicit qualification with regard to the inherent variability and uncertainty of the
conventional models.
24
To conclude, conventional forecasting methods and approaches suffer from
their inexplicability, unrelatedness and determinism (Brandon 1982; Wilson 1982;
Taylor 1984; Bowen and Edwards 1985a, 1985b; Bowen et al. 1987). In short,
these approaches fail to explain the systems they purport to represent, fail to show
the relationship and interdependency between the variables and fail to consider the
variability and uncertainty of forecasting.
2.8 Summary
Building price forecasting is a sub-process of cost planning. It helps
decision makers to be aware of the probable financial commitments before extensive
design work is undertaken. After all, a decision to build can be put forth, or
alternative plans compared, only if a reasonable forecast can be made.
Although the forecasts that are made at different stages have similar functions,
their levels of influence are different, because a design decision that is made in the
early stages is more cost sensitive than the same decision made later. Thus, early
stage forecasts play an influential role in the final value of buildings.
At the early design stage, the type of information that is available for the use
of forecasting is usually very rough, and practising forecasters use simple single unit
methods, such as the floor area method, to accomplish the forecasting task.
Practitioners generally believe that accuracy is proportional to the level of detail of
the forecast. This perception is reflected in their choice of forecasting model, and
that they consider the amount of data available to be the most significant criterion.
25
Paradoxically, the simple floor area method is found to be more accurate than the
detailed approximate quantities method.
A few empirical surveys on forecasting practices have been undertaken in
different countries. They all show that conventional forecasting methods, such as
the floor area and approximate quantities methods, still dominate, despite the fact
that plenty of new alternative models have been developed. Several problems of
existing forecasting practices are identified. They include the misconception of the
relationship between level of detail and forecasting accuracy, the lack of theoretical
background, the lack of performance evaluation, and the inexplicability,
unrelatedness and determinism that are rooted in the forecasting approach.
Therefore, the direction of development for new models should focus on the features
of logical transparency (i.e. be theoretically supported), interdependence (i.e., show
the relationship between variables) and stochastic variability (i.e., allow the output to
be expressed in probability terms). The performance of new models should also be
measured empirically to demonstrate their forecasting ability.
26
CChhaapptteerr 33 DDeevveellooppmmeenntt ooff FFoorreeccaassttiinngg
MMooddeellss
If the moon's face is red, of water she speaks.
Saying of the Zuni Indians of the Southwest
3.1 Introduction
The first recorded forecasting method was the cube method, which was
invented about 200 years ago (Skitmore et al. 1990 p. xix). The more widely used
floor area method was developed around 1920 (Skitmore et al. 1990 p. xix).
Starting from the mid-1950s, more and more research has focused on the
development of alternative forecasting cost models. One of the pioneers, James,
developed the storey enclosure method in 1954 as an alternative method to the floor
and cube methods for early stage forecasting. As a method developed 50 years ago,
it possesses the inherent problems that are explained in Chapter 2. However, James
identifies some possible variables other than total floor area and building volume that
might influence building cost. These variables attempt to explain the variability of
building shapes, the vertical positioning of the floor areas, storey heights and the
presence of basements in the design of a building. The author also demonstrates
27
(although only through a very crude comparison) that the accuracy level of his
proposed storey enclosure method is greater than the floor area and cube methods.
The storey enclosure model is considered to be the most sophisticated model of all of
the single price-rate models (as elaborated in Section 3.8) that are used for
forecasting in the early design stage, but despite the empirical evidence for the
performance of the storey enclosure model, it has not been widely used by practising
forecasters.
The conventional methods, such as the approximate quantities and elemental
cost methods, are the cost models that express building costs as a function of
quantities and unit rates. Extensive studies on the subject of cost modelling were
conducted in the mid-1970s. Researchers started to apply statistical techniques to
modelling. A wider variety of cost modelling techniques in the categories of
simulation, generation and optimisation have been developed in the past 30 years.
However, a lot of research on model development focuses on the way in which new
alternative models are different from other models, and stresses their uniqueness.
There is a lack of clear demonstration of the applicability of these models, which is
considered to be the biggest obstacle to their practical application.
3.2 Definition of Cost Model
The English word, ‘model’, comes from the Latin word, ‘exemplum’, which
means the manner, fashion, or example to be followed, a precedent and an example
of what may happen. A model is a representation of a structure, or an “organised
body of mutually connected and dependent parts” (Holes 1987). The etymology
28
suggests that a model only represents the general picture of what may occur. It is
clear enough from its definition that uncertainties do exist within it. A model that is
developed from historical information or experience can represent reality, but it does
not thereby become reality (Beeston 1974; Bowen 1984).
Seeley (1996 p. 202-203) defines the word ‘model’ as “a procedure
developed to reflect, by means of derived processes, adequately acceptable output for
an established series of input data”. Therefore, a building cost forecasting model is
a system that produces forecasted prices (output) from historical data (input).
Beeston (1987 p. 46) considers that all forecasting methods can be described as cost
models, which are classified as in-place quantity-based, descriptive or realistic, and
their task may be to forecast the cost of a whole design or of an element of it, or to
calculate the cost effect of a design change.
Cost models are technical models that are used to assist in the evaluation of
the financial implication of building design decisions (Maver 1979). Skitmore and
Marston (1999 pp.2-4) differentiate technical models from isomorphic models. The
former type features an important step in the abstraction of the most significant
influencing elements at the beginning of the model development process, whereas
the latter type involves the mapping of every influencing element within the results,
which is expensive and is not cost effective. As buildings are composed of
thousands of items, involve hundreds of companies in their production and take years
to complete, the number of elements that influence building costs is huge. Building
a cost model requires the selection of a sub-set of major influencing elements, which
is an exercise in cost-benefit trade-off. Even if the resources were available, it is
impossible to construct an isomorphic model for building costs due to individual
variation between projects (Kenley and Wilson 1986).
29
The purposes of cost models are to forecast the total cost that the client will
have to pay for the building at any stage in the design evaluation, to compare a range of
actual design alternatives at any stage in the design evolution, to compare a range of
possible design alternatives at any stage in the design evolution, and to forecast the
economic effects upon society of changes in design codes and regulations (Skitmore
and Marston 1999 p. 9).
3.3 Brandon’s “Paradigm Shift”
Although many experimental cost models were generated in the 1970s (for
example: Buchanan (1972), Regdon (1972), Kouskoulas and Koehn (1974), Braby
(1975), McCaffer (1975), Wilson and Templeman (1976), Flanagan and Norman
(1978)) few are able to challenge the existing forecasting approaches. Nobody had
probed the possibility that the existing forecasting models might actually be wrong
until Brandon (1982) addressed the need for a paradigm shift in building cost
research. He doubts the reliability of existing forecasting models, and urged the
development of a cost model that is founded on solid theory. With the assistance of
computer technology, which makes complicated calculation much easier than before,
simulation is suggested as the direction for further research investigation, because it
gives a better understanding of why certain costs arise.
This new approach sets out a more explicit and sound criteria for model
development. Brandon’s view is inspiring and visionary. In response to Brandon’s
suggestion, Bowen and Edwards (1985a) review the existing paradigm, and address
who it is that needs a new paradigm and why it must it be a new paradigm. The
30
authors believed that the new approach to cost modelling and price forecasting after
the shift would entail the recognition both of the continuing need for historically
derived data in the exploration of cost trends and relationships, and the recognition of
the importance of the building process by the incorporation of significant aspects of
resource utilisation into the estimation methods. They also believed that the new
approach would insist on inferential statements backed by statistically reliable data,
that the approach would be stochastic in creatively dealing with future uncertainty
through the use of probabilistic techniques, and that it would simulate reality and be
capable of demonstrating the strength and associative characteristics of the
relationships that exist between the factors involved. Forecasters would then profit
from the knowledge base that would be gained through their expert understanding of
the field, and be capable of using this systematically to provide logically coherent
solutions to cost modelling and price forecasting problems.
Beeston (1987) does not rule out the use of descriptive methods (that is,
those that contain variables that describe the design and its environment by
measurements of such factors as size, shape, type of construction and location),
despite their inherent deficiencies. He considers that they would be suitable both
for forecasting at the early planning stage and for forecasting the maintenance costs
of estates.
Both Beeston (1987 p. 18) and Bowen et al. (1987) suggest that the
development of modelling systems for the purpose of design economics should
attempt to represent as closely as possible the way in which costs are actually
incurred. As is highlighted in Chapter 2, the conventional approach is ill equipped
because of its inexplicability, unrelatedness and determinism, and thus the
31
development of new cost models should shift towards logical transparency,
interdependence and stochastic variability (Bowen et al. 1987).
3.3.1 Black box versus realistic models
There are two distinct ways of representing costs – the realistic approach and
the “black box” approach (Beeston 1983). The realistic approach attempts to
represent the ways in which costs arise, whereas the black box approach does not.
The former approach identifies all of the direct causes of cost, and measures them
directly. This involves the detailed comparison of methods and prototype structures,
and thus this approach has the best potential accuracy. However, the data that is
required for the realistic approach is extremely difficult, if not impossible, for
forecasters that represent clients to acquire (Hardcastle 1984). Although it is
possible, even at the early design stage when information is scant, to use the realistic
approach through the simulation of production operations (such as CPS (Bennett and
Ormerod 1984) and CASPAR (Thompson and Willmer 1985)), forecasters still
prefer to use black box models. This is partly because the way that cost is incurred
is not a perfect function of the building design, and thus forecasters have to make
additional assumptions to convert design information into production information if
the realistic approach is used. These additional assumptions will inevitably create
extra complications.
Thus, models for very early stage forecasting are unavoidably inexplicable,
but their performance can still be judged, and indeed, the justification for the black
box approach rests on its actual performance. It is measured by comparing the
32
output of the model that is based on the black box approach in response to certain
stimuli with the output of the prototype under the same stimuli.
Both the black box approach and the realistic approach have their raison
d'être. Choosing which of them to use depends on the purpose of the model
(Skitmore and Patchell 1990). The realistic approach needs structural validation to
test its soundness, but it has the benefit of being explanatory. However, the black
box approach uses model performance in model testing.
3.3.2 Deterministic versus stochastic models
A model without a formal measure of uncertainty is, by definition, a
deterministic model. Conventional models generally only give a single-figure
estimate as their output without recognising the reality of the inherent variability and
uncertainty, and are thus deterministic models. The variability and uncertainty are
not formally assessed, but are more often dealt with intuitively by forecasters. By
contrast, if the duration and cost of activities or groups of activities are recognised as
being uncertain, then they will be modelled as stochastic variables using a
probabilistic approach (Bowen and Edward 1985a). Formal measures of
uncertainty may be articulated as the associated coefficient of variation (as in
regression) or the cumulative frequency distribution (as in the Monte Carlo
Simulation) (Newton 1990). The application of probabilistic approaches to the
problems of building economics has been demonstrated through various studies such
as that of Spooner (1974), Mathur (1982), Wilson (1982) and Diekmann (1983).
Despite the different considerations of uncertainty that are discussed, the earlier
studies do not challenge the validity of the hidden assumptions, for example, that the
33
events that are simulated are independent events, and that the use of normal and
rectangular frequency distributions is appropriate in the application of the Monte
Carlo Simulation (Raftery 1984b). More recent works by Chau (1995a; 1995b) and
Wall (1997) validate these assumptions in their application of the Monte Carlo
Simulation. The test of underlying assumptions in the modelling process is an
indication of the sophistication of the simulation techniques.
3.3.3 Deductive versus inductive models
Approaches to modelling cost in construction can also be classified as
deductive or inductive (Wilson 1982; Raftery 1984b). Models that are developed
from the former approach involve the analysis of cost data over design variables
(whichever are being considered) with the objective of deriving formal mathematical
expressions that succinctly relate a wide range of design-valuable values to price.
This approach draws heavily upon the techniques of statistics, and of correlation and
least-squares regression in particular. Deductive models arise largely from the
follow equation:
P = f1(V1, V2, V3, … Vn), (3.1)
where P is the forecasted price, which is a function, f1, of the design variables,
V1, V2, V3, … Vn. The crucial constraints to the deductive approach include the not
inconsiderable limitations of the statistical techniques that are available for
modelling, and the total dependence upon the suitability of the cost data used.
Inductive models do not involve the analysis of a set of given cost data, but
rather the synthesis of the costs of individual discrete design solutions from the
34
constituent components of the design. Inductive methods require the summation of
cost over some suitably defined set of subsystems that are appropriate to the building
design. The most detailed level of subsystem definition would be the individual
resources themselves, but several other levels of aggregation are in common use, for
example, operational activities and constructional elements. Inductive models arise
largely from the equation:
( )j
n
jj CP' ∑
=
=1f , (3.2)
where P’ is the forecasted price, which is the summation of each cost
function fj of the resources committed, Cj, for j equal to 1 to n, where n is the total
number of subsystems that represent the prices.
In deductive models, the techniques of statistical inference are used to deduce
the relationships between building features or design models, whereas in inductive
models the resource implications of design decisions are calculated and aggregated to
measure economic performance. Thus, the former models are more relevant to
early design stages designs and the latter models to later design stages.
3.4 Major Directions of Model Development
Newton (1990) classifies nine descriptive primitives for cost modelling
studies: data, units, usage, approach, application, model, technique, assumptions and
uncertainty. Table 3-1 briefly explains the meaning of each primitive and its
corresponding classification criteria. The descriptive primitives for this research are
also exhibited in the table. Table 3-2 shows a summary of the reviewed research
35
studies on modelling techniques and applications according to Newton’s
classification. The number of studies on modelling techniques for early design
stages (feasibility and sketch design) exceeds the number for later design stages.
This circumstance seems reasonable because, as is discussed in Chapter 2, design
decisions are more cost sensitive at the early stage than they are at the later stage, and
the potential benefit of developing a good model for the early design stage is
therefore greater. Thus, the development of designers’ forecast models focuses on
their application in the early stages of design.
Skitmore and Patchell (1990) review all of the modelling techniques that have
been developed in the building and process plant industries. The authors
differentiate the various techniques one by one according to their characteristics or
primitives, which include the mathematical model, relevant contract type, general
accuracy, whether the technique itself is deterministic or probabilistic, the number of
variables, type of variables, the characteristic of quantities (derivation, deterministic
or probabilistic (quantity model), derivation database) and the characteristic of rates
(weighting, current, quantity trend and deterministic or probabilistic (rate model)).
A summary of the characteristics for the various techniques identified is shown in
Table 3-3.
To summarise the development of cost models, Skitmore and Patchell
conclude that research has developed with differing emphasis on all of the four
factors that influence estimation reliability, although much system development has
been centred at the item level involving the search for the best set of predictors of
tender price (regression analysis), the homogenisation of database contracts by
weighting or proximity measures (BCIS and Lu Qian system), the generation of
items and quantities from contract characteristics (Holes, Calculix and expert
36
systems such as ELSIE) and the quantification of overall estimate reliability from
assumed item reliability (probabilistic model (PERT-COST) and simulation).
Since the 1990s, there is a new class of tools, neural networks, which offers
an alternative approach to cost forecasting (Li 1995; Adeli and Wu 1998; Bode 1998
Emsley et al. 2002; Kim et al. 2004). Neural network models are black-box in
nature and usually involve complicated algorithm. The superiority of neural
network models over other mathematical models lies on their ability to learn and
adapt their own representation during the model training process. Although many
researches have proved their outstanding performance, especially in terms of error
reduction, it is doubtful that practising forecasters understand these models, or even
have heard their names. As suggested in Fortune and Lee’s report, the relative
performance of new and traditional cost models in strategic advice for clients (as
addressed in Section 2.6), the possible fact that many practising forecasters are not
well-equipped enough to understand and use these models could be a big hurdle for
their real-life application.
37
Table 3-1: Classification of this research according to Newton’s descriptive
primitives
Descriptive primitives Explanation Suggested Classification
Primitives of this research
Data Whether data is specifically relates to a type of design proposal or not
Specific or non-specific
Specific
Units Whether it is a unit in abstract form, a unit of finished works or unit of as-built works
Abstracted, finished or as-built
Finished works (floor area, external wall (building perimeter and storey height) and roof area of final product)
Usage Whether the purpose is for designers’ price estimation or builders’ bidding
Cost or price Price
Approach
Whether it is implemented for estimation of the whole building cost or a particular component or part
Marco or micro Marco approach
Application When is the model applied in the design process
Feasibility, sketch, detailed, tender, throughout, non-construction
Feasibility (or very early sketch)
Model Common classification of techniques
Simulation, generation or optimisation
Simulation
Technique (See also Table 3.2)
Type of technique used Dynamic programming, expert system, functional dependency, linear programming, manual, monte carlos simulation, networks, parametric modelling, probability analysis, regression analysis
Regression analysis
Assumptions
Whether assumptions can be accessed or not
Explicit or implicit Explicit
Uncertainty Whether there is a formal measure of uncertainty not
Stochastic or deterministic
Stochastic
38
Table 3-2: Previous studies on modelling techniques and applications according
to Newton’s classification
Techniques Application Previous works Dynamic programming
Feasibility
Sketch Atkin (1987) Detailed Tender Throughout Non-construction Expert system Feasibility Sketch Brandon (1988), Lu (1988) Detailed Tender Throughout Non-construction Functional dependency
Feasibility Wilderness (1964), Thomsen (1965), Bathurst and Butler (1977), Flanagan and Norman (1978) , Pegg (1984), Meijer (1987), Tan (1999)
Sketch DOE (1971), Townsend (1978), Moore and Brandon (1979), Powell and Chisnall (1981), Scholfield et al. (1982), Langston (1983), Newton (1983), Weight (1987), Boussabaine and Elhag (1999)
Detailed Tender Throughout Holes and Thomas (1982), Sidwell and Wottoon (1984), Berny
and Howes (1987), Holes (1987), Woodhead et al. (1987) Non-construction Linear programming
Feasibility
Sketch Russell and Choudhary (1980), Cusack (1985) Detailed Tender Throughout Non-construction Manual Feasibility James (1954) Sketch Dunican (1960), RICS (1964), Barrett (1970) Detailed Gray (1982), PSA (1987), Munns and Al-Haimus (2000) Tender Throughout Kiiras (1987), Dreger (1988) Non-construction
39
Table 3-2: Previous studies on modelling techniques and application according
to Newton’s classification (Cont’d)
Techniques Application Previous works Monte carlos simulation
Feasibility
Sketch Mathur (1982), Pitt (1982), Wilson (1982), Bennett and Ormerod (1984)
Detailed Tender Walker (1988) Throughout Non-construction Gehring and Narula (1986) Networks Feasibility Sketch Detailed Tender Throughout Bowen et al. (1987), Brown (1987) Non-construction Parametric modelling
Feasibility Tregenza (1972), Selinger (1988)
Sketch Nadel (1967), Meyrat (1969), Southwell (1971), Tregenza (1972),Brandon (1978) , Selinger (1988), Warszawski (2003)
Detailed Tender Throughout Non-construction Park (1988) Probability analysis
Feasibility
Sketch Zahry (1982), Cusack (1987), Pegg (1987) Detailed Tender Fine (1980) Throughout Skitmore (1982) Non-construction Regression analysis
Feasibility Buchanan (1972), Regdon (1972), Kouskoulas and Koehn (1974), Braby (1975), McCaffer (1975), Wilson and Templeman (1976), Bathurst and Butler (1977), McCaffer et al (1984), Karshenas (1984)
Sketch Gould (1970), Buchanan (1972), Sierra (1982), Yokoyama and Tomiya (1988), Skitmore and Patchell (1990)
Detailed Tender Throughout Khosrowshahi (1988) Non-construction
40
Table 3-3: Summary of estimating techniques (Extracted from Skitmore &
Patchell 1990)
Estimate Technique Model
Relevant Contract
Type
General Accuracy
(c.v.)
Deterministic / Probabilistic
Number of variables Type of variables
Unit P = qr All 25-30% Deterministic Single Any comparable unit, e.g. tonne steelwork, metre pipeline
Graphical P = fr(q) Process Plant 15-30% Deterministic Few Ditto
Functional Unit P = qr Buildings 25-30% Deterministic Single Ditto, e.g. number of beds, number of pupils
Parametric P = fr(q1, q2,, q3,,…) Process Plant 15-30% Deterministic Few
Process parameters, e.g. capacity pressure, temperature, material,
cost index
Exponent r
PP1
212 =
Process Plant 15-30% Deterministic Single Size of plant or equipment, e.g.
capacity
Factor
i
N
ii
m
ii rqfactP ∑∑
==
=11
a) m=1 (Lang method)
b) m>1, fact1 ≠ fact2,etc. (Hand
method)
c) facti = U(αi, βi) (Chiltern Method)
Process Plant 10-15% Deterministic Few Any
Comparative ∑=
−+=N
iii ppPP
11212 )(
All 25-30% Deterministic Few Depends on differences
Interpolation P = qr Buildings 25-30% Deterministic Single Gross floor area
Conference P = f(P1,P2,…) Process Plant ? Deterministic any Any
Floor Area P = qr Buildings 20-30% Deterministic Single Gross floor area
Cube P = qr Buildings 20-45%
(based on 86 cases)
Deterministic Single Volume
Storey Enclosure P = qr Buildings
15-30% (based on 86 cases)
Deterministic Single Floor area, external wall area, basement wall area and roof area
BQ Pricing:
(i) Conventional ∑=
=N
iii rqP
1 Construction
10-20% (5-8% for builders)
Deterministic
Many (number of variables varies)
Quantities required under SMM
(ii) B Fine ∑=
=N
iii rqP
1 Buildings 15-20% Deterministic
Many (number of variables varies)
Ditto
Significant Item Estimating ∑
=
=N
iii rqP
1 PSA
Buildings 10-20% Deterministic Medium Quantities required under SMM
Approximate Quantities:
(i) Conventional ∑=
=N
iii rqP
1 Construction 15-25% Deterministic Medium to
many Combining quantities and items
required under SMM
41
Table 3-3 (Cont’d): Summary of estimating techniques (Extracted from
Skitmore & Patchell 1990)
Estimate Technique Model
Relevant Contract
Type
General Accuracy
(c.v.)
Deterministic / Probabilistic
Number of variables Type of variables
Approximate Quantities:
(ii) Gleeda ∑=
=N
iii rqP
1
Buildings 15-25% Deterministic Few to
medium Ditto
(iii) Gilmore ∑=
=N
iii rqP
1
Buildings 15-25% Deterministic Few to
medium Ditto
(iv) Ross 1 ∑=
=N
iii rqP
1
Buildings 25% (based
on 17 cases)Deterministic / Probabilistic
Few to medium Ditto
(v) Ross 2 ∑
=
+N
iii rqp
1
Buildings 50% (based
on 17 cases)Deterministic / Probabilistic
Few to medium Ditto
(vi) Ross 3
∑=
+N
iii rqp
1
(Pi = a + bqi + e, e = N(0,σ2)
Buildings 30% (based on 17 cases)
Deterministic / Probabilistic
Few to medium Ditto
Elemental ∑=
=N
iii rqP
1
Buildings 20-25% Deterministic Medium BCIS/Cl afb entities (UK),
individual company manual (HK)
CPU ∑=
=N
iii rqP
1
Buildings 20-25% Deterministic Medium Similar
Elsie ∑=
=N
iii rqP
1
2
Offices Deterministic Medium DBE
Norms (schedule) ∑=
=N
iii rqP
1
2 Buildings 10-20% Deterministic
Many (number of variables
varies) SMM type, e.g. PSA schedule
Regression ebqaP
N
iii ++= ∑
=1
e = N(0,σ2)
All 15-25% Deterministic /
Probabilistic Few Usually contract characteristics, e.g. floor area, no. of storey
Lu Qian ∑=
=N
iii rqP
1 Buildings ? Deterministic Few Usually contract characteristics,
e.g. floor area, no. of storey
Resource (Scheduling
Activity, Operational)
∑=
=N
iii rqP
1 All 5-8% Deterministic
Many (number of variables
varies)
Resources, e.g. man hours, materials, plant
PERT-COST ∑
=
=N
iipP
1
where pi = N(qiri,σ i2)
All N/A Probabilistic Number of variables
varies
Usually time resources, e.g. man hours
CPS ∑∑
==
+=N
ii
N
iii rnrtP
11
where ti = F(µi,σ i
2)
Buildings 6.50% Probabilistic Usually few Resources, e.g. man hours, materials, plant
Risk Estimating ∑=
=N
iii rqP
1 Construction N/A Probabilistic Usually few Any
Homogenised Estimating (BCIS on line) (BICPE
etc.)
∑=
=N
iii rqP
1 Buildings N/A Deterministic Any Any
42
3.5 Limitations of Cost Models
3.5.1 Model assumptions
Models are only ever a representation of reality, and forecasting models are
always non-isomorphic models that are simplifications of an actual system. Every
model has a set of inherent assumptions about problem boundaries, about what is or
is not significant and about how the user might best conceptualise a problem
(Newton 1990). Regardless of whether the assumptions of a model are explicit or
implicit, it is always possible to devise tests that show models to be deficient in some
way or other. This implies that models should be used with care, and should not be
pushed beyond the limits of their validity (Skitmore and Marston 1999).
Designers’ forecast models are structured to represent completed buildings or
their components. However, the origin of the price of a building or a component
should be based on the construction process and the resources that are employed.
To modify this kind of price data to suit a designers’ forecast model, an implicit
assumption must be made that the actual buildings in the data pool are so similar that
their production methods do not differ, or that differing production methods do not
significantly affect cost. Obviously, these assumptions are untrue (Bowen and
Edwards 1985a).
43
3.5.2 Reliance on historical data for prediction
All forecasting models demand historical data as inputs for prediction. The
Wilderness Group (1964, p. 254-255) point out two limitations to using historical
data:
“it is almost impossible to find the actual buildings which are
sufficiently similar for their differences in cost to be related to
particular factors and . . . It was found impracticable merely with
historical data to isolate with any certainty and the effect upon
buildings cost of certain design factors individual to those buildings of
which the costs were examined”.
Moreover, Bowen and Edwards (1985a) criticise the use of mathematics in
historical data for modelling, because it fails to reflect the change in technology over
time. It is also debateable whether backward-looking concepts that are based on
historical price data should be used for forecasting. Bon (1989 pp.61-62) explains
the problem:
“Ex ante or forward-looking concepts predominate in economics,
while ex post or backward-looking concepts are more prevalent in
accounting . . . Cost is often treated as the pre-eminent ex post
concept. However, no matter how accurate and exhaustive our
historical records, backward-looking concepts of cost are inadequate
for two reasons. First, at the moment of decision one is perforce
considering future costs. Second, the valuation of costs is impossible
without explicit account of opportunity cost - the satisfaction forgone .
44
. . The difficulties with cost forecasting based on historical data are
exacerbated in the case of long-lived capital goods, such as buildings.”
3.5.3 Insufficiency of information and preparation time
The preparation of a forecast relies heavily on information input from
external organisations such as the client’s brief and the designers’ layout plans, and
the information that is available within the organisation, such as historical price data.
It is quite common that the design information that is given at the early design stage
is ambiguous and contradictory. The very limited information and allowable time
for producing forecasts may force forecasters to make assumptions according to their
own subjective judgements. For instance, forecasters will usually rely on price data
that is derived from a sample of buildings that do not perfectly match the
characteristics of the proposed building or works if appropriate historical price data
are unavailable (Flanagan and Norman 1983).
3.5.4 Reliance on expert judgment
Forecasting is partly an art and partly a science. The science part involves
the use of modelling techniques and mathematics. The art part comes with the
exercising of professional judgement. Tversky and Kahneman (1974) suggest that
in making judgements in uncertain conditions, people in general do not follow the
calculus of chance or the statistical theory of prediction, but instead rely on a number
of simplifying strategies or heuristics that direct their judgements. Such heuristics
(rules of thumb) can sometimes lead to reasonable judgements and sometimes to
45
severe and systemic errors. The exercise of judgement is therefore the cost
forecaster, rather than the forecasting model itself (Skitmore et al. 1990). Raftery
(1995) and Birnie (1995) point out that humans make mistakes when making
judgements, and state that more work is needed to understand the behavioural
processes that are involved. Empirical evidence shows that judgement has a
significant role within the formulation and transmission of early cost advice to clients
(Fortune and Lees 1996). As the exercise of judgement is a human cognitive
process, it can be subject to error, bias and heuristics.
3.6 Review of Cost Models in Use
Fortune and Lees (1996) study the incidence of the use of certain techniques,
and the extent to which lack of understanding is a factor that influences the incidence
of the use of certain techniques in their research on the relative performance of new
and traditional cost models. The studied techniques are classified into seven
categories: traditional (conventional) techniques, statistical techniques,
knowledge-based techniques, life cycle costing, resource- and process-based
techniques, risk analysis and value-related techniques. The authors reveal that the
use of conventional techniques outweighs the use of all of the other techniques, and
that these other techniques were not well understood by respondents.
A more recent study by Fortune and Hinks on models that are used by UK
quantity surveying practices also reinforces the notion that practitioners have not yet
answered the call of academia to adopt the new computer-based stochastic models that
are available in the assessment of project risk and uncertainty (Fortune and Hinks
46
1998). The study also indicates that, in the period 1993 to 1997, conventional models
that provide single-figure deterministic price forecasts had only a slightly reduced
incidence of use, whereas newer computer-based models had only a slightly increased
incidence of use, which suggests that the paradigm shift in the formulation of reliable
early cost advice has not yet been achieved in practice (Fortune and Hinks 1998). A
similar survey that studies the forecasting models that are used in South Africa also
indicates that conventional models remain firmly in the mainstream in application
(Bowen and Edwards 1998).
The demand for a move to a more scientific basis for forecasting appears to
come mainly from academia, rather than from practice (Bowen and Edwards 1985,
Raftery 1987 p. 53). For the sake of producing publications, academics (modellers)
have focused on the demonstration of how a newly developed model is different
from other models. However, the conservative attitude and the ignorance of
practitioners (forecasters) towards change and new knowledge create another hurdle
(Brandon 1982). To initiate a paradigm shift, academia will have to convince
practitioners by establishing and advertising the benefits that forecasters will enjoy
from these alternative approaches. This could include educating forecasters and
managers about how these new approaches can be applied and about how much
better they are than the conventional approaches, and the heightening of their
awareness of the inadequacy of the conventional approaches (Fortune and Lees
1996). A model that is new and mathematically sound for forecasting may not
necessarily be appropriate for implementation. Thorough studies on the benefits of
and strategies for putting a new model into practice are crucial to the acceptance of
new models or forecasting approaches.
47
3.7 Significant Items Estimation
Surveys that have been conducted in the UK and South Africa have reinforced
the fact that newer models are not popular in practice. More than 20 years after the
proposal of a paradigm shift, the idea remains a pipe dream, and the popularity of
conventional models remains unchanged. Perhaps the only new model that has
been put forth in practice (although it is still not well recognised), is the significant
items estimating model that was developed by the Property Services Agency (PSA)
in the UK.
Barnes (1971) investigates the implication of the proposition that different
values of rates have different degrees of reliability, and, specifically, that the
reliability of a product of quantity and rates is an increasing function of its value.
By assuming a constant coefficient of variation for each item, he shows that a
selective reduction in the number of low-valued items has a trivial effect on the
overall estimate reliability. The empirical evidence that backs up favour Barnes’
assumption is quite strong, and therefore its essence has been used to develop the
significant items method.
According to the outline that was published by the Department of
Environment of the UK government in 1987, the statement “some 80% of the value
of measured work on building projects is contained within 20% of the items in the
bills of quantities” was tested by analysing the prices in 40 bills of quantities. It
was found that 78% of the value was contained within the top 20% of items, which
broadly confirmed the 80/20 relationship. By restricting measurement and pricing to
the most significant items (the top 20% of items), and by using data with a reasonable
48
sample size, it should be possible to minimise the unreliability of the rates from the
bills of quantities (PSA 1987). The major benefits of the significant items
estimating model include the shorter forecasting time that is required due to its
concentration on fewer items, the improvement in accuracy, the improvement in
reliability (because its outputs are derived from data from a large sample) and the
flexibility it affords in allowing a move away from average rates towards varying
percentage additions for each trade (Allman 1988).
Munns and Al-Haimus (2000), in their work to refine the significant items
model, reveal that there is a lack of formal rules for the selection of work packages to
be used within the original significant items model, and therefore a potential to
overestimate the cost of projects. By using their new methodology for selecting
work packages and the refined technique that is known as the cost significant global
cost model, they demonstrate that there is a significant improvement in performance
over the original significant items model.
Although the use of the significant items model has shown significant
improvements in performance, the actual contribution to the overall value of a
building is limited, because it is a forecasting model for the later design stage. At
this stage, the design information is quite sophisticated, and there is little room for
cost saving or value enhancement. However, studies on the significant items model
give an empirically supported demonstration of how conventional models, such as
the approximate quantities method, can be further advanced by the use of statistical
techniques.
49
3.8 Discussions on Research Opportunities
Having reviewed the development of, and limitations to, forecasting models,
it seems reasonable to conclude that there is no universally agreed approach to
modelling building costs. There is no general agreement on the most useful set of
elements and functions for each of the model types, nor on how the models
themselves and their values should be derived, nor is there any agreement on the
nature of the functions that connect the cost with the various elements (Skitmore and
Marston 1999, p.19). In contrast, it appears that practising forecasters need
commonly agreed models. Since the existing models in use have been developed to
be a convention, forecasters can strictly follow them to prepare estimates (even
though many of them are not developed strictly) without the need to worry that their
choice of models will be challenged by other practitioners. However, the fact that
every forecaster is using a model does not automatically validate that model. The
conventional models deserve rigorous tests to justify their existence.
This research focuses on the study of conventional models used in the early
design stage. As explained in Section 2.5, earlier forecasts’ contribution is higher
since earlier decisions mainly influenced by forecasts are more cost sensitive. The
successful experience of applying the significant item model in practice provides
insights into the potential of developing a new model that is applicable to the early
design stage. According to the RIBA outline plan of works (RIBA 1991), this early
design stage corresponds to the period between the beginning of the feasibility stage
and the midway of the sketch design stage. Before the beginning of this period,
referred to by the RIBA outline plan of works as the inception stage, there are no
drawings available. Forecasters have to make their best guess by discretion, or
50
sometimes known in forecasters’ slang as “guestimates”. After the end of this
period, when there is more available information such as formal sketch layout plans
(as compared with those sketches produced during the early design stage), a few
sketch elevation plans, draft specifications, and perhaps the schedules of finishes,
doors and window, forecasters can use more detailed methods (for example, the
elemental cost estimating method) for forecasting.
During the early design stage, information is often more brief and can only be
extracted from a few sketches. Because of this, all the conventional methods used
by forecasters follow a single price-rate system. Differing from the elemental
method, which is applied at a later stage, and the significant item method or the
approximate quantities method, which is applied at an even later stage, the lack of
information inevitably imposes higher uncertainty on forecasts prepared by single
price-rate methods. However, the two early conventional methods, the cube and
floor area methods, appear unable to extract all the information available from the
sketches. The subsequent storey enclosure method, as described in Section 3.9,
shows sophistication in attempting to extract some further, arguably all the major,
information from the sketches, i.e. the area of each floor and the envelop area of a
building. Although also following a single price-rate system, the storey enclosure
method takes into account a additional aspects of building design economics. To
avoid falling into the trap of developing something new without any theoretical base,
as criticised by Raftery (1984b) and Newton (1990), the storey enclosure model that
was developed by James in 1954 is chosen for further development. Although the
model shares many of the flaws of the single-unit deterministic models, it is
considered to be the most sophisticated model, and is worth refining. Skitmore and
Marston (1999, p.164) suggest that the model has considerable potential for further
51
development by statistical means. Ashworth (1999, p.251) also suggests that in the
past, credibility was a factor to be taken into account, but that it might be more
acceptable today to apply the storey enclosure model. After all, what is needed is
an approach that will harness the strengths and minimise the weaknesses of all that
has been developed to date (Bowen and Edwards 1985a).
Patchell (1987) suggests four criteria to be observed for cost advice at the
schematic or feasibility stage: cost accuracy from very preliminary information, a
flexible and quick response to various options, economy of production in man and
machine hours, and estimation and analysis on the same basis. These criteria are
very practical, and are used as the requirements for the forecasting models that are
developed in this research.
The models that are developed in this research share similar justifications
with the significant items model, but they are designed to be applied in the early
design stage. They should be understandable, easy to use, fairly accurate and
relatively reliable compared with the conventional forecasting models.
The accomplishment of this research relies on the use of empirical data for
both the model development and the assessment of model performance. The
emphasis here is on the purely empirical nature of the model, which is thought to be
the best way to avoid subjectivity, as the essence of a good empirical research is to
minimise the role of the researcher in interpreting the results of the study (Skitmore
1988).
Like much of the research on empirical models, the modelling of prices in
this research follows mainstream research by using statistical techniques such as
regression analysis. Hypothetical models that contain various groups of variables
52
are derived by multiple regressions. The machinery of the approach is described in
Chapter 5.
3.9 Storey Enclosure Method
Generally, all of the conventional methods for forecasting building prices at
the early design stage are single-rate methods. Amongst them, the most commonly
used method is the floor area method. The floor area method is simple in
application, easily understood and produces a forecast quickly. However, it is also
considered to be too simple to take into account the different characteristics of
buildings. James (1954) criticised the floor area method as well as the cube method.
First, none of the two methods is satisfactory for universal application. Second,
none of the two methods reflect the cost implications of building shape, building
height and the number of storeys. Third, the two single price-rate methods have to
account separately for basements. Four, the cube method is sensitive to changing
unit rates. He proposed an alternative single-rate method, the storey enclosure
method, to overcome this limitation. This method takes into account various
important aspects of design in building price forecasting, whilst leaving the type of
structure and standard of finishes to be assessed in the price rate. The factors to be
considered and the adjustments in the methods to reflect those factors are shown in
Table 3-4. The method involves the multiplication of a weighting that is assigned
to each of the adjustments in the table. The assigned weightings suggested by
James and inclusions for each component are shown in Table 3-5.
53
The summation of the products of each measured value of the adjustment and
its corresponding assigned weighting will form the storey enclosure area, which is
the unit quantity of the storey enclosure method. The product of an appropriate
single unit rate and the storey enclosure area will produce a forecasted price.
Equations (3.3) and (3.4), which represent the forecasting method, are shown.
P = S . R (3.3)
RrspfspfiP j
m
jj
m
jji
n
ii
n
ii ⋅⎟⎟
⎠
⎞⎜⎜⎝
⎛+′′+′+++= ∑∑∑∑
==== 00005.22)15.02( , (3.4)
where P is the forecasted price, S is the storey enclosure area, R is the unit
rate, fi is the floor area at i storeys above ground, pi is the perimeter of the external
wall at i storeys above ground, si is the storey height at i storeys above ground, n is
the total number of storeys above ground level, m is the total number of floors below
ground level, f’j is the floor area at j floors below ground level, p’j is the
perimeter of the external wall at j storeys below ground level, s’j is the storey height
at j storeys below ground level and r is the roof area.
Table 3-4: Adjustment for the factors affecting the estimates in the storey
enclosure method
Factors affecting the estimates Adjustment
Shape of building By measuring the external wall area
Total floor area By measuring the area of each floor
Vertical positioning of the floor area in a building
By using a greater multiplier to the floor area of a suspended floor positioned higher in a building
Storey heights of building Proportion of floor and roof areas to the external wall
Overall Building heights Ratio of roof area to external wall area
Extra cost of sinking usable floor area below ground level
By using increased multiplier for work below ground level
54
Table 3-5: Weightings and inclusions for individual components in the storey
enclosure method
Components Weighting Factors
Inclusions
Above Ground Components Ground Floor 2 Internal partitioning, finishings, fitments, doors,
etc., on the floor; a non-suspended floor; finishings on one side of it; and normal foundations to all vertical structural members in a single storey building including those of its external walls
Upper Floors 2 + (0.15 x No. of Floor above Ground)
Internal partitioning, finishings, fitments, doors, etc., on the floor; a suspended load-carrying floor; finishings on both sides of it; vertical structural supports to it; and the further cost which arises, in the case of vertical structural floor supports to the lower floors of multi-storey buildings, from the need to support the additional transmitted load of all superimposed floors and the roof above them
Roof 1 A suspended roof and its ligher-than-floor) load; finishings on both sides of it (one weatherproof); horizontal structural supports to it (such as beams and trusses); and vertical structural supports to it (such as walls and columns)
External Walls 1 A wall with weatherproof qualities; finishings on both sides of it; windows and external doors, etc.; and normal architectural features
Below Ground Components
Floors below Ground 2
External Walls below Ground 2.5
Displacement and disposal of earth; waterproof tanking and the loading skins to keep it in position; members of heavier construction than those required in equivalent positions above ground; finishings on one side of these members; internal partitioning finishes, fitments, door, etc.; and normal (in the basement sense) foundations to all vertical structural members in a single basement-storey building
In James’ study, the proposed storey enclosure method is applied to 86
tenders for different building types. The storey enclosure method is compared with
two other early stage methods – the superficial floor area method and the cube
method. James’ results of the tests for the cube, floor area and storey enclosure
methods as shown in Table 3-6. The estimates that are produced by the storey
55
enclosure method are nearer to the tender figures, and that the range of price
variation is reduced accordingly. These results turn out to be statistically significant
(chi-square 5.99, 2df), with the storey enclosure and floor area methods being better
than the cube method (Skitmore 1991). There are some examples that show the use
of the storey enclosure method in the textbooks of cost planning (Cartlidge and
Mehrtens 1982; Seeley 1996 pp.160-162; Ashworth 1999 pp.250-251; Ferry at el.
1999). However, despite the many benefits that are demonstrated by James, the use
of the storey enclosure method remains very limited in practice. Survey results on
the use of conventional cost forecasting models in the UK reveal that less than 2% of
respondents made use of the storey enclosure method to provide strategic cost advice
to clients (Fortune and Lees 1989). However, another survey that was conducted
more recently in South Africa indicates that 27% of the respondents had used the
storey enclosure method in practice (Bowen and Edwards 1998).
Like any of the single-unit deterministic models, the story enclosure method
suffers the deficiency of being inexplicable, unrelated and deterministic. Its
unpopularity is probably due to the fact that the weightings are not derived
empirically from proven data, but are based on experience (Wilderness Group 1964;
Ashworth 1999 p.251), that there is insufficient historical data support (Wilderness
Group 1964), that there are difficulties in obtaining an appropriate rate (Seeley 1996
pp.161-162), that the calculations that are involved are relatively complex (Seeley
1996 pp.161-162), and that the method provides no link with other forecasting
methods, such as the elemental or approximate quantity method that would be used
subsequently as the design develops.
56
Table 3-6: The results of tests for the cube, floor area and storey enclosure
methods in James’ study (Source: James (1954))
3.10 Regression Analysis
As there is no universal set of elements or variables for forecasting models,
the purpose of reviewing previous empirical research on the influencing variables
and forecasting targets is to consolidate a list of them for later use in model selection.
A review of the surrounding literature shows that the technique of regression analysis
has been widely used in the modelling of building prices. The technique of
regression analysis is statistically able to demonstrate the strength of the relationship
between two or more variables, for example, height and unit price. A variety of
applications of regression analysis in the forecasting of building cost have been
developed since the mid 1970s. Regression analysis has been used for modelling
the prices of building at three levels: the overall price, the price of building elements
and the price of components. Regression analysis was first used to model building
prices for offices (Department of Environment 1971; Tregenza 1972; Flanagan and
57
Norman 1978; Karshenas 1984; Skitmore and Patchell 1990), schools (Moyles 1973),
houses (Neale 1973; Braby 1975; Khosrowshahi and Kaka 1996), homes for old
people (Baker 1974), lifts (Blackhall 1974), electrical services (Blackhall 1974),
motorway drainage (Coates 1974) and a few other types of building (Kouskoulas and
Koehn 1974). It was then used to model the prices of reinforced concrete frames
(Buchanan 1972; Singh 1990) and building services (Gould, 1970). It has also been
used to model the prices of components such as the beams of suspended-roof steel
structures (Southwell, 1971). This research concerns the modelling of overall
building prices.
3.11 Review of Model Predictors
Of the conventional methods of forecasting for the early design stage, the
floor area method is the most widely used (Akintoye et al. 1992; Bowen and Edwards
1998; Fortune and Lees 1996). In this method, the floor area is presumed to be the
only variable that is directly proportional to the building price. Another frequently
addressed variable is the height of a building, and previous studies have expressed
this in different measures, such as the overall building height (Kouskoulas and
Koehn 1974; Karshenas 1984; Pegg 1987), number of storeys (Clark and Kingston
1930; Wilderness Group 1964; Buchanan 1969; Department of Environment 1971;
Buchanan 1972; Tregenza 1972; Steyert 1972; Neale 1973; Braby 1975; Flanagan
and Norman 1978; Singh 1990) and storey height (Wilderness Group 1964;
Buchanan 1969; Buchanan 1972; Moyles 1973). High-rise buildings are generally
more expensive to build than low-rise buildings, because the former require extra cost
for the special arrangements for servicing the building, particularly the upper floors,
58
and because the lower part of high rises is designed to carry the weight of the upper
floors and the extra wind load. The additional cost of working at a great height from
the ground when erecting the building, and the increasing area that is occupied by the
service core and circulation are also factors that increase the cost of high rises (Ferry et
al. 1999 p. 293).
The earliest work on the identification of the variable of building height was
undertaken in the United States. Clark and Kingston (1930) analyse the relative
costs of the major components of eight office buildings that range from 8 to 75
storeys on a hypothetical site. In general, they find that the unit building cost tends
to rise moderately with the building height.
In the UK, Stone (1963) reports a moderate rise in the unit building cost with
the building height for blocks of flat and maisonettes in London and other parts of
the UK. The Wilderness Group (1964) produced a series of schedules that detail
the costs of a steel frame for a structure. The spans, storey heights and number of
storeys vary and are manually priced. Their study is the first serious attempt within
the UK building industry to isolate the cost effects of fundamental design variables,
such as the number of storeys, storey height, the superimposed loading of suspended
floors, column spacing in the direction of the slab span and column spacing across
the slab span, taking into account the interacting cost effects of each variable upon
the others.
Tan (1999) cites a report that was prepared by Thomsen (1966) from the
United States that states that, except for the lower floors, the unit office building cost
is almost constant when the building height is varied. However, as details of the
59
simple simulation study are not given in Thomsen’s report, Tan warns that the results
should be interpreted with care.
A study that was conducted by the Department of Environment (1971) of the
UK government reports that the cost of local authority office blocks rises fairly
constantly by two per cent per floor as the height increases above four storeys.
Tregenza (1972) analyses the price per square meter of ten office buildings
that range from one to eighteen storeys high. The prices were rebased to January
1971 prices. A linear regression line was fitted and the result agrees with the
findings of earlier works that tall buildings tend to be more expensive than low
buildings with the same internal floor area. However, the sample was too small,
and the fitting was done by pure observation. Thus, it is doubtful whether it is
appropriate to interpret the relationship as being linear.
Buchanan (1972) uses the multiple regression technique for the development
of a model that represents the total cost of a reinforced concrete structure. The
model was developed from 38 reinforced concrete frame buildings that were
constructed by the Ministry of Public Building and Works of the UK between 1960
and 1968. The dependent variables that are identified are the gross floor area,
storey height, number of storeys, average superimposed loading, shortest span,
longest span, slab concrete thickness and number of lifts.
Kouskoulas and Koehn (1974) represent the pre-design estimation of building
prices per square foot (price per area) as a function of six variables: building locality,
price index, building type, building height, building quality and building technology.
Karshenas (1984) regards the resulting pre-design estimation technique that is
devised by Kouskoulas and Koehn as being simple, fast and applicable to forecasts
60
for a wide variety of building construction projects, and opines that the methodology
might be generalised in a global sense. Kouskoulas and Koehn’s use of raw
dependent variables with the inflation index as the independent variable is also
particularly interesting. They use a multiple regression methodology to derive the
single cost-estimation function from 40 sets of data on building contracts in the US.
Disregarding the possibility of obtaining a better-performing model by the
elimination of some of the variables, they insist on keeping all of the variables, as
they believe that the better result that is obtained by omitting some of the variables is
due to a bias in the data sample. This supposition is rather subjective. The final
model is tested on only two ex ante projects, and shows little forecasting bias. This
test sample is also considered as be too small to draw a reasonable conclusion from.
Unfortunately, no results on the performance of the reduced model are shown as a
comparison in their paper.
In Australia, Braby (1975) studied the relationship between the height of
buildings, as represented by the number of storeys, and the building price per floor
area in eighty buildings in Melbourne. Instead of classifying the data according to
the building type, as is typical in other studies, Braby divides the data according to its
location relative to the central business district (i.e., whether it is inside or outside of
the central business district). The results of the linear regression indicate that
building prices generally increase with the number of storeys. However, the results
are not conclusive due to the poor determination of the correlations.
McCaffer (1975) summarises research work that was produced by
post-graduate students (Buchanan (1969); Gould (1970); Moyles (1973); Neale
(1973); Baker (1974); Blackhall (1974) and Coat (1974)) of the Department of Civil
Engineering at Loughborough University of Technology on the use of regression
61
analysis for forecasting. A summary of the models that were developed by the
post-graduate students is shown in Table 3-7. The paper raises an important
statistical concern about the deterioration of the performance of regression models in
actual forecasting using ex ante data from their validation performance using ex post
data. The experience of the author indicates that the coefficient of variation (as a
measure for model performance) increases by 25% to 50% when the derived model is
applied to data outside of its own database. Thus, a model with a coefficient of
variation of 10% in its validation will deteriorate by 15% to 20% when used for other
cases of a similar type. This study, although it does not show detailed calculations as
evidence, is particularly important to studies in the area of cost modelling, as it is the
first in the field to address the difference between ex post performance and ex ante
performance. In fact, except when using a more advanced approach to resampling
validation, such as the cross validation (as is applied in this research), or
bootstrapping, it is crucial to measure both the ex post and ex ante performance in the
validation of a model to give a full picture of its performance.
Based on the theoretical study that was undertaken by Steyert (1972), who
suggests that the cost of the various elements of a building respond differently to
changes in the number of storeys, Flanagan and Norman (1978) further elaborate his
idea by suggesting that the cost components of a building can be split into four
categories: those that fall as the number of storeys increases, those that rise as the
number of storeys increases, those that are unaffected by height and those that fall
initially and then rise as the number of storeys increases. They use the learning
curve that was produced by the Committee of Housing in New York to illustrate that
every time the number of repetitions doubles, the output time declines by a fixed
percentage. Fifteen office buildings of more than two storeys that were built
62
between 1964 and 1975, including the ten that were used by Tregenza (1972), are
selected for curve fitting. They apply the regression analysis technique to model
the relationship between the building height and building price. By making the
assumption that other influencing variables, such as the quality of building,
geographical location, size of project, site characteristics and so forth, are constant,
the results show that the relationship between the price per square meter and the
number of storeys in an office is projected to be U-shaped.
Karshenas (1984) uses data from 24 historical multi-storey office buildings in
the US to derive the mathematical relationship between price, overall building height
and average floor area (termed “typical floor area” in Karshenas’ paper). By
merely observing the points that are distributed on the chart of the average floor area
against the height, a set of contours that represents the constant per area price for
different heights and average floor areas is constructed on the chart. Based on the
shape of the contours, the author hypothesises that building price is a function of the
average floor area and overall building height:
C = α . Aβ . Hγ, (3.5)
where C is the building price, A is the average floor area, H is the overall
building height and α, β and γ are the constants.
By transforming both sides with a natural logarithm, the equation becomes:
lnC = lnα + β . lnA + γ . lnH. (3.6)
63
This transformed hypothetical equation suits the methodology of the multiple
linear regression. To make building prices comparable, Karshenas updates all
prices to the base of March 1982, according to the price index. His derived model
of building price with the average area and overall height as variables is compared
with the floor area model using floor area unit rates from the published price book.
Unfortunately, the comparison does not pay attention to the deterioration problems
that are addressed by McCaffer. Thus, the conclusion that a better method has been
developed is not persuasive enough.
Based on a large sample of 1188 projects, Pegg (1984) identifies ten
variables that statistically and significantly affect building price level: building price
date, location, selection of contractor, contract sum (≤£20,000 or >£20,000), building
function, measurement of structural steelwork, building height, form of contract, site
conditions and type of work. Within these significant variables, the only
quantitative variable is the building height. Skitmore and Marston (1999, p.252)
criticise the study for not giving a clear description of the method of analysis or of
levels of accuracy.
Apart from summarising the model development in the résumé that was cited
earlier in this review, Skitmore and Patchell (1990) also demonstrate the use of the
multiple regression analysis technique in the development of a forecasting model of
building price per gross floor area (GFA) based on six raw independent variables,
including the number of storeys. Data was extracted from 28 office buildings in the
UK for the period 1982 to 1988. The final model is a natural logarithmic
transformed model that is derived by forward stepwise regression. It contains three
chosen variables: the number of bidders, GFA and the contract period. Very
64
detailed empirical work is incorporated in this study, but no ex ante performance
validation is included.
Khosrowshahi and Kaka (1996) use multivariate regression analysis with an
improvised iterative method to develop forecasting models for the cost and duration
of housing projects. The objective of the paper is to develop building price and
duration forecasting models for both the contractor and the client. Fifteen variables
are taken into account, including the number of storeys, which is divided into three
groups (low, medium and high). Data from 54 housing projects in the UK in the
period 1981 to 1991 are used. Six of the fifteen candidate variables are selected by
multivariate regression analysis These include one scale variable, ‘unit’, and five
categorical variables, ‘project operation’ (which comprises refurbishment, extension,
alteration and new); ‘project sub-type’ (whether sheltered, public or bungalow);
‘abnormality’ (which comprises access to site, poor communication, repeating
stoppages, sudden speed ups, transportation problems, time and cost yardsticks,
keeping occupation and unknown factor, contractor’s mistakes, various delays,
resource shortages, repeating variations, lack of presence and others); ‘starting
month’ (January, February, March, April, May, June, July, August, September,
October, November or December); and ‘horizontal access’ (whether good, fair or
poor). The final model may have a problematic area in application. In real-life,
abnormalities cannot be assumed to be mutually exclusive and independent to each
other. The presence of more than one abnormalities at the same time and the
interdependence of those abnormalities could easily ruin the model. Also, there is
no actual performance validation being shown in Khosrowshahi and Kaka’s paper.
An interesting commonality between Skitmore and Patchell’s final office
price model and Khosrowshahi and Kaka’s house model is that the variable that
65
represents the height of building was eliminated during the process of selecting the
variables. A summary of the forecasting targets (independent variables) and the
influencing variables (dependent variables) for the building price forecasting models
as reviewed in this section is shown in Table 3-8. The influencing variables in the
table are classified according to whether they are quantitative (measurable) or
qualitative (intangible, normally divided into various levels) in nature. Most of the
studies put quantitative and qualitative measures into their model. This approach is
acceptable, because these models all belong to the category of ‘black-box’
forecasting tools, which are validated solely on the performance accuracy of the
models. However, the ways in which the qualitative variables are chosen, defined
and divided into various levels, are mainly based on the experience of the modellers.
By defining the variables or the levels or scales differently, a rather different final
model may be produced. Thus, these models must be used with extra care.
Alternatively, this possible flaw can be avoided by employing models that use only
quantitative variables. Instead of putting the qualitative variables into the model, an
alternative approach is to group the data with similar qualitative characteristics
together to derive a model that explains only a particular set of qualitative
characteristics. This approach, however, produces more models than a generalised
approach does, and the grouping criteria are subjective.
Armstrong (2001 p.342-345) reviews the general principles for using
forecasting methods in published research. He concludes that a quantitative method
should be used if there is enough data. In consideration of the limited amount of
data that is available in the early design stage (because many of the qualitative
characters of a project are yet to be determined), the quantitative variables of floor
66
area, roof area, basement wall area and external wall area, as identified in JSEM, are
used in the model development in this research.
3.12 Occam’s Razor: Parsimony of Variables
For a given set of data, there are always an unlimited number of possible
explanatory models. If a model is too simple, then the model and its predictions
will be unrealistic, whereas if it is too complex, then the model will be specific but
its predictions unreliable (Edwards 2001 p.129). It has long been advocated that
“economists should follow the advice of natural scientists and others to keep their
models sophisticatedly simple, especially as simple models seem to work well in
practice” (Zellner 2001 p.4).
In the world of scientific modelling and theory development, scientists should
adopt the underlying principle of parsimony to distinguish a better model from others.
The principle of parsimony, also known as Occam’s razor, is attributed to the
mediaeval philosopher William of Occam, who suggested that “pluralitas non est
ponenda sine necessitate” (another version is “entia non sunt multiplicanda praeter
necessitatem”), which means entities should not be multiplied unnecessarily. Thus,
if there are two competing theories (or models in the context of this study) that both
describe the same characteristics of observed fact (data set), then the simpler of the
two should be adopted until more evidence comes along (Stangl 1997). Occam’s
razor is particularly important in the development of universal models, as the subject
domain of these models is of an unlimited complexity. Because of this complexity,
the chance of obtaining a manageable model is very slight if the modelling process
67
starts with a very complicated theoretical foundation. The same principle also
applies to the development and selection of early stage forecasting models, because
data that is obtained at the early stage are highly abstract and uncertain, and the use
of a complicated model will inevitably add unnecessary assumptions. The
discourse on scientific theory suggests that no theory can be totally validated, but
that any theory can be falsified by facts (Popper 1959 pp.78-92). Thus, science
operates according to the principle of parsimony.
To apply the principle of parsimony to model selection, Simon (2001 p.35)
suggests expressing parsimony as a measure in the ratio of the complexity of the data
set to the complexity of the formula set. In the context of the competition between
two models, the parsimony of the relationship of the data set to the simpler model
(e.g., a linear model that contains one explanatory variable) is greater than the
parsimony of its relationship with the more complex model (e.g., a linear model
containing two explanatory variables) if they both describe a data set equally. By
the same token, the parsimony of the relationship of a model with a larger data set is
greater than the parsimony of the relation of the model with a smaller data set, if the
same model equally describes the two data sets.
To implement Occam’s razor, regression techniques can be applied to achieve
the parsimony of variables. This involves the development of a model through the
least-squares error method for a given domain (data for a particular type of building).
The goal of the final model is to produce accurate forecasts, and the criterion for the
selection of that model is the forecasting accuracy.
Forecasting accuracy is an objective measure of the success of a model, and
is also the expected fit of unseen data in a domain. It plays a very important role in
68
the judgment of models, as models themselves can never give error-free forecasts.
More reviews on forecasting accuracy are contained in Chapter 4.
Table 3-7: Summary of the models developed by the post-graduate students of
the Department of Civil Engineering at Loughborough University of
Technology (extracted from McCaffer 1975)
Author(s) Subject of Regression Model
Variables Model performance
Buchanan, J. S. (1969)
Reinforced concrete frame in buildings
Gross floor area, average load, shortest span, longest span, no. of floors, height between floors, slab concrete thickness and no. of lifts
More accurate for medium and high cost schemes rather than low cost schemes.
Gould, P. R. (1970)
Heating, ventilating and air conditioning services in buildings
Functions which described the heat and air flow through the building, the heat source and distance which it has to be ducted and shape
High accuracies at higher cost
Moyles, B. F. (1973)
System built school buildings
Floor area, area of external and internal walls, no. of rooms and functional units, area of corridors, storey height and no. of sanitary fittings
Generally, high accuracy
Neale, R. H. (1973)
Houses for private sale
Floor area, area of roof, are of garage, number of storeys, slope of site, unit cost of external finishes and cost of sanitary fittings, area and volume of kitchen units, site densities, regional factors, number of doors, area of walls, number of angles on plan, construction date and duration of development, and type of central heating
Only two cases fell outside the ±10%
Baker, J. (1974)
Residential apartment scheme for old people
Area of single units, double units, triple units, common rooms, Warden’s Flat, laundry, access corridors, number of lifts and garages and duration of contract
Coefficient of Variation (c.v.) of 9.16%
Blackhall, J. D. (1974)
Passenger lifts in office building
Contract date, dimensions of the car, no. of landings, length of travel, operating speed, type of control system and location of installation
Coefficient of Variation (c.v.) of 20.9%
Blackhall, J. D. (1974)
Electrical services in buildings
No. of distribution boards, fused load, number of active ways, no. of socket and other outlets, voltage, contract date and a differentiation of whether the building was commercial or domestic use
Coefficient of Variation (c.v.) of 20.0%
Coates, D. (1974)
Motorway drainage (including three models: (1) using porous pipes; (2) using helpline pipes and (3) using asbestos pipes)
Internal diameter, average depths and cost of pipes
(1) For porous pipe, c.v. of 12.8%; (2) For helpline pipe, c.v. of 9.2%; and (3) For asbestos pipe, c.v. of 6.9%
69
Table 3-8: Summary of Forecasting Targets and Influencing Variables in
Previous Empirical Studies
Variables Empirical studies
Forecasting Targets (Dependent Variables Used)
Overall building price / Cost of reinforced concrete structure
James (1954); Buchanan (1972); Moyles, (1973); Neale (1973); Baker (1974); Karshenas (1984); Singh (1990); Khosrowshahi and Kaka (1996)
Overall building price building price per square meter floor area
Department of Environment (1971); Tregenza (1972); Kouskoula and Koehn (1974); Braby (1975); Flanagan and Norman (1978); Skitmore and Patchell (1990)
Influencing Factors (Independent Variables Used)
Building type Kouskoula and Koehn (1974); Khosrowshahi and Kaka (1996)
Gross floor area Buchanan (1972); Moyles (1973); Neale (1973); Skitmore and Patchell (1990)
Typical floor area Karshenas (1984)
Number of storeys Department of Environment (1971); Buchanan (1972); Tregenza (1972); Neale (1973); Braby (1975); Flanagan and Norman (1978); Singh (1990)
Overall height Kouskoula and Koehn (1974); Karshenas (1984)
Storey height Buchanan (1972); Moyles (1973)
External wall area James (1954); Moyles (1973)
Location index (No location index in Hong Kong)
Kouskoula and Koehn (1974), Neale (1973)
Roof area James (1954); Neale (1973)
Starting date Neale (1973); Khosrowshahi and Kaka (1996)
Contract duration Neale (1973); Baker (1974); Skitmore and Patchell (1990)
Area of garage Neale (1973); Baker (1974)
Area of corridors Buchanan (1972); Baker (1974)
Number of lifts Buchanan (1972); Baker (1974)
Basement wall area James (1954)
Average superimposed loading, shortest span, longest span, slab concrete thickness
Buchanan (1972)
Internal walls area of number of rooms and functional units, number of sanitary fittings
Moyles (1973)
70
Table 3-8 (Cont’d): Summary of Forecasting Targets and Influencing Variables
in Previous Empirical Studies
Variables Empirical studies
Influencing Factors (Independent Variables Used) (cont’d)
Slope of site, unit cost of external finishes, cost of sanitary fittings, area and volume of kitchen units, site densities, number of doors, area of walls, number of angle on plan, duration of development, type of central heating
Neale (1973)
Area of single units, double units, triple units, common rooms, Warden’s Flat, laundry
Baker (1974)
Price index, quality, building technology Kouskoula and Koehn (1974)
Quantities of constituents of concrete construction, structural scheme, section of beams, grade of concrete, grid location, grid size
Singh (1990)
Number of bidders Skitmore and Patchell (1990)
Project operation, abnormality, and horizontal access
Khosrowshahi and Kaka (1996)
Note: Bold typed variables are measurable.
3.13 Summary
A building price forecasting model is a system that produces forecasted prices
from historical data. It is a type of technical model that attempts to dig out the
variables that have most influence on building prices. Forecasting models can be
distinguished according to whether they are black box or realistic, deterministic or
stochastic, and deductive or inductive. A more detailed classification was prepared
by Newton using descriptive primitives. According to his classification, the final
models in this research are specific for individual types of building (Data); applicable
71
to finished works, i.e. equating price to a function of identified variables that
comprises floor and external wall areas and so forth (Units); represent the designer’s
price forecast (Usage); follow Marco’s approach, i.e., producing forecasts for the
whole building (Approach); are applied at the feasibility and sketch design stage
(Application); are simulation models in terms of the problem boundary, the variables
considered and the inter-relationships between the variables (Model); are generated
by regression analysis (Techniques); are based on explicit assumptions about defined
problem boundaries (Assumptions) and are stochastic in terms of their performance
assessment (Uncertainty).
The characteristics of different types of forecasting cost models are
summarised in Skitmore and Patchell’s study. The application of cost models is
highly restricted by the assumptions that lie beneath the models, their reliance on
historical data for predicting future events, the insufficiency of information and
preparation time, and their reliance on expert judgment. Many studies on the
development of new models are criticised for their overemphasis on the uniqueness
and innovativeness of the model, their ignorance of the practicability of the model
and the lack of a clear demonstration of the benefit of the model, especially in terms
of their forecasting performance relative to the conventional models. To put forth
more advanced models in practice, their statistical significance and practical
significance are both crucial issues that should be addressed.
James’ storey enclosure model (JSEM), proposed in 1954, has been chosen
for further development. The original model uses some physical measurements,
such as floor area, roof area and elevation area of buildings to estimate building
prices. Although JSEM is not a widely used model in practice, and suffers from the
same inherent shortcomings as other early stage conventional forecasting models,
72
JSEM has been proved empirically to outperform other models. As the simplified
equation for JSEM for multi-storey buildings (as elaborated in Chapter 5) shows that
it can be considered as a problem of determining the best set of predictors, regression
analysis is employed to improve the JSEM further. The regressed models are
developed empirically, and are expected to be understandable, easy to use, fairly
accurate and reliable.
Predictors for regression models that have been used in previous studies are
reviewed. The two most commonly studied variables are the floor area, as
represented by the gross floor area or typical floor area, and the building height, as
represented by the number of storeys, overall height and storey height. The former
variable represents the costs of the horizontal elements of a building, whereas the
latter variable represents the costs of the vertical elements. In JSEM, the identified
variables include the areas of the floor, roof, basement walls and external walls.
The price of buildings can be expressed in an unlimited number of ways with
different mathematical functions and variables. Occam’s razor is addressed at the
end of this chapter because it is considered to be the most important principle for
model development. Taking this into account, the regression technique that is used
in this research is considered to be the means to achieve the necessary parsimony.
73
CChhaapptteerr 44 PPeerrffoorrmmaannccee ooff FFoorreeccaassttiinngg MMooddeellss
The more unpredictable the world is the more we rely on predictions.
Steve Rivkin
4.1 Introduction
It is essential for modellers to demonstrate the benefits of a new forecasting
model or approach to practising forecasters before its launch. The fundamental
benefit that a new model should show is an improvement in forecasting performance.
For instance, this study hypothesises that the new regressed models outperform the
conventional models in terms of forecasting accuracy. Much research has been
conducted in the past on the subject of forecasting performance, and some of this
research has studied the determinants of forecasting performance.
The measures for forecasting performance include bias, consistency and
accuracy. The bias in forecasting that is produced by a model is generally
represented either by the average percentage difference between the designers’
forecast and the lowest tender sum, or the average ratio between them. Bias is the
most popular measure of performance. Consistency refers to the degree of variation
74
around the average that is represented by standard deviations, and accuracy is the
combination of bias and consistency into a single quantity (Skitmore 1991 p.2).
4.2 Measures of Forecasting Accuracy
A naive definition of accuracy would be the absence of error, or the assertion
that the smaller the error, the higher the accuracy and vice versa (Flanagan and
Norman 1983). Accuracy measures are usually defined in terms of the ratio of the
lowest bid to a forecast, the ratio of a forecast to the lowest bid (the reciprocal of the
ratio of the lowest bid to a forecast), the percentage by which the lowest bid exceeds
a forecast, the percentage by which a forecast exceeds the lowest bid (the reciprocal
of the percentage by which the lowest bid exceeds a forecast), the difference between
the lowest bid and a forecast, and the total number of “serious” errors. As the
percentage by which a forecast exceeds the lowest bid, is a widely accepted
expression of error in practice and is a unit-free measure, it is used to measure
accuracy in this study.
To properly interpret accuracy measures, there are two major components:
bias and consistency. Bias can be measured by the arithmetic mean, median,
Pearson r, Spearman’s rho and the coefficient of regression of the errors, the
percentage errors or the ratios described above. The first measure of bias uses
forecasts as the base of reference, which is suitable for the evaluation of the
forecasting performance of an individual forecaster or an individual company. The
second and third measures are statistically the same. The fourth measure does not
75
take into account the scale of the data, and data with large numbers might easily
dominate the comparisons.
Consistency can be measured by the standard deviation and coefficient of
variation of the errors, the percentage errors or the ratios described above. While
they both represent the degree of variation around the mean, the latter measure
adjusts the differences in magnitudes of the means of the data sets.
Instead of measuring bias and consistency, alternatively, accuracy can be
measured by a single quantity. The common combined measures, found mainly
from researches on modelling, are the mean square error, the root mean square error
and the mean modulus (absolute) percentage error.
Skitmore et al. (1990 pp. 5-23) extensively review the measures of
performance of forecasts in the literature. The different representations of bias,
consistency and accuracy (combined accuracy measures) in previous research are
summarised in Table 4-1. The authors found that the consistency measures in terms
of the coefficient of variation of forecasts and the overall accuracy measures are by
far less frequently used then bias measures.
Since all the models in comparison in this study are generated and tested by
the same set of data and the use of cross validation for modelling would likely
produce mean percentage errors that are close to zero, the effect of magnitude
differences mentioned earlier is likely to be small which lessens the benefit of using
the coefficient of variations. To compare models deterministically, a forecasting
model that is less biased (e.g. smaller mean error) and more consistent (e.g. smaller
standard deviation of error), or more accurate (e.g. smaller mean square error) than
other models is more preferable. However, the more sophisticated probabilistic
76
approach suggests that statistical inference should be used to conclude whether one
model is significantly better than the others. There are far more statistical inference
methods available for the measures using mean and standard deviation (or variance,
i.e. the square of standard deviation). Therefore, this study adopts the mean and
standard deviation of percentage error as the measures of forecasting performance.
Skitmore et al. (1990 pp. 5-23) also suggest that there are five primary
determinants that affect forecasting performance: the nature of the target, the
information used, the forecasting technique used, the feedback mechanism used and
the person who is providing the forecast. Except for the feedback mechanism, the
other factors have been well explored by many researchers. A summary of the
empirical evidence on the factors that affect forecasting quality is shown in Table 4-2.
The table is an extended version of a similar table that was prepared by Skitmore et
al. (1990 p.20-21), but more recent empirical studies are incorporated.
One of the major inadequacies that is found from a review of the literature on
forecasting accuracy is that some of the evidence is not strong enough because of a
lack of tests for the significance of forecasting errors (Skitmore and Drew 2003).
According to Table 4-2, there are a few contradictory results, but these contradictions
might have occurred by chance and may not represent the true population (Gunner
1997 p.30-31).
77
Table 4-1: Measures of Performance of Forecasts (Source: Skitmore et al. 1990
p. 22)
78
Table 4-2: Factors affecting quality of forecasts – summary of empirical
evidence (extended from the similar table in Skitmore et al. (1990, p. 20-21))
Factor Researcher Evidence
(1) Nature of target
Contract works type
McCaffer (1975) Buildings more biased and more consistent than roads.
Harvey (1979) Different biases for buildings, non buildings, special trades, and others.
Morrison & Stevens (1980) Different bias and consistency for schools, new housing, housing modifications, and others.
Flanagan & Norman (1983) No bias differences between schools, new housing, housing modifications, and others.
Skitmore (1985) Different bias and consistency for school, housing, factory, health centre and offices.
Skitmore & Tan (1988) No bias or consistency differences for libraries, schools, council houses, offices and other buildings.
Skitmore et al (1990, pp. 79-87)
No bias or consistency differences for primary school, sheltered housing, offices, unit factories, health centres and other buildings
Quah, L. K. (1992) New works more consistent than refurbishment
Gunner and Skitmore (1999) No bias or consistency differences for commercial, non-commercial and residential buildings. Renovation works more biased and more consistent than new works.
Skitmore and Drew (2003) No bias or consistency differences for commercial, health, apartment, education and other. No bias or consistency differences for new and alterations works.
Contract size McCaffer (1975) No bias trend.
Harvey (1979) Bias reduces with size.
Morrison & Stevens (1980) Modulus error reduces with size. Consistency improves with sizes.
Flanagan & Norman (1983) Bias trend reversed between samples.
Wilson et al (1987) No linear bias trend.
Skitmore & Tan (1988) Bias reduces and consistency improves with size.
Skitmore (1988) No consistency trend.
Ogunlana and Thrope (1991) Consistency reduces with larger contract size.
Cheong (1991, p.106) No consistency trend.
Thng (1989) Ditto.
Gunner and Skitmore (1999) Bias reduces and consistency improves with size.
79
Table 4-2 (Cont’d): Factors affecting quality of forecasts – summary of
empirical evidence (extended from the similar table in Skitmore et al. (1990,
p.20-21))
Factor Researcher Evidence
(1) Nature of target (cont’d)
Contract size (Cont’d)
Skitmore (2002) No bias or consistency trend.
Skitmore and Drew (2003) No bias or consistency trend.
Project size (area)
Skitmore and Drew (2003) No bias or consistency trend.
Contract conditions type
Wilson et al (1987) More bias for bill of quantities contracts.
Gunner and Skitmore (1999) (1) Bias difference between conditions of contract issued by Singapore Institute of Architects (SIA) and standard form (RHLB form). (2) Consistency difference between contract with a fluctuation provision and contract without.
Geographical location
Harvey (1979) Bias differences between Canadian regions.
Wilson et al (1987) No bias trend between Australian regions.
Ogunlana and Thrope (1991) No conclusion although bias and consistency difference between regions of United Kingdom.
Nature of competition
Harvey (1979) Bias differences for individual bidders.
McCaffer (1975) Estimates higher with more bidders.
de Neufville et al (1977) Ditto.
Harvey (1979) Ditto. Inverse number of bidders gives best model.
Flanagan & Norman (1983) Estimates higher with more bidders.
Runeson & Bennett (1983) Ditto.
Hanscomb Association (1984) Estimates higher with more bidders. Non linear relationship.
Wilson et al (1987) Ditto.
Tan (1988) Ditto but not with UK data.
Ogunlana and Thrope (1991) No bias and consistency trend.
Skitmore (2002) No consistency trend.
80
Table 4-2 (Cont’d): Factors affecting quality of forecasts – summary of
empirical evidence (extended from the similar table in Skitmore et al. (1990,
p.20-21))
Factor Researcher Evidence
(1) Nature of target (cont’d)
Prevailing economic climate
de Neufville et al (1977) Estimates higher in ‘bad’ years with lagged response rate.
Harvey (1979) Ditto.
Flanagan & Norman (1983) Ditto.
Morrison & Stevens (1980) Estimates lower in uncertain economic climate.
Ogunlana and Thrope (1991) No significant relationship
Gunner and Skitmore (1999) Estimates higher in ‘bad’ years with lagged response rate.
Price intensity Skitmore et al (1990, p.191) High value contracts were underestimated and low value contracts over estimated.
Gunner and Skitmore (1999) Ditto.
Contract period Skitmore (1988) No difference between groups of contract period
Gunner and Skitmore (1999) No conclusion due to different results obtained from using contract sum as the base for measurement of bias against contract sum minus provisional sums as the same.
Other project characteristics
Skitmore & Tan (1988) Bias reduces and consistency trend with contract period and basic plan shape.
Ogunlana and Thrope (1991) Bias and consistency differences between design offices
Gunner and Skitmore (1999) (1) Bias and consistency better for foreign than local (Singapore) contractors. (2) No conclusion although bias difference between foreign and local architects. (3) Consistency improves with increasing area. (4) Consistency better for private sector than public sector
Skitmore and Drew (2003) No bias or consistency trend with client type.
81
Table 4-2 (Cont’d): Factors affecting quality of forecasts – summary of
empirical evidence (extended from the similar table in Skitmore et al. (1990,
p.20-21))
Factor Researcher Evidence
(2) Level of information
Number of priced items
Jupp & McMillan (1981) Slight bias reduction with price data.
Bennett (1987) Consistency differences between price data sources.
Skitmore (1985) No bias or consistency trend with price data. Increased bias and consistency with project information.
Gunner and Skitmore (1999) Consistency reduces as the number of items reduces.
Preliminaries percentage
Gunner and Skitmore (1999) No conclusion due to different results obtained from using contract sum as the base for measurement of bias against contract sum minus provisional sums as the same.
3) Forecasting technique
James (1954) Consistency differences between cube, floor area and storey enclosure methods.
McCaffer (1975) Consistency better for regression methods than conventional.
Morrison & Stevens (1981) Simulation model has less bias and more consistency than conventional
Ross (1983) Consistency better with simpler techniques.
McCaffer et al (1984) Consistency of regression and method comparable with conventional.
Brandon et al (1988) Expert system has less bias and more consistency than conventional.
Munns and Al-Haimus (2000) Cost significant global model has less bias and more consistency than conventional.
Skitmore and Drew (2003) Bias and consistency differences between approximate quantities and floor area methods. Consistency better for floor area method.
(4) Use of feedback
No evidence available
82
Table 4-2 (Cont’d): Factors affecting quality of forecasts – summary of
empirical evidence (extended from the similar table in Skitmore et al. (1990,
p.20-21))
Factor Researcher Evidence
(5) Ability of forecasters
Forecasters Jupp & McMillan (1981) Bias and consistency differences between subjects.
Morrison & Stevens (1980) Bias and consistency differences between offices.
Skitmore (1985) Bias and consistency differences between subjects.
Skitmore et al (1990) Bias and consistency differences between subjects.
Gunner and Skitmore (1999) No bias but consistency differences between subjects
Number of price forecasts
Gunner and Skitmore (1999) Bias reduces in proportion to number of price forecasts
4.3 Base Target for Forecasting Accuracy
Generally speaking, contractors derive a tender price by summing the
estimated total costs of production (including head office overheads and the cost of
finance) and their mark up. For the traditional procurement method, where the task
of design is separated from that of construction, design team members gain no access
to the details of the estimated costs of production or the allowed mark up in tenders.
The target of forecasts at the early design stage is the returned tender price,
rather than the final contract price, as the latter presents far too many unforeseeable
83
reasons and uncertainties, such as the possibility of contractual claims, that would
frustrate the forecasts of the final contract sum in the early design stage. There is,
however, a controversy between practice and academia about the use of returned
tenders as the forecasting target. Some suggest that the target should be the lowest
returned tender price (Morrison 1994; Ogunlana and Thorpe 1987) whilst others
suggest employing the mean (McCaffer 1976) or median of the returned tender
prices. The proposal for using the mean is based on the reason that it is less
variable, and is therefore more likely to be more accurate. The proposed use of the
median simply derives from a conservative notion, though one that is widely
accepted by practicing forecasters, that the possibility of underestimation should be
avoided. As price models are used to forecast the market price (i.e. the unknown
value of the contract to contractors buying on the contract market) (Skitmore and
Marston 1999, p.20), the lowest returned tender price is chosen to be the forecasting
target in this study. After all, the major interest of the forecasting exercise is to
predict the probable market price, and the use of the mean or median is ill defined.
Moreover, the effect of using the lowest or the mean tender on the assessment of
accuracy is found to be small (Beeston 1983).
4.4 Overview of Model Performance at Various
Design Stages
The field of forecasting techniques has been studied for more than four
decades. The popularity of research on this subject is due to the inherent
84
shortcomings of the conventional models of forecasting, as is described in Chapter 2.
As is detailed in section 3.6 of Chapter 3, modellers should demonstrate the benefits
of a model before attempting to implement it in practice, and thus the assessment of
the performance of a newly developed model is essential. However, only a few
empirical studies demonstrate the performance of new models, in a relative sense,
compared with that of conventional models, and even less deal with performance
measurement seriously by the use of statistical inference.
Barnes (1971) suggests that the performance of designers’ forecasts at the
commencement of feasibility studies is between +20% to 40% of the coefficient of
variation (cv), which improves to +10% to 20% cv at the commencement of the
detailed design stage.
Beeston (1974) uses a hypothetical example to show that the performance of
designers’ forecasts can only be reduced to close to the contractor’s estimate.
Based on the assumption that the variability of the designers’ forecast can be reduced
to 5%, this would lead to a figure of 6% of the coefficient of the variation of
differences between the forecast and the lowest tender, and there would be no further
reduction. If 6% of the coefficient of variation could be achieved, then 60% of the
designers’ forecast would fall within 5% of the lowest tender, 90% would fall within
10% and all of the forecast would fall within 20%.
Marr (1974) divides designers’ price forecasting into four stages: planning,
budget, schematics and preliminaries. Their corresponding adequate degrees of
accuracy are stated as 20-40% for planning, 15-30% for budget, 10-20% for
schematics and 8-15% for preliminaries, reducing to 5-10%. McCaffery (1978), in
his assessment of the forecasting accuracy for 15 schools, also makes a similar
85
division of stages, i.e., forecast, brief, sketch plan and detailed design. Their
corresponding cv are 17%, 10%, 9% and 6%.
McCaffer (1975) compares the quality of eight multiple regression statistical
models with that of other unspecified (conventional) models that are used by
practicing forecasters. Table 3-7 shows the performance of the eight methods.
Based on the assumption that the coefficient of variation of the forecast is likely to be
25% to 50% greater than the coefficient of variation of the prediction, as suggested
by the author, the multiple regression approach is proved to produce better quality
forecasts than the other (unspecified) methods that are adopted in practice.
Ross (1983) reveals some surprising results on the relationship of the
sophistication of models and their performance. Three models are compared in his
study. The first uses the simple average of the value of sections of construction
work from a set of bills of quantities for previous contracts. The second model uses
a regression procedure to predict the total value from the sectional values, and the
third model uses a regression on the unit value of items. The models are therefore
arranged in order of increasing use of information. However, the results reveal the
first model to be the most accurate, with a coefficient of variation of 24.5%, followed
by the second model with a cv of 30.49%, and the third method with a cv of 52.66%,
which suggests, controversially, that the more sophisticated methods that utilise more
of the available data produce less accurate results.
Ashworth and Skitmore (1983) review the literature that is concerned with the
forecasting accuracy of conventional models. Their cited references are shown in
Table 4-3. They draw the important conclusions that certain types of project are
associated with higher degrees of accuracy, and that the estimation accuracy is found
86
to be 15% to 20% cv in the early design stages, which only improves to 13% to 18% at
the tender stage.
McCaffer et al. (1984) suggest a more sophisticated approach to forecasting
based on the element unit rate method. This approach involves the use of 32
different models together with a criterion for selecting the most appropriate model to
match the characteristics of the target. The reported consistency of this method is
between 10% to 19% cv, which is at least comparable to that of conventional
methods.
The research of Brandon et al. (1988) suggests the use of a developed expert
system for forecasting. The performance of the expert system for early stage
forecasting for office projects is reported to be within 5% of that predicted by the
expert forecaster, and the system provides a forecast within 10% of the lowest bid,
which is much better than that achieved by the average forecaster.
Skitmore and Drew (2003) reveal significant differences in both bias
(ANOVA test, p=0.021) and consistency (Bartlett’s test, p=0.030) between the
approximate quantities and the floor area method. Similar to the surprising result
found by Ross, the approximate quantities method (with 14.27% cv) that utilizes
more data is found to be less accurate than the floor area method (with 10.87% cv).
87
Table 4-3: Performance of designers’ forecasts reviewed by Ashworth and
Skitmore (1983) [email protected]
4.5 Summary
The bias of the forecasts that are produced by a model is calculated by the
arithmetic mean of the percentage difference between the designers’ forecast and the
lowest tender sum. The use of a percentage mean is a unit-free measure for
forecasting errors. Consistency is the degree of variation around the average, which
88
is represented by the standard deviation of percentage errors. Both bias and
consistency are chosen to be the accuracy measures in this study.
There are different options as to the choice of forecasting target. The lowest
returned tender price is considered to be a more appropriate forecasting target than
the mean or median of returned tender prices, as the major concern of the forecasting
exercise is to predict the probable market price to be paid by clients, which is often
the lowest bid in a tendering exercise.
Results from studies on bias and consistency tend to be contradictory, rather
than conclusive. Although significance tests can provide strong evidence to show
whether one model prevails over the others, and as a result can demonstrate the
major benefit of using the model in terms of its accurate performance, a review of
forecasting studies finds that sufficient significance testing is lacking.
Empirical studies on forecasting accuracy in different stages suggest that
there is little improvement in accuracy as a building project proceeds from the early
design stage to the detailed design stage. Paradoxically, there are two studies, one
by Ross and the other by Skitmore and Drew, which provide evidence that rougher
models are more accurate than more sophisticated models.
89
CChhaapptteerr 55 MMeetthhooddoollooggyy
By three methods we may learn wisdom: first, by reflection which is noblest; second, by imitation, which is the easiest; and third, by experience, which is the bitterest. Confucius
5.1 Introduction
James’ Storey Enclosure Model (JSEM) has been chosen for further
development because it is considered to be more sophisticated than other
conventional models in terms of the number of variables contained therein (such as
the floor area for each floor, basement area, external wall area and roof area of a
building) and the rationale behind the use of these variables (i.e. the consideration of
certain design factors, such as the shape of buildings, the vertical positioning of floor
areas, the storey heights and the cost of sinking storeys below ground in estimating
building prices). However, JSEM lacks the appropriate support for its assigned
weightings and selection of variables. Disregarding this deficiency, JSEM has been
judged, if rather roughly, to be a more accurate method than the floor area and the
cube models. The floor area model is still a very popular model that is widely
employed in practice, whereas JSEM is still only found in the textbooks of building
price studies. Because JSEM is as simple in application as other conventional
90
models, and has been proved to be relatively more accurate, it has been chosen for
further development in this research. The method for the development of JSEM
that is undertaken comprises the simplification of JSEM for multi-storey buildings by
making reasonable assumptions, the use of regression techniques for modelling
empirical data of building projects in Hong Kong and the assessment of forecasting
performance by statistical inference.
5.2 Research Framework
The further development of JSEM involves a purpose-designed modelling
approach that uses different regression techniques. Figure 5-1 shows the
framework for the identification, selection and validation of the price models in this
research. The framework comprises seven major steps: (1) the simplification of
JSEM, (2) data collection, (3) model building, (4) reliability analysis, (5) model
selection, (6) model adjustment and (7) performance assessment. The final price
models are developed through the identification of candidates in JSEM (Step 1) and
through the selection of predictors by regression techniques (Steps 2 to 6). The use
of regression techniques overcomes the major criticism of the irrationality of the
assigned weightings in JSEM. The forecasting performance of the final price models
(i.e. the best subset-regressed models) is then assessed by using the known measures
of the bias and consistency (Step 7). Finally, the best subset-regressed models are
compared with other conventional models to classify the models according to their
forecasting performance.
91
Figure 5-1: Research Framework for Identification, Selection and Validation of
Price Models
Simplification of JSEM
Classification and Entry of Historical Data
Generation of Subset Models by Least Square Error Method
Data Collection
Identification of Candidate Variables
Calculate Forecasting Error for Each
Sub-sample
Fit Each Sub-sample Using Least Square
Method
Determine Average MSQ for Sub-samples
of a Subset of Predictors
Construct Sub-samples
(each omitting one unique case from the
sample)
Reliability Analysis
Model Selection
Accuracy Testing against JSEM, Floor Area and Cube Models
Performance Assessment
Best Subset Model (Model
with Smallest Average MSQ)
Exclusion of Offending Variables
Model Adjustment
Calculation of Average MSQ for Each Subset Model by Cross Validation
Model Building
Leave-One-Out Method Formation of Base Model containing all
Candidates
Selection by Forward Stepwise and Backward Stepwise Procedures
92
5.3 Types of Quantity Measured in Single-Rate
Forecasting Models
The traditional models that are used for comparison in this study, including
JSEM, the floor area model and the cube model, are all single-rate forecasting
models. JSEM is the most complicated of the models because it demands more
measured variables, including the area of each floor, the perimeter of each floor, the
storey height of each floor and the roof area. After the introduction of these
traditional models, there are more models proposed such as those reviewed in
Chapter 3. However, most of them demand far more information than what is
extractable from sketch drawings. In other words, these models are to be used at a
later design stage.
As described in Chapter 3, JSEM can be represented by Equation (5.1):
RrspfspfiP j
m
jj
m
jji
n
ii
n
ii ⋅⎟⎟
⎠
⎞⎜⎜⎝
⎛+′′+′+++= ∑∑∑∑
==== 0000
5.22)15.02( , (5.1)
where P is the forecasted price, fi is the floor area at i storeys above ground, pi
is the perimeter of the external wall at i storeys above ground, si is the storey height
at i storeys above ground, n is the total number of storeys above ground level, m is
the total number of storeys below ground, f’j is the floor area at j storeys below
ground level, p’j is the perimeter of the external wall at j storeys below ground level,
s’j is the storey height at j storeys below ground level, r is the roof area and R is the
unit rate (determined by historical data).
93
The floor area model and cube model can also be represented mathematically
by Equations (5.2) and (5.3), respectively:
RfPnm
ii ⋅⎟⎠
⎞⎜⎝
⎛= ∑
+
=0 (5.2)
RsfPnm
iii ⋅⎟⎠
⎞⎜⎝
⎛⋅= ∑
+
=0 (5.3)
According to Equations (5.1) to (5.3), there are common variables amongst
the three models (e.g., fi, m and n).
5.4 Simplification of JSEM
To make a price model useful, it must be general enough to accommodate
variations without violating the original assumptions of the model, and specific
enough to reflect cost-significant factors. It must also be simple enough to be
understood by practicing forecasters, and intricate enough to explain real situations.
Although the data that are used in James’ study are mainly from low-rise buildings
(less than three storeys) such as houses, and medium rise buildings (3 to 10 storeys)
such as schools and industrial buildings, JSEM can also be applied to high-rise
buildings (higher than 10 storeys). Moreover, it can be applied to building projects
that contain more than one building (by adding another set of
variables, "")15.02(00
"l
t
ll
t
ll sprfl ∑∑
==
+++ ). However, the higher the building, the more
variables have to be created. If Equation (5.1) is used to estimate the price of a
40-storey building without a basement (a typical number of storeys for high-rise
buildings in Hong Kong), then one has to measure the floor area, the perimeter and
94
storey height ten times (once for each level), the number of levels, and the roof area.
With the JSEM, 81 variables (e.g., pjsj and p’js’j) have to be created, which are
calculated from 161 items of measurement (e.g., pj ,,sj, p’j and s’j). The rationale in
behind of the JSEM is that the areas of different parts of a building affects the
building price differently. The huge number of variables generated for modelling
the price of high-rise buildings would induce a heavy burden on the size of the data
set required. However, the rationale can be sustained and the number of variables
can be significantly reduced if the assumption is made that the floor areas at different
levels are approximately the same. This assumption is supported by the fact that
high-rise buildings generally comprise repeating floors. Very often, only typical
layout plans instead of the layout plans for every floor are provided for forecasting in
the early design stage. Although layout plans for every floor are more available at
the later stages, other development restrictions such as those laid down on land leases,
e.g. the site coverage and the plot ratio, leave little room for designers or the decision
makers to change the distribution of areas and the number of storey drastically.
With this assumption, the number of variables is reduced to four: the total level, total
elevation area (which can be easily measured by multiplying the average perimeter
by the overall building height), the average floor area and the roof area.
Equation (5.7) represents the simplified JSEM for use with high-rise
buildings. Care has to be taken to avoid applying the simplified equation to
buildings with significantly different floor sizes at different levels, or the assumption
will be violated. It is possible that the presence of a podium in a typical large
development may also violate the assumption, as floors that are located at podium
level will generally contribute a much larger average floor area than those in the
tower or towers above the podium. To avoid a probable violation, the variables in
95
JSEM that represent the floor area above ground level have to be divided into two
parts – one for the podium and the other for towers. This leads to Equation (5.6),
which represents JSEM for buildings with a podium design. Here are the steps for
deriving Equations (5.6) and (5.7).
Let ptpti
n
ii snpsp =∑
=0
, where ppt is the average perimeter of the superstructure
and spt, is the average storey height of the podium. Letbbj
m
jj smpsp =′′∑
=0
, where pb is
the average perimeter of the basement and sb, is the average storey height of the
basement. Let b
m
jj mff =′∑
=0
, where fb is the average floor area per storey for floors at
basement level and f’0 ≈ f’1 ≈ … ≈ f’m ≈ fb (the floor area for each level of the
basement is more or less the same, and is approximately equal to fb). Equation (5.1)
for JSEM becomes:
RsmpmfsnprfiP bbbptpt
n
ii ⋅⎟
⎠
⎞⎜⎝
⎛+++++= ∑
=
5.22)15.02(0
. (5.4)
Consider that a building comprises a podium section and a tower section. Let
n = a + b, where a is the number of storeys of the podium and b is the number of
storeys of the tower. f0 ≈ f1 ≈ …≈ fa ≈ fp (the floor area for each level of the podium is
more or less the same, and is approximately equal to fp), where fp is the average
storey area for floors at the podium level. fa+1 ≈ fa+2, … , fb ≈ ft (the floor area for each
level of the tower is more or less the same, and is approximately equal to ft), where ft
is the average storey area for floors at tower level. Then,
96
∑∑∑∑+==
+
==
+++=+=+b
ait
a
ip
ba
ii
n
ii fifififi
1000
)15.02()15.02()15.02()15.02(
[ ] ttpp fbaaabffaaf )()2()1(15.02)10(15.02 ++++++++++++= LL
fbabfbffaaf tpp )21(15.015.02)10(15.02 ++++++++++= LL (5.5)
The simplified equation for JSEM becomes:
Rsmpmfsnpr
abffbbffaafP
bbbptpt
tttpp ⋅⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++++
++⎟⎠⎞
⎜⎝⎛ −++⎟
⎠⎞
⎜⎝⎛ −
=5.22
15.0215.0
215.02
215.0
215.02 22
RsmpRmfRsnprR
RabfRfbRbfRfaRaf
bbbptpt
tttpp
522
1502150
21502
2150
21502 22
.
.....
++++
++⎟⎠⎞
⎜⎝⎛ −++⎟
⎠⎞
⎜⎝⎛ −=
(5.6)
Consider a building that has no podium, or that the average storey area for the
podium is approximately equal to that of the tower, i.e., fp ≈ ft ≈ fpt, where fpt is the
average storey area for floors above ground level, and a + b = n. The simplified
equation becomes:
tttpp
tttpp
abffbbffaaf
fbbabfbffaaaf
15.0215.0
215.02
215.0
215.02
2)1(15.015.02
2)1(15.02
22 ++⎟⎠⎞
⎜⎝⎛ −++⎟
⎠⎞
⎜⎝⎛ −=
−⋅+++
−⋅+=
97
Rsmpmfsnpr
abffbbffaafP
bbbptpt
ptptptptpt ⋅⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
++++
++⎟⎠⎞
⎜⎝⎛ −++⎟
⎠⎞
⎜⎝⎛ −
=5.22
15.0215.0
215.02
215.0
215.02 22
Rsmpmfsnprfbafba bbbptptptpt ⋅⎥⎦
⎤⎢⎣
⎡+++++++⎟
⎠⎞
⎜⎝⎛ −= 5.22)(
215.0)(
215.02 2
(5.7)
Rsmpmfsnprfnnf bbbptptptpt ⋅⎥⎦
⎤⎢⎣
⎡+++++⎟
⎠⎞
⎜⎝⎛ −= 5.22
215.0
215.02 2
RsmpRmfRsnprRRfnRnf bbbptptptpt 5.22215.0
215.02 2 +++++⎟
⎠⎞
⎜⎝⎛ −=
5.5 Identification of a Problem
In JSEM, building prices are assumed to be proportional to the floor area,
roof area and elevation area. However, their exact relationships have not been
properly studied. As JSEM has been determined by rule of thumb, or by a very
coarse method, it is possible that JSEM may include some irrelevant predicting
variables, or have excluded some significant predicting variables, and that the
relationships between building prices and the predicting variables are not the same as
has been proposed. As suggested, JSEM can be represented by Equations (5.6)
and (5.7). These equations actually fit the hypothetical models such that each
equation contains one dependent variable (response), P, and some independent
variables (predictors), including nfptR and rR, which can be statistically developed by
regression techniques. The question in hand can be considered as a typical multiple
linear regression problem. Regression techniques can be used to determine the
98
subset of variables and the corresponding coefficients that give the best forecast of
the building prices. The developed regressed models and the employed modelling
approach in this research are both advancements of JSEM.
Let all of the possible predictors be Vi, where i = 1, 2, …, k, the building price
model can be represented as
i
k
iikk VVVVP ∑
=
+=++++=1
022110 ββββββ L , (5.8)
where β0, βis are constant coefficients and Vis are independent variables.
Table 5-1 shows the coefficients and the variables that are designated in JSEM with
reference to Equations (5.6) and (5.7).
There are other available techniques for modelling the variables other than
multiple regression analysis. Perhaps the closest alternative approach that serves
the same purpose is structural equation modelling. This takes into account the
modelling of interactions, nonlinearities, correlated independents, measurement
errors, correlated error terms, multiple latent independents, and one or more latent
dependents (independents and dependents are each measured by multiple indicators).
Compared with multiple regression, structural equation modelling includes more
flexible assumptions (particularly in allowing interpretation even in the face of
multicollinearity), uses confirmatory factor analysis to reduce measurement error by
having multiple indicators for each latent variable, provides a graphical modelling
interface and has the ability to test models with multiple dependents to model
mediating variables and error terms, and tests coefficients across multiple
between-subject groups (Garson 2004). Although structural equation modelling has
many advantages over the multiple regression method and is considerably more
99
powerful, multiple regression is more suitable for this research because of the
possible violation of assumptions and the multivariate normality of the indicators
(Jaccard and Wan 1996 p. 80). More importantly, the cross validation approach to
the multiple regression method that is used in this research provides a more direct
means for the measurement of reliability for small size samples.
Table 5-1: Coefficients and Variables Designated in JSEM
Equation (5.6) Equation (5.7)
Coefficients (βi) Variables (Vi) Coefficients (βi) Variables (Vi)
β0 = 0 β0 = 0
β1 = ⎟⎠⎞
⎜⎝⎛ −
21502 .
V1 = Raf p β1 = ⎟
⎠⎞
⎜⎝⎛ −
21502 .
V1 = Rnfn
β2 = 2150. V2 = Rfa p
2 β2 = 2150. V2 = Rfn n
2
β3 = ⎟⎠⎞
⎜⎝⎛ −
21502 .
V3 = Rbft β3 =1 V6 = rR
β4 = 2150. V4 = Rfb t
2 β4 =1 V7 = Rsnp ptpt
β5 = 150. V5 = Rabft β5 =2 V8 = Rmfb
β6 =1 V6 = rR β6 = 52. V9 = Rsmp bb
β7 =1 V7 = Rsnp ptpt
β8 =2 V8 = Rmfb
β9 = 52. V9 = Rsmp bb
5.6 Data Preparation and Entry
Cost analyses that were prepared by forecasters are chosen to be the data
source, as they contain all the information that is required for this study, such as the
100
tender price, floor area, roof area, building height and external wall area. The cost
analyses that are used in this research were provided by one of the two dominating
quantity surveying practices in Hong Kong (see Appendix A). Since the quantity
surveying consultants in Hong Kong rarely focus their business on providing services
to projects of particular types or with particular characteristics, project data
obtainable from the dominating practices are considered to have sufficient
representation of the price behaviour.
5.6.1 Data sample
The data sample consists of the values of identified candidate variables and
tender prices from 148 completed projects in Hong Kong. The tenders for these
projects were received in the ten-year period between the third quarter of 1988 and
the second quarter of 1997.
Hong Kong is a former British colony. Both the structure of the
construction industry and the professional practices within the industry are very
similar to those in the UK. In 1929, the Royal Institution of Chartered Surveyors
(RICS) established a branch office in Hong Kong. Before the local surveying
institution, The Hong Kong Institute of Surveyors (HKIS), was founded in 1984, the
Hong Kong branch of theRICS was the only institution that recognizes and provides
local support to surveyors. However, neither the HKIS nor the RICS have
attempted to formalise forecasting practice in Hong Kong.
Unlike forecasts that are produced in the UK, which are generally presented
in the format of the Building Cost Information Service (BCIS) (BCIS 1969), there is
101
no standardised definition or classification of building elements in Hong Kong.
Using data from a single source avoids the unnecessary complications that arise from
differences in the classification of building elements or the format and the
breakdown of building costs, which may differ across firms. Moreover, there is no
Building Cost Information Service (BCIS) type of organisation that provides online
cost advice services in Hong Kong, and a practice will not provide its own historical
project data to a competitor. Thus, it is almost impossible for the forecaster of one
company to get access to cost data from a third party such as the BCIS or another
practice. That the data is collected from a single source also ensures that the
models that are generated in this research are applicable to practical forecasting,
because the cross validation approach, as described in section 5.7.4, is very similar to
the manner in which forecasts are prepared in practice.
5.6.2 Definition and classification of building types
James’ study is based on a sample of 86 tenders in the categories of flats,
schools, industrial buildings and let houses in the 1950s in the UK. In accordance
with James’ study, all of the data from the 148 projects in this research are grouped
according to their building types.
The data are grouped for analysis into different building types according to
the Construction Index Samarbetskommitem for Byggnadsfrager (CI/SfB), which is
published by the Royal Institution of British Architects (RIBA) (Ray-Jones and
Clegg 1976). Five types of building were identified: (1) code no. 32 – office
facilities, offices; (2) code no. 442 – nursing homes, convalescent homes, sanatoria;
102
(3) code no. 712 – primary schools; (4) code no. 713 – secondary schools; (5) code
no. 816 – flats (apartments).
Because the number of available projects was small, and the provisions for
primary and secondary schools were very similar, these two sub-types were grouped
together. For ease of reference, the four groups are known as offices, nursing
homes, schools and private housing. Table 5-2 shows the distributions of the
building projects that were used for the development of price models, according to
their building types.
It should be noted that a few projects contain a mixture of more than one type
of building. For example, a 50-storey office tower project may have a few shops at
ground floor level. As all of the projects selected are dominated by one particular
type of building, the effect of the presence of another type or types of building is
considered to be insignificant.
103
Table 5-2: Classification of building projects according to building types
CI/SfB code
Building Type
Inclusions Exclusions No. of Samples Collected
No. of Discarded Cases
No. of Samples Used
32 Office Offices, such as design offices, professional offices and executive offices, that are not associated with a particular facilities
Official administrative facilities, law court, commercial facilities, trading facilities, shops, protective service facilities, bank, shopping arcade, industrial and office
45 3 42
442 Nursing home
Nursing homes, convalescent homes and sanatoria
Hospital facilities, hospitals, medical facilities and animal welfare facilities
23 - 23
712 & 713
School Primary and secondary schools including infants schools, secondary modern, secondary technical and community schools
Universities, colleges, nursery schools, kindergarten, scientific facilities, private schools, exhibition, display facilities, information facilities, libraries and other education facilities
23 - 23
816 Private Housing
Multi-storey Flats (Apartments)
Low-rise housing, one-off housing units, houses, public housing, special housing facilities, hotels, hostel, historical residential facilities, quasi-private housing and service apartment
57 7 50
Total: 148 10 138
104
5.6.3 Treating of outliers
Outliers (extreme cases) are especially troublesome when the goal is to select
from a set of forecasting models, but are less of a problem for model calibration
(Armstrong and Collopy 1992). The presence of outliers can seriously affect the
least-squares fitting of a regressed model. These outliers may possess different
characteristics from the rest of the data. Some regression diagnostics, such as the
jack-knife residual and leverage, assist in the identification of outliers. However,
pure reliance on the results of these statistical techniques (e.g., when they lie three or
more standard deviations from the mean of the residuals) for excluding extreme
cases without studying the plausibility of the exclusion may lead to a favourable
model being produced from biased data. Therefore, unless there is strong evidence
to indicate that a case is not a member of an intended sample, it should not be
discarded.
All of the inputs and outputs of the regression are evaluated according to
three criteria: reasonableness and given knowledge of the variable, response
extremeness, and predictor extremeness (Kleinbaun et al. 1998 p. 228). The
residuals from the regressed models were analysed, and three office and seven
private housing cases were discarded. All of the discarded office cases had
comparatively lower response values. Further investigation revealed that the three
office cases were for industrial and office (I-O) purposes1. Moreover, five of the
seven discarded private housing cases had comparatively lower response values, and
1 “An I-O Building is defined as a dual-purpose building in which every unit of the building, other than that in the purpose-designed non-industrial portion, can be used flexibly for both industrial and office purposes. In terms of building construction, the building must comply with all relevant building and fire regulations applicable to both industrial and office buildings, including floor loading, compartmentation, lighting, ventilation, provision of means of escape and sanitary fitments.” (Town Planning Board 2003)
105
the other two had higher response values. The five lower response cases were
discarded, as they are quasi-private housing development (housings completed under
the Private Sector Participation Scheme (PSPS)2), which were not solely developed
by private developers and thus were not part of the intended sample data. The two
higher response cases were found to be service apartment buildings, which are
generally better furnished than ordinary private housing, and were therefore also
discarded. To sum up, the differences in response values for the discarded cases
may be caused by the differences in the building provisions (industrial and office
buildings, service apartments and quasi-private housing), contractual arrangements
(quasi-private housing) and technology of fabrication (quasi-private housing).
Finally, 138 building projects in four categories were used for the modelling (see
Table 5-2).
5.7 Model Building
5.7.1 Dependent Variables
As is reviewed in section 4.4 of Chapter 4, the lowest tender price is set to be
the target of forecast. In accordance with James’ paper, the lowest tender prices
that are used in the modelling exclude the price of the foundations, building services,
external works, preliminaries and contingencies.
2 Under the Private Sector Participation Scheme (PSPS), private sector developers bid for the right to build according to a given design. The finished flats will be purchased by the Housing Authority of the Hong Kong Government at a pre-agreed price for onward sale to buyers who are selected by the Housing Authority.
106
When tender prices are used as the response for modelling, there is a risk of
producing poorly performing models in terms of their percentage errors, i.e. the ratio
of error (which is forecasted tender price minus the actual or lowest tender price) to
the actual tender price. It is also found that the magnitude of error that is produced
from forecasts of a wide range of tender prices (e.g., for offices, the tender prices
range from HK$24 million to $1,477 million) varies significantly. As the
performance of the forecasts are measured according to their percentage errors, the
minimisation of total squared errors in the least-squares method is not necessarily an
effective means of obtaining a good model unless tender prices in all of the cases in a
model are fairly close to each other. To reduce the influence of a wide tender price
range, the tender price per total floor area is adopted as the response. The tender
price per total floor area is a sensible alternative because forecasters usually present
building prices in unit prices, especially at the early budget stage, and the calculation
of forecasted prices from the unit price model is straight forward. The unit price
model can be directly compared with other conventional models despite their
responses being different, because performance is measured on the basis of
percentage errors.
5.7.1.1 Price Index Adjustment
The tender prices were rebased to the prices of the second quarter of 1997 by
means of the tender price index that is published by the quantity surveying practice
that provided the data for this study. A copy of this tender price index is attached in
Appendix B.
107
5.7.1.2 Other Adjustments
Apart from incorporating inflationary effects using the tender price index,
there may be some other characteristics that need to be adjusted using indices
(Kouskoulas and Koehn 1974; Pegg 1984). However, there is a lack of indices
other than the tender price index in Hong Kong. For instance, the location index,
while popular in many countries, is not in use at all. As the overall area of Hong
Kong is only slightly more than 400 square miles, projects that are undertaken
anywhere in Hong Kong are interpreted as being in the same geographical region
(Drew 1995). Other than projects that are located in remote areas such as outlying
islands and hillsides, etc., the location effect is not significant. No buildings located
in remote areas are included in the data pool.
The other possible adjustments by indices such as the quality and technology
of buildings are considered to be either irrelevant or inapplicable. First, there are no
quality and technology indices in use. Second, detailed specifications and method
statement for buildings are yet to be defined at the early design stage. Instead of
using indices for adjustment, only project data with similar characteristics, such as
project type, are grouped together for modelling.
5.7.2 Candidate variables
To identify the predictors for best subset models, the modelling process
started off with the variables that are used in JSEM. The actual measurements of
quantities (e.g. perimeter and storey height) for the variables in JSEM (e.g. elevation
area) were extracted to form the primary candidate variables for the regression
analysis. With reference to the variables in JSEM, a few candidate variables, such
108
as the number of storeys, the square of the number of storeys and their interaction
with storey height, were also added to form another set of candidate variables for
regression analysis. The unit rate ‘R’ was excluded, because the tender price is not
measured on a unit area basis in regressed models. Table 5-3 shows a full list of the
candidate variables for the regressed models for buildings with and without
basements.
Table 5-3: List of Candidate Variables
Primary Model
JSEM Model
All Subsets Model (With Basement)
All Subsets Model (Without Basement)
All Identified Variables (without higher degree and interaction effects) No. of storey for podium (a), a fp , a²fp , No. of storey for tower (b), bft , b²ft , No. of storey for basement (m), abft , mfb , Square of no. of storey for podium (a²), (asp + bst)ppt , Square of no. of storey for tower (b²), msbpb , r Average floor area for podium (fp), Average floor area for tower (ft), (separating Average floor area for basement (fb), podium and Average storey height for podium (sp), tower) Average storey height for tower (st), Average storey height for basement (sb), Average perimeter for tower and podium (ppt), Average perimeter for basement (pb), Roof area (r) Reduced Version of All Identified Variables (without higher degree and interaction effects)
No. of storey for superstructure (n), n fpt , n²fpt , n , m , n² , n , n² , fpt , No. of storey for basement (m), mfb , nsptppt , fpt , fb , spt , spt , ppt , Square of no. of storey for podium (n²), msbpb , r sb , ppt , pb , nfpt , n²fpt , Average floor area for superstructure (fpt), nfpt , n²fpt , nspt , n²spt , Average floor area for basement (fb), (combining mfb , nspt , nsptppt , Average storey height for superstructure (spt), podium and msb , n²sp t , n²sptppt , r Average storey height for basement (sb), tower) nsptppt , Average perimeter for tower and podium (ppt), msbpb , Average perimeter for basement (pb), n²sptppt , r Roof area (r)
109
5.7.3 Fitting Criterion
There are two approaches for the selection of predictors based on errors of
forecasts – parametric and non parametric. For a linear model, the former approach
demands the satisfaction of some statistical assumptions, including the following
(Kleinbaum et al. 1998, pp. 43-46). (1) For any fixed value of the variable V, P is a
random variable with a certain probability distribution, e.g., a normal
distribution ( )VPVP |2
| , μσ , that has a finite mean and variance; (2) the p-values are
statistically independent of one anther; (3) the mean value of P ( )VP|μ is a straight
line function of V; (4) the variance of P is the same for any V ( )2|
2| ba VPVP σσ = ; and (5)
for any fixed value of V, P has a normal distribution. If assumptions (1) to (4) are
satisfied and assumption (5) is not badly violated, then the conclusions that are
reached by a regression analysis remain reliable and accurate. This approach allows
the use of multiple partial F statistics and p-values to select variables for the best
models. These parametric procedures are suitable for routine problems, but not for
the problems that are identified in this research. First, the sample sizes for the
various types of building are small, around 25 to 50. This would easily cause bias
in the estimation of the coefficient. Second, the use of parametric techniques such
as the least-squares method is known to be robust, even if the normality assumption
(the fifth assumption) is not fully satisfied. However, the parametric estimates of
the error rates may not be correspondingly robust (McLachlan 1987). Although
transformation can be applied to variables to fulfil the requirement of normality, it
may cause the violation of other assumptions.
110
Instead of relying on the multiple partial F statistics and p-values for the
selection of variables, a non-parametric approach that is based on the mean square
error (MSQ) is adopted. There are two main advantages of using MSQ rather than
the actual errors or absolute errors. The first is that positive differences do not
cancel negative differences, and the second is that the use of differentiation is not
difficult (Fausett 2002).
Previous regressed price models that have been developed by researchers use
either the least-squares approach or the minimum variance approach for the model
fitting. In a linear fitting, both approaches produce the same solution (Kleinbaum et
al. 1998, p. 118). According to the non-parametric approach that is adopted in this
study, the termination criterion is to minimise the MSQ, and therefore the
least-squares approach is preferred.
5.7.3.1 Matrix Notation for Calculation of MSQ
Recall that Equation (5.8) can be presented in a matrix notation. Let P be a
column vector containing n rows of observed values for the response {P1, P2, … ,
Pn}T and V be a matrix that contains n x (k + 1) of the observed values for a subset of
variables such that:
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
knnn
k
k
n VVV
VVVVVV
,2,1,
,22,21,2
,12,11,1
2
1
1
11
L
MOMMM
L
L
M
V
VV
V . (5.9)
Corresponding to Pi is Vi, a row vector that contains the observed values for
the variables (which contain a constant term and k number of predictors) and {1, Vi,1,
111
Vi,2, … , Vi,k}, where i = 1, 2, … , n . In a regressed model, the price is
represented by:
eβVP += , (5.10)
where β is a column vector of the coefficients {β0, β1, β2, … , βk}T and e
is a column vector of the forecasting errors {e1, e2, … , en}T. The mean square error
then becomes:
)(n
en
eeTn
ii
11MSQ1
2 == ∑=
( ) ( )
)(1
1
βVVββVPPVβPP
βVPβVP
TTTTTT
T
n
n
+−−=
−−=
(5.11)
β̂ is the β that produces the minimum MSQ. To determine β̂ , the
MSQ is differentiated with respect to β , and the result is equated to zero, i.e.,
0βVVPVβ ββ
=+−=∂
∂
=
)ˆ22(1MSQˆ
TT
n. (5.12)
This yields:
PVβVV TT =ˆ
( ) PVVVβ TT 1ˆ −= (5.13)
Therefore, the minimum MSQ is:
112
( )βVVββVPPVβPP TTTTTT ˆˆˆˆnmin +−−=1MSQ . (5.14)
5.7.4 Reliability analysis
The fitness of a model that is built by historical data is not a reliable indicator
of its forecasting ability (Armstrong 1985). In classical statistical inference, a
model is validated using ex ante (out of sample) forecasts. However, the lack of
available data is always a limitation in the construction of price forecasting models.
Unquestionably, it is problematic to use the same data both to build up and to
validate a statistical model, i.e., to use ex post simulation prediction (within simple),
but the alternative of analysing data blindly simply to preserve the purity of classical
statistical inference, presents even worse problems.
In this research, a resampling method is adopted to select variables and
evaluate models. Three possible resampling methods were considered (Efron 1982):
cross validation, in which one case is omitted in turn from the model derivation and
the resulting coefficients are applied to that case; the jack-knife method, in which
one case is omitted in turn from the model derivation and the resulting coefficients
are applied to the other cases; and the bootstrap method, in which the coefficients are
used to generate simulated data from which a second set of coefficients is obtained.
For predictive applications, the cross validation method has the most intuitive appeal
as with non-time-series data of this nature each error value can be thought of as a real
error that may arise in the practice of forecasting (Skitmore 1992). In cross
validation, the accuracy of statistical inference is preserved by dividing at random a
sample of data into two sub-samples, an exploratory sub-sample, which is used to
113
select a statistical model for the data, and a validatory sub-sample, which is used for
formal statistical inference (Fox 1997). This is a compromise method that keeps the
integrity of the inference when the same data are used for the selection and validation
of statistical models, and is an approach to ex post forecasting, because test data are
within simple but are not used in model fitting. It is different from split sample
validation in that the split sample validation uses only a single sub-sample (the
validation set) to estimate the error. This distinction is particularly important,
because cross validation is proved to be markedly superior for small data sets (Goutte
1997).
To simulate a practical situation, the ‘leave-one-out’ cross validation method
is the most suitable approach, and is adopted in this study. The steps of the
‘leave-one-out’ cross validation approach for the assessment of the reliability of a
model are shown in Figure 5-1. The accuracy of statistical inference in the
leave-one-out method is preserved by dividing a sample that contains n cases of data
into n exploratory sub-samples (each containing n - 1 cases that are obtained from
the original n-case sample by the omission of one case without repetition), each of
which is used to select a statistical model using the least-squares approach, and n
omitted cases, each of which is used to validate the selected model from an
exploratory sub-sample that does not contain the omitted case. An average MSQ is
deduced from n models for each subset of candidates. The average MSQs from
models of different subsets of candidates are compared, and the model with the
smallest average MSQ is taken to be the best subset model.
Cross validation appears to make no assumptions at all. For the purpose of
comparing models, each explanatory sub-sample produces a slightly different
best-fitting curve in the family, and there is a penalty for large, complex families of
114
curves because large families tend to produce greater variation in the curves that best
fit an explanatory sub-sample (Turney 1990a). This leads to an average fit that is
poorer than the fit of the curve that best fits the total data sample (Forster 2001 pp.
96-97). In cross validation, the selection criterion is designed implicitly, rather than
explicitly, as it gives the forecasting accuracy in terms of MSQ.
5.7.4.1 Matrix Notation for Calculation of MSQ by Leave-one-out Method
Referring to the least-squares method that is described in the matrix notation
in section 5.7.4.1, let P(-j) be a column vector that contains n rows of observed values
for the response {P1, P2, …, P(j-1), P(j+1), …, Pn}T, let V(-j) be a matrix containing (n –
1) x (k + 1) of the observed values for the subset of variables (with the omission of
one row of the observed values, representing the jth case, from the matrix of variables
V such that j is any number from 1 to n):
⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
=
+++
−−−
+
−−
knnn
kjjj
kjjj
k
k
n
j
j
V
VV
VV
V j
,,,
),(),(),(
),(),(),(
,,,
,,,
)(
)(
VVV
VVVVVV
VVVVVV
)(
L
MOMMM
L
L
MOMMM
L
L
M
M
21
12111
12111
22212
12111
1
1
2
1
1
11
11
. (5.15)
)( jβ − is a column vector of the coefficients {β0, β1, β2, …, β(j-1), β(j+1), … ,
βk}T and e(-j) is a column vector of the forecasting errors {e1, e2, …, e(j-1), e(j+1), … ,
en}T of the regressed model )()()()( jjjj eβVP −−−− += . Similar to the derivation that is
shown in Equations (5.11) to (5.14), the minimum MSQ of the regressed model that
does not contain the jth case becomes:
115
⎟⎠⎞
⎜⎝⎛ +−−=
−−−−−−−−−−−−− )()()()()()()()()()()()()( ˆˆˆˆnmin
jjjjjjjjjjjjj
βVVββVPPVβPPTTTTTT1MSQ (5.16)
The average of MSQmin(-j),
)(
min
j−
MSQ , is deduced from n regressed models
(for j = 1, … , n) of the subset of variables in accordance with Equation (5.17),
∑=
−−
=n
j
jj
1MSQ1MSQ
)()(
minmin n. (5.17)
Different )(
min
j−
MSQ from different subsets of variables that are chosen by
the selection strategy that is described in the next section are compared. The subset
of variables that gives the smallest )(
min
j−
MSQ is the best subset model.
5.7.5 Selection Strategies
The all-possible regressions procedure that fits all combinations of variables
is used over other variables selection procedures whenever practicable, because it is
the only procedure that guarantees the identification of the best subset model.
However, to find the best subset out of all of the subsets for the models with
basements that are listed in Table 5-3 using this procedure involves the fitting of 1 to
19 combinations of variables, i.e.,
519
110243.5
)!19(!!19
×=−∑
=i ii.
If each fitting consumes four seconds of computing time, then a full analysis
of all of the subsets for one type of building using one fitting criterion would take
116
over 24 days of computing time. As four types of building are included in this
research and two sets of variables are suggested (refer to Table 5.3), the overall
computing time would be much longer than 24 days!
There are a few common selection procedures for parametric problems, such
as forward elimination, backward elimination and stepwise selection. Forward
selection begins with no variable in the regression equation. The variable that has
the highest correlation with the dependent (criterion) variable is entered into the
equation first. The remaining variables are then entered into the equation
depending on the contribution of each variable. Backward elimination begins with
all of the predictor variables in the regression equation, and sequentially removes
them. Stepwise selection is a combination of the forward and backward elimination
procedures.
These procedures can also be applied to non-parametric regression, and the
difference rests on the use of different termination criteria. To ensure the selection
of the best subset model, a dual stepwise procedure that consists of a combination of
the forward stepwise and backward stepwise procedures is adopted (Figure 5-2).
According to the algorithm for the forward stepwise procedure (on the left-hand side
of the figure), forward regression is first applied by entering one candidate variable
at a time. When no candidate that enters into the model can further reduce the
average MSQ, the forward regression ends. A subset of variables that produces the
minimal average MSQ is selected. Backward regression is then applied, and if the
number of variables that was selected in the forward regression is less than two, then
the stepwise procedure will be terminated, as all single predictor models have been
considered in the forward regression. Candidates in the subset that are selected by
the forward regression are eliminated one at a time until the average MSQ cannot be
117
further reduced by the elimination of a candidate. Forward regression starts again
and backward regression follows until the average MSQ cannot be further reduced,
and a minimum average MSQ is determined at the end of the forward stepwise
procedure. The backward stepwise procedure (on the right-hand side of the figure)
is the same as the forward stepwise procedure, except that it commences with all of
the candidates being contained in the model and starts off with a backward regression.
The best subset model that is deduced by the forward stepwise procedure is
compared with that deduced from the backward stepwise procedure. If they are the
same, then the selected subset model will either be very close to, or the same as, the
best model using the all-possible regression procedure.
118
The best model in this stage contains r number of variables
The best model in this stage contains (i-1) number of variables
Generate all 1-variable models
For i = 2
Select best 1-variable model
Backward Regression
Generate all i-variable model with 1st to (i-1)th variables
already entered
Is average MSQ of best i -variable model < that of
best (i-1)-variable model?
Select best i-variable model For i = i + 1
Generate all (i-2)-variable models from already entered 1st
to (i-1)th variables
Select best (i-2)-variable model
Forward Regression
Is average MSQ of best (i-2) -
variable model < that of best (i-1)-variable model?
Best Subset
Model by Forward Stepwise Procedure
Generate all r-variable models from already entered 1st to
(r+1)th variables
For r = i – 3, which the best model in this stage contains (r+1) number of variables
Select best r-variable model
Is average MSQ of best r-variable model < that of
best (r+1)-variable model?
For r = r - 1
For i – 1 = r + 1,
which the best model
in this stage
contains (i -1) number of variables
Yes
No
Yes
No
Yes
No
Yes
No
Backward Regression
Generate all (r-1)-variable models from already entered 1st
to rth variables
Select best (r-1)-variable model
Is average MSQ of best (r-1) -
variable model < that of r-
variable model?
Best Subset
Model by Backward Stepwise Procedure Generate all (r+1)-variable
model with 1st to rth variables already entered
No
Yes
Generate n-variable model
Is r > (n – 1)?
For r = r - 1
For r = n
Yes
No
Forward Regression
Select best (r+1)-variable model
Is average MSQ of best (r+1) -variable model
< that of r-variable model?
No
Yes
For r = i - 2, which the best model in this stage contains (r
+ 1) number of variables
Generate all i-variable model with 1st to (i-1)th variables
already entered
Select best i-variable model
For i = i + 1
Is average MSQ of best i -
variable model < that of (i-1)-
variable model?
Yes For i = r, which the
best model in this stage contains i number of variables
No
Are they the same model?
STOP
Identification of a base model containing n
variables
FORWARD STEPWISE REGRESSION BACKWARD STEPWISE REGRESSION
Is i > 2?
Yes
Exclusion of an offending
variable
Figure 5-2: Algorithm for Dual Stepwise Selection
119
5.8 Model Adjustment
5.8.1 Exclusion of candidates
The best subset models that are selected by the forward stepwise and
backward stepwise procedures are not necessarily the same. Divergence is easily
caused by multicollinearity, i.e., strong correlations amongst the predictors. One
typical strategy to avoid the presence of multicollinearity is to combine or remove
predictors that are strongly correlated to each other. This can be easily
implemented by the use of correlation tables. However, this strategy is not
appropriate for the modelling exercise in this research, because a lot of the selected
predictors are actually interaction terms, and are likely to be strongly correlated with
the primary variables (in Table 5-3). Moreover, as the future use of the best model
is for forecasting rather than understanding how predictors in the model have an
impact on the response, good models that suffer from multicollinearity still produce
accurate forecasts. Therefore, except for variables that are very highly correlated (>
0.95), predictors that have similar values to each other have not been deleted simply
because their correlation is high (say, > 0.7). If the cross-validated average MSQs
of the best models that are generated from the two procedures are different, then one
of them will always be better – the one with the smaller average MSQ. To prevent
a less significant candidate acting as an offending variable and entering into the
model before a more significant candidate (or a more significant candidate being
eliminated from the model before a less significant candidate), an algorithm to
120
exclude offending variables has been set up to deal with the possible divergence.
This involves four steps: (1) the exclusion of a candidate in turn before modelling by
regression, (2) the generation of models with forward stepwise and backward
stepwise procedures, (3) the selection of the model with the smaller average MSQ if
two different subsets of variables are chosen, and (4) the comparison of the smaller
average MSQ with that of a subset of variables that is selected from an all-subset
model that contains the excluded candidate. Step 1 is repeated (i.e., excluding the
second, third or more candidates before modelling) if the forward stepwise and
backward stepwise procedures for modelling cannot produce an agreeable model, or
the average MSQ of the best subset model is higher than that of the subset of
variables that is selected from an all-subset model that contains the excluded
candidate(s). The procedure for excluding candidates as described stops if the
forward and backward stepwise procedures produce the same model (subset of
predictors) with the smallest average MSQ.
The use of cross validation is a non-parametric approach to the determination
of the best subset of predictors, and therefore does not have to fulfil the assumptions
of homoscedasticity and normality of predictors that are required in parametric
regression. Because of this, the use of transformation strategies for variables in this
research is limited to the circumstances in which the original data suggested a model
that is non-linear in either the regression coefficients or the original variables, or the
linearisation of the regression coefficients.
A few studies have attempted to find the relationships between various
predictors and the price of building or the prices of the components of a building
(Wilderness Group 1964; Flanagan and Norman 1978; Russell and Choudhary 1980;
Tan 1999). However, a generalised relationship between any particular predictor
121
and the price of building or the prices of its components is absent, and on the
contrary, many studies have shown quite different relationships for the same subjects.
For example, the relationship between building price (represented by total price or
price per total floor area) and building height (represented by the number of storeys
or overall building height) has been expressed as a linear (Tregenza 1972; Braby
1975), a parabolic with a minimum (Flanagan and Norman 1978) and a power
(Karshenas 1984) function. Perhaps it can only be concluded that each relationship
can only be held true for the data from which it is generated.
5.8.2 Transformation of variables
For a given set of predictors and a given response, there can be unlimited
combinations of transformed predictors and transformed responses. Certainly,
models with transformed variables are more complicated, more inexplicable, and
bear a higher risk of being too specific for the given data than their untransformed
counterparts. More importantly, complicated models often do a bad job of
forecasting new data, although they can be made to fit old data quite well. This is
experienced also by modellers in other disciplines (Sober 2001 p.30). In terms of
practicability, simplicity also aids understanding and implementation by decision
makers, reduces the likelihood of mistakes, and is less expensive (Armstrong 2001
pp. 374-375). In the light of the principle of parsimony3, as reviewed in Chapter 3,
this research avoids the development of models with complex mathematical
functions. Instead, each best subset model has been transformed to a power
3 “The concern for parsimony can lead to normative rules for discovery systems: that such systems should be designed, as far as possible, to generate simple rules before generating complex ones.” (Simon 2001 p.42-43)
122
function, because this has been demonstrated by Karshenas (1984) and Skitmore and
Patchell (1990) to improve accuracy. The power function model can be expressed
as follows:
∏=
⋅=k
ii
i
10
'V'β'P' β , (5.18)
where P’ is the forecasted price, β’0, β’is are constant coefficients and V’is are
the variables of the best subset model. Taking the natural logarithm (ln) of both
sides (the ln transformation for the model), Equation (5.18) may be equivalently
expressed as:
i
k
ii V'β'β'P' lnlnln
10 ⋅+= ∑
=
. (5.19)
Equation (5.19) shows the transformation of the original variables to a linear
function of ln variables. The forecasting performance of the linear best subset
model has to compare with that of the model that is represented by Equation (5.18).
Referring to the principle of parsimony, the linear model prevails over the power
function counterpart unless the latter is shown to make significantly better forecasts.
5.9 Comparison of Best Model with Other Models
To assess the forecasting accuracy of the best subset models for the four types
of building, their forecast results have been compared with those obtained from the
other three conventional models. The same set of data that was collected for
123
building regressed price models is used to analyse the performance of all of the
models to facilitate a fair comparison.
With regard to the regressed models, the forecasted price per total floor area
for each case is multiplied by the total floor area to obtain the forecasted price to
calculate the forecasting error. Similar to the leave-one-out method, the reliability
of the three conventional models is also analysed using cross validation. The data
for each building type is split into two parts in turns without repetition. One part is
the exploratory sub-sample that contains all of the cases minus the one that is used to
calculate the average unit rate, and the other part contains the omitted case for the
assessment of the forecasting ability. The forecast for each turn is then calculated
by multiplying the average unit rate by the value of the predictor in the omitted case.
To measure the closeness of a forecast relative to the actual tender price, the
percentage error of the forecast is used, i.e.,
%100PriceTender Actual
PriceTender Actual - PriceTender Forecasted× (5.20)
The mean and standard deviation of percentage errors that represent the two
widely established accuracy measures of bias and consistency are used. The higher
the mean, the more bias the model has, and the higher the standard deviation, the less
consistent the model is. However, the magnitude of these two measures cannot
distinguish whether a model is better or worse than the others without significance
testing. The confidence level for all of the significance tests that are employed in
this research is 95%.
124
5.9.1 Choice of parametric and non-parametric inference
There are two approaches to statistical inference – parametric and
non-parametric. The former approach refers to modern statistical inference that is
based on the postulation of a parametric statistical model (Fisher 1922). The
parametric models are arguably simpler than the non-parametric models because they
are more informative, more amenable to statistical adequacy assessment, are often
more parsimonious and are more likely to give rise to reliable and precise empirical
evidence (Spanos 2001 p.186). Therefore, statistical adequacy can best be analysed
in a parametric setting. However, the common assumption of normality that lies
behind a parametric model may not always be fulfilled.
There are statistical tests that are available to check normality, such as the
Anderson-Darling (A-D) and Kolmogorov-Smirnov (K-S) tests. The K-S test
essentially looks at the most extreme absolute deviation, and determines the
probability that this deviation can be explained by a normally distributed data set,
whereas the A-D test is a modification of the K-S test that gives more weight to the
tails than the K-S test. The A-D test also differs from the K-S test in that it makes
use of specific distributions, such as a normal distribution, in the calculation of
critical values, and thus has the advantage of being more sensitive. The A-D test is
adopted for testing the assumption of normality in this research. The null hypothesis
for the test is that the forecasted percentage errors for a particular model follow a
normal distribution. The A-D test statistic is defined as:
( ) ( )( )[ ]ini
n
i−+
=
−+−
−−= ∑ 11
2 1lnyln12 yDDn
)i(nA , (5.21)
125
where D is the cumulative distribution function of the normal distribution, n
is the sample size and yi are the ordered data.
In a case in which the assumption of normality is proved to be invalid,
transformation using such techniques as the Box-Cox normality plot may help to
normalise a distribution. The Box-Cox transformation identifies a value of lambda
(λ) such that the suggested transformation of the original data is Yiλwhen λ≠ 0 and
ln(Yi) whenλ= 0.
To find the optimal lambda values, the Box-Cox transformation modifies the
original data using Equations (5.22) and (5.23) for Wi (a standardised transformed
variable). It then calculates the standard deviation of the variable Wi. The goal is
to find the value of lambda that minimises the standard deviation of Wi.
( )11 −
−= λ
λ
λ GYW i
i whenλ≠ 0 (5.22)
Wi =G ln(Yi) whenλ= 0, (5.23)
where Yi is the original data, G is the geometric mean of all the data and λ is
the lambda value.
If the transformation of the data fails to fulfil the normality assumption, then
the parametric way to proceed is to postulate another appropriate distribution.
Unfortunately, there are much fewer available statistical tests for distributions that
are other than normal. Alternatively, the non-parametric model, which makes use
126
of less specific probabilistic assumptions, may be used for inference. The
non-parametric model is distribution free, which refers to implicit assumptions such
as whether the random variable is discrete or continuous, the nature of the support set
of the distribution, the existence of certain moments and the smoothness of the
distribution. Inference using a non-parametric model is based on rank, and is less
susceptible to the problem of statistical inadequacy. The benefits of non-parametric
inference include its significant gains in power and efficiency when the error
distribution has tails that are heavier than those of a normal distribution, and superior
robustness in general (Hettmansperger and McKean 1998 p. xiii).
5.9.2 Statistical inference for bias
To ascertain the significance of bias, the models are tested against a mean
zero using t statistics. The t-test is well known for its robustness, even if the
distribution of data departs from normality (Lehmann 1959). The null hypothesis
for the t-test is that the mean percentage error for a model is equal to zero, which
represents an unbiased model. Let dμ be the mean percentage error, dσ be the
standard derivation of the percentage error, and nd the total number of cases for one
of the models that is represented by the notation d. The p-value that is calculated
from the t statistics in Equation (5.24) shows whether a model is significantly biased
from the zero mean percentage error.
d
d
d
n
tσμ
= . (5.24)
127
As all forecasts have been produced by cross-validated models that are
represented by the same set of selected predictors and their different coefficients in
the regressed models (or the different average unit rates in the conventional models
for different turns), the mean percentage errors for the models are likely to be close
to zero.
5.9.3 Statistical inference for consistency
As models are expected to be more or less unbiased, the consistency of the
models becomes an important indicator to distinguish the model or models that
perform better than others. Although the t-test for bias is robust even for departures
from normality, the parametric inference tests for consistency (the standard deviation
of percentage errors) are not.
Figure 5-3 shows an algorithm for the selection of parametric and
non-parametric tests. To avoid using the parametric tests naively, the assumption of
the normality of the data (forecasted prices) has been tested. As the parametric
inference is more amenable in terms of statistical adequacy, it is more preferable that
the assumption of normality be fulfilled, by means transformation if necessary. The
details concerning the checking of the normality assumption and the use of the
Box-Cox transformation are described in section 5.9.1. Alternatively,
non-parametric inference is employed if the assumption is not satisfied.
After deciding on the type of inference, the forecasting models are first tested
in groups for homogeneity of multivariances. This involves the use of the Bartlett’s
test for parametric inference and the Kruskal-Wallis test non-parametric inference.
128
Figure 5-3: Algorithm for Comparisons of Variances of Percentage Errors
Yes
Yes
Yes
No
Models of about same potency in consistency are grouped together
Conduct Multiple F-tests
using LSD approach
No
No
Yes
No
Conduct Box-Cox transformations of percentage errors and Anderson-Darling test
Conduct Anderson-Darling test for normality of distributions
Determine forecasted percentage errors for models under comparison
Is distribution of percentage errors for each model
normal?
Conduct Kruskal-Wallis test for equality of rank deviations
Conduct Bartlett’s test for
equality of variances
Is distribution of transformed errors for each model normal?
Are models of same variance?
Are models of same variance?
Conduct Multiple Mann-Whitney U tests using LSD approach
All models are comparable in consistency
Parametric Tests
Non-parametric Tests
129
The Bartlett’s test is used to study the significance of the differences between
the variance of percentage errors for the models under comparison. The null
hypothesis for the test is that the variance of percentage error for the models in
comparison is equal. Let M be the number of models for comparison, and the
Bartlett’s test statistic (B) be represented by Equation (5.25) as follows:
( )( )
( )( )
( ) ( ) ( )⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
−−
−++
⋅−−⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
−
⋅−⋅⎟
⎠
⎞⎜⎝
⎛−
=
∑∑
∑∑
∑∑
=
=
=
=
=
=
M
dd
M
d d
d
M
ddM
dd
M
dddM
dd
nnM
nn
nn
B
1
1
2
1
1
1
2
1
1
11
113
11
ln11
1ln1 σ
σ
. (5.25)
With reference to a chi-square (x²) distribution, the B value corresponds to a
p-value, which suggests whether the models in comparison are of equal variance.
The Kruskal-Wallis test (H-test) is a nonparametric equivalent to a one-way
ANOVA that tests whether several independent samples have a mean. The central
tendencies or medians are the main concern in the H-test. Based on the assumption
that the values for each sample under consideration have underlying continuous
distributions, the null hypothesis is that k samples from possibly different
populations actually originate from similar populations. By replacing percentage
errors with absolute deviations from the sample mean as the sample values for
ranking, the H-test assesses for the homogeneity of population variance (Sprent 1993
pp. 155-157). Let Rj be the sum of ranks of the jth sample, nj be the size of the jth
sample, and N be the size of the combined sample. The H-test statistic is:
130
( ) ( )131
121
2
+−⎥⎥⎦
⎤
⎢⎢⎣
⎡⋅
+= ∑
=
NnR
NNH
k
j j
j . (5.26)
With reference to a chi-square (x²) distribution, the H value corresponds to a
p-value, which suggests whether the models in comparison are of equal variances.
If the p-value from the Bartlett’s test or Kruskal-Wallis test statistics is
smaller than 0.05 and the null hypothesis is not supported, then the consistencies of
the models in comparison are not equal. The next step is to determine which of the
models differ specifically from each other. To do this, the variance of percentage
errors of the models are compared in pairwise using the F-tests or Mann-Whitney U
rank sum tests.
Following the Bartlett’s test that shows the significant difference of variances
amongst the models, the F-test is used to test the null hypothesis of whether the
variances or standard deviations of the forecasted percentage errors for two models
are equal. The F-test statistics is:
22
21
ssF = , (5.27)
where s12 and s2
2 are the sample variances. The more this ratio deviates
from 1, the stronger the evidence for unequal population variances. With reference
to the F distribution, a corresponding p-value can be found that suggests whether the
two models in comparison are of equal variance.
131
If the H-test shows a significant difference of variances amongst the models,
then it follows the Mann-Whitney U test (U-test) by using the rank sums of the two
samples to examine the null hypothesis of whether the absolute deviations from the
sample means of the two samples are equal. The observations from both samples
are combined and ranked, with the average rank assigned in the case of a tie. If the
percentage error deviations for the two samples in comparison are identical, then the
ranks should be randomly mixed between the two samples. Two rank sums, Ta and
Tb, are calculated. For sample sizes that are larger than 20, the U statistics refer to a
normal Z distribution, as is shown in Equation (5.28):
( )12
12
2121
21
++
−=
nnnn
nnUZ , (5.28)
where U is the smaller of Ua and Ub in Equations (5.29) and (5.30) as
follows:
( ) T - nn nn U aa 2111
21+
+= (5.29)
( ) T - nn nn U bb 2122
21+
+= . (5.30)
With reference to the Z distribution, a corresponding p-value can be found
that suggests whether the two models in comparison are of equal variance.
Unfortunately, performing several F-tests or Mann-Whitney U rank sum tests
has a serious drawback. The more null hypotheses there are to be tested, the more
likely it is that one of them will be rejected even if all of the null hypotheses are
132
actually true (Kleinbaum et al. 1998 pp. 443-447). In other words, if each test has a
5% probability of erroneously rejecting the null hypothesis (H0), then the probability
of incorrectly rejecting at least one H0 is much larger than 5%, and continues to
increase with each additional test that is carried out.
Fisher’s least significance difference (LSD) approach is used to correct
exaggerated significance levels. For example, if k sets of two-sample tests are
produced, then the maximum possible value for this overall significance is 0.05k.
The remedy for the LSD is to decrease the significance level to 0.05/k. In this
research, six (4C2) two-sample tests are produced for each type of building (i.e., k =
6), and therefore the corrected significance level for each pairwise test is 0.0083.
5.10 Tools for Computation
Both spreadsheets (e.g. Excel) and statistical software packages (e.g. SPSS)
provide built-in regression functions. Users can simply use these functions by
inputting the observed values for dependent and independent variables, and a
regression model by least-squares method (or other methods), together with other
relevant information to describe the model, will automatically be generated in report
format. However, these built-in functions do not feature a resampling procedure,
which means that they are unable to satisfy the needs of this study. To accomplish
this requirement and follow the various algorithms that are described in sections 5.7
to 5.9 of this chapter requires a purpose-made programme. Therefore, this research
uses the programming language of the mathematical software MathCad to write a
programme for handling the selection procedures and reliability analysis. Mathcad
133
is also used as a calculation tool in this study. It possesses advantages over other
programming languages in its use of direct equation input and its approach to the
solution of mathematical problems symbolically or numerically, which means that
programmes that are written by Mathcad are readable even for someone who has no
background in programming language. To illustrate the use of the worksheets that
were written by Mathcad, an example for the RASEM for office is attached in
Appendix D.
In addition, the functions of significance tests, such as the t-test, K-W test and
U-test, are available in the spreadsheets and statistical software packages that are
used.
5.11 Summary
This chapter describes an approach to further develop JSEM. JSEM is first
simplified to avoid an escalation in the number of variables that are induced by
increasing the number of storeys of a building. The simplification procedure
successfully reduces the number of variables for JSEM from a function of the
number of storeys to 9 for buildings with a podium and 6 for buildings without a
podium.
The cost analyses of 148 completed projects in Hong Kong for four types of
building – offices, private housing, nursing homes and primary and secondary
schools – were collected. Ten out of the 148 samples were considered as outliers
due to their differences in building provision, contractual arrangement and
technology of fabrication, and were discarded from further analysis. The building
134
prices per total floor area, which were extracted from the analyses and rebased in
accordance with the tender price index, are set as the observed values of the response
for modelling. With reference to the actual measurements of quantities (e.g.
perimeter and storey height of buildings) for the variables in JSEM (e.g. elevation
area), another two sets of variables are identified, one containing 12 variables for
buildings with basements and the other containing 19 variables for buildings without
basements.
A non-parametric approach using the least average MSQ as the termination
criterion is proposed to prevent the violation of parametric assumptions that are
likely to be caused by small sample sizes. The leave-one-out cross validation
method, which on the one hand determines the models by an explanatory sub-sample,
and on the other hand checks the forecasting ability of the models by an omitted case,
is considered to be the most intuitive method that simulates the practice of
forecasting. To improve the probability of identifying the best subset models, a
dual stepwise procedure, together with an algorithm to eliminate the possible
offending variables, is suggested. The transformation of variables may further
improve forecasting performance, and in this research the natural logarithmic
transformation method is selected for the variables that are chosen in the best
regressed models. The principle of parsimony is particularly addressed in the
selection of models, and a more complicated model has to demonstrate its benefits in
terms of forecasting accuracy to be chosen over a simpler model.
The performance of a forecast is measured by the percentage error of
departure from the actual price. To assess the performance of a model in terms of
forecasting accuracy, bias and consistency are adopted.
135
Statistical inference can be classified as parametric or non-parametric. The
former approach is more powerful and is used if its assumptions can be satisfied. If
they cannot, then the percentage errors are transformed to see if the transformed
distribution can fulfil the assumptions. If the assumptions are still not fulfilled, then
the non-parametric approach is used.
As the regressed models and the unit rates in the conventional models are
developed by cross validation, it is expected that the forecasts from these models will
have a close to zero bias. Because of this, each model is first tested against a zero
bias using the t-test. The t-test is parametric, but is known to be robust for
departures from normality. However, parametric tests for consistency are not
robust, and an algorithm is developed to assist in the selection of an appropriate
approach and significance tests within that approach.
Two stages are involved to distinguish the models using measures of
consistency. First, the homogeneity of variance of all of the models is tested using
k-sample tests, such as the Bartlett’s test under the parametric approach and the
Kruskal Wallis test under the non-parametric approach. If models are found to be
significantly different, then these tests are followed by multiple two-sample tests,
such as F-tests under the parametric approach and Mann-Whitney U-tests under the
non-parametric approach. Because of the exaggerated significance levels due to the
multiple comparisons, the Fisher’s least significance difference approach (LSD) is
used for rectification. With the assistance of the LSD, models of the same potency
in consistency are grouped together.
The benefits of advances in computer software are harnessed to assist in this
research, and a combination of different software is used. The mathematical
136
software Mathcad is used to execute the purpose-made algorithm of regression
analysis using cross validation, and commonly used spreadsheets and statistical
packages that offer a variety of built-in functions for significance tests are also
adopted to produce statistical inferences.
137
CChhaapptteerr 66 AAnnaallyyssiiss
Think as you work, for in the final analysis, your worth to your company comes not only in solving problems, but also in anticipating them. Harold Wallace Ross
6.1 Introduction
This chapter is divided into three sections. The first section concerns the
development of the regressed models based on the data that was collected from Hong
Kong projects. The details of the eight regressed models that are generated from
two sets of variables for the four types of buildings and the corresponding
logarithmic transformed models are explained. The variables that are selected in
each regressed model are different.
The bias and consistency of percentage errors of the forecasts from the
regressed models that were developed in the first section, and those of the
conventional methods, are measured in the second section. Each regressed model is
compared individually with the conventional models. On average, the regressed
models, especially the Regressed Model for Advanced Storey Enclosure Method
(RASEM), produce more accurate forecasts and all fall into the best clusters of
138
models in the eight groups of models under comparison. However, there is
insufficient evidence to conclude their superiority over their conventional
counterparts.
A practical approach to combining forecasts is proposed in the third section
to improve prediction accuracy. The combined forecast is always more accurate
than the average forecast, and is sometimes better than the best forecast.
6.2 Model Development
6.2.1 Data Collected
The data collected include the number of podium storeys (a), the number of
tower storeys (b), the number of basement storeys (m), the average area per podium
storey in m² (fp), the average area per tower storey in m² (ft), the average area per
basement storey in m² (fb), the average podium storey height in m (sp), the average
tower storey height in m (st), the average basement storey height in m (sb), the
average perimeter on plan for the superstructure in m (ppt), the average perimeter on
plan for the basement in m (pb), the roof area in m² (r), the original tender price in
Hong Kong dollars (tp), the date the tender was returned and the tender price index
(TPI). Appendix C (enclosing Table C-1 to Table C-4) is attached to display these
data according to the building type in a tabular format. The original tender prices
were rebased to the base period of the second quarter of 1997 in accordance with the
tender price index in Appendix B. The rebased prices are also shown in Appendix
C.
139
6.2.2 Candidates for Regression Models
The regression methodology that is described in Chapter 5 is used to advance
the original JSEM. A new model – the Regressed Model for James’ Storey
Enclosure Method (RJSEM) – is developed by using the variables that were
identified in JSEM for each type of building. The methodology that is applied to
the new model is the Regressed Model for Advanced Storey Enclosure Method
(RASEM) methodology, which uses another set of variables. The RASEM contains
four types of candidates: the primary variable (n, m fpt, fb, spt, sb, ppt, pb, r), the
second degree variable (n2), the interaction term that is formed amongst the primary
variables (nfpt, mfb, nspt, msb, nsptppt, msbpb) and the interaction term that is
formed between primary variables and second degree variables (n2fpt, n2spt,
n2sptppt). Table 6-1 shows the candidate variables, the response and the
corresponding equations for the RJSEM and the RASEM.
6.2.3 Response for Regression Models
A regressed model that produces a small average MSQ may not produce a
corresponding small mean or standard deviation of percentage errors (for which the
mean represents bias and the standard deviation represents the consistency of the
model), because larger response values have more influential effects in the
least-squares method, whereas the use of percentage errors for performance
assessment is unit free. These large-value influential effects can be reduced
tremendously by changing the response from the tender price to the tender price per
total floor area, as described in section 5.7.1 of Chapter 5. By adopting this change,
140
the ranges of actual response values that are represented by the ratios of the
maximum actual response value to the minimum are reduced from 60.74 to 2.33 for
offices, from 113.16 to 2.17 for private housing, from 6.87 to 2.09 for nursing homes
and from 7.93 to 2.23 for schools.
141
Table 6-1: Candidates, Responses and their Equations for the RJSEM and the
RASEM
Variable Equation Notation
RJSEMCandidatesTotal floor area for podium a · fp afpStorey number for podium · Total floor area for podium a² · fp a2fpTotal floor area for tower b · ft bftStorey number for tower · Total floor area for tower b² · ft b2ftStorey number for podium · Total floor area for tower a · b · ft abftTotal floor area for basement m · fb mfbElevation area (a · sp + b · st) · ppt nsptpptBasement wall area m · sb · pb msbpbRoof area r rResponseAdjusted tender price per total floor area P ÷ (a · fp + b · ft + m · fb) Y
RASEMCandidatesStorey number for superstructure a + b nStorey number for basement m mSquare of storey number for superstructure (a + b)² n2Average area per storey for superstructure (a · fp + b · ft) ÷ (a + b) fptAverage area per storey for basement fb fbAverage storey height of superstructure (a · sp + b · st) ÷ (a + b) sptAverage storey height of basement sb sbAverage perimeter on plan for superstructure ppt pptAverage perimeter on plan for basement pb pbTotal floor area for superstructure (a · fp + b · ft) nfptStorey number for superstructure · Total floor area forsuperstructure
(a + b) · (a · fp + b · ft) n2fpt
Total floor area for basement m · fb mfbHeight of building above ground (a · sp + b · st) nsptDepth of basement m · sb msbStorey number for superstructure · Height of building aboveground
(a + b) · (a · sp + b · st) n2spt
Elevation area (a · sp + b · st) · ppt nsptpptBasement wall area m · sb · pb msbpbStorey number for superstructure · Elevation area (a + b) · (a · sp + b · st) · ppt n2sptpptRoof area r rResponseAdjusted tender price per total floor area P ÷ (a · fp + b · ft + m · fb) Y
142
6.2.4 Selection of Predictors
The selection of best models (the best subsets of the predictors) concerns the
minimisation of the average MSQ by leave-one-out cross validation. The dual
stepwise procedure that is described in section 5.7.5 of Chapter 5 is applied to the
two sets of candidates and responses (one for the RJSEM and the other for the
RASEM), as is shown in Table 6-1. Except for the RJSEMs for nursing homes and
schools, for which agreeable subsets of predictors were produced, two different
subsets of predictors were selected from the values of these candidates and responses
using the forward stepwise and backward stepwise procedures separately. As is
explained in section 5.8.1 of Chapter 5, this discrepancy may possibly be due to a
less significant predictor that acts as an offending variable and enters the model
before a more significant predictor, or a more significant predictor that acts as an
offending variable is eliminated from the model before a less significant predictor.
To avoid this circumstance, candidates in the RJSEMs or RASEMs are excluded
repetitively using the algorithm that is shown in Figure 5.3 of Chapter 5. According
to this algorithm, the selection process ceases when both forward stepwise and
backward stepwise procedures produce the same best subset of variables. Several
candidates in the RJSEMs and RASEMs for the four types of building were excluded.
Table 6-2 shows the included candidates, excluded candidates and selected predictors
in these models. Amongst the candidates in the RJSEMs, msbpb (basement area)
was the only candidate that was excluded in the RJSEMs for offices and private
housing. However, there were more excluded candidates in the RASEMs. First of
all, the observed values for r (roof area) were found to be very close to, or the same
as, those for fpt (average floor area for the superstructure), because most multi-storey
buildings in Hong Kong, including those in this research, have a flat roof design for
143
the podium and tower. As fpt is considered to be a more representative candidate,
because the average floor area corresponds to more elements of a building than the
roof area, r was excluded from the RASEMs. The other primary variables, such as
n, m fpt, fb, spt, sb, ppt, and pb, and the second degree variable, n2, were kept
because the use of untransformed variables excluding any interaction term is the best
starting point for a general regression model (Skitmore and Patchell 1990). All of
the interaction terms were subject to the exclusion procedures. nfpt (being a
candidate in RJSEM as well), n2fpt, n2spt, msbpb (being a candidate in RJSEM as
well) and n2sptppt were excluded from the RASEMs for the four types of building.
Furthermore, the interaction terms mfb (total basement floor area) and msb (depth of
basement) were also excluded from the private housing and nursing home models.
The agreeable best models from both the forward stepwise and backward stepwise
procedures were generated from the Mathcad worksheets that were purposefully
written to carry out the selection algorithm and the reliability analysis using cross
validation.
144
Table 6-2: Included Candidates, Excluded Candidates and Selected Predictors
for RJSEMs and RASEMs
Office Private Housing
Nursing Home School Office Private
HousingNursing Home School
afp / nfpt* o o o o n o o o oa2fp / o o o o m o o o NAbft o o NA NA n2 o o o ob2ft o o NA NA fpt o o o oabft o o NA NA fb o o o NAmfb o o o NA spt o o o onsptppt o o o o sb o o o NAmsbpb x x o NA ppt o o o or o o o o pb o o o NA
nfpt x x x xn2fpt x x x x
Legend: mfb o o x NAo - Candidate x - Excluded Candidate nspt o o o oo - Selected Predictor NA - Not applicable msb o x x NA
n2spt x x x xRemarks: nsptppt o o o o* - afp and a2fp for office and private housing, msbpb x x x NA nfpt and n2fpt for nursing home and school n2sptppt x x x x
r x x x x
RASEMRJSEM
6.2.4.1 Selected Predictors for RJSEMs and RASEMs
Tables 6-3 to 6-10 show the step by step results of the predictor selection by
forward stepwise and backward stepwise procedures based on the criterion of
average MSQ. Tables 6-11 to 6-18 show the regression coefficients for each
predictor, forecast and MSQ as determined by the cross-validated models.
Table 6-19 divides the constants and selected predictors of all of the
regressed models according to the signs of their corresponding coefficients. Special
145
attention is drawn to the fact that the sign of a coefficient does not represent the
actual relationship between a predictor and the response of tender price per total floor
area, but the relationship between them in the best model under the proposed
regression methodology. Thus, the use of another methodology (e.g. the use of
another termination criterion rather than the least average MSQ) may produce
another best model (such as another group of transformed valuables or another subset
of predictors) that would suggest a different set of relationships between the selected
predictors and the response in terms of the signs and values of the coefficients.
All of the constant terms (β0) are positive except the term for the RASEM for
private housing. The selected predictors can be classified into two groups: floor
area related predictors and non-floor area related predictors. Referring to Table
6-23, all of the models have at least one floor area related predictor. The floor area
predictors include afp, a2fp, bft, b2ft, fb, fpt, n2fpt and r. Most of these predictors
exhibit a negative effect on the tender price per total floor area in the RJSEMs and
the RASEMs. The average floor area of the superstructure (fpt) does not exist as a
candidate in the RJSEMs. Instead, the effect of floor area on the response is
represented by the total floor area and total floor area multiplied by the number of
storeys (afp, a2fp, bft and b2ft or nfpt and n2fpt). If r in the RJSEMs is considered
to be an alternative candidate to fpt in the RASEMs due to their proximity in value,
then all of the regressed models except for the RJSEMs for offices and private
housing would have a negative component that is represented by the average area of
the superstructure (similar to the typical floor area for multi-storey buildings of
rectangular shape). In addition, the RASEM for nursing homes is considered to be
very similar to the corresponding RJSEM in terms of the selected predictors (nsptppt
and r are the predictors in the RJSEM, whereas nsptppt and fpt are the predictors in
146
the RASEM) and the values of the corresponding coefficients due to the proximity of
value of fpt and r.
All of the RASEMs contain the predictor fpt with a corresponding negative
coefficient. If these models were used for prediction, then they would suggest that
the higher the value of the average floor area of a superstructure, the smaller the
forecasted tender price per total floor area would be. In the RJSEMs for offices and
private housing, the predictors of total floor area such as a2fp (for offices), bft (for
offices and private housing) and b2ft (for offices), instead of the predictors of
average area per storey, are present as the negative components. In contrast, some
other floor area related predictors such as afp (in the RJSEM for offices), r (in the
RJSEM for private housing), n2fpt (in the RJSEM for schools) and fb (in the
RASEM for private housing) are present in the different models with positive
coefficients. To find out the overall effect of the floor area related predictors on all
of the regressed models, their aggregate contributions to the response were reckoned.
Table 6-20 shows the contributions of the floor area related predictors to the response.
From the table, it can be found that the aggregate contribution of these floor area
related predictors is generally negative (except for a few cases in the RJSEMs for
offices and private housing and the RASEM for private housing), which suggests that
the tender price per total floor area is inversely proportional to the floor area related
predictors in the models. However, the non-floor related predictors, n2, pb, ppt, sb,
spt, nspt and nsptppt exhibit solely positive aggregate contributions to all responses.
Their contributions are shown in Table 6-21.
Unlike the original JSEM that assumes the price of components (e.g. external
wall, window and external finishes) to be proportional to the measured areas (e.g.
external wall area), the regressed models select variables without assuming such a
147
relationship. The aggregate contributions according to the classification of floor or
non-floor related predictors provide further information on the composition of the
regressed models.
Table 6-3: Step-by-Step Selection Results of Predictors for the RJSEM for
Offices
Step Variables entered Variables deleted Average MSQ
1 a2fp 3.00E+062 nsptppt 2.86E+063 bft 2.48E+064 afp 2.10E+065 b2ft 2.07E+066Final model: 2.07E+06
Step Variables entered Variables deleted Average MSQ
1 afp, a2fp, bft, b2ft, abft, mfb, nsptppt, r 2.87E+06
2 abft 2.46E+063 r 2.11E+064 mfb 2.07E+0656Final model: 2.07E+06
(No deletion or entry, end regression)
(No entry or deletion, end regression)
a2fp, nsptppt, bft, afp, b2ft
Forward Stepwise
Backward Stepwise
a2fp, nsptppt, bft, afp, b2ft
(Stop backward, start forward)
148
Table 6-4: Step-by-Step Selection Results of Predictors for the RJSEM for
Private Housing
Step Variables entered Variables deleted Average MSQ
1 bft 9.72E+052 r 9.59E+053Final model: 9.59E+05
Step Variables entered Variables deleted Average MSQ
1 afp, a2fp, bft, b2ft, abft, mfb, nsptppt, r 1.69E+06
2 mfb 1.33E+063 b2ft 1.15E+064 nsptppt 1.08E+065 a2fp 1.06E+066 abft 9.95E+057 afp 9.59E+058Final model: 9.59E+05bft, r
(No deletion or entry, end regression)
(No entry or deletion, end regression)
Forward Stepwise
Backward Stepwise
bft, r
Table 6-5: Step-by-Step Selection Results of Predictors for the RJSEM for
Nursing Homes
Step Variables entered Variables deleted Average MSQ
1 r 6.73E+052 nsptppt 6.57E+053Final model: 6.57E+05
Step Variables entered Variables deleted Average MSQ
1 nfpt, n2fpt, mfb, nsptppt, msbpb, r 3.26E+06
2 n2fpt 1.01E+063 mfb 7.74E+054 nfpt 7.00E+055 msbpb 6.57E+056Final model: 6.57E+05r, nsptppt
(No deletion or entry, end regression)
(No entry or deletion, end regression)
Forward Stepwise
Backward Stepwise
r, nsptppt
149
Table 6-6: Step-by-Step Selection Results of Predictors for the RJSEM for
Schools
Step Variables entered Variables deleted Average MSQ
1 r 2.17E+052 n2fpt 2.07E+053Final model: 2.07E+05
Step Variables entered Variables deleted Average MSQ
1 nfpt, n2fpt, nsptppt, r 3.16E+052 nsptppt 2.35E+053 afp 2.07E+054Final model: 2.07E+05r, n2fpt
(No deletion or entry, end regression)
(No entry or deletion, end regression)
Forward Stepwise
Backward Stepwise
r, n2fpt
150
Table 6-7: Step-by-Step Selection Results of Predictors for the RASEM for
Offices
Step Variables entered Variables deleted Average MSQ
1 nspt 2.79E+062 n2 2.04E+063 fpt 1.79E+064 ppt 1.63E+065Final model: 1.63E+06
Step Variables entered Variables deleted Average MSQ
1 n, m, n2, fpt, fb, spt, sb, ppt, pb, mfb, nspt, msb, nsptppt 1.85E+07
2 pb 7.14E+063 nspt 3.78E+064 spt 2.51E+065 fb 2.04E+066 nsptppt 1.93E+067 msb 1.92E+068 mfb 1.90E+069 m 1.89E+061011 nspt 1.80E+061213 sb 1.69E+0614 n 1.63E+0615Final model: 1.63E+06
(No deletion or entry, end regression)
(No entry or deletion, end regression)
nspt, n2, fpt, ppt
(Stop backward, start forward)
(Stop forward, start backward)
Forward Stepwise
Backward Stepwise
nspt, n2, fpt, ppt
151
Table 6-8: Step-by-Step Selection Results of Predictors for the RASEM for
Private Housing
Step Variables entered Variables deleted Average MSQ
1 spt 5.96E+052 fb 5.62E+053 pb 5.19E+054 fpt 4.96E+055 sb 4.92E+056Final model: 4.92E+05
Step Variables entered Variables deleted Average MSQ
1n, m, n2, fpt, fb, spt, sb, ppt, pb, mfb, nspt, nsptppt 5.26E+06
2 m 6.74E+053 nspt 6.11E+054 ppt 5.67E+055 n 5.41E+056 mfb 5.20E+057 nsptppt 5.02E+058 n2 4.92E+059Final model: 4.92E+05spt, fb, pb, fpt, sb
Forward Stepwise
Backward Stepwise
spt, fb, pb, fpt, sb
(No deletion or entry, end regression)
(No entry or deletion, end regression)
152
Table 6-9: Step-by-Step Selection Results of Predictors for the RASEM for
Nursing Homes
Step Variables entered Variables deleted Average MSQ
1 fpt 6.70E+052 nsptppt 6.47E+053Final model: fpt, nsptppt 6.47E+05
Step Variables entered Variables deleted Average MSQ
1 n, m, n2, fpt, fb, spt, sb, ppt, pb, nspt, nsptppt 1.23E+08
2 sb 2.34E+073 n2 3.79E+064 pb 1.33E+065 fb 9.68E+056 n 8.63E+057 m 7.89E+058 nspt 7.36E+059 ppt 6.47E+0510 spt 6.47E+0511Final model: fpt, nsptppt 6.47E+05
(No deletion or entry, end regression)
(No entry or deletion, end regression)
Forward Stepwise
Backward Stepwise
153
Table 6-10: Step-by-Step Selection Results of Predictors for the RASEM for
Schools
Step Variables entered Variables deleted Average MSQ1 nspt 1.80E+052 fpt 1.75E+053Final model: 1.75E+05
Step Variables entered Variables deleted Average MSQ1 n, n2, fpt, spt, ppt, nspt,
nsptppt 2.68E+052 n2 2.26E+053 nsptppt 2.07E+054 ppt 2.04E+055 n 2.00E+056 spt 1.75E+057Final model: 1.75E+05fpt, nspt
Forward Stepwise
Backward Stepwise
fpt, nspt
(No deletion or entry, end regression)
(No entry or deletion, end regression)
154
Table 6-11: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RJSEM for Office
Caseβ 0 β 1 β 2 β 3 β 4 β 5 Forecasted Y MSQ
1 4696 -0.066 0.223 -0.079 0.295 -0.0002 5,802 1.26E+062 4708 -0.069 0.222 -0.079 0.306 -0.0002 5,657 8.77E+053 4690 -0.063 0.221 -0.078 0.286 -0.0002 5,118 8.07E+054 4710 -0.070 0.221 -0.078 0.308 -0.0002 5,645 8.19E+055 4706 -0.069 0.224 -0.080 0.309 -0.0002 6,186 2.33E+066 4541 -0.083 0.242 -0.097 0.402 -0.0001 2,325 3.08E+067 4686 -0.068 0.226 -0.080 0.305 -0.0003 6,247 2.49E+068 4614 -0.064 0.217 -0.074 0.277 -0.0003 6,540 6.78E+069 4568 -0.059 0.239 -0.081 0.251 -0.0002 9,757 2.88E+0610 4697 -0.065 0.214 -0.078 0.293 -0.0002 7,508 1.44E+0611 4663 -0.068 0.223 -0.078 0.300 -0.0003 6,618 1.73E+0412 4620 -0.066 0.224 -0.078 0.293 -0.0003 6,034 1.34E+0613 4676 -0.067 0.225 -0.079 0.300 -0.0003 6,375 6.34E+0514 4694 -0.058 0.225 -0.079 0.246 -0.0002 3,925 1.16E+0615 4703 -0.064 0.212 -0.075 0.280 -0.0002 7,969 1.40E+0616 4524 -0.064 0.228 -0.076 0.278 -0.0003 5,639 8.25E+0617 4787 -0.121 0.221 -0.081 0.466 -0.0002 2,516 6.12E+0618 4713 -0.069 0.221 -0.079 0.307 -0.0002 5,692 1.09E+0619 4510 -0.066 0.232 -0.079 0.296 -0.0003 5,279 6.51E+0620 4655 -0.068 0.225 -0.079 0.301 -0.0003 8,907 1.36E+0421 4711 -0.069 0.221 -0.077 0.300 -0.0002 5,359 6.54E+0522 4599 -0.069 0.227 -0.078 0.303 -0.0003 5,226 1.27E+0623 4665 -0.068 0.223 -0.078 0.299 -0.0003 8,958 1.74E+0324 4662 -0.068 0.223 -0.078 0.300 -0.0003 5,424 3.51E+0125 4667 -0.068 0.223 -0.077 0.298 -0.0003 4,518 3.17E+0426 4593 -0.064 0.218 -0.074 0.280 -0.0003 6,333 1.01E+0727 4559 -0.069 0.229 -0.078 0.303 -0.0003 5,323 4.03E+0628 4649 -0.066 0.223 -0.078 0.294 -0.0003 6,432 2.00E+0529 4665 -0.068 0.223 -0.078 0.301 -0.0003 5,916 3.09E+0330 4727 -0.087 0.227 -0.088 0.403 -0.0002 6,743 5.49E+0631 4715 -0.068 0.219 -0.077 0.295 -0.0003 4,828 5.44E+0532 4669 -0.068 0.224 -0.079 0.301 -0.0003 6,505 1.56E+0533 4653 -0.068 0.224 -0.078 0.299 -0.0003 5,262 2.19E+0434 4642 -0.068 0.225 -0.078 0.297 -0.0003 5,506 1.47E+0535 4700 -0.068 0.221 -0.078 0.300 -0.0002 5,268 4.07E+0536 4702 -0.071 0.220 -0.082 0.325 -0.0002 6,080 1.02E+0637 4639 -0.068 0.226 -0.077 0.297 -0.0003 5,149 9.69E+0438 4718 -0.067 0.220 -0.075 0.285 -0.0003 5,431 1.54E+0639 4745 -0.078 0.221 -0.082 0.349 -0.0002 6,025 1.65E+0640 4722 -0.067 0.220 -0.075 0.284 -0.0003 5,401 1.64E+0641 4604 -0.056 0.235 -0.064 0.207 -0.0005 7,046 7.38E+0642 4698 -0.067 0.223 -0.076 0.290 -0.0003 5,861 1.15E+06
Average: 2.07E+06
RJSEM ( β 0 + β 1 ⋅ a2fp + β 2 ⋅ nsptppt + β 3 ⋅ bft + β 4 ⋅ afp + β 5 ⋅ b2ft )
155
Table 6-12: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RJSEM for Private Housing
Caseβ 0 β 1 β 2 Forecasted Y MSQ
1 4530 -0.008 0.057 3,935 2.41E+062 4567 -0.008 0.052 3,784 4.77E+063 4484 -0.007 0.055 4,504 1.53E+064 4466 -0.007 0.058 4,430 2.09E+065 4533 -0.007 0.057 3,353 8.83E+026 4496 -0.008 0.087 5,744 1.27E+067 4475 -0.007 0.058 4,480 1.27E+068 4509 -0.007 0.057 4,520 2.25E+059 4536 -0.007 0.056 4,466 8.91E+0310 4531 -0.007 0.045 4,771 5.08E+0511 4625 -0.010 0.069 2,997 9.64E+0612 4533 -0.007 0.057 3,845 5.93E+0413 4474 -0.007 0.056 4,471 1.99E+0614 4470 -0.007 0.058 4,464 1.52E+0615 4522 -0.007 0.057 4,514 4.95E+0416 4509 -0.007 0.061 4,024 7.96E+0517 4558 -0.007 0.056 4,566 2.35E+0518 4532 -0.007 0.056 3,926 3.08E+0419 4583 -0.008 0.056 4,595 9.98E+0520 4490 -0.007 0.058 4,368 1.12E+0621 4511 -0.007 0.058 4,213 6.41E+0522 4516 -0.007 0.058 4,261 2.60E+0523 4568 -0.007 0.055 4,321 1.35E+0624 4535 -0.007 0.056 4,048 6.96E+0325 4530 -0.007 0.056 4,531 3.55E+0326 4541 -0.007 0.060 4,416 4.88E+0527 4552 -0.007 0.056 4,472 1.99E+0528 4551 -0.007 0.056 4,219 9.78E+0529 4536 -0.007 0.056 4,037 1.91E+0530 4495 -0.007 0.057 4,418 9.56E+0531 4534 -0.007 0.055 3,890 2.68E+0532 4586 -0.008 0.054 4,399 1.85E+0633 4526 -0.007 0.054 4,648 1.10E+0534 4551 -0.007 0.059 4,403 8.02E+0535 4537 -0.007 0.057 4,259 9.87E+0436 4552 -0.007 0.056 4,499 1.61E+0537 4534 -0.007 0.054 3,856 6.24E+0538 4523 -0.007 0.054 3,600 3.37E+0539 4550 -0.007 0.054 4,067 1.01E+0640 4523 -0.007 0.054 3,614 3.22E+0541 4533 -0.007 0.055 3,820 2.76E+0542 4530 -0.007 0.057 4,028 3.60E+0443 4532 -0.007 0.055 3,802 9.83E+0444 4572 -0.007 0.055 4,447 7.69E+0545 4558 -0.007 0.054 4,153 1.11E+0646 4601 -0.008 0.055 4,486 2.78E+0647 4499 -0.007 0.058 4,377 6.62E+0548 4542 -0.007 0.056 4,479 3.55E+0449 4560 -0.007 0.056 4,490 4.01E+0550 4532 -0.007 0.054 3,822 6.27E+05
Average: 9.59E+05
RJSEM ( β 0 + β 1 ⋅ bft + β 2 ⋅ r )
156
Table 6-13: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RJSEM for Nursing Homes
Caseβ 0 β 1 β 2 Forecasted Y MSQ
1 4541 -0.799 0.121 4,389 2.01E+042 4424 -0.814 0.161 4,730 1.73E+063 4619 -0.859 0.111 3,928 1.20E+064 4604 -0.875 0.121 3,512 4.97E+055 4210 -0.722 0.163 4,181 1.02E+066 4566 -0.815 0.124 4,608 1.02E+057 4540 -0.719 0.096 4,822 1.14E+068 4278 -0.730 0.149 4,310 1.46E+069 4527 -0.801 0.125 3,825 4.84E+0310 4569 -0.810 0.123 4,434 2.06E+0511 4480 -0.830 0.135 3,838 6.47E+0512 4577 -0.842 0.122 3,133 6.30E+0413 4548 -0.807 0.126 4,476 1.64E+0514 4575 -0.740 0.112 3,901 1.16E+0615 4719 -0.813 0.099 4,237 1.10E+0616 4420 -0.788 0.139 4,082 3.05E+0517 4509 -0.803 0.130 5,434 5.86E+0318 4501 -0.752 0.122 3,184 7.23E+0419 4585 -0.830 0.125 4,734 2.32E+0520 4397 -0.760 0.137 4,391 4.63E+0521 4776 -0.830 0.093 4,327 1.48E+0622 4621 -0.835 0.125 4,520 1.62E+0623 4496 -0.752 0.115 4,780 4.09E+05
Average: 6.57E+05
RJSEM ( β 0 + β 1 ⋅ r + β 2 ⋅ nsptppt )
157
Table 6-14: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RJSEM for Schools
Caseβ 0 β 1 β 2 Forecasted Y MSQ
1 2391 -0.520 0.013 2,205 8.05E+032 2379 -0.512 0.013 2,016 1.03E+043 2379 -0.515 0.012 2,197 6.61E+044 2386 -0.516 0.012 2,257 1.69E+045 2413 -0.519 0.012 2,042 2.96E+046 2415 -0.537 0.013 2,375 1.64E+047 2453 -0.555 0.012 2,318 1.77E+058 2372 -0.530 0.014 2,359 9.66E+049 2332 -0.393 0.008 2,314 1.09E+0610 2406 -0.524 0.012 2,123 8.67E+0311 2372 -0.504 0.012 2,343 3.70E+0412 2414 -0.531 0.013 2,209 7.20E+0413 2316 -0.485 0.012 2,156 1.25E+0614 2387 -0.556 0.014 1,910 1.94E+0515 2386 -0.513 0.013 2,069 3.17E+0416 2406 -0.602 0.017 2,640 4.36E+0517 2432 -0.649 0.016 1,204 1.84E+0518 2405 -0.517 0.012 1,977 1.32E+0419 2511 -0.570 0.011 2,304 6.32E+0520 2375 -0.527 0.013 2,063 1.08E+0521 2417 -0.532 0.012 2,252 1.76E+0422 2434 -0.506 0.011 2,039 2.07E+0523 2361 -0.473 0.012 1,786 6.76E+04
Average: 2.07E+05
RJSEM ( β 0 + β 1 ⋅ r + β 2 ⋅ n2fpt )
158
Table 6-15: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RASEM for Offices
Caseβ 0 β 1 β 2 β 3 β 4 Forecasted Y MSQ
1 2370 43.45 -1.892 -1.571 18.76 6,005 1.75E+062 2359 45.68 -1.981 -1.450 16.88 5,576 7.34E+053 2363 45.18 -1.961 -1.454 17.08 4,937 5.14E+054 2356 45.81 -1.983 -1.441 16.76 5,520 6.09E+055 2300 48.45 -2.107 -1.398 15.94 6,487 3.34E+066 2025 47.18 -2.038 -1.688 19.90 2,394 2.84E+067 2288 47.24 -2.057 -1.453 16.94 6,433 3.11E+068 2395 42.12 -1.799 -1.468 17.60 7,068 4.31E+069 2235 47.75 -2.025 -1.407 16.24 8,550 2.40E+0510 2405 48.12 -2.059 -1.218 13.33 7,416 1.68E+0611 2281 46.65 -2.024 -1.440 16.83 6,899 2.23E+0412 2286 45.11 -1.955 -1.480 17.57 6,409 6.10E+0513 2298 47.77 -2.061 -1.402 16.05 6,610 1.06E+0614 2387 46.10 -1.982 -1.508 16.59 3,836 1.36E+0615 2433 46.52 -1.989 -1.260 14.10 7,866 1.65E+0616 2106 48.58 -2.091 -1.368 16.09 5,694 7.93E+0617 2290 46.27 -2.004 -1.449 16.94 4,925 4.25E+0318 2361 46.88 -2.029 -1.396 15.99 5,738 1.18E+0619 2096 45.67 -1.983 -1.546 18.86 5,414 5.84E+0620 2239 47.86 -2.076 -1.422 16.50 9,053 6.92E+0421 2370 46.51 -2.016 -1.405 16.15 5,519 9.40E+0522 2218 45.53 -1.976 -1.510 18.07 5,430 8.50E+0523 2352 46.75 -2.015 -1.353 15.49 8,587 1.70E+0524 2300 46.61 -2.019 -1.427 16.61 5,587 2.48E+0425 2291 45.53 -1.963 -1.448 17.38 5,605 1.60E+0626 2269 46.35 -2.005 -1.340 15.72 6,509 9.02E+0627 2108 47.33 -2.061 -1.478 17.69 5,339 3.96E+0628 2292 46.35 -2.008 -1.438 16.80 6,770 1.22E+0429 2275 46.94 -2.034 -1.420 16.53 5,661 3.95E+0430 2335 46.00 -2.001 -1.440 16.93 5,606 1.46E+0631 2293 46.50 -2.015 -1.434 16.73 4,113 5.50E+0232 2296 45.41 -1.973 -1.516 17.92 6,801 4.78E+0533 2266 46.47 -2.013 -1.448 16.99 5,171 5.72E+0434 2244 47.00 -2.035 -1.430 16.74 5,392 2.48E+0535 2347 45.84 -1.987 -1.440 16.76 5,121 2.41E+0536 2270 46.98 -2.035 -1.425 16.63 7,224 1.80E+0437 2101 52.95 -2.439 -1.337 15.23 4,320 1.30E+0638 2367 45.35 -1.982 -1.469 17.29 5,611 2.02E+0639 2284 46.62 -2.020 -1.434 16.73 4,696 1.96E+0340 2378 44.97 -1.963 -1.488 17.53 5,720 2.56E+0641 2290 46.48 -2.015 -1.435 16.77 4,363 1.06E+0342 2202 53.66 -2.347 -1.232 13.35 6,966 4.73E+06
Average: 1.63E+06
RASEM ( β 0 + β 1 ⋅ nspt + β 2 ⋅ n2 + β 3 ⋅ fpt + β 4 ⋅ ppt)
159
Table 6-16: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RASEM for Private Housing
Caseβ 0 β 1 β 2 β 3 β 4 β 5 Forecasted Y MSQ
1 -6090 3757 0.617 -3.045 -0.142 -114.9 4,166 1.74E+062 -5808 3666 0.632 -3.119 -0.165 -105.6 3,969 4.00E+063 -6091 3748 0.603 -2.975 -0.125 -118.9 4,868 7.60E+054 -8495 4625 0.633 -3.110 -0.116 -167.1 7,330 2.12E+065 -6419 3873 0.606 -2.985 -0.121 -131.6 3,564 3.29E+046 -6386 3863 0.514 -2.298 -0.126 -150.0 5,053 1.91E+057 -6281 3813 0.600 -2.955 -0.117 -123.0 4,946 4.37E+058 -6403 3861 0.605 -2.978 -0.120 -126.6 4,590 1.64E+059 -6370 3861 0.612 -3.015 -0.128 -130.7 4,538 2.76E+0410 -6369 3857 0.581 -2.836 -0.126 -131.7 5,292 3.67E+0411 -6309 3836 0.454 -2.157 -0.127 -121.2 5,212 7.90E+0512 -6208 3804 0.664 -3.268 -0.133 -122.0 4,995 8.22E+0513 -6162 3782 0.608 -2.995 -0.128 -124.7 5,620 6.93E+0414 -6258 3803 0.599 -2.949 -0.116 -122.3 4,974 5.24E+0515 -7189 4168 0.629 -3.097 -0.134 -148.9 5,918 1.40E+0616 -6508 3901 0.598 -2.948 -0.110 -136.2 3,724 3.51E+0517 -6301 3846 0.618 -3.048 -0.137 -132.8 4,622 2.92E+0518 -6373 3859 0.610 -3.005 -0.126 -129.8 3,790 1.51E+0319 -6145 3791 0.617 -3.042 -0.138 -130.8 4,118 2.73E+0520 -6251 3815 0.620 -3.069 -0.126 -134.9 5,148 7.65E+0421 -6459 3890 0.611 -3.025 -0.126 -124.1 5,164 2.27E+0422 -6544 3908 0.606 -2.984 -0.120 -125.6 4,042 5.32E+0523 -6197 3799 0.543 -2.588 -0.130 -119.2 3,631 2.22E+0524 -6249 3822 0.623 -3.071 -0.145 -122.6 3,518 3.76E+0525 -6855 4009 0.599 -2.947 -0.106 -129.1 3,806 6.15E+0526 -6368 3863 0.628 -3.110 -0.133 -135.9 3,451 7.07E+0427 -6496 3898 0.608 -2.998 -0.123 -128.7 3,753 7.48E+0428 -6291 3837 0.610 -3.006 -0.127 -134.9 4,014 6.16E+0529 -6369 3860 0.610 -2.991 -0.129 -134.5 3,488 1.24E+0430 -6114 3767 0.662 -3.302 -0.127 -151.2 4,623 5.97E+0531 -6217 3801 0.676 -3.492 -0.124 -91.8 3,994 3.85E+0532 -5997 3744 0.617 -3.046 -0.140 -132.3 3,964 8.56E+0533 -6525 3902 0.608 -2.997 -0.124 -123.1 3,983 9.94E+0534 -6329 3847 0.609 -3.003 -0.126 -132.6 3,955 2.01E+0535 -6371 3859 0.610 -3.007 -0.127 -129.5 3,958 1.81E+0236 -6601 3934 0.619 -3.043 -0.118 -146.3 3,707 1.53E+0537 -6433 3880 0.611 -3.012 -0.127 -129.2 2,957 1.18E+0438 -6380 3859 0.601 -2.964 -0.116 -133.9 3,536 2.67E+0539 -6199 3804 0.609 -3.003 -0.127 -133.2 3,777 5.09E+0540 -6392 3862 0.602 -2.965 -0.116 -133.6 3,509 2.14E+0541 -6341 3848 0.607 -2.991 -0.123 -131.8 3,622 1.07E+0542 -6486 3895 0.611 -3.012 -0.127 -126.7 3,722 2.46E+0543 -6361 3855 0.609 -3.001 -0.125 -130.3 3,608 1.43E+0444 -6237 3817 0.612 -3.020 -0.132 -129.6 3,797 5.16E+0445 -6142 3785 0.611 -3.012 -0.131 -131.0 3,638 2.90E+0546 -5902 3705 0.614 -3.031 -0.139 -128.7 3,441 3.88E+0547 -6694 3952 0.600 -2.954 -0.111 -124.3 4,067 1.26E+0648 -6355 3855 0.616 -3.069 -0.130 -117.2 4,541 6.25E+0449 -6510 3918 0.615 -3.031 -0.129 -137.7 4,664 6.52E+0550 -6383 3864 0.603 -2.972 -0.117 -136.5 3,856 6.81E+05
Average: 4.92E+05
RASEM ( β 0 + β 1 ⋅ spt + β 2 ⋅ fb + β 3 ⋅ pb + β 4 ⋅ fpt + β 5 ⋅ sb)
160
Table 6-17: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RASEM for Nursing Homes
Caseβ 0 β 1 β 2 Forecasted Y MSQ
1 4608 -0.920 0.125 4,423 1.15E+042 4492 -0.929 0.162 4,690 1.62E+063 4695 -0.989 0.113 3,914 1.23E+064 4727 -1.056 0.123 3,283 8.74E+055 4293 -0.832 0.162 4,262 8.65E+056 4639 -0.938 0.127 4,613 1.05E+057 4597 -0.828 0.100 4,892 9.94E+058 4348 -0.845 0.152 4,315 1.45E+069 4601 -0.934 0.129 3,704 3.62E+0410 4640 -0.932 0.126 4,430 2.02E+0511 4547 -0.939 0.136 3,944 4.88E+0512 4664 -0.981 0.125 3,071 9.79E+0413 4616 -0.927 0.129 4,449 1.43E+0514 4642 -0.855 0.115 3,890 1.14E+0615 4776 -0.929 0.103 4,195 1.02E+0616 4487 -0.911 0.143 4,044 3.47E+0517 4567 -0.930 0.138 5,494 1.88E+0418 4564 -0.849 0.124 3,351 1.90E+0519 4661 -0.957 0.128 4,754 2.51E+0520 4468 -0.877 0.139 4,405 4.44E+0521 4837 -0.948 0.096 4,297 1.40E+0622 4687 -0.954 0.128 4,496 1.56E+0623 4564 -0.869 0.119 4,803 3.80E+05
Average: 6.47E+05
RASEM ( β 0 + β 1 ⋅ fpt + β 2 ⋅ nsptppt )
161
Table 6-18: Coefficients, Forecasts and MSQs Determined by Leave-One-Out
Method for the RASEM for Schools
Caseβ 0 β 1 β 2 Forecasted Y MSQ
1 1484 49.079 -0.208 2,282 1.73E+022 1443 50.343 -0.185 2,023 1.16E+043 1485 48.432 -0.205 2,272 3.30E+044 1461 49.437 -0.201 2,153 5.47E+045 1473 49.570 -0.206 1,840 8.99E+026 1489 50.339 -0.223 2,466 4.79E+047 1503 51.626 -0.244 2,452 3.08E+058 1445 50.826 -0.192 2,225 3.14E+049 1673 35.520 -0.195 2,417 8.84E+0510 1497 49.233 -0.214 2,223 3.74E+0411 1478 49.856 -0.211 2,596 3.76E+0312 1488 49.902 -0.212 2,247 9.42E+0413 1559 41.767 -0.195 2,298 9.48E+0514 1395 52.655 -0.201 1,914 1.91E+0515 1476 49.299 -0.199 2,008 1.37E+0416 1391 55.184 -0.202 2,491 2.61E+0517 1389 55.851 -0.245 1,271 1.31E+0518 1480 49.330 -0.208 1,846 2.51E+0219 1635 46.517 -0.272 2,293 6.15E+0520 1415 51.450 -0.198 2,003 1.51E+0521 1465 49.734 -0.202 2,061 3.42E+0322 1576 45.224 -0.219 1,817 5.45E+0423 1402 50.264 -0.135 1,918 1.54E+05
Average: 1.75E+05
RASEM ( β 0 + β 1 ⋅ nspt + β 2 ⋅ fpt )
Table 6-19: Signs of Coefficients for Selected Predictors
Positive Coefficients Negative Coefficients RJSEM Office Constant, nsptppt and afp a2fp, bft and b2ft Private Housing Constant, r bft Nursing Home Constant, nsptppt r School Constant, n2fpt r RASEM
Office Constant, nspt and ppt n2 and fpt Private Housing spt and fb Constant, pb, fpt, and sb Nursing Home Constant, nsptppt fpt School Constant, nspt fpt
Remark: Bold – Floor area related predictor
162
Table 6-20: Contributions of Floor Area Related Predictor to Response
CaseOffice Private
HousingNursing Home
School Office Private Housing
Nursing Home
School
(β 1 ⋅ a2fp + β 3 ⋅ bft + β 4 ⋅ afp + β 5 ⋅ b2ft)
(β 1 ⋅ bft + β 2 ⋅ r)
(β 1 ⋅ r) (β 1 ⋅ r + β 2 ⋅ n2fpt)
(β 3 ⋅ fpt) (β 2 ⋅ fb + β 4 ⋅ fpt)
(β 1 ⋅ fpt) (β 2 ⋅ fpt)
1 -750 -596 -999 -175 -945 -422 -1,058 -1872 -431 -786 -847 -364 -598 -610 -966 -4613 -1,322 21 -1,366 -193 -1,853 -161 -1,474 -1844 -525 -36 -1,768 -140 -630 -62 -2,133 -1785 -707 -1,180 -332 -371 -679 -647 -333 -1866 -4,540 1,255 -534 -34 -6,077 6,718 -614 -1107 -1,395 6 -396 -141 -1,363 -20 -414 -1358 -330 11 -460 -42 -399 -45 -533 -3119 -9,242 -72 -1,274 -18 -2,293 -117 -1,485 -15510 -2,137 236 -711 -289 -1,747 5,379 -800 -18611 -2,173 -1,601 -1,187 -37 -2,113 5,953 -1,155 -10812 -570 -688 -2,172 -198 -489 6,348 -2,335 -20413 -688 -2 -738 -165 -425 -146 -848 -17414 -15,890 -6 -1,132 -471 -11,830 -18 -1,223 -21815 -1,165 -7 -779 -313 -1,171 -40 -891 -33116 -238 -482 -737 237 -378 -529 -852 -20717 -7,146 8 -477 -1,229 -4,839 -26 -553 -51518 -435 -608 -1,865 -430 -474 -434 -1,766 -21919 -291 13 -461 -211 -247 -52 -531 -14620 -1,208 -122 -486 -313 -758 1,664 -552 -19421 -423 -298 -712 -168 -367 310 -813 -10022 -337 -254 -728 -396 -241 -162 -832 -21323 -1,323 -248 -416 -587 -1,298 1,070 -480 -32724 -583 -485 -517 -71525 -7,349 1 -4,807 -1726 -556 -122 -481 2,15627 -510 -80 -428 -15928 -524 -333 -629 -28129 -639 -499 -968 -10030 -1,074 -76 -2,305 1,20531 -658 -644 -1,362 1,17632 -540 -187 -487 -14733 -217 123 -391 -23834 -269 -147 -477 -30335 -349 -278 -446 -31936 -10,290 -53 -4,958 85537 -10,140 -677 -2,801 -42538 -1,113 -923 -1,557 -50339 274 -483 -1,186 -29540 -1,135 -910 -1,324 -52941 -2,083 -711 -6,673 -42742 -1,170 -501 -759 -30943 -732 -43944 -125 -8045 -404 -25046 -115 -15647 -121 -10648 -63 27849 -70 -18750 -709 -386
Remark: Bold numbers represents positive contributions to the responses
Regressed JSEM Regressed ASEM
163
Table 6-21: Contribution of Non-Floor Area Related Predictors to Responses
CaseOffice Private
HousinNursing Home
School Office Private Housing
Nursing Home
School
(β 2 ⋅ nsptppt) (β 2 ⋅ nsptppt) (β 1 ⋅ nspt + β 2 ⋅ n2 +
β 4 ⋅ ppt)
( β 1 ⋅ spt + β 3 ⋅ pb + β 5 ⋅ sb)
(β 2 ⋅ nsptppt) (β 1 ⋅ nspt )
1 1,856 847 4,580 10,678 873 9852 1,380 1,153 3,815 10,387 1,164 1,0413 1,750 675 4,427 11,120 693 9714 1,460 676 3,794 15,887 689 8705 2,187 303 4,866 10,630 302 5536 2,324 576 6,446 4,721 588 1,0877 2,956 678 5,508 11,247 709 1,0848 2,256 492 5,072 11,038 500 1,0919 14,431 572 8,608 11,025 588 89910 4,948 576 6,758 6,282 590 91211 4,128 545 6,731 5,568 552 1,22612 1,984 728 4,612 4,855 742 96313 2,387 666 4,737 11,928 681 91314 15,121 458 13,279 11,250 471 73715 4,431 297 6,604 13,147 310 86316 1,353 399 3,966 10,761 409 1,30717 4,875 1,402 7,474 10,949 1,480 39718 1,414 548 3,851 10,597 553 58519 1,060 610 3,565 10,315 624 80420 5,460 480 7,572 9,735 489 78221 1,071 263 3,516 11,313 273 69622 964 627 3,453 10,748 641 45423 5,616 700 7,533 8,758 719 84324 1,345 3,804 10,48225 7,200 8,121 10,67826 2,296 4,721 7,66327 1,274 3,659 10,40828 2,307 5,107 10,58629 1,890 4,354 9,95730 3,090 5,576 9,53231 771 3,182 9,03532 2,376 4,992 10,10833 826 3,296 10,74634 1,133 3,625 10,58735 917 3,220 10,64836 11,668 9,912 9,45337 10,650 5,020 9,81538 1,826 4,801 10,41939 1,006 3,598 10,27140 1,814 4,666 10,43041 4,525 8,746 10,39042 2,333 5,523 10,51743 10,40844 10,11445 10,03046 9,49947 10,86748 10,61849 11,36150 10,625
Remark: Bold numbers represents positive contributions to the responses
Regressed JSEM Regressed ASEM
164
6.2.5 Model Transformation
The regressed models with the logarithmic transformed variables can be expressed in
the form of Equation (5.19) in Chapter 5. The response and all of the predictors in
the regressed models were logarithmically transformed (in base e). The LRJSEM
and the LRASEM represent the transformed models for the RJSEM and the RASEM,
respectively. There is a key condition that governs the logarithmic transformation
that all of the values of the transformed variables must be larger than zero.
Unfortunately, the predictors of two of the regressed models do not satisfy this
condition. As some of the office projects do not have podiums and private housing
projects do not have basements, certain predictors, including afp and a2fp in the
RJSEM for offices, and fb, sb and pb in the RASEM for private housing, cannot be
transformed. To fulfil the condition, these predictors were excluded in the
LRJSEM for offices and the LRASEM for private housing.
6.3 Performance Validation
6.3.1 Forecasting Results
To study whether the regressed models improve the performance of forecasts,
their performance was compared with that of the conventional models. The same
data for generating the regressed models were used to assess the performance of the
conventional models. Forecasted tender prices for the JSEM, the floor area model
165
and the cube model were calculated using Equations (6.1) to (6.3), respectively, as
follows:
R
sbpbmfbm
pptsptbarftbaftb
ftbfpafpa
P ⋅
⎪⎪⎪
⎭
⎪⎪⎪
⎬
⎫
⎪⎪⎪
⎩
⎪⎪⎪
⎨
⎧
⋅⋅+⋅+
⋅⋅+++⋅⋅+⋅+
⋅⎟⎠⎞
⎜⎝⎛ −+⋅+⋅⎟
⎠⎞
⎜⎝⎛ −
=
5.22
)(15.0215.0
215.02
215.0
215.02
ˆ 2
2
(6.1)
( ) ''ˆ RfbmftbfpaP ⋅⋅+⋅+⋅= (6.2)
( ) ''''ˆ RsbfbmstftbspfpaP ⋅⋅⋅+⋅⋅+⋅⋅= , (6.3)
where P̂ , 'P̂ and ''P̂ are the forecasted prices for the JSEM, the floor
area and cube models, respectively, and R , 'R and ''R are their corresponding
unit rates that are deduced by cross validation as described in section 5.9 of Chapter
5. The quantities measured, the cross-validated unit rates and the forecasted tender
prices for the three conventional models for offices, private housing, nursing homes
and schools are shown in Tables E-1 to E-4 in Appendix E. The forecasted prices
as shown in the tables were used to calculate the corresponding percentage errors for
the purpose of making comparisons with the regressed models.
To assess the performance of the best subset of regressed models, their
forecasting results were compared with those that were obtained from the
conventional models. First of all, the forecasting errors and percentage errors of all
of the models were calculated. The forecasting errors for various conventional
models and regressed models are shown in Tables F-1 to F-4 and the percentage
errors are shown in Tables F-5 to F-8 in Appendix F. Table 6-22 shows a summary
166
of the means and standard deviations of the percentage errors that represent the bias
and consistency of all the models as extracted from the appendix, and the results of
the significance testing (p-values of the t-tests) for zero bias for all of the models.
As expected, the forecasted prices from the models that were generated by the
method of cross validation generally have very little bias, and most do not deviate
significantly from zero. The only exception is the JSEM for offices. This model
is significantly biased, and has the highest mean percentage error (-6.88%) amongst
all of the models. As bias alone is not informative enough to distinguish the
performance of the models, consistency becomes an important measure in this study.
Unlike the t-tests that are used for the comparison of means, the use of parametric
tests for the homogeneity of variance are not robust in their departure from normality,
as is explained in section 5.9.1 of Chapter 5. As parametric tests are more
preferable than non-parametric tests, the distribution of errors (in terms of the ratio of
forecast to actual tender price) for all of the models were examined in order to
choose the appropriate tests.
167
Table 6-22: Summary of Means and Standard Deviations of Percentage Errors
Office Private Housing Nursing Home SchoolJSEMMean % error (m) -6.88% -2.73% 2.09% 4.08%SD of % error 21.43% 29.04% 20.03% 21.25%p -value for t -test (H0: m=0)
0.04 0.51 0.62 0.37
FLOOR AREAMean % error (m) 5.62% 1.31% 4.20% 3.35%SD of % error 27.32% 23.53% 24.45% 21.45%p -value for t -test (H0: m=0)
0.19 0.69 0.42 0.46
CUBEMean % error (m) 0.16% 1.47% 5.75% 3.56%SD of % error 26.99% 19.59% 25.21% 24.56%p -value for t -test (H0: m=0)
0.97 0.60 0.29 0.49
RJSEMMean % error (m) 3.06% 4.84% 3.21% 3.41%SD of % error 25.38% 22.64% 21.45% 20.84%p -value for t -test (H0: m=0)
0.44 0.14 0.48 0.44
Predictors afp, a2fp, bft, b2ft, nsptppt
bft, r n2fpt, r n2fpt, r
RASEMMean % error (m) 2.96% 2.66% 3.09% 2.94%SD of % error 22.15% 15.95% 21.36% 19.56%p -value for t -test (H0: m=0)
0.39 0.24 0.49 0.48
Predictors n2, fpt, ppt, nspt fpt, fb, spt, sb,pb
fpt, nspt fpt, nspt
LRJSEM
Mean % error (m) 1.87% 2.27% 1.44% 2.14%SD of % error 19.47% 21.14% 20.28% 19.64%p -value for t -test (H0: m=0)
0.54 0.45 0.74 0.61
Predictors ln(bft ), ln(b2ft ), ln(nsptppt )
ln(bft ), ln(r ) ln(n2fpt ), ln(r ) ln(n2fpt ), ln(r )
LRASEMMean % error (m) 2.71% 1.68% 1.36% 2.07%SD of % error 21.86% 17.60% 19.69% 20.06%p -value for t -test (H0: m=0)
0.43 0.50 0.74 0.63
Predictors ln(n2 ), ln(fpt ), ln(ppt ), ln(nspt )
ln(fpt ), ln(spt ) ln(fpt ), ln(nspt ) ln(fpt ), ln(nspt )
Remark:Bold - p -value < 0.05, H0 is rejected (i.e., Mean % error is significantly different from zero)
168
6.3.2 Normality Testing
To use the parametric tests appropriately, the distributions of the forecast to
actual tender price ratios should follow normality. If the models have to be
transformed to fulfil the normality requirement, then the ratios for the models under
examination should be transformed on the same basis. Therefore, all of the
distributions of the ratios for the three conventional models, together with the
distribution of the ratios for one of the regressed models (either with the
untransformed variables or the transformed variables for comparison), would have to
pass the normality tests before the parametric tests could be used to ascertain
homogeneity of variance. The same requirement would also have to be applied to
the comparison between two regressed models with untransformed variables and
transformed variables.
Table 6-23 shows the p-values of the Anderson-Darling (A-D) tests for
normality. The ratios of forecast to actual tender price were used to produce the
plot and to deduce the lambda value, rather than the percentage errors, to avoid the
presence of negative values that handicap the transformation of the logarithm or
square root. Seven distributions of the ratios of forecast to actual tender price were
found to depart significantly from the norm at a confidence level of 95%. They
were from the floor area model and the LRASEM for offices, the JSEM and the floor
area and cube models for private housing, and the RJSEM and the RASEM for
nursing homes. To normalise these distributions, a transformation was carried out
using the Box-Cox normality plots, as is shown in Figures 6-1 to 6-7.
169
The best lambda (λ) values were determined from the normality plots and are
summarised in Table 6-24. If the best λ equals 1, then no transformation can further
normalise the distribution. If it equals 0.5, then a square root transformation is
suggested; if 0, then a logarithmic transformation is suggested; and if -1, then
reciprocal transformation is suggested. As none of the lambda values for the
models in any particular type of building matches with any of the others, the ratios
for each model under the same building type were transformed according to the same
determined lambda value, with the exception of schools because all school models
support the normality assumption. The transformed ratios for each distribution
were then subjected to the A-D tests again to assess the normality of all of the
distributions of the transformed ratios. Unfortunately, the various attempts to
transform the ratios for the groups of models under comparison in sections 6.3.3 and
6.3.4 failed to normalise their distributions. Therefore, non-parametric tests were
employed for the comparisons involving the seven models that failed to fulfil the
normality requirement.
Table 6-23: Results of Normality Tests for Percentage Errors According to
Building and Model Types
Anderson-Darling Tests (p-value)
Office Private Housing Nursing Home School JSEM 0.227 <0.005 0.261 0.455
Floor Area <0.005 <0.005 0.102 0.483 Cube 0.431 0.045 0.550 0.243
RJSEM 0.580 0.765 0.013 0.788 RASEM 0.728 0.133 0.022 0.853 LRJSEM 0.312 0.473 0.092 0.930 LRASEM 0.015 0.224 0.099 0.602
Remark: Bold figures represent p-value < 0.05, H0 is rejected.
170
Figure 6-1: Box-Cox Plot of Percentage Errors for the Floor Area Model for
Offices
Figure 6-2: Box-Cox Plot of Percentage Errors for the LRASEM for Offices
171
Figure 6-3: Box-Cox Plot of Percentage Errors for the JSEM for Private
Housing
Figure 6-4: Box-Cox Plot of Percentage Errors for the Floor Area Model for
Private Housing
172
Figure 6-5: Box-Cox Plot of Percentage Errors for the Cube Model for Private
Housing
173
Figure 6-6: Box-Cox Plot of Percentage Errors for the RJSEM for Nursing
Homes
Figure 6-7: Box-Cox Plot of Percentage Errors for the RASEM for Nursing
Homes
Table 6-24: Estimated Lambda Values According to Building and Model Types
(for Models not Satisfying Normality Assumption Only)
Office Private Housing Nursing Home School
Estimated λ-value
Best λ-value
Estimated λ-value
Best λ-value
Estimated λ-value
Best λ-value
Estimated λ-value
Best λ-value
JSEM N.A. N.A. -0.03 0 N.A. N.A. N.A. N.A. Floor Area 1.33 1 0.48 0.5 N.A. N.A. N.A. N.A. Cube N.A. N.A. 0.86 1 N.A. N.A. N.A. N.A. RJSEM N.A. N.A. N.A. N.A. 1.32 1 N.A. N.A. RASEM N.A. N.A. N.A. N.A. -1.16 -1 N.A. N.A. LRJSEM N.A. N.A. N.A. N.A. N.A. N.A. N.A. N.A. LRASEM -0.83 -1 N.A. N.A. N.A. N.A. N.A. N.A. Remark: N.A. stands for not applicable as the assumption of normality is supported.
174
6.3.3 Significance of Variable Transformation
As described in section 5.8.2 of Chapter 5, the variables were transformed
only in circumstances in which a transformed model would significantly improve the
performance of a forecast. According to the bias and consistency of the regressed
models in terms of the means and standard deviations of the percentage errors that
are shown in Table 6-22, the transformed model LRJSEM generally performed better
than its untransformed counterpart, the RJSEM. Although the transformed model
LRASEM also produced less biased forecasts than its untransformed counterpart, the
RASEM, the same did not apply to the measures of consistency. As the t-tests
support the hypothesis that each regressed model is zero biased, the significance
testing for the consistency between each pair of models (transformed and
untransformed) becomes crucial in judging whether models in a pair are significantly
different.
Two-sample F-tests for the homogeneity of variance were applied to the
percentage errors of every pair of the transformed and untransformed models except
the LRASEM and RASEM for offices, due to their failure to comply with the
normality assumption described in section 6.3.2. For the exception, a
Mann-Whitney U-test for homogeneity of absolute deviation was used. The results
of these tests are summarised in Table 6-25. None of the transformed models were
found to be significantly different from their untransformed counterparts.
According to the principle of parsimony, the regressed models RJSEM and RASEM
were selected for comparison with the conventional models.
175
Table 6-25: Two-sample F-tests and Mann-Whitney U test between Regressed
Models with Untransformed Variables and with Logarithmic Transformed
Variables
p-value of significance tests
Office Private Housing Nursing Home School
RJSEM & LRJSEM 0.09 0.63 0.79 0.78
Statistics F-test F-test F-test F-test
H0: No variance difference (reject if p < 0.05)
Accept H0 Accept H0 Accept H0 Accept H0
RASEM & LRASEM 0.68 0.80 0.71 0.91
Statistics U-test F-test F-test F-test
H0: No variance difference (reject if p < 0.05)
Accept H0 (H0: No absolute
deviation difference)
Accept H0 Accept H0 Accept H0
6.3.4 Comparisons of Models
Eight groups of models were compared separately: four comprising the
RJSEM and the conventional models, and four comprising the RASEM and the
conventional models. The forecasting performance of the models under comparison
in this section is shown in Table 6-22. To distinguish the better performing model
or models in terms of their consistency, the Kruskal Wallis (K-W) tests
(non-parametric) were first employed to the six groups of models for offices, private
housing and nursing homes, and Bartlett’s test (parametric) was applied to the two
groups of models for schools. For the models that were found to be significantly
different in consistency, multiple two-sample tests were then applied. According to
176
Fisher’s Least Significance Difference (LSD) approach, the corrected significance
level for each pairwise test was 99.17%.
Figure 6-8 shows a graphical presentation of the results of these tests. The
four groups of models for offices and private housing were found to be significantly
different, whereas the four groups for nursing homes and schools were not.
Therefore, the former groups were examined in pairwise using Mann-Whitney
U-tests. The results of the U-tests for the four groups for offices and private
housing are shown in Table 6-26.
Table 6-26: Two-sample Mann-Whitney U-tests between Models for Office and
Private Housing
Mann-Whitney U-test (at 99.17%* significance level) Office Private Housing
Pair Z p-value H0: No difference in absolute
deviation (reject if p < 0.0083)
Z p-value H0: No difference in absolute
deviation (reject if p < 0.0083)
Common Comparisons for Both Groups
JSEM and Floor Area -2.8896 0.0039 Reject H0 -1.8544 0.0637 Accept H0 Floor Area and Cube -1.3240 0.1855 Accept H0 -1.6821 0.0926 Accept H0 Cube and JSEM -1.4493 0.1473 Accept H0 -3.0609 0.0022 Reject H0 Comparisons with RJSEM
JSEM and RJSEM -1.1988 0.2306 Accept H0 -2.4818 0.0131 Accept H0 Floor Area and RJSEM -1.6103 0.1073 Accept H0 -1.1651 0.2440 Accept H0 Cube and RJSEM -0.1252 0.9003 Accept H0 -0.4481 0.6541 Accept H0 Comparisons with RASEM
JSEM and RASEM -0.2952 0.7678 Accept H0 -4.3707 0.0000 Reject H0 Floor Area and RASEM -2.2007 0.0278 Accept H0 -3.0126 0.0026 Reject H0 Cube and RASEM -0.8946 0.3710 Accept H0 -1.7441 0.0811 Accept H0 Remark: * – 99.17% = (1 – 0.05/6) x 100%
177
Figure 6-8: Tests of Homogeneity of Variances Using Bartlett’s Tests, Kruskal
Wallis Tests and Mann-Whitney U Tests
Group 2 Group 1
LSD Comparison of Sample Variances (by U-tests) LSD Comparison of Sample Variances (by U-tests)
LSD Comparison of Sample Variances (by U-tests)
Floor Area
Floor Area
Cube
JSEM Cube RASEM
Kruskal Wallis test (p=0.000)
JSEMRASEM
Significant difference
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28
x x x x
0.0.20 0.19 0.18 0.290.17 0.16 0.15
O F F I C E
P R I V A T E H O U S I N G
N U R S I N G H O M E
S C H O O L
JSEM Floor Area Cube RASEM
No significant difference
Bartlett’s test (p=0.757)
Floor Area
Cube RJSEM
JSEM Cube RJSEM
Kruskal Wallis test (p=0.043)
Floor Area
JSEM
Significant difference
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28
x x x x
0.29
Floor Area
Floor Area
RJSEM
JSEM Cube RJSEM
Kruskal Wallis test (p=0.009)
JSEM Cube
Significant difference
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28
x x x x
0.200.19
x x
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28
x x
Floor Area
CubeRASEM
JSEM Cube RASEM
Kruskal Wallis test (p=0.029)
Significant difference
JSEM Floor Area
JSEM Floor Area Cube RJSEM
No significant difference
Bartlett’s test (p=0.857)
JSEM Floor Area Cube RASEM
No significant difference
Kruskal Wallis Test (p=0.653)
JSEM Floor Area Cube RJSEM
No significant difference
Kruskal Wallis Test (p=0.642)
LSD Comparison of Sample Variances (by U-tests)
Group 7 Group 8
Group 5 Group 6
Group 3 Group 4
178
6.3.4.1 Models for Offices
Compared with the range of bias for the other type of buildings, the range of
bias for office models is the largest (-6.85 to 5.62%). Except for the t-tests for the
JSEM, all of the other t-tests for the office models supported the null hypothesis,
which suggests that the JSEM is the most biased model and the others are all
unbiased models, and are therefore comparable with each other.
The ascending order of the sample variances is the JSEM, the RASEM, the
cube model and the floor area model in Group 1 and the JSEM, the RJSEM, the cube
model and the floor area model in Group 2. The Kruskal Wallis tests for both
groups of models, as shown in Figure 6-8, rejected the notion that the models under
comparison are equal in consistency. The LSD approach of multiple pairwise
comparisons using U-tests is illustrated diagrammatically in Figure 6-8. In Group 1,
the JSEM, the RASEM and the cube model have the same potency, the RASEM and
the cube and floor area models also have the same potency and the JSEM differs
from the floor area model. Therefore, the more consistent set of models for Group
1 comprises the three comparable models: the JSEM, the RASEM and the cube
model. Similarly, the more consistent set of models for Group 2 comprises the
JSEM, the RJSEM and the cube model.
As JSEM is significantly different from a zero mean percentage error, the
best performing sets of models, taking into account both the bias and consistency, are
the RASEM and the cube model in Group 1, and the RJSEM and the cube model in
Group 2.
179
6.3.4.2 Models for Private Housing
All of the t-tests for the private housing models supported the null hypotheses
that the percentage errors of the models are not significantly different from a zero
mean.
The ascending orders of the sample variances are the RASEM, the cube
model, the floor area model and the JSEM model in Group 3, and the cube model,
the RJSEM, the floor area model and the JSEM in Group 4. As for Group 1 and 2,
both the Kruskal Wallis tests for the models in Group 3 and 4 rejected the notion that
the models under comparison are equal in consistency, as is shown in Figure 6-8.
In particular, the RASEM in Group 3 attained spectacularly low consistency
(15.95%). In this group, the RASEM and the cube model have the same potency,
the cube and floor area models have the same potency, the floor area model and the
JSEM have the same potency, both the RASEM and the cube model differ from the
JSEM, and the RASEM differs from the floor area model. Therefore, the more
consistent set of models for Group 3 comprises the two comparable models: the
RASEM and the cube model.
In Group 4, the cube model, the RJSEM and the floor area model have the
same potency, the RJSEM, the floor area model and the JSEM have the same
potency, and the cube model differs from the JSEM. Therefore, the more consistent
set of models for Group 4 comprises the three comparable models: the cube model,
RJSEM and floor area model.
180
6.3.4.3 Models for Nursing Homes
As with the private housing models, all of the t-tests for the nursing home
models supported the null hypotheses that the percentage errors of the models are not
significantly different from a zero mean.
The ascending orders of the sample variances are the JSEM, the RASEM,
floor area and cube models in Group 5, and JSEM, the RJSEM and the floor area and
cube models in Group 6. Moreover, the Kruskal Wallis tests for the models in
Group 5 and 6 supported the notion that the models under comparison are equal in
consistency. Therefore, all of the models are comparable with each other in terms
of both bias and consistency for both groups.
6.3.4.4 Models for Schools
As with the private housing and nursing home models, all of the t-tests for the
school models supported the null hypotheses that the percentage errors of the models
are not significantly different from a zero mean.
The ascending orders of the sample variances are the RASEM, the JSEM, and
the floor area and cube models in Group 7, and the RJSEM, the JSEM and the floor
area and cube models in Group 8. Moreover, the Bartlett’s tests for the models in
Group 7 and 8 supported the notion that the models under comparison are equal in
consistency. Therefore, all of the models are comparable to each other in terms of
both bias and consistency for both groups.
181
6.3.4.5 Discussions on model comparisons
Amongst the eight groups, the mean percentage errors, which represent the
bias of a model, are generally quite close to zero (-0.068 to 0.058), due to the use of
cross validation and the least-squares method for deducing the model unit rates and
coefficients. The range of standard deviations of percentage errors is quite narrow
(from 0.159 to 0.328), possibly because of the exclusion of the building component
costs, such as the foundation, building services, preliminaries and contingency, from
the original tender price in the forecasting target. These were excluded because of
the similar nature of the data collected, as multi-storey reinforced concrete buildings
in Hong Kong are very similar in terms of construction methods and specifications.
The coefficient of variation (cv) that represents the general accuracy was 20%
to 30% for the JSEM, 21% to 26% for the floor area model and 19% to 33% for the
cube model. These accuracy ranges generally fall within the ranges that were
reviewed by Skitmore and Patchell (1990), i.e. 15% to 30% for the JSEM, 20% to
30% for the floor area model and 20% to 45% for the cube model.
James used rather crude measures (i.e., multiplying the lowest rate by the
equal highest rate and the number of rates within a percentage group) and a small
sample size (i.e., 16 flats, 14 school, 39 industrial buildings and 17 houses) to
conclude that the JSEM was a better model than the floor area and cube models.
When better-accepted and defined measures were used for comparison in this study,
such as bias and consistency, it was found that the conventional models are more
likely to be comparable.
The three conventional models were also compared with the RJSEM and the
RASEM separately for consistency. In a four-sample comparison, Groups 1 to 4
182
(the office and private housing models) were found to be significantly different in
their absolute deviation of percentage errors, whereas Groups 5 to 8 (the nursing
home and school models) were not found to be significantly different. One of the
possible causes for the lack of significant improvement in the regressed models for
nursing homes and schools is the insufficient number of candidate variables. In this
study, the number of candidates was largely reduced in these two regressed models
because of the absence of podiums (for nursing homes and schools) and basements
(for schools), and because of the procedure of excluding candidates to tackle the
multicollinearity problem. Thus, the forecast performance could probably be
further improved by identifying and including more uncorrelated candidates in the
regressed models if more information is extracted as design develops from the early
design stage to later stages.
Both the regressed and cube models were included in all of the best sets of
comparable models. However, the comparison results created ambiguities in
interpreting the models as some models, such as the RASEM and the cube model in
Group 1, the RJSEM and the cube model in Group 2, the cube model in Group 3 and
the RJSEM and floor area model in Group 4, show potency in two different sets of
comparable models. Nevertheless, it can be concluded from the LSD comparisons
that the use of the RASEM may improve the forecasts, and at least will not worsen
them.
The major concern of this research is forecasting accuracy. The evidence
shows that the forecasting models are more likely to be comparable in terms of
accuracy measures than uniquely outstanding. Hence, the hypothesis that the new
regressed models outperform the conventional forecasting models is rejected. The
type of information that is available in the early design stage is coarse and very
183
limited, which constricts the forecasting ability of any model, because a model can
only capture as much as the available information allows. It appears that even more
information such as the elevation area and the roof area, etc. have been extracted and
used in the regressed models, the improvement is not significant enough to
distinguish them from the conventional models. Unless more information can be
brought into the modelling process or a less rigour statistical inference is used to
distinguish the models, it is very difficult if not impossible to produce a significantly
outstanding model in the early design stage. As no single model performed
significantly better than all of the others together, an alternative strategy of
combining forecasts is explored in the next section.
6.4 Combining Forecasts
There is a line of research concerning the combination of multiple individual
forecasts that are produced by different forecasting models. The literature of
forecasting suggests that when different models are similar in their forecasting
accuracy, an approach that combines the different forecasts may improve accuracy.
The concept of combining forecasts is based on the implicit assumption that different
forecasting models are able to capture different aspects of the information that is
available for forecasting without knowing the underlying process (Clemen 1989).
Armstrong (2001 p.428) summarised 30 studies on the combination of
forecasts, all of which show a certain amount of gain in accuracy, and on average
there was a 12.5% reduction in forecasting error. Although the regressed models in
this research did improve the forecasting accuracy on average, and fall within the
184
best cluster of models in the LSD approach, they are not distinguishably better than
the conventional models in terms of forecasting accuracy amongst the eight groups
of models that were examined. The approach of combining forecasts ensures a gain
in accuracy over the average performance of the models without risking the
performance of a single model.
Empirical studies on time-series forecasting methods suggest that correlations
between forecasts should be ignored in calculating the combination weights
(Newbold and Granger 1974; Makridakis and Hibon (1979); Makridakis et al. (1982,
1983)). Clemen (1989) conducted a comprehensive review of the evidence, and
found equal weighting to be accurate for many types of forecasting. This evidence
leads to the conclusion made by Armstrong (2001 pp.419-424) that an equally
weighted combination of forecasts should be used when it is not certain which model
is best. The author also suggests that an equal-weights rule is a reasonable starting
point, and that a trimmed mean is desirable if the combination contains five or more
models. The author states that different weights should only be used if the domain
knowledge or information upon the method of greatest accuracy is well understood.
Two types of combinations were produced for the eight groups of models in
this research, as is shown in Figure 6-8. One combines the forecasts of the best sets
of models (C1), and the other of all of the models (C or C2) under comparison. The
equal-weight rule is applied to both types. The combined forecasts for the eight
groups of models are shown in Tables G-1 to G-8 in Appendix G, and the results
from these tables are summarised in Tables 6-27 to 6-34. Tables 6-27 to 6-34 also
show the average minimum and maximum percentage error forecasts. The former
takes the average of the percentage errors of the best forecasts for all cases, as shown
in the tables in Appendix H, and the latter takes the average of the worst. All
185
forecasts fall within the range between the average minimum and the average
maximum.
As expected, the accuracy of the combined forecasts shows improvement
over that of the average models in every group. The accuracy (cv) gains from the
combination of forecasts over the average of the best sets of the models was in the
range of 2.41% to 17.61% and over the average of all of the models was in the range
of 4.49% to 15.61%, or the averages of 9.42% and 9.33%, respectively. This
improvement is slightly less than the average reduction of 12.5% in Armstrong’s
study. Except the RASEM for private housing in Group 3 and the cube model for
private housing in Group 4, the C1 combined forecasts did show a gain in accuracy
(i.e., the negative values in the rows “C1 effect” in Tables 6-27 to 6-34). There are
a few more exceptional cases for the C or C2 combined (according to the rows “C
effect” or “C2 effect” in Tables 6-27 to 6-34). The C1 combined forecasts
produced the best forecasts in Groups 1 and 2, and the C combined forecast produced
the best forecast in Group 8. To conclude, the combined forecasts are more
accurate than the average forecasts, and are sometimes better than the best forecasts.
Table 6-27: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 1 Models
Group 1 JSEM Floor Area
Cube RASEM
Combined Forecasts
Average
(w) (x) (y) (z) (JSEM, Cube &
RASEM) (C1)
(All Four
Models) (C2)
(w, y & z)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error (Out of
Four Models)
Mean % Err:
-6.88% 5.62% 0.16% 2.96% -1.25% 0.47% -1.25% 0.47% -1.71% 5.06%
SD % Err: 21.43% 27.32% 26.99% 22.15% 19.38% 20.65% 23.52% 24.47% 11.57% 35.80%
CV: 23.01% 25.87% 26.94% 21.51% 19.62% 20.56% 23.82% 24.36% 11.77% 34.07%
C1 Effect: -14.72% -24.14% -27.17% -8.77% - - -17.61% - - -
C2 Effect: -10.67% -20.54% -23.70% -4.43% - - - -15.61% - -
186
Table 6-28: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 2 Models
Group 2 JSEM Floor Area
Cube RJSEM Combined Forecasts Average
(w) (x) (y) (z) (JSEM, Cube & RJSEM)
(C1)
(All Four
Models) (C2)
(w, y & z)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error (Out of
Four Models)
Mean % Err:
-6.88% 5.62% 0.16% 3.06% -1.22% 0.49% -1.22% 0.49% -4.29% 2.71%
SD % Err: 21.43% 27.32% 26.99% 25.38% 20.28% 21.27% 24.60% 25.28% 12.16% 36.23%
CV: 23.01% 25.87% 26.94% 24.62% 20.53% 21.16% 24.90% 25.16% 12.71% 35.27%
C1 Effect: -10.79% -20.64% -23.81% -16.62% - - -17.56% - - -
C2 Effect: -8.03% -18.19% -21.45% -14.04% - - - -15.87% - -
Table 6-29: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 3 Models
Group 3 JSEM Floor Area
Cube RASEM Combined Forecasts Average
(w) (x) (y) (z) (Cube & RASEM)
(C1)
(All Four
Models) (C2)
(y & z)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error
(Out of Four
Models)
Mean % Err: -2.73% 1.31% 1.47% 2.66% 2.07% 0.68% 2.07% 0.68% 0.63% -1.28%
SD % Err: 29.04% 23.53% 19.59% 15.95% 16.70% 20.50% 17.77% 22.03% 13.26% 30.70%
CV: 29.86% 23.22% 19.30% 15.53% 16.36% 20.36% 17.41% 21.88% 13.18% 31.10%
C1 Effect: -45.20% -29.54% -15.23% +5.34% - - -6.00% - - -
C2 Effect: -31.82% -12.33% +5.48% +31.07% - - - -6.94% - -
187
Table 6-30: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 4 Models
Group 4 JSEM Floor
Area Cube RJSEM Combined Forecasts Average
(w) (x) (y) (z) (Floor Area,
Cube & RASEM)
(C1)
(All Four
Models) (C2)
(x, y & z)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error
(Out of Four
Models)
Mean % Err:
-2.73% 1.31% 1.47% 4.84% 2.54% 1.22% 2.54% 1.22% 0.98% 0.28%
SD % Err: 29.04% 23.53% 19.59% 22.64% 21.39% 22.64% 21.92% 23.70% 16.57% 32.39%
CV: 29.86% 23.22% 19.30% 21.60% 20.86% 22.36% 21.38% 23.41% 16.41% 32.30%
C1 Effect: -30.15% -10.18% +8.07% -3.41% - - -2.41% - - -
C2 Effect: -25.11% -3.71% +15.86% +3.55% - - - -4.49% - -
Table 6-31: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 5 Models
Group 5 JSEM Floor Area Cube RASEM Combined Forecasts
Average
(w) (x) (y) (z) (All Four Models)
(C)
(w, x, y & z)
Min % Error
(Out of Four Models)
Max % Error
(Out of Four Models)
Mean % Err: 2.09% 4.20% 5.75% 3.09% 3.78% 3.78% 3.57% 7.55%
SD % Err: 20.03% 24.45% 25.21% 21.36% 20.73% 22.76% 12.40% 30.00%
CV: 19.62% 23.47% 23.84% 20.72% 19.97% 21.93% 11.97% 27.90%
C Effect: +1.80% -14.90% -16.22% -3.63% - -8.95% - -
Table 6-32: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 6 Models
Group 6 JSEM Floor Area
Cube RJSEM Combined Forecasts
Average
(w) (x) (y) (z) (All Four
Models) (C) (w, x, y
& z)
Min % Error
(Out of Four
Models)
Max % Error
(Out of Four
Models)
Mean % Err:
2.09% 4.20% 5.75% 3.21% 3.81% 3.81% 2.50% 7.61%
SD % Err: 20.03% 24.45% 25.21% 21.45% 20.77% 22.79% 12.38% 30.09%
CV: 19.62% 23.47% 23.84% 20.78% 20.01% 21.95% 12.08% 27.96%
C Effect: +1.99% -14.74% -16.06% -3.73% - -8.83% - -
188
Table 6-33: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 7 Models
Group 7 JSEM Floor Area
Cube RASEM Combined Forecasts
Average
(w) (x) (y) (z) (All Four
Models) (C)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error
(Out of Four
Models)
Mean % Err:
4.08% 3.35% 3.56% 2.94% 3.48% 3.48% 1.92% 6.85%
SD % Err: 21.25% 21.45% 24.56% 19.56% 20.23% 21.70% 15.04% 27.39%
CV: 20.41% 20.75% 23.72% 19.01% 19.55% 20.97% 14.75% 25.64%
C Effect: -4.22% -5.78% -17.56% +2.88% - -6.78% - -
Table 6-34: Accuracy for Combined, Model Average, Minimum and Maximum
Forecasts for Group 8 Models
Group 8 JSEM Floor Area
Cube RJSEM Combined Forecasts
Average
(w) (x) (y) (z) (All Four
Models) (C)
(w, x, y & z)
Min % Error
(Out of Four
Models)
Max % Error
(Out of Four
Models)
Mean % Err:
4.08% 3.35% 3.56% 3.41% 3.60% 3.60% 2.18% 7.18%
SD % Err: 21.25% 21.45% 24.56% 20.84% 20.44% 22.02% 15.02% 27.40%
CV: 20.41% 20.75% 23.72% 20.16% 19.73% 21.26% 14.70% 25.56%
C Effect: -3.35% -4.93% -16.82% -2.12% - -7.20% - -
6.5 Summary
Eight regressed models were built from two sets of variables, one for the
RJSEM and the other for the RASEM, for the four types of buildings – offices,
private housing, nursing homes and schools. By setting the tender price per total
floor area as the response, rather than the tender price, the average ratios of the
189
maximum actual response value to the minimum actual response value for each
building type were all reduced to around 2. This avoids the significantly large value
effect in modelling by the least-squares method, and improves the accuracy of the
regressed models.
Eight regressed models with different best subset variables were generated.
The selected predictors are height of the building above ground, square of the
number of storeys of the superstructure, average area per storey of the superstructure
and average perimeter on plan of the superstructure for the office RASEM (in Group
1); the number of podium storeys multiplied by the total podium floor area, elevation
area, total tower floor area and number of tower storeys multiplied by total tower
floor area for the office RJSEM (in Group 2); the average storey height of the
superstructure, average area per basement storey , average basement perimeter on
plan, average area per storey of the superstructure, and average basement storey
height for the private housing RASEM (in Group 3); the total tower floor area and
roof area for the private housing RJSEM (in Group 4); the average area per storey of
the superstructure and elevation area for the nursing home RASEM (in Group 5); the
roof area and elevation area for the nursing home RJSEM (in Group 6); the average
area per storey of the superstructure and height of building above ground for the
school RASEM (in Group 7); and the roof area and number of storeys of the
superstructure multiplied by the total floor area of the superstructure for the school
RJSEM (in Group 8).
In particular, the RJSEM and the RASEM for nursing homes are considered
to be very similar to each other, as they both contain the elevation area as the one of
the predictors, and the other predictors – roof area in the RJSEM and average floor
area in the RASEM – are very similar in their observed values.
190
The predictors of all of the regressed models are divided into two types: floor
related (e.g., average floor area, total floor area and roof area) and non-floor related
(e.g., elevation area, storey height and number of storeys). The aggregate
contribution of the former type of predictors is generally negative (except for a few
cases in the office RJSEM and the private housing RASEM), whereas the latter type
exhibits solely positive aggregate contributions to all responses, which suggests that
the tender price per total floor area is negatively correlated to the floor area related
predictors, and is positively correlated to the non-floor area related predictors in
these models.
The variables (other than those that relate to podiums or basements) that
contain zero observed values in the regressed models were logarithmically
transformed in base e. The transformed models, the LRJSEM and the LRASEM,
were tested against the original regressed models, the RJSEM and the RASEM,
respectively. The LRJSEM is generally less biased and more consistent than the
RJSEM. However, the statistical tests suggest that the transformed models are not
significantly better than their original counterparts. Therefore, the original
regressed models, being the simpler models, were chosen for further comparison
with the conventional models.
The coefficients of variation (cv) that represent general accuracy were 20% to
30% for the JSEM, 21% to 26% for the floor area model and 19% to 33% for the
cube model. These accuracy ranges generally fall within the ranges that were
reviewed by Skitmore and Patchell (1990), i.e. 15% to 30% for the JSEM, 20% to
30% for the floor area model and 20% to 45% for the cube model.
191
As the regressed models and the unit rates for conventional models were
generated by cross validation, the percentage errors of all the models except the
JSEM for offices did not significantly depart from a zero mean, which suggests that
most of the models are generally not biased at all, and the consistency of the models
would be a more influential indicator for distinguishing their performance.
Eight groups of models, each comprising one regressed model and the three
conventional models for the same type of buildings were formed. The models in
each group were tested for their homogeneity of variance, which is a measure of
whether the models are equally consistent. To accomplish this task, a combination
of parametric tests such as Bartlett’s test and the non-parametric tests such as the
Kruskal Wall test and Mann-Whitney U test. First, the models were compared as a
group. Only the models for offices and private housing (Group 1 to 4) were found
to be significantly different in consistency. Within their own group, these models
were then compared in pairwise using Fisher’s Least Significance Difference (LSD)
approach. No one single model was found to be significantly more consistent than
the others. The cluster of most consistent models in Group 1 comprises the JSEM,
the RASEM and the cube model; in Group 2 comprises the JSEM, the RJSEM and
the cube model; in Group 3 comprises the RASEM and the cube model; and in
Group 4 the cube model, the RJSEM and the floor area model. Both the regressed
and cube models were included in all of the best sets of comparable models that had
the same potency. Hence, the hypothesis that the new regressed models outperform
the conventional forecasting models is rejected.
A strategy to improve forecasting accuracy by combining forecasts is
proposed. This is considered to be particularly suitable for early stage forecasting,
because the available information is usually very limited at this stage. By combining
192
forecasts, the different aspects of information can be captured. The combined
forecasts are always more accurate than the average, and are sometimes better than
the best forecasts. In this study, the average accuracy gain from the combination of
forecasts from the best clusters of models was 9.42%, and from all of the models
over their averages was 9.33%.
193
CChhaapptteerr 77 CCoonncclluussiioonnss
Mistake, error, is the discipline through which we advance.
William Ellery Channing
7.1 Introduction
Forecasting methods that are adopted in practice have been criticised for
lacking theoretical support and a proper evaluation of performance. By considering
the significance of early stage forecasts, this research focuses on the development of
cost models that improve forecasting performance. The primary aim of this work is
to develop forecasting models by a systematic and logical approach. JSEM is
chosen for further development because as reviewed, it is the most sophisticated
conventional model applicable in the early design stage. The regressed models
developed in this study using variables as identified in JSEM are expected to capture
more information depicted from sketch drawings (the only key information available
during the early design stage).
Evidence reveals that conventional forecasting methods, such as the floor
area and approximate quantities methods, are still the most widely used methods,
194
despite a number of alternative methods having been developed by researchers. To
put forth the use of new cost models and forecasting approaches in practice, it is
crucial that practitioners appreciate the improved performance of these models and
approaches.
A secondary aim of this research is to prove the hypothesis that the new
regressed models outperform the conventional models by testing the forecasting
accuracy of the former models against the latter models. A rigorous and objective
approach for comparing and examining the forecasting accuracy of the models
empirically, which has been usually overlooked in previous studies of model
development, is adopted.
7.2 Model Development
Eight regressed models were built from two sets of variables. The models
are the RJSEM and RASEM for offices, private housing, nursing homes and schools.
The performance of the forecasts was noticeably better when the tender price per
total floor area was used as the response, rather than the tender price, because the
influential effect of large tender figures was significantly reduced.
In the regression analysis, the forecasting errors for each model were
minimised by reducing the number of identified independent variables using the
cross validation approach. All of the models are found to comprise different
predictors. The only common predictor amongst the RASEMs is the average floor
area, but its effect on the response is not conclusive as its coefficients in various
RASEMs are different in their sign and magnitude.
195
The predictors of the eight regressed models can be divided into two types:
floor related (e.g., average floor area, total floor area and roof area) and non-floor
related (e.g., elevation area, storey height and number of storeys). The tender price
per total floor area is generally negatively correlated to the gross contributions of the
floor area related predictors, and positively correlated to the gross contributions of
the non-floor area related predictors in the models.
Following previous modelling studies on the improvement of forecasting
performance by transformation strategy, the variables in the best subset models were
logarithmically transformed. The statistical testing suggested that none of the
transformed models were significantly better than their original untransformed
counterparts, although there was some improvement in some cases.
To conclude, the cross validation algorithm developed in this study for
modelling JSEM’s variables is a significant contribution as it makes advancement to
the model building process. Although the data, the observed values for the
candidates and the response, used in this study are only from four different types of
building projects, the developed methodology for modelling is also applicable to data
from other types of buildings as well as to other types of data. In using the cross
validation approach, both the regressed and conventional models are examined
simultaneously based on this criterion. It is found to be particularly suitable for the
problem of building price forecasting, because in practice, forecasters extract the
relevant information from a pool of historical projects to make a prediction, and the
sample base for modelling in the cross validation approach corresponds to that
relevant information. The difference, however, is that practicing forecasters rely
heavily on their judgment in choosing the data, the methods for forecasting and
deciding the relationship with the tender price. The cross validation approach has
196
considerable intuitive appeal because it produces forecasts in a similar way to
forecasters, but it also preserves objectivity. Using the cross validation approach
under the dual stepwise selection procedure provides an automatic means of
achieving variable parsimony.
7.3 Performance Validation
The coefficients of variation (cv) that represent the general accuracy using the
cross validated approach were 20% to 30% for the JSEM, 21% to 26% for the floor
area model and 19% to 33% for the cube model. These accuracy ranges generally
fall within the ranges that were reviewed by Skitmore and Patchell (1990), i.e., 15%
to 30% for the JSEM, 20% to 30% for the floor area model and 20% to 45% for the
cube model.
In James’ study, the JSEM was proved empirically to outperform the floor
and cube models. However, when bias and consistency are used as the accuracy
measures, James’ result is not supported. Models for the same type of building are
more likely to be comparable with each other, rather than superior or inferior.
As was anticipated from forecasts using the cross validation approach, the
models were generally unbiased, the only exception being the JSEM for offices.
There is no significant difference between the performance of the regressed models
(the RASEM or the RJSEM) and the three conventional models for the nursing home
and school samples. When the least significance difference (LSD) approach was
applied to make multiple comparisons amongst the models, both the regressed and
cube models appeared to fall into the best clusters of comparable models in the office
197
and private housing samples. Disregarding that all regressed models belong to the
best clusters of comparable models, the hypothesis that the new regressed models
outperform the conventional forecasting models is rejected.
The principle of parsimony is particularly important in the context of model
selection, as the number of possible models is unlimited for a given set of data. It is
logical to follow the principle that a more complicated model has to demonstrate its
benefits over a simpler model. The average and standard deviation of percentage
errors that are used in this research can give an indication, but needs to be supported
by statistical inference to draw conclusions about the comparisons. The proposed
framework for comparisons in this research allows a fair judgment to be made.
The three major weaknesses in the development of cost models in previous
studies, that is, the lack of theoretical support, the deterministic emphasis of
approach, and the crude evaluation of new models, have been overcome in this
research. First of all, James’ storey enclosure model was simplified with reasonable
assumptions to fit a typical problem that can be solved by multi-linear regression.
The developed models share the common objective of minimising forecasting errors,
which is achieved mathematically by the use of the least-squares error approach for
selection of the variables and the determination of the coefficients. Secondly,
conventional approaches to cost models generally rely upon the use of historical
price data to produce single-figure (i.e., deterministic) building price forecasts, which
do not explicitly describe inherent variability and uncertainty. In the cross
validation approach that is used in this research, costs are modelled repetitively, and
the reliability of the models is measured according to the mean and standard
deviation of percentage errors (the stochastic components of forecasts). The
evaluation of the models was conducted with reference to a framework for the
198
selection of the appropriate parametric and non-parametric tests that were used to
examine the performance of the models. This framework is an exemplar which
ensures the objectivity and rigorousness in the evaluation of models.
Compared with other forecasting regression models, the RASEM and RJSEM
gain an advantage over previously developed models in terms of the use of cross
validation for reliability analysis, which avoids the major problem of within-sample
validation and makes the best use of sample data; applicability, as the candidates and
predictors identified are extractable from existing cost analyses, which avoids the
subjective elements in defining and measuring qualitative variables; and the use of
statistical inference for comparing models, which provides a fair basis for the
assessment of model performance.
7.4 Combining forecasts
A strategy to improve forecasting accuracy by combining forecasts is
proposed. This is considered to be particularly suitable for early stage forecasting,
as the information that is available at this stage is usually very limited. By combining
forecasts, the different aspects of information can be captured. The combined
forecasts are always more accurate than the average forecasts, and are sometimes
better than the best forecasts. In this research, the average accuracy gain from the
combination of forecasts over the averages was 9.42% for the best clusters of models
and 9.33% for all of the models together.
199
7.5 Implications for Practice
Although the regressed models are not distinguishably better, they are
replicable, because they are backed by the cross validation approach; are easy to use,
because they involve only a few predictors; and are fairly accurate and reliable,
because they are comparable with other models within the best clusters. If a
regressed model is chosen for prediction, then it can, on average, produce forecasts
that are at least as good as the forecasts of any of the conventional models. For
certain applications, such as when the RASEM is used for private housing, the
chance of getting better forecasts is high.
In the framework for model comparison, the use of both parametric and
non-parametric tests with reference to a selection algorithm is adopted. The
proposed selection algorithm can be applied for the comparison of forecasting
models of any kind and amongst any number of models. The use of bias and
consistency together with significance testing ensures objectivity in judging the
models.
The alternative strategy of combining forecasts has been shown to be
practical and useful. This is particularly suitable for early stage forecasting, as
models that are used at this stage are all very simple in terms of the number of
measurements and the calculation that is involved. This approach is also
economical to use, as a simple equally weighted combination of forecasts can
improve forecasting accuracy. This applies both to situations in which a forecaster
has previously acknowledged the performance of the models and in which a
forecaster has not acknowledged this. If the forecaster can identify the better
200
models, then a reasonable application would be to combine the forecasts from these
models. If the better models cannot be identified, then all of the possible models
should be combined.
In practice, estimates are prepared by a number of forecasters within a
company. The combination strategy is mechanical, and avoids additional
uncertainties that are caused by the subjectivity of forecasting experts. However,
expert judgment can also be applied to combine forecasts that are produced from
different forecasters if such judgments are essential to that type of forecast.
With powerful computers and software, any modellers should be able to
follow the methodology that is detailed here to create and examine cost models, and
both experienced and inexperienced forecasters should be able to use the regressed
models that have been developed.
7.6 Model Limitations
No cost model can ever be a perfect representation of building prices, nor can
it produce forecasts with no errors. Cost models are limited by their underlying
assumptions, by their reliance on historical data for predicting future events, by the
insufficiency of information and preparation time and by their reliance on expert
judgment. Compared with the parametric approach to the development of regressed
models, there are fewer assumptions to be met in the cross validation approach.
Although the development of forecasting models is heavily influenced by the
sufficiency of data, the cross validation approach is relatively undemanding. In
terms of model building, the regressed models certainly take more time to construct
201
and to maintain compared with the conventional models. In terms of producing
forecasts, the regressed models are as easy to use as the conventional models, as only
a few predictors have to be measured. Subjective judgment is unavoidable, but the
proposed method for the development and testing of models is systemic and logical,
and is an effective way to avoid making unreasonable judgments. For instance, the
regressed models that are described in this research can be easily replicated by
modellers for forecasts with different sets of variables.
All the models for nursing homes and schools are of no significant difference.
The small number of samples that are included in these models (23 samples only)
and the relatively fewer variables that were identified because of the absence of
podiums and basements could be the causes of the potential bias in the results.
Given that the data were collected in a ten-year period and from a single practice, it
is apparent that further studies into the creation of models for nursing homes and
schools must seek to identify other potential variables. In this regard, further
information extractable as design develops from the early design stage to later stages
may help to address those variables.
This research shows only the significance of the developed models in terms
of statistics. To improve both the methodology and the combined forecast approach,
the practical significance of both should also be studied.
Previous studies have suggested that practitioners prefer to exercise their own
judgement in giving cost advice in order to demonstrate their expertise. Cost
models and forecasting approaches that complement the professional judgement of
practitioners are therefore likely to receive better recognition. Since this study
focuses on developing models based on the quantitative data extractable from sketch
202
drawings, the qualitative variables have not been incorporated. However, it would
be particularly worthwhile to develop the new approach of this research further by
incorporating qualitative variables that require the exercising of judgement. For
instance, in the cost models that are developed in this research, the quality of
buildings is implicitly assumed to be equal for each building type, and thus buildings
of significantly different quality were discarded from the data sample. Data on
options and intentions that represent the quality variable on a scale can be judged by
practitioners and could also be used for modelling. Furthermore, there are other
potential variables that demand expert judgement, and which could be added to the
model, including market condition, level of competition and the credit worthiness of
clients.
The combination of forecasts is limited by the three major criticisms that are
levelled at it – that it ruins traditional statistical procedures, that there is one
appropriate way to forecast, and that instead of combining forecasts, one should look
for a comprehensive model that incorporates all the relevant information.
7.7 Opportunities for Further Research
Further research could be undertaken by refining the models, or by
developing similar models for early stage forecasting. The models that are
presented here could be refined in the ways detailed here, which are presented in list
form for the sake of brevity.
203
1. Adding more relevant variables (e.g., finishing standard, building
provisions, etc.) to the models based on the information other than
sketch drawings.
2. Producing another set of interaction terms.
3. Using other variable transformations (e.g., reciprocal, square root).
4. Using other model functions (e.g., polynomial).
5. Using different types of buildings (e.g., shopping malls, public housing,
etc.).
6. Using all of the combinations of selection procedures.
7. Exploring the weighted least-squares approach.
8. Exploring other modelling techniques (e.g., artificial neural network
(ANN), fuzzy set theory, and genetic algorithm, etc.).
9. Exploring different approaches to determine the best weightings for the
combination of forecasts (e.g., trimmed means).
10. Comparing similar models that are generated from data from other
countries.
11. Comparing the use of different error measures on the performance of
forecasts
As practical significance is as important as statistical significance, it would be
useful to assess other criteria, such as the comprehensibility and acceptability of the
regressed model. Moreover, as the quality of a designers’ forecast is related to the
204
way in which it is perceived by decision makers, it would be worthwhile not only to
focus on forecasting accuracy, but also on the satisfaction side of the forecasting task.
A sophisticated survey on how decision makers perceive the quality of forecasts
could also be a way of improving the forecasting function. The results of such a
study would draw the attention of both practicing forecasters and researchers to the
needs of decision makers.
Previous empirical studies on forecasting accuracy suggest that there is little
improvement, or even a decrease, in accuracy as a building project proceeds from the
early design stage to the detailed design stage. However, no study on forecasting
accuracy has ever been conducted for the same projects at different stages.
Although it would be very difficult to collect the required data, and there would be
complexities in classifying the stages of the different projects, this sort of study
would provide a clearer picture of the paradoxical results that have been thrown up
by other accuracy studies, which have found that using more information can
produce poorer forecasts.
205
BBiibblliiooggrraapphhyy
Adeli, H. and Wu, M. (1998). "Regularization neural network for construction cost estimation." Journal of Construction Engineering and Management 24(1): 18-24.
Akintoye, A. S., Ajewole, O. and Olomolaiye, P. O. (1992). "Construction cost information management in Nigeria." Construction Management and Economics 10: 107-116. Allman, I. (1988). Significant Items Estimating: Review a PSA Estimating System. Chartered Quantity Surveyor: 24-5.
Armstrong, J. S. (1985). LONG-RANGE FORECASTING From Crystal Ball to Computer, 2nd ed. A Wiley-Interscience Publication.
Armstrong, J. S. (2001). Principles of forecasting : a handbook for researchers and practitioners, Boston, MA : Kluwer Academic.
Armstrong, J. S. and Collopy, F. (1992). "Error Measures for Generalizing about Forecasting Methods: Empirical Comparisons." International Journal of Forecasting 8(69-80).
Ashworth, A. (1999). Cost studies of buildings. 3rd ed. Harlow, England : Longman.
Ashworth, A. and Skitmore, R. M. (1983). Accuracy in estimating, The Chartered Institute of Building.
Association Industrial Consultants Limited and Business Operations Research Limited (1967). Report of the Joint Consulting Team on Serial Contracting for Road Construction, Ministry of Transport.
Atkin, B. (1987). A time/cost planning technique for early design evaluation. Building Cost Modelling and Computers. P. S. Brandon. London, E & F N Spon: 145-54.
Baker, M. J. (1974). Cost of Houses for the Aged. Department of Civil Engineering, Loughborough University of Technology.
Barnes, N. M. L. (1971). The design and use of experimental bills of quantities for civil engineering contracts, University of Manchester Institute of Science and Technology.
Barrett, A. C. (1970). Preparing a cost plan on the basis of outline proposals. Chartered Surveyor: 507-20.
Barrie, D. S. and Paulson, B. C. (1978). Professional Construction Management, McGraw-Hill Book Co., New York.
Bathurst, E. P. and Butler, D. A. (1977). Building Cost Control Techniques
206
and Economics, Heinemann.
BCIS (1969). Standard form of cost analysis, The Royal Institution of Chartered Surveyors.
Beeston, D. T. (1974). One statistician's view of estimating, London: Building Cost Information Service.
Beeston, D. T. (1983). Statistical methods for building price data. London ; New York, E & F N Spon.
Beeston, D. T. (1987). A future for cost modelling. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 15-24. Bennett, J. and Ferry, D. (1987). Towards a Simulated Model of the Total Construction Process. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 377-86.
Bennett, J. and Omerod, R. N. (1984). "Simulation Applied to Construction Projects." Construction Management and Economics 2: 225-63.
Bennett, J., Morrison, N. and Stevens, S. (1979). Construction cost data base - a report prepared on behalf of PSA by University of Reading, Department of Environment (Directorate of Quantity Surveying), Property Services Agency Library.
Bennett, J., Morrison, N. and Stevens, S. (1980). Construction cost data base - second annual report, Department of Environment (Directorate of Quantity Surveying), Property Services Agency Library.
Berny, J. and Howes, R. (1987). Project Management Control Using Growth Curve Models Applied to Budgeting, Monitoring and Forecasting Within the Construction Industry. Management Construction Worldwide. P. R. Landsley and P. A. Harlow. London, E & FN Spon. 1: Systems for Managing Construction: 304-13.
Birnie, J. (1995). The Possiblr Effects of Human Bias in Construction Cost Prediction. Proceedings of 11th Annual ARCOM Conference, University of York, September.
Blackhall, J. D. (1974). The Application of Regression Modelling to the Production of a Price Index for Electrical Services. Department of Civil Engineering, Loughborough University of Technology.
Bode, J. (1998). "Neural networks for cost estimation." Cost Engineering 40(1): 25-30.
Bon, R. (1989). Building as an Economic Process: An Introduction to Building Economics.
Bon, R. (2001). “The future of building economics: a note.” Construction Management and Economics 19: 255-258.
Boussabaine, A. and Elhag, T. (1999). Knowledge Discovery in Residential Construction Project Cost Data. ANNUAL CONFERENCE- ARCOM 1999 15TH.
Bowen, P. A. (1984). Applied econometric cost modelling. Proceedings, 3rd International Symp on Building Economics, CIB W-55, Ottawa.
207
Bowen, P. A. and Edward, P. J. (1998). "Building Cost Planning and Cost Information Managment in South Africa." International Journal of Procurement(June): 16-25.
Bowen, P. A. and Edwards, P. J. (1985a). "Cost Modelling and Price Forecasting: Practice and Theory in Perspective." Construction Management and Economics(3): 199-215.
Bowen, P. A. and Edwards, P. J. (1985b). A Conceptual Understanding of the Paradigm Shift in Cost Modelling Techniques Used in the Economics of Building. Durban, Department of Quantity Surveying and Building Economics, University of Natal.
Bowen, P. A., Wolvaardt, J. S. and Taylor, R. G. (1987). Cost Modelling: a Process-Modelling Approach. Building Cost Modelling and Computer. P. S. Brandon, E & F N Spon: 387-395.
Braby, R. H. (1975). Costs of high-rise buildings. Building Economist. 14: 84-6.
Brandon, P. S. (1978). A Framework for Cost Exploration and Strategic Cost Planning in Design. Building and Quantity Surveying Quarterly. 5: 60-3.
Brandon, P. S., Basden, A., Hamilton, I. W. and Stockley, J. E. (1988). Application of Expert System to Quantity Surveying, N B S Services Ltd.
Brandon, P. S., Ed. (1982). Building cost research: need for a paradigm shift? Building cost techniques: New Directions, E & FN Spon.
Brown, H. W. (1987). Predicting the Elemental Allocation of Building Costs by Simulation with Special Reference to the Cost of Building Service Elements. Building Cost Modelling and Computer. P. S. Brandon, E & F N Spon: 397-406.
Buchanan, J. S. (1969). Development of a Cost Model for the Reinforced Concrete Frame of a Building. Department of Civil Engineering, Loughborough University of Technology.
Buchanan, J. S. (1972). Cost Models for Estimating: Outline of the Development of a Cost Model for the Reinforced Concrete Frame of a Building. London, RICS.
Cartlidge, D. P. and Mehrtens, I. N. (1982). Practical cost planning : a guide for surveyors and architects. London :, Hutchinson.
Chau, K. W. (1995). "Monte Carlo simulation of construction costs using subjective data." Construction Management and Economics 13: 369-83.
Chau, K. W. (1995). "The validity of the triangular distribution assumption in Monte Carlo simulation of construction costs: empirical evidence from Hong Kong." Construction Management and Economics 13(1): 15-21.
Cheong, P. F. (1991). Accuracy in design stage cost estimating, National University of Singapore.
Clark, W. and Kingston, J. (1930). The Skyscrapter: A Study in the Economic Height of Modern Office Buildings, American Institute of Steel
208
Construction, New York.
Clemen, R. T. (1989). "Combining forecasts: A review and annotated bibliography." International Journal of Forecasting 3: 379-391.
Coates, D. (1974). Estimating for French Drains - A Computer Based Model. Department of Civil Engineering, Loughborough University of Technology.
Connauhgton, J. and Meikle, J. (1991). The Future Role of the Chartered Quantity Surveyor. London:, The Royal Institution of Chartered Surveyors, Quantity Surveyors Division. Cusack, M. M. (1985). Optimization of Time and Cost. Project Management. 3: 50-4.
Cusack, M. M. (1987). An Integrated Model for the Control of Costs, Duration and Resources on Complex Projects. Proceedings of the Fourth International Symposium on Building Economics, Section C: Resource Utilisation, Copenhagen, SBI.
de Neufville, R., Hani, E. N. and Lesage, Y. (1977). "Bidding model: effects of bidders' risk aversion." Journal of the Construction Division ASCE 103(CO1): 57-70.
Department of Environment (DOE) (1971). Local Authority Offices: Areas and Costs, DOE, London.
Diekmann , J. E. (1983). "Probabilistic estimating: mathematics and applications." Journal of Construction Engineering and Management, ASCE 109: 297-308.
Dreger, G. T. (1988). Cost Management Models for Design Application. Transactions of the American Association of Cost Engineers, Morgantown: AACE.
Drew, D. (1995). The Effect of Contract Type and Size on Competitiveness in Construction Contract Bidding. PhD Thesis. Department of Civil Engineering, University of Salford.
Dunican, P. (1960). Structural Steelwork and Reinforced Concrete for Framed Buildings. The Chartered Surveyor. August 1960: 74-77.
Edwards, A. W. F. (2001). 7 Occam's bonus. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 128-132.
Efron, B. (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia, Pa. :, Society for Industrial & Applied Math.,.
Ellis, C. and Turner, A. (1986). Procurement Problems. Chartered Quantity Surveyor: 11.
Emsley, M. W., Lowe, D. J., Duff, A. R. Harding A., and Hickson, A. (2002). “Data modelling and the application of a neural network approach to the prediction of total construction costs.” Construction Management and Economics 20(6): 465-472.
Fausett, L. V. (2002). Numerical methods using MathCAD. Upper Saddle
209
River, N.J., Prentice Hall.
Ferry, D. J., Brandon, P. S. and Ferry, J. D. (1999). Cost planning of buildings, 7th ed. Blackwell Science.
Fine, B. (1980). Construction Management Laboratory, Fine, Curtis, Gross.
Fine, B. and Hackermar, G. (1970). Estimating and bidding strategy. Building Technology and Management: 8-9.
Fisher, R. A. (1922). "On the mathematical foundations of theoretical statistics." Philosophical Transactions of the Royal Society A 222: 309-368.
Flanagan, R. and Norman, G. (1978). The relationship between construction price and height. Chartered Surveyor B and QS Quarterly: 69-71.
Flanagan, R. and Norman, G. (1983). "The accuracy and monitoring of quantity surveyors' price forecasting for building work." Construction Management and Economics: 157-180.
Flanagan, R. and Tate, B. (1997). Cost control in building design : an interacitve learning text. Oxford :, Blackwell Science,.
Forster, M. R. (2001). 5 The new science of simplicity. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 83-119.
Fortune, C. (1999). "Quality issues in building project price forecasting - factors affecting model selection." Journal of Construction Procurement 5(2): 129-140.
Fortune, C. and Hinks, J. (1997). Model Selection Criteria in Building Project Price Forecast. Annual conference; 13th Conference Volume Title Association of Researchers in Construction Management, Cambridge, Association of Researchers in Construction Management.
Fortune, C. and Hinks, J. (1998). "Strategic building project price forecasting models in use - paradigm shift postponed." Journal of Financial Management of Property and Construction 3(1): 3-26.
Fortune, C. and Lees, M. (1989). An investigation into Methods of Early Cost Advice for Clients, Salford College of Technology.
Fortune, C. and Lees, M. (1994). Early cost advice for clients: the practitioners' view. ANNUAL CONFERENCE- ARCOM 1994 10th.
Fortune, C. and Lees, M. (1996). The relative performance of new and traditional cost models in strategic advice and clients, The Royal Institution of Chartered Surveyors.
Fox, J. (1997). Applied regression analysis, linear models and related models. Thousand Oaks, Calif.: Sage Publications.
Garson, G. D. (2004). Structural Equation Modelling. [Internet] North Carolina State University. Available from: <http://www2.chass.ncsu.edu/garson/pa765/structur.htm> [Assessed 15 July 2004]
Gehring, H. and Narula, S. (1986). Project Cost Planning with Qualitative
210
Information. Project Management. 4: 61-5.
Gould, P. R. (1970). The Development of a Cost Model for H&V and A. C. Installations in Buildings. Department of Civil Engineering, Loughborough University of Technology.
Goutte, C. (1997). “Note on free lunches and cross-validation.” Neural Computation 9(6): 1246-9.
Gray, C., Ed. (1982). Analysis of the Preliminary Element of Building Production Cost. Building cost techniques: New Directions, E & FN Spon.
Grinyer, R. H. and Whittaker, J. D. (1973). "Managerial judgement in a competitive bidding model." Operational Research Quarterly 24(2): 181-191.
Gunner, J. C. (1997). A Model of Building Price Forecasting Accuracy. Department of Surveying. Salford, University of Salford.
Gunner, J. C. and Skitmore, R. M. (1999). "Comparative analysis of pre-bid forecasting of building prices based on Singapore data." Construction Management and Economics 17(5): 635-646.
Hanscomb Associates (1984). Area Cost Factors, Report of the US Army Corps of Engineers, Hanscomb Associates Inc, 600 West Peachtree Street, NW, Suite 1400, Atlanta, Georgia 30308, USA.
Hardcastle, C. (1984). The relationship between cost communications and price prediction in the evaluation of building design. Proceedings 3rd Int Symp on Build Econ, CIB W-55, Ottawa.
Harvey, J. (1979). Competitive Bidding on Canadian Public Construction Contracts: Stochastic Analysis for Optimization, University of Ontario: 102.
Hettmansperger, T. P. and McKean, J. W. (1998). Robust Nonparametric Statistical Methods, London: Arnold.
Hillebrandt, P. M. (1985). Economic Theory and the Construction Industry, MacMillan, 2nd ed. London.
Holes, L. G. (1987). Holostoc Resource and Cost Modelling. Building Cost Modelling and Computers. P. S. Brandon. London, E & F N Spon: 221-27.
Holes, L. G. and Thomas, R. (1982). General purpose cost modelling. Building Cost Techniques - New Directions. P. S. Brandon, E & F N Spon: 220-7.
Jaccard, J. and Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Thousand Oaks, CA: Sage Publications.
Jaggar, D., Ross A., Smith J. and Love, P. (2002). Building Design Cost Management, Blackwell Publishing
James, W. (1954). "A New Approach to Single Price Rate Approximate Estimating." RICS Journal XXXIII (XI)(May): 810-24.
Jupp Mansfield & Partners (1981). "Reliability of detailed cost data for estimating during detailed design (unpublished)."
Karshenas, S. (1984). "Predesign Cost Estimating Method for Multistory
211
Buildings." Journal of Construction Engineering and Management, ASCE 110(1): 79-86.
Kenley, R. and Wilson O. D. (1986). "A construction project cash flow model – an idiographic approach" Construction Management and Economics 4: 213-232.
Kim, G. H., Yoon, J. E., An, S. H., Cho, H. H. and Kang, K. I. (2004). "Neural network model incorporating a genetic algorithm in estimating construction costs" Building and Environment 39(11): 1330-1340.
Khosrowshahi, F. (1988). Construction Project Budgeting and Forecasting. Transactions of the American Association of Cost Engineers, Morgantown: AACE.
Khosrowshahi, F. and Kaka, A. P. (1996). "Estimation of Project Total Cost and Duration for Housing Projects in the U.K." Building and Environment 31(4): 375-383.
Kiiras, J. (1987). NCCS, Normal Cost Control System for Finnish Building Projects. Proceedings of the Fourth International Symposium on Building Economics, Session B: Design Optimisation, Copenhagen, SBI.
Kleinbaum, D. G., L. L. Kupper, et al. (1998). Applied regression analysis and other multivariable methods. 3rd ed. Boston, Mass. :, PWS-Kent,.
Kouskoulas, V. and Koehn, E. (1974). "Predesign cost estimating function for buildings." Journal of Construction Division, ASCE: 589-604.
Langston, C. A. (1983). Computerised Cost Planning Techniques. The Building Economist. 21: 171-73.
Lehmann, E. L. (1959). Testing Statistical Hypotheses, NewYork: JohnWiley.
Li, H. (1995). "Neural networks for construction cost estimation." Building Research and Information 23(5): 279-284.
Lu, Q. (1988). Cost estimation based on theory of fuzzy sets and prediction techniques - an expert system approach. Construction Contracting in China, Department of Civil and Structural Engineering, Hong Kong Polytechnic: 113-25.
MacCaffery (1978). Tender-price prediction for UK buildings - a feasibility study. Department of Civil Engineering, Loughborough University of Technology.
Makridakis, S. and Hibon, M. (1979). "Accuracy of Forecasting: An Empirical Investigation." Journal of the Royal Statistical Society Series A, Vol. 142(Part 2): 79-145.
Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandowshi, R., Newton, J., Parzen, E. and Winkler, R. (1982). "The accuracy of extrapolation (time series) methods: Results of a forecasting competition." Journal of Forecasting 1: 111-153.
Makridakis, S., Andersen, A., Carbone, R., Fildes, R., Hibon, M., Lewandowshi, R., Newton, J., Parzen, E. and Winkler, R. (1983). The
212
Forecasting Accuracy of Major Time Series Methods, London: Wiley.
Male, S. P. (1990). "Professional Authority, Power and Emerging Forms of "Profession" in Quantity Surveyor." Construction Management and Economics 8: 191-204.
Marr, K. F. (1974). Standards for Construction Cost Estimating. American Association of Cost Engineers, Transactions 1974.
Mathur, K., Ed. (1982). A Probabilistic Planning Model. Building cost techniques: New Directions, E & FN Spon.
Maver, T. (1970). "A Theory of Architectural Design in which the Role of the Computer is Identified." Building Science 4: 199-207.
Maver, T. (1979). Cost Performance Modelling. Chartered Quantity Surveyor. 2: 111-15.
Mayer, J. F. and Robinson, C. (1988). Accuracy of estimating - Summary of comparisons made between four estimating stages, Building Research Establishment, Department of Environment.
McCaffer, R. (1975). Some examples of the use of regression analysis as an estimating tool. Quantity Surveyor: 81-86.
McCaffer, R. (1976). Contractor's bidding behaviour and tender price prediction. Department of Civil Engineering, Loughborough University of Technology.
McCaffer, R., McCaffrey, M. J. and Thorpe, A. (1984). "Predicting the tender price of buildings during early stage design: method and validation." J. Opl Res. Soc. 35(5): 415-424.
McLachlan, G. J. (1987). Advances in multivariate statistical analysis. A. K. Gupta. Dordrecht, Holland, D. Reidel.
Meijer, R. F. (1987). Cost Modelling of Archetypes. Building Cost Modelling and Computers. P. S. Brandon. London, E & FN Spon: 223-31.
Meyrat, R. F. (1969). Algebraic Calculation of Cost Price. BUILD International: 27-36.
Moore, G. and Brandon, P. S. (1979). A Cost Model for Reinforced Concrete Frame Design. Chartered Quantity Surveyor. October 1979: 40-44.
Morrison, N. A. D. (1983). The Cost Planning and Estimating Techniques Employed by the Quantity Surveying Profession. Department of Construction Management, University of Reading.
Morrison, N. A. D. (1984). "The accuracy of quantity surveyors' cost estimating." Construction Management and Economics(2): 57-75.
Morton, R. and Jaggar, D. (1995). Design and the economics of building. London :, E & FN Spon,.
Moyles, B. F. (1973). An Analysis of the Contractors' Estimating Process.
213
Department of Civil Engineering, Loughborough University of Technology.
Munns and Al, H. (2000). "Estimating using cost significant global cost models." Construction Management and Economics 18(5): 575-585.
Nadel, E. (1967). "Parameter cost estimates." Engineering New Record(16 Mar): 112-23.
Neale, R. H. (1973). The Use of Regression Analysis as a Contractor's Estimating Tool. Department of Civil Engineering, Loughborough University of Technology. Newbold, P. and Granger, C. W. J. (1974). "Experience with forecasting univariate time series and the combination of forecasts." Journal of Royal Statistical Society Series A(137): 131-149.
Newton, S. (1983). Analysis of Construction Economics. School of Architecture and Building Science, University of Strethclyde.
Newton, S. (1988). Cost modelling techniques in perspective. Transactions 32nd Ann meeting of the AACE and the 10th Int cost engineering congress, New York, AACE.
Newton, S. (1990). "An agenda for cost modelling research." Construction Management and Economics.
Ogunlana, O. and T. Thorpe (1987). "Design phase cost estimating. The state of art." International Journal of Construction Management and Technology 2(4): 34-47.
Ogunlana, S. and Thorpe, A. (1991). "Factor affecting the accuracy of cost estimates: developing correct associations." Building and Environment 26(2): 77-86.
Park, R. E. (1988). Parametric Sofeware Cost Estimation with an Adaptable Model. Transactions of the American Association of Cost Engineers, Morgantown: AACE.
Patchell, B. R. T. (1987). The implementation of cost modelling theory. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 233-42.
Pegg, I. (1984). Cost Study F38a: The effect of location and other measurable parameters on tender levels, Building Cost Information Service, Royal Institution of Chartered Surveyors: 13-27.
Pegg, I. (1987). Computerised Approximate Estimating from the BCIS On-line Data Base. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 243-9.
Pena, W. and Parshall, S. A. (2001). Problem seeking : an architectural programming primer. 4th ed. New York :, Wiley.
Pitt, T. (1982). The Identification and Use of Spend Units in the Financial Monitoring and Control of Construction Projects. Building Cost Techniques - New Directions. P. S. Brandon, London: E & F N Spon: 255-62.
Popper, K. (1959). The Logic of Scientific Discovery, New York: Science
214
Editions.
Powell, J. and Chisnall, J. (1981). Getting early estimates right. Chartered Quantity Surveyor: 279-281.
Proctor, C. J., Bowen, P. A., Le Roux, G. K. and Fielding, M. J. (1993). Client and Architect Satisfaction with Building Price Advice: An Empirical Study. CIB W55/W95 Internal Symposium on Economic Evaluation and the Built Environment, Lisbon.
Property Services Agency (PSA) (1977). Early cost advice (B & CE elements) - offices, sleeping quarters, HMSO. Property Services Agency (PSA) (1987). Significant items estimating. Croydon :, Property Services Agency.
Quah, L. K. (1992). "Comparative variability in tender bids for refurbishment and new build work." Construction Management and Economics 10: 263-69.
Raftery, J. (1984a). Some problems of data collection and model validation. Paper for Research Seminar, Liverpool Polytechnic.
Raftery, J. (1984b). Models in building economics: a conceptual framework for the assessment of performance. Proceedings 3rd International Symposium on Building Economics, Ottawa, CIB W-55.
Raftery, J. (1987). The state of cost/price modelling in the construction industry: a multicriteria approach. Building Cost Modelling and Computers. P. S. Brandon, E & F N Spon: 49-71.
Raftery, J. (1991). Principles of Building Economics, BSP Professional Books.
Raftery, J. (1995). Property and Construction Economics as the Study of Human Behaviour in Exchange,. Keynote Address to the Internal Conference on Financial Management of Property and Construction, Newcastle, Co. Down, Northern Ireland, 165-75.
Ray-Jones, A. and D. Clegg (1976). CI/SfB construction indexing manual. 3rd ed. London :, RIBA Publications,.
Regdon, G. (1972). "Pre-determination for Housing Cost." BUILD International (March/April 1972): 94-99.
Ross, E. (1983). A database and computer system for tender price prediction by approximate quantities, Loughborough University of Technology.
Royal Institution of British Architects (RIBA) (1991). Architect's Handbook of Practice Management, RIBA Publications.
Royal Institution of Chartered Surveyors (RICS) (1992). The core skills and knowledge base of the quantity surveyor, The Royal Institution of Chartered Surveyors.
Royal Institution of Chartered Surveyors (RICS) Junior Organisation (1964). The effect of shape and height on building cost. The Chartered Surveyor.
215
Runeson, G. and Bennett, J. (1983). "Tendering and the price level in the New Zealand building industry." Construction Papers 2(2): 29-35.
Russell, A. D. and Choudhary, K. T. (1980). "Cost optimisation of buildings." Journal of Structural Division, American Society of Civil Engineers(January): 283-300.
Schofield, D., Raftery, J. and Wilson, A. (1982). An Economic Model of Means of Escape Provision in Commercial Buildings. Building Cost Techniques: New Directions. P. S. Brandon. London, E & FN Spon: 210-220. Seeley, I. H. (1996). Building economics : appraisal and control of building design cost and efficiency. 4th ed. Houndmills, Basingstoke, Hampshire :, Macmillan,.
Selinger, S. (1988). Computerized Parametric Estimating. British-Israeli Seminar on Building Economics, Haifa: The Building Research Station: 160-67.
Sidwell, A. C. and Woottoon, A. H. (1984). Operation Estimating. Organizing and Managing Construction. V. K. Handa. Ontario, University of Waterloo. 3: Developing Countries Research: 1015-20.
Sierra, J. E. E. (1982). "A Statistical Analysis of Low Rise Office Accommodation Investment Packages." The Building Economist(March): 175-78.
Simon, H. A. (2001). 3 Science seeks parsimony, not simplicity: searching for pattern in phenomena. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 32-72.
Singh, S. (1989). Computer Model for Cost Estimation of Structures in High Rise Commercial Buildings. PROCEEDINGS OF THE ANNUAL CONFERENCE- ASSOCIATED SCHOOLS OF CONSTRUCTION 1989 25th.
Skitmore, M. (1981). Bidding dispersion - an investigation into a method of measuring the accuracy of building cost estimates. Department of Civil Engineering, University of Salford.
Skitmore, M. (1982). A Bidding Model. Building Cost Techniques - New Directions. P. S. Brandon, London: E & F N Spon: 175-8.
Skitmore, M. (1992). "Parameter prediction for cash flow forecasting models." Construction Management and Economics 10: 397-413.
Skitmore, M. (2002). "Raftery curve construction for tender price forecast." Construction Management and Economics 20: 83-89.
Skitmore, M. and Drew, D. (2003). "The analysis of pre-tender building price forecasting performance: a case study." ENGINEERING CONSTRUCTION AND ARCHITECTURAL MANAGEMENT 10(1): 36-42.
Skitmore, M., Stradling, S., Tuohy, A. and Makwezalamba, H. (1990). The Accuracy of Construction Price Forecasts: A Study of Quantity Surveyors' Performance in Early Stage Estimating, The University of Salford.
216
Skitmore, R. M. (1985). The influence of professional expertise in construction price forecasts, Department of Civil Engineering, University of Salford.
Skitmore, R. M. (1988). Fundamental research in building and estimating. Transactions CIB British-Israeli seminar on building economics, Haifa, Israel Building Research Station.
Skitmore, R. M. (1991). Early Stage Construction Price Forecasting - A Review of Performance, The Royal Institution of Chartered Surveyors.
Skitmore, R. M. and Marston, V. K. (1999). Cost Modelling. London, E & FN Spon.
Skitmore, R. M. and Patchell, B. R. T. (1990). Development in contract price forecasting and bidding techniques. Quantity Surveying Techniques: New Directions. P. S. Brandon, Blackwell Scientific: 75-120.
Skitmore, R. M. and Tan, S. H. (1988). Factors affecting the accuracy of engineers' estimates. Transactions 10th International Cost Engineering Congress, The American Association of Cost Engineers, paper B-3, American Association of Cost Engineers, Mortgantown.
Sober, E. (2001). 2 What is the problem of simplicity? Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 13-31.
Southwell, J. (1971). Building Cost Forecasting, Royal Institution of Chartered Surveyors.
Spanos, A. (2001). 11 Parametric versus non-parametric inference: statistical models and simplicity. Simplicity, inference and modelling : keeping it sophisticatedly simple. A. Zellner, H. A. Keuzenkamp and M. McAleer. Cambridge, Cambridge University Press: 181-206.
Spooner, J. E. (1974). "Probabilistic estimating." Journal of Construction Division, ASCE: March.
Sprent, P. (1993). Applied Non-Parametric Statistical Methods, Chapman & Hall.
Stangl W. (1997). Ockham's razor also Occam's razor. "Pluralitas non est ponenda sine necessitate". [Internet] Johannes Kepler University (JKU), Linz, Austria. Available from: <http://paedpsych.jk.uni-linz.ac.at/INTERNET/ARBEITSBLAETTERORD/PHILOSOPHIEORD/Occam.html> [Assessed 15 July 2004]
Steyert, R. S. (1972). The Economics of High Rise Apartment Buildings of Alternate Design Construction, Construction Research Council, American Society of Civil Engineer.
Stone, P. (1963). Housing, Town Development Land and Costs. London.
Taylor, R. G. (1984). A critical examination of quantity surveying techniques in cost appraisal and tendering within the building industry. Department of Quantity Surveying and Building Economics, University of Natal.
217
Tan, S. H. (1988). An investigation into the accuracy of cost estimates during the design stages of construction projects,. Department of Civil Engineering, The University of Salford.
Tan, W. (1999). "Construction cost and building height." Construction Management and Economics 17(2): 129-32.
Thng, S. H. (1989). Estimating accuracy at public sector - housing development board, National University of Singapore.
Thompson, P. A. and Willmer, G. (1985). CASPAR - A Program for Engineering Project Appraisal and Management. Proceedings, 2nd International Conference on Civil and Structural Engineering Computing, London.
Thomsen, C. (1965). "How high to rise." AIA Journal(April 1965): 66-68.
Thomsen, C. (1966). "How high to rise." Appraisal Journal 34(4): 585-91.
Townsend, P. R. F. (1978). The effect of design decisions on the cost of office development. Chartered Surveyor - Building and Quantity Surveying Quarterly: 53-6.
Tregenza, T. (1972). "Association between building height and cost." Architects' Journal 156(44): 1031-2.
Turney, P. D. (1990a). "The curve fitting problem - a solution." British Journal for the Philosophy of Science 41(509-530).
Tversky, A. and Kahneman, N. (1974). "Judgement Under Uncertainty: Heuristics and Biases." Science 185: 1124-1131.
Walker, D. H. T. (1988). Using Spreadsheets for Simulation Studies. The Building Economist. 27: 14-5.
Wall, D. M. (1997). "Distribution and correlations in Monte Carlo simulation." Construction Management and Economics 15: 241-58.
Warszawski, A. (2003). “Parametric analysis of the financing cost in a building project.” Construction Management and Economics 21(5): 447-459.
Weight, D. (1987). Patterns Cost Modelling. Building Cost Modelling and Computers. P. S. Brandon. London, E & FN Spon: 257-66.
Wilderness Group (1964). An investigation into building cost relationships of the following design variables: storey height, floor loading, column spacing, number of storeys, The Royal Institution of Chartered Surveyors: 253-71.
Wilson, A. J. and Templeman, A. B. (1976). "An Approach to the Optimal Thermal Design of Office Buildings." Building and Environment 11(1): 39-50.
Wilson, A. J., Ed. (1982). Experiments in probabilistic cost modelling. Building cost techniques: New Directions, E & FN Spon.
Wilson, O. D., Sharpe, K. and Kenley, R. (1987). "Estimates Given and Tenders Received." Construction Management and Economics 5(3).
218
Woodhead, W. D., Rahilly, M., Salomonsson, G. D. and Tait, R. (1987). An Integrated Cost Estimating System for House Builders. Proceedings of the Fourth International Symposium on Building Economics, Session B: Design Optimation, Copenhagen, SBI.
Yokoyama, K. and Tomiya, T. (1988). The Integrated Cost Estimating Systems Technique for Building Costs. Transactions of the American Association of Cost Engineers, Morgantown: AACE.
Zahry, M. (1982). Capital cost prediction by multi-variate analysis. School of Architecture, University of Strathclyde. Zellner, A., Keuzenkamp, H. A. and McAleer, M. (2001). Simplicity, inference and modelling : keeping it sophisticatedly simple. Cambridge, Cambridge University Press.
233
AAppppeennddiixx DD:: FFoorreeccaassttss bbyy CCrroossss VVaalliiddaattiioonn
UUssiinngg CCoonnvveennttiioonnaall MMooddeellss
234
Table D-1: Forecasts by Cross Validation Using the Conventional Models for
Offices
James Floor Cube James Floor Cube James Floor Cube
(m²) (m²) (m³) ($/m²) ($/m²) ($/m³) (HK$) (HK$) (HK$)1 50,550 12,213 43,744 1,175 5,861 1,455 59,372,873 71,583,498 63,662,214 2 38,811 9,176 31,035 1,175 5,859 1,455 45,585,401 53,760,591 45,143,231 3 83,694 23,330 78,378 1,174 5,877 1,457 98,277,121 137,105,039 114,181,365 4 46,142 10,491 32,639 1,175 5,860 1,454 54,209,728 61,471,200 47,455,012 5 58,593 12,620 47,163 1,176 5,862 1,456 68,878,692 73,975,591 68,663,684 6 160,425 50,940 163,404 1,172 5,912 1,459 187,972,498 301,140,203 238,433,705 7 110,466 25,052 92,362 1,176 5,871 1,457 129,899,266 147,087,959 134,580,309 8 34,558 6,520 27,926 1,172 5,838 1,451 40,494,985 38,065,902 40,527,931 9 941,061 130,060 637,140 1,183 5,652 1,433 1,112,835,911 735,117,042 912,914,988 10 185,100 42,250 192,065 1,154 5,773 1,440 213,696,762 243,903,254 276,534,754 11 193,526 47,820 224,431 1,162 5,824 1,455 224,821,862 278,494,684 326,526,162 12 51,123 10,011 36,773 1,173 5,843 1,451 59,953,119 58,499,117 53,371,035 13 54,357 10,298 36,571 1,175 5,854 1,454 63,873,689 60,283,413 53,160,909 14 1,363,377 295,360 1,126,816 1,194 6,050 1,486 1,627,239,786 1,786,965,675 1,673,999,827 15 115,367 24,600 109,215 1,163 5,799 1,444 134,126,404 142,666,828 157,664,365 16 25,137 5,585 20,441 1,172 5,843 1,451 29,458,901 32,629,344 29,669,766 17 306,948 81,700 367,355 1,168 5,900 1,476 358,504,079 481,992,100 542,082,691 18 38,140 8,492 28,486 1,175 5,859 1,455 44,812,745 49,753,951 41,436,441 19 18,267 3,680 12,070 1,173 5,847 1,453 21,432,969 21,518,551 17,532,517 20 118,746 20,254 94,403 1,169 5,814 1,448 138,842,813 117,748,657 136,671,495 21 24,461 5,490 18,283 1,175 5,857 1,455 28,735,303 32,155,761 26,594,149 22 18,527 3,828 12,631 1,174 5,851 1,453 21,749,466 22,397,117 18,357,240 23 128,261 26,454 122,827 1,163 5,798 1,445 149,150,758 153,380,485 177,446,419 24 38,095 8,697 28,786 1,174 5,854 1,453 44,721,857 50,917,009 41,838,760 25 638,680 130,070 451,993 1,200 5,989 1,470 766,506,884 778,994,468 664,510,311 26 56,424 11,560 43,695 1,169 5,825 1,447 65,938,132 67,335,601 63,221,858 27 35,511 7,642 23,758 1,172 5,845 1,451 41,633,854 44,666,333 34,470,126 28 54,626 11,377 44,522 1,172 5,845 1,452 64,045,606 66,493,664 64,652,059 29 60,329 14,996 48,130 1,172 5,852 1,451 70,709,169 87,756,220 69,858,260 30 144,936 36,820 126,841 1,175 5,887 1,458 170,347,624 216,761,177 184,927,046 31 26,578 8,550 32,405 1,174 5,862 1,456 31,196,743 50,118,072 47,189,075 32 39,939 7,718 27,511 1,174 5,851 1,453 46,897,923 45,153,654 39,977,555 33 19,899 4,921 17,703 1,174 5,853 1,454 23,358,683 28,807,337 25,743,836 34 26,577 6,453 23,366 1,173 5,852 1,454 31,184,898 37,760,493 33,967,145 35 21,525 5,267 17,005 1,174 5,856 1,454 25,277,990 30,843,509 24,731,553 36 941,236 183,462 736,348 1,146 5,688 1,413 1,078,454,181 1,043,505,263 1,040,471,312 37 1,208,137 171,960 644,176 1,247 5,900 1,454 1,506,529,900 1,014,641,302 936,597,527 38 53,764 14,840 64,406 1,174 5,868 1,459 63,139,681 87,080,970 93,990,913 39 27,690 8,268 35,476 1,173 5,858 1,456 32,491,687 48,433,883 51,663,376 40 57,878 15,130 57,797 1,175 5,869 1,458 68,006,401 88,797,765 84,255,294 41 112,359 35,350 252,357 1,172 5,887 1,490 131,630,183 208,113,877 375,929,649 42 75,550 16,897 69,902 1,175 5,864 1,458 88,791,931 99,078,500 101,892,166
Case Model Quantity Model Rate Forecasted Tender Sum
235
Table D-2: Forecasts by Cross Validation Using the Conventional Models for
Private Housing
James Floor Cube James Floor Cube James Floor Cube
(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 710,487 118,840 341,184 760 4,020 1,426 539,822,162 477,790,905 486,573,380 2 944,939 155,150 445,564 755 3,988 1,415 713,552,191 618,674,258 630,354,139 3 131,043 30,840 94,746 761 4,051 1,437 99,778,400 124,920,812 136,153,077 4 112,167 20,937 70,734 763 4,054 1,439 85,613,626 84,879,293 101,768,657 5 956,202 181,820 494,364 771 4,095 1,450 736,919,515 744,639,741 716,582,916 6 1,157,607 240,440 754,110 754 4,029 1,438 872,813,723 968,627,663 1,084,710,974 7 12,366 2,800 8,680 765 4,062 1,440 9,456,723 11,374,722 12,502,087 8 24,127 6,066 17,893 764 4,062 1,440 18,443,658 24,640,525 25,769,121 9 250,732 43,003 128,526 765 4,060 1,440 191,856,830 174,599,877 185,123,898 10 566,453 134,220 404,167 751 4,015 1,427 425,155,543 538,865,772 576,608,977 11 1,120,997 211,440 590,512 744 3,951 1,400 833,797,413 835,449,571 826,753,484 12 663,624 132,070 372,257 763 4,063 1,440 506,647,430 536,549,802 536,151,998 13 151,135 34,335 108,442 761 4,048 1,437 115,008,129 138,985,475 155,784,824 14 27,793 5,195 15,730 765 4,061 1,440 21,251,161 21,098,368 22,650,548 15 35,515 7,811 24,663 765 4,062 1,440 27,152,627 31,729,589 35,527,123 16 1,133,306 197,320 558,684 777 4,111 1,458 880,832,193 811,217,111 814,454,721 17 10,811 2,793 8,285 765 4,063 1,441 8,269,007 11,349,233 11,935,431 18 787,734 141,300 391,527 769 4,075 1,444 605,381,821 575,764,763 565,246,703 19 28,116 6,841 19,484 765 4,064 1,441 21,504,864 27,803,438 28,075,002 20 142,064 31,520 97,830 762 4,053 1,438 108,265,743 127,743,892 140,676,020 21 382,217 66,825 210,871 763 4,048 1,438 291,627,893 270,476,690 303,195,772 22 252,034 45,900 126,225 764 4,055 1,437 192,496,780 186,140,819 181,426,161 23 313,524 60,140 169,387 767 4,077 1,445 240,584,951 245,196,620 244,834,083 24 1,002,584 187,360 525,872 765 4,060 1,439 766,639,967 760,717,016 756,785,376 25 17,041 3,769 9,479 765 4,063 1,440 13,033,398 15,313,365 13,652,540 26 384,544 95,770 270,824 762 4,072 1,444 293,053,202 389,961,458 390,981,903 27 82,080 20,640 55,109 764 4,064 1,440 62,714,248 83,873,938 79,372,448 28 522,038 90,720 254,088 770 4,083 1,447 402,007,775 370,370,941 367,703,903 29 499,149 108,627 304,590 765 4,076 1,445 381,634,950 442,791,500 440,105,402 30 148,177 34,470 106,967 762 4,052 1,438 112,852,580 139,673,486 153,796,614 31 627,877 122,750 340,357 768 4,085 1,447 482,324,197 501,440,924 492,666,293 32 170,618 32,550 87,885 766 4,072 1,443 130,777,011 132,536,234 126,823,048 33 384,252 74,700 219,960 761 4,046 1,436 292,536,200 302,254,734 315,794,951 34 407,097 79,260 223,270 767 4,075 1,445 312,076,760 322,950,196 322,515,873 35 529,896 97,860 276,927 766 4,066 1,442 405,862,926 397,939,532 399,266,095 36 103,759 21,938 64,166 765 4,063 1,441 79,325,442 89,140,383 92,453,090 37 652,203 123,950 313,594 771 4,095 1,447 502,657,976 507,571,144 453,792,371 38 958,982 169,260 457,002 776 4,109 1,454 744,014,266 695,492,936 664,503,547 39 499,756 85,840 231,768 771 4,085 1,447 385,154,078 350,666,191 335,355,838 40 966,375 168,720 455,544 776 4,108 1,454 749,878,706 693,047,206 662,170,897 41 733,776 128,390 346,653 772 4,089 1,448 566,228,519 524,940,289 501,783,295 42 494,856 87,480 236,196 765 4,060 1,438 378,793,092 355,173,599 339,660,311 43 767,745 129,870 350,649 771 4,083 1,445 592,301,009.74 530,198,096.05 506,800,720.56 44 114,848 21,121 55,971 766 4,066 1,441 87,926,473.50 85,879,578.90 80,658,164.91 45 393,972 70,670 187,276 769 4,081 1,445 302,928,374.19 288,375,864.90 270,637,948.94 46 130,842 28,000 70,488 766 4,072 1,443 100,224,854.20 114,019,920.97 101,686,494.80 47 118,933 24,850 68,338 763 4,057 1,438 90,773,455.62 100,804,279.38 98,264,079.00 48 65,162 14,940 44,329 764 4,063 1,441 49,806,322.57 60,695,623.54 63,860,229.19 49 83,376 20,300 58,870 764 4,065 1,441 63,726,309.10 82,509,513.31 84,842,576.37 50 737,963 128,700 353,925 773 4,097 1,451 570,746,064.53 527,332,419.33 513,701,177.26
Case Model Quantity Model Rate Forecasted Tender Sum
236
Table D-3: Forecasts by Cross Validation Using the Conventional Models for
Nursing Homes
James Floor Cube James Floor Cube James Floor Cube
(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 32,195 9,357 37,295 1,211 4,215 1,183 38,992,273 39,436,561 44,130,687 2 37,823 10,640 39,472 1,236 4,292 1,200 46,752,715 45,665,113 47,348,177 3 28,924 8,940 35,760 1,199 4,186 1,175 34,672,981 37,423,995 42,024,543 4 30,853 10,100 45,450 1,208 4,234 1,201 37,258,239 42,766,748 54,599,219 5 8,020 2,400 9,600 1,212 4,219 1,178 9,723,386 10,124,491 11,311,030 6 18,536 5,240 18,602 1,217 4,231 1,179 22,566,468 22,172,318 21,937,612 7 16,224 3,502 12,258 1,216 4,196 1,169 19,722,838 14,695,509 14,332,358 8 12,913 3,783 15,132 1,208 4,202 1,175 15,594,843 15,895,715 17,777,176 9 34,503 10,900 33,108 1,216 4,258 1,174 41,969,369 46,414,397 38,868,066 10 22,893 6,865 24,028 1,218 4,245 1,182 27,889,849 29,139,691 28,404,085 11 19,628 6,150 18,143 1,209 4,217 1,167 23,720,771 25,933,114 21,178,107 12 35,890 11,900 41,650 1,224 4,302 1,197 43,925,314 51,193,195 49,857,789 13 16,713 4,575 19,444 1,220 4,238 1,188 20,397,313 19,389,012 23,101,057 14 18,357 5,720 17,160 1,229 4,286 1,188 22,558,469 24,515,279 20,380,578 15 17,620 5,752 23,008 1,223 4,273 1,196 21,551,242 24,575,631 27,526,333 16 12,112 3,740 17,204 1,212 4,224 1,186 14,685,236 15,796,253 20,397,589 17 72,543 15,978 47,197 1,223 4,108 1,123 88,713,548 65,633,295 53,006,869 18 20,374 6,240 21,216 1,230 4,287 1,193 25,053,592 26,751,383 25,304,864 19 22,236 6,105 20,757 1,219 4,232 1,178 27,114,010 25,839,115 24,443,092 20 14,943 4,405 19,383 1,210 4,209 1,181 18,074,729 18,543,558 22,896,143 21 13,564 4,290 15,659 1,223 4,264 1,189 16,590,664 18,293,875 18,625,695 22 18,287 5,230 17,259 1,227 4,267 1,186 22,440,674 22,315,583 20,475,966 23 27,805 7,190 21,517 1,208 4,177 1,156 33,575,265 30,033,124 24,863,430
Case Model Quantity Model Rate Forecasted Tender Sum
237
Table D-4: Forecasts by Cross Validation Using the Conventional Models for
Schools
James Floor Cube James Floor Cube James Floor Cube
(m²) (m²) (m³) ($/m²) ($/m²) ($/m²) (HK$) (HK$) (HK$)1 14,916 4,498 18,036 632 2,104 549 9,428,724 9,464,630 9,905,382 2 38,219 12,466 51,609 636 2,136 562 24,291,506 26,628,941 29,000,856 3 14,916 4,498 18,036 630 2,098 547 9,399,254 9,435,048 9,874,423 4 14,861 4,436 15,613 631 2,101 545 9,381,703 9,317,595 8,515,324 5 8,541 2,710 10,081 635 2,118 551 5,427,421 5,739,048 5,557,443 6 9,882 2,970 10,692 633 2,108 548 6,258,694 6,261,287 5,862,205 7 9,649 2,771 11,638 637 2,117 553 6,144,545 5,866,423 6,433,834 8 32,645 9,715 34,780 637 2,118 548 20,789,791 20,574,694 19,067,844 9 19,003 5,570 20,165 616 2,047 532 11,703,934 11,404,989 10,723,764 10 12,441 3,480 16,112 637 2,114 554 7,921,848 7,358,185 8,933,449 11 10,858 3,068 12,579 632 2,100 548 6,862,630 6,442,888 6,894,603 12 16,273 4,814 18,582 637 2,119 552 10,369,211 10,202,933 10,262,204 13 14,996 4,459 19,486 620 2,064 541 9,304,505 9,205,053 10,538,984 14 14,985 4,340 15,190 633 2,102 546 9,478,966 9,123,962 8,289,897 15 27,004 8,325 29,138 638 2,129 551 17,240,545 17,726,537 16,050,372 16 23,185 7,171 24,238 636 2,121 548 14,743,916 15,207,998 13,283,659 17 14,201 4,200 14,910 640 2,130 553 9,094,442 8,947,076 8,250,403 18 10,848 3,152 12,450 637 2,119 552 6,912,741 6,679,007 6,878,452 19 7,197 2,140 9,245 638 2,123 555 4,592,511 4,544,131 5,126,316 20 13,913 3,920 14,896 633 2,102 547 8,806,584 8,238,969 8,153,064 21 6,407 1,976 6,916 634 2,112 549 4,062,983 4,172,677 3,798,607 22 9,739 2,921 9,785 639 2,126 552 6,220,005 6,209,432 5,400,789 23 32,434 9,700 40,740 651 2,167 569 21,121,705 21,016,634 23,199,480
Case Model Quantity Model Rate Forecasted Tender Sum
238
AAppppeennddiixx EE:: EErrrroorrss aanndd PPeerrcceennttaaggee EErrrroorrss ooff
FFoorreeccaassttss
239
Table E-1: Errors and Percentage Errors of Forecasts for the Conventional
Models for Offices
Case
Error % Error Error % Error Error % Error
1 2.3E+06 0.040 1.4E+07 0.252 7.2E+06 0.126 2 2.3E+06 0.053 1.0E+07 0.241 2.3E+06 0.053 3 -1.1E+05 -0.001 3.9E+07 0.393 1.7E+07 0.172 4 4.5E+06 0.091 1.2E+07 0.236 -1.8E+06 -0.036 5 1.0E+07 0.172 1.5E+07 0.258 1.1E+07 0.180 6 -2.0E+07 -0.095 9.3E+07 0.449 3.3E+07 0.160 7 1.3E+07 0.111 3.0E+07 0.257 1.9E+07 0.163 8 -1.9E+07 -0.320 -2.2E+07 -0.362 -1.9E+07 -0.313 9 6.5E+07 0.062 -3.1E+08 -0.299 -1.2E+08 -0.119 10 -1.5E+08 -0.419 -1.2E+08 -0.337 -8.8E+07 -0.240 11 -9.8E+07 -0.303 -4.4E+07 -0.137 7.3E+06 0.023 12 -1.2E+07 -0.166 -1.3E+07 -0.187 -1.8E+07 -0.251 13 6.5E+06 0.112 2.8E+06 0.049 -3.7E+06 -0.065 14 1.5E+08 0.103 3.1E+08 0.210 2.2E+08 0.148 15 -9.1E+07 -0.404 -8.2E+07 -0.366 -6.6E+07 -0.292 16 -1.8E+07 -0.380 -1.5E+07 -0.313 -1.8E+07 -0.369 17 -4.9E+07 -0.120 7.4E+07 0.182 1.4E+08 0.344 18 5.4E+06 0.136 1.0E+07 0.260 2.4E+06 0.060 19 -7.4E+06 -0.256 -7.3E+06 -0.253 -1.1E+07 -0.385 20 -3.9E+07 -0.220 -6.0E+07 -0.339 -4.0E+07 -0.224 21 3.8E+06 0.151 7.2E+06 0.287 1.9E+06 0.076 22 -2.5E+06 -0.105 -1.9E+06 -0.079 -5.8E+06 -0.237 23 -8.9E+07 -0.373 -8.5E+07 -0.356 -5.9E+07 -0.247 24 -2.5E+06 -0.052 3.7E+06 0.078 -4.9E+06 -0.105 25 2.0E+08 0.359 2.1E+08 0.380 1.1E+08 0.190 26 -4.4E+07 -0.400 -4.3E+07 -0.388 -4.6E+07 -0.419 27 -1.4E+07 -0.256 -1.1E+07 -0.203 -2.1E+07 -0.378 28 -1.4E+07 -0.181 -1.2E+07 -0.150 -1.3E+07 -0.165 29 -1.7E+07 -0.195 -1.2E+05 -0.001 -1.7E+07 -0.197 30 8.5E+06 0.052 5.5E+07 0.338 2.5E+07 0.154 31 -3.8E+06 -0.107 1.5E+07 0.433 1.3E+07 0.364 32 -2.2E+05 -0.005 -2.0E+06 -0.042 -6.8E+06 -0.143 33 -3.2E+06 -0.122 2.2E+06 0.082 -6.1E+05 -0.023 34 -6.8E+06 -0.179 -2.5E+05 -0.006 -3.7E+06 -0.097 35 9.1E+05 0.037 6.5E+06 0.265 6.1E+05 0.025 36 -2.2E+08 -0.170 -2.6E+08 -0.198 -2.5E+08 -0.191 37 5.7E+08 0.606 7.6E+07 0.081 8.6E+06 0.009 38 1.0E+06 0.016 2.5E+07 0.400 3.3E+07 0.528 39 -6.7E+06 -0.170 9.2E+06 0.236 1.3E+07 0.332 40 5.7E+06 0.092 2.6E+07 0.425 2.3E+07 0.366 41 -2.8E+07 -0.181 5.5E+07 0.360 1.3E+08 0.822 42 7.9E+06 0.098 1.8E+07 0.224 2.2E+07 0.272
MSQ: 1.2E+16 8.9E+15 4.7E+15Max: 0.606 0.449 0.822Min: -0.419 -0.388 -0.419Mean: -0.069 0.056 0.002SD: 0.214 0.273 0.270
ORIGINAL JSEM CUBE FLOOR AREA
240
Table E-2: Errors and Percentage Errors of Forecasts for the Conventional
Models for Private Housing
Case
Error % Error Error % Error Error % Error
1 -1.1E+08 -0.172 -1.7E+08 -0.267 -1.7E+08 -0.254 2 -2.1E+08 -0.229 -3.1E+08 -0.332 -3.0E+08 -0.319 3 -7.7E+07 -0.436 -5.2E+07 -0.294 -4.1E+07 -0.231 4 -3.7E+07 -0.304 -3.8E+07 -0.310 -2.1E+07 -0.173 5 1.2E+08 0.198 1.3E+08 0.211 1.0E+08 0.165 6 -2.4E+08 -0.214 -1.4E+08 -0.127 -2.5E+07 -0.023 7 -6.2E+06 -0.398 -4.3E+06 -0.275 -3.2E+06 -0.204 8 -1.2E+07 -0.391 -5.7E+06 -0.187 -4.5E+06 -0.150 9 3.9E+06 0.021 -1.3E+07 -0.071 -2.9E+06 -0.015 10 -3.1E+08 -0.422 -2.0E+08 -0.268 -1.6E+08 -0.217 11 -4.6E+08 -0.354 -4.5E+08 -0.352 -4.6E+08 -0.359 12 -3.3E+07 -0.062 -3.5E+06 -0.006 -3.8E+06 -0.007 13 -8.7E+07 -0.431 -6.3E+07 -0.312 -4.6E+07 -0.229 14 -8.3E+06 -0.282 -8.5E+06 -0.287 -6.9E+06 -0.235 15 -9.8E+06 -0.266 -5.3E+06 -0.142 -1.5E+06 -0.040 16 2.6E+08 0.425 1.9E+08 0.313 2.0E+08 0.318 17 -3.1E+06 -0.275 -5.1E+04 -0.004 5.4E+05 0.047 18 7.5E+07 0.142 4.6E+07 0.086 3.5E+07 0.067 19 -3.1E+06 -0.126 3.2E+06 0.130 3.5E+06 0.141 20 -6.3E+07 -0.367 -4.3E+07 -0.253 -3.0E+07 -0.177 21 -4.3E+07 -0.129 -6.5E+07 -0.193 -3.2E+07 -0.095 22 -2.7E+07 -0.121 -3.3E+07 -0.150 -3.8E+07 -0.172 23 5.1E+07 0.266 5.5E+07 0.291 5.5E+07 0.289 24 -7.4E+06 -0.010 -1.3E+07 -0.017 -1.7E+07 -0.022 25 -4.3E+06 -0.247 -2.0E+06 -0.115 -3.6E+06 -0.211 26 -6.3E+07 -0.177 3.4E+07 0.095 3.5E+07 0.098 27 -2.0E+07 -0.245 7.7E+05 0.009 -3.7E+06 -0.045 28 1.1E+08 0.372 7.7E+07 0.264 7.5E+07 0.255 29 -9.4E+06 -0.024 5.2E+07 0.132 4.9E+07 0.126 30 -7.3E+07 -0.393 -4.6E+07 -0.249 -3.2E+07 -0.173 31 6.8E+07 0.165 8.7E+07 0.211 7.9E+07 0.190 32 3.2E+07 0.322 3.4E+07 0.340 2.8E+07 0.282 33 -7.9E+07 -0.214 -7.0E+07 -0.187 -5.6E+07 -0.151 34 3.4E+07 0.123 4.5E+07 0.162 4.5E+07 0.160 35 2.0E+07 0.051 1.2E+07 0.031 1.3E+07 0.034 36 -1.1E+07 -0.118 -7.6E+05 -0.008 2.6E+06 0.028 37 1.2E+08 0.323 1.3E+08 0.336 7.4E+07 0.194 38 2.3E+08 0.456 1.8E+08 0.361 1.5E+08 0.300 39 1.2E+08 0.464 8.8E+07 0.333 7.2E+07 0.275 40 2.4E+08 0.459 1.8E+08 0.348 1.5E+08 0.288 41 1.4E+08 0.339 1.0E+08 0.241 7.9E+07 0.186 42 9.8E+06 0.027 -1.4E+07 -0.037 -2.9E+07 -0.080 43 1.4E+08 0.308 7.7E+07 0.170 5.4E+07 0.119 44 1.3E+07 0.166 1.0E+07 0.139 5.3E+06 0.070 45 8.4E+07 0.383 6.9E+07 0.317 5.2E+07 0.236 46 2.1E+07 0.270 3.5E+07 0.445 2.3E+07 0.289 47 -3.8E+07 -0.296 -2.8E+07 -0.219 -3.1E+07 -0.238 48 -1.4E+07 -0.223 -3.4E+06 -0.053 -2.4E+05 -0.004 49 -1.5E+07 -0.186 4.2E+06 0.054 6.5E+06 0.084 50 1.8E+08 0.463 1.4E+08 0.352 1.2E+08 0.317
MSQ: 1.6E+16 1.3E+16 1.0E+16Max: 0.464 0.445 0.318Min: -0.436 -0.352 -0.359Mean: -0.027 0.013 0.015SD: 0.290 0.235 0.196
ORIGINAL JSEM FLOOR AREA CUBE
241
Table E-3: Errors and Percentage Errors of Forecasts for the Conventional
Models for Nursing Homes
Case
Error % Error Error % Error Error % Error
1 -3.4E+06 -0.080 -3.0E+06 -0.070 1.7E+06 0.041 2 1.0E+07 0.286 9.3E+06 0.256 1.1E+07 0.303 3 -1.0E+07 -0.228 -7.5E+06 -0.167 -2.9E+06 -0.064 4 -5.3E+06 -0.125 1.7E+05 0.004 1.2E+07 0.282 5 -2.7E+06 -0.220 -2.3E+06 -0.188 -1.2E+06 -0.092 6 9.5E+04 0.004 -3.0E+05 -0.013 -5.3E+05 -0.024 7 -9.0E+05 -0.044 -5.9E+06 -0.287 -6.3E+06 -0.305 8 -5.3E+06 -0.253 -5.0E+06 -0.239 -3.1E+06 -0.149 9 -4.8E+05 -0.011 4.0E+06 0.094 -3.6E+06 -0.084 10 5.6E+05 0.021 1.8E+06 0.066 1.1E+06 0.039 11 -4.8E+06 -0.169 -2.6E+06 -0.092 -7.4E+06 -0.258 12 3.7E+06 0.091 1.1E+07 0.271 9.6E+06 0.238 13 1.8E+06 0.095 7.6E+05 0.041 4.5E+06 0.240 14 6.4E+06 0.397 8.4E+06 0.519 4.2E+06 0.262 15 3.2E+06 0.176 6.2E+06 0.341 9.2E+06 0.502 16 -2.6E+06 -0.153 -1.5E+06 -0.088 3.1E+06 0.177 17 3.1E+06 0.036 -2.0E+07 -0.233 -3.3E+07 -0.381 18 6.9E+06 0.377 8.6E+06 0.471 7.1E+06 0.391 19 1.1E+06 0.044 -1.3E+05 -0.005 -1.5E+06 -0.059 20 -4.3E+06 -0.191 -3.8E+06 -0.170 5.6E+05 0.025 21 3.2E+06 0.242 4.9E+06 0.370 5.3E+06 0.395 22 5.5E+06 0.322 5.3E+06 0.315 3.5E+06 0.206 23 -5.4E+06 -0.138 -8.9E+06 -0.229 -1.4E+07 -0.362
MSQ: 2.3E+13 4.8E+13 8.6E+13Max: 0.397 0.519 0.502Min: -0.253 -0.287 -0.381Mean: 0.021 0.042 0.058SD: 0.200 0.245 0.252
ORIGINAL JSEM FLOOR AREA CUBE
242
Table E-4: Errors and Percentage Errors of Forecasts for the Conventional
Models for Schools
Case
Error % Error Error % Error Error % Error
1 -8.9E+05 -0.086 -8.6E+05 -0.083 -4.2E+05 -0.040 2 4.2E+05 0.018 2.8E+06 0.116 5.1E+06 0.215 3 -1.6E+06 -0.148 -1.6E+06 -0.145 -1.2E+06 -0.105 4 -1.2E+06 -0.114 -1.3E+06 -0.120 -2.1E+06 -0.196 5 3.6E+05 0.071 6.7E+05 0.132 4.9E+05 0.097 6 -4.1E+05 -0.062 -4.1E+05 -0.062 -8.1E+05 -0.122 7 8.9E+05 0.169 6.1E+05 0.116 1.2E+06 0.224 8 9.0E+05 0.045 6.8E+05 0.034 -8.3E+05 -0.042 9 -7.0E+06 -0.374 -7.3E+06 -0.390 -8.0E+06 -0.427 10 8.6E+05 0.122 3.0E+05 0.042 1.9E+06 0.265 11 -9.1E+05 -0.118 -1.3E+06 -0.172 -8.8E+05 -0.113 12 1.0E+06 0.110 8.6E+05 0.092 9.2E+05 0.099 13 -5.3E+06 -0.362 -5.4E+06 -0.369 -4.0E+06 -0.278 14 -7.2E+05 -0.071 -1.1E+06 -0.106 -1.9E+06 -0.187 15 1.5E+06 0.095 2.0E+06 0.126 3.1E+05 0.019 16 5.5E+05 0.039 1.0E+06 0.071 -9.1E+05 -0.064 17 2.2E+06 0.326 2.1E+06 0.305 1.4E+06 0.203 18 1.0E+06 0.178 8.1E+05 0.138 1.0E+06 0.172 19 1.4E+06 0.422 1.3E+06 0.407 1.9E+06 0.588 20 -5.7E+05 -0.061 -1.1E+06 -0.121 -1.2E+06 -0.130 21 -1.3E+05 -0.030 -1.5E+04 -0.004 -3.9E+05 -0.093 22 1.6E+06 0.344 1.6E+06 0.342 7.7E+05 0.167 23 6.3E+06 0.427 6.2E+06 0.420 8.4E+06 0.567
MSQ: 6.1E+12 6.7E+12 8.9E+12Max: 0.427 0.420 0.588Min: -0.374 -0.390 -0.427Mean: 0.041 0.033 0.036SD: 0.212 0.214 0.246
ORIGINAL JSEM FLOOR AREA CUBE
243
Table E-5: Errors and Percentage Errors of Forecasts for the Regressed Models
for Offices
Case Error * % Error Error * % Error Error * % Error Error * % Error
1 1.4E+07 0.240 1.6E+07 0.283 1.6E+07 0.272 1.5E+07 0.256 2 8.6E+06 0.199 7.9E+06 0.181 8.0E+06 0.186 7.1E+06 0.164 3 2.1E+07 0.213 1.7E+07 0.170 6.3E+06 0.064 -2.1E+06 -0.021 4 9.5E+06 0.191 8.2E+06 0.165 4.6E+06 0.093 5.9E+06 0.118 5 1.9E+07 0.327 2.3E+07 0.392 1.9E+07 0.331 1.8E+07 0.311 6 -8.9E+07 -0.430 -8.6E+07 -0.413 -6.3E+07 -0.304 -5.0E+07 -0.241 7 4.0E+07 0.338 4.4E+07 0.378 2.3E+07 0.199 2.0E+07 0.169 8 -1.7E+07 -0.285 -1.4E+07 -0.227 -5.8E+06 -0.097 -5.1E+06 -0.086 9 2.2E+08 0.211 6.4E+07 0.061 -8.4E+07 -0.080 -1.3E+08 -0.120 10 -5.1E+07 -0.138 -5.5E+07 -0.149 -1.1E+08 -0.298 -9.7E+07 -0.264 11 -6.3E+06 -0.020 7.1E+06 0.022 -5.9E+07 -0.183 -5.3E+07 -0.164 12 -1.2E+07 -0.161 -7.8E+06 -0.109 -9.5E+06 -0.132 -7.4E+06 -0.103 13 8.2E+06 0.143 1.1E+07 0.185 2.1E+07 0.365 1.8E+07 0.308 14 -3.2E+08 -0.215 -3.4E+08 -0.233 2.1E+08 0.143 2.7E+08 0.183 15 -2.9E+07 -0.129 -3.2E+07 -0.140 -5.2E+07 -0.230 -4.7E+07 -0.207 16 -1.6E+07 -0.337 -1.6E+07 -0.331 -1.1E+07 -0.227 -1.1E+07 -0.240 17 -2.0E+08 -0.496 -5.3E+06 -0.013 -9.4E+06 -0.023 -5.3E+07 -0.131 18 8.8E+06 0.224 9.2E+06 0.234 1.0E+07 0.252 8.4E+06 0.212 19 -9.4E+06 -0.326 -8.9E+06 -0.309 -7.5E+06 -0.261 -4.6E+06 -0.160 20 2.4E+06 0.013 5.3E+06 0.030 -7.3E+05 -0.004 -2.0E+06 -0.011 21 4.4E+06 0.178 5.3E+06 0.213 4.1E+06 0.165 6.9E+06 0.275 22 -4.3E+06 -0.177 -3.5E+06 -0.145 4.7E+04 0.002 -9.5E+05 -0.039 23 -1.1E+06 -0.005 -1.1E+07 -0.046 -3.6E+07 -0.150 -2.6E+07 -0.111 24 -5.2E+04 -0.001 1.4E+06 0.029 1.3E+06 0.028 -1.4E+06 -0.029 25 2.3E+07 0.041 1.6E+08 0.291 1.1E+08 0.192 1.2E+08 0.204 26 -3.7E+07 -0.334 -3.5E+07 -0.316 -3.0E+07 -0.274 -2.8E+07 -0.256 27 -1.5E+07 -0.274 -1.5E+07 -0.272 -1.4E+07 -0.258 -1.4E+07 -0.255 28 -5.1E+06 -0.065 -1.3E+06 -0.016 -5.3E+06 -0.067 -3.9E+06 -0.050 29 8.4E+05 0.010 -3.0E+06 -0.034 -4.7E+06 -0.053 -9.9E+06 -0.113 30 8.6E+07 0.533 4.4E+07 0.274 1.9E+07 0.118 8.3E+06 0.051 31 6.3E+06 0.180 2.0E+05 0.006 -7.6E+06 -0.217 -5.2E+06 -0.148 32 3.0E+06 0.065 5.3E+06 0.113 1.5E+07 0.310 2.2E+07 0.473 33 -7.3E+05 -0.027 -1.2E+06 -0.044 -7.8E+05 -0.029 -2.3E+06 -0.085 34 -2.5E+06 -0.065 -3.2E+06 -0.085 -7.4E+05 -0.019 -2.3E+06 -0.060 35 3.4E+06 0.138 2.6E+06 0.106 4.0E+06 0.164 3.4E+06 0.140 36 -1.9E+08 -0.142 2.5E+07 0.019 -2.1E+08 -0.164 -2.3E+08 -0.179 37 -5.3E+07 -0.057 -2.0E+08 -0.209 2.2E+08 0.236 1.7E+08 0.183 38 1.8E+07 0.296 2.1E+07 0.339 6.4E+06 0.102 1.1E+07 0.179 39 1.1E+07 0.271 -3.6E+05 -0.009 1.1E+06 0.028 -5.5E+06 -0.140 40 1.9E+07 0.311 2.4E+07 0.388 8.3E+06 0.133 1.3E+07 0.214 41 9.6E+07 0.627 1.2E+06 0.008 4.9E+07 0.321 1.1E+08 0.700 42 1.8E+07 0.224 3.7E+07 0.454 1.2E+07 0.153 1.7E+07 0.208
MSQ: 6.3E+15 5.0E+15 4.5E+15 5.3E+15Max: 0.627 0.454 0.365 0.700Min: -0.496 -0.413 -0.304 -0.264Mean: 0.031 0.030 0.019 0.027SD: 0.254 0.221 0.195 0.219
Remark: "*" - errors of predicted price calculated by actual price - (actual floor area x predicted price per area)
RJSEM LRJSEM LRASEM RASEM
244
Table E-6: Errors and Percentage Errors of Forecasts for the Regressed Models
for Private Housing
Case Error * % Error Error * % Error Error * % Error Error * % Error
1 -1.8E+08 -0.283 -1.6E+08 -0.241 -1.9E+08 -0.299 -1.6E+08 -0.244 2 -3.4E+08 -0.366 -3.1E+08 -0.335 -3.2E+08 -0.346 -3.0E+08 -0.319 3 -3.8E+07 -0.215 -2.7E+07 -0.152 -3.4E+07 -0.193 -3.1E+07 -0.176 4 -3.0E+07 -0.246 3.0E+07 0.248 -3.3E+07 -0.270 3.6E+07 0.294 5 -5.4E+06 -0.009 3.3E+07 0.054 -1.8E+07 -0.030 7.4E+07 0.120 6 2.7E+08 0.244 1.0E+08 0.095 5.7E+07 0.051 -1.5E+08 -0.135 7 -3.2E+06 -0.201 -1.9E+06 -0.118 -1.9E+06 -0.120 -2.0E+06 -0.130 8 -2.9E+06 -0.095 -2.5E+06 -0.081 -9.7E+04 -0.003 -3.3E+06 -0.108 9 4.1E+06 0.022 7.1E+06 0.038 1.0E+06 0.006 -1.2E+05 -0.001 10 -9.6E+07 -0.130 -2.6E+07 -0.035 -9.5E+07 -0.129 -2.0E+08 -0.273 11 -6.6E+08 -0.509 -1.9E+08 -0.146 -5.5E+08 -0.424 -5.1E+08 -0.394 12 -3.2E+07 -0.060 1.2E+08 0.222 -5.2E+07 -0.097 -3.6E+07 -0.067 13 -4.8E+07 -0.240 -9.0E+06 -0.045 -4.6E+07 -0.228 -1.1E+07 -0.056 14 -6.4E+06 -0.217 -3.8E+06 -0.127 -6.4E+06 -0.215 -4.1E+06 -0.137 15 -1.7E+06 -0.047 9.2E+06 0.249 -1.0E+06 -0.028 9.6E+06 0.260 16 1.8E+08 0.285 1.2E+08 0.189 2.0E+08 0.325 1.4E+08 0.234 17 1.4E+06 0.119 1.5E+06 0.132 4.1E+06 0.357 1.4E+06 0.124 18 2.5E+07 0.047 5.5E+06 0.010 1.7E+07 0.033 9.8E+06 0.018 19 6.8E+06 0.278 3.6E+06 0.145 1.1E+07 0.427 2.7E+06 0.111 20 -3.3E+07 -0.195 -8.7E+06 -0.051 -4.5E+07 -0.260 -3.5E+06 -0.020 21 -5.3E+07 -0.160 1.0E+07 0.030 -7.8E+07 -0.232 3.5E+07 0.104 22 -2.3E+07 -0.107 -3.3E+07 -0.153 -4.8E+07 -0.219 -3.9E+07 -0.180 23 7.0E+07 0.368 2.8E+07 0.149 5.0E+07 0.265 5.3E+07 0.278 24 -1.6E+07 -0.020 -1.1E+08 -0.148 6.0E+06 0.008 -7.1E+07 -0.092 25 -2.2E+05 -0.013 -3.0E+06 -0.171 1.1E+06 0.065 -3.4E+06 -0.198 26 6.7E+07 0.188 -2.5E+07 -0.072 6.4E+07 0.180 9.1E+06 0.025 27 9.2E+06 0.111 -5.6E+06 -0.068 4.1E+06 0.049 -8.1E+06 -0.097 28 9.0E+07 0.306 7.1E+07 0.243 7.2E+07 0.245 6.3E+07 0.216 29 4.8E+07 0.122 -1.2E+07 -0.031 3.2E+07 0.081 4.4E+07 0.114 30 -3.4E+07 -0.181 -2.7E+07 -0.143 -3.9E+07 -0.207 -1.0E+07 -0.054 31 6.3E+07 0.153 7.6E+07 0.184 4.5E+07 0.109 5.7E+07 0.139 32 4.4E+07 0.448 3.0E+07 0.305 2.8E+07 0.279 2.5E+07 0.254 33 -2.5E+07 -0.067 -7.4E+07 -0.200 -1.9E+07 -0.051 -8.2E+07 -0.220 34 7.1E+07 0.255 3.5E+07 0.128 6.6E+07 0.237 3.0E+07 0.108 35 3.1E+07 0.080 1.3E+06 0.003 2.1E+07 0.053 -4.6E+06 -0.012 36 8.8E+06 0.098 -8.6E+06 -0.095 4.1E+06 0.046 -8.1E+05 -0.009 37 9.8E+07 0.258 -1.3E+07 -0.035 7.7E+07 0.201 5.2E+06 0.014 38 9.8E+07 0.192 8.8E+07 0.171 1.1E+08 0.206 1.1E+08 0.214 39 8.6E+07 0.327 6.1E+07 0.233 5.6E+07 0.211 5.6E+07 0.214 40 9.6E+07 0.186 7.8E+07 0.152 1.0E+08 0.203 1.0E+08 0.201 41 6.7E+07 0.159 4.2E+07 0.099 4.8E+07 0.113 4.8E+07 0.114 42 -1.7E+07 -0.045 -4.3E+07 -0.118 -4.8E+07 -0.130 -4.7E+07 -0.128 43 4.1E+07 0.090 1.6E+07 0.034 2.2E+07 0.049 2.3E+07 0.050 44 1.9E+07 0.246 4.8E+06 0.064 5.6E+06 0.074 2.0E+06 0.026 45 7.4E+07 0.340 3.8E+07 0.174 4.4E+07 0.202 3.3E+07 0.150 46 4.7E+07 0.592 1.7E+07 0.221 3.9E+07 0.498 1.5E+07 0.191 47 -2.0E+07 -0.157 -2.8E+07 -0.217 -3.2E+07 -0.249 -3.1E+07 -0.243 48 2.8E+06 0.044 3.7E+06 0.058 -2.1E+06 -0.033 7.5E+06 0.117 49 1.3E+07 0.164 1.6E+07 0.209 9.1E+06 0.116 1.3E+07 0.168 50 1.0E+08 0.261 1.1E+08 0.272 8.1E+07 0.207 1.1E+08 0.275
MSQ: 1.6E+16 5.4E+15 1.2E+16 1.1E+16Max: 0.592 0.305 0.498 0.294Min: -0.509 -0.335 -0.424 -0.394Mean: 0.048 0.027 0.023 0.017SD: 0.226 0.159 0.211 0.176
Remark: "*" - errors of predicted price calculated by actual price - (actual floor area x predicted price per area)
RJSEM LRJSEM LRASEM RASEM
245
Table E-7: Errors and Percentage Errors of Forecasts for the Regressed
Models for Nursing Homes
Case Error * % Error Error * % Error Error * % Error Error * % Error
1 -1.3E+06 -0.031 -1.0E+06 -0.024 -4.2E+06 -0.098 -3.7E+06 -0.086 2 1.4E+07 0.385 1.4E+07 0.373 1.1E+07 0.308 1.1E+07 0.296 3 -9.8E+06 -0.218 -9.9E+06 -0.221 -1.2E+07 -0.266 -1.2E+07 -0.265 4 -7.1E+06 -0.167 -9.4E+06 -0.222 -8.0E+06 -0.188 -9.1E+06 -0.213 5 -2.4E+06 -0.195 -2.2E+06 -0.179 -2.5E+06 -0.203 -2.0E+06 -0.164 6 1.7E+06 0.075 1.7E+06 0.076 2.0E+06 0.087 1.9E+06 0.083 7 -3.7E+06 -0.181 -3.5E+06 -0.169 -3.1E+06 -0.153 -2.5E+06 -0.123 8 -4.6E+06 -0.219 -4.6E+06 -0.218 -4.4E+06 -0.210 -4.5E+06 -0.215 9 -7.5E+05 -0.018 -2.1E+06 -0.049 -2.8E+06 -0.067 -3.8E+06 -0.089 10 3.1E+06 0.114 3.1E+06 0.113 2.2E+06 0.082 2.2E+06 0.079 11 -4.9E+06 -0.173 -4.3E+06 -0.150 -6.3E+06 -0.219 -5.6E+06 -0.197 12 -3.0E+06 -0.074 -3.7E+06 -0.092 -7.8E+05 -0.019 -1.0E+06 -0.025 13 1.9E+06 0.099 1.7E+06 0.093 1.2E+06 0.063 1.0E+06 0.053 14 6.2E+06 0.382 6.1E+06 0.378 5.2E+06 0.322 5.1E+06 0.318 15 6.0E+06 0.330 5.8E+06 0.316 4.9E+06 0.270 4.6E+06 0.250 16 -2.1E+06 -0.119 -2.2E+06 -0.127 -2.9E+06 -0.166 -3.1E+06 -0.180 17 1.2E+06 0.014 2.2E+06 0.026 -6.7E+05 -0.008 1.7E+05 0.002 18 1.7E+06 0.092 2.7E+06 0.150 2.6E+06 0.142 3.0E+06 0.164 19 2.9E+06 0.113 3.1E+06 0.118 4.3E+06 0.165 4.3E+06 0.165 20 -3.0E+06 -0.134 -2.9E+06 -0.131 -2.8E+06 -0.126 -2.8E+06 -0.126 21 5.2E+06 0.390 5.1E+06 0.380 4.6E+06 0.342 4.3E+06 0.323 22 6.7E+06 0.393 6.5E+06 0.385 6.0E+06 0.356 5.8E+06 0.344 23 -4.6E+06 -0.118 -4.4E+06 -0.114 -3.3E+06 -0.084 -3.1E+06 -0.081
MSQ: 2.7E+13 2.9E+13 2.6E+13 2.6E+13Max: 0.393 0.385 0.356 0.344Min: -0.219 -0.222 -0.266 -0.265Mean: 0.032 0.031 0.014 0.014SD: 0.215 0.214 0.203 0.197
Remark: "*" - errors of predicted price calculated by actual price - (actual floor area x predicted price per area)
lnRASEM RASEM RJSEM LRJSEM
246
Table E-8: Errors and Percentage Errors of Forecasts for the Regressed Models
for Schools
Case
Error * % Error Error * % Error Error * % Error Error * % Error 1 -4.0E+05 -0.039 -5.7E+04 -0.006 -5.4E+05 -0.053 -3.9E+05 -0.038 2 1.3E+06 0.053 1.3E+06 0.056 2.2E+06 0.091 2.4E+06 0.099 3 -1.2E+06 -0.105 -8.2E+05 -0.074 -1.3E+06 -0.117 -1.2E+06 -0.104 4 -5.8E+05 -0.054 -1.0E+06 -0.098 -6.5E+05 -0.061 -1.2E+06 -0.115 5 4.7E+05 0.092 -8.2E+04 -0.016 -1.0E+05 -0.020 -1.1E+05 -0.022 6 3.8E+05 0.057 6.5E+05 0.097 6.7E+05 0.100 4.7E+05 0.070 7 1.2E+06 0.222 1.5E+06 0.292 1.2E+06 0.223 1.4E+06 0.265 8 3.0E+06 0.152 1.7E+06 0.087 1.9E+06 0.094 1.2E+06 0.059 9 -5.8E+06 -0.311 -5.2E+06 -0.280 -5.7E+06 -0.303 -5.9E+06 -0.313 10 3.3E+05 0.046 6.7E+05 0.095 -2.0E+03 0.000 4.9E+05 0.069 11 -5.9E+05 -0.076 1.9E+05 0.024 -3.8E+05 -0.049 -2.1E+05 -0.028 12 1.3E+06 0.139 1.5E+06 0.158 1.2E+06 0.124 1.2E+06 0.126 13 -5.0E+06 -0.341 -4.3E+06 -0.298 -5.1E+06 -0.347 -4.7E+06 -0.324 14 -1.9E+06 -0.188 -1.9E+06 -0.186 -2.1E+06 -0.201 -1.9E+06 -0.185 15 1.5E+06 0.094 9.7E+05 0.062 1.7E+06 0.109 1.1E+06 0.071 16 4.7E+06 0.334 3.7E+06 0.258 3.7E+06 0.264 2.6E+06 0.184 17 -1.8E+06 -0.263 -1.5E+06 -0.222 -9.5E+05 -0.139 -1.1E+06 -0.160 18 3.6E+05 0.062 -5.0E+04 -0.009 -1.7E+05 -0.029 -5.6E+04 -0.010 19 1.7E+06 0.527 1.7E+06 0.520 1.6E+06 0.481 1.7E+06 0.539 20 -1.3E+06 -0.137 -1.5E+06 -0.163 -1.6E+06 -0.168 -1.6E+06 -0.167 21 2.6E+05 0.063 -1.2E+05 -0.028 1.7E+04 0.004 -1.1E+05 -0.026 22 1.3E+06 0.287 6.8E+05 0.147 8.2E+05 0.177 6.5E+05 0.140 23 2.5E+06 0.170 3.8E+06 0.257 4.6E+06 0.314 5.1E+06 0.345
MSQ: 5.2E+12 4.3E+12 5.3E+12 5.0E+12Max: 0.527 0.520 0.481 0.539Min: -0.341 -0.298 -0.347 -0.324Mean: 0.034 0.029 0.021 0.021SD: 0.208 0.196 0.196 0.201
Remark: "*" - errors of predicted price calculated by actual price - (actual floor area x predicted price per area)
RJSEM LRJSEM LRASEM RASEM
247
AAppppeennddiixx FF:: RReessuullttss ooff CCoommbbiinniinngg FFoorreeccaassttss
248
Table F-1: Combined Forecasts for Group 1 Models
Case JSEM FLOOR
AREA CUBE RASEM Min Max
(a) (b) (c) (d) avg. (a, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error
1 0.040 0.252 0.126 0.283 0.149 0.175 0.040 0.2832 0.053 0.241 0.053 0.181 0.096 0.132 0.053 0.2413 -0.001 0.393 0.172 0.170 0.114 0.183 -0.001 0.3934 0.091 0.236 -0.036 0.165 0.073 0.114 -0.036 0.2365 0.172 0.258 0.180 0.392 0.248 0.250 0.172 0.3926 -0.095 0.449 0.160 -0.413 -0.116 0.025 -0.095 0.4497 0.111 0.257 0.163 0.378 0.217 0.227 0.111 0.3788 -0.320 -0.362 -0.313 -0.227 -0.287 -0.305 -0.227 -0.3629 0.062 -0.299 -0.119 0.061 0.001 -0.074 0.061 -0.29910 -0.419 -0.337 -0.240 -0.149 -0.269 -0.286 -0.149 -0.41911 -0.303 -0.137 0.023 0.022 -0.086 -0.099 0.022 -0.30312 -0.166 -0.187 -0.251 -0.109 -0.175 -0.178 -0.109 -0.25113 0.112 0.049 -0.065 0.185 0.077 0.070 0.049 0.18514 0.103 0.210 0.148 -0.233 0.006 0.057 0.103 -0.23315 -0.404 -0.366 -0.292 -0.140 -0.279 -0.301 -0.140 -0.40416 -0.380 -0.313 -0.369 -0.331 -0.360 -0.348 -0.313 -0.38017 -0.120 0.182 0.344 -0.013 0.070 0.098 -0.013 0.34418 0.136 0.260 0.060 0.234 0.143 0.172 0.060 0.26019 -0.256 -0.253 -0.385 -0.309 -0.316 -0.301 -0.253 -0.38520 -0.220 -0.339 -0.224 0.030 -0.138 -0.188 0.030 -0.33921 0.151 0.287 0.076 0.213 0.147 0.182 0.076 0.28722 -0.105 -0.079 -0.237 -0.145 -0.162 -0.141 -0.079 -0.23723 -0.373 -0.356 -0.247 -0.046 -0.222 -0.255 -0.046 -0.37324 -0.052 0.078 -0.105 0.029 -0.043 -0.013 0.029 -0.10525 0.359 0.380 0.190 0.291 0.280 0.305 0.190 0.38026 -0.400 -0.388 -0.419 -0.316 -0.378 -0.381 -0.316 -0.41927 -0.256 -0.203 -0.378 -0.272 -0.302 -0.277 -0.203 -0.37828 -0.181 -0.150 -0.165 -0.016 -0.121 -0.128 -0.016 -0.18129 -0.195 -0.001 -0.197 -0.034 -0.142 -0.107 -0.001 -0.19730 0.052 0.338 0.154 0.274 0.160 0.204 0.052 0.33831 -0.107 0.433 0.364 0.006 0.087 0.174 0.006 0.43332 -0.005 -0.042 -0.143 0.113 -0.012 -0.019 -0.005 -0.14333 -0.122 0.082 -0.023 -0.044 -0.063 -0.027 -0.023 -0.12234 -0.179 -0.006 -0.097 -0.085 -0.120 -0.092 -0.006 -0.17935 0.037 0.265 0.025 0.106 0.056 0.108 0.025 0.26536 -0.170 -0.198 -0.191 0.019 -0.114 -0.135 0.019 -0.19837 0.606 0.081 0.009 -0.209 0.135 0.122 0.009 0.60638 0.016 0.400 0.528 0.339 0.294 0.321 0.016 0.52839 -0.170 0.236 0.332 -0.009 0.051 0.097 -0.009 0.33240 0.092 0.425 0.366 0.388 0.282 0.318 0.092 0.42541 -0.181 0.360 0.822 0.008 0.216 0.252 0.008 0.82242 0.098 0.224 0.272 0.454 0.275 0.262 0.098 0.454
Mean: (0.069) 0.056 0.002 0.030 (0.013) 0.005 (0.017) 0.051SD: 0.214 0.273 0.270 0.221 0.194 0.207 0.116 0.358
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.005Average SD (w, x, y & z): 0.245
Combined Forecasts
249
Table F-2: Combined Forecasts for Group 2 Models
Case JSEM FLOOR
AREA CUBE RJSEM Min Max
(a) (b) (c) (d) avg. (a, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error
1 0.040 0.25242 0.126 0.240 0.135 0.164 0.040 0.2522 0.053 0.241 0.053 0.199 0.102 0.137 0.053 0.2413 -0.001 0.393 0.172 0.213 0.128 0.194 -0.001 0.3934 0.091 0.236 -0.036 0.191 0.082 0.121 -0.036 0.2365 0.172 0.258 0.180 0.327 0.226 0.234 0.172 0.3276 -0.095 0.449 0.160 -0.430 -0.122 0.021 -0.095 0.4497 0.111 0.257 0.163 0.338 0.204 0.217 0.111 0.3388 -0.320 -0.362 -0.313 -0.285 -0.306 -0.320 -0.285 -0.3629 0.062 -0.299 -0.119 0.211 0.051 -0.036 0.062 -0.29910 -0.419 -0.337 -0.240 -0.138 -0.266 -0.284 -0.138 -0.41911 -0.303 -0.137 0.023 -0.020 -0.100 -0.109 -0.020 -0.30312 -0.166 -0.187 -0.251 -0.161 -0.193 -0.191 -0.161 -0.25113 0.112 0.049 -0.065 0.143 0.063 0.060 0.049 0.14314 0.103 0.210 0.148 -0.215 0.012 0.061 0.103 -0.21515 -0.404 -0.366 -0.292 -0.129 -0.275 -0.298 -0.129 -0.40416 -0.380 -0.313 -0.369 -0.337 -0.362 -0.350 -0.313 -0.38017 -0.120 0.182 0.344 -0.496 -0.090 -0.022 -0.120 -0.49618 0.136 0.260 0.060 0.224 0.140 0.170 0.060 0.26019 -0.256 -0.253 -0.385 -0.326 -0.322 -0.305 -0.253 -0.38520 -0.220 -0.339 -0.224 0.013 -0.143 -0.192 0.013 -0.33921 0.151 0.287 0.076 0.178 0.135 0.173 0.076 0.28722 -0.105 -0.079 -0.237 -0.177 -0.173 -0.150 -0.079 -0.23723 -0.373 -0.356 -0.247 -0.005 -0.208 -0.245 -0.005 -0.37324 -0.052 0.078 -0.105 -0.001 -0.053 -0.020 -0.001 -0.10525 0.359 0.380 0.190 0.041 0.197 0.243 0.041 0.38026 -0.400 -0.388 -0.419 -0.334 -0.384 -0.385 -0.334 -0.41927 -0.256 -0.203 -0.378 -0.274 -0.303 -0.278 -0.203 -0.37828 -0.181 -0.150 -0.165 -0.065 -0.137 -0.141 -0.065 -0.18129 -0.195 -0.001 -0.197 0.010 -0.127 -0.096 -0.001 -0.19730 0.052 0.338 0.154 0.533 0.246 0.269 0.052 0.53331 -0.107 0.433 0.364 0.180 0.146 0.217 -0.107 0.43332 -0.005 -0.042 -0.143 0.065 -0.028 -0.031 -0.005 -0.14333 -0.122 0.082 -0.023 -0.027 -0.057 -0.023 -0.023 -0.12234 -0.179 -0.006 -0.097 -0.065 -0.114 -0.087 -0.006 -0.17935 0.037 0.265 0.025 0.138 0.067 0.116 0.025 0.26536 -0.170 -0.198 -0.191 -0.142 -0.168 -0.175 -0.142 -0.19837 0.606 0.081 0.009 -0.057 0.186 0.160 0.009 0.60638 0.016 0.400 0.528 0.296 0.280 0.310 0.016 0.52839 -0.170 0.236 0.332 0.271 0.144 0.167 -0.170 0.33240 0.092 0.425 0.366 0.311 0.256 0.298 0.092 0.42541 -0.181 0.360 0.822 0.627 0.423 0.407 -0.181 0.82242 0.098 0.224 0.272 0.224 0.198 0.204 0.098 0.272
Mean: (0.069) 0.056 0.002 0.031 (0.012) 0.005 (0.043) 0.027SD: 0.214 0.273 0.270 0.254 0.203 0.213 0.122 0.362
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.005Average SD (w, x, y & z): 0.253
Combined Forecasts
250
Table F-3: Combined Forecasts for Group 3 Models
Case JSEM FLOOR
AREA CUBE RASEM Min Max
(a) (b) (c) (d) avg. (c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error
1 -0.172 -0.267 -0.254 -0.241 -0.247 -0.233 -0.172 -0.2672 -0.229 -0.332 -0.319 -0.335 -0.327 -0.304 -0.229 -0.3353 -0.436 -0.294 -0.231 -0.152 -0.191 -0.278 -0.152 -0.4364 -0.304 -0.310 -0.173 0.248 0.038 -0.135 -0.173 -0.3105 0.198 0.211 0.165 0.054 0.109 0.157 0.054 0.2116 -0.214 -0.127 -0.023 0.095 0.036 -0.067 -0.023 -0.2147 -0.398 -0.275 -0.204 -0.118 -0.161 -0.249 -0.118 -0.3988 -0.391 -0.187 -0.150 -0.081 -0.115 -0.202 -0.081 -0.3919 0.021 -0.071 -0.015 0.038 0.011 -0.007 -0.015 -0.07110 -0.422 -0.268 -0.217 -0.035 -0.126 -0.235 -0.035 -0.42211 -0.354 -0.352 -0.359 -0.146 -0.252 -0.303 -0.146 -0.35912 -0.062 -0.006 -0.007 0.222 0.107 0.037 -0.006 0.22213 -0.431 -0.312 -0.229 -0.045 -0.137 -0.254 -0.045 -0.43114 -0.282 -0.287 -0.235 -0.127 -0.181 -0.233 -0.127 -0.28715 -0.266 -0.142 -0.040 0.249 0.105 -0.050 -0.040 -0.26616 0.425 0.313 0.318 0.189 0.253 0.311 0.189 0.42517 -0.275 -0.004 0.047 0.132 0.090 -0.025 -0.004 -0.27518 0.142 0.086 0.067 0.010 0.038 0.076 0.010 0.14219 -0.126 0.130 0.141 0.145 0.143 0.073 -0.126 0.14520 -0.367 -0.253 -0.177 -0.051 -0.114 -0.212 -0.051 -0.36721 -0.129 -0.193 -0.095 0.030 -0.032 -0.097 0.030 -0.19322 -0.121 -0.150 -0.172 -0.153 -0.162 -0.149 -0.121 -0.17223 0.266 0.291 0.289 0.149 0.219 0.249 0.149 0.29124 -0.010 -0.017 -0.022 -0.148 -0.085 -0.049 -0.010 -0.14825 -0.247 -0.115 -0.211 -0.171 -0.191 -0.186 -0.115 -0.24726 -0.177 0.095 0.098 -0.072 0.013 -0.014 -0.072 -0.17727 -0.245 0.009 -0.045 -0.068 -0.056 -0.087 0.009 -0.24528 0.372 0.264 0.255 0.243 0.249 0.283 0.243 0.37229 -0.024 0.132 0.126 -0.031 0.047 0.051 -0.024 0.13230 -0.393 -0.249 -0.173 -0.143 -0.158 -0.240 -0.143 -0.39331 0.165 0.211 0.190 0.184 0.187 0.188 0.165 0.21132 0.322 0.340 0.282 0.305 0.293 0.312 0.282 0.34033 -0.214 -0.187 -0.151 -0.200 -0.176 -0.188 -0.151 -0.21434 0.123 0.162 0.160 0.128 0.144 0.143 0.123 0.16235 0.051 0.031 0.034 0.003 0.019 0.030 0.003 0.05136 -0.118 -0.008 0.028 -0.095 -0.033 -0.048 -0.008 -0.11837 0.323 0.336 0.194 -0.035 0.079 0.204 -0.035 0.33638 0.456 0.361 0.300 0.171 0.236 0.322 0.171 0.45639 0.464 0.333 0.275 0.233 0.254 0.326 0.233 0.46440 0.459 0.348 0.288 0.152 0.220 0.312 0.152 0.45941 0.339 0.241 0.186 0.099 0.143 0.216 0.099 0.33942 0.027 -0.037 -0.080 -0.118 -0.099 -0.052 0.027 -0.11843 0.308 0.170 0.119 0.034 0.077 0.158 0.034 0.30844 0.166 0.139 0.070 0.064 0.067 0.110 0.064 0.16645 0.383 0.317 0.236 0.174 0.205 0.277 0.174 0.38346 0.270 0.445 0.289 0.221 0.255 0.306 0.221 0.44547 -0.296 -0.219 -0.238 -0.217 -0.227 -0.242 -0.217 -0.29648 -0.223 -0.053 -0.004 0.058 0.027 -0.055 -0.004 -0.22349 -0.186 0.054 0.084 0.209 0.146 0.040 0.054 0.20950 0.463 0.352 0.317 0.272 0.295 0.351 0.272 0.463
Mean: (0.027) 0.013 0.015 0.027 0.021 0.007 0.006 (0.013)SD: 0.290 0.235 0.196 0.159 0.167 0.205 0.133 0.307
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.007 Average SD (w, x, y & z): 0.220
Combined Forecasts
251
Table F-4: Combined Forecasts for Group 4 Models
Case JSEM FLOOR
AREA CUBE RJSEM Min Max
(a) (b) (c) (d) avg. (b, c & d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error % Error
1 -0.172 -0.267 -0.254 -0.283 -0.268 -0.244 -0.172 -0.2832 -0.229 -0.332 -0.319 -0.366 -0.339 -0.312 -0.229 -0.3663 -0.436 -0.294 -0.231 -0.215 -0.247 -0.294 -0.215 -0.4364 -0.304 -0.310 -0.173 -0.246 -0.243 -0.258 -0.173 -0.3105 0.198 0.211 0.165 -0.009 0.122 0.141 -0.009 0.2116 -0.214 -0.127 -0.023 0.244 0.031 -0.030 -0.023 0.2447 -0.398 -0.275 -0.204 -0.201 -0.227 -0.269 -0.201 -0.3988 -0.391 -0.187 -0.150 -0.095 -0.144 -0.206 -0.095 -0.3919 0.021 -0.071 -0.015 0.022 -0.022 -0.011 -0.015 -0.07110 -0.422 -0.268 -0.217 -0.130 -0.205 -0.259 -0.130 -0.42211 -0.354 -0.352 -0.359 -0.509 -0.407 -0.393 -0.352 -0.50912 -0.062 -0.006 -0.007 -0.060 -0.024 -0.034 -0.006 -0.06213 -0.431 -0.312 -0.229 -0.240 -0.260 -0.303 -0.229 -0.43114 -0.282 -0.287 -0.235 -0.217 -0.246 -0.255 -0.217 -0.28715 -0.266 -0.142 -0.040 -0.047 -0.076 -0.124 -0.040 -0.26616 0.425 0.313 0.318 0.285 0.305 0.335 0.285 0.42517 -0.275 -0.004 0.047 0.119 0.054 -0.028 -0.004 -0.27518 0.142 0.086 0.067 0.047 0.067 0.085 0.047 0.14219 -0.126 0.130 0.141 0.278 0.183 0.106 -0.126 0.27820 -0.367 -0.253 -0.177 -0.195 -0.208 -0.248 -0.177 -0.36721 -0.129 -0.193 -0.095 -0.160 -0.149 -0.144 -0.095 -0.19322 -0.121 -0.150 -0.172 -0.107 -0.143 -0.137 -0.107 -0.17223 0.266 0.291 0.289 0.368 0.316 0.303 0.266 0.36824 -0.010 -0.017 -0.022 -0.020 -0.020 -0.017 -0.010 -0.02225 -0.247 -0.115 -0.211 -0.013 -0.113 -0.146 -0.013 -0.24726 -0.177 0.095 0.098 0.188 0.127 0.051 0.095 0.18827 -0.245 0.009 -0.045 0.111 0.025 -0.043 0.009 -0.24528 0.372 0.264 0.255 0.306 0.275 0.299 0.255 0.37229 -0.024 0.132 0.126 0.122 0.127 0.089 -0.024 0.13230 -0.393 -0.249 -0.173 -0.181 -0.201 -0.249 -0.173 -0.39331 0.165 0.211 0.190 0.153 0.185 0.180 0.153 0.21132 0.322 0.340 0.282 0.448 0.357 0.348 0.282 0.44833 -0.214 -0.187 -0.151 -0.067 -0.135 -0.155 -0.067 -0.21434 0.123 0.162 0.160 0.255 0.192 0.175 0.123 0.25535 0.051 0.031 0.034 0.080 0.048 0.049 0.031 0.08036 -0.118 -0.008 0.028 0.098 0.039 0.000 -0.008 -0.11837 0.323 0.336 0.194 0.258 0.263 0.278 0.194 0.33638 0.456 0.361 0.300 0.192 0.285 0.327 0.192 0.45639 0.464 0.333 0.275 0.327 0.312 0.350 0.275 0.46440 0.459 0.348 0.288 0.186 0.274 0.320 0.186 0.45941 0.339 0.241 0.186 0.159 0.196 0.231 0.159 0.33942 0.027 -0.037 -0.080 -0.045 -0.054 -0.034 0.027 -0.08043 0.308 0.170 0.119 0.090 0.126 0.172 0.090 0.30844 0.166 0.139 0.070 0.246 0.151 0.155 0.070 0.24645 0.383 0.317 0.236 0.340 0.298 0.319 0.236 0.38346 0.270 0.445 0.289 0.592 0.442 0.399 0.270 0.59247 -0.296 -0.219 -0.238 -0.157 -0.205 -0.227 -0.157 -0.29648 -0.223 -0.053 -0.004 0.044 -0.004 -0.059 -0.004 -0.22349 -0.186 0.054 0.084 0.164 0.100 0.029 0.054 -0.18650 0.463 0.352 0.317 0.261 0.310 0.349 0.261 0.463
Mean: (0.027) 0.013 0.015 0.048 0.025 0.012 0.010 0.003SD: 0.290 0.235 0.196 0.226 0.214 0.226 0.166 0.324
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.012Average SD (w, x, y & z): 0.237
Combined Forecasts
252
Table F-5: Combined Forecasts for Group 5 Models
Case JSEM FLOOR
AREA CUBE RASEM Combined
Forecasts Min Max
(a) (b) (c) (d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error
1 -0.080 -0.070 0.041 -0.024 -0.033 -0.024 -0.0802 0.286 0.256 0.303 0.373 0.305 0.256 0.3733 -0.228 -0.167 -0.064 -0.221 -0.170 -0.064 -0.2284 -0.125 0.004 0.282 -0.222 -0.015 0.004 0.2825 -0.220 -0.188 -0.092 -0.179 -0.170 -0.092 -0.2206 0.004 -0.013 -0.024 0.076 0.011 0.004 0.0767 -0.044 -0.287 -0.305 -0.169 -0.201 -0.044 -0.3058 -0.253 -0.239 -0.149 -0.218 -0.215 -0.149 -0.2539 -0.011 0.094 -0.084 -0.049 -0.013 -0.011 0.09410 0.021 0.066 0.039 0.113 0.060 0.021 0.11311 -0.169 -0.092 -0.258 -0.150 -0.167 -0.092 -0.25812 0.091 0.271 0.238 -0.092 0.127 0.091 0.27113 0.095 0.041 0.240 0.093 0.117 0.041 0.24014 0.397 0.519 0.262 0.378 0.389 0.262 0.51915 0.176 0.341 0.502 0.316 0.334 0.176 0.50216 -0.153 -0.088 0.177 -0.127 -0.048 -0.088 0.17717 0.036 -0.233 -0.381 0.026 -0.138 0.026 -0.38118 0.377 0.471 0.391 0.150 0.347 0.150 0.47119 0.044 -0.005 -0.059 0.118 0.025 -0.005 0.11820 -0.191 -0.170 0.025 -0.131 -0.117 0.025 -0.19121 0.242 0.370 0.395 0.380 0.347 0.242 0.39522 0.322 0.315 0.206 0.385 0.307 0.206 0.38523 -0.138 -0.229 -0.362 -0.114 -0.211 -0.114 -0.362
Mean: 0.021 0.042 0.058 0.031 0.038 0.036 0.075 SD: 0.200 0.245 0.252 0.214 0.207 0.124 0.300
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.038 Average SD (w, x, y & z): 0.228
253
Table F-6: Combined Forecasts for Group 6 Models
Case JSEM FLOOR
AREA CUBE RJSEM Combined
Forecasts Min Max
(a) (b) (c) (d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error
1 -0.080 -0.070 0.041 -0.031 -0.035 -0.031 -0.0802 0.286 0.256 0.303 0.385 0.307 0.256 0.3853 -0.228 -0.167 -0.064 -0.218 -0.169 -0.064 -0.2284 -0.125 0.004 0.282 -0.167 -0.002 0.004 0.2825 -0.220 -0.188 -0.092 -0.195 -0.174 -0.092 -0.2206 0.004 -0.013 -0.024 0.075 0.010 0.004 0.0757 -0.044 -0.287 -0.305 -0.181 -0.204 -0.044 -0.3058 -0.253 -0.239 -0.149 -0.219 -0.215 -0.149 -0.2539 -0.011 0.094 -0.084 -0.018 -0.005 -0.011 0.09410 0.021 0.066 0.039 0.114 0.060 0.021 0.11411 -0.169 -0.092 -0.258 -0.173 -0.173 -0.092 -0.25812 0.091 0.271 0.238 -0.074 0.132 -0.074 0.27113 0.095 0.041 0.240 0.099 0.119 0.041 0.24014 0.397 0.519 0.262 0.382 0.390 0.262 0.51915 0.176 0.341 0.502 0.330 0.337 0.176 0.50216 -0.153 -0.088 0.177 -0.119 -0.046 -0.088 0.17717 0.036 -0.233 -0.381 0.014 -0.141 0.014 -0.38118 0.377 0.471 0.391 0.092 0.333 0.092 0.47119 0.044 -0.005 -0.059 0.113 0.023 -0.005 0.11320 -0.191 -0.170 0.025 -0.134 -0.118 0.025 -0.19121 0.242 0.370 0.395 0.390 0.349 0.242 0.39522 0.322 0.315 0.206 0.393 0.309 0.206 0.39323 -0.138 -0.229 -0.362 -0.118 -0.212 -0.118 -0.362
Mean: 0.021 0.042 0.058 0.032 0.038 0.025 0.076SD: 0.200 0.245 0.252 0.215 0.208 0.124 0.301
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.038Average SD (w, x, y & z): 0.228
254
Table F-7: Combined Forecasts for Group 7 Models
Case JSEM FLOOR
AREA CUBE RASEM Combined
Forecasts Min Max
(a) (b) (c) (d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error
1 -0.086 -0.083 -0.040 -0.006 -0.054 -0.006 -0.0862 0.018 0.116 0.215 0.056 0.101 0.018 0.2153 -0.148 -0.145 -0.105 -0.074 -0.118 -0.074 -0.1484 -0.114 -0.120 -0.196 -0.098 -0.132 -0.098 -0.1965 0.071 0.132 0.097 -0.016 0.071 -0.016 0.1326 -0.062 -0.062 -0.122 0.097 -0.037 -0.062 -0.1227 0.169 0.116 0.224 0.292 0.200 0.116 0.2928 0.045 0.034 -0.042 0.087 0.031 0.034 0.0879 -0.374 -0.390 -0.427 -0.280 -0.368 -0.280 -0.42710 0.122 0.042 0.265 0.095 0.131 0.042 0.26511 -0.118 -0.172 -0.113 0.024 -0.095 0.024 -0.17212 0.110 0.092 0.099 0.158 0.115 0.092 0.15813 -0.362 -0.369 -0.278 -0.298 -0.327 -0.278 -0.36914 -0.071 -0.106 -0.187 -0.186 -0.137 -0.071 -0.18715 0.095 0.126 0.019 0.062 0.076 0.019 0.12616 0.039 0.071 -0.064 0.258 0.076 0.039 0.25817 0.326 0.305 0.203 -0.222 0.153 0.203 0.32618 0.178 0.138 0.172 -0.009 0.120 -0.009 0.17819 0.422 0.407 0.588 0.520 0.484 0.407 0.58820 -0.061 -0.121 -0.130 -0.163 -0.119 -0.061 -0.16321 -0.030 -0.004 -0.093 -0.028 -0.039 -0.004 -0.09322 0.344 0.342 0.167 0.147 0.250 0.147 0.34423 0.427 0.420 0.567 0.257 0.418 0.257 0.567
Mean: 0.041 0.033 0.036 0.029 0.035 0.019 0.068 SD: 0.212 0.214 0.246 0.196 0.202 0.150 0.274
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.035 Average SD (w, x, y & z): 0.217
255
Table F-8: Combined Forecasts for Group 8 Models
Case JSEM FLOOR
AREA CUBE RJSEM Combined
Forecasts Min Max
(a) (b) (c) (d) avg. (a to d) min. (a to d) max. (a to d) % Error % Error % Error % Error % Error % Error % Error
1 -0.086 -0.083 -0.040 -0.039 -0.062 -0.039 -0.0862 0.018 0.116 0.215 0.053 0.100 0.018 0.2153 -0.148 -0.145 -0.105 -0.105 -0.126 -0.105 -0.1484 -0.114 -0.120 -0.196 -0.054 -0.121 -0.054 -0.1965 0.071 0.132 0.097 0.092 0.098 0.071 0.1326 -0.062 -0.062 -0.122 0.057 -0.047 0.057 -0.1227 0.169 0.116 0.224 0.222 0.183 0.116 0.2248 0.045 0.034 -0.042 0.152 0.047 0.034 0.1529 -0.374 -0.390 -0.427 -0.311 -0.375 -0.311 -0.42710 0.122 0.042 0.265 0.046 0.119 0.042 0.26511 -0.118 -0.172 -0.113 -0.076 -0.120 -0.076 -0.17212 0.110 0.092 0.099 0.139 0.110 0.092 0.13913 -0.362 -0.369 -0.278 -0.341 -0.337 -0.278 -0.36914 -0.071 -0.106 -0.187 -0.188 -0.138 -0.071 -0.18815 0.095 0.126 0.019 0.094 0.084 0.019 0.12616 0.039 0.071 -0.064 0.334 0.095 0.039 0.33417 0.326 0.305 0.203 -0.263 0.143 0.203 0.32618 0.178 0.138 0.172 0.062 0.137 0.062 0.17819 0.422 0.407 0.588 0.527 0.486 0.407 0.58820 -0.061 -0.121 -0.130 -0.137 -0.112 -0.061 -0.13721 -0.030 -0.004 -0.093 0.063 -0.016 -0.004 -0.09322 0.344 0.342 0.167 0.287 0.285 0.167 0.34423 0.427 0.420 0.567 0.170 0.396 0.170 0.567
Mean: 0.041 0.033 0.036 0.034 0.036 0.022 0.072 SD: 0.212 0.214 0.246 0.208 0.204 0.150 0.274
(w) (x) (y) (z)
Average mean (w, x, y & z): 0.036