Using Geographic-attribute Weighted Regression for CAMA … · the implementation of CAMA systems in more than 300 assessment jurisdictions. Joshua Myers is Real Estate CAMA Modeler

Journal of Property Tax Assessment & Administration • Volume 7, Issue 3 5

This article evaluates the new au-tomated valuation model (AVM),

geographically weighted regression (GWR), which incorporates a geospatial factor, the parcel’s x-y coordinates ob-tained from the geographic information system (GIS), in the model specification. It also introduces and evaluates a vari-ant of GWR called geographic-attribute weighted regression (GAWR), which incorporates the parcel’s similarity to surrounding parcels as well as its loca-tion. The study used an existing research dataset (Moore 2006) to compare value estimates obtained through incorporation of parcel x-y coordinates in the AVM mod-el specification with estimates produced by commonly used AVMs that do not include a geospatial factor. The findings were extended and validated by applying the new methodology to datasets from the City of Norfolk and Fairfax County, Virginia. The study was conducted using a

rigorous experimental design with statisti-cal hypothesis testing, frequently missing in papers reporting comparison of results produced from various AVMs.

In the past two decades, much atten-tion has been focused upon integrating geographic information systems and com-puter-assisted mass appraisal (CAMA). An important question deserving research attention is the relative importance of each parcel’s x-y coordinates in the specification of the AVM. When x-y coor-dinates are not available, a neighborhood adjustment factor variable must apply a value uniformly across each specifically delineated neighborhood, potentially re-sulting in sharp changes in value estimates at neighborhood boundaries and less uniformity. Theoretically, a model that incorporates the exact location of a given parcel should produce more accurate value estimates than a model that does not, but would a statistically significant

J. Wayne Moore, Ph.D., is an independent researcher. For more than three decades, he devel-oped property appraisal software solutions as well as participated either directly or indirectly in the implementation of CAMA systems in more than 300 assessment jurisdictions.

Joshua Myers is Real Estate CAMA Modeler Analyst with the City of Norfolk, Virginia. He holds a master of science in statistics from the University of Virginia.

Using Geographic-attribute Weighted Regression for CAMA Modeling

BY J. WAYNE MOORE, PH.D., AND JOSHUA MYERS

The paper on which this article is based was presented September 1, 2010, at the International Association of Assessing Officers’ 76th Annual International Conference on Assessment Admin-istration in Orlando, Florida. It expands upon the paper presented in Little Rock, Arkansas, on March 9, 2010, at the 2010 GIS/CAMA Technologies Conference sponsored by IAAO and the Urban and Regional Information Systems Association (URISA).

6 Journal of Property Tax Assessment & Administration • Volume 7, Issue 3

difference exist in the real world? This research offers empirical evidence in answering the question.

Research seeks to answer questions, but in the process, new questions are dis-covered that also need answers. Recently published and potentially applicable literature in other disciplines has been studied and evaluated from the perspec-tive of potential application in property appraisal. Much work has been done outside of the appraisal field by scholars interested in predicting, estimating, and forecasting data points, including home transaction prices; their research may have direct application to the appraisal process. One such methodology, geo-graphically weighted regression, is the primary subject of this research.

Research Questions and DefinitionsAssessors have a variety of tools available for performing their duty of accurately and uniformly estimating the market value of properties within their juris-dictions. The accuracy with which they estimate market value directly affects tax equity among property owners. Because assessors have a variety of market value estimating methodologies from which to choose, as well as limited resources avail-able for performing their duties, they need access to objective, independent information and comparative analyses of the relative performance of the available automated valuation models to ascertain best practice. The research questions developed to address this need were:

1. To what extent do measures of equity/accuracy differ among available methodologies for esti-mating the market value of single-family homes?

2. Does the use of parcel coordi-nates from GIS in the market value estimating model specifi-cation improve the measure of equity/accuracy by a statistically significant amount?

3. Does the addition of attribute weighting improve the measure of equity/accuracy of the market value estimating model by a sta-tistically significant amount over standard GWR?

Key MetricsFollowing are descriptions of the key measurements evaluated in this research.

Measure of equity is an objective statistic designed to provide an indication of property tax equity and assessment ac-curacy. The measures for this research were coefficient of dispersion (COD) for horizontal equity and quintile mean ratio (QMR) for vertical equity.

Horizontal inequities are differences in effective tax rates among properties hav-ing similar market values in the same or similar neighborhoods based upon the relationship between actual tax dollar amount levied upon each property to its market value. Horizontal inequities are primarily measured by the coefficient of dispersion.

Vertical inequities are differences in effective tax rates between groups of properties based upon their relative val-ue ranges. For example, if higher-priced properties as a group have a different effective tax rate than lower-priced properties as a group, the condition of vertical inequity exists. Vertical inequity will influence the value of the COD, but additional measures are required to ob-tain a more precise evaluation.

Coefficient of dispersion (COD) is the average absolute deviation of calculated sale ratios from their median expressed as a percentage of the median (IAAO 1997, 26). Larger COD values indicate diminished uniformity.

Quintile mean ratio (QMR) is the aver-age of the appraised value to sale price ratios in each one-fifth grouping of the ratios being investigated after the ratios have been sorted from lowest sale price to highest sale price and divided into five equal sale price groups.


Vertical equity index (VEI) is the absolute value of the difference between the high-est and lowest of the five quintile mean ratios within a study group divided by the mean of the five QMRs, and then multiplied by 100 (Moore 2008). Lower VEI values indicate better vertical equity.

Hedonic Models, Econometric Models, AVMs, and GWRThe term model as it relates to appraisal is defined in the Glossary of Property Ap-praisal and Assessment (IAAO 1997) as “a representation (in words or an equation) that explains the relationship between value or estimated sale price and vari-ables representing factors of supply and demand.”(p. 88) In a recent article in the Journal of Property Tax Assessment & Administration, the relationships that exist between the economic supply and demand functions and the general assessment model were explained in detail (Moore 2009). Court (1939) has been credited with originating the idea of the hedonic price estimating model, but the concept actually originated ear-lier in the form of scientific appraising in the work of Zangerle (1924); Pollock and Scholz (1926); and Prouty, Collins, and Prouty (1930). Without using the term hedonic, Jensen (1931) defined a constructive market value as “one constructed synthetically by taking all the factors affecting value into account so that it shall approximate as closely as possible what the market value would be could one be ascertained.” (p. 450) The scientific appraisal methodology and constructive market value that Jen-sen described amounted to what is now known as the cost method.

Hedonic models decompose the price of an item into separate components that determine the total price when summed. A familiar application is the cost method. This use is among the more complex, but effectiveness is not synonymous with com-plexity. All models used in the appraisal of real estate are hedonic, the scholarly term for this broad category of models.

Another term found in the literature is econometric model. According to Kennedy (1998), a generally accepted definition does not exist. He offered, however, that econometricians “…are theoretical statisticians, applying their skills to the development of statistical techniques appropriate to the empirical problems characterizing the science of economics.” (p. 1) Many models that are based on the generalized linear model in statistics, such as various regression models used in ap-praisal, are types of econometric models, which also are part of the broader clas-sification of hedonic models. Automated valuation models (AVMs) comprise a more narrow set of econometric models used specifically to estimate property values and selling prices.

Computer-assisted mass appraisal (CAMA) is the term applied to computer software that incorporates AVMs and is used by assessors to assist in managing and performing their property valuation duties. The more common AVMs used in CAMA systems are the traditional cost method, comparable sales method, multiple regression analysis, adaptive estimating procedure (also referred to as feedback), and the transportable cost-specified market (also called mar-ket-calibrated cost).

An earlier study (Moore 2006) eval-uated the performance of four of these AVMs without considering spa-tially dependent variables beyond the neighborhood variable. That study rec-ommended future research to evaluate the impact of adding a precise geospatial factor to the same characteristics dataset to determine whether a statistically sig-nificant improvement in equity would result. The current study evaluates a new AVM, geographically weighted regres-sion, which incorporates a parcel’s x-y coordinates obtained from GIS. It also introduces and tests a potential variant of GWR, geographic-attribute weighted regression, which incorporates both a parcel’s x-y coordinates and its similarity to surrounding parcels.


Literature Review and Description of GWRComparing Performance of AVMsThis research is an extension of the research reported by Moore (2006) and uses the same data as that study. The purpose of the 2006 study was to compare the performance of the pri-mary automated valuation models used in computer-assisted mass appraisal. The models were tested in a controlled experiment in which nine experienced modelers with access to the same sales dataset constructed models to predict the next year’s sales prices. From a popula-tion of 22,785 parcels, a total of 5,546 jurisdiction-validated sales from the pe-riod 1999–2003, including characteristics as they were at the time of the sale, were made available to participants for use in model development. Each modeler was free to choose as many or as few of the historical sales as desired and to use their favorite software. Once constructed, their models were used to blindly estimate the selling prices of the 1,299 jurisdiction-validated 2004 sales as an out-of-sample test. All 1,299 sales were included in test-ing the resultant value predictions; that is, no outliers were eliminated. None of the participants had information on cur-rent or prior assessed values for any of the parcels including the 5,546 available for model building. They did not know the jurisdiction from which the data were extracted, and they did not know the identity of the other participants.

The process of estimating the 2004 selling prices as the test group, instead of using a portion of the sales held out from the 1999–2003 time period, simu-lated the annual revaluation process that assessors must follow to establish assessed values for use in property taxation as of the statutory tax lien date each year. Thus, the test would allow a realistic evaluation of the predictive power of the AVMs.

The 2006 study tested four automated valuation model types most commonly

used in mass appraisal: adaptive es-timation procedure (AEP), multiple regression analysis including non-linear regression (MRA), the traditional cost method (COST), and a hybrid trans-portable cost-specified market method (TCM). The dependent variable was the COD that resulted from applying each AVM to predict the selling prices of the same set of 1,299 out-of-sample properties in the 2004 test group. A one-way analysis of variance (ANOVA) was conducted to evaluate the null hypoth-esis that no differences in market value estimating accuracy existed among these major AVM methods and to analyze the relationship between AVM type chosen and the resulting COD.

The study results provided clear sta-tistical evidence to support what most CAMA practitioners already believed to be true: a market-calibrated AVM will predict selling prices more accurately than a purely cost-based AVM. The three market-based AVMs (AEP, MRA, and TCM) produced statistically equivalent mean CODs near 10, whereas the purely cost-based AVM produced a mean COD close to 15. The research also provided a baseline for pursuit of additional re-search questions, including the question considered by this research: Does the addition of a parcel’s x-y coordinates improve the value estimates of an auto-mated valuation model by a statistically significant amount? This question was tested in the current study using a new, enhanced form of geographically weighted regression—geographic-attri-bute weighted regression—as well as a standard GWR model.

Multiple Regression Analysis In the context of automated valuation models, the common regression tech-niques are linear and non-linear multiple regression analysis with neighborhood “adjustments” done by use of dummy variables. Multiple regression analysis was first used for property valuation purposes by C.G. Haas in 1922(Haas


1922). The computational demands of this method were high, but support for the method increased after World War II when the power of the computer started to be realized (Gipe 1975). These common regression techniques may be fundamentally flawed, however, because they fail to adequately take into account spatial autocorrelation, the coincidence of selling price with location, and spa-tial heterogeneity, the changing value of property attributes across a study region (Yu, Wei, and Wu 2007). Spatial autocorrelation violates the fundamental regression assumption of independence of observations, and spatial heteroge-neity violates the assumption that the housing market acts in relative equilib-rium.

Neighborhood dummy variables do account for some of the effects of loca-tion, but their use can create a boundary value problem. The result is that two properties on opposite sides of the boundary between two neighborhoods may be given very different adjustments. In actuality, in most jurisdictions, values of property characteristics vary smoothly across space and are not constant even throughout an entire neighborhood. In MRA, AEP, and TCM, the process of making adjustments by neighborhood is a somewhat crude method of accounting for spatial effects, but it is often all that local assessors have to use. What is really needed is a smoother way to account for the effects of location throughout a jurisdiction and to follow the first law of geography which states that things that are closer together tend to be more alike than things that are farther apart (Tobler 1970). With the advent of GIS, it seems appropriate to directly incorporate loca-tion when using regression to estimate property value.

Geographically Weighted RegressionGeographically weighted regression is a nonparametric regression modeling technique that incorporates the use of

GIS parcel centroid coordinates (Bruns-don, Fotheringham, and Charlton 1996; McMillen 1996). At its heart, geographi-cally weighted regression is a special case of the locally weighted regression (LWR) model (McMillen and Redfearn 2010). Locally weighted regression was intro-duced in a paper by Cleveland (1979), and the method was further developed by Cleveland and Devlin (1988). GWR essentially takes a method that previously was only applied to a variable space and applies it to a geographic space.

Use of GWR has grown over the past 10 years, and it has been implemented in nearly every major field of academic research (Matthews 2007). There are two different ways to use GWR. One is as an exploratory tool to understand the vary-ing tastes and preferences for different property attributes across a jurisdiction. This type of analysis is more useful to researchers than to property assessors. Another way is as a statistical technique to better estimate the value of a subject property with a given set of attributes by better taking into account the effects of location. This is the application of GWR in the assessment context and is the im-petus for this research.

Use of GWR for the purpose of prop-erty value estimation has already been studied extensively using housing data from throughout the United States and Canada. For example, Yu, Wei, and Wu (2007) worked with data from Milwau-kee, Wisconsin; Paez, Long, and Farber (2008) took data from Toronto, Ontario; Des Rosiers and Theriault (2008) used data from Montreal, Quebec; and Borst and McCluskey (2008) applied data from Sarasota County, Florida, Fairfax County, Virginia, and Catawba County, North Carolina. Yet, to our knowledge, there has not been an implementation of the model inside the walls of an actual appraisal jurisdiction. Also, there has not yet been a controlled experiment com-paring the GWR model to other models commonly used in the assessment com-


munity. This article reports the first such comparison of the GWR model to the other AVMs.

MRA is the foundation of GWR, but unlike MRA, which has one model-wide set of regression coefficients, GWR produces a different set of regression co-efficients for every property by running a series of weighted least squares regres-sions. The weighting is determined by the distance of the subject property from its nearest neighbors. Therefore, GWR is essentially the combination of many small weighted MRAs that are performed around each subject property. This process makes each set of regression coefficients in GWR a function of loca-tion. The following model formula for GWR looks similar to the one for MRA:

Yi = β0(xi, yi) +pβh(xi, yi)Xih + єi ,∑

h=1where p is the number of independent variables, (xi,yi) denotes the coordinates of the i-th subject property, βh(xi,yi) is the regression coefficient for the h-th independent variable of the i-th subject property, Yi is the sale price for the i-th subject property, Xih is the h-th inde-pendent variable for the i-th subject property, and єi is the random error term for the i-th subject property with distribu-tion N(0,σ2I) (Brunsdon, Fotheringham, and Charlton 2002).

Here, the local regression coefficients for the i-th subject property are esti-mated as:

β̂i = (XTWiX)−1 XTWiY,

where X is the matrix of independent variables, Wi is the diagonal weights ma-trix for the i-th subject property, and Y is the vector of sale prices.

The diagonal weights matrix in each weighted least squares regression is de-termined by a choice of weight function in which nearby properties are afforded more weight than properties that are farther away. This choice of weighting is supported by the first law of geography

(Tobler 1970). Here, the weight function is applied to all sale properties in a cer-tain sliding neighborhood around each subject property. The size of the sliding neighborhood is called the bandwidth. The literature contains two types of bandwidths—fixed and adaptive (Bruns-don, Fotheringham, and Charlton 2002). A fixed bandwidth model, effec-tively, includes in each local regression all of the sale properties within a fixed distance from each subject property. An adaptive bandwidth model includes in each local regression a specific number of sale properties around each subject property by allowing the size of the sliding neighborhood to vary. A fixed bandwidth model potentially could have more difficulty than an adaptive band-width model in dealing with irregularly spaced data, like home sales, because the fixed bandwidth model could have some local regressions that are primar-ily based on only a small number of sale properties (Brunsdon, Fotheringham, and Charlton 2002). To address this con-cern, only adaptive bandwidth models are used in this research. One common choice of adaptive weighting scheme, and the one used in this research, is the bi-square function given as:

wij = (1 − (dij /b)2)2, dij ≤ b

wij = 0, otherwise

where dij is geographic distance between the i-th subject property and its j-th neighboring sale property, and b is equal to the bandwidth (Brunsdon, Fothering-ham, and Charlton 2002).

Spatial-attribute Weighting FunctionThe standard set of possible weighting functions for GWR takes into account only the geographical distance between properties, not the similarity of their attributes. It makes sense though that the similarity of the sale property to the subject property would also be impor-tant. A modification of the previously given weight function to account for


differences in attributes could be stated as follows:

wij = ((1 − (dij /b)2) × f(τ))2, dij ≤ b

wij = 0, otherwise

f(τ) = e −|1 − (Aj /Ai)|

where f(τ) is an exponential function that changes the weight according to the difference (τ) between attributes of the i-th subject property, Ai, and its j-th neighboring sale property, Aj (Shi, Zhang, and Liu 2006). This weighting function is now transformed into a spatial-attribute weighting function for use in the series of locally weighted least squares regressions that comprise the nonparametric GWR model. We named this method geographic-attribute weight-ed regression (GAWR) to differentiate it from GWR models containing only a distance-based weighting function.

The principal difference between GWR and GAWR can best be illustrated through a practical example using three properties: a subject property, a smaller sale property two doors to the left of the subject property, and an exact replica of the subject property that sold two doors to the right. Under GWR, the smaller sale property would be given about the same weight as the exact replica sale property; however, under GAWR, the replica would be given decidedly more weight than the smaller property. One of the research questions investigated was whether GAWR’s performance is statisti-cally significantly better than GWR’s in the estimation of property value.

The sales comparison method is the preferred method of assessment for single-family residential properties (IAAO 2008). GAWR is very similar to the classic sales comparison method. In sales comparison, an appraiser looks for similar sale properties in the vicinity of the subject property, not just any prop-erty nearby. The same is true of GAWR. This capability adds a level of appeal to GAWR that is absent in many other AVM methodologies.

MethodologyThis methodology section contains a restatement of the three research questions, the statement of hypotheses developed for separately testing hori-zontal and vertical equity, and a brief description of the research design. This section also explains the methodology used for random selection and assign-ment of sale parcels to the AVM test groups and contains a description of how the research was conducted.

Research QuestionsDetermining best practice when compar-ing estimating performance of available automated valuation models can be dif-ficult for assessors without the availability of objective, independent information and comparative analyses.

To address this problem, three ques-tions were developed for this research:

1. To what extent do measures of equity/accuracy differ among available methodologies for esti-mating the market value of single-family homes?

2. Does inclusion of parcel coor-dinates from GIS in the market value estimating model specifi-cation improve the measure of equity/accuracy by a statistically significant amount?

3. Does the addition of attribute weighting to a geographically weighted regression improve the measure of equity/accuracy of the market value estimating model by a statistically significant amount over standard GWR?

To seek answers to these research ques-tions, seven test groups were considered, each representing the result set of a different market value estimating meth-odology. The methodologies tested were (1) adaptive estimating procedure (AEP), (2) traditional cost method (COST), (3) geographic-attribute weighted regression (GAWR), (4) geographically weighted


regression (GWR), (5) multiple regres-sion analysis (MRA), (6) GAWR without using replacement cost new (NoRCN), and (7) transportable cost-specified mar-ket (TCM).

Statement of HypothesesTesting for horizontal equity and vertical equity requires two different models us-ing the same data sample. The necessary operational variables for measuring both forms of equity require computation of the appraised value (AV) to sale price (SP) ratio (A/S). The COD is the aver-age absolute deviation of all the ratios (A/S) in a sample from the median ratio. Multiple COD samples were selected and assigned for each test group and the mean COD (MCOD) of each group was used to test horizontal equity. The same randomly selected parcels were used for each test group, but the calculated selling price estimates for each group depended upon the AVM applied. The relative level of horizontal equity in each test group would be indicated by the magnitude of the test group’s mean COD, with lower CODs indicating better equity. However, the mere fact that one test group exhibits a lower mean COD than another is not sufficient evidence that its equity is bet-ter. Only the existence of a statistically significant difference in the mean CODs among test groups would provide such evidence because the magnitude of the variance of the sample and group ratios also must be considered when attempting to draw a conclusion from the test group COD means. For that reason, an ANOVA model was selected for hypothesis testing. No statistically significant difference in the mean CODs among the test groups would imply equal value estimating per-formance and accuracy, even if the test groups had somewhat different mean CODs, as might be expected. Without an analysis of the variances involved, a somewhat lower or higher mean COD alone would not provide sufficient evidence for comparisons between group performances.

Vertical equity is the construct used to reflect the uniformity of mean ratios (MA/S) among different levels of parcel sale prices. Vertical equity was measured and analyzed by using five equal-count sale price range levels called quintiles. The ranges were determined by ranking from lowest to highest the sale prices from within the same sample that was used for testing horizontal equity and then dividing the sales into five groups with an equal number of sales in each. The quintile mean ratio (QMR), the average of the appraised value to sale price ratios (MA/S) for each quintile, was calculated to test vertical equity. To imply vertical equity, all five groups should have approximately equal QMRs. Thus, the test for vertical equity was a comparison of the QMRs of the five quintile ranges within each AVM test group. For any test group, vertical equity would be achieved by the absence of statistically significant differences in the QMRs among its sample quintile ranges, indicating uniformity among vertical value strata.

Hypothesis for Horizontal EquityHorizontal equity addresses the question of whether all homes are burdened with similar effective tax liabilities (actual taxes paid as a percentage of market value). Hence, an AVM’s performance accuracy in estimating market value has a direct bearing on horizontal equity. The null hypothesis for testing hori-zontal equity performance among the value-estimating methodologies (AVMs) evaluated in this study was:

H10: There are no differences among MCOD based upon the AVM chosen by the assessor, where MCOD is the mean COD of each AVM test group.

The alternative hypothesis was:

H1a: Differences exist in at least two MCOD based upon the AVM chosen by the assessor.


Hypothesis for Vertical EquityVertical equity seeks uniformity in ef-fective tax rates (actual taxes paid as a percentage of market value) among groups of properties based upon their relative value. If higher-priced proper-ties as a group have different effective tax rates than lower-priced properties as a group, then the condition of verti-cal inequity exists—property tax equity demands that all homes have the same effective tax rate. The performance accu-racy in estimating market value for each AVM also has a direct bearing on vertical equity. The null hypothesis for testing vertical equity performance among the value-estimating methodologies (AVMs) evaluated in this study was:

H20: There are no differences in the quintile mean ratios (QMRs) within each AVM type available for use by the assessor.

The alternative hypothesis was:

H2a: There are differences in the QMRs within at least one AVM type available for use by the assessor.

Table 1 contains a summary of the char-acteristics of the hypotheses and models used for testing AVM performance.

Description of Research DesignA completely randomized design was utilized in the research for examining COD means as a test of horizontal equity and quintile mean ratios as a test of verti-cal equity. To test the horizontal equity hypothesis, an ANOVA model was con-structed. If testing should indicate that,

at the pre-specified significance level of α =.01, the COD means were not dif-ferent, then no evidence would exist to reject the null hypothesis, indicat-ing equal AVM performance. If testing should indicate significant main effects among the COD means for the selected α =.01 significance level, then evidence would exist that the null hypothesis should be rejected and further analysis would be required for pair-wise compari-sons among the AVMs.

Vertical equity was tested by further stratification of the appraised value to sale price ratios (A/S) of each test group into quintiles based on sale price. An analysis of variance was conducted using the quintile mean ratios of the five sale price range levels for each AVM type. This approach was a departure from some of the vertical equity testing litera-ture (Allen and Dare 2002; Cornia and Slade 2005; Sirmans, Diskin, and Friday, 1995) that relied upon various regression techniques for examining vertical equity. Since there was no consensus among the cited references as to which regression model performed best for testing vertical equity, the ANOVA test that had been developed for use in recent research by Moore (2008) was used in this study as an alternative method for evaluating vertical equity. To provide a basis for comparison, price-related differential, the widely accepted standard measure of vertical equity (IAAO 2003), was computed along with the vertical equity index (VEI), which was developed for use in a recent study by Moore (2008). This step was prompted by the work of Jensen (2009), which reported that het-

Table 1. Summary of model descriptions and hypotheses for testing AVM performanceItem Horizontal equity Vertical equity

Model type One-way ANOVA One-way ANOVAIndependent variables AVM type Quintile within AVM typeReplications per cell 31 155Dependent variable MCOD QMRGeneral null hypothesis MCODs are not different QMRs are not differentGeneral alternate hypothesis Not all MCODs are the same Not all QMRs are the same


erogeneous variance in sale prices can reduce the reliability of the PRD.

Power Analysis for Horizontal Equity TestingThe total sample size required to provide adequate power at α = .01 was estimated through an accurate a priori power analysis for horizontal equity. The power analyses indicated that for a one-way ANOVA study, a cell sample size of 13 was necessary to achieve 99% power to detect differences among the COD means using an F test with α = .01 as the significance level. Hence, a sample size of at least 13 mean CODs for each AVM was estimated as required for the study of horizontal equity.

Power Analysis for Vertical Equity TestingThe total sample size required to provide adequate power at α = .01 was also esti-mated through a priori power analysis for vertical equity. Meaningful evaluation of vertical equity can only occur among quintiles of a single AVM. Hence, a single-factorial power analysis was of particular interest because each AVM would need to be evaluated independently. Using a cal-culated effect size for analysis in G*Power (Buchner, Erdfelder, and Faul 1997), with equal size samples in five quintile groups, resulted in a total required sample size estimate of 655 at α = 0.01 to achieve 99% power. Thus, G*Power confirmed that the planned quintile cell size of 131 (655/5) or greater as estimated by the power analysis fulfilled the requirement for 99% power at α = 0.01 for vertical equity hypothesis testing.

Total Sample SizeThe COD means (that is, the average of calculated CODs for the multiple samples in each AVM type) that were used for test-ing horizontal equity were second order statistics and each COD itself was calcu-lated from a first order sample of parcels. Based on the a priori power analysis and other practical considerations, a deci-sion was made to create 31 groups each

containing 25 parcels randomly selected and randomly assigned to COD groups from the result set of each AVM type be-ing tested. The mean COD of these 31 CODs was calculated for each AVM test group to test for horizontal equity. Each COD was computed from 25 A/S ratios, which required that 775 sale parcels be randomly selected and assigned from the value estimating result population of each AVM test group. The practical considerations driving the decision on sample size were: (1) a sample size of 30+ is widely used because simulation studies involving the central limit theorem have shown that the mean of samples of 30 or more from almost any type of distribution is approximately normal, (2) the sample size had to be divisible into 30 or more COD groups for horizontal equity test-ing and exactly five (quintile) groups for vertical equity testing while maintaining a balanced cell size for both tests, and (3) based on personal empirical experience with the calculation of medians and CODs for residential properties, as well as the precepts of the central limit theorem, 25 randomly selected parcel A/S ratios were deemed sufficient to derive one COD about each median appraised value to sale price ratio. Thus, a sample size of 31 CODs for each AVM would require 775 parcels (31 × 25) for the test of horizontal equity, which was well within the number suggested by power analysis for 99% power. At the same time, 775 parcel sale prices would provide a balanced cell size of 155 observations for vertical equity test-ing of the same AVM, which exceeds the minimum sample size of 131 estimated using power analysis.

Using the ANOVA F test requires assumptions of normally distributed populations and equal variance among populations, which are rarely satisfied completely. Those assumptions were fully discussed and analyzed by Moore (2008) with respect to the use of ANOVA for horizontal and vertical equity testing with the same sample sizes as used in this study.


Selection of Subject ParcelsThis research is an extension of the re-search reported by Moore (2006) and uses the same data as was used in that study. However, the methodology of the current study differs from the earlier study in that parcel centroid coordinates have been added, two additional market value estimating methods that incorpo-rate parcel centroid coordinates in their model specification have been intro-duced, and replacement cost new (RCN) has been incorporated as an additional variable. This section reviews how sample data were selected and manipulated.

The theoretical population for the study is all single-family homes in North America. The study population comprised single-family homes in an undisclosed Midwestern assessing jurisdiction. The sampling frame consisted of those homes that transferred ownership in valid arm’s-length transactions during the years 2001, 2002, 2003, and 2004. This sampling frame was identified and isolated through computer processing by Moore with per-mission of the jurisdiction.

To measure the predictive power of the seven AVM methods, all tests were conducted using the same population and the same random sample of sale parcels drawn from that population. The population included 22,785 exist-ing single-family residential properties with their descriptive characteristics, representing 52 distinct neighborhoods that were a subset of randomly drawn neighborhoods from the entire jurisdic-tion. In the earlier study, observations in the five years of 1999–2003 contain-ing 5,546 jurisdiction-validated sales, with characteristics as they were at the time of the sale, were available for use by the participants for model specifica-tion and calibration. For this study, only 2001–2003 sales were used for model specification because parcel x-y coordi-nates were not available for the earlier years.

To test the predictive power of models, a different sample than the one used for

model specification and calibration is recommended (Clapp and O’Connor 2008). Therefore, characteristics of 1,299 out-of-sample validated sales from the population that had sold in 2004 were provided to participants without the ac-tual selling prices, which were known only to Moore. These sales had been screened by the assessing office staff to verify that they were arm’s-length market transac-tions. Figure 1 contains a list of the parcel characteristic variables that were avail-able for use in development and testing of the predictive models. For this study, variables 41–45 (parcel x-y coordinates and RCN) were appended to the dataset. For the years 2001–2003, there were 3,872 validated sales containing parcel coordi-nates for use in model specification and calibration. The jurisdiction’s established land values as of December 31, 2003, were supplied as part of the characteristic data, and participants were instructed to use them as a “given.” No data were provided for computing new land values. Once the participant’s models were specified, they were used to blindly estimate the selling prices of the 1,299 out-of-sample validated sale transactions in 2004. In the current study, Joshua Myers performed the model specification and calibration for the two spatially enhanced AVMs (GWR and GAWR). He was given the same information about the dataset and out-of-sample sale prices as participants in the prior study.

Model SpecificationTo summarize, the nine participants in the original study (Moore 2006) had to build (specify) predictive models using their respective analytical tools and then calibrate (fit) them to the time-trended sales sample from 2003 and earlier, using their own time trending technique and judgment as to what sales should be used for model specification and calibration. They then applied their respective mod-els to the 1,299 out-of-sample properties in the 2004 test group to estimate selling prices. Cost calculation results for the


Figure 1. Parcel characteristic variablesField Name Description

1 ParcelNo Parcel identifier, numeric. 2 Class Property class—all are residential, single family class 5103 Neigh Neighborhood number, 3-digit numeric, range 108 to 579 (52 total)4 District Tax district number, 6-digit numeric5 SaleDate Sale date in a single date field with the format ‘mm/dd/yyyy’ (total=5,546)6 SaleAmt Sale amount; range 17,400–1,823,000; median 139,900; mean 168,2747 s1 Sale validity code for state reporting8 s2 Sale validity code for arm’s-length market transaction, ‘V’ = valid9 Acres Parcel acreage where available10 TLA_SF Total finished living area square feet11 FinSFB Finished living area square feet—basement12 FinSF1 Finished living area square feet—1st floor13 FinSF2 Finished living area square feet—full 2nd floor14 FinSFUp Finished living area square feet—partial upper floor such as half story15 FinSFLL Finished living area square feet—lower level of split- or bi-level (split foyer)16 Stories Story height as a single numeric field; 100 = 1 story, 150 = 1½ story, and so on17 H_Type House type code18 B_SF Basement square feet (no basement = 0)19 F_Baths Number of full baths20 H_Baths Number of half baths21 Tot_Fix Number of total plumbing fixtures22 AttGar_SF Attached garage size in square feet (no attached garage = 0)23 Gar_Cap Attached garage car capacity (not always available)24 DetG_SF Detached garage size in square feet (no detached garage = 0)25 C_Air Central air-conditioning (Y or N)26 FP Number of fireplaces27 Year Year constructed28 EffYear Effective year built—proxy for effective age29 Cond Condition: 94% = AV, 1% = EX, 1.5% = F, 2% = G, 1% = VG, 0.1% = P30 Grade Quality grade, numeric, ranging from 25 to 95 with 45 = avg, 25 = poor31 Extra Extra features flag, where 1 = yes32 ExtraDesc Free form description of extra features33 ExtraAmt Amount of value assigned to the extra features by the appraisal office34 PorchSF Total square feet of porch area35 WdDkSF Total square feet of wood deck area36 Land Estimated market land value placed on the lot by the appraisal office prior to time of sale37 RoofMat Roof cover material code38 AtticSF Total square feet of attic area39 AtticFinSF Finished living area square feet in attic40 Ext_Cov Exterior cover material code41 Latitude y-coordinate (adjusted by a constant to keep the real data location confidential) 42 Longitude x-coordinate (adjusted by a constant to keep the real data location confidential)43 Era Transform of year built into five groups44 Style Transform of house type code into five groups45 RCN Replacement Cost New


1,299 properties in the test group were furnished by Moore using Marshall & Swift’s residential cost book data (Mar-shall & Swift 2003). The single best predictive result set for each AVM type from the original study (Moore 2006) was used to randomly select and assign 775 observations of 2004 parcel selling price estimates for the current study.

Using the following year’s out-of-sample valid sales for testing differs from the usual model-testing methodology that sets aside a portion of the model-building sales sample for testing. The justification for using the following year’s valid market sales as the test group was that it more closely resembles the real-ity faced by assessors each year. Also, this approach could possibly uncover instability in the models. This decision was influenced in part by the desire to consider a “worst case” scenario for test-ing the predictive power of models.

Modeling Procedures for Geographical-attribute Weighted RegressionTo specify and calibrate the two new GWR and GAWR models utilizing parcel coordinates from GIS, it was necessary to determine three elements during the model-building phase: (1) the set of variables to include from the list of 45 (figure 1), (2) the number of years of sales history to use, and (3) the type of spatial-attribute weight function. A for-ward step-wise procedure was employed to determine the best set of variables to include in the model. At each step in this procedure, leave-one-out cross-validation was used on the sales-history dataset to determine the COD and optimum band-width for each of the variable options. The variable that most lowered the COD at each step was added to the model. At the end of the process, taking into ac-count the principle of parsimony and the quality of the coefficients, the model with the best COD was chosen and its corresponding optimum bandwidth was recorded. An adaptive bandwidth was used in this study.

Because the sales-history dataset stretched several years into the past, it had to be determined what set of years yielded the best results. Therefore, results were computed using various sets of years and the optimum historical time period selected. Also, the different choices for the spatial-attribute weight function had to be compared and the function that produced the lowest cross-validation COD identified. The full GAWR weight func-tion used in this research was:

wij = ((1 − (dij/b)2) × e −|1 − (TLAj/TLAi)

× e −|1 − (Gradej/Gradei)|

× e −|1 − (Landj/Landi)|)2, dij ≤ b

wij = 0, otherwise

where Gradei, Landi, and TLAi are the grade, land value, and total living area respectively for the i-th subject prop-erty, and Gradej, Landj, and TLAj are the grade, land value, and total living area respectively for the j-th sale property.

It was observed during this research that the set of variables selected to build a jurisdiction-wide multiple regression model cannot be applied as is in the con-text of GWR and GAWR and produce the same optimal results. Oftentimes, the best multiple regression model variable set performed decidedly worse when applied to GWR and GAWR. A step-by-step process of model building needs to be undertaken for these models as well. Additionally, it was observed that transformations of the sale price dependent variable that were efficacious in the multiple regression context can very well be punitive in the context of GWR/GAWR. The localization employed in GWR and GAWR seems to produce effects that are more linear and thus are not in need of transformation. Once all these processes and steps had been completed, the GWR and GAWR models were used to estimate the value of the subject out-of-sample properties.

Inclusion of RCN and the Variable SetReplacement cost new is the basis of the cost method of property valuation. It


represents the current cost to construct a functionally equivalent replacement of a structure. Many property attributes are used in the determination of the RCN, especially the total living area and the construction quality grade. The cost method has one major advantage over many other valuation methods: it is easy to understand. Regression, on the other hand, is a mystery to many taxpayers and oftentimes is difficult for appraisers to explain. This aspect may be hindering the widespread adoption of regression in appraisal jurisdictions. In an effort to increase the explanatory power of GWR and GAWR and break down barriers to their adoption, RCN was included as one of the variables that were tested during the model-building process. Surprisingly, the addition of RCN proved advanta-geous in reducing the COD and so it was included in the final model for both the research dataset and the Norfolk dataset. This outcome is significant. It permits the results of GWR and GAWR models to be interpreted as a market adjustment to the construction cost, a concept that is easier for the average person to grasp, and should allow for easier adoption.

However, inclusion of the reproduc-tion cost may seem counterintuitive to some. After all, cost assigns one number to an improvement and does not take into account the relative dif-ferences among its various attributes. A large regression model may seem more appropriate because many factors are taken into account directly. This research demonstrated that such thinking was in-correct, at least in the context of GWR/GAWR. RCN works as a variable because it constitutes a certain index of the prop-erty, a broad measure of what features a property contains, and, more impor-tantly, a good measure of how a certain property compares to nearby properties. RCN may not be a very accurate means of estimating property value by itself, but it has great worth in the GAWR context. As a variable, it improves results, makes it possible to achieve a more parsimoni-

ous model specification, reduces the likelihood of the over-fitting hazard men-tioned by Clapp and O’Connor (2008), and provides a way to better explain a seemingly mysterious regression process.

Table 2 lists the sets of variables chosen for the research data and for the Norfolk data. All of the models included an inter-cept. Even though the intercept can be difficult to interpret, for reasons beyond the scope of this article, it has an impor-tant role in improving the accuracy of the estimates. The land value estimate for the property was determined by an external process and provided as a data input for modeling. The Norfolk model included the land estimate as an offset. An offset is simply a value that is taken as a given and not set as a variable in the model. Also included as an offset in the Norfolk model was the total RCNLD (Reproduction Cost New Less Depreciation) for other improvements, such as pools, carports, piers, and utility sheds. The Norfolk da-taset contained RCNLD instead of RCN because of the way individual apprais-ers finalize values there. For the GAWR

Table 2. Variables selected for inclusion in GWR and GAWR models

Research data Norfolk dataIntercept InterceptRCN for the dwelling and garage

RCNLD for the dwelling and garage

Pre-determined land market value

Squared reverse month of sale (Time)

Total garage area (sum of attached and detached garage area)

Land value as an offset

Reverse month of sale (Time) RCNLD for other outlying improvements as an offset

Total other area (sum of attic and basement area)Total living area Neighborhood indicator variable (coded 1 if the sale property is in the same neighborhood as the subject property, and 0 otherwise)


model to work in Norfolk, depreciation had to be accounted for in the RCNLD instead of separately as the effective age. In both models, time was accounted for by using a reverse month of sale (RMOS) variable (Borst 2008). This approach treats the time that a property sold as an attribute of the property, not as an adjust-ment for before or after the fact. Slightly different sets of variables were chosen for the research data and the Norfolk data. This was done simply because the model specification procedure yielded different results for the two datasets. Every jurisdic-tion is different, so it seems natural that one set of variables will not produce the best results everywhere.

Software Used for GWR/GAWR ImplementationR software was used to implement GWR/GAWR for this research. R is a free, open-source application for statistical data analysis that is present on every major computing platform and is in use by re-searchers all over the world (R 2010). R utilizes its own computer language and is highly customizable. Users are permitted to write packages for R which they can make publicly available for download. These packages are baskets of functions written in R-code that can be used to implement various forms of statistical analysis within the program. For this research, the R-code was self-written and called several outside functions. A Mi-crosoft Excel macro also was developed

that implements the analysis by calling R behind the scenes.

Primary FindingsDescriptive statistics and statistical analy-ses are presented in the Results section without discussion. Analysis of variance findings also are provided. After the pre-sentation of the results, an analysis and an evaluation of the findings are offered. Additional findings are discussed in the subsequent section.

ResultsDescriptive Statistics The sampling frame for the research dataset consisted of the 1,299 sale par-cels in the year 2004 for which market values were estimated using the seven AVMs being evaluated: adaptive estima-tion procedure (AEP), cost method (COST), geographic-attribute weighted regression with RCN (GAWR), standard geographically weighted regression with RCN (GWR), multiple regression analysis (MRA), GAWR without using RCN (NoRCN), and transportable cost-specified market (TCM). Descriptive statistics for the estimated market values for each AVM and the actual sale prices are presented in table 3.

From the sampling frame, 775 parcel sales were randomly selected for the pur-pose of determining the COD for each AVM type. Table 4 contains the descriptive statistics for the 775 value estimates for each AVM and the 775 actual sale prices.

Table 3. Descriptive statistics for the sampling frame of the research dataset for each AVM and the set of actual sales

Statistic AEP COST GAWR GWR MRA NoRCN TCM SalesCount 1299 1299 1299 1299 1299 1299 1299 1299Median 171308 169319 171066 173044 182512 169632 168480 179000Mean 202085 208025 212557 211937 208420 212367 204390 216972Std Dev 130702 148510 142463 138932 121240 140968 141212 144540Min 37897 35557 52990 54118 49227 49018 49060 47000Max 1112331 1198514 1162931 1108231 893872 1137511 1405620 1250000Med A/S 0.937 0.940 0.982 0.978 0.990 0.979 0.943 n/aCOD 10.229 14.955 7.393 7.875 10.065 7.713 10.120 n/aPRD 1.007 0.985 1.002 1.004 1.032 1.006 1.004 n/a


The 775 parcel sales were randomly assigned to 31 groups of 25 sales each. CODs were calculated for each of the 31 groups of 25 parcel sales within each AVM type. Table 5 presents the descrip-tive statistics for the 31 CODs within each AVM type being tested.

Summary of Equity Findings for the Seven AVM TypesStandard equity measurement statistics for each of the seven AVM types evalu-ated in this study are summarized in table 6. The AVM types GAWR, GWR, and NoRCN included the parcel cen-

troid x-y coordinates, whereas the other four AVM types did not.

Hypothesis Testing for Horizontal EquityHypothesis testing for horizontal equity was performed for each AVM test group using the 31 COD samples from the re-search dataset. Mean COD differences between AVM test groups were signifi-cant for F(6,210) = 68.50 and p < .01. The ANOVA test results provided evidence of differences between the mean CODs of the seven AVMs tested, thus calling for rejection of the null hypothesis that all means are equal. The ANOVA test results are presented in table 7.

Table 4. Descriptive statistics for the sample of 775 randomly selected parcelsStatistic AEP COST GAWR GWR MRA NoRCN TCM Sales

Count 775 775 775 775 775 775 775 775Median 171154 167121 171674 174085 182718 170221 167820 179220Mean 197347 202814 207715 207043 203814 207007 199810 213076Std Dev 122843 139049 135334 130742 112626 132198 135184 139537Min 42752 35557 52992 54118 49493 48318 49060 47000Max 1112331 1198514 1162931 1108231 893872 1137025 1405620 1250000Med A/S 0.932 0.930 0.980 0.975 0.987 0.982 0.942 n/aCOD 9.949 14.912 7.668 8.387 9.911 7.934 10.061 n/aPRD 1.009 0.990 1.008 1.013 1.034 1.011 1.006 n/a

Table 5. Descriptive statistics of COD samples from the research dataset for each AVMStatistic AEP COST GAWR GWR MRA NoRCN TCM

Count 31 31 31 31 31 31 31Median 9.457 15.253 7.629 8.089 9.643 7.889 9.639Mean 9.641 14.447 7.559 8.239 9.549 7.780 9.731Std Dev 1.111 2.138 1.495 1.392 1.633 1.599 1.461Min 7.738 9.649 5.004 6.163 6.702 4.725 6.996Max 12.908 17.349 10.429 10.863 11.953 10.151 13.998Range 5.170 7.700 5.425 4.700 5.251 5.426 7.002

Table 6. Summary of equity findings for AVM types with the research datasetOverall findings COD findings by quintile QMR findings by quintile

AVM Med COD PRD VEI 1 2 3 4 5 1 2 3 4 5AEP 0.93 9.95 1.01 6.3 10.05 9.34 9.33 9.19 11.11 0.91 0.94 0.97 0.96 0.91COST 0.93 14.91 0.99 10.8 12.10 14.62 15.00 14.40 15.56 0.88 0.92 0.98 0.98 0.96GAWR 0.98 7.67 1.01 4.9 8.48 6.51 6.55 6.13 10.14 1.01 0.97 0.97 0.98 0.97GWR 0.99 8.39 1.01 6.6 8.71 7.93 7.44 7.17 9.72 1.03 0.98 0.98 0.98 0.96MRA 0.99 9.91 1.03 13.7 8.93 8.94 8.92 9.11 11.71 1.00 1.01 1.04 0.99 0.90NoRCN 0.98 7.93 1.01 5.5 8.39 6.98 6.70 6.45 10.39 1.02 0.98 0.97 0.98 0.96TCM 0.94 10.06 1.01 4.5 10.44 8.95 8.78 9.50 12.29 0.95 0.95 0.97 0.93 0.92


Also, results from the Kruskal-Wallis test were consistent with the ANOVA test findings, with p < 0.001 and chi-square of 111.83. The Kruskal-Wallis test, which conducts a one-way analysis of variance by ranks, is a nonparametric method for testing equality of population medians that does not assume a normal distribu-tion, an assumption required for the ANOVA test. Test results indicated that AVM mean CODs should be compared. Table 8 contains the results of Tukey-Kramer multiple comparison tests for all

pair-wise differences between the COD means of the 31 samples in each of the AVM test groups. The results in table 8 are arranged from the lowest to the highest mean COD (MCOD) for the seven AVMs tested. For a visual comparison, fig-ure 2 offers a box plot of the horizontal equity test findings.

Hypothesis testing for vertical equityHypothesis testing for vertical equity was performed for each AVM type using five groups of 155 ratios. A separate analysis was conducted for each AVM test group.

Table 7. Analysis of variance horizontal equity test results

Source Term DFSum of

Squares Mean Square F-ratio Prob. LevelPower

(Alpha=0.01)A: AVM 6 1018.023 169.6705 68.50 0.000** 1.000S(A) 210 520.132 2.4768Total (Adjusted) 216 1538.156Total 217

** Term significant at alpha = 0.01

Table 8. Comparison of horizontal equity performance between AVMs using the research datasetAVM

test groupTest group

sample size MCOD

Tukey-Kramer multiple-comparison test resultsCOD performance is different from:

GAWR 31 7.559 MRA, AEP, TCM, COSTNoRCN 31 7.780 MRA, AEP, TCM, COSTGWR 31 8.239 AEP, TCM, COSTMRA 31 9.549 GAWR, NoRCN, COSTAEP 31 9.641 GAWR, NoRCN, GWR, COSTTCM 31 9.731 GAWR, NoRCN, GWR, COSTCOST 31 14.447 GAWR, NoRCN, GWR, MRA, AEP, TCM

Figure 2. Box plot of the horizontal equity test findings for seven AVM types


Table 9 contains a summary of the test results.

As the test results show, all AVMs, ex-cept TCM, failed both the ANOVA test and the Kruskal-Wallis test for vertical equity. Therefore, the null hypothesis that all quintile mean ratios (and median ratios) are the same within each AVM was rejected for six of the seven AVMs. The vertical equity findings presented in table 9 tend to validate the equity find-ings implied in table 6—that the market values of low-value homes (quintile 1) and high-value homes (quintile 5) are the most difficult to estimate. Figure 3 provides box plots of the TCM vertical equity results, the only acceptable result of the seven AVMs tested, compared with

the results for MRA, the AVM with the greatest quintile ratio |Max-Min| spread of the seven AVMs. The other five AVM results are inside the range of these two.

Analysis and Evaluation of FindingsIn this section, each research question is presented in turn accompanied by an analysis of the relevant findings.

The findings of this research provide strong evidence that the AVM selected for use by the assessor does highly cor-relate with the equity and accuracy of

Table 9. ANOVA test results of AVM quintile mean ratios for vertical equity

GroupMeanratio |Max-Min| MSE F-ratioa p

ANOVAH0 decisionb

Kruskal-WallisH0 decisionc

Ratiosdifferent?d

AEP 0.935 0.059 0.0141 8.32 <.001** Reject Reject Yes

COST 0.942 0.102 0.0293 10.02 <.001** Reject Reject Yes

GAWR 0.983 0.048 0.0112 5.12 <.001** Reject Reject Yes

GWR 0.985 0.065 0.0126 7.46 <.001** Reject Reject Yes

MRA 0.989 0.136 0.0154 26.11 <.001** Reject Reject Yes

NoRCN 0.982 0.053 0.0117 5.65 <.001** Reject Reject Yes

TCM 0.943 0.043 0.0151 2.73 0.02813 Accept Accept No

Notesa The F-ratio was for F(4,770)b H0: No difference in mean QMRsc H0: No difference in median QMRsd Yes indicates vertical inequity because the QMRs are different by a statistically significant amount** Term significant at p < 0.01

Figure 3. Box plots of vertical equity test findings for TCM and MRA


assessments. The traditional cost method achieves the least equitable results of all the AVMs, as evidenced in table 8 and il-lustrated graphically in figure 2. While the findings provide significant evidence that a market-calibrated AVM will predict sell-ing prices more accurately than a purely cost-based AVM, no statistically significant difference was found between the per-formance of AEP, MRA, and TCM AVMs when applied without use of the parcel centroid x-y coordinates. Finally, the findings provide strong evidence that an AVM based upon GAWR produces better horizontal equity and accuracy of assess-ments than other AVMs by a statistically significant amount. Thus, differences do exist among available methodologies for estimating market value of homes.

2. Does the use of parcel coordinates from GIS in the market value estimating model specifica-tion improve the measure of equity/accuracy by a statistically significant amount?

The findings provide strong evidence that when parcel centroid x-y coordi-nates available from GIS are applied in the GAWR model specification and calibration, the best horizontal equity and accuracy of assessments are achieved relative to the other AVMs evaluated in this research. (See table 8 and figure 2.)

3. Does the addition of attribute weighting improve the measure of equity/accuracy of the market value estimating model by a statisti-cally significant amount over standard GWR?

The findings do not support that the addition of attribute weighting improves

the measure of equity/accuracy of the GAWR market value estimating model by a statistically significant amount over standard GWR. GAWR does produce an improvement that is visible in the statis-tical results; however, it is not sizeable enough to reject the null hypothesis that no statistically significant difference exists.

The prior study that used the same research dataset (Moore 2006) did not evaluate horizontal and vertical equity separately as was done in the current research. An important finding of this study was that vertical equity remains a problem for most AVMs, as shown by the test results contained in table 9. TCM was the only AVM for which the vertical equity test result was acceptable.

Additional FindingsFindings using the Norfolk Dataset in GWR and GAWRIn addition to using the research dataset described in the methodology section for comparing performance of the seven AVMs, sales from the City of Norfolk, Virginia, from July 2005 to June 2007 were used to calibrate GWR and GAWR models to estimate market values of 1575 residential properties that actually sold throughout the city from July 2007 to June 2008. Descriptive statistics for the Norfolk sales and GWR and GAWR market value estimates are contained in table 10.

Equity findings for the City of Nor-folk dataset are presented in table 11.

Table 10. Descriptive statistics for the City of Norfolk dataset for GWR, GAWR, and salesGroup Count Median Mean Std Dev Min Max Med A/S COD PRD

Sales 1575 206500 248692 146209 60300 2000000 n/a n/a n/aGWR 1575 212814 255409 144114 68370 1460350 1.029 7.956 1.010GAWR 1575 213235 255452 144639 69312 1454034 1.029 7.489 1.012

Table 11. Equity findings with the City of Norfolk datasetOverall findings COD findings by quintile QMR findings by quintile

Model Med COD PRD VEI 1 2 3 4 5 1 2 3 4 5GAWR 1.03 7.49 1.01 8.0 8.07 6.54 6.53 7.13 7.28 1.09 1.05 1.03 1.02 1.01GWR 1.03 7.96 1.01 6.8 8.67 7.10 7.22 7.38 7.98 1.08 1.05 1.03 1.01 1.01


As with the results from the research dataset, the GAWR model using the Norfolk data yielded a lower COD than the GWR model. The Norfolk findings confirm that results comparable to those demonstrated with the research dataset are attainable in a working assessment jurisdiction. Because the City of Norfolk real estate market was declining during the 2007–2008 period and rising dur-ing most of the 2005–2007 period from which the sales were drawn, the estimat-ed sale prices for the later period used for the testing resulted in a computed median ratio of 1.03.

The GWR/GAWR methodology de-veloped for this research offers the capability of specifying a target median ratio for the resulting market value esti-mates. This feature would accommodate those jurisdictions that prefer to target an assessment median ratio in the range of 0.95 to 1.00. Testing GWR/GAWR model specification, calibration, and application on actual recent sales data from the City of Norfolk permits a real-world compari-son of the research data results with those of an active production environment.

Findings using the Fairfax County Dataset in GAWRGAWR also was applied to the Fairfax County, Virginia, data used in the 2006 model-building initiative sponsored by IAAO. The sales history dataset con-tained 51,190 valid residential sales from January 1967 through December 1991, and the prediction hold-out sample included 5,000 valid residential sales

from January 1972 through June 1991. Table 12 compares the predictive power of GAWR against the two best practice results and the default OLS (MRA) re-sult as reported by Clapp and O’Connor (2008). The GAWR model produced better results than any of the alternative models.

In the Clapp and O’Connor study, results were reported in terms of the absolute value percentage error, an equity measure likely unknown to prop-erty assessors. The mean absolute value percentage error is best explained as be-ing equal to the COD when the median ratio is 1.

The Fairfax County data did not include some fields, such as the total living area, that are normally included in appraisal data. Instead, the data listed the total number of rooms as a proxy for house size. These differences could lead some assessors to criticize the data and unfairly doubt the results. While the Fairfax data may not be perfect, the results are still valid because each participant worked with the same data. Valid comparisons between models can still be made to determine best practice, the aim of the original IAAO initiative.

Conclusions and RecommendationsAssessors have a variety of market value estimating methodologies from which to choose, and oftentimes limited resources available for performing their jobs. Evaluating the relative performance of the available automated valuation mod-els can be difficult without the type of

Table 12. Equity findings with the Fairfax County data (January 1972 to June 1991)Absolute Value Percentage Error

Model Mean 25th Percentile Median 75th PercentileMyers GAWR 10.6 3.0 6.4 11.8Gloudemans/Montgomerya

11.8 3.7 7.8 14.1

Casea 11.8 3.7 8.0 14.1OLS (MRA) 12.6 4.0 8.4 15.8

a The Case and Gloudemans/Montgomery models were considered the two best practice results of all the models surveyed in the Clapp/O’Connor study (2008).


objective, independent information and comparative analyses provided by this re-search. This study has used the scientific method of statistical hypothesis testing to compare performance measurements of automated valuation models as a means of determining if some perform better than others in terms of predictive power and equity. Both horizontal and vertical equity results were evaluated for seven types of models: adaptive estimation procedure called feedback (AEP), cost method (COST), geographic-attribute weighted regression with RCN (GAWR), geographically weighted regression with RCN (GWR), multiple regression analysis (MRA), geographic-attribute weighted regression without RCN (NoRCN), and transportable cost-specified market method, also called market-calibrated cost (TCM).

The comparison of horizontal equity performance among AVMs using the research dataset (table 8) showed that distinct statistically significant differ-ences in mean COD did exist between the AVM groups that were evaluated. COST performance was significantly worse than all other AVMs. As a group, no statistically significant difference in performance was found between MRA, AEP, and TCM—all market-based meth-ods that did not include parcel centroid x-y coordinates. When x-y coordinates and attribute weighting were used in GAWR-based models, a statistically sig-nificant improvement in performance was found over COST and the three market-based methods that did not include coordinates, MRA, AEP, and TCM. Tests did reveal that GAWR pro-duced a visibly lower COD than GWR in both the research dataset and the City of Norfolk dataset, but the difference was insufficient to be considered statisti-cally significant. Results similar to those achieved with the research dataset when applied to the City of Norfolk dataset in-dicate that GAWR has potential for use in a production assessment environment.

Inclusion of x-y parcel coordinates

alone, however, may not guarantee best practice results, as shown by the test results of Clapp and O’Connor (2008). Methodology and model specification are important. Table 12 compares the predictive power of GAWR against the two best practice results from the 11 separate predictive models tested in the IAAO initiative as well as the default OLS (MRA) result reported by Clapp and O’Connor (p. 64, table 5). The 11 results reported in 2008 had mean ab-solute value percentage errors ranging from 11.8 to 27.0, whereas GAWR had a mean absolute percentage error of 10.6 against the same dataset as well as the lowest absolute percentage error at the 25th, 50th, and 75th percentiles.

A significant finding of this study was the benefit of using replacement cost new as an independent variable in the GWR/GAWR modeling methodology, which resulted in a visible improvement in the COD results, but not a statistically significant one. The significance of this finding lies in the fact that inclusion of RCN allows the methodology to be explained in terms that are understand-able to nearly everyone. RCN provides the foundation of the market value esti-mate for the subject property based on its structural characteristics while GWR/GAWR analyzes nearby comparable sales and adjusts the RCN according to market and location influences, producing an accurate and equitable estimate of the subject’s market value. In essence, it is a comparable sales method used to cali-brate the RCN and land value estimate.

Unfortunately, evaluation of vertical equity indicated that AVM results are not as they should be and that more work is required to produce an AVM method that will improve vertical equity as well as horizontal equity. As shown by the test results in table 9, all AVMs except TCM failed both the ANOVA test and the Kruskal-Wallis test for vertical equity. Thus, the null hypothesis that all quintile mean ratios (and median ratios) were the same within each AVM (the test of


vertical equity) was rejected for six of the seven AVMs.

This study also highlights the weakness of PRD as a measure of vertical equity, as discussed by Moore (2008) and Jensen (2009). Comparison of the information contained in table 6 and table 9 further illustrates the weakness of PRD as an ac-curate measure of vertical equity. Five of the seven AVMs produced identical PRDs of 1.01, including TCM, which was the only AVM to produce acceptable vertical equity according to the more sensitive tests applied in this study. The vertical eq-uity index (VEI), presented in table 6, is similar in concept to the COD and does correctly discriminate between AVMs relative to their respective vertical equi-ties. The VEI was developed by Moore (2008) as a means of evaluating vertical equity in his dissertation research.

Based on the findings of this research, the first recommendation is that assess-ing jurisdictions utilize the investments made in their geographic information systems to improve assessment accuracy and equity by using an AVM that incor-porates parcel centroid x-y coordinates, such as GWR or GAWR. Strong evidence was found in this study that a potential improvement in CODs of between 20 and 25 percent could be obtained with the same dataset over market-based AVMs that do not use the parcel centroid x-y coordinates.

The second recommendation is that more research be initiated to discover methods of improving the vertical equity results of AVMs. The findings of this study clearly demonstrate that a vertical equity problem exists. In conjunction with the research to improve vertical eq-uity, research also should be conducted to develop a better measure of vertical equity than is attainable by the widely used price-related differential.

ReferencesAllen, M.T., and W.H. Dare. 2002. Identifying determinants of horizontal property tax inequity: Evidence from

Florida. Journal of Real Estate Research 24 (2): 153–164.

Borst, R.A. 2008. Evaluation of the Fou-rier transformation for modeling time trends in a hedonic model. Journal of Property Tax Assessment and Administration 5 (4): 33–40.

Borst, R.A., and W.J. McCluskey. 2008. The modified comparable sales method as the basis for a property tax valuations system and its relationship and compari-son to spatially autoregressive valuation models. In Mass appraisal methods: An international perspective for property valuers, ed. T. Kauko and M. d’Amato, 49−69. Ox-ford, United Kingdom: Wiley-Blackwell.

Brunsdon, C., A.S. Fotheringham, and M.E. Charlton. 1996. Geographi-cally weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis 28 (4): 281–298.

Buchner, A., E. Erdfeld, and F. Faul. 1997. How to use G*Power. http://www.psycho.uni-duesseldorf.de/aap/ projects/gpower/how_to_use_gpower.html (accessed May 27, 2006).

Clapp, J.M., and P.M. O’Connor. 2008. Automated valuation models of time and space: Best practice. Journal of Property Tax Assessment and Administration 5 (2): 57–67.

Cleveland, W.S. 1979. Robust locally weighted regression and smoothing scat-ter plots. Journal of the American Statistical Association 74:823–836.

Cleveland, W.S., and S.J. Devlin. 1988. Locally weighted regression: An ap-proach to regression analysis by local fitting. Journal of the American Statistical Association 83:596–610.

Cornia, G.C., and B.A. Slade. 2005. As-sessed valuation and property taxation of multifamily housing: An empirical analysis of vertical and horizontal equity and assessment methods. Journal of Real Estate Research 27 (1): 17–46.

Court, A.T. 1939. Hedonic price indexes with automotive examples. In The Dy-


namics of Automobile Demand, 99–117. New-York: General Motors Corporation.

Des Rosiers, F., and M. Theriault. 2008. Mass appraisal, hedonic price modeling and urban externalities: Understanding property value shaping processes. In Mass appraisal methods: An international perspective for property valuers, ed. T. Kauko and M. d’Amato, 111−147. Oxford, United Kingdom: Wiley-Blackwell.

Fotheringham, A.S., C. Brunsdon, and M.E. Charlton. 2002. Geographically weighted regression: The analysis of spa-tially varying relationships. Chichester, West Sussex, England: John Wiley & Sons.

Gipe, G.W. 1975. Understanding mul-tiple regression analysis. Assessors Journal 10 (4): 1–13.

Haas, C.G. 1922. Sales prices as a basis for farm land appraisal. Technical Bulletin no. 9. St. Paul: University of Minnesota Agricultural Experiment Station.

IAAO. 1997. Glossary for property appraisal and assessment. Chicago: International Association of Assessing Officers.

IAAO. 2003. Standard on automated valu-ation models (AVMs). http://www.iaao.org/uploads/AVM_STANDARD.pdf (accessed August 3, 2009).

IAAO. 2008. Standard on mass appraisal of real property. http://www.iaao.org/uploads/StandardOnMassAppraisal.pdf (accessed February 7, 2010).

Jensen, D.L. 2009. The effects of hetero-geneous variance on the detection of regressivity and progressivity. Journal of Property Tax Assessment & Administration 6 (3): 5–22.

Jensen, J.P. 1931. Property taxes in the United States. Chicago: University of Chi-cago Press.

Kennedy, P. 1998. A guide to econometrics, 4th ed. Cambridge, MA: MIT Press.

Marshall & Swift. 2003. Residential cost handbook (September). Los Angeles: Marshall & Swift/Boechk.

Matthews, S. 2007. Geographical weighted regression publication listing. R25 Advanced Spatial Analysis Training Program, Pennsylvania State University Population Research Center. http://www.pop.psu.edu/gia-core/litsearches/SAM_GWR_list.pdf (accessed September 17, 2009).

McMillen, D.P. 1996. One hundred fifty years of land values in Chicago: A non-parametric approach. Journal of Urban Economics 40 (1):100−124.

McMillen, D.P., and C. Redfearn. 2010. Estimation and hypothesis testing for nonparametric hedonic house price functions. Journal of Regional Science 50 (3): 712–733.

Moore, J.W. 2006. Performance com-parison of automated valuation models. Journal of Property Tax Assessment & Ad-ministration 3 (1): 43–59.

Moore, J.W. 2008. Evaluating property tax equity implications of capping assessment increases: Evidence from Florida. PhD diss., Northcentral University. AAT 3305724. http://www.jwaynemoore.net (accessed March 12, 2009).

Moore, J.W. 2009. A history of appraisal theory and practice: Looking back from IAAO’s 75th year. Journal of Property Tax Assessment & Administration 6 (3): 23–49.

Paez, A., F. Long, and S. Farber. 2008. Moving window approaches for hedonic price estimation: An empirical compari-son of modeling techniques. Urban Studies 45 (8): 1565–1581.

Pollock, W.W., and K.W.H. Scholz. 1926. The science and practice of urban land valu-ation. Philadelphia, PA: Manufacturers’ Appraisal Company.

Prouty, W.L., C.W. Collins, and F.H. Prouty. 1930. Appraisers and assessors manual. New York: McGraw-Hill.

R. 2010. The R project for statistical comput-ing. http://www.r-project.org (accessed February 12, 2010).


Shi, H., L. Zhang, and J. Liu. 2006. A new spatial-attribute weighting function for geographically weighted regression. Canadian Journal of Forest Research 36 (4): 996–1005.

Sirmans, G.S., B.A. Diskin, and H.S. Friday. 1995. Vertical inequity in the taxation of real property. National Tax Journal 48 (1): 71–84.

Tobler, W. 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography 46 (2): 234–240.

Yu, D., Y.D. Wei, and C. Wu. 2007. Mod-eling spatial dimensions of housing prices in Milwaukee, WI. Environment and Planning B: Planning and Design 34 (6): 1085–1102.

Zangerle, J.A. 1924. Principles of real estate appraising. Cleveland, OH: McMichael Publishing.

Documents

Using Geographic-attribute Weighted Regression for CAMA … · the implementation of CAMA systems in more than 300 assessment jurisdictions. Joshua Myers is Real Estate CAMA Modeler