20
This article was downloaded by: [SUNY College of Environmental Science and Forestry ] On: 05 September 2014, At: 08:58 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of Remote Sensing Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tres20 Integration of urban growth modelling products with image-based urban change analysis Huiran Jin a & Giorgos Mountrakis a a Department of Environmental Resources Engineering , State University of New York College of Environmental Science and Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013) Integration of urban growth modelling products with image-based urban change analysis, International Journal of Remote Sensing, 34:15, 5468-5486, DOI: 10.1080/01431161.2013.791760 To link to this article: http://dx.doi.org/10.1080/01431161.2013.791760 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

This article was downloaded by: [SUNY College of Environmental Science and Forestry ]On: 05 September 2014, At: 08:58Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of RemoteSensingPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/tres20

Integration of urban growth modellingproducts with image-based urbanchange analysisHuiran Jin a & Giorgos Mountrakis aa Department of Environmental Resources Engineering , StateUniversity of New York College of Environmental Science andForestry , Syracuse , NY , 13210 , USAPublished online: 24 Apr 2013.

To cite this article: Huiran Jin & Giorgos Mountrakis (2013) Integration of urban growth modellingproducts with image-based urban change analysis, International Journal of Remote Sensing, 34:15,5468-5486, DOI: 10.1080/01431161.2013.791760

To link to this article: http://dx.doi.org/10.1080/01431161.2013.791760

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing, 2013Vol. 34, No. 15, 5468–5486, http://dx.doi.org/10.1080/01431161.2013.791760

Integration of urban growth modelling products with image-basedurban change analysis

Huiran Jin and Giorgos Mountrakis*

Department of Environmental Resources Engineering, State University of New York College ofEnvironmental Science and Forestry, Syracuse, NY 13210, USA

(Received 10 September 2012; accepted 28 November 2012)

Urban change detection using remotely sensed data has been extensively studied. Onecurrent application of detection products is the formulation of calibration data forurban change prediction models. As multi-temporal scenes become available and urbangrowth prediction models increase in popularity and accuracy, it is natural to envisage abi-directional relationship where, in addition to detection products assisting predictionmodels, the prediction information acts as ancillary input to enhance spectral-basedchange detection products. This closed feedback loop has the potential to significantlyincrease the accuracy of both detection and prediction efforts. Consequently, our objec-tive is to evaluate the integration of prediction information with spectral data for urbanchange monitoring. A case study was carried out in the Denver, Colorado metropolitanarea. Probabilities of urban change generated from two existing urban prediction models(based on decision trees and logistic regression) are combined as additional informa-tion content with a Landsat Thematic Mapper (TM) scene. Detailed assessments at thepixel and block scales are implemented to evaluate urban change classification accuracyusing different input data and training sample sizes. Results show that in pixel-basedassessments, the fusion of decision tree change probabilities and TM spectral bandswith sufficient training samples leads to improvements. In terms of overall accuracies,the improvement is 2.0–2.4%, from 87.3% (spectral-only model) and 87.7% (predictionprobability model) to 89.7% for the fused model. Similarly, the corresponding kappacoefficients show increases of 0.07–0.08, from 0.60 for the spectral model and 0.61 forthe urban prediction model to 0.68 for the fused model. Accuracies aggregated at theblock scale present an approximate 2.1–4.3% increase when the fusion-based model isemployed compared with the exclusive use of either spectral or prediction probabilitydata, namely 87.6% (fused) vs 83.4% (spectral) and 85.7% (prediction). It is also impor-tant to state that the standard deviation of accuracies between blocks is significantlyreduced by more than 3% (11.5% vs 14.9% and 14.7%), suggesting higher consistencyin classification performance. This is a desirable attribute for subsequent use of theseproducts, for example by the urban planning community. Statistical tests at the blockscale also demonstrate that such improvements are significant. It is also observed thatto receive the integration benefits, the remote sensing classifier needs a large but rea-sonable training data set size, while the prediction model should be based on advancedmodelling methods. Further assessments on block accuracy with respect to urbaniza-tion conditions (i.e. urban presence and change sizes) indicate the ability of the fusionto address spectral limitations, especially in blocks with high relative change. Theseinitial results encourage the expansion of spectral/prediction data fusion to other sites,modelling techniques, and input data.

*Corresponding author. Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 3: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5469

1. Introduction

Remote sensing is a powerful tool for land-cover and land-use mapping. One importantaspect in the application of remotely sensed imagery is the categorization and classifi-cation of spectral measurements taken from satellite sensors into various features on landsurface. However, multispectral information is not always sufficient to provide reliable land-cover/land-use discrimination; different classes may not be distinguishable due to similarspectral responses (Franklin 1995).

A possible solution to limited spectral separability is the incorporation of ancillary data.The additional information provided is usually uncorrelated to the spectral information, andcan potentially offer classification accuracy improvements (Stathakis and Kanellopoulos2008). Ancillary data are typically incorporated in image-based classification as addi-tional pseudo-bands, thereby facilitating straightforward integration (Strahler et al. 1981;Ricchetti 2000). Accordingly, many studies have been conducted on the integration of ancil-lary data achieving higher accuracies than those produced through the exclusive use ofspectral information.

In particular, topographic data and digital elevation models were integrated with mul-tispectral data to improve interpretation potential (e.g. Duguay et al. 1989; Franklin andWilson 1991; Ricchetti 2000; Trietz and Howarth 2000; Rogan et al. 2003; Stathakis andKanellopoulos 2008; Sesnie et al. 2008). Moreover, ancillary data from other sources suchas census and geospatial vectors (e.g. land use and city boundaries) have also been fusedwith spectral information to improve classification, especially for urban and suburban areas.Harris and Ventura (1995) investigated the incorporation of zoning and housing density tomodify the initial land-use classification in the city of Beaver Dam, Wisconsin. Greenbergand Bradley (1997) selected population and road density as gradient variables to assist clas-sifications in Seattle, Washington. Stefanov, Ramsey, and Christensen (2001) constructedan expert system in the Phoenix metropolitan region to perform post-classification in orderto reduce errors associated with the initial classification efforts using additional spatial datasets including land-use polygons, water rights, and city and reservation boundaries. For anarea covering almost one-third of Germany, Esch et al. (2009) derived useful informationon the residential, industrial, and transportation-related impervious surface on the basis ofa combined analysis of single-date satellite images and road network and railway vectordata. The aforementioned studies took advantage of Landsat imagery to provide spectralinformation. Gamba, Dell’Acqua, and Trianni (2007) used multitemporal synthetic aper-ture radar data with the integration of ancillary data – namely GIS layers of the urban extentand city parcel boundaries – to detect earthquake damage in urban areas.

As urbanization has increased in recent years, knowledge of the present distributionand area of urban lands is pressingly required by legislators, planners, and governmentalofficials to develop environmentally friendly land-use policies, to project transportation andenergy demands, and to improve quality of life. In contrast to conventional surveying andmapping techniques, which are expensive and time consuming, remote-sensing data aretimely and cost effective, providing temporally and spatially detailed information on urbandistribution (Weber and Puissant 2003), and therefore, have been increasingly used forurban monitoring. Several remote-sensing approaches have been proposed to extract urbanlands using methods such as multivariate regression (e.g. Yang 2006; Bauer, Loeffelholz,and Wilson 2007), spectral mixture analysis (e.g. Wu and Murray 2003; Wu 2004; Luand Weng 2006; Powell et al. 2007; Franke et al. 2009; Weng, Hu, and Liu 2009), andmachine learning models (e.g. Xian, Crane, and McMahon 2008; Esch et al. 2009; Hu andWeng 2009; Luo and Mountrakis 2010; Mountrakis and Luo 2011). A detailed review onremote-sensing classifiers for urban monitoring is provided by Weng (2012).

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 4: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5470 H. Jin and G. Mountrakis

Urban growth prediction models (UGPMs) are tasked to capture the underlying spatialand temporal complexities of urban sprawl. Input variables in UGPMs are mainly derivedfrom remote sensing-based data and field/sampling-based data. A possible set of physical,demographic, socioeconomic, and political factors are incorporated into the modelling ofurbanization. Several algorithmic methods have been employed including cellular automata(e.g. White, Engelen, and Uljee 1997; Clarke and Gaydos 1998; Syphard et al. 2005; Liet al. 2010), artificial neural networks (e.g. Pijanowski et al. 2005; Liu and Seto 2008),decision trees (e.g. Chan, Chan, and Yeh 2001; Sesnie et al. 2008), and linear/logisticregression (e.g. Seto and Kaufmann 2003; Pontius and Malanson 2005; Hu and Lo 2007;Huang, Zhang, and Wu 2009).

Until now, the relationship between remote sensing (detection) and urban growth mod-els (prediction) has been one-directional, namely detection techniques are implemented tocreate the necessary data sets to facilitate training of prediction models. As prediction mod-els increase in reliability, there is the opportunity to incorporate them in future monitoringtasks. For example, one may build a prediction model using, among other variables, remote-sensing images from 1970, 1980, 1990, and 2000. Assuming a new remotely sensed imagearrives from 2010, change detection analysis could fuse the expected change as presentedby the prediction model with the 2010 spectral information to produce a more accuratechange map.

However, the potential integration of these prediction models to image classificationhas not yet been evaluated. This article aims to investigate the effectiveness of integrat-ing urban growth-modelling products to image-based classification tasks targeting urbansprawl detection. Several integration scenarios are evaluated at the pixel and block scalesin a Denver site, taking into account practical considerations such as training sample size.For this study, urban areas are defined as residential areas, commercial/light industries,institutions, communication and utilities, heavy industries, entertainments/recreations, androads and other transportations.

2. Study area and data sets

The Denver metropolitan area was selected as the study site. It is located in the centre of theFront Range Urban Corridor, with the Southern Rocky Mountains to the west and the HighPlains to the east. As the largest city of Colorado, Denver experienced rapid population andurban growth in the 1990s, causing many of the surrounding communities to expand expo-nentially in size. According to the land-use maps provided by the US Geological SurveyRocky Mountain Mapping Center (USGS 2003), the percentage of urban cover over theentire study area increased from 48% in 1977 to 58% in 1997 – nearly one-fifth of thenon-urban land in 1977 had been converted to urban areas by 1997, indicating a strongurbanization trend during the two decades. This significant trend was also the motivationbehind this site selection for urban growth model development in other studies (e.g. Wangand Mountrakis 2010; Triantakonstantis, Mountrakis, and Wang 2011).

The area selected for this study is specified by an x-coordinate ranging from 481,500 mto 522,030 m and a y-coordinate ranging from 4,389,810 m to 4,421,340 m (UTM Zone13 North, WGS-84). It has a ground extent of 40.53 km × 31.53 km, i.e. approximately1278 km2, covering the major part of the Denver metropolitan area. A map of urban devel-opment in this area over the 20 year period, as Figure 1 shows, was generated by USGS fromthe land-use data sets for 1977 (source: 1975, 1977−1978 USGS and USDA aerial photog-raphy) and 1997 (source: 1996/1997 imagery) by manually digitizing aerial photographsover the entire site for the two time periods. To focus on relatively large-scale patterns, all

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 5: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5471

Urban areas in both 1977 and 1997

Non urban areas in both 1977 and 1997

Areas transitioned from non urban in 1977 to urban in 1997

Nkm

8420

Figure 1. Urban growth in the Denver metropolitan area from 1977 to 1997 (white and grey pixelscomprise the reference data set).

data are rasterized to a spatial resolution of 30 m × 30 m following a minimum presencerule, in which a pixel is assigned to the urban class if it overlaps with any digitized data thatrelate to urban development.

Seven land-use types are defined as urban development: residential areas,commercial/light industries, institutions, communication and utilities, heavy industries,entertainments/recreations, and roads and other transportations. Figure 1 presents the dis-tribution of existing urban areas in 1977 (black), non-urban areas in both 1977 and 1997(grey), and areas that experienced urban development from 1977 to 1997 (white). As thereverse transition from developed to non-developed areas is rarely observed by comparingthe bi-temporal land-use data sets, urban areas in 1977 are excluded from all experimentalprocedures in this study. Therefore, the entire set of white (defined as change) and grey(defined as no change) pixels is used as the population of ground reference, from whichtraining and testing samples are collected.

A Landsat 5 Thematic Mapper (TM) image, acquired on 26 June 1997, was downloadedfrom the USGS Earth Resources Observation and Science Center (EROS) website (source:http://glovis.usgs.gov/). The image is convenient to use since it is in the same projectionand coordinate system (UTM Zone 15 North, WGS-84) as the urban growth map shownin Figure 1. Here, a subscene with ground coverage spatially coincident with the urbangrowth map is used (Figure 2). It consists of 1351 × 1051 pixels with a pixel size of 30 m× 30 m. Also, the six reflective bands (bands 1−5 and 7) are extracted to provide spectralinformation for different land uses in 1997.

To evaluate the integration of urban growth models to image classification, two prob-ability maps showing different performance in urban change prediction estimated throughdecision tree (DT) and logistic regression (LR), respectively, are combined with the afore-mentioned TM image. Both urban growth models were developed using a variety ofexplanatory variables, shown in Table 1, including: (a) six proximity variables using

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 6: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5472 H. Jin and G. Mountrakis

0 2 4 8km

Figure 2. Landsat 5 TM image (pseudo-BGR bands 3, 4, and 7) acquired in June 1997.

Euclidean distances to the nearest neighbour to measure the linear accessibility of targetfeatures associated with urban expansion; (b) 15 density variables using kernel functionsto quantify from a centre pixel the weighted spatial agglomeration of target features andtheir accessibility at various scales; (c) two topographic variables of slope and elevation;and (d) one dichotomous variable to capture local heterogeneity – this variable has a valueof 1 if a pixel lies in an interior subregion encompassed by dense urbanized areas, and–1 if it belongs to an exterior subregion outside of urban structures. All these 24 predictorvariables are derived from the land-use data in 1977. It is important to note that no informa-tion from 1997 is incorporated as that is the prediction year. For further details regardingthe descriptions of predictor variables and the implementation of decision tree and logis-tic regression techniques in urban growth modelling, the reader is referred to Wang andMountrakis (2010) and Triantakonstantis, Mountrakis, and Wang (2011).

The two urban growth models were trained, leading to the creation of urban changeprobability maps (Figure 3). These maps have values continuously varying from 0 to 1, andthey describe the likelihood of every non-urban pixel in 1977 to transition to urban in 1997.The decision tree-based prediction is displayed on the left and the logistic regression-basedprediction is on the right of the figure. The closer the probability is to 1, the more likely thepixel is to change. Higher probabilities, shown in dark red, are coarsely located around themajor existing urban areas, suggesting these adjacent undeveloped areas are more likely toexperience urban expansion. Compared with the reference map (Figure 1), DT probabili-ties explicitly capture urban growth patterns in the north, southwest, and southeast areas,whereas LR probabilities are gradually branching out into the margins and are less indica-tive of the numerous and disintegrated urban changes at the finer level. Note that existingurban pixels are excluded from the change probability estimation, as no conversion fromdeveloped back to non-developed areas is taken into consideration.

3. Methods

The motivation behind this work is to assess whether integration of UGPM outputs into animage-based classification can improve classification accuracy. We do not examine in detail

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 7: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5473

Table 1. Summary of explanatory variables used for the UGPMs.

Name Description*

Proximity images created from Euclidean distancesX 1: Entertainments Distance to

entertainments/recreationsX 2: Heavy industries Distance to heavy industriesX 3: Rivers/canals Distance to rivers/canalsX 4: Primary roads Distance to primary roadsX 5: Secondary roads Distance to secondary roadsX 6: Minor roads Distance to minor roadsKernel density for binary images Searching radius (pixels)X 7: Agricultural business 120 Kernel density of agricultural

businessX 8: Residential areas 120 Kernel density of residential

areasX 9: Urban developments (excluding

roads)120 Kernel density of urban

developmentsX 10: Commercial areas 120 Kernel density of commercial

areasX 11: Institutes 120 Kernel density of

institutes/schoolsX 12: Communications/utilities 120 Kernel density of

communications/utilitiesX 13: Lakes/ponds 120 Kernel density of lakes/pondsX 14: Cultivated lands 120 Kernel density of cultivated

landsX 15: Natural vegetations 120 Kernel density of natural

vegetationsKernel density for proximity imagesX 16–X 21: Urban developments

(excluding roads)10, 30, 50, 80, 100, 150 Kernel density of distance to

urban developmentsTopographyX 22: Elevation (m) Elevation based on the mean sea

levelX 23: Slope (◦) SlopeLocal heterogeneity ValueX 24: Subregion indicator −1 An exterior urban subregion

1 An interior urban subregion

Note: *See Wang and Mountrakis (2010) and Triantakonstantis, Mountrakis, and Wang (2011) for more details.

either the UGPM inputs or training mechanism; we simply assume that their predictionresults (i.e. urban change probabilities) are available.

The procedure adopted in this study is described in Figure 4. The reference data arefirst divided into two subsets, one for calibration and the other for validation. Probabilitymaps of change generated from two different UGPMs are combined with the TM spectraldata for image classification. Results are quantitatively evaluated and compared with theclassifications obtained using spectral data or probability maps alone in terms of pixel- andblock-based accuracy measures. Existing urban pixels in 1977 (black colour in Figure 1)are excluded from the reference data, and hence not involved in either classification trainingor accuracy assessment.

The construction of training and testing data sets is discussed below, followed by twosections describing image classifications using different inputs and methods, and anothersection on accuracy assessment.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 8: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5474 H. Jin and G. Mountrakis

Decision tree model

Predicted probability of change

0.0 0.2 0.4 0.6 0.8 1.0

Existing urban areas in 1977

Logistic regression model

km0 3 6 12

Figure 3. Prediction outputs from two different urban growth models.

Referencedata set

Split

Calibrationdata set

Validationdata set

SamplingTrainingsamples

TM bandsInput data/classificationscenarios

Classificationmethods

Urban changemaps

TM + UGLR

fusion-based mapTM + UGDT

fusion-based mapTM-based

mapUGLR-based

mapUGDT-based

map

ValidationAccuracy assessment

DT: Decision treeLR: Logistic regression

Prob: ProbabilityUG: Urban growth (prediction model)

Externally provided by an urbangrowth prediction model

*

UGDT

Prob*

Classification

Sampling

UGDT

Prob*UGLR

Prob*

UGLR

Prob*TM bands

TM bands

DTDTDTThresholdingThresholding

Figure 4. Flow chart of the procedure using one training sample set.

3.1. Sampling and validation

This study opted for the stratified random sampling technique with proportional alloca-tion to split the reference data set into two subsets: calibration and validation. As a result,proportions of no-change and change pixels in each subset consist with those in the totalpopulation. It is necessary to restate that our reference data contain only non-urban pixels

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 9: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5475

Table 2. Summary of sample sets used for calibration and validation.

Class

Sample UsageNumber of pixels (and

percentage of total) No change Change

I Calibration 20,000 (∼3%) 15,662 4338II Validation 713,059 (∼97%) 558,388 154,671Total 733,059 (100%) 574,050 159,009

Note: Urban pixels in 1977 are excluded from the reference data set and any subsequent analysis.

in 1977 that had the potential to change; existing urban pixels are excluded from allsample sets.

A total of 733,059 non-urban pixels with a constant size of 30 m × 30 m in 1977 com-prise the entire reference data set. Of these, 574,050 pixels remained non-urban (greycolour in Figure 1) and 159,009 pixels transitioned from non-urban in 1977 to urban in1997 (white colour in Figure 1). According to the sampling scheme adopted, 20,000 pixels(∼3%) are randomly extracted throughout the reference set for the calibration process, andall of the remaining 713,059 pixels (∼97%) are kept for validation. Information on the datasets used in this study is summarized in Table 2. Sample I consists of 15,662 no-changeand 4338 change pixels, and it is used as the calibration data set where training samplesare collected. All results are validated using the entire set of Sample II, which contains558,388 no-change and 154,671 change pixels. The proportion of each class in each sam-ple set is consistent with the corresponding class proportion in the entire reference data set(78.31% no-change vs 21.69% change).

Training samples are randomly collected exclusively from the calibration data set. Largesampling sizes are avoided as they would defeat the purpose of an automated and realisticimage-based classification method. Therefore, a series of sizes from 100 to 5000 pixels witha 100 pixel increment are tested to evaluate the classification performance with respectto different training sizes, and the proportion of the two classes remains the same in alltraining data sets. The collection of training data is repeated 1000 times per sample sizeto reduce the variability among multiple sampling attempts and gain a statistically betterrepresentation. Also, using various training data sets offers the possibility to yield multipleclassification results on which further assessment on modelling stability can be carried out.

3.2. Classification scenarios

Five classification scenarios are defined in order to evaluate the hypothesis that com-bining data from multiple sources and gathering spectral and probability information toachieve urban change inferences outperforms the exclusive use of a single data source.More specifically, classification models are developed using five different input data sets:(a) probabilities provided by a decision tree-based urban growth prediction model (denotedas UGDT Prob in Figure 4); (b) probabilities provided by a logistic regression-based urbangrowth prediction model (UGLR Prob); (c) six reflective bands from TM imagery; andthe joint use of TM data with (d) DT probabilities and (e) LR probabilities of change asan additional layer, respectively. Of further importance is contrasting the performance offusion-based models with classifications solely derived from spectral data on a varyingtraining size at both pixel and block scales, as it directly speaks to the applicability andpotential benefits of incorporating external urban growth products into the classification.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 10: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5476 H. Jin and G. Mountrakis

3.3. Classification methods

3.3.1. Thresholding

A thresholding procedure is applied to every continuous probability map to categorize thevalues of the predicted probability into two classes: no change and change. Pixels withprobability values greater than the threshold are considered urban, while a pixel exhibits notransition from non-urban to urban if the value is smaller than the threshold. The motivationbehind this procedure is to test the performance of exclusive use of urban growth predictionmodelling products, so as to completely evaluate the benefits introduced by the combineduse of spectral characteristics and change probability estimations. In other words, if betterclassification can be achieved directly from either DT or LR probabilities, then there isno need to introduce spectral information and undertake more complicated classificationsbased on the integrated information contents.

For every training sample set, the threshold of each probability map is determinedthrough an exhaustive searching method which involves varying the value from 0 to 1 withan increment of 0.01 and enumerating all possible solutions. The value that results in thehighest training accuracy is finally selected and applied to the validation data set, convert-ing all continuous probabilities to binary indicators. A total of 1000 simulations identifyinglocations of new urban development by 1997 are produced from either DT or LR probabil-ities given a specific training sample size, which varies from 100 to 5000 with a 100 pixelincrement in this study.

3.3.2. Decision trees

The probability map created from the two different urban growth prediction models –decision tree and logistic regression – is applied to the subsequent image-based classifica-tion tasks as an additional input. The combined multispectral and probability informationrequires an appropriate classifier, and a non-parametric learning algorithm – decision treeclassification – is employed in this study considering its capability to be trained and exe-cuted efficiently (Pal and Mather 2003), and to accommodate a wide variety of inputdata measured on different scales (Hampson and Volper 1986; Fayyad and Irani 1992).Compared with traditional parametric approaches, such as maximum likelihood classifica-tion, decision trees can handle non-linear relationships between features and classes with noimplicit assumption required on data distributions (Breiman et al. 1984; Hansen, Dubayah,and DeFries 1996).

Since Friedl and Brodley (1997) first systematically assessed the utility of decisiontree classification algorithms for the task of land-cover mapping from remotely senseddata, decision trees (DTs) have been widely applied by the remote-sensing community as aviable alternative to traditional classifiers in many successful cases (e.g. DeFries et al. 1998;Hansen et al. 2000; Friedl et al. 2002; McIver and Friedl 2002; Brown de Colstoun et al.2003; Im and Jensen 2005; Sesnie et al. 2008; Tooke et al. 2009; Davranche, Lefebvre, andPoulin 2010). Meanwhile, some studies have also discussed the limitations of DTs, includ-ing the instability of trees to outliers or small changes in the training data set, as well as thegeneral requirement of a large number of training samples for reliable tree construction (Liand Belford 2002; Miller and Franklin 2002; Joy, Reich, and Reynolds 2003).

Here, decision tree classifications based on dichotomous partitioning (Breiman et al.1984) are performed with the Statistics Toolbox in Mathwork’s MATLAB software.Classification trees are constructed using the same training samples for each of the threeinput data sets: (a) TM spectral bands (6 inputs), (b) TM bands and DT probabilities

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 11: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5477

(7 inputs), and (c) TM bands and LR probabilities (7 inputs). Moreover, each tree is adjustedusing a 10-fold cross-validation on the training data and a pruning process. The developedtrees are then applied to classify the entire validation data set. Again, 1000 classificationsare created per input data given a specific training size with different training samples.Note that every time we use one training sample set to search for thresholds and to set upclassification rules. As a consequence, all classified maps in the flow chart (Figure 4) fromdifferent scenarios are comparable and indicative of the model performance.

3.4. Accuracy assessment

Accuracy assessment is critical to remotely sensed image classification. Defined as thedegree to which a classification corresponds to reference data, accuracy is a key criterion toquantitatively evaluate the quality of the resultant classified maps. In this study, pixel-basedand block-based assessments are implemented using the validation data set.

The most commonly reported pixel-based accuracy assessment statistic is the overallaccuracy, which summarizes the error matrix information using the sum of the diagonalentries. As a general measure of the classification performance, overall accuracy indicatesthe percentage of pixels that are correctly classified. Here, all classified maps generatedfrom different inputs and training samples are compared with the reference data on apixel-by-pixel basis, and overall accuracy is computed for the validation data exclusively.To adjust the chance agreement, the kappa coefficient is also calculated from the errormatrix to statistically determine whether a classifier is significantly better than a randomassignment of class labels (Congalton, Oderwald, and Mead 1983).

Block-scale accuracies are important in practical applications, such as urban planning,since policy-makers are more interested in urban sprawl at the block level (e.g. a 1 km2

block) rather than whether an explicit pixel area (e.g. 30 m × 30 m) has changed. In con-trast to a pixel-per-pixel comparison, such block-based assessment aggregates algorithmicperformance in coarser resolutions moving away from individual pixels.

A block-based assessment is conducted at the 1 km scale, a representative extent for theurban planning community. For this study, blocks are defined as non-overlapping windowsconsisting of 33 × 33 pixels with a pixel size of 30 m × 30 m, and each block is coveringan area of approximately 1 km2. For every assessed classified map, the value assigned toeach window is computed as the ratio of correct class predictions to total valid pixels thathad potential to change inside of the examined window. Moreover, spatial variability of theclassification performance is evaluated from both visual and statistical perspectives by cre-ating a map of accuracies at the 1 km scale for each scenario and calculating the average andstandard deviation over the study area. To establish the statistical significance of the differ-ence in performance between the classification models in terms of block-level accuracies,a two-sample t-test is undertaken and the p-value is calculated. The block-level accura-cies are further investigated with respect to urban presence in 1977 and urban developmentbetween 1977 and 1997.

4. Results

4.1. Pixel-level assessment

To assess classification performance, a series of training sample sizes were examined. Thesample size varied from 100 to 5000 with increments of 100. For a given size, trainingsamples were repetitively collected from the calibration data set on the basis of stratified

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 12: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5478 H. Jin and G. Mountrakis

random sampling, and ‘no change’ and ‘change’ samples were proportionally allocated.Every training set was used to classify each of the following input data sets: (a) changeprobabilities produced from DT modelling, (b) change probabilities produced from LRmodelling, (c) six TM reflective bands, (d) six TM reflective bands combined with anadditional layer of DT probabilities, and (e) six TM reflective bands combined with anadditional layer of LR probabilities (i.e. the five models in Figure 4). This process wasrepeated 1000 times using different training data sets, and therefore, 1000 classificationswere created for each scenario given a specific size of training samples. All results werevalidated using the entire set of Sample II, and overall accuracy and kappa coefficients aregraphically displayed by the box plots in Figures 5 and 6. Each box contains the medianvalue (central mark) and the 25th and 75th percentiles (edges of the box) calculated from1000 random training sets. Moreover, a line connecting the median accuracies is drawn foreach classification scenario.

500 1000 1500 2000Training sample size

2500 3000 3500

UGLR-based map

UGDT-based map

TM + UGLR fusion-based map

TM + UGDT fusion-based map

TM-based map

4000 4500 5000077

79

81Val

idat

ion

accu

racy

(%

)

83

85

87

89

91

Figure 5. Box plots of overall accuracy using different input data sets.

TM + UGDT fusion-based map

UGLR-based map

UGDT-based map

TM + UGLR fusion-based map

TM-based map

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

0 500 1000 1500 2000

Training sample size

2500 3000 3500 4000 4500 5000

Val

idat

ion

kapp

a

Figure 6. Box plots of the kappa coefficient using different input data sets.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 13: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5479

Looking at the training sample size, UGPM classification accuracy was not signifi-cantly affected by the sample size; this is in sharp contrast to TM-based maps, for whichthe accuracy improved for larger training sample sizes. According to Figure 5, the overallaccuracies are consistently around 87.7% using UGDT probabilities and 79.6% using UGLR

probabilities, with no significant improvement when more training data are collected. It isalso evident that the UGPM based on decision trees significantly outperforms the UGPMbased on logistic regression. In contrast to the two UGPM-based scenarios, the classifi-cation accuracies of models exclusively using TM bands are sensitive to the number oftraining samples, with the median overall accuracy increasing from 82.1% to 87.3%.

Integrating UGDT probabilities with spectral-based classification brings in consid-erable improvement with overall accuracies of 87.1–89.7%, showing an increase ofnearly 2.5–5.0% over the TM-based classification. The corresponding kappa coefficient inFigure 6 presents an approximate improvement of 0.20 (from 0.41 to 0.61) for the smallesttraining size (100 samples), and of 0.08 (from 0.60 to 0.68) for the largest (5000 sam-ples). In contrast, there is only a slight increase of 0.2% in the median overall accuracy and0.01 in the median kappa coefficient when UGLR probabilities are incorporated as addi-tional information content for classification using 5000 training samples, suggesting that amedium-quality urban growth model, such as logistic regression, does not contribute muchto accuracy improvement. Therefore, the performance of integrating change probabilitieswith image-based classification tasks largely depends on the performance of urban growthmodels, and products from a high-quality model complement the TM data in urban changedetection well.

4.2. Block-level assessment

Moving away from individual pixels, accuracies of the resultant classified maps from dif-ferent input data were also evaluated within the 40 × 31 non-overlapping 1 km2 blocks(approximated by a 33 × 33 square of 30 m pixels), a commonly used unit for urban plan-ners and policy-makers. To simplify the block-based assessments and result presentations,only the 5000 training sample size was selected for the subsequent analysis, reflecting theclassification performance at the block level with sufficient time/effort being devoted fortraining data acquisition.

Spatial variability of block accuracy aggregated from all 1000 simulations using5000 training samples is displayed in Figure 7. Every block represents an approximatearea of 1 km × 1 km. In Figure 7, the non-urban 1977 percentage map is placed in a boxto be easily distinguishable from the maps of block-level accuracies. Block-level accuracyrefers to the ratio of correct pixel-based class identifications and is calculated exclusivelyusing the non-urban 1977 pixels in a specific block, in essence, the pixels that had thepotential to change into ‘urban’. Blocks in black colour were completely occupied byurban development in 1977, and hence excluded from the following assessments. Fromthe 1000 performed simulations, the median accuracy value for each block is assessed.As the colour turns from blue to red, corresponding blocks present increasing accuracies,suggesting that more and more pixels are correctly classified within the examined blocks.

Assessing the five variability maps in Figures 7(b)–(f ), the northwest and northeastportions with higher proportions of non-urban areas in 1977 (Figure 7(a)) are well classi-fied in all scenarios, whereas the north, southeast, and southwest portions closely locatedaround existing major urban development exhibit variable accuracy depending on the clas-sifier’s input data. Consistent with the box plots in Figure 5, accuracies achieved by theintegration of multispectral information and UGDT probabilities based on a large training

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 14: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5480 H. Jin and G. Mountrakis

(a) (b) (c)

(d) (e) (f)

Blocks with complete urban presence in 1977

Percentage of non-urban pixels in 1977 Accuracies: UGDT-based map Accuracies: UGLR-based map

Average: 73.6%Standard deviation: 27.4%

Average: 85.7%Standard deviation: 14.7%

Average: 83.4%Standard deviation: 14.9%

Average: 87.6%Standard deviation: 11.5%

0 20% 40% 60% 80% 100%

Average: 83.5%Standard deviation: 14.9%

Accuracies: TM-based map Accuracies: TM + UGDT fusion-based map Accuracies: TM + UGLR fusion-based map

Figure 7. Spatial variability maps of (a) non-urban block percentages in 1977, and (b)–(f ) medianblock accuracies derived from 1000 simulations using 5000 training samples.

set are considerably better compared with the separate use of either multispectral or urbangrowth probability data (Figure 7(e) vs Figures 7(b) and (d)). Particularly, Figure 7(e) dis-plays in a stronger red colour with less ‘salt and pepper’ noise (i.e. extreme and isolatedpixels in light blue showing relatively low accuracies). This indicates that the correspond-ing classification is spatially distributed in a more uniform manner within the study area.From the statistical perspective, averaged block accuracy increases from 85.7% (UGDT

probabilities) and 83.4% (TM bands) to 87.6% when multi-source data sets are fused forclassification. Equally important, standard deviation drops by more than 3%, suggestingthat accuracies at the 1 km scale are more evenly distributed with less spatial variability, asignificant advantage from the decision-making perspective. Finally, Figures 7(d) and (f )have almost the same visual appearance and statistical performance: little improvement isattained when the modestly accurate LR-based UGPM is incorporated with spectral data inclassification.

To further assess the contribution of the joint use of spectral and prediction data, a t-testwas conducted on block accuracies from the 1000 simulations between TM and UGDT

fusion-based and TM- and UGDT-based models, respectively. The null hypothesis is thattwo models using combined and individual (spectral or prediction) inputs can produceidentical results. A p-value was calculated at every block indicating the level of statisti-cal significance. Distributions of the p-value in the two pairwise comparisons are shownin Figure 8. It is clear that in both cases, only a limited number of blocks (red) presentno significant difference at the 95% confidence level (p > 0.05), while other blocks arethose where it is sufficient to reject the null hypothesis, which means that significantlyhigher accuracies can be achieved by the proposed method relative to the exclusive use

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 15: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5481

p > 0.05 0.01 < p ≤ 0.05 0.001 < p ≤ 0.01 p ≤ 0.001

Blocks with complete urban presence in 1977

(a) (b)

Figure 8. Comparison of t-test results (p-value) on block accuracies from 1000 simulations betweenTM and UGDT fusion-based and (a) TM- and (b) UGDT-based models.

Table 3. Summary of the p-value between using individual and fused input data sets.

Number of blocks (and approximate percentage)

p-Value Fused vs TM model Fused vs UGDT model

p > 0.05 59 (5%) 89 (8%)0.01 < p ≤ 0.05 13 (1%) 13 (1%)0.001 < p ≤ 0.01 12 (1%) 12 (1%)p ≤ 0.001 1065 (93%) 1035 (90%)

Note: 91 blocks (displayed in black in Figure 8) presenting complete urbanization in 1977 are excluded from thet-test of significance.

of either spectral (Figure 8(a)) or prediction (Figure 8(b)) data. The majority of the studyarea exhibits very high significance with p ≤ 0.001, and this is again demonstrated by thestatistics summarized in Table 3 (∼93% and ∼90%).

5. Discussion

The applicability of the resultant classified maps using spectral information and UGDT

probabilities can be seen in Figure 7. The 1 km scale, a representative scale for urban plan-ning communities, shows more uniformly distributed and higher block-level accuracies forthe integrated TM and UGDT model. From the practical perspective, this is a rather appeal-ing result because of its potential usefulness for urban planners to extract urban growthpatterns at the block scale, and accordingly, to make decisions on urban design. It shouldbe noted that integration of spectral and urban prediction data is most beneficial with suf-ficient training data (required for the spectral model) and advanced modelling algorithms(required for the urban prediction model).

To further investigate the urban conditions where the spectral/prediction fusion is bene-ficial, Figures 9 and 10 are presented, where accuracy differences are accessed at the 1 km2

block level based on urbanization presence (X ) and urbanization change (Y ) sizes. Blockswere extracted from Figure 7 under the assumption of 5000 training points and with theexclusion of blocks that were 100% urban in 1977. The distribution of these blocks is shown

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 16: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5482 H. Jin and G. Mountrakis

1000

Block accuracy difference (%)

400Mean = 4.1Median = 1.1Standard deviation = 11.0

300

200

Blo

ck c

ount

100

0–100 –50 0 50

Block accuracy difference (%)

100

900

800

700

600

500

400

300

200

100

00 200

Number of 30 m urban pixels in each 1 km2 block in 1977

Num

ber

of 3

0 m

cha

nged

to u

rban

pix

els

in e

ach

1 km

2 blo

ck in

199

7

400 600 800 1000–2

0

2

4

6

8

10

12

14

Figure 9. Fused vs spectral model. Smoothed accuracy differences for 5000 training point modelswith respect to urbanization presence and change (1 km2 blocks from Figure 7).

1000Mean = 1.9Median = 0.0Standard deviation = 8.0

Blo

ck c

ount

500

400

300

200

100

0

Block accurancy difference (%)

–50 0 50

900

800

700

600

500

400

300

200

100

00 200 400

Number of 30 m urban pixels in each 1 km2 block in 1977

Num

ber

of 3

0 m

cha

nged

to u

rban

pix

els

in e

ach

1 km

2 blo

ck in

199

7

600 800 1000–20

–15

–10

–5

0

5

10

15

20Block accuracy difference (%)

Figure 10. Fused vs prediction model. Smoothed accuracy differences for 5000 training pointmodels with respect to urbanization presence and change (1 km2 blocks from Figure 7).

as black points. The contour plots represent accuracy differences and are smoothened toallow identification of larger patterns.

In Figure 9, where the fused spectral/prediction model is contrasted with the spectralmodel, it is easy to identify a wide range of conditions of beneficial fusion. Especially

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 17: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5483

along the hypotenuse of the triangle, where change is substantial compared with theavailable non-urban pixels in a given block, the fusion significantly increases observedaccuracy. This shows that it is not only the size of change, where, for example, newmaterials could pose spectral constraints, but also the existing urbanization condition thatmatters.

Although this work assesses the added value of a prediction model to the spectral infor-mation, it is of equal interest (especially from the GIS/planning community) to also assessthe inverse, namely the benefits of adding spectral information to a prediction model, inessence examining the value of remotely sensed inputs. Figure 10 follows the structure ofFigure 9, but now the differences of the fused model with respect to the urban predictionmodel are visualized. It appears that integration of spectral information is helpful, with apossible exception of low change values (difficult to detect remotely) and very high relativechange sizes. It should be noted that the histograms of Figures 9 and 10 aggregate statisticsfrom all blocks, placing higher weight on more densely sampled portions of the graph. It isof greater importance to examine the areal coverage of the triangle as it relates directly todifferent urbanization conditions.

It is necessary to point out that this study was conducted under several specific con-ditions examining 24 pre-determined explanatory variables and two algorithms to modelurban growth patterns, one classification algorithm for the integration of multispectral andprobability information, and more importantly, a single site. These limited but promisingresults should encourage integration efforts of remote sensing and urban growth models,especially as the availability of multi-temporal data is increasing. Monitoring efforts couldimprove, as this manuscript has demonstrated, and improved monitoring results could inreturn assist in more accurate prediction models creating an interesting modelling feedbackloop.

6. Conclusions

Probability maps of change produced from urban growth models can act as ancillary datato image-based classification tasks. Our hypotheses investigate whether the integration ofthe output from urban growth prediction models has a potential to improve classificationaccuracy, and whether that improvement is more pronounced under different urbanizationconditions. The results presented in this study show that with sufficient training sam-ples, the fusion of spectral information and urban change probabilities from a high-qualitymodel achieved improvements of 2.0–2.4% in overall accuracy and 0.07–0.08 in kappacoefficients in pixel-based assessments compared with the exclusive use of either spec-tral or prediction probability data. Meanwhile, accuracies aggregated at the block scale areincreased by nearly 2–5% when the fusion-based model was employed, and the standarddeviation drops by more than 3%, suggesting higher consistency in classification perfor-mance. These improvements were further proved to be significant through statistical tests.Moreover, the proposed spectral/prediction fusion is beneficial to address spectral limita-tions, especially in blocks with high relative non-urban to urban transition. In the future,additional sites should be tested to evaluate the usefulness of urban growth modellingproducts in image-based urban change analysis.

AcknowledgementsThis research was supported by the National Aeronautics and Space Administration (grant no.NNX08AR11G) through a New Investigator award.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 18: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5484 H. Jin and G. Mountrakis

ReferencesBauer, M., B. Loeffelholz, and B. Wilson. 2007. “Estimating and Mapping Impervious Surface Area

by Regression Analysis of Landsat Imagery.” In Remote Sensing of Impervious Surfaces, editedby Q. Weng, 3–19. Boca Raton, FL: CRC Press.

Breiman, L., J. H. Friedman, R. A. Olshen, and C. Stone. 1984. Classification and Regression Trees.Belmont, CA: Wadsworth.

Brown de Colstoun, E. C., M. H. Story, C. Thompson, K. Commisso, T. G. Smith, and J. R. Irons.2003. “National Park Vegetation Mapping Using Multitemporal Landsat 7 Data and a DecisionTree Classifier.” Remote Sensing of Environment 85: 316–327.

Chan, J. C., K. Chan, and A. G. Yeh. 2001. “Detecting the Nature of Change in an Urban Environment:A Comparison of Machine Learning Algorithms.” Photogrammetric Engineering and RemoteSensing 67: 213–225.

Clarke, K. C., and L. J. Gaydos. 1998. “Loose-Coupling a CA Model and GIS: Long-Term UrbanGrowth Prediction for San Francisco and Washington/Baltimore.” International Journal ofGeographical Information Science 12: 699–714.

Congalton, R. G., R. G. Oderwald, and R. A. Mead. 1983. “Assessing Landsat ClassificationAccuracy Using Discrete Multivariate Analysis Statistical Techniques.” PhotogrammetricEngineering & Remote Sensing 49: 1671–1678.

Davranche, A., G. Lefebvre, and B. Poulin. 2010. “Wetland Monitoring Using Classification Treesand SPOT-5 Seasonal Time Series.” Remote Sensing of Environment 114: 552–562.

DeFries, R. S., M. Hansen, J. R. G. Townshend, and R. Sohlberg. 1998. “Global Land CoverClassifications at 8 Km Spatial Resolution: The Use of Training Data Derived from LandsatImagery in Decision Tree Classifiers.” International Journal of Remote Sensing 19: 3141–3168.

Duguay, C., G. Holder, E. LeDrew, P. Howarth, and D. Dudycha. 1989. “A Software Packagefor Integrating Digital Elevation Models into the Digital Analysis of Remote-Sensing Data.”Computers and Geosciences 15: 669–678.

Esch, T., V. Himmler, G. Schorcht, M. Thiel, T. Wehrmann, F. Bachofer, C. Conrad, M. Schmidt, andS. Dech. 2009. “Large-Area Assessment of Impervious Surface Based on Integrated Analysisof Single-Date Landsat-7 Images and Geospatial Vector Data.” Remote Sensing of Environment113: 1678–1690.

Fayyad, U. M., and K. B. Irani. 1992. “On the Handling of Continuous-Valued Attributes in DecisionTree Generation.” Machine Learning 8: 87–102.

Franke, J., D. A. Roberts, K. Halligan, and G. Menz. 2009. “Hierarchical Multiple EndmemberSpectral Mixture Analysis (MESMA) of Hyperspectral Imagery for Urban Environments.”Remote Sensing of Environment 113: 1712–1723.

Franklin, J. 1995. “Predictive Vegetation Mapping: Geographic Modeling of Biospatial Patterns inRelation to Environmental Gradients.” Progress in Physical Geography 19: 494–519.

Franklin, S. E., and B. A. Wilson. 1991. “Spatial and Spectral Classification of Remote-SensingImagery.” Computers and Geosciences 17: 1151–1172.

Friedl, M. A., and C. Brodley. 1997. “Decision Tree Classification of Land Cover from RemotelySensed Data.” Remote Sensing of Environment 61: 399–409.

Friedl, M. A., D. K. McIver, J. C. F. Hodges, X. Y. Zhang, D. Muchoney, A. H. Strahles, C. E.Woodcock, S. Gopal, A. Schneider, A. Cooper, A. Baccini, F. Gao, and C. Schaaf. 2002.“Global Land Cover Mapping from MODIS: Algorithms and Early Results.” Remote Sensingof Environment 83: 287–302.

Gamba, P., F. Dell’Acqua, and G. Trianni. 2007. “Rapid Damage Detection in the Bam Area UsingMultitemporal SAR and Exploiting Ancillary Data.” IEEE Transactions on Geoscience andRemote Sensing 45: 1582–1589.

Greenberg, J. D., and G. A. Bradley. 1997. “Analyzing the Urban-Wildland Interface with GIS.”Journal of Forestry 95: 18–22.

Hampson, S. E., and D. J. Volper. 1986. “Linear Function Neurons: Structure and Training.”Biological Cybernetics 53: 203–217.

Hansen, M. C., R. S. DeFries, J. R. G. Townshend, and R. Sohlberg. 2000. “Global Land CoverClassification at 1 Km Spatial Resolution Using a Classification Tree Approach.” InternationalJournal of Remote Sensing 21: 1331–1364.

Hansen, M., R. Dubayah, and R. DeFries. 1996. “Classification Trees: An Alternative to TraditionalLand Cover Classifiers.” International Journal of Remote Sensing 17: 1075–1081.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 19: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

International Journal of Remote Sensing 5485

Harris, P. M., and S. J. Ventura. 1995. “The Integration of Geographic Data with Remotely SensedImagery to Improve Classification in an Urban Area.” Photogrammetric Engineering and RemoteSensing 61: 993–998.

Hu, X., and Q. Weng. 2009. “Estimating Impervious Surfaces from Medium Spatial ResolutionImagery Using the Self-Organizing Map and Multi-Layer Perceptron Neural Networks.” RemoteSensing of Environment 113: 2089–2102.

Hu, Z., and C. P. Lo. 2007. “Modeling Urban Growth in Atlanta Using Logistic Regression.”Computers, Environment and Urban Systems 31: 667–688.

Huang, B., L. Zhang, and B. Wu. 2009. “Spatiotemporal Analysis of Rural-Urban Land Conversion.”International Journal of Geographical Information Science 23: 379–398.

Im, J., and J. R. Jensen. 2005. “A Change Detection Model Based on Neighborhood CorrelationImage Analysis and Decision Tree Classification.” Remote Sensing of Environment 99: 326–340.

Joy, S. M., R. M. Reich, and R. T. Reynolds. 2003. “A Non-Parametric Supervised Classification ofVegetation Types on the Kaibab National Forest Using Decision Trees.” International Journal ofRemote Sensing 24: 1835–1852.

Li, R. H., and G. G. Belford. 2002. “Instability of Decision Tree Classification Algorithms.” InProceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery andData Mining, Edmonton, AB, July 23–26, CD-ROM, 570–575. New York: Association forComputing Machinery.

Li, X., X. Zhang, A. Yeh, and X. Liu. 2010. “Parallel Cellular Automata for Large-ScaleUrban Simulation Using Load-Balancing Techniques.” International Journal of GeographicalInformation Science 24: 803–820.

Liu, W., and K. C. Seto. 2008. “Using the ART-MMAP Neural Network to Model and Predict UrbanGrowth: A Spatiotemporal Data Mining Approach.” Environment and Planning B: Planning andDesign 35: 296–317.

Lu, D., and Q. Weng. 2006. “Use of Impervious Surface in Urban Land Use Classification.” RemoteSensing of Environment 102: 146–160.

Luo, L., and G. Mountrakis. 2010. “Integrating Intermediate Inputs from Partially Classified ImagesWithin a Hybrid Classification Framework: An Impervious Surface Estimation Example.”Remote Sensing of Environment 114: 1220–1229.

McIver, D. K., and M. A. Friedl. 2002. “Using Prior Probabilities in Decision-Tree Classification ofRemotely Sensed Data.” Remote Sensing of Environment 81: 253–261.

Miller, J., and J. Franklin. 2002. “Modeling the Distribution of Four Vegetation Alliances UsingGeneralized Linear Models and Classification Trees with Spatial Dependence.” EcologicalModelling 157: 227–247.

Mountrakis, G., and L. Luo. 2011. “Enhancing and Replacing Spectral Information with IntermediateStructural Inputs: A Case Study on Impervious Surface Detection.” Remote Sensing ofEnvironment 115: 1162–1170.

Pal, M., and P. M. Mather. 2003. “An Assessment of the Effectiveness of Decision Tree Methods ofLand Cover Classification.” Remote Sensing of Environment 86: 554–565.

Pijanowski, B., S. Pithadia, B. Shellito, and A. Alexandridis. 2005. “Calibrating a Neural Network-Based Urban Change Model for Two Metropolitan Areas of the Upper Midwest of the UnitedStates.” International Journal of Geographical Information Science 19: 197–215.

Pontius, R. G., and J. Malanson. 2005. “Comparison of the Structure and Accuracy of Two LandChange Models.” International Journal of Geographical Information Science 19: 745–748.

Powell, R. L., D. A. Roberts, P. E. Dennison, and L. L. Hess. 2007. “Sub-Pixel Mapping of UrbanLand Cover Using Multiple Endmember Spectral Mixture Analysis: Manaus, Brazil.” RemoteSensing of Environment 106: 253–267.

Ricchetti, E. 2000. “Multispectral Satellite Image and Ancillary Data for Geological Classification.”Photogrammetric Engineering and Remote Sensing 66: 429–435.

Rogan, J., J. Miller, D. Stow, J. Franklin, L. Levien, and C. Fischer. 2003. “Land-Cover ChangeMonitoring with Classification Trees Using Landsat TM and Ancillary Data.” PhotogrammetricEngineering and Remote Sensing 69: 793–804.

Sesnie, S. E., P. E. Gessler, B. Finegan, and S. Thessler. 2008. “Integrating Landsat TM and SRTM-DEM Derived Variables with Decision Trees for Habitat Classification and Change Detection inComplex Neotropical Environments.” Remote Sensing of Environment 112: 2145–2159.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14

Page 20: Sensing International Journal of Remote€¦ · Forestry , Syracuse , NY , 13210 , USA Published online: 24 Apr 2013. To cite this article: Huiran Jin & Giorgos Mountrakis (2013)

5486 H. Jin and G. Mountrakis

Seto, K. C., and R. K. Kaufmann. 2003. “Modeling the Drivers of Urban Land Use Change inthe Pearl River Delta, China: Integrating Remote Sensing with Socioeconomic Data.” LandEconomics 79: 106–121.

Stathakis, D., and L. Kanellopoulos. 2008. “Global Elevation Ancillary Data for Land-UseClassification Using Granular Neural Networks.” Photogrammetric Engineering and RemoteSensing 74: 55–63.

Stefanov, W. L., M. S. Ramsey, and P. R. Christensen. 2001. “Monitoring Urban Land Cover Change:An Expert System Approach to Land Cover Classification of Semiarid to Arid Urban Centers.”Remote Sensing of Environment 77: 173–185.

Strahler, A. H., J. Franklin, C. E. Woodcock, and T. L. Logan. 1981. “FOCIS: A Forest Classificationand Inventory System Using Landsat and Digital Terrain Data.” Report NFAP/255, NationwideForestry Application Program, U.S. Department of Agriculture, Houston, TX, 60 p.

Syphard, A. D., K. C. Clarke, K. C. Woodcock, and J. Franklin. 2005. “Using a Cellular AutomatonModel to Forecast the Effects of Urban Growth on Habitat Pattern in Southern California.”Ecological Complexity 2: 185–203.

Tooke, T. R., N. C. Coops, N. R. Goodwin, and J. A. Voogt. 2009. “Extracting Urban VegetationCharacteristics Using Spectral Mixture Analysis and Decision Tree Classifications.” RemoteSensing of Environment 113: 398–407.

Triantakonstantis, D., G. Mountrakis, and J. Wang. 2011. “A Spatially Heterogeneous ExpertBased (SHEB) Urban Growth Model Using Model Regionalization.” Journal of GeographicInformation System 3: 195–210.

Trietz, P., and P. Howarth. 2000. “Integrating Spectral, Spatial and Terrain Variables for ForestEcosystem Classification.” Photogrammetric Engineering and Remote Sensing 66: 305–317.

USGS (US Geological Survey Rocky Mountain Mapping Center) 2003. Front Range InfrastructureResources (Frir) Project. Accessed April 19, 2013. http://energy.cr.usgs.gov/regional_studies/frip/.

Wang, J., and G. Mountrakis. 2010. “Developing a Multi-Network Urbanization Model: A CaseStudy of Urban Growth in Denver, Colorado.” International Journal of Geographical InformationScience 25: 229–253.

Weber, C., and A. Puissant. 2003. “Urbanization Pressure and Modeling of Urban Growth: Exampleof the Tunis Metropolitan Area.” Remote Sensing of Environment 86: 341–352.

Weng, Q. 2012. “Remote Sensing of Impervious Surfaces in the Urban Areas: Requirements,Methods, and Trends.” Remote Sensing of Environment 117: 34–49.

Weng, Q., X. Hu, and H. Liu. 2009. “Estimating Impervious Surfaces Using Linear Spectral MixtureAnalysis with Multitemporal ASTER Images.” International Journal of Remote Sensing 30:4807–4830.

White, R., G. Engelen, and I. Uljee. 1997. “The Use of Constrained Cellular Automata for High-Resolution Modelling of Urban Land-Use Dynamics.” Environment and Planning B: Planningand Design 24: 323–343.

Wu, C. 2004. “Normalized Spectral Mixture Analysis for Monitoring Urban Composition UsingETM+ Imagery.” Remote Sensing of Environment 93: 480–492.

Wu, C., and A. T. Murray. 2003. “Estimating Impervious Surface Distribution by Spectral MixtureAnalysis.” Remote Sensing of Environment 84: 493–505.

Xian, G., M. Crane, and C. McMahon. 2008. “Quantifying Multi-Temporal Urban DevelopmentCharacteristics in Las Vegas from Landsat and ASTER Data.” Photogrammetric Engineeringand Remote Sensing 74: 473–481.

Yang, X. 2006. “Estimating Landscape Imperviousness Index from Satellite Imagery.” IEEEGeoscience and Remote Sensing Letters 3: 6–9.

Dow

nloa

ded

by [

SUN

Y C

olle

ge o

f E

nvir

onm

enta

l Sci

ence

and

For

estr

y ]

at 0

8:58

05

Sept

embe

r 20

14