16
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013 313 Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution Melissa Cote, Member, IEEE, and Parvaneh Saeedi, Member, IEEE Abstract—Building profile extraction from aerial imagery con- stitutes a key element in numerous geospatial applications. Rooftop detection has been addressed through a variety of ap- proaches that are, however, rarely capable of coping with con- ditions such as arbitrary illumination, variant reflections, and complex building profiles. This paper proposes a new method for extracting 2-D rooftop footprints from nadir aerial imagery through a fully automatic approach that handles arbitrary illumi- nation, variant reflections, and complex building profiles without shape priors. The proposed method combines the strength of energy-based approaches with distinctiveness of corners. Corners are assessed using multiple color and color-invariance spaces. A rooftop outline is generated from selected corner candidates and further refined to fit the best possible boundaries through level-set curve evolution that is enhanced via a mean squared error map. Experimental results confirm the ability of the presented system to effectively extract rooftop profiles with an overall average shape accuracy of 84%, correctness of 94%, completeness of 92%, and quality of 88%. Index Terms—Aerial image processing, boundary detection, building extraction, level set evolution. I. I NTRODUCTION T HREE-DIMENSIONAL building model reconstruction from aerial imagery constitutes a key element in numer- ous geospatial applications including 3-D map reconstruction, urban environmental planning, and military simulations. The abundance of high-resolution images with fast update rate and the importance of maintaining geographic information systems up-to-date have both contributed in making 3-D building re- construction a tremendously active research area in the last decades. Rooftop detection as the main foundation of 3-D reconstruction has been addressed through a variety of ap- proaches. However, most approaches are incapable of coping with complexities arising from sources such as arbitrary illumi- nation, variant reflections, and slopped and gabbled rooftops, and are limited to process relatively simple profiles. Manuscript received November 17, 2011; revised April 5, 2012; accepted May 15, 2012. Date of publication June 28, 2012; date of current version December 19, 2012. The authors are with the Laboratory for Robotic Vision, School of Engineer- ing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2012.2200689 This paper presents a method for extracting 2-D rooftop footprints from nadir aerial imagery that copes with arbitrary illumination, variant reflections, slopped surfaces, and complex profiles. The objective of this work is to present a fast system for finding rooftops automatically and accurately. We assume that rooftops have distinctive edges on their boundaries, and also similar material (in both color and reflection properties) is used in the construction of a rooftop. If two or more materials are used in one rooftop, the rooftop will be identified by multiple pieces. The paper is structured in the following way. Related work regarding 2-D building extraction is reviewed next, followed by the contributions of the proposed approach. Section II details the methodology, and Section III describes the devel- oped graphical user interface (GUI). Experimental results and their qualitative and quantitative evaluations are provided in Section IV. Finally, Section V presents conclusions from this work and relevant perspectives. A. Related Work Two-dimensional rooftop extraction as part of 3-D building model reconstruction has been tackled through many approach- es, varying according to sensor modalities and their qualitative and quantitative properties, additional input resources such as digital elevation models (DEM) or digital surface models (DSM), a priori knowledge, level of human supervision, and in- teraction, building complexity, and shape assumptions [1]–[4]. Curve evolution/energy-based and model-based methods have been very popular among the research community for building extraction. Deformable boundaries, active contours (snakes, introduced by Kass et al. [5]), and level set formula- tions (introduced by Caselles et al. [6] and Malladi et al. [7]) are included in the curve evolution/energy-based category. Ruther et al. [8] developed a semi-automatic approach using DSM to generate initial raised structure hypotheses from el- evation blobs that were further refined via active contours. Mayunga et al. [9] reported a semi-automatic approach for Quickbird imagery using a radial casting algorithm to initialize active contours and solve their initialization problem. In order to cope with content heterogeneity of remote sensed data, Besbes et al. [10] proposed a semi-automatic adaptive vari- ational segmentation method of satellite images using level set formulations and evaluating spectral and texture features relevance to each image region. Based on the radiometric 0196-2892/$31.00 © 2012 IEEE

Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

Embed Size (px)

Citation preview

Page 1: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013 313

Automatic Rooftop Extraction in Nadir AerialImagery of Suburban Regions Using Corners

and Variational Level Set EvolutionMelissa Cote, Member, IEEE, and Parvaneh Saeedi, Member, IEEE

Abstract—Building profile extraction from aerial imagery con-stitutes a key element in numerous geospatial applications.Rooftop detection has been addressed through a variety of ap-proaches that are, however, rarely capable of coping with con-ditions such as arbitrary illumination, variant reflections, andcomplex building profiles. This paper proposes a new methodfor extracting 2-D rooftop footprints from nadir aerial imagerythrough a fully automatic approach that handles arbitrary illumi-nation, variant reflections, and complex building profiles withoutshape priors. The proposed method combines the strength ofenergy-based approaches with distinctiveness of corners. Cornersare assessed using multiple color and color-invariance spaces. Arooftop outline is generated from selected corner candidates andfurther refined to fit the best possible boundaries through level-setcurve evolution that is enhanced via a mean squared error map.Experimental results confirm the ability of the presented systemto effectively extract rooftop profiles with an overall average shapeaccuracy of 84%, correctness of 94%, completeness of 92%, andquality of 88%.

Index Terms—Aerial image processing, boundary detection,building extraction, level set evolution.

I. INTRODUCTION

THREE-DIMENSIONAL building model reconstructionfrom aerial imagery constitutes a key element in numer-

ous geospatial applications including 3-D map reconstruction,urban environmental planning, and military simulations. Theabundance of high-resolution images with fast update rate andthe importance of maintaining geographic information systemsup-to-date have both contributed in making 3-D building re-construction a tremendously active research area in the lastdecades. Rooftop detection as the main foundation of 3-Dreconstruction has been addressed through a variety of ap-proaches. However, most approaches are incapable of copingwith complexities arising from sources such as arbitrary illumi-nation, variant reflections, and slopped and gabbled rooftops,and are limited to process relatively simple profiles.

Manuscript received November 17, 2011; revised April 5, 2012; acceptedMay 15, 2012. Date of publication June 28, 2012; date of current versionDecember 19, 2012.

The authors are with the Laboratory for Robotic Vision, School of Engineer-ing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada (e-mail:[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TGRS.2012.2200689

This paper presents a method for extracting 2-D rooftopfootprints from nadir aerial imagery that copes with arbitraryillumination, variant reflections, slopped surfaces, and complexprofiles. The objective of this work is to present a fast system forfinding rooftops automatically and accurately. We assume thatrooftops have distinctive edges on their boundaries, and alsosimilar material (in both color and reflection properties) is usedin the construction of a rooftop. If two or more materials areused in one rooftop, the rooftop will be identified by multiplepieces.

The paper is structured in the following way. Related workregarding 2-D building extraction is reviewed next, followedby the contributions of the proposed approach. Section IIdetails the methodology, and Section III describes the devel-oped graphical user interface (GUI). Experimental results andtheir qualitative and quantitative evaluations are provided inSection IV. Finally, Section V presents conclusions from thiswork and relevant perspectives.

A. Related Work

Two-dimensional rooftop extraction as part of 3-D buildingmodel reconstruction has been tackled through many approach-es, varying according to sensor modalities and their qualitativeand quantitative properties, additional input resources suchas digital elevation models (DEM) or digital surface models(DSM), a priori knowledge, level of human supervision, and in-teraction, building complexity, and shape assumptions [1]–[4].

Curve evolution/energy-based and model-based methodshave been very popular among the research community forbuilding extraction. Deformable boundaries, active contours(snakes, introduced by Kass et al. [5]), and level set formula-tions (introduced by Caselles et al. [6] and Malladi et al. [7])are included in the curve evolution/energy-based category.Ruther et al. [8] developed a semi-automatic approach usingDSM to generate initial raised structure hypotheses from el-evation blobs that were further refined via active contours.Mayunga et al. [9] reported a semi-automatic approach forQuickbird imagery using a radial casting algorithm to initializeactive contours and solve their initialization problem. In orderto cope with content heterogeneity of remote sensed data,Besbes et al. [10] proposed a semi-automatic adaptive vari-ational segmentation method of satellite images using levelset formulations and evaluating spectral and texture featuresrelevance to each image region. Based on the radiometric

0196-2892/$31.00 © 2012 IEEE

Page 2: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

314 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

and geometric behaviors of buildings, Peng and Liu [11] pro-posed an automatic method to segment monocular urban aerialimages into sunshine parts of high objects, shadow regions,and sunshine ground, with building extraction further refinedusing a modified active contour model. Kabolizade et al. [12]developed an improved snake model (from the gradient vectorflow snake, initially introduced by Xu and Prince [13]) inan automatic building extraction scheme also based on theradiometric and geometric behaviors of buildings. Buildingdetection and initial seed selection were performed by lookingfor local maxima in a DSM model created from light detectionand range (LiDAR) data. Vestri and Devernay [14] presenteda fully automatic system for modeling buildings using cor-relation and DEM. They constructed building models in twostages of segmenting the DEM into planar surface patches,and fitting a polygonal model to each segment using weakgeometric constraints. Yun and Ying [15] created a variationallevel set model for automatic building detection that uses aneighborhood-based image analysis framework and a novelenergy term related to height and roughness of non terrainobjects derived from LiDAR data. Zhou and Neumann [16]proposed a method for creating 2.5-D building models fromaerial LiDAR point clouds. They optimized the building modelsover the 3-D surfaces and the 2-D boundaries of rooftops.Sampath and Shan present method for creating a polyhedralmodel of building roof from LiDAR point clouds by clusteringthe data to a set of planar and breakline sections. They usednormal vectors of the planar sections to determine the principaldirections of the roof planes [17].

An important and attractive feature of level set-based meth-ods and some improved active contour-based methods is theirability to handle topological variations. Curve evolution-basedmethods also incorporate local features such as surface andboundary information, leading toward good localization. How-ever, these approaches are subject to inaccuracies due to theinitial seed locations, arbitrary illumination, and variations inreflection.

Model-based methods also have been reported for rooftopdetection. Liu et al. [18] developed a semi-automatic rectilin-ear shape rooftop detection algorithm using multi-scale objectoriented classification and probabilistic Hough transform. Inanother semi-automatic approach, Zhengjun et al. [19] utilizedregion growing and localized multi-scale object oriented seg-mentation to detect small rectilinear rooftops. These rooftopswere later refined and fitted to more complex profiles usinga node graph search. Lafarge et al. [20] proposed an auto-matic model-based but costly building extraction method usingDEMs. Rough approximations of buildings were first identifiedby rectangle layouts and refined using height discontinuities.

Model-based methods seem to be capable of dealing withpartial occlusion. However, they either assume simple profilesor require a large number of prior models, impacting the qualityand efficiency of the recognition process.

Combinations of model-based and curve evolution-basedapproaches have also been reported for building detection.Bailloeul et al. [21] introduced a system using specific geomet-rical information derived from prior shapes of building modelsintegrated into a level set framework. Karantzalos and Paragios

Fig. 1. Flowchart of the proposed rooftop detection method.

[22] proposed an automatic recognition-driven building detec-tion method using templates (shape priors) integrated into alevel set formulation. Another example is given by Li et al. [23],who developed an automatic method that combines regiongrowing, rectangular templates, and snakes.

While combined methods may take advantage of the uniquecharacteristics of both curve evolution and model-based match-ing, they are not necessarily successful in addressing both prob-lems of arbitrary illumination/variant reflections and complexbuilding profiles. They are also computationally expensive andrequire a large number of models to address various buildingprofiles [22].

In our previous work [24], we developed an automatic systemfor detecting rooftops using straight line segments and corners.Each rooftop was defined as a polygonal shape with a variantnumber of vertices (maximum of 8 for a faster performance).The algorithm assumed that buildings had flat or flat lookingrooftops with only straight line segments defining rooftops. Amajor drawback of that work was the time complexity thatincreased dramatically with the number of vertices. Two otherdrawbacks were the issue of gabled rooftops where the rooftopprofile was not flat (or flat looking) and included multiplepieces, and the assumption of only straight edge lines in arooftop.

In this paper, we present a novel approach for automaticrooftop detection in nadir color aerial imageries by combiningthe strength of energy-based approaches with those of cornersto detect complex profiles. Image corners are first assessed andselected based on local information using multiple color andcolor-invariance spaces to cope with arbitrary illuminationand variant reflections. A polygonal rooftop outline is thengenerated (as the initial profile of the building) from selectedcorner candidates and is further refined through level set curveevolution of mean squared error image. The performance of thesystem is assessed for 233 rooftops from 21 aerial test images.The system is ported to be applicable on satellite images and istested for a sample satellite image including 21 buildings.

Page 3: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 315

Fig. 2. Results for different stages of reference points identification. In (b), parameter k is equal to 5. This parameter is already adaptive and set dynamically tobe equal to the number of peaks found on the H&S histogram, which is computed for each image. (a) Filtered version of the input image; (b) histogram of the HScomponents of the image; (c) k-means segmentation results using HS components; (d) first cluster; (e) first cluster: after morphological cleaning—step 1; (f) firstcluster: after cluster splitting—step 2; (g) first cluster: blobs after individual processing—step 3; (h) final blobs from all k-means clusters.

B. Contributions

The main contributions of this paper are as follows:

1) We propose a very intuitive, fast, and practical methodfor automatic extraction of multiple rooftop outlines frommonocular nadir aerial imagery. The method makes nodirect assumption about the color, size, and number ofsides or their angular constraints. The only assumptionsmade are the existence of some degree of local texturesmoothness and the absence of large local discontinuitiesin the rooftop boundaries, and the RGB and Gaussiancolor invariance spaces.

2) Curve evolution formulation evolving in different spacessuch as color, texture, intensity, and edge have beenused in the past for many applications such as medical

imaging, object recognition etc. [25]. Here, we presentthe use of Gaussian color invariance space and meansquared error maps with the curve evolution algorithmfor detecting building rooftops. These two spaces preventgeneral curve evolution-based approaches from gettingtrapped in spurious local image minima.

II. METHODOLOGY

The flowchart of the proposed methodology is presented inFig. 1. It can be summarized by three main procedures:

1) Reference points identification.2) Rooftop detection.3) Rooftop candidates assessment.Details of each procedure are explained next.

Page 4: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

316 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

A. Reference Points Identification

One of the central ideas behind the proposed automaticrooftop detection algorithm revolves around obtaining an initialpoint located inside each rooftop (referred to as referencepoint). Once the reference points have been found, they areused as a starting point inside each rooftop for the rooftopdetection algorithm. The reference points are obtained from ablob detection algorithm through k-means clustering of the hueand saturation (H&S) components of the images.

First, a down-size/up-size filter (with a ratio of 8) with bicu-bic interpolation is applied to smooth the original RGB image[Fig. 2(a)]. The smooth image is then transposed into the hue-saturation-value (HSV) color space to remove the dependencybetween the color and brightness information. A k-means clus-tering algorithm is applied on the H&S components [Fig. 2(c)].k-means clustering partitions the 2-D pixels (H&S components)into k clusters through an iterative scheme. It minimizes thesum of the intra-cluster sums of the distances (of H&S data)between pixels and cluster centroids (kci). Two parametershave to be specified: the number of desired clusters (k), whichis data dependent, and the initial cluster centroids. The lattercan be found in a pseudorandom manner, but specifying themensures repeatability of the results. In our implementation, thesetwo parameters are determined automatically for each imagethrough 2-D histogram analysis of the H&S values

k = |{(HH&S)x = regional maximum}| (1)

kci =H&S ({regional maxima}) . (2)

Here, {regional maxima} represents a set of peaks in the 2-Dhistogram HH&S [Fig. 2(b)], x denotes a bin and the operator| . . . | is the cardinality. In other words, the number of clustersk is equal to the number of peaks, and the cluster centroids kciare given by the H&S components of the peaks. Each of thek resulting clusters is further processed to obtain meaningfulblobs that represent rooftops by following steps:

1) Morphological cleaning (opening with a squared struc-tural element of 5 × 5 pixels) is carried out on each clusterto remove very small or narrow objects and isolated pixels[Fig. 2(e)].

2) Each cluster is split into blobs where strong Canny edgesare present, [Fig. 2(f)]. This is done by dilating the edgemap by a circular structuring element (radius of 2 pixels)and then setting all the dilated edge points to 0 in theclusters masks.

3) Every blob is then subjected to individual processing:a) Image border: all blobs connected to the image border

are discarded since we look only for rooftops that arefully contained within the image.

b) Eccentricity: all eccentric blobs are removed:

Lminor

Lmajor< 0.175 (3)

with Lminor and Lmajor the length of the minor andmajor axes of the ellipse that has the same normalizedsecond central moments as the blob, respectively. The

Fig. 3. Normalized histogram of rooftop eccentricity values for over 200rooftops. A threshold of 0.175 assures that no rooftop is dismissed.

threshold value was selected to prevent any rooftopfrom being discarded (Fig. 3).

c) Size: very small blobs are dismissed (a minimal areaof 400 pixels, or about 9 m2 for our data set isutilized).

d) Shape: all blobs with extreme shapes defined by anout-of-the-blob center of gravity are discarded.

e) Vegetation: blobs corresponding to vegetation areremoved. An analysis of several hundred objects(rooftops and non-rooftops) under different illumina-tions has shown that green vegetation (be it grass ortrees) can be distinguished from rooftops based onthe saturation component of the HSV color space. Avegetation mask (Mv) is created from pixels that havea saturation value over 0.4. To strengthen the approachdue to the nonetheless existing overlap in saturationvalue, a second constraint related to hue is added:the hue value has to be centered on the green value(0.333). A tolerance spanning from yellow (H = 0.1)to cyan (H = 0.5) is allowed:

Mv(x, y)=

{1, S(x, y) > 0.4 & 0.1 < H(x, y) < 0.50, otherwise.

(4)

Here, x, y denote pixel coordinates, and S and H thesaturation and hue bands, respectively.

After these processing steps [Fig. 2(g)], the remaining blobsare labeled, and the reference points are determined as theblobs’ centroids. The procedure is repeated for every cluster ofthe k-means segmentation results and all the blobs found fromeach cluster are put together [Fig. 2(h)]. Along with the x− ycoordinates, a reference radius is computed for each referencepoint to act as a reference region. This radius is proportionalto the blob size and restricted to the range of 1 to 2 m (8 to15 pixels that is automatically adjusted according to the imageresolution).

B. Rooftop Detection Algorithm

The rooftop detection algorithm includes the followingprocesses:

• detecting corners in an input image;• selecting only the best corners by looking at a vector of

characteristics determined using various color and color-invariance spaces;

• computing an initial estimate of the rooftop outlines fromthe best corner sets (found in the previous step);

Page 5: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 317

• refining the initial estimate using smart shrinking andexpanding (through variational level set curve evolution).

Using the output of Section II-A, a circular marker (CM) atthe location of each reference point is created. We refer tothis region as the reference region (RR). The image propertiesof RR are used in characterizing rooftop region properties.These processes are explained in details in the followingsections.

1) Corner Detection: Corners, as points with low self-similarity, constitute the key image features used in the pro-posed method. In this paper, the Harris corner detector [26]is utilized to detect all potential corners. The value of Harrisresponse threshold parameter is (initially set to 50) designed todetect an average number of 1 corner per square meter. If thenumber of corners is less than the above value, the responsethreshold will be reduced (by 5%) automatically (through aniterative step) and the corners are detected again. The corner de-tection is iterated until at least the average number is achieved.The non-maximal suppression is carried out at the pixel level.We denote detected corners at this stage by set Cinit. Usingsuch high sensitivity settings for Harris corner detector ensuresthat no important corner is missed, but at the same time, itwill cause a corner overcall. To reduce the number of detectedcorners, an edge constraint is applied to remove corners that arenot located on a dilated edge map of the image.

The dilated edge map is defined by

MDE = ME ⊕ SC ={(x, y)|(SC)x,y ∩ME �= ∅

}(5)

with x, y being pixel coordinates from the grayscale versionof the original image, ME representing the edge map obtainedfrom Canny edge detection and SC denoting a square structur-ing element (3 × 3 pixels). The corner set C is then found fromimage pixels satisfying

C ={(x, y)|(MDE)x,y = 1, (x, y) ∈ Cinit

}. (6)

The set C is input into the corner selection stage(Section II-B3). The edge constraint removes corners with lesssignificance while keeping those located on the actual rooftopedges and corners.

2) Color and Color-Invariance Spaces Computation: Thefollowing color and color-invariance spaces are computed fromthe original RGB images to supply complementary informationto the corner selection process: Gaussian color invariance, H&Scomponents, and mean squared error image.

a) Gaussian color invariance image: Detecting sloppedsurfaces poses a major challenge because of the inherent variantlighting reflections, causing high intensity/color discontinuitiesin the source images over the same structure. This is particularlytrue if the illumination source (the sun in this case) is notperpendicular to the earth surface, or if the different rooftopparts have steep slopes. To reduce such undesirable effects, acolor invariance model is added to the system to facilitate abetter corner selection.

Geusebroek et al. [27] presented the measurement of coloredobject reflectance under various assumptions including ob-

Fig. 4. Examples of color spaces. Original RGB image (a), correspondingGaussian color invariant version (b), and hue and saturation components (c).

Fig. 5. Mean squared error. Section of an RGB image with the circular marker(CM) overlaid in yellow (a), and corresponding MSE (b) image.

ject properties and illumination conditions. From a reflectancemodel, they derived illumination and geometrical invariantproperties and proposed several models for different imagingconditions. In particular, one of the models tackles matte/dullobjects illuminated by a source with equal energy but unevenly(arbitrary illumination). This model is adopted in this paper forthe following reasons:

1) most rooftops are made of matte/dull materials;2) the visual effect of variant reflections due to slopped

surfaces is similar to the one for even surfaces illuminatedarbitrarily.

For this model, the reflected spectrum in the viewing direc-tion is given by

E(λ, x) = i(x)R∞(λ, x). (7)

Here, x denotes the position at the imaging plane, λ the wave-length, i(x) the intensity variation and R∞(λ, x) the materialreflectivity. From the reflected spectrum, the object’s color re-gardless of its intensity (Cλ) and the object reflectance propertyunder equal energy illumination (Cλλ) are found and form theGaussian color invariant image Γ

Γ = (Cλ, Cλλ) Cλ = Eλ/E, Cλλ = Eλλ/E. (8)

Here, Eλ and Eλλ denote the first and the second derivatives ofthe reflected spectrum with respect to wavelength. The reflected

Page 6: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

318 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

Fig. 6. Corner selection process on a cropped sample image: initial corners(a), remaining corners after conditions a (b) and b (c).

spectrum and its derivatives are approximated from the RGBcolor space using the following transformation:⎡

⎣ EEλEλλ

⎤⎦ =

⎛⎝ 0.3 0.58 0.11

0.25 0.25 −.050.5 −.05 0

⎞⎠

⎡⎣RGB

⎤⎦ (9)

where R, G, and B represent the RGB color channels.Fig. 4 shows the RGB components of a sample aerial im-

age (a) along with its corresponding Gaussian color invariantversion (b). For displaying purposes, Cλ and Cλλ are placedinto the G and R channels, respectively, while channel B isset to zero. It can be seen from Fig. 4 that the chosen modelneutralizes the shaded regions of the presented gabled rooftops.

b) Hue-saturation image components: The HSV colorspace, which can be derived from the RGB color space, can alsoconvey useful information regarding the presence of variantreflections. This color space separates color information (H&S)from brightness information. Hue represents basic colors whilesaturation refers to the purity of the color [28].

Fig. 4(c) shows the H&S components of Fig. 4(a). Fordisplaying purposes, H and S are placed into the R andG channels, respectively, while channel B is set to zero. As withthe Gaussian color invariance, the H&S components presentsome limitations, for they can have the adverse effect of re-moving otherwise visible differences between background and

Fig. 7. Corner selection conditions. Percentage of overlapping regions ofsimilar pixel values on the Gaussian color invariant image with windows WCi

and WRR centered on corner Ci and on CM (a), and mean squared error profilealong segment SCi (b).

Fig. 8. MSE profile threshold selection. Rooftop with important color varia-tions and its surroundings (a), corresponding MSE image (b), and normalizedhistograms of MSE values for the rooftop and non rooftop pixels. A thresholdof 0.05 covers at least 95% of rooftop pixels.

rooftop pixels. The H&S color space constitutes nonethelesspertinent data for the corner selection process.

c) Mean squared error image: In order to facilitate thecorner selection process by incorporating rooftop-specific in-formation given by the reference region RR directly into thepixel data, a mean squared error (MSE) image is computedfor each rooftop candidate as follows:

MSE(x, y) =1

AW

∑W

(Y IQ(x, y)− Y IQRR)2 . (10)

Here, x, y denote the pixel coordinates, Y IQ the image trans-formed into the NTSC luma and chrominance color space,Y IQRR the reference values from RR in this color space, Wa small window (3 × 3) centered on each pixel, and AW thearea of W . The Y IQ channels (derived from the original colorspace) provide for the (partial) elimination of the correlationbetween the red, green, and blue components in the input image

Page 7: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 319

Fig. 9. Initial contour for curve evolution found from best selected corners. Best selected corners (a), corresponding polygonal representations (b), and initialcontours after shrinking step (c), overlaid on cropped sample RGB image.

[28], allowing for separate treatments of the brightness and thechrominance data.

The choice of reference Y IQRR values in computing MSEimpacts the uniformity/smoothness of the error map over therooftop. This prevents sensitivity to the location of the circularmarker and allows coping with buildings with variant reflec-tions. The proposed method selects Y IQRR values based onhistogram analysis and a multilevel assessment of the variabil-ity of RR according to the following:

1) Computing the standard deviation of the reference region(σSRR).

2) Estimating the normalized histogram (NH) over the ref-erence region for each Y IQ channel and finding locationsof local peaks (LP ), defined as:

LP =

⎧⎨⎩x

∣∣∣∣∣∣(NH)x > (NH)x−1

(NH)x ≥ (NH)x+1

(NH)x ≥ ThLP

⎫⎬⎭ . (11)

Here, x denotes a specific bin, x− 1 the previous bin,x+ 1 the next bin, (NH)x the frequency for bin x, andThLP the minimal acceptable frequency for a local peak.ThLP has been set to 15% for this work in order todismiss peaks that are due to small rooftop objects suchas chimneys and skylights, which generally represent upto 15% of the area covered by RR.

3) Determining a valid range (R) for Y IQRR values (foreach Y IQ component) based on (11) and σSRR fromstep 1):

R =[x1 −

σSRR

2, x2 +

σSRR

2

]. (12)

In each rooftop, usually one or two peaks are present. Weconsider three cases of local peak distributions:

• If |LP | ≥ 2, x1, and x2 denote the bins of the twolargest peaks (x1 < x2).

• Else if |LP | = 1,• if the peak (x) is significant ((NH)x ≥2ThLP ), x1 = x2 = x;

• else if the peak (x) combined with one ofits immediate neighboring bins is significant,x1 = x2 = x′.

• Otherwise, R is set to a range between the minimumand maximum possible values in the histogram.

In computing (12), values for x1 and x2 are taken at theedges of each bin rather than the center.

4) In a multilevel assessment of the variability of RR, slid-ing a small window SW (size of RR/4×RR/4) overthe reference region. For all positions of SW , computethe mean and standard deviation values of the Y IQcomponents and add them to the sets μSW and σSW ,respectively. Find values (one for each channel) from theset μSW , σSW and the valid range R:

Y IQRR = {(μSW )i|(σSW )i = max(σSW ), (μSW )i ∈ R} .(13)

This algorithm has proven to be effective for addressing casesof high variability due to variant reflections as well as copingwith the presence of small rooftop objects, while still beingefficient for the simpler cases of flat rooftops.

Fig. 5 shows a sample image with a specific building underextraction (a) and the corresponding MSE (b) image. Fordisplaying purposes, the error values related to the Y, I, and Qchannels are placed into the R, G, and B channels, respectively.It can be seen that the MSE image, which is rooftop-specific,increases the contrast between the true rooftop and its back-ground or neighboring buildings.

3) Best Corner Selection: Corners output from the cornerdetection step (Section II-B-1) undergo a selection scrutiny.The texture-based assessment uses a vector of characteristicscomputed for each corner Ci in set C [(6)] from the color andcolor-invariance spaces presented in Section II-B-2. The goal isto find a subset of the best corner candidates CB ⊆ C that willenable the computation of an efficient initial approximation ofthe rooftop’s profile. Fig. 6 shows the corner selection processfor a sample building. CB is given by

CB = {Ci ∈ C | ∀ K}, K = {K1 and K2}. (14)

Here, K denotes the set of conditions that must be satisfied bya corner to be selected. Two conditions that are elements of Kfor texture-based corner selection include (Fig. 7):

a) Similarity of neighborhood around each corner (WCi)with that of the CM(WRR) on the Γ image:

0.15 ≤ (WCi ∩WRR) ≤ 0.9. (15)

Page 8: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

320 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

Fig. 10. Refinement with level set. The evolution utilizes the MSE imagebut is displayed here on the RGB image. Initial zero level set (a), zero level setevolution after 50 iterations (b) and at convergence after 150 iterations (c).

Fig. 11. Gaussian color invariance standard deviation values of over 200rooftops [sorted Γ(1) versus Γ(2)]. A combination of thresholds of 0.10 and0.15 prevents any rooftop from being discarded.

Fig. 12. Normalized histogram of rooftop solidity values for over 200rooftops. A threshold of 0.75 covers all the rooftops and includes a safetymargin in order not to dismiss real rooftops that would be over called (withextra small pieces for instance).

These thresholds were set to allow for concave and con-vex rooftop sections (for instance, a corner on a concave90-degree section would have a similarity of neighbor-hood of 0.75).

b) Smoothness of the MSE profile along the segment SCi

between Ci and the center of CM , such that:∑SCi

MSE(x, y) < 0.05. (16)

The error threshold was selected in a way that it coversaround 95% of rooftop pixels for a rooftop presenting a lotof variation color-wise. Fig. 8 shows an example of MSEvalues for a rooftop and its immediate surroundings.

The selected corners (CB) are then connected in the order oftheir polar angles with respect to the center of CM to create apolygonal representation of the rooftop (P ) [see Fig. 9(b)].

4) Refinement With Level Set Evolution: Some inaccuraciesmay be anticipated in the polygonal rooftop estimates createdfrom the selected corners since only one set of parameters areused when processing all 233 rooftops in the data set. Also,each rooftop might have curved sides that cannot be presented

Fig. 13. Rooftop candidate assessment. Initial rooftop candidates (a) and finalresults (b).

Fig. 14. System’s Graphical User Interface. After buildings are extracted,outlines are converted from curves to polygons.

accurately via a straight line connecting the two end points(corners) of the curved edge. Moreover, the polar ordering ofthe corners might not accurately represent the rooftop’s profilewhen the rooftop has concave profile, or its profile includesprotrusions such as a porch. The refinement through curveevolution allows the algorithm to recover those lost details andobtain the correct rooftop profile.

The objective of this stage is to refine the rooftop estimatesfrom Section II-B3 and produce final rooftop outlines that willbest fit rooftop boundaries. To accomplish this task, a curveevolution with level set is chosen as a refinement step sinceit can conform to local variations of the rooftop boundariesand create smooth boundaries while retaining a desired levelof details. Here, MSE images, which include smooth profilesover rooftop surfaces, are utilized to drive the evolution.

a) Initial contour: The initial contour is obtained byshrinking the polygonal representation of the rooftop (P ) pointby point (pixel by pixel) until each point’s neighborhood fea-tures are similar to those of the reference region. The imple-mented shrinking algorithm is as follows:

1) For each point pi ∈ P :a) Compute the direction of shrinkage (unit vector ui

from pi to the center of CM ).b) Move pi along ui such that:

p′i(x, y) ={βpi(x, y)uι

∣∣∣(MSE)Npi< ThIC

}. (17)

Page 9: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 321

Fig. 15. (a) Examples of output results with extracted rooftop boundaries overlaid on input images and added rooftop numbering for reference. (b) The groundtruth. The width and length of the images are marked in each individual image.

Here, x, y denote pixel coordinates, β an incrementalstep factor of 1 m (converted to pixel according to theimage resolution), Npi the neighborhood around pi,and ThIC the maximal error allowed, set dynamicallyto the mean MSE value inside each polygon P .

2) Sort all pi according to their polar coordinates withrespect to the center of CM and connect them to createa closed initial contour.

Fig. 9(c) shows an example with the initial contours.

b) Iterative refinement: In this paper,the level set formu-lation by Li et al. [29] is utilized. This is a variational method(the evolution partial differential equation is directly derivedfrom the problem of minimizing a certain energy functionaldefined on the level set function) that does not require a costlyperiodical re-initialization procedure by forcing the level setfunction to be close to a signed distance function.

The energy functional consists of an internal energy termthat penalizes the deviation of the level set function from asigned distance function. It also includes an external energy

Page 10: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

322 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

term that drives the motion of the zero level to the desired imagefeatures. In this paper, rooftop boundaries on the mean-squared-error image are used as features. The resulting evolution of thelevel set function is the gradient flow that minimizes the overallenergy functional E(ϕ), defined as

E(ϕ) = μP (ϕ) + Eg,λ,v(ϕ). (18)

Here P (ϕ) denotes the internal energy of the function ϕ thatpenalizes the deviation of ϕ from a signed distance functionduring its evolution. Eg,λ,v(ϕ) is the external energy that drivesthe zero level set toward rooftop boundaries, μ is the coefficientof internal energy term (0.04). g is the edge indicator function,and λ and v are external energy coefficients of weighted lengthand weighted area terms (5 and −1.5). The gradient flow thatminimizes the functional E is

∂ϕ

∂t= −∂E

∂ϕ. (19)

The above parameters were selected based on the recommen-dations in [29]. They are set to drive the evolution outward(negative v). The edge indicator function g is computed fromthe grayscale version of MSE which tends to be more uniformacross the rooftop.

A stopping criterion has been added to the formulation torelate the difference of overlapping areas inside the zero levelcurves in consecutive iterations

(ACurr ∩APrev)/(ACurr ∪APrev) ≤ Thc. (20)

The numerator of (20) is the area overlap between the regioninside the current zero level curve (ACurr) and the previousone (APrev). ThC is the minimum allowed difference for nonconvergence which is dynamically set to be proportional tothe area of the initial contour. Convergence is usually achievedbetween 50 and 200 iterations.

Fig. 10 shows the level set evolution for a sample image.

C. Rooftop Candidates Assessment

Sometimes, the estimated reference points (found inSection II-A) are associated with regions that exhibit rooftop-like properties which are not real rooftops. This necessitatesthe assessment of all found rooftops. Rooftop candidates thatdo not meet one of the following criteria are discarded/furtherprocessed:

1) Variation: rooftop candidates with too much intensityvariation on their surfaces are dismissed. A test based onthe standard deviation of the rooftop pixels values on theGaussian color invariance space (to limit the influence ofvariant reflection due to gabled rooftops) is utilized by:((σΓ(1))roof > 0.15 and (σΓ(2))roof > 0.10

)or(

(σΓ(1))roof > 0.10 and (σΓ(2))roof > 0.15). (21)

Here (σΓ(1))roof and (σΓ(2))roof are the standard devi-ations of pixels inside a rooftop candidate on the two

Fig. 16. Minimal sensitivity to the circular marker location. Highly variantreflections (a), uniformly sunlit (b) and shadowy (c) regions all give similaracceptable results.

TABLE IQUANTITATIVE RESULTS FOR AERIAL IMAGES

channels of the Gaussian color invariance image. Thethreshold values were selected in order not to dismiss anyrooftop (Fig. 11).

2) Solidity: rooftop candidates with low solidity, which isdefined as the ratio of their area over the area of theircorresponding convex hull, are rejected:

Solidity :=Arearoof

Areaconvex hull< 0.75. (22)

The threshold was set to cover all rooftops (Fig. 12).3) Edge coverage: rooftops with boundaries that do not

comply with image edges are discarded.∑ME ∩Mroof boundaries

Perimeterroof< 0.90. (23)

Here Mroof boundaries denotes the rooftop candidate’sboundaries mask, and ME is the image edge map. Thethreshold was set as to allow for some not well-definededges on the rooftop boundaries (for instance due to thepresence of a tree covering part of the roof).

4) Redundancy: When different reference points yield toidentical rooftop candidates, only one is kept.

5) Attached rooftops: the centroid of the union of rooftopcandidates sharing a side and similar in color is added tothe list of reference points for a second rooftop detectionpass/assessment. This prevents single rooftop separationinto pieces in the presence of strong internal edges.

Fig. 13 shows the rooftop candidates and the final resultsafter their assessment for a sample image.

III. GRAPHICAL USER INTERFACE

A GUI for this work has been developed (Fig. 14). Usingthis GUI, a user loads an image from the image directory.The user then has a choice to run the system automatically bypressing the “Detect Rooftop(s)” button or mark the referencepoints manually and then push the “Detect Rooftop(s)” button.In the first case, the system finds the reference points automat-ically (as described in Section II-A). In the second case, the

Page 11: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 323

Fig. 17. Quantitative results for aerial images: shape accuracy, correctness, completeness, and overall quality of rooftop detection against building areas.Distribution of building areas (frequency) also shown. Mean values appear with a straight horizontal line.

reference point detection is by-passed, and the rest of therooftop extraction and evaluation algorithm will be executed.Once the detection is complete, the rooftop outlines are con-verted from the final zero level curves to polygons with a fewcontrol points or vertices for a more compact representation.

The conversion process from contour to polygon first cre-ates one control point for each contour pixel. The number ofcontrol points is then reduced iteratively in segments of lowdegree curvature [30], yielding to a density of control pointsproportional to the local contour curvature. Concatenation ofthe line segments between neighboring control points producesthe final polygonal representation as seen in Fig. 14.

The user may correct potential shape inaccuracies (if present)by manipulating these vertices. The system automatically com-putes rooftop areas and saves the results in an XML file forfuture use.

IV. EXPERIMENTAL RESULTS

The proposed method is tested using 21 aerial im-ages (Pictometry Int. Corporation’s) with a resolution of0.15 m/pixel. The test images are from suburban regions ofRichmond, BC, Canada and contain a total of 233 build-ings. The scenes include flat and gabled rooftops, simple andcomplex shapes with arbitrary illumination. A single set ofparameters is used for all images. Qualitative and quantitativeevaluations, comparisons with other methods, as well as perfor-mance aspects in terms of implementation and portability, arepresented next.

A. Qualitative Evaluation

Fig. 15(a) represents typical results for several aerial imageswith the extracted rooftop boundaries overlaid on the input

TABLE IICOMPARISON WITH OTHER METHODS

images. Fig. 15(b) represents the ground truth associated witheach image. The dimensions of each image in pixels (width ×length) are shown on the bottom of each image. The systemsuccessfully extracts rooftop outlines using selected cornersand a level set evolution refinement. It performs well withboth flat and gabled rooftops (e.g., Fig. 15 building #50 versus#65–69). It is able to recover more complex profiles such asrooftops with curved edges or concave shapes (buildings #2and #48). Some scenes are acquired with more pronouncedsun angulations than others, creating a larger contrast betweenrooftops’ sunlit and shadowy regions (buildings #58–64). Thealgorithm also extracts rooftops with some degree of dete-rioration effectively (building #33). Additionally, the systemperforms well for buildings of similar color in close proximity(buildings #63–64). Fig. 16 shows that performance sensitivityto the location of the circular maker is minimal, due to theefficacy of mean squared error computation.

The presence of weak edges combined with similar coloredbackground may yield to over detection (e.g., Fig. 15 building#38) when edges cannot be recovered from the RGB or HSVcolor spaces. The problem of occluded regions is not addressedby the current methodology. Trees reaching over the roofs willcause the algorithm to detect only the uncovered part of therooftop (e.g., building #53).

Page 12: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

324 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

Fig. 18. Examples of the detection quality for [28] and [29] and the proposed method. For all cases manual seeds are considered.

The GUI allows users to easily correct missing rooftop parts.Fully automatic recovery of occluded regions would necessi-tate the incorporation of additional/complementary data in thefuture work.

B. Quantitative Evaluation

In order to quantitatively evaluate the system, extractionresults are compared with manually extracted ground truth data.Four metrics which have become standard for the evaluation ofman-made structure extraction [31] are used: shape accuracy,correctness, completeness, and overall quality.

The shape accuracy [32], based on the overlap between theextracted rooftop and its ground truth, is estimated by

Shape accuracy = 1− |AGT −AE |AGT

. (24)

Here AGT and AE denote a building’s area from the groundtruth and the extraction process, respectively.

As errors of pixel labeling are not taken into account inthe shape accuracy, additional metrics are required to obtain acomplete assessment. Therefore, three metrics of correctness,completeness, and overall quality are computed. Correctnessmeasures the degree to which detected building pixels are in-

deed real building pixels, whereas Completeness measures thedegree to which real building pixels are detected by the system.Overallquality is normalization between the previous two

Correctness =TP

TP + FP(25)

Completeness =TP

TP + FN(26)

Overall Quality =TP

TP + FP + FN. (27)

Here, TP represents true positives (correctly extracted buildingpixels), FP false positives (incorrectly extracted building pix-els), and FN false negatives (missed building pixels). Optimalvalues for the four metrics are 1, and the OverallQualitycannot be higher than neither Correctness nor Completeness.Quantitative evaluation results are provided in Table I.

In Fig. 17, the above four metrics are plotted against therooftop areas. The mean shape accuracy is computed rooftop-wise whereas the mean values of the other three metrics arecomputed pixel-wise across all images. Out of 233 rooftops,17 were missed and 19 were detected falsely. These casescorrespond to zero values on the graphs. The mean shapeaccuracy over 233 buildings extracted by the proposed system is84.5%, with a mean correctness of 94.4%, mean completeness

Page 13: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 325

TABLE IIIFREE PARAMETERS OF THE SYSTEM

of 92.5%, and mean overall quality of 88%. Compared to theshape accuracy reported in other works (81% in [8], 83.6% in[11]), our approach shows higher effectiveness. Moreover, aspreviously mentioned, it is capable of extracting buildings ofsimilar color in close proximity, which was one of the mainproblems encountered in [8]. From Fig. 17, we can see that thefew problematic cases in terms of metric values arise for smallbuildings (area less than 70 m2), for which pixel misclassifica-tion has a greater impact on the mean shape accuracy. Building# 29 (Fig. 15) constitutes one such case, where the extractedboundaries of a detached garage have leaked to the outside.All metrics present a similar trend, with increasing performancewith the building area.

C. Comparison With Other Methods

In this section, we present the comparison of the results forthe proposed method and two well-known methods by Chanand Vese [33] and Caselles et al. [34].

The level-set method by Chan and Vese [33] is a region-basedmethod that clusters the image into two homogeneous regions(according to their mean values). In this method, a functionφ, defined by the signed distance function (driven from theimage and updated at every iteration), is used to maximize thedifference between the regions of foreground and backgroundregions of a contour. The second method by Caselles et al. [34]is an active contour-based method that deforms contours ac-cording to intrinsic geometric measures of the image. It com-

bines the energy minimization approach with the snakes. Thismethod is based on the image gradient descent to drive thecontours to the areas with high image gradient. It utilizes thesigned distance function, which is updated on each iteration.

In implementing Chan and Vese’s algorithm [33], the cur-vature term’s coefficient was set to 0.2 (for all the acquiredresults) and the number of iteration was set to 3000 to providethe algorithm with a fair chance to fully converge toward therooftop boundaries. In the implementation of the method byCaselles et al. [34], the coefficients of the internal energy termswere set to 1, and the coefficient of the external energy termwas set to 0.2. Similar to Chan’s method, the maximum numberof iteration was set to 3000. Table II presents the comparisonbetween the three methods.

In acquiring these results, the initial contours were manuallyinitiated. We found that both methods were very sensitiveto the location and local characteristics of regions inside theinitial contours. Therefore, to provide these methods with a fairopportunity to perform well, the initial contours were placed onthe middle of each rooftop where the region inside includedvariations in intensity (due to slopes). Fig. 18 shows visualresults of the three methods on two sample images.

D. Performance

The proposed system is implemented in Matlab 7 and testedon a PC (CPU Intel Core2 Quad 2.4 GHz with 2 GB RAM).Since no rooftop dimension is assumed beforehand, images

Page 14: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

326 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

TABLE IVCHANGE TRENDS OF THE FREE PARAMETERS

are initially processed in their entirety for every building. Theprocessing time is therefore highly dependent on the dimen-sions of the input images as well as the amount of details(i.e., number of corners). The entire extraction process takeson average around 20 s per rooftop to complete, considering anaverage image size of 400 × 400 pixels with an average of 3425corners/image. The current processing times are substantiallyless than most reported in the literature (for instance [9], [23],[35], and [36]). A C/C++ implementation of the code wouldyield to substantial improvements in terms of processing time.Using a narrowband level set implementation instead of full-domain could also reduce computational costs [37].

E. Sensitivity Analysis and Portability Issues

The proposed system is designed for RGB nadir aerial im-ages. Like any complex system, there are a few parameters thatare set according to the characteristics of the input imagery.Many of the parameters however are estimated automaticallyaccording to the image characteristics. Table III summarizesthe free parameters that are not estimated automatically. Theseparameters have been presented in different sections of theproposed work. They have either been set according to his-togram analyses, recommendations derived from other works,or conceptual values. Table IV presents their change trends: theeffect of decreasing and increasing the values.

In order to customize the system for other types of input im-ages (captured by different electro-optic sensors), some of theseparameters must be adjusted. We consider those parameters that

have to be adjusted to be the most important ones in the successof our algorithm for other images.

To demonstrate the portability of our system and to identifythe most sensitive parameters, we adjusted such parametersfor a satellite image of the Toulouse, France area (courtesyof Dr. Konstantinos Karantzalos from [22]). One general issuewith this image was that it looked quite dark initially, and weapplied histogram equalization on all bands to make the pixeldistributions more uniform.

The following parameters had to be adjusted in our systemso it could successfully detect buildings:

• MSE threshold was changed to 0.15 instead of 0.05,• v was changed to −2.25 instead of −1.5.

The MSE threshold is the first key parameter because itdepends on the local intensity smoothness of a rooftop andits surroundings, and directly affects most corners in the bestcorner selection process, hence the initial polygonal rooftopestimates. Different suburban settings may comprise more orless smooth rooftops as well as more or less distinctivenessbetween rooftops and their surroundings. The external energycoefficient of weighted area term (v) is the second key pa-rameter because it directly affects the final refinement of therooftop profiles by controlling the tendency of the zero levelset to evolve outward or inward. It is also related to the firstkey parameter as the curve evolution utilizes the MSE image;therefore, it depends on similar factors.

Fig. 19 shows the results by our method (top), [22] (middle),and the ground truth (bottom). Please note that in the case

Page 15: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

COTE AND SAEEDI: AUTOMATIC ROOFTOP EXTRACTION 327

Fig. 19. Example of the detection for a satellite image: Results for our methodon an image from [22] (a), results by [22] (b) and ground truth (c).

TABLE VQUANTITATIVE RESULTS FOR TOULOUSE IMAGES

TABLE VIQUANTITATIVE RESULTS FOR VARIED PARAMETERS

when an object such as a tree covers parts of a rooftop, theground truth boundaries do not include the hidden area of therooftop. Table V presents the four overall scores for this image.Karantzalos et al. [22] reported correctness, completeness, andoverall quality of 92.7%, 87.7%, and 82%, respectively.

These numbers cannot be compared directly as they usea model-based approach with a single channel panchromaticinput image.

We conducted a sensitivity analysis of performance whenthe most important parameters are varied. Table VI presentsthe comparison for a plausible range of values for the MSEthreshold and v, where the shape accuracy is computedrooftop-wise and the correctness, completeness and overallquality are computed pixel-wise.

From Table VI, the shape accuracy is the metric that isthe most affected by varying the two key parameters. Theshape accuracy of a given rooftop is sometimes lower thanwith the proposed method. However, the large discrepancy canbe better explained by the increased number of missed realrooftops.

As it is computed rooftop-wise, missed rooftops contribute azero shape accuracy value. As detailed in Table IV, varying theMSE threshold impacts the initial polygonal rooftop estimatesby selecting more or less corners. Lowering the MSE thresholdtoo much may in certain cases cause to dismiss good cornersfrom the list, therefore creating inadequately small rooftopestimates that will be inevitably filtered out in the rooftopcandidate assessment step, decreasing the shape accuracy. Onthe other hand, increasing the MSE threshold too much mayin certain cases include corners from neighboring rooftops orobjects, yielding oversized rooftop candidates that will alsobe filtered out during the rooftop candidate assessment step,decreasing the mean shape accuracy. Smaller/higher negativevalues for the external energy coefficient of weighted area of thelevel set formulation have similar consequences: curves that donot evolve enough because of a coefficient too close to zero willyield inadequately small rooftop candidates, whereas curvesthat evolve too much because of a coefficient too small, leakingout of the rooftop boundaries, will yield oversized rooftop can-didates. Correctness values are lower for all four cases in com-parison to the proposed method, whereas completeness valuesare sometimes higher, mainly due to increased overdetectionwith more real building pixels detected. The overall qualityremains good for all varied parameters except for the minimalMSE threshold, in which case too many rooftop pixels aremissed due to an overly strict acceptable mean squared error.The proposed nominal used values show the best results.

V. CONCLUSION

This paper presented a novel automatic approach for fast2-D rooftop extraction from nadir aerial imagery. Through anautomatic blob detection stage, reference points on rooftopcandidates are first determined from uniform regions. Therooftop detection algorithm is based on modified Harris cornersassessed through a multistage selection process, and an energyminimization with a variational level set formulation to refinethe extracted rooftop outlines. Gaussian color invariants, alongseveral other color spaces, are incorporated to allow properextraction under arbitrary illumination and variant reflection.Experimental results for various aerial images confirm themethod’s ability to accurately extract rooftops. Extension ofthe approach to satellite imagery is expected to be straightforward, requiring minor adjustments to the detection/selectionparameters. Real-time performance is also expected from themigration from the Matlab environment to C/C++ combinedwith a narrowband level set implementation.

ACKNOWLEDGMENT

The authors would like to acknowledge with gratitude theNSERC Canada and MacDonald Dettwiler and Associates Ltd.

Page 16: Automatic Rooftop Extraction in Nadir Aerial Imagery of Suburban Regions Using Corners and Variational Level Set Evolution

328 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

for their support through the NSERC Strategic Grant Program.The authors are also grateful to Dr. K. Karantzalos for providingthe Toulouse test image.

REFERENCES

[1] H. Mayer, “Automatic object extraction from aerial imagery—A surveyfocusing on buildings,” Comput. Vis. Image Understanding, vol. 74, no. 2,pp. 138–149, May 1999.

[2] G. Ferraioli, “Multichannel InSAR building edge detection,” IEEE Trans.Geosci. Remote Sens., vol. 48, no. 3, pp. 1224–1231, Mar. 2010.

[3] B. Sırmaçek and C. Ünsalan, “A probabilistic framework to detect build-ings in aerial and satellite images,” IEEE Trans. Geosci. Remote Sens.,vol. 49, no. 1, pp. 211–221, Jan. 2011.

[4] N. Haala and M. Kada, “An update on automatic 3D building reconstruc-tion,” ISPRS J. Photogramm. Remote Sens., vol. 65, no. 6, pp. 570–580,Nov. 2010.

[5] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour mod-els,” Int. J. Comput. Vis., vol. 1, no. 4, pp. 321–331, 1987.

[6] V. Caselles, F. Catte, T. Coll, and F. Dibos, “A geometric model for activecontours in image processing,” Numer. Math., vol. 66, no. 1, pp. 1–31,Dec. 1993.

[7] R. Malladi, J. A. Sethian, and B. C. Vemuri, “Shape modeling with frontpropagation: A level set approach,” IEEE Trans. Pattern Anal. Mach.Intell., vol. 17, no. 2, pp. 158–175, Feb. 1995.

[8] H. Rüther, H. M. Martine, and E. G. Mtalo, “Application of snakes anddynamic programming optimisation technique in modeling of buildings ininformal settlement areas,” ISPRS J. Photogramm. Remote Sens., vol. 56,no. 4, pp. 269–282, Jul. 2002.

[9] S. D. Mayunga, Y. Zhang, and D. J. Coleman, “Semi-automatic buildingextraction utilizing Quickbird imagery,” in Proc. ISPRS Workshop ObjectExtraction 3D City Models, Road Databases Traffic Monitor.—Concepts,Algorithms and Evaluation, 2005, pp. 131–136.

[10] O. Besbes, Z. Belhadj, and N. Boujemaa, “A variational framework foradaptive satellite images segmentation,” in Proc. 1st Int. Conf. ScaleSpace Variational Methods Comput. Vis., 2007, pp. 675–686.

[11] J. Peng and Y. C. Liu, “Model and context-driven building extraction indense urban aerial images,” Int. J. Remote Sens., vol. 26, no. 7, pp. 1289–1307, 2005.

[12] M. Kabolizade, H. Ebadi, and S. Ahmadi, “An improved snake modelfor automatic extraction of buildings from urban aerial images andLiDAR data,” Comput., Environ. Urban Syst., vol. 34, no. 5, pp. 435–441,Aug. 2010.

[13] C. Xu and J. L. Prince, “Snakes, shapes, and gradient vector flow,” IEEETrans. Image Process., vol. 7, no. 3, pp. 359–369, Mar. 1998.

[14] C. Vestri and F. Devernay, “Using robust methods for automatic extractionof buildings,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Kawai,HI, 2001, vol. 1, pp. 133–138.

[15] Y. Yun and L. Ying, “Object-based level set model for building detectionin urban area,” in Proc. Joint Urban Remote Sens. Event, Shanghai, China,2009, pp. 1–6.

[16] Q. Zhou and U. Neumann, “2.5D dual contouring: A robust approach tocreating building models from aerial lidar point clouds,” in Proc. 11th Eur.Conf. Comput. Vis., 2010, pp. 115–128.

[17] A. Sampath and J. Shan, “Segmentation and reconstruction of polyhedralbuilding roofs from aerial lidar point clouds,” IEEE Trans. Geosci. RemoteSens., vol. 48, no. 3, pp. 1554–1567, Mar. 2010.

[18] Z. J. Liu, J. Wang, and W. P. Liu, “Building extraction from high res-olution imagery based on multi-scale object oriented classification andprobabilistic hough transform,” in Proc. IEEE Int. Geosci. Remote Sens.Symp., 2005, pp. 2250–2253.

[19] Z. Liu, S. Cui, and Q. Yan, “Building extraction from high resolution satel-lite imagery based on multi-scale image segmentation and model match-ing,” in Proc. Int. Workshop EORSA, Shanghai, China, 2008, pp. 1–7.

[20] F. Lafarge, X. Descombes, J. Zerubia, and M. Pierrot-Deseilligny, “Au-tomatic building extraction from DEMs using an object approach andapplication to the 3D-city modeling,” ISPRS J. Photogramm. RemoteSens., vol. 63, no. 3, pp. 365–381, 2008.

[21] T. Bailloeul, V. Prinet, B. Serra, and P. Marthon, “Spatio-temporal priorshape constraint for level set segmentation,” in Proc. 5th Int. WorkshopEnergy Minimization Methods Comput. Vis. Pattern Recognit., 2005,pp. 503–519.

[22] K. Karantzalos and N. Paragios, “Recognition-driven two-dimensionalcompeting priors toward automatic and accurate building detection,”IEEE Trans. Geosci. Remote Sens., vol. 47, no. 1, pp. 133–144, Jan. 2009.

[23] G. Li, Y. Wan, and C. Chen, “Automatic building extraction based onregion growing, mutual information match and snake model,” in Proc.Int. Conf. Inf. Comput. Appl., 2010, pp. 476–483.

[24] M. S. Nosrati and P. Saeedi, “A novel approach for polygonal rooftopdelection in satellite/aerial imageries,” in Proc. IEEE Int. Conf. ImageProcess., Cairo, Egypt, 2009, pp. 1709–1712.

[25] D. Cremers, M. Rousson, and R. Deriche, “A review of statistical ap-proaches to level set segmentation: Integrating color, texture, motion andshape,” Int. J. Comput. Vis., vol. 72, no. 2, pp. 195–215, Apr. 2007.

[26] C. Harris and M. Stephens, “A combined corner and edge detector,” inProc. 4th Alvey Vision Conf., Manchester, U.K., 1988, pp. 147–151.

[27] J. M. Geusebroek, R. van den Boomgaard, A. W. M. Smeulders, andH. Geerts, “Color invariance,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 23, no. 12, pp. 1338–1350, Dec. 2001.

[28] H. D. Cheng, X. H. Jiang, Y. Sun, and J. Wang, “Color image seg-mentation: Advances and prospects,” Pattern Recognit., vol. 34, no. 12,pp. 2259–2281, Dec. 2001.

[29] C. Li, C. Xu, C. Gui, and M. D. Fox, “Level set evolution without re-initialization: A new variational formulation,” in Proc. IEEE Conf. Com-put. Vis. Pattern Recognit., 2005, pp. 430–436.

[30] M. Izadi and R. Safabakhsh, “An improved time-adaptive self-organizingmap for high-speed shape modeling,” Pattern Recognit., vol. 42, no. 7,pp. 1361–1370, Jul. 2009.

[31] P. Agouris, P. Doucette, and A. Stefanidis, “Automation and dig-ital photogrammetric workstations,” in Manual of Photogrammetry,15th ed. Bethesda, MD: Amer. Soc. Photogramm. Remote Sens., 2004,pp. 949–981.

[32] D. M. McKeown, T. Bulwinkle, S. Cochran, W. Harvey, C. McGlone, andJ. A. Shufelt, “Performance evaluation for automatic feature extraction,”Int. Arch. Photogramm. Remote Sens., vol. 33, pt. B2, pp. 379–394, 2000.

[33] T. Chan and L. Vese, “Active contours without edges,” IEEE Trans. ImageProcess., vol. 10, no. 2, pp. 266–277, Feb. 2001.

[34] V. Caselles, R. Kimmel, and G. Sapiro, “Geodesic active contours,” Int. J.Comput. Vis., vol. 22, no. 1, pp. 61–79, Feb./Mar. 1997.

[35] P. Saeedi and H. Zwick, “Automatic building detection in aerial andsatellite images,” in Proc. 10th Int. Conf. Control, Autom., Robot. Vis.,2008, pp. 623–629.

[36] M. Izadi and P. Saeedi, “Automatic building detection in aerial imagesusing a hierarchical feature based image segmentation,” in Proc. 20th Int.Conf. Pattern Recognit., 2010, pp. 472–475.

[37] C. Li, C. Xu, C. Gui, and M. D. Fox, “Distance regularized level setevolution and its application to image segmentation,” IEEE Trans. ImageProcess., vol. 19, no. 12, pp. 3243–3254, Dec. 2010.

Melissa Cote (S’06–M’10) received the B.Eng.,M.A.Sc., and Ph.D. degrees in computer engineer-ing from the Ecole Polytechnique, Montreal, QC,Canada, in 2003, 2006, and 2010, respectively. Thetheme of her doctoral research was collaborativevirtual environments and haptics for biomedicalsimulations.

Currently, she is working as a Postdoctoral Fellowat the Laboratory for Robotic Vision, School ofEngineering Science, Simon Fraser University,Burnaby, Canada. Her research interests include

computer vision, image processing, 3-D modeling and visualization, and virtualreality applications.

Parvaneh Saeedi (M’04) received the B.A.Sc. de-gree in electrical engineering from the Iran Univer-sity of Science and Technology, Tehran, Iran, and theM.Sc. and Ph.D. degrees in electrical and computerengineering from the University of British Columbia,Vancouver, BC, Canada, in 1998 and 2004,respectively.

From 2004 to 2006, she was a Research Asso-ciate with MacDonald Detwiller and Associates Ltd.Since 2007, she has been an Assistant Professor withthe School of Engineering Science, Simon Fraser

University, BC, Canada. Her research interests include pattern recognition,machine vision, image understanding, and artificial intelligence.