9
UCTEA Chamber of Surveying and Cadastre Engineers Journal of Geodesy and Geoinformation TMMOB Harita ve Kadastro Mühendisleri Odası Jeodezi ve Jeoinformasyon Dergisi © 2013 HKMO Vol.2 Iss.1 pp.9-17 May 2013 Journal No.107 Doi: 10.9733/jgg.061213.2 www.hkmodergi.org An automatic region growing based approach to extract facade textures from single ground-level building images Emre Sümer 1 ,*, Mustafa Türker 2 1 Department of Computer Engineering, Baskent University, 06810, Ankara, Turkey 2 Department of Geomatics Engineering, Hacettepe University, 06800, Ankara, Turkey Volume: 2 Issue: 1 Page: 9 - 17 May 2013 Abstract An approach is presented for the automatic retrieval of building facade textures from single ground-level building images. The texture information is extracted using the Watershed segmentation which is carried out repetitively until the most successful segment is obtained. To initiate segmentation, the marker pixels are seeded automatically both for foreground (facade) and background (sky, pavement and neighboring build- ings) regions. The proposed concept was tested on two different datasets. The first dataset contains fifteen rectilinear buildings selected from the residential area of the Batikent district of Ankara, Turkey. The second dataset includes five buildings selected from the eTRIMS database, which contains over one hundred build- ings captured in major European cities. The assessment of the segmented facade images was carried out us- ing a quantitative evaluation metric. For both datasets, a quantitative accuracy of above 80% was achieved for facade texture extraction in average. The experimental results indicate that the proposed approach for the automatic retrieval of the facade textures is quite promising and a considerable progress has been made towards the automated construction of the virtual cities. Keywords Image processing, Facade texture extraction, Watershed segmentation, Accuracy assessment Cilt: 2 Sayı: 1 Sayfa: 9 - 17 Mayıs 2013 Özet Cephe dokularının tekli yersel bina görüntülerinden bölge büyütme tabanlı bir yaklaşım kullanılarak otomatik çıkarımı Bu çalışmada bina cephe dokularının tekli yersel bina görüntülerinden elde edilmesini sağlayan otomatik bir yaklaşım sunulmaktadır. Doku bilgisi Watershed bölütlemesi kullanılarak çıkarılmakta olup bu işlem en başarılı bölütü elde edene kadar tekrarlı olarak gerçekleştirilmektedir. Bölütlemeyi başlatmak için işaretçi pikseller görüntünün hem ön planına (bina cephesi) hem de arka planına (gökyüzü, kaldırım, komşu binalar) otomatik olarak yerleştirilir. Geliştirilen kavram iki farklı veri kümesinde test edilmiştir. Birinci veri kümesi Ankara’nın Batıkent bölgesine ait bir yerleşim yerinden seçilen 15 dörtgensel binayı içermektedir. İkinci veri kümesi ise eTRIMS görüntü veritabanından seçilen 5 binadan oluşmakta olup bu veritabanı Avrupa’nın başlıca kentlerinden çekilmiş yüzün üzerinde binayı içermektedir. Bölütlenen cephe dokularının başarım değerlendirmesi kantitatif bir ölçüm metriği ile gerçekleştirilmiştir. Her iki veri kümesi için de cephe dokusu çıkarımı ortalama %80’in üzerinde bir kantitatif doğrulukla elde edilmiştir. Deneysel sonuçlar bina cephe dokularının tespiti için önerilen bu yaklaşımın umut verici olduğuna ve sanal şehirlerin otomatik üretimine doğru giden yolda önemli bir gelişme kaydedilmekte olduğunu göstermektedir. Anahtar Sözcükler Görüntü işleme, Cephe dokusu çıkarımı, Watershed bölütleme, Başarım değerlendirme * Corresponding Author: Tel: +90 (312) 2466666 Fax: +90 (312) 2466660 E-mail: [email protected] (Sumer E.), [email protected] (Turker M.)

An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

UCTEAChamber of Surveying and Cadastre Engineers

Journal of Geodesy and Geoinformation

TMMOBHarita ve Kadastro Mühendisleri Odası

Jeodezi ve Jeoinformasyon Dergisi

© 2013 HKMO

Vol.2 Iss.1 pp.9-17 May 2013 Journal No.107 Doi: 10.9733/jgg.061213.2 www.hkmodergi.org

An automatic region growing based approach to extract facade textures from single ground-level building imagesEmre Sümer1,*, Mustafa Türker2

1 Department of Computer Engineering, Baskent University, 06810, Ankara, Turkey2 Department of Geomatics Engineering, Hacettepe University, 06800, Ankara, Turkey

Volume: 2Issue: 1Page: 9 - 17May 2013

AbstractAn approach is presented for the automatic retrieval of building facade textures from single ground-level building images. The texture information is extracted using the Watershed segmentation which is carried out repetitively until the most successful segment is obtained. To initiate segmentation, the marker pixels are seeded automatically both for foreground (facade) and background (sky, pavement and neighboring build-ings) regions. The proposed concept was tested on two different datasets. The first dataset contains fifteen rectilinear buildings selected from the residential area of the Batikent district of Ankara, Turkey. The second dataset includes five buildings selected from the eTRIMS database, which contains over one hundred build-ings captured in major European cities. The assessment of the segmented facade images was carried out us-ing a quantitative evaluation metric. For both datasets, a quantitative accuracy of above 80% was achieved for facade texture extraction in average. The experimental results indicate that the proposed approach for the automatic retrieval of the facade textures is quite promising and a considerable progress has been made towards the automated construction of the virtual cities.

KeywordsImage processing, Facade texture extraction, Watershed segmentation, Accuracy assessment

Cilt: 2Sayı: 1Sayfa: 9 - 17Mayıs 2013

ÖzetCephe dokularının tekli yersel bina görüntülerinden bölge büyütme tabanlı bir yaklaşım kullanılarak otomatik çıkarımıBu çalışmada bina cephe dokularının tekli yersel bina görüntülerinden elde edilmesini sağlayan otomatik bir yaklaşım sunulmaktadır. Doku bilgisi Watershed bölütlemesi kullanılarak çıkarılmakta olup bu işlem en başarılı bölütü elde edene kadar tekrarlı olarak gerçekleştirilmektedir. Bölütlemeyi başlatmak için işaretçi pikseller görüntünün hem ön planına (bina cephesi) hem de arka planına (gökyüzü, kaldırım, komşu binalar) otomatik olarak yerleştirilir. Geliştirilen kavram iki farklı veri kümesinde test edilmiştir. Birinci veri kümesi Ankara’nın Batıkent bölgesine ait bir yerleşim yerinden seçilen 15 dörtgensel binayı içermektedir. İkinci veri kümesi ise eTRIMS görüntü veritabanından seçilen 5 binadan oluşmakta olup bu veritabanı Avrupa’nın başlıca kentlerinden çekilmiş yüzün üzerinde binayı içermektedir. Bölütlenen cephe dokularının başarım değerlendirmesi kantitatif bir ölçüm metriği ile gerçekleştirilmiştir. Her iki veri kümesi için de cephe dokusu çıkarımı ortalama %80’in üzerinde bir kantitatif doğrulukla elde edilmiştir. Deneysel sonuçlar bina cephe dokularının tespiti için önerilen bu yaklaşımın umut verici olduğuna ve sanal şehirlerin otomatik üretimine doğru giden yolda önemli bir gelişme kaydedilmekte olduğunu göstermektedir.

Anahtar SözcüklerGörüntü işleme, Cephe dokusu çıkarımı, Watershed bölütleme, Başarım değerlendirme

* Corresponding Author: Tel: +90 (312) 2466666 Fax: +90 (312) 2466660E-mail: [email protected] (Sumer E.), [email protected] (Turker M.)

Page 2: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

10 An automatic region growing based approach to extract facade textures from single ground-level building images

1. Introduction

3D city modeling is a vital issue in a variety of location-relat-ed service, such as vehicle navigation, urban planning, vir-tual tourism, intelligent transportation systems, etc. Building facade texturing makes the visualization of the solid 3D city models much better than a simple wireframe representation. Photorealistic textures provide rich information and add vi-sual realism by extracting windows, doors and balconies. However, the extraction of facade textures is a challenging problem due to the large variety in their appearance and structures. Building textures are commonly captured from ground-based systems or occasionally from airborne remote sensing systems. Furthermore, some applications for textur-ing building facades include the assignment of hand-made typical textures from a catalogue or selective manual mod-eling. However, each technique has drawbacks, such as the need for additional image capturing, reduced degree of au-tomation, lack of realism, or costly model creation. In order to provide more realistic views, the extraction of realistic texture information is inevitable.

In previous studies, several efforts have been spent for extracting the facade textures from ground-based and air-borne mapping systems. Among these, large majority were related with the ground-based systems that employ laser scanning data, terrestrial image sequences, and single im-ages. Those studies that rely on airborne oblique imagery have been occasionally encountered in the literature due to its comparatively low resolution and poor depiction of fa-cades. Lorenz and Döllner (2006) carried out facade texture mapping from overlapping high resolution monochrome ae-rial images. Their method was based on the rectified images and reported to be efficient in texturing the facades of large-scale city models. In a different study conducted by Wang et al. (2008) the automated texture acquisition from oblique aerial images was performed by using image line analysis. In a more recent study, the facade texture generation using airborne oblique imagery was employed. Different from other studies, the problem of occlusion was also handled (Rau and Chu 2011).

In a substantial number of studies, the tendency was to detect facade information from the terrestrial laser scanning data. Even though the ground-level laser scanning systems have been widely used, the excessive cost is still the chief disadvantage. The study conducted by Böhm (2008) demon-strated a new approach for facade detail extraction from range data in the case of incomplete texture information due to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation using partially ordered 3D point clouds. In a different study, knowledge based reconstruction of building facade models from terrestrial laser scanning data was pre-sented (Pu and Vosselman 2009). In a more recent study con-ducted by Martinez et al. (2012), a new approach for auto-matic segmentation of terrestrial laser scanning point clouds was introduced. The main contribution was to extract a set of features of building facades in different layers based on planar features.

On the other hand, a considerable number of studies have employed the image sequences to retrieve the facade tex-tures. The main benefit of using image sequences is that data acquisition is rather economic and flexible. However, the

variations due to different perspectives, scales, contrasts, and color shadings are needed to be adjusted. In an exceptional study conducted by Hoegner and Stilla (2009), 3D building models were textured from the images recorded by infrared (IR) cameras. Further, Poullis and You (2009) presented a study that reconstructs a photorealistic large-scale urban city model. Jang and Jung (2009) proposed a method for large scale building modeling with multiple image sequences. By using their method, suitable line segments were extracted for estimating the vanishing points and corresponding cor-ners. Tian et al. (2010) demonstrated an automatic knowl-edge-based method for reconstruction of building facades from terrestrial video sequences. In a different study con-ducted by Kang et al. (2010), the automatic mosaicing of building facade textures was achieved from a sequence of monocular close-range images. In a more recent study, a method for building facade labeling in stereo images was proposed (Delmerico et al. 2011). The contribution was the use of plane fitting techniques from stereo imagery to the problem of building facade segmentation in the context of mobile platform localization.

The studies conducted on single ground-level images have also been met. The advantage of using a single image is that it is cheaper to capture and requires less memory stor-age. In addition, it provides higher resolution data that is important in the visualization of textured building models. Moreover, a single photo reduces the number of scenes to be processed for texture mapping. However, the occlusion problem is unavoidable especially for high-density urban area. In a study conducted by Laycock et al. (2007) a nov-el approach for automatic generation and texturing of urban environment was demonstrated. The method was based on procedural texture generation, which permits the realization of a large variation in the building facade’s appearance. Fur-ther, Mohan and Murali (2007) presented a novel method for automatic modeling of planar surfaces with texture using the single view perspective images based on the edges and intensity representation. In a different study, the automatic derivation of 3D models of high visual quality was carried out from rectified single facade images of arbitrary resolu-tions (Müller et al. 2007). Further, David (2008) proposed a method to detect building facades from the perspective view single images. The main contribution of this study was on the improved performance over existing approaches while plac-ing no constraints on the facades. Ripperda (2008) presented a grammar based extraction approach for the detection of building facade elements from ground-based images and range data. In a more recent study conducted by Wan and Li (2011), an automatic algorithm for facade seg-mentation was proposed. The algorithm was based on the detected lines and vanishing points.

In this study, we present an automatic approach for the extraction of facade textures from single ground-level build-ing images. The automation level is increased by employ-ing a repetitive Watershed image segmentation to extract building facades. The markers to initiate the watershed seg-mentation are located automatically. Further, a well-known assessment technique is adapted in the evaluation of the tex-ture extraction accuracy. The proposed approach was fully implemented in the MATLAB programming environment.

Page 3: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

11Emre Sümer, Mustafa Türker / Vol.2 Iss.1 2013

2. The methodology

The proposed approach is summarized in Figure 1, which shows a facade texture extracted from a ground-level pho-to of a building. The automatic retrieval of the textures is performed using a procedure developed based on watershed segmentation technique, which discriminates foreground region (the building facade) from background such as sky, pavement and cars. This technique grows out of mathemat-ical morphology and takes its inspiration from hydrology and the study of watersheds. The fundamentals and work-ing principles of watershed segmentation can be found in the studies conducted by Beucher and Meyer (1992) and Vincent and Soille (1991).

Figure 1: The summary of texture extraction procedure

The initial markers, which are necessary to start segmen-tation, are located automatically for both foreground and background regions. For the foreground, the initial positions of the markers fall in the center of the image frame, while the background markers are located at the edges. In Figure 2(a), the initial locations of the foreground and background markers are illustrated where the red points correspond to foreground seeds and the greens refer to the background seeds. Starting with initial markers, the nearest connect-ed-component regions are merged iteratively on the gradient magnitude of an image for both foreground and background. The gradient response of the grayscale image contains high intensities (watersheds) and low intensities (catchment ba-sins), which tend to be homogenous regions of the object to be segmented. In the present case, the connectivity of the flood fill operation is defined as fully connected (8-neigh-borhood). The segmentation is carried out repetitively such that for each run, new foreground markers are arbitrarily positioned inside the segmented regions. The marker count (except from the initial step) is computed as one percent of the total number of pixels in the image. That is, if an image has 480x480 pixels, then the marker count will be 2304 (480 x 480 x 0.01) for the subsequent runs.

To terminate segmentation, a two-stage stopping crite-rion is used. The first stage is based on the measure of the steadiness of the growth ratio (gr), while the second is based on the measure of the maximum repetition count (mrc). To check the first condition, gr (labeled pixels / total image pixels) is calculated for each segment. Then, the differenc-es between the last two ratios are computed. If the differ-

ence stays below a predefined threshold value (growth ratio threshold (grt)), then the algorithm stops. This means that the growing of foreground region is highly diminished or com-pletely ended. To reduce under-segmentation, we set grt to a relatively small constant of 0.001. Although the grt value is chosen optimally, there are some cases, in which the selected value may be problematic:

Case-1:Let’s examine a special case, in which the grt value leads to an oversegmentation problem. For segment #1, let’s com-pute a ratio 91 / 10000 = 0.0091, where 91 is the number of pixels in the extracted segment and 1000 is the total number of pixels. For segment #2, let’s assume that the region grows and the ratio becomes 125 / 10000 = 0.0125. The ratio differ-ence is now 0.0125 – 0.0091 = 0.0034. The execution pro-ceeds since 0.0034 > 0.0010 (grt). For segment #3, let’s as-sume that the region grows again and becomes 490 / 10000 = 0.049. The ratio difference is now 0.049 – 0.0125 = 0.0365. The execution proceeds since 0.0365 > 0.0010 (grt). Finally, for segment #N (where N < mrc) let’s assume that the region still grows with large steps and becomes 10000 / 10000 = 1. Since all the ratios are greater than grt, the whole image is labelled as foreground. Therefore, an over-segmentation problem occurs. In this case, the grt value should be selected as a larger value to stop the execution in time.

Case-2:Let’s examine another case, in which the grt value leads to an undersegmentation problem. As opposed to case-1, the segment ratios can be very low. For segment #1, let’s com-pute a ratio 91 / 10000 = 0.0091, where 91 is the number of pixels in the extracted segment and 1000 is the total number of pixels. For segment #2, let’s assume that the region grows slightly and the ratio becomes 99 / 10000 = 0.0099. The ratio difference is now 0.0099 – 0.0091 = 0.0008, which is less than 0.0010 (grt), and therefore the execution stops. This case may occur when the initial foreground markers hit on a small local region. This time grt value should be selected as a smaller value to resume the execution.

For the second stopping criterion (mrc) we use the val-ue of 50. This value was determined based on experiments through which the premature convergence problem was best reduced using the upper limit of 50. Therefore, the segmen-tation terminates at the 50th repetition at most. Although this case generally ends up with over-segmentation, under-seg-mentation appears to be a more challenging problem. The proposed repetitive segmentation procedure is illustrated in Figure 2. In Figure 2(a), the initial positions of foreground and background markers are illustrated in red and green col-ors, respectively. After the first iteration, the segmented re-gion obtained is shown in Figure 2(b), in which the white areas represent the building facade, while the black areas represent the background. For the first four iterations, the foreground and background markers are illustrated in red and green colors in Figures 2(c), 2(e) and 2(g), while the resulting segmented images are shown in Figures 2(d), 2(f) and 2(h). The pseudo-code for the implementation of the ap-proach (extractFacadeImage) is presented in Appendix 1.

Page 4: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

12 An automatic region growing based approach to extract facade textures from single ground-level building images

Figure 2: The proposed repetitive Watershed segmentation: (a) The initial markers, (b) The segmented region after the first iteration, (c), (e), and (g) The markers in the second, third and fourth iterations, respectively, (d), (f), and (h) The segmented regions after the second, third, and fourth iterations, respectively.

Figure 3: (a) The segments produced through the repetitive segmentation approach and (b) the final segment after performing the intersection operation.

As mentioned earlier, the proposed repetitive watershed seg-mentation procedure has a strong potential for over-segmen-tation. To tackle with this problem, the segmentation is rerun and the candidate facade segments are stored. In the present case, the limit for the number of generated candidate seg-ments is kept as 10. Next, the detected candidate segments (Figure 3 (a)) are overlaid and the intersecting area between them is found. The intersection is carried out by means of a

logical AND operation, for which the facade regions (white) correspond to logical 1 and the background regions (black) refer to 0. The area of the intersections is then accepted as the final facade segment, in which the over-segmentation is min-imized to a reasonable level (Figure 3(b)). Furthermore, the borders of the extracted 10 segments overlaid on the original image are illustrated in Figure 4.

Page 5: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

13Emre Sümer, Mustafa Türker / Vol.2 Iss.1 2013

Figure 4: The borders of the (a) extracted 10 segments and (b) the intersected segment overlaid on the original image

In this study, the value (the number of generated candidate segments) of ten was accepted as the optimum limit for all experiments. We found based on the experiments that those segments that are generated after the tenth segment have become quite similar to first ten segments. Therefore, in-creasing this limit does not affect the results significantly but brings a computational cost. If the limit is decreased, the problem of over-segmentation may arise. For example, if the first three segments were used instead of ten, then the final segment would have a lower accuracy (Figure 5).

3. Accuracy assessmentThe results obtained from the facade texture extraction ap-proach are assessed by a quantitative evaluation metric, whi-ch is based on labeling the pixels by comparing the output of the developed method with the reference data (Shufelt 1999; Shan and Lee 2005). In the present study, the reference da-taset was prepared by cropping the facade images from the original ground-level photos manually. For each pixel, three

Figure 5: The final segments after the intersection of (a) first 3 segments and (b) all (10) segments

possible cases are considered, which are True Positive (TP), False Positive (FP), and False Negative (FN). In the case of TP, both the analysis results and the reference data label the pixels as belonging to foreground (building facade). In the case of FP, the analysis results label the pixels as belonging to foreground, while the reference data labels them as ba-ckground. The FN case is the exact opposite of the FP case. These cases are illustrated in Figure 6 where AIT is the area of extracted building facade and its ground truth (TP), AT is the missed area of the building facade (FN) and AI is the area, which is wrongly detected as building facade (FP).

To evaluate the accuracies of the extracted facade ima-ges, the counts TP, FP, and FN were calculated for each facade image, and the metrics facade detection percentage (FDP), branching factor (BF) and quality percentage (QP) were computed using equations (1) – (3) (Shufelt 1999).

Figure 6: The three possible cases in the accuracy assessment of the extracted facade images

Page 6: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

14 An automatic region growing based approach to extract facade textures from single ground-level building images

Figure 6: The three possible cases in the accuracy assessment of the extracted facade images

To evaluate the accuracies of the extracted facade images, the counts TP, FP, and FN were

calculated for each facade image, and the metrics facade detection percentage (FDP),

branching factor (BF) and quality percentage (QP) were computed using equations (1) – (3)

(Shufelt 1999).

FNTPxTPFDP

100

(1)

TPFPBF

(2)

FNFPTPxTPQP

100 (3)

The metric FDP can be treated as a measure of object detection performance that evaluates

the fraction of reference pixels, which are labeled as building facade pixels. The metric BF is

the measure of over-segmentation. If there is no over-segmentation, the branching factor

becomes zero. On the other hand, a branching factor value of one indicates an incorrect

labeling of background pixels. Further, the metric QP measures the absolute quality of the

extracted facade images. If QP is 100%, this indicates a perfect segmentation of the

foreground with respect to reference data. In other words, the number of false negatives and

the false positives are equal to zero.

4. The experimental data

The experiments were carried out on two different datasets. The first dataset contains five

images selected randomly from the eTrims database, which was created by Korč and Förstner

(2009) along with the members of the eTrims consortium (Figure 7). The database contains

(1)

Figure 6: The three possible cases in the accuracy assessment of the extracted facade images

To evaluate the accuracies of the extracted facade images, the counts TP, FP, and FN were

calculated for each facade image, and the metrics facade detection percentage (FDP),

branching factor (BF) and quality percentage (QP) were computed using equations (1) – (3)

(Shufelt 1999).

FNTPxTPFDP

100

(1)

TPFPBF

(2)

FNFPTPxTPQP

100 (3)

The metric FDP can be treated as a measure of object detection performance that evaluates

the fraction of reference pixels, which are labeled as building facade pixels. The metric BF is

the measure of over-segmentation. If there is no over-segmentation, the branching factor

becomes zero. On the other hand, a branching factor value of one indicates an incorrect

labeling of background pixels. Further, the metric QP measures the absolute quality of the

extracted facade images. If QP is 100%, this indicates a perfect segmentation of the

foreground with respect to reference data. In other words, the number of false negatives and

the false positives are equal to zero.

4. The experimental data

The experiments were carried out on two different datasets. The first dataset contains five

images selected randomly from the eTrims database, which was created by Korč and Förstner

(2009) along with the members of the eTrims consortium (Figure 7). The database contains

(2)

Figure 6: The three possible cases in the accuracy assessment of the extracted facade images

To evaluate the accuracies of the extracted facade images, the counts TP, FP, and FN were

calculated for each facade image, and the metrics facade detection percentage (FDP),

branching factor (BF) and quality percentage (QP) were computed using equations (1) – (3)

(Shufelt 1999).

FNTPxTPFDP

100

(1)

TPFPBF

(2)

FNFPTPxTPQP

100 (3)

The metric FDP can be treated as a measure of object detection performance that evaluates

the fraction of reference pixels, which are labeled as building facade pixels. The metric BF is

the measure of over-segmentation. If there is no over-segmentation, the branching factor

becomes zero. On the other hand, a branching factor value of one indicates an incorrect

labeling of background pixels. Further, the metric QP measures the absolute quality of the

extracted facade images. If QP is 100%, this indicates a perfect segmentation of the

foreground with respect to reference data. In other words, the number of false negatives and

the false positives are equal to zero.

4. The experimental data

The experiments were carried out on two different datasets. The first dataset contains five

images selected randomly from the eTrims database, which was created by Korč and Förstner

(2009) along with the members of the eTrims consortium (Figure 7). The database contains

(3)

The metric FDP can be treated as a measure of object de-tection performance that evaluates the fraction of reference pixels, which are labeled as building facade pixels. The met-ric BF is the measure of over-segmentation. If there is no over-segmentation, the branching factor becomes zero. On the other hand, a branching factor value of one indicates an incorrect labeling of background pixels. Further, the metric QP measures the absolute quality of the extracted facade images. If QP is 100%, this indicates a perfect segmentati-on of the foreground with respect to reference data. In other

words, the number of false negatives and the false positives are equal to zero.

4. The experimental dataThe experiments were carried out on two different datasets. The first dataset contains five images selected randomly from the eT-rims database, which was created by Korč and Förstner (2009) along with the members of the eTrims consortium (Figure 7). The database contains the annotated RGB-images of bu-ilding facades over a hundred buildings captured in several major European cities. The second dataset includes 15 bu-ilding facade images that were collected from the Batikent district of Ankara - Turkey, using Samsung WB500 digital camera (Figure 8). The photos were taken from the positions that have suitable shooting conditions, such as sufficient ang-le of view and reasonable amount of occlusions.

Figure 7: The facade images selected from the eTrims image database

Figure 8: The facade images collected in the Batikent district of Ankara, Turkey.

Page 7: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

15Emre Sümer, Mustafa Türker / Vol.2 Iss.1 2013

5. The results and discussionsFor both datasets, the quantitative evaluation results of the facade image extraction are given in Tables 1 and 2. For the eTrims dataset, the computed FDP values ranged from 74.1% to 96.9%. Similarly, BF and QP values were com-puted to be in the range 0.002–0.011 and 66.7%–96.6%, re-spectively. The average FDP, BF and QP values were found to be 81.7%, 0.005, and 81.4%, respectively (Table 1). The QP values were close to FDP values since the branching fac-tors were relatively low for this dataset. This means that the over-segmentation problem is almost never encountered in the extraction of building facades.

For the Batikent dataset, the FDP values ranged from 53.6% to 100%, whereas BF and QP values stayed in the range 0.004–0.245 and 53.5%–97.8%, respectively. The av-erage percentage value for the facade detection was found to be 89.5%. On the other hand, the average values were com-puted to be 0.088 and 82.3% for BF and QP respectively (Table 2). When compared to eTrims dataset, a significant gap is evident between the average values of FDP and QP in the Batikent dataset. We believe that this is due to high BF values compared with relatively lower values of the eTrims dataset. This indicates the existence of over-segmentation problem in the Batikent dataset, where the background pixels were labeled as foreground pixels. It is also observed that the computed facade detection percentages of 53.6% (building #8) and 67.7% (building #12) were remarkably low. These low values of accuracy were believed to be due to high num-ber of FN pixels. If these buildings are excluded from the dataset then, the average facade detection percentage would reach to 93.9%.

Tablo 1: The results of facade image extraction for the eTrims dataset

Table 2: The results of facade image extraction for the Batikent dataset

BuildingNo

TP FP FNFDP(%)

BFQP(%)

1 28237 127 9847 74,1 0,005 73,9

2 28669 320 8595 76,9 0,011 76,3

3 29989 54 14920 66,8 0,002 66,7

4 41336 153 2808 93,6 0,004 93,3

5 40716 117 1301 96,9 0,003 96,6

BuildingNo

TP FP FNFDP(%)

BFQP(%)

1 65769 9067 2290 96,6 0,13 85,3

2 68801 13886 687 99,0 0,20 82,5

3 68749 1296 240 99,7 0,01 97,8

4 70822 4751 3 100 0,06 93,7

5 80136 19617 465 99,4 0,24 80,0

6 104676 2898 26291 79,9 0,02 78,2

7 67038 10874 14 100 0,16 86,0

8 78006 504 67404 53,6 0,00 53,5

9 68683 3647 6401 91,5 0,05 87,2

10 45743 710 6465 87,6 0,01 86,4

11 81688 3219 24358 77,0 0,03 74,8

12 110865 476 52855 67,7 0,00 67,5

13 65731 3683 6383 91,1 0,05 86,7

14 86805 19801 180 99,8 0,22 81,3

15 84725 5428 207 99,8 0,06 93,8

The results indicate that the proposed approach is quite suc-cessful for extracting building facade textures. However, in several cases the developed facade image extraction proce-dure does not work properly. One problem is under-segmen-tation that occurs during facade texture extraction. In this case, the segmentation algorithm may not successfully par-tition the facade image and therefore, the accuracy of the extracted facade image would decrease, which was the case for buildings #12 and #8. Since the structure of building#12 is rather thin some of the initial foreground markers may not fall within the building facade. Therefore, this leads to an under-segmentation problem, causing the foreground regi-on not to grow as expected. This results in increasing the false negative pixels and therefore decreasing the facade detection percentage. Similar to building #12, building #8 also suffers from the under-segmentation problem. In this case, the chimney that is elongated in the middle of the bu-ilding façade could be segmented alone. This is due to the fact that a considerable number of initial foreground markers hit on the chimney. Therefore, similar to building#12, the foreground region may not grow as expected and the false negative pixels increase.

The other shortcoming of the developed technique is the partial loss of the texture information due to the level of bu-ilding bases which may become lower than the level of pa-vement. In this case, the undersides of the facades cannot be recorded by the camera. The amount of texture loss may be as much as a whole story of a building. This case is illust-rated in Figure 9, in which a significant part of the building, which is below the red line, cannot be extracted from the fa-cade image due to the level difference between the building baseline and the pavement.

6. ConclusionsIn this study, an approach is presented for the automatic extraction of facade textures with a low-acquisition cost by employing single ground-level building facade images. The experimental tests of the approach provided quite satisfac-tory results. The average facade detection and quality per-centage values were computed to be 89.5% and 82.3% for the Batikent dataset, and they were computed to be 81.7% and 81.4% for the eTrims dataset. The proposed repetitive watershed segmentation was found to be quite effective for the automatic extraction of large amount of texture infor-mation as long as the over-segmentation condition is kept

Figure 9: The level difference between the building baseline and the pavement

Page 8: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

16 An automatic region growing based approach to extract facade textures from single ground-level building images

at the minimum level. Whereas the conventional watershed segmentation, which employs manual initialization of the se-eds, we proposed the automatic initialization of the markers based on dynamic templates. Therefore, the level of automa-tion of the watershed segmentation was increased.

Although the proposed approach was successful in most cases for retrieving photorealistic texture, in certain cases shortcomings were evident. For instance, if the photograp-hed building facade is blocked by a large amount of occlu-sion or casting shadow then the problem of under-segmen-tation is faced. Besides, the buildings that have protrusions are found to be quite challenging since the facade extraction algorithm segments those regions as foreground texture. In another case, the facade textures of closely located identical buildings are extracted as a single facade due to their spectral similarity.

Finally, we believe based on the satisfactory results ac-hieved in this study that the developed approach for the automatic retrieval of photorealistic textures from single ground-level building images will make an important contri-bution to automatic construction of virtual cities.

Appendix 1

0 Begin Procedure: extractFacadeImage1 INITIATE markers for the first run2 SET number of candidate facade segment as 03 IF number of candidate facade segment == 10 THEN 4 GO TO Step-145 ELSE6 EXECUTE watershed segmentation7 IF (difference between any consecutive growth ratio < growth ratio threshold) OR (repetition count >= maximum repetition count) THEN 8 STORE candidate facade segment 9 INCREMENT number of candidate facade segment by one 10 GENERATE new markers for the next run11 ENDIF 12 GO TO Step-313 ENDIF14 INTERSECT candidate facade segments 15 End Procedure: extractFacadeImage

ReferencesBeucher S., Meyer F., (1992), The Morphological Approach

of Segmentation: The Watershed Transformation, In: Mathematical Morphology in Image Processing, (Dougherty E., Ed.), Marcel Dekker Inc., pp.433-481.

Böhm J., (2008), Facade Detail from Incomplete Range Data, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXXVII, Part B5, Beijing, China.

Carlberg M., Andrews J., Gao P., Zakhor A., (2008), Fast Surface Reconstruction and Segmentation with Ground-Based and Airborne LIDAR Range Data, Proceedings of 4th International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT’08), Atlanta, Georgia, Usa.

David P., (2008), Detection of Building Facades in Urban Environments, In: Proceedings of SPIE, 6978, pp.139–148, doi: 10.1117/12.779280.

Delmerico J. A., David P., Corso J. J., (2011), Building Facade Detection, Segmentation, and Parameter Estimation for Mobile Robot Localization and Guidance, In: IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, Usa, pp.1632–1639.

Hoegner L., Stilla U., (2009), Thermal Leakage Detection on Building Facades Using Infrared Textures Generated by Mobile Mapping, Proceedings of 2009 Urban Remote Sensing Joint Event, Shangai, China.

Jang K.H., Jung S.K., (2009), Practical Modeling Technique for Large-scale 3-D Building Models from Ground Images, Pattern Recognition Letters, 30, 861–869, doi: 10.1016/j.patrec.2009.04.004.

Kang Z., Zhang L., Zlatanova S., Li J., (2010), An Automatic Mosaicking Method for Building Facade Texture Mapping Using a Monocular Close-range Image Sequences, ISPRS Journal of Photogrammetry and Remote Sensing, 65(3), 282–293, doi: 10.1016/j.isprsjprs.2009.11.003.

Korč F., Förstner W., (2009), eTRIMS Image Database for Interpreting Images of Man-Made Scenes, Technical report TR-IGG-P-2009-01, University of Bonn, Deptartment of Photogrammetry, Germany.

Laycock R.G., Ryder G.D.G., Day A.M., (2007), Automatic Generation, Texturing and Population of a Reflective Real-Time Urban Environment, Computers and Graphics, 31, 625–635, doi: 10.1016/j.cag.2007.04.001.

Lorenz H., Döllner J., (2006), Towards Automating the Generation of Facade Textures of Virtual City Models, ISPRS Commission II, WG II/5 Workshop, Vienna, Austria.

Martinez J., Soria-Medina A., Arias P., Buffara-Antunes A. F., (2012), Automatic processing of terrestrial laser scanning data of building facades, Automation in Construction, 22, 298–305, doi: 10.1016/j.autcon.2011.09.005

Mohan S., Murali S., (2007), Automated 3D Modeling and Rendering from Single View Images, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), Tamil Nadu, India, doi: 10.1109/ICCIMA.2007.56.

Müller P., Zeng G., Wonka P., Van Gool L., (2007), Image-based Procedural Modeling of Facades, Proceedings of SIGGRAPGH 2007, San Diego, California, Usa.

Poullis C., You S., (2009), Photorealistic Large-Scale Urban City Model Reconstruction, IEEE Transactions on Visualization and Computer Graphics, 15(4), 654–669, doi: 10.1109/TVCG.2008.189.

Pu S., Vosselman G., (2009), Knowledge-based Reconstruction of Building Models from Terrestrial Laser Scanning Data, ISPRS Journal of Photogrammetry and Remote Sensing, 64(6), 575–584, doi: 10.1016/j.isprsjprs.2009.04.001.

Rau J. Y., Chu C. Y., (2011), Vector-based Occlusion Detection for Automatic Facade Texture Mapping, In IEEE International Workshop on Multi-Platform/Multi-Sensor Remote Sensing and Mapping, Xiamen, China, pp.1-6, doi: 10.1109/M2RSM.2011.5697380.

Ripperda N., (2008), Determination of Facade Attributes for Facade Reconstruction, Proceedings of International Society for Photogrammetry and Remote Sensing (ISPRS’08) Congress, Beijing, China.

Shan J., Lee S.D., (2005), Quality of Building Extraction from IKONOS Imagery, Journal of Surveying Engineering, 131(1), 27-32, doi: 10.1061/(ASCE)0733-9453(2005)131:1(27).

Page 9: An automatic region growing based approach to extract ...to occlusions. Similarly, Carlberg et al. (2008) presented a general framework for facade surface reconstruction and segmentation

17Emre Sümer, Mustafa Türker / Vol.2 Iss.1 2013

Shufelt J.A., (1999), Performance Evaluation and Analysis of Monocular Building Extraction from Aerial Imagery, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(4), 311-326, doi: 10.1109/34.761262.

Tian Y., Gerke M., Vosselman G., Zhu Q., (2010), Knowledge-based Building Reconstruction from Terrestrial Video Sequences, ISPRS Journal of Photogrammetry and Remote Sensing, 65(4), 395–408, doi: 10.1016/j.isprsjprs.2010.05.001.

Vincent L., Soille P., (1991), Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations, IEEE

Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583–598, doi: 10.1109/34.87344.

Wan G., Li S., (2011), Automatic Facades Segmentation using Detected Lines and Vanishing Points, 4th International Congress on Image and Signal Processing (CISP). Shanghai, China, pp. 1214-1217, doi: 10.1109/CISP.2011.6100448.

Wang M., Bai H., Hu F., (2008), Automatic Texture Acquisition for 3D Model Using Oblique Aerial Images. First International Conference on Intelligent Networks and Intelligent Systems (ICINIS 2008). Wuhan, China, doi: 10.1109/ICINIS.2008.122.