An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

Embed Size (px)

Citation preview

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    1/25

    RESEARCH ARTICLE

    An Integrated Multistage Framework for Automatic Road

    Extraction from High Resolution Satellite ImageryT. T. Mirnalinee & Sukhendu Das & Koshy Varghese

    Received: 6 October 2009 /Accepted: 6 April 2010 /Published online: 12 March 2011# Indian Society of Remote Sensing 2011

    Abstract Automated procedures to rapidly identify

    road networks from high-resolution satellite imagery

    are necessary for modern applications in GIS. In this

    paper, we propose an approach for automatic road

    extraction by integrating a set of appropriate modules

    in a unified framework, to solve this complex

    problem. The two main properties of roads used are:

    (1) spectral contrast with respect to background and

    (2) locally linear path. Support Vector Machine is

    used to discriminate between road and non-road

    segments. We propose a Dominant singular Measure(DSM) for the task of detecting linear (locally) road

    boundaries. This pair of information of road seg-

    ments, obtained using Probabilistic SVM (PSVM)

    and DSM, is integrated using a modified Constraint

    Satisfaction Neural Network. Results of this integra-

    tion are not satisfactory due to occlusion of roads,

    variation of road material, and curvilinear pattern.

    Suitable post-processing modules (segment linking

    and region part segmentation) have been designed to

    address these issues. The proposed non-model based

    approach is verified with extensive experimentations

    and performance compared with two state-of-the-art

    techniques and a GIS based tool, using multi-spectral

    satellite images. The proposed methodology is robust

    and shows superior performance (completeness and

    correctness are used as measures) in automating the

    process of road network extraction.

    Keywords Dominant singular measure . PSVM .CSNN-CII . Road edges . Road segments . Fusion .

    Segment linking . Region part segmentation

    Introduction

    Road networks are essential modes of transportation,

    and provide a backbone for human civilization.

    Cartographic object extraction from digital imagery

    is a fundamental operation for GIS update. However

    the complete automation of the extraction processes isstill an unsolved problem. Road feature extraction

    from a raster image is a non trivial and image specific

    process. Hence, it is difficult to have a general method

    to extract roads from any given raster image. Road

    layers on raster maps typically have two distinguish-

    able geometric properties from other layers: (1) Road

    lines are straight within a small distance (i.e., several

    meters in a street block); (2) Unlike building layers,

    which could have many small distinct connected

    J Indian Soc Remote Sens (March 2011) 39(1):125

    DOI 10.1007/s12524-011-0063-9

    T. T. Mirnalinee : S. Das (*)

    Visualization and Perception Lab, Dept. of CSE, IndianInstitute of Technology, Madras,

    Chennai 600 036, India

    e-mail: [email protected]

    T. T. Mirnalinee

    e-mail: [email protected]

    K. Varghese

    Dept. of Civil Engg, Indian Institute of Technology,

    Madras,

    Chennai 600 036, India

    e-mail: [email protected]

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    2/25

    components, roads are connected to each other to

    form a road network. Road layers usually have few

    connected objects or even only one huge connected

    object forming a whole road layer. Many works on

    this topic have been presented (Laptev et al. 2000; Shi

    and Zhu 2002; Hinz and Baumgartner 2003; Hu and

    Tao 2007; Mokhtarzade and Zoej 2007; Mena 2003;Tupin et al. 2002). However, the manual intervention

    of the operator in extracting, defining and validating

    cartographic objects for GIS update is still needed.

    Applications of road extraction process are found in

    updating GIS records, urban planning, traffic control,

    car navigation, map generation etc.

    Most of the works published in literature on road

    detection from satellite images are classified in two

    categories: (1) Semi Automatic (Gruen and Li 1995;

    Udomhunsakul 2004; Bucha et al. 2006; Zhang et al.

    2008; Hu et al. 2004; Xiao et al. 2005) processes thatrequire help from a human operator. In contrast to the

    automatic methods they demand a number of seed

    points which are usually chosen by the operator in an

    interactive fashion. Given such seed points the semi-

    automatic algorithm connects them by a path which is

    most likely a road. On the other hand, (2) Automatic

    (Laptev et al. 2000; Shi and Zhu 2002; Hinz and

    Baumgartner 2003; Mokhtarzade and Zoej 2007;

    Baumgartner et al. 2002; Zhu et al. 2005) road

    extraction methods require no initial (prior) informa-

    tion about the presence and location of roads. In thefollowing, we will discuss automatic road extraction

    process.

    Automated extraction of r oads f rom high-

    resolution imagery is a difficult task because of

    the complexity in spatial and spectral variability of

    the road network. Roads exhibit a variety of

    spectral responses due to differences in age and/or

    material and vary widely in physical dimensions. In

    addition, r oad network in dense urban areas

    typically have different geometric characteristics

    than those in suburban and rural areas. Techniquesto extract road networks using binarization and line

    segment matching of high-resolution IKONOS

    urban imagery were presented in (Shi and Zhu

    2002; Zhu et al. 2005). A line segment match

    method was used to detect long linear groups of

    pixels for classification as roads. These road pixels

    are then simplified into the road centerlines with the

    use of morphological operators. Mayer et al. (1997)

    presented a complex road net-work extraction ap-

    proach that attempts to accurately map both the road

    network and the road edges through the use of

    snakes (Kass et al. 1987). In another approach, Hinz

    and Baumgartner (2003) utilized multiple very high-

    resolution aerial images and detailed scene models,

    to perform road extraction.

    One can find a survey of road extraction methodsfrom satellite images by Mena (2003). Tupin et al.

    (2002) presented the road extraction algorithm using

    feature extraction (line detector) and network recon-

    struction (graph labeling), which uses multiple views

    of the same scene. According to McKeown (1996),

    roads extracted from one raster image need not be

    extracted in the same way from another raster image,

    as there can be a drastic change in the value of

    important parameters based on nature, instrument

    variation, and photographic orientation. Yang and

    Wang (2007) proposed a road extraction algorithmwhich deals with detecting two types of road

    primitives, namely blob-like primitive and line-like

    primitive. These primitives are defined, measured,

    extracted and linked using different methods for

    dissimilar road scenes.

    Tuncer (2007) proposed a method which comprises

    of preprocessing the image via a series of wavelet

    based filter banks and reducing the data into a single

    image which is of the same size as the original

    satellite image. Then a fuzzy inference algorithm is

    utilized to perform road detection. Each waveletfunction resolves features at a different resolution

    level associated with the frequency response of the

    corresponding FIR filter. Resulting two images are

    fused together using Karhounen-Louve transform

    (KLT) which is based on principal component

    analysis (PCA). This process underlines the promi-

    nent features of the original image as well as

    denoising it, since the prominent features appear in

    both of the wavelet transformed images while noise

    does not strongly correlate between scales. Next a

    fuzzy logic inference algorithm which is based onstatistical information and geometry is used to extract

    the road pixels. The approach is only suitable for the

    Ikonos data on rural areas where roads are mostly

    homogeneous and are not disturbed by shadows or

    occlusions. The central idea is to take into account the

    spectral information by means of a (fuzzy) classifica-

    tion approach.

    A back-propagation neural network (BPNN) with

    one hidden layer has been proposed for road

    2 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    3/25

    extraction in Mokhtarzade and Zoej (2007). The

    output layer consists of one neurode that expresses

    the networks response by a number between 0 and 1,

    as background and road pixel respectively. Back

    propagation Neural Network with different sizes of

    the hidden layers, were trained with different number

    of iterations before converging. Training and recallingstages were time consuming in this approach.

    Doucette et al. (2001) introduced a self-organizing

    road map algorithm to extract roads from high-

    resolution Multi-Spectral imagery. The self organizing

    road map, a specialized version of the self organizing

    neural network model, performs spatial clustering to

    identify and group together elongated regions.

    Most of the methods discussed so far use a limited

    set of image samples of a particular area to obtain

    descent results. Some of them do not exhibit

    performance analysis and comparative study withexisting state of the art techniques. Techniques

    adapted are often adhoc and tuned for a particular

    set of (few) samples acquired to show results. Our

    study of road extraction is solely based on the road

    characteristics (geometrical and spectral) stored in an

    implicit manner in a raster image.

    It is often difficult to obtain satisfactory results, by

    using only one of these methods to detect road

    structures in complex pictures. However, it is possible

    to improve the results by using the complementary

    nature of edge-based and region-based information. Alarge amount of work on the fusion of edge and

    region information have been reported in literature

    (Haddon and Boyce 1990; Chu and Aggarwal 1993;

    Moigne and Tilton 1995; Pavlidis and Liow 1990) for

    image segmentation. Pavlidis and Liow (1990) de-

    scribed a method to combine segments obtained using

    a region growing (over-segmented) approach, where

    the edges between regions are eliminated or modified

    based on contrast, gradient and smoothness of

    the boundary. Haddon and Boyce (1990) generate

    regions by partitioning the image co-occurrencematrix and then refining them by relaxation using

    the edge information. Chu and Aggarwal (1993)

    present an optimization method to integrate segmen-

    tation and edge maps obtained from several channels,

    including visible, infrared, etc., where user specified

    weights and arbitrary mixing of region and edge

    maps are allowed. Most of the methods proposed for

    combining region and edge information are highly

    sensitive to the correctness of edge map.

    Lin et al. (1992) proposed constraint a satisfaction

    neural network for image segmentation. They posed

    the image segmentation problem as a constraint

    satisfaction problem (CSP) by interpreting the process

    as one of assigning labels to pixels subject to certain

    spatial con-straints. Kurugollu and Sankur (1999)

    proposed a segmentation algorithm for color images,which implements the MAP estimation of the label

    field using a CSNN. In their work, the initial class

    probabilities are obtained via a fuzzy C-means

    algorithm in contrast to Lin et al. (1992) method,

    where an adhoc fuzzification of an initial map takes

    place. They have tried to combine advantages of

    GMRF formulation (Raghu and Yegnanarayana 1996)

    with those of the CSNN based (Lin et al. 1992)

    relaxation. The results are shown on synthetic images.

    In a recent work proposed by Lalit et al. (2008), a

    CSNN-CII (Constraint Satisfaction Neural Net-workComplementary Information Integration) has been

    used for texture segmentation. Results are shown on

    simulated and real world images.

    The focus of this paper is on the design and

    development of a technique, which enables the user to

    extract road segments from an input image without

    much of user interaction. The motivation of our work

    comes from the fact that the complimentary informa-

    tion of regions (road pixels in our case) and edges

    (road boundaries) have not been exploited together to

    obtain a decent road map from satellite images. Eitherof these techniques when solely applied, produce

    errors which do not occur together (simultaneously),

    in general. This is due to the fact that the criteria for

    classification of pixels as road regions look for

    continuity and local smoothness, whereas methods

    to detect road boundaries look for discontinuities in

    raster images. Road regions are separated from non-

    road regions in our proposed framework using a

    PSVM (Probabilistic Support Vector Machine) classi-

    fier. In our previous work on DSM (Dominant

    Singular Measure) (Mirnalinee et al. 2009) basedroad extractor, the performance was low as the local

    contrast between the regions was only considered.

    Therefore, we decided to merge the information from

    both DSM and PSVM using a CSNN-CII (Constraint

    Satisfaction Neural Network with Complimentary

    Information Integration) (Lalit et al. 2008) to produce

    better results. A modified constraint satisfaction

    neural network (CSNN) has been designed for this

    task, which uses a novel dynamic window to merge

    J Indian Soc Remote Sens (March 2011) 39(1):125 3

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    4/25

    the complimentary information of edges and regions.

    The output of CSNN-CII needs to be processed

    further to remove some undesired artifacts and errors.

    Segment linking algorithm is used to bridge the

    discontinuities detected between road segments. Re-

    gion part segmentation algorithm separates the roads

    from protruding or attached non-road regions therebyimproving the accuracy. Results are shown using four

    categories of database of high-resolution satellite

    images from the following areas: (1) Developed

    suburban, (2) Developed Urban, (3) Emerging subur-

    ban and (4) Emerging Urban. Performance analysis is

    presented using completeness and correctness meas-

    ures (Heipke et al. 1997).

    This paper is organized as follows: Section

    Research Issues and Design Strategy deals with

    the research issues and design strategies. Section

    Proposed Method

    deals with the overall proposed

    methodology. Description of various stages in

    pr op os ed fr am ew or k is pr es en te d in Se ct io n

    Description of the Different Stages in Our Proposed

    Framework. We present experimental results in Section

    Experimental Results and Comparative Study and

    conclude the paper in Section Conclusions.

    Research Issues and Design Strategy

    The difficulties in the design of an automated roadnetwork extraction system using remotely-sensed

    imagery lie in the fact that the image characteristics

    of road feature vary according to sensor type, spectral

    and spatial resolution, ground characteristics, etc.

    Even for an image taken over a particular urban area,

    different parts of the road network reveal different

    characteristics. In real world, a road network is too

    complex to be modeled using a mathematical formu-

    lation or an abstract model. The existence of other

    objects (e.g., buildings and trees) cast shadows to

    occlude road features, thus complicating the extrac-tion process.

    Human perceptual ways of recognizing a road

    involves (Jin and Davis 2005) extracting geometric,

    radiometric and topological characteristics of an

    image. Humans usually recognize a road using first

    its geometric characteristics considering a road to be a

    long, elongated feature with uniform width and

    similar radiometric variance along its path. Even

    though spectral characteristics of road vary within an

    image, its physical appearance tends to exist as long

    continuous features. Humans fuse these vital clues to

    identify a foreground road object from the back-

    ground layer. This motivated us to develop a generic

    framework that integrates suitable processing modules

    necessary for extracting the different types of features

    present in road objects available in satellite scenes.We present the characteristics of roads next, followed

    by suitable modules designed specifically to address

    these issues. We also validate the efficiency of the

    extraction system using experimental results.

    Most significant characteristics of roads, which

    appear in high-resolution satellite imagery are:

    1. Roads have a distinctively contrasting spectral

    signature (both locally and globally) with respect

    to the background layer (e.g. vegetation, soil,

    waterways, manmade structures etc.).

    2. Roads are mostly elongated structures, with

    locally linear properties.

    3. The road surface is usually homogeneous, with

    occasional variations.

    4. Discontinuities appear in a road structure mainly

    due to occluding objects, such as trees, buildings,

    large vehicles etc. or even shadows.

    5. Roads do not appear as a small segment or patch;

    either in isolation or attached to a large linear

    segment.

    6. Roads rarely terminate (no abrupt ending) within

    short distances. In fact, they intersect, occlude

    one another (bridges and highways) and bifurcate

    to build a network (global appearance).

    7. Roads have near-parallel boundaries, with both

    linear and curvilinear patterns.

    8. Road structures are rarely non-smooth and occur

    generally without much of sharp bends.

    Among the different properties stated above, the

    two major characteristics of roads are their geometri-

    cal shape and spectral contrast (as stated in (1) and (2)

    above). Roads in high spatial-resolution images ofurban areas appear as piecewise linear segments with

    spectrally homogeneous characteristics. These are

    vital clues, which form the basis of the design of

    our framework for automatically detecting roads in

    satellite imagery.

    In the design of a framework for road detection, we

    first need to exploit these two vital characteristics of

    roads. In such a case, one may be tempted to use a

    foreground extracting algorithm trained with spectral

    4 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    5/25

    patterns for roads and then use linear features on top

    of it. However, a classifier based on only spectral

    features will produce false alarms (identify non-road

    objects as roads and filter parts of roads as back-

    ground, due to reasons mentioned in points (3) and

    (4) above). On the other hand, a pattern classifier (for

    classifying roads) trained with geometrical features isuseless, unless the target (road, in this case) is

    available. It is also not possible to simultaneously

    extract and fuse these pair of distinct/disconnected

    features together, as not unless the road-like structures

    are filtered from the background the linear features

    may be estimated. It is impossible to design an

    operator or mask for this purpose, as that would need

    to simultaneously extract spectral and RST-invariant

    shape (geometrical) features from the image data. It is

    also not possible to formulate a mathematical (para-

    metric) model for a road network, which will work forall complex variations in the geometric design

    patterns (linear and curvilinear) formed by roads in

    urban scenarios.

    Due to the existence of these complex phenomena

    for roads, it is almost impossible to consider and

    model all these situations and incorporate them in a

    single module or processing stage for road network

    extraction. This drove us to formulate and design a

    hierarchical pipelined framework, consisting of the

    classification (supervised), information integration,

    filtering and local neighborhood analysis to obtaindescent results with acceptable quality. Results will be

    compared with two state-of-the-art methods (Tuncer

    2007; Mokhtarzade and Zoej 2007) published in

    literature and one GIS-based software (Geospace

    2008) used for raster image analysis.

    Because of the issues mentioned earlier, in most

    cases with hyper-spectral dataset, the spectral infor-

    mation alone is not sufficient to define roads. We

    need an integrated multistage framework to achieve

    our goal. Each stage of the framework deals with a

    particular characteristic of roads and are given in the

    left column of Table 1. The center column gives the

    corresponding strategy (processing module) used by

    us to solve the problem, while the right-hand side

    column specifies the difficulties/drawbacks that onemay face in execution of that stage. In the next

    section we describe our proposed multistage method

    based on the issues discussed in this section, followed

    by design details of the road extraction modules listed

    in Table 1.

    Proposed Method

    A multistage pipelined framework for road extrac-

    tion has been proposed in this paper. Figure 1shows the flowchart of our proposed method of road

    extraction, which is a hierarchical pipelined multi-

    stage framework based on details specified in

    Table 1. The first stage consists of an iterative

    merging of region and edge based information using

    a set of constraints. Road edges (boundaries) are

    extracted from edge features using DSM. We assume

    roads appearing in satellite images to be locally

    linear. Soft class labels (probabilities) for each pixel

    belonging to either road or non-road regions are

    produced by the PSVM. Then a modified CSNN,termed CSNN-CII (Lalit et al. 2008) is used for

    integrating the complimentary information from the

    edge and region outputs. A fruitful cooperation

    could be established between region-based and

    edge-based methods to extract elongated thick

    objects like roads in high-resolution satellite imag-

    ery. Elongatedness measure (shape feature) is used to

    remove the isolated non-road structures. Then a

    Table 1 Road characteristics & corresponding processing module

    Sl.

    No

    Characteristics Strategy/module Remarks

    1. Contrast w.r.t. background

    Mostly homogenous

    SVM classifier using mean and variance of

    spectral response

    Misclassification of non-road objects with iden-

    tical spectral response

    2. Elongated Structure DSM on edge map; Shape features Discontinuity due to occlusion

    3. Discontinuities and distortions in

    linear pattern

    CSNN-CII and Segment Linking Chance of linking roads with other structures

    4. Not appearing in isolation, rarely

    terminate

    Region Part Segmentation Removal of small road fragments

    J Indian Soc Remote Sens (March 2011) 39(1):125 5

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    6/25

    segment linking algorithm is used to link the

    discontinuous road segments which result due to

    occlusion. Region part algorithm module removes

    the non-road structures which appear due to adjacent

    manmade structures. The steps of the algorithm,

    depicting the process illustrated in Fig. 1, is given in

    Algorithm 1. In the next section, we present the

    description of the different stages of our framework

    along with intermediate results of processing using

    two satellite image samples.

    Algorithm 1 Proposed framework for road detection.

    Input: Image.

    Output: Segmented Image.

    Steps:

    1. Compute edge maps of the image using DSM.

    2. Compute the probability of class-label for each pixel using PSVM.

    3. Integrate region information and edge information (output of steps (2) & (1)) using CSNN-CII

    (Lalit et al, 2008):

    Initialize the neuron in CSNN-CII using the probability obtained from PSVM.

    Iterate and update the probabilities and edge map to get the final segmented map.

    4. Post-process the CSNN-CII output to remove stray patches and unnecessary artifacts.

    5. Perform segment linking to reduce the false negative.

    6. Perform Region part segmentation algorithm to reduce the false positive.

    Description of the Different Stages

    in Our Proposed Framework

    DSM Based Edge Detection

    Roads are expected to be locally linear. Hence, we

    extract the local orientation from the image of the

    road network. Extracting linear features from satellite

    images have been of interest to pattern recognition

    community for some time (Cooper and Cowan 2007;

    Granlund and Knutsson 1995; Lyvers and Mitchell

    1988; Wei and Xin 2008; B.Majidi and BabHadiashar

    2009). In the work by Cooper and Cowan (2007),

    amplitude balanced horizontal derivatives were used

    for enhancing linear features in images. However, ifthe dataset possesses features with large variations in

    amplitude then the horizontal derivative will also have

    the same property, and the smaller amplitude features

    (which may be of considerable importance) may be

    hard to discern. Granlund and Knutsson (1995)

    devised an elegant method for combining the outputs

    of quadrature pairs to extract a measure of orientation.

    Perona (1998) extended the idea of anisotropic

    diffusion to orientation maps. Bigun et al. (1991)

    posed the problem as the least squares fitting of a

    plane in the Fourier transform domain. Anothertechnique (Haglund and Fleet 1994) b ase d o n

    steerable filters (Jacob and Unser 2004), is limited

    in precision and generalization. In (Lyvers and

    Mitchell 1988), Lyvers et al. examined the accuracy

    of various local differential operators for noiseless

    situations, as well as in the presence of additive

    Gaussian noise. In (Jiang 2007), Jiang proposed an

    image integration operator which leads to unbiased

    orientation estimation.Fig. 1 Framework of the proposed method for road detection

    6 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    7/25

    Our method of obtaining the dominant direction

    using PCA and a gradient matrix (obtained using 1-D

    Canny (Kumar et al. 2000)) for orientation estimation

    to extract road segments is novel, more efficient and

    produces more robust results. Most established local

    orientation estimation techniques are based on the

    analysis of the local gradient field of the image. Butthe local gradients are very sensitive to noise, thus

    making the estimate of local orientation from these

    images unreliable. We use the method of Principal

    Component Analysis (PCA) for image orientation

    estimation. For each pixel in the image, we first

    calculate the local image gradients (using 1-D Canny

    (Kumar et al. 2000)) and then perform SVD of the

    gradient matrix. Gradient of image f(x,y) at point (xk,

    yk) is denoted by:

    rfk rfxk;yk dfxk;ykdx ; dfxk;ykdy

    T 1

    which involves 1-D processing along orthogonal

    directions (for details see (Kumar et al. 2000)). For

    example, the smoothing operator used along one

    direction (say, x) is the Gaussian filter:

    Gx 1ffiffiffiffiffi2p

    ps1

    expx2

    2s21 2

    and the 1-D Canny operator for computing the

    derivative along y is:

    dGy yffiffiffiffiffi2p

    ps32

    expy22s22

    3

    Similar processing are applied along the y and x

    directions, where the two operators interchange their

    directions of processing. This method is efficient and

    produces better gradient vectors which are orthogonal

    to the dominant orientation of the image pattern. Let

    us assume that in the image of interest f(x,y), the

    orientation field is piece-wise constant. Under thisassumption, the gradient vectors in an image block

    should on average be orthogonal to the dominant

    orientation of the image pattern. So orientation

    estimation can be formulated as the task of finding a

    unit vector a, which maximizes the average of theangles between a and gradient vectors (Feng andMilanfar 2002). The computational basis of PCA is

    the calculation of the Singular Value Decomposition

    (SVD) of the data covariance matrix. The majority of

    the eigenvectors form a cluster along a dominant

    direction indicating the presence of a linear structure.

    The eigenvalue will reflect the strength (peakiness in

    domain) of the distribution of the gradients towards

    a particular direction. Generally, the first eigenvalue is

    larger than the second one, and hence in case of an

    ideal straight line the second eigenvalue is zero(indicating no spread along the orthogonal direction).

    However, a digital line is represented stepwise

    (aliased), and hence the second eigenvalue for the

    case of a line in a digital image is a non-zero value. In

    order to get the local orientation estimate, we

    rearrange the gradient vectors into a 2 N2 matrix,

    where a window size of NN is used for processing

    around each pixel, as shown below:

    G rf1 rf2 rf3::: rfN2 4

    where, rfi rfxi;yi; i 1; 2; :::;N2 see Eq. 1.We then compute the SVD (Singular Value Decom-

    position) (Strang 2005) of the gradient matrix for each

    pixel, computed using a window of size NN. SVD

    of the gradient map is computed as

    G USVT 5where, U is an orthogonal 22 matrix, in which the

    first column represents the dominant orientation of the

    gradient field. S is a 2 N2 matrix, representing the

    energy along the dominant directions and V isorthogonal matrix of size N2 N2 representing each

    vectors contribution to the singular value.

    Dominant Singular Measure

    Dominant singular Measure (DSM) is computed as

    the ratio between the singular value of the major axis

    and the sum of the singular values. This measure

    approaches 1 for an elongated shape, DSM is

    defined as:

    DSM s1s1 s2 ; s1 ! s2 6

    When all the gradient components have the same

    direction, only one singular value (s1) is non-zero,

    which in turn makes the DSM value equal to 1. If both

    the singular values are equal and non-zero, the DSM

    value is 0.5. Range of values of DSM thus lies in the

    range [0.5 - 1]. We use the DSM measure to distinguish

    between scattered or disoriented image patterns and an

    J Indian Soc Remote Sens (March 2011) 39(1):125 7

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    8/25

    image region with an orientation pattern. If the DSM is

    less than a threshold (0:5

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    9/25

    allows the identification of samples drawn from

    unknown classes through the application of a suitable

    Bayesian decision rule (Duda et al. 2000). This

    approach is based on support vector machines(SVMs) for the estimation of probability density

    functions, which uses a recursive procedure to

    generate prior probability estimates for known and

    unknown classes. SVMs are exploited by Yager and

    Sowmya (2003) as a classifier for road extraction,

    which involves two stages of processing. Here, SVM

    is trained using edge based features such as, edge

    length, gradient and intensity within the edge pair. In

    level 1, SVM is used to classify edges as road edges

    or non-road edges. Edges classified as road edges are

    given as input to the SVM in level 2 where oppositeedges are paired as road segments. However, they

    have reported very low correctness measure. A new

    method (Miliaresisa and Kokkasb 2007) is presented

    for the extraction of buildings from light detection

    and ranging (LIDAR) digital elevation models

    (DEMs) on the basis of segmentation principles. The

    accuracy of supervised classification largely depends

    on the quality of the training data. The locations and

    sample size of training data are difficult to be

    optimized depending on image data types and

    classifiers to be used.Support vector machines (SVM) represent a prom-

    ising development in machine learning research that is

    not widely used within the remote sensing community

    (Pal and Mather 2005). The architecture of a SVM

    machine (Theodoridis and Koutroumbas 2006) is

    given in Fig. 5. Number of nodes is determined by

    the number of support vectors Ns.

    The main idea of SVM is to separate the classes

    with a hyperplane surface so as to maximize the

    margin among them. In this paper, support vector

    machines are used to classify roads from satellite

    imagery. In SVM the input vectors are mapped

    nonlinearly to a very high-dimensional feature space(Cortes and Vapnik 1995). Considering a two-class

    pattern classification problem, let the training set

    of size N be Xi; diNi1 where, X2i 2 Rnis the inputpattern for the ith example and d2i 2 1; 1is thecorresponding desired response. The classifier is

    represented by the function fx; a y with asthe parameters of the classifier. The SVM method

    involves finding the optimum separating hyperplane

    so that:

    1. Samples with labels y = 1 are located on each

    side of the hyperplane.

    2. The distances of the closest vectors to the

    hyperplane on each side are maximum. These

    are called support vectors and the distance is the

    optimal margin.

    The membership decision rule is based on the

    function f(x) where, f(x) represents the discriminant

    (a) (b) (c)

    Fig. 4 The results of DSM on a satellite image of a suburban scene. a Input image, b edge map extracted using multi-scale Canny

    (Kumar et al. 2000; Qian and Huang 1996) c corresponding DSM output

    Fig. 5 Architecture of SVM

    J Indian Soc Remote Sens (March 2011) 39(1):125 9

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    10/25

    function associated with the hyperplane in the trans-

    formed space and is defined as:

    fx w:fx w0 7where, w* is the weight vector, w0 is the bias fx 2

    Rd0

    d0d

    : SVM is used to classify every pixel into

    either road or non-road groups based on the sign ofthe discriminant function (y = sgn(f(x))). Pixels

    belonging to roads are assigned as group 1 and others

    to group 2 from training sample images. Since SVM

    has good generalization ability, this decision function

    can be applied to extract road structures from satellite

    images. Through training, we obtain the decision

    function. The feature vectors are fed into the SVM

    classifier initially for training (to learn the pattern)

    from known examples, and then for predicting the

    labels of unknown samples once the training is

    complete. Considering a classifier to produce aposterior probability is very useful in practical

    recognition problems. Posterior probabilities are also

    required when a classifier is making a small part of an

    overall decision, and the classification output is

    combined for overall decision. As described above,

    SVM is principally a binary classifier. Polynomial

    kernel of degree two was used due to its superiority

    over other kernels for most of the applications.

    However, SVM (Cortes and Vapnik 1995) produces

    an uncalibrated value that is not a probability. In the

    next section, we describe a mechanism to obtainprobabilistic classification of pixels as roads or non-

    roads, using soft-class labels from SVM.

    Soft Class Labels Using PSVM

    SVM does not provide any estimation of their

    classification confidence. Thus, SVM does not allow

    us to incorporate any a-priori information. Hence we

    use PSVM to produce posterior probability P(Class/

    Input). The posterior probability outputs of SVMs are

    based on the distance of testing vectors and support

    vectors. Following a method presented in Platt

    (1999), a sigmoid model is used to map binary

    SVM scores into probabilities as shown below:

    Py 1 fj 11 expAf B 8

    where y is the binary class label and f is an output

    of SVM decision function (Eq. 7). The two parame-

    ters A and B are obtained using Maximum likelihood

    using the training set (fi, yi). The parameters A and B

    are found by minimizing the negative log- likelihood

    of the training data. An image block is said to be road

    if its probability output by PSVM is larger than a

    predetermined threshold. As a result, the model has a

    probabilistic output for further processing. Probabi-listic output of a classifier makes it possible to use

    existing results for fusion theories, especially in cases

    when a classifier is making a small part of an overall

    decision, and the classification outputs must be

    combined for the overall decision.

    Training samples are gathered from regions sur-

    rounding the road pixels. The sample sub images shown

    in Fig. 6, illustrate the discriminative feature between

    road and non-road samples. Spectral characteristics

    vary for both the classes, which is analyzed by PSVM.

    As seen from Fig. 6a, local homogeneous orientationfor the road class will be captured by DSM, whereas

    non-road structures as shown in Fig. 6b will produce

    distributed orientations. In order to demonstrate the

    performance of the proposed method, we used the

    generated dataset described in Section Dataset

    Description and Performance Measures. Our system

    is trained with 5,000 samples of road and 7,200

    samples of non-road classes. Once the classifier is

    trained, it is asked to predict the labels for the test

    (a) (b)

    Fig. 6 a Road samples and b non road samples, of size 2121

    10 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    11/25

    image pixels. Figure 7 shows the results of P-SVM for

    the images given in Figs. 3a& 4a. Experimental results

    for different scenarios, namely, urban and suburban

    areas of developed and emerging countries and their

    discussions are presented in Section Results and

    Discussion. In the next section, we discuss the method

    of fusing the two complementary information (segmentclass from PSVM and linear edgemap edge obtained

    using DSM), using a CSNN (Constraint Satisfaction

    Neural Network) based integrator.

    CSNN for Integration

    Edge extraction from satellite images often delivers

    partly fragmented and erroneous results. Attributes

    describing geometrical and radiometric properties of

    the line segments can be helpful in sorting out the

    most probable false alarms. However, these attributesmay be ambiguous and are not considered to be

    reliable enough when used alone. Region based

    segmentation produces over-segmentation whereas

    edge based segmentation may lead to under -

    segmentation. We used a fusion strategy proposed

    by Lalit et al. (2008), which uses a constraint to

    iteratively correct both these erroneous outputs to

    produce a better result. The method is described

    briefly in the following for the sake of completion of

    this paper.

    Each neuron in CSNN-CII contains two fields:probability and rank. Rank field stores the rank of the

    probability in a decreasing order for that neuron. We

    exploit the soft class labels produced by PSVM to

    compute ranks, which in turn is used to initialize the

    interconnection weights of the CSNN. In addition to

    region-based constraints CSNN-CII also incorporates

    edge constraints. The number of neighbors considered

    for computation is determined using edge informa-

    tion. The initial class probabilities can be obtained

    using PSVM (Platt 1999). The initial edge maps can

    be obtained using DSM based techniques for road

    edge extraction.

    Dynamic Window

    The interconnection weights of the CSNN are

    computed only for those neurons which are within

    the effective size of the dynamic window. This

    effective width is based on the presence of edge

    information around the seed pixel. The stopping

    criterion is based on the presence of the edge pixels.

    Hence this process helps to mutually exploit both the

    complementary information of regions and edges

    inside the window. The window is considered to be

    dynamic (or adaptive), as its effective size depends onboth these information one (region) for initial estima-

    tion and the other (edge) for convergence. The

    obvious advantage of using dynamic window at

    region boundaries is that only the neurons which

    correspond to a single class will be processed and the

    neurons which may confuse the network would not be

    used for computation. The optimal size of dynamic

    window (m n) was obtained empirically as 3121.

    Lalit et al. (2008) used a square window, whereas we

    use a rectangular oriented window in our work. The

    orientation of the rectangular window is obtainedfrom the DSM output. It was observed from experi-

    mentation, that when a larger window size was used

    small regions (or small sections of a region) were

    merged with larger adjacent regions. The use of a

    smaller window size makes the CSNN take a longer

    time to converge to the final solution. Figure 8 shows

    (a) (b)

    Fig. 8 The results of CSNN-CII obtained by: a combining

    those in Fig. 3c & Fig. 7a; b combining those in Fig. 4c &

    Fig. 7b, for the images in Figs. 3a & 4a respectively

    (a) (b)

    Fig. 7 a The results of P-SVM for the image shown in Fig. 3a;

    b the results of P-SVM for the image shown in Fig. 4a

    J Indian Soc Remote Sens (March 2011) 39(1):125 11

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    12/25

    the results of CSNN-CII using inputs from the

    intermediate results of processing shown in Figs. 3,

    4 and 7, for the images in Figs. 3 and 4a.

    Post-Processing and Segment Linking

    The objective of the refinement process presentedhere is to eliminate the false segments which do not

    belong to roads. The result of CSNN integration

    produces a few undesired patches, which do not

    correspond to road segments. In the case of satellite

    images, a few undesired or noisy structures will be

    erroneously classified as road segments. To eliminate

    these false alarms (segments), we use connected

    component labeling (Haralick and Shapiro 1992) to

    extract the disjoint segments from the output of our

    algorithm. Segments with area less than a prefixed

    threshold TA are deleted. Major axis and minor axis

    lengths of each component are computed using

    normalized second central moments for each segment

    as shown below:

    m20

    M20

    xM10;m02

    M02

    yM01;

    x M10M00

    ; y M10M00

    ; Mpq X

    x

    Xy

    xiyiIx;y

    We computed the ratio of major axis length to the

    minor axis length of each component as: E m20m02

    Components having value of E less than a threshold

    TE, are usually non-road structures and hence deleted.

    The steps of the algorithm, depicting the post-

    processing stage is given below in Algorithm 2.

    Algorithm 2 Steps of Post-processing for refining the result.

    Compute the connected components.

    1. Compute Area (A) of each connected component.

    2. Compute Eccentricity (E) of each connected component.

    3. For each Component

    if (E TE) then

    delete that component

    else

    if (A TA)

    delete that component

    end if

    end if

    We used a region linking algorithm (Rizvandi et al.

    2008) to eliminate the discontinuities detected between

    road segments. Initially a dilation operation is performed

    on the input image. Since dilation is an operation that

    thickens or grows objects in the original image, the

    result of this operation is that edge segments which arevery close to each other are automatically linked. In our

    algorithm the structural element used for the dilation

    operation is a disk of radius 10. The image is then

    thinned and the edges are broken down into smaller

    straight line edge segments. Heuristics based upon

    proximity properties and alignment of road features are

    used to cluster and integrate fragmented segments. For

    each segment, the best neighbor is determined based on

    the difference in direction and the minimum distance

    between the end points. Results of post-processing and

    segment linking are shown in Figs. 9 and 10.

    (a) (b)

    Fig. 9 The results of (a) post-processing using the output shown

    in Fig. 8a; b segment linking using the output shown in (a)

    12 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    13/25

    Region Part-Segmentation

    Region part segmentation is necessary to eliminate

    some large patches of non-road structures which

    appear to be fused to roads. These patches are man-made structures such as roof-tops, parking lots, with

    similar spectral characteristics as roads. The proposed

    algorithm for Region part-segmentation is based on

    part-segmentation (Bennamoun and Mamic 2002),

    consisting of the following steps:

    1. Compute the smoothed inner and outer contours

    (closed) of the image

    2. Compute the smoothed curvature of the con-

    tours.

    3. Determine the local extrema, where the deriva-tive of smooth curvature equals zero, with

    curvature value greater than a threshold.

    4. Compute Convex/Concave Dominant Points at

    which the interior angle is greater/less than

    180 , by tracing the outer/inner contour of the

    region as shown in Fig. 11.

    5. Compute effective Convex (CDPcx) and Con-

    cave (CDPce) dominant points, on outer and

    inner contours respectively by logical AND

    operation of the output in steps 3 and 4.

    6. The CDPs (both CDPcx & CDPce) are moved

    along the normal for a fixed number of iterations

    (all the CDPs must move simultaneously) on the

    respective contours.

    7. A moving CDP will stop (freezes) only if ittouches another moving CDP or a point on the

    same contour within a specified path distance

    from it. For the outer contour, if the contour of

    the segment touches the boundary of the image,

    then respective CDPs are not freezed.

    8. Trace back all the freezed CDPs and join the

    pair of corresponding CDPs or the CDP and the

    contour point using a line segment.

    9. For each line segment obtained in step 8: form

    two adjacent regions within a closed contour,

    using the line as the new boundary.10. Merge the new pair of adjacent region, if they

    have similar structural properties (orientation of

    line segments near the CDPs).

    11. Set a threshold and eliminate all the connected

    components with area below the threshold.

    Curvature Computation

    A curve is represented in parametric form, where t is

    the path length, x and y are the coordinates of the

    contour.

    rt xt; yt 9If there is more than one object, then outer contour

    is traced for each object. If there is a child object

    inside an object, we have to then trace the outer

    contour for the child object as well.

    Inner boundary pixels are extracted by tracing the

    pixels at the inner contour in an object. A smoothing

    of the contour with a Gaussian kernel is then needed

    prior to the computation of the curvature, to overcomethe problem of discontinuities in derivatives needed

    for curvature calculation (Pei and Lin 1992). The

    smoothed contour is represented

    xst xtG; yst ytG 10Figure 11ashows an image having one object with

    two holes. The outermost pixels of the object are

    traced to extract the outer contour and the boundary

    of the holes gives the inner contours as shown inFig. 11 a Input image; b inner and outer contours

    (a) (b)

    Fig. 10 The results of (a) post-processing using the output

    shown in Fig. 8b; b segment linking using the output shown in

    (a)

    J Indian Soc Remote Sens (March 2011) 39(1):125 13

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    14/25

    Fig. 11b. Curvature is defined as the rate of change of

    slope as a function of arc length t:

    Kt dqtdt

    11

    where, (t) is the tangent to the curve at t. The

    curvature is computed as (Bennamoun and Mamic2002)

    Kst xs ys ys xs

    xs2 ys23=212

    The curvature obtained from Eq. 12 is smoothed

    with a Gaussian kernel (Eq. 2) to obtain a smoothed

    curvature, as given by the following equation:

    Kst KtG 13Figure 12c shows the curvature plot of the image

    shown in Fig. 12a. The smoothed curvature obtainedusing Eq. 13 is shown in Fig. 12d.

    Extraction of Dominant Points

    It has been suggested from the view point of the human

    visual system (Bennamoun 1994) that the dominant

    points have high curvature or the rate of change of

    slope along the path length is high. In this paper, we

    detect these points and use them to decompose theobject to remove the non-road structures. Dominant

    points are points having a curvature value grater than a

    threshold. Local extremas are defined by the points at

    which the derivative of the curvature equals zero (Pei

    and Lin 1992), as

    K:

    st dKstdt

    0 14

    which is equivalent to convolving the curvature with

    the derivative of Gaussian and taking the zero cross-

    ings of this operation. Figure 12e shows the localextremas for the input image in Fig. 12a.

    (a) (b) (c)

    (f)

    (g) (h) (i)

    (d) (e)

    Fig. 12 a Input synthetic image; b smoothed contour; c curvature plot; d smoothed curvature; e local extremas; f effective CDPs

    marked on the smoothed curvature in (d); g CDP marked on the smoothed contour; h contour normals at CDP; i segmented map of (a)

    14 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    15/25

    Convex Dominant points on the outer contour is

    combined with local extremas using AND opera-

    tion to give the effective CDPcx. Similarly Convex

    Dominant points on the inner contour is combined

    with the local extramas to get effective CDPce.

    These points are then used to segment the non-road

    parts from the given image. The CDPcx are movedinwards along the direction of its normal, where

    as CDPce are moved outwards along the direction

    of the normal. For a particular contour all the

    CDPs (both CDPcx & CDPce) are allowed to move

    simultaneously, and a CDP freezes only when it

    touches another moving CDP in the same contour or

    a point in the contour itself, which is within a

    specified path length. The specified path length of

    the moving CDP dictates the maximum perimeter of

    the non-road region for the purpose of elimination.

    All the freezed CDPs are traced back to their origins

    and the corresponding CDPs or the CDP and the

    contour point are joined using a line segment. The

    effective CDPs on the smoothed contour in Fig. 12d

    are shown in Fig. 12f. The same are marked on the

    smoothed contour of Fig. 12 b, i n Fi g. 12 g.

    Figure 12i show the results of region part segmen-

    tation for the image in Fig. 12a.

    Unlike Bennamoun algorithm (Bennamoun and

    Mamic 2002) there is no necessity to freeze all the

    CDPs and we only move the CDPs for a particular

    number of iterations. Unfrozen CDPs are not takeninto account for segmentation. Now the regions fitted

    with the new line segments are isolated as separate

    components. Setting an area threshold, small noisy

    non-road structures are eliminated. Figure 13 shows

    the results of region part segmentation algorithm for

    the images shown in Figs. 9b and 10b. It is observed

    that the non-road regions have been eliminated

    thereby improving the accuracy of road extraction

    results (Fig. 13).

    Experimental Results and Comparative Study

    We now describe the results of experimentation usingour proposed framework. The performance of the

    proposed method is verified on satellite images of size

    512 512 each. The performance of the proposed

    technique is compared with two state of the art

    techniques: Tuncer (2007) and Mokhtarzade et al.

    (2007), as well as a free commercial tool for feature

    extraction (Geospace 2008), termed as FeatureObjeX.

    FeatureObjeX (Geospace 2008) is a semi-

    automatic system, where it allows the user to select

    the training samples. Once the seed is created

    intensity distributions are computed for a set of pixelsaround the seed, which are then used to fit a

    multivariate normal distribution. Each seed region is

    modeled by a Naive Bayes classifier (Duda et al.

    2000). Then the likelihood of a given pixel is

    computed with respect to each of the seed distribu-

    tion. If the likelihood of a particular pixel is same or

    greater than the likelihood of the seed, then that pixel

    is classified as a target class. FeatureObjeX was used

    to segment the image into road and non-road classes

    using color features. Several configuration changes

    were made in FeatureObjeX before the tests, to makeit more efficient and closer to our requirement for

    working in road scenes over urban and suburban

    environments.

    Dataset Description and Performance Measures

    We created a database for satellite images with 1-m/

    pixel resolution from Wikimapia (Koriakine and

    Saveliev 2006). The commercial cost of this type of

    images is very expensive. We screen captured 100

    images of Developed countries and 100 images ofEmerging countries that we considered useful for our

    work. In our case the place and date were not very

    critical, and the only characteristic that we were

    looking for was the content of the images which had

    views of highways and roads. For creating the dataset,

    we consider selected sections (512 512 pixels) of

    scenes from satellite images of 1 m/pixel resolution

    acquired from Wikimapia (Koriakine and Saveliev

    2006), which includes: (1) sub-urban and (2) urban

    (a) (b)

    Fig. 13 The results of region part segmentation for: a output

    shown in Fig. 9b; and b output shown in Fig. 10b

    J Indian Soc Remote Sens (March 2011) 39(1):125 15

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    16/25

    areas from Developed and Emerging countries.

    Figures 15a and 17a show three examples of images

    from suburban areas in Developed and Emerging

    countries, whereas Figs. 16a and 18a show three

    examples of images from urban areas in Developed

    and Emerging countries. For each image in the

    dataset, ground truth (road) map was also obtainedusing a human operator. A portion of the dataset can

    be downloaded from (Visualisation and Perception

    Lab 2006). The categorization of the data into four

    groups was done with the advice (based on visual

    observation and geo-location) of a GIS expert. As the

    data was distributed in four groups of 50 images each,

    we trained four different P-SVMs with data (25

    images) from each respective group. Rest (25 images)

    was used for testing and performance analysis of the

    output of our proposed multistage framework.

    To assess the performance of road extractionsystem, the length of the extracted road network

    (parameter obtained after morphological thinning)

    that falls within a prespecified range with respect to

    the reference road network is used for the calculation

    of accuracy measures. The road segments in the test

    sites are manually digitized to form the reference road

    network. This subjectively obtained reference road

    network, used to evaluate the proposed road extrac-

    tion system, covers all roads present in the image.

    Hence this is used as a ground truth to estimate the

    accuracy measures for road extraction. Two measuresare used to evaluate the accuracy of the extracted road

    network (Heipke et al. 1997), and these measures are

    defined as follows: Completeness is defined as the

    percentage of the reference data, which was detected

    during road extraction:

    completeness length of matched referencelength of reference

    15

    Correctness is represents the percentage of the

    extracted road data, which is correct:

    correctness length of matched extractionlength of extraction

    16

    Results and Discussion

    Separate training of the P-SVM was necessary for the

    four categories of image samples, as the spectral

    characteristics exhibited for roads were different for

    the four cases of our study. The road intensity and

    contrast also varies between the four different types of

    image samples. The proposed CSNN - based algo-

    rithm iteratively shuttles between adding new and

    removing redundant edge pixels, and hence inherently

    produces a correction mechanism to the process of

    fusion. Edge maps are obtained using the methoddiscussed in Section DSM Based Edge Detection.

    The CSNN-CII algorithm requires the probability

    values for all the pixels corresponding to each class

    in an image. The initial probabilistic values and

    segmented maps are obtained using the method

    discussed in Section Segmentation Using Probabilistic

    SVM.

    In order to directly compare our approach with

    the recently published results in (Tuncer 2007;

    Mokhtarzade and Zoej 2007), we use a pair of

    images published by them for evaluation. We willshow the results first on the images used in (Tuncer

    2007) and (Mokhtarzade and Zoej 2007), and then

    with a few examples from the testing dataset

    acquired from wikimapia (Koriakine and Saveliev

    2006) in Figs. 15, 16, 17 and 18. Figure 14a shows

    two sample images used in (Tuncer 2007) and

    (Mokhtarzade and Zoej 2007). Output of human

    operator to detect roads from the two images are

    presented in Fig. 14b. Figure 14c-I presents the

    result published in (Tuncer 2007), while that in

    Fig. 14c-II is taken from (Mokhtarzade and Zoej2007). The result in Fig. 14c-I shows that only roads

    with rather large pixel widths such as the main

    highways are recovered as thinned structure. Prom-

    inent roads are recovered with good accuracy but

    inner city roads which are narrow and road inter-

    sections have not been recovered. Similarly for the

    method presented in (Mokhtarzade and Zoej 2007),

    false positives (non road structures) occur more

    which reduces the correctness measure for this

    method (see Fig. 14c-II). Some pixels belonging to

    rooftops of buildings were falsely identified as roads.The completeness and correctness measures for the

    given test images calculated for Tuncer (2007) and

    Mokhtarzade et al. (2007) as well as our proposed

    method are shown in Table 2. The completeness

    measure in (Mokhtarzade and Zoej 2007) is higher

    than that in (Tuncer 2007), as the true positives

    (actual road parts) are detected more accurately.

    Results of our proposed method are much better in

    both the cases as shown in Fig. 14d. It can be

    16 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    17/25

    observed that our method outperforms both the prior

    published work.

    Figures 15 and 16 show the results obtained using

    the proposed methodology on satellite images of

    Developed countries, whereas Figs. 16 and 17 show

    the results for Emerging countries. Figures 15b, 16b,

    17b and 18b show the results of feature extraction

    using the FeatureObjeX tool for the images inFigs. 15a, 16a, 17a and 18a respectively. Figures 15c,

    16c, 17c and 18c show the results for the algorithm

    proposed in (Tuncer 2007). Figures 15d, 16d, 17d and

    18d show the extracted road segments from the input

    satellite images using the technique presented in

    (Mokhtarzade and Zoej 2007). Figures 15e, 16e, 17e

    and 18e show manually plotted reference road layouts

    from the respective input images. It can be observed

    that the results of our proposed method, given in

    Figs. 15f, 16f, 17fand 18fare significantly better than

    other approaches and quite close to the groundtruth given in Fig. 15e, 16e, 17e and 18e. Our

    system outperforms FeatureObjeX (Geospace 2008)

    and other state of the art methods in all the cases.

    The optimal values for the parameters used for

    our proposed approach are given in Table 3, which

    have been obtained empirically using a large set of

    experiments.

    Table 4 describes the comparison of accuracy

    measures for the results presented in Figs. 15, 16, 17and 18 using the completeness and correctness

    measures. From Table 4 it can be seen using the

    completeness and correctness measure that our pro-

    posed method outperforms the other techniques in

    almost all the cases. In very few cases, the complete-

    ness measure of the FeatureObjeX tool is marginally

    better than our method. Tables 5, 6, 7 and 8 show the

    average classification accuracy obtained by analyzing

    images using the proposed method, FeatureObjeX

    (Geospace 2008) and two state of the art techniques

    (Tuncer2007) and (Mokhtarzade and Zoej 2007), over25 images in four different categories respectively.

    Methods Completeness Correctness

    (Tuncer 2007) (Fig. 14c-I) 82% 96%

    Proposed (Fig. 14d-I) 100% 100%

    (Mokhtarzade and Zoej 2007) (Fig. 14c-II) 92% 82%

    Proposed (Fig. 14d-II) 96% 85%

    Table 2 Performance of the

    proposed approach and the

    algorithm presented in

    (Tuncer2007; Mokhtarzade

    and Zoej 2007)

    (a) (b) (c) (d)

    (I)

    (II)

    Fig. 14 a Images presented in (Tuncer 2007) & (Mokhtarzade and Zoej 2007); b output of manual (hand-drawn) extraction; c results

    reproduced from (I) Tuncer (Tuncer 2007) and (II) Mokhtarzade et al. (Mokhtarzade and Zoej 2007); d results of our proposed approach

    J Indian Soc Remote Sens (March 2011) 39(1):125 17

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    18/25

    (I) (II) (III)

    (f)

    (e)

    (d)

    (c)

    (b)

    (a)

    Fig. 15 a Three satellite images of size (512 512), from a

    suburban area of a developed region; b results from FeatureObjeX

    (Geospace 2008); c results of the method proposed in (Tuncer

    2007); d results of the method proposed in (Mokhtarzade and

    Zoej 2007); e hand-drawn (manual) road map; f results of our

    proposed method

    18 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    19/25

    (I) (II) (III)

    (f)

    (e)

    (d)

    (c)

    (b)

    (a)

    Fig. 16 a Three satellite images of size (512 512), from a

    urban area of a developed region; b results from FeatureObjeX

    (Geospace 2008); c results of the method proposed in (Tuncer

    2007); d results of the method proposed in (Mokhtarzade and

    Zoej 2007); e hand-drawn (manual) road map; f results of our

    proposed method

    J Indian Soc Remote Sens (March 2011) 39(1):125 19

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    20/25

    (I) (II) (III)

    (f)

    (e)

    (d)

    (c)

    (b)

    (a)

    Fig. 17 a Three satellite images of size (512 512), from a

    suburban area of a emerging region; b results from FeatureObjeX

    (Geospace 2008); c results of the method proposed in (Tuncer

    2007); d results of the method proposed in (Mokhtarzade and

    Zoej 2007); e hand-drawn (manual) road map; f results of our

    proposed method

    20 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    21/25

    (I) (II) (III)

    (f)

    (e)

    (d)

    (c)

    (b)

    (a)

    Fig. 18 a Three satellite images of size (512 512), from a

    urban area of a emerging region; b results from FeatureObjeX

    (Geospace 2008); c results of the method proposed in (Tuncer

    2007); d results of the method proposed in (Mokhtarzade and

    Zoej 2007); e hand-drawn (manual) road map; f results of our

    proposed method

    J Indian Soc Remote Sens (March 2011) 39(1):125 21

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    22/25

    It is observed from the results shown in Figs. 15,

    16, 17 and 18 and Tables 4, 5, 6, 7 and 8 that the

    performance measure for our proposed algorithm is

    superior than the other methods. The results obtained

    using proposed methodologies are much superior to

    the methods presented in (Tuncer 2007; Mokhtarzade

    and Zoej 2007) and close to the manually drawn

    reference road network. Compared to our preliminary

    investigation in (Mirnalinee et al. 2009), the perfor-

    mance in terms of completeness and correctnessmeasures have been enhanced significantly. The

    region linking algorithm improves the completeness

    measure whereas the region part segmentation

    improves the correctness measure.

    The correctness and completeness measures obtained

    for scenes from emerging countries are in most cases

    less compared to the scenes from the developed

    countries. This decrease in accuracy for the scenes of

    emerging countries is expected, since there are many

    more opportunities for errors in these types of areas due

    to the large numbers of linear non-road features, fourway crossings, non-linear road structures and unplanned

    layouts. Comparing the results of developed urban and

    suburban scenes, the performance of urban scenes is

    low, because of the distortions. It is obvious that images

    of urban areas exhibit a more complex structure than

    scenes of suburban areas, as the number of different

    objects and their heterogeneity is much higher in urban

    scenes. Some of the roads comprise several lanes that

    are linked by complex road crossings. Generally as

    shown in Fig. 15, the extraction results for open

    landscape areas are nearly complete and correct.

    Suburban scenes of emerging countries are covered

    by vegetation. Moreover the spectral response of roadsin these areas are on certain occasions similar to the

    spectral response of open-fields and roof-tops, which in

    turn increases the false positives thereby reducing the

    correctness measure. Overall, our proposed method

    outperforms the featureObjeX (Geospace 2008) and

    the two state of the art methods (Tuncer 2007;

    Mokhtarzade and Zoej 2007), for observations aver-

    aged over images of 50 developed and 50 emerging

    areas.

    Conclusions

    A novel and efficient method for automatically extract-

    ing roads using low level information, directly from

    satellite images based on region and edge integration

    has been introduced and demonstrated. This new

    method combines outputs of PSVM and DSM in such

    a way that, it preserves the strong discriminative ability

    of SVM while simultaneously exploiting the linear like

    characteristics in the features derived using DSM. For

    the determination of the discontinuity and elimination ofnon-road parts, two approaches were shown: the first

    using several criteria concerning properties of the road

    parts and their relations to each other. Segment linking

    module solves the problem of discontinuity to some

    extent, thereby increasing the completeness. Region part

    segmentation and shape analysis based on elongated-

    Road image type FeatureObjeX Tuncer Mokhtarzade Proposed

    I II III I II III I II III I II III

    Developed suburban A 97 84 100 98 82 97 98 68 95 100 94 100

    Fig. 15 B 88 72 91 93 74 92 86 56 85 98 89 90

    Developed urban A 85 96 97 75 92 91 65 66 76 92 100 99

    Fig. 16 B 79 82 72 83 83 68 52 54 57 96 100 94

    Emerging suburban A 96 83 94 73 62 92 61 51 87 88 96 95

    Fig. 17 B 68 73 63 67 56 74 58 51 62 92 93 89

    Emerging urban A 91 87 83 62 52 74 71 64 59 89 92 82

    Fig. 18 B 74 71 75 72 51 61 57 58 58 83 92 85

    Table 4 Performance of the

    system for the images

    shown in Figs. 15, 16, 17

    and 18

    A: Completeness, B: Cor-

    rectness

    Table 3 Values of the parameters used in our proposed

    approach

    Road image type 1 2 N TE TA

    Suburban 2 2.5 9 9 0.6 0.7 50

    Urban 3 3.5 11 11 0.7 0.7 50

    22 J Indian Soc Remote Sens (March 2011) 39(1):125

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    23/25

    Methods Completeness Correctness

    FeatureObjeX (Geospace 2008) 78% 60%

    Tuncer (Tuncer 2007) 58% 52%

    Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 63% 52%

    Proposed Method 85% 87%

    Table 8 Performance of the

    system averaged over 25

    images of urban scenes of

    emerging countries

    Methods Completeness Correctness

    FeatureObjeX (Geospace 2008) 89% 62%

    Tuncer (Tuncer 2007) 78% 64%

    Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 64% 58%

    Proposed Method 87% 91%

    Table 7 Performance of the

    system averaged over 25

    images of suburban scenes

    of emerging countries

    Methods Completeness Correctness

    FeatureObjeX (Geospace 2008) 84% 74%

    Tuncer (Tuncer 2007) 81% 65%Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 62% 89%

    Proposed Method 93% 91%

    Table 6 Performance of the

    system averaged over 25

    images of urban scenes of

    developed countries

    Methods Completeness Correctness

    FeatureObjeX (Geospace 2008) 86% 79%

    Tuncer (Tuncer 2007) 81% 72%

    Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 63% 60%

    Proposed Method 93% 89%

    Table 5 Performance of the

    system averaged over 25

    images of suburban scenes

    of developed countries

    J Indian Soc Remote Sens (March 2011) 39(1):125 23

  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    24/25

    ness measure eliminates non-road parts and increases

    the correctness. The results prove that the proposed

    system is able to effectively extract major sections of the

    road network, a few junctions and curved roads from

    high-resolution satellite images.

    It is observed that the road detection process

    produces a high degree of accuracy especially forthe scenes of developed countries. In urban areas

    however, only major roads with larger pixel widths

    have been detected. Moreover, the presence of

    buildings and other features similar to roads made

    the extraction process somewhat more difficult com-

    pared to the suburban case. Linking of discontinuous

    segments, road junction detection and modeling of

    shadows are issues to be addressed in future scope of

    work for this problem. Vectorization of the extracted

    road segments can also be a nice extension of this

    work for GIS updates. The next step may include theformation of a road network by searching for

    junctions connecting road segments. Results may

    improve with the help of a road hypothesis verifica-

    tion using parallelism of road boundaries and use of a

    graph data structure to form a complete road network

    representation.

    References

    Baumgartner, A., Hinz, S., & Wiedemann, C. (2002). Efficient

    methods and interfaces for road tracking. In: Proceedings

    of the ISPRS commission III Symp. Photogrammet.

    Comput. Vision, pp. 2831.

    Bennamoun, M. (1994). A contour based part segmentation

    algorithm. In: Proc. of the IEEE ICASSP, pp. 4144.

    Bennamoun, M., & Mamic, G. J. (2002). Object recognition

    fundamentals and case studies. Springer.

    Bigun, J., Granlund, G., & Wiklund, J. (1991). Multidimen-

    sional orientation estimation with applications to texture

    analysis and optical flow. IEEE Transactions on Pattern

    analysis and Machine Intelligence, 13(8), 775790.

    Bucha, V., Uchida, S., & Ablameyko, S. (2006). Interactive

    road extraction with pixel force fields. In: IEEE The 18thInternational Conference on Pattern Recognition

    (ICPR06), pp. 829832.

    Chu, J., & Aggarwal, J. (1993). The integration of image

    segmentation maps using region and edge information.

    IEEE Transactions on Pattern Analysis and Machine

    Intelligence, 15, 12411252.

    Cooper, G., & Cowan, D. (2007). Enhancing linear features in

    image data using horizontal orthogonal gradient ratios.

    Computers and Geosciences, 33, 981984.

    Cortes, C., & Vapnik, V. (1995). Support vector networks.

    Machine Learning, 20(3), 273297.

    Doucette, P., Agouris, P., Stefanidis, A., & Musavi, M. (2001).

    Self-organized clustering for road extraction in classified

    imagery. ISPRS Journal of Photogrammetry and Remote

    Sensing, 55, 347358.

    Duda, R., Hart, P., & Stork, D. (2000). Pattern classification.

    Wiley Interscience.

    Feng, X., & Milanfar, P. (2002). Multiscale principal compo-

    nents analysis for image local orientation estimation. In:

    Proceedings of The 36th Asilomar Conference on Signals,Systems and Computers, pp. 478482.

    Geospace (2008). FeatureObjeX, http://www.pcigeomatics.

    com/.

    Granlund, G., & Knutsson, H. (1995). Signal processing for

    computer vision. Boston: Kluwer Academic.

    Gruen, A., & Li, H. (1995). Road extraction from aerial and

    satellite images by dynamic programming. ISPRS Journal

    of Photogrammetry and Remote Sensing, 50(4), 1120.

    Haddon, J., & Boyce, J. (1990). Image segmentation by

    unifying region and boundary information. IEEE Trans-

    actions on Pattern Analysis and Machine Intelligence, 12

    (10), 929948.

    Haglund, L., & Fleet, D. (1994). Stable estimation of image

    orientation. In: Proceedings of the First IEEE International

    Conference on Image Processing III, pp. 6872.

    Haralick, R., & Shapiro, L. (1992). Computer and robot vision.

    Addison Wesley.

    Heipke, C., Mayer, H., Wiedemann, C., & Jamet, O. (1997).

    Evaluation of automatic road extraction. International

    Archives of Photogrammetry and Remote Sensing,

    pp. 4756.

    Hinz, S., & Baumgartner, A. (2003). Multiview fusion of road

    objects supported by self diagnosis. In: In Proceeding of

    2nd GRSS/ISPRS Joint Workshop on Remote Sensing and

    Data Fusion over Urban Areas, pp. 137141.

    Hu, X., & Tao, V. (2007). Automatic extraction of main road

    centerlines from high resolution satellite imagery usinghierarchical grouping. Photogrammetric Engineering and

    Remote Sensing, 73(9), 10491056.

    Hu, X., Zhang, Z., & Tao, V. (2004). A robust method for semi-

    automatic extraction of road centerlines using a piece-wise

    parabolic model and least square template matching. The

    International Journal of Photogrammetric engineering

    and Remote Sensing, 70(12), 13931398.

    Jacob, M., & Unser, M. (2004). Design of steerable filters for

    feature detection using Canny like criteria. IEEE Trans-

    actions on Pattern Analysis and Machine Intelligence, 26

    (8), 10071019.

    Jiang, X. (2007). Extracting image orientation feature by using

    integration operator. Pattern Recognition, 40, 705717.

    Jin, X., & Davis, C. (2005). An integrated system for automaticroad mapping from high-resolution multispectral satellite

    imagery by information fusion. Information Fusion,

    pp. 257273.

    Kass, M., Witkin, A., & Terzopoulos, D. (1987). Snakes: active

    contour models. International Journal of Computer Vision,

    1, 321331.

    Koriakine, A., & Saveliev, E. (2006). Data, http://www.

    wikimapia.org/.

    Kumar, P., Das, S., & Yegnanarayana, B. (2000). One-dimensional

    processing of images. In: International Conference on

    Multimedia Processing and Systems, pp. 451454.

    24 J Indian Soc Remote Sens (March 2011) 39(1):125

    http://www.pcigeomatics.com/http://www.pcigeomatics.com/http://www.wikimapia.org/http://www.wikimapia.org/http://www.wikimapia.org/http://www.wikimapia.org/http://www.pcigeomatics.com/http://www.pcigeomatics.com/
  • 8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery

    25/25

    Kurugollu, F., & Sankur, B. (1999). Map segmentation of color

    images using constraint satisfaction neural network. In:

    International Conference on Image Processing, pp. 236

    239.

    Lalit, G., Mangai, U. G., & Das, S. (2008). Integrating region

    and edge information for texture segmentation using a

    modified constraint satisfaction neural network. Image and

    Vision Computing, pp. 11061117.

    Laptev, I., Mayer, H., Lindeberg, T., Eckstein, W., Steger, C., &Baumgartner, A. (2000). Automatic extraction of roads

    from aerial images based on scale space and snakes.

    Machine Vision and Applications, 12(1), 2331.

    Lin, W., Kuo, E., & Chen, C. (1992). Constraint satisfaction

    neural networks for image segmentation. Pattern Recog-

    nition, 25(7), 679693.

    Lyvers, E., & Mitchell, O. (1988). Precision edge contrast and

    orientation estimation. IEEE Transactions on Pattern

    Analysis and Machine Intelligence, 10(6), 927937.

    Majidi, B., & BabHadiashar, A. (2009). Aerial tracking of

    elongated objects in rural environments. Machine Vision

    and Applications, 20, 2334.

    Mantero, P., Moser, G., & Serpico, S. (2005). Partially

    supervised classification of remote sensing images through

    SVM-based probability density estimation. IEEE Trans-

    actions on Geoscience and Remote Sensing, 43(3), 559

    570.

    Mayer, H., Laptev, I., Baumgartner, A., & Steger, C. (1997)

    Automatic road extraction based on multi-scale modelling,

    context and snakes. In: International Archives of Photo-

    grammetry and Remote Sensing, pp. 106113.

    McKeown, D. (1996). Top ten lessons learned in automated

    cartography.

    Mena, J. B. (2003). State of the art on automatic road extraction

    for GIS update: a novel classification. Pattern Recognition

    Letters, 24(16), 30373058.

    Miliaresisa, G., & Kokkasb, N. (2007). Segmentation andobject-based classification for the extraction of the

    building class from LIDAR DEMs. Computers and Geo-

    sciences, 33, 10761087.

    Mirnalinee, T., Das, S., & Varghese, K. (2009). Integration of

    region and edge based information for efficient road

    extraction from high resolution satellite imagery. In: IEEE

    Proceedings of ICAPR, Kolkata, India, pp. 373376.

    Moigne, J., & Tilton, J. (1995). Refining image segmentation

    by integration of edge and region data. IEEE Transactions

    on Geoscience and Remote Sensing, 33, 605615.

    Mokhtarzade, M., & Zoej, M. (2007). Road detection from

    high-resolution satellite images using artificial neural

    networks. International Journal of applied Earth Obser-

    vation and Geoinformation, 9(1), 3240.Pal, M., & Mather, P. (2005). Support Vector Machines for

    classification in remote sensing. International Journal of

    Remote Sensing, 26(5), 10071011.

    Pavlidis, T., & Liow, Y. (1990). Integrating region growing and

    edge detection. IEEE Transactions on Pattern Analysis

    and Machine Intelligence, 12, 225233.

    Pei, S., & Lin, C. (1992). The detection of dominant points on

    digital curves by scale space filtering. Pattern Recognition,

    pp. 13071314.

    Perona, P. (1998). Orientation diffusions. IEEE Transactions on

    Image processing, 7(3), 457467.

    Platt, J. C. (1999). Probabilistic outputs for support vector

    machines and comparisons to regularized likelihood

    methods. In: Advances in Large Margin Classifiers, MIT

    Press, pp. 6174.

    Qian, R., & Huang, T. (1996). Optimal edge detection in two-

    dimensional images. IEEE Transaction on Image process-

    ing, 5, 12151220.Raghu, P., & Yegnanarayana, B. (1996). Segmentation of

    Gaborfiltered textures using deterministic relaxation. IEEE

    Transactions on Image processing, 5(12), 424429.

    Rizvandi, N., Pizurica, A., Philips, W., & Ochoa, D. (2008).

    Edge linking based method to detect and separate

    individual c. elegans worms in culture. In: DICTA,

    pp. 6570.

    Shi, W., & Zhu, C. (2002). The line segment match method for

    extracting road network from high-resolution satellite

    images. IEEE Transactions on Geoscience and Remote

    Sensing, 40(2), 511514.

    Strang, G. (2005). Linear Algebra and its application. Thomson

    Brooks.

    Theodoridis, S., & Koutroumbas, K. (2006). Pattern Recogni-

    tion. Academic.

    Tuncer, O. (2007). Fully automatic road network extraction

    from satellite images. In: Recent Advances in Space

    Technologies, pp. 708714.

    Tupin, F., Houshmand, B., & Datcu, M. (2002). Road detection

    in dense urban areas using SAR imagery and the

    usefulness of multiple views. IEEE Transactions on

    Geoscience and Remote Sensing, 40, 24052414.

    Udomhunsakul, S. (2004). Semi-automatic road detection from

    satellite imagery. In: IEEE International Conference on

    Image Processing (ICIP), pp. 17231726.

    Visualisation and Perception Lab (2006). http://www.cse.iitm.

    ac.in/~sdas/vplab/downloads.html.Wei, W., & Xin, Y. (2008). Feature extraction for man-made

    objects segmentation in aerial images. Machine Vision and

    Applications, 19, 5764.

    Xiao, Y., Tan, T., & Tay, S. (2005). Utilizing edge to extract

    roads in high-resolution satellite imagery. In: IEEE

    International Conference on Image Processing (ICIP), pp.

    637640.

    Yager, N., & Sowmya, A. (2003). Support vector machines for

    road extraction from remotely sensed images. LNCS,

    2756, 285292.

    Yang, J., & Wang, R. (2007). Classified road detection from

    satellite images based on perceptual organization. Inter-

    national Journal of Remote Sensing, 28, 46534669.

    Zhang, H., Xiao, Z., & Zhou, Q. (2008). Research on roadextraction semi-automatically from high resolution remote

    sensing images. The International Archives of the Photo-

    grammetry, Remote Sensing and Spatial Information

    Sciences XXXVII (Part B):536538.

    Zhu, C., Shi, W., Pesaresi, M., & Liu, L. (2005). The

    recognition of road network from high-resolution satellite

    remotely sensed data using image morphological charac-

    teristics. International Journal of Remote Sensing, 26(24),

    54935508.

    J Indian Soc Remote Sens (March 2011) 39(1):125 25

    http://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.html