Upload
rodrigobrunozanin
View
222
Download
0
Embed Size (px)
Citation preview
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
1/25
RESEARCH ARTICLE
An Integrated Multistage Framework for Automatic Road
Extraction from High Resolution Satellite ImageryT. T. Mirnalinee & Sukhendu Das & Koshy Varghese
Received: 6 October 2009 /Accepted: 6 April 2010 /Published online: 12 March 2011# Indian Society of Remote Sensing 2011
Abstract Automated procedures to rapidly identify
road networks from high-resolution satellite imagery
are necessary for modern applications in GIS. In this
paper, we propose an approach for automatic road
extraction by integrating a set of appropriate modules
in a unified framework, to solve this complex
problem. The two main properties of roads used are:
(1) spectral contrast with respect to background and
(2) locally linear path. Support Vector Machine is
used to discriminate between road and non-road
segments. We propose a Dominant singular Measure(DSM) for the task of detecting linear (locally) road
boundaries. This pair of information of road seg-
ments, obtained using Probabilistic SVM (PSVM)
and DSM, is integrated using a modified Constraint
Satisfaction Neural Network. Results of this integra-
tion are not satisfactory due to occlusion of roads,
variation of road material, and curvilinear pattern.
Suitable post-processing modules (segment linking
and region part segmentation) have been designed to
address these issues. The proposed non-model based
approach is verified with extensive experimentations
and performance compared with two state-of-the-art
techniques and a GIS based tool, using multi-spectral
satellite images. The proposed methodology is robust
and shows superior performance (completeness and
correctness are used as measures) in automating the
process of road network extraction.
Keywords Dominant singular measure . PSVM .CSNN-CII . Road edges . Road segments . Fusion .
Segment linking . Region part segmentation
Introduction
Road networks are essential modes of transportation,
and provide a backbone for human civilization.
Cartographic object extraction from digital imagery
is a fundamental operation for GIS update. However
the complete automation of the extraction processes isstill an unsolved problem. Road feature extraction
from a raster image is a non trivial and image specific
process. Hence, it is difficult to have a general method
to extract roads from any given raster image. Road
layers on raster maps typically have two distinguish-
able geometric properties from other layers: (1) Road
lines are straight within a small distance (i.e., several
meters in a street block); (2) Unlike building layers,
which could have many small distinct connected
J Indian Soc Remote Sens (March 2011) 39(1):125
DOI 10.1007/s12524-011-0063-9
T. T. Mirnalinee : S. Das (*)
Visualization and Perception Lab, Dept. of CSE, IndianInstitute of Technology, Madras,
Chennai 600 036, India
e-mail: [email protected]
T. T. Mirnalinee
e-mail: [email protected]
K. Varghese
Dept. of Civil Engg, Indian Institute of Technology,
Madras,
Chennai 600 036, India
e-mail: [email protected]
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
2/25
components, roads are connected to each other to
form a road network. Road layers usually have few
connected objects or even only one huge connected
object forming a whole road layer. Many works on
this topic have been presented (Laptev et al. 2000; Shi
and Zhu 2002; Hinz and Baumgartner 2003; Hu and
Tao 2007; Mokhtarzade and Zoej 2007; Mena 2003;Tupin et al. 2002). However, the manual intervention
of the operator in extracting, defining and validating
cartographic objects for GIS update is still needed.
Applications of road extraction process are found in
updating GIS records, urban planning, traffic control,
car navigation, map generation etc.
Most of the works published in literature on road
detection from satellite images are classified in two
categories: (1) Semi Automatic (Gruen and Li 1995;
Udomhunsakul 2004; Bucha et al. 2006; Zhang et al.
2008; Hu et al. 2004; Xiao et al. 2005) processes thatrequire help from a human operator. In contrast to the
automatic methods they demand a number of seed
points which are usually chosen by the operator in an
interactive fashion. Given such seed points the semi-
automatic algorithm connects them by a path which is
most likely a road. On the other hand, (2) Automatic
(Laptev et al. 2000; Shi and Zhu 2002; Hinz and
Baumgartner 2003; Mokhtarzade and Zoej 2007;
Baumgartner et al. 2002; Zhu et al. 2005) road
extraction methods require no initial (prior) informa-
tion about the presence and location of roads. In thefollowing, we will discuss automatic road extraction
process.
Automated extraction of r oads f rom high-
resolution imagery is a difficult task because of
the complexity in spatial and spectral variability of
the road network. Roads exhibit a variety of
spectral responses due to differences in age and/or
material and vary widely in physical dimensions. In
addition, r oad network in dense urban areas
typically have different geometric characteristics
than those in suburban and rural areas. Techniquesto extract road networks using binarization and line
segment matching of high-resolution IKONOS
urban imagery were presented in (Shi and Zhu
2002; Zhu et al. 2005). A line segment match
method was used to detect long linear groups of
pixels for classification as roads. These road pixels
are then simplified into the road centerlines with the
use of morphological operators. Mayer et al. (1997)
presented a complex road net-work extraction ap-
proach that attempts to accurately map both the road
network and the road edges through the use of
snakes (Kass et al. 1987). In another approach, Hinz
and Baumgartner (2003) utilized multiple very high-
resolution aerial images and detailed scene models,
to perform road extraction.
One can find a survey of road extraction methodsfrom satellite images by Mena (2003). Tupin et al.
(2002) presented the road extraction algorithm using
feature extraction (line detector) and network recon-
struction (graph labeling), which uses multiple views
of the same scene. According to McKeown (1996),
roads extracted from one raster image need not be
extracted in the same way from another raster image,
as there can be a drastic change in the value of
important parameters based on nature, instrument
variation, and photographic orientation. Yang and
Wang (2007) proposed a road extraction algorithmwhich deals with detecting two types of road
primitives, namely blob-like primitive and line-like
primitive. These primitives are defined, measured,
extracted and linked using different methods for
dissimilar road scenes.
Tuncer (2007) proposed a method which comprises
of preprocessing the image via a series of wavelet
based filter banks and reducing the data into a single
image which is of the same size as the original
satellite image. Then a fuzzy inference algorithm is
utilized to perform road detection. Each waveletfunction resolves features at a different resolution
level associated with the frequency response of the
corresponding FIR filter. Resulting two images are
fused together using Karhounen-Louve transform
(KLT) which is based on principal component
analysis (PCA). This process underlines the promi-
nent features of the original image as well as
denoising it, since the prominent features appear in
both of the wavelet transformed images while noise
does not strongly correlate between scales. Next a
fuzzy logic inference algorithm which is based onstatistical information and geometry is used to extract
the road pixels. The approach is only suitable for the
Ikonos data on rural areas where roads are mostly
homogeneous and are not disturbed by shadows or
occlusions. The central idea is to take into account the
spectral information by means of a (fuzzy) classifica-
tion approach.
A back-propagation neural network (BPNN) with
one hidden layer has been proposed for road
2 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
3/25
extraction in Mokhtarzade and Zoej (2007). The
output layer consists of one neurode that expresses
the networks response by a number between 0 and 1,
as background and road pixel respectively. Back
propagation Neural Network with different sizes of
the hidden layers, were trained with different number
of iterations before converging. Training and recallingstages were time consuming in this approach.
Doucette et al. (2001) introduced a self-organizing
road map algorithm to extract roads from high-
resolution Multi-Spectral imagery. The self organizing
road map, a specialized version of the self organizing
neural network model, performs spatial clustering to
identify and group together elongated regions.
Most of the methods discussed so far use a limited
set of image samples of a particular area to obtain
descent results. Some of them do not exhibit
performance analysis and comparative study withexisting state of the art techniques. Techniques
adapted are often adhoc and tuned for a particular
set of (few) samples acquired to show results. Our
study of road extraction is solely based on the road
characteristics (geometrical and spectral) stored in an
implicit manner in a raster image.
It is often difficult to obtain satisfactory results, by
using only one of these methods to detect road
structures in complex pictures. However, it is possible
to improve the results by using the complementary
nature of edge-based and region-based information. Alarge amount of work on the fusion of edge and
region information have been reported in literature
(Haddon and Boyce 1990; Chu and Aggarwal 1993;
Moigne and Tilton 1995; Pavlidis and Liow 1990) for
image segmentation. Pavlidis and Liow (1990) de-
scribed a method to combine segments obtained using
a region growing (over-segmented) approach, where
the edges between regions are eliminated or modified
based on contrast, gradient and smoothness of
the boundary. Haddon and Boyce (1990) generate
regions by partitioning the image co-occurrencematrix and then refining them by relaxation using
the edge information. Chu and Aggarwal (1993)
present an optimization method to integrate segmen-
tation and edge maps obtained from several channels,
including visible, infrared, etc., where user specified
weights and arbitrary mixing of region and edge
maps are allowed. Most of the methods proposed for
combining region and edge information are highly
sensitive to the correctness of edge map.
Lin et al. (1992) proposed constraint a satisfaction
neural network for image segmentation. They posed
the image segmentation problem as a constraint
satisfaction problem (CSP) by interpreting the process
as one of assigning labels to pixels subject to certain
spatial con-straints. Kurugollu and Sankur (1999)
proposed a segmentation algorithm for color images,which implements the MAP estimation of the label
field using a CSNN. In their work, the initial class
probabilities are obtained via a fuzzy C-means
algorithm in contrast to Lin et al. (1992) method,
where an adhoc fuzzification of an initial map takes
place. They have tried to combine advantages of
GMRF formulation (Raghu and Yegnanarayana 1996)
with those of the CSNN based (Lin et al. 1992)
relaxation. The results are shown on synthetic images.
In a recent work proposed by Lalit et al. (2008), a
CSNN-CII (Constraint Satisfaction Neural Net-workComplementary Information Integration) has been
used for texture segmentation. Results are shown on
simulated and real world images.
The focus of this paper is on the design and
development of a technique, which enables the user to
extract road segments from an input image without
much of user interaction. The motivation of our work
comes from the fact that the complimentary informa-
tion of regions (road pixels in our case) and edges
(road boundaries) have not been exploited together to
obtain a decent road map from satellite images. Eitherof these techniques when solely applied, produce
errors which do not occur together (simultaneously),
in general. This is due to the fact that the criteria for
classification of pixels as road regions look for
continuity and local smoothness, whereas methods
to detect road boundaries look for discontinuities in
raster images. Road regions are separated from non-
road regions in our proposed framework using a
PSVM (Probabilistic Support Vector Machine) classi-
fier. In our previous work on DSM (Dominant
Singular Measure) (Mirnalinee et al. 2009) basedroad extractor, the performance was low as the local
contrast between the regions was only considered.
Therefore, we decided to merge the information from
both DSM and PSVM using a CSNN-CII (Constraint
Satisfaction Neural Network with Complimentary
Information Integration) (Lalit et al. 2008) to produce
better results. A modified constraint satisfaction
neural network (CSNN) has been designed for this
task, which uses a novel dynamic window to merge
J Indian Soc Remote Sens (March 2011) 39(1):125 3
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
4/25
the complimentary information of edges and regions.
The output of CSNN-CII needs to be processed
further to remove some undesired artifacts and errors.
Segment linking algorithm is used to bridge the
discontinuities detected between road segments. Re-
gion part segmentation algorithm separates the roads
from protruding or attached non-road regions therebyimproving the accuracy. Results are shown using four
categories of database of high-resolution satellite
images from the following areas: (1) Developed
suburban, (2) Developed Urban, (3) Emerging subur-
ban and (4) Emerging Urban. Performance analysis is
presented using completeness and correctness meas-
ures (Heipke et al. 1997).
This paper is organized as follows: Section
Research Issues and Design Strategy deals with
the research issues and design strategies. Section
Proposed Method
deals with the overall proposed
methodology. Description of various stages in
pr op os ed fr am ew or k is pr es en te d in Se ct io n
Description of the Different Stages in Our Proposed
Framework. We present experimental results in Section
Experimental Results and Comparative Study and
conclude the paper in Section Conclusions.
Research Issues and Design Strategy
The difficulties in the design of an automated roadnetwork extraction system using remotely-sensed
imagery lie in the fact that the image characteristics
of road feature vary according to sensor type, spectral
and spatial resolution, ground characteristics, etc.
Even for an image taken over a particular urban area,
different parts of the road network reveal different
characteristics. In real world, a road network is too
complex to be modeled using a mathematical formu-
lation or an abstract model. The existence of other
objects (e.g., buildings and trees) cast shadows to
occlude road features, thus complicating the extrac-tion process.
Human perceptual ways of recognizing a road
involves (Jin and Davis 2005) extracting geometric,
radiometric and topological characteristics of an
image. Humans usually recognize a road using first
its geometric characteristics considering a road to be a
long, elongated feature with uniform width and
similar radiometric variance along its path. Even
though spectral characteristics of road vary within an
image, its physical appearance tends to exist as long
continuous features. Humans fuse these vital clues to
identify a foreground road object from the back-
ground layer. This motivated us to develop a generic
framework that integrates suitable processing modules
necessary for extracting the different types of features
present in road objects available in satellite scenes.We present the characteristics of roads next, followed
by suitable modules designed specifically to address
these issues. We also validate the efficiency of the
extraction system using experimental results.
Most significant characteristics of roads, which
appear in high-resolution satellite imagery are:
1. Roads have a distinctively contrasting spectral
signature (both locally and globally) with respect
to the background layer (e.g. vegetation, soil,
waterways, manmade structures etc.).
2. Roads are mostly elongated structures, with
locally linear properties.
3. The road surface is usually homogeneous, with
occasional variations.
4. Discontinuities appear in a road structure mainly
due to occluding objects, such as trees, buildings,
large vehicles etc. or even shadows.
5. Roads do not appear as a small segment or patch;
either in isolation or attached to a large linear
segment.
6. Roads rarely terminate (no abrupt ending) within
short distances. In fact, they intersect, occlude
one another (bridges and highways) and bifurcate
to build a network (global appearance).
7. Roads have near-parallel boundaries, with both
linear and curvilinear patterns.
8. Road structures are rarely non-smooth and occur
generally without much of sharp bends.
Among the different properties stated above, the
two major characteristics of roads are their geometri-
cal shape and spectral contrast (as stated in (1) and (2)
above). Roads in high spatial-resolution images ofurban areas appear as piecewise linear segments with
spectrally homogeneous characteristics. These are
vital clues, which form the basis of the design of
our framework for automatically detecting roads in
satellite imagery.
In the design of a framework for road detection, we
first need to exploit these two vital characteristics of
roads. In such a case, one may be tempted to use a
foreground extracting algorithm trained with spectral
4 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
5/25
patterns for roads and then use linear features on top
of it. However, a classifier based on only spectral
features will produce false alarms (identify non-road
objects as roads and filter parts of roads as back-
ground, due to reasons mentioned in points (3) and
(4) above). On the other hand, a pattern classifier (for
classifying roads) trained with geometrical features isuseless, unless the target (road, in this case) is
available. It is also not possible to simultaneously
extract and fuse these pair of distinct/disconnected
features together, as not unless the road-like structures
are filtered from the background the linear features
may be estimated. It is impossible to design an
operator or mask for this purpose, as that would need
to simultaneously extract spectral and RST-invariant
shape (geometrical) features from the image data. It is
also not possible to formulate a mathematical (para-
metric) model for a road network, which will work forall complex variations in the geometric design
patterns (linear and curvilinear) formed by roads in
urban scenarios.
Due to the existence of these complex phenomena
for roads, it is almost impossible to consider and
model all these situations and incorporate them in a
single module or processing stage for road network
extraction. This drove us to formulate and design a
hierarchical pipelined framework, consisting of the
classification (supervised), information integration,
filtering and local neighborhood analysis to obtaindescent results with acceptable quality. Results will be
compared with two state-of-the-art methods (Tuncer
2007; Mokhtarzade and Zoej 2007) published in
literature and one GIS-based software (Geospace
2008) used for raster image analysis.
Because of the issues mentioned earlier, in most
cases with hyper-spectral dataset, the spectral infor-
mation alone is not sufficient to define roads. We
need an integrated multistage framework to achieve
our goal. Each stage of the framework deals with a
particular characteristic of roads and are given in the
left column of Table 1. The center column gives the
corresponding strategy (processing module) used by
us to solve the problem, while the right-hand side
column specifies the difficulties/drawbacks that onemay face in execution of that stage. In the next
section we describe our proposed multistage method
based on the issues discussed in this section, followed
by design details of the road extraction modules listed
in Table 1.
Proposed Method
A multistage pipelined framework for road extrac-
tion has been proposed in this paper. Figure 1shows the flowchart of our proposed method of road
extraction, which is a hierarchical pipelined multi-
stage framework based on details specified in
Table 1. The first stage consists of an iterative
merging of region and edge based information using
a set of constraints. Road edges (boundaries) are
extracted from edge features using DSM. We assume
roads appearing in satellite images to be locally
linear. Soft class labels (probabilities) for each pixel
belonging to either road or non-road regions are
produced by the PSVM. Then a modified CSNN,termed CSNN-CII (Lalit et al. 2008) is used for
integrating the complimentary information from the
edge and region outputs. A fruitful cooperation
could be established between region-based and
edge-based methods to extract elongated thick
objects like roads in high-resolution satellite imag-
ery. Elongatedness measure (shape feature) is used to
remove the isolated non-road structures. Then a
Table 1 Road characteristics & corresponding processing module
Sl.
No
Characteristics Strategy/module Remarks
1. Contrast w.r.t. background
Mostly homogenous
SVM classifier using mean and variance of
spectral response
Misclassification of non-road objects with iden-
tical spectral response
2. Elongated Structure DSM on edge map; Shape features Discontinuity due to occlusion
3. Discontinuities and distortions in
linear pattern
CSNN-CII and Segment Linking Chance of linking roads with other structures
4. Not appearing in isolation, rarely
terminate
Region Part Segmentation Removal of small road fragments
J Indian Soc Remote Sens (March 2011) 39(1):125 5
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
6/25
segment linking algorithm is used to link the
discontinuous road segments which result due to
occlusion. Region part algorithm module removes
the non-road structures which appear due to adjacent
manmade structures. The steps of the algorithm,
depicting the process illustrated in Fig. 1, is given in
Algorithm 1. In the next section, we present the
description of the different stages of our framework
along with intermediate results of processing using
two satellite image samples.
Algorithm 1 Proposed framework for road detection.
Input: Image.
Output: Segmented Image.
Steps:
1. Compute edge maps of the image using DSM.
2. Compute the probability of class-label for each pixel using PSVM.
3. Integrate region information and edge information (output of steps (2) & (1)) using CSNN-CII
(Lalit et al, 2008):
Initialize the neuron in CSNN-CII using the probability obtained from PSVM.
Iterate and update the probabilities and edge map to get the final segmented map.
4. Post-process the CSNN-CII output to remove stray patches and unnecessary artifacts.
5. Perform segment linking to reduce the false negative.
6. Perform Region part segmentation algorithm to reduce the false positive.
Description of the Different Stages
in Our Proposed Framework
DSM Based Edge Detection
Roads are expected to be locally linear. Hence, we
extract the local orientation from the image of the
road network. Extracting linear features from satellite
images have been of interest to pattern recognition
community for some time (Cooper and Cowan 2007;
Granlund and Knutsson 1995; Lyvers and Mitchell
1988; Wei and Xin 2008; B.Majidi and BabHadiashar
2009). In the work by Cooper and Cowan (2007),
amplitude balanced horizontal derivatives were used
for enhancing linear features in images. However, ifthe dataset possesses features with large variations in
amplitude then the horizontal derivative will also have
the same property, and the smaller amplitude features
(which may be of considerable importance) may be
hard to discern. Granlund and Knutsson (1995)
devised an elegant method for combining the outputs
of quadrature pairs to extract a measure of orientation.
Perona (1998) extended the idea of anisotropic
diffusion to orientation maps. Bigun et al. (1991)
posed the problem as the least squares fitting of a
plane in the Fourier transform domain. Anothertechnique (Haglund and Fleet 1994) b ase d o n
steerable filters (Jacob and Unser 2004), is limited
in precision and generalization. In (Lyvers and
Mitchell 1988), Lyvers et al. examined the accuracy
of various local differential operators for noiseless
situations, as well as in the presence of additive
Gaussian noise. In (Jiang 2007), Jiang proposed an
image integration operator which leads to unbiased
orientation estimation.Fig. 1 Framework of the proposed method for road detection
6 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
7/25
Our method of obtaining the dominant direction
using PCA and a gradient matrix (obtained using 1-D
Canny (Kumar et al. 2000)) for orientation estimation
to extract road segments is novel, more efficient and
produces more robust results. Most established local
orientation estimation techniques are based on the
analysis of the local gradient field of the image. Butthe local gradients are very sensitive to noise, thus
making the estimate of local orientation from these
images unreliable. We use the method of Principal
Component Analysis (PCA) for image orientation
estimation. For each pixel in the image, we first
calculate the local image gradients (using 1-D Canny
(Kumar et al. 2000)) and then perform SVD of the
gradient matrix. Gradient of image f(x,y) at point (xk,
yk) is denoted by:
rfk rfxk;yk dfxk;ykdx ; dfxk;ykdy
T 1
which involves 1-D processing along orthogonal
directions (for details see (Kumar et al. 2000)). For
example, the smoothing operator used along one
direction (say, x) is the Gaussian filter:
Gx 1ffiffiffiffiffi2p
ps1
expx2
2s21 2
and the 1-D Canny operator for computing the
derivative along y is:
dGy yffiffiffiffiffi2p
ps32
expy22s22
3
Similar processing are applied along the y and x
directions, where the two operators interchange their
directions of processing. This method is efficient and
produces better gradient vectors which are orthogonal
to the dominant orientation of the image pattern. Let
us assume that in the image of interest f(x,y), the
orientation field is piece-wise constant. Under thisassumption, the gradient vectors in an image block
should on average be orthogonal to the dominant
orientation of the image pattern. So orientation
estimation can be formulated as the task of finding a
unit vector a, which maximizes the average of theangles between a and gradient vectors (Feng andMilanfar 2002). The computational basis of PCA is
the calculation of the Singular Value Decomposition
(SVD) of the data covariance matrix. The majority of
the eigenvectors form a cluster along a dominant
direction indicating the presence of a linear structure.
The eigenvalue will reflect the strength (peakiness in
domain) of the distribution of the gradients towards
a particular direction. Generally, the first eigenvalue is
larger than the second one, and hence in case of an
ideal straight line the second eigenvalue is zero(indicating no spread along the orthogonal direction).
However, a digital line is represented stepwise
(aliased), and hence the second eigenvalue for the
case of a line in a digital image is a non-zero value. In
order to get the local orientation estimate, we
rearrange the gradient vectors into a 2 N2 matrix,
where a window size of NN is used for processing
around each pixel, as shown below:
G rf1 rf2 rf3::: rfN2 4
where, rfi rfxi;yi; i 1; 2; :::;N2 see Eq. 1.We then compute the SVD (Singular Value Decom-
position) (Strang 2005) of the gradient matrix for each
pixel, computed using a window of size NN. SVD
of the gradient map is computed as
G USVT 5where, U is an orthogonal 22 matrix, in which the
first column represents the dominant orientation of the
gradient field. S is a 2 N2 matrix, representing the
energy along the dominant directions and V isorthogonal matrix of size N2 N2 representing each
vectors contribution to the singular value.
Dominant Singular Measure
Dominant singular Measure (DSM) is computed as
the ratio between the singular value of the major axis
and the sum of the singular values. This measure
approaches 1 for an elongated shape, DSM is
defined as:
DSM s1s1 s2 ; s1 ! s2 6
When all the gradient components have the same
direction, only one singular value (s1) is non-zero,
which in turn makes the DSM value equal to 1. If both
the singular values are equal and non-zero, the DSM
value is 0.5. Range of values of DSM thus lies in the
range [0.5 - 1]. We use the DSM measure to distinguish
between scattered or disoriented image patterns and an
J Indian Soc Remote Sens (March 2011) 39(1):125 7
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
8/25
image region with an orientation pattern. If the DSM is
less than a threshold (0:5
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
9/25
allows the identification of samples drawn from
unknown classes through the application of a suitable
Bayesian decision rule (Duda et al. 2000). This
approach is based on support vector machines(SVMs) for the estimation of probability density
functions, which uses a recursive procedure to
generate prior probability estimates for known and
unknown classes. SVMs are exploited by Yager and
Sowmya (2003) as a classifier for road extraction,
which involves two stages of processing. Here, SVM
is trained using edge based features such as, edge
length, gradient and intensity within the edge pair. In
level 1, SVM is used to classify edges as road edges
or non-road edges. Edges classified as road edges are
given as input to the SVM in level 2 where oppositeedges are paired as road segments. However, they
have reported very low correctness measure. A new
method (Miliaresisa and Kokkasb 2007) is presented
for the extraction of buildings from light detection
and ranging (LIDAR) digital elevation models
(DEMs) on the basis of segmentation principles. The
accuracy of supervised classification largely depends
on the quality of the training data. The locations and
sample size of training data are difficult to be
optimized depending on image data types and
classifiers to be used.Support vector machines (SVM) represent a prom-
ising development in machine learning research that is
not widely used within the remote sensing community
(Pal and Mather 2005). The architecture of a SVM
machine (Theodoridis and Koutroumbas 2006) is
given in Fig. 5. Number of nodes is determined by
the number of support vectors Ns.
The main idea of SVM is to separate the classes
with a hyperplane surface so as to maximize the
margin among them. In this paper, support vector
machines are used to classify roads from satellite
imagery. In SVM the input vectors are mapped
nonlinearly to a very high-dimensional feature space(Cortes and Vapnik 1995). Considering a two-class
pattern classification problem, let the training set
of size N be Xi; diNi1 where, X2i 2 Rnis the inputpattern for the ith example and d2i 2 1; 1is thecorresponding desired response. The classifier is
represented by the function fx; a y with asthe parameters of the classifier. The SVM method
involves finding the optimum separating hyperplane
so that:
1. Samples with labels y = 1 are located on each
side of the hyperplane.
2. The distances of the closest vectors to the
hyperplane on each side are maximum. These
are called support vectors and the distance is the
optimal margin.
The membership decision rule is based on the
function f(x) where, f(x) represents the discriminant
(a) (b) (c)
Fig. 4 The results of DSM on a satellite image of a suburban scene. a Input image, b edge map extracted using multi-scale Canny
(Kumar et al. 2000; Qian and Huang 1996) c corresponding DSM output
Fig. 5 Architecture of SVM
J Indian Soc Remote Sens (March 2011) 39(1):125 9
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
10/25
function associated with the hyperplane in the trans-
formed space and is defined as:
fx w:fx w0 7where, w* is the weight vector, w0 is the bias fx 2
Rd0
d0d
: SVM is used to classify every pixel into
either road or non-road groups based on the sign ofthe discriminant function (y = sgn(f(x))). Pixels
belonging to roads are assigned as group 1 and others
to group 2 from training sample images. Since SVM
has good generalization ability, this decision function
can be applied to extract road structures from satellite
images. Through training, we obtain the decision
function. The feature vectors are fed into the SVM
classifier initially for training (to learn the pattern)
from known examples, and then for predicting the
labels of unknown samples once the training is
complete. Considering a classifier to produce aposterior probability is very useful in practical
recognition problems. Posterior probabilities are also
required when a classifier is making a small part of an
overall decision, and the classification output is
combined for overall decision. As described above,
SVM is principally a binary classifier. Polynomial
kernel of degree two was used due to its superiority
over other kernels for most of the applications.
However, SVM (Cortes and Vapnik 1995) produces
an uncalibrated value that is not a probability. In the
next section, we describe a mechanism to obtainprobabilistic classification of pixels as roads or non-
roads, using soft-class labels from SVM.
Soft Class Labels Using PSVM
SVM does not provide any estimation of their
classification confidence. Thus, SVM does not allow
us to incorporate any a-priori information. Hence we
use PSVM to produce posterior probability P(Class/
Input). The posterior probability outputs of SVMs are
based on the distance of testing vectors and support
vectors. Following a method presented in Platt
(1999), a sigmoid model is used to map binary
SVM scores into probabilities as shown below:
Py 1 fj 11 expAf B 8
where y is the binary class label and f is an output
of SVM decision function (Eq. 7). The two parame-
ters A and B are obtained using Maximum likelihood
using the training set (fi, yi). The parameters A and B
are found by minimizing the negative log- likelihood
of the training data. An image block is said to be road
if its probability output by PSVM is larger than a
predetermined threshold. As a result, the model has a
probabilistic output for further processing. Probabi-listic output of a classifier makes it possible to use
existing results for fusion theories, especially in cases
when a classifier is making a small part of an overall
decision, and the classification outputs must be
combined for the overall decision.
Training samples are gathered from regions sur-
rounding the road pixels. The sample sub images shown
in Fig. 6, illustrate the discriminative feature between
road and non-road samples. Spectral characteristics
vary for both the classes, which is analyzed by PSVM.
As seen from Fig. 6a, local homogeneous orientationfor the road class will be captured by DSM, whereas
non-road structures as shown in Fig. 6b will produce
distributed orientations. In order to demonstrate the
performance of the proposed method, we used the
generated dataset described in Section Dataset
Description and Performance Measures. Our system
is trained with 5,000 samples of road and 7,200
samples of non-road classes. Once the classifier is
trained, it is asked to predict the labels for the test
(a) (b)
Fig. 6 a Road samples and b non road samples, of size 2121
10 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
11/25
image pixels. Figure 7 shows the results of P-SVM for
the images given in Figs. 3a& 4a. Experimental results
for different scenarios, namely, urban and suburban
areas of developed and emerging countries and their
discussions are presented in Section Results and
Discussion. In the next section, we discuss the method
of fusing the two complementary information (segmentclass from PSVM and linear edgemap edge obtained
using DSM), using a CSNN (Constraint Satisfaction
Neural Network) based integrator.
CSNN for Integration
Edge extraction from satellite images often delivers
partly fragmented and erroneous results. Attributes
describing geometrical and radiometric properties of
the line segments can be helpful in sorting out the
most probable false alarms. However, these attributesmay be ambiguous and are not considered to be
reliable enough when used alone. Region based
segmentation produces over-segmentation whereas
edge based segmentation may lead to under -
segmentation. We used a fusion strategy proposed
by Lalit et al. (2008), which uses a constraint to
iteratively correct both these erroneous outputs to
produce a better result. The method is described
briefly in the following for the sake of completion of
this paper.
Each neuron in CSNN-CII contains two fields:probability and rank. Rank field stores the rank of the
probability in a decreasing order for that neuron. We
exploit the soft class labels produced by PSVM to
compute ranks, which in turn is used to initialize the
interconnection weights of the CSNN. In addition to
region-based constraints CSNN-CII also incorporates
edge constraints. The number of neighbors considered
for computation is determined using edge informa-
tion. The initial class probabilities can be obtained
using PSVM (Platt 1999). The initial edge maps can
be obtained using DSM based techniques for road
edge extraction.
Dynamic Window
The interconnection weights of the CSNN are
computed only for those neurons which are within
the effective size of the dynamic window. This
effective width is based on the presence of edge
information around the seed pixel. The stopping
criterion is based on the presence of the edge pixels.
Hence this process helps to mutually exploit both the
complementary information of regions and edges
inside the window. The window is considered to be
dynamic (or adaptive), as its effective size depends onboth these information one (region) for initial estima-
tion and the other (edge) for convergence. The
obvious advantage of using dynamic window at
region boundaries is that only the neurons which
correspond to a single class will be processed and the
neurons which may confuse the network would not be
used for computation. The optimal size of dynamic
window (m n) was obtained empirically as 3121.
Lalit et al. (2008) used a square window, whereas we
use a rectangular oriented window in our work. The
orientation of the rectangular window is obtainedfrom the DSM output. It was observed from experi-
mentation, that when a larger window size was used
small regions (or small sections of a region) were
merged with larger adjacent regions. The use of a
smaller window size makes the CSNN take a longer
time to converge to the final solution. Figure 8 shows
(a) (b)
Fig. 8 The results of CSNN-CII obtained by: a combining
those in Fig. 3c & Fig. 7a; b combining those in Fig. 4c &
Fig. 7b, for the images in Figs. 3a & 4a respectively
(a) (b)
Fig. 7 a The results of P-SVM for the image shown in Fig. 3a;
b the results of P-SVM for the image shown in Fig. 4a
J Indian Soc Remote Sens (March 2011) 39(1):125 11
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
12/25
the results of CSNN-CII using inputs from the
intermediate results of processing shown in Figs. 3,
4 and 7, for the images in Figs. 3 and 4a.
Post-Processing and Segment Linking
The objective of the refinement process presentedhere is to eliminate the false segments which do not
belong to roads. The result of CSNN integration
produces a few undesired patches, which do not
correspond to road segments. In the case of satellite
images, a few undesired or noisy structures will be
erroneously classified as road segments. To eliminate
these false alarms (segments), we use connected
component labeling (Haralick and Shapiro 1992) to
extract the disjoint segments from the output of our
algorithm. Segments with area less than a prefixed
threshold TA are deleted. Major axis and minor axis
lengths of each component are computed using
normalized second central moments for each segment
as shown below:
m20
M20
xM10;m02
M02
yM01;
x M10M00
; y M10M00
; Mpq X
x
Xy
xiyiIx;y
We computed the ratio of major axis length to the
minor axis length of each component as: E m20m02
Components having value of E less than a threshold
TE, are usually non-road structures and hence deleted.
The steps of the algorithm, depicting the post-
processing stage is given below in Algorithm 2.
Algorithm 2 Steps of Post-processing for refining the result.
Compute the connected components.
1. Compute Area (A) of each connected component.
2. Compute Eccentricity (E) of each connected component.
3. For each Component
if (E TE) then
delete that component
else
if (A TA)
delete that component
end if
end if
We used a region linking algorithm (Rizvandi et al.
2008) to eliminate the discontinuities detected between
road segments. Initially a dilation operation is performed
on the input image. Since dilation is an operation that
thickens or grows objects in the original image, the
result of this operation is that edge segments which arevery close to each other are automatically linked. In our
algorithm the structural element used for the dilation
operation is a disk of radius 10. The image is then
thinned and the edges are broken down into smaller
straight line edge segments. Heuristics based upon
proximity properties and alignment of road features are
used to cluster and integrate fragmented segments. For
each segment, the best neighbor is determined based on
the difference in direction and the minimum distance
between the end points. Results of post-processing and
segment linking are shown in Figs. 9 and 10.
(a) (b)
Fig. 9 The results of (a) post-processing using the output shown
in Fig. 8a; b segment linking using the output shown in (a)
12 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
13/25
Region Part-Segmentation
Region part segmentation is necessary to eliminate
some large patches of non-road structures which
appear to be fused to roads. These patches are man-made structures such as roof-tops, parking lots, with
similar spectral characteristics as roads. The proposed
algorithm for Region part-segmentation is based on
part-segmentation (Bennamoun and Mamic 2002),
consisting of the following steps:
1. Compute the smoothed inner and outer contours
(closed) of the image
2. Compute the smoothed curvature of the con-
tours.
3. Determine the local extrema, where the deriva-tive of smooth curvature equals zero, with
curvature value greater than a threshold.
4. Compute Convex/Concave Dominant Points at
which the interior angle is greater/less than
180 , by tracing the outer/inner contour of the
region as shown in Fig. 11.
5. Compute effective Convex (CDPcx) and Con-
cave (CDPce) dominant points, on outer and
inner contours respectively by logical AND
operation of the output in steps 3 and 4.
6. The CDPs (both CDPcx & CDPce) are moved
along the normal for a fixed number of iterations
(all the CDPs must move simultaneously) on the
respective contours.
7. A moving CDP will stop (freezes) only if ittouches another moving CDP or a point on the
same contour within a specified path distance
from it. For the outer contour, if the contour of
the segment touches the boundary of the image,
then respective CDPs are not freezed.
8. Trace back all the freezed CDPs and join the
pair of corresponding CDPs or the CDP and the
contour point using a line segment.
9. For each line segment obtained in step 8: form
two adjacent regions within a closed contour,
using the line as the new boundary.10. Merge the new pair of adjacent region, if they
have similar structural properties (orientation of
line segments near the CDPs).
11. Set a threshold and eliminate all the connected
components with area below the threshold.
Curvature Computation
A curve is represented in parametric form, where t is
the path length, x and y are the coordinates of the
contour.
rt xt; yt 9If there is more than one object, then outer contour
is traced for each object. If there is a child object
inside an object, we have to then trace the outer
contour for the child object as well.
Inner boundary pixels are extracted by tracing the
pixels at the inner contour in an object. A smoothing
of the contour with a Gaussian kernel is then needed
prior to the computation of the curvature, to overcomethe problem of discontinuities in derivatives needed
for curvature calculation (Pei and Lin 1992). The
smoothed contour is represented
xst xtG; yst ytG 10Figure 11ashows an image having one object with
two holes. The outermost pixels of the object are
traced to extract the outer contour and the boundary
of the holes gives the inner contours as shown inFig. 11 a Input image; b inner and outer contours
(a) (b)
Fig. 10 The results of (a) post-processing using the output
shown in Fig. 8b; b segment linking using the output shown in
(a)
J Indian Soc Remote Sens (March 2011) 39(1):125 13
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
14/25
Fig. 11b. Curvature is defined as the rate of change of
slope as a function of arc length t:
Kt dqtdt
11
where, (t) is the tangent to the curve at t. The
curvature is computed as (Bennamoun and Mamic2002)
Kst xs ys ys xs
xs2 ys23=212
The curvature obtained from Eq. 12 is smoothed
with a Gaussian kernel (Eq. 2) to obtain a smoothed
curvature, as given by the following equation:
Kst KtG 13Figure 12c shows the curvature plot of the image
shown in Fig. 12a. The smoothed curvature obtainedusing Eq. 13 is shown in Fig. 12d.
Extraction of Dominant Points
It has been suggested from the view point of the human
visual system (Bennamoun 1994) that the dominant
points have high curvature or the rate of change of
slope along the path length is high. In this paper, we
detect these points and use them to decompose theobject to remove the non-road structures. Dominant
points are points having a curvature value grater than a
threshold. Local extremas are defined by the points at
which the derivative of the curvature equals zero (Pei
and Lin 1992), as
K:
st dKstdt
0 14
which is equivalent to convolving the curvature with
the derivative of Gaussian and taking the zero cross-
ings of this operation. Figure 12e shows the localextremas for the input image in Fig. 12a.
(a) (b) (c)
(f)
(g) (h) (i)
(d) (e)
Fig. 12 a Input synthetic image; b smoothed contour; c curvature plot; d smoothed curvature; e local extremas; f effective CDPs
marked on the smoothed curvature in (d); g CDP marked on the smoothed contour; h contour normals at CDP; i segmented map of (a)
14 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
15/25
Convex Dominant points on the outer contour is
combined with local extremas using AND opera-
tion to give the effective CDPcx. Similarly Convex
Dominant points on the inner contour is combined
with the local extramas to get effective CDPce.
These points are then used to segment the non-road
parts from the given image. The CDPcx are movedinwards along the direction of its normal, where
as CDPce are moved outwards along the direction
of the normal. For a particular contour all the
CDPs (both CDPcx & CDPce) are allowed to move
simultaneously, and a CDP freezes only when it
touches another moving CDP in the same contour or
a point in the contour itself, which is within a
specified path length. The specified path length of
the moving CDP dictates the maximum perimeter of
the non-road region for the purpose of elimination.
All the freezed CDPs are traced back to their origins
and the corresponding CDPs or the CDP and the
contour point are joined using a line segment. The
effective CDPs on the smoothed contour in Fig. 12d
are shown in Fig. 12f. The same are marked on the
smoothed contour of Fig. 12 b, i n Fi g. 12 g.
Figure 12i show the results of region part segmen-
tation for the image in Fig. 12a.
Unlike Bennamoun algorithm (Bennamoun and
Mamic 2002) there is no necessity to freeze all the
CDPs and we only move the CDPs for a particular
number of iterations. Unfrozen CDPs are not takeninto account for segmentation. Now the regions fitted
with the new line segments are isolated as separate
components. Setting an area threshold, small noisy
non-road structures are eliminated. Figure 13 shows
the results of region part segmentation algorithm for
the images shown in Figs. 9b and 10b. It is observed
that the non-road regions have been eliminated
thereby improving the accuracy of road extraction
results (Fig. 13).
Experimental Results and Comparative Study
We now describe the results of experimentation usingour proposed framework. The performance of the
proposed method is verified on satellite images of size
512 512 each. The performance of the proposed
technique is compared with two state of the art
techniques: Tuncer (2007) and Mokhtarzade et al.
(2007), as well as a free commercial tool for feature
extraction (Geospace 2008), termed as FeatureObjeX.
FeatureObjeX (Geospace 2008) is a semi-
automatic system, where it allows the user to select
the training samples. Once the seed is created
intensity distributions are computed for a set of pixelsaround the seed, which are then used to fit a
multivariate normal distribution. Each seed region is
modeled by a Naive Bayes classifier (Duda et al.
2000). Then the likelihood of a given pixel is
computed with respect to each of the seed distribu-
tion. If the likelihood of a particular pixel is same or
greater than the likelihood of the seed, then that pixel
is classified as a target class. FeatureObjeX was used
to segment the image into road and non-road classes
using color features. Several configuration changes
were made in FeatureObjeX before the tests, to makeit more efficient and closer to our requirement for
working in road scenes over urban and suburban
environments.
Dataset Description and Performance Measures
We created a database for satellite images with 1-m/
pixel resolution from Wikimapia (Koriakine and
Saveliev 2006). The commercial cost of this type of
images is very expensive. We screen captured 100
images of Developed countries and 100 images ofEmerging countries that we considered useful for our
work. In our case the place and date were not very
critical, and the only characteristic that we were
looking for was the content of the images which had
views of highways and roads. For creating the dataset,
we consider selected sections (512 512 pixels) of
scenes from satellite images of 1 m/pixel resolution
acquired from Wikimapia (Koriakine and Saveliev
2006), which includes: (1) sub-urban and (2) urban
(a) (b)
Fig. 13 The results of region part segmentation for: a output
shown in Fig. 9b; and b output shown in Fig. 10b
J Indian Soc Remote Sens (March 2011) 39(1):125 15
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
16/25
areas from Developed and Emerging countries.
Figures 15a and 17a show three examples of images
from suburban areas in Developed and Emerging
countries, whereas Figs. 16a and 18a show three
examples of images from urban areas in Developed
and Emerging countries. For each image in the
dataset, ground truth (road) map was also obtainedusing a human operator. A portion of the dataset can
be downloaded from (Visualisation and Perception
Lab 2006). The categorization of the data into four
groups was done with the advice (based on visual
observation and geo-location) of a GIS expert. As the
data was distributed in four groups of 50 images each,
we trained four different P-SVMs with data (25
images) from each respective group. Rest (25 images)
was used for testing and performance analysis of the
output of our proposed multistage framework.
To assess the performance of road extractionsystem, the length of the extracted road network
(parameter obtained after morphological thinning)
that falls within a prespecified range with respect to
the reference road network is used for the calculation
of accuracy measures. The road segments in the test
sites are manually digitized to form the reference road
network. This subjectively obtained reference road
network, used to evaluate the proposed road extrac-
tion system, covers all roads present in the image.
Hence this is used as a ground truth to estimate the
accuracy measures for road extraction. Two measuresare used to evaluate the accuracy of the extracted road
network (Heipke et al. 1997), and these measures are
defined as follows: Completeness is defined as the
percentage of the reference data, which was detected
during road extraction:
completeness length of matched referencelength of reference
15
Correctness is represents the percentage of the
extracted road data, which is correct:
correctness length of matched extractionlength of extraction
16
Results and Discussion
Separate training of the P-SVM was necessary for the
four categories of image samples, as the spectral
characteristics exhibited for roads were different for
the four cases of our study. The road intensity and
contrast also varies between the four different types of
image samples. The proposed CSNN - based algo-
rithm iteratively shuttles between adding new and
removing redundant edge pixels, and hence inherently
produces a correction mechanism to the process of
fusion. Edge maps are obtained using the methoddiscussed in Section DSM Based Edge Detection.
The CSNN-CII algorithm requires the probability
values for all the pixels corresponding to each class
in an image. The initial probabilistic values and
segmented maps are obtained using the method
discussed in Section Segmentation Using Probabilistic
SVM.
In order to directly compare our approach with
the recently published results in (Tuncer 2007;
Mokhtarzade and Zoej 2007), we use a pair of
images published by them for evaluation. We willshow the results first on the images used in (Tuncer
2007) and (Mokhtarzade and Zoej 2007), and then
with a few examples from the testing dataset
acquired from wikimapia (Koriakine and Saveliev
2006) in Figs. 15, 16, 17 and 18. Figure 14a shows
two sample images used in (Tuncer 2007) and
(Mokhtarzade and Zoej 2007). Output of human
operator to detect roads from the two images are
presented in Fig. 14b. Figure 14c-I presents the
result published in (Tuncer 2007), while that in
Fig. 14c-II is taken from (Mokhtarzade and Zoej2007). The result in Fig. 14c-I shows that only roads
with rather large pixel widths such as the main
highways are recovered as thinned structure. Prom-
inent roads are recovered with good accuracy but
inner city roads which are narrow and road inter-
sections have not been recovered. Similarly for the
method presented in (Mokhtarzade and Zoej 2007),
false positives (non road structures) occur more
which reduces the correctness measure for this
method (see Fig. 14c-II). Some pixels belonging to
rooftops of buildings were falsely identified as roads.The completeness and correctness measures for the
given test images calculated for Tuncer (2007) and
Mokhtarzade et al. (2007) as well as our proposed
method are shown in Table 2. The completeness
measure in (Mokhtarzade and Zoej 2007) is higher
than that in (Tuncer 2007), as the true positives
(actual road parts) are detected more accurately.
Results of our proposed method are much better in
both the cases as shown in Fig. 14d. It can be
16 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
17/25
observed that our method outperforms both the prior
published work.
Figures 15 and 16 show the results obtained using
the proposed methodology on satellite images of
Developed countries, whereas Figs. 16 and 17 show
the results for Emerging countries. Figures 15b, 16b,
17b and 18b show the results of feature extraction
using the FeatureObjeX tool for the images inFigs. 15a, 16a, 17a and 18a respectively. Figures 15c,
16c, 17c and 18c show the results for the algorithm
proposed in (Tuncer 2007). Figures 15d, 16d, 17d and
18d show the extracted road segments from the input
satellite images using the technique presented in
(Mokhtarzade and Zoej 2007). Figures 15e, 16e, 17e
and 18e show manually plotted reference road layouts
from the respective input images. It can be observed
that the results of our proposed method, given in
Figs. 15f, 16f, 17fand 18fare significantly better than
other approaches and quite close to the groundtruth given in Fig. 15e, 16e, 17e and 18e. Our
system outperforms FeatureObjeX (Geospace 2008)
and other state of the art methods in all the cases.
The optimal values for the parameters used for
our proposed approach are given in Table 3, which
have been obtained empirically using a large set of
experiments.
Table 4 describes the comparison of accuracy
measures for the results presented in Figs. 15, 16, 17and 18 using the completeness and correctness
measures. From Table 4 it can be seen using the
completeness and correctness measure that our pro-
posed method outperforms the other techniques in
almost all the cases. In very few cases, the complete-
ness measure of the FeatureObjeX tool is marginally
better than our method. Tables 5, 6, 7 and 8 show the
average classification accuracy obtained by analyzing
images using the proposed method, FeatureObjeX
(Geospace 2008) and two state of the art techniques
(Tuncer2007) and (Mokhtarzade and Zoej 2007), over25 images in four different categories respectively.
Methods Completeness Correctness
(Tuncer 2007) (Fig. 14c-I) 82% 96%
Proposed (Fig. 14d-I) 100% 100%
(Mokhtarzade and Zoej 2007) (Fig. 14c-II) 92% 82%
Proposed (Fig. 14d-II) 96% 85%
Table 2 Performance of the
proposed approach and the
algorithm presented in
(Tuncer2007; Mokhtarzade
and Zoej 2007)
(a) (b) (c) (d)
(I)
(II)
Fig. 14 a Images presented in (Tuncer 2007) & (Mokhtarzade and Zoej 2007); b output of manual (hand-drawn) extraction; c results
reproduced from (I) Tuncer (Tuncer 2007) and (II) Mokhtarzade et al. (Mokhtarzade and Zoej 2007); d results of our proposed approach
J Indian Soc Remote Sens (March 2011) 39(1):125 17
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
18/25
(I) (II) (III)
(f)
(e)
(d)
(c)
(b)
(a)
Fig. 15 a Three satellite images of size (512 512), from a
suburban area of a developed region; b results from FeatureObjeX
(Geospace 2008); c results of the method proposed in (Tuncer
2007); d results of the method proposed in (Mokhtarzade and
Zoej 2007); e hand-drawn (manual) road map; f results of our
proposed method
18 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
19/25
(I) (II) (III)
(f)
(e)
(d)
(c)
(b)
(a)
Fig. 16 a Three satellite images of size (512 512), from a
urban area of a developed region; b results from FeatureObjeX
(Geospace 2008); c results of the method proposed in (Tuncer
2007); d results of the method proposed in (Mokhtarzade and
Zoej 2007); e hand-drawn (manual) road map; f results of our
proposed method
J Indian Soc Remote Sens (March 2011) 39(1):125 19
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
20/25
(I) (II) (III)
(f)
(e)
(d)
(c)
(b)
(a)
Fig. 17 a Three satellite images of size (512 512), from a
suburban area of a emerging region; b results from FeatureObjeX
(Geospace 2008); c results of the method proposed in (Tuncer
2007); d results of the method proposed in (Mokhtarzade and
Zoej 2007); e hand-drawn (manual) road map; f results of our
proposed method
20 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
21/25
(I) (II) (III)
(f)
(e)
(d)
(c)
(b)
(a)
Fig. 18 a Three satellite images of size (512 512), from a
urban area of a emerging region; b results from FeatureObjeX
(Geospace 2008); c results of the method proposed in (Tuncer
2007); d results of the method proposed in (Mokhtarzade and
Zoej 2007); e hand-drawn (manual) road map; f results of our
proposed method
J Indian Soc Remote Sens (March 2011) 39(1):125 21
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
22/25
It is observed from the results shown in Figs. 15,
16, 17 and 18 and Tables 4, 5, 6, 7 and 8 that the
performance measure for our proposed algorithm is
superior than the other methods. The results obtained
using proposed methodologies are much superior to
the methods presented in (Tuncer 2007; Mokhtarzade
and Zoej 2007) and close to the manually drawn
reference road network. Compared to our preliminary
investigation in (Mirnalinee et al. 2009), the perfor-
mance in terms of completeness and correctnessmeasures have been enhanced significantly. The
region linking algorithm improves the completeness
measure whereas the region part segmentation
improves the correctness measure.
The correctness and completeness measures obtained
for scenes from emerging countries are in most cases
less compared to the scenes from the developed
countries. This decrease in accuracy for the scenes of
emerging countries is expected, since there are many
more opportunities for errors in these types of areas due
to the large numbers of linear non-road features, fourway crossings, non-linear road structures and unplanned
layouts. Comparing the results of developed urban and
suburban scenes, the performance of urban scenes is
low, because of the distortions. It is obvious that images
of urban areas exhibit a more complex structure than
scenes of suburban areas, as the number of different
objects and their heterogeneity is much higher in urban
scenes. Some of the roads comprise several lanes that
are linked by complex road crossings. Generally as
shown in Fig. 15, the extraction results for open
landscape areas are nearly complete and correct.
Suburban scenes of emerging countries are covered
by vegetation. Moreover the spectral response of roadsin these areas are on certain occasions similar to the
spectral response of open-fields and roof-tops, which in
turn increases the false positives thereby reducing the
correctness measure. Overall, our proposed method
outperforms the featureObjeX (Geospace 2008) and
the two state of the art methods (Tuncer 2007;
Mokhtarzade and Zoej 2007), for observations aver-
aged over images of 50 developed and 50 emerging
areas.
Conclusions
A novel and efficient method for automatically extract-
ing roads using low level information, directly from
satellite images based on region and edge integration
has been introduced and demonstrated. This new
method combines outputs of PSVM and DSM in such
a way that, it preserves the strong discriminative ability
of SVM while simultaneously exploiting the linear like
characteristics in the features derived using DSM. For
the determination of the discontinuity and elimination ofnon-road parts, two approaches were shown: the first
using several criteria concerning properties of the road
parts and their relations to each other. Segment linking
module solves the problem of discontinuity to some
extent, thereby increasing the completeness. Region part
segmentation and shape analysis based on elongated-
Road image type FeatureObjeX Tuncer Mokhtarzade Proposed
I II III I II III I II III I II III
Developed suburban A 97 84 100 98 82 97 98 68 95 100 94 100
Fig. 15 B 88 72 91 93 74 92 86 56 85 98 89 90
Developed urban A 85 96 97 75 92 91 65 66 76 92 100 99
Fig. 16 B 79 82 72 83 83 68 52 54 57 96 100 94
Emerging suburban A 96 83 94 73 62 92 61 51 87 88 96 95
Fig. 17 B 68 73 63 67 56 74 58 51 62 92 93 89
Emerging urban A 91 87 83 62 52 74 71 64 59 89 92 82
Fig. 18 B 74 71 75 72 51 61 57 58 58 83 92 85
Table 4 Performance of the
system for the images
shown in Figs. 15, 16, 17
and 18
A: Completeness, B: Cor-
rectness
Table 3 Values of the parameters used in our proposed
approach
Road image type 1 2 N TE TA
Suburban 2 2.5 9 9 0.6 0.7 50
Urban 3 3.5 11 11 0.7 0.7 50
22 J Indian Soc Remote Sens (March 2011) 39(1):125
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
23/25
Methods Completeness Correctness
FeatureObjeX (Geospace 2008) 78% 60%
Tuncer (Tuncer 2007) 58% 52%
Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 63% 52%
Proposed Method 85% 87%
Table 8 Performance of the
system averaged over 25
images of urban scenes of
emerging countries
Methods Completeness Correctness
FeatureObjeX (Geospace 2008) 89% 62%
Tuncer (Tuncer 2007) 78% 64%
Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 64% 58%
Proposed Method 87% 91%
Table 7 Performance of the
system averaged over 25
images of suburban scenes
of emerging countries
Methods Completeness Correctness
FeatureObjeX (Geospace 2008) 84% 74%
Tuncer (Tuncer 2007) 81% 65%Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 62% 89%
Proposed Method 93% 91%
Table 6 Performance of the
system averaged over 25
images of urban scenes of
developed countries
Methods Completeness Correctness
FeatureObjeX (Geospace 2008) 86% 79%
Tuncer (Tuncer 2007) 81% 72%
Mokhtarzade et al. (Mokhtarzade and Zoej 2007) 63% 60%
Proposed Method 93% 89%
Table 5 Performance of the
system averaged over 25
images of suburban scenes
of developed countries
J Indian Soc Remote Sens (March 2011) 39(1):125 23
8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
24/25
ness measure eliminates non-road parts and increases
the correctness. The results prove that the proposed
system is able to effectively extract major sections of the
road network, a few junctions and curved roads from
high-resolution satellite images.
It is observed that the road detection process
produces a high degree of accuracy especially forthe scenes of developed countries. In urban areas
however, only major roads with larger pixel widths
have been detected. Moreover, the presence of
buildings and other features similar to roads made
the extraction process somewhat more difficult com-
pared to the suburban case. Linking of discontinuous
segments, road junction detection and modeling of
shadows are issues to be addressed in future scope of
work for this problem. Vectorization of the extracted
road segments can also be a nice extension of this
work for GIS updates. The next step may include theformation of a road network by searching for
junctions connecting road segments. Results may
improve with the help of a road hypothesis verifica-
tion using parallelism of road boundaries and use of a
graph data structure to form a complete road network
representation.
References
Baumgartner, A., Hinz, S., & Wiedemann, C. (2002). Efficient
methods and interfaces for road tracking. In: Proceedings
of the ISPRS commission III Symp. Photogrammet.
Comput. Vision, pp. 2831.
Bennamoun, M. (1994). A contour based part segmentation
algorithm. In: Proc. of the IEEE ICASSP, pp. 4144.
Bennamoun, M., & Mamic, G. J. (2002). Object recognition
fundamentals and case studies. Springer.
Bigun, J., Granlund, G., & Wiklund, J. (1991). Multidimen-
sional orientation estimation with applications to texture
analysis and optical flow. IEEE Transactions on Pattern
analysis and Machine Intelligence, 13(8), 775790.
Bucha, V., Uchida, S., & Ablameyko, S. (2006). Interactive
road extraction with pixel force fields. In: IEEE The 18thInternational Conference on Pattern Recognition
(ICPR06), pp. 829832.
Chu, J., & Aggarwal, J. (1993). The integration of image
segmentation maps using region and edge information.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 15, 12411252.
Cooper, G., & Cowan, D. (2007). Enhancing linear features in
image data using horizontal orthogonal gradient ratios.
Computers and Geosciences, 33, 981984.
Cortes, C., & Vapnik, V. (1995). Support vector networks.
Machine Learning, 20(3), 273297.
Doucette, P., Agouris, P., Stefanidis, A., & Musavi, M. (2001).
Self-organized clustering for road extraction in classified
imagery. ISPRS Journal of Photogrammetry and Remote
Sensing, 55, 347358.
Duda, R., Hart, P., & Stork, D. (2000). Pattern classification.
Wiley Interscience.
Feng, X., & Milanfar, P. (2002). Multiscale principal compo-
nents analysis for image local orientation estimation. In:
Proceedings of The 36th Asilomar Conference on Signals,Systems and Computers, pp. 478482.
Geospace (2008). FeatureObjeX, http://www.pcigeomatics.
com/.
Granlund, G., & Knutsson, H. (1995). Signal processing for
computer vision. Boston: Kluwer Academic.
Gruen, A., & Li, H. (1995). Road extraction from aerial and
satellite images by dynamic programming. ISPRS Journal
of Photogrammetry and Remote Sensing, 50(4), 1120.
Haddon, J., & Boyce, J. (1990). Image segmentation by
unifying region and boundary information. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence, 12
(10), 929948.
Haglund, L., & Fleet, D. (1994). Stable estimation of image
orientation. In: Proceedings of the First IEEE International
Conference on Image Processing III, pp. 6872.
Haralick, R., & Shapiro, L. (1992). Computer and robot vision.
Addison Wesley.
Heipke, C., Mayer, H., Wiedemann, C., & Jamet, O. (1997).
Evaluation of automatic road extraction. International
Archives of Photogrammetry and Remote Sensing,
pp. 4756.
Hinz, S., & Baumgartner, A. (2003). Multiview fusion of road
objects supported by self diagnosis. In: In Proceeding of
2nd GRSS/ISPRS Joint Workshop on Remote Sensing and
Data Fusion over Urban Areas, pp. 137141.
Hu, X., & Tao, V. (2007). Automatic extraction of main road
centerlines from high resolution satellite imagery usinghierarchical grouping. Photogrammetric Engineering and
Remote Sensing, 73(9), 10491056.
Hu, X., Zhang, Z., & Tao, V. (2004). A robust method for semi-
automatic extraction of road centerlines using a piece-wise
parabolic model and least square template matching. The
International Journal of Photogrammetric engineering
and Remote Sensing, 70(12), 13931398.
Jacob, M., & Unser, M. (2004). Design of steerable filters for
feature detection using Canny like criteria. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence, 26
(8), 10071019.
Jiang, X. (2007). Extracting image orientation feature by using
integration operator. Pattern Recognition, 40, 705717.
Jin, X., & Davis, C. (2005). An integrated system for automaticroad mapping from high-resolution multispectral satellite
imagery by information fusion. Information Fusion,
pp. 257273.
Kass, M., Witkin, A., & Terzopoulos, D. (1987). Snakes: active
contour models. International Journal of Computer Vision,
1, 321331.
Koriakine, A., & Saveliev, E. (2006). Data, http://www.
wikimapia.org/.
Kumar, P., Das, S., & Yegnanarayana, B. (2000). One-dimensional
processing of images. In: International Conference on
Multimedia Processing and Systems, pp. 451454.
24 J Indian Soc Remote Sens (March 2011) 39(1):125
http://www.pcigeomatics.com/http://www.pcigeomatics.com/http://www.wikimapia.org/http://www.wikimapia.org/http://www.wikimapia.org/http://www.wikimapia.org/http://www.pcigeomatics.com/http://www.pcigeomatics.com/8/2/2019 An Integrate Multistage Framework for Automatic Road Extraction From High Resolution Satellite Imagery
25/25
Kurugollu, F., & Sankur, B. (1999). Map segmentation of color
images using constraint satisfaction neural network. In:
International Conference on Image Processing, pp. 236
239.
Lalit, G., Mangai, U. G., & Das, S. (2008). Integrating region
and edge information for texture segmentation using a
modified constraint satisfaction neural network. Image and
Vision Computing, pp. 11061117.
Laptev, I., Mayer, H., Lindeberg, T., Eckstein, W., Steger, C., &Baumgartner, A. (2000). Automatic extraction of roads
from aerial images based on scale space and snakes.
Machine Vision and Applications, 12(1), 2331.
Lin, W., Kuo, E., & Chen, C. (1992). Constraint satisfaction
neural networks for image segmentation. Pattern Recog-
nition, 25(7), 679693.
Lyvers, E., & Mitchell, O. (1988). Precision edge contrast and
orientation estimation. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 10(6), 927937.
Majidi, B., & BabHadiashar, A. (2009). Aerial tracking of
elongated objects in rural environments. Machine Vision
and Applications, 20, 2334.
Mantero, P., Moser, G., & Serpico, S. (2005). Partially
supervised classification of remote sensing images through
SVM-based probability density estimation. IEEE Trans-
actions on Geoscience and Remote Sensing, 43(3), 559
570.
Mayer, H., Laptev, I., Baumgartner, A., & Steger, C. (1997)
Automatic road extraction based on multi-scale modelling,
context and snakes. In: International Archives of Photo-
grammetry and Remote Sensing, pp. 106113.
McKeown, D. (1996). Top ten lessons learned in automated
cartography.
Mena, J. B. (2003). State of the art on automatic road extraction
for GIS update: a novel classification. Pattern Recognition
Letters, 24(16), 30373058.
Miliaresisa, G., & Kokkasb, N. (2007). Segmentation andobject-based classification for the extraction of the
building class from LIDAR DEMs. Computers and Geo-
sciences, 33, 10761087.
Mirnalinee, T., Das, S., & Varghese, K. (2009). Integration of
region and edge based information for efficient road
extraction from high resolution satellite imagery. In: IEEE
Proceedings of ICAPR, Kolkata, India, pp. 373376.
Moigne, J., & Tilton, J. (1995). Refining image segmentation
by integration of edge and region data. IEEE Transactions
on Geoscience and Remote Sensing, 33, 605615.
Mokhtarzade, M., & Zoej, M. (2007). Road detection from
high-resolution satellite images using artificial neural
networks. International Journal of applied Earth Obser-
vation and Geoinformation, 9(1), 3240.Pal, M., & Mather, P. (2005). Support Vector Machines for
classification in remote sensing. International Journal of
Remote Sensing, 26(5), 10071011.
Pavlidis, T., & Liow, Y. (1990). Integrating region growing and
edge detection. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 12, 225233.
Pei, S., & Lin, C. (1992). The detection of dominant points on
digital curves by scale space filtering. Pattern Recognition,
pp. 13071314.
Perona, P. (1998). Orientation diffusions. IEEE Transactions on
Image processing, 7(3), 457467.
Platt, J. C. (1999). Probabilistic outputs for support vector
machines and comparisons to regularized likelihood
methods. In: Advances in Large Margin Classifiers, MIT
Press, pp. 6174.
Qian, R., & Huang, T. (1996). Optimal edge detection in two-
dimensional images. IEEE Transaction on Image process-
ing, 5, 12151220.Raghu, P., & Yegnanarayana, B. (1996). Segmentation of
Gaborfiltered textures using deterministic relaxation. IEEE
Transactions on Image processing, 5(12), 424429.
Rizvandi, N., Pizurica, A., Philips, W., & Ochoa, D. (2008).
Edge linking based method to detect and separate
individual c. elegans worms in culture. In: DICTA,
pp. 6570.
Shi, W., & Zhu, C. (2002). The line segment match method for
extracting road network from high-resolution satellite
images. IEEE Transactions on Geoscience and Remote
Sensing, 40(2), 511514.
Strang, G. (2005). Linear Algebra and its application. Thomson
Brooks.
Theodoridis, S., & Koutroumbas, K. (2006). Pattern Recogni-
tion. Academic.
Tuncer, O. (2007). Fully automatic road network extraction
from satellite images. In: Recent Advances in Space
Technologies, pp. 708714.
Tupin, F., Houshmand, B., & Datcu, M. (2002). Road detection
in dense urban areas using SAR imagery and the
usefulness of multiple views. IEEE Transactions on
Geoscience and Remote Sensing, 40, 24052414.
Udomhunsakul, S. (2004). Semi-automatic road detection from
satellite imagery. In: IEEE International Conference on
Image Processing (ICIP), pp. 17231726.
Visualisation and Perception Lab (2006). http://www.cse.iitm.
ac.in/~sdas/vplab/downloads.html.Wei, W., & Xin, Y. (2008). Feature extraction for man-made
objects segmentation in aerial images. Machine Vision and
Applications, 19, 5764.
Xiao, Y., Tan, T., & Tay, S. (2005). Utilizing edge to extract
roads in high-resolution satellite imagery. In: IEEE
International Conference on Image Processing (ICIP), pp.
637640.
Yager, N., & Sowmya, A. (2003). Support vector machines for
road extraction from remotely sensed images. LNCS,
2756, 285292.
Yang, J., & Wang, R. (2007). Classified road detection from
satellite images based on perceptual organization. Inter-
national Journal of Remote Sensing, 28, 46534669.
Zhang, H., Xiao, Z., & Zhou, Q. (2008). Research on roadextraction semi-automatically from high resolution remote
sensing images. The International Archives of the Photo-
grammetry, Remote Sensing and Spatial Information
Sciences XXXVII (Part B):536538.
Zhu, C., Shi, W., Pesaresi, M., & Liu, L. (2005). The
recognition of road network from high-resolution satellite
remotely sensed data using image morphological charac-
teristics. International Journal of Remote Sensing, 26(24),
54935508.
J Indian Soc Remote Sens (March 2011) 39(1):125 25
http://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.htmlhttp://www.cse.iitm.ac.in/~sdas/vplab/downloads.html