95
“AUTOMATIC MAIN ROAD EXTRACTION FROM HIGH RESOLUTION SATELLTE IMAGERY” A project report s A project report s A project report s A project report submitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements for the degree of for the degree of for the degree of for the degree of Bachelor Of Technology in Bachelor Of Technology in Bachelor Of Technology in Bachelor Of Technology in Electrical Engineering Electrical Engineering Electrical Engineering Electrical Engineering BY Mr. Pratik Panchabhaiyye Roll No. : 10502010 Mr. Ravi Shekhar Chaudhary Roll No. : 10502051 Guided by Prof. K.R. Subhashini Deptt. Of Electrical Engineering NIT Rourkela National Institute Of Technology Rourkela, Rourkela, Orissa - 769008 2008-2009

“AUTOMATIC MAIN ROAD EXTRACTION FROM HIGH …ethesis.nitrkl.ac.in/1139/2/krs.pdfAutomatic road (network) detection from high resolution satellite imagery will hold great potential

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • “AUTOMATIC MAIN ROAD EXTRACTION FROM HIGH

    RESOLUTION SATELLTE IMAGERY”

    A project report sA project report sA project report sA project report submitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements ubmitted in partial fulfillments of the requirements

    for the degree offor the degree offor the degree offor the degree of

    Bachelor Of Technology in Bachelor Of Technology in Bachelor Of Technology in Bachelor Of Technology in Electrical EngineeringElectrical EngineeringElectrical EngineeringElectrical Engineering

    BBBBYYYY

    Mr. Pratik Panchabhaiyye

    Roll No. : 10502010

    Mr. Ravi Shekhar Chaudhary

    Roll No. : 10502051

    Guided by

    Prof. K.R. Subhashini

    Deptt. Of Electrical Engineering

    NIT Rourkela

    National Institute Of Technology Rourkela,

    Rourkela, Orissa - 769008

    2008-2009

  • Road Extraction From Satellite Imagery [2008-09]

    1 | P a g e

    Certificate

    This is to certify that Mr. Ravi Shekhar Chaudhary of 8th

    semester Electrical

    Engineering has completed the B.Tech Project on “Automatic Main Road

    Extraction From High Resolution Satellite Imagery” in a satisfactory manner.

    This project is submitted in partial fulfillment towards the Bachelor’s degree

    in Electrical Engineering as prescribed by the institute during the year 2008-2009.

    PROF. K.R. Subhashini PROF.B.D.Subudhi

    Project Guide H.O.D

    N.I.T ROURKELA Deptt Of Electrical Engg

  • Road Extraction From Satellite Imagery [2008-09]

    2 | P a g e

    Certificate

    This is to certify that Mr. Pratik Panchabhaiyye of 8th

    semester Electrical

    Engineering has completed the B.Tech Project on “Automatic Main Road

    Extraction From High Resolution Satellite Imagery” in a satisfactory manner.

    This project is submitted in partial fulfillment towards the Bachelor’s degree

    in Electrical Engineering as prescribed by the institute during the year 2008-2009.

    PROF. K.R. Subhashini PROF.B.D.Subudhi

    Project Guide H.O.D

    N.I.T ROURKELA Deptt Of Electrical Engg

  • Road Extraction From Satellite Imagery [2008-09]

    3 | P a g e

    ABSTRACT

    Road information is essential for automatic GIS (geographical information system) data

    acquisition, transportation and urban planning. Automatic road (network) detection from high

    resolution satellite imagery will hold great potential for significant reduction of database

    development/updating cost and turnaround time. From so called low level feature detection to

    high level context supported grouping, so many algorithms and methodologies have been

    presented for this purpose. There is not any practical system that can fully automatically extract

    road network from space imagery for the purpose of automatic mapping. This paper presents

    the methodology of automatic main road detection from high resolution satellite IKONOS

    imagery. The strategies include multiresolution or image pyramid method, Gaussian blurring

    and the line finder using 1-dimemsional template correlation filter, line segment grouping and

    multi-layer result integration. Multi-layer or multi-resolution method for road extraction is a

    very effective strategy to save processing time and improve robustness. To realize the strategy,

    the original IKONOS image is compressed into different corresponding image resolution so that

    an image pyramid is generated; after that the line finder of 1-dimemsional template correlation

    filter after Gaussian blurring filtering is applied to detect the road centerline. Extracted

    centerline segments belong to or do not belong to roads. There are two ways to identify the

    attributes of the segments, the one is using segment grouping to form longer line segments and

    assign a possibility to the segment depending on the length and other geometric and

    photometric attribute of the segment, for example the longer segment means bigger possibility

    of being road. Perceptual-grouping based method is used for road segment linking by a

    possibility model that takes multi-information into account; here the clues existing in the gaps

  • Road Extraction From Satellite Imagery [2008-09]

    4 | P a g e

    are considered. Another way to identify the segments is feature detection back-to-higher

    resolution layer from the image pyramid.

    CONTENTS

    S.NO. TOPIC PAGE NO.

    1. ABSTRACT 1 2. INTRODUCTION 2

    3. LITERATURE REVIEW 4

    4. STRATEGY 9 5. IMPLEMENTATION - CODING 15

    6. CONCLUSION & FUTURE WORK 27

    7. REFERENCES 28

  • Road Extraction From Satellite Imagery [2008-09]

    5 | P a g e

    1. Introduction

    Modern digital technology has made it possible to manipulate multi-dimensional signals with

    systems that range from simple digital circuits to advanced parallel computers. The goal of this

    manipulation can be divided into three categories:

    • Image Processing image in → image out

    • Image Analysis image in → measurements out

    • Image Understanding image in → high-level description out

    Here we will focus on the fundamental concepts of image processing.

    Image understanding requires an approach that differs fundamentally from the theme of our

    discussion. Further, we will restrict ourselves to two–dimensional (2D) image processing

    although most of the concepts and techniques that are to be described can be extended easily

    to three or more dimensions.

    We begin with certain basic definitions. An image defined in the “real world” is considered to

    be a function of two real variables, for example, a(x,y) with a as the amplitude (e.g. brightness)

    of the image at the real coordinate position (x,y). An image may be considered to contain sub-

    images sometimes referred to as

    regions–of–interest, ROIs, or simply regions. This concept reflects the fact that images

    frequently contain collections of objects each of which can be the basis for a region. In a

    sophisticated image processing system it should be possible to apply specific image processing

    operations to selected regions. Thus one part of an image (region) might be processed to

    suppress motion blur while another part might be processed to improve color rendition. The

    amplitudes of a given image will almost always be either real numbers or integer numbers. The

    latter is usually a result of a quantization process that converts a continuous range (say,

    between 0 and 100%) to a discrete number of levels. In certain image-forming processes,

    however, the signal may involve photon counting which implies that the amplitude would be

    inherently quantized. In other image forming procedures, such as magnetic resonance imaging,

    the direct physical measurement yields a complex number in the form of a real magnitude and

    a real phase.

  • Road Extraction From Satellite Imagery [2008-09]

    6 | P a g e

    2. Digital Image Definitions

    A digital image a[m,n] described in a 2D discrete space is derived from an analog image a(x,y) in

    a 2D continuous space through a sampling process that is frequently referred to as digitization.

    The mathematics of that sampling process will be described in Section 5. For now we will look

    at some basic definitions

    associated with the digital image. The effect of digitization is shown in Figure 1.

    The 2D continuous image a(x,y) is divided into N rows and M columns. The intersection of a row

    and a column is termed a pixel. The value assigned to the

    integer coordinates [m,n] with {m=0,1,2,…,M–1} and {n=0,1,2,…,N–1} is a[m,n].

    In fact, in most cases a(x,y)—which we might consider to be the physical signal that impinges

    on the face of a 2D sensor—is actually a function of many variables including depth (z), color (l),

    and time (t). Unless otherwise stated, we will consider the case of 2D, monochromatic, static

    images in this chapter.

    Columns

    Figure 2.1: Digitization of a continuous image. The pixel at coordinates [m=10, n=3] has the integer brightness value 110.

    The image shown in Figure 2.1 has been divided into N = 16 rows and M = 16 columns. The

    value assigned to every pixel is the average brightness in the pixel rounded to the nearest

    integer value. The process of representing the amplitude of the 2D signal at a given coordinate

    as an integer value with L different gray levels is usually referred to as amplitude quantization

    or simply quantization.

    Row

    s Value = a(x, y, z, l, t)

  • Road Extraction From Satellite Imagery [2008-09]

    7 | P a g e

    2.1 COMMON VALUES

    There are standard values for the various parameters encountered in digital image processing.

    These values can be caused by video standards, by algorithmic requirements, or by the desire

    to keep digital circuitry simple. Table 1 gives some commonly encountered problems.

    Parameters Symbol Typical

    Values

    Rows N 256 512 525 625 1024 1035

    Columns M 256 512 768 1024 1320

    Grey Levels L 2 64 256 1024 4096 16384

    Table 1: Common values of digital image parameters

    Quite frequently we see cases of M=N=2K where {K = 8,9,10}. This can be motivated by digital

    circuitry or by the use of certain algorithms such as the (fast)

    Fourier transforms. The number of distinct gray levels is usually a power of 2, that is, L=2B

    where B is the number of bits in the binary representation of the brightness levels. When B>1

    we speak of a gray-level image; when B=1 we speak of a binary image. In a binary image there

    are just two gray levels which can be referred to, for example, as “black” and “white” or “0” and

    “1”.

    2.2 CHARACTERISTICS OF IMAGE OPERATIONS

    There is a variety of ways to classify and characterize image operations. The reason for doing so

    is to understand what type of results we might expect to achieve with a given type of operation

    or what might be the computational burden associated with a given operation.

    2.2.1 Types of operations

    The types of operations that can be applied to digital images to transform an input image

    a[m,n] into an output image b[m,n] (or another representation) can be classified into three

    categories as shown in Table 2.

  • Road Extraction From Satellite Imagery [2008-09]

    8 | P a g e

    Operation Characterization Generic

    C Complexity/Pixel

    • Point – the output value at a specific coordinate is dependent

    only on the input value at that same coordinate.

    constant

    • Local – the output value at a specific coordinate is dependent on

    the input values in the neighborhood of that same P2

    coordinate.

    • Global – the output value at a specific coordinate is dependent on

    all the values in the input image. N2

    Table 2: Types of image operations. Image size = N ´ N; neighborhood size

    = P ´ P. Note that the complexity is specified in operations per pixel.

    2.2.2 Types of neighborhoods

    Neighborhood operations play a key role in modern digital image processing. It is therefore

    important to understand how images can be sampled and how that relates to the various

    neighborhoods that can be used to process an image.

    • Rectangular sampling – In most cases, images are sampled by laying a rectangular grid over an

    image as illustrated in Figure 1. This results in the type of sampling shown in Figure 3ab.

    • Hexagonal sampling – An alternative sampling scheme is shown in Figure 3c and is termed

    hexagonal sampling.

  • Road Extraction From Satellite Imagery [2008-09]

    9 | P a g e

    Both sampling schemes have been studied extensively [1] and both represent a possible

    periodic tiling of the continuous image space. We will restrict our attention, however, to only

    rectangular sampling as it remains, due to hardware and software considerations, the method

    of choice.

    Local operations produce an output pixel value b[m=mo,n=no] based upon the pixel values in

    the neighborhood of a[m=mo,n=no]. Some of the most common neighborhoods are the 4-

    connected neighborhood and the 8-connected neighborhood in the case of rectangular

    sampling and the 6-connected neighborhood in the case of hexagonal sampling illustrated in

    Figure 3.

    Figure 2.2a Figure 2.2b

    Rectangular sampling Rectangular sampling

    4-connected 8-connected

  • Road Extraction From Satellite Imagery [2008-09]

    10 | P a g e

    3. Tools

    Certain tools are central to the processing of digital images. These include

    Mathematical tools such as convolution, Fourier analysis, and statistical descriptions, and

    manipulative tools such as chain codes and run codes. We will present these tools without any

    specific motivation. The motivation will follow in later sections.

    3.1 CONVOLUTION

    There are several possible notations to indicate the convolution of two (multidimensional)

    signals to produce an output signal. The most common are:

    c = a⊗ b = a * b (1)

    We shall use the first form, c = a *b, with the following formal definitions.

    In 2D continuous space:

    c(x, y) = a(x, y) * b(x, y) = � �(� , �)( − � , � − � ) � � (2)

    In 2D discrete space:

    c[m,n] = a[m,n] ⊗ b[m, n] = ∑∑a[j,k]b[m- j,n - k] (3)

    3.2 PROPERTIES OF CONVOLUTION

    There are a number of important mathematical properties associated with

    convolution.

    • Convolution is commutative.

    c = a b ⊗ = b ⊗ a (4)

  • Road Extraction From Satellite Imagery [2008-09]

    11 | P a g e

    • Convolution is associative.

    c = a ⊗ (b ⊗ d) = (a ⊗ b) ⊗ d = a ⊗ b ⊗ d (5)

    • Convolution is distributive.

    c = a ⊗ (b + d) = (a ⊗ b) + (a ⊗ d) (6) where a, b, c, and d are all images, either continuous or discrete.

    3.3 FOURIER TRANSFORMS

    The Fourier transform produces another representation of a signal, specifically a representation

    as a weighted sum of complex exponentials. Because of Euler’s formula:

    e jq

    = cos(q) + jsin(q) (7)

    where j2 = -1, we can say that the Fourier transform produces a representation of a (2D) signal

    as a weighted sum of sines and cosines. The defining formulas for the forward Fourier and the

    inverse Fourier transforms are as follows. Given an image a and its Fourier transform A, then

    the forward transform goes from the spatial domain (either continuous or discrete) to the

    frequency domain which is always continuous.

    Forward – A = �{a} (8)

    The inverse Fourier transform goes from the frequency domain back to the spatial

    domain.

    Inverse – a =F -1{A} (9)

    The Fourier transform is a unique and invertible operation so that:

    a =F -1

    {F {a}} and A = F{F -1

    {A}} (10)

    The specific formulas for transforming back and forth between the spatial domain and the

    frequency domain are given below.

    In 2D continuous space:

    Forward – A(u,v) = � �(, �)���(�����)

    � (11)

    Inverse – �(, �) = 1/4�� � �(�, �)���(�����)� � �� (12)

    In 2D discrete space:

  • Road Extraction From Satellite Imagery [2008-09]

    12 | P a g e

    Forward – �(Ω, Ψ) = ∑ � $%� ∑ a'm, n*e�,(Ω$�-.)� .%� (13)

    Inverse – �'/, 0* = 1/4�� 1 �2�2 1 �(Ω, Ψ)���(Ω3�-4)ΩΨ�2

    �2 (14)

    3.4 PROPERTIES OF FOURIER TRANSFORMS

    There are a variety of properties associated with the Fourier transform and the

    inverse Fourier transform. The following are some of the most relevant for digital

    image processing.

    • The Fourier transform is, in general, a complex function of the real frequency

    variables. As such the transform can be written in terms of its magnitude and

    phase.

    A(u,v) = |�(�, �)| ��6(�,�) A(W,Y) = |�(Ω, Ψ)|��6(Ω,-) (15)

    • A 2D signal can also be complex and thus written in terms of its magnitude and phase.

    �(, �) = |�(, �)|��7(�,�) �'/, 0* = |�'/, 0*|��7'3,4* (16)

    • If a 2D signal is real, then the Fourier transform has certain symmetries.

    �(�, �) = �∗(−�, −�) �(Ω, Ψ) = �∗(−Ω, −Ψ) (17)

    The symbol (*) indicates complex conjugation. For real signals eq. (17) leads

    directly to:

    |�(�, �) | = |�(−�, −�) | (18)

    • If a 2D signal is real and even, then the Fourier transform is real and even.

    A(u,v) = A(-u,-v) A(Ω, Ψ) = A(-Ω,- Ψ) (19)

    • The Fourier and the inverse Fourier transforms are linear operations.

    F {w1a + w2b} =F {w1a} + F{w2b} = w1A +w2 B

  • Road Extraction From Satellite Imagery [2008-09]

    13 | P a g e

    F-1

    {w1A + w2B} = F-1

    {w1A} + F-1

    {w2B} = w1a +w2b (20)

    where a and b are 2D signals (images) and w1 and w2 are arbitrary, complex

    constants.

    • The Fourier transform in discrete space, A(Ω, Ψ), is periodic in both W and Y.

    Both periods are 2π.

    A(Ω+ 2πj,Ψ + 2πk) = A(Ω, Ψ) j, k integers (21)

    • The energy, E, in a signal can be measured either in the spatial domain or the

    frequency domain. For a signal with finite energy:

    Parseval’s theorem (2D continuous space):

    9 = 1 � � 1 |�(, �)|��

    � = 1/4�� 1 �

    � 1 |�(�, �)|��

    � �� (22)

    This “signal energy” is not to be confused with the physical energy in the phenomenon that

    produced the signal. If, for example, the value a[m,n] represents a photon count, then the

    physical energy is proportional to the amplitude, a, and not the square of the amplitude. This is

    generally the case in video imaging.

    3.4.1 Importance of phase and magnitude

    Equation (15) indicates that the Fourier transform of an image can be complex.

    This is illustrated below in Figures 4a-c. Figure 4a shows the original image a[m,n], Figure 4b the

    magnitude in a scaled form as log(|A(Ω,Ψ)|), and Figure 4c the phase j(Ω,Ψ).

    Figure 4a Figure 4b Figure 4c

    Original log(|A(,Ω,Ψ)|) Φ(Ω,Ψ)

  • Road Extraction From Satellite Imagery [2008-09]

    14 | P a g e

    Both the magnitude and the phase functions are necessary for the complete reconstruction of

    an image from its Fourier transform. Figure 5a shows what happens when Figure 4a is restored

    solely on the basis of the magnitude information and Figure 5b shows what happens when

    Figure 4a is restored solely on the basis of the phase information.

    Figure 5a Figure 5b

    Φ(Ω,Ψ) = 0 |A(Ω,Ψ)| = constant

    Neither the magnitude information nor the phase information is sufficient to restore the image.

    The magnitude–only image (Figure 5a) is unrecognizable and has severe dynamic range

    problems. The phase-only image (Figure 5b) is barely recognizable, that is, severely degraded in

    quality.

    3.4.2 Circularly symmetric signals

    An arbitrary 2D signal a(x,y) can always be written in a polar coordinate system as

    a(r,θ). When the 2D signal exhibits a circular symmetry this means that:

    a(x, y) = a(r,θ) = a(r ) (23)

    where r2 = x

    2 + y

    2 and tanθ = y/x. As a number of physical systems such as lenses

    exhibit circular symmetry, it is useful to be able to compute an appropriate Fourier

    representation.

    The Fourier transform A(u, v) can be written in polar coordinates A(ωr,ξ) and then, for a circularly symmetric signal, rewritten as a Hankel transform:

    A(u,v) ={a(x,y)} = 2π 1 �(:) ; Jo(ωrr)rdr = A(ωr) (24)

    where ωr2 = u2 + v2 and tanξ = v u and Jo(•) is a Bessel function of the first kind

    of order zero.

    The inverse Hankel transform is given by:

  • Road Extraction From Satellite Imagery [2008-09]

    15 | P a g e

    �(:) = 1/2� 1 � ; (ωr )Jo(ωrr)ωrdωr (25)

    The Fourier transform of a circularly symmetric 2D signal is a function of only the radial

    frequency, ωr. The dependence on the angular frequency, ξ, has vanished. Further, if a(x,y) = a(r) is real, then it is automatically even due to the circular symmetry.

    According to equation (19), A(ωr) will then be real and even.

    3.4.3 Examples of 2D signals and transforms

    Table 4 shows some basic and useful signals and their 2D Fourier transforms. In

    using the table entries in the remainder of this chapter we will refer to a spatial

    domain term as the point spread function (PSF) or the 2D impulse response and its

    Fourier transforms as the optical transfer function (OTF) or simply transfer

    function. Two standard signals used in this table are u(•), the unit step function, and

    J1(•), the Bessel function of the first kind. Circularly symmetric signals are treated

    as functions of r as in eq. (23).

  • Road Extraction From Satellite Imagery [2008-09]

    16 | P a g e

    4. GEOMETRICAL IMAGE RESAMPLING

    As noted in the preceding sections of this chapter, the reverse address computation process

    usually results in an address result lying between known pixel values of an input image. Thus it

    is necessary to estimate the unknown pixel amplitude from its known neighbors. This process is

    related to the image reconstruction task, as described in Chapter 4, in which a space-

    continuous display is generated from an array of image samples. However, the geometrical

    resampling process is usually not spatially regular. Furthermore, the process is discrete to

    discrete; only one output pixel is produced for each input address.

    In this section, consideration is given to the general geometrical resampling process in which

    output pixels are estimated by interpolation of input pixels. The special, but common case, of

    image magnification by an integer zooming factor is also discussed. In this case, it is possible to

    perform pixel estimation by convolution.

    4.1. Interpolation Methods

    The simplest form of resampling interpolation is to choose the amplitude of an output image

    pixel to be the amplitude of the input pixel nearest to the reverse address. This process, called

    nearest-neighbor interpolation, can result in a spatial offset error by as much as pixel units. The

    resampling interpolation error can be significantly reduced by utilizing all four nearest

    neighbors in the interpolation. A common approach, called bilinear interpolation, is to

    interpolate linearly along each row of an image and then interpolate that result linearly in the

    columnar direction. The estimated pixel is easily found to be F(p2, q2)=(1 – a)[(1 – b)F(p, q)+ bF(p, q + 1)]

    + a[(1 – b)F(p+ 1, q)+ bF(p + 1, q + 1)] (4.1.1)

    Although the horizontal and vertical interpolation operations are each linear, in general,

    their sequential application results in a nonlinear surface fit between the four neighboring

    pixels.

  • Road Extraction From Satellite Imagery [2008-09]

    17 | P a g e

    The expression for bilinear interpolation of Eq. 4.5-1 can be generalized for any

    interpolation function that is zero-valued outside the range of sample spacing. With this

    generalization, interpolation can be considered as the summing of four weighted interpolation

    functions as given by

    F(p2, q2)= F(p, q)R{–a}R{b}+F(p, q+ 1)R{–a}R{–(1 – b)}

    +F(p + 1, q)R{1 – a}R{b}+F(p+ 1, q + 1)R{1 – a}R{–(1 – b)} (4.1−2)

    In the special case of linear interpolation, , where is defined in Eq. 4.3-2. Typically, for reasons

    of computational complexity, resampling interpolation is limited to a pixel neighborhood.

    Figure 13.5-2 defines a generalized bicubic interpolation neighborhood in which the pixel is the nearest neighbor to the pixel to be interpolated. The interpolated pixel may be expressed in the

    compact form

    F (p2, q2) = ∑ ∑ �(�= + /��, ? + 0)�@A �B/– �D� @A{ – F�0– G}�

    where denotes a bicubic interpolation function such as a cubic B-spline or cubic interpolation

    function, as defined in Section 4.3-2.

    4.2. Convolution Methods

    When an image is to be magnified by an integer zoom factor, pixel estimation can be

    implemented efficiently by convolution (12). As an example, consider image magnification by a

    factor of 2:1. This operation can be accomplished in two stages. First, the input image is

    transferred to an array in which rows and columns of zeros are interleaved with the input

    image data as follows:

  • Road Extraction From Satellite Imagery [2008-09]

    18 | P a g e

    FIGURE 4-3. Interpolation kernels for 2:1 magnification.

    (a) Original (b) Zero interleaved quadrant

  • Road Extraction From Satellite Imagery [2008-09]

    19 | P a g e

    (c) Peg (d) Pyramid

    (e) Bell (f )Cubic B-spline

    FIGURE 4.5-4. Image interpolation on the mandrill_mon image for 2:1 magnification

    Next, the zero-interleaved neighborhood image is convolved with one of the discrete

    interpolation kernels listed in Figure 4-3. Figure 4-4 presents the magnification results for

    several interpolation kernels. The inevitable visual trade-off between the interpolation error

    (the jaggy line artifacts) and the loss of high spatial frequency detail in the image is apparent

    from the examples.

    This discrete convolution operation can easily be extended to higher-order magnification

    factors. For N:1 magnification, the core kernel is a peg array. For large kernels it may be more

    computationally efficient in many cases, to perform the interpolation indirectly by Fourier

    domain filtering rather than by convolution (6).

  • Road Extraction From Satellite Imagery [2008-09]

    20 | P a g e

    IMAGE IMPROVEMENT

    The use of digital processing techniques for image improvement has received much interest

    with the publicity given to applications in space imagery and medical research. Other

    applications include image improvement for photographic surveys and industrial radiographic

    analysis.

    Image improvement is a term coined to denote three types of image manipulation processes:

    image enhancement, image restoration, and geometrical image modification.

    Image enhancement entails operations that improve the appearance to a human viewer, or

    operations to convert an image to a format better suited to machine processing. Image

    restoration has commonly been defined as the modification of an observed image in order to compensate for defects in the imaging system that produced the observed image. Geometrical image modification includes image magnification, minification, rotation, and nonlinear spatial

    warping.

    Chapter 4 describes several techniques of monochrome and color image enhancement. The

    chapters that follow develop models for image formation and restoration, and present methods

    of point and spatial image restoration. The final chapter of this part considers geometrical

    image modification.

  • Road Extraction From Satellite Imagery [2008-09]

    21 | P a g e

    5.IMAGE ENHANCEMENT

    Image enhancement processes consist of a collection of techniques that seek to improve the

    visual appearance of an image or to convert the image to a form better suited for analysis by a

    human or a machine. In an image enhancement system, there is no conscious effort to improve

    the fidelity of a reproduced image with regard to some ideal form of the image, as is done in

    image restoration. Actually, there is some evidence to indicate that often a distorted image, for

    example, an image with amplitude overshoot and undershoot about its object edges, is more

    subjectively pleasing than a perfectly reproduced original.

    For image analysis purposes, the definition of image enhancement stops short of information

    extraction. As an example, an image enhancement system might emphasize the edge outline of

    objects in an image by high-frequency filtering. This edge-enhanced image would then serve as

    an input to a machine that would trace the outline of the edges, and perhaps make

    measurements of the shape and size of the outline. In this application, the image enhancement

    processor would emphasize salient features of the original image and simplify the processing

    task of a data extraction machine.

    There is no general unifying theory of image enhancement at present because there is no

    general standard of image quality that can serve as a design criterion for an image

    enhancement processor. Consideration is given here to a variety of techniques that have

    proved useful for human observation improvement and image analysis.

    5.1. CONTRAST MANIPULATION

    One of the most common defects of photographic or electronic images is poor contrast resulting

    from a reduced, and perhaps nonlinear, image amplitude range. Image contrast can often be

    improved by amplitude rescaling of each pixel (1,2).

    Figure 5.1-1a illustrates a transfer function for contrast enhancement of a typical continuous

    amplitude low-contrast image. For continuous amplitude images, the transfer function operator

    can be implemented by photographic techniques, but it is often difficult to realize an arbitrary

    transfer function accurately. For quantized amplitude images, implementation of the transfer

    function is a

  • Road Extraction From Satellite Imagery [2008-09]

    22 | P a g e

    FIGURE 5.1-1. Continuous and quantized image contrast enhancement.

    relatively simple task. However, in the design of the transfer function operator, consideration

    must be given to the effects of amplitude quantization. With reference to Figure 5.l-lb, suppose

    that an original image is quantized to J levels, but it occupies a smaller range. The output image

    is also assumed to be restricted to J levels, and the mapping is linear. In the mapping strategy

    indicated in Figure 5.1-1b, the output level chosen is that level closest to the exact mapping of

    an input level. It is obvious from the diagram that the output image will have unoccupied levels

    within its range, and some of the gray scale transitions will be larger than in the original image.

    The latter effect may result in noticeable gray scale contouring. If the output image is quantized

    to more levels than the input image, it is possible to approach a linear placement of output

    levels, and hence, decrease the gray scale contouring effect.

  • Road Extraction From Satellite Imagery [2008-09]

    23 | P a g e

    c) Absolute value scaling FIGURE 5.1-2. Image scaling methods.

    5.1.1. Amplitude Scaling

    A digitally processed image may occupy a range different from the range of the original image.

    In fact, the numerical range of the processed image may encompass negative values, which

    cannot be mapped directly into a light intensity range. Figure 5.1-2 illustrates several

    possibilities of scaling an output image back into the domain of values occupied by the original

    image. By the first technique, the processed image is linearly mapped over its entire range,

    while by the second technique, the extreme amplitude values of the processed image are

    clipped to maximum and minimum limits. The second technique is often subjectively

    preferable, especially for images in which a relatively small number of pixels exceed the limits.

    Contrast enhancement algorithms often possess an option to clip a fixed percentage of the

    amplitude values on each end of the amplitude scale. In medical image enhancement

    applications, the contrast modification operation shown in Figure 5.2-2b, for , a ≥ b is called a

    window-level transformation. The window value is the width of the linear slope,b-a ; the level is

    located at the midpoint c of the slope line. The third technique of amplitude scaling, shown in

    Figure 5.1-2c, utilizes an absolute value transformation for visualizing an image with negatively

    valued pixels. This is a useful transformation for systems that utilize the two's complement

  • Road Extraction From Satellite Imagery [2008-09]

    24 | P a g e

    (e) Min. clip = 0.24, max. clip = 0.35 (f) Enhancement histogram FIGURE 5.1-4. Window-level contrast stretching of an earth satellite image.

    5.1.2. Contrast Modification

    Section 10.1.1 dealt with amplitude scaling of images that do not properly utilize the dynamic

    range of a display; they may lie partly outside the dynamic range or occupy only a portion of the

    dynamic range. In this section, attention is directed to point transformations that modify the

    contrast of an image within a display's dynamic range.

    Figure 10.1-5a contains an original image of a jet aircraft that has been digitized to 256

    gray levels and numerically scaled over the range of 0.0 (black) to 1.0 (white).

    (a) Original (b) Original histogram

  • Road Extraction From Satellite Imagery [2008-09]

    25 | P a g e

    (c) Cube function (d ) Cube output

    FIGURE 5.1-6. Square and cube contrast modification of the jet_mon image.

    G(j, k) = [F(j,k)]p (5.1-1)

    Where 0.0 ≤ F(j, k)≤ 1.0 represents the original image and p is the power law variable. It is

    important that the amplitude limits of Eq. 5.1-1 be observed; processing of the integer code

    (e.g., 0 to 255) by Eq. 5.1-1 will give erroneous results. The square function provides the best

    visual result. The rubber band transfer function shown in Figure 5.1-8a provides a simple

    piecewise linear approximation to the power law curves. It is often useful in interactive

    enhancement machines in which the inflection point is interactively placed.

    The Gaussian error function behaves like a square function for low-amplitude pixels and

    like a square root function for high- amplitude pixels. It is defined as

    G(j,k) = �:IB(�(J, K) − 0.5)/�√2D + 0.5/�√2 (5.1-2a) 2 erf{ 0.5/�√2}

    IMAGE ANALYSIS

    Image analysis is concerned with the extraction of measurements, data or information from an

    image by automatic or semiautomatic methods. In the literature, this field has been called

    image data extraction, scene analysis, image description, automatic photo interpretation,

    image understanding, and a variety of other names.

    Image analysis is distinguished from other types of image processing, such as coding,

    restoration, and enhancement, in that the ultimate product of an image analysis system is

    usually numerical output rather than a picture.

  • Road Extraction From Satellite Imagery [2008-09]

    26 | P a g e

    6.LINEAR PROCESSING TECHNIQUES

    Most discrete image processing computational algorithms are linear in nature; an output image

    array is produced by a weighted linear combination of elements of an input array. The

    popularity of linear operations stems from the relative simplicity of spatial linear processing as

    opposed to spatial nonlinear processing. However, for image processing operations,

    conventional linear processing is often computationally infeasible without efficient

    computational algorithms because of the large image arrays. This chapter considers indirect

    computational techniques that permit more efficient linear processing than by conventional

    methods.

    6.1. TRANSFORM DOMAIN PROCESSING

    Two-dimensional linear transformations have been defined in Section 5.4 in series form as

    P(m1, m2) = ∑ ∑ �(01, 02)P(01, 02; /1, /2) (6.1-1)

    and defined in vector form as

    p=Tf (6.1-2)

    It will now be demonstrated that such linear transformations can often be computed more

    efficiently by an indirect computational procedure utilizing two-dimensional unitary transforms

    than by the direct computation indicated by Eq. 6.1-1 or 6.1-2.

  • Road Extraction From Satellite Imagery [2008-09]

    27 | P a g e

    FIGURE 6.1-1. Direct processing and generalized linear filtering; series formulation.

    Figure 6.1-1 is a block diagram of the indirect computation technique called generalized linear

    filtering (1). In the process, the input array undergoes a two-dimensional unitary

    transformation, resulting in an array (n1,n2) of transform coefficients F( u1, u2) . Next, a linear

    combination of these coefficients is taken according to the general relation

    F˜ (w1,w2) = ∑ ∑ �(�1, �2)P(�1, �2; R1, R2) (6.1-3)

    where T (u1, u2, ;w1, w2 ) represents the linear filtering transformation function. Finally, an inverse unitary transformation is performed to reconstruct the processed array P (m1, m2 ) . If this computational procedure is to be more efficient than direct computation by Eq. 6.1-1, it is

    necessary that fast computational algorithms exist for the unitary transformation, and also the

    kernel must be reasonably sparse; that is, it must contain many zero elements.

    The generalized linear filtering process can also be defined in terms of vectorspace

    computations as shown in Figure 9.1-2. For notational simplicity, let N1 = N2 = N and M1 = M2

    = M. Then the generalized linear filtering process can be described by the equations

    I = '�ST* I (6.1-4a)

    I U = Pf (6.1-4b)

    p='�VT*-1I U (6.1-4c)

  • Road Extraction From Satellite Imagery [2008-09]

    28 | P a g e

    FIGURE 9.1-2. Direct processing and generalized linear filtering; vector formulation.

    where AN2 is a N

    2 x N

    2 unitary transform matrix, T is a M2 x N2 linear filtering transform

    operation, and �VT is a unitary transform matrix. From Eq. 9.1-4, the input and output vectors are related by

    p='�VT*-1P'�ST*f (6.1-5)

    6.2. TRANSFORM DOMAIN SUPERPOSITION

    The superposition operations can often be performed more efficiently by transform domain

    processing rather than by direct processing. Figure 6.2-1a and b illustrate block diagrams of the

    computational steps involved in direct finite area or sampled image superposition. In Figure

    6.2-1d and e, an alternative form of processing is illustrated in which a unitary transformation

    operation is performed on the data vector f before multiplication by a finite area filter matrix D

    or sampled image filter matrix B . An inverse transform reconstructs the output vector. From

    Figure 6.2-1, for finite-area superposition, because

    q=Df

    (6.2-1a)

    and

    q=[AM2]–1D[AN2]f (6.2-1b)

    then clearly the finite-area filter matrix may be expressed as

  • Road Extraction From Satellite Imagery [2008-09]

    29 | P a g e

    (a) Finite length convolution

    (b) Sampled data convolution

    (c) Circulant convolution

    FIGURE 6.2-2. One-dimensional Fourier and Hadamard domain convolution matrices.

    Figure 6.2-2 shows the Fourier and Hadamard domain filter matrices for the three forms of

    convolution for a one-dimensional input vector and a Gaussian-shaped impulse response (6). As

    expected, the transform domain representations are much more sparse than the data domain

    representations. Also, the Fourier domain circulant convolution filter is seen to be of diagonal

    form. Figure 6.2-3 illustrates the structure of the three convolution matrices for two-

    dimensional convolution (4).

  • Road Extraction From Satellite Imagery [2008-09]

    30 | P a g e

    6.3. FAST FOURIER TRANSFORM CONVOLUTION

    As noted previously, the equivalent output vector for either finite-area or sampled image

    convolution can be obtained by an element selection operation on the extended output vector

    kE for circulant convolution or its matrix counterpart KE.

    Spatial domain Fourier domain

    (a) Finite-area convolution

    (b) Sampled image convolution

    (c) Circulant convolution

  • Road Extraction From Satellite Imagery [2008-09]

    31 | P a g e

    FIGURE 9.2-3. Two-dimensional Fourier domain convolution matrices.

    This result, combined with Eq. 6.2-13, leads to a particularly efficient means of convolution

    computation indicated by the following steps:

    1. Embed the impulse response matrix in the upper left corner of an all-zero JxJ matrix, for

    J≥Nfinite-area convolution or for J ≥ M sampled infinite-area convolution, and take the

    two-dimensional Fourier transform of the extended impulse response matrix, giving

    HE=AJHEAJ

    (6.3-1)

    2. Embed the input data array in the upper left corner of an all-zero matrix, and take the two-dimensional Fourier transform of the extended input data matrix to

    obtain

    FE=AJHEAJ

    (6.3-2)

    3. Perform the scalar multiplication

    KE(m,n) = JHE(m,n)FE(m,n) (6.3-3)

    4. Take the inverse Fourier transform

    KE=[AJ2]

    -1HE[AJ

    2]

    (6.3-4)

    5. Extract the desired output matrix

    Q=[S1J(M)

    ]KE[S1J(M)

    ]T

    (6.3-5)

    It is important that the size of the extended arrays in steps 1 and 2 be chosen large enough to

    satisfy the inequalities indicated. If the computational steps are performed with J = N, the

    resulting output array, shown in Figure 9.3-1, will contain erroneous terms in a boundary region

    of width L – 1 elements, on the top and left-hand side of the output field. This is the

    wraparound error associated with incorrect use of the Fourier domain convolution method. In

    addition, for finite area (D-type) convolution, the bottom and right-hand-side strip of output

    elements will be missing. If the computation is performed with J = M, the output array will be

    completely filled with the correct terms for D-type convolution. To force J = M for B-type

    convolution, it is necessary to truncate the bottom and right-hand side of the input array. As a

    consequence, the top and left-hand-side elements of the output array are erroneous.

  • Road Extraction From Satellite Imagery [2008-09]

    32 | P a g e

    Fourier domain processing is more computationally efficient than direct processing for image

    convolution if the impulse response is sufficiently large. However, if the image to be processed

    is large, the relative computational advantage of Fourier domain processing diminishes. Also,

    there are attendant problems of computational

    FIGURE 6.3-5. Comparison of direct and Fourier domain processing for finite-area convolution.

    accuracy with large Fourier transforms. Both difficulties can be alleviated by a block-mode

    filtering technique in which a large image is separately processed in adjacent overlapped blocks

    (2, 7–9).

    Figure 9.3-6a illustrates the extraction of a pixel block from the upper left corner of a large

    image array. After convolution with a impulse response, the resulting pixel block is placed in the

    upper left corner of an output

  • Road Extraction From Satellite Imagery [2008-09]

    33 | P a g e

    FIGURE 6.3-5. Comparison of direct and Fourier domain processing for finite-area convolution.

    data array as indicated in Figure 6.3-6a. Next, a second block of pixels is extracted from the

    input array to produce a second block of output pixels that will lie adjacent to the first block. As

    indicated in Figure 6.3-6b, this second input block must be overlapped by (L – 1) pixels in order

    to generate an adjacent output block. The computational process then proceeds until all input

    blocks are filled along the first row. If a partial input block remains along the row, zero-value

    elements can be added to complete the block. Next, an input block, overlapped by (L –1) pixels

    with the first row blocks, is extracted to produce the first block of the second output row. The

    algorithm continues in this fashion until all output points are computed.

  • Road Extraction From Satellite Imagery [2008-09]

    34 | P a g e

    7.IMAGE RESTORATION MODELS

    Image restoration may be viewed as an estimation process in which operations are performed

    on an observed or measured image field to estimate the ideal image field that would be

    observed if no image degradation were present in an imaging system. Mathematical models are

    described in this chapter for image degradation in general classes of imaging systems. These

    models are then utilized in subsequent chapters as a basis for the development of image

    restoration techniques.

    7.1. GENERAL IMAGE RESTORATION MODELS

    In order effectively to design a digital image restoration system, it is necessary quantitatively to

    characterize the image degradation effects of the physical imaging system, the image digitizer,

    and the image display. Basically, the procedure is to model the image degradation effects and

    then perform operations to undo the model to obtain a restored image. It should be

    emphasized that accurate image modeling is often the key to effective image restoration. There

    are two basic approaches to the modeling of image degradation effects: a priori modeling and a

    posteriori modeling.

    In the former case, measurements are made on the physical imaging system, digitizer, and

    display to determine their response for an arbitrary image field. In some instances it will be

    possible to model the system response deterministically, while in other situations it will only be

    possible to determine the system response in a stochastic sense. The a posteriori modeling

    approach is to develop the model for the image degradations based on measurements of a

    particular image to be restored.

    Basically, these two approaches differ only in the manner in which information is gathered to

    describe the character of the image degradation.

  • Road Extraction From Satellite Imagery [2008-09]

    35 | P a g e

    FIGURE 7.1-1. Digital image restoration model.

    Figure 7.1-1 shows a general model of a digital imaging system and restoration process. In the

    model, a continuous image light distribution dependent on spatial coordinates (x, y), time (t),

    and spectral wavelength is assumed to exist as the driving force of a physical imaging system

    subject to point and spatial degradation effects and corrupted by deterministic and stochastic

    disturbances. Potential degradations include diffraction in the optical system, sensor

    nonlinearities, optical system aberrations, film nonlinearities, atmospheric turbulence effects,

    image motion blur, and geometric distortion. Noise disturbances may be caused by electronic

    imaging sensors or film granularity. In this model, the physical imaging system produces a set of

    output image fields at time instant described by the general relation

    FO(i)(x,y,tj ) = OP{C(x, y, t, W)} (7.1−1)

    where represents a general operator that is dependent on the space coordinates (x, y), the time

    history (t), the wavelength , and the amplitude of the light distribution (C). For a monochrome

    imaging system, there will only be a single output field, while for a natural color imaging

    system, may denote the red, green, and blue tristimulus bands for i = 1, 2, 3, respectively.

    Multispectral imagery may also involve several output bands of data.

    In the general model of Figure 7.1-1, each observed image field is digitized, following the

    techniques outlined in Part 3, to produce an array of image samples at each time instant . The

    output samples of the digitizer are related to the input observed field by

    FS(i)(m1,m2,tj) = OG{FO(i)(x,y,tj)} (7.1−2)

    where is an operator modeling the image digitization process. A digital image restoration

    system that follows produces an output array by the transformation

  • Road Extraction From Satellite Imagery [2008-09]

    36 | P a g e

    FˆI(i)(x,y,tj) = OD{FK(i)(k1,k2,tj)} (7.1−3)

    where OD{.}models the display transformation.

    7.2. OPTICAL SYSTEMS MODELS

    One of the major advances in the field of optics during the past 40 years has been the

    application of system concepts to optical imaging. Imaging devices consisting of lenses, mirrors,

    prisms, and so on, can be considered to provide a deterministic transformation of an input

    spatial light distribution to some output spatial light distribution.

    Also, the system concept can be extended to encompass the spatial propagation of light

    through free space or some dielectric medium.

    In the study of geometric optics, it is assumed that light rays always travel in a straight-line path

    in a homogeneous medium. By this assumption, a bundle of rays passing through a clear

    aperture onto a screen produces a geometric light projection of the aperture. However, if the

    light distribution at the region between the light and

    FIGURE 11.2-1. Generalized optical imaging system.

    dark areas on the screen is examined in detail, it is found that the boundary is not sharp. This

    effect is more pronounced as the aperture size is decreased. For a pinhole aperture, the entire

    screen appears diffusely illuminated. From a simplistic viewpoint, the aperture causes a

    bending of rays called diffraction. Diffraction of light can be quantitatively characterized by

    considering light as electromagnetic radiation that satisfies Maxwell's equations. The

    formulation of a complete theory of optical imaging from the basic electromagnetic principles

    of diffraction theory is a complex and lengthy task. In the following, only the key points of the

    formulation are presented; details may be found in References 1 to 3.

    Figure 7.2-1 is a diagram of a generalized optical imaging system. A point in the object plane at

    coordinate xoyoof intensity Io (xo, yo radiates energy toward an imaging system

    characterized by an entrance pupil, exit pupil, and intervening system transformation.

    Electromagnetic waves emanating from the optical system are focused to a point on the image

  • Road Extraction From Satellite Imagery [2008-09]

    37 | P a g e

    plane producing an intensity Ii (xi, yi . The imaging system is said to be diffraction limited if the

    light distribution at the image plane produced by a point-source object consists of a converging

    spherical wave whose extent is limited only by the exit pupil. If the wavefront of the

    electromagnetic radiation emanating from the exit pupil is not spherical, the optical system is

    said to possess aberrations.

    7.3. DISCRETE IMAGE RESTORATION MODELS

    This chapter began with an introduction to a general model of an imaging system and a digital

    restoration process. Next, typical components of the imaging system were described and

    modeled within the context of the general model. Now, the discussion turns to the

    development of several discrete image restoration models. In the development of these

    models, it is assumed that the spectral wavelength response and temporal response

    characteristics of the physical imaging system can be separated from the spatial and point

    characteristics. The following discussion considers only spatial and point characteristics.

    After each element of the digital image restoration system of Figure 7.1-1 is modeled, following

    the techniques described previously, the restoration system may be conceptually distilled to

    three equations:

    Observed image:

    FS(m1, m2) = OM{FI(n1,n2),N1(m1,m2)...........,NN(m1m2 )} (7.3−1α)

    Compensated image:

    FK(k1,k2) = OR{FS(m1,m2)} (7.3−1β)

    Restored image:

    FˆI(n1,n2) = OD{(FK(k1,k2)} (7.3−1c)

    where FS represents an array of observed image samples, FI and are arrays of ideal image

    points and estimates, respectively, FK is an array of compensated image points from the digital

    restoration system, Ni denotes arrays of noise samples from various system elements, and , ,

    represent general transfer functions of the imaging system, restoration processor, and display

    system, respectively.

    Vector-space equivalents of Eq. 7.4-1 can be formed for purposes of analysis by column

    scanning of the arrays of Eq. 7.3-1.

    Several estimation approaches to the solution of 7.3-1 or 7.3-2 are described in the following

    chapters. Unfortunately, general solutions have not been found; recourse must be made to

    specific solutions for less general models.

  • Road Extraction From Satellite Imagery [2008-09]

    38 | P a g e

    The most common digital restoration model is that of Figure 7.3-1a, in which a continuous

    image field is subjected to a linear blur, the electrical sensor responds nonlinearly to its input

    intensity, and the sensor amplifier introduces additive Gaussian noise independent of the image

    fP(i) = OP{fB(i)} (7.3−2) Equation for the observed physical image samples in terms of points on the ideal image:

    fS=BPOP{BBfI}BPn (7.3-3)

    Several special cases of Eq. 11.4-5 will now be defined. First, if the point nonlinearity is absent,

    fS=Bf+nB (7.3-4)

    (a) Original

    (b) Impulse response (c) Observation

  • Road Extraction From Satellite Imagery [2008-09]

    39 | P a g e

    where B = BPBB and nB = BPn. This is the classical discrete model consisting of a set of linear

    equations with measurement uncertainty. Another case that will be defined for later discussion

    occurs when the spatial blur of the physical image digitizer is negligible.

    Chapter 8 contains results for several image restoration experiments based on the

    restoration model defined by Eq. 7.3-6. An artificial image has been generated for these

    computer simulation experiments (9). The original image used for the analysis of

    underdetermined restoration techniques, shown in Figure 7.3-3a, consists of a pixel square of

    intensity 245 placed against an extended background of intensity 10 referenced to an intensity

    scale of 0 to 255. All images are zoomed for display purposes.

    In the computer simulation restoration experiments, the observed blurred image model

    has been obtained by multiplying the column-scanned original image of Figure 7.3-3a by the

    blur matrix B. Next, additive white Gaussian observation noise has been simulated by adding

    output variables from an appropriate random number generator to the blurred images. For

    display, all image points restored are clipped to the intensity range 0 to 255.

  • Road Extraction From Satellite Imagery [2008-09]

    40 | P a g e

    8.IMAGE DETECTION AND REGISTRATION

    Image detection is concerned with the determination of the presence or absence of objects

    suspected of being in an image.

    TEMPLATE MATCHING

    One of the most fundamental means of object detection within an image field is by template

    matching, in which a replica of an object of interest is compared to all unknown objects in the

    image field (1–4). If the template match between an unknown object and the template is

    sufficiently close, the unknown object is labeled as the template object.

    As a simple example of the template-matching process, consider the set of binary black line

    figures against a white background as shown in Figure 8.1-1a. In this example, the objective is

    to detect the presence and location of right triangles in the image field. Figure 8.1-1b contains a

    simple template for localization of right triangles that possesses unit value in the triangular

    region and zero elsewhere. The width of the legs of the triangle template is chosen as a

    compromise between localization accuracy and size invariance of the template. In operation,

    the template is sequentially scanned over the image field and the common region between the

    template and image field is compared for similarity.

    A template match is rarely ever exact because of image noise, spatial and amplitude

    quantization effects, and a priori uncertainty as to the exact shape and structure of an object to

    be detected. Consequently, a common procedure is to produce a difference measure between

    the template Dmnand the image field at all points of the image field where and denote

    the trial offset. An object is deemed to be matched wherever the difference is smaller than

    some established level . Normally, the threshold level is constant over the image field. The

    usual difference measure is the mean-square difference or error as defined by

    D(m, n) = ∑ ∑ [F(j,k)– T(j –m, k – n)]2 (8.1)

    where F(j,k) denotes the image field to be searched and T(j,k) is the template. The search, of course, is restricted to the overlap region between the translated template and the image field. A

    template match is then said to exist at coordinate (m,n) if

  • Road Extraction From Satellite Imagery [2008-09]

    41 | P a g e

    D(m, n) < LD (m,n) (8.2) Now, let Eq. 12.1 be expanded to yield

    D(m, n) = D1 (m,n) − 2D2 (m,n) + D3 (m,n) (8.3)

    D1(m,n) = ∑ ∑ [F(j,k)]2 (8.4a)

    D2 (m,n) = ∑ ∑ [F(j,k)T(j –m, k – n)] (8.4β)

    D3 (m,n) = ∑ ∑ [T(j–m,k–n)]2 (8.4c)

    The term D3 (m,n) represents a summation of the template energy. It is constant valued and independent of the coordinate (m,n) . The image energy over the window area represented by the first term D1 (m,n) generally varies rather slowly over the image field. The second term should be recognized as the cross correlation RFT (m,n) between the image field and the template. At the coordinate location of a template match, the cross correlation should become large to yield a

    small difference.

    However, the magnitude of the cross correlation is not always an adequate measure of the

    template difference because the image energy term D1 (m,n) is position variant. For example, the cross correlation can become large, even under a condition of template mismatch, if the

    image amplitude over the template region is high about a particular coordinate (m,n) . This difficulty can be avoided by comparison of the normalized cross correlation

    @XYZ (/, 0) = [1(/, 0)[2(/, 0) = ∑ ∑ �(J, K)P(J − / , K − 0)

    ∑ ∑'�(J, K)*�

    to a threshold level . A template match is said to exist if

    ˜RFT(m,n) > LR(m,n) (8.5)

    The normalized cross correlation has a maximum value of unity that occurs if and only if the image function under the template exactly matches the template.

    One of the major limitations of template matching is that an enormous number of templates must

    often be test matched against an image field to account for changes in rotation and magnification

    of template objects. For this reason, template matching is usually limited to smaller local

    features, which are more invariant to size and shape variations of an object. Such features, for

    example, include edges joined in a Y or T arrangement.

  • Road Extraction From Satellite Imagery [2008-09]

    42 | P a g e

    9.MORPHOLOGICAL IMAGE PROCESSING

    Morphological image processing is a type of processing in which the spatial form or structure of

    objects within an image are modified. Dilation, erosion, and skeletonization are three

    fundamental morphological operations. With dilation, an object grows uniformly in spatial

    extent, whereas with erosion an object shrinks uniformly. Skeletonization results in a stick

    figure representation of an object. The basic concepts of morphological image processing trace

    back to the research on spatial set algebra by Minkowski (1) and the studies of Matheron (2) on

    topology. Serra (3–5) developed much of the early foundation of the subject. Steinberg (6,7)

    was a pioneer in applying morphological methods to medical and industrial vision applications.

    This research work led to the development of the cytocomputer for high-speed morphological

    image processing (8,9). In the following sections, morphological techniques are first described

    for binary images. Then these morphological concepts are extended to gray scale images.

    9.1. BINARY IMAGE CONNECTIVITY

    Binary image morphological operations are based on the geometrical relationship or

    connectivity of pixels that are deemed to be of the same class (10,11). In the binary image of

    Figure 9.1-1a, the ring of black pixels, by all reasonable definitions of connectivity, divides the

    image into three segments: the white pixels exterior to the ring, the white pixels interior to the

    ring, and the black pixels of the ring itself. The pixels within each segment are said to be

    connected to one another. This concept of connectivity is easily understood for Figure 14.1-1a,

    but ambiguity arises when considering Figure 9.1-1b. Do the black pixels still define a ring, or do

    they instead form four disconnected lines? The answers to these questions depend on the

    definition of connectivity.

  • Road Extraction From Satellite Imagery [2008-09]

    43 | P a g e

    FIGURE 9.1-1. Connectivity. Consider the following neighborhood pixel pattern:

    X3 X2 X1 X4 X X0 X5 X6 X7

    in which a binary-valued pixel , where X = 0 (white) or X = 1 (black) is surrounded by its eight

    nearest neighbors . An alternative nomenclature is to label the neighbors by compass

    directions: north, northeast, and so on: NW N NE W X E

    SW S SE Pixel X is said to be four-connected to a neighbor if it is a logical 1 and if its east, north, west, or

    south neighbor is a logical 1. Pixel X is said to be eight-connected if it is a logical 1 and if its

    north, northeast, etc. neighbor is a logical 1.

    The connectivity relationship between a center pixel and its eight neighbors canbe quantified

    by the concept of a pixel bond, the sum of the bond weights betweenthe center pixel and each

    of its neighbors. Each four-connected neighbor has a bondof two, and each eight-connected

    neighbor has a bond of one. In the followingexample, the pixel bond is seven.

    1 1 1

    0 X 0 1 1 0

  • Road Extraction From Satellite Imagery [2008-09]

    44 | P a g e

    FIGURE 9.1-2. Pixel neighborhood connectivity definitions.

    Under the definition of four-connectivity, Figure 9.1-1b has four disconnected black line

    segments, but with the eight-connectivity definition, Figure 9.1-1b has a ring of connected black

    pixels. Note, however, that under eight-connectivity, all white pixels are connected together.

    Thus a paradox exists. If the black pixels are to be eight-connected together in a ring, one would

    expect a division of the white pixels into pixels that are interior and exterior to the ring. To

    eliminate this dilemma, eight-connectivity can be defined for the black pixels of the object, and

    four-connectivity can be established for the white pixels of the background. Under this

    definition, a string of black pixels is said to be minimally connected if elimination of any black

    pixel results in a loss of connectivity of the remaining black pixels. Figure 9.1-2 provides

    definitions of several other neighborhood connectivity relationships between a center black

    pixel and its neighboring black and white pixels.

    The preceding definitions concerning connectivity have been based on a discrete image model

    in which a continuous image field is sampled over a rectangular array of points. Golay (12) has

    utilized a hexagonal grid structure. With such a structure, many of the connectivity problems

    associated with a rectangular grid are eliminated.

    In a hexagonal grid, neighboring pixels are said to be six-connected if they are in the same set

    and share a common edge boundary. Algorithms have been developed for the linking of

    boundary points for many feature extraction tasks (13). However, two major drawbacks have

    hindered wide acceptance of the hexagonal grid. First, most image scanners are inherently

    limited to rectangular scanning. The second problem is that the hexagonal grid is not well

    suited to many spatial processing operations, such as convolution and Fourier transformation.

  • Road Extraction From Satellite Imagery [2008-09]

    45 | P a g e

    9.2. BINARY IMAGE HIT OR MISS TRANSFORMATIONS

    The two basic morphological operations, dilation and erosion, plus many variants can be

    defined and implemented by hit-or-miss transformations (3). The concept is quite simple.

    Conceptually, a small odd-sized mask, typically 3 x 3, is scanned over a binary image. If the

    binary-valued pattern of the mask matches the state of the pixels under the mask (hit), an

    output pixel in spatial correspondence to the center pixel of the mask is set to some desired

    binary state. For a pattern mismatch (miss), the output pixel is set to the opposite binary state.

    For example, to perform simple binary noise cleaning, if the isolated 3 x3 pixel pattern

    0 0 0 0 1 0

    is encountered, the output pixel is set to zero; otherwise, the output pixel is set to the state of

    the input center pixel. In more complicated morphological algorithms, a large number of the29

    = 512 possible mask patterns may cause hits.

    It is often possible to establish simple neighborhood logical

    relationships that define the conditions for a hit. In the isolated pixel removal example, the

    defining equation for the output pixel G(j,k)becomes

    G(j, k) = X(X0 ∩X1 ∩............∩X7 ) (9.2-1)

    Where ∩ denotes the intersection operation (logical AND) and ∪ denotes the union operation (logical OR). For complicated algorithms, the logical equation method of definition can be

    cumbersome. It is often simpler to regard the hit masks as a collection of binary patterns.

    Hit-or-miss morphological algorithms are often implemented in digital image processing

    hardware by a pixel stacker followed by a look-up table (LUT), as shown in Figure 9.2-1 (14).

    Each pixel of the input image is a positive integer, represented by a conventional binary code,

    whose most significant bit is a 1 (black) or a 0 (white). The pixel stacker extracts the bits of the

    center pixel X and its eight neighbors and puts them in a neighborhood pixel stack. Pixel

    stacking can be performed by convolution with the 3x3 pixel kernel 2-4 2-3 2-2 2-5 20 2-1

    2-6 2-7 2-8

    The binary number state of the neighborhood pixel stack becomes the numeric input address of

    the LUT whose entry is Y For isolated pixel removal, integer entry 256, corresponding to the

    neighborhood pixel stack state 100000000, contains Y = 0; all other entries contain Y = X.

  • Road Extraction From Satellite Imagery [2008-09]

    46 | P a g e

    FIGURE 14.2-1. Look-up table flowchart for binary unconditional operations

    Several 3x3 other hit-or-miss operators are described in the following subsections.

    9.2.1. Additive Operators

    Additive hit-or-miss morphological operators cause the center pixel of a pixelwindow to be

    converted from a logical 0 state to a logical 1 state if the neighboringpixels meet certain

    predetermined conditions. The basic operators are now defined.

    Interior Fill. Create a black pixel if all four-connected neighbor pixels are black.

    G(j, k) =X ∪ [X0 ∩X2 ∩X4 ∩X6 ] (9.2-2)

    Diagonal Fill. Create a black pixel if creation eliminates the eight-connectivity of the

    background.

    G(j, k) = X ∪ [∪P1 ∪P2 ∪P3 ∪P4] (9.2-3a)

    FIGURE 9.2-1. Look-up table flowchart for binary unconditional operations.where P1 = X∩X0∩X1∩X2 (9.2-3b)

    P2 =X∩ X2 ∩X3 ∩X4 (9.2-3b)

    P3 =X∩ X4∩ X5 ∩X6 (9.2-3d) P4 =X∩ X6 ∩X7 ∩X0 (9.2-3e)

  • Road Extraction From Satellite Imagery [2008-09]

    47 | P a g e

    There are 132 such qualifying patterns. This strategem will not prevent connection of two

    objects separated by two rows or columns of white pixels. A solution to this problem is

    considered in Section 9.3. Figure 9.2-3 provides an example of fattening.

    (a) Original

    (b) One iteration (c) Three iterations

    FIGURE 9.2-2. Dilation of a binary image

  • Road Extraction From Satellite Imagery [2008-09]

    48 | P a g e

    9.2.2. Subtractive Operators

    Subtractive hit-or-miss morphological operators cause the center pixel of a 3 x 3 window to be

    converted from black to white if its neighboring pixels meet predetermined conditions. The

    basic subtractive operators are defined below.

    Isolated Pixel Remove: Erase a black pixel with eight white neighbors.

    G(j,k) = X ∪ [ X0 ∩ X1 ∩ X2 ∩ …. ∩ X7] (9.2-6)

    Spur Remove : Erase a black pixel with a single eight-connected neighbor.

    FIGURE 9.2-3. Fattening of a binary image

    The following is one of four qualifying patterns:

    0 0 0

    0 1 0

    1 0 0

    Interior Pixel Remove: Erase a black pixel if all four-connected neighbors are black. G(j,k) = X∪ [ X0∩X2∩ X4∩X6] (9.2-7)

    There are 16 qualifying patterns.

    H-Break: Erase a black pixel that is H-connected. There are two qualifying patterns.

  • Road Extraction From Satellite Imagery [2008-09]

    49 | P a g e

    1 1 1 1 0 1

    0 1 0 1 1 1

    1 1 1 1 0 1

    Eight-Neighbor Erode: Erase a black pixel if at least one eight-connected neighbor pixel is white.

    G(j,k) = X ∩ X0 ∩ …. ∩ X7 (9.2-8)

    (a) Original image

    (b) One iteration (c) Three iterations

    FIGURE 9.2-4. Erosion of a binary image.

    A generalized erosion operator is defined in Section 9.4. Recursive application of the erosion

    operator will eventually erase all black pixels. Figure 9.2-4 shows results for one and three

    iterations of the erode operator. The eroded pixels are midgray. It should be noted that after

    three iterations, the ring is totally eroded.

  • Road Extraction From Satellite Imagery [2008-09]

    50 | P a g e

    9.2.3. Majority Black Operator

    The following is the definition of the majority black operator:

    Majority Black. Create a black pixel if five or more pixels in a 3x 3 window are black; otherwise,

    set the output pixel to white. The majority black operator is useful for filling small holes in

    objects and closing short gaps in strokes.

    9.3. BINARY IMAGE SHRINKING, THINNING, SKELETONIZING, AND THICKENING

    Shrinking, thinning, skeletonizing, and thickening are forms of conditional erosion in which the

    erosion process is controlled to prevent total erasure and to ensure connectivity.

    9.3.1. Binary Image Shrinking

    The following is a definition of shrinking:

    Shrink. Erase black pixels such that an object without holes erodes to a single pixel at or near its

    center of mass, and an object with holes erodes to a connected ring lying midway between

    each hole and its nearest outer boundary.

    A 3 x 3 pixel object will be shrunk to a single pixel at its center. A 2 x 2 pixel object will be

    arbitrarily shrunk, by definition, to a single pixel at its lower right corner.

    It is not possible to perform shrinking using single-stage 3 x 3 pixel hit-or-miss transforms

    of the type described in the previous section. The 3 x 3 window does not provide enough

    information to prevent total erasure and to ensure connectivity. A 5 x 5 hit-or-miss transform

    could provide sufficient information to perform proper shrinking. But such an approach would

    result in excessive computational complexity (i.e., 225 possible patterns to be examined!).

    References 15 and 16 describe twostage shrinking and thinning algorithms that perform a

    conditional marking of pixels for erasure in a first stage, and then examine neighboring marked

    pixels in a second stage to determine which ones can be unconditionally erased without total

    erasure or loss of connectivity. The following algorithm developed by Pratt and Kabir (17) is a

    pipeline processor version of the conditional marking scheme.

    In the algorithm, two concatenated 3 x 3 hit-or-miss transformations are performed to obtain

    indirect information about pixel patterns within a 5 x 5window. Figure 9.3-1 is a flowchart for

    the look-up table implementation of this algorithm.

    In the first stage, the states of nine neighboring pixels are gathered together by a pixel stacker,

    and a following look-up table generates a conditional mark M for possible erasures. Table 9.3-1

    lists all patterns, as indicated by the letter S in the table column, which will be conditionally

  • Road Extraction From Satellite Imagery [2008-09]

    51 | P a g e

    ma