Digital Image Definitions&Transformations

Embed Size (px)

Citation preview

  • 8/3/2019 Digital Image Definitions&Transformations

    1/18

    Digital Image Definitions

    A digital image described in a 2D discrete space is derived from an analog image in a2D continuous space through a samplingprocess that is frequently referred to as digitization. Themathematics of that sampling process will be described in subsequent Chapters. For now we will look atsome basic definitions associated with the digital image. The effect of digitization is shown in figure 1.

    The 2D continuous image is divided into N rows and M columns. The intersection of a row and a

    column is termed a pixel. The value assigned to the integer coordinates with

    and is . In fact, in most cases ,which wemight consider to be the physical signal that impinges on the face of a 2D sensor , is actually a function of

    many variables including depth ,color ( )and time .Unless otherwise stated, we will consider thecase of 2D, monochromatic, static images in this module.

    Figure(1.1): Digitization of a continuous image.

    The pixel at coordinates has the integer brightness value 110.

    The image shown in figure (1.1) has been divided into rows and The value assigned toevery pixel is the average brightness in the pixel rounded to the nearest integer value. The process ofrepresenting the amplitude of the 2D signal at a given coordinate as an integer value with L different graylevels is usually referred to as amplitude quantization or simple quantization.

    Common values

    There are standard values for the various parameters encountered in digital image processing. Thesevalues can be caused by video standards, by algorithmic requirements, or by the desire to keep digitalcircuitry simple. Table 1 gives some comm

    Parameter Symbol Typical values

    Rows N 256,512,525,625,1024,1035

    Columns M 256,512,768,1024,1320

  • 8/3/2019 Digital Image Definitions&Transformations

    2/18

    Gray Levels L 2,64,256,1024,4096,16384

    Table 1: Common values of digital image parameters

    Quite frequently we see cases ofM=N=2k where .This can be motivated by digital circuitry

    or by the use of certain algorithms such as the (fast) Fourier transform.

    The number of distinct gray levels is usually a power of 2, that is, where B is the number of bits in

    the binary representation of the brightness levels. When we speak of a gray-level image; when

    we speak of a binary image. In a binary image there are just two gray levels which can be referredto, for example, as "black" and "white" or "0" and "1".

    Suppose that a continuous image is approximated by equally spaced samples arranged in the

    form of an array as:

    (

    Each element of the array refered to as "pixel" is a discrete quantity. The array represents a digital image.

    The above digitization requires a decision to be made on a value for N a well as on the number of discretegray levels allowed for each pixel.

    It is common practice in digital image processing to let N=2n and G = number of gray levels = . It isassumed that discrete levels are equally spaced between 0 to L in the gray scale.

    Therefore the number of bits required to store a digitized image of size is In other

    words a image with 256 gray levels (ie 8 bits/pixel) required a storage of bytes.

    The representation given by equ (1) is an approximation to a continuous image.

    Reasonable question to ask at this point is how many samples and gray levels are required for a goodapproximation? This brings up the question of resolution. The resolution (ie the degree of discernbledetail) of an image is strangely dependent on both N and m. The more these parameters are increased,

    the closer the digitized array will approximate the original image.

    Unfortunately this leads to large storage and consequently processing requirements increase rapidly as afunction of N and large m.

    Spatial and Gray level resolution:

    Sampling is the principal factor determining the spatial resolution of an image. Basically spatial resolutionis the smallest discernible detail in an image.

  • 8/3/2019 Digital Image Definitions&Transformations

    3/18

    As an example suppose we construct a chart with vertical lines of width W, and with space between thelines also having width W. A line-pair consists of one such line and its adjacent space. Thus width of line

    pair is and there are line-pairs per unit distance. A widely used definition of resolution is simplythe smallest number of discernible line pairs per unit distance; for es 100 line pairs/mm.

    Gray level resolution: This refers to the smallest discernible change in gray level. The measurement ofdiscernible changes in gray level is a highly subjective process.

    We have considerable discretion regarding the number of Samples used to generate a digital image. Butthis is not true for the number of gray levels. Due to hardware constraints, the number of gray levels isusually an integer power of two. The most common value is 8 bits. It can vary depending on application.When an actual measure of physical resolution relating pixels and level of detail they resolve in the

    original scene are not necessary, it is not uncommon to refer to an L-level digital image of size as

    having a spatial resolution of pixels and a gray level resolution of L levels.

    Characteristics of Image Operations

    Types of operations

    Types of neighborhoods

    There is a variety of ways to classify and characterize image operations. The reason for doing so is tounderstand what type of results we might expect to achieve with a given type of operation or what mightbe the computational burden associated with a given operation.

    Type of operations

    The types of operations that can be applied to digital images to transform an input image a[m, n]into an

    output image b[m, n](or another representation) can be classified into three categories as shown in Table2.

    Operation Characterization GenericComplexity/

    Pixel

    *Point -the output value at a specific coordinate isdependent only on the input value at that samecoordinate.

    constant

    *Local -the output value at a specific coordinate isdependent on the input values in the neighborhoodof that same coordinate.

    *Global --the output value at a specific coordinate isdependent on all the values in the input image..

    Table 2: Types of image operations. Image size= neighborhood size= . Note that thecomplexity is specified in operations per pixel.

    This is shown graphically in Figure(1.2).

  • 8/3/2019 Digital Image Definitions&Transformations

    4/18

    Figure (1.2): Illustration of various types of image operations

    Types of neighborhoods

    Neighborhood operations play a key role in modern digital image processing. It is therefore important tounderstand how images can be sampled and how that relates to the various neighborhoods that can be

    used to process an image.

    Rectangular sampling - In most cases, images are sampled by laying a rectangular grid over an image asillustrated in Figure(1.1). This results in the type of sampling shown in Figure(1.3ab). Hexagonalsampling-An alternative sampling scheme is shown in Figure (1.3c) and is termed hexagonal sampling.

    Both sampling schemes have been studied extensively and both represent a possible periodic tiling of thecontinuous image space. However rectangular sampling due to hardware and software and softwareconsiderations remains the method of choice. Local operations produce an output pixel value

    based upon the pixel values in the neighborhood .Some of themost common neighborhoods are the 4-connected neighborhood and the 8-connected neighborhood inthe case of rectangular sampling and the 6-connected neighborhood in the case of hexagonal sampling

    illustrated in Figure(1.3).

    Fig (1.3a) Fig (1.3b) Fig (1.3c)

    Video Parameters

    We do not propose to describe the processing of dynamically changing images in this introduction. It isappropriate-given that many static images are derived from video cameras and frame grabbers-tomention the standards that are associated with the three standard video schemes that are currently inworldwide use- NTSC, PAL, and SECAM. This information is summarized in Table 3.

  • 8/3/2019 Digital Image Definitions&Transformations

    5/18

    Standard NTSC PAL SECAM

    Property

    Images / Second 29.97 25 25

    Ms / image 33.37 40.0 40.0

    Lines / image 525 625 625(horiz./vert.)=aspect radio 4:3 4:3 4:3

    interlace 2:1 2:1 2:1

    Us / line 63.56 64.00 64.00

    Table 3: Standard video parameters

    In a interlaced image the odd numbered lines (1, 3, 5.) are scanned in half of the allotted time (e.g. 20 msin PAL) and the even numbered lines (2, 4, 6,.) are scanned in the remaining half. The image displaymust be coordinated with this scanning format. The reason for interlacing the scan lines of a video imageis to reduce the perception of flicker in a displayed image. If one is planning to use images that have beenscanned from an interlaced video source, it is important to know if the two half-images have been

    appropriately "shuffled" by the digitization hardware or if that should be implemented in software. Further,the analysis of moving objects requires special care with interlaced video to avoid 'Zigzag' edges.

    Tools

    Certain tools are central to the processing of digital images. These include mathematical tools such asconvolution, Fourier analysis, and statistical descriptions, and manipulative tools such as chain codes andrun codes. We will present these tools without any specific motivation. The motivation will follow in latersections.

    Convolution

    Properties of Convolution Fourier Transforms

    Properties of Fourier Transforms

    Statistics

    Convolution There are several possible notations to indicate the convolution of two (multi-dimensional) signals

    to produce an output signal. The most common are:

    We shall use the first form

    ,with the following formal definitions.

    In 2D continuous space:

    In 2D discrete space:

    Properties of Convolution

    There are a number of important mathematical properties associated with convolution.

  • 8/3/2019 Digital Image Definitions&Transformations

    6/18

    Convolution is commutative.

    Convolution is associative.

    Convolution is distributive.

    where a, b, c, and dare all images, either continuous or discrete.

    2-D Discrete cosine transforms

    One disadvantage of the DFT for some applications is that the transform is complex valued, even for realdata. A related transform, the discrete cosine transform (DCT), does not have this problem. The DCT is aseparate transform and not the real part of the DFT. It is widely used in image and video compressionapplications, e.g., JPEG and MPEG. It is also possible to use DCT for filtering using a slightly different

    form of convolution called symmetric convolution.

    Definition (2-D DCT)

    Assume that the data array has finite rectangular support on then the 2-D DCT isgiven as

    for

    (4.3.

    The DCT basis functions for size 8 x 8 are shown in Figure ( ). The mapping between the mathematicalvalues and the colors (gray levels) is the same as in the DFT case. Each basis function occupies a smallsquare; the squares are then arranged into as 8 x 8 mosaic. Note that unlike the DFT, where the highest

    frequencies occur near , the , the highest frequencies of the DCT occur at the highest

    indices .

    The inverse DCT exists and is given for as,

    (4.3.2

    where the weighting function w(k) is given just as in the case of 1-D DCT by

    (4.3.

  • 8/3/2019 Digital Image Definitions&Transformations

    7/18

    From eqn (4.3.1), we see that the 2-D DCT is a separable operator. As such it can be applied to the rowsand then the columns, or vice versa. Thus the 2-D theory can be developed by repeated application of the1-D theory. In the following subsections we relate the 1-D DCT to 1-D DFT of a symmetrically extendedsequence. This not only provides an understanding of the DCT but also enables its fast calculation. Wealso present a fast DCT calculation that can avoid the use of complex arithmetic in the usual case wherex is a real-valued signal, e.g., an image. (Note: the next two subsections can be skipped by the readerfamiliar with the 1-D DCT)

    In the 1-D case the DCT is defined as

    (4.3

    for every Npoint signal having support The corresponding inverse transform, or IDCT,can be written as

    (4.3

    It turns out that this 1-D DCT can be understood in terms of the DFT of a symmetrically extendedsequence,

    (4.3

    This is not the only way to symmetrically extendx, but this method results in the most widely used DCT

    sometimes called DCT-2 with support In fact, on defining the 2N point DFT

    we will show that the DCT can be alternatively expressed as

    (4.3

    Thus the DCT is just the DFT analysis of the symmetrically extended signal defined in (4.3.6):

    Looking at this equation, we see that there is no overlap in its two components, which fit together withouta gap. We can see that right after comes at position , which is then followed by

    the rest of the nonzero part of x in reverse order, upto , where sits .We can see a point

    of symmetry midway between and N, i.e., at .

  • 8/3/2019 Digital Image Definitions&Transformations

    8/18

    If we consider its periodic extension we will also see a symmetry about the point . We thus

    expect that the 2Npoint will be real valued except for the phase factor . So the phase factorin eqn (4.3.7) is just what is needed to cancel out the phase term in Y and make the DCT real , as it mustif the two equations, (4.3.1) and (4.3.7), are to agree for real valued inputsx.

    To reconcile these two definitions, we start out with eqn (4.3.7), and proceed as follows:

    the last line following from and Euler's relation, which agrees withthe original definition, eqn (4.3.4).

    The formula for the inverse DCT, can be established similarly, starting out from

    Some 1-D DCT Properties

    1) Linearity:

    2) Energy conservations:

    (4.3.

    3) Symmetry:

    (a) General case:

  • 8/3/2019 Digital Image Definitions&Transformations

    9/18

    (b) Real-valued case:

    4) Eigenvectors of unitary DCT: Define the column vector

    and define the matrix C with elements:

    Then the vector contains the unitary DCT, whose elements are given as

    A unitary matrix is one whose inverse is the same as the transpose . For the unitary DCT, wehave

    and energy balance equation,

    which is a slight modification on the DCT Parseval relation (4.3.8). So the unitary DCT preserves theenergy of the signal x.

    It turns out that eigenvectors of the unitary DCT are the same as those of the symmetric tridiagonalmatrix,

  • 8/3/2019 Digital Image Definitions&Transformations

    10/18

    and this holds true for arbitrary values of the parameter . We can relate this matrix Q to the inverse

    covariance matrix of a 1-D first-order stationary Markov random sequence, with correlation coefficient

    necessarily satisfying

    where and . The actual covariance matrix of the Markov randomsequence is

    with corresponding, first-order difference equation,

    It can further be shown that when , , so that eigenvectors approximate each other too.Because the eigenvectors of a matrix and its inverse are the same, we then have the fact that the unitaryDCT basis vectors approximate the Karhunen-Loeve expansion, with basis vectors given as the solutionto the matrix-vector equation,

  • 8/3/2019 Digital Image Definitions&Transformations

    11/18

    And corresponding Karhunen-Loeve transform (KLT) given by

    Thus the 1-D DCT of a first-order Markov random vector of dimension N should be close to the KLT of x

    when its correlation coefficient This ends the review of the 1-D DCT.

    Since the 2-D DCT

    is just the separable operator resulting from application of the 1-D DCT along first one dimension andthen the other, the order being immaterial, we can easily extend the 1-D DCT properties to the 2-D case.In terms of the connection of the 2-D DCT with the 2-DFT, we thus see that we must symmetrically extendin, say, the horizontal direction and then symmetrically extend that result in the vertical direction. The

    resulting symmetric function (extension) becomes

    The symmetry is about the lines and then from (4.3.7), it follows that the 2-D DCT is

    given in terms of the point DFT as

    Comments

    We see that both the 1-D and 2-D DCTs involve only real arithmetic for real-valued data, and this maybe important in some applications.

    The symmetric extension property can be expected to result in fewer high frequency coefficients in DCTwith respect to DFT. Such would be expected for lowpass data, since there would often be a jump at the

    four edges of the period of the corresponding periodic sequence which is notconsistent with small high- frequency coefficients in the DFS or DFT. Thus the DCT is attractive for lossy

    data storage applications, where the exact value of the data is not of paramount importance.

    The DCT can be used for a symmetrical type of filtering with a symmetrical filter.

    2-D DCT properties are easy generalizations of 1-D DCT properties.

  • 8/3/2019 Digital Image Definitions&Transformations

    12/18

    Karhunen-Loeve Transform (KLT )

    So far no transformation has been found for images that completely removes statistical dependencebetween transformation coefficients. Suppose a reversible transformation is carried out on N-dimensional

    vector to produce N-dimensional vector of transform coefficients, denoted by,

    (4.4.

    where is a linear transformation matrix that is orthonormal or unitary;

    ie (4.4

    (where "1" denotes transpose)

    Denote the m th column of matrix by column vector , then eqn (4.4.2 ) is equivalent to

    The vectors are called orthonormal basis vectors for linear transform .

    From eqn (4.4.1) we can express as,

    where are transform coefficients, and is represented as a weighted sum of basisvectors.

    KLT transform is an orthogonal linear transformation that can remove pairwise statistical correlation

    between the transform coefficients; ie the KLT transform coefficients satisfy

    where summation is over all possible values and and P( ) represents the probability.

    This is usually written using the statistical averaging operatorEas,

    or

  • 8/3/2019 Digital Image Definitions&Transformations

    13/18

    where is the variance of .

    Note that statistical independence implies uncorrelation, but the reverse is not generally true (except forjointly Gaussian r v s )

    The KLT can be derived by assuming

    TheN x Ncorrelation matrix of then becomes

    From the normality or unitary transform condition i.e., we get with the

    variances of along the main diagonal of the matrix. Thus, in terms of column vectors of , the aboveequation becomes

    We see that the basis vectors of the KLT are the eigenvectors of , orthonormalised to satisfy

    [Note that whereas the other image transforms like DFT, DCT were independent of data, the KLTtransformation depends on 2nd order satisfies of the data.]

    The variances of the KLT coefficients are the eigenvalues of and since is symmetric andpositive definite, eigenvalues are real and positive. The KLT basic vectors and transform coefficients arealso real. Besides decorrelating transform coefficients, the KLT has another useful property: it maximizesthe number of transform coefficients that are small enough so that they are insignificant. For example,

    suppose the KLT coefficients are ordered according to decreasing variance ie

    Also suppose that for reasons of economy, we transmit only the 1st pN coefficients where . The

    receiver then uses the truncated column vector to form the reconstructed values

    as .

    The mean squared error (MSE) between the original and the reconstructed pels is then

  • 8/3/2019 Digital Image Definitions&Transformations

    14/18

    It can be shown that KLT minimizes the MSE due to truncation.

    To show that of all possible linear transforms operating on length N vectors having stationary statistics,the KTL minimizes the MSE due to truncation.

    Let be some other transform having unit length basis vectors

    First we show heuristically that the truncation error is minimum if the basis vectors areorthogonalised using the well known Gram Schmidt procedure.

    Let be the vector space spanned by the "transmitted" vectors: for and let be

    the space spanned by the remaining vectors. Therefore any vector can be represented by

    plus vector i.e, as shown in figure below.

    If is the space spanned by transmitted basis vectors, then truncation error is the length of

    vector .Error is minimised if is to

    If is the space spanned by transmitted basis vector, then truncation error is the length of vector .

    Error is minimized if is to .

  • 8/3/2019 Digital Image Definitions&Transformations

    15/18

    If only is transmitted, then MSE is the squared length of the error vector , which is clearly

    minimized if is orthogonal to i.e., is orthogonal to . Further orthogonalisation of the basis

    vectors within , or within , does not affect MSE.

    Thus we assume to be unitary i.e., has orthogonal basis vectors i.e.

    Its transform coefficients are then given by

    The mean square truncation error is as given by eqn ( ) is,

    Now suppose we represent the vectors in terms of KTL basis vectors

    i.e.,

    and, This follows from

    Since is unitary

    (because basis vectors are of unit length)

    Now from equation ( ) we have

    and therefore can be written in terms of as,

    Since the KTL is unitary,

  • 8/3/2019 Digital Image Definitions&Transformations

    16/18

    We may now write

    which form eqn , we get

    As ,we have

    ie

    which is

    (Since for )

    Thus from eqn for

    we have

    The second summation

    ie

    and since

    we have

    ie

    which is = MSE for KTL

    This concludes that the MSE for any other linear transform exceeds that of the KTL transform.

  • 8/3/2019 Digital Image Definitions&Transformations

    17/18

    The Hadamard Transform

    The Hadamard transform and the Haar transform, to be considered in the next section, share a significantcomputational advantage over the previously considered DFT, DCT, and DST transforms. Their unitary

    matrices consist of and the transforms are computed via additions and subtractions only, with nomultiplications being involved. Hence, for processors for which multiplication is a time-consuming

    operation a sustained saving is obtained.

    The Hadamard unitary matrix of order n is the matrix , generated by the followingiteration rule:

    (4.5.

    where

    (4.5.

    And denotes the Kronecker product of two matrices .

    where A(i,j) is the (i,j) element of A,i,j=1,2...,N.. Thus, according to (4.5.1), (4.52) it is

    And for

    It is not difficult to show the orthogonality of , that is,

    For a vector x ofNsamples and , the transform pair is

  • 8/3/2019 Digital Image Definitions&Transformations

    18/18

    The 2-D Hadamard tansform is given by

    The Hadamard transform has good to very good energy packing properties. Fast algorithms for its

    computation in subtractions and/or additions are also available.

    Remark

    Experimental results using the DCT, DST, and Hadamard transforms for texture discriminationhave shown that the performance obtained was close to that of the optimal KL transform. At thesame time, this near-optimal performance is obtained at substantially reduced complexity, due tothe availability of fast computational schemes as reported before.