View
223
Download
0
Category
Preview:
Citation preview
7/28/2019 Sree Project Document
1/48
1
Chapter 1
Introduction
The computer is becoming more and more powerful day by day. As a result, the uses of
digital images are increasing rapidly. Along with this, increasing use of digital images come as
the serious issue of storing and transferring the huge volume of data representing the images
because the uncompressed multimedia (graphics, audio and video) data requires considerable
storage capacity and transmission bandwidth. Though there is a rapid progress in mass storage
density, speed of the processor and the performance of the digital communication systems, the
demand for data storage capacity and data transmission bandwidth continues to exceed the
capabilities of on hand technologies. Besides, the latest growth of data intensive multimedia
based web applications has put much pressure on the researchers to find the way of using the
images in the web applications more effectively.
Internet teleconferencing, High Definition Television (HDTV), satellite communications
and digital storage of movies are not feasible without a high degree of compression. As it is, such
applications are far from realizing their full potential largely due to the limitations of common
image compression techniques.
The image is actually a kind of redundant data i.e. it contains the same information from
certain perspective of view. By using data compression techniques, it is possible to remove some
of the redundant information contained in images. Image compression minimizes the size in
bytes of a graphics file without degrading the quality of the image to an unacceptable level. The
reduction in file size allows more images to be stored in a certain amount of disk or memory
space. It also reduces the time necessary for images to be sent over the Internet or downloaded
from web pages.
Wavelets are functions which allow data analysis of signals or images, according toscales or resolutions. The processing of signals by wavelet algorithms in fact works much the
same way the human eye does; or the way a digital camera processes visual scales of resolutions,
and intermediate details. But the same principle also captures cell phone signals, and even
digitized colour images.
7/28/2019 Sree Project Document
2/48
2
Wavelets are of real use in these areas, for example in approximating data with sharp
discontinuities such as choppy signals, or pictures with lots of edges. While wavelets is perhaps a
chapter in function theory, we show that the algorithms that result are key to the processing of
numbers, or more precisely of digitized information, signals, time series, still-images, movies,
colour images, etc.
Though there is a rapid progress in mass storage density, speed of the processor and the
performance of the digital communication systems, the demand for data storage capacity and
data transmission bandwidth continues to exceed the capabilities of on hand technologies.
Besides, the latest growth of data intensive multimedia based web applications has put much
pressure on the researchers to find the way of using the images in the web applications more
effectively. As it is, such applications are far from realizing their full potential largely due to the
limitations of common image compression techniques.
Since the Haar Transform is memory efficient, exactly reversible without the edge
effects, it is fast and simple. As such the Haar Transform technique is widely used these days in
wavelet analysis. Fast Haar Transform is one of the algorithms which can reduce the tedious
work of calculations. One of the earliest versions of FHT is included in HT. FHT involves
addition, subtraction and division by 2. Its application in atmospheric turbulence analysis, image
analysis, signal and image compression.
The Modified Fast Haar Wavelet Transform (MFHWT), in which the MFHWT is used
for one-dimensional approach and FHT is used to find the N/2detail coefficients at each level for
a signal of length N. In this project, it has used the same concept of finding averages and
differences as in but here that approach is extended for 2D images with the addition of
considering the detail coefficients 0 for N/2elements at each level. The Haar Transform and Fast
Haar Transform have been explained. Modified Fast Haar Wavelet Transform is presented withthe proposed algorithm for 2D images.
7/28/2019 Sree Project Document
3/48
3
Chapter 2
Introduction to Digital Image processing
2.1 Introduction:
Collection of pixels laid out in a specific order with width (x) and height (y) in pixels.
Each pixel has a numerical value, which correspond to a color or gray scale value. A Pixel has
no absolute size and pixels may (sometimes NOT always)have a spatial value (Spatial data isdata associated with the pixels that provides information about the size of the objects in the
image).
Fig: 2.1 Representation of digital image in x and y pixel format.
An image may be defined as a two-dimensional function, f(x, y), where x and y are
spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity
or gray level of the image at that point. When x, y, and the amplitude values of f are all finite,
discrete quantities, we call the image a digital image. The field of digital image processing refers
to processing digital images by means of a digital computer. Note that a digital image is
composed of a finite number of elements, each of which has a particular location and value.
These elements are referred to as picture elements, image elements and pixels. Pixel is the term
most widely used to denote the elements of a digital image.
Vision is the most advanced of our senses, so it is not surprising that images play the
single most important role in human perception. However, unlike humans, who are limited to the
visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire
EM spectrum, ranging from gamma to radio waves. They can operate also on images generated
by sources that humans are not accustomed to associating with images. These include ultrasound,
electron microscopy, and computer-generated images. Thus, digital image processing
7/28/2019 Sree Project Document
4/48
4
encompasses a wide and varied field of applications. Image processing stops and other related
areas, such as image analysis and computer vision, start. Sometimes a distinction is made by
defining image processing as a discipline in which both the input and output of a process are
images. We believe this to be a limiting and somewhat artificial boundary. For example, under
this definition, even the trivial task of computing the average intensity of an image would not be
considered an image processing operation. On the other hand, there are fields such as computer
vision whose ultimate goal is to use computers to emulate human vision, including learning and
being able to make inferences and take actions based on visual inputs. This area itself is a branch
of artificial intelligence (AI), whose objective is to emulate human intelligence. The field of AI
is in its earliest stages of infancy in terms of development, with progress having been much
slower than originally anticipated. The area of image analysis (also called image understanding)
is in between image processing and computer vision. There are no clear-cut boundaries in the
continuum from image processing at one end to computer vision at the other. However, one
useful paradigm is to consider three types of computerized processes in this continuum: low-,
mid-,and high-level processes. Low-level processes involve primitive operations such as image
pre-processing to reduce noise, contrast enhancement, and image sharpening. A low-level
process is characterized by the fact that both its inputs and outputs are images. Mid-level
processes on images involve tasks such as segmentation (partitioning an image into regions or
objects), description of those objects to reduce them to a form suitable for computer processing,
and classification (recognition) of individual objects. A mid-level process is characterized by the
fact that its inputs generally are images, but its outputs are at- tributes extracted from those
images (e.g., edges, contours, and the identity of individual objects). Finally, higher-level
processing involves making sense of an ensemble of recognized objects, as in image analysis,
and, at the far end of the continuum, performing the cognitive functions normally associated with
human vision.
Based on the preceding comments, we see that a logical place of overlap between imageprocessing and image analysis is the area of recognition of individual regions or objects in an
image. Thus, what we call in this book digital image processing encompasses processes whose
inputs and outputs are images and, in addition, encompasses processes that extract attributes
from images, up to and including the recognition of individual objects. As a simple illustration to
7/28/2019 Sree Project Document
5/48
5
clarify these concepts, consider the area of automated analysis of text. The processes of acquiring
an image of the area containing the text, pre-processing that image, extracting (segmenting) the
individual characters, describing the characters in a form suitable for computer processing, and
recognizing those individual characters, are in the scope of what we call digital image processing
in this book. Making sense of the content of the page may be viewed as being in the domain of
image analysis and even computer vision, depending on the level of complexity implied by the
statement making sense. Digital image processing, as we have defined it, is used successfully
in a broad range of areas of exceptional social and economic value.
2.2 Digital Image Characteristics:
Pixel - An abbreviation of the term 'picture element.' A pixel is the smallest picture
element of a digital image. A monochrome pixel can have two values, black or white/0 or 1.Color and gray scale require more bits; true color, displaying approximately 16.7 million colors,
requires 24 bits for each pixel. A pixel may have more data than the eye can perceive at one
time. Dot The smallest unit that a printer can print. Voxel An abbreviation of the term
volume element. The smallest distinguishable box-shaped part of a three dimensional space. A
particular voxel will be identified by the x, y and z coordinates of one of its eight corners, or
perhaps its centre. The term is used in three dimensional modeling. Voxels need not have
uniform dimensions in all three coordinate planes.
To the human observer, the internal structures and functions of the human body are not
generally visible. However, by various technologies, images can be created through which the
medical professional can look into the body to diagnose abnormal conditions and guide
therapeutic procedures. The medical image is a window to the body. No image window reveals
everything. Different medical imaging methods reveal different characteristics of the human
body. It is an overview of the medical imaging process. The five major components are the
patient, the imaging system, the system operator, the image itself, and the observer, The
objective is to make an object or condition within the patient's body visible to the observer. The
visibility of specific anatomical features depends on the characteristics of the imaging system
and the manner in which it is operated. Most medical imaging systems have a considerable
number of variables that must be selected by the operator. They can be changeable system
components, such as intensifying screens in radiography, transducers in sonography, or coils in
7/28/2019 Sree Project Document
6/48
6
magnetic resonance imaging (MRI). However, most variables are adjustable physical quantities
associated with the imaging process, such as kilo voltage in radiography, gain in sonography,
and echo time (TE) in MRI. The values selected will determine the quality of the image and the
visibility of specific body features.
2.2.1 Image Quality:
The quality of a medical image is determined by the imaging method, the characteristics
of the equipment, and the imaging variables selected by the operator. Image quality is not a
single factor but is a composite of at least five factors: contrast, blur, noise, artefacts, and
distortion, as shown above. The human body contains many structures and objects that are
simultaneously imaged by most imaging methods. We often consider a single object in relation
to its immediate background. In fact, with most imaging procedures the visibility of an object is
determined by this relationship rather than by the overall characteristics of the total image.
The task of every imaging system is to translate a specific tissue characteristic into image shades
of gray or colour. If contrast is adequate, the object will be visible. The degree of contrast in the
image depends on characteristics of both the object and the imaging system.
2.2.2 Image Contrast:
Contrast means difference. In an image, contrast can be in the form of different shades of
gray, light intensities, or colors. Contrast is the most fundamental characteristic of an image. An
object within the body will be visible in an image only if it has sufficient physical contrast
relative to surrounding tissue. However, image contrast much beyond that required for good
object visibility generally serves no useful purpose and in many cases is undesirable. The
physical contrast of an object must represent a difference in one or more tissue characteristics.
For example, in radiography, objects can be imaged relative to their surrounding tissue if there is
an adequate difference in either density or atomic number and if the object is sufficiently thick.
When a value is assigned to contrast, it refers to the difference between two specific points or
areas in an image. In most cases we are interested in the contrast between a specific structure or
object in the image and the area around it or its background.
7/28/2019 Sree Project Document
7/48
7
2.2.3 Contrast Sensitivity:
The degree of physical object contrast required for an object to be visible in an image
depends on the imaging method and the characteristics of the imaging system. The primary
characteristic of an imaging system that establishes the relationship between image contrast and
object contrast is its contrast sensitivity. Consider the situation shown below. The circular
objects are the same size but are filled with different concentrations of iodine contrast medium.
That is, they have different levels of object contrast. When the imaging system has a relatively
low contrast sensitivity, only objects with a high concentration of iodine (ie, high object contrast)
will be visible in the image. If the imaging system has a high contrast sensitivity, the lower-
contrast objects will also be visible.
It emphasize that contrast sensitivity is a characteristic of the imaging method and the
variables of the particular imaging system. It is the characteristic that relates to the system's
ability to translate physical object contrast into image contrast. The contrast transfer
characteristic of an imaging system can be considered from two perspectives. From the
perspective of adequate image contrast for object visibility, an increase in system contrast
sensitivity causes lower-contrast objects to become visible. However, if we consider an object
with a fixed degree of physical contrast (i.e., a fixed concentration of contrast medium), then
increasing contrast sensitivity will increase image contrast.
It is difficult to compare the contrast sensitivity of various imaging methods because
many are based on different tissue characteristics. However, certain methods do have higher
contrast sensitivity than others. For example, computed tomography (CT) generally has a higher
contrast sensitivity than conventional radiography. This is demonstrated by the ability of CT to
image soft tissue objects (masses) that cannot be imaged with radiography. Consider the image
below. Here is a series of objects with different degrees of physical contrast. They could be
vessels filled with different concentrations of contrast medium. The highest concentration (and
contrast) is at the bottom. Now imagine a curtain coming down from the top and covering some
of the objects so that they are no longer visible. Contrast sensitivity is the characteristic of the
7/28/2019 Sree Project Document
8/48
8
imaging system that raises and lowers the curtain. Increasing sensitivity raises the curtain and
allows us to see more objects in the body. A system with low contrast sensitivity allows us to
visualize only objects with relatively high inherent physical contrast.
2.2.4 Blur and Visibility of Detail:
Structures and objects in the body vary not only in physical contrast but also in size.
Objects range from large organs and bones to small structural features such as trabecula patterns
and small calcifications. It is the small anatomical features that add detail to a medical image.
Each imaging method has a limit as to the smallest object that can be imaged and thus on
visibility of detail. Visibility of detail is limited because all imaging methods introduce blurring
into the process. The primary effect of image blur is to reduce the contrast and visibility of small
objects or detail. Consider the image below, which represents the various objects in the body in
terms of both physical contrast and size. As we said, the boundary between visible and invisible
objects is determined by the contrast sensitivity of the imaging system. We now extend the idea
of our curtain to include the effect of blur. It has little effect on the visibility of large objects but
it reduces the contrast and visibility of small objects. When blur is present, and it always is, our
curtain of invisibility covers small objects and image detail.
2.2.5 Noise:
Another characteristic of all medical images is image noise. Image noise, sometimes
referred to as image mottle, gives an image a textured or grainy appearance. The source and
amount of image noise depend on the imaging method and are discussed in more detail in a later
chapter. We now briefly consider the effect of image noise on visibility. In the image below we
find our familiar array of body objects arranged according to physical contrast and size. We now
add a third factor, noise, which will affect the boundary between visible and invisible objects.
The general effect of increasing image noise is to lower the curtain and reduce object visibility.
In most medical imaging situations the effect of noise is most significant on the low-contrast
objects that are already close to the visibility threshold.
7/28/2019 Sree Project Document
9/48
9
2.2.6 Object Contrast:
The ability to see or detect an object is heavily influenced by the contrast between the
object and its background. For most viewing tasks there is not a specific threshold contrast at
which the object suddenly becomes visible. Instead, the accuracy of seeing or detecting a specific
object increases with contrast. The contrast sensitivity of the human viewer changes with
viewing conditions. When viewer contrast sensitivity is low, an object must have a relatively
high contrast to be visible. The degree of contrast required depends on conditions that alter the
contrast sensitivity of the observer: background brightness, object size, viewing distance, glare,
and background structure.
2.2.7 Background Brightness:
The human eye can function over a large range of light levels or brightness, but vision is
not equally sensitive at all brightness levels. The ability to detect objects generally increases with
increasing background brightness or image illumination. To be detected in areas of low
brightness, an object must be large and have a relatively high level of contrast with respect to its
background. This can be demonstrated with the image in the image above. View this image with
different levels of illumination. You will notice that under low illumination you cannot see all of
the small and low-contrast objects. A higher level of object contrast is required for visibility.
2.3 File Formats:
File format which defines the components of the digital image (x & y values, values of
the pixels, colour/gray scale, compression, manner in which the pixels are laid out, etc.) Standard
file formats provide the exchange of digital image information .
There are many file formats exist. They are,
JPEG - Joint Photographic Experts Group.
TIFF - Tagged Image File Format.
PNG - Portabe Network Graphics.
2.4 Digital Image Representation:
7/28/2019 Sree Project Document
10/48
10
An image is defined as a two-dimensional function ie. a matrix, f(x, y), where x and y are
spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity
or gray level of the image at the point. Color images are formed by combining the individual
two-dimensional images. For example, in the RGB color system, a color images consists of three
namely, red, green and blue individual component images. Thus many of the techniques
developed for monochrome images can be extended to color images by processing the three
component images individually. When x, y and the amplitude values of f are all finite, discrete
quantities, the image is called a digital image. The field of digital image processing refers to
processing digital images by means of a digital computer. A digital image is composed of a finite
number of elements, each of which has a particular location and value. These elements are
referred to as picture elements, image elements, pels and pixels. Since pixel is the most widely
used term, the elements will be denoted as pixels from now on.
An image may be continuous with respect to the x- and y-coordinates, and also in
amplitude. Digitizing the coordinates as well as the amplitude will take into effect the conversion
of such an image to digital form. Here, the digitization of the coordinate values are called
sampling; digitizing the amplitude values is called quantization. A digital image is composed of
a finite number of elements, each of which has a particular location and value. The field of
digital image processing refers to processing digital images by means of a digital computer.
2.4.1 Coordinate Convention:Assume that an image f(x, y) is sampled so that the resulting image has M rows and N
columns. Then the image is of size M N. The values of the coordinates (x, y) are discrete
quantities. Integer values are used for these discrete coordinates. In many image processing
books, the image origin is set to be at (x, y) = (0, 0). The next coordinate values along the first
row of the image are (x, y) = (0, 1). Note that the notation (0, 1) is used to signify the second
sample along the first row. These are not necessarily the actual values of physical coordinates
when the image was sampled. Note that x ranges from 0 to M1, and y from 0 to N 1, where x
and y are integers. However, in the Wavelet Toolbox the notation (r, c) is used where r indicates
rows and c indicates the columns. It could be noted that the order of coordinates is the same as
the order discussed previously. Now, the major difference is that the origin of the coordinate.
7/28/2019 Sree Project Document
11/48
11
system is at (r, c) = (1, 1); hence r ranges from 1 to M, and c from 1 to N for r and c integers. The
coordinates are referred to as pixel coordinates.
2.4.2 Images as Matrices:
The coordinate system discussed in preceding section leads to the following representation for
the digitized image function:
f(x,y) = (2.1)
The right side of the equation is a representation of digital image. Each element of this
array (matrix) is called the pixel.
Now, in MATLAB, the digital image is represented as the following matrix:
f = (2.2)Where M = the number of rows and N = the number of columns Matrices in MATLAB
are stored in variables with names such as A, a, RGB, real array and so on.
2.4.3 Color Image Representation:
An RGB color image is an M N 3 array or matrix of color pixels, where each color
pixel consists of a triplet corresponding to the red, green, and blue components of an RGB image
at a specific spatial location. An RGB image may be viewed as a stack of three gray-scale
images, that when fed into the red, green, and blue inputs of a color monitor, produce a color
image on the screen. So from the stack of three images forming that RGB color image, each
image is referred to as the red, green, and blue component images by convention. Now, the data
class of the component images determine their range of values. If an RGB color image is of
class double, meaning that all the pixel values are of type double, the range of values is [0, 1].
Likewise, the range of values is [0, 255] or [0, 65535] for RGB images of class uint8 or uint16,
7/28/2019 Sree Project Document
12/48
12
respectively. The number of bits used to represent the pixel values of the component images
determines the bit depth of an RGB color image.
The RGB color space is shown graphically as an RGB color cube. The vertices of the
cude are the primary (red, green, and blue) and secondary (cyan, magenta, and yellow) colors of
light.
2.4.4 Indexed Images:
An indexed image has two components: a data matrix of integers, X, and a colormap
matrix, map. Matrix map is an m 3 array of class double containing floating-point values in the
range [0, 1]. The length, m, of the map is equal to the number of colors it defines. Each row of
map specifies the red, green, and blue components of a single color. An indexed image uses
direct mapping of pixels intensity values of color map values. The color of each pixel is
determined by using the corresponding value of integer matrix X as a pointer into map. If X is of
class double then all of its components with value 2 point to the second row, and so on. If X is of
class unit 8 or unit 16, then all components with value 0 point to the first row in map, all
components with value 1 to point to the second row and so on.
2.4.5 The Basics of Color Image Processing:
Color image processing techniques deals with how the color images are handled for a
variety of image-processing tasks. For the purposes of the following discussion we subdivide
color image processing into three principal areas: (1) color transformations (also called color
mappings); (2) spatial processing of individual color planes; and (3) color vector processing. The
first category deals with processing the pixels of each color plane based strictly on their values
and not on their spatial coordinates. This category is analogous to the intensity transformations.
The second category deals with spatial (neighbor-hood) filtering for individual color planes and
is analogous to spatial filtering. The third category deals with techniques base on processing all
components of a color image simultaneously. Since full-color images have at least three
components, color pixels are indeed vectors. For example, in the RGB color images, the RGB
system color point can be interpreted as a vector extending from the origin to that point in the
RGB coordinate system.
Let c represent an arbitrary vector in RGB color space:
7/28/2019 Sree Project Document
13/48
13
c = [ ] = [ ] (2.3)
This above equation indicates that the components of c are simply the RGB components
of a color image at a point. Since the color components are a function of coordinates (x, y) by
using the notation.
c(x,y) = = (2.4)
For an image of size M N, there are MN such vectors, c(x, y), for x = 0,1,. M 1 and
y = 0,1,.N 1. In order for independent color component and vector-based processing to be
equivalent, two conditions have to be satisfied: (i) the process has to be applicable to both
vectors and scalars. (ii) the operation on each component of a vector must be independent of the
other components. The averaging would be accomplished by summing the gray levels of all the
pixels in the neighborhood. Or the averaging could be done by summing all the vectors in the
neighborhood and dividing each component of the average vector is the sum of the pixels in the
image corresponding to that component, which is the same as the result that would be obtained if
the averaging were done on the neighborhood of each component image individually, and then
the color vector were formed.
2.4.6 Reading Images:
In MATLAB, images are read into the MATLAB environment using function called
imread. The syntax is as follows: imread(filename) Here, filename is a string containing the
complete name of the image file including any applicable extension. For example, the command
line >> f = imread (x.jpg); reads the JPEG image into image array or image matrix f. Since there
are three color components in the image, namely red, green and blue components, the image is
broken down into the three distinct color matrices fR, fG and fB.
2.5 Standard method of image compression:
In 1992, JPEG established the first international standard for still image compression
where the encoders and decoders are DCT-based. The JPEG standard specifies three modes
7/28/2019 Sree Project Document
14/48
14
namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless
encoding. The performance of the coders for JPEG usually degrades at low bit-rates mainly
because of the underlying block-based Discrete Cosine Transform (DCT) . The baseline JPEG
coder [5] is the sequential encoding in its simplest form. Fig. 1 and 2 show the key processing
steps in such an encoder and decoder respectively for grayscale images. Color image
compression can be approximately regarded as compression of multiple grayscale images, which
are either compressed entirely one at a time, or are compressed by alternately interleaving 8x8
sample blocks from each in turn.
The DCT-based encoder can be thought of as essentially compression of a stream of 8x8
blocks of image samples. Each 8x8 block makes its way through each processing step, and yields
output in compressed form into the data stream. Because adjacent image pixels are highly
correlated, the Forward DCT (FDCT) processing step lays the basis forgaining data compression
by concentrating most of the signal in the lower spatial frequencies. For a typical 8x8 sample
block from a typical source image, most of the spatial frequencies have zero or near-zero
amplitude and need not to be encoded.
Original
Image
Fig: 2.2 Encoder Block Diagram.
Compressed
Image Data
Fig: 2.3 Decoder Block Diagram.
FDCT QuantizerEntropy
Encoder
Quantization
Table(QT)
Huffman
Table
Compressed
Image Data
Entropy
Decoder
DequantizerInverse
DCT
Quantization
Table(QT)
Huffman
Table
Reconstructed
Image
7/28/2019 Sree Project Document
15/48
15
After output from the Forward DCT (FDCT), each of the 64 DCT coefficients is
uniformly quantized in conjunction with a carefully designed 64-element Quantization Table
(QT). At the decoder, the quantized values are multiplied by the corresponding QT elements to
pick up the original unquantized values. After quantization, all the quantized coefficients are
ordered into zig-zag sequence. This ordering helps to facilitate entropy encoding by placing low
frequency non-zero coefficients before high-frequency coefficients. The DC coefficient, which
contains a significant fraction of the total image energy, is differentially encoded.
Entropy Coding (EC) achieves additional compression losslessly through encoding the
quantized DCT coefficients more compactly based on their statistical characteristics. The
JPEG proposal specifies both Huffman coding and arithmetic coding. More recently, the wavelet
transform has emerged as a cutting edge technology, within the field of image analysis. Wavelets
are a mathematical tool for hierarchically decomposing functions. Though rooted in
approximation theory, signal processing, and physics, wavelets have also recently been applied
to many problems in Computer Graphics including image editing and compression, automatic
level-of detail control for editing and rendering curves and surfaces, surface reconstruction from
contours and fast methods for solving simulation problems in 3D modelling, global illumination,
and animation .
Wavelet-based coding provides substantial improvements in picture quality at higher
compression ratios. Over the past few years, a variety of powerful and sophisticated wavelet-
based schemes for image compression have been developed and implemented. Because of the
many advantages of wavelet based image compressionas listed below, the top contenders in the
JPEG-2000 standard are all wavelet-based compression algorithms.
2.6 Conclusion:
The digital image characteristics, digital image representation in different analysis, the
basic colour image processing, the standard method of image compression are been discussed in
this chapter. Image compression using different teqniques are discussed in the next chapter.
7/28/2019 Sree Project Document
16/48
16
Chapter 3
Image compression using different technique
3.1 Introduction:
Here, some background topics of image compression which include the principles of
image compression, the classification of compression methods and the framework of a general
image coder and wavelets for image compression, different types of transforms and quantization
are going to be discussed.
3.2Principles of Image Compression:
An ordinary characteristic of most images is that the neighboring pixels are correlated
and therefore hold redundant information. The foremost task then is to find out less correlated
representation of the image. Two elementary components of compression are redundancy and
irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal
source image. Irrelevancy reduction omits parts of the signal that is not noticed by the signal
receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be
identified: (a) Spatial Redundancy or correlation between neighboring pixel values, (b) Spectral
Redundancy or correlation between different color planes or spectral bands and (c) Temporal
Redundancy or correlation between adjacent frames in a sequence of images especially in video
applications. Image compression research aims at reducing the number of bits needed torepresent an image by removing the spatial and spectral redundancies as much as possible.
3.3. Framework of General Image Compression Method:
A typical lossy image compression system is shown in Fig. C. It consists of three closely
connected components namely (a) Source Encoder, (b) Quantizer and (c) Entropy Encoder.
Compression is achieved by applying a linear transform in order to decorrelate the image data,
quantizing the resulting transform coefficients and entropy coding the quantized values.
Input
Image
Fig: 2.4 A typical lossy encoder.
Source
Encoder
QuantizerEntropy
Encoder
Compressed
Image
7/28/2019 Sree Project Document
17/48
17
Source Encoder:A variety of linear transforms have been developed which include Discrete
Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet
Transform (DWT) and many more, each with its own advantages and disadvantages.
Quantizer:A quantizer is used to reduce the number of bits needed to store the transformed
coefficients by reducing the precision of those values. As it is a many-to-one mapping, it
is a lossy process and is the main source of compression in an encoder. Quantization can
be performed on each individual coefficient, which is called Scalar Quantization (SQ).
Quantization can also be applied on a group of coefficients together known as Vector
Quantization (VQ). Both uniform and non-uniform quantizers can be used depending on
the problems.
Entropy Encoder:An entropy encoder supplementary compresses the quantized values losslessly to
provide a better overall compression. It uses a model to perfectly determine the
probabilities for each quantized value and produces an appropriate code based on these
probabilities so that the resultant output code stream is smaller than the input stream. The
most commonly used entropy encoders are the Huffman encoder and the arithmetic
encoder, although for applications requiring fast execution, simple Run Length Encoding
(RLE) is very effective.
3.4. Image Compression:
In the last decade, there has been a lot of technological transformation in the way we
communicate. This transformation includes the ever present, ever growing internet, the explosive
development in mobile communication and ever increasing importance of video communication.
Data Compression is one of the technologies for each of the aspect of this multimedia revolution.
Cellular phones would not be able to provide communication with increasing clarity without data
compression. Data compression is art and science of representing information in compact form.
Despite rapid progress in mass-storage density, processor speeds, and digital
communication system performance, demand for data storage capacity and data-transmission
7/28/2019 Sree Project Document
18/48
18
bandwidth continues to outstrip the capabilities of available technologies. In a distributed
environment large image files remain a major bottleneck within systems.
Image Compression is an important component of the solutions available for creating
image file sizes of manageable and transmittable dimensions. Platform portability and
performance are important in the selection of the compression/decompression technique to be
employed.
Four Stage model of Data Compression:
Almost all data compression systems can be viewed as comprising four successive
stages of data processing arranged as a processing pipeline (though some stages will often be
combined with a neighboring stage, performed "off-line," or otherwise made rudimentary).
The four stages are
(A) Preliminary pre-processing steps.
(B) Organization by context.
(C) Probability estimation.
(D) Length-reducing code.
The ubiquitous compression pipeline (A-B-C-D) is what is of interest.
With (A) we mean various pre-processing steps that may be appropriate before the final
compression engineLossy compression often follows the same pattern as lossless, but with one or
more quantization steps somewhere in (A). Sometimes clever designers may defer the loss until
suggested by statistics detected in (C); an example of this would be modern zero tree image
coding.
(B) Organization by context often means data reordering, for which a simple but good
example is JPEG's "Zigzag" ordering. The purpose of this step is to improve the estimates found
by the next step.
(C) A probability estimate (or its heuristic equivalent) is formed for each token to be
encoded. Often the estimation formula will depend on context found by (B) with separate 'bins'
of state variables maintained for each conditioned class.
7/28/2019 Sree Project Document
19/48
19
(D) Finally, based on its estimated probability, each compressed file token is represented as
bits in the compressed file. Ideally, a 12.5%-probable token should be encoded with three bits,
but details become complicated.
Principle behind Image Compression:
Images have considerably higher storage requirement than text; Audio and Video Data
require more demanding properties for data storage. An image stored in an uncompressed file
format, such as the popular BMP format, can be huge. An image with a pixel resolution of 640
by 480 pixels and 24-bit colour resolution will take up 640 * 480 * 24/8 = 921,600 bytes in an
uncompressed format.
The huge amount of storage space is not only the consideration but also the data
transmission rates for communication of continuous media are also significantly large. An image,
1024 pixel x 1024 pixel x 24 bit, without compression, would require 3 MB of storage and 7
minutes for transmission, utilizing a high speed, 64 Kbits /s, ISDN line.
Image data compression becomes still more important because of the fact that the transfer
of uncompressed graphical data requires far more bandwidth and data transfer rate. For example,
throughput in a multimedia system can be as high as 140 Mbits/s, which must be transferred
between systems. This kind of data transfer rate is not realizable with todays technology, or in
near the future with reasonably priced hardware.
3.5. Fundamentals of Image Compression Techniques:A digital image, or "bitmap", consists of a grid of dots, or "pixels", with each pixel
defined by a numeric value that gives its colour. The term data compression refers to the process
of reducing the amount of data required to represent a given quantity of information. Now, a
particular piece of information may contain some portion which is not important and can be
comfortably removed. All such data is referred as Redundant Data. Data redundancy is a central
issue in digital image compression. Image compression research aims at reducing the number of
bits needed to represent an image by removing the spatial and spectral redundancies as much as
possible.
A common characteristic of most images is that the neighboring pixels are correlated and
therefore contain redundant information. The foremost task then is to find less correlated
representation of the image. In general, three types of redundancy can be identified:
7/28/2019 Sree Project Document
20/48
20
1. Coding Redundancy
2. Inter Pixel Redundancy
3.PsychovisualRedundancy
Coding Redundancy:
If the gray levels of an image are coded in a way that uses more code symbols than
absolutely necessary to represent each gray level, the resulting image is said to contain coding
redundancy. It is almost always present when an images gray levels are represented with a
straight or natural binary code. Let us assume that a random variable rK
lying in the interval [0,
1] represents the gray levels of an image and that each rK
occurs with probability Pr(r
K).
Pr(r
K) = N
k/ n where k = 0, 1, 2 L-1. (3.1)
L = No. of gray levels.N
k=No. of times that gray appears in that image.
N = Total no. of pixels in the image.
If no. of bits used to represent each value of rK
is l (rK), the average no. of bits required to
represent each pixel is
Lavg
= l (rK) P
r(r
K) (3.2)
That is average length of code words assigned to the various gray levels is found by summing
the product of the no. of bits used to represent each gray level and the probability that the gray
level occurs. Thus the total no. of bits required to code an MN image is MN Lavg
.
Inter Pixel Redundancy:
The Information of any given pixel can be reasonably predicted from the value of its
neighboring pixel. The information carried by an individual pixel is relatively small.
In order to reduce the inter pixel redundancies in an image, the 2-D pixel array normally used
for viewing and interpretation must be transformed into a more efficient but usually non visual
format. For example, the differences between adjacent pixels can be used to represent an image.
These types of transformations are referred as mappings. They are called reversible if the
original image elements can be reconstructed from the transformed data set.
7/28/2019 Sree Project Document
21/48
21
Psycho visual Redundancy:
Certain information simply has less relative importance than other information in normal visual
processing. This information is said to be Psycho visually redundant, it can be eliminated without
significantly impairing the quality of image perception.
In general, an observer searches for distinguishing features such as edges or textual regions
and mentally combines them in recognizable groupings. The brain then correlates these
groupings with prior knowledge in order to complete the image interpretation process.
The elimination of psycho visually redundant data results in loss of quantitative information;
it is commonly referred as quantization. As this is an irreversible process i.e. visual information
is lost, thus it results in Lossy Data Compression. An image reconstructed following Lossy
compression contains degradation relative to the original. Often this is because the compression
scheme completely discards redundant information.
Image Compression Techniques:
There are basically two methods of Image Compression:
3.5.1. Lossless Coding Techniques
3.5.2. Lossy Coding Techniques
3.5.1. Lossless Coding Techniques:
In Lossless Compression schemes, the reconstructed image, after compression, is
numerically identical to the original image. However Lossless Compression can achieve a
modest amount of Compression. Lossless coding guaranties that the decompressed image is
absolutely identical to the image before compression. Lossless techniques can also be used for
the compression of other data types where loss of information is not acceptable. Lossless
compression algorithms can be used to squeeze down images and then restore them again for
viewing completely unchanged.
Lossless Coding Techniques are as follows:
Source Encoder Input Image F(x, y)1. Run Length Encoding.
2. Huffman Encoding.
3. Entropy Encoding.
4. Area Encoding.
7/28/2019 Sree Project Document
22/48
22
3.5.2. Lossy Coding Techniques:
Lossy techniques cause image quality degradation in each Compression / De-
compression step. Careful consideration of the Human Visual perception ensures that the
degradation is often unrecognizable, though this depends on the selected compression ratio. An
image reconstructed following Lossy compression contains degradation relative to the original.
Often this is because the compression schemes are capable of achieving much higher
compression. Under normal viewing conditions, no visible loss is perceived (visually Lossless).
Lossy Image Coding Techniques normally have three Components:
Image Modeling:
It is aimed at the exploitation of statistical characteristics of the image (i.e. high
correlation, redundancy). It defines such things as the transformation to be applied to the Image.
Parameter Quantization:
The aim of Quantization is to reduce the amount of data used to represent the information
within the new domain.
Encoding:
Here a code is generated by associating appropriate code words to the raw produced by
the Quantizer. Encoding is usually error free. It optimizes the representation of the information
and may introduce some error detection codes.
3.6. Measurement of Image Quality:
The design of an imaging system should begin with an analysis of the physical characteristics of
the originals and the means through which the images may be generated. For example, one might
examine a representative sample of the originals and determine the level of detail that must be
preserved, the depth of field that must be captured, whether they can be placed on a glass platen
or require a custom book-edge scanner, whether they can tolerate exposure to high light
intensity, and whether specula reflections must be captured or minimized. A detailed
examination of some of the originals, perhaps with a magnifier or microscope, may be necessary
to determine the level of detail within the original that might be meaningful for a researcher or
scholar. For example, in drawings or paintings it may be important to preserve stippling or other
techniques characteristic.
7/28/2019 Sree Project Document
23/48
23
3.7. Wavelets for image compression:
Wavelet transform exploits both the spatial and frequency correlation of data by dilations
(or contractions) and translations of mother wavelet on the input data. It supports the multi-
resolution analysis of data i.e. it can be applied to different scales according to the details
required, which allows progressive transmission and zooming of the image without the need of
extra storage. Another encouraging feature of wavelet transform is its symmetric nature that is
both the forward and the inverse transform has the same complexity, building fast compression
and decompression routines. Its characteristics well suited for image compression include the
ability to take into account of Human Visual Systems (HVS) characteristics, very good energy
compaction capabilities, robustness under transmission, high compression ratio etc.
Wavelet transform divides the information of an image into approximation and detail
sub-signals. The approximation sub-signal shows the general trend of pixel values and other
three detail sub-signals show the vertical, horizontal and diagonal details or changes in the
images. If these details are very small (threshold) then they can be set to zero without
significantly changing the image. The greater the number of zeros the greater the compression
ratio. If the energy retained (amount of information retained by an image after compression and
decompression) is 100% then the compression is lossless as the image can be reconstructed
exactly. This occurs when the threshold value is set to zero, meaning that the details have not
been changed.3.8 Image Compression Methodology:
Overview:The storage requirements for the video of a typical Angiogram procedure is of the order of
several hundred Mbytes.
*Transmission of this data over a low bandwidth network results in very high latency.
* Lossless compression methods can achieve compression ratios of ~2:1.
* We consider lossy techniques operating at much higher compression ratios (~10:1).* Key issues:
- High quality reconstruction required.
- Angiogram data contains considerable high-frequency spatial texture.
7/28/2019 Sree Project Document
24/48
24
* Proposed method applies a texture-modeling scheme to the high-frequency texture of some
regions of the image.
* This allows more bandwidth allocation to important areas of the image.
3.9 Different types of transforms:
1. FT (Fourier Transform).
2. DCT (Discrete Cosine Transform).
3. DWT (Discrete Wavelet Transform). .
3.9.1 Discrete Fourier Transform:
The DTFT representation for a finite duration sequence is (3.3) (3.4)
Where x(n) is a finite duration sequence, X(j) is periodic withperiod 2.It is convenient sample
X(j) with a sampling frequency equal an integer multiple of its period =m that is taking N
uniformly spaced samples between 0 and 2.
Let (3.5)Therefore (3.6)Since X(j) is sampled for one period and there are N samples X(j) can be expressed as
(3.7)3.9.2 The Discrete Cosine Transform (DCT):
The discrete cosine transform (DCT) helps separate the image into parts (or spectral
sub-bands) of differing importance (with respect to the image's visual quality). The DCT is
similar to the discrete Fourier transform: it transforms a signal or image from the spatial domain
to the frequency domain.
7/28/2019 Sree Project Document
25/48
25
3.9.3 Discrete Wavelet Transform (DWT):
The discrete wavelet transform (DWT) refers to wavelet transforms for which the
wavelets are discretely sampled. A transform which localizes a function both in space and
scaling and has some desirable properties compared to the Fourier transform. The transform is
based on a wavelet matrix, which can be computed more quickly than the analogous Fourier
matrix. Most notably, the discrete wavelet transform is used for signal coding, where the
properties of the transform are exploited to represent a discrete signal in a more redundant form,
often as a preconditioning for data compression. The discrete wavelet transform has a huge
number of applications in Science, Engineering, Mathematics and Computer Science.
Wavelet compression is a form of data compression well suited for image compression
(sometimes also video compression and audio compression). The goal is to store image data in as
little space as possible in a file. A certain loss of quality is accepted (lossy compression).
Using a wavelet transform, the wavelet compression methods are better at representing
transients, such as percussion sounds in audio, or high-frequency components in two-
dimensional images, for example an image of stars on a night sky. This means that the transient
elements of a data.
Signal can be represented by a smaller amount of information than would be the case if
some other transform, such as the more widespread discrete cosine transform, had been used.
First a wavelet transform is applied. This produces as many coefficients as there are pixels in theimage (i.e.: there is no compression yet since it is only a transform). These coefficients can then
be compressed more easily because the information is statistically concentrated in just a few
coefficients.
3.10 Quantization:
Quantization involved in image processing. Quantization techniques generally compress
by compressing a range of values to a single quantum value. By reducing the number of discrete
symbols in a given stream, the stream becomes more compressible. For example seeking to
reduce the number of colors required to represent an image. Another widely used example DCT
data quantization in JPEG and DWT data quantization in JPEG 2000.
7/28/2019 Sree Project Document
26/48
26
3.11 Entropy Encoding:
An entropy encoding is a coding scheme that assigns codes to symbols so as to match
code lengths with the probabilities of the symbols. Typically, entropy encoders are used to
compress data by replacing symbols represented by equal-length codes with symbols represented
by codes proportional to the negative logarithm of the probability. Therefore, the most common
symbols use the shortest codes.
According to Shannon's source coding theorem, the optimal code length for a symbol is
logbP, where b is the number of symbols used to make output codes and P is the probability of
the input symbol. Three of the most common entropy encoding techniques are Huffman coding,
range encoding, and arithmetic coding. If the approximate entropy characteristics of a data
stream are known in advance (especially for signal compression), a simpler static code such as
unary coding, Elias gamma coding, Fibonacci coding, Golomb coding, or Rice coding may be
useful.
There are three main techniques for achieving entropy coding:
Huffman Coding - one of the simplest variable length coding schemes.
Run-length Coding (RLC) - very useful for binary data containing long runs of ones of
zeros.
Arithmetic Coding - a relatively new variable length coding scheme that can combine
the best features of Huffman and run-length coding, and also adapt to data with non-stationary
statistics. It shall concentrate on the Huffman and RLC methods for simplicity.
3.12 Conclusion:
Here, some topics of image compression which include the principles of image
compression, the classification of compression methods and the framework of a general image
coder and wavelets for image compression, different types of transforms and quantization are
discussed. The introduction to wavelet transforms is given in the next chapter.
7/28/2019 Sree Project Document
27/48
27
Chapter 4
Introduction to wavelet transform4.1. Introduction:
The fundamental idea behind wavelets is to analyze according to scale. Indeed, some
researchers in the wavelet field feel that, by using wavelets, one is adopting a whole new
mindset or perspective in processing data.
Wavelets are functions that satisfy certain mathematical requirements and are used in
representing data or other functions. This idea is not new. Approximation using superposition
of functions has existed since the early 1800's, when Joseph Fourier discovered that he could
superpose sines and cosines to represent other functions. However, in wavelet analysis, the
scale that we use to look at data plays a special role. Wavelet algorithms process data at
different scales or resolutions. If we look at a signal with a large "window," we would notice
gross features. Similarly, if we look at a signal with a small "window," we would notice small
features. The result in wavelet analysis is to see both the forest andthe trees, so to speak.
This makes wavelets interesting and useful. For many decades, scientists have wanted
more appropriate functions than the sines and cosines which comprise the bases of Fourier
analysis, to approximate choppy signals. By their definition, these functions are non-local
(and stretch out to infinity). They therefore do a very poor job in approximating sharp spikes.
But with wavelet analysis, we can use approximating functions that are contained neatly in
finite domains. Wavelets are well-suited for approximating data with sharp discontinuities.
The wavelet analysis procedure is to adopt a wavelet prototype function, called an
analyzing wavelet or mother wavelet. Temporal analysis is performed with a contracted,
high-frequency version of the prototype wavelet, while frequency analysis is performed with
a dilated, low-frequency version of the same wavelet. Because the original signal or function
can be represented in terms of a wavelet expansion (using coefficients in a linear combination
of the wavelet functions), data operations can be performed using just the corresponding
wavelet coefficients. And if you further choose the best wavelets adapted to your data, or
truncate the coefficients below a threshold, your data is sparsely represented. This sparse
coding makes wavelets an excellent tool in the field of data compression.
7/28/2019 Sree Project Document
28/48
28
Other applied fields that are making use of wavelets include astronomy, acoustics,
nuclear engineering, sub-band coding, signal and image processing, neurophysiology, music,
magnetic resonance imaging, speech discrimination, optics, fractals, turbulence, earthquake-
prediction, radar, human vision, and pure mathematics applications such as solving partial
differential equations.
4.2. Basis Functions:
It is simpler to explain a basis function if we move out of the realm of analog
(functions) and into the realm of digital (vectors) (*). Every two-dimensional vector (x,y) is a
combination of the vector (1,0) and (0,1). These two vectors are the basis vectors for (x,y).
Why? Notice that x multiplied by (1,0) is the vector (x,0), and y multiplied by (0,1) is the
vector(0,y). The sum is (x,y).
The best basis vectors have the valuable extra property that the vectors are
perpendicular, or orthogonal to each other. For the basis (1,0) and (0,1), this criteria is
satisfied. Now let's go back to the analog world, and see how to relate these concepts to basis
functions. Instead of the vector (x,y), we have a function f(x). Imagine that f(x) is a musical
tone, say the note A in a particular octave. We can construct A by adding sines and cosines
using combinations of amplitudes and frequencies. The sines and cosines are the basis
functions in this example, and the elements of Fourier synthesis. For the sines and cosines
chosen, we can set the additional requirement that they be orthogonal. How? By choosing the
appropriate combination of sine and cosine function terms whose inner product add up to
zero. The particular set of functions that are orthogonal and that construct f(x) are our
orthogonal basis functions for this problem.
Scale-Varying Basis Functions:
A basis function varies in scale by chopping up the same function or data space using
different scale sizes. For example, imagine we have a signal over the domain from 0 to 1. We
can divide the signal with two step functions that range from 0 to 1/2 and 1/2 to 1. Then we
can divide the original signal again using four step functions from 0 to 1/4, 1/4 to 1/2, 1/2 to
3/4, and 3/4 to 1. And so on. Each set of representations code the original signal with a
particular resolution or scale.
7/28/2019 Sree Project Document
29/48
29
4.3. Fourier analysis:
Fourier Transform:
The Fourier transform's utility lies in its ability to analyze a signal in the time domain
for its frequency content. The transform works by first translating a function in the time
domain into a function in the frequency domain. The signal can then be analyzed for its
frequency content because the Fourier coefficients of the transformed function represent the
contribution of each sine and cosine function at each frequency. An inverse Fourier transform
does just what you'd expect, transform data from the frequency domain into the time domain.
Discrete Fourier Transform:
The discrete Fourier transform (DFT) estimates the Fourier transform of a function
from a finite number of its sampled points. The sampled points are supposed to be typical of
what the signal looks like at all other times.
The DFT has symmetry properties almost exactly the same as the continuous Fourier
transform. In addition, the formula for the inverse discrete Fourier transform is easily
calculated using the one for the discrete Fourier transform because the two formulas are
almost identical.
Windowed Fourier Transform:
Iff(t) is a non-periodic signal, the summation of the periodic functions, sine and
cosine, does not accurately represent the signal. You could artificially extend the signal to
make it periodic but it would require additional continuity at the endpoints. The windowed
Fourier transform (WFT) is one solution to the problem of better representing the non
periodic signal. The WFT can be used to give information about signals simultaneously in thetime domain and in the frequency domain.
With the WFT, the input signal f(t) is chopped up into sections, and each section is
analyzed for its frequency content separately. If the signal has sharp transitions, window uses
input data so that the sections converge to zero at the endpoint. This windowing is
accomplished via a weight function that places less emphasis near the interval's endpoints
than in the middle. The effect of the window is to localize the signal in time.
7/28/2019 Sree Project Document
30/48
30
Fast Fourier Transform:
To approximate a function by samples, and to approximate the Fourier integral by the
discrete Fourier transform, requires applying a matrix whose order is the number sample
points n. Since multiplying an matrix by a vector costs on the order of arithmetic
operations, the problem gets quickly worse as the number of sample points increases.
However, if the samples are uniformly spaced, then the Fourier matrix can be factored into a
product of just a few sparse matrices, and the resulting factors can be applied to a vector in a
total of order arithmetic operations. This is the so-calledfast Fourier transform or FFT.
4.4. Similarities between Fourier and Wavelet Transform:
The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both
linear operations that generate a data structure that contains segments of various lengths,
usually filling and transforming it into a different data vector of length .
The mathematical properties of the matrices involved in the transforms are similar as
well. The inverse transform matrix for both the FFT and the DWT is the transpose of the
original. As a result, both transforms can be viewed as a rotation in function space to a
different domain. For the FFT, this new domain contains basis functions that are sines and
cosines. For the wavelet transform, this new domain contains more complicated basis
functions called wavelets, mother wavelets, or analyzing wavelets.
Both transforms have another similarity. The basis functions are localized in
frequency, making mathematical tools such as power spectra (how much power is contained
in a frequency interval) and scale grams (to be defined later) useful at picking out frequencies
and calculating power distributions.
4.5. Dissimilarities between Fourier and Wavelet Transform:
The most interesting dissimilarity between these two kinds of transforms is that
individual wavelet functions are localized in space. Fourier sine and cosine functions are not.
This localization feature, along with wavelets' localization of frequency, makes many
functions and operators using wavelets "sparse" when transformed into the wavelet domain.
This sparseness, in turn, results in a number of useful applications such as data compression,
detecting features in images, and removing noise from time series.
7/28/2019 Sree Project Document
31/48
31
4.6. Wavelets:
Compactly supported wavelets are functions defined over a finite interval and having
an average alue of zero. The basic idea of the wavelet transform is to represent any arbitrary
function f(x) as a uperposition of a set of such wavelets or basis functions. These basisfunctions are obtained from a single prototype wavelet called the mother wavelet (x), by
dilations or scaling and translations. Wavelet bases are very good at efficiently representing
functions that are smooth except for a small set of discontinuities.
For each n, k Z, define (x) by(x) = (x - k) (4.1)Constructing the function (x),
on R, such that {
(x)}n,k
Z is an or- thonormal
basis on R. As mentioned before (x) is a wavelet and the collection {(x) }n,kZ is awavelet orthonormal basis on R; this framework for constructing wavelets involves theconcept of a multi resolution analysis or MRA.
Multi resolution analysis is a device for computation of basis coefficients in (R) :f = , . It is defined as follows, = {f(x)|f(x) = g(x), g(x) }, (4.2)Where
f(x) = (f, ( n))(x n) (4.3)Then a multi resolution analysis on R is a sequence of subspaces {}nZ of functions on R, satisfying the following properties:(a)For all n, k Z, .(b)If f(x) is on R, then f(x) span{}nZ. That is, given > 0, there is an n Z
and a function g(x) such that ||fg||
7/28/2019 Sree Project Document
32/48
32
(x) = g(k)(2x k). (4.5)Then {(x)} is a wavelet orthonormal basis on R.The orthogonal projection of an arbitrary function f
onto
is given by
f = (f, ) (4.6)As k varies, the basis functions are shifted in steps of , so f cannotrepresent any detail on a scale smaller than that. We say that the functions in have theresolution or scale . Here, f is called an approximation to f at resolution . For agiven function f, an MRA provides a sequence of approximations f of increasing accuracy.The difference between the approximations at resolutionand is called the finedetail at resolution
which is as follows:
f(x) = f(x) f(x). (4.7)Or f = (f, ) . (4.8) is also an orthogonal projection and its range is orthorgonal to where the followingholds: = {f| f = f} (4.9)
= {f|
f = f} (4.10)
= (4.11)There are choices of the numbers h and g such that {(x)} is a wavelet orthonormalbasis on R.We must show otho-normality and completeness. As for completeness, we have=0 (4.12)and = R (4.13)Then we have {
|k
Z} =
=
. Hence {
(x)}
is complete if and only
if = (R) holds, and this is true.Now, as for the ortho-normality,( , )= ((x k), (x l)
= (, ) = (k l). (4.14)To prove ortho-normality between scales, let n, n Z with n < n, and let k, k Z bearbitrary. Since (x) V1, (x) V1, Then we have . Since ( , ) =0 for all k, l
Z, it follows that (
,
)= 0, for all n, k, l
Z. Given f(x)
we know
that f(x)= (f, )(x) Hence for f(x)
7/28/2019 Sree Project Document
33/48
33
( , f) = ( (f, ))= (f, ) ( , ) = 0 (4.15)
Since, n < n,
and since
,
also. Hence (
,
) =0
Therefore {(x)} is a wavelet orthonormal basis on R.i) Symmetry:Symmetric filters are preferred for they are most valuable for minimizing the edge
effects in the wavelet representation of discrete wavelet transform (DWT) of a function; large
coefficients resulting from false edges due to periodization can be avoided. Since orthogonal
filters in exception to Haar-filter cannot be symmetric, biorthogonal filters are almost always
selected for image compression application.
ii)
Vanishing Moments:Vanishing Moments are defined as follows: From the definition of multi resolution
analysis(MRA), any wavelet (x) that comes from MRA must satisfy =0 (4.16)The integral is referred to as the zeroth moment of(x), so that if the above equation
holds, we say that (x) has its zeroth moment vanishing. The integral (x) dx is referredto as the
moment of
(x) and if
(x) dx = 0, we say that
(x) has its
moment
vanishing.
We may encounter a situation where having different number of vanishing moments
on the analysis filters than on the reconstruction filters. As a matter of fact, it is possible to
have different number of vanishing moments on the analysis filters than on the reconstruction
filters. Vanishing moments on the analysis filters are desired for small coefficients in the
transform as a result, whereas vanishing moments on the reconstruction filter results in fewer
blocking artifacts in the compressed image thus is desired. Thus having sufficient vanishing
moments which maybe different in numbers on each filters are advantageous.iii)Size of the filters:
Long analysis filters results in greater computation time for the wavelet or wavelet
packet transform. Long reconstruction filters can create unpleasant artifacts in the
compressed image for the following reason. The reconstructed image is made up of the
superposition of only a few scaled and shifted reconstruction filters. So features of the
reconstruction filters such as oscillations or lack of smoothness, can be obvious noted in the
7/28/2019 Sree Project Document
34/48
34
reconstructed image. Smoothness can be guaranteed by requiring a large number of vanishing
moments in the reconstruction filter.
4.7 List Of Wavelet Related Transform:
4.7. 1. Continuous Wavelet Transform:
A continuous wavelet transform is used to divide a continuous-time function into
wavelets. Unlike Fourier transform, the continuous wavelet transform possesses the ability to
construct a time frequencyrepresented of a signal that offers very good time and frequency
localization.
4.7.2. Multi resolution analysis:
A multi resolution analysis (MRA) or multi scale approximation (MSA) is the design
methods of most of the practically relevant discrete wavelet transform (DWT) and the
justification for the algorithm of the fast Fourier wavelet transform (FWT)
4.7.3. Discrete Wavelet Transform:
In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is
any wavelet transform for which the wavelets are discretely sampled. As with other wavelet
transforms, a key advantage it has overFourier transforms is temporal resolution: it capturesboth frequency andlocation information.
4.7.4. Fast Wavelet Transform:
The Fast Wavelet Transform is a mathematical algorithm designed to turn a waveform
or signal in the time domain into a sequence of coefficients based on an orthogonal basis of
small finite waves, or wavelets. The transform can be easily extended to multidimensional
signals, such as images, where the time domain is replaced with the space domain.
4.8. Applications of Wavelet Transforms:
Wavelets have broad applications in fields such as signal processing and medical
imaging. Due to time and space constraints, I will only be discussing two in this paper. The
two applications most applicable to this are wavelet image compression and the progressive
transmission of image files over the internet.
http://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Time-frequency_representation7/28/2019 Sree Project Document
35/48
35
4.8.1. Wavelet Compression:
The point of doing the Haar wavelet transform is that areas of the original matrix that
contain little variation will end up as small or zero elements in the Haar transform matrix. A
matrix is considered sparse if it has a high proportion of zero entries. Matrices that are sparse
take much less memory to store. Because we cannot expect the transformed matrices always
to be sparse, we must consider wavelet compression To perform wavelet compression we first
decide on a non-negative threshold value known as . We next let any value in the Haar
wavelet transformed matrix whose magnitude is less than be reset to zero. Our hope is that
this will leave us with a relatively sparse matrix. If is equal to zero we will not modify any
of the elements and therefore we will not lose any information. This is known as lossless
compression. Lossy compression occurs when is greater than zero. Because some of the
elements are reset to zero, some of our original data is lost. In the case of lossless
compression we are able to reverse our operations and get our original image back. With
lossy compression we are only able to build an approximation of our original image.
4.8.2. Progressive Transmission:
Many people frequently download images from the internet, a few of which are not
even pornographic. Wavelet transforms speed up this process considerably. When a personclicks on an image to download it, the source computer recalls the wave transformed matrix
from memory. It first sends the overall approximation coefficient and larger detailcoefficients, and then the progressively smaller detail coefficients. As your computer receivesthis information it begins to reconstruct the image in progressively greater detail until the
original image is fully reconstructed. This process can be interrupted at any time if the user
decides that s/he does not want the image. Otherwise an user would only be able to see an
image after the entire image file had been downloaded. Because a compressed image file issignificantly smaller it takes far less time to download.
4.9. Conclusion:
The basis function, Fourier analysis, similarities, dissimilarities between Fourier and
wavelet transform, introduction to wavelets, types of wavelet transforms and applications of
wavelet transform are discussed in this chapter. The image compression using modified fast
haar wavelet transform will be discussed.
7/28/2019 Sree Project Document
36/48
36
Chapter 5
1.Image compression using SPIHT algorithmA. Description of the Algorithm
Image data through the wavelet decomposition, the coefficient of the distribution turn
into a tree. According to this feature, defining a data structure: spatial orientation tree. 4-level
wavelet decomposition of the spatial orientation trees structure are shown in Figure1.We can
see that each coefficient has four children except the red marked coeffcients in the LL
subband and the coeffcients in the highest subbands (HL1;LH1; HH1).
The following sets of coordinates of coeffcients are used to represent set partitioning
method in SPIHT algorithm. The location of coeffcient is notated by (i,j),where i and j
indicate row and column indices, respectively.
H: Roots of the all spatial orientation trees
O (i,j):Set of offspring of the coeffcient (i, j), O(i, j) = {(2i, 2j), (2i, 2j + 1),(2i + 1,
2j), (2i + 1, 2j + 1)}, except (i, j) is inLL; When (i,j) is inLL subband, O(i; j) is defined as:
O(i, j) = {(i, j + ), (i + , j), (i +, j + )}, where and is the width andheight of theLL subband, respectively.D (i,j): Set of all descendants of the coeffcient (i, j),L
(i,j):D (i,j) - O (i,j).
Figure5.1: Parent-child relationship in SPIHT
A significance function () which decides the Significance of the set ofcoordinates, , with respect to the threshold 2n is defined by:
7/28/2019 Sree Project Document
37/48
37
Where ci,j is the wavelet coefficient. In this algorithm, three ordered lists are used to
store the significance information during set partitioning. List of insignificant sets (LIS), list
of insignificant pixels (LIP), and list of significant pixels (LSP) are those three lists. Note that
the term pixel is actually indicating wavelet coeffcient if the set partitioning algorithm is
applied to a wavelet transformed image.
Algorithm: SPIHT
1) Initialization:
1. Output n= [log2 max {| (, )|}]2. SetLSP=;3. SetLIP= (i,j)
H;
4. SetLIS= (i,j) H, whereD(i; j) and set each entry inLISas type A2) Sorting Pass:
1. For each (i, j) LIPdo:(a) Output (,)(b) If (,) = 1 then move (i, j) toLSPand output Sign ( , )2. For each (i, j) LISdo:(a) If (i, j) is type A then
I. output ((,))ii. If then (, ) = 1 then
A. for each (k, l) O(i, j). Output (, ). If (, ) = 1 then append (k, l) toLSP, output Sign( ,),and , = , 2sign( ,) else append (k; l) toLIPB. move (i, j) to the end ofLISas type B
(b) If (i, j) is type B then
I. output (, )ii. If (, ) = 1 then
. Append each (k, l) O(i, j) to the end ofLISas type A
. Remove (i,j) fromLSP
3) Refinement Pass:
1. For each (i,j) in LSP, except those included in the last sorting pass
7/28/2019 Sree Project Document
38/48
38
. Output the n-th MSB of |, |4) Quantization Pass:
1. Decrement nby 1
2. Goto step 2).
B. Analyses of SPIHT Algorithm
Here a concrete example to analyze the output binary stream of SPIHT encoding. The
following is 3-level wavelet decomposition coefficients of SPIHT encoding
n = [log2 max {|c(i,j)|}] = 5, so, The initial threshold value: = 25, for , the output binarystream: 11100011100010000001010110000, 29 bits in all. By the SPIHT encoding results,
we can see that the output bit stream with a large number of seriate "0" situation, and along
with the gradual deepening of quantification, the situation will become much more severity,
so there will have a great of redundancy when we direct output.
7/28/2019 Sree Project Document
39/48
39
2.Image compression using WDR algorithmWDR ALGORITHM
One of the defects of SPIHT is that it only implicitly locates the position of significant
coefficients. This makes it difficult to perform operations, such as region selection on
compressed data, which depend on the exact position of significant transform values. By
region selection, also known as region of interest (ROI), which means selecting a portion of a
compressed image, which requires increased resolution. Such compressed data operations are
possible with the Wavelet Difference Reduction (WDR) algorithm of Tian and Wells.
The term difference reduction refers to the way in which WDR encodes the locations
of significant wavelet transform values. In WDR, the output from the significance pass
consists of the signs of significant values along with sequences of bits which concisely
describe the precise locations of significant values.
The WDR algorithm is a very simple procedure. A wavelet transform is first applied to the
image, and then the bit-plane based WDR encoding algorithm for the wavelet coefficients is
carried out. WDR mainly consists of five steps as follows:
1. Initialization:During this step an assignment of a scan order should first be made. For an image
with P pixels, a scanorder is a one-to-one and onto mapping
= Xk, for k
=1,2,..., P between the wavelet coefficient () and a linear ordering (X k). The scan order is a
zigzag through subbands from higher to lower levels. For coefficients in subbands, row-based
scanning is used in the horizontal subbands, column based scanning is used in the vertical
subbands, and zigzag scanning is used for the diagonal and low-pass subbands. As the
scanning order is made, an initial threshold T0 is chosen so that all the transform values
satisfy |Xm|< T0 and at least one transform value satisfies |Xm|>= T0 / 2.
2.Update threshold:Let Tk=Tk-1 / 2.
7/28/2019 Sree Project Document
40/48
40
3.Significance pass:In this part, transform values are deemed significant if they are greater than or equal
to the threshold value. Then their index values are encoded using the difference reduction
method of Tian and Wells. The difference reduction method essentially consists of a binary
encoding of the number of steps to go from the index of the last significant value to the index
of the current significant value. The output from the significance pass includes the signs of
significant values along with sequences of bits, generated by difference reduction, which
describes the precise locations of significant values.
4.Refinement pass:The refinement pass is to generate the refined bits via the standard bit-plane
quantization procedure like the refinement process in SPHIT method. Each refined value is a
better approximation of an exact transform value.
5.Repeat steps (2) through (4) until the bit budget is reached.
3.Image compression using EZW algorithmTHEORY OF EZW ALGORITHM
It generates a lot of unimportant data after wavelet transforming on image data. It
discards some unimportant data after the process of quantizing and coding according to some
special rules and the remained data can represent the original data approximately. This is the
principle of Image compress algorithm based on wavelet transform.
Zero-tree coding method is one of the most popular image compress algorithms using
wavelet transforms and Embedded Zero-tree Wavelets (EZW) coding is the representative
method of the zero-tree coding based. EZW was invented by Shapiro in 1993[3]. It is an
embedded wavelet image coding algorithms, which has a high compression rate. It is a
progressive coding method and can perform well at image compressing from lossy to
lossless.
7/28/2019 Sree Project Document
41/48
41
The main features of EZW include compact multiresolution representation of images
by discrete wavelet transformation, zero-tree coding of the significant wavelet coefficients
providing compact binary maps, successive approximation quantization of the wavelet
coefficients, adaptive multilevel arithmetic coding, and capability of meeting an exact target
Compress rate.
The basic process flow of EZW algorithm can be described as follows: Operate the
image through wavelet transform and quantizing the coefficients. Given a series of threshold
values which are sorted from high to low, for every threshold (current threshold value equals
to 1/2 of the former threshold), sort all the coefficients and remain the important coefficients
and discard unimpo
Recommended