Super Resolution Reconstruction of Compressed Low Resolution

A

PROJECT

ON

SUPER RESOLUTION RECONSTRUCTION OF

COMPRESSED LOW RESOLUTION IMAGES

USING WAVELET LIFTING SCHEMES

ABSTRACT

Due to the factors like processing power limitations and channel capabilities images are

often down sampled and transmitted at low bit rates resulting in a low resolution compressed

image. High resolution images can be reconstructed from several blurred, noisy and down

sampled low resolution images using a computational process know as super resolution

reconstruction. The problem of recovering a high resolution image from a sequence of low

resolution compressed images is considered. In this paper, we propose lifting schemes for

intentionally introducing down sampling of the high resolution image sequence before

compression and then utilize super resolution techniques for generating a high resolution image

at the decoder.

Lifting wavelet transform has its advantages over the ordinary wavelet transform by way

of reduction in memory required for its implementation. This is possible because lifting

transform uses in-place computation. The lifting coefficients replace the image samples present

in the respective memory locations. In our proposed approach the forward lifting is applied to the

high resolution images which are compressed using Set Portioning in Hierarchical Trees

(SPHIT), the compressed images are transmitted which results in low resolution images at the

encoder, at the decoder super resolution techniques are applied; lifting scheme based fusion is

performed, then decoded using DSPIHT and Inverse lifting is applied. In order to remove noise

from the reconstructed image soft thresholding is performed and the blur is removed using blind

de-convolution, and finally interpolated using our novel interpolation technique. We have

performed both objective and subjective analysis of the reconstructed image, and the resultant

image has better super resolution factor, and a higher ISNR and PSNR.

INTRODUCTION

Super Resolution is a process of producing a high spatial resolution image from one or

more Low Resolution (LR) observation. It includes an alias free up sampling of the image

thereby increasing the maximum spatial frequency and removing the degradations that arises

during the image capture, Viz Blur and noise. It is the ability to use multiple noisy and blurred

images obtained by low resolution cameras and together generating a higher resolution image

with greater details than those you could get with just a single image. The main reason for the

reconstruction of a single high resolution image from multiple low resolution images is that the

images are degraded and mis-registered, these images are sub pixel shifted due to sub sampling

which results in an aliased low resolution image. If the images are sub-pixel shifted than each

low resolution image contains different information. Therefore, the information that is contained

in an under sampled image sequence can be combined to obtain an alias free high resolution

image. Super resolution image reconstruction from multiple snapshots provides far more detail

information than any interpolated image from a single snapshot.

WAVELETS

First of all, why do we need a transform, or what is a transform anyway?

Mathematical transformations are applied to signals to obtain a further information from

that signal that is not readily available in the raw signal. In the following tutorial I will assume a

time-domain signal as a raw signal, and a signal that has been "transformed" by any of the

available mathematical transformations as a processed signal.

There are number of transformations that can be applied, among which the Fourier

transforms are probably by far the most popular.

Most of the signals in practice, are TIME-DOMAIN signals in their raw format. That is,

whatever that signal is measuring, is a function of time. In other words, when we plot the signal

one of the axes is time (independent variable), and the other (dependent variable) is usually the

amplitude. When we plot time-domain signals, we obtain a time-amplitude representation of the

signal. This representation is not always the best representation of the signal for most signal

processing related applications. In many cases, the most distinguished information is hidden in

the frequency content of the signal. The frequency SPECTRUM of a signal is basically the

frequency components (spectral components) of that signal. The frequency spectrum of a signal

shows what frequencies exist in the signal.

Intuitively, we all know that the frequency is something to do with the change in rate of

something. If something ( a mathematical or physical variable, would be the technically correct

term) changes rapidly, we say that it is of high frequency, where as if this variable does not

change rapidly, i.e., it changes smoothly, we say that it is of low frequency. If this variable does

not change at all, then we say it has zero frequency, or no frequency. For example the publication

frequency of a daily newspaper is higher than that of a monthly magazine (it is published more

frequently).

The frequency is measured in cycles/second, or with a more common name, in "Hertz".

For example the electric power we use in our daily life in the US is 60 Hz (50 Hz elsewhere in

the world). This means that if you try to plot the electric current, it will be a sine wave passing

through the same point 50 times in 1 second. Now, look at the following figures. The first one is

a sine wave at 3 Hz, the second one at 10 Hz, and the third one at 50 Hz. Compare them.

So how do we measure frequency, or how do we find the frequency content of a signal?

The answer is FOURIER TRANSFORM (FT). If the FT of a signal in time domain is taken, the

frequency-amplitude representation of that signal is obtained. In other words, we now have a plot

with one axis being the frequency and the other being the amplitude. This plot tells us how much

of each frequency exists in our signal.

The frequency axis starts from zero, and goes up to infinity. For every frequency, we

have an amplitude value. For example, if we take the FT of the electric current that we use in our

houses, we will have one spike at 50 Hz, and nothing elsewhere, since that signal has only 50 Hz

frequency component. No other signal, however, has a FT which is this simple. For most

practical purposes, signals contain more than one frequency component. The following shows

the FT of the 50 Hz signal:

One word of caution is in order at this point. Note that two plots are given in Figure 1.4.

The bottom one plots only the first half of the top one. Due to reasons that are not crucial to

know at this time, the frequency spectrum of a real valued signal is always symmetric. The top

plot illustrates this point. However, since the symmetric part is exactly a mirror image of the first

part, it provides no additional information, and therefore, this symmetric second part is usually

not shown. In most of the following figures corresponding to FT, I will only show the first half

of this symmetric spectrum.

WHY DO WE NEED THE FREQUENCY INFORMATION?

Often times, the information that cannot be readily seen in the time-domain can be seen

in the frequency domain.

Let's give an example from biological signals. Suppose we are looking at an ECG signal

(ElectroCardioGraphy, graphical recording of heart's electrical activity). The typical shape of a

healthy ECG signal is well known to cardiologists. Any significant deviation from that shape is

usually considered to be a symptom of a pathological condition.

This pathological condition, however, may not always be quite obvious in the original

time-domain signal. Cardiologists usually use the time-domain ECG signals which are recorded

on strip-charts to analyze ECG signals. Recently, the new computerized ECG

recorders/analyzers also utilize the frequency information to decide whether a pathological

condition exists. A pathological condition can sometimes be diagnosed more easily when the

frequency content of the signal is analyzed.

This, of course, is only one simple example why frequency content might be useful.

Today Fourier transforms are used in many different areas including all branches of engineering.

Although FT is probably the most popular transform being used (especially in electrical

engineering), it is not the only one. There are many other transforms that are used quite often by

engineers and mathematicians. Hilbert transform, short-time Fourier transform (more about this

later), Wigner distributions, the Radon Transform, and of course our featured transformation ,

the wavelet transform, constitute only a small portion of a huge list of transforms that are

available at engineer's and mathematician's disposal. Every transformation technique has its own

area of application, with advantages and disadvantages, and the wavelet transform (WT) is no

exception.

For a better understanding of the need for the WT let's look at the FT more closely. FT

(as well as WT) is a reversible transform, that is, it allows to go back and forward between the

raw and processed (transformed) signals. However, only either of them is available at any given

time. That is, no frequency information is available in the time-domain signal, and no time

information is available in the Fourier transformed signal. The natural question that comes to

mind is that is it necessary to have both the time and the frequency information at the same time?

As we will see soon, the answer depends on the particular application and the nature of

the signal in hand. Recall that the FT gives the frequency information of the signal, which means

that it tells us how much of each frequency exists in the signal, but it does not tell us when in

time these frequency components exist. This information is not required when the signal is so-

called stationary.

FOURIER TRANSFORM DRAWBACKS

1. It is too wide of a subject to discuss in this tutorial.

2. It is not our main concern anyway. However, I would like to mention a couple important

points again for two reasons.

3. It is a necessary background to understand how WT works.

4. It has been by far the most important signal processing tool for many (and I mean many

many) years.

In 19th century (1822*, to be exact, but you do not need to know the exact time. Just trust

me that it is far before than you can remember), the French mathematician J. Fourier, showed

that any periodic function can be expressed as an infinite sum of periodic complex exponential

functions. Many years after he had discovered this remarkable property of (periodic) functions,

his ideas were generalized to first non-periodic functions, and then periodic or non-periodic

discrete time signals. It is after this generalization that it became a very suitable tool for

computer calculations. In 1965, a new algorithm called fast Fourier Transform (FFT) was

developed and FT became even more popular.

THE SHORT TERM FOURIER TRANSFORM

There is only a minor difference between STFT and FT. In STFT, the signal is divided

into small enough segments, where these segments (portions) of the signal can be assumed to be

stationary. For this purpose, a window function "w" is chosen. The width of this window must

be equal to the segment of the signal where its stationarity is valid.

This window function is first located to the very beginning of the signal. That is, the

window function is located at t=0. Let's suppose that the width of the window is "T" s. At this

time instant (t=0), the window function will overlap with the first T/2 seconds (I will assume that

all time units are in seconds). The window function and the signal are then multiplied. By doing

this, only the first T/2 seconds of the signal is being chosen, with the appropriate weighting of

the window (if the window is a rectangle, with amplitude "1", then the product will be equal to

the signal). Then this product is assumed to be just another signal, whose FT is to be taken. In

other words, FT of this product is taken, just as taking the FT of any signal.

The result of this transformation is the FT of the first T/2 seconds of the signal. If this

portion of the signal is stationary, as it is assumed, then there will be no problem and the

obtained result will be a true frequency representation of the first T/2 seconds of the signal.

The next step, would be shifting this window (for some t1 seconds) to a new location,

multiplying with the signal, and taking the FT of the product. This procedure is followed, until

the end of the signal is reached by shifting the window with "t1" seconds intervals.

MULTIRESOLUTION ANALYSIS

Although the time and frequency resolution problems are results of a physical

phenomenon (the Heisenberg uncertainty principle) and exist regardless of the transform used, it

is possible to analyze any signal by using an alternative approach called the multiresolution

analysis (MRA) . MRA, as implied by its name, analyzes the signal at different frequencies with

different resolutions. Every spectral component is not resolved equally as was the case in the

STFT.

MRA is designed to give good time resolution and poor frequency resolution at high

frequencies and good frequency resolution and poor time resolution at low frequencies. This

approach makes sense especially when the signal at hand has high frequency components for

short durations and low frequency components for long durations. Fortunately, the signals that

are encountered in practical applications are often of this type. For example, the following shows

a signal of this type. It has a relatively low frequency component throughout the entire signal and

relatively high frequency components for a short duration somewhere around the middle.

THE CONTINUOUS WAVELET TRANSFORM

The continuous wavelet transform was developed as an alternative approach to the short

time Fourier transform to overcome the resolution problem. The wavelet analysis is done in a

similar way to the STFT analysis, in the sense that the signal is multiplied with a function, {\it

the wavelet}, similar to the window function in the STFT, and the transform is computed

separately for different segments of the time-domain signal. However, there are two main

differences between the STFT and the CWT:

1. The Fourier transforms of the windowed signals are not taken, and therefore single peak will

be seen corresponding to a sinusoid, i.e., negative frequencies are not computed.

2. The width of the window is changed as the transform is computed for every single spectral

component, which is probably the most significant characteristic of the wavelet transform.

The continuous wavelet transform is defined as follows

As seen in the above equation , the transformed signal is a function of two variables, tau and s ,

the translation and scale parameters, respectively. psi(t) is the transforming function, and it is

called the mother wavelet . The term mother wavelet gets its name due to two important

properties of the wavelet analysis as explained below:

The term wavelet means a small wave . The smallness refers to the condition that this

(window) function is of finite length ( compactly supported). The wave refers to the condition

that this function is oscillatory . The term mother implies that the functions with different region

of support that are used in the transformation process are derived from one main function, or the

mother wavelet. In other words, the mother wavelet is a prototype for generating the other

window functions.

The term translation is used in the same sense as it was used in the STFT; it is related to

the location of the window, as the window is shifted through the signal. This term, obviously,

corresponds to time information in the transform domain. However, we do not have a frequency

parameter, as we had before for the STFT. Instead, we have scale parameter which is defined as

$1/frequency$. The term frequency is reserved for the STFT. Scale is described in more detail in

the next section.

COMPUTATION OF THE CWT

Interpretation of the above equation will be explained in this section. Let x(t) is the signal

to be analyzed. The mother wavelet is chosen to serve as a prototype for all windows in the

process. All the windows that are used are the dilated (or compressed) and shifted versions of the

mother wavelet. There are a number of functions that are used for this purpose. The Morlet

wavelet and the Mexican hat function are two candidates.

Once the mother wavelet is chosen the computation starts with s=1 and the continuous wavelet

transform is computed for all values of s , smaller and larger than ``1''. However, depending on

the signal, a complete transform is usually not necessary. For all practical purposes, the signals

are bandlimited, and therefore, computation of the transform for a limited interval of scales is

usually adequate. In this study, some finite interval of values for s was used, as will be described

later in this chapter.

For convenience, the procedure will be started from scale s=1 and will continue for the

increasing values of s , i.e., the analysis will start from high frequencies and proceed towards low

frequencies. This first value of s will correspond to the most compressed wavelet. As the value of

s is increased, the wavelet will dilate.

The wavelet is placed at the beginning of the signal at the point which corresponds to

time=0. The wavelet function at scale ``1'' is multiplied by the signal and then integrated over all

times. The result of the integration is then multiplied by the constant number 1/sqrt{s} . This

multiplication is for energy normalization purposes so that the transformed signal will have the

same energy at every scale. The final result is the value of the transformation, i.e., the value of

the continuous wavelet transform at time zero and scale s=1 . In other words, it is the value that

corresponds to the point tau =0 , s=1 in the time-scale plane.

The wavelet at scale s=1 is then shifted towards the right by tau amount to the location

t=tau , and the above equation is computed to get the transform value at t=tau , s=1 in the time-

frequency plane.

This procedure is repeated until the wavelet reaches the end of the signal. One row of

points on the time-scale plane for the scale s=1 is now completed.

Then, s is increased by a small value. Note that, this is a continuous transform, and

therefore, both tau and s must be incremented continuously . However, if this transform needs to

be computed by a computer, then both parameters are increased by a sufficiently small step

size. This corresponds to sampling the time-scale plane.

The above procedure is repeated for every value of s. Every computation for a given value of s

fills the corresponding single row of the time-scale plane. When the process is completed for all

desired values of s, the CWT of the signal has been calculated.

Why is the Discrete Wavelet Transform Needed?

Although the discretized continuous wavelet transform enables the computation of the

continuous wavelet transform by computers, it is not a true discrete transform. As a matter of

fact, the wavelet series is simply a sampled version of the CWT, and the information it provides

is highly redundant as far as the reconstruction of the signal is concerned. This redundancy, on

the other hand, requires a significant amount of computation time and resources. The discrete

wavelet transform (DWT), on the other hand, provides sufficient information both for analysis

and synthesis of the original signal, with a significant reduction in the computation time.

The DWT is considerably easier to implement when compared to the CWT. The basic concepts

of the DWT will be introduced in this section along with its properties and the algorithms used to

compute it. As in the previous chapters, examples are provided to aid in the interpretation of the

DWT.

THE DISCRETE WAVELET TRANSFORM (DWT)

The foundations of the DWT go back to 1976 when Croiser, Esteban, and Galand devised

a technique to decompose discrete time signals. Crochiere, Weber, and Flanagan did a similar

work on coding of speech signals in the same year. They named their analysis scheme as

subband coding. In 1983, Burt defined a technique very similar to subband coding and named it

pyramidal coding which is also known as multiresolution analysis. Later in 1989, Vetterli and

Le Gall made some improvements to the subband coding scheme, removing the existing

redundancy in the pyramidal coding scheme. Subband coding is explained below. A detailed

coverage of the discrete wavelet transform and theory of multiresolution analysis can be found in

a number of articles and books that are available on this topic, and it is beyond the scope of this

tutorial.

The Subband Coding and The Multiresolution Analysis

The main idea is the same as it is in the CWT. A time-scale representation of a digital

signal is obtained using digital filtering techniques. Recall that the CWT is a correlation between

a wavelet at different scales and the signal with the scale (or the frequency) being used as a

measure of similarity. The continuous wavelet transform was computed by changing the scale of

the analysis window, shifting the window in time, multiplying by the signal, and integrating over

all times. In the discrete case, filters of different cutoff frequencies are used to analyze the signal

at different scales. The signal is passed through a series of high pass filters to analyze the high

frequencies, and it is passed through a series of low pass filters to analyze the low frequencies.

The resolution of the signal, which is a measure of the amount of detail information in the

signal, is changed by the filtering operations, and the scale is changed by upsampling and

downsampling (subsampling) operations. Subsampling a signal corresponds to reducing the

sampling rate, or removing some of the samples of the signal. For example, subsampling by two

refers to dropping every other sample of the signal. Subsampling by a factor n reduces the

number of samples in the signal n times.

Upsampling a signal corresponds to increasing the sampling rate of a signal by adding

new samples to the signal. For example, upsampling by two refers to adding a new sample,

usually a zero or an interpolated value, between every two samples of the signal. Upsampling a

signal by a factor of n increases the number of samples in the signal by a factor of n.

Although it is not the only possible choice, DWT coefficients are usually sampled from

the CWT on a dyadic grid, i.e., s0 = 2 and 0 = 1, yielding s=2j and =k*2j, as described in Part

3. Since the signal is a discrete time function, the terms function and sequence will be used

interchangeably in the following discussion. This sequence will be denoted by x[n], where n is

an integer.

The procedure starts with passing this signal (sequence) through a half band digital

lowpass filter with impulse response h[n]. Filtering a signal corresponds to the mathematical

operation of convolution of the signal with the impulse response of the filter. The convolution

operation in discrete time is defined as follows:

A half band lowpass filter removes all frequencies that are above half of the highest

frequency in the signal. For example, if a signal has a maximum of 1000 Hz component, then

half band lowpass filtering removes all the frequencies above 500 Hz.

The unit of frequency is of particular importance at this time. In discrete signals,

frequency is expressed in terms of radians. Accordingly, the sampling frequency of the signal is

equal to 2 radians in terms of radial frequency. Therefore, the highest frequency component that

exists in a signal will be radians, if the signal is sampled at Nyquist�s rate (which is twice the

maximum frequency that exists in the signal); that is, the Nyquist�s rate corresponds to rad/s

in the discrete frequency domain. Therefore using Hz is not appropriate for discrete signals.

However, Hz is used whenever it is needed to clarify a discussion, since it is very common to

think of frequency in terms of Hz. It should always be remembered that the unit of frequency for

discrete time signals is radians.

After passing the signal through a half band lowpass filter, half of the samples can be

eliminated according to the Nyquist�s rule, since the signal now has a highest frequency of /2

radians instead of radians. Simply discarding every other sample will subsample the signal by

two, and the signal will then have half the number of points. The scale of the signal is now

doubled. Note that the lowpass filtering removes the high frequency information, but leaves the

scale unchanged. Only the subsampling process changes the scale. Resolution, on the other hand,

is related to the amount of information in the signal, and therefore, it is affected by the filtering

operations. Half band lowpass filtering removes half of the frequencies, which can be interpreted

as losing half of the information. Therefore, the resolution is halved after the filtering operation.

Note, however, the subsampling operation after filtering does not affect the resolution, since

removing half of the spectral components from the signal makes half the number of samples

redundant anyway. Half the samples can be discarded without any loss of information. In

summary, the lowpass filtering halves the resolution, but leaves the scale unchanged.

The signal is then subsampled by 2 since half of the number of samples are redundant.

This doubles the scale.

This procedure can mathematically be expressed as

Having said that, we now look how the DWT is actually computed: The DWT analyzes

the signal at different frequency bands with different resolutions by decomposing the signal into

a coarse approximation and detail information. DWT employs two sets of functions, called

scaling functions and wavelet functions, which are associated with low pass and highpass filters,

respectively. The decomposition of the signal into different frequency bands is simply obtained

by successive highpass and lowpass filtering of the time domain signal. The original signal x[n]

is first passed through a halfband highpass filter g[n] and a lowpass filter h[n]. After the filtering,

half of the samples can be eliminated according to the Nyquist�s rule, since the signal now has

a highest frequency of /2 radians instead of . The signal can therefore be subsampled by 2,

simply by discarding every other sample. This constitutes one level of decomposition and can

mathematically be expressed as follows:

where yhigh[k] and ylow[k] are the outputs of the highpass and lowpass filters, respectively,

after subsampling by 2.

This decomposition halves the time resolution since only half the number of samples now

characterizes the entire signal. However, this operation doubles the frequency resolution, since

the frequency band of the signal now spans only half the previous frequency band, effectively

reducing the uncertainty in the frequency by half.

The above procedure, which is also known as the subband coding, can be repeated for

further decomposition. At every level, the filtering and subsampling will result in half the

number of samples (and hence half the time resolution) and half the frequency band spanned

(and hence double the frequency resolution). Figure 4.1 illustrates this procedure, where x[n] is

the original signal to be decomposed, and h[n] and g[n] are lowpass and highpass filters,

respectively. The bandwidth of the signal at every level is marked on the figure as "f".

Figure 4.1. The Subband Coding Algorithm As an example, suppose that the original signal x[n]

has 512 sample points, spanning a frequency band of zero to rad/s. At the first decomposition

level, the signal is passed through the highpass and lowpass filters, followed by subsampling by

2.

The output of the highpass filter has 256 points (hence half the time resolution), but it

only spans the frequencies /2 to rad/s (hence double the frequency resolution). These 256

samples constitute the first level of DWT coefficients. The output of the lowpass filter also has

256 samples, but it spans the other half of the frequency band, frequencies from 0 to /2 rad/s.

This signal is then passed through the same lowpass and highpass filters for further

decomposition. The output of the second lowpass filter followed by subsampling has 128

samples spanning a frequency band of 0 to /4 rad/s, and the output of the second highpass filter

followed by subsampling has 128 samples spanning a frequency band of /4 to /2 rad/s. The

second highpass filtered signal constitutes the second level of DWT coefficients. This signal has

half the time resolution, but twice the frequency resolution of the first level signal. In other

words, time resolution has decreased by a factor of 4, and frequency resolution has increased by

a factor of 4 compared to the original signal. The lowpass filter output is then filtered once again

for further decomposition. This process continues until two samples are left. For this specific

example there would be 8 levels of decomposition, each having half the number of samples of

the previous level. The DWT of the original signal is then obtained by concatenating all

coefficients starting from the last level of decomposition (remaining two samples, in this case).

The DWT will then have the same number of coefficients as the original signal.

The frequencies that are most prominent in the original signal will appear as high

amplitudes in that region of the DWT signal that includes those particular frequencies. The

difference of this transform from the Fourier transform is that the time localization of these

frequencies will not be lost. However, the time localization will have a resolution that depends

on which level they appear. If the main information of the signal lies in the high frequencies, as

happens most often, the time localization of these frequencies will be more precise, since they

are characterized by more number of samples. If the main information lies only at very low

frequencies, the time localization will not be very precise, since few samples are used to express

signal at these frequencies. This procedure in effect offers a good time resolution at high

frequencies, and good frequency resolution at low frequencies. Most practical signals

encountered are of this type.

We will revisit this example, since it provides important insight to how DWT should be

interpreted. Before that, however, we need to conclude our mathematical analysis of the DWT.

One important property of the discrete wavelet transform is the relationship between the

impulse responses of the highpass and lowpass filters. The highpass and lowpass filters are not

independent of each other, and they are related by

where g[n] is the highpass, h[n] is the lowpass filter, and L is the filter length (in number

of points). Note that the two filters are odd index alternated reversed versions of each other.

Lowpass to highpass conversion is provided by the (-1)n term. Filters satisfying this condition are

commonly used in signal processing, and they are known as the Quadrature Mirror Filters

(QMF). The two filtering and subsampling operations can be expressed by

The reconstruction in this case is very easy since halfband filters form orthonormal bases.

The above procedure is followed in reverse order for the reconstruction. The signals at every

level are upsampled by two, passed through the synthesis filters g�[n], and h�[n] (highpass

and lowpass, respectively), and then added. The interesting point here is that the analysis and

synthesis filters are identical to each other, except for a time reversal. Therefore, the

reconstruction formula becomes (for each layer)

However, if the filters are not ideal halfband, then perfect reconstruction cannot be

achieved. Although it is not possible to realize ideal filters, under certain conditions it is possible

to find filters that provide perfect reconstruction. The most famous ones are the ones developed

by Ingrid Daubechies, and they are known as Daubechies� wavelets.

Note that due to successive subsampling by 2, the signal length must be a power of 2, or

at least a multiple of power of 2, in order this scheme to be efficient. The length of the signal

determines the number of levels that the signal can be decomposed to. For example, if the signal

length is 1024, ten levels of decomposition are possible.

Interpreting the DWT coefficients can sometimes be rather difficult because the way

DWT coefficients are presented is rather peculiar. To make a real long story real short, DWT

coefficients of each level are concatenated, starting with the last level. An example is in order to

make this concept clear:

Suppose we have a 256-sample long signal sampled at 10 MHZ and we wish to obtain its

DWT coefficients. Since the signal is sampled at 10 MHz, the highest frequency component that

exists in the signal is 5 MHz. At the first level, the signal is passed through the lowpass filter

h[n], and the highpass filter g[n], the outputs of which are subsampled by two. The highpass

filter output is the first level DWT coefficients. There are 128 of them, and they represent the

signal in the [2.5 5] MHz range. These 128 samples are the last 128 samples plotted. The

lowpass filter output, which also has 128 samples, but spanning the frequency band of [0 2.5]

MHz, are further decomposed by passing them through the same h[n] and g[n]. The output of the

second highpass filter is the level 2 DWT coefficients and these 64 samples precede the 128 level

1 coefficients in the plot. The output of the second lowpass filter is further decomposed, once

again by passing it through the filters h[n] and g[n]. The output of the third highpass filter is the

level 3 DWT coefficiets. These 32 samples precede the level 2 DWT coefficients in the plot.

The procedure continues until only 1 DWT coefficient can be computed at level 9. This

one coefficient is the first to be plotted in the DWT plot. This is followed by 2 level 8

coefficients, 4 level 7 coefficients, 8 level 6 coefficients, 16 level 5 coefficients, 32 level 4

coefficients, 64 level 3 coefficients, 128 level 2 coefficients and finally 256 level 1 coefficients.

Note that less and less number of samples is used at lower frequencies, therefore, the time

resolution decreases as frequency decreases, but since the frequency interval also decreases at

low frequencies, the frequency resolution increases. Obviously, the first few coefficients would

not carry a whole lot of information, simply due to greatly reduced time resolution. To illustrate

this richly bizarre DWT representation let us take a look at a real world signal. Our original

signal is a 256-sample long ultrasonic signal, which was sampled at 25 MHz. This signal was

originally generated by using a 2.25 MHz transducer, therefore the main spectral component of

the signal is at 2.25 MHz. The last 128 samples correspond to [6.25 12.5] MHz range. As seen

from the plot, no information is available here, hence these samples can be discarded without any

loss of information. The preceding 64 samples represent the signal in the [3.12 6.25] MHz range,

which also does not carry any significant information. The little glitches probably correspond to

the high frequency noise in the signal. The preceding 32 samples represent the signal in the [1.5

3.1] MHz range. As you can see, the majority of the signal�s energy is focused in these 32

samples, as we expected to see. The previous 16 samples correspond to [0.75 1.5] MHz and the

peaks that are seen at this level probably represent the lower frequency envelope of the signal.

The previous samples probably do not carry any other significant information. It is safe to say

that we can get by with the 3rd and 4th level coefficients, that is we can represent this 256 sample

long signal with 16+32=48 samples, a significant data reduction which would make your

computer quite happy.

One area that has benefited the most from this particular property of the wavelet

transforms is image processing. As you may well know, images, particularly high-resolution

images, claim a lot of disk space. As a matter of fact, if this tutorial is taking a long time to

download, that is mostly because of the images. DWT can be used to reduce the image size

without losing much of the resolution.

For a given image, you can compute the DWT of, say each row, and discard all values in

the DWT that are less then a certain threshold. We then save only those DWT coefficients that

are above the threshold for each row, and when we need to reconstruct the original image, we

simply pad each row with as many zeros as the number of discarded coefficients, and use the

inverse DWT to reconstruct each row of the original image. We can also analyze the image at

different frequency bands, and reconstruct the original image by using only the coefficients that

are of a particular band. I will try to put sample images hopefully soon, to illustrate this point.

Another issue that is receiving more and more attention is carrying out the decomposition

(subband coding) not only on the lowpass side but on both sides. In other words, zooming into

both low and high frequency bands of the signal separately. This can be visualized as having

both sides of the tree structure of Figure 4.1. What result is what is known as the wavelet

packages. We will not discuss wavelet packages in this here, since it is beyond the scope of this

tutorial. Anyone who is interested in wavelet packages, or more information on DWT can find

this information in any of the numerous texts available in the market.

And this concludes our mini series of wavelet tutorial. If I could be of any assistance to

anyone struggling to understand the wavelets, I would consider the time and the effort that went

into this tutorial well spent. I would like to remind that this tutorial is neither a complete nor a

through coverage of the wavelet transforms. It is merely an overview of the concept of wavelets

and it was intended to serve as a first reference for those who find the available texts on wavelets

rather complicated. There might be many structural and/or technical mistakes, and I would

appreciate if you could point those out to me. Your feedback is of utmost importance for the

success of this tutorial.

The Wavelet Lifting Scheme :

Wavelets and Their Applications

Wavelets have been applied in a wide range of areas. My interest in wavelets came from digital

signal processing of non-stationary time series .Data sets without obviously periodic components

cannot be processed well using Fourier techniques. Wavelets allow complex filters to be

constructed for this kind of data which can remove or enhance selected parts of the signal. There

is a growing body of literature on wavelet techniques for noise reduction.

Wavelets have been used for data compression. For example, the United States FBI compresses

their fingerprint data base using wavelets. Lifting scheme wavelets also form the basis of the

emerging JPEG 2000 image compression standard.

Wavelet techniques have also been used in a variety of statistical applications, including signal

variance estimation, frequency analysis and kernel regression.

Perfect Reconstruction in an Finite World

One of the features of wavelets that is critical in areas like signal processing and compression is

what is referred to in the wavelet literature as perfect reconstruction. A wavelet algorithm has

perfect reconstruction when the inverse wavelet transform of the result of the wavelet transform

yields exactly the original data set:

IWT( WT( D ) ) = D

Limitations of the Haar Wavelet Transform

The Haar wavelet transform has a number of advantages:

It is conceptually simple.

It is fast.

It is memory efficient, since it can be calculated in place without a temporary array.

It is exactly reversible without the edge effects that are a problem with other wavelet

trasforms.

The Haar transform also has limitations, which can be a problem for some applications.

In generating each set of averages for the next level and each set of coefficients, the Haar

transform performs an average and difference on a pair of values. Then the algorithm shifts over

by two values and calculates another average and difference on the next pair.

The high frequency coefficient spectrum should reflect all high frequency changes. The Haar

window is only two elements wide. If a big change takes place from an even value to an odd

value, the change will not be reflected in the high frequency coefficients.

If the wavelet literature is any guide, when mathematicians think about and write about

wavelets equations, they think about wavelets applied to an infinite sequence of data.

Many wavelet equations that have the property of perfect reconstruction for infinite data

sequences do not have this property for finite data sequences.

Since sound files, financial time series, images and other data sets to which wavelets are

applied are finite this can be a problem. There are several methods proposed in the wavelet

literature for dealing with the edges of finite data sets. See for example, my discussion of the

daubechies D4 Wavelet transformThe simplest wavelet algorithm that shows perfect

reconstruction is the Haar wavelet algorithm. However, Haar wavelets can miss detail and do not

always represent change at all resolution scales.

The wavelet lifting scheme divides the wavelet transform into a set of steps. One of the

elegant qualities of wavelet algorithms expressed via the listing scheme is the fact that the

inverse transform is a mirror of the forward transform. The simplest way to start thinking about

the lifting scheme is via a one step wavelet that I refer to as a "predict wavelet".

The wavelet Lifting Scheme is a method for decomposing wavelet transforms into a set of stages.

Lifing scheme algorithms have the advantage that they do not require temporary arrays in the

calculation steps, as is necessary for some versions of the Daubechies D4 wavelet algorithm.

The simplest version of a forward wavelet transform expressed in the Lifting Scheme is shown

below in Figure 1. The predict step is the subject of this web page, which will considered in

isolation. The predict step calculates the wavelet function in the wavelet transform. This is a high

pass filter. The update step calculates the scaling function, which results in a smoother version of

the data. The complete lifting scheme is discussed on the Basic Lifting Scheme web page. This

web page is intended to provide background for this discussion.

Figure 2 shows the ur-Lifting Scheme transform, consisting of two steps:

1. Split step: divide the input data into odd and even elements. In a finite data set the odd

elements are moved to the second half of the array, leaving the even elements in the first

half.

2. Predict step: predict the odd elements from the even elements.

One way to view the predict step is through the lens of data compression. If our objective is to

compress a set of data and the odd elements can be absolutely predicted from the even elements

using the equation.

odd = even * 2;

the odd elements can be replaced by zero. If we apply a compression algorithm like run length

encoding the odd elements will be reduced to a count and zero, compressing the original data set

by almost 50%. If the data set consists of points on a line, then it can be reduced to something

close to a single element and the length of the data set. In most cases the data set is more

complex and it cannot be entirely represented by a starting condition, a length and an equation.

However, a more compact representation might be arrived at by approximating the data in a local

region using a function. The predict stage replaces an odd element with the difference between

the odd element a function calculated from the even elements. The simplest example of such a

predict stage takes a single even element as its argument to calculate the predicted value of the

odd element:

oddj+1,i = oddj,i - P(evenj,k)

Here the function P() is the predict function. Wavelet algorithms are recursive, so the recursive

step j generates data for the next recursive step j+1. The subscript i indexes the odd part of the

array. The subscript k indexes the even part of the array.

One of the simplest predict functions is simply

oddj+1,i = evenj,k

If the split step had not divided the odd and even elements, the predict step predicts that the odd

value is equal to its even predecessor.

ai+1 = ai;

The predict step replaces the odd elements with the difference between the actual odd value and

the predicted value:

oddj+1,i = oddj,i - evenj,k

If the data shows a trend (in the language of statistics the data shows autocorrelation),

then the odd element can be predicted from the even element, to some degree.

As a result, the difference between the odd element and its predictor (the even element)

will be smaller than the odd element itself. Smaller values can be represented in fewer bits, so

some level of compression can be achieved.

The process of "predicting" the odd elements from the even elements is recursive, as long

as the number of data elements is a power of two. After the first pass, the odd (upper) half of the

array will contain the differences between the prediction and the original odd element values.

The next recursive pass divides the lower half of the array into odd and even halves. The

difference between the prediction and the odd element value is stored in the new odd half. The

recursive passes continue until the last step where a single odd element is predicted from a single

even element. This is shown in Figure 3 below.

Figure 3

Another way to view this is as a chain of split/predict steps, where the even elements

from one step become the input for the next step.

The closer the predict function fits the data the smaller the differences will be. A

prediction function that closely fits the data will allow the result of the predict wavelet to be

represented in fewer bits, so there will be a higher level of compression.

The Daubechies D4 Wavelet Transform

The Daubechies wavelet transform is named after its inventor (or would it be

discoverer?), the mathematician Ingrid Daubechies. The Daubechies D4 transform has four

wavelet and scaling function coefficients. The3 scaling function coefficients are

Each step of the wavelet transform applies the scaling function to the the data input. If the

original data set has N values, the scaling function will be applied in the wavelet transform step

to calculate N/2 smoothed values. In the ordered wavelet transform the smoothed values are

stored in the lower half of the N element input vector.

The wavelet function coefficient values are:

g0=h3,g1=-h2,g2=h1,g3 = -h0

Each step of the wavelet transform applies the wavelet function to the input data. If the

original data set has N values, the wavelet function will be applied to calculate N/2 differences

(reflecting change in the data). In the ordered wavelet transform the wavelet values are stored in

the upper half of teh N element input vector.

The scaling and wavelet functions are calculated by taking the inner product of the

coefficients and four data values. The equations are shown below:

Each iteration in the wavelet transform step calculates a scaling function value and a

wavelet function value. The index i is incremented by two with each iteration, and new scaling

and wavelet function values are calculated. This pattern is discussed on the web page A Linear

Algebra View of the Wavelet Transform.

In the case of the forward transform, with a finite data set (as opposed to the

mathematician's imaginary infinite data set), i will be incremented until it is equal to N-2. In the

last iteration the inner product will be calculated from calculated from s[N-2], s[N-1], s[N] and

s[N+1]. Since s[N] and s[N+1] don't exist (they are byond the end of the array), this presents a

problem. This is shown in the transform matrix below.

Daubechies D4 forward transform matrix for an 8 element signal

Note that this problem does not exist for the Haar wavelet, since it is calculated on only

two elements, s[i] and s[i+1].

A similar problem exists in the case of the inverse transform. Here the inverse transform

coefficients extend beyond the beginning of the data, where the first two inverse values are

calculated from s[-2], s[-1], s[0] and s[1]. This is shown in the inverse transform matrix below.

Daubechies D4 inverse transform matrix for an 8 element transform result

Three methods for handling the edge problem:

1. Treat the data set as if it is periodic. The beginning of the data sequence repeats folling

the end of the sequence (in the case of the forward transform) and the end of the data

wraps around to the beginning (in the case of the inverse transform).

2. Treat the data set as if it is mirrored at the ends. This means that the data is reflected from

each end, as if a mirror were held up to each end of the data sequence.

3. Gram-Schmidt orthogonalization. Gram-Schmidt orthoganalization calculates special

scaling and wavelet functions that are applied at the start and end of the data set.

Zeros can also be used to fill in for the missing elements, but this can introduce significant

error.

The Daubechies D4 algorithm published here treats the data as if it were periodic. The code

for one step of the forward transform is shown below. Note that in the calculation of the last two

values, the start of the data wraps around to the end and elements a[0] and a[1] are used in the

inner product.

SPIHT

The powerful wavelet-based image compression method called Set Partitioning in

Hierarchical Trees (SPIHT). This award-winning method has received worldwide acclaim and

attention since its introduction in 1995. Thousands of people, researchers and consumers alike,

have now tested and used SPIHT. It has become the benchmark state-of-the-art algorithm for

image compression.

The SPIHT method is not a simple extension of traditional methods for image

compression, and represents an important advance in the field. The method deserves special

attention because it provides the following:

Highest Image Quality

Progressive image transmission

Fully embedded coded file

Simple quantization algorithm

Fast coding/decoding

Completely adaptive

Lossless compression

Exact bit rate coding

Error protection

Each of these properties is discussed below. Note that different compression methods

were developed specifically to achieve at least one of those objectives. What makes SPIHT

really outstanding is that it yields all those qualities simultaneously. So, if in the future you find

one method that claims to be superior to SPIHT in one evaluation parameter (like PSNR),

remember to see who wins in the remaining criteria.

Image Quality

Extensive research has shown that the images obtained with wavelet-based methods yield

very good visual quality. At first it was shown that even simple coding methods produced good

results when combined with wavelets and is the basis for the most recently JPEG2000 standard.

However, SPIHT belongs to the next generation of wavelet encoders, employing more

sophisticated coding. In fact, SPIHT exploits the properties of the wavelet-transformed images to

increase its efficiency.

Many researchers now believe that encoders that use wavelets are superior to those that

use DCT or fractals. We will not discuss the matter of taste in the evaluation of low quality

images, but we do want to say that SPIHT wins in the test of finding the minimum rate required

to obtain a reproduction indistinguishable from the original. The SPIHT advantage is even more

pronounced in encoding color images, because the bits are allocated automatically for local

optimality among the color components, unlike other algorithms that encode the color

components separately based on global statistics of the individual components.

If, after what we said, you are still not certain that you should believe us (because in the

past you heard claims like that and then were deeply disappointed), we understand you point of

view.

Progressive Image Transmission

In some systems with progressive image transmission (like WWW browsers) the quality

of the displayed images follows the sequence: (a) weird abstract art; (b) you begin to believe that

it is an image of something; (c) CGA-like quality; (d) lossless recovery. With very fast links the

transition from (a) to (d) can be so fast that you will never notice. With slow links (how "slow"

depends on the image size, colors, etc.) the time from one stage to the next grows exponentially,

and it may take hours to download a large image. Considering that it may be possible to recover

an excellent-quality image using 10-20 times less bits, it is easy to see the inefficiency.

Furthermore, the mentioned systems are not efficient even for lossless transmission.

The problem is that such widely used schemes employ a very primitive progressive

image transmission method. On the other extreme, SPIHT is a state-of-the-art method that was

designed for optimal progressive transmission (and still beats most non-progressive methods!). It

does so by producing a fully embedded coded file (see below), in a manner that at any moment

the quality of the displayed image is the best available for the number of bits received up to that

moment.

So, SPIHT can be very useful for applications where the user can quickly inspect the

image and decide if it should be really downloaded, or is good enough to be saved, or need

refinement.

Optimized Embedded Coding

A strict definition of the embedded coding scheme is: if two files produced by the

encoder have size M and N bits, with M > N, then the file with size N is identical to the first N

bits of the file with size M.

Let's see how this abstract definition is used in practice. Suppose you need to compress

an image for three remote users.

Each one have different needs of image reproduction quality, and you find that those

qualities can be obtained with the image compressed to at least 8 Kb, 30 Kb, and 80 Kb,

respectively. If you use a non-embedded encoder (like JPEG), to save in transmission costs (or

time) you must prepare one file for each user. On the other hand, if you use an embedded

encoder (like SPIHT) then you can compress the image to a single 80 Kb file, and then send the

first 8 Kb of the file to the first user, the first 30 Kb to the second user, and the whole file to the

third user.

But what is the price to pay for this "amenity"? Surprisingly, with SPIHT all three users

would get (for the same file size) an image quality comparable or superior to the most

sophisticated non-embedded encoders available today. SPIHT achieves this feat by optimizing

the embedded coding process and always coding the most important information first.

An even more important application is for progressive image transmission, where the

user can decide at which point the image quality satisfies his needs, or abort the transmission

after a quick inspection, etc.

Compression Algorithm

Various paper describing SPIHT is available via anonymous ftp. Here we take the

opportunity to comment how it is different from other approaches. The following is a

comparison of image quality and artifacts at high compression ratios versus JPEG.

SPIHT represents a small "revolution" in image compression because it broke the trend to

more complex (in both the theoretical and the computational senses) compression schemes.

While researchers had been trying to improve previous schemes for image coding using

very sophisticated vector quantization, SPIHT achieved superior results using the simplest

method: uniform scalar quantization. Thus, it is much easier to design fast SPIHT codecs.

Encoding/Decoding Speed

The SPIHT process represents a very effective form of entropy-coding. This is shown by

the demo programs using two forms of coding: binary-uncoded (extremely simple) and context-

based adaptive arithmetic coded (sophisticated). Surprisingly, the difference in compression is

small, showing that it is not necessary to use slow methods (and also pay royalties for them!). A

fast version using Huffman codes was also successfully tested, but it is not publicly available.

A straightforward consequence of the compression simplicity is the greater

coding/decoding speed. The SPIHT algorithm is nearly symmetric, i.e., the time to encode is

nearly equal to the time to decode. (Complex compression algorithms tend to have encoding

times much larger than the decoding times.)

Some of our demo programs use floating-point operations extensively, and can be slower

in some CPUs (floating points are better when people want to test you programs with strange 16

bpp images). However, this problem can be easily solved: try the lossless version to see an

example. Similarly, the use for progressive transmission requires a somewhat more complex and

slower algorithm. Some shortcuts can be used if progressive transmission is not necessary.

When measuring speed please remember that these demo programs were written for

academic studies only, and were not fully optimized as are the commercial versions.

Applications

SPIHT exploits properties that are present in a wide variety of images. It had been

successfully tested in natural (portraits, landscape, weddings, etc.) and medical (X-ray, CT, etc)

images. Furthermore, its embedded coding process proved to be effective in a broad range of

reconstruction qualities. For instance, it can code fair-quality portraits and high-quality medical

images equally well (as compared with other methods in the same conditions).

SPIHT has also been tested for some less usual purposes, like the compression of

elevation maps, scientific data, and others.

Lossless Compression

SPIHT codes the individual bits of the image wavelet transform coefficients following a

bit-plane sequence. Thus, it is capable of recovering the image perfectly (every single bit of it)

by coding all bits of the transform. However, the wavelet transform yields perfect reconstruction

only if its numbers are stored as infinite-precision numbers. In practice it is frequently possible to

recover the image perfectly using rounding after recovery, but this is not the most efficient

approach.

For lossless compression we proposed an integer multiresolution transformation, similar

to the wavelet transform, which we called S+P transform. It solves the finite-precision problem

by carefully truncating the transform coefficients during the transformation (instead of after).

A codec that uses this transformation to yield efficient progressive transmission up to

lossless recovery is among the SPIHT demo programs. A surprising result obtained with this

codec is that for lossless compression it is as efficient as the most effective lossless encoders

(lossless JPEG is definitely not among them). In other words, the property that SPIHT yields

progressive transmission with practically no penalty in compression efficiency applies to lossless

compression too. Below are examples of Lossless and lossy (200:1) images decoded from the

same file.

Rate or Distortion Specification

Almost all image compression methods developed so far do not have precise rate control.

For some methods you specify a target rate, and the program tries to give something that is not

too far from what you wanted. For others you specify a "quality factor" and wait to see if the size

of the file fits your needs. (If not, just keep trying...)

The embedded coding property of SPIHT allows exact bit rate control, without any

penalty in performance (no bits wasted with padding, or whatever).

The same property also allows exact mean squared-error (MSE) distortion control. Even

though the MSE is not the best measure of image quality, it is far superior to other criteria used

for quality specification.

Error Protection

Errors in the compressed file cause havoc for practically all important image compression

methods. This is not exactly related to variable length entropy-coding, but to the necessity of

using context generation for efficient compression. For instance, Huffman codes have the ability

to quickly recover after an error. However, if it is used to code run-lengths, then that property is

useless because all runs after an error would be shifted.

SPIHT is not an exception for this rule. One difference, however, is that due to SPIHT's

embedded coding property, it is much easier to design efficient error-resilient schemes.

This happens because with embedded coding the information is sorted according to its

importance, and the requirement for powerful error correction codes decreases from the

beginning to the end of the compressed file. If an error is detected, but not corrected, the decoder

can discard the data after that point and still display the image obtained with the bits received

before the error. Also, with bit-plane coding the error effects are limited to below the previously

coded planes.

Another reason is that SPIHT generates two types of data. The first is sorting

information, which needs error protection as explained above. The second consists of

uncompressed sign and refinement bits, which do not need special protection because they affect

only one pixel.

While SPIHT can yield gains like 3 dB PSNR over methods like JPEG, its use in noisy

channels, combined with error protection as explained above, leads to much larger gains, like 6-

12 dB. (Such high coding gains are frequently viewed with skepticism, but they do make sense

for combined source-channel coding schemes.)

SPIHT Optimized Algorithm

The following are the suite of application specific SPIHT compression products:

D-SPIHT (Dynamic)

The D-SPIHT software is capable of the most efficient compression of monochrome, 1

and 2 byte per pel, and color images. It has the features of specifying bit rate or quality at

encoding time. For bit rate specification, the compressed file is completely and finely rate-

embedded. Rate-embedded means that any lower rate D-SPIHT file is a subset of the full file

and can be decoded to the given smaller rate. The decoding has the special feature of producing

smaller resolution image reconstructions (such as thumbnails) directly from the compressed file

without offline processing.

S-SPIHT (Striped)

The S-SPIHT software shares many features of D-SPIHT, but it works with a small

memory. The small memory leads to a slight degradation in performance compared to D-

SPIHT. Therefore, this software is ideal for compression of very large images, such as those

from GIS and remote sensing systems. It is both efficient and fast. A good strategy for

compression of image of any size is to use a combination of D-SPIHT and S-SPIHT, with S-

SPIHT taking over from D-SPIHT when an image dimension exceeds 1024 or 2048.

T-SPIHT (Tiled)

T-SPIHT is a specially designed form of SPIHT that compresses large images in tiles.

Remote sensing, geographical, and meteorological data are often handled in tiles rather than all

at once. Therefore, systems for processing such data would prefer to use a compression by tiles.

T-SPIHT can efficiently compress, with constant quality and without tile boundary artifacts,

large image or data sets, whether in tile format or not. In fact, the compression is significantly

more efficient than JPEG2000 in its tiled compression mode.

P-SPIHT (Photo-ID)

P-SPIHT is especially designed to efficiently compress small grayscale and color

images. It is more efficient in this application than JPEG2000 or any other competing software.

It is ideal for storage of photo ID’s on plastic cards with magnetic strips.

PROGRES (Progressive Resolution Decompression)

PROGRES is a progressive resolution, extremely fast version of SPIHT that has full

capability of random access decoding. It chooses a quality factor for lossy compression of any

size image and can decompress a given arbitrary region to any resolution very quickly. It comes

either in a command line version for WINDOWS/DOS or UNIX (or Linux) or with additional

region-of-interest (ROI) encoding and decoding in a UNIX (or Linux) GUI version. It is an

excellent choice for remote sensing and GIS applications, where rapid browsing of large images

is necessary.

PROGCODE (Lossless & Progressive)

PROGCODE is our pioneering progressive lossy to purely lossless (reversible) grey scale

compression algorithm. It sets the standard for lossless compression. For larger images, the

compression is still efficient, but the larger memory utilization slows down execution. This

algorithm can efficiently compress medical images either lossless or with any desired rate.

Moreover, the lossless file can act as an archive in a file server, from within which any desired

lossy lower rate file can be extracted and decompressed directly without offline processing. This

software is also ideal for multi-user storage or transmission systems, where users with different

capabilities and requirements can receive the image to their desired accuracy from a single file or

transmission.

SPM (Fastest Lossless!)

SPM software is especially designed for lossless and lossy compression of medical

images. The method is not based on SPIHT, but on our patented quadrature splitting algorithm

called AGP (for Amplitude and Group Partitioning).

SPM calls for a quality factor, with 0 giving perfectly lossless compression. Because the

intended application is medical images, the allowed lossy compression adheres to degrees of

high quality. The outstanding characteristics of SPM, besides efficiency, are very fast

compression and decompression, regardless of image size. It comes either in a command line

version for WINDOWS/DOS or UNIX (or Linux) or in a WINDOWS GUI version. All versions

allow reduced resolution decoding from a single compressed file.

IMAGE FUSION

Multi sensor Image fusion is the process of combining relevant information from two or

more images into a single image. The resulting image will be more informative than any of the

input images.

In remote sensing applications, the increasing availability of space borne sensors gives a

motivation for different image fusion algorithms. Several situations in image processing require

high spatial and high spectral resolution in a single image. Most of the available equipment is not

capable of providing such data convincingly. The image fusion techniques allow the integration

of different information sources. The fused image can have complementary spatial and spectral

resolution characteristics. However, the standard image fusion techniques can distort the spectral

information of the multispectral data while merging.

In satellite imaging, two types of images are available. The panchromatic image acquired

by satellites is transmitted with the maximum resolution available and the multispectral data are

transmitted with coarser resolution. This will usually be two or four times lower. At the receiver

station, the panchromatic image is merged with the multispectral data to convey more

information.

Many methods exist to perform image fusion. The very basic one is the high pass filtering

technique. Later techniques are based on DWT, uniform rational filter bank, and laplacian

pyramid.

Why Image Fusion

Multi sensor data fusion has become a discipline to which more and more general formal

solutions to a number of application cases are demanded. Several situations in image processing

simultaneously require high spatial and high spectral information in a single image. This is

important in remote sensing. However, the instruments are not capable of providing such

information either by design or because of observational constraints. One possible solution for

this is data fusion..

Standard Image Fusion Methods

Image fusion methods can be broadly classified into two - spatial domain fusion and

transform domain fusion.

The fusion methods such as averaging, Brovey method, principal component analysis

(PCA) and IHS based methods fall under spatial domain approaches. Another important spatial

domain fusion method is the high pass filtering based technique. Here the high frequency details

are injected into up sampled version of MS images. The disadvantage of spatial domain

approaches is that they produce spatial distortion in the fused image. Spectral distortion becomes

a negative factor while we go for further processing, such as classification problem. Spatial

distortion can be very well handled by transform domain approaches on image fusion. The multi

resolution analysis has become a very useful tool for analyzing remote sensing images. The

discrete wavelet transform has become a very useful tool for fusion. Some other fusion methods

are also there, such as Lapacian pyramid based, curvelet transform based etc. These methods

show a better performance in spatial and spectral quality of the fused image compared to other

spatial methods of fusion.

The images used in image fusion should already be registered. Mis registration is a major

source of error in image fusion. Some well-known image fusion methods are:

High pass filtering technique

IHS transform based image fusion

PCA based image fusion

Wavelet transform image fusion

pair-wise spatial frequency matching

Applications

1. Image Classification

2. Aerial and Satellite imaging

3. Medical imaging

4. Robot vision

5. Concealed weapon detection

6. Multi-focus image fusion

7. Digital camera application

8. Battle field monitoring

Satellite Image Fusion

Several methods are there for merging satellite images. In satellite imagery we can have two

types of images

Panchromatic images - An image collected in the broad visual wavelength range but

rendered in black and white.

Multispectral images - Images optically acquired in more than one spectral or

wavelength interval. Each individual image is usually of the same physical area and scale

but of a different spectral band.

The SPOT PAN satellite provides high resolution (10m pixel) panchromatic data. While the

LANDSAT TM satellite provides low resolution (30m pixel) multispectral images. Image fusion

attempts to merge these images and produce a single high resolution multispectral image.

The standard merging methods of image fusion are based on Red-Green-Blue (RGB) to

Intensity-Hue-Saturation (IHS) transformation. The usual steps involved in satellite image fusion

are as follows:

1. Resize the low resolution multispectral images to the same size as the panchromatic

image.

2. Transform the R,G and B bands of the multispectral image into IHS components.

3. Modify the panchromatic image with respect to the multispectral image. This is usually

performed by histogram matchingof the panchromatic image with Intensity component of

the multispectral images as reference.

4. Replace the intensity component by the panchromatic image and perform inverse

transformation to obtain a high resolution multispectral image.

An explanation of how to do Pan-sharpening in Photoshop.

Medical Image Fusion

Image fusion has become a common term used within medical diagnostics and treatment.

The term is used when multiple patient images are registered and overlaid or merged to provide

additional information. Fused images may be created from multiple images from the same

imaging modality, or by combining information from multiple modalities, such as magnetic

resonance image (MRI), computed tomography (CT), positron emission tomography (PET), and

single photon emission computed tomography (SPECT). In radiology and radiation oncology,

these images serve different purposes. For example, CT images are used more often to ascertain

differences in tissue density while MRI images are typically used to diagnose brain tumors.

For accurate diagnoses, radiologists must integrate information from multiple image

formats. Fused, anatomically-consistent images are especially beneficial in diagnosing and

treating cancer. Companies such as Nicesoft, Velocity Medical Solutions, Mirada Medical,

Keosys, MIMvista, IKOE, and BrainLAB have recently created image fusion software for both

improved diagnostic reading, and for use in conjunction with radiation treatment planning

systems. With the advent of these new technologies, radiation oncologists can take full

advantage of intensity modulated radiation therapy (IMRT). Being able to overlay diagnostic

images onto radiation planning images results in more accurate IMRT target tumor volumes.

RESTORATION

Image restoration refers to the recovery of an original signal from degraded

observations. Image restoration is different from image enhancement in that the latter is designed

to emphasize features of the image that make the image more pleasing to the observer, but not

necessarily to produce realistic data from a scientific point of view. Image enhancement

techniques (like contrast stretching or de-blurring by a nearest neighbor procedure) provided by

"Imaging packages" use no a priori model of the process that created the image.

With image enhancement noise can be effectively be removed by sacrificing some resolution, but

this is not acceptable in many applications. In a Fluorescence Microscope resolution in the z-

direction is bad as it is. More advanced image processing techniques must be applied to recover

the object.

DeConvolution is an example of image restoration method. It is capable of:

Increasing resolution especially in the axial direction

Removing noise

Increasing contrast

Since axial imaging performance is the prime reason for researchers to invest in expensive

optical equipment like confocal or two-photon excitation microscopes, the capability of

increasing axial resolution with a 'mere' software technique has considerable value.

In image restoration the information provided by the microscope is only taken as indirect

evidence about object. By itself the image needs not even to be viewable.

A microscopic image contains more information than is readily visible in the image.

Often details are hidden in the noise or masked by other features.

Artifacts may confuse the viewer.

Information may be present in implicit form, so it can only be retrieved with the addition

of a priory knowledge.

INTERPOLATION

In the mathematical field of numerical analysis, interpolation is a method of

constructing new data points within the range of a discrete set of known data points.

In engineering and science, one often has a number of data points, obtained by sampling

or experimentation, which represent the values of a function for a limited number of values of

the independent variable. It is often required to interpolate (i.e. estimate) the value of that

function for an intermediate value of the independent variable. This may be achieved by curve

fitting or regression analysis.

A different problem which is closely related to interpolation is the approximation of a

complicated function by a simple function. Suppose we know the formula for the function but it

is too complex to evaluate efficiently. Then we could pick a few known data points from the

complicated function, creating a lookup table, and try to interpolate those data points by

constructing a simpler function. Of course, when using the simple function to estimate new data

points we usually do not receive the same result as we would if we had used the original

function, but depending on the problem domain and the interpolation method used the gain in

simplicity might offset the error.

There is also another very different kind of interpolation in mathematics, namely the

"interpolation of operators".

A Lifting Scheme Version of the Daubechies (D4) Transform

Lifting scheme wavelet transforms are composed of update and predict steps. In this case a

normalization step has been added as well.

Fig.Two stage Daubechies D4 Forward lifting Wavelet Transform(LWT)

The split step divides the input data into even elements which are stored in the first half of an N

element array section ( S0 to Shalf-1) and odd elements which are stored in the second half of an

N element array section (Shalf to SN -1). In the forward transform equations below the

expression S [half+n] references an odd element and S[n] references an even element. The LL

represents the low frequency components and LH,HL, HH are the high frequency components in

the horizontal, vertical and diagonal directions.

The forward step equations areUpdate1 (U1):

For n= 0 to half -1

S[n] = S[n] + 3S[half + n]

Predict (P1):

for n=1 to half -1

Update2 (U2):

for n=0 to half -2

Normalize (N):

for n=0 to half-1

the inverse transform is a mirror of the forward transform, inwhich addition and subtraction

operations interchanged.

MATHEMATICAL FORMULATION FOR THE SUPER RESOLUTION MODEL

we give the mathematical model for super resolution image reconstruction from a set of

Low Resolution (LR) images. LR means that the pixel density with an image is less therefore

offering fewer details. The CCD discretizes the images and produces a digitized noisy image.

Imaging systems do not sample the scene according to Nyquist criterion. As a result of this the

high frequency contents of images are destroyed and appear as LR. Let us consider the low

resolution sensor plane byM1 by M2 . The low resolution intensity values are denoted as {y(i,j)}

where i = 0...M1 -1 and j = 0...M2 -1 ; if the down sampling parameters are q1 and q2 in

horizontal and vertical directions. Then the high resolution image will be of size q1M1×q2M2 .

We assume that q1=q2 =q and therefore the desired high resolution image Z will have

intensity values {Z (k, l)}where k=0...qM1-1 and l=0...qM2 -1.Given {z (k, l )} , the process of

obtaining down sampled LR aliased image {y(i, j)} is

i.e. the low resolution intensity is the average of high resolution intensities over a

neighborhood of q2 pixels. We formally state the problem by casting it in a low resolution

restoration frame work. There are P observed images { }p 1ym m= each of size M1 ×M2 which

are decimated, blurred and noisy versions of a single high resolution image Z of size N1 × N 2

where N1 = qM1 and N2 = qM2 . After incorporating the blur matrix, and noise vector, the image

formation model is written as

Where m=1…P Here D is the decimation matrix of size 2M1M2 × q M1M2 ,H is PSF of

size M1M2 ×M1M2 , m η is M1M2 ×1 noise vector and P is the number of low resolution

observations. Stacking P vector equations from different low resolution images into a single

matrix vector.

SPHIT ALGORITHM

Set Partitioning in Hierarchical Trees (SPIHT) is an imagecompression algorithm that

exploits the inherent similarities across the sub bands in a wavelet decomposition of an image.

The algorithm codes the most important wavelet transform coefficients first, and transmits the

bits so that an increasingly refined copy of the original image can be

obtain progressively. SPIHT consists of two passes, the ordering pass and the refinement pass. In

the ordering pass SPIHTattempts to order the coefficients according to theirmagnitude. In the

refinement pass the quantization ofcoefficients is refined. The ordering and refining is

maderelative to a threshold. The threshold is appropriatelyinitialized and then continuously made

smaller with eachround of the algorithm. SPIHT maintains three lists ofcoordinates of

coefficients in the decomposition. These arethe List of Insignificant Pixels (LIP), the List of

SignificantPixels (LSP) and the List of Insignificant Sets (LIS).

To decide if a coefficient is significant or not SPIHT uses thefollowing definition. A

coefficient is deemed significant at a certain threshold if its magnitude is larger then or equal

tothe threshold. Using the notion of significance the LIP, LIS and LSP can be explained. The LIP

contains coordinates of coefficients that are insignificant at the current threshold; The LSP

contains the coordinates of coefficients that are significant at the same threshold. The LIS

contains coordinates of the roots of the spatial parent-children trees.

PROPOSED SUPER RESOLUTION RECONSTRUCTION

Images are obtained in areas ranging from everyday photography to astronomy, remote

sensing, medical imaging and microscopy. In each case there is an underlying objector scene we

wish to observe, the original or the true image is the ideal representation of the observed scene.

Yet the observation process is never perfect, there is uncertainty in the measurement occurring as

blur, noise and other degradations in the recorded images. The low resolution observation model

of eq. (2) is considered. In our proposed approach we suggest a general frame work for super

resolution reconstruction using lifting wavelet transforms. The proposed algorithm is based on

breaking the problem into three consecutive steps to work on large dimension images. These

steps are Registration; lifting scheme based fusion, restoration step.

A. Algorithm for Super Resolution Reconstruction using lifting scheme and SPHIT.

Step 1: Three input low resolution blurred, noisy, under sampled, rotated, shifted and

compressed images are considered.

Step 2: The images are first preprocessed, i.e. registered using FFT based algorithm, as proposed

by [ 11].

Step 3: The registered low resolution images are decomposed using lifting schemes to a specified

number of levels. At each level we will have one approximation i.e. LL sub band and 3 detail sub

bands, i.e. LH, HL, HH coefficients.

Step 4: Each low resolution image is encoded using SPHIT.

Step 5: The decomposed images are fused using the fusion rule i.e. Maximum Frequency

Fusion:“fusion by averaging for each band of decomposition and for each channel the wavelets

coefficients of the three images are averaged”. That is approximate and detail coefficients re

fused separately

Step 6: Inverse lifting scheme is applied to obtain the fusedimage

Step 7: The fused image is decoded using DSPHIT.

Step 8: Restoration is performed in order to remove the blur and noise present in the image.

Step 9: Most of the additive noise will be eliminated during the fusion process, to achieve further

noise reduction soft thresolding is applied, where as the image is deblurred using Iterative Blind

De-convolution Algorithm (IBD) .

Step 10: Finally a super resolution image is obtained by wavelet based Interpolation

SIMULATION RESULTS

A (256x 256) image is considered to be the original high resolution image , from which

three noisy, blurred, under sampled, compressed and mis registered images LR images are

created. We have tested using both motion blur with an angel of (10, 20 and 30) and Gaussian

blur of 3x3, 5x5 and 7x7 is considered. Gaussian white noise with SNR (20, 30 and 40 dB) is

added to the blurred low resolution images. A high resolution image of size 512 x 512 is

constructed from the LR images using LWT and SPIHT. Table 1 shows the comparison of the

lifting scheme based super resolution reconstruction with DWT based super resolution

reconstruction Daubechies (D4) wavelets.

SUPER RESOLUTION PERFORMANCE MEASUREMENT

In order to measure the performance of super resolution algorithm, we have used the

objective image quality measure such as MSE , PSNR and ISNR (Improvement in Signal to

Noise Ratio).

1) Improvement in Signal-to-Noise Ratio (ISNR)

For the purpose of objectively testing the performance of there stored image, Improvement in

signal to noise ratio (ISNR) is used as the criteria which is defined by

Where j and i are the total number of pixels in the horizontal and vertical dimensions of the

image; f(i, j), y(i, j) and and g(i, j) are the original, degraded and the restored image.

2) The MSE of the reconstructed image is

Where f(i, j) is the source image F(I,J) is the reconstructed

image, which contains N x N pixels

CONCLUSION

Super resolution techniques have proved to be useful in may different applications in terms of

quality and computational complexity. A hardware and software implementation of SRR is

possible. SRR can be in corporated as a feature in video editing software, cellular networks and

video sitessuch as You Tube could utilize super resolution capabilitiesto improve the quality of

video clips taken by cell phones.The advantage of the proposed method is that lifting schemes is

used for the reconstruction of a high resolution image. Wavelet lifting schemes are faster,

easierto implement, the inverse transform has exactly the same complexity as the forward

transform, requires less memory,and can be used on arbitrary geometries. We have tested onboth

real time and synthetic images and found that our proposed super resolution reconstruction has

better ISNR and PSNR when compared to DWT based super resolution reconstruction. In our

proposed approach, we are using FFT based image registration to register the rotated shifted

images, and we are using SPHIT algorithm for encoded and DSPHIT algorithm for decoding and

finally we are performing iterative blind deconvolution to deblur theimage. Hence our proposed

super resolution reconstruction out performs when compared to the super resolution

reconstruction as proposed by .

REFERENCE

[1] S.Susan Young, Ronal G.Diggers, Eddie L.Jacobs, “ Signal Processing and performance

Analysis for imaging Systems”,ARTEC HOUSE, INC 2008.

[2] S. Chaudhuri, Ed., Super-Resolution Imaging. Norwell, A:Kluwer, 2001

[3] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: A technical

review,” IEEE Signal Processing Mag., vol. 20, pp. 21–36, May 2003

[4] D. Sale, R. R. Schultz, and R. J. Szczerba, “Super-resolution enhancement of night vision

image sequences,” in Proceedings of the IEEE International Conference on Systems, Man, and

Cybernetics,vol. 3, pp. 1633–1638,Nashville, Tenn, USA, October 2000.

[5] B. K. Gunturk, A. U. Batur, Y. Altunbasak, M. H. Hayes III, and R. M. Mersereau,

“Eigenface-based super- resolution for face recognition,” in Proceedings of the IEEE

International Conference. on Image Processing (ICIP ’02), vol. 2, pp. 845– 848,Rochester, NY,

USA, September 2002

[6] Priyam chatterjee, Sujata Mukherjee, Subasis Chaudhuri and Guna Setharaman,” Application

of Papoulis – Gerchberg Method in Image Super Resolution and Inpainting”,The Computer

Journal Vol 00 No 0, 2007

[7] G.R.Ayer and J.C.Danity, “ Iterative Blind Deconvolution”, July 1988, vol13, No 7, optics

letters.

[8] Barbara ZItova, “J.Flusser , “Image Registration: Survey”, Image and vision computing,

21(2003) , Elsevier publications.

[9] Hao Yang, Jianpo Gao and Zhenyang Wu, “Blur Identification and Image Super Resolution

Reconstruction using an approach similar to variable Projection” IEEE Signal Processing Letters,

Vol 15, 2008.

[10] A.Jensen, A.la Cour-Hardo, “Ripples in Mathematics” Springer publications.

[11]E.De.Castro and C.Morandi,” Registartion of translated and rotated images using FFT” IEEE

trans on pattern analysis and machine Intelligence vol 9, No 5, sep 1987.

[12] Wi Zhang, Jinzhong Yang, Xiahong Wang and Qinghu Yang, “ Fusion of remote sensing

images based on lifting wavelet transform”,Computer and Information science journal , vol 2,

no1 Feb 2009.

[13] N.K.Bose, Mahesh B.Chappali, “A second generation wavelet framework for super

resolution with noise filtering” 2004 Wiley periodicals.

[14] Wim Sweldens , “ the lifting scheme a construction of seconds generation wavlets”, SIAM

journal.

[15] W. Lim, M. Park, and M.G. Kang, “Spatially adaptive regularized iterative high resolution

image reconstruction algorithm,” in Proc. VCIP2001,Photonicswest, San Jose, CA, Jan. 2001,

pp. 20-26.

[16] Khalid Sayood, “Introduction to Data Compression” , Third Edition”.

[ 17] Ping-ISng Tsai, Tinku Acharaya , “ Image upsampling using DWT “.

Appendix A

MATLAB

A.1 Introduction

MATLAB is a high-performance language for technical computing. It integrates

computation, visualization, and programming in an easy-to-use environment where problems and

solutions are expressed in familiar mathematical notation. MATLAB stands for matrix

laboratory, and was written originally to provide easy access to matrix software developed by

LINPACK (linear system package) and EISPACK (Eigen system package) projects. MATLAB

is therefore built on a foundation of sophisticated matrix software in which the basic element is

array that does not require pre dimensioning which to solve many technical computing problems,

especially those with matrix and vector formulations, in a fraction of time.

MATLAB features a family of applications specific solutions called toolboxes. Very

important to most users of MATLAB, toolboxes allow learning and applying specialized

technology. These are comprehensive collections of MATLAB functions (M-files) that extend

the MATLAB environment to solve particular classes of problems. Areas in which toolboxes are

available include signal processing, control system, neural networks, fuzzy logic, wavelets,

simulation and many others.

Typical uses of MATLAB include: Math and computation, Algorithm development, Data

acquisition, Modeling, simulation, prototyping, Data analysis, exploration, visualization,

Scientific and engineering graphics, Application development, including graphical user interface

building.

A.2 Basic Building Blocks of MATLAB

The basic building block of MATLAB is MATRIX. The fundamental data type is the

array. Vectors, scalars, real matrices and complex matrix are handled as specific class of this

basic data type. The built in functions are optimized for vector operations. No dimension

statements are required for vectors or arrays.

A.2.1 MATLAB Window

The MATLAB works based on five windows: Command window, Workspace window,

Current directory window, Command history window, Editor Window, Graphics window and

Online-help window.

A.2.1.1 Command Window

The command window is where the user types MATLAB commands and expressions at

the prompt (>>) and where the output of those commands is displayed. It is opened when the

application program is launched. All commands including user-written programs are typed in

this window at MATLAB prompt for execution.

A.2.1.2 Work Space Window

MATLAB defines the workspace as the set of variables that the user creates in a work

session. The workspace browser shows these variables and some information about them.

Double clicking on a variable in the workspace browser launches the Array Editor, which can be

used to obtain information.

A.2.1.3 Current Directory Window

The current Directory tab shows the contents of the current directory, whose path is

shown in the current directory window. For example, in the windows operating system the path

might be as follows: C:\MATLAB\Work, indicating that directory “work” is a subdirectory of

the main directory “MATLAB”; which is installed in drive C. Clicking on the arrow in the

current directory window shows a list of recently used paths. MATLAB uses a search path to

find M-files and other MATLAB related files. Any file run in MATLAB must reside in the

current directory or in a directory that is on search path.

A.2.1.4 Command History Window

The Command History Window contains a record of the commands a user has entered in

the command window, including both current and previous MATLAB sessions. Previously

entered MATLAB commands can be selected and re-executed from the command history

window by right clicking on a command or sequence of commands. This is useful to select

various options in addition to executing the commands and is useful feature when experimenting

with various commands in a work session.

A.2.1.5 Editor Window

The MATLAB editor is both a text editor specialized for creating M-files and a graphical

MATLAB debugger. The editor can appear in a window by itself, or it can be a sub window in

the desktop. In this window one can write, edit, create and save programs in files called M-files.

MATLAB editor window has numerous pull-down menus for tasks such as saving,

viewing, and debugging files. Because it performs some simple checks and also uses color to

differentiate between various elements of code, this text editor is recommended as the tool of

choice for writing and editing M-functions.

A.2.1.6 Graphics or Figure Window

The output of all graphic commands typed in the command window is seen in this

window.

A.2.1.7 Online Help Window

MATLAB provides online help for all it’s built in functions and programming language

constructs. The principal way to get help online is to use the MATLAB help browser, opened as

a separate window either by clicking on the question mark symbol (?) on the desktop toolbar, or

by typing help browser at the prompt in the command window. The help Browser is a web

browser integrated into the MATLAB desktop that displays a Hypertext Markup Language

(HTML) documents. The Help Browser consists of two panes, the help navigator pane, used to

find information, and the display pane, used to view the information. Self-explanatory tabs other

than navigator pane are used to perform a search.

A.3 MATLAB Files

MATLAB has three types of files for storing information. They are: M-files and MAT-

files.

A.3.1 M-Files

These are standard ASCII text file with ‘m’ extension to the file name and creating own

matrices using M-files, which are text files containing MATLAB code. MATLAB editor or

another text editor is used to create a file containing the same statements which are typed at the

MATLAB command line and save the file under a name that ends in .m. There are two types of

M-files:

1. Script Files

It is an M-file with a set of MATLAB commands in it and is executed by typing name of

file on the command line. These files work on global variables currently present in that

environment.

2. Function Files

A function file is also an M-file except that the variables in a function file are all local.

This type of files begins with a function definition line.

A.3.2 MAT-Files

These are binary data files with .mat extension to the file that are created by MATLAB

when the data is saved. The data written in a special format that only MATLAB can read. These

are located into MATLAB with ‘load’ command.

A.4 the MATLAB System:

The MATLAB system consists of five main parts:

A.4.1 Development Environment:

This is the set of tools and facilities that help you use MATLAB functions and files. Many

of these tools are graphical user interfaces. It includes the MATLAB desktop and Command

Window, a command history, an editor and debugger, and browsers for viewing help, the

workspace, files, and the search path.

A.4.2 the MATLAB Mathematical Function:

This is a vast collection of computational algorithms ranging from elementary functions

like sum, sine, cosine, and complex arithmetic, to more sophisticated functions like matrix

inverse, matrix eigen values, Bessel functions, and fast Fourier transforms.

A.4.3 the MATLAB Language:

This is a high-level matrix/array language with control flow statements, functions, data

structures, input/output, and object-oriented programming features. It allows both "programming

in the small" to rapidly create quick and dirty throw-away programs, and "programming in the

large" to create complete large and complex application programs.

A.4.4 Graphics:

MATLAB has extensive facilities for displaying vectors and matrices as graphs, as well as

annotating and printing these graphs. It includes high-level functions for two-dimensional and

three-dimensional data visualization, image processing, animation, and presentation graphics. It

also includes low-level functions that allow you to fully customize the appearance of graphics as

well as to build complete graphical user interfaces on your MATLAB applications.

A.4.5 the MATLAB Application Program Interface (API):

This is a library that allows you to write C and FORTRAN programs that interact with

MATLAB. It includes facilities for calling routines from MATLAB (dynamic linking), calling

MATLAB as a computational engine, and for reading and writing MAT-files.

A.5 SOME BASIC COMMANDS:

pwd prints working directory

Demo demonstrates what is possible in Mat lab

Who lists all of the variables in your Mat lab workspace?

Whose list the variables and describes their matrix size

clear erases variables and functions from memory

clear x erases the matrix 'x' from your workspace

close by itself, closes the current figure window

figure creates an empty figure window

hold on holds the current plot and all axis properties so that subsequent graphing

commands add to the existing graph

hold off sets the next plot property of the current axes to "replace"

find find indices of nonzero elements e.g.:

d = find(x>100) returns the indices of the vector x that are greater than 100

break terminate execution of m-file or WHILE or FOR loop

for repeat statements a specific number of times, the general form of a FOR

statement is:

FOR variable = expr, statement, ..., statement END

for n=1:cc/c;

magn(n,1)=NaNmean(a((n-1)*c+1:n*c,1));

end

diff difference and approximate derivative e.g.:

DIFF(X) for a vector X, is [X(2)-X(1) X(3)-X(2) ... X(n)-X(n-1)].

NaN the arithmetic representation for Not-a-Number, a NaN is obtained as a

result of mathematically undefined operations like 0.0/0.0

INF the arithmetic representation for positive infinity, a infinity is also produced

by operations like dividing by zero, e.g. 1.0/0.0, or from overflow, e.g. exp(1000).

save saves all the matrices defined in the current session into the file,

matlab.mat, located in the current working directory

load loads contents of matlab.mat into current workspace

save filename x y z saves the matrices x, y and z into the file titled filename.mat

save filename x y z /ascii save the matrices x, y and z into the file titled filename.dat

load filename loads the contents of filename into current workspace; the file can

be a binary (.mat) file

load filename.dat loads the contents of filename.dat into the variable filename

xlabel(‘ ’) : Allows you to label x-axis

ylabel(‘ ‘) : Allows you to label y-axis

title(‘ ‘) : Allows you to give title for

plot

subplot() : Allows you to create multiple

plots in the same window

A.6 SOME BASIC PLOT COMMANDS:

Kinds of plots:

plot(x,y) creates a Cartesian plot of the vectors x & y

plot(y) creates a plot of y vs. the numerical values of the elements in the y-vector

semilogx(x,y) plots log(x) vs y

semilogy(x,y) plots x vs log(y)

loglog(x,y) plots log(x) vs log(y)

polar(theta,r) creates a polar plot of the vectors r & theta where theta is in radians

bar(x) creates a bar graph of the vector x. (Note also the command stairs(x))

bar(x, y) creates a bar-graph of the elements of the vector y, locating the bars

according to the vector elements of 'x'

Plot description:

grid creates a grid on the graphics plot

title('text') places a title at top of graphics plot

xlabel('text') writes 'text' beneath the x-axis of a plot

ylabel('text') writes 'text' beside the y-axis of a plot

text(x,y,'text') writes 'text' at the location (x,y)

text(x,y,'text','sc') writes 'text' at point x,y assuming lower left corner is (0,0)

and upper right corner is (1,1)

axis([xmin xmax ymin ymax]) sets scaling for the x- and y-axes on the current plot

A.7 ALGEBRIC OPERATIONS IN MATLAB:

Scalar Calculations:

+ Addition

- Subtraction

* Multiplication

/ Right division (a/b means a ÷ b)

\ left division (a\b means b ÷ a)

^ Exponentiation

For example 3*4 executed in 'matlab' gives ans=12

4/5 gives ans=0.8

Array products: Recall that addition and subtraction of matrices involved

addition or subtraction of the individual elements of the matrices. Sometimes it is desired to

simply multiply or divide each element of an matrix by the corresponding element of another

matrix 'array operations”.

Array or element-by-element operations are executed when the operator is preceded by a '.'

(Period):

a .* b multiplies each element of a by the respective element of b

a ./ b divides each element of a by the respective element of b

a .\ b divides each element of b by the respective element of a

a .^ b raise each element of a by the respective b element

A.8 MATLAB WORKING ENVIRONMENT:

A.8.1 MATLAB DESKTOP

Matlab Desktop is the main Matlab application window. The desktop contains five sub

windows, the command window, the workspace browser, the current directory window, the

command history window, and one or more figure windows, which are shown only when the

user displays a graphic.

The command window is where the user types MATLAB commands and expressions at

the prompt (>>) and where the output of those commands is displayed. MATLAB defines the

workspace as the set of variables that the user creates in a work session.

The workspace browser shows these variables and some information about them. Double

clicking on a variable in the workspace browser launches the Array Editor, which can be used to

obtain information and income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of the current

directory, whose path is shown in the current directory window.

For example, in the windows operating system the path might be as follows: C:\

MATLAB\Work, indicating that directory “work” is a subdirectory of the main directory

“MATLAB”; WHICH IS INSTALLED IN DRIVE C. clicking on the arrow in the current

directory window shows a list of recently used paths. Clicking on the button to the right of the

window allows the user to change the current directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which are

organize in directories in the computer file system. Any file run in MATLAB must reside in the

current directory or in a directory that is on search path. By default, the files supplied with

MATLAB and math works toolboxes are included in the search path. The easiest way to see

which directories are soon the search path, or to add or modify a search path, is to select set path

from the File menu the desktop, and then use the set path dialog box. It is good practice to add

any commonly used directories to the search path to avoid repeatedly having the change the

current directory.

The Command History Window contains a record of the commands a user has entered in

the command window, including both current and previous MATLAB sessions. Previously

entered MATLAB commands can be selected and re-executed from the command history

window by right clicking on a command or sequence of commands.

This action launches a menu from which to select various options in addition to executing

the commands. This is useful to select various options in addition to executing the commands.

This is a useful feature when experimenting with various commands in a work session.

A.8.2 Using the MATLAB Editor to create M-Files:

The MATLAB editor is both a text editor specialized for creating M-files and a graphical

MATLAB debugger. The editor can appear in a window by itself, or it can be a sub window in

the desktop. M-files are denoted by the extension .m, as in pixelup.m.

The MATLAB editor window has numerous pull-down menus for tasks such as saving,

viewing, and debugging files. Because it performs some simple checks and also uses color to

differentiate between various elements of code, this text editor is recommended as the tool of

choice for writing and editing M-functions.

To open the editor , type edit at the prompt opens the M-file filename.m in an editor

window, ready for editing. As noted earlier, the file must be in the current directory, or in a

directory in the search path.

A.8.3 Getting Help:

The principal way to get help online is to use the MATLAB help browser, opened as a

separate window either by clicking on the question mark symbol (?) on the desktop toolbar, or by

typing help browser at the prompt in the command window. The help Browser is a web browser

integrated into the MATLAB desktop that displays a Hypertext Markup Language(HTML)

documents. The Help Browser consists of two panes, the help navigator pane, used to find

information, and the display pane, used to view the information. Self-explanatory tabs other than

navigator pane are used to perform a search.

Appendix B

INTRODUCTION TO DIGITAL IMAGE PROCESSING

What is DIP?

An image may be defined as a two-dimensional function f(x, y), where x & y are

spatial coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity

or gray level of the image at that point. When x, y & the amplitude values of f are all finite

discrete quantities, we call the image a digital image. The field of DIP refers to processing digital

image by means of digital computer. Digital image is composed of a finite number of elements,

each of which has a particular location & value. The elements are called pixels.

Vision is the most advanced of our sensor, so it is not surprising that image play the single

most important role in human perception. However, unlike humans, who are limited to the visual

band of the EM spectrum imaging machines cover almost the entire EM spectrum, ranging from

gamma to radio waves. They can operate also on images generated by sources that humans are

not accustomed to associating with image.

There is no general agreement among authors regarding where image processing stops &

other related areas such as image analysis& computer vision start. Sometimes a distinction is

made by defining image processing as a discipline in which both the input & output at a process

are images. This is limiting & somewhat artificial boundary. The area of image analysis (image

understanding) is in between image processing & computer vision.

There are no clear-cut boundaries in the continuum from image processing at one end to

complete vision at the other. However, one useful paradigm is to consider three types of

computerized processes in this continuum: low-, mid-, & high-level processes. Low-level

process involves primitive operations such as image processing to reduce noise, contrast

enhancement & image sharpening. A low- level process is characterized by the fact that both its

inputs & outputs are images. Mid-level process on images involves tasks such as segmentation,

description of that object to reduce them to a form suitable for computer processing &

classification of individual objects. A mid-level process is characterized by the fact that its inputs

generally are images but its outputs are attributes extracted from those images. Finally higher-

level processing involves “Making sense” of an ensemble of recognized objects, as in image

analysis & at the far end of the continuum performing the cognitive functions normally

associated with human vision.

Digital image processing, as already defined is used successfully in a broad range of

areas of exceptional social & economic value.

What is an image?

An image is represented as a two dimensional function f(x, y) where x and y are spatial co-

ordinates and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity of the

image at that point.

Gray scale image:

A grayscale image is a function I (xylem) of the two spatial coordinates of the image

plane.

I(x, y) is the intensity of the image at the point (x, y) on the image plane.

I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] [0, b]I:

[0, a] [0, b] [0, info)

Color image:

It can be represented by three functions, R (xylem) for red, G (xylem) for green and

B (xylem) for blue.

An image may be continuous with respect to the x and y coordinates and also in

amplitude. Converting such an image to digital form requires that the coordinates as well as the

amplitude to be digitized. Digitizing the coordinate’s values is called sampling. Digitizing the

amplitude values is called quantization.

Coordinate convention:

The result of sampling and quantization is a matrix of real numbers. We use two principal

ways to represent digital images. Assume that an image f(x, y) is sampled so that the resulting

image has M rows and N columns. We say that the image is of size M X N. The values of the

coordinates (xylem) are discrete quantities. For notational clarity and convenience, we use

integer values for these discrete coordinates. In many image processing books, the image origin

is defined to be at (xylem)=(0,0).

The next coordinate values along the first row of the image are (xylem)=(0,1).It is

important to keep in mind that the notation (0,1) is used to signify the second sample along the

first row. It does not mean that these are the actual values of physical coordinates when the

image was sampled. Following figure shows the coordinate convention. Note that x ranges from

0 to M-1 and y from 0 to N-1 in integer increments.

The coordinate convention used in the toolbox to denote arrays is different from the

preceding paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the

notation (race) to indicate rows and columns. Note, however, that the order of coordinates is the

same as the order discussed in the previous paragraph, in the sense that the first element of a

coordinate topples, (alb), refers to a row and the second to a column. The other difference is that

the origin of the coordinate system is at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to

N in integer increments. IPT documentation refers to the coordinates. Less frequently the toolbox

also employs another coordinate convention called spatial coordinates which uses x to refer to

columns and y to refers to rows. This is the opposite of our use of variables x and y.

Image as Matrices

The preceding discussion leads to the following representation for a digitized image

function:

f (0, 0) f (0, 1) ……….. f (0, N-1)

f (1, 0) f (1, 1) ………… f (1, N-1)

f (xylem) = . . .

. . .

f (M-1, 0) f (M-1, 1) ………… f (M-1, N-1)

The right side of this equation is a digital image by definition. Each element of this

array is called an image element, picture element, pixel or pel. The terms image and pixel are

used throughout the rest of our discussions to denote a digital image and its elements.

A digital image can be represented naturally as a MATLAB matrix:

f (1, 1) f (1, 2) ……. f (1, N)

f (2, 1) f (2, 2) …….. f (2, N)

. . .

f = . . .

f (M, 1) f (M, 2) …….f (M, N)

Where f (1, 1) = f (0, 0) (note the use of a monoscope font to denote MATLAB

quantities). Clearly the two representations are identical, except for the shift in origin. The

notation f (p, q) denotes the element located in row p and the column q. For example f (6, 2) is

the element in the sixth row and second column of the matrix f. typically we use the letters M

and N respectively to denote the number of rows and columns in a matrix. A 1xN matrix is

called a row vector whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a scalar.

Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and

so on. Variables must begin with a letter and contain only letters, numerals and underscores. As

noted in the previous paragraph, all MATLAB quantities are written using monoscope

characters. We use conventional Roman, italic notation such as f(x, y), for mathematical

expressions

6.5 Reading Images:

Images are read into the MATLAB environment using function imread whose syntax is

imread(‘filename’)

Format name Description recognized extension

TIFF Tagged Image File Format .tif, .tiff

JPEG Joint Photograph Experts Group .jpg, .jpeg

GIF Graphics Interchange Format .gif

BMP Windows Bitmap .bmp

PNG Portable Network Graphics .png

XWD X Window Dump .xwd

Here filename is a spring containing the complete of the image file(including any

applicable extension).For example the command line

>> f = imread (‘8. jpg’);

reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes

(‘) to delimit the string filename. The semicolon at the end of a command line is used by

MATLAB for suppressing output. If a semicolon is not included. MATLAB displays the results

of the operation(s) specified in that line. The prompt symbol(>>) designates the beginning of a

command line, as it appears in the MATLAB command window.

When as in the preceding command line no path is included in filename, imread reads the

file from the current directory and if that fails it tries to find the file in the MATLAB search path.

The simplest way to read an image from a specified directory is to include a full or relative path

to that directory in filename.

For example,

>> f = imread ( ‘D:\myimages\chestxray.jpg’);

reads the image from a folder called my images on the D: drive, whereas

>> f = imread(‘ . \ myimages\chestxray .jpg’);

reads the image from the my images subdirectory of the current of the current working

directory. The current directory window on the MATLAB desktop toolbar displays MATLAB’s

current working directory and provides a simple, manual way to change it. Above table lists

some of the most of the popular image/graphics formats supported by imread and imwrite.

Function size gives the row and column dimensions of an image:

>> size (f)

ans = 1024 * 1024

This function is particularly useful in programming when used in the following form to

determine automatically the size of an image:

>>[M,N]=size(f);

This syntax returns the number of rows(M) and columns(N) in the image.

The whole function displays additional information about an array. For instance ,the

statement

>> whos f

gives

Name size Bytes Class

F 1024*1024 1048576 unit8 array

Grand total is 1048576 elements using 1048576 bytes

The unit8 entry shown refers to one of several MATLAB data classes. A semicolon at the

end of a whose line has no effect ,so normally one is not used.

6.6 Displaying Images:

Images are displayed on the MATLAB desktop using function imshow, which has the basic

syntax:

imshow(f,g)

Where f is an image array, and g is the number of intensity levels used to display it.

If g is omitted ,it defaults to 256 levels .using the syntax

Imshow (f, {low high})

Displays as black all values less than or equal to low and as white all values greater

than or equal to high. The values in between are displayed as intermediate intensity values using

the default number of levels .Finally the syntax

Imshow(f,[ ])

Sets variable low to the minimum value of array f and high to its maximum value.

This form of imshow is useful for displaying images that have a low dynamic range or that have

positive and negative values.

Function pixval is used frequently to display the intensity values of individual pixels

interactively. This function displays a cursor overlaid on an image. As the cursor is moved over

the image with the mouse the coordinates of the cursor position and the corresponding intensity

values are shown on a display that appears below the figure window .When working with color

images, the coordinates as well as the red, green and blue components are displayed. If the left

button on the mouse is clicked and then held pressed, pixval displays the Euclidean distance

between the initial and current cursor locations.

The syntax form of interest here is Pixval which shows the cursor on the last image

displayed. Clicking the X button on the cursor window turns it off.

The following statements read from disk an image called rose_512.tif extract basic

information about the image and display it using imshow :

>>f=imread(‘rose_512.tif’);

>>whos f

Name Size Bytes Class

F 512*512 262144 unit8 array

Grand total is 262144 elements using 262144 bytes

>>imshow(f)

A semicolon at the end of an imshow line has no effect, so normally one is not used.

If another image,g, is displayed using imshow, MATLAB replaces the image in the screen with

the new image. To keep the first image and output a second image, we use function figure as

follows:

>>figure ,imshow(g)

Using the statement

>>imshow(f),figure ,imshow(g) displays both images.

Note that more than one command can be written on a line ,as long as different commands

are properly delimited by commas or semicolons. As mentioned earlier, a semicolon is used

whenever it is desired to suppress screen outputs from a command line.

Suppose that we have just read an image h and find that using imshow produces the image.

It is clear that this image has a low dynamic range, which can be remedied for display purposes

by using the statement.

>>imshow(h,[ ])

6.7 WRITING IMAGES:

Images are written to disk using function imwrite, which has the following basic syntax:

Imwrite (f,’filename’)

With this syntax, the string contained in filename must include a recognized file format

extension .Alternatively, the desired format can be specified explicitly with a third input

argument. >>imwrite(f,’patient10_run1’,’tif’)

Or alternatively

For example the following command writes f to a TIFF file named patient10_run1:

>>imwrite(f,’patient10_run1.tif’)

If filename contains no path information, then imwrite saves the file in the current working

directory.

The imwrite function can have other parameters depending on e file format selected. Most

of the work in the following deals either with JPEG or TIFF images ,so we focus attention here

on these two formats.

More general imwrite syntax applicable only to JPEG images is

imwrite(f,’filename.jpg,,’quality’,q)

where q is an integer between 0 and 100(the lower number the higher the degradation due to

JPEG compression).

For example, for q=25 the applicable syntax is

>> imwrite(f,’bubbles25.jpg’,’quality’,25)

The image for q=15 has false contouring that is barely visible, but this effect becomes quite

pronounced for q=5 and q=0.Thus, an expectable solution with some margin for error is to

compress the images with q=25.In order to get an idea of the compression achieved and to obtain

other image file details, we can use function imfinfo which has syntax.

Imfinfo filename

Here filename is the complete file name of the image stored in disk.

For example,

>> imfinfo bubbles25.jpg

outputs the following information(note that some fields contain no information in this case):

Filename: ‘bubbles25.jpg’

FileModDate: ’04-jan-2003 12:31:26’

FileSize: 13849

Format: ‘jpg’

Format Version: ‘ ‘

Width: 714

Height: 682

Bit Depth: 8

Color Depth: ‘grayscale’

Format Signature: ‘ ‘

Comment: { }

Where file size is in bytes. The number of bytes in the original image is corrupted simply

by multiplying width by height by bit depth and dividing the result by 8. The result is

486948.Dividing this file size gives the compression ratio:(486948/13849)=35.16.This

compression ratio was achieved. While maintaining image quality consistent with the

requirements of the appearance. In addition to the obvious advantages in storage space, this

reduction allows the transmission of approximately 35 times the amount of un compressed data

per unit time.

The information fields displayed by imfinfo can be captured in to a so called structure

variable that can be for subsequent computations. Using the receding an example and assigning

the name K to the structure variable.

We use the syntax >>K=imfinfo(‘bubbles25.jpg’);

To store in to variable K all the information generated by command imfinfo, the

information generated by imfinfo is appended to the structure variable by means of fields,

separated from K by a dot. For example, the image height and width are now stored in structure

fields K. Height and K. width.

As an illustration, consider the following use of structure variable K to commute the

compression ratio for bubbles25.jpg:

>> K=imfinfo(‘bubbles25.jpg’);

>> image_ bytes =K.Width* K.Height* K.Bit Depth /8;

>> Compressed_ bytes = K.FilesSize;

>> Compression_ ratio=35.162

Note that iminfo was used in two different ways. The first was t type imfinfo bubbles25.jpg

at the prompt, which resulted in the information being displayed on the screen. The second was

to type K=imfinfo (‘bubbles25.jpg’),which resulted in the information generated by imfinfo

being stored in K. These two different ways of calling imfinfo are an example of command_

function duality, an important concept that is explained in more detail in the MATLAB online

documentation.

More general imwrite syntax applicable only to tif images has the form

Imwrite(g,’filename.tif’,’compression’,’parameter’,….’resloution’,[colres rowers] )

Where ‘parameter’ can have one of the following principal values: ‘none’ indicates no

compression, ‘pack bits’ indicates pack bits compression (the default for non ‘binary images’)

and ‘ccitt’ indicates ccitt compression. (the default for binary images).The 1*2 array [colres

rowers]

Contains two integers that give the column resolution and row resolution in dot per_ unit

(the default values). For example, if the image dimensions are in inches, colres is in the number

of dots(pixels)per inch (dpi) in the vertical direction and similarly for rowers in the horizontal

direction. Specifying the resolution by single scalar, res is equivalent to writing [res res].

>>imwrite(f,’sf.tif’,’compression’,’none’,’resolution’,……………..[300 300])

the values of the vector[colures rows] were determined by multiplying 200 dpi by the ratio

2.25/1.5, which gives 30 dpi. Rather than do the computation manually, we could write

>> res=round(200*2.25/1.5);

>>imwrite(f,’sf.tif’,’compression’,’none’,’resolution’,res)

where its argument to the nearest integer.It function round rounds is important to note that

the number of pixels was not changed by these commands. Only the scale of the image changed.

The original 450*450 image at 200 dpi is of size 2.25*2.25 inches. The new 300_dpi image is

identical, except that is 450*450 pixels are distributed over a 1.5*1.5_inch area. Processes such

as this are useful for controlling the size of an image in a printed document with out sacrificing

resolution.

Often it is necessary to export images to disk the way they appear on the MATLAB

desktop. This is especially true with plots .The contents of a figure window can be exported to

disk in two ways. The first is to use the file pull-down menu is in the figure window and then

choose export. With this option the user can select a location, filename, and format. More control

over export parameters is obtained by using print command:

Print-fno-dfileformat-rresno filename

Where no refers to the figure number in the figure window interest, file format refers

one of the file formats in table above. ‘resno’ is the resolution in dpi, and filename is the name

we wish to assign the file.

If we simply type print at the prompt, MATLAB prints (to the default printer) the contents of the

last figure window displayed. It is possible also to specify other options with print, such as specific

printing device.

Documents

Super Resolution Reconstruction of Compressed Low Resolution