Wavelets for Sound Analysis

7/28/2019 Wavelets for Sound Analysis

1/63

Wavelets for Sound Analysis

and Re-Synthesis

Graham Self

2nd

May 2001

Project Supervisor: Dr. Guy Brown

Second Marker: Dr. Joab Winkler

This report is submitted in partial fulfilment of the requirement for the Bachelor of

Science Dual Honours in Computer Science and Mathematics byGraham Self


2/63

Page I

Declaration

All sentences or passages quoted in this dissertation from other peoples work have

been specifically acknowledged by clear cross-referencing to author, work and page(s).

Any illustrations which are not the work of the author of this dissertation have been

used with the explicit permission of the originator and are specifically acknowledged. I

understand that failure to do this amounts to plagiarism and will be considered grounds

for failure in this dissertation and the degree examination as a whole.

Name:

Signature:

Date:


3/63

Page II

Abstract

This paper firstly introduces the main branch of mathematics that lead to the discovery ofWavelets and their transforms, Fourier Analysis. The paper gives a detailed account of the

development of different Fourier transforms and describes the motivation for a need for a

different transform due to the multiresolution problem. The theory of Wavelets is then

introduced as a solution to this problem together with a detailed account of the Continuous and

Discrete Wavelet Transforms.

The paper also documents the development, testing and evaluation of two Computer Assisted

Learning Tools that students could use to learn about wavelet theory.


4/63

Page III

Acknowledgments

Thanks go to

My Parents, for being supportive

Alice for being there every step of the way

Guy for endless proof reading

And to all those involved in the testing and evaluation


5/63

Page IV

Contents

Introduction 1

1: Fourier Analysis 21.1 The Statement that changed mathematics 2

1.2 Applications of Fourier Analysis 3

1.3 The Fourier Transform Finding the frequency content of a signal 3

1.4 The Discrete Fourier Transform 4

1.5 The Fast Fourier Transform 5

1.6 The Time/Frequency problem 5

1.7 The Short Term Fourier Transform 6

1.8 Technical definitions for Chapter 1 8

2 : Introduction to Wavelets 11

2.1 Where did they come from? 112.2 The Mother Wavelet 122.3 Wavelets achieve Multiresolution 12

2.4 How do you create a Wavelet? 13

3 : The Continuous Wavelet Transform 143.1 Theory 14

3.2 Computation of the CWT 15

3.3 Visualising the CWT 3D Plot 163.4 Visualising the CWT Scalograms 16

4 : The Discrete Wavelet Transform 17

4.1 Why not use the CWT? 174.2 Discretizing the Continuous Wavelet Transform 17

4.3 Subband Coding 184.4 Example of Subband Coding 20

5 : Computer Assisted Learning Tools 225.1 Computer Assisted Learning (CAL) 225.2 Which programming language? 23

6 : Requirements Analysis 246.1 The initial requirement 24

6.2 The Client and Developer scenario 24

6.3 The Matlab Auditory Demos 26

7 : Software Development 277.1 Getting familiar with MATLAB 27

7.1.1 MATLABs GUI 27

7.1.2 Coding in MATLAB 28

7.2 Wavelet Learning Tool - Interface Design 29

7.3 The coding of The Wavelet Learning Tool 307.3.1 Coding the CWT 30

7.3.2 Coding the Inverse CWT 32

7.3.3 Input error recovery 32

7.4 Development problems 337.4.1 Surf vs Imagesc 33

7.4.2 The disappearing cursor 34


6/63

Page V

7.5 Development of the Subband Learning Tool 35

7.6 Screen Shots 35

7.7 Functionality of the Wavelet Learning Tool 36

8 : Software Testing 388.1 Motivation 38

8.2 Functional Testing 39

8.3 Testing the Wavelet Learning Tool 398.3.1 Random Testing 40

8.3.2 Testing Structural Synthesis 41

8.3.3 The Category-Partition method 44

8.4 Testing Summary 45

9: Evaluation 469.1 Questionnaire Construction 469.2 User Guide 47

9.3 Questionnaire Results 47

9.4 Evaluation Summary 499.5 Evaluation of the Subband Learning Tool 49

9.6 Future Work 49

10: Conclusions 51

References 52

Appendix A Questionnaire 53

Appendix B Wavelet Learning Tool User Guide 55

Appendix C Software Development Log 57


7/63

Introduction

Page 1

Introduction

Wavelets have many historical routes leading up to their discovery as they can be used in

many different scientific areas. This dissertation documents the usefulness and importance

wavelets have when calculating a frequency analysis on a signal. The main route of thediscovery of wavelets came as a natural progression from the Short Term Fourier Transform,

as there was a need for a better analysis technique. The problem with using Fourier Transforms

is that a frequency analysis cant give both good frequency and good time resolution at the

same time. This is known as the multiresolution problem and was solved by using Waveletsinstead of a fixed analysis window. The wavelets have the important property that they dont

have to be fixed in size. They can be easily controlled by parameters, which compress and

dilate the wavelet depending on which frequency is being analysed in the signal. An account of

how to create wavelets using the mother wavelet formula and how they can be applied to

signal analysis is also given in this paper.

The free movement of the wavelets was exploited in the Continuous Wavelet Transform to

give a detailed representation of the frequencies in a signal. The signal could now be

represented with both good time and good frequency resolution, so unlike before, the user

could tell not only what frequencies occurred in the signal, but also when they occurred. Thetransform introduces the notion of a scale, which is equivalent to the inverse of the frequencies

being analysed. The continuous wavelet transform produces a coefficient matrix, which can be

visualised by the use of a scalogram, which is also investigated. The problem that the

continuous wavelet transform has is that it calculates a coefficient for each sample in the signal

for every scale that is to be calculated. For long signals, this can be time consuming and

uneconomical when applying the transform computationally.

The solution to this problem was solved by the development of a discrete version of thewavelet transform. The dissertation details the reasons for wanting to use a discrete version

along with the arguments for showing how this can be done without loss of data. The DiscreteWavelet Transform is based on the ideas behind Nyquists Sampling Theorem and a method

for calculating the transform is given called Subband Coding.

A large part of this dissertation gives a detailed account of the development, testing andevaluation of two Computer Assisted Learning (CAL) Tools. Because wavelet theory is a

fairly new topic, lecturers are limited with the way they can teach the topic to new students

who are unfamiliar to the concepts that are involved. Therefore, CAL tools were developed to

aid in the teaching of wavelets and to provide an interactive demonstration of what thetransforms can do and how they provide benefits to signal analysis. The dissertation provides

evidence for the benefits for using CAL tools with a series of lectures and also what

requirements were needed for them to be successful.

The main tool developed alongside this dissertation is called the Wavelet Learning Tool. This

was developed to demonstrate visually how wavelets are used in the continuous version of the

wavelet transform. The user is able to compare the effects of using different wavelets and tosee a visual representation of the coefficient matrix produced, called a scalogram. The

scalogram effectively shows the signal in the frequency domain. As the title of the dissertation

suggests, the user can also re-synthesise a signal from a scalogram that has been altered. The

tool allows the user to remove scales from the scalogram and then listen to the effects this has

on the overall signal. This is effectively a frequency filtering technique.

The second tool called the Subband learning Tool was developed to show visually the process

undergone during Subband coding, which is the discrete version of the wavelet transform.

Subband coding is, in its simplest terms, a series of filtering and sub-sampling operations andthe user can see what effects on a speech signal each stage of the process has.


8/63

Fourier Analysis

Page 2

1 : Fourier Analysis

1.1 The Statement that Changed Mathematics

Before 1930, the main branch of mathematics leading to Wavelet Theory formed as a result ofa natural progression from the world of the Fourier analysis. Fourier analysis is a significant

discovery and has influenced many different areas of applications including science, maths,

engineering and most important of all, signal processing. Joseph Fourier started off the

Fourier revolution by introducing a simple mathematical statement:

Any periodic function can be represented as a sum of sines and cosines

This discovery had fundamental importance to the signal processing world, as a complex

signal could be visualised as a combination of smaller signals of which we know a lot about.

Studying these smaller signals i.e. sine waves and cosine waves, will directly allow you toinfer properties about the more complex signal. We will see later how the amplitudes of the

sine and cosine waves in a complex signal enable you to calculate the frequencies in the signal.

Fouriers statement can be expressed mathematically as follows.

f(x) = )sincos(2

1

1

0

++ kxbkxaa kk [EQ 1]

Where the Fourier Coefficientskk

baa ,,0 are defined as

=

2

0

0 )(2

1dxxfa =

2

0

)cos()(1

dxkxxfak =

2

0

)sin()(1

dxkxxfbk

[M 1993]

This is known as the Fourier Series and says that any curve that periodically repeats itself can

be expressed as the sum of perfectly smooth oscillations i.e. the sines and cosines.

Barbara Burke Hubbard described how the Fourier Series can be represented by a process ofmultiplying various sinusoidal waves (sines and cosines) with certain amplitude coefficients

and then shifting them so they either add or cancel [H 1995]. An example of this procedure is

shown in figure 1.1.1

Figure 1.1.1

a) sin(x), b) sin(x) + sin(2x), c) sin(x) + sin(2x) + (4+ sin(3x))

The sinusoidal waves are called basis signals as they form a basis of any function. These are

also the basis for the Fourier Transform (see Section 1.3), which is derived from the property

described above. The sines and cosines form a basis of any signal. See page 8 for anexplanation of a basis

a)

b)

c)


9/63

Fourier Analysis

Page 3

1.2 Applications of Fourier Analysis

Not only did the Fourier Series aid mathematicians to differentiate difficult functions; the

Fourier Series opened a new door to frequency analysis. Before Fourier, raw signals werealways represented in the time domain. The problem with a signal being represented in the

time domain is that no information, except amplitude, can be given at a specific time. It isoften more convenient to be able to see what frequencies occur at different time intervals.

As described before, the Fourier Series consist of a combination of sinusoids. If you extract the

amplitudes of each sinusoidal component you obtain the Fourier Coefficients. As the sinusoids

are equally spaced in frequency i.e. sin x, cos x, sin 2x, cos 2x knowing these coefficients

gives us information on which frequencies are present in the function.As sound signals are effectively represented as a function, the Fourier Series enables us to

extract the frequencies present in the signal.

This leads us to the Fourier Transform, which converts a signal from the time-domain into the

frequency-domain.

1.3 The Fourier Transform Finding the frequency content of a signal

The Fourier Transform, as introduced above, converts information about a signal in the time-

domain into a signal in the frequency-domain. It also allows you to go back without loss of

information.

Figure 1.3.1

The only frequencies that contribute to the Fourier Series of a periodic function (see 1.1) arethe integer multiples of the functions fundamental frequency. This fundamental frequency,

also known as the base frequency is the inverse of the period of the signal. For example, if the

signal has a period of 2 ms ( = 0.002 s), the fundamental frequency will be 1/0.002 = 500Hz.The Fourier Transform (FT) also allows certain non-periodic functions (those that decrease

fast enough so that the area under their graphs is finite) to be converted i.e. it is still possible to

describe it in terms of its frequencies. But to do this, you need to compute coefficients for all

possible frequencies, to compute its FT.

The Fourier Transform can be derived from the Fourier series, where the coefficients for a and

b are defined as follows.

= xdxxfa 2cos)()(

= xdxxfb 2sin)()( [H 1995] [EQ 2]

Since we are now dealing with a non-periodic function, we must consider the interval between

minus-infinity and infinity.

Transforming into the complex planeIn Fourier Analysis, complex numbers make it possible to have a single coefficient for each

frequency. This means that you no longer need coefficients for the sines and cosines, just one,

which will give you the information of both. It is calculated using the fact that a complex

number z = x + iy can be represented by the point (x,y) in the complex plane. The x-axis

representing the real part of the number z, and the y-axis representing the imaginary (i) part(see figure 1.3.2). The phase of the complex number is simply the angle that is created when

Inverse Fourier Transform

Fourier Transform

TIME DOMAIN FREQUENCY DOMAIN


10/63

Fourier Analysis

Page 4

Figure 1.3.2The Complex Plane

a line joins the point (0,0) to the point (x,y). The magnitude is the length of the line, which is

calculated using Pythagoras Theorem.

To write a Fourier series or transform using complex numbers, you can use the formula

sincos iei += , which is derived from the complex planeThe formula for the Fourier Series of a periodic function (equation 1) can now be written

ikx

kecxf

=)(

Where the formulas for the coefficients (equation 2) become

dxexfcikx

k =1

0

)(

Using this knowledge, the formula for the Fourier Transform )(f and its inverse f(x) are

dxexffxi

= )()( [EQ 3]

defxf

xi)(

2

1)(

=

1.4 The Discrete Fourier Transform

Digital computers are finite machines: any desired computation can only use a finite number of

operations. No digital computer, then, can deal with real-(or complex)-valued functions of real

numbers. The Discrete Fourier Transform (DFT) samples the function in order for it to be

represented on a computer. Instead of having f(x) at all x, we have only the values of f at a

finite number of points f(x1), f(x2), For convenience, this is usually sample points at regular

intervals, of say. If you start sampling at x1=0, the sequence of sample points becomes 0, ,2,,(n-1) and the sample values are f(0), f(), f(2),,f((n-1) )Because the DFT is to enable you to use digital computers in Fourier Analysis, we need a

finite analogue of the Fourier Transform given in equation 3.

The integral is all over x, and is calculated for every real value of . In the finite analogue, theintegral becomes the sum

=

1

0

)(n

r

rierf

[C 1990] [EQ 4]

This is still defined for all complex numbers of absolute value 1, therefore the equation needs

restricting further, so we have a finite number of values .We use the fact that

rie , for fixed , is periodic, with period 2/. Thus the range can berestricted to values of between 0 and 2/.So equation 4 now becomes the formula for the Discrete Fourier Transform (DFT)

==

1

0

/2, )()/1()/2(

n

r

nikrn erfnnkfD

(x , y)

x

y

magnitude

)(22

yx +


11/63

Fourier Analysis

Page 5

1.5 The Fast Fourier Transform

The Fast Fourier Transform is of special interest because it has enabled the use of the DFT to

be faster and therefore more commonly used. When calculating the DFT, a small number ofFourier Coefficients is often adequate for an accurate Fourier Transform. It means that, with

the aid of computers, a Discrete Fourier Transform can be quickly processed.An example of this is the fft command in Matlab. The program uses the FFT technique to give

a fast effective result, which can then be plotted on the frequency domain.

Figure 1.5.1 shows the result of using this command.

Figure 1.5.1Matlab plots of left) Plot of a speech signal right) Its Fast Fourier Transform

When the Fourier Transform is plotted, only the amplitude data is shown and the phase is

discarded.

The Fast Fourier Transform (FFT) was introduced by Cooley and Tuley in the mid 1960s. Its

effects have been evolutionary, turning Fourier Analysis from merely being a mathematical

tool to being a practical one as more and more people turned to Fourier Transforms as an

effective way of frequency analysis. Cooley and Tooley developed an algorithm to speed up

the DFT for signals that had a length of a power of 2 e.g. 2, 4, 8, 16, 32, 64 This is becausethe method recursively halved the data.

The algorithm is described as a divide and conquer algorithm that systematically divides the

data into two halves, the FFT is then performed on each half, and then the two halves are

spliced back together.

The FFT cuts down the number of computations needed from 2

n to n log n. Therefore thelarger the value of n, the more impressive the gain in speed of calculating the FT.

Since the Cooley and Tooley algorithm has been developed, a new FFT algorithm has been

discovered where the signal need not have a length of a power of 2.

1.6 The Time/Frequency problem

There is a fundamental problem with the Fourier Transform that led to the search for a moresuitable transform such as the Wavelet Transform.

A function and its Fourier Transform are two faces of the same information. The function

displays the time information and hides the information about frequencies, and the Fourier

Transform displays the frequency information but no information about the time. The question

that is raised is, is it necessary to have both the time and the frequency information at the same

time?

The answer depends on the particular application and the nature of the signal in hand. The

Fourier Transform gives the frequency information of the signal, which means that it tells ushow much of each frequency exists in the signal, but it does not tell us when in time these

frequency components exist.

Olivier Rioul and Martin Vetterli state that a transform such as this one can only be effectivelyapplied to stationary signals[R 1991].


12/63

Fourier Analysis

Page 6

Definition: Signals in which frequency content doesnt vary significantly in time are

Stationary.

The only strictly stationary signal is one with constant frequency.

Here is an example.Robi Polikar gave the following example in his Wavelet tutorial, which shows the problems

with non-stationary signals. [P 1996]

Figure 1.6.1 is a plot of a stationary signal and its Fast Fourier Transform, zoomed in. The

signal is x(t)=cos(2*pi*10*t)+cos(2*pi*25*t)+cos(2*pi*50*t)+cos(2*pi*100*t) and so

contains frequencies of 10, 25, 50, 100 hz at any given time instant.

Figure 1.6.1Plot of a Stationary signal and its FFT

Figure 1.6.2Plot of a Non-stationary signal and its FFT

Figure 1.6.2 is a plot of a signal with four different frequency components at four differenttime intervals i.e. it is non-stationary. This signal contains the same frequencies as the signal

used for the stationary example. The Fast Fourier Transform is also given.

The ripples in the FFT are due to sudden changes from one frequency to another and are

irrelevant in this example.

Apart from the ripples, the two plots of the FFT look identical, they both have large peaks at

frequencies 10, 25, 50, 100 Hz. The problem is apparent when you ask yourself when these

frequencies actually occurred. It is straightforward for the stationary example, but not so forthe non-stationary signal. This example shows that any change in frequency of a signal at the

slightest of time intervals will effect the overall FFT. As a result, you would get completely

false information.

As this project involves non-stationary signals, a further approach is needed.

1.7 The Short Term Fourier Transform

Fourier analysis doesnt work equally well for all kinds of signals or for all kinds of problems.

Hubbard describes the nave approach of some scientists when applying Fourier

In some cases, scientists using it are like the man looking for a dropped coin under a

lamppost, not because that is where he dropped it, but because thats where the light is

[H 1995]

If the signal is non-periodic, the summation of the periodic functions, sine and cosine, doesnt

accurately represent the signal. Research into artificially extending the signal to make it


13/63

Fourier Analysis

Page 7

periodic revealed that you would require what is called continuity at the endpoints of thefunction.

Dennis Gabor introduced the Short Term Fourier Transform (STFT), also known as

Windowed Fourier Transform, as an attempt to overcome the problem with identifying when afrequency occurred in a non-stationary signal. STFT introduces the notion of time dependency

into the Fourier Analysis.

Gabors basic Idea: Introduce a local frequency parameter (local in time) so that the Fourier

Transform looks at the signal through a window over which the signal is approximately

stationary.

Figure 1.7.1

Studying the frequencies of the signal segment by segment limits the span of time during

which something is happening.

Formal Definition

Given a signal x(t), Gabor recognised that to be accurate in time, a two-dimensional time-frequency representation is needed, S(t, f), where f is the local frequency.

Recall, the signal is stationary when it is seen through a window g(t) (see figure 1.7.1).

The signals viewed in the window are represented by the following equation.

)()( Ttgtx T is the time location where the window is centred, and g (t) is the window function.

The multiplication of x(t) by g (t) is called Convolution (see page 10)

The Fourier Transform of these signals is then obtained by applying this to the FourierTransform given in Equation 3 (page 4)

As you can see by the formula, the STFT relies heavily on the choice of the window. In figure

1.7.1, the window was a basic rectangular window, but for more accurate results, different

shaped windows can be used such as the preferred Hamming window.

Another factor is the size of the window. Although the size of the window is fixed for the

entire process of calculating the STFT, different STFTs can be calculated using different sizedwindows. A small window is effectively blind to low frequencies, as they are too large for the

window, but using a large window, information is lost about brief changes.

We will see later how Wavelets have combated this problem as an attempt to see the wood and

the trees.

Figure 1.7.2 illustrates the windowing of a signal in STFT.It shows Gabors 2-dimension principle and gives two alternative views.

Analysis Window g(t)The wave is stationary in this

windowed sectionT

Time (t)

x(t)

dteTtgtxfTSTFTftj2

)()(),(

= [EQ 5]


14/63

Fourier Analysis

Page 8

Figure 1.7.2[R 1991]

Figure1.7.2 shows vertical stripes in the time-frequency plane. They illustrate the windowing

of the signal view of the STFT. Windowing at time t, it computes all frequencies of the STFT.

The alternative view is shown by the horizontal stripes. It is based on a filter bank

interpretation of the STFT process. At a given frequency f, the STFT amounts to filtering the

signal at every value of t, using a bandpass filter. The window function is modulated to thegiven frequency.

The time/frequency resolution problem with STFT

In 1975, Jean Morlet recognised a problem: Unlike Fourier Analysis, The STFT system has thedisadvantage of being imprecise about time in the high frequencies because the size of the

window is fixed. If you then make the window very small, it means losing all the informationabout low frequencies.

So Morlet took another approach, which lead to the discovery of the WAVELET.

1.8 Technical Definitions for Chapter 1

Basis Functions

A group of functions such as y = sinx, y=sin2x, etc. form a basis if

i) They are all linearly independent from each other.ii) They can form any other linear function i.e. they span a vector space.

[Adapted from PMA211 course notes]

Linear Independence Formal Definition

Let v1,,vr be functions over a field F (eg. Real Numbers).

Say that there exist coefficients a1,,ar in F, not all zero, such that a1v1+a2v2++arvr = 0If the only way this can occur is that a1=a2==ar=0, then v1,,vr are all linearly

independent. Basically, this means that you cant make one of the linear independent functions

by combining any multiples of the others that are in the same basis.

Part ii) states that given any linear function in the same field, you can make it up by using

combinations of the basis functions.

STFT(T,f2)

STFT(T,f1)

Tf

Sliding Window g(t)

STFT(T1,f) STFT(T2,f)

T1 T2

f1

f2

modulated

filter bank


15/63

Fourier Analysis

Page 9

In the case of the Fourier Transform, the basis is made up of sinusoids. According to the

definition of a basis, this means that all sinusoids are linearly independent from each other.

The following is a proof to show this.

A proof to show linear independence of sinusoidsFirstly, we need to explore Orthogonality and Inner Products

The Inner Product of two functions is a mapping < , > : V x V where V is a function in avector space and is a real number. We only need to concern ourselves with the mapping fora continuous function, as both sin nx and cos nx are both continuous.

Given the space of all continuous functions on the closed interval [a,b], the Inner Product < >

is defined by

>= =

++=

tdtnmtdtnmntdtmt )cos(2

1)cos(

2

1coscos

+ ++= nm

tnmnm

tnm )sin(21)sin(

21 = 0

so cos mx and cos nx are orthogonal

Step 2 proof that sin mx and sin nx are othogonal where m n

< sin mx, sin nx > =

+=

tdtnmtdtnmntdtmt )cos(2

1)cos(

2

1sinsin

++

=nm

tnm

nm

tnm )sin(

2

1)sin(

2

1= 0

so sin mx and sin nx are orthogonal

Step 3 proof that cos nx and sin mx are othogonal

< cos nx, sin mx > =

++=

tdtmntdtmnmtdtnt )sin(2

1)sin(

2

1sincos

if n m

+

++

=mn

tmn

mn

tmn )cos(

2

1)cos(

2

1= 0

if n = m

++

= mntmn )cos(

2

1

= 0


16/63

Fourier Analysis

Page 10

so cos nx and sin mx are othogonal

All possible cases have been exhausted, and in each case the inner products have equalled to

zero, therefore, all sinusoids are orthogonal to each other and hence they are linearly

independent and form a basis for the Fourier Transform.

Convolution

An operation of the form x(n)h(n) is called Convolution, written x(n) h(n), but the symbol isusually omitted.The Matlab command in conv(x,h)

Convolution is used to calculate the response of the system to an arbitrary input signal by

convolving it with the systems impulse response.

You can think of convolutions geometrically, but it is best to explain them mathematically as

this is what computers do when they calculate the convolution.The convolution of two sequences a and b, is given by

jkjk baba

= )(

wherek

ba )( is the kth element of the resulting sequence.If the ja and jb are nonzero only for j >= 0 then

jk

k

j

jkbaba

==

0

)( [H 1995]

The convolution property is more useful when applied in a transformed domain (such as

frequency is the transformed domain of time in Fourier Analysis). It is very hard to visualise

what is happening to two signals after convolution when still in the time domain, but in the

transformed domain, convolution becomes multiplication. ][][][ yTxTyxT =

So this is effectively where corresponding values of points along the x-axis of both graphs are

multiplied together to form the new point.

Figure 1.8.1

TIME TIME

FREQUENCY FREQUENCY

FOURIER TRANSFORM


17/63

Introduction to Wavelets

Page 11

2 : Introduction to Wavelets

2.1 Where did they come from?

Wavelets were discovered as a result of engineering and not from mathematics like mostapplications in signal processing. Yves Meyer was one of the first people who realised the

importance of wavelets and recognised that most researchers had been using a process

resembling the wavelet process already without knowing of its history or functionalbackgrounds. Wavelet theory has developed independently from a large number of areas, and

it was he who made the connection. He made the following comment.

Tracing the history of wavelets is almost a job for an archeologist, I have found at least 15

distinct roots of the theory, some going back to the 1930s[H 1995]

This dissertation focuses on the discovery of wavelets as an approach to solve the

time/frequency resolution problem presented in the previous chapter.

The first use of wavelets was when Morlet was using Short Term Fourier Analysis. He wasusing STFT when working on a system that processed echo signals, used for aiding the

localisation of oil for excavation. Big windows were placed at different places on the signal,

then, as the price of computing dropped further; windows were placed closer and closer

together, even overlapping. Morlets problem was, no matter what he did; the process didnt

get any better. Morlet wanted a finer local definition.

As mentioned in section 1.7, the STFT system has the disadvantage of being imprecise abouttime in the high frequencies (unless you make the window very small, which means loosing all

the information about low frequencies).

So Morlet decided on another technique. Instead of keeping the size of the window fixed andfilling it with oscillations of different frequencies, he did the reverse: he kept the number of

oscillations in the window constant and varied the width of the window.

This window is called a WAVELET.

Figure 2.1.1

STFT Vs Wavelets

When the wavelet is stretched, the oscillations inside of it are stretched, decreasing their

frequency. When the wavelet is compressed, higher frequencies are produced.

Figure 2.1.1 shows the difference between STFT and Wavelets.

Top Row: STFT

Size of window is fixed and the number of oscillations varies.Small window is blind to low frequencies, which are too large for the window.

STFT

WAVELETS


18/63


Page 12

The large window looses information about a brief change in the information concerning the

entire interval corresponding to the window.

Bottom Row: WAVELETS

A mother wavelet (left) is stretched or compressed to change the size of the window.It makes it possible to analyse a signal at different scales.

2.2 The Mother Wavelet

The mother waveletis the building block for all other wavelets.

All wavelets are generated from a single wavelet function by a series of simple scaling and

translation procedures. This two dimensional parameterization is obtained from the function

(t) by

)2(2)( 2, kttj

j

kj = j , k Z [B 1998]

Z is the set of all integers. The factor 22j

maintains a constant norm idependent of scale j.

k- parameterisation of the time or space location.

j- the frequency or scale.

The function (t) is called the generating wavelet or Mother Wavelet and defines the waveletbasis. The term basis is the same here as it was in the FT case.

Looking at the formula, it is clear to see that there are infinitely many mother wavelets, which

form the foundations of the Wavelet Transforms.

2.3 Wavelets achieve Multiresolution

Figure 2.3.1

[G 1995]

Frequency Frequency

TimeTime

a) b)


19/63


Page 13

Amara Graps describes the benefits of using wavelets in signal processing. Wavelets overcome

the time/frequency resolution problem because of their ability to be stretched and compressed

(see section 2.2).

Part a of figure 2.3.1 shows a STFT, where the window is simply a square. Because a single

window is used for all frequencies in the STFT, the resolution of the analysis is the same at all

locations in the time/frequency plane.

An advantage of using wavelets in a transform is that the width of the windows can vary. You

can have short high-frequencies windows and long low-frequency windows.Part b shows the coverage in the time/frequency plane with a wavelet function.

2.4 How do you create a Wavelet?

As mentioned earlier, there are infinitely many wavelets. Unlike the basis for the Fourier

Transform i.e. sinusoids, wavelets can contain many sharp corners or discontinuities.

Wavelets are obtained by altering the variables j and k given in the mother wavelet formula in

section 2.2. These variables are integers that scale and dilate the mother function to generate

different wavelet families such as the Daubechies family (see below). The scale index jindicates the wavelets width, and the translation index k gives its position.

The term position is used in the same sense as it is for the STFT; it is related to the location of

the window, as it is shifted through the signal.

Figure 2.4.1

Figure 2.4.1, shows the Daubechies Wavelet family with different scaling and transitions.

They were created by a Matlab function that used the rules described by Ingrid Daubechies, inher book Ten Lectures on Wavelets. [D 1992]

Within each family of wavelets (such as the Daubechies family) are wavelet subclasses that

are distinguished by the number of coefficients and by the level of integration. These wavelets

are classified within a family often by the number of vanishing moments. [G 1995]


20/63

The Continuous Wavelet Transform

Page 14

3 : The Continuous Wavelet Transform

The Continuous Wavelet Transform (CWT) was developed as an alternative approach to the

Short Term Fourier Transform (STFT) to overcome the time/frequency resolution problem

(section 1.7)

3.1 Theory

The CWT is done in a similar way to the STFT, in the sense that the signal is multiplied with a

function, which is in this case, the Wavelet introduced in the previous chapter. Also, like

STFT, the transform is computed seperately for different segments of the time-domain signal.The main difference between CWT and STFT is that the width of the window is changed as

the transform is computed for every single spectral component, which is probably the most

significant characteristic of the CWT.

In the STFT computation, because the window had a constant shape and size throughout the

analysis, the frequency responses of the window were regularly spaced over the frequency

axis. Figure 3.1.1 (a) shows what filter bank the STFT produces. A filter bank is a term used to

describe the filtering effects on the frequencies in the signal as the window moves along thesignal.

Figure 3.1.1

[R 1991]

In the CWT case, instead of the frequency responses of the analysis filter being regularly

spaced over the frequency axis, they are regularly spread in a logarithmic scale (figure 3.1.1

(b) ).

Olivier Rioul and Martin Vetterli describes that this logarithmic approach in the filter banks isused for modelling the frequency response of the cochlea situated in the inner ear and is

therefore adapted to auditory perception. [R 1991]

We have already introduced Wavelets as the basis function for the CWT, and that these are

scaled and translated versions of the mother wavelet. The following formula expresses the

CWT in terms of the signal applied to the wavelet

Formula

dts

ttx

sabssCWT

x

=

)(

)(

1),( [EQ 6]

This shows the transformed signal is a function of two variables, and s, the translation andscale parameters, respectively and (t) is the mother wavelet.

Fre uenc f Fre uenc f

(a) Constant Bandwidth (STFT) (b) Constant Relative Bandwidth (CWT)


21/63


Page 15

Notice that we do not have a frequency parameter, as we had with the STFT (EQ 5, page 7),

instead, we have a scale parameter which is defined as 1/frequency.

Robi Polikar made the following analogy:

The scale parameter in the wavelet analysis is similar to the scale used in maps. As is the

case of maps, high scales correspond to a non-detailed global view (of the signal), and low

scales correspond to a detailed view. [P 1996]

3.2 Computation of the CWT

This section explains the formula given above and shows some applications.

Let x(t) be the signal that is to be analysed. Firstly, you need to choose a wavelet to act as the

analysing window. There are several candidates, Morlet, Sombrero, Daubechies, which are all

derived from the mother wavelet. Once the wavelet is chosen, the computation starts at s = 1

and the CWT is computed for all values of s, smaller and larger than 1. It is conventional that

the value of s (the scale) starts at 1, but this doesnt always have to be the case. The procedure

then continues for increasing values of s i.e. the analysis will start from high frequencies and

proceed towards low frequencies.As the value of s increases, the more the wavelet dilates, so the first value of s corresponds to

the most compressed wavelet.

The wavelet is placed at the beginning of the signal which is at the point which corresponds

to t = 0. The wavelet function at scale 1 is multiplied by the signal and then integrated. The

result of this integration is then multiplied by the constant number 1/sqrt(s). This is for

normalisation purposes only, so that the transformed signal will have the same energy at every

scale. The final result is the value of the CWT at time zero (t=0) and scale s=1 in the time-

scale plane.

The value of the transformation is calculated every time the wavelet is shifted towards the right

by . Therefore the value for the CWT is obtained at t=0, t=, t=2, etc. with scale s=1as thewavelet is shifted. This procedure repeats until the wavelet reaches the end of the signal. Onerow of points on the time-scale plane is then completed. Sections 3.3 and 3.4 show how these

rows are represented.

s is then increased by a small value and the above procedure is repeated for every value of s,

where each value of s fills the corresponding row of the time-scale plane.

Figure 3.2.1 shows the CWT process with s=1. The wavelet is the Morlet wavelet and isshown in yellow. Here, t represents the value of time where the centre of the wavelet is

positioned.

Figure 3.2.2 shows the CWT process with s=5.

Figure 3.2.1s=1 a) t=2, b) t=40, c) t=90, d) t=140

Figure 3.2.2s=5 a) t=2, b) t=40, c) t=90, d) t=140

a b a b

c d c d


22/63


Page 16

3.3 Visualising the CWT 3D Plot

Section 3.2 showed how the CWT is calculated by moving the wavelet window along the

signal at different wavelet scales. Each time the scale is increased, a new row of a matrix isadded. The matrix produced by the CWT process has the following dimensions.

x-axis : Translation (depends on the value of tau)y-axis : the number of different values of s used

Figure 3.3.1 shows a typical plot of a CWT.

Figure 3.3.1

As described earlier, the scale parameter s in equation 7 is actually the inverse of frequency. In

other words, frequency decreases as scale increases. So the portion of the graph in figure 3.3.1with scales around zero, actually correspond to highest frequencies in the analysis.

3.4 Visualising the CWT - Scalograms

The Scalogram is a very common tool in signal analysis, as it provides a distribution of theenergy of the signal in the time-scale plane. Olivier Rioul and Martin Vetterli recognised that

the CWT is isometric and therefore preserves energy. They proved this with the following

formula

XE

s

dsdsCWT = 2

2

),(

[R 1991]

where2

)(= txEx is the energy of the signal x(t).

This discovery lead to the definition of the scalogram, as the squared modulus of the CWT.Figure 3.4.1 shows an example of a typical scalogram.

Figure 3.4.1

The Scalogram is the visual representation used in the Wavelet Learning Tool being developedas part of this dissertation


23/63

The Discrete Wavelet Transform

Page 17

4 : The Discrete Wavelet Transform

4.1 Why not use the CWT?

As described in section 3, a signal can be transformed from the time domain to the frequencydomain using the Continuous Wavelet Transform (CWT) while reducing the loss of time and

frequency resolution. The section explained how the CWT was calculated by changing the

shape of the wavelet, which acts as the analysis window, for each analysis frequency. The

wavelet shape was governed by the scale parameter s where a larger s would represent a

more dilated wavelet. The wavelet would move along the signal with each scale and calculate

the CWT coefficient for each step. The size of the steps are governed by the parameter.

The s and parameters are continuous, i.e. their values can be incremented up to any value,and hence the transform is called the Continuous Wavelet Transform. Due to these parameters

being continuous, the CWT is not well suited to computer implementation [A 1996]. Although

the Wavelet Learning Tool, being developed alongside this dissertation, is being developed touse the CWT, to show how the process works and how the scalograms are produced, it is not

the quickest or most practical transform to use. The tool will only allow you to use small

values of s and , any larger values i.e. a signal with too many samples increases computationtime dramatically.

4.2 Discretizing the Continuous Wavelet Transform

In the Continuous Wavelet Transform, the wavelet coefficients were calculated using equation6. As mentioned above, the CWT cant be practically computed because it contains an integral

with which the variables are continuous. It is therefore necessary to discretize the transform.

The most intuitive way of doing this is to simply sample the time-frequency plane. With mosttransforms, the most natural choice would be to sample the plane with a uniform sampling rate,but in the case of Wavelet Transforms, the scale change can be used to reduce the sampling

rate. Nyquist developed the following rule that explains the reasons for using scale to reduce

sampling rate.

Nyquists Sampling Theorem: If the range of frequencies of a signal measured in cycles per

second is n, then the signal can be represented with complete accuracy by measuring its

amplitude 2n times a second. [H 1995]

This theorem describes how a curve with a finite number of frequencies can be represented

exactly by a finite number of samples. Usually you would need an infinite number of samples

in order to represent the curve exactly.

Nyquists Sampling Theorem can be interpreted so that if the time-scale plane needs to besampled with a sampling rate of n1 at scale s1, the same plane can be sampled with a sampling

rate of n2 at scale s2, where s1f2) and n2


24/63


Page 18

Figure 4.2.1Dyadic Sampling Grid [V 1995]

The Dyadic Sampling Grid shown in figure 4.2.1 is a pictorial representation of the

relationship between sampling frequency and scale. As scale increases down the graph, thefrequency being analysed decreases. Nyquists Rule says that the further you go down the

graph, the lower the sampling rate that is needed. In the figure, the sampling rate is represented

by the dots. The more dots, the higher the sampling rate. Each dot corresponds to a Waveletcoefficient calculated using the Continuous Wavelet Transform. The larger the scale

parameter, the fewer the number of coefficients that are needed, and therefore the quicker the

computation time.

You could think of the area covered by the axes as the entire time-scale plane. The CWT will

assign a value to the continuum of points on this plane. There are obviously an infinite number

of CWT coefficients. Considering the discretization of the scale axis, among the infinitenumber of points, only a finite number of them will actually be calculated, using a logarithmic

rule. The base of the logarithm depends on the application but the most common is 2 because

of its convenience. An application called Subband Coding (see section 4.3), uses such a base.

If the base 2 is chosen, only the values 2,4,8,16,32etc are used for the scale parameter. The

time axis is then discretized according to the discretization of the scale axis. Since the discretescale changes by a factor of 2, the sampling rate is reduced for the time axis by a factor of 2 at

every scale. You can see at each stage of the Dyadic Grid that the sample rate is decreased by

half. As a consequence, the Discrete Wavelet Transform uses wavelets only of the form where

the scalekj 2= and k is a whole number, see the formula for the mother wavelet on page 12.

4.3 Subband Coding

There are two well documented methods for calculating the Discrete Wavelet Transform basedon the ideas expressed in section 4.2, The Multiresolution Pyramid and Subband Coding. This

section will give a detailed explanation of the later of these two and will be used as part of asecond piece of software showing the visual effects the method has on the signal.

Driven by applications such as speech and image compression, a method called Subband

Coding was proposed by Croisier, Esteban and Galand using a special class of filters called

quadrative mirror filters in the late 1970s [V 1995]

The Subband coding scheme, first popularised in speech compression, uses a combination of

high-pass and low-pass filters to reduce the sample rate of the transform. Filters of different

cut-off frequencies are used to analyse the signal at different scales. The whole Subband

process consists of a series of these filters known as a filter bank. High-pass filters are used to

analyse the high frequencies in the signal, and the signal is passed through a series of low-pass

filters to analyse the low frequencies.

log s


25/63


Page 19

The resolution of the signal, which is a measure of the amount of detail information in the

signal, is changed by the filtering operations, and the scale is changed by downsampling (sub-

sampling) operations. Sub-sampling a signal corresponds to reducing the sampling rate, which

is equivalent to removing some of the samples of the signal. For example, subsampling by two

refers to dropping every other sample of the signal (see figure 4.2.1). Subsampling by a factor

n reduces the number of samples in the signal n times.

Figure 4.3.1The Subband Coding scheme shown as a filter bank tree. [R 1991]

h(n) high-pass filter, g(n) Low-pass filter, 2 Subsampling by 2

The procedure starts with creating a high-pass filtered version of the signal by passing the

signal through a half band digital low-pass filter. This is done by convolving the signal by an

impulse response function h[n] which represents the low-pass filter. A half band low-pass filtereliminates exactly half the frequencies from the low end of the frequency scale. For example,

if a signal has a maximum of 1000 Hz component, then half band lowpass filtering removes all

the frequencies above 500 Hz.

There is an important thing to consider when talking about frequency in the discrete case and

is explained as follows.

In discrete signals, frequency is usually expressed in terms of radians. As a result of this, the

sampling frequency of the signal is equal to 2 radians in terms of radial frequency. Therefore,the highest frequency component that exists in a signal will be radians, if the signal issampled at Nyquists rate (which is twice the maximum frequency that exists in the signal, see

page 17); that is, the Nyquists rate corresponds to rad/s in the discrete frequency domain.Therefore using Hz is not appropriate for discrete signals. However, Hz is used whenever it is

needed to clarify a discussion, since it is very common to think of frequency in terms of Hz. It

should always be remembered that the unit of frequency for discrete time signals is radians.

After passing the signal through a half band low-pass filter, half of the samples can be

eliminated. This is according to Nyquists rule, since the signal now has a highest frequency of

/2 radians instead of radians. Simply discarding every other sample will subsample thesignal by two, and the signal will then have half the number of data points. The low-pass

filtering removes the high frequency information, but leaves the scale unchanged. Only the

subsampling process changes the scale (see figure 4.3.2). Resolution, on the other hand, is

related to the amount of information in the signal, and therefore, it is affected by the filteringoperations. Half band lowpass filtering removes half of the frequencies, which can be

Etc.

h(n)

g(n)

h(n)

g(n)

h(n)

g(n)

2

2

2Level 1

Level 2

Level 3


26/63


Page 20

interpreted as losing half of the information. Therefore, the resolution is halved after the

filtering operation. Half the samples can be discarded without any loss of information.

Basically, the lowpass filtering halves the resolution, but leaves the scale unchanged. The

signal is then subsampled by 2 since half of the number of samples are redundant. This doubles

the scale.

This completes one level of the Subband decomposition

This can be repeated for further decomposition. At every level, the filtering and subsamplingwill result in half the number of samples (and hence half the time resolution) and half the

frequency band spanned (and hence double the frequency resolution).

Figure 4.3.2

Resolution and scale changes in discrete time

4.4 Example of Subband Coding

We have shown how to decompose a sequence into two sub-sequences at half rate by using abank of halfband pass filters. This process can be iterated on the sequence from the lower band

to achieve finer frequency resolution at lower frequencies. Repeating the process once on the

first low band creates a new low band, which corresponds to the lower quarter of the frequency

spectrum. Each further iteration will half the amount of frequency in the signal.Figure 4.4.1 shows the result of applying a signal to the Subband Coding scheme, each stage

shows how the signal is sub-sampled and how the frequency band is reduced by half.

The signal in blue is a sound wave made up of two tones produced by someone whistling. The

first region is a low tone and the second is a distinctively higher tone. Its Scalogram image,

produced by the Wavelet Learning Tool being developed along side this dissertation, is given

below also. This was computed using the Morlet wavelet. It clearly shows the two distinct

frequencies.

Figure 4.4.1Outputs from Subband Coding Scheme

halfband

lowpass filter

resolution: halved

scale: no change

halfband

lowpass filter

resolution: halved

scale: doubled

2

High pass filter 1

2000 Samples

f(/2 ~ )

High pass filter 2

1000 Samples

f(/4 ~ /2)

High pass filter 3

500 Samples

f(/8 ~ /4)

High pass filter 4

250 Samples

f(/16 ~ /8)

4000 Samples


27/63


Page 21

Figure 4.4.1 shows the results from each stage (iteration) of the Subband Coding scheme.

The example given is of a signal comprised of a low frequency tone followed by a high

frequency tone. Throughout the different stages of the Subband coding scheme, the signal has

been high-pass filtered and then down sampled. The outputs given show the signal at the

different stages of the scheme. It is clearly visible that at the start of the process, only the high

frequency components are visible but, as the process has gone on, the signal has been more

and more high pass filtered until only the very low frequencies are left. This is proved by the

fact that the high frequency tone has been completely filtered out after 4 filters.

The outputted signals given in figure 4.4.1 are taken from the Subband Learning Tool

developed along side this dissertation.


28/63

Computer Assisted Learning Tools

Page 22

5 : Computer Assisted Learning Tools

5.1 Computer Assisted Learning (CAL)

As this dissertation researches into the fairly new field of Wavelet Theory, a software packageis being developed to aid in the understanding of the processes involved. The fact that there

hasnt been much software development in this field gives all the more reason to develop one

now, which can be used to complement the teaching of the subject. Generally speaking, thesoftware package being developed belongs to the family of Computer Assisted Learning

(CAL) tools. This chapter discusses the advantages and disadvantages of using CAL and

whether a CAL tool is appropriate in this situation.

Computer Assisted Learning means (in a broad sense) using computers in education for all

kinds of purposes. [ICASSP vol2 1995]

When constructing CAL tools, it has been recognised that they should provide flexibility for

the student involved and also be stimulating enough so that the student can construct private

concepts rather than reproducing given explanations. [K 1996]

The advantage of having such a tool is that users can control their own access to the

information being taught. In this way, the flexibility of the system allows students to adapt the

available information streams to their mental need at any given moment. The disadvantage of

this free access to the CAL tool is that lecturers have no control over the flexibility of the

system. Lecturers need to anticipate how much the students will take advantage of this free

access, which may lead to the student not fully understanding the basic concepts that the CAL

tool was developed for in the first place. Because of this flexibility problem, CAL tools can notsolely be used as a method for teaching a new subject, but there is plenty of evidence below to

suggest that there are great advantages of using a CAL tool when used with a series of lectures.

Lecturers, Martin Cooke and Guy Brown, have researched into possible situations where a

CAL tool is useful in teaching speech and hearing, and may therefore aid in the teaching of

Wavelets. Their development of the Matlab Auditory Demos (MAD), with which the Wavelet

Learning Tool being developed as part of this dissertation would contribute to, has given thetwo authors a deep understanding on whether a CAL tool is appropriate.

They initially recognised the following problem,

The courses in speech and hearing typically introduce large amounts of unfamiliar material to

participants with backgrounds almost as variedthe domains of speech and hearing involve

intangible signals, ill-suited to traditional styles of presentationthe possibilities formisinterpretation are immense and, in our experience, difficult to predict [C 1999]

Recognising this problem, it was clear that CAL tools would be an appropriate solution, due to

the scope for interaction and experimentation.

Matti Karjalainen and Martti Rahkila also recognised the fruitfulness of using a CAL tool with

teaching Signal Processing. They also understood that a CAL tool must be in some way more

useful than ordinary teaching methods. [ICASSP vol2 1995]

Kommers, Grabinger and Dunlap recognised that there are 3 main areas to most CAL tools

where there are significant advantages to learning,Resource, Communication andExploration.

[K 1996]Resource: Paper-based documents are restricted to text, tables, schematic line drawings and

pictures, whereas hypermedia allows sound and video sequences as well. CAL tools can

provide multiple dimensions in the meanings of expressed ideas e.g. hearing a property of asound wave provides a better and more natural understanding than a picture of the sound wave.


29/63

Computer Assisted Learning Tools

Page 23

However, it is important that the resources provided by the tool match those provided by the

lecturer. If the tool and the lectures mention the same word for different meanings, the learning

experience is weakened.

Communication: This is based on the idea that a system should be programmed so that a

dialogue could evolve between the machine and the learner. The actual bandwidth of

communication in a CAL tool is very low and would probably not feature a large amount in

the tools being developed. However, a user guide will be developed to help users to use the

tools to the best effect, see Appendix B.Exploration: Computer simulation programs themselves are convincing for demonstrating their

education value. Confronting a student with a simulation allows more drastic, flexible and

critical manipulations. In a book, only a few examples may be given for a particular property,

but with a CAL tool, explorations into many other instances of that particular property can be

achieved. The exploration property will allow students to learn by discovery.

The CAL tools being developed for this dissertation are developed with the teaching of thesubject in mind, and how it can complement and reinforce a lecture course by providing hands-

on experience. The software has to be fairly easy to use to avoid early frustrations to a users

inexperience, and must also be visually suitable to make it clear what is happening. Justlooking at the previous chapters demonstrates how mathematical the theory of Wavelets is, and

to a computer science student, who may or may not have a good mathematical background,

may seem very demanding. The tool will help students to see visually what the maths

represents and enhance the students willingness to learn more about the subject, instead of

being intimidated by the mathematical content. After all, the actual use of Wavelets is to aid in

Signal Processing, so actual audio and visual demonstrations are an obvious key to teaching

the subject.

5.2 Which programming language?

There are many different program languages, which you can develop a learning tool from andthey all have their advantages and disadvantages, so it is sometimes difficult to choose which

one to use without seeing the benefits.

Matti Karjalainen and Martti Rahkila constructed a CAL tool using the QuickSig object-

orientated environment packages. This provides signal-processing tools for many application

domains, using the concept that signals and related concepts are represented as objects and the

operations on them are typically implemented by method functions. Another advantage is that

a wide range of functions such as filtering and transforms (like FFT) are built-in. However,

this programming environment has a problem with portability. QuickSig is Macintosh-specific

and also requires the languages Lisp and CLOS. So QuickSig is ill suited to large class sizes

and students would not be able to run the software on their home machines.

The CAL tools being developed in this dissertation use MATLAB.Matlab is a high level programming language, which provides many facilities for data

visualisation and numerical computation. It doesnt have the problem with portability, as a

version of MATLAB is available for most operating systems. MATLAB lends itself to

prototype programming, as it provides good facilities for quick interface construction. Matlab

also has a high-level support for sound handling and signal processing. As it is a mathematical

language, it lends itself to the use of vectors and matrices and makes it very straightforward to

plot graphs of signals. Martin Cooke and Guy Brown recognised that MATLAB is a sensiblechoice compared to languages such as Java. They recognised that an application in Java would

be time consuming because the Java applications interface (API) has no equivalent of the

signal processing toolbox available in MATLAB. A Java application for signal processing

would probably be too slow for any adequate user interaction.


30/63

Requirements Analysis

Page 24

6 : Requirements Analysis

The previous chapter introduced the notion of a Computer Assisted Learning tool and how

they can be used to assist in the teaching of Wavelet Theory. This dissertation will now focus

on the development of such a tool.

6.1 The initial requirement

At the very beginning, before any work was carried out, there was a small brief on what the

project should cover. This brief also contained a short paragraph on the basic requirements of

the system that was also to be developed

A MATLAB application will be designed and implemented that allows a wavelet

representation of sound to be generated using the DWT and modified by direct on-screen

manipulation (e.g. by removing components at certain scales); the inverse DWT will then beapplied to resynthesize a sound waveform which can be played to the listener.

This statement details the basic ideas of the system and was considered to be the backbone of

the system. It is clear that the system needs to include a function to work out the DWT, and

also the IDWT, and some user interaction to alter the scales on screen before the IDWT is

calculated. There are no clues to how the GUI should look, or how the user interacts with it.

Even though the initial requirement states that the DWT should be calculated, there was no

obvious indication of how the results of these calculations were to be displayed. Also, which

wavelet is to be used to calculate the DWT ?

All of these questions needed to be answered, so an interview was set up between myself (thedeveloper) and the client, to attempt to make clear what is needed.

6.2 The Client and Developer scenario

Before starting the design of any software system, it is important that you have all the

requirements clear first. The best way to do this is for the developer to ask a multitude of

questions to the client in order to extract important information about the system that the client

may not of previously given.

In any software development program, the client will have a total understanding of the

problem, as he is the one with the expertise on the subject. It is all too common for clients to

assume that a software developer is also an expert on their particular field of work. However,this is always never the case and the software developer will have a limited understanding of

the problem and may be confused by the clients initial requirements. It may be the case thatthe initial requirements are vague or even impossible to implement, so the idea of having an

interview is to explore and develop the clients narrow goals and to fill in any gaps of

understanding.

The interview session took place in the first week of development of this project and some of

the questions that were raised are detailed below. Most of the questions were of a result of

studying the initial requirement and of an early research into the topics of Wavelet Theory

before the project began. The answers given are not direct quotes, but they summarise the

discussion.


31/63


Page 25

Q. Do you want a visualrepresentation of the effects thatdifferent wavelets have?This two-part question asks the developer to clarify how the result of the DWT should be

presented and to find out whether the user should be able to compare the effects of different

wavelets.

A. The result of the process of using the Wavelet Transform should be represented by aScalogram so the user can see all the coefficients calculated on a time-frequency axis. This

scalogram can then be manipulated ready for the inverse transform.

The user should be able to choose from a selection of wavelets so they that can compare thedifferent effects they have on the process. A good idea would be to have a side-by-side

comparison.

After the discussion, it was realised that the answer given by the client to this question

contradicted the initial requirement, and also the text that had been studied. In the

requirements, it was clear that the system should use a Discrete Wavelet Transform to analyse

the signal. However, the answer to the question suggested that the result of the transformshould be presented visually using a scalogram. In section 3.4, it explains that a scalogram is

calculated by taking a Continuous Wavelet Transform matrix and taking the square of the

magnitudes of the coefficients. This means that the requirements should be changed to usingthe CWT and not the DWT. After discussing this, this became the new requirement and it was

agreed that another piece of software would be developed to show how the DWT could be

calculated using Subband Coding.

Q. Do you want a breakdown visually of the CWT process?After recognising that it was in fact the CWT that would be used in the software, this question

approached the subject of using animation to picture the CWT process as well as the scalogramshowing the results it gives.

A. Yes. Showing the wavelet as it compresses or dilates to calculate each scale of the CWT

would be beneficial.

Q. How do you want the On-Screen manipulation to work?The initial requirement suggested that the user should be able to use on-screen manipulations

to remove scales from the CWT in order to re-compute the inverse transform. However, it isunclear how this could be done, especially in Matlab, where graphics are more limited than in

Java, say.

A. It was suggested that the user could directly manipulate the scalogram by using some sort of

cursor. The cursor could move up and down the scalogram and a window would show the

CWT coefficients at that scale. The cursor could then be used to select certain scales to be

removed.

Q. Would you like to be able to play back the sound waves?

A. Yes. You need to be able to play the original sound wave that is to be analysed, and also

you need to hear the effects of the re-synthesis of the scalogram using the ICWT.

Q. Do you want to be able to actively control the scaling and translation coefficients of

the wavelet used?The CWT coefficients are calculated at every scale by using the scaling coefficient in the

mother wavelet formula (see section 2.2). The number of rows in the CWT matrix depends on

the number of scales calculated. Also, the number of columns in the CWT matrix depends on

the size of the steps the wavelet takes as it moves along the signal between each coefficient

calculation. The steps are controlled by the translation coefficient. Altering the values that

these coefficients take will change the number of CWT coefficients to be calculated, which

may be a useful property.

A. Yes. You could use a slider.


32/63


Page 26

6.3 The Matlab Auditory Demos

The software tools being developed along side this dissertation will form part of the Matlab

Auditory Demo (MAD) CAL tools. These tools have been developed to aid in the teaching ofComputer Speech and Hearing courses in the University of Sheffield. [C 1999]

Because these MADs are Computer Assisted Learning tools, they need to fulfil therequirements mentioned in chapter 5. The MADs consist of many different tools that enable a

user to understand many different concepts in the subject of Computer Speech and Hearing.

They vary from producing spectogram representations of speech waveforms to the complex

modelling of the Basilar Membrane in the ear.

Even though there is a wide variety of software systems comprising the MADs, they all have

many things in common that enable them to be successful CAL tools. The research into the

MADs produced the following further requirements.

Speed

The software tools that are being developed must be quick enough to allow sufficient user

interaction and provide meaningful animation. A quick system will maintain the users

interest, which will naturally enhance users understanding and learning of Wavelets.

Ease of useResearch into the MADs showed that all the systems were user friendly in such a way that theuser could see almost straight away what functionality was available. GUI objects are clearly

labelled and are only accessible at the appropriate times. Axes are labelled appropriately and

on-screen instructions appear when appropriate to guide the user. None of the systems are too

clustered with too many buttons, sliders etc. which would only confuse a new user, especially

if the user is new to the subject being investigated.

AestheticsIt is important that the interface is pleasing to look at. If the system is dull in appearance, the

user would not be as interested in using the system, so the teaching of the subject would suffer.The system should provide a suitable amount of colour to aid in the users understanding e.g.the system could be colour coded so that different components have their own colour. Also, the

tools being developed should match the appearance of the MADs so that it is clear that they

belong to that group. The tools should have similar headers, size, background colour etc.

Input error recoveryIt is important that a CAL tool recovers well to input error. Ideally, the system shouldnt allow

a wrong sequence of inputs i.e. by disabling GUI objects, but it is often the case where a userwould input something wrong either by mistake or by their lack of understanding about the

material or the system. It is here where the system should recognise that a user has inputted the

wrong value, and correct the error appropriately.


33/63

Software Development

Page 27

7 : Software Development

After the Requirements Analysis, it was proposed that two pieces of software would be

developed. The main one, called the Wavelet Learning Tool (WLT), will be developed and

documented in detail and will comprise of all the requirements set out in the previous chapter.The second piece of software will be a basic tool, which will show the different stages in the

Subband Coding scheme and is called the Subband Learning Tool (SLT).

7.1 Getting familiar with MATLAB

In simplest terms, MATLAB is a computer environment for performing calculations.[R 1998]

MATLAB is a contraction of Matrix Laboratory, and is primarily used for a convenient tool

for the manipulation of matrices. Since it was first created, it has added more and more

functionality and remains a leading tool for scientific computation. While simple problems canbe solved interactively with MATLAB, its real power shows when you give it calculations

that are extremely cumbersome or tedious to do by hand. Because the Continuous Wavelet

Transform involves constructing a very large matrix, and then displaying that matrix,

MATLAB seemed to be a natural choice.

MATLAB also allows graphics to be displayed with ease and with little programming as it

combines an efficient programming structure with a multitude of pre-defined mathematical

commands. Therefore, before any software development, it was beneficial to familiarise with

what MATLAB has to offer in term of commands and interface construction.

7.1.1 MATLABs GUI

Matlab provides many different GUI objects to make user interaction as easy as possible. Eachobject has its own different advantages depending on the type of operation. Below is a

comprehensive list of what is available.

Push Buttons The software user can press buttons for instant execution of a particularfunction.

Pop-up Menus Contains a list where one item can be selected i.e. a wavelet type. After

selection, a process can be initiated.

Edit Boxes Used to alter a numerical parameter in a function.

Check Boxes Often used when there is an option to have a particular property in the

system or not.

Radio Buttons Similar to check boxes except they usually come in pairs where you

select one or the other.

Sliders Also used to alter the parameter in the function, but unlike edit boxes, a

value doesnt need to be known and the function can update in real-timeas the slider moves.

List Boxes Similar to pop-up menus except that the whole of the list cant bevisualised at once to save space.


34/63


Page 28

Matlab also allows the construction of axis, lines, text, figures and other graphical devices, that

were taken into consideration when sketching the initial interface design.

Each GUI object has attached to it a series of handles. These handles define many properties of

the objects, which are set and controlled by the developer. The properties are set when the

software is first executed, but can be changed at any time during software development.

Controlling the handles to the GUI objects controls how the system will look, behave and how

efficiently. Therefore, a good understanding of the handles available and how to use themappropriately was an important stage in software development.

7.1.2 Coding in MATLAB

Matlab is a functional program language rather than an object-orientated one. Functions call

other functions, which are executed in a sequential fashion. Each function is read downwardsunless encountered by aforor while loop or by an ifclause.

The Wavelet Learning Tool will consist of many different functions, which will call each otherappropriately. The Matlab system will have to cope with the flow of control between these

different functions, which are written in m-files, which can be called at any time within a

program. However, keeping track of all these m-files can be very difficult and looks veryuntidy. The solution to this is the case statement. Instead of having many different m-files

containing the different functions of the system, it is better practise to place all the functions

into one m-file separated by a case statement.

Figure 7.1.2.1

Every time the m-file containing all the switch statements is called, it must be called with an

appropriate argument corresponding to which case is to be read.

Another advantage to this case switching approach is that you can have an initialisation case

where all the global parameters can be set when the system is first executed.

Matlab programming revolves around designing separate functions to do separate jobs, and

then plugging them into the overall system to interact with the interface. This plug-in nature of

programming allows each function to be tested independently before being encapsulated into

the system. Most of the functions being developed involve matrix manipulation and can be

tested simply by running them on the Matlab command window to see if they produce correct

results. Some functions need not be tested seperately from the system if their sole purpose is to

just change global variables or alter the interface in some way. The functions that are

constructed from a mathematical knowledge followed the sequence of development shown in

figure 7.1.2.2.

m-file

CWT

m-file

LoadSignal

m-file

Zoomin

m-file

switchcase CWT

--------

case Load Signal

--------

case Zoom in

--------

end


35/63


Page 29

Figure 7.1.2.2

One of the most useful GUI objects handles is the enabling handle. This allows the software

developer to control when a user can use a particular GUI object. To stop the wrong sequence

of inputs into the system, a particular object can be disabled at a time when it shouldnt be

used. These restrictions will make the software more secure and more likely to recover from

user input error.

7.2 Wavelet Learning Tool - Interface Design

The main system to be developed is the one identified in the requirements analysis, the

Wavelet Learning Tool (WLT). This tool will allow users to choose different wavelets to

compute a CWT and corresponding scalogram. A side by side comparison of different

wavelets was recommended, as was a way of manipulating the scalogram to produce a re-

synthesised signal from an Inverse CWT function.

Matlab allows you to construct an interface very simply and effectively with very little code. It

is very beneficial to do interface development in the early stages of development, as it will aid

in the understanding of what functionality is needed in the final system. Knowing which GUI

objects were available, an initial sketch of the interface was made and presented for comment.

Figure 7.2.1

The sketch shown in figure 7.2.1 is an early representation of the proposed look of thesoftware tool. The original idea was to have the following GUI objects.

Mathematical

Formula

Code into

Matlab

Test seperately

on command

screen

Introduce appropriate

Interface code and

handle commands

Plug into the systemusing case clause

Test the

system


36/63


Page 30

5 Graphs: Plot of loaded signal. 2 for showing the animated wavelet as it changes shape tocalculate the different scales of the scalogram. Plot of the CWT coefficients from the

selected scale from a scalogram. Plot of the reprocessed signal.

2 Scalogram Images each with their own cursor Pop-up menus for selecting a wavelet

Zoom Controls for zooming in on the plot of CWT coefficients Process buttons for re-processing the CWT. Play buttons for playing back the original and altered signals.

7.3 The coding of The Wavelet Learning Tool

Section 7.2 shows how an initial interface was designed to accommodate for the requirements

established in the requirement analysis. This initial interface was then coded into Matlab with

the help of the Guide tool. The Guide tool is a Matlab interface construction tool, which allows

a user to code an interface quickly and effectively without the tedious task of setting all the

handles to each GUI object. Although the Guide tool writes the interface code for you, it is in a

format which didnt suit the interactive nature of the system, therefore it was only used to setthe basic handles such as position and colour, whilst other handles such as call-backs were

coded by hand.

Having developed the interface first, it was then logical to develop the main functions, whichwould be called by the user via the interface. There are two main functions: the Continuous

Wavelet Transform (CWT) and the Inverse Continuous Wavelet Transform (ICWT).

This section describes the main coding developments that were undergone during the coding ofthe Wavelet Learning Tool. By no means is this a fully comprehensive description of all the

coding, as it does not include full details e.g. of global parameters or handle properties. To see

a full development log, see appendix C.

See Appendix C for the software development log

7.3.1 Coding the CWT

The requirements state that the user must be able to compare the effects of using different

wavelets with the CWT. The interface contains pop-up menus, which contain a list of the

different wavelets available for the computation. The main job of the CWT function is to

calculate the CWT coefficients at each scale of the wavelet transform and to plot these into a

CWT matrix. Chapter 3 described how this matrix could then be used to produce a scalogram

of the CWT by squaring each coefficient independently. The scalogram will have time on thex-axis and scale on the y-axis.

There are three separate CWT functions, one for each wavelet (Gauss, Morlet and Sombrero).

Each calculates the CWT matrix and sets this as a global variable. The CWT matrix is

calculated in the usual way by changing the shape of the wavelet at each scale of the

transform. The CWT functions first calculate how many scales to compute depending on thelength of the signal. Re-calling from chapter 3, as the scale value increases, the frequency

decreases and therefore the wavelet needs to dilate. So the user can see the final shape of the

wavelet, the CWT function calculates the largest scale first and then compresses the wavelet

until the scale=1. The different shapes of the wavelet are stored in another matrix, which is

also set as a global variable for other functions to use.


37/63


Page 31

When the user selects a wavelet from a pop-up menu either the function waveselecta orwaveselectb is called depending on which pop-up menu is executed. The waveselcta and

waveselctb functions are the main functions in the WLT program. These functions control all

the data handling and graphics handling of the system when a new CWT is calculated. Each

function controls its own part of the system, either A or B, which corresponds to which pop-up

menu is selected. The interface is split into A and B to allow for the side by side comparisons.

Both functions were developed simultaneously as they both contain similar functionality. The

functions were carefully developed to provide the following sequence of operations

The function reads which wavelet has been selected,

then calls the appropriate CWT function.

(The CWT function calculates the CWT matrix and matrix of wavelet shapes see above)

A graph is animated with the different forms that the wavelet takes at each scale of the CWT.

The function then presents the CWT matrix as a scalogram.

The scalograms cursor is drawn onto scale number 2.

The PlotCWT function is called.

The above list gives a summary of the functionality within the two main functions of the

system, namely waveselecta and waveselectb. These functions were the first to be developed

after the coding of the CWT function, as there needed to be a way of calling the function with

different arguments.

The summary states that after the displaying of the scalogram, the main function then calls a

function called PlotCWT. This function plots a graph of the CWT coefficients from a selected

scale of the scalogram. The scale is selected by the cursor and, as the cursor is re-drawn after

the construction of a new scalogram, the CWTPlot function needs to be called to update the

plot. The CWTPLot function is also called every time the cursor is moved, to update the plot

Documents

Wavelets for Sound Analysis