156
On the auditory discrimination of spectral shape Citation for published version (APA): Versfeld, N. J. (1992). On the auditory discrimination of spectral shape. Eindhoven: Technische Universiteit Eindhoven. https://doi.org/10.6100/IR377008 DOI: 10.6100/IR377008 Document status and date: Published: 01/01/1992 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 13. Jun. 2020

On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

On the auditory discrimination of spectral shape

Citation for published version (APA):Versfeld, N. J. (1992). On the auditory discrimination of spectral shape. Eindhoven: Technische UniversiteitEindhoven. https://doi.org/10.6100/IR377008

DOI:10.6100/IR377008

Document status and date:Published: 01/01/1992

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 13. Jun. 2020

Page 2: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit
Page 3: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

On the auditory discrimination of spectral shape

Page 4: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

On the auditory discrimination of spectral shape

PROEFSCHRIFT

ter verkrijging van de graa.d van doctor aan de Technische Universiteit Eindhoven,

op gezag van de Rector Magnificus, prof. dr. J.H. van Lint, voor een commissie aa.ngewezen door het College van Dekanen

in het openbaa.r te verdedigen op vrijdag 3 juli 1992 om 16.00 uur

qoor

Nicolaus Jacobus Versfeld

geboren te Eindhoven

druk: wibro dissertatiedrukkerij, helmond.

Page 5: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Dit proefschrift is goedgekeurd door de promotoren prof. dr. A.J.M. Houtsma prof. dr. R. Collier

en de copromotor dr. T. Houtgast

Dit onderzoek werd uitgevoerd a.an het lnstituut voor Perceptie Onderzoek (IPO)

te Eindhoven, en werd financieel gesteund door de stichting PSYCHON van de Neder­landse Organisatie voor Wetenschappelijk Onderzoek (Nwo).

Page 6: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

"Perhaps,,, he added, "to your ear That sounds an easy thing?

Try it yourself, my little dear! It took me something like a year,

With constant practising."

From: Phantasmagoria Lewis Carroll

Aan Anja

Page 7: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Voorwoord

H ET in dit proefschrift beschreven onderzoek is verricht op het In­stituut voor Perceptie Onderzoek te Eindhoven, in de periode van

1 juni 1988 tot en met 31 mei 1992, gedurende de tijd dat de schrijver in tijdelijke dienst was van de Nederlandse Organisatie voor Wetenschap­pelijk Onderzoek.

Graag wil ik iedereen bedanken die, op zijn of haar manier, heeft bijgedragen aan de totstandkoming van dit proefschrift. In het bijzon­der beda.nk ik Aad Houtsma, die het gelukt is - eerst via een college, daarna via een stage en afstuderen - mijn interesse te wekken en te houden voor de psychoakoestiek. Voorts dank ik Theo de Jong, zonder wie het mij onmogelijk gelukt ZOU zijn om ook maar een meetpunt te ontfutselen aan mijn proefpersonen. Als laatste hen ik met name de leden van de promotiecommissie prof. dr. R. Collier, dr. ir. T. Hout­gast, dr. B.C.J. Moore en prof. dr. ir. R. Plomp zeer erkentelijk voor de genomen moeite om het proefschrift te beoordelen en becommen­tarieren.

Niek Versfeld

Page 8: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Contents

Introduction 1

1 Perception of spectral changes in multi-tone complexes 5 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.1 Experiment I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Determination of thresholds .. . . . . . . .. . . . .. .. . .. .. . .. 11 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2.2 Experiment II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Results & Discussion .. .. .. .. . . . .. . . . . . . . . . . . . . . . . . . . 18

1.2.3 Experiment III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.2.4 Learning effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.3 General Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Discrimination of changes in the amplitude of two-tone complexes 29 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.2 Characteristics of two-tone complexes . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.3 The EWAIF model and its variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.4 The multi-channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Page 9: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Contents vii

2.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.5.1 General description . . .. .. . . . . . . . . . .. . . . . . . . . . .. . . . . . . . . 46

Stimulus conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 7

2.5.2 Experiment I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.5.3 Experiment II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Results & Discussion . . . . . . .. . . .. . . .. . . . . . .. . . . . . .. . 52

2.5.4 Experiment III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.5.5 Experiment IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , 63 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2.6 General Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3 Discrimination of spectral changes in noise bands 69 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.2.1 General description . . . . . . . . . . . . . . .. . . . .. . . . . . . . . . . . . . . . 71 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.2.2 Experiment I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Results & Discussion . . . .. .. .. . .. . . . . . . . .. . . .. .. . . .. 7 4

3.2.3 Experiment II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Results, & Discussion . . . . . . . . .. . . . . . . . . . . . . .. . .. .. . . 76

3.2.4 Experiment III . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Results & Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.2.5 The EWAIF model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.2.6 The multi-channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.3 General Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Summary 91

Appendix A: A novel adaptive staircase procedure 95 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 A.2 Experimental procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 A.3 Example ..................................................... 100

Page 10: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

viii Contents

A.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Appendix B: Three-interval, three-alternative forced-choice paradigms 105 B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 B.2 Signal Detection Theory . . . . .. .. . . . . . . . . . . . . . .. . .. . .. . . . .. . 107 B.3 The regular 3I3AFC paradigm .. .. . . .. .. . . . .. . . . . . . . .. . . . . . 108

B.3.1 Case 1: Labeling is possible . . . . . . . . . . . . . . . . . . . . . . . . . 109 B.3.2 Case 2: Labeling is not possible . . . . . . . . . . . . . . . . . . . . . 112

B.4 Triangular paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 B.4.1 Case 1: Labeling is possible . . . .. .. . . .. . . .. . .. . . . . . .. 115 B.4.2 Case 2: Labeling is not possible . .. . .. . . . . . . . . .. . . . . . 117

B.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Appendix C: Fitting the psychometric function 123 C.1 The psychometric function . . . . .. . . . . . .. . . . . . . . . . . . . . . . . . . . 123 C.2 The Maximum-Likelihood fit .. .. .. . . . .. . .. . .. .. . . . .. . . . .. . 124 C .3 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 C.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 C.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

References 133

Samenvatting 140

Page 11: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Introduction

P SYCHOACOUSTICS is concerned with the question how a varia­tion in the magnitude of one or more physical parameters of a

sound affects its perceptual attributes. Increasing the intensity of a signal, for instance, will result in an increase in loudness. Increasing the fundamental frequency of a periodic sound will probably cause its pitch to increase. The timbre of complex, time-varying signals, such as speech and music, can be changed by alterations in the shape of the spectro-temporal pattern. Manipulation makes it possible to transform the vowel /a/ to the vowel /e/, or to make a clarinet sound like a sax­ophone. It is important to know what details in the spectro-temporal pattern of a sound causes an /a/ to be perceived. If the perceptually relevant features of a signal can be identified, this not only gives more insight into the perceptual organization of the auditory system, but can also be used to construct low-bit-rate codecs that still can reproduce perceptually acceptable speech or music (for instance vocoders, or the new Digital Compact Cassette).

The perception of different spectral shapes is not easy to investigate systematically, since many signal parameters are involved. Even if one restricts oneself to steady-state signals, and not considers the effects of transients which are known to be perceptually very relevant (Berger, 1964; Grey, 1977), the frequency, amplitude or phase of each compo­nent is a variable. Von Bismarck (1974b) studied verbal attributes of sounds with different artificial spectral shapes. He found that four factors could account for 90% of the variance. The most important factors corresponded to the verbal scales "dull-sharp" and "compact­scattered". Plomp (1976) asked subjects to give (dis )similarity judg­ments between sounds with different spectral shapes, derived from mu­sical instruments. With multidimensional scaling techniques he was

Page 12: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2 Introduction

able to relate the subjective dissimilarity between two sounds to the (objective) dissimilarity in spectral shape, by calculating and combin­ing the differences in intensity for each critical band.

Another approach of exploring timbre space was initiated by Green and coworkers in the beginning of the eighties (Green, 1988). Instead of asking subjects to judge the similarity between two sounds, they presented two fl.at multicomponent sounds to the subject, which were identical except for a small increment in the amplitude of the mid­dle component, and asked them to indicate the sound with the incre­ment. The unexpected and important finding was that the addition of frequency components, remote from the middle (target) component could improve the discriminability. This meansJhat the auditory sys­tem does not focus solely on the middle component, ignoring the other non-changing components, but rather compares the amplitude of that component with other components in the spectrum. Stated differently: The spectral shape or profile is monitored. In subsequent experiments these profile-analysing capabilities of the auditory system were studied to a larger extent. To make sure that the only reliable information sub­jects could use were changes in the spectral shape, experiments were conducted by roving the overall intensity in each and every sound burst. Many profile-analysis experiments were usually performed with multi­tone signals (Green, 1988), though discriminability between differently shaped broadband noises also has been examined to some extent (Far­rar, Reed, Ito, Durlach, Delhorne, Zurek & Braida, 1987; Moore, Old­field & Dooley, 1989).

Almost all profile-analysis experiments reported in literature have been conducted with multi-component complex sounds. Though these signals may bear a likeness to music and speech sounds and therefore may provide information that can be applied directly in speech and music synthesis or coding, they are also difficult to manipulate system­atically. This thesis deals with the ability of the human auditory sys­tem to discriminate between very simple sounds, including the simplest sounds that still may comprise a change in the spectral shape. Sounds may comprise only two frequency components with amplitudes that are changed in opposite direction, or noise bands with a changing sign of their spectral slope. The advantage of using simple signals is that ex­isting models can be applied and tested relatively easy. Experiments

Page 13: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Introduction 3

in this thesis are, in contrast with usual spectral-shape discrimination experiments, not restricted to signals with bandwidths that exceed the critical bandwidth. Thus, both within-channel and across-channel com­parison is studied.

Chapter 1 starts with a series of experiments with multi-tone com­plexes. By variation of the number of components and the component spacing, an attempt is made to gain some insight in the important fac­tors infiuencing spectral-shape discrimination. The results of chapter 1 showed that two-tone complexes of which the amplitudes are changed in opposite direction are very interesting signals -in the light of profile analysis. Therefore, in chapter 2 thresholds for changes in the ampli­tudes are measured systematically as a function of signal parameters such as frequency separation, centre frequency and overall intensity. The results are used to evaluate some models on spectral-shape dis­crimination. Chapter 3 examines the discriminability of a change in the sign of the spectral slope of a noise band. The results give some insight into the discrimination processes with noise-like signals. They also will be used to relate discrimination of signals with continuous spectra to those with discrete spectra. Finally, the main findings in this thesis are recapitulated in a summary.

Although this thesis primarily deals with psychoacoustics, some is­sues of experimental methodology are also considered. Some back­ground is given in three appendices. Appendix A deals with the exper­imental method, i.e. the adaptive procedure that was used to determine thresholds. Appendix B deals with signal detection theory, i.e. how the probability of a correct response is related to the subject's sensitivity in a three-interval, three-alternative, forced-choice, paradigm. Finally, Appendix C describes a way to fit a parameterized psychometric func­tion to the experimentally obtained data.

Page 14: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

4

Page 15: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Chapter 1

Perception of spectral changes in multi-tone complexes1

Abstract

Amplitude changes of the spectral components of a complex tone, relative to each other, are usually well perceived, even if the overall intensity is kept fixed. Three experiments are reported: Experi­ment I dealt with the detectability of amplitude changes in two­tone complexes of fixed frequencies. Experiment II examined de­tection of slope changes in ramp-shaped spectral envelopes of two­and three-tone complexes as a. function of spectral spacing. As a. control experiment for some conditions a roving intensity level was used. Experiment III investigated the detectability of changes in the spectral slope of multi-tone complexes as a. function of the number of components. The results of the experiments show that detection of spectral changes in a sound is strongly dependent on the frequency spacing of the components. It is concluded that the auditory sys­tem is capable of comparing the relative energy distributions over different critical bands. Within a critical band there exists an opti­mum frequency separation with respect to the detection of relative amplitude change.

1.1 Introduction

T HIS chapter deals with the ability of our auditory system to discriminate between multi-tone complexes of different spectral

1This chapter is a slightly modified version of: Versfeld, N.J. & Houtsma, A.J.M. (1991). "Perception ofspectral changes in multi-tone complexes", Quarterly Journal of Experimental Psychology 43A, 459 479.

Page 16: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

6 Chapter l muW-tone complexes

shapes. Knowing how and to what extent we can discriminate spectral shape is important for understanding speech communication and music perception. From a subjective viewpoint, one can see this as an inves­tigation of primarily timbre discrimination, but it should be kept in mind that the subjective cues used to discriminate the various sounds may often also be pitch or loudness.

In most experiments dealing with this subject, an increment in am­plitude in one of the components is to be detected. To prevent subjects from listening to pure intensity cues, a roving level is often used (Green, 1988), that is, the overall intensity is randomly varied between obser­vation intervals. In our experiments stimuli have a constant overall intensity level, which means that overall intensi~y cannot be used as a cue for detection. This does not mean that local intensity cues are not used. Therefore a roving level paradigm for control has been applied in one experiment.

Because a spectrum can be varied in many other ways (time, fre­quency, phase), some further specifications and restrictions are in order. Firstly, we will restrict ourselves to quasi-steady-state sounds and not become involved in perceptual effects of transients such as signal onset or offset, which are known to have significant consequences for both the effective spectrum as well as the percept of a sound (Berger, 1964j Smurzynski & Houtsma, 1989). Secondly, the focus of this chapter will be on sensitivity for amplitude changes of components with fixed fre­quency. Changes caused by frequency or phase shifts of components are not considered. Phase seems to have little influence on detection of spectral envelope changes in multi-tone signals (Green & Mason, 1985).

Spectral-shape discrimination in the strict sense of the word applies to many discrimination experiments reported in the literature. On the one hand there are the many simple-tone intensity (Riesz, 1928j Rabi­nowitz, Lim, Braida & Durlach, 1976; Jesteadt, Wier & Green, 1977) and frequency (Shower & Biddulph, 1931; Wier, Jesteadt & Green, 1977) discrimination experiments which, in essence, do provide a mea­sure of sensitivity for simple changes in spectral shape. On the other hand, discrimination measurements have been made for sounds with very complex broadband spectra such as speech-shaped noise (Far­rar, Reed, Ito, Durlach, Delhorne, Zurek & Braida, 1987), peaked or notched broadband noise (Moore, Oldfield & Dooley, 1989) and multi-

Page 17: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 7

tone complexes (Moore, Glasberg & Shailer, 1984; Green, 1988). The multi-tone experiments of Moore and Green involved the selection of a specific spectral feature, for instance the centre frequency of a spectral peak or notch or the intensity of a single tone component, as the physi­cal variable to be manipulated and discriminated. Finally, other experi­ments with band-limited tone and noise stimuli have been reported that could be viewed as spectral-shape discrimination experiments. Buus (1983) investigated sensitivity for opposite frequency shifts of the com­ponents of a two-tone complex as a function of their frequency dif­ference, whereas Feth & O'Malley (1977) did similar experiments for · small intensity changes. Raney, Richards, Onsan & Green (1989) have shown that tone-in-noise detection with signal uncertainty also can be considered as spectral-shape discrimination.

Three experiments a.re reported in this chapter. Experiment I deals entirely with the detection of incremental amplitude changes in one or two fixed-frequency tones, with and without the background of other non-varying tone components. Experiment II deals with detection of changes in the spectral slope of two- and three-tone complexes, as a function of frequency separation between components. Experiment III explores the detection of changes in the spectral slope of multi-tone complexes comprising between two and twelve components, as a func­tion of spectral range. A computer implementation of Feth's EWAIF

model (Feth, 1974; Feth & O'Malley, 1977; Feth, O'Malley & Ramsey Jr., 1982; Feth & Stover, 1987) was used to compare our results qual­itatively with the model predictions. In a final section of the chapter an attempt is made to identify some elements that should go into a model which can explain the most prominent features of discrimination behaviour observed in this chapter as well as in other studies from the literature.

1.2 Experiments

1.2.1 Experiment I

Experiment I dealt with the question of how well the human auditory system can discriminate signals that are comprised of fixed frequencies but differ in amplitude of one or more components. Detection threshold

Page 18: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

.8 Chapter 1 multi-tone complexes

for an amplitude increase in one (or more) components was determined with an adaptive two-interval two-alternative forced-choice paradigm. Each sound contained discrete frequency components that remained fixed during the entire experiment. The number of components in the sounds varied from a single one (i.e. sounds were pure tones) to twenty, although most sounds were two-tone complexes. The one- and two-tone experiments were done in three di:ff erent frequency regions to study the frequency dependence of discrimination. The twenty-tone experiment, in which the amplitudes of only two components were manipulated, was meant as a control experiment to assess the influence of the pres­ence of other non-varying components on complex-sound discrimination (Green, Kidd Jr. & Picardi, 1983; Green, Mason & Kidd Jr., 1984). In the profile-analysis studies by Green and his co-workers the general ap­proach was to measure detection of an amplitude increment of a single component of a tone complex. In the present chapter the focus is on the detection of some fixed increment of a single tone component being moved from one component to the next. The fact that we ,can track the placement of tones and tone voids in a sequence of complex tones has been shown in a. recent pitch-motion study with random chord se­quences (Allik, Dzhafa.rov, Houtsma, Ross & Versfeld, 1989). In order to make sure that no loudness cues were used, Green and co-workers used a roving intensity level. In this experiment, discrimination of pure intensity differences and spectral shape differences a.re compared; thus the roving level paradigm was not used.

Stimuli

Five types of stimuli were used, as illustrated in Figure 1.1, each stim­ulus containing two sound bursts:

(1) two pure tones of frequency /i, one with level L+&, the other with level L;

(2) two two-tone complexes with tones at frequencies Ji and / 2 , one with tones at levels of L+& and L, the other with both tones at level L;

(3) two two-tone complexes with tones of frequencies Ji and /2, one with levels of L+& and L, the other with levels of Land L+&;

Page 19: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 9

(4) two two-tone complexes with frequencies / 1 and /2, one with both tones at a level of L+~, the other with both tones at a level of L· I

(5) two twenty-tone complexes with frequencies fn 200 · 20Cn-l)/i9 Hz (1 :5 n :5 20), all at a level of L except the mth component in one sound burst and ( m + 1 }th in the other sound burst, which had a level of L+~ ..

For stimulps types (2), (3) and ( 4), three different frequency pairs (fi,/2) were generated: a low pair (321.0 Hz,375.8 Hz), a middle-range pair (826.6 Hz,967.8 Hz) and a high pair (2128.9 Hz,2494.5 Hz). They will be referred to as "I", "m" and "h", respectively. The twenty-tone complex of stimulus type (5) consisted of equally-spaced components on a log-frequency scale, ranging from 200 to 4000 Hz, with a frequency ratio of 1.17 for successive components (about 2.7 semitones). Green et al. (1984), Green, Onsan & Forrest (1987) and Bernstein & Green (1987a) have shown that for this frequency separation the improvement of the threshold is maximized. The component pairs (4,5), (10,11) and (16,17) of this twenty-tone complex correspond exactly to the three frequency pairs "l", "m" and "h" of stimulus types (2), (3) and ( 4). Each sound burst had a duration of 250 ms, including 20-ms on- and

·off-ramps. Bursts were separated by a 10-ms silent gap. Total stimulus duration was 510 ms.

Procedure

Stimuli were presented in an adaptive two-interval two-alternative forced-choice paradigm. All sessions started with the determination of the subject's threshold for the stimulus (with ~ = 0) with a back­ground of continuous white noise with a spectral density of 10 dB SPL/Hz (which, according to Hawkins & Stevens (1950), corresponds to a sensation level of about 40 dB). The subject, who was seated in a double-walled (IAC) sound-insulated chamber, received the repeated stimulus bursts binaurally through headphones and adjusted the sound level with an attenuator so that this stimulus was barely audible. Stim­uli were presented 20 dB above this empirically established threshold. The white noise was used to mask most unwanted distortion products of

Page 20: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

10

I 1

I 1

11

111111111111111111111 f 1

L+AL L

L+AL L

L+AL L

L+AL L

L+AL L

Chapter 1 multi-tone complexes

type __ ____,._______ (1)

11 type --'-'--- (2)

I I type --~....._____ (3)

11 type -~~-(4)

type --~ ....................... , ......... 2 ............................. ...._ (5)

Frequency (log.)

Figure 1.1: Representation in the frequency-amplitude domain of the five stimulus types used in Experiment I. Amplitudes are either Lor L+AL.

the ear (Plomp, 1965i Goldstein, 1967) and was continuously present throughout the experiment. There was no response time limit after the presentation of a stimulus during experimental runs, and feedback was provided immediately after each response. A response was given by pushing a button on the response-box. A two-down one-up (2nlu) adaptive procedure was used to establish the 70. 7%-correct point of the

Page 21: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 11

psychometric function (Levitt, 1971 ). Trials started with values for AL well above threshold. At the beginning of each run a simple lnlu pro­cedure was used to reach the threshold quickly. The 2DlU procedure was only adopted after the first four reversals. Each run contained 24 reversals, having about 100 to 150 trials and lasted typically 4 minutes. From each subject at least 10 runs were collected for experiments with types (3) and (5) and at least 5 runs with types (1) and (2). Only after the threshold proved to be stable were the runs taken into account. For one subject, 5 runs were collected with type (4). The AL values varied in steps of-0.05 dB in the range from 0.00 to 2.00 dB. For larger values larger steps were taken.

All stimuli were calculated on a P857 minicomputer system and were stored on disk. Stimuli were transformed into acoustical signals with a 12-bit D/ A-converter at a sampling rate of 10 kHz, low-pass filtered at 4.3 kHz. The entire experiment was computer-controlled.

Subjects

Three subjects, AH, JS and NV, participated. They included both authors. All subjects had received musical training and had experience with psychoacoustic experiments.

Determination of thresholds

Thresholds were determined by fitting a Cumulative Normal Integral to the data with a least-x2 fit, whereafter the 70. 7%-correct point was es­timated. Signal detection theory provides an expression for the relation between Pc (the probability on a correct response) and the sensitivity d' in a two-interval two-alternative forced-choice paradigm (see Green & Swets, 1988):

Pc=~(~), (1.1)

where ~( z) is the Cumulative Normal Integral. The x 2 statistics pro­vide a goodness-of-fit criterion, and results will show that under all conditions and for all subjects a x 2-fit is acceptable (p > 0.01) with a simple linear relationship between d' and AL, that is:

d' = a AL. (1.2)

Page 22: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

.12 Chapter 1 multi-tone complexes

Threshold is taken as that value for D.L for which a 70.7%-correct score is established. In this way, not only information from responses around threshold is used, but also information of trials that were well above or below threshold. Furthermore, a great experimental advantage com­pared to the conventional adaptive procedures, such as the 2D1U pro­cedure (Levitt, 1971) is that now the stimulus step size can be chosen arbitrarily, as long as enough measurements are made for each stimulus value. A goodness-of-fit criterion is provided and confidence intervals for the threshold can be given. It is also possible to perform ANOVA-like processing, for instance, to decide whether averaging over subjects is allowed.

Results lJ Discussion

It became apparent from the beginning of the experiment that a variety of subjective cues played a role in the discrimination process of the signals of the different types. For types ( 1) and ( 4) it was only loudness cues, as expected, but for stimulus types (2), (3) and (5) subjects quite often reported using pitch jumps or timbre changes as discrimination cues. Pitch jumps, as reported by subjects in terms of musical intervals, appeared to correspond to the physical frequency components in which the change took place. No further attempt was made to investigate the subjective cues that were supposedly or actually used. The feedback that was given after every response simply enabled subjects to learn to use every possible cue in order to obtain a correct answer.

The results for stimulus types (1), (2), (3) and (5) are shown in Fig­ure 1.2 for each individual subject, where the 70. 7%-correct thresholds Ll.11 (in dB) are plotted as a function of the stimulus type and frequency region. Filled symbols in Figure 1.2 indicate that inter-subject differ­ences were· not significant (p > 0.01 ), so that averaging over subjects was allowed. Six data points [AH and JS "h" for type (2) and (3); NV "l" and JS "h" for type (5) ] differed significantly from the rest of the data. Omitting these six points, results showed that for each subject, results for stimulus type (1) differed significantly (p <0.001) from the other types. Types (2) and (3) also differed significantly, but no significant difference was found between types (2) and (5) or (3)

1Thresholds are denoted with Small Caps.

Page 23: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments

& 2.5 type (1) type (2) type (3) type (5)

(dB) 2.0

1.5 9 1.0

0.5

0.0 I 1 m h 1 m h 1 m h 1 m

Frequency region

Figure 1.2: Results of Experiment I for the individual subjects for stimuli types {1), {2), (3) and (5). Subjects AH (D), JS (6), NV ( <> ). The filled symbols indicate that averaging over subjects was allowed (p > 0.01). The three frequency regions are denoted with "l" (low), "m" (middle) and "h'' (high). ·

13

I h

and (5). Frequency region proved to be not significant if the six above­mentioned points are excluded, and averaging over both subjects and frequency region then was allowed for each type. For all subjects and all frequency regions, the four types differed significantly. The thresholds, averaged over subjects and frequency regions, are 1.22 dB for type {l)i 0.57 dB for type (2); 0.33 dB for type (3) and 0.44 dB for type (5).

The results show that the pure-tone intensity discrimination thresh­olds for stimulus type (1) vary from 0.9 to 1.7 dB for all subjects and frequencies and are in good agreement with data from Florentine (1983) or Jesteadt et al. (1977) for sensation levels of 20 dB. As a control ex­periment only 5 runs were taken from subject NV for stimulus type ( 4) in the middle-frequency region. The threshold for this stimulus type was 1.04 dB, which is 0.3 dB lower than NV's threshold for pure tones. This result did not differ significantly from the other data of type (1),

Page 24: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

14 Chapter 1 multi-tone complexes

whereas it did for the other types. Results obtained with stimuli of types (2) and (3) show that for all conditions the thresholds for type (3) are about twice as small as for type (2). This in itself is not a very sur­prising result, as stimulus type (3), in which both components change in amplitude, contains twice the information.

One would expect that thresholds for stimulus types (1) and (2) would be about the same, as the only difference is the addition of a stationary tone component. Since this added component does not change, it does not provide any information to aid discrimination of the sounds. If thresholds are compared, however, the surprising result is that in 7 out of the 9 conditions threshold decreases to about 0.5 dB. Apparently the added fixed sinusoid improves the detectability of an amplitude increment in most cases. In the same way the results of type (3) might be compared with those of type (4).

To see whether discrimination performance changes when more tone components are added, the results for stimulus types (3) and (5) should be compared. The results show that discrimination performance with the twenty-tone complex is slightly (but significantly) worse, but still is better than with type (2). Apparently discriminability is not increased by adding more non-changing components, which indicates that sub­jects do not relate the increase of one component to the entire flat spectrum. This is different from the results of Green et al. (1984), who obtained a lowering of intensity difference thresholds for a single tone component when more components were added to the complex. Although we expected to find similar results, despite the differences between the experiments, this turned out not to be the case.

Another question is why some subjects have such difficulties in dis­criminating the two different spectral shapes in the high frequency region around 2 kHz. Experiments by Green et al. (1987) and by Richards, Onsan & Green {1989) showed that subjects perform best near 1 kHz. Spiegel & Green (1982) noticed a break point in frequency beyond which detection of a tone increment became more difficult. These effects, however, are too small to account for our results. We believe that, just as in Kidd Jr., Mason & Green (1986), we are dealing here with a learning effect. It can be seen that out of the six outlying data-points, four are close to pure-tone threshold values. The other two are smaller. We think that even after 10 to 20 runs some subjects

Page 25: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 15

were still unable to use fully the potentially most effective cue, and that additional training might have helped.

Perhaps the most interesting results of this experiment are the very low threshold values which, for some conditions, were a.s low a.s 0.2 dB. When the spectral shape of a signal changes, a.s happened in stimulus types (2), (3) and (5), thresholds are lower than when only magnitude changes, a.s in stimulus types (1) and (4). It seems that the change of spectral shape in itself provides a powerful discrimination cue to the auditory system, and probably reflects the same behaviour that Green (1988) has referred to as profile analysis. However, no roving level for intensity has been used in the present experiment; thus one cannot exclude that intensity cues still may have played a role. The very small intensity difference thresholds on the one hand, and the subjects' report of using different cues (pitch jumps) on the other hand, do suggest that for these types no overall loudness cues have been used, since for types (3) and (5) overall intensity remained constant for the two stimulus bursts.

Feth & Stover (1987) reported computations on complex signals using the envelope weighted average of the instantaneous frequency (EWAIF) as a means of discriminating complex tones (Feth, 1974). The model is based on the assumption that a change in the amplitude of a component in a complex signal introduces a change in pitch, on the basis of which discrimination takes place. It can be shown that an arbitrary time signal A(t) can be written as

A(t) = E(t) cos [B(t)], (1.3)

where E(t) is the envelope function and B(t) the instantaneous phase angle. The instantaneous frequency now is defined· as the time­derivative of 6( t)

f(t) = _!_ 86(t) 211" 8t

(1.4)

One now can determine the EWAIF of a sound over a period T by computing

EWAIF = 0/T E(t)f(t) dt

0/T E(t) dt . (1.5)

Page 26: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

16 Chapter 1 multi-tone complexes

Threshold of discrimination is said to be reached if the EWAIF differ­ence between two signals has a certain value, probably the pure-tone frequency JND at that centre frequency. Applying this model to our stimuli, we see that the model does correctly predict a factor of two dif­ference in the thresholds found with types (2) and (3). The model yields a larger difference in EWAIF value if the frequency separation between. two tones is increased. If one keeps in mind that with our stimuli, the frequency separation increases from the lower- to the higher-frequency region (namely from 55 to 366 Hz), such an improvement in threshold is not found. This can be explained by the fact that the pure-tone frequency JND increases as well, causing the difference in EWAIF at threshold to increase with increasing centre frequency. Although the EWAIF model is essentially a narrowband model, because the signal needs to be unresolved to calculate the instantaneous frequency, Feth & Stover (1987) successfully applied the EWAIF model to data of Green et al. (1984), that were obtained with broadband signals. Applying it to our stimulus types (3) and (5), the model predicts that thr~shold for type (5) should be lower than that of type (3), which is contrary to our data.

By manipulating the frequency ratio of successive partials one might reach an optimum value for which discriminability is best. To find out what the optimum frequency ratio is and how much the already rather low threshold values for type (3) might improve Experiment II was performed.

1.2.2 Experiment II

This experiment is designed to study threshold behaviour for amplitude changes in two- and three-tone complexes. The two-tone complexes are identical to type (3) of the previous experiment, but with varying inter­tone frequency ratio. The three-tone complexes are identical to the two-tone complexes, except for a non-changing component put in the geometric centre. Thus, if a two-tone complex has an intertone spacing of N semitones (sT), then the corresponding three-tone complex has an inter-tone spacing of~ N ST, whereas for both complexes the outer-most components are N ST apart.

The general idea of this experiment is to investigate whether and

Page 27: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 17

how discrimination improvements, observed in Experiment I under some conditions, change when the total range and/or the inter-tone frequency ratio are changed. As a control experiment, it was investi­gated to what extent a roving intensity level would affect discrimination performance for two-tone complexes.

Stimuli

Two types of stimuli were used, each containing two sound bursts:

(1) two two-tone complexes with tones of frequencies Ii and / 2 , cen­tered geometrically around 1 kHz, one complex with levels of L+& and L respectively, the other with levels of L and L+.8.1, This stimulus is in essence similar to the one used by Feth & O'Malley {1977);

(2) two three-tone complexes, derived from the two-tone complexes described above, but with a fixed frequency component of 1 kHz and a fixed level of L+ l AL added.

The temporal structure of the stimuli was the same as in Experiment I, i.e. bursts had a duration of 250 ms and were separated by a 10-ms silent period. When a roving level for intensity was used, the intensity was varied randomly from burst to burst uniformly between 10 and 30 dB above the empirically established threshold in white noise.

Procedure

The procedure was the same as the one described in Experiment I. From each subject at least 10 runs (containing 24 reversals) were collected for ea.ch condition if the threshold proved to be stable. For stimulus type (1) there were 13 different frequency-ratio values, ranging from 0.25 to 11 ST. For stimulus type (2) six different range ratios, that is, ratios of the highest and lowest frequency, were investigated. They ranged from 1 to 10 ST. Thresholds were determined by the same method as was described in the previous section by fitting a Cumulative Normal Integral to the data, after which the 70. 73-correct point was estimated.

Page 28: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

18 Chapter 1 multi-tone complexes

Subjects

With the two-tone complexes 5 subjects participated for ratios of 0.25, 0.5,- 1, 3, 51 7, 9 and 11 ST. One of those and two other subjects partic­ipated for ratios of 1, 2, 4, 6, 8 and 10 ST for both two- and three-tone complexes. For the experiment with two-tone complexes, having roving levels for intensity, two subjects (both authors) participated for ratios of 1, 5 and 11 ST. All subjects had some experience with psychoacoustic experiments. Their degree of musical experience varied greatly.

Results & Discussion

Results obtained with the two-tone stimuli are plotted for all individual subjects in Figure 1.3. The figure shows the threshold intensity differ­ence between sound components, 81, as a function of the range / 2 //i expressed in semitones. Each adaptive run represents on the average more than 100 discrimination trials, so that each symbol corresponds to at least 1000 trials. Also the results of Experiment I for type (3), middle-frequency region are plotted (2.7 ST). The results of both ex-periments are in good agreement with each other. ·

Feth & O'Malley (1977) performed a comparable experiment in which they used stimuli similar to ours, but with a fixed intensity differ­ence of 1 dB between tone components, and measured percent-correct discrimination scores. From the three-line fit they made to their data taken around 1 kHz, we calculated the equivalent 70. 73-correct thresh­old on the assumption that their data contained no large response bias and that sensitivity d' is proportional to the intensity difference (in dB) between the tone components. The result of this calculation, which is represented by the dashed function in Figure 1.3, appears to be quite consistent with our results.

To explain their data, Feth invoked the earlier mentioned EWAIF model. This model was shown to account, at least qualitatively, for their data and, by inference, also for our present data. When f2 = / 1 ,

the two two-tone complexes to be distinguished will be physically iden­tical as long as the phase angle between the two identical frequencies remains constant, and they will therefore be indistinguishable. This corresponds with an infinite 81. When /2//i increases, the envelope-

Page 29: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments

~ 2.0 -.--·------------,.......----. (dB)

1.5

1.0

0.5

+

0

-·· ---6

~

0 1 2 3 4 5 6 7 8 9 10 11 12

Frequency ratio ( sT)

Figure 1.3: Results of Experiment II. Threshold values & of two-tone complexes are plotted as a function of the frequency ratio in semitones (sT). Each symbol denotes a different subject. A fit to Feth & O'Malley's data (1977) for 1 kHz is plotted as a dashed curve.

19

weighted instantaneous frequency will become increasingly different be­tween the two complexes; discrimination will therefore improve, and 8L will decrease. If, however, f2/ /1 reaches the point where the two fre­quency components of the complex tone become aurally resolved, the EWAIF mechanism breaks down, discrimination deteriorates, and 8L will increase again. Ultimately it is expected to level off at the inten­sity difference limen for single pure tones of 1-1.5 dB.

Although the EWAIF account of Feth & O'Malley's and our present data is qualitatively plausible, there are some quantitative aspects, which the model does not explain very well. One is the observed fact that there is a plateau in the original Feth & O'Malley data or, equiv­alently, a region of constant low 8L between 1 and 5 ST for the manner in which we have plotted the data. The EWAIF model does not de­scribe what keeps 8L limited to about 0.3 dB in that region. Another

Page 30: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

20 Chapter 1 multi-tone complexes

problem is that a conservative estimate of the bandwidth of the filters that separate the tones, represented by the breakpoint of the curve at about 5 ST, is very wide compared with the conventional estimates of the critical bandwidth (about 3 ST).

For our present data one could argue that a plateau is not so evident and that they are best fitted by an asymmetric U-shaped curve with a minimum .1L for a frequency ratio of 1 or 2 ST. This would largely resolve the dilemma of the plateau and the location of the breakpoint. The difference limen .1L simply decreases with increasing /2/ fi because of the increasing EWAIF. difference between the signals, until the trend is reversed by aural resolution of the tone components. The plateau in Feth & O'Malley's data can easily be accounted for by their paradigm of using a constant component intensity difference of 1 dB, which caused performance saturation of 100%-correct responses.

In order to see whether our experimental task should be viewed as spectral shape discrimination and not as intensity discrimination, a control experiment with a roving intensity level was performed. The experiment was identical in all respects, except that the overall inten­sity was varied randomly over a range of 20 dB for each sound burst. 70. 7%-correct thresholds for the two individual subjects are given in Table I. The results show that despite the roving level, subjects are still able to discriminate in a correct sense, which indicates that for large frequency separations, they must use across-channel information. This enables us to think of the discrimination process with fixed inten­sities as being based on the perception of changes in the spectral shape.

AH NV

1 ST

fixed roving 0.14 0.28 0.18 0.28

5 ST

fixed roving 0.72 1.04 0.37 0.50

11 ST

fixed roving 1.40 2.29 0.43 1.13

Table I: Results of Experiment II for two-tone complexes, with and without a roving intensity level, for frequency separations of 1, 5 and 11 ST.

Page 31: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments

& 1.0 (dB) - •

0

• i i

0 0.5- i

~ • <> ~ 0 • ~ • • • Ii • e e

0.0 I I I I I I I I I I I

0 1 2 3 4 5 6 7 8 9 10 11 12

Frequency ratio (sT)

Figure 1.4: Results of Experiment II. Threshold values & of two- (open symbols) and three-tone complexes (filled symbols) are plotted as a function of the frequency ratio of the outer com­ponents in semitones. Each different symbol denotes a subject.

21

Figure 1.4 shows the results of the second part of the experiment for three subjects. Solid data points represent results for the three-tone complexes, open d~ta points for two-tone complexes. 70.73-correct thresholds are plotted as a function of the bandwidth of the signal, expressed in semitones. Each symbol represents about 1000 trials.

One can see that generally thresholds decrease when partials come closer in frequency, just as was observed with two-tone complexes. In general thresholds for three-tone complexes resemble those of the two­tone ones. For small inter-tone values (1-2 ST), three-tone complexes have a larger threshold. The EWAIF model predicts this correctly. Al­though no smaller inter-tone distances were tried, & is expected to increase again when the range is reduced. The data show that thresh­olds for three-tone complexes are not as low as those for the two-tone complexes. For 8 and 10 ST, discrimination thresholds for three-tone complexes are lower than those found for two-tone stimuli. The reason

Page 32: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

22 Chapter 1 multi-tone complexes

for this is not obvious, because the added third tone did not change am­plitude. It is our belief that when the added third tone is also resolved, cross-comparison is easier, because it enables the subject to use one extra point to estimate the spectral shape. Bernstein & Green (1987a) provide an analogous explanation.

1.2.3 Experiment III

Experiment III deals with the perception of changes in the spectral slope of multi-tone complexes. It is designed to investigate how dis­crimination depends on the total frequency range, while the inter-tone spacing remains constant.

Stimuli

The stimulus used contained two sound bursts, as illustrated in Fig­ure 1.5 for a two-, four- and a twelve-tone complex. Each sound burst contained a multi-tone complex with tones geometrically cen­tered around 1 kHz, all tones being separated by 1 ST, and with a. ramp-shaped spectral envelope on a. log-frequency scale. One complex had a. positive spectral ramp - that is, its lowest-frequency partial a.t level L and its highest partial at L+M, - the other complex a similar negative spectral ramp.

The temporal structure of the stimuli was the same as in Experi­ment I - that is, bursts ha.d a duration of 250 ms and were separated by a 10-ms silent period.

Procedure

The procedure was the same as the one described in Experiment I. From each subject at least 10 runs (containing 24 reversals) were collected for each condition if threshold proved to be stable. The number of components in the complex was either 2, 4, 6, 8, 10 or 12, resulting in a spectral bandwidth of 1, 3, 5, 7, 9 or 11 ST. 70. 7%-correct thresholds were determined by the same method a.s was described in Experiment I.

Page 33: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments

...-... ~ "'C ...._.,

Q)

'"C ::l

+:> ;.:::! 0.. s <

I

L+6L

L

I 11

L+6L

L I I

111111111111

L+6L

I 11111111111 L

Frequency (log.)

Figure 1.5: Representation in the frequency-amplitude domain of the stimulus type used in Experiment III. Stimuli with 2, 4 and 12 co1!1ponents are plotted. Amplitudes varied between L and L+6L. Inter-tone spacing always was 1 ST.

Subjects

23

Four subjects participated, including one of the authors (NV). All had some experience with psychoacoustic experiments. Their degree of mu­sical experience varied greatly.

Results & Discussion

Figure 1.6 shows the results, where threshold is plotted as a function of the bandwidth of the multi-tone complex, expressed in semitones. Again one symbol represents on the average 1000 trials. For compari­son, also the data of Experiment II for two- and three-tone complexes with a separation of 1 ST are plotted. The results of both experiments are in excellent agreement. The figure as a whole shows that, starting from the two-tone condition as a reference, addition of tone components

Page 34: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

24 Chapter 1 multi-tone complexes

& 1.0 (dB) -

. --

x + + x +

0.5 - + x x <> t::. ~ ~ tJ

~ \! 0 D ~ t:i.

i 0.0 I I I I I I I t I I I

0 1 2 3 4 5 6 7 8 9 10 11 12

Frequency ratio (ST)

Figure 1.6: Results of Experiment III. Threshold values & of multi-tone complexes are plotted versus the frequency ratio of the outer components (ST) for four subjects. For comparison also some results of Experiment II are plotted. Each different symbol represents a different subject.

makes discrimination threshold increase until about 4 or 5 components are present (3 to 4 ST). Addition of more tone components does not increase the threshold level any further but, on the contrary, seems to cause a slight decrease. The subjective cue, as reported by the subjects, appeared to be a change in "sharpness" of the timbre, which was also reported by von Bismarck (1974a,b).

The left part of Figure 1.6, showing an upward trend, is not con­sistent with the prediction of the EWAIF model. The model predicts a downward trend for AL, when tone components of 1-ST spacing are added. Beyond a total complex tone range of 3 or 4 ST one might expect a decline in performance - that is, an increase in &, because the auditory filter will begin to separate the tone complex into differ­ent spectral parts. The data, on the other hand, show a plateau up to about 7 ST, and beyond that a slight decline.

Page 35: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.2 Experiments 25

The study by Green et al. (1983) provides some data that can be compared with the present ones, despite the large experimental differ­ences. If a vertical cross-cut is made through the data displayed in Figure 4 of Green's study, at an abscissa value of 5 units, which is comparable to our stimulus condition of 1-ST separation, a function of four data points is obtained which has the same general shape as our function in Figure 1.6. Absolute magnitude comparisons cannot be made because of the fundamental difference between the roving-level paradigm which Green used and our fixed-level paradigm. Our stimuli also resemble those used by Bernstein & Green {1987b ). They com­pared a 21-component flat spectrum with a spectrum having a ramp on a linear amplitude-log frequency scale. Because they used only one complex-tone condition, results cannot be compared. They showed, however, that the percept could be categorized as spectral-shape dis­crimination.

Vector summation of d' for each changing component predicts a much larger decrease in threshold when more and more components are added than is actually observed. This implies that not all information in the signal is used optimally. Bernstein & Green (1987b) suggest that the auditory system is only able to compare changes in two critical bands at the same time, which may leave some information unused.

1.2.4 Leaming effects

Figure 1. 7 shows subject RC's threshold DJ, as a function of run num­ber for three different frequency ratios employed in the first part of Experiment IL Especially for frequency separations larger than three semitones, subjects took time (sometimes even up to 2500 practice tri­als) to reach a relatively stable. threshold. Large learning effects were also reported by Kidd Jr. et al. (1986), who reported similar numbers of practice trails needed before threshold was stable. It is believed that strong learning effects exist because subjects have to learn to compare signals which are not in the same critical band.

Apart from learning effects, robustness of the threshold - that is, variation of the threshold between the consecutive runs - varies for dif­ferent frequency ratios. This robustness appears to depend on threshold value in the sense that the variance increases with increasing threshold.

Page 36: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

26 Chapter 1 multi-tone complexes

~ 4.0 (dB) 3.5

3.0

2.'5

2.0

1.5

1.0

0.5

0.0 0 5 10 15 20 25 30

Run number

Figure 1.7: Results from Experiment II for subject RC. Thresh­old values & are plotted for frequency ratios of 1, 5 and 11 ST

as a\function of the run number.

This may be partly due to the experimental set-up, where, in a 2DlU adaptive procedure, a linear relationship between d' and Mi (which de­termines the shape of the psychometric function) causes an increasing variance with increasing threshold.

1.3 General Discussion

The most striking feature of this chapter is the finding that discrimina­tion thresholds for intensity changes, which under most conditions are typically in the order of 1 dB for signals of 20 or 30 dB above hearing threshold, can become as low as 0.2 dB. The control experiments with a roving intensity level show that discrimination is not based on simple intensity cues and that, for large frequency separations, subjects are able to make comparison across critical bands. The facts that even for large frequency separations (a) thresholds in Experiment II stay near 0.5 dB; (b) three-tone complexes have lower thresholds than the cor-

Page 37: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

1.3 General Discussion 27

responding two-tone complexes, and ( c) discrimination improves with added tone components in Experiment III also strongly indicate that a pattern or profile comparison across different channels must take place.

Originally we interpreted our data as if the auditory system is limited by two different mechanisms when detecting spectral shape changes. One mechanism operates on the conventional measure of rela­tive energy changes in critical-band filters, the other on global or local changes of spectral slope. If this idea is applied to the two-tone data shown in Figure 1.3, one would expect them to fit a straight line that

, goes through the origin (constant spectral slope change) and levels off horizontally (constant fil) when slope detection becomes less effective than energy detection. At very small frequency separations, the two components were expected to interact such that discrimination would worsen. Our data seem to fit such a curve (Versfeld, 1989). The data of Experiment III, however, show that the concept of simple spectral slope discrimination by the auditory system for the present complex changes is untenable. If spectral slope discrimination were the mecha­nism, for Experiment III one would expect an ever-increasing threshold as a function of the bandwidth.

The E'\\VAIF model of Feth (1974) can, at least qualitatively, account for some of the data presented in this chapter but not for all of them. The model does roughly predict the observed shape of the fil function of Experiment II shown in Figure 1.3, but predicts just about the opposite trends for the data of Experiment I shown in Figure 1.2, type (5), or the results of Experiment III shown in Figure 1.6. At present, the EWAIF

model can account for the data only if the signals' bandwidth does not exceed 1 ST.

Another approach to interpreting our data might be to consider the beat interference pattern of signals as a possible basis for discrimina­tion. The lowest fil value found in Experiment II was for a tone sepa­ration of 1 ST, as seen in Figure 1.3. This is about the same spacing for which Plomp & Levelt (1965) and Plomp & Steeneken (1968) found a maximum of subjective roughness or dissonance in two-tone complexes. Inside the critical band, the relatively low thresholds may be caused by the detection of a change in regularity of an interference pattern from incompletely resolved partials. Especially if there are only two partials, a amplitude change as happened in stimulus type (3) of Experiment I

Page 38: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

28 Chapter 1 multi-tone complexes

will cause a shift in the carrier phase with respect to the signal enve­lope, which may constitute a detection cue. With three partials in the stimulus, the signal envelope will be much less regular because the tones are spaced equally on a log-frequency scale and are usually not harmon­ically related. A change of carrier phase with respect to an irregular envelope may be much more difficult to detect, explaining the rise of threshold. With four partials present, the advantage of the interference pattern cue has just about disappeared. Presumably, if stimulus par­tials had been chosen harmonically, as is the case in most musical and voiced-speech sounds, the curves of Figure 1.4 and Figure 1.5 would have shown a quite different shape. It is left to future research to :find

· out whether this is true.

Page 39: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Chapter 2

Discrimination of changes in the amplitude of two-tone complexest

Abstract

Discrimination experiments have been performed in which the com­ponent amplitudes of two-tone complexes were varied such that a change in the spectral shape was obtained. Thresholds were mea­sured as a function of frequency ratio, centre frequency and overall intensity. In most experiments, the overall intensity was varied ran­domly between each a.nd every presentation, to avoid discrimination on the basis of changes in loudness. The results show that perfor­mance is best for frequency ratios of about one semitone, hardly depending on centre frequency. For bandwidths of one semitone and beyond, thresholds can be explained in terms of a multi-channel profile-analysis model. For bandwidths less than one semitone, the EWAIF-model ca.n account for the data, and EWAIF-values correspond with pure-tone frequency JNDs.

2.1 Introduction

T HIS chapter is concerned with the ability of the auditory system to discriminate between two signals that differ only with respect to

their spectral shape. The topic is relevant, since discrimination between different speech or musical timbres requires the ability of the auditory

1 Part of the results presented in this chapter appeared in: Versfeld, N .J. &: Houtsma, A.J.M. (1992). "Spectral-shape discrimination of two-tone complexes", in: Auditory physiology and perception, Y. Ca.zals, L. Demany, K. Horner, eds. (Pergamon Press, Oxford), 363 - 371.

Page 40: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

30 Chapter 2 Two-tone complexes

system to compare the activity in different regions across the spectrum. Spectral-shape discrimination has gained considerable interest during the last decade. Experiments showed not only that the auditory system is able to compare the output of remote frequency channels in a relative fashion, but, more importantly, that the mere presence of components in bands, remote from the band in which a change in amplitude had to be detected, could increase or decrease threshold. In a typical profile­analysis experiment, Green, Mason & Kidd Jr. (1984) investigated the detectability of an increment in the amplitude of the centre component of an otherwise fl.at multi-component spectrum. They found that the threshold decreased if non-changing frequency components were placed remotely (i.e. many critical bands away) from the centre component. Their results were explained qualitatively by assuming that the audi­tory system, in one way or another, is able to code and compare the two spectral shapes or profiles. Up to now, many experiments have been performed by Green and co-workers where thresholds were measured as a function of many variables, e.g. the number of compone~ts or the component spacing. These results show that the classical theory of the critical band has to be extended, in the sense that not only the signal­to-noise ratio within one critical band determines the detectability, but also across-band information can be used.

A change in the spectral shape of a signal can be achieved in many different ways. Therefore, in this chapter an attempt is ma.de to mea­sure discrimination thresholds for changes in the spectral profile with the simplest possible signals, namely two-tone complexes. They consist of two sinusoids with amplitudes changing relative to each other. Sev­eral papers have published on experiments with amplitude variations in such two-tone complexes. In an attempt to gain insight into the per­ception of formants, Morton & Carpenter (1963) measured difference limens for two-tone complexes, where either both components were in­creased or decreased in intensity (so that the overall intensity changed), or where one component was increased and the other decreased (so that the spectral shape changed, but the overall intensity remained un­changed). They found that, with small frequency separations, threshold for the former were smaller than for the latter condition, whereas with large separations they were about the same. Their results were in line with critical-band models, and were used to support the hypothesis

Page 41: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.1 · Introduction 31

that the two most prominent harmonics were sufficient to discriminate between two different formant positions. Feth and coworkers also re­ported on discrimination of changes in the component amplitudes of two-tone complexes. Initially, Feth (1974) tried to explain the results in terms of the perceived change in pitch if the relative amplitude of the components were changed. Later, Feth & O'Malley {1977) used these signals to obtain a measure for the width of the critical band. In subsequent papers, Feth, O'Malley & Ramsey Jr. (1982) and Anan­thraraman, Krishnamurthy & Feth {1992) extended the experiments to come to a model that could account for the thresholds measured with narrowband signals. Recently, Ito (1990) investigated the ability of the auditory system to make across-channel comparisons, by using two-tone complexes having a large frequency separation (several critical hands). In contrast with most profile-analysis results, she found large effects of a roving overall intensity level. Furthermore, her results showed that thresholds were almost independent of frequency separation. She con­cluded that the multi-channel model of Durlach, Braida & Ito {1986) could only partly account for her results. In a previous study, Versfeld & Houtsma {1991) measured discrimination thresholds for changes in the spectral slope of multi-tone complexes. They made an attempt to study profile analysis with simple signals, though in most of their ex­periments, a roving level was absent. With two-tone complexes, they found that threshold depended on frequency separation, and discrimi­nation was best for a frequency separation of one semitone.

Two-tone complexes with component amplitudes changing in oppo­site direction are elegant signals for studying profile analysis, hut also are very suitable for testing models for narrowband signals (Feth, 1974) and broadband signals (Durlach et al., 1986). Though the behaviour of threshold as a function of several parameters such as frequency separa­tion has been studied to some extent, thresholds for narrowband two­tone complexes in the presence of a roving level have received less at­tention. This chapter starts with a detailed description of the two-tone complexes. Then, the models of Feth (1974) and Durlach et al. (1986) will be discussed and applied to two-tone complexes. Finally, four ex­periments will he presented. The first experiment deals with thresh­olds of large-bandwidth two-tone complexes and results are discussed in terms of the multi-channel model of Durlach et al. (1986). In the

Page 42: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

32 Chapter 2 Two-tone complexes

second experiment, thresholds for amplitude-changes in narrowband two-tone complexes are measured. Because it was found that spectral­shape changes in two-tone complexes were perceived as a change in pitch, a third experiment was conducted, where, in the presence of a roving level, pure-tone frequency JNDs are measured. The results of the second and third experiment are used to test the EWAIF-model of Feth {1974), together with some variants. The fourth experiment deals with the question whether and to what extent profile analysis is similar to level-increment detection. To this end, thresholds were measured for the two conditions as a function of overall level, and the "near-miss" to Weber's Law was investigated.

2.2 Characteristics of two-tone complexes

In this chapter, experiments are reported where thresholds.for changes in the amplitudes of two-tone complexes have been measured as a func­tion of a number of parameters .. This section is concerned with some general properties of the two-component signal. For an extensive treat­ment of these signals, the reader is referred to Voelcker (1966). Only the most relevant issues are discussed. The waveform of a two-tone complex with frequency components /i, / 2 (0 < Ii < /2), amplitudes Ai, A2 , and starting phases </>1 , ¢2 , respectively, can be written as

A(t) = A1ei[21r/it+4>i1 + A2 ei[21r/:at+t1>2l. (2.1)

A(t) is the vector sum of the two vectors with amplitudes A1 and A2 in the complex plane, as can be seen in Figure 2.1. A(t) can be rewritten as a function of its modulus E(t) and its argument fJ(t)

A(t) = E(t)eiB(t). (2.2)

Figure 2.1 reveals that the modulus E(t) can be written as

E(t) =VA~+ 2A1A2 cos [27rA/t +~]+A~, (2.3)

where L\f is defined as L\f = /2 - /i, (2.4)

and ~as (2.5)

Page 43: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.2 Characteristics of two-tone complexes

A(t)

Figure 2.1: Vector representation of A(t) in the complex plane.

The argument 8( t) is given by the relation

tan [B(t)] = 9 [A(t)] ~ [A(t)]

Ai sin [27r fit+ ¢1] + A2 sin [27r ht+ ¢2] -

Ai cos [27r Ii t + ¢1] + A2 cqs [27r ht + ¢2]'

33

(2.6)

(2.7)

where 9 and~ stand for the imaginary and real part of the waveform, respectively. The time-derivative of 8( t) is related to the instantaneous frequency f(t) as

f(t) = __!_ 88(t). (2.8) 21r 8t

Insertion of 8(t), derived from Equation (2. 7), into Equation (2.8) yields

Page 44: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

.34 Chapter 2 Two-tone complexes

(a)

A

(b)

E

(c) \ /\ ,,,..

' / '\ I \ ,f \. ,'

' ,' \, , ........ ..._,

f(t)

(d) !2 -------------------Ii

Figure 2.2: Features of a complementary pair of two-tone com­plexes. (a) Frequency spectrwn. (b) Waveform. (c) Envelope function. ( d) Instantaneous frequency.

Page 45: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.2 Characteristics of two-tone complexes

f(t) -/1A~ + '2A~ + 2/ AiA2 cos [21rll./t + 64>] A~+ A~+ 2A1A2 cos [21rll./t + 64>]

I ll.f A~ -A~ + 2 A~+ A~+ 2AiA2 cos [21rll.ft + 64>]'

where f is the arithmetic mea.n of f i and fz

f = (Ii + fz)/2.

35

(2.9)

(2.10)

(2.11)

Equation (2.10) shows that f(t) is periodic in 1/ ll.f. It can be shown (Cherry & Phillips, 1961). that the time-averaged instantaneous fre­quency f (averaged over a whole number of periods) ca.n have only three values, namely

{

f - lf:l.f = Ii - 2 f = f

f + ~ll.f = !2

if Ai > A2 if Ai= A2 if Ai< A2

(2.12)

Figure 2.2 displays a complementary pair of two-tone complexes, i.e., a two-tone complex with components (Ji, A+ M) a.nd (f2 , A), and an­other complex with components (Ji, A) and (!2 , A+ M). From top to bottom, the frequency spectrum, the waveform, the envelope function a.nd the instantaneous frequency are plotted for each complex. Equa­tion (2.3) as well as Figure 2.2 show that the envelope E(t) is invariant under exchange of A and A+ M. The instantaneous frequency f(t), however, is not. Figure 2.2 shows that f(t) becomes its mirror image with respect to f, which ca.n also be seen from Equation (2.10). The extreme values for f(t), fe, are

if cos [27r ll.f t + .64>] = -1 fe = 2 A2 -Ai

{

f + ll.f A2 + Ai

f + ll.f A2 - Ai if cos [27r ll./ t + 64>] = 1 2 A2 +Ai

(2.13)

The extremes can reach arbitrary values, depending on the relative val­ues of Ai and A2 • A striking aspect of the waveform is the "minimum

Page 46: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

36

A

Chapter 2 Two-tone complexes

(a) (b)

Figure 2.3: Waveform of a complementary pair of two-tone complexes, together with a sinusoid of frequency fi. (a) MP­waveform. (b) NMP-wa.veform.

phase" (MP) and the "non-minimum phase» (NMP), as they are called by Voelcker (1966). In Figure 2.3 the waveform of both two-tone com­plexes has been replotted. Also plotted in this figure is the waveform of a sinusoid with frequency ft. As can be seen from the figure, the sinu­soid remains approximately in phase with the waveform of the two-tone complex in one case (MP), whereas in the. other case it is out of phase at the envelope-minimum of the two-tone complex (NMP). With the MP two-tone complex, the number of cycles per envelope period equals that of the sinusoid, whereas in the case of the NMP two-tone complex, the number of periods is increased by one. For the NMP signal, the number of periods in one envelope cycle equals

(2.14)

Thus, if the number of periods is increased by one, its frequency be-comes

(2.15)

Therefore, if a sinusoid with frequency '2 is plotted together with a NMP-signal, they will remain in phase. This curious phenomenon is a consequence of the fact that ] can only be either ft, '2, or f.

Page 47: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.3 The EWAIF model and its variants 37

2.3 The EWAIF model and its variants

Experiments with two-tone complexes go back to the previous century, where Helmholtz (1954) observed a change in pitch of two slightly de­tuned - hence slowly beating - tuning forks. This change in pitch was attributed to the change in instantaneous frequency, as given by Equation (2.10). Jeffress (1968) demonstrated the effect of changes in pitch more clearly by gating that portion of the waveform for which the largest variations in the instantaneous frequency occurred. The first modern discrimination experiments with two-tone complexes were performed by Feth (1974). He measured the discriminability between a complementary pair of two-tone complexes (cf. Figure 2.2) as a function of the difference in amplitude AA and of the frequency separation ll.f between the two components. In a subsequent paper, Feth & O'Malley (1977) used the two-tone complexes to obtain a measure for the band­width of the auditory filter. The results showed, first of all, that the signals of this complementary pair can be discriminated, and that the discriminability depends on the frequency separation ll.f as well as on the difference in amplitude AA. The reported cue was a change in pitch. If a change in pitch were the sole cue for discrimination, then one should be able to map the amplitude difference AA with the fre­quency difference ll.f into a number that corresponds to the perceived difference in pitch. To this end, a first attempt is to simply calculate the difference in amplitude-weighted frequency, or MWF,

flAWF = (Afi +[A+ AA]/2) - ([A+ AA]/1 + A/2) (2.l6) (A+ [A+ AA])

= ll.f a - 1 (2.17) a+ 1'

where a= (A+AA)/A. Instead of weighting the amplitude, one might also consider the difference in intensity-weighted frequency, or LlIWF

LlIWF = (A2 /1 +[A+ AA]2 /2) - ([A+ AA]2 /1 + A2 /2) (A2 +[A+ AA]2)

(2.18)

(2.19)

Page 48: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

38 Chapter 2 Two-tone complexes

A different approach to map amplitude and frequency differences into a single number is by using the dynamic (temporal) behaviour of the two-tone complexes. If one recalls that the envelope function for both signals of the complementary pair is identical, it is plausible to assume that the instantaneous frequency bears the cue for the discrimination. Unfortunately, Equation (2.12) shows that the time-averaged instan­taneous frequency 1 does not depend on the difference in amplitude aA., and therefore it cannot account for the experimental results. Feth (1974) proposed that the diffe;ence in the envelope·weighted average of the instantaneous frequency or EWAIF might provide the listener with a cue for discrimination. EWAIF is defined as

0/T E(t)f(t) dt

0/T E(t)dt

EWAIF = (2.20)

The averaging is over a time interval T. Applying this model to the present pair of complementary two-tone complexes gives, with the aid of Equations (2.3) and (2.10), the difference in EWAIF, being

.6.EWAIF

- EWAIF(J1 ,A),(/:z,A+M) - EWAIF(/1 ,A+M),(/,,A) (2.21)

IT -;======l=====dt

[ 2 ] 0 J1+2a cos(27rA./t + fltP) + a2

A/ a -1 T •

0/ J1+2acos(2rA/t + fltP) + a 2 dt

(2.22)

Similarly, one might also think of the intensity-weighted average of the instantaneous frequency, or IWAIF (Ananthraraman et al., 1992), defined as

IWAIF = 0/T E2(t)f(t) dt

ofT Ea(t)dt (2.23)

Hence, in contrast with the EWAIF-model, now the envelope-function is squared. For a complementary pair of two-tone complexes, the differ-

Page 49: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.3 The EWAIF model and its variants 39

ence in IWAIF, 8IWAIF, is given by

a.2 -1 8IWAIF =A/ [. ( . . (2.24)

a.2 + 1+1r~T sm 27r!:lfT + ~) - sm(~)]

IfT ~ 1/ A/, or in the special case that T encompasses a whole number of envelope periods, the expression for 8IWAIF reduces to

a.2 - 1 8IWAIF = Af a 2 + l . (2.25)

Equation (2.25) is identical to Equation (2.19). The four models have several properties in common. Equations (2.17), (2.19), (2.22) and (2.25) show a dependence on the frequency difference A/, not on the absolute frequency values of / 1 or /2. This means that a linear shift over the frequency a.xis does not affect the values of !::JA WF, 61.WF, AEWAIF or 8IWAIF. Similarly, the models a.re a function of the amplitude ratio. This implies that Weber's Law holds, i.e., the model outcome does not change as long as the ratio AA/ A is kept constant. It can he shown that values of !::JAWF, LlIWF, AEWAIF or LlIWAIF are always smaller than A/, regardless the magnitude of AA. The frequency separation between the two components of the two-tone complex can he expressed as the frequency ratio in semitones (sT)

Frequency ratio (ST) = 12log2 ['2/ /1]. (2.26)

The difference in the amplitude can be expressed as the level difference .11 (in dB)

[A+ AA] .11 = 20 log10 A . (2.27)

Figure 2.4 shows curves for .11 (dB) as a function of !:lf (sT) with the quantities !::JAWF {dashed curve), 61.wF, LlIWAIF {both dotted curve), and AEWAIF (solid curve), all being fixed at 3 Hz. The centre frequency fc (defined as .Jf;];) here is 1 kHz, and the integration time T has been taken equal to 1/ ll.f. Figure 2.4 shows that the models have the same global behaviour. All curves have an asymptote for a frequency separation of 3 Hz (about 0.052 ST). Furthermore, .11 decreases as the components' separation becomes larger. In all models .11 goes to zero at large separations.

Page 50: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

40

81 10 ' (dB) \

9 ' I I I

8 I I

• , I

7 I I I

' 6 I I I I

5 I I . • 4 .

I

' \ 3 \

2 1 0

0

\

' ' ' ' ' --.................

1

Chapter 2 Two-tone complexes

2 3 iif (ST)

' Figure 2.4: M. (dB) as a function of the frequency separa-tion A/ (sT) of the two components for a fixed value of AAWF

(dashed curve), 81.WF or 81.WAIF (dotted curve), and ,AEWAIF

(solid curve).

2.4 The multi-channel model

In the late sixties and early seventies, Plomp (1976) and coworkers studied the subjective dissimilarity between periodic sounds (musical instruments, vowels, etc.) that differed only with respect to their spec­tral shape. The signals had the same fundamental frequency, and were equalized in loudness beforehand. In a triadic comparison experiment, subjects had to indicate which pair out of three presented sounds was most dissimilar and which pair was most similar. Parallel to the exper­iment, the signals were analysed by resolving the spectra into a set of 15 adjacent 1/3-octave band filters. Each signal then was represented by a vector in 15-dimensional space. The difference between two sig­nals, S1 and S2 , was defined as the sum of differences in level L of the

Page 51: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.4 The multi-channel model 41

individual bands

(2.28)

where the value of a was determined such that an optimal correla­tion was obtained between the physically and perceptually measured distance matrices. It appeared that a could be varied to some extent without impairing the correlation very much. In general, the subjective dissimilarity and the measured spectral distance correlated very well. This led to the conclusion that differences in timbre can be predicted well from spectral differences.

Plomp's experiments were not designed to measure thresholds for differences between the spectra of sounds. The different sound spectra in his experiments were all very well discriminable. In an attempt to model intensity discrimination, Florentine & Buus (1981) adopted more or less the same strategy as Plomp. They calculated in each critical band LY..i, the difference in excitation level between two sounds that were just discriminable. LY.i's were transformed to the sensitivity in each band, di, with a simple linear relation. There, in turn, were combined to produce the total sensitivity. With this model they could explain a number of intensity-discrimination data.

A shortcoming of the above-mentioned model is that it cannot ac­count for discrimination results obtained in the presence of a roving intensity level, since the ability of the auditory system to make in­terchannel comparisons was not modeled. Profile-analysis experiments demonstrated that components, remote from each other could be com­pared in level (e.g., Green et al., 1984). This was first modelled by Durlach et al. (1986) with the introduction of interchannel correlated noise. This section briefly outlines the general multi-channel model. For a complete overview, the reader is referred to Durlach et al. (1986) and the doctoral dissertation of Ito (1990).

With multi-channel models, the signal is :filtered first into a set of N frequency bands (not necessarily critical bands). Due to either internal or external noise (of e.g. neural firing or a noisy signal) the output of each band is corrupted with (Gaussian) noise. The output of the bands can be represented by an N-dimensional vector y, which is

Page 52: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

42 Chapter 2 Two-tone complexes

stochastic in nature. Suppose that a. signal S1 has to be discriminated from S2 • The signals give rise to vectors y1 and y2 , respectively, and the difference between y2 and y1 is a measure of dissimilarity between the·two signals. The expected value of Yi in the presence of signal S1 is (in the jth channel) denoted as µ,;,;

(2.29)'

The difference in the expected values of y1 and y2 is given by Ii, where

(2.30)

The interaction between channels is represented in a covariance ma­trix A

A= (2.31)

2 PN10'N0'1 O'N

The noise processes in the different bands are partly correlated. The variances of the individual channels are situated in the diagonal ele­ments, and the interaction between the channels is given by the off­diagonal elements. The correlation coefficients Pflc can have only values between plus and minus unity. An important property of a covari­ance matrix is that it is positive definite (A;; > 0), and symmetric (A;1c = A1c;). Therefore p;1c = P1e;.

With signal detection theory, it can be shown that, using a maximum-likelihood criterion, the sensitivity d1 is given by

(2.32)

where fiT is the transposed vector of Ii and A-1 is the inverse matrix of A (which always exists, because all diagonal elements are positive). In the special case of the noise processes being independent across channels (p;1c = 0), the off-diagonal elements become zero, and Equation (2.32) reduces to

N (/i·)2 d' = 2: .2. ' j=l O'j

(2.33)

Page 53: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.4 The multi-channel model 43

where Aj;1 is equal to 1/uJ. Equation (2.33) was used by Florentine & Buus (1981), where 5; was the observed level difference in a band, and CT; the just-noticeable level difference for the same band. Equation (2.33) resembles Equation (2.28), with a= 2, that was used by Plomp (1976) to match timbre space with spectral space.

With the present two-tone complexes, the above equations are sit­uated in two-dimensional space, and Equation (2.32) becomes

d' = A225: + An5~ - 2A125152

AnA22 - A~2 (2.34)

The difference in amplitude 5, for a complementary pair of two-tone complexes (cf. Figure 2.2), which denotes a change in the spectral shape, becomes

Equation (2.34) then becomes

d' = M.. A22 + An + 2A12 AnA22 - A~2

(2.35)

(2.36)

At threshold, i.e., d' = 1, Equations (2.36) yields an expression for fil1

fil = AnA22 - A~2

A22 + An + 2A12. (2.37)

In profile-analysis experiments, a so-called roving level is often used, meaning that the overall level fluctuates randomly between each and every tone burst. These fluctuations do not add information, so Equa­tion (2.35) remains unchanged. A common noise term uh, however, is introduced into all matrix elements of the covariance matrix in Equa­tion (2.31). This changes the covariance matrix for the two-dimensional case into

(2.38)

1 Note that thresholds are denoted with Small Caps.

Page 54: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

44 Chapter 2 Two-tone complexes

where p p12 = p21 . Insertion of the matrix elements of A from Equation (2.38) into Equation (2.37), gives

(2.39)

where "Y = u2/u1 • If no roving level is present (uR = 0), Equation (2.39) reduces to

1-p2 (2.40)

In the model, u; is proportional to the single-tone threshold of the jth component as

(/'.

&; = .....1... w· '

(2.41)

The proportionality constant w; yields the weight with which a com­ponent contributes in the decision process. With optimum process­ing, w; = 1. Non-optimum processing results in w; < 1, so that the contribution of that component is less, and threshold increases. The assumption of u1 = u2 = u simplifies Equation (2.39) to

& - f1=p -;;-y~-2-, (2.42)

which does not depend on UR, so introducing a roving level does not affect threshold. Two extreme cases now can be determined. Firstly, if the noise processes in the two channels are uncorrelated, p = 0 and Equation (2.42) reduces to

(2.43)

This agrees with the idea that threshold reduces by a factor v'M if the number of (independent) observations is increased from one to M. Secondly, if the noises are fully correlated (p = 1) & equals zero. With purely correlated noise, subtraction of the two signals eliminates all noise, hence threshold reduces to zero. In the case that the amplitudes of both components change in the same direction by an amount AL

Page 55: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.4 The multi-channel model

& 2.0 tT 1.8

1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2

1.0 ............... __ ... -- ...... -....................... __ ----------------·--------- _.., _____ ... _

0.1

o~oi==========================:==--------------­~-~---~--------=======~~-._. ____ _

O.Q-1----.--....----.---r--.---..------.-...,..---.---1

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 p

Figure 2.5: Threshold &/O' as a function of the interchannel correlation p. The solid line denotes the shape-condition; the dashed curves the intensity condition. The values for <TA/ <T2 are put near the respective dashed curves.

(change in level), 5 becomes

45

(2.44)

and &., at threshold can be written as

LlL= A22 + Au - 2A12

(2.45)

Insertion of the matrix elements of A from Equation (2.38) into Equa­tion (2.45), yields

1 - p2 O"~ 0"2 +-

1 - 2p; + 1 2 O"~ • (2.46)

Page 56: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

46 Chapter 2 Two-tone complexes

If 'Y = 1, this equation reduces to

-= O'

1 + p ui -2-+ u 2 • (2.47)

Threshold for a change in level therefore is affected by a roving level, which one intuitively would expect. Figure 2.5 summarizes the present results for 'Y = 1. Here, &Ju is plotted as a function of the correlation p (Equations 2.42 and 2.47). With the level changes (dashed curves) the values for u'A,/u2 have been taken as 0.01, 0.1 and 1.0. If the com­ponents of the two-tone complex are close together, perhaps within a critical band, the noise is partially correlated, so p > 0. The internal noise of remote bands is less correlated, so going from left to right in Figure 2.5, thresholds are plotted for two-tone complexes with an in­creasing frequency separation. From this model, it is expected that for changes in the spectral shape, threshold increases as bandwidth increases. With changes in level, the reverse is true. Experiment I examines this prediction.

2.5 Experiments

In this section, four experiments are reported, which were designed to gain more insight in the perception and the discriminability of changes in the spectral shape of a complementary pair of two-tone complexes. Though most experiments reported in this section deal with these sig­nals, some experiments with simple sinusoids or two-tone complexes without a change in spectral shape have also been performed.

2.5.1 General description

Stimulus conditions

Sounds were two-tone complexes with frequencies / 1 and Ji and ampli­tudes A and A + M, as illustrated in Figure 2.2. The frequency sepa­ration of both components is often expressed as the frequency ratio in semitones (ST), and the difference in amplitude as the level difference DL, as defined in Equations (2.26) and (2.27), respectively. A stim­ulus consisted of three sound bursts, each lasting 400 ms including a

Page 57: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 47

20-ms linear onset and a. 20-ms linear offset. The silent period between the sound bursts was 100 ms, so the total duration of one stimulus was 1400 ms. Before each new experimental condition, the subject was asked to adjust the overall intensity of the sound so that it was ha.rely audible. During the experiment the overall level of each sound burst was varied randomly between 30 and 50 dB a.hove this empirically es­tablished threshold. The so-called roving intensity level is an essential feature of profile-analysis experiments, a.nd is applied to prevent sub­jects from using loudness cues, and to force them to listen solely to changes in the spectral shape.

Procedure

A subject, who was seated in a. sound-insulated booth, received the stimulus diotically through TDH-49P headphones. One stimulus com­prised three sound burst one of which - a.part from the overall intensity - differed from the other two. In one trial, one of six possible sequences could occur (being either AAB, ABA, BAA, BBA, BAB, or ABB, where A and B stand for the two different sounds). The subject's task was to indicate which burst contained the odd sound by pushing the appropriate button on a response box. There was no response­time limit, and visual feedback was provided immediately after each response. Each run comprised 150 trials and lasted 7 to 10 minutes.

An adaptive staircase procedure was adopted. Each run started with values of &.i lying well above the expected threshold. &.i was changed according to the stepping rule described in Appendix A. The stepping rule was chosen to converge on values of &.i giving about 71 % correct. Results were entered into the record only after the subject's perfor­mance didn't improve, i.e., there were no more learning effects. The entire psychometric function was estimated by fitting the pooled data to a parameterized psychometric function, as described in Appendix B, with a maximum-likelihood procedure, described in Appendix C. It was assumed that d' ex: &.i (Bu us & Florentine, 1991 ). Thus only one coef­ficient had to be estimated. The threshold was defined as that LlL value for which d' = 1 on the established psychometric function. Note that thresholds are denoted with Small Caps.

Page 58: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

.48 Chapter 2 Two-tone complexes

2.5.2 Experiment I

In the context of profile analysis> the first interesting question was whether - and if so, how well - changes in the relative amplitude of the components large-bandwidth two-tone complexes could be perceived in the presence of a roving intensity level. If discrimination could occu~ on the basis of changes in the spectral shape, this would be the simplest form of profile analysis> according to the definition. The results may be used to test the multi-channel model as described in section 2.4.

Stimuli

Stimuli were complementary pairs of two-tone complexes of which dis­crimination thresholds for changes in the amplitude AL were measured. The centre frequency fc was kept fixed to 1 kHz. Stimulus components had a frequency ratio of either 3, 6, 12 or 24 STs. Four subjects par­ticipated in the experiment. At least 600 trials per subject and per condition were taken.

In a control condition, thresholds were measured for discriminating two-tone complexes, using either a change in level or a change in shape. With a change in level, one sound had amplitudes {Ji, A),('2, A); the other one (/1 , A+ M),{!2 , A+ M). With a change in shape, the com­plementary pair of tones were used. Frequency separation was either 1 ST or 24 ST, with a centre frequency of 1 kHz. The range over which the level was roved was either 0 dB (no roving level) or 20 dB. One subject participated. Per condition, 600 trials were taken.

Results & Discussion

Thresholds AL (dB) for the individual subjects are plotted as open sym­bols in Figure 2.6 as a function of the frequency ratio (sT). The solid curve represents the global behaviour of the threshold obtained with the (averaged) data of Experiment 2 of Versfeld & Houtsma (1991), in which no roving intensity level was used. Plotted as filled symbols are the results of the control experiment of Versfeld & Houtsma (1991), where a 20-dB roving intensity level was employed. Each different sym­bol represents a different subject. The two subjects who participated

Page 59: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments

.6L 3.0

(dB)2.5

2.0

1.5

1.0

0.5

0 5 10 15 20 Bandwidth (sT)

J;X 0 D

25

Figure 2.6: Results of Experiment I. Thresholds & (dB) for two-tone complexes are plotted with open symbols as a function of the frequency ratio (ST). Different symbols represent the indi­vidual subjects. The solid curve represents results of Versfeld & Houtsma (1991 ). Filled symbols represent the results of Versfeld & Houtsma's control experiment.

49

in the experiments of Versfeld & Houtsma (1991), also participated in the present experiment. They are indicated by similai:-shaped symbols. For frequency ratios of5 - 6 ST, Figure 2.6 shows first of all a very close agreement between the present data and the results of the control ex­periment of Versfeld & Houtsma (1991). Thresholds for a ratio of 11 ST

(filled symbols) are higher than those for a ratio of 12 ST. The reason for this is not clear, but might lie in the fact that the experimental paradigms differed (two- vs. three-interval forced-choice). Especially with large frequency separations, labeling is difficult, and therefore a three-interval paradigm can have a major advantage over a two-interval paradigm, because no labeling is needed. Thresholds obtained with a roving level are 0.1 to 1 dB higher than those obtained without a roving level. This in itself is not strange, because the introduction of stimu-

Page 60: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

50 Chapter 2 Two-tone complexes

lus uncertainty often causes threshold to raise. Few data are available that consider the effect of a roving level. With multi-tone complexes, Spiegel, Picardi & Green (1981) and Green et al. (1984) found that thresholds· were influenced only slightly. On the other hand, in experi­ments with two-tone complexes, similar to ours, Ito (1990) found a very large effect (up to 10 dB) if a roving level of 40 dB was applied.

To examine what the threshold would be if no spectral-shape change cues could be used, the control experiment was performed. The results are given in Table I. The threshold for a change in level, in the pres­ence of a 20-dB roving level was 8.81 dB (at 1 ST) and 8.75 dB (at 24 ST). To show that this is in agreement with statistical expecta­tion, the threshold for a change in level with a 20-dB roving level by computer simulation was found to be 7.22 dB,1 so the outcome of the simulation is in good agreement with the data.

As stated before, the roving level was adopted to force subjects to listen solely to changes in the spectral shape. Suppose now that the subject's strategy is to focus on amplitude changes in one component only (or, in other words, to listen to a change in loudness), simply because he or she is not able to analyse the spectral profile. The re­sults of the control experiment indicate that, without listening to a change in spectral shape, at best, one is able to obtain a threshold of about 8 dB. This level is referred to as the "ceiling" (cf. Green, 1988). If thresholds are obtained that are close to the ceiling, or even higher, one has to suspect that subjects did use loudness cues, instead of spectral-shape cues. With the present results, thresholds are always smaller than 2.4 dB, so the first important :finding is that with large­bandwidth stimuli, discrimination can take place on the basis of profile analysis, which is in agreement with the :findings of Ito (1990). The results show that threshold increases as bandwidth increases (roughly 0.58 dB per octave), indicating that relative changes in the amplitude are less well perceived as the components become more separated. The rise in threshold is in agreement with the predictions of the optimum­processing model of Durlach et al. (1986) and Ito (1990). The model predicts that threshold increases as interchannel correlation becomes less, as depicted in Figure 2.5. Also in agreement with the model is the

1 This value depends not only on the degree to which the level is roved, but also on the experimental paradigm (cf. Green, 1988).

Page 61: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 51

slightly higher threshold for the 1-ST separation (8.81 and 1.98 dB), in comparison with that for a 24-ST separation (8. 75 and 1.83 dB). Thresholds for changes in shape increase by roving the level, however, so u1 =/; u2 , as was shown in Equation (2.42). It is possible to find val­ues for p, u 1 , u 2 and <TR, by insertion of the thresholds of Table I into Equations (2.39) and (2.46), for a frequency separation of either 1 ST or 24 ST. Table II gives the results of these calculations. Surprising, at first sight, is the high interchannel correlation for both separations (0.991 and 0.959). The two numbers, however, differ significantly, since thresholds are not proportional to p, but rather to Jl - p2• <TR

is about the satne for the two conditions (8.6 dB), which was to be expected, since in both conditions, a roving level of 20 dB was present. The main difference between the 1-ST and the 24-ST condition is the value of -y: For the 1-ST condition, u1 ~ u2 , whereas in the 24-ST condi­tion, u2 is quite large (7.9 dB). In the case of optimum processing, this would imply that the single-tone threshold for one component would be twice as high as the other. The actual frequency values are / 1 =500 Hz and /2=2000 Hz. Although intensity difference thresholds depend on frequency (Jesteadt, Wier & Green, 1977), this large value of 'Y cannot really be accounted for by this dependence. The results rather have to be accounted for by non-optimum processing, in which one component

fl.f (ST) 1

24

Shape fil (dB) with RL without RL

0.29 0.20 1.91 0.72

Level fil (dB) with RL without RL

8.81 1.98 8.75 1.83

Table I: Thresholds for the control experiment.

fl.f (sT) 1

24

p

0.991 0.959

<T1

(dB) 2.749 3.663

3.195 7.876

1.162 2.150

<TR

(dB) 8.589 8.556

Table II: Values for parameters of the multi-channel model.

Page 62: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

52 Chapter 2 Two-tone complexes

{the one with the lower u) contributes more than the other.

2.5.3 Experiment II

Using two-tone complexes, Versfeld & Houtsma (1991) found a mini­mum shape threshold of about 0.25 dB (at d' = 1) for a frequency ratid of 1 ST and a centre frequency of 1 kHz, which indicates that the audi­tory system is extremely good at perceiving this type of change. Figure 2.6 shows that the introduction of a roving intensity level raised this threshold only slightly to 0.36 dB. To investigate to what extent these low thresholds depend on centre frequency, measurements for changes in the relative amplitude of the two tones in two-tone complexes were extended by variation of the centre frequency. Also, since all stimuli in the present experiment are situated within one critical band, the data of this experiment can be used to examine to what extent narrowband models such as the EWAIF-model can be used.

Stimuli

Stimuli were two-tone complexes with, in one condition, a fixed band­width of 58 Hz, and in another condition a relative bandwidth of 1 ST.

The centre frequency fc had values of 125, 250, 500, 1000, 2000 and 4000 Hz. Note that a fixed bandwidth of 58 Hz gives rise to a decreasing frequency ratio if the centre frequency is increased. Only at /c=l kHz, does a frequency ratio of 1 ST correspond to a bandwidth of 58 Hz. The overall level was roved between 30 and 50 dB SL. Four subjects participated in the experiment. At least 600 trials were taken for each subject at each condition.

Results & Discussion

Figure 2. 7 displays threshold values of LY. (corresponding to d' = 1) for two-tone complexes with a bandwidth of 58 Hz as a function of the centre frequency fc· Each different symbol represents an individual subject. For convenience, the corresponding frequency ratio (in ST) is given at the top of the figure. Starting from the low centre frequencies, threshold decreases from about 1.5 dB to a minimum of 0.4 dB at 1 kHz,

Page 63: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 53

Bandwidth (sT) 8 4 2 1 ! !

9

AL 8

(dB) 7

6

5

4

3

2

1

0 100 1000 10000

Jc (Hz)

Figure 2.7: Results of Experiment IL Thresholds llL (dB) for two-tone complexes with a bandwidth of 58 Hz are plotted as a function of the centre frequency. Different symbols represent the individual subjects.

whereafter it rapidly increases. Thus, despite the roving level, subjects are quite good at discriminating spectral changes, as was expected. A threshold of 0.4 dB at fc=l kHz corresponds well to the results in Table 1 of Versfeld & Houtsma (1991). At fc=4 kHz, individual thresholds diverge, ranging from 1 dB to about 6 dB. A first conclusion

Page 64: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

54 Chapter 2 Two-tone complexes

from these results is that threshold depends greatly on centre frequency if the bandwidth is kept fixed at 58 Hz.

Figure 2.8 displays threshold values of & (corresponding to d' = 1) for two-tone complexes with a frequency ratio of 1 ST as a. function of the centre frequency fc· Again, each different symbol represents an individual subject. The top of the figure shows corresponding band­width in Hz. Thresholds as a function of centre frequency remain fairly invariant at 0.4 dB, except perhaps for the extreme centre frequencies of 125 Hz and 4 kHz, where they are slightly elevated. At /c=l kHz, a bandwidth of 58 Hz corresponds to a frequency ratio of 1 ST, and indeed thresholds for both conditions at that centre frequency are vir­tually identical. In Figure 3 of Versfeld & Houtsma (1991) the threshold for two-tone complexes as a function of the frequency ratio at a centre frequency of 1 kHz displays a minimum at 1 ST. This result, in com­bination with the results of the previous and the present experiment, suggest a global invariance of a U-shaped curve (as the one in Figure 3 of Versfeld & Houtsma, 1991) under transformation of centre frequency. This behaviour of threshold was also found by Feth & O'Malley {1977). The invariance suggests that the frequency resolving power of the ear might play an important role in the discrimination process, indicating that the same threshold is obtained if the two components are separated by an equal amount in auditory bandwidth. This idea is supported by the slight rise in threshold with low centre frequencies in Figure 2.8. The rise in threshold for /c=4 kHz cannot be accounted for by this idea, but, since the difference between the two signals of a complementary pair lies only in the fine-structure, the rise might be due to the poorer phase-locking capacity at higher frequencies. The present data allow a test of the EWAIF-model and its variants, as described in section 2.3. If differences in EWAIF are the perceptual cue, then threshold at a given centre frequency is expected to correspond to a fixed value of .D.EWAIF. Since it is related to the frequency-discriminating power of the auditory system (Feth et al., 1982), ~WAIF at threshold may vary as a func­tion of the centre frequency. But, at threshold and at a given centre frequency, the difference ~WAIF should have the same value under the 1-ST condition as under the 58-Hz condition. Thus, the model predicts that the ratio ~WAIF(58Hz)/ ~WAIF(lsT) should be unity. Thresh­olds obtained in this experiment were inserted in Equation (2.22) and

Page 65: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments

Bandwidth (Hz)

7 14 29 58 116 231 3 ~~-~_.____. _ __. _ __._ _ ___,

AL (dB) 2

1

o-.::i-~-r--.-..-r-"T"T"'l-rr-~-r--r-r--r-..-.-.~

100 1000 10000

fc (Hz)

Figure 2.8: Results of Experiment II. Thresholds & (dB) for two-tone complexes with a frequency ratio of 1 ST are plotted as a function of the centre frequency. Different symbols represent the individual subjects.

55

ratios were plotted in Figure 2.9 as a function of centre frequency. The results show that for centre frequencies below 1 kHz, the ratio is not close to unity. For a centre frequency of 1 kHz, the ratio is very close to unity, which is due to the identical stimuli at this special centre fre­quency. With higher centre frequencies, the ratio stays close to one. According to conventional estimates (Sharf, 1970), all signals lie within one critical band1 . The general claim, however, that EWAIF models the data correctly for unresolved signals has to be narrowed to one that states that the EWAIF-model accounts only for discrimination data for signals with a centre frequency above 1 kHz, or with a frequency sepa­ration smaller than 1 ST. Insertion of the data into the AWF-, IWF-, or !WAIF-model yield plots that resemble Figure 2.9. Hence, the conclu­sions for these models are the same as for the EWAIF-model.

1For centre frequencies o( 250 Hz and below, the signal's bandwidth does exceed the equivalent rectangular bandwidth (Moore, Peters & Glasberg, 1990).

Page 66: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

56 Chapter 2 Two-tone complexes

&WAIF (58 Hz)lO &wAIF (1 ST)

1

0.2 -+--~~~~~-~-~~--.--.-.-! 100 1000 10000

Jc (Hz)

Figure 2.9: The ratio .6EWAIF(58 Hz)/ .6EWAIF(l ST) is plot­ted as a function of the centre frequency. Data points of single subjects are connected by a solid line.

2.5.4 Ezperiment III

In the previous experiments with two-tone complexes, a rather strange phenomenon was observed when subjects were asked to describe the perceptual cue they used. Though the frequencies of the components did not change, and the only physical difference between the pair of

Page 67: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 57

complementary two-tone complexes was a change in the relative ampli­tude, all subjects reported an either upward or downward jump in pitch on the basis of which they made a discrimination. Musically trained listeners were asked to identify the musical interval associated with this pitch jump. Their reported interval corresponded to the physical fre­quency ratio of the stimulus, even if the components were separated by 24 ST. Though no attempt was made to investigate the perceptual at­tributes of the stimuli more thoroughly, they provide some information about where to start in modelling the data. Obviously, a change in pitch is the perceptual cue, and Feth (1974) already argued that thresholds of differences in amplitude of a complementary pair of two-tone complexes can be mapped with the EWAIF-model into pure-tone frequency JNDs. The measurement of single-tone frequency JNDs has received consider­able attention in the literature. Unfortunately, results seem to depend to some extent on the experimental paradigm (Jesteadt & Sims, 1975), in particular the underlying sensitivity parameter d'. The paradigm used in the present experiments is not widely used in psychophysics, and therefore systematic differences have occurred. Furthermore, it is known that pitch is related to intensity (Henning, 1966; Verschuure & van Meeteren, 1975; Jesteadt & Neff, 1982), so that discrepancies with results in the literature are to be expected if the .experiment is done in the presence of a roving intensity level. Roving level frequency JNDs have been measured by Emmerich, Ellermeier & Butensky (1989) and Moore & Glasberg (1989). They found that frequency-discrimination performance is impaired in a roving level condition only slightly. Fi­nally, between-subject differences seem to be large (e.g. in comparison with intensity-discrimination experiments). Therefore, it was decided to measure thresholds for changes in the frequency of pure tones as a function of centre frequency. 1 In order to stick as closely as possible to the experimental paradigm in this chapter, a roving intensity level of 20 dB was applied, even though pitch changes slightly as a function of intensity (e.g., Jesteadt & Neff, 1982), which might impair perfor­mance. If, however, changes in the spectral shape are transformed into pitch changes which, in turn, are related to the perception of changes in the frequency of a pure tone, then both thresholds should be affected equally by the introduction of a roving level, so the use of a roving

10f course, it is a bit odd to to talk about the centre frequency of a pure tone, but it is done merely to be consistent with the other sections.

Page 68: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

58 Chapter 2 Two-tone complexes

level is appropriate. These measured frequency difference thresholds can then be used to test the EWAIF-model and its variants.

Stimuli

Stimuli were pure tones with a centre frequency of either 125, 250, 500, 1000, 2000, or 4000 Hz. Thresholds for a change in frequency were measured. Four subjects participated. They also participated in Experiment II. At least 600 trials per subject and per condition were taken. Thresholds were obtained as described in section 2.5.1, but now with the assumption that d' oc .6F (Jesteadt & Sims, 1975).

Results & Discussion

Figure 2.10 displays threshold values of t::.F for frequency changes in pure tones as a function of centre frequency le. Different symbols represent the individual subjects. Figure 2.10 shows that, on average, thresh­olds remain fairly constant to about 500 Hz, whereafter they start to increase. For a centre frequency of 1 kHz and beyond, the t::.F is propor­tional to the centre frequency, i.e., t::.F/ fe is constant. This behaviour, which is related to the critical bandwidth, is in very close agreement with other results found in the literature. For comparison, the results of Wier, Jesteadt & Green (1977) at a sensation level of 40 dB (without a roving level), and the results of Moore & Glas berg (1989) at 70 dB SPL

(with a 6-dB roving level) are plotted as a dashed and dotted curve, respectively. Apparently, the roving level does not impair thresholds very much, which is in agreement with results of Henning (1966), Em­merich et al. (1989) and Moore, Oldfield & Dooley (1989). They found only a small influence of a roving intensity level on threshold, except perhaps for frequencies of 4 kHz and beyond.

With the aid of the EWAIF-model and its variants it is possible to relate the present results to those obtained in Experiment II. Equations (2.16), (2.19), (2.22), and (2.25) do not contain free variables, and in­sertion of the (individual) threshold values of 81 of Experiment II will yield values for &.WF, !::.J.WF, LlEWAIF and !::.J.WAIF, respectively. Next, these values can be compared with the present single-tone JNDs, hence providing a test for the models. Figures 2.11 to 2.13 display the ratio

Page 69: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments

50 :·: ,, - :1 )

' I - ' & c ~

" (Hz) - • •: ,t:J •: •: •: •:

10- <> •:

" I; - •: - •: \l ,. - " 1: - t:

r;

- <> " " - /::,. ~ .,

.. ·;~ - .··, ~·· , ..... ·····v

fi:il I

- • 0 ~

I

,'' 0

~ ll-' --6---·o ,

1- \l - <> ---

0.5 ' ' ' ' I ' "I ' I I I II II

100 1000 10000

fc (Hz)

' ·~.

Figur,e 2.10: Results of Experiment III. Thresholds llF (Hz) ' for frequency changes in pure tones are plotted as a function of

the centre frequency. Different symbols represent the individual subjects. The dashed and dotted line are results of Wier et al. (1977) and Moore et al. (1989), respectively.

59

AA.wF/llF, !M.WAIF/llF (or !M.wF/llF) and illJWAIF/llF, respectively, as a function of the centre frequency for each individual subject. Data points of a single subject are connected by a solid line. Open symbols represent thresholds obtained with two-tone complexes with a band-

Page 70: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

60 Chapter 2 Two-tone complex.es

100=--~------~~~~------.

flAWF . LSF

10

1

OJ -+----.--...--....---.-,-r-r-r-.-----.--...--.----r-.--..---r-r-1

100 1000 10000 fc (Hz)

Figure 2.11: The ratio l:!.AWF / llF is plotted as a function of the centre frequency. Different symbols denote the individual subjects.

width of 58 Hz. Filled symbols represent thresholds obtained with two-tone complexes with a frequency ratio of 1 ST. If spectral-shape change were transformed into an equivalent frequency change, accord­ing to any of these models, the results should lie on the dotted line. None of the models fulfills this condition, though the filled symbols (1-ST frequency ratio) seem to lie parallel to the dotted line. In gen­eral, all models display similar trends, which was to be expected from the computations shown in Figure 2.4. The only difference is a shift along the ordinate. For centre frequencies below 1 kHz, the ratios for the 58-Hz condition (open symbols) are systematically larger than for the 1-ST condition (filled symbols). With centre frequencies of 1 kHz and beyond, data points of the 1-ST condition and the 58-Hz condition overlap. This makes that the ratio DEWAIF(58 Hz)/ DEWAIF(l ST) in Figure 2.9 decreases to unity as the centre frequency increases. For the lowest centre frequencies, DEWAIF seems to fit the data for the 1 ST­condition quite nicely. For higher centre frequencies, .6IWAIF (.6IwF)

Page 71: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments

100~~~~~~~~~~~~~

.6iwAIF LSF

10

1

OJ-r-~~-.-~.-----.-.-.--.-..,...-,-.,--~~.-----..---,,..--,,....-,--.-.--rl

100 1000 10000 Jc (Hz)

Figure 2.12: The ratio .6IWAIF / IY (or .6IwF / IY) is plotted as a function of the centre frequency. Data points of single subjects are connected by a solid line.

61

fits the data best. This result suggests that with slow frequency fluc­tuations, as is the case with a frequency ratio of 1 ST at fc=125 Hz, a temporal model such as the EWAIF-model can account for the data, but only if the signal is completely unresolved. At high centre frequencies, on the other hand, temporal cues are too weak, leaving only spectral properties of the signal to be used, hence the IWF-model provides a better account of the data.

2.5.5 Experiment IV

An important question is to what extent profile analysis is similar to the detection of a change in the amplitude of a pure tone. Therefore, an experiment was performed where thresholds for changes in the am­plitude of two-tone complexes were measured. In the spectral-shape condition, the signal was a complementary pair of two-tone complexes, so one sound had components (!1 , A),(/2, A+ M); the other sound

Page 72: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

62 Chapter 2 Two-tone complexes

100=-~~~~~~~~~~~~~--.

&WAIF -&

10

0.1 -1-----,...-..--...--.-..-~~---r--r-r--r--,--..--.--r-I 100 1000 10000

Jc (Hz)

Figure 2.13: The ratio &WAIF/ llF is plotted as a function of the centre frequency. Data points of single subjects are connected by a solid line.

(/1 , A+ M),(/2, A). In the intensity condition, the amplitude of both components changed in the same direction, so one sound had compo­nents (/i, A),(/2, A)i the other sound (/1 , A+M),(/2, A+M). Thresh­old values of & were measured as a function of sensation level. It is known that with pure-tone intensity discrimination, Weber's Law does not hold. Many experiments (e.g., Jesteadt et al., 1977) have shown that, at higher intensities, sensitivity is better than is expected from Weber's Law. This so-called "near-miss" to Weber's Law usually is de­scribed for by a power function, which has been formulated in several different ways in the literature. Here, the formulation of Green (1988) is used, yielding

(2.48)

The coefficient k is a proportionality constant, depending, amongst other things, on the subject's absolute sensitivity to changes in the amplitude and on the experimental paradigm. !:::.A is the empirically

Page 73: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 63

obtained threshold and A0 is the amplitude for which the signal is barely audible. The exponent p determines the "near-miss,,. If p = O, the "near-miss" is absent and Weber's Law holds. Since sensitivity increases with increasing level, p < 0. Rewriting Equation (2.48) in terms of just-noticeable level differences, D.t, yields

(2.49)

where

L 20 loglO (:J , (2.50)

is the sensation level (in dB). The values fork and p will be determined by a fit of the data to Equation (2.49). It is expected that with intensity difference thresholds, p will have a value similar to that reported in literature (p = -0.18 0.05; Green, 1988). The value for k is not important, since it is only a proportionality constant.

If changes in the spectral shape are transformed by the auditory sys­tem in some way to changes in loudness, then the dependence of thresh­old on sensation level should be similar to that for intensity changes.

Stimuli

Stimuli were two-tone complexes with /c=l kHz and a frequency ratio of 1 ST. In one condition (spectral-shape condition), threshold values of D.t were measured for opposite changes in the amplitude of the com­ponents, causing a change in the spectral shape. In another condition (intensity condition), both amplitudes changed in the same direction, causing a change in overall intensity. The stimulus level L was either 20, 40, 60, or 80 dB SL. Thus, in this experiment, no roving level was applied. Four subjects participated, and at least 600 trials per subject and per condition were taken.

Results & Discussion

In Figure 2.14 threshold values of D.t for two-tone complexes, com­prising a change in spectral shape, are plotted as a function of the overall level L (dB SL) with open symbols. Thresholds obtained with

Page 74: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

64 Chapter 2 Two-tone complexes

• & 4 • • (dB) ' ' ' ' ' ' ' ' ' • ' ' '

3 ' ' ' ' ' ' ' I ' ' ' ' ' ' ' • ' ' ~ ' 2 • ' ' ' '

~ ' ' ... ' ' ' ' ' ' <> ' ' • ' •

1 6 <> ______ .., ___ -----t:J;-------

\J 6 121 D CJ " D 0

0 20 40 60 80 100

L (dB SL)

Figure 2.14: Results of Experiment IV. Threshold values of & (dB) for a change in spectral shape (open symbols) or a change in level (filled symbols) are plotted as a function of the overall level L (dB SL). Different symbols represent the individual subjects. The dashed curves are fits to the power law:

two-tone complexes, comprising a change in level are plotted with filled symbols. Thresholds for the shape condition are smaller than those of the intensity condition. Furthermore, the former hardly seems to depend on the absolute level L. Thresholds for pure-intensity changes

Page 75: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.5 Experiments 65

are somewhat larger than reported in literature (up to 4.5 dB at 20 dB SL, cf. Jesteadt et al., 1977), which may be due to the envelope fluc­tuations in the two-tone complexes, which are not present with pure tones. Nevertheless, the data display the "near-miss" to Weber's Law. A least-squares fit of Equation (2.49) was made to the individual data, to obtain values for k and p. Table III displays the values k and p

Intensity Spectral shape k p k p

Sl 0.870 -0.127 0.156 0.035 S2 0.471 -0.138 0.058 -0.024 83 0.889 -0.138 0.060 0.091 84 0.432 -0.119 0.030 0.033 Average 0.652 -0.130 0.074 0.041

Table ID: Parameter values of p and k, obtained by a least­squares fit to the data of Experiment ID.

for the individual subjects, as well as for the averaged results. The average value for p of -0.13 obtained in this experiment corresponds well to the data obtained in the literature (Laming, 1986, Green, 1988) 1 • There are considerable differences between subjects, as can be seen from Figure 2.14. This is reflected in the parameter k. The "near-miss" parameter p, on the other hand, is virtually identical for all subjects. The result that Weber's Law holds in the spectral-shape condition is in agreement with Mason, Kidd Jr., Hanna & Green (1984) and Green & Mason (1985). A major difference between the present experiment and those of Mason et al. (1984) and Green & Mason (1985) is that their experiments were done with broadband stimuli, for which it is known that Weber's Law holds. The present experiment deals with narrowband stimuli, yet Weber's Law still holds. The behaviour can be accounted for by an excitation-pattern model (Florentine & Buus, 1981 ). As the level of a frequency component increases, its excitation

1 Since Equation 2.48 implicitly assumes d' oc M, p depends upon the definition of threshold. Performing a fit on the raw data, i.e., with a threshold that is defined as d' = 2.134 (Appendix B), yields an average value for p of -0.15.

Page 76: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

66 Chapter 2 Two-tone complexes

pattern broadens. Therefore, at higher levels more bands are available, yielding a better estimate of the intensity, and ultimately resulting in the "near-miss". With broadband stimuli (e.g. white noise), the excita­tion: pattern already is broadband, and increasing the overall intensity does not cause more bands to be involved, hence no "near miss" is ob­served. This might also be the case with the stimuli of Mason et al. (1984) and Green & Mason (1985). In our spectral-shape condition, the amplitudes of both components change, but the overall level re­mains the same. Thus, the global excitation pattern will not change very much, and no "near-miss" is observed.

2.6 General Discussion

Results of Experiment I show that the auditory system can monitor accurately the difference between a pair of complementary two-tone complexes, despite the presence of a roving intensity level. This means that profile analysis must take place, i.e., comparison of amplitudes from different parts of the spectrum is done in a relative fashion. This conclusion is in agreement with the results of Ito (1990), who measured thresholds for similar two-tone complexes.

The shape of the function relating threshold to frequency separa­tion (also measured by Feth & O'Malley, 1977 and Versfeld & Houtsma, 1991) strongly suggests that two discrimination processes are involved. With narrowband (unresolved) two-tone complexes, the signal is pro­cessed as a whole, and it is likely that temporal characteristics of the signal are used. Broadband signals are resolved by the auditory periph­ery, whereafter across-band comparison takes place.

Results of experiments II and III showed that the EWAIF-model can account for the thresholds only if components are very close in frequency (less than 1 ST). Then the model-output yields values that correspond to the single-tone JNDs at that centre frequency. At present, the ap­plication of the EWAIF-model is only very limited, in the sense that the signal's bandwidth has to be very small. An additional restriction is that the signal's envelope for both members of a pair of complex sounds has to be the same. This condition is necessary to prevent lis­teners from using additional cues, such as e.g., changes in roughness

Page 77: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

2.6 General Discussion 67

(cf. Berg, 1992).

The multi-channel model of Durlach et al. (1986) seems to be able to account for the data in Experiment I. In contrast to the results of Ito (1990), the data show an increase in threshold with increasing fre­quency separation, which can be partly accounted for by a decrease of the interchannel correlation. The major factor causing this increase is assumed to be the growing inability of the auditory system to monitor both components simultaneously. That is, with large separations, the listener seems to focus on one component only, partly ignoring changes in the other component, resulting in a value for 'Y that deviates from unity. Indeed, if one listens to these sounds, one will notice that the two components are not perceived as one sound, but rather as two dif­ferent sounds. The non-optimum multi-channel model has been applied only to data of one subject, at two frequency separations. If the two components in the signal are not resolved, only one channel is used, and the multi-channel model reduces to a trivial model that can only detect differences in overall level. Apparently, this is not yet true for separations of 1 ST, but eventually, with narrower bandwidths, other cues take over and are used to detect spectral-shape changes. Con­sidering the results, it is very worthwhile to extend the experiments to further test the multi-channel model. One should do experiments with very narrow-bandwidth signals (in order to see where the model starts to fail), intermediate bandwidths (to see if there is an effect of the critical band) and very large bandwidths (to see if the correlation keeps decreasing).

The transition from within-channel analysis to across-channel com­parison takes place near a bandwidth of 1 ST, which is about one-third to one-fourth of the critical bandwidth. With these frequency separa­tions, Plomp (1964) found that subjects were able to just distinguish the individual pitches of the components. With narrower bandwidths, the two components are inseparable, and the signal is processed as a whole. With larger bandwidths, the two components can be separated, and the signal is processed by analysis of the individual components.

The "near-miss" to Weber's Law was not observed if a change in spectral shape occurred. If discrimination were governed solely by changes in the excitation pattern, discrimination would be due to a shift in excitation pattern over the basilar membrane, rather than a

Page 78: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

68 Chapter 2 Two-tone complexes

growth of the excitation pattern. In that sense, changes in the am­plitude should be related to single-tone frequency JNDs, just as was expected with the EWAIF-model. However, JNDs for frequency shifts also decrease as the intensity increases (Wier et al., 1977), and hence cannot account for the data. Apart from the excitation pattern, tem­poral coding may also provide cues, but it is difficult to give predictions for threshold behaviour.

An interesting question is to what extent the obtained thresholds (especially for narrow band signals) are a consequence of processing at the auditory periphery. To examine this, one should perform an exper­iment where the component with frequency / 1 is directed to one ear, whereas the other component, with frequency / 2 , goes to the other ear. If behaviour remains unchanged, one can almost be sure that the pro­cessing takes place at a central level, and models that operate on the entire signal (as e.g. the EWAIF-model) cannot account for the data. If, on the other hand, threshold increases dramatically (which seems to be the case for some preliminary results), the signal presumably needs to be processed as a whole. Then, models that operate on the signal as a whole have validity.

In conclusion, thresholds obtained with broadband (i.e., 1 ST and beyond) two-tone complexes can be accounted for by the multi-channel model of Durlach et al. (1986). Very narrowband stimuli give thresholds that can be accounted for by the EWAIF-modeL

Page 79: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Chapter 3

Discrimination of spectral changes in noise bands i

Abstract

Adaptive, 3-interval, 3-alternative forced-choice discrimination ex­periments were perfotmed for sign changes of the spectral slopes of noise bands. Thresholds were measured at several bandwidths and centre frequencies, and for several tokens. Experiments were per­formed while roving the overall intensity. At a fixed centre frequency of 1000 Hz, sensitivity is best for bandwidths of 3-6 semitones (sT). At larger bandwidths, threshold increases only slowly. ff the band­width is kept constant at 1 ST, threshold is practically independent of centre frequency. Attempts have been made to explain the re­sults in terms of the EWAIF model (for narrowband stimuli) and a multi-channel model (for broadband stimuli). The former model fails to describe the data. The latter model can only qualitatively account for the data. The present data show a remarkable simi­larity with results obtained with two-tone complexes under similar conditions, which suggests that mainly changes in two regions - the spectral edges of the noise band - are monitored by the auditory system. This idea is supported by several other results reported in the literature.

1 Part of the results presented in this chapter appeared in: Versfeld, N .J. (1991 ). "Perception of spectral changes in noise bands of varying bandwidth", IPO Annual progress report 25, 23 - 31; and in: Versfeld, N.J. (1992). "On the perception of spectral changes in noise bands", in: The processing of speech, M.E.H. Schouten, ed. (Mouton-de Gruyter, Berlin).

Page 80: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

70 Chapter 3 noise bands

3.1 Introduction

I N profile-analysis experiments, stimuli usually comprise a finite num­ber of components, often arranged such that each component falls

into a separate critical band. This is done to avoid interaction between the components at a peripheral level, so that mechanisms based on across-channel comparison can be studied. Most profile-analysis ex­periments reported in the literature have dealt with the detectability of an increment in the amplitude of only the middle component of a multi-tone complex (Green, 1988). Some experiments have dealt with the discriminability of broadband changes (Bernstein & Green, 1987b; Green, Onsan & Forrest, 1987). A main finding with these experi­ments was that the discriminability of broadband changes was much poorer than would be expected from single-tone discrimination thresh­olds. Presumably, in the decision process, not all information is used with equal weight. Other profile-analysis experiments (e.g., Bernstein & Green, 1987a) showed furthermore that, due to masking, discrimi­nation degrades as components fall into the same critical band, which makes the study of the discriminability of spectral changes between signals with a dense spectrum difficult.

Against this background, data obtained with broadband changes in a dense spectrum can be difficult to interpret. Nonetheless, the topic is relevant, since many everyday signals, such as speech and music, do have dense spectra and continuously undergo broadband variations. Especially noise-like spectra are interesting in this point of view, if one thinks of the confusion between fricatives and plosives. There are a number of papers dealing with identification of noise-like speech signals, such as consonants. Unfortunately, only very few papers deal with the detectability of broadband spectral changes in noise. Farrar, Reed, Ito, Durlach, Delhorne, Zurek & Braida ( 1987) studied the discrimination of noises with different speech-like spectra, embedded in a background of non-changing noise. They found that, even in the presence of a roving intensity level, different spectra could be discriminated. This in itself is not surprising, since in real life, one is able to understand speech, even in a noisy background, and under loudness changes. Moore, Oldfield & Dooley (1989) studied the detection of peaks or notches in an otherwise flat noise-spectrum with a roving intensity level. They found that such

Page 81: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experjments 71

changes could generally be discriminated, peaks better than notches, the latter suffering from masking effects. Both Farrar et al. (1987) and Moore et al. (1989) could explain their results in terms of the multi­channel model of Durlach, Braida & Ito (1986), though with a very low efficiency. Again, not all channel information seemed to be used with equal weight.

The present experiments can first of all be seen as an attempt to gain more insight into the discriminating power of the auditory system for spectral changes in noise-like signals, either broadband, or narrow­band. Three experiments are reported. The first one deals with the detection of a change in the spectral slope of noise bands as a func­tion of the bandwidth. The second one investigates the influence of the noise sample on threshold. The last experiment examines the de­pendence of threshold on centre frequency, under the condition that the bandwidth is kept fixed at either 1 semitone or 58 Hz. The results are discussed in terms of within- and across-channel models, such as the EWAIF model of Feth (1974) or the multi-channel model of Durlach et al. (1986). Furthermore, an attempt is made to relate the discrimi­nation of changes in the spectral shape of noise-like signals to that of two-tone and multi-tone complexes, as described in chapters 1 and 2.

3.2 Experiments

3.2.1 General description

The present experiment deals with the discrimination between two noise bands that differ from one another only with respect to the sign of the spectral slope. In an adaptive, three-interval, three-alternative forced­choice procedure, the subject's task was to discriminate between a noise band having a positive spectral slope and one having a negative spec­tral slope (see Figure 3.1). The magnitude of the slope was varied adaptively. The idea of using these sounds was derived from experi­ments with two-tone complexes in the previous chapters. The spectral shape was chosen such that changes in the shape were very simple and symmetrical, while the overall intensity remained constant.

Page 82: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

72

L+M.

L

(a)

Chapter 8 noise bands

Frequency (log.) (b)

Figure 3.1: Frequency spectrum of stimuli used in the experi· ment. (a) Noise band with a negative spectral slope. (b) Noise band with a positive spectral slope.

Stimuli

Sounds were noise bands, generated digitally by summation of sinu· saids of the appropriate amplitude and starting phase, spaced 1 Hz apart. In each experimental condition (i.e., fixed bandwidth and centre frequency) the starting phase of each component was preserved from trial to trial, so that, in fact, the signal should not be considered as true noise in a statistical sense, but rather as a frozen noise sample, or simply as a signal with a very dense spectrum. The spectral slope was linear on a log.frequency, log·amplitude scale, and was expressed as the level difference 61 between the two spectral edges (see Figure 3.1). The bandwidth was expressed as the ratio in semitones (ST) of the two edge frequencies. One stimulus consisted of three noise bursts, each with a duration of 400 ms (including a 20·ms onset and a 20·ms offset). Bursts were separated by lOO·ms silent periods. Thus, one stimulus lasted 1400 ms.

Page 83: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments 73

Procedure

Before each new experimental condition) the subject - who was seated in a sound-insulated booth and received the stimuli diotically through TDH-49P headphones - was asked to adjust the overall intensity of a noise band with the desired bandwidth and centre frequency, and with a flat spectrum (LlL = 0)) such that it was barely audible. During the experiment the overall intensity of each noise burst was varied ran­domly between 30 and 50 dB above this empirically established thresh­old. The roving intensity was applied to prevent subjects from using loudness cues, leaving changes in the spectral shape as only possible discrimination cues.

For each stimulus, one noise burst differed in the sign of the spectral slope from the other two. In each trial) one of six possible sequences (++-, +-+, -++, -+,-+-or+-) could occur. The subject's task was to indicate, by pushing the appropriate button on the response box, which interval contained the odd sound. There was no response­time limit, and visual feedback was provided immediately after each response. A run comprised 150 trials and lasted 7 to 10 minutes.

In a run, an adaptive staircase procedure was adopted. Each run started with values of LlL lying well above the expected threshold. LlL was changed according to the stepping rule described in Appendix A. The stepping rule was arranged such that most trials were taken for values LlL close to the 713-correct score point. Only after the sub­ject's performance had stabilized were results entered into the record. The entire psychometric function was then estimated by fitting the pooled data to a parameterized psychometric function, as described in Appendix B, with a Maximum-Likelihood procedure, described in Ap­pendix C. It was assumed that d' <X LlL. The threshold was defined as that value 81} for which, on the established psychometric function, d' = 1.

3.2.2 Experiment I

Experiment I investigated the ability of the auditory system to dis­criminate between a noise band with a positive spectral slope and one

1Thresholds are denoted with Small Caps.

Page 84: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

74 Chapter 3 noise bands

& 8

(dB) 7 0

6 v

5 8 4 0

3 /). <>

2 ~Oo <> 8~a ~ 1 /). /).

/).

<>

0 0 5 10 15 20 25

Bandwidth (sT)

Figure 3.2: Results of Experiment I. Thresholds & (dB) are plotted as a function of the bandwidth (sT). Different symbols represent the individual subjects. The solid curve represents thresholds obtained with two-tone complexes.

with a negative spectral slope, as a function of the bandwidth. The bandwidth was varied between 0.5 and 24 ST. The centre frequency /c, defined as the geometrical mean of the two edge frequencies, was kept fixed at 1 kHz. Six subjects participated in the experiment. At least 750 trials per subject and per condition were taken.

Results f.1 Discussion

The results of experiment I are displayed in Figure 3.2, where thresholds & (dB) are plotted as a function of the bandwidth (sT). Different symbols represent the individual subjects. The solid curve represents thresholds obtained with two-tone complexes in chapter 2, and will be discussed below. To make sure that no loudness cues were used, despite the roving level, the so-called "ceiling" had to be determined (Green, 1988). To that end, the discrimination threshold for changes in level of a

Page 85: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments 75

noise band with a :fiat spectrum and a bandwidth of 6 ST was measured. Threshold was determined with and without a roving intensity level. One subject participated. With a roving level, threshold was 8 dB (the "ceiling"), whereas without a roving level it was about 2 dB. The latter result is in agreement with the data in Figure 4 of Moore et al. (1989). The 8-dB ceiling indicates that possible loudness cues could not account for the obtained thresholds in Figure 3.2. Subjects had to use the relative differences in the amplitude, or stated differently, they had to discriminate between the spectral shapes.

The spread of thresholds between subjects remained fairly constant, at about 1.5 to 2 dB, except at a bandwidth of 0.5 ST, where the spread was larger. More important is that the individual curves are parallel, indicating that only the sensitivity between subjects is different. Going from left to right in Figure 3.2, threshold first decreases to about 0.8 - 2 dB, for a bandwidth of 3-6 ST. For larger bandwidths, threshold slowly starts to increase again. The average increase is 0.38 dB/octave (if averaged over 6, 12, and 24 ST).

Subjects reported that the perceptual cue with narrowband stimuli was a change in pitch. With broadband stimuli (i.e., larger than about 3 ST) the cue was a change in sharpness. Noise bands with a positive spectral slope were perceived as sounding sharper than bands with a negative slope. The reported percept is consistent with the findings of von Bismarck (1974b ), who studied the verbal attributes of steady-state signals with different spectral shapes. Both threshold behaviour and reported perceptual cues suggest that, around a bandwidth of 3 ST, a transition occurs. For bandwidths smaller than 3 ST, the signal falls entirely within one critical band, so that it can be processed as a whole. Large-bandwidth signals are spread out over a number of bands, so that the outputs of different bands have to be combined in order to make a discrimination.

3.2.3 Experiment II

The second experiment investigated the influence of noise sample - or phase relation between components of the noise band on the discrim­inability of changes in the spectral shape. Since the previous experi­ment dealt with frozen noise, it is interesting to know to what extent

Page 86: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

76 Chapter 3 noise bands

& 5 5

(dB) 0 EB

4- 4-0 8 3 (9

<> xx 3- <>

D 0 <> 3- (9 xx ijl

- ~ ~ D -

~ ' 181 0 • 181 181 2- 'V 8 ~ 2- I I EB

- 8 6. 6. ~ - @

6 I 1- 6. 1-- -

0- + + + + + 0- + + + + + + 1 2 3 4 5 1 2 3 4 5 6

(a) (b)

Figure 3.3: Results of Experiment II. Thresholds AL (dB) are plotted for (a) the five noise samples, and (b) the six subjects.

threshold is influenced by a different phase relation between the com­ponents. It is known that beyond the critical bandwidth the auditory system no longer is very sensitive to phase. Only within-channel phase relations can be perceived. Therefore, thresholds for changes in the spectral slope of a noise band with a bandwidth of 1 ST were measured. Five different samples were ta.ken at random, labeled (1) to (5). Sam­ple (1) wa.s used in the previous experiment. Thresholds were measured using six subjects (the same as in the previous experiment) and at least 450 trials were taken per subject and per condition.

Results & Discussion

Figure 3.3 displays the results of Experiment II, where thresholds have been plotted (a) as a function of the five noise samples, and (b) as a. function of the six subjects. Figure 3.3 (a) shows that between-subject differences in threshold are 1.5 to 3.0 dB, which is somewhat larger than was seen in Figure 3.2. For all subjects, threshold for sample (1), most

Page 87: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments 77

left, is lower than for samples (2), (3) and ( 4). Similarly, threshold for sample (4) is 0.4 dB lower than for sample (3). Figure 3.3 (b) shows that thresholds for the different noise samples do not vary much for subjects (1) and (5), 1 dB or less, whereas the other subjects have a larger variation in threshold. At the same time, however, they also have a higher threshold. In conclusion, the results show that phase relation may affect threshold, but the amount of variation is small enough to ensure that the shape of the threshold curve in Figure 3.2 is preserved. The location of the minimum, however, could be shifted somewhat if different samples were used.

9.2.4 Experiment III

The U-shaped curve of Figure 3.2 of Experiment I resembles the thresh­old function, obtained with two-tone complexes (chapters 1 and 2), of :which the amplitudes were changed in opposite direction such that a change in the spectral slope occurred, much similar to the change in slope of the noise bands.

In order to investigate to what extent the discriminability of changes in the spectral shape of noise bands is similar to the discriminability of changes in two-tone complexes, thresholds for a sign change in the spectral slope of noise bands were measured as a function of centre fre­quency fc· In one condition the bandwidth was kept fixed at 58 Hz, and in a second condition at 1 ST. Centre frequencies for both condi­tions were 125, 250, 5001 1000, 2000 and 4000 Hz. The centre frequency was defined as the geometrical mean between the two edge frequencies of the noise band. In chapter 2, thresholds for two-tone complexes as a function of centre frequency were measured under identical condi­tions. There, threshold was hardly dependent of centre frequency if the bandwidth was kept fixed at 1 ST.

Two subjects participated, who also participated in the previous two experiments, as well as in the experiments of chapter 2. Per subject and per condition, 600 trails were taken.

Page 88: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

78 Chapter 3 noise bands

Bandwidth (ST)

8 9

4 2 1 ! !

& 8

(dB) 7

6

5

4

3

2

1

0 100 1000 10000

Jc (Hz)

Figure 3.4: Results of Experiment III. Thresholds At (dB) for noise bands with a bandwidth of 58 Hz are plotted as a function of the centre frequency (filled symbols). Plotted in open symbols are thresholds for two-tone complexes from chapter 2. Different symbols represent the individual subjects.

Results & Discussion

Figures 3.4 and 3.5 display the thresholds for two subjects for the 58-Hz and the 1-ST bandwidth condition, respectively (filled symbols). Also

Page 89: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments

Bandwidth (Hz)

7 14 29 58 116 231 3......-~~~~~~~---"~~~--.

& (dB) 2

1 ~~~~~4

0 -+---r---.--ir-r-ir'T"'l"'1ri----i,..--,,...--r-"'T""T'"TT-rl

100 1000 10000

fc (Hz)

Figure 3.5: Results of Experiment III. Thresholds LY.. (dB) for noise bands with a frequency ratio of 1 ST are plotted as a func­tion of the centre frequency (filled symbols). Also plotted (open symbols) are thresholds for two-tone complexes from chapter 2. Different symbols represent the individual subjects.

79

plotted (in open symbols) are thresholds for changes in the amplitude of two-tone complexes, obtained in chapter 2. The relation with two­tone complexes will be discussed below. The secondary axis at the top of the figure indicates the corresponding bandwidth of the noise band in ST (if the bandwidth was kept fixed to 58 Hz, Figure 3.4) and in Hz (if the bandwidth was 1 ST, Figure 3.5). At fc=l kHz, the stimuli in the two conditions had the same bandwidth. The samples, however, i.e., the phases of the components of the noise bands, were different for both conditions. Yet, thresholds are very similar (cf. thresholds at /c=l kHz in Figure 3.4 with those of Figure 3.5). These thresholds are also close to the ones obtained in Experiment I, for a bandwidth of 1 ST. It should be noted that the subjects of the present experiment were subjects number (2) and (5) of Experiment II, whose thresholds were influenced only slightly by the different noise samples. Thresholds for the 58-Hz condition decrease at first as centre frequency increases.

Page 90: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

80 Chapter 3 noise bands

At 500 Hz a minimum of about 1.2 dB is reached. For higher centre frequencies, thresholds start to increase again. The largest difference between the two subjects is obtained for /c = 4000 Hz. The thresh­olds for two-tone complexes are very similar to the present results. Two main differences can be observed, however: Firstly, thresholds for two-tone complexes are much smaller. This was also observed in Ex­periment I. Because the phase relation between the components in the noise band was kept fi..xed 1 the spectro-temporal pattern was also fixed. The auditory system, however, analyses the signal with a time window that is small in comparison with the duration of the noise burst. There­fore, if analysed with a short time window 1 the spectrum of the noise band fluctuates. The internal representation of the spectrum thus is un­stable, causing a poorer estimate of the spectral shape. Secondly, the minimum is situated at /c=500 Hz 1 whereas for two-tone complexes it is 1000 Hz, For the moment, no explanation can be given. Thresholds for the 1-ST condition do not display a clear trend as a function of centre frequency. Especially for one subject, threshold doesn't even depend on it, except perhaps for a slight increase at /c=125 Hz. Threshold be­haviour for amplitude changes in two-tone complexes (open symbols) again resembles the present behaviour very much.

Figure 3.2 displayed thresholds as a function of bandwidth in semi­tones. In a sense, this is also true in Figure 3.4, where, going from left to right, the bandwidth in semitones decreases. In the latter figure, the minimum seems to occur at a somewhat smaller bandwidth (2 ST, instead of 3-6 ST). Further comparison of the two figures shows that Figure 3.4 can be seen as the mirror image of Figure 3.2. Combining the results of Figure 3.2, 3.4 and 3.5 suggests an invariance of the shape of the curve in Figure 3.2 under transformation of centre frequency, i.e., threshold is not determined by the value of the centre frequency, but rather by the signal's bandwidth in semitones. It is not unreasonable to think that threshold will be constant if the bandwidth is constant in, for instance, critical bands.

3.2.5 The EWAIF model

For narrowband signals, the obtained thresholds might be modelled with the EWAIF model of Feth (1974). The model states that a change

Page 91: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments 81

in the spectral shape is perceived as a change in pitch. The mapping of the signal's spectrum onto the pitch axis is done by calculation of the envelope-weighted average of the instantaneous frequency, or EWAIF

EWAIF = 0/T E(t)f(t) dt

0/T E(t)dt

(3.1)

In this equation, E( t) and f ( t) are the envelope function and the instan­taneous frequency, respectively. Averaging is done over a time interval T. If, for two signals, the difference in the associated values for EWAIF becomes larger, the two signals can be discriminated better. Though for two-tone complexes - and even three-tone complexes - the envelope function and the instantaneous frequency can be derived analytically, this is in practice impossible for noise bands. Therefore, envelope and instantaneous frequency were extracted from the sampled waveform with the aid of a computer programme, using a Discrete Hilbert Trans­formation (Feth & Stover, 1987; Kidd Jr., Mason, Uchanski, Brantley & Shah, 1991). E(t) and f(t) were calculated for noise bands with a positive and with a negative spectral slope, where the slope was set to the empirically obtained threshold value of .6.L. Obtained functions for E(t) and /(t) were inserted into Equation (3.1) and the difference in EWAIF between a noise hand with a positive and with a negative slope, LiEWAIF, was determined. The integration time T was 400 ms, the stimulus duration. It was impossible to check whether the obtained instantaneous frequency had been determined correctly: Sometimes, the analytical instantaneous frequency may have negative values, or values, larger than half the sampling frequency. Then the analytical and computed results will deviate. If EWAIF were the discrimination cue, LiEWAIF(58 Hz) should be equal to LiEWAIF(l ST) at threshold. In other words, the ratio LiEWAIF(58 Hz)/ .6.EWAIF(l ST) should he close to unity. Figure 3.6 displays this ratio for both subjects. For /c=l kHz the ratio is close to unity, which of course is due to the fact that the signals at this centre frequency are identical (except for the phase rela­tion, which may actually influence the EWAIF values). For larger centre frequencies, the ratio stays close to unity. For /c=125 to 500 Hz, the ratio is a factor 2 to 4 too large. This result implies either that the model does not work for low centre frequencies, or that the model does

Page 92: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

82

LlEWAIF (58 HzfO iS:EWAIF ( 1 ST)

1

Chapter 3 noise bands

0 .2 -t-----.---.---.---.--.-,r-r-ror-----.---,-...,._,......,,...,..,.-,-1

100 1000 10000

Jc (Hz)

Figure 3.6: The ratio '1EWAIF(58 Hz)/ .1EWAIF(l ST) as a func­tion of centre frequency. Different symbols represent the indi­vidual subjects.

not work for signals with a bandwidth exceeding 1 ST, or both.

Feth (1974) stated that - since spectral changes are transformed to pitch changes - the difference in EWAIF should be correlated with pure-tone frequency JNDs. For the present two subjects, such pure­tone JNDs were measured as described in chapter 2, yielding values of

Page 93: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments

100-.--~~~~~~~~~~~~~

&WAIF &

10

OJ-+-~~~~~~..-.-........... ~~~~~~~.--.-.-1 100 1000 10000

fc (Hz)

Figure 3. 7: Ratio AEWAIF / llF as a function of centre frequency le· Open symbols represent the 58-Hz condition; filled symbols the 1-sT condition. Different symbols represent the individual subjects.

83

AF measured under similar conditions as were used for noise band dis­crimination. Figure 3. 7 displays the ratio llEWAIF/ AF as a function of centre frequency. Ideally, the ratio should be close to unity. Figure 3. 7 shows that for the 1-ST condition this is true, especially with low centre frequencies. With the 58-Hz condition, the ratio is larger than unity for low centre frequencies, but decreases as the centre frequency increases. The results thus indicate that it is not the centre frequency that is a lim­iting factor, but the bandwidth: If the bandwidth is smaller than 1 ST,

the EWAIF model can account for the present data. Results obtained with two-tone complexes in chapter 2 are even more constraining: With these signals the EWAIF model can only account for the data if band­widths are smaller than 0.5 ST. The applicability of the EWAIF model is limited to narrowband stimuli because the model treats the sound as a whole. With broadband signals, the model cannot be applied because, in the auditory periphery, the signal is filtered into different channels.

Page 94: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

84 Chapter 3 noise bands

Figures 3.4 and 3.5 show a remarkable similarity between thresholds obtained with noise bands and thresholds obtained with two-tone com­plexes. It therefore should be possible to map the noise-band thresh­olds into two-tone-complex thresholds. A transformation model could be the EWAIF model. The EWAIF model maps differences in spec­tral shape into pitch differences, for noise bands, and for two-tone complexes. At threshold, the model should yield the same LlEWAIF value for noise bands of a certain bandwidth as for two-tone complexes with the same bandwidth. In other words, the ratio LlEWAIF(noise band)/ LlEWAIF( two-tone complex) should be close to unity. This ratio was determined for the individual centre frequencies, and for either the 1-ST or 58-Hz condition. The results are plotted in Figure 3.8. Only a few data-points are in the vicinity of unity, being those obtained in the 1-ST condition at low centre frequencies. Thresholds obtained with nar­row bandwidth and high centre frequency do not seem to satisfy the EWAIF model. Therefore, the applicability of the model is restricted further to narrowband signals (less than 1 ST) at low centre frequencies (500 Hz and below). This may indicate that the instantaneous fre­quency can only be tracked at low centre frequencies, where the fluctu­ations are rather slow. Of course, the possibility exists that, due to the Discrete Hilbert Transformation, EWAIF values are incorrectly deter­mined. After all, negative instantaneous frequencies or extremely high frequencies are awkward to interpret in psychophysical terms. Also, one might argue that other cues than EWAIF, such as differences in the temporal envelope, may have played a role, and provided an additional cue.

3. 2. 6 The multi-channel model

In multi-channel models, as developed by Plomp (1976), Durla.ch et al. (1986) and Ito (1990), a broadband signal is filtered by a. set of (non­overlapping) bandpass filters. In each band the amount of activity is measured. Thus the output of the model is a crude spectral represen­tation of the signal. The frequency bands usually are identified with critical bands. Therefore, a multi-channel model can only be applied to the results in Experiment I, since the signals used in Experiments II and III all fall within one critical band. Though the theoretical back-

Page 95: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.2 Experiments

&wAIF (nb) &WAIF (ttc)

2~~---~--~

0.2-~~~~-~~~

100 1000 10000

fc {Hz)

Figure 3.8: The ratio &WAIF( noise ba.nd)/ &wAIF(two­tone complex) as a function of centre frequency. Different sym­bols represent the individual subjects. Open symbols a.re the 58-Hz condition; filled symbols the 1-ST condition.

85

ground of the multi-channel model is rather straightforward, a precise quantitative implementation is difficult, since assumptions have to be made about the auditory filter shape. Nevertheless, some qualitative statements can be made. In its simplest form, no interchannel cor­relation noise is present, and bands do not overlap. Discriminability

Page 96: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

86 Chapter 3 noise bands

between two signals improves as the difference between the two spectra (e.g., the sum of differences in intensity per critical band) gets larger. If, at threshold, that difference between a noise band with a positive and' a negative spectral slope remained the same, threshold & should decrease as bandwidth increases. Results of Experiment I show that the opposite is true. By introducing interchannel correlation noise and weighting coefficients, Durlach et al. (1986) were able to explain the ability of the auditory system to detect changes in spectral shape in the presence of a roving intensity level. Threshold in Experiment I in­creases as the bandwidth increases. Since, with these stimuli, most in­formation is situated at the spectral edges, this threshold behaviour can be accounted for by stating that the ability to perform across-channel comparisons worsens (or correlation decreases) as bands become more remote. This was also observed with two-tone complexes in chapter 2. Since the multi-channel model operates in the spectral domain, it can­not directly account for differences in threshold due to different phase relations of the components.

3.3 General Discussion

Noise bands with spectral slopes that change in sign can be discrimi­nated while a roving intensity level is present. Results of Experiment I showed that discrimination was based on spectral-shape discrimina­tion only, and that the signal can be processed either by using within­channel cues, meaning that the signal is processed as a whole, or across­channel cues, where the outputs of the different bands are compared.

Spectral-shape discrimination with narrowband signals (smaller than one critical band) was examined throughout the experiments. Results indicate that discriminability remains constant if the signal's bandwidth is constant on a critical band scale or equivalent rectangular bandwidth scale. The EWAIF model, as proposed by Feth (1974), could account for the data only to a limited extent. The results indicate that fluctuations of the instantaneous frequency can be used by the audi­tory system, only if they are slow. This is the case for very narrow bandwidths at low centre frequencies.

Considering the great similarity in threshold behaviour between

Page 97: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.3 General Discussion 87

noise bands and two-tone complexes, it is tempting to state that, in some way, the noise bands are processed in a similar way to two-tone complexes. The increase in threshold with increasing bandwidth is 0.38 dB/oct. for noise bands, and 0.58 dB/oct. for two-tone complexes, which is rather similar. More specifically, it seems as if mainly two regions of the spectrum are used in the decision process. This is in fact a very simple version of the multi-channel model. With a roving intensity level present, the minimum number of channels which has to be observed is two (except for narrowband signals, since then only one band is used, and the decision is based on within-channel cues, such as a change in pitch). With two-tone complexes, it is obvious that only two channels are involved. With large-bandwidth noise bands, there is quite some information present in channels near the centre of the band. Since the increase in threshold is only slightly less for noise bands than for two-tone complexes, it seems that only very little "centre-band" information is used, compared with the edge information. Therefore, with the present noise bands, results indicate that mainly two regions are used, viz. the spectral edges.

Versfeld & Houtsma (1991) studied the discriminability of changes in the spectral slope of a multi-tone complex as a function of the number of components. The distance between two adjacent components remained 1 ST, so the bandwidth (in ST) was equal to the number of components minus one. In another experiment, they measured thresholds for a two­tone complex as a function of bandwidth. The multi-tone complex can be regarded as a two-tone complex with a number of components in between. Figure 3.9 displays the thresholds for the two-tone complexes and the multi-tone complexes as a function of the total bandwidth. These results show that thresholds for both conditions remain about the same up to about 9 ST. This implies that components inside the multi-tone complex do not help in the discrimination process, and only the two outer components are used. At larger bandwidths, thresholds for multi-tone complexes are smaller. This indicates that possibly more inward-lying components are also used. These experiments were done without a roving intensity level, so one cannot rule out the possibility that thresholds for broadband two-tone and multi-tone complexes were obtained by discrimination on the basis of within-channel level cues.

Bernstein & Green (1987b) reported several experiments where the

Page 98: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

88

L\L 1.5 (dB)

1.0

0.5

Chapter 3 noise bands

0 1 2 3 4 5 6 7 8 9 10 11 12 Bandwidth (sT)

Figure 3.9: Threshold & for multi-tone complexes (filled sym­bols) and two-tone complexes (open symbols) as a function of the bandwidth (sT). The error bars denote the standard deviation between subjects.

threshold for detecting an increase in amplitude of only one component in a 21-component spectrum was compared to thresholds for detect­ing broadband spectral changes (e.g., flat vs. tilted spectra). They found that thresholds obtained with broadband changes could not be predicted by the single-component thresholds, unless it was assumed that only two regions of the spectrum were used in the discrimina­tion process. In profile-analysis experiments, it was found in general that complex changes give poorer thresholds than would be expected from thresholds obtained with single-component changes in a multi­tone spectrum (Green & Kidd Jr., 1983; Green et al., 1987; Richards, Onsan & Green, 1989).

Farrar et al. (1987) studied the discriminahility of different speech­like noise spectra, embedded in long-term averaged speech noise. In an attempt to predict their results with a simple multi-channel model, they found rather large values for the internal noise variance. This

Page 99: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

3.3 General Discussion 89

indicates again poor performance. This large variance was obtained because, for all bands in the multi-channel model, the same variance was assumed. If only two regions of the spectra were used (which can be made plausible by inspection of their Figure 6), the variance of two regions would be rather small, whereas at the other regions it would be large.

Moore et al. (1989) investigated the ability of the auditory system to detect a peak or a notch in an otherwise flat spectrum. A simple multi-channel model would predict that the threshold should decrease as the bandwidth of the peak or notch increases. The results, however, showed no such dependence. This again suggests that only two regions of the spectrum are used. One region then is situated at the peak {notch), the other at the non-changing part of the spectrum.

Berg { 1990) proposed a model that can account for threshold be­haviour for detection of a change in amplitude of one component of a multi-tone complex. By introducing random variations in the ampli­tudes of the components in the complex, the influence of these varia­tions on the discriminability of an increment in the target component can be measured. For each individual component this influence can be transformed to a. weight, indicating to which degree it aids the detec­tion of an amplitude increment in the target component. This model has only been tested for single-component amplitude changes (Green & Berg, 1991 ). Though it will need a great experimental effort, it will be interesting to see what the model weights are if thresholds for com­plex spectral changes are measured (either in multi-tone complexes or noise bands). The distribution of the weights directly shows whether or not indeed only two regions, and if so, which regions, a.re used in the discrimination process.

It seems that the discrimination of complex spectral changes can be explained qualitatively by assuming that only two regions are used. One must keep in mind, however, that all experiments reported are discrimination experiments, where only two signals had to be discrim­inated. It is very well possible that in other experiments, a set 'Of spectra have to be discriminated (identified), which requires more than two regions. This is e.g. true for experiments where different timbres (Plomp, 1976) or vowels (Pols, van der Kamp & Plomp, 1969) have to be judged for dissimilarity. In that case, the whole spectrum has to be

Page 100: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

90 Chapter 3 noise bands

monitored, not just two regions which contain the most reliable source for discrimination,

The multi-channel model is a very powerful model, due to its general nature. Then again, this also is its weakness, since the many free parameters mean that the model is not easy to test. In order to find out how well the multi-channel model can account for the observed phenomenon that roughly two regions a.re used in the discrimination process further research will be needed.

Page 101: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Summary

I N this thesis an attempt was made to gain insight into the capability of the human auditory system to discriminate between sounds that

differ only with respect to their spectral shape. Since the spectrum of a. signal can be varied in many ways, the signals used in this thesis were restricted to very simple signals that usually differed from each other in the sign of the spectral slope. Discrimination thresholds for spectral­slope changes of two-tone complexes were measured in chapters 1 and 2. Chapter 1 also reported thresholds for changes in the spectral slope of multi-tone complexes. In chapter 3 thresholds for changes in the spectral slope of noise bands were discussed. Thresholds were generally measured a.s a function of frequency separation (bandwidth), centre frequency and overall intensity. In several experiments, the overall intensity for ea.ch sound burst was randomized, so that the change in the spectral shape was left as the only reliable cue for discrimination.

The general finding in this thesis is that, despite the roving intensity level, the auditory system can discriminate surprisingly well between signals with different spectral shapes. Perhaps the most striking exam­ple is given by two-tone complexes with a frequency separation of one semitone (sT), where the threshold for component amplitudes changing in opposite direction, is 0.25 dB, about four times better than single­tone intensity thresholds. However, it may be unfair to compare the present thresholds to intensity thresholds, since the reported percep­tual cues were changes in pitch and sharpness, rather than changes in loudness.

Auditory frequency resolving power plays an important role in the discrimination process, in the sense that changes in the spectral shape of unresolved signals are processed in a different way from those of

Page 102: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

92 Summary

resolved signals. Thresholds for a. particular signal did not depend on centre frequency, as long as the signal could be resolved to the same degree. The role of the critical band can be seen in the shape of the threshold as a. function of the bandwidth for two-tone complexes or noise bands. Discrimination was best if the signal's bandwidth was about 1 ST (for two-tone complexes) or 3-6 ST (for noise bands).

If the signal cannot be resolved by the auditory system (which is the case for bandwidths smaller than 1 ST), the signal has to be processed as a whole, and therefore can be interpreted as one spectral component with a time-varying frequency and amplitude. Changes in the spectral shape caused a change in pitch or roughness, or both. However, models that assume that spectral changes in narrowband signals are trans­formed to changes in pitch (Feth, 1974), could only partly account for the results.

With resolved signals (bandwidths larger than 1 ST), the outputs of the separate frequency channels can be (and, in case of a roving intensity level, have to be) compared in a relative fashion, but neither with two-tone complexes, nor with noise bands, is the auditory system able to combine the outputs optimally. With noise bands, it seemed as if only two regions of the spectrum (probably the spectral edges) were taken into account. If so, the discrimination strategy for noise bands should resemble that of two-tone complexes very much, and, indeed, threshold functions for both stimuli were, under all conditions, very similar.

Though changes in simple sounds already have been examined in this thesis, still more insight can be gained with such simple sounds. An important question, for instance, is how far auditory profile analysis has a central origin and to what extent processes in the auditory periphery influence threshold functions. A possible approach to this question is to repeat the experiments with two-tone complexes, but with frequency components going to the different ears. By roving the overall intensity (to eliminate the use of loudness cues) and by randomly assigning one component to one ear and the other to the opposite ear (to eliminate localization cues), evidence for central or peripheral processing can be found.

The multi-channel model of Durlach et al. (1986) seems to be able

Page 103: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Summary 93

to account for the present two-tone data., even for a separation of only 1 ST. Further exploration of the model with two-, three- a.nd multi-tone complexes ma.y provide more insight in the applicability of the model. The present two-tone complexes are a. special case for the model, in the sense that if optimum processing takes place, no influence of a roving intensity level is to be expected. The effects of a roving level thus are a. direct measure of the inefficiency of the auditory system in comparing different spectral shapes.

Measurements obtained with noise bands suggest that only two re­gions of the entire spectrum are used in the discrimination process. One can imagine that if only two sounds have to be discriminated, in the presence of a. roving level, the minimum number of spectral regions that need to be watched is two. A logical experiment to be done in the fu­ture is to examine the generality of the "two-region" idea by extending the experiments to threshold measurements for changes in the spectral shape of signals (noise bands as well as multi-tone complexes) with a few peaks in the spectrum. If the "two-region" model holds, then one may wonder what will happen if roving signals have to be discriminated - that is, in each 3I3AFC trial the subject is presented two signals se­lected from a set with different kinds of spectral shapes. It may well be that the model is an artifact of the experimental paradigm, in the sense that, in a run, always only the same two signals are compared.

Page 104: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

94

Page 105: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Appendix A

A novel adaptive staircase procedure

Abstract

A decision criterion is proposed for changing the stimulus level in adaptive staircase procedures. The criterion does not assume knowl­edge of the shape of the psychometric function. The decision cri­terion depends on two parameters: One parameter determines the target value on a correct response, the other parameter determines the transition speed, viz. the maximum number of consecutive tri­als at one stimulus level. The procedure resembles the transformed up-down methods (and can be used to imitate some of them), but has the advantage that relatively more trials are concentrated at the target value. The decision criterion is very easy to implement.

A.1 Introduction

SOON after Dixon & Mood (1948) introduced the simple one-down, one-up (lnlu) adaptive procedure as a means of estimating a (dis­

crimination) threshold, alternatives were presented to improve the ef­ficiency, or to estimate points other than the 50%-correct point of the psychometric function. These so-called transformed up-down (TUD) procedures, as proposed by Wetherill & Levitt (1965) and Levitt (1971), nowadays are frequently used in psychophysical studies. The strategy with these procedures is simple. At a given stimulus level, a subject is allowed to make L - 1 (L2::1) incorrect responses, before the stimulus level is increased. If the subject, on the other hand, manages to give K (K2::1) correct responses, the stimulus level is decreased. This pro­cedure is called a K-down, L-up procedure. Thus, after a maximum of K + L-1 trials, a change in level occurs. Consider the example where a subject has to discriminate between two signals that differ by an amount

Page 106: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

96 Appendix A adaptive procedure

A. It is assumed that discriminability improves as A increases, so the psychometric function is monotonic. Let p.(A1 ) be the probability that a subject gives one correct response at stimulus level A1 • If p.(A1 ) is small, only few consecutive correct responses will be given, and the staircase procedure will cause the stimulus level to be increased until a difference A2 is reached for which p .. (A2) is sufficiently large that the number of correct responses make the level decrease again. Therefore, there exists a value for A for which the probability of decreasing the stimulus level equals the probability of increasing the level

P(level decrease) - P(level increase)

= 1 - P(level decrease),

so that P(level decrease) = !·

(A.1) (A.2)

(A.3) With a K-down, L-up procedure, the probability on a correct response at the point of equilibrium, denoted as Pt (the target probability), is determined by the solution of the equation

P(level decrease) =

Pt~ (K K ~ i l) [pt]K-1 [1 - Pt]' 1 2· (AA)

Equation ( A.4) is to be interpreted as the sum over all possible chains, containing l incorrect and K correct responses, with the restriction that the last response is correct. Equation (A.4) is a polynomial in Pt of order K + L 1. Solutions for Pt for K s5 and Ls5 are displayed in Table I.

Often, at the beginning of a run, a simple lDlU procedure is used to reach quickly the vicinity of threshold, whereafter a 2D 1 U or a 3D 1 U

procedure is followed. These procedures are in fact two separate pro­cedures. In this paper, a stepping rule is presented, where several TUD

procedures are mapped into one equation, containing two variables. By adjustment of the two variables, an original TUD procedure can be regained, but new stepping rules can also be generated. With runs containing several procedures (hybrid procedures), little programming is necessary. Furthermore, an example will be given and implications of this new stepping rule will be discussed.

Page 107: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

A.2 Experimental procedure 97

A.2 Experimental procedure

In a typical adaptive procedure, a decision has to be made after each trial to either increase the stimulus level, decrease it or to take at least one more trial at the same level. Suppose that at a given level .1.1 , a subject has responded already to Nt trials, of which Ne answers were correct. Let the subject's performance at this level be p.(.1.1 ). In the end, the experimenter wishes to estimate the threshold, defined as that value for .1. for which the probability of a correct response is Pt· Thus the experimenter has to determine whether at the present level the subject's performance p,(.1.1 ) is better or worse than Pt, or is close enough to it. The probability of Ne correct responses out of Nt trials, given some probability p of a correct response is

Pr(Nti Ne,P) = (Z:) pNc{l-p)Nt-Nc. (A.5)

which is one term of the Binomial distribution. Figure A.l displays Pr( Ntt Ne, p) as a function of p, where, as an example, Nt = 10 and Ne = 9. It can be seen from Figure A.l, but can also be shown, that Pr( Ni, Ne,P) is maximized for p = Nc/Nt. In an adaptive procedure, Pt is kept fixed. So if p.( .1.) is close to Pt, the probability of the fraction Ne./ Nt lying close to Pt is large, and hence Pr( Nt, Ne, Pt) will be large. This property is used to decide whether or not to change the stimulus level. The experimenter chooses a target value Pti and calculates after each trial the probability Pr(Nt, Ne, Pt)· If

Pr(Nt, Ne, Pt) ;::: a, (0 <a< 1) (A.6)

L K 1 2 3 4 5

1 0.5000 0. 7071 0. 7937 0.8409 0.8706 2 0.2929 0.5000 0.6143 0.6862 0. 7356 3 0.2063 0.3857 0.5000 0.5786 0.6359 4 0.1591 0.3138 0.4241 0.5000 0.5598 5 0.1294 0.2644 0.3641 0.4402 0.5000

Table I: Values for Pt for a. K-down, L-up procedure.

Page 108: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

98 Appendix A adaptive procedure

Pr 0.50-.-----------------. 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 -+---,..---,..---.....----=:::::;::...--,---..---..---..---1

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 p

Figure A.1: Pr(Nt, Nc,P) as a function of p for Nt = 10 and Ne= 9.

the probability of getting Ne correct responses out of Nt trials is suffi­ciently large (i.e. larger than some criterion a), indicating that Ne/ Nt lies close enough to Pt, and the current stimulus level is maintained. Hence, at least one more trial is taken at that level. If, on the other hand

(A.7)

the experimenter should either increase the stimulus level (if Ne/ Nt < Pt) or decrease the stimulus level (if Nc/Nt > Pt)· The main point is that, the closer p. comes to Pti the more trials are needed to change the level, which is exactly what the experimenter wants: Few trials at stimulus levels remote from threshold, and many trials at levels lying close to the desired threshold.

Large values of a cause Equation (A. 7) to be satisfied after a few trials. Hence the stimulus level is changed rather quickly. On the other hand, small values of a result (on the average) in many trials at the same level, so a change in level does not occur very frequently. By varying a, the experimenter is able to manipulate the maximum

Page 109: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

A.2 Experimental procedure 99

number of trials at a level. By taking large a values at the beginning of an adaptive run, threshold is reached very quickly. After this a can be decreased, resulting in many trials at the levels near threshold.

The present criterion for changing level can be used in the same way as in the TUD procedures. In fact, it should be seen as an exten­sion of these. First of all, it is possible to imitate a number of TUD

procedures. The value for Pt is then given by Table I. Table II gives the a values for which the TUD procedure is regained. Not all TUD

procedures can be imitated exactly, and the a values with an asterisk yield an approximation to the TUD procedure. The :figures given in Table II have been chosen such that the maximum number of trials per level equals that of the associated TUD procedure (being K + L - 1). In general, the present procedure will change the stimulus level more often. Of course, by taking smaller values for a, the average stay per level can be increased.

It is beyond the scope of this paper to discuss fully how thresholds should be estimated. A first approach - the quickest and simplest - is to average over the levels at reversals, just as in usual staircase procedures. All psychophysical procedures have the handicap that there is a trade­off between a quick estimate of the threshold, by rapidly changing the levels, with large variances, and slow, but accurate estimates. The present procedure is no exception. If a hybrid procedure is used, i.e., the value a is changed during a run, threshold can be determined by weighting the individual parts of a run. Suppose that N, trials have been taken with a value ai. For each part of the run the threshold t::i.:n" can be determined by averaging over the levels at the reversals. The threshold t::i.thr then is given by simple averaging

LNill.~h,. .athr = _,_· __ _

LNi (A.8)

i

When small values for a are used, few reversals are obtained, and a better method will be to estimate threshold by a (x2 or maximum­likelihood) fit of the pooled data to a parameterized psychometric func­tion.

Page 110: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

100 Appendix A adaptive procedure

A.3 Example

Let. Pt = 0. 75, a conventional threshold performance level in psy­chophysics. It corresponds roughly to d' = 1 in a two-interval, two­alternative forced-choice discrimination task. For 0.75 < a < 1 the procedure is reduced to a simple lnlu procedure. Note that the con­vergence level is not 753, but rather 503, a very crude estimate. On the other hand, the stimulus level will be changed after every trial, which may be used to come rapidly to the vicinity of what later will be defined as threshold. If 0.56 <a< 0.75, the procedure becomes a 2DlU procedure, estimating the 70. 7%-correct level. For 0.42 < a < 0.56, the procedure is a 3DlU procedure, now estimating the 79.43-correct level. Smaller values for a result more and more in procedures where the level is changed more slowly, but do approximate the 753-score more closely. As an example, computer simulations of staircases for procedures with values for a of 0.9, 0. 7, 0.5, 0.4, 0.3, and 0.2 are plotted in Figure A.2. In the figure, the subject's performance p. is plotted along the ordinate. Correct responses are denoted with open symbols, incorrect responses with filled symbols. The figure shows respectively the lnlu, 2DlU, and 3DlU procedure. With a= 0.4 the procedure resembles a 4DlU, but sometimes mistakes are allowed, and do not result in an increase in level. With a = 0.3 or 0.2, Figure A.2 clearly shows that asp. comes closer to Pt, more trials per level are gathered.

L K 1 2 3 4 5

1 0.550 0.550 0.550 0.550 0.550 2 0.550 0.370 0.370 0.370* 0.370* 3 0.550 0.370 0.350* 0.300* 0.290* 4 0.550 0.370" 0.300* 0.300* 0.265* 5 0.550 0.370* 0.290* 0.265* 0.265*

Table II: Values for a used for imitating a K-down, L-up pro­cedure. An asterisk (*) denotes that the TUD procedure is ap­proximated.

Page 111: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

A.3 Example

Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

0

101

a= 0.9

10 20 30 40 50 60 70 80 90 100 Trial number

a= 0.7

10 20 30 40 50 60 70 80 90 100 Trial number

Page 112: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

102

, Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

0

Appendix A adaptive procedure

a:= 0.5

10 20 30 40 50 60 70 80 90 100 Trial number

a= 0.4

10 20 30 40 50 60 70 80 90 100 Trial number

Page 113: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

A.3 Example

Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

Ps 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

0

0

103

a= 0.3

10 20 30 40 50 60 70 80 90 100 Trial number

a= 0.2

10 20 30 40 50 60 70 80 90 100 Trial number

Figure A.2: Staircases of the novel procedure with Pt = 0. 75 and a either 0.9, 0.7, 0.5, 0.4, 0.3 or 0.2.

Page 114: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

104 Appendix A adaptive proceduxe

A.4 Discussion

The procedure proposed in this paper can be used to generate a number of adaptive staircase procedures. It resembles a TUD procedure, but also has some characteristics of the PEST procedure (Taylor & Creelman, 1967), in the sense that more trials are taken at levels giving scores close to the desired correct-response probability. The only constraint to the shape of the psychometric function is that it is monotonic.

Two further remarks are needed. First of all, the criterion for chang­ing the level is not a statistical criterion in that a 4ypothesis is tested with a given significance level. It can be shown that Pr(Nt, Ne, Pt) de­creases with increasing Ne, so if enough trials are taken the criterion will always be met (which is not true for the PEST procedure), resulting in a change of level. The essential point is that the stay is longest at that level where p.(8) =Pt· Secondly, substantial bias may be introduced in the estimated threshold. The bias is due to the discrete nature of adaptive procedures: Only after a whole number of trials can a deci­sion be made whether or not to change the stimulus level. This implies that sometimes Pt and a can be varied to some extent without actually changing the procedure. This means that, given the procedure, Pt can be shifted slightly such that the bias is minimal or even zero. For a given value for a and Pt, the unbiased or actual value of convergence can be calculated with the method used by Wetherill & Levitt (1965) or Levitt (1971). They determined the convergence values of the TUD

procedures by simply writing down the possible (Markov) chains at a level, just as was done in the introduction to the present paper. In the example of Pt = 0. 75 and a = 0. 7, the actual point to whicli the pro­cedure converges is 0.707 (since it is a 2DlU procedure). By changing Pt to 0. 707, the procedure remains unchanged, but now the estimate is unbiased.

The advantages of the present procedure are the flexibility (Pt can be chosen freely) and compatebility {several TUD procedures can be imitated). An important characteristic of the present procedure is that, during a run where Pt is kept fixed, a can be decreased, causing less frequent changes in level, yet leaving the target level unchanged.

Page 115: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Appendix B

Three-interval, three-alternative, forced-choice paradigms

Abstract

This paper presents an alternative derivation of psychometric functions for several three-interval, three-alternative, forced-choice paradigms. It appears that, with a particular paradigm, a priori knowledge of the stimuli changes the shape of the psychometric function.

B.1 Introduction

R ECENTLY, a trend may be seen in psychoacoustics towards the use of three-interval, three-alternative, forced-choice {3I3AFC)

paradigm in favor of the (still more common) 2I2AFC one. The main reason for this trend is the more frequent use of complex stimuli. Though only the value for one physical parameter may differ between both stimuli, the subjective difference can be quite complex. The subject might well be able to perceive the difference, but may not be able to consistently decide which response alternative is correct. With the 3I3AFC paradigm, the subject's task is only to indicate the odd-sounding interval (hence the name "odd-ball" or "odd-man-out" paradigm).

The forced-choice paradigms can be separated into two categories: the first category contains paradigms in which the signal needs to be labeled or categorized. Examples in this category are the one-interval or two-interval, two-alternative, forced-choice {1I2AFC or 2I2AFC) paradigm. The second category contains paradigms in which the signal

Page 116: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

106 Appendix B 3I3AFC paradigms

does not need to be labeled, and where a subject should be able to re­spond correctly even on the very first trial. Examples in this category are the same-different paradigms, and the "odd-ball" paradigms.

Signals whose differences are hard to label are best examined with the second~category of paradigms. Signals whose differences are easy to label can be examined with paradigms from the first category, though, of course, it is possible to use second-category paradigms as well.

As an example, suppose that a subject has to discriminate between a sound N, representing a pure tone of frequency J, and a sound S, representing a pure tone with a slightly higher frequency J + 6.J. In a 1I2AFC paradigm, one can imagine that the response on the very first presentation is based on guessing only. Only after several trials, is the subject able to base decisions on knowledge of the stimulus. Next, suppose that the sound's frequency is roved, i.e., N comprises JB+ J and S comprises Jn+ J +6.J (Jn~ 6.J), where Jn is an offset frequency that is changed randomly between each trial (not within each trial). In the 1I2AFC paradigm, there is no way for the subject to give consistently correct responses, though otherwise perfect discrimination between J and f + 6.f might be possible.

In a 3I3AFC paradigm where, in one trial, N is presented twice and S only once, the subject is able to give a consistent response, starting from the very first trial, regardless a roving frequency. This indicates that the subject is able to use a cue that he or she cannot use in the first-category paradigms. The additional cue here can be obtained by a relative comparison of the signals Sand N.

Considering the above, one might expect that if a second;.category paradigm is used and, at the same time, the subject is able to label the stimuli, performance will be better than if labeling is not possible, leav­ing only a relative comparison as a cue. This paper is concerned with the derivation of the psychometric function of two 3I3AFC paradigms. One paradigm (the regular 3I3AFC paradigm; Elliot, 1988) deals with the discrimination between S and N if only sequences SN N, NS N and N NS can occur. In the second paradigm (triangular paradigm; Bradley, 1963) all six sequences SNN, NSN, NNS, NSS, SNS and SSN can occur. Psychometric functions will be derived under the as­sumption that the subject either can label the signal or cannot label

Page 117: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.2 Signal Detection Theory 107

the signal, and therefore can compare the signals in a relative way only.

B.2 Signal Detection Theory

Signal Detection Theory {Green & Swets, 1988) provides a framework in which a relationship can be established between the probability of a correct response Pc. in a given paradigm and d', being the subject's sensitivity for the physical ~ifference between S and N. In order to come to a relationship between the sensitivity and the probability of a correct response, several assumptions have to be made. Some of these are rather strict, while others have been made to simplify the derivation. The assumption are {cf. Frijters, 1980; Green & Swets, 1988)

(1) The presentation of a signal I (I can be either N or S) results in a number e on the decision-axis x. e is one realization of a stochastic process, caused by the presentation of the signal.

{2) z is a random variable with a Gaussian probability density func­tion pdfi(:e) with expected value µ1 and variance err The prob­ability of obtaining a number e between z - dz/2 and z + d:v/2, given the presentation of signal I is

P(e I I)= pdfI(:v) d:e (B.1)

The stochastic process is the same for S as for N, therefore it is assumed that CTN crs =er. The density function thus becomes

1 { 1 (z - µ1 ) 2

} pdfi ( :e) = cr.J2;r exp - 2 cr (B.2)

(3) The experiment is balanced, i.e., each sequence can occur with equal a priori probability.

( 4) The subject acts as an optimum processor, therefore no bias exists.

(5) The values of 6, e2 and ea, being the result of the three stimulus presentations in the 3I3AFC paradigm, are mutually independent.

Page 118: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

108 Appendix B 3I3AFC para.digms

ud'

Figure B.1: Probability density functions pd/N(z) and pdfs(z) as a function of z.

Figure B.l displays pdfN(z) and pdfs(z). Also indicated is the sen­sitivity d', being the distance between µs and P,N, normalized by the standard deviation u

d' = P,s -µN. (J"

(B.3)

Without loss of generality, it is assumed that µs :;::: P,N, so that d' :;::: 0.

B.3 The regular 3I3AFC paradigm

In the regular 3I3AFC paradigm, the subject is presented with one out of the three sequences SN N, NS N and N NS on each trial. The subject's task is to indicate the odd interval, by responding "1", "2" or "3". The probability of a correct response Pc is

Pc = P("l" I SNN)P(SNN) + P("2" I NSN)P(NSN)

+P("3" I NNS)P(NNS) (B.4)

Page 119: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.3 The regular 3I3AFC paradigm 109

= P("l" I SN N), (B.5)

since the experiment is balanced ( P( SN N) = P( NS N) = P( N NS)) and the subject has no bias (P("l" I SNN) = P("2" I NSN) = P("3,, INNS)).

Three presentations give rise to three numbers on the decision a.xis. These numbers are denoted as e = (6,e2,e3). e thus can be seen as one point in three-dimensional (decision} space: If SN N was presented, e is a possible realization of the density function

- _1 exp{-! (:.c1 -µs)2} uv121r 2 u

1 { 1 (:.C2 - /J,N) 2} ·--exp --uv121r 2 u

1 { 1 (:.C3 - /J,N) 2} ·--exp -- . uv121r 2 u (B.7)

B.3.1 Case 1: Labeling is possible

Prior knowledge of the stimuli is modeled by the assumption that /LSNN,

/LNSN and /LNNS a.re known by the subject. In the example this means that the subject is familiar with f and f + A/, and perceives these signals as "low in pitch" and "high in pitch", respectively. The sequence SN N, for instance, then is perceived as "high-low-low".

Suppose that the sequence SN N has been presented. The three signals give rise toe= (6, 6, e3)· One can imagine that, if e is close to /LSNN, the subject indeed has perceived "snn", and will respond "1". If, on the other hand, e is close to /LNSN or /J,NNSi "2" or "3" will be responded, respectively. The subject's decision is modeled with the aid of the likelihood ratios

P(sNN 1 e) AsNN,NSN = P(NSN I e) (B.8)

Page 120: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

110

and, similarly,

Appendix B 3I3AFC paradigms

pdfsNN(6, e2, 6) (B.9)

pdfNsN(6, 6, 6)

pdj s( ei )pdj N( 6)pdfN( ea) (B.10) -

pdfN( ei )pdj s( 6 )pdfN( ea)

pdfs(6)pdfN(6) (B.11) -

pdfN(6)pdfs(e2)

~exp{-~ [(6 - µs) 2 + {e2 - µN) 2]}

- 2:u2 exp {- 2~2 [(6 - µN )2 + (6 - µs)2]}

exp {2~2 [(e1 - µN)2 + {e2 - µs) 2

-(ei - µs) 2 - {e2 - µN) 2]}

{µs - µN( )} exp u2 6 -6

{µs-µN }

AsNN,NNS =exp u2 (6 - ea) .

(B.12)

(B.13)

(B.14)

(B.15)

If the subject has no bias, the response will be "1 11 if

{ AsNN,NSN > 1 AsNN,NNS > 1

(B.16)

meaning that the likelihood that e comes from SN N is larger than the likelihood that e comes from NSN or NNS. Equation (B.16) is easily solved with the aid of Equations (B.14) and (B.15). Thus, the subject will respond "1 11 if e satisfies

(B.17)

Page 121: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.3 The regular 3I3AFC paradigm 111

Similarly, the response will be "2" if

{ 6 < 6 ea < 6 (B.18)

and "3" if

(B.19)

Having three response categories means that decision-space is divided into three subspaces, as defined in Equations (B.17), (B.18) and (B.19).

The probability of a correct response was given by Equation (B.5), and can now, with the constraints of Equation (B.17), be determined.

Pc = P("l" I SN N) (B.20)

(B.21)

- _J00

dz1 _Jzdz2 _Jzdza pdfs(z1) pdfN(z2) pdfN(za)

(B.22)

Inserting the expression for the density functions of Equation (B.2) into the last equation; substituting u = (z1 - µs)/u, v = (z2 - µN)/u and w = ( z3 - µN )/ u; thereafter inserting the expression for the sensitivity of Equation (B.3), gives

(B.23)

where <P(t) is the Normal Distribution Function

"'( ) 1 _ !t2 .,, t = --e 2

v'21r (B.24)

and <P( z) the Cumulative Normal Integral

(B.25)

Equation (B.23) yields the desired psychometric function, and the ex­pression is identical to the one of Elliot (1988).

Page 122: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

112 Appendjx B 3I3AFC paradigms

B. 9. 2 Case 2: Labeling is not possible

If, in our example, the frequency is roved across trials, the subject ha.s·to discriminate between f +JR and f + /R +A/, where /Risa roving frequency. It was assumed that f was associated with a density function with expected value P,N, and f +A/ with µs. The introduction of a roving frequency, will cause the expected values to rove as well, and become P.R + P.N and P.R + µs, respectively. Since P.R changes in each trial, the subject cannot rely on these values, and is unable to calculate the likelihood of e coming from distribution Nor S. Instead of perceiving the sequence SN N as "high-low-low" the subject perceives the sequence in terms of "lower in pitch" or "higher in pitch", in a relative way. This is achieved by combining (adding, subtracting) ei, e2

and 6 such that influence of the rove is eliminated as much as possible. The rove can partly be eliminated by an orthonormal tra.nsformation1

! z1 = *(z1 -za)

z2 = :7sC -z1 + 2z2 - za)

Z3 = :j;(z1 + z2 + za)

(B.26)

This transforms the expected values for SN N, N_S N and N NS into

(µs + P.R, P.N + P.R1 P.N + P,R) --+ (µs - P.N _µs P.N µs + 2µN + 3µR)

v'2· ' J6 ' J3 (B.27)

(µN + P.R, µs + P.11.i P.N + P.R)

--+ (0 2(µ.s - P.N) µs + 2µN + 3µR) ' v'6 , V3 (B.28)

(µN + µR, P.N + µR, µs + µa) --+ (-µs - P.N _µs -µN µs + 2µN + 3µR)

y'2 ) J6 I J3 (B.29)

The advantage of an orthonormal transformation is that, except for their expected values, the density functions do not change. e is trans-

1This transformation is not the only one that can be used. One may consider other (non-orthonormal) transformations, or even projections.

Page 123: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.3 The regular 3I3AFC paradigm 113

formed to C as

! (1 = ~(6 - ea)

(2 = :ja( -e1 + 2e2 - ea)

Ca = 7a<e1+6+ea)

(B.30)

The expected values thus are transformed such that µR is situated in the z3-coordina.te only. z3 does not contain information about the sequence of the signals in the trial. Though it can be used by the subject, for instance to estimate something like the "overall pitch height", the value for (3 is not of use in the 3I3AFC paradigm, and the subject does best to ignore it.

( 1 and ( 2 are freed from the roving term JLR, so the likelihood of (1

and (2 coming from SN N relative to the likelihood of (1 and (2 coming from NSN or NNS can be determined by the subject. In analogy with the previous section, two likelihood ratios are calculated.

AsNN,NSN = P(SNN I() P(NSN I()

(B.31)

pdfsNN((1, (2) - pdfNsN((i, (2)

(B.32)

pdf s( (1 )pdfN( (2)

pdfN( (1 )pdf s( (2)

_1 exp{-!((1 q.J');i 2

_l exp{-! ('1

)

2

} (!' $ 2 (!'

(B.33)

(B.34)

_1_ exp{-! ((2 + (µs - µN)/J6) 2}

uv"f; 2 u

_l_exp {-! ((2 -2(µs - µN)/v'6)2}

uv"f; 2 u

Page 124: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

114 Appendix B 3I3APC paradigms

(B.35)

and·

{µs - P.N In}

AsNN,NNS = exp u2 (1v2 {B.36)

The subject will respond "1" if both likelihood ratios a.re larger than one, which is the case if

(B.37)

The probability of a correct response then is given by

Pc = P("l" I SNN) (B.38)

(B.39)

(B.40)

After some rearranging and substitutions, the psychometric function is obtained

Pc= /00

t/>(u) ~ (u + d'../'i.) du -d'/../2 J3

(B.41)

With the a.id of Fourier theory, it can be shown that Equation (B.41) is identical to Equation (B.23). This means that, in this paradigm, prior knowledge of the stimulus does not improve performance.

B.4 Triangular paradigm

In the triangular paradigm, one out of the six sequences SN N, NS N, NNS, NSS, SNS and SSN is presented in each trial. The subject's task is to indicate the interval that contained the "odd" sound. The

Page 125: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.4 Triangular paradigm 115

subject's response alternatives a.re "l", "2" or "3". The probability of a correct response is

Pc = P("l" I SNN)P(SNN) + P("2" I NSN)P(NSN)

+P("3" I NNS)P(NNS) + P("l" I NSS)P(NSS)

+P("2" I SNS)P(SNS) + P("3" I SSN)P(SSN)

= P("l" I SN N),

(B.42)

(B.43)

since the experiment is balanced and the subject's responses a.re unbi­ased. The probability density functions are given in Equation (B. 7), and the decision model is applied in the same way as was done in the previous section.

B.4 .1 Case 1: Labeling is possible

Suppose that the trial contained the sequence SN N. If the subject is fa­miliar with the stimuli, µsNN to µssN are known, and the likelihood ra­tios AsNN,NSN, AsNN,NNS, AsNN,Nss, AsNN,SNS and AsNN,SSN are de­termined. The first two likelihood ratios are given in Equations (B.14) and (B.15); the remaining three are calculated in the same way. They are

AsNN,NSN {µs - µN( } exp u2 6 - 6) (B.44)

AsNN,NNS (B.45)

AsNN,Nss

AsNN,sNs (B.47)

AsNN,ssN (B.48)

Page 126: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

116 Appendix B 313.A.FC paradigms

The subject will not only respond "1" if

AsNN,NSN > 1 AsNN,NNs > 1 AsNN,Nss > 1 (B.49) AsNN,sNs > 1 AsNN,SSN > 1

but also if ANss,SNN > 1 ANss,NsN > 1 ANsS,NNS > 1 (B.50) ANss,sNs > 1 ANss,ssN > 1

In the latter case, the subject perceives "nss", although SNN was presented. The subject will answer "1" if e satisfies either

e2 < ei ea < e1 e2 +ea < ei + (µ.N + µ.s )/2 (B.51) ea < (µN + µ.s)/2 6 < (µ.N + µ.s )/2

or e2 +ea > ei + (µ.N + µs)/2 ea > (µN + µ.s)/2 6 > (µ.N + µ.s )/2 (B.52) e2 > ei ea > ei

Similarly, the subspaces for answer "2" and "3" can be determined.

The probability of a correct response is given in Equation (B.43), and can be determined with the constraints given by Equations (B.51) and (B.52). Integrating over the two subspaces is not very difficult, but some insight is required to correctly determine the integration bound­aries. The psychometric function becomes

Pc P("l" I SN N)

P("snn" I SN N) + P("nss" I SN N)

(B.53)

{B.54)

Page 127: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.4 Triangular paradigm 117

+ /°d:v1 /°d:v2 j°d:va pdfsNN(:vi, :v2, :va) (B.55) (µN+l.ls )/2 z1 =1

Rearranging and using the definition of d' yields the psychometric func­tion

-d'/2 Joo¢( 'U) [1 - cft( 'U + d')]2 du

(B.56)

B.4.2 Case 2: Labeling is not possible

If the subject is presented with stimuli that are difficult to label, the same strategy as in section B.3.2 is used. Thus the coordinate trans­formation of Equation (B.26) is adopted. The transformed expected values become

(µs + µR, µN + µR, µN + µR) ~ (µs-µN _µs-µN µs+2µN+3µR)

V2 I v'6 I v'3 (B.57)

(µN + µR, µs + µR, µN + µR) ~ (0 2(µs - µN) µs + 2µN + 3µR)

I v'6 I v'3 (B.58)

Page 128: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

118 Appendix B 3I3AFC paradigms

(B.59)

(µN + µR, µs + µR, /LS + /LR)

-+ (-/LS - /LN /LS - /LN /LN + 2µs + 3µR) v'2 ) v'6 ' v'3 \ (B.60)

(µs + /LR, /LN +/LR, µs + /LR)

-+ (0 - 2(µs - /LN) /LN + 2µs + 3µR) ' v'6 , V3 (B.61)

(µs + /LR, /LS + µR, /LN + /LR)

-+ (µs-µN µs-µN µN+2µs+3µR)

v'2 l v'6 ' V3 (B.62)

e is transformed to ( l and analogously, :five likelihood ratios are deter­mined. They are ((3 is omitted)

AsNN,NsN - {/LS - /LN ( v'3)} exp u2v12 (1 - (2 3 (B.63)

AsNN,NNS {µs - µN v'2} -exp u2 (1 2 (B.64)

AsNN,Nss = exp { (µs -;N )v'2 ( (1 - (2/ v'3)} (B.65)

AsNN,SNS - exp { µ:~N ( (1 + (2/ v'3)} (B.66)

AsNN,SSN = exp {- (µs - ;N )v'2 (a/ v'3} (B.67)

The subject responds "1" if Equation (B.49) or (B.49) is satisfied. This yields the condition that ( must satisfy either

Page 129: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.4 Triangular paradigm

or

(1 > (2.J3 (1 > 0 (1 > (2/../3 (1 > -(2/../3 (2 < 0

(1 < (2../3 (1 < 0 (1 < (2/../3 (1 < -(2/.J3 (2 > 0

119

(B.68)

(B.69)

These restrictions are rather easily written as integration boundaries, so that the probability of a correct response is given by

Pc - P("snn" I SNN) + P("nss" I SN N) (B.70)

(B.71)

loo du <P(u) [q> (!!_)-ti> (-u\1'3-d'·m] ~M ~ Vi)

+ _J-.INi du ql(u) [ t (-uv'i -d'Jf) -t ( ~)]

(B.72)

which can, with the aid of Fourier theory, be rewritten as

Pc = 2 .f du qi(" J [ t (-~ + d' Jf) u (-~ - d' Jf) ] (B.73)

Page 130: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

120

Pc

Appendix B 313.lFC paradigms

1.0 .,----------::-:::-::::=,,.,.,.,.......--=:::::::==t ·····

0.5

0

/., .. ···•······ ----------~: _: _____ -----------

,• ,•'

,,,,. "' ,.

1 2 3 4

---­.,---

5 d'

Figure B.2: Psychometric functions for the regular 3I3AFC

paradigm (dotted), triangular paradigm, stimulus known (solid), and stimulus not known (dashed).

Equation (B.73) is identical to the expression obtained by Frijters (1980).

B.5 Discussion

Figure B.2 displays the psychometric function of Equation (B.23} or (B.41) as a dotted line, Equation (B.56} as a solj~ line, and Equa­tion (B. 73) as a dashed line. The figure shows that the regular 3I3AFC

paradigm is the "easiest", in the sense that small values for the sen­sitivity d' already yield a reasonably large probability of a correct re­sponse. The differences in the psychometric functions between the two conditions of the triangular paradigm are not very large. Knowledge of the stimulus improves the probability of a correct response a bit. Table B.5 gives some values for Pc as a function of d'. In the experi­ments, reported in chapters 2 and 3, the triangular paradigm is used,

Page 131: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

B.5 Discussion 121

and it is assumed that the subject is familiar with the stimuli, since in an adaptive procedure, he or she is presented the same signal continu­ously. Thus, Equation (B.56) is used to fit the data to the psychometric function.

d' 0.000 0.200 0.400 0.600 0.800 1.000 1.200 1.265 1.400 1.600 1.800 2.000 2.134 2.200 2.400 2.544 2.600 2.800 3.000

Pc Eq. (23) Eq. (56) Eq. (73)

0.333 0.391 0.452 0.513 0.574 0.634 0.690 0.707 0.742 0.789 0.830 0.866 0.887 0.896 0.921 0.936 0.941 0.957 0.969

0.333 0.338 0.353 0.377 0.409 0.447 0.490 0.505 0.536 0.584 0.631 0.677 0.707 0.721 0.762 0.789 0.799 0.832 0.861

0.333 0.337 0.348 0.365 0.389 0.418 0.451 0.461 0.487 0.526 0.565 0.605 0.631 0.644 0.681 0.707 0.717 0.750 0.781

Table I: Pc for some values of d' for the different paradigms.

Page 132: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

122

Page 133: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Appendix C

Fitting the psychometric function

Abstract

In psychophysics, thresholds in an adaptive procedure are usually estimated by averaging over the levels at the reversals. Averaging implicitly requires that part of the underlying psychometric function can be considered as a straight line. In order to satisfy this condi­tion, the first part of a run has to be omitted, leaving a number of trials unused. This paper proposes a Maximum-Likelihood fitting procedure of the experimentally obtained data to a parameterized psychometric function, whereafter a x2-test is performed to test the goodness of fit. If the shape of the underlying psychometric func­tion is more or less known, this method has the great advantage that all the data are used to estimate threshold. Moreover, the method does not ·depend on the adaptive procedure, nor on stimulus step size. Hence, much more experimental flexibility is offered.

C.1 The psychometric function

T HE theory of signal detectability, as developed in the sixties (Green & Swets, 1988) can be used to establish a relation between the

sensitivity index d' and the probability on a correct response Pc in a given forced-choice paradigm

Pc(d') = '11[d'], (C.l)

where '11 yields the shape of the psychometric function, which depends on the number of stimulus intervals, number of response alternatives, possible response bias, etc. '11 increases monotonically with increasing d'. At d' = 0, Pc is at chance level; at d' = oo, Pc is equal to unity.

Page 134: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

124 Appendix C fitting the psychometric function

Psychophysical research mostly deals with the discrimination be­tween two stimuli that differ an amount D.. from each other with respect to some physical parameter. The relation between the (physical) pa­rameter t::.. and the (psychophysical) sensitivity d' is mostly unknown, but is written here as a function of M + 1 coefficients, which are used to fit the psychometric function to the data. Hence 1

d'(t::..) = Q(t::.., Co, ... , CM)· (C.2)

d'(O) = 0, since then the two stimuli are physically identical, and thus indiscriminable. Inserting Equation (C.2) into Equation (C.l) yields a relationship between Pc and D..

Pc(D..) = w[Q(D..,co, ... ,cM)l. (C.3)

The introduction of detection theory, causing the step of writing Pc as a function of d', which in turn is written as a function of D.. seems to be somewhat unnecessary. However, in this way Equation (C.1) yields a relationship that depends solely on the experimental paradigm, whereas Equation (C.2) relates the psychophysical parameter d' to the physical parameter D.., independent of the experimental paradigm. This separa­tion is therefore very useful, because it saves coefficients. Psychophys­ical research has shown that Q usually is a very simple function, often even linear, yielding only one coefficient.

C.2 The Maximum-Likelihood fit

In adaptive procedures, only a limited number of stimulus levels, i.e. values of D.., are used. These levels are labeled by,the index l, so that levels used in one experiment are 81 (0 ~ l ~ L). During a run, these levels are presented to a subject one or more times. The total number of presented trials at level 61 is Nt(61), for which the subject gave Nc(81) correct responses. The underlying probability of a correct response at level 61 is given by Equation (C.3)

Pc(61) = W [Q(81, co, ... , cM)]. (C.4)

The probability Pr(61) of Nc(61) correct responses out of Nt(61) trials, assuming that the underlying probability of a correct response is Pc( 61),

Page 135: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

C.2 The Maximum-Likelihood fit 125

is given by the Binomial distribution:

Thus, Pr(5i) depends on the level Si, but also on the values of the coefficients es. By variation of es, a Maximum-Likelihood fit is obtained if the likelihood

L

A= II Pr(81) (C.6) l=O

is maximized. Instead of A,£= ln(A) usually is maximized. Maximiz­ing £ yields, first of all,

(C.7)

for all i. Other conditions also have to be satisfied to make sure that £ reaches a maximum, but it is assumed that they are fulfilled. Writing out Equation (C.7) gives

!_£ {:) L

- B I)n [Pr( Si)] (C.8) 8es Ci l=O

- :"' t.1n [ ( ~:~ ~: l) {pc( 6,)} N.( ") { 1-pc( 61)} [No(i,)-N.( .. )) l (C.9)

t { 8 [ ( N,( 61)) l 8 l=o Bci ln Ne( Sr) +Bes Nc(Sr)ln [Pc( Sr)]

+ ~[Nt(61)-Nc(61)]ln[l -Pc(fi1)]} (C.10)

- t{N.(61) 8 Pc(6,)_N,(61)-N,(61) 8 Pc(6t)} l=o Pc( Si) Bes 1 - Pc( fir) Bes

(C.11}

- t Nc(Si) - Pc(fil)N,(81) 8 Pc(Si) l=o Pc(o,)[1 - Pc(61)] 8es

(C.12)

L Pc(81) Pc(Si) B (C.13) - t; Nt(Si) Pc(S,)[1 - Pc(5i)] Bci Pc( Si)= 0.

Page 136: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

126 Appendix C Ii.Hing the psy~ometric function

In Equation (C.13), Nc(51)/Nt(51), the experimentally obtained propor­tion of correct responses at level 81, has been substituted by Pc(51).

Expression (C.13) is a set of M + 1 coupled, non-linear equations with M + 1 variables Ci, which can only be solved numerically. The out­come is a set of coefficients Ci, which, inserted into Equation (C.3) gives the shape of the psychometric function that fits best (in the Maximum­Likelihood sense) to the data. Since now the whole psychometric func­tion is known, the threshold Llth. can be defined arbitrarily, e.g. that value for .1_th.r for which d' = 1.

C.3 Goodness of fit

A x2-test may provide a simple goodness-of-fit criterion, and can be used to test a number of hypotheses. The criterion is given by

t [Noo1erved - Nezpectea]2

(C.14) l=O Nea:pected t {{Nc(61)- Pc(51)Nt(61)}2

l=O Pc( 81 )Nt( 81)

{[Nt(51) - Nc(51)] - [1 - Pc(51)]Nt(51)}2} (C lS)

+ [1- Pc(51)]Nt(61) .

t {Nc(81) - Pc(81)Nt(81)}2 (C.l6) - l=o Pc(81)[l - Pc(81)]Nt(81)

- t N (6) {Pc(51) - Pc(81)}2 (C.17) l=O t

1 Pc( 81)[1 - Pc( 81)] .

Here, Noo•erved and Nea:pected stand for the observed and expected num­ber of responses, respectively. The x2-test is valid only if the underlying probability density function is Normal. With the present model, the underlying density function is Binomial, which implies that only those terms l can be incorporated for which the Binomial function can be ap­proximated by a Normal function. This yields the restriction that only terms for which Nc(81) ~ 5 and Nt(81) - Nc(81) 5 can be taken into account. Therefore, in determination of the goodness-of-fit criterion, a number of data points usually do not contribute. These points are

Page 137: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

C.3 Goodness of fit 127

often those values for l for which either Nt(S1) is small, or a near-perfect score is obtained. Assume that the number of points taken into account is L' (~ L ). The fit is good in a statistical sense if

x2 < x!(v), (C.18)

where v denotes the degrees of freedom, in this case v = L' -( M + 1 )-1, and a denotes the level of significance. x!(v) is tabulated in most books on statistics, or can be determined by computer.

Of course one wishes to describe the psychometric function with as few coefficients Ci as possible. This can be done with the aid of the x2-test. Suppose one has reason to believe that a coefficient c; has a given value, e.g., zero. Then the null hypothesis

Ho: c; = 0 (C.19)

has to be tested. By taking c; = 0, Equation (C.4) changes, and first a Maximum-Likelihood fit has to be made again. This will yield a new set of values for Ci, where of course c; is still equal to zero. Inserting this new set into Equation (C.17) presumably1 results in a larger value for X2 and hence a worse fit, but also reduces the number of coefficients by one, i.e., increases the degrees of freedom v by one. If, again, Equation (C.18) is satisfied, the fit is still good. Of course, more coefficients than one can be set to zero, and as long as X2 < x2

,

given a and v, it remains possible to describe the psychometric function adequately.

Suppose that a decision has to be made whether one parameterized version of a certain psychometric function is better than a second one. In other words: Is there a significant difference between the functions Q(~ 1 eo,ci, ... ,cM) and 1t(~ 1 d0 ,d1 , ••• ,dN)? This problem is solved by determination of the Maximum-Likelihood fit for both functions g and 1f., and next the values X~ and X~ 1 with V<; and V?t degrees of freedom respectively. If

(C.20)

1 An optimal fit in the Maximum-Likelihood sense does not imply an optimal fit in the x2 sense. Therefore, a non-optimal Maximum-Likelihood fit may produce a better x2 fit.

Page 138: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

128 Appendix C fitting the psycpometric function

the two psychometric functions give a significantly different fit ( signifi­cance level a'). Equation (C.20) represents the so-called F-test, where values for Fa(vg, V?-t) are tabulated.

C.4 Example

In the present example, a one-interval, two-alternative, forced-choice paradigm is adopted. Data were collected (in a simulation) with a simple two-down, one-up staircase procedure (Levitt, 1971 ). Signal detection theory predicts a simple relation between the sensitivity d' and the probability on a correct response Pc

1 Pc(d') = -

V21r -oo

ij(d'/2),

Id' /2 _!.z2 e 2 dz (C.21)

(C.22)

where ij is the Cumulative Normal Distribution. The relation between d' and A was simply assumed to be

d' =A (C.23)

The stimulus step size was taken as

6, = 5. [0.75]1, (C.24)

with l = 0, 1, 2, 3, ... , resulting in values 61 of 5.000, 3. 750, 2.8125, 2.1094, 1.5820, etc. Ten runs were taken, all started at l = 0 (50 = 5.000), each comprising 50 trials. The data obtained are given in Table I, and plotted in Figure C.l. The relation b~tween Pc and d' is known, since it is determined entirely by the experimental paradigm. The relation between d' and A is a psychophysical one, and is not known beforehand. Suppose that the experimenter chooses the follow­ing relation

(C.25)

which are the first two terms of a Taylor expansion. The function has the property that d1 increases as A increases (if the coefficients are larger than zero) and that d' 0 if A = 0. With the Maximum-Likelihood

Page 139: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

C.4 Example 129

procedure, the values for the two coefficients are determined. They are given in Table II. After the Maximum-Likelihood fit, a x2 test has been performed on the original fit, and on fits where either c1 , c2 , or both have been set to 0 or 1, according to some hypothesis. A better fit results in a a value closer to unity. Usually, a fit is judged to be worse if a < 0.05 or 0.01. By putting C1 = 1, for instance, the x2 fit becomes worse, but one degree of freedom is gained, resulting in a higher value for a. The hypothesis H1 : c2 = 0 comes from the fact that a negative value for c2 (as obtained with Ho) will, for large d' values, result in a non-monotonic psychometric function, and therefore is not allowed. In Figure C.1, the fraction Pc = Ne/ Nt is plotted as a function of A. The larger the symbol, the more trials were taken at that stimulus level. Furthermore, five curves are plotted, representing the functions H4 : d1 A (solid curve); Ho : d' = 1.190.6. - 0.0345A 2 (dashed curve); H1 : d1 1.117A (dashed); H2 : d1 = A+ 0.0579A2 (dashed); and H3 : d1 0.6123A2 (dotted). The figure shows that the solid and the dashed curves do not differ very much. Putting c1 0 resulted in a

l A Nt Ne Pr 0 5.0000 20 20 1.000 * 1 3.7500 28 28 1.000 * 2 2.8125 42 38 0.905 * 3 2.1094 67 59 0.881 4 1.5820 78 59 0.756 5 1.1865 75 59 0.787 6 0.8899 78 60 0.769 7 0.6674 56 36 0.643 8 0.5006 27 15 0.556 9 0.3754 9 6 0.667 *

10 0.2816 11 10 0.909 * 11 0.2112 8 4 0.500 * 12 0.1584 1 0 0.000 *

Table I: Result of the simulation. The (*) indicates that the level is not taken in account in the :.e-test.

Page 140: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

130 Appendix C titting the psychometric function

Pc

0.9

0.8

0.7

0.6

0 1 2 4 5

Figure C.1: Data of ten simulated runs, plotted ~s a psychome­tric function. The diameter of each symbol indicates the num­ber of trials taken at that level. The curves represent different Maximum-Likelihood fits.

rather poor fit, and indeed the dotted curve is a bit divergent from the other four. Therefore, it is interesting to know whether this fit differs significantly from the other ones. The answer to this is obtained by an F-test between H3 and e.g. H1 . Equation (C.20) has to be filled in, and a' has to be calculated (e.g., with the aid of a table). If the two

Page 141: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

C.4 Example 131

were to yield an identical fit, a.' = ~- A significant difference would - result in a value for a.' close to 0 or 1. The F-test yields

x~ ~ = 11.52 __!__ = 3_65_ V1 X~ 4 4.80

(C.26)

The corresponding value for a.' under v1 = 4 and v2 = 4 is 0.12. Though the fit under hypothesis H 3 is bad, the curve under H 1 still does not fit the data significantly better than the other one. In an analogous way, Ho can be compared with Hi, H 2 and so on. For this example, no significant differences have been found using the F-test.

It is known that a two-down, one-up procedure estimates the point D.. for which Pc = 0. 7071 (Levitt, 1971). This corresponds, according to Equation (C.22), to d' = 1.0899. The five relations obtained between d' and D.. result in a value for the threshold D.. thr, and are given in Ta­ble II. The results show that the estimated threshold under hypothesis Ho deviates 163 from the actual threshold. Under the restriction that all coefficients have to be positive (H1 ), threshold deviates about 123. The other hypotheses (except for H3 ) yield a better estimate of thresh­old, but then again, more information has been added in the form of restrictions for the coefficients.

Hypothesis Ho: -H1 : C2 = 0 H2: C1=1 Hs: c1 = 0 H4 : c1 = l,c2 = 0

Ct C2 f:!:..th X 2 d.j. CX.

1.190 -0.0345 0.942 4.64 6-2-1=3 0.2005 1.117 0.0000 0.976 4.80 6-1-1=4 0.3081 1.000 0.0579 1.029 5.29 6-1-1=4 0.2589 0.000 0.6213 1.324 17.52 6-1-1=4 0.0015 1.000 0.0000 1.090 5.87 6-0-1=5 0.3195

Table II: Values for c1 and c2, obtained with a Maximum­Likelihood fit. Furthermore, results of the x2-test are given.

Page 142: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

132 Appendix C fitting tb.e psycb.ometric function

C.5 Discussion

If the data of the adaptive procedure a.re processed with the present method, the sequential order of the stimuli is lost. This is allowed only if the subject's performance does not change in time. This is not really a restriction, since threshold determination with conventional methods (averaging over reversals, etc.) also needs this assumption. No restric­tion is needed for the experimental paradigm, nor for the stimulus step size, though the former determines the shape of qi in Equation (C.l). The possibility of choosing the step size freely therefore offers greater experimental flexibility. The present way of threshold estimation can also be used to fit data that are not obtained with an adaptive pro­cedure. This method can be used as long as data can be written in the form of Table I. Instead of a Maximum-Likelihood :fitting proce­dure, also a lea.st-x2 procedure could have been used to estimate the coefficients Ci· However, the latter procedure can use only those data points for which the Normality assumption is valid. This restriction causes some data points not to be taken into account. Therefore, espe­cially with only few data points, the Maximum-Likelihood procedure is preferred.

With an adaptive procedure, most data points are gathered for val­ues fl. near the target value for Pc (which is determined by the adaptive procedure). Therefore, the estimated psychometric function will always pivot around that point, as can be seen from Figure (C.1). This point will therefore be estimated quite robustly, independent of the exact shape of the psychometric function. Though the entire psychometric function is given by a Maximum-Likelihood fit, one does best to de­fine threshold as that point on the psychometric fu11ction for which the probability of a correct response is estimated by the adaptive procedure.

Page 143: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

References

Allik, J., Dzhafarov, E.N., Houtsma, A.J.M., Ross, J. & Versfeld, N.J. (1989). "Pitch motion with random chord sequences", Percep­tion & Psychophysics 46, 513 - 527.

Ananthraraman, J.N., Krishnamurthy, A.K. & Feth, L.L. (1992). "Testing an extension of the EWAIF model: Intensity-weighted av­erage of instantaneous frequency {IWAIF)", in: Abstracts of the fif­teenth midwinter research meeting of the association for research in otolaryngology, 67.

Berg, B.G. (1990). "Spectral weights in profile listening", Journal of the Acoustical Society of America 88, 758 - 766.

Berg, B.G. (1992). "Discrimination of narrow band spectra based on pitch cues", in: Abstracts of the fifteenth midwinter research meeting of the association for research in otolaryngology, 93.

Berger, K.W. (1964). "Some factors in the recognition of timbre", Journal of the Acoustical Society of America 36, 1888 - 1891.

Bernstein, L.R. & Green, D.M. {1987a.). "The profile analysis band­width", Journal of the Acoustical Society of America 81, 1888 -1895.

Bernstein, L.R. & Green, D.M. (1987b). "Detection of simple and complex changes of spectral shape", Journal of the Acoustical Society of America 82, 1587 1592.

von Bismarck, G. (1974a). "Timbre of steady sounds: A factorial investigation of its verbal attributes", Acustica 30, 146 - 159.

von Bismarck, G. (1974b). "Sharpness as an attribute of the timbre of steady sounds", Acustica 30, 159 - 172.

Bradley, R.A. {1963). "Some relationships among sensory difference tests", Biometrics 19, 385 - 397.

Page 144: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

134 References

Buus, S. {1983). "Discrimination of envelope frequency", Journal of the Acoustical Society of America 7 4, 1709 - 1715.

Buus, S. & Florentine, M. (1991). "Psychometric functions for level · discrimination", Journal of the Acoustical Society of America 90,

1371 - 1380.

Cherry, E.C. & Phillips, V.J. (1961). "Some possible uses of sin-gle sideband signals in formant-tracking systems", Journal of the Acoustical Society of America 33, 1067 - 1077.

Dixon, W.J. & Mood, A.M. (1948). "A method for obtaining and analyzing sensitivity data", Journal of the American Statistical Association 43, 109 - 126.

Durlach, N .I., Braida, L.D. & Ito, Y. (1986). "Towards a model for discrimination of broadband signals", Journal of the Acoustical Society of America 80, 63 - 72.

Elliot, P.B. (1988). "Tables of d"', in: Signal detection and recognition by human observers, J .A. Swets, ed. (Peninsula Publishing, Los Altos, CA).

Emmerich, D.S., Ellermeier, W. & Butensky, B. (1989). "A reexam­ination of frequency discrimination of random-amplitude tones, and a test of Henning's modified energy-detector model", Journal of the Acoustical Society of America 85, 161>3 - 1659. ·

Farrar, C.L., Reed, C.M., Ito, Y., Durlach, N.I., Delhorne, L.A., Zurek, P.M. & Braida, L.D. (1987). "Spectral shape discrimination. I. Results from normal-hearing listeners for stationary broadband noises,,, Journal of the Acoustical Society of America 81, 1085 .... 1092.

Feth, L .. L. (1974). "Frequency discrimination 9f complex periodic tones", Perception & Psychophysics 15, 375 .::. 378.

Feth, L.L. & O'Malley, H. (1977). "Two-tone auditory spectral res­olution", Journal of the Acoustical Society of America 62, 940 -947.

Feth, L.L., O'Malley, H. & Ramsey Jr., Jw. (1982). "Pitch of unre­solved, two-component complex tones", Journal of the Acoustical Society of America 72, 1403 - 1412.

Feth, L.L. & Stover, L.J. (1987). "Demodulation processes in au-ditory perception", in: Auditory processing of complex sounds,

Page 145: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

References 135

W.A. Yost & C.S. Watson, eds. (Lawrence Erlbaum Associates, Hillsdale NJ). 76 86.

Florentine, M. (1983). "Intensity discrimination as a function of level and frequency and its relation to high-frequency hearing", Journal of the Acoustical Society of America 7 4, 1375 - 1379.

Florentine, M. & Buus, S. (1981). "An excitation-pattern model for intensity discrimination", Journal of the Acoustical Society of America 70, 1646 - 1654.

Frijters, J.E.R. (1980). Psychophysical and psychometrical models for the triangular method. Ph.D. Thesis, Utrecht University.

Goldstein, J.L. (1967). "Auditory nonlinearity", Journal of the Acous­tical Society of America 41, 676 689.

Green, D.M. (1988). Profile analysis, (Oxford University Press, New York).

Green, D.M. & Berg, B.G. (1991). "Spectral weights and the profile bowl", Quarterly Journal of Experimental Psychology 43A, 449 -458.

Green, D.M. & Kidd Jr., G. (1983). "Further studies of auditory profile analysis'', Journal of the Acoustical Society of America 73, 1260 - 1265.

Green, D.M., Kidd Jr., G. & Picardi, M.C. (1983). "Successive versus simultaneous comparison in auditory intensity discrimination", Journal of the Acoustical Society of America 73, 639 - 643.

Green, D.M. & Mason, C.R. (1985). "Auditory profile analysis: Fre­quency, phase and Weber's Law", Journal of the Acoustical Soci­ety of America 77, 1155 - 1161.

Green, D.M., Mason, C.R. & Kidd Jr., G. (1984). "Profile analysis: Critical bands and duration", Journal of the Acoustical Society of America 75, 1163 - 1167.

Green, D.M., Onsan, Z.A. & Forrest, T.G. (1987). "Frequency ef-fects in profile analysis and detecting complex spectral changes", Journal of the Acoustical Society of America 81, 692 - 699.

Green, D.M. & Swets, J.A. (1988). Signal detection theory and psy­chophysics, (Peninsula Publishing, Los Altos, CA).

Page 146: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

136 References

Grey, J.M. (1977). "Multidimensional perceptual scaling of musical timbres", Journal of the Acoustical Society of America 61, 1270 - 1277.

Hawkins, J.E. & Stevens, S.S. (1950). "The masking of pure tones and of speech by white noise", Journal of the Acoustical Society of America 22, 6 - 13.

Helmholtz, H.L.F. (1954). On the sensations of tone, 2nd English edition, (Dover, New York).

Henning, G.B. (1966). "Frequency discrimination of random-amplitude tones", Journal of the Acoustical Society of America 39, 336 - 339.

Ito, Y. {1990). Auditory discrimination of power spectra for roving two-tone stimuli. Ph.D. Thesis, Massachusetts Institute of Tech­nology.

Jeffress, L.A. (1968). "Beating sinusoids and pitch changes", Journal of the Acoustical Society of America 43, 1464.

Jesteadt, W. & Neff, D.L. (1982). "A signal-detection-theory measure of pitch shifts in sinusoids as a function of intensity", Journal of the Acoustical Society of America 72, 1812 - 1820.

Jesteadt, W. & Sims, S.L. (1975). "Decision processes in frequency discrimination", Journal of the Acousti~al Society of America 57, 1161 - 1168.

Jesteadt, W., Wier, C.C. & Green, D.M. (1977). "Intensity discrimi­nation as a function of frequency and sensation level", Journal of the Acoustical Society of America 61, 169 - 177.

Kidd Jr., G., Mason, C.R. & Green, D.M. (1986). "Auditory profile analysis of irregular sound spectra" , J ournaJ of the Acoustical Society of America 79, 1045 - 1053.

Kidd Jr., G., Mason, C.R., Uchanski, R.M., Brantley, M.A. & Shah, P. (1991). "Evaluation of simple models of auditory profile analysis using random reference spectra", Journal of the Acoustical Society of America 90, 1340 - 1354.

Laming, D.J. (1986). Sensory analysis, (Academic Press, London).

Levitt, H. (1971). "Transformed up-down methods in psychoacous-tics", Journal of the Acoustical Society of America 49, 467 - 477.

Page 147: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

References 137

Macmillan, N.A., Kaplan, H.L. & Creelman, C.D. (1977). "The psy­chophysics of categorical perception", Psychological Review 84, 452 - 471.

Mason, C.R., Kidd Jr., G., Hanna, T.E. & Green, D.M. (1984). "Pro­file analysis and level variation", Hearing Research 13, 269 - 275.

Moore, B.C.J. & Glasberg, B.R. (1989). "Mechanisms underlying the frequency discrimination of pulsed tones and the detection of frequency modulation", Journal of the Acoustical Society of America 86, 1722 1732.

Moore, B.C.J., Glasberg, B.R. & Shailer, M.J. {1984). "Frequency and intensity difference limens for harmonics within complex tones", Journal of the Acoustical Society of America 75, 550 - 561.

Moore, B.C.J., Oldfield, S.R. & Dooley, G.J. (1989). "Detection and discrimination of spectral peaks and notches at 1 and 8 kHz", Journal of the Acoustical Society of America 85, 820 - 836.

Moore, B.C.J., Peters, R.W. & Glasberg, B.R. (1990). "Auditory filter shapes at low centre frequencies", Journal of the Acoustical Society of America 88, 132 - 140.

Morton, J. & Carpenter, A. (1963). "Experiments relating to the per­ception of formants" Journal of the Acoustical Society of America 35, 475 - 480.

Plomp, R. (1964). "The ear as a frequency analyzer", Journal of the Acoustical Society of America 36, 1628 - 1636.

Plomp, R. (1965). "Detectability threshold for combination tones", Journal of the Acoustical Society of America 37, 1110 - 1123.

Plomp, R. (1976). Aspects of tone sensation, (Academic Press, Lon­don).

Plomp, R. & Levelt, W.J.M. (1965). "Tonal consonance and critical bandwidth", Journal of the Acoustical Society of America 38, 548 - 560.

Plomp, R. & Steeneken, H.J.M. (1968). "Interference between two simple tones", Journal of the Acoustical Society of America 43, 883 - 884.

Pols, L.C.W., van der Kamp, L.J.Th. & Plomp, R. (1969). "Percep­tual and physical space of vowel sounds", Journal of the Acoustical Society of America 46, 458 - 467.

Page 148: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

138 References

Rabinowitz, W.M., Lim, J.S., Braida, L.D. & Durlach, N.I. (1976). "Intensity perception. VI. Summary of recent data on deviations from Weber's law for 1000-Hz tone pulses", Journal of the Acous­tical Society of America 59, 1506 - 1509.

Raney, J.J., Richards, V.M., Onsan, Z.A. & Green, D.M. (1989). "Sig­nal uncertainty and psychometric functions in profile analysis", Journal of the Acoustical Society of America 86, 954 - 960.

Richards, V.M., Onsan, Z.A. & Green, D.M. (1989). "Auditory profile analysis: Potential pitch cues", Hearing Research 39, 27 - 36.

Riesz, R.R. (1928). "Differential intensity sensitivity of the ear", Physics Review 31, 867 - 875.

Sharf, B. (1970). "Critical bands,,, in: Foundations of modem auditory theory, Vol. 1, J.V. Tobias, ed. (Academic Press, New York).

Shower, E.G. & Biddulph, R. (1931). "Differential pitch sensitivity of the ear", Journal of the Acoustical Society of America. 3, 275 -287.

SmurzyD.ski, J. & Houtsma., A.J.M. (1989). "Auditory discrimination of tone-pulse onsets", Perception & Psychophysics 45, 2 - 9.

Spiegel, M.F. & Green, D.M. (1982). "Signal and masker uncertainty with noise maskers of varying duration, bandwidth and center frequency", Journal of the Acoustical Society of America 71, 1204

1210. ~

Spiegel, M.F., Picardi, M.C. & Green, D.M. (1981). "Signal and masker uncertainty in intensity discrimination'', Journal of the Acoustical Society of America 70, 1015 - 1019.

Taylor, M.M. & Creelman, C.D. (1967). "PEST: Efficient estimates on probability functions", Journal of the A~oustical Society of America 41, 782 - 787. J

Verschuure, J. & van Meeteren, A.A. (1975). "The effect of intensity on pitch'>, Acustica 32, 33 - 44.

Versfeld, N.J. (1989). "Intensity perception in two-tone complexes", in: Proceedings of the 28th acoustic conference on physiologi­cal acoustics, psychoacoustics, acoustics of music and speech, C. Gorah'kova, ed. (VU.st, Prague), 167 - 170.

Versfeld, N.J. (1990). "Perception of spectra.I changes in noise bands of varying bandwidth", IPO annual progress report 25, 23 - 31.

Page 149: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

References 139

Versfeld, N.J. (1992). "On the perception of spectral changes in noise bands", in: The processing of speech, M.E.H. Schouten, ed. (Mouton-de Gruyter, Berlin).

Versfeld, N.J. & Houtsma, A.J.M. (1991). "Perception of spectral changes in multi-tone complexes", Quarterly Journal of Experi­mental Psychology 43A, 459 - 479.

Versfeld, N.J. & Houtsma, A.J.M. (1992). "Spectral shape discrimina­tion of two-tone complexes", in: Auditory physiology and percep­tion, Y. Cazals, L. Demany & K. Horner, eds. (Pergamon Press, Oxford), 363 - 371.

Voelcker, H.B. (1966). "Toward a unified theory of modulation. Part I: Phase-envelope relationships", Proceedings of the IEEE 54, 340 - 353.

Wetherill, G.B. & Levitt, H. (1965). "Sequential estimation of points on a psychometric function", the British Journal of Mathematical and Statistical Psychology 18, 1 - 10.

Wier, C.C., Jesteadt, W. & Green, D.M. (1977). "Frequency discrim­ination as a function of frequency and sensation level", Journal of the Acoustical Society of America 61, 178 - 184.

Page 150: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Samenvatting

W ANNEER we een geluid horen, kunnen we aan dat geluid een aan­tal subjectieve eigenschappen toekennen. We kunnen aangeven

of een geluid hard is of zachtj of het een toonhoogte heeft, en zo ja, of deze hoog is of laag. Voorts heeft elk geluid een klankkleur of timbre, waardoor we twee geluiden die ondanks het feit dat ze even luid zijn en dezelfde toonhoogte hebben, toch uit elkaar kunnen houden.

Psychoakoestiek is een tak van de wetenschap, die bestudeert hoe geluidssignalen worden omgevormd tot zintuigindrukken. Een van de manieren waarop we wat meer te weten kunnen komen over de subjec­tieve gewaarwording is het onderscheidingsvermogen van het gehoor on­derzoeken. Zo kunnen we aan proefpersonen een serie van drie geluiden aanbieden, waarvan er twee identiek zijn en de derde net iets afwijkt. De taak van de proefpersoon is daarna aaIYte geven welk geluid van. de drie afweek van de andere twee. Door het verschil tussen de twee geluiden te varieren, komen we vanzelf uit bij dat verschil waarvoor de proefpersoon een vooraf bepaald percentage correct scoort. Dit noemen we de "drempel". Door drempels te meten kunnen we te weten komen hoe gevoelig het menselijk gehoor is voor bepaalde soorten verander­ingen. Uiteindelijk kunnen we dan de wetmatigheden vinden die het auditief systeem (het gehoororgaan en de hersenen die de geluidssig­nalen verwerken) beschrijven.

Het is bekend dat het binnenoor een frequentie-analyse uitvoert op de geluiden. Dat betekent dat de verschillende irequentie-componenten waaruit een signaal is opgebouwd, op verschillende plaatsen in bet slakkehuis resoneren. Liggen twee componenten zo dicht bij elkaar dat ze interageren ( ze liggen dan in elkaars zogeheten "kritieke band"), dan kan dit betekenen dat een verandering in een component minder

Page 151: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Samenvatting 141

goed, of juist beter waarneembaar is door de aanwezigheid van de an­dere component. Voorts zouden we verwachten dat twee componenten, die veel in frequentie verschillen, niet of nauwelijks wisselwerking met elkaar hebben, omdat ze verschillende delen van het slakkehuis a.ctive­ren. Va.ak is dit inderda.ad zo, maar onder bepaalde omstandigheden blijkt dat het auditief systeem toch in staat is in frequentie veraf gelegen componenten met elkaar te vergelijken en te gebruiken om het verschil nauwkeuriger waar te nemen. Het is dan alsof niet de afzonderlijke com­ponenten, ma.ar eigenlijk de spectrale vorm wordt waargenomen. Dit proces wordt in de literatuur aangeduid met de term "profielanalyse".

Dit proefschrift bestudeert hoe het auditief systeem een verander­ing in de spectrale vorm van een geluid verwerkt. Geprobeerd is om de eigenschappen van het gehoor in kaart te brengen voor geluiden waar­van de spectrale vorm op een eenvoudige manier verandert, nl. door de spectrale helling te varieren. De geluiden bestonden meestal uit slechts twee frequentie-componenten (twee-toon complexen), maar ook geluiden met meer componenten en geluiden met een continu (d.w.z. ruisachtig) spectrum zijn bestudeerd. Om te voorkomen dat proefper­sonen a.lltwoorden gaven die zijn gebaseerd op veranderingen in de in­tensiteit, werd in de experimenten de intensiteit van elk geluidsinterval willekeurig gevarieerd.

De resultaten uit dit proefschrift la.ten zien dat een verandering in de spectrale helling kan resulteren in verschillende percepten. Is de bandbreedte van het signaal klein, dan leidt een verandering in de helling tot een verandering in de toonhoogte. Is de bandbreedte groot, dan leidt dit in sommige gevallen tot een verandering in tim­bre. Wanneer de bandbreedte van het signaal erg klein is, lijkt het erop also{ het signaal door het auditief systeem wordt geanalyseerd alsof het een component is die in de tijd wordt gevarieerd. Een model dat dit proces ongeveer beschrijft is het EWAIF model. In dit proefschrift laten we zien da.t dit model de resultaten alleen kan beschrijven voor zeer kleine ba.ndbreedtes. Wa.nneer de bandbreedte groot is, worden de frequenctie-componenten van het geluidssignaal verdeeld over de ver­schillende kritieke banden. De resultaten uit dit proefschrift la.ten zien dat de informatie uit de afzonderlijke kritieke banden door het auditief systeem relatief met elkaar vergeleken kunnen worden. Dit betekent in feite dat de spectrale vorm bepaald kan worden of, in andere woorden,

Page 152: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

142 Samenvatting

da.t profiela.na.lyse plaatsvindt. Een model da.t dit gedra.g ten minste kwa.lita.tief ka.n beschrijven is het multi-channel model. In het geva.l van twee~toon complexen ka.n dit model de resultaten zelfs kwantitatief beschrijven. Het blijkt dat het gehoor deze geluiden niet op een op­timale manier verwerkt. Ditzelfde resultaat zien we ook bij de andere geluidssignalen. Het lijkt erop a.lsof het auditief systeem niet in sta.a.t is om meer dan twee kritieke banden tegelijk te gebruiken.

Page 153: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

19 december 1962

1975 - 1981

1981 - maart 1988

juli 1988

juni 1988 mei 1992

Curriculum Vitae

Geboren te Eindhoven.

Lorentz Lyceum te Eindhoven, Atheneum B.

Technische Universiteit Eindhoven, Technische N atuurkunde. Afstudeerrichting: Psychofysica. Afstudeeronderwerp: "Detection of pitch mo­tion in complex sounds".

Muziekonderwijs akte A, hoofdvak piano.

Onderzoeker in Opleiding aan het Instituut voor Perceptie Onderzoek te Eindhoven. Pro­jecttitel: "Auditieve discriminatie van spec­trale vorm van geluiden".

Page 154: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

Stellingen

behorende bij het proefschrift On the auditory discrimination of spectral shape

van Niek Versfeld

I

Het EWAIF-model van Feth (1974) kan het discriminerend gedrag van het menselijk gehoor bij veranderingen in de modulatiediepte van amplitude-gemoduleerde signalen niet beschrijven.

Feth, L.L. (1974). "Frequency discrimination of complex periodic tones", Perception & Psychophysics 15, 375 - 378.

II

Hoewel bij twee-toon complexen de amplitude-omhullende en de in­stantane frequentie periodiek zijn met de reciproke van de verschilfre­quentie, is de bewering van Feth & Stover (1987) dat het signaal zelf dan ook periodiek is met deze periode, onwaar. Dit is slechts het geval wanneer de twee frequenties opeenvolgende harmonischen zijn.

Feth, L.L. & Stover, L.J. (1987). "Demodulation processes in auditory perception", in: Auditory processing of complex sounds, W.A. Yost & C.S. Watson, eds. (Lawrence Erlbaum Associates, Hillsdale NJ). pp. 76 - 86.

III

Geheel ten onrechte beschrijven Kidd et al. (1986) het leerproces met een logarithmische curve. Dit impliceert dat naarmate de luisteraar meer oefent, de drempel steeds lager wordt.

Kidd Jr., G., Mason, C.R. & Green, D.M. (1986). "Auditory profile analysis of irregular sound spectra", Journal of the Acoustical Society of America 79, 1045 1053.

Page 155: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

IV

Het menselijk auditief systeem is onder speciale condities in staat amplitude-veranderingen van minder dan 0.2 dB waar te nemen.

Dit proefschrift.

v Auditieve paradoxen zoals beschreven door Shepard (1964), Schroeder (1986) en Risset (1986), kunnen eenvoudig worden verklaard met een Lokaal Dipool Model (Allik et al., 1989).

Allik, J., Dzhafarov, E.N., Houtsma, A.J.M., Ross, J. & Versfeld, N.J. (1989). "Pitch motion with random chord sequences", Perception & Psychophysics 46, 513 - 527.

Risset, J.-C. (1986). "Pitch and rhythm paradoxes: Comment on "Au­ditory paradox based on fractal waveform", [J. Acoust. Soc. Am. 79, 186 - 189, 1986]", Journal of the Acoustical Society of America 80, 961- 962.

Schroeder, M.R. (1986). "Auditory paradox based on fractal wave­form", Journal of the Acoustical Society of America 79, 186 - 189.

Shepard, R.N. (1964). "Circularity in the judgements of relative pitch", Journal of the Acoustical Society of America 36, 2346 - 2353.

VI

Jesteadt et al. (1977) beschrijven de "near-miss" aan Weber's Law voor intensiteit discriminatie ofwel als log (Af /I)= k+plog (I/ Io), ofwel als log ([I+ Af]/ I) = k + plog (I/ I0 ), waarbij Io de juist-waarneembare intensiteit, en Af het juist waarneembare verschil in intensiteit is. De eerste beschrijving verdient de voorkeur, daar dan de "near-miss" pa­rameter p onafhankelijk is van de definitie van de drempel.

Jesteadt, W., Wier, C.C. & Green, D.M. (1977). "Intensity discrimi­nation as a function of frequency and sensation level", Journal of the Acoustical Society of America 61, 169 -177.

Page 156: On the auditory discrimination of spectral shape · On the auditory discrimination of spectral shape PROEFSCHRIFT ter verkrijging van de graa.d van doctor aan de Technische Universiteit

VII De spectrale omhullende van een 21-toon complex, zoals getekend in Figuur 3 uit Bernstein & Green (1987), laat zien dat het visueel systeem in plaats van

a[i] =a+ &sin [2'1I'10i/21] graag

ziet.

a[i] = a+..::» sin [11"i/21] sin [11"i 11" /2] i 1,2, ... ,21

Bernstein, L.R. & Green, D.M. (1987). "The profile analysis band­width", Jomnal of the Acoustical Society of America. 81, 1888 - 1895.

VIII De vorm van de psychometrische functie kan afhankelijk zijn van de bekendheid met de stimulus.

Dit proefschrift.

IX Als uit een gegeven meting blijkt dat proefpersonen significant boven kans scoren, impliceert dit niet dat ze bovendrempelig waarnemen. Evenzo, als blijkt dat proefpersonen bovendrempelig waarnemen, wil dit niet zeggen dat de score significant boven kans is.

x Hoewel de geluidskwaliteit van de Digital Compact Cassette en de Com­pact Disc in fysisch opzicht van elkaar verschillen, zijn deze perceptief identiek.

XI Als een experiment de moeite waard is om te doen, is het ook de moeite waard om het over te doen.

XII Taal is een kwestie van smaak: De een eet liever aa.rdappels dan frites, de ander liever aardappels als frites.