=31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub

Audio Engineering Society

Convention PaperPresented at the ��th Convention

�� October �� San Francisco� CA� USA

This convention paper has been reproduced from the author�s advance manuscript� without editing� corrections� orconsideration by the Review Board� The AES takes no responsibility for the contents� Additional papers may be

obtained by sending request and remittance to Audio Engineering Society� �� East ��nd Street� New York� New York�� USA also see www�aes�org� All rights reserved� Reproduction of this paper� or any portion thereof� isnot permitted without direct permission from the Journal of the Audio Engineering Society�

Modeling Auditory Localization of Subwoofer

Signals in Multi�Channel Loudspeaker Arrays

Jonas Braasch�� William L� Martens�� and Wieslaw Woszczyk�

�CIRMMT� Faculty of Music� McGill University� Montreal� Canada

Correspondence should be addressed to Jonas Braasch �jb�music�mcgill�ca�

ABSTRACT

For economical reasons� home entertainment surround sound systems are usually equipped with a singlesubwoofer channel� The main argument for this procedure is the believed inability of the auditory systemto localize low frequencies in small reverberant rooms� However� a psychoacoustic localization test thatwas conducted using a standard ��channel set�up with subwoofers showed that the listeners were able todetermine the lateral displacement left� center or right of the loudspeaker presenting the test stimulus �anoctave�band noise burst at ��Hz� ��Hz or ��Hz center frequency�� Using a binaural model simulatinghuman perception� recordings of subwoofers signals at dierent positions were analyzed� As expected� theinteraural level dierences remained nearly constant for dierent subwoofer positions in the low frequencyrange� On basis of interaural time dierences� however� the model was able to predict the position of theloudspeaker regarding the left�right dimension� verifying the outcome of the listening test� The resultsindicate the importance to consider more than one subwoofer in multi�channel audio systems�

�� INTRODUCTION

In its recommendation ITU�R BS �� for mul�tichannel stereophonic sound systems� the Inter�national Telecommunication Union �ITU� proposesthe use of a single loudspeaker for the low fre�quency range below � Hz as a cost�eective solu�tion� While such an approach is generally acceptedfor home entertainment systems� it should be ques�tioned whether this procedure has disadvantages insound when used in studio monitoring systems and

other high�end applications�

Several recent articles address this topic under dif�ferent aspects� and dierent conclusions were drawn�K�gler and Theile �� investigated whether substan�tial dierences in sound are perceived between afullrange �channel sound system and a ��channelsound system that are both identical with the ex�ception that the ��channel system uses one sub�woofer� The authors conclude from the outcome ofan AB comparison test with dierent music samples

Braasch et al Modeling Localization of Subwoofer Signals

that the dierences are negligible if the crossover fre�quency is under � Hz� Zacharov et al� �� cameto a similar conclusion and recommend that for do�mestic environments the loudspeaker set�up is lesscritical for a one�subwoofer system �tested crossoverfrequency �� Hz�� and the audible dierences be�tween the experimental one�subwoofer case and themultiple subwoofer case were found to be little� Tothe opinion of the authors� the use of multiple sub�woofers on the other hand can lead to �spatial andtemporal smearing� if the subwoofers are not prop�erly aligned�

Noussaine �� and Welti �� draw the conclusionthat a multiple�subwoofer set�up is advantageous toa one�subwoofer set�up after they examined bothoptions under the aspect of room mode excitation�Both authors belief that the use of multiple sub�woofers lead to a more �even� or �balanced� exci�tation of room modes� Griesinger �� and Martens�� conclude that multiple subwoofer arrays can beuseful in decorrelating low�frequency sounds� whichis important to generate the impression of appar�ent source width �ASW� and listeners envelopment�LEV��

When discussing spatial properties of sound� soundlocalization is probably the most frequently dis�cussed issue in this �eld� The �rst systematic investi�gations date back to the late ��th century� In regardto low�frequency localization in loudspeaker arrays�however� our knowledge is still incomplete� Thispaper addresses the question whether the multi�channel information of sound reproduction systemsshould be extended to low frequencies for localizabil�ity reasons� For this purpose� a localization experi�ment was conducted� which is described in the nextsection� following an extensive model simulation tosupport the psychoacoustical �ndings �Section ��

�� PSYCHOACOUSTIC EXPERIMENT

While designing the listening test� we found it to beimportant to maintain the typical features of stereo�phonic listening situations to make our conclusionsapplicable to the practical audio engineer� For thisreason� we chose not to hide the architecture of thespeaker set�up �e�g�� by blindfolding the listeners�nor did we restrict the listeners from using headmovements� In the test� the listeners were asked to

indicate the speaker closest to the direction of theauditory event� Strictly speaking� our test resem�bled a source identi�cation test rather than a purelocalization experiment� Nonetheless� our aim wasto show whether true multichannel sound should beprovided in the low�frequencies for reasons of local�izability� and not to investigate the general perfor�mance of the auditory system when localizing lowfrequency sounds�

�� Methods

�� Listeners

Four listeners including the �rst two authors �onefemale� three males� participated in the experiment�Their ages ranged from to �� years� None of thelisteners had known hearing disorders�

�� Apparatus and Stimuli

In the source identi�cation test� the signals were pre�sented to the listeners through subwoofer loudspeak�ers� The subwoofers were custom�built by Bang �Olufsen for use with the Beolab � loudspeakers�but in the test only the subwoofers were addressed�The loudspeakers were set�up in MARLAB �Multi�Channel Audio Research Laboratory� at McGill Uni�versity according to ITU recommendation ITU�RBS �� at the following azimuth angles� ��

and �� m radius�� The speakers were placedon the �oor� and the center of the driver ��in diam�eter� was at a height of � cm� The following rever�beration times were measured for MARLAB �t �� Hz�� Hz�� Hz��

The test stimuli were generated in Matlab on a per�sonal computer �Hewlett Packard� Pavilion a� n�Windows XP�� For this purpose� white noise burstsof � �ms duration were created at �� kHz sam�pling frequency and ��bit resolution� Afterwards�the signals were �ltered with an octave�wide FIR �l�ter �� coe�cients� at dierent center frequencies�� Hz� �� Hz� and �� Hz�� The signals had �ms cos� on� and oset ramps to exclude that the lis�teners could use �spectral widening� onset cues� Themaximum sound pressure level of the loudspeakersignals was set to � dB�c� measured at the centerposition of the set�up� In the ��Hz center fre�quency condition� the sound pressure level had tobe reduced to �� dB�c� to avoid audible non�lineareects of the speakers� Each loudspeaker was cali�brated individually using a mixing console �Yamaha

AES ��th Convention� San Francisco� CA� USA� �� October ��

Page � of �


loca

lized

spe

aker

presented speaker

L 1

sl

l

c

r

sr

L 2

f c=31

.5 H

z

100%

50%

25%

10%

5%

1%

sl

l

c

r

sr

L 3 L 4

f c=63

Hz

sl l c r sr

sl

l

c

r

sr

sl l c r sr sl l c r sr

f c=12

5 H

z

sl l c r sr

Fig� �� Localization performance of four listeners �L��L�� localizing low�frequency octave�band noise burstsat dierent center frequencies as indicated in the righthand labels�

�D�� A level rove �� dB� � dB� and � dB� was in�cluded� to exclude that the listeners would identifythe loudspeaker signal according to its sound pres�sure level�

�� Procedure

The experiment was divided into several sessions�In each session� the center frequency was kept con�stant� and only the azimuth of the signal was pseudo�randomly varied� The session started with a train�ing phase in which the listener could become familiarwith the experiment procedure� After a stimulus hadbeen presented� the listener reported the direction ofthe auditory event on a graphical interface by indi�cating one of the �ve buttons� which resembled the�ve loudspeakers� using a mouse� In those cases in

which the auditory event did not match the positionof any speaker� the listeners were asked to indicatethe speaker closest to the auditory event�

After the response� the next stimulus was presented�At the end of the training phase� the recording ofthe listener�s responses began� In each session� everystimulus was presented twelve times �four times foreach of the three roved levels� �� dB� � dB and� dB�� The duration of each session was about �veminutes� No feedback was provided to the listenersduring the training phase nor the recording phase�Listeners � and participated in �ve sessions foreach of the three center frequencies� Listeners � and� took part only in two sessions each for the twohighest center frequencies �� Hz and �� Hz��


Page � of �


1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

left side right side

Fig� � Percentage of correctly identi�ed front�back directions� The left graph shows the results for the leftand surround�left subwoofers� the right graph for the right and surround�right subwoofers� The black barsshow the ��Hz center frequency condition� the gray bars the ��Hz center frequency condition� and thewhite bars the ��Hz center frequency condition� In both plots� the horizontal dot�dashed line marks thepercent correct level above which performance is signi�cantly better than chance would allow�

�� Results

The results of the localization experiment are shownseparately for each of the four listeners �columnwise� and dierent center frequencies �row wise� inFig� �� The data for all sound�pressure level settingswere combined together to construct the responsepercentages plotted in this graph� The x�axis de�picts the presented speaker positions� surround left�sl�� left �l�� center �c�� right �r�� and surround right�sr�� The y�axis shows the reported speaker loca�tions� The size of each black circle is proportionalto the occurrences the speaker position �given at they�axis of the circle�s center� was chosen for the pre�sented speaker position �given at the x�axis of thecircle�s center�� A legend on the top�right providesthe relationship between dierent circle sizes andpercentage of answers that were given for this com�bination of presented�perceived direction� The top�left panel� for example� shows the localization resultsfor Listener � �L�� for the ��Hz center frequencycondition� If the listener would have been able toidentify all speaker positions correctly� we would ex�pect �ve large circles aligned at the diagonal line forcorresponding localized and presented speaker posi�tions� For the center speaker� the listener�s responsepattern is quite close to this case� but for all re�maining speakers we observe a number of confusionsbetween the front and the corresponding surround

speakers� e�g� left�surround�left and right�surround�right� The remaining panels in Fig� � show very sim�ilar patterns� Obviously� the listeners are fairly goodin discriminating in the left�right dimension� but donot do so well in distinguishing between front�back�Only two signi�cant exceptions were found� the ��Hz center frequency conditions for L and L�� Inthis condition� both listeners are able to discrimi�nate between the left speaker and the surround�leftspeaker and to some extent also between the rightspeaker and the surround�right speaker�

To elaborate on the details of this eect further�the data of the localization experiment are shownin the graph on the left side of Fig� � Each bargives the percentages of correct responses given forthe left and the surround�left speakers� The dashedand the dashed�dotted line show the � � thresh�old of chance and the �� detection threshold re�spectively� As Fig� � already revealed� only for Land L� the percentages of correct responses clearlyexceed the �� threshold in the ��Hz condition�In two further cases� the percentage of correct re�sponses is just above the �� threshold �L�� Hzcondition and L�� Hz condition�� In all othercases� the percentages of correct responses are be�low the �� threshold� The right graph of Fig� shows the percentages of correct responses given forthe right and the surround�right speakers� similar tothe left graph of the same �gure� Here� only in two


Page � of �


1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

1 2 3 40

20

40

60

80

100

listener ID

corr

ect a

nsw

ers

[%]

left� center� right surround left� surround right

left� right

Fig� �� Percentage of correctly identi�ed left�rightdirections� The top�left graph show the resultsfor the left� center� and right subwoofers� the top�right graph for the surround�left and surround�right subwoofers� and the bottom�left graph forthe left and right subwoofers� The black bars showthe ��Hz center frequency condition� the graybars the ��Hz center frequency condition� and thewhite bars the ��Hz center frequency condition�

cases the percentages of correct responses exceededthe �� threshold� L for the ��Hz condition andL� for the ��Hz condition�

Regarding the left�right dimension� the percentagesof correct responses depicted in the top�left graphof Fig� � are near the possible maximum of � ��indicating that the listeners did not face any di��culties judging the lateral displacement of the soundsources� In this �gure� the data were collapsedover the stimulus presentations over the three frontspeakers� Those cases in which the listeners falselyindicated the surround�left speaker instead of thefront�left speaker and the cases in which the listenersreported the surround�right speaker instead of thefront�right speaker were counted as correct responsesbecause the sideness of the judgement� left� center�or right remained correct� The left�center�right dis�crimination task should be easier with the two sur�round loudspeakers� since larger ITDs occur here�However� the performance cannot get much better�and only in the case with the lowest percentage ofcorrect answers �L�� Hz center frequency con�

dition�� this value was noticeably higher for the sur�round speaker case �Fig� �� top�right graph�� Thisobservation is not aected when we compare theresponses for both surround speakers to the front�left and front�right speaker only �Fig� �� bottom�leftgraph��

When analyzing the data� we were also interestedwhether the variation of the sound�pressure levelchanges the performance of the listeners or not� Fig�ure � shows the results of the localization experimentfor L� ��rst two rows from left� and L ��rst tworows from right�� The scatter plots are shown inthe same way as Fig� �� but now each level settingis depicted separately� To save space� we only showthe �� dB and �� level conditions for L� and L�Recall that L� and L� participated only in two in�stead of �ve sessions for each center frequency� Thecorresponding plots for the �� dB and �� dB levelconditions look very similar� with two exceptions�Firstly� L� seems to have less confusions between thesurround speakers and the matching front�left�rightspeakers in the ��Hz center frequency condition


Page of �


loca

lized

spe

aker

presented speaker

−3 dBL1

sl

l

c

r

sr

+3 dB −3 dBL2

+3 dB

f c=31

.5 H

z

sl

l

c

r

sr

f c=63

Hz

sl l c r sr

sl

l

c

r

sr

sl l c r sr sl l c r sr

f c=12

5 H

z

sl l c r sr

Fig� �� Same as Fig� �� but with separate plots for dierent level settings �Listeners � and only��

��rst two top panel from left�� Secondly� L ap�pears to indicate the front�right speaker instead ofthe surround�right speaker� when the relative levelwas set to �� dB� One possible explanation for thisdegradation in performance is that L was moni�toring monaural level cues and expected the level inboth surround speakers to be less than in the frontalspeaker� as would result from shadowing eects ofthe head�

In the left graph of Fig� �� the percentages of cor�rect responses regarding the front�back direction areshown� The method of presenting this data is analo�gous to Fig� � except for the fact that this time thedata for the left and right speakers were combined�and more important� all three level settings are now

shown separately� For this reason� the code for thebars shows the relative sound pressure level setting�� dB� � dB� and �� dB� of the stimuli rather thantheir center frequency� The latter is now depicted inseparate rows as indicated by the righthand side la�bels� The only noteworthy changes in the responsepatterns through level variation of the stimuli is thepreviously mentioned case of L� in the ��Hz cen�ter frequency condition� Some improvement is alsoobservable for L� in the ��Hz condition�

In concluding this section� the percentage of cor�rect responses regarding the left�right direction isdepicted in the right graph of Fig� � in an analogousway to Fig� �� but again all three level settings areshown separately� As already observed in Figs� �


Page � of �


listener ID

corr

ect a

nsw

ers

[%]

f c=31

.5 H

z

0 20 40 60 80

100

f c=63

Hz

0 20 40 60 80

100

f c=12

5 H

z

1 2 3 4 0 20 40 60 80

100

listener ID

corr

ect a

nsw

ers

[%]

f c=31

.5 H

z

0 20 40 60 80100

f c=63

Hz

0 20 40 60 80100

f c=12

5 H

z

1 2 3 4 0 20 40 60 80100

front�back discrimination left�right discrimination

Fig� �� Left graph� Percentage of correctly identi�ed front�back directions for dierent level settings� rightgraph� Percentage of correctly identi�ed left�right directions for dierent level settings� In both graphs� theblack bars show the �� dB level rove condition� the gray bars the � dB level rove condition� and the whitebars the �� dB level rove condition�

and � �left graph�� the level variation has minor in��uence on the performance of the listeners� Only inthree out of ten cases� the improvement at the high�est level was above �� compared to the lowest level�L�� Hz center frequency� L� ��Hz centerfrequency� and L�� Hz center frequency��

�� MODEL ALGORITHMS

�� Introduction

Wightman and Kistler showed in a localization testthat interaural time dierence �ITD� cues domi�nate interaural level dierence �ILD� cues for nat�ural ITD�ILD combinations if low frequencies arepresent �� For that reason� our analysis focuseson cross�correlation algorithms based on ITD cues�Besides simulating the auditory periphery and an al�gorithm for estimating the target position� combin�ing the output of the single frequency bands� wereimplemented�

�� Stimuli

Principally� the same test signals as in the psychoa�coustic experiment were used to evaluate the model�A dummy head �Br�el � Kj�r� served as the �ears�for the binaural model� For this purpose the binau�ral room impulse response �BRIR� was measured be�tween the dummy head� placed in the center� and asubwoofer loudspeaker �Yamaha� SW � � positionedsequently at all �ve locations of the surround set�up�For the impulse�response measurement� the Yamahasubwoofer was chosen over the Bang � Olufsen sub�woofer for the simple reason that it allowed highersound pressure levels which results in a better signal�to�noise ratio of the measured impulse responses� Onthe other hand� the Yamaha speaker could not beused in the listening test� because not enough speak�ers were available for a �ve�channel set�up� Never�theless� both the model and the psychoacoustic test


Page � of �


0 25 50 75 100

−1

−0.5

0

0.5

1 left channel

Time [ms]

Rel

. Am

plitu

de

0 25 50 75 100

−1

−0.5

0

0.5

1 right channel

Time [ms]

Rel

. Am

plitu

de

Fig� �� Binaural room impulse response measuredbetween the right speaker �� azimuth� �� eleva�tion� and a dummy head placed in the center of theloudspeaker set�up�

environments should be comparable� since the roomand the loudspeaker positions were the same� Theimpulse responses were measured using a custom�built software based on a cross�correlation algorithmand MLS signals which was executed on a personalcomputer and an external sound device �GatewayPentium �� Marc of the Unicorn� MOTU �� Af�terwards� the same signals that were previously usedin the listening test were convolved with the BRIRsand then analyzed using the binaural model�

A second set of BRIRs was measured in the TVstudio of the Instructional Communication Centre�ICC studio� at McGill University� The ICC studiohas the following reverberation times �t �� Hz�� Hz�� Hz�� For this measure�ment� a � channel sound reproduction system wasused� which was developed within the Valorisation�

recherche Quebec �VRQ� project Real�time Commu�

nication of High�resolution Multi�sensory Content

via Broadband Networks� The system is basicallyan augmented standard surround system with � ver�tically aligned rings of custom�built ribbon loud�speakers� Five subwoofers �D�Box� Mini Mammoth��

Ban

dp

ass

filt

erb

ank

D ec is io n d ev ice

Ban

dp

ass

filt

erb

ank

Hai

r-ce

llsi

mu

lati

on

Cro

ss-c

orre

lati

on

Hai

r-ce

llsi

mu

lati

on

1 st

2 nd

i th

n th

freq u en c yb an d :

H C

H C

H C

H C

H C

H C

C C

C C

C C

C en tra l n e rv o u ssystem

In n er ea r In n er ea r

righ

t ch

ann

el

left

ch

ann

el

Fig� �� General model structure of the localizationalgorithm�

which were set up on the studio �oor in a standardITU ��channel con�guration are used to reproducelow�frequencies� Full frequency range room�impulseresponses were measured using the subwoofers andthose ribbon speakers that were mounted directlyabove them� Again� the dummy head was positionedin the center of the loudspeaker array� Figure �shows an example of a measured BRIR�

�� Model Structure

Stages to simulate the auditory periphery were im�plemented into the cross�correlation model �Fig� ��in order to simulate human hearing with a su�cientaccuracy� The model is similar to an earlier modelproposed by Blauert and Cobben �� The trans�formation from the sound sources to the eardrumsare taken into account by �ltering the sounds withBRIRs from a speci�c direction as described in theprevious section� Basilar�membrane and hair�cellbehavior are simulated with a gammatone �lter bankof �� bands at a sampling frequency of �� kHz ��bit resolution�� as described by Patterson et al� �� and a simple half�wave recti�cation� With the ex�ception of Fig� �� only the low frequency bands � to� �� Hz�� Hz� are analyzed�


Page � of �


−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 47

293

738

1559

3105

Freq. [Hz

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 47

293

738

1559

3105

Freq. [Hz

ITD [ms]

Fig� �� Demonstration of the decompression algo�rithm that was introduced in the cross�correlationmodel to sharpen the peaks by taking the cross�correlation function to the power of � � The topgraph shows the output of the cross�correlationmodel without the decompression stage� the bottomgraph with decompression stage�

After the half�wave recti�cation� the interaural crosscorrelation was estimated within each frequencyband over the whole stimulus duration t��t��

�Yl�r�f� ��

�

t� � t�

t�X

t�t�

Yl�f� t�Yr�f� t� ��

To observe a cross�correlation peak at low fre�quencies in the ITD range from �� ms to ��ms� the cross�correlation functions had to be de�compressed� This was done by taking the cross�correlation function to the power of � Otherwisethe cross�correlation functions would have resultedin a plateau� The dierence the decompression stagemakes is shown in Fig� �� Since we are primar�ily interested in how the auditory system localizes

low�frequency signals� we chose a constant decom�pression rate� In a full�range model� the decom�pression rate could be adapted with decreasing val�ues at higher frequencies� such that the widths ofthe cross�correlation peaks correlate with variableslike apparent source width �ASW� or listener�s en�velopment �LEV�� Unlike in previous studies by the�rst author �� the cross�correlation functionsare not remapped from ITD�based to azimuth�basedfunctions before the decision device for two reasons�Firstly� the actual location of the auditory event isnot of primary interest since� strictly spoken� the aimis to predict the outcome of a sound source identi�ca�tion experiment rather than simulating a pure local�ization test� Therefore� it is su�cient to show thatthe model can or cannot discriminate between dif�ferent loudspeaker positions in a similar way to theperformance of human listeners� The model shouldassign the test stimuli to speakers with a similar ac�curacy as was found in the listing test�

In the decision device� the average of the ICC func�tions is calculated and normalized to one� The ICCpeak in each frequency band is scaled to the soundpressure level measured in this band beforehand�The model estimates the sound source�s directionof incidence at the position of the local peak of theaveraged ICC function�

�� Results

The outputs of the cross�correlation algorithm indierent frequency bands is shown in Fig� �� Forboth the test stimuli at ��Hz and ��Hz cen�ter frequencies and for each frequency band� we ob�serve a clearly visible shift of the cross�correlationpeaks as we move the speaker position from the cen�ter speaker to the surround�left speaker via the leftspeaker �top to bottom row�� In the active region�the peaks of the cross�correlation functions for dier�ent frequency bands are more or less aligned at thesame internal delay �ITD�� Naturally� not only doesthe ��Hz stimulus activate lower frequency bandsthan is observed for the ��Hz stimulus� but alsoits cross�correlation functions appear to be wider�because the width of the cross�correlation peak de�pends on the wavelength of the analyzed signal�

Figure � shows the results of the model simulationafter the cross�correlation functions were summed


Page � of �


−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

center

left

surround�

left

center

left

surround�

left

��Hz center frequency ��Hz center frequency

Fig� �� Outputs of the cross�correlation algorithm for dierent speaker positions in MARLAB �row wise�and stimuli with dierent center frequencies �column wise��

over all frequency bands� According to the im�plementation of the decision device� the estimatedstimulus position corresponds to the maximum ofthe cross�correlation peak� Since we are not inter�ested in the height of the peak� the peak of thecross�correlation functions were normalized to onefor a better readability of the graphs� The �gureshows all simulated test conditions and not only theones previously depicted in Fig� � �from top to bot�tom� ��Hz� ��Hz� ��Hz center frequencies�� Ineach graph� the average cross�correlation function

for each speaker position is shown� The legend inthe graph for the ��Hz center frequency condi�tion� which is valid for all three frequency condi�tions� gives the line type used for each speaker lo�cation� Again� it is obvious that the peak of thecross�correlation functions widens with decreasingfrequency� It might be surprising to some readersthat the cross�correlation peak for the center speakeris not always located at �ms ITD ��Hz and ��Hzcenter frequency condition�� and that the peaks forthe left and right speakers �respectively surround�


Page �� of �


−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=31

.5 H

zsl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

zf c=

125

Hz

sl l c r sr sllcrsr

Fig� � � Model performance of the cross�correlationmodel for the lateralization of an octave�band�widenoise burst for dierent speaker positions in MAR�LAB �sl!surround left� l!left� c!center� r!right�and sr!surround right as indicated in the legend�and center frequencies �from top to bottom row�� Hz� �� Hz� �� Hz�

left and surround�right speaker� are not equidistantfrom the position for the center speaker� However�we should consider that present room re�ections canbe accounted for the observed asymmetrical eects�

In order to investigate the eect of reverberationmore thoroughly� the simulation was rerun� with theexception that this time the BRIRs from the mea�surement in the ICC studio were used� The resultsare shown in Fig� �� left row�� They are plottedin the same way as in the previous �gure� Sincethe ICC studio is more reverberant than MARLAB�greater deviations are to be expected� Most inter�estingly is the fact that in the ��Hz condition thepeak position for the center speaker is found to beleft of the peak position for the left speaker� andin the ��Hz condition� the peak position for thesurround�right speaker is located left of the peakposition for the right speaker� The peak positionsfor the center speaker and the right speaker nearlymatch for the ��Hz condition�

In the next step� the ICC�studio BRIRs for all mea�

sured positions were truncated just before the ar�rival of the �rst prominent early re�ection� The �rstprominent early re�ection usually arrived in the or�der of ten milliseconds after the direct source� Theidea was to eliminate the room re�ections as wellas possible� After truncating the BRIRs� the sim�ulation was rerun� and the results are depicted inthe right row of Fig� �� In comparison to the re�verberant condition� the positions of the peaks arenow more symmetrical spaced� and the peak posi�tions for the center speaker are closer to midline�In addition� the peak widths decreased in absenceof the re�ections� and the peaks moved further out�ward� A look at the cross�correlation functions thatwere measured in the single frequency bands revealsthat the peaks are more aligned after the re�ectionshave been removed �Fig� ��

Concluding the model analysis� the ILDs for eachspeaker position� estimated in each frequency band�are shown in Fig� �� The legend of the graph pro�vides the line type used for each speaker location�The measured ILDs� with maximum magnitude val�ues of approximately two decibels� remain very lowin the plotted frequency domain� and the curves forthe three frontal speakers cross each other severaltimes�

�� DISCUSSION

One of the strongest arguments for the use of onlyone subwoofer in a standard two� or �ve�channelloudspeaker set�up is the inability for humans to lo�calize low�frequency sounds� which is often believedto occur in reverberant listening spaces� The out�come of the psychoacoustic experiment gives evi�dence that this is not the case for the listening spaceof the experimental set�up� Even though the lis�teners had di�culties in discriminating between theleft and surround�left speakers� and the right andsurround�right speakers� they showed no di�cultyin determining whether the sound arrived from left�center or right�

The model simulation was established to decidewhether the psychoacoustic �ndings are in line withour present knowledge of how our auditory systemoperates� or whether our psychoacoustical resultscannot be explained theoretically so far� It shouldbe pointed out that the human ability to localizelow�frequency sounds is not questioned in general�


Page �� of �


−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00IC

C a

mp.

f c=31

.5 H

zsl l c rsr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

z

sl lc r sr sllcrsr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=31

.5 H

zsl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ICC

am

p.

f c=63

Hz

sl l c r sr

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.500.00

0.25

0.50

0.75

1.00

ITD [ms]

ICC

am

p.

f c=12

5 H

z

sl l c r sr sllcrsr

with reverb without reverb

Fig� �� Model performance of the cross�correlation model for the lateralization of an octave�band�widenoise burst for dierent speaker positions in ICC�studio �sl!surround left� l!left� c!center� r!right� andsr!surround right as indicated in the legend� and center frequencies �from top to bottom row�� Hz�� Hz� �� Hz� The left graph shows the results for the complete impulse responses �including reverberation��in the right graph� the impulse responses were truncated to eliminate early re�ections and reverberation�

For example� low�frequency ILDs cause a great lat�eral displacement of the auditory event� It is ratherquestioned whether the cues provided in a real en�vironment are su�cient for the auditory system toestimate the position of the sound source� For lowfrequencies� the ILDs are very small� because thehead dimensions here are in orders smaller than thewavelength of the sound� and head shadowing eectsare not eective anymore� For ITDs on the otherhand� reverberation can distort the cue� and further�more� the physically measured phase dierences be�come relatively small� An ITD of �� ms"which isdetermined by the distance between both eardrums�approximately �� cm� and which is about the nat�ural limit found in nature"results in phase dier�ences of �� at ��Hz center frequency� ��

at ��Hz center frequency� and �� at ��Hz cen�ter frequency� More importantly� the peak of thecross�correlation function becomes very wide at lowfrequencies� The model results� however� show thatthe sounds can still be localized relatively well� whichbecomes more apparent after the cross�correlation

peaks were decompressed to eectively reduce thewidth of the cross�correlation peaks�

In the model simulation� the cross�correlation peakvaried with the speaker position in such a way thatthe lateral displacement of the peak always matchedthe speaker position tested� The lateral displace�ment of the speakers from the center listening po�sition� surround left� left� center� right� and sur�round right� was maintained in the same order bythe peak positions of the cross correlation functions�In particular� the distance between the peaks for thefront�left and front�right speakers to the correspond�ing surround speakers was in the same order as thedistance to the peak for the center speaker� This�nding suggests that the listeners should be able todistinguish between the left and the surround�leftspeakers and the right and surround�right speakers�if they can distinguish between the left� center andright speaker� At this point� it can be only specu�lated why such a response pattern was not observedin most listening conditions� One explanation isthat the density of coincidence cells is believed to


Page �� of �


−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

−1.50 −1.00 −0.50 0.00 0.50 1.00 1.50 23

47

76

105

141

176

Freq. [Hz]

ITD [ms]

left

surround�

left

left

surround�

left

with reverb without reverb

Fig� �� Outputs of the cross�correlation algorithm for dierent speaker positions in ICC�studio �row wise� foroctave�band wide noise bursts at ��Hz center frequency� The left graphs show the results for the completeimpulse responses �including reverberation�� In the right graphs� the impulse responses were truncated toeliminate early re�ections and reverberation�

be larger for small ITDs than for large ITDs �� andtherefore the ability to discriminate between ITDsdecreases with the ITD magnitudes� The assump�tion that the listeners cannot remap the ITDs of suchlow frequencies to a distinct azimuth value mightserve as a second explanation�

Another interesting �nding should be noted� In the��Hz center frequency condition of the listeningtest �Fig� �� three out of four listeners �L� L��and L�� showed a better performance in discrimi�nating between the left and surround�left speakersthan for the right and surround�right speakers� The�ndings of the model simulation point in a simi�lar direction� Here the cross correlation peaks forthe left and the surround�left speakers are closerto each other than it is the case for the right andthe surround�right speakers �Fig� � � bottom�� This�nding also supports the assumption that the lis�teners did not discriminate between the surround

speakers and their corresponding front speakers onthe basis of front�back discrimination but rather bycues regarding the lateral displacement� While ITDcues are very similar for the same left�right positionin the frontal and the rear hemisphere� ILD cues andmonaural cues typically provide the information forfront�back�discrimination� Since the stimuli appliedwere only an octave wide and low in frequency� itis unlikely that monaural cues were utilized by thelisteners� Also the absolute signal level can be ex�cluded as a cue� because a level rove was appliedduring the experiment� The use of ILD cues is notvery likely either� The dierence in ILDs betweenthe surround speakers and the corresponding frontspeakers is in the order of one decibel� which is nearthe absolute threshold in ILD discrimination tasks�

Regarding the center frequency of the stimuli� thedistances between the cross correlation peaks appearto be larger for the very low frequencies �Fig� � ��


Page �� of �


50 100 150 200−3

−2

−1

0

1

2

3

frequency [Hz]

ILD

[dB

]sl

l

c

r

sr

sllcrsr

Fig� �� Interaural level dierences measured witha binaural model for dierent loudspeaker positions�surround left �sl�� left �l�� center �c�� right �r�� sur�round right �sr� as indicated in the legend� The ILDsare shown as function of the center frequency of thefrequency bands of the gammatone �lter bank�

The explanation why the front�back discriminationperformance rather improved with increasing stimu�lus center frequency� therefore� should be connectedto the decreasing peak width� assuming that thisleads to a better discriminability of adjacent peakpositions�

So far� it was not discussed to what extend the�ndings of this investigation can be generalized toother spaces� After comparing the data for MAR�LAB �Fig� � � with the data for ICC studio �Fig� ��left panel�� it can be assumed that the cues pro�vided for the same listening situation are less reli�able in the ICC studio� After reducing the roomre�ections� though� the model performance greatlyimproved� In the near future� further localizationexperiments in more reverberant spaces than MAR�LAB are planned� The results of these listening testwill hopefully show whether the human auditory sys�tem can utilize the precedence eect for such low fre�quencies or whether the performance of the listenersdegrades as the reverberant condition of the modelsimulation in ICC studio would suggest�

It should also be discussed to what extent the binau�ral cues in the subwoofer frequency range �typically# � Hz� contribute to the position of the auditory

event for sounds of broader bandwidths� In the psy�choacoustic experiment� the frequency range of thenoise bursts was limited to an octave� but in naturemost sounds extent to higher frequencies� Researchhas been previously conducted to determine how theauditory system weights information throughout fre�quency when determining the lateral position of au�ditory cues on an ITD basis� The spectral domi�nance region was found to be at � Hz� and theinformation in other frequency bands are weightedless the more its center frequency deviates from thisvalue �� In this context it is noteworthy that eventhough con�icting cues caused by one�subwoofer pre�sentations might not signi�cantly in�uence the posi�tion of the auditory event� the con�icting cues couldlead to larger apparent source widths and even splitauditory events�

In summary� our �ndings do not support the gen�eral use of one�subwoofer systems in high�end audioapplications� The question remains why the inves�tigations of K�gler and Theile �� and Zacharov etal� �� came to the opposite conclusion� This dis�crepancy can be explained by the strong in�uence ofreverberation� The space used in our investigation islikely to have been less reverberant than the spacesused by K�gler and Theile and Zacharov et al� An�other dierence is that in our investigation the up�per stimulus frequency was below the crossover fre�quency of the subwoofer� while in the other two in�vestigations this was not the case� In the latter case�the binaural cues at higher frequencies might havemasked the low�frequency cues� Also regarding gen�eral methodology� dierences are found between thethree investigations� Zacharov et al� report aboutan informal listening test� and the outcome of K��gler and Theile�s experiment might have been dier�ent� if it had been designed as a forced�choice exper�iment� In the graphs shown by the authors �Figs� �� and �� the percentage of audible dierences arehigher for all tested crossover frequencies than in thecontrol condition� and the recommendation to avoidcrossover frequencies above � Hz� does not appearto have been made on the basis of a statistical testor another systematic procedure�

�� CONCLUSION

As the results of our investigation show� low fre�quency signals are often localizable� and we there�fore recommend to consider using two or more sub�


Page �� of �


woofers or a full�range speaker system� When usingtwo subwoofers in a ��channel surround set�up� it isoften advantageous to place the subwoofers at theside �� and �� rather than at the front�left andfront�right position� In the �rst case� larger ITDscan be generated and� because of the physical ab�sence of ILDs� cues for front�back discriminationsare missing for frequencies below � Hz�

�� ACKNOWLEDGEMENT

This investigation was supported by a Grant ofthe Government of Qu$bec within the projectReal�time Communication of High�resolution Multi�

sensory Content via Broadband Networks of theprogram Valorisation�recherche Qu�bec �VRQ�� Wewould like to thank our anonymous listeners for par�ticipating in the listening tests and Durand Begaultand his students for providing us with the reverber�ation time measurements� which were measured inDurand�s class at McGill University�

�� REFERENCES

�� F� A� Bilsen and J� Raatgever� Spectral dom�

inance in binaural lateralization� Acustica ��pp� ��

�� J� Blauert and W� Cobben� �Some considerationof binaural cross correlation analysis�� Acustica�� pp� ��

�� J� Braasch� �Localization in the presence of adistracter and reverberation in the frontal hor�izontal plane� II� Model algorithms�� ACUS�TICA�acta acustica �� pp� ��

�� J� Braasch� �Localization in the presence of adistracter and reverberation in the frontal hor�izontal plane� III� The role of interaural leveldierences�� ACUSTICA�acta acustica� �� pp��

�� H� S� Colburn� �Theory of binaural interactionbased on auditory�nerve data� II� Detection oftones in noise�� J� Acoust� Soc� Am� �� pp��

�� D� Griesinger� �Objective measures of spa�ciousness and envelopment�� in� Proc� AudioEngineering Society� ��th Int� Conf� on Spa�tial Sound Reproduction� Rovaniemi� Finland�� pp� ��

�� C� K�gler� G� Theile� �Loudspeaker Reproduc�tion� Study on the Subwoofer Concept�� pre�sented at the �nd Convention of the AudioEngineering Society� J� Audio Eng� Soc� �Ab�stracts�� p� �� May�� preprint ��

�� W� L� Martens� �Subjective evaluations of au�ditory spatial imagery associated with decorre�lated subwoofer signals�� Proc� of the Int�Conf� of Auditory Display� July �� Ky�oto� Japan � ��

�� T� Noussaine� �Multiple subwoofers for hometheater�� presented at the � �rd Convention ofthe Audio Engineering Society �� preprint�� H��

�� R� D� Patterson� M� H� Allerhand and C�Gigu%re� �Time�domain modeling of periphalauditory processing� A modular architectureand software platform�� J� Acoust� Soc� Am� ��pp� ��

�� T� Welti� �How many subwoofers are enough��presented at the ��rd Convention of the AudioEngineering Society � �� preprint ��

�� F� L� Wightman� D� J� Kistler� �The dominantrole of low�frequency interaural time dierencesin sound localization�� J� Acoust� Soc� Am� ��pp� ��

�� N� Zacharov� S� Bech� D� Meares� �The use ofsubwoofers in the context of surround soundprogram reproduction�� J� Audio Eng� Soc� ��pp� ��


Page � of �

Documents

=31.5 Hzsymphony.arch.rpi.edu/~braasj/.inside/... · As exp ected the in teraural lev el di erences remained nearly constan t for di eren tsub w ... temp oral smearing if the sub