9
Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions Wen Zhang 1,2 , Thushara D. Abhayapala 1,2 , Rodney A. Kennedy 1 1 Department of Information Engineering, Research School of Information Sciences and Engineering, The Australian National University, Canberra ACT 0200, Australia 2 Wireless Signal Processing Program, Canberra Research Laboratory, NICTA, Canberra ACT 2612, Australia Correspondence should be addressed to Wen Zhang ([email protected]) ABSTRACT This paper proposes a method to reproduce the Head Related Transfer Function (HRTF) in the horizontal auditory scene. The method is based on a separable representation which consists of a Fourier Bessel series expansion for the spectral components and a conventional Fourier series expansion for the spatial dependence. The proposed representation can be used to predict HRTFs at any azimuth position and at any frequency sampling point from a finite number of measurements. Implementation details are demonstrated in the paper. Measured HRTFs from a KEMAR manikin and analytically simulated HRTFs were used to validate the fidelity and predictive capabilities of the method. The average mean square error for model reconstruction is less than two percent. 1. INTRODUCTION The ability of human beings to localize the sound in three dimensions depends on the way the sound waves from the same source differ from each other as they reach the left and right ears. The head, torso, shoul- ders and the outer ears modify the sound arriving at a person’s ears. This modification can be described by a complex response function, the Head Related Transfer Function (HRTF), which depicts how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the individual body shape before the sound reaches the listener’s eardrum. Theoretically, HRTFs can be used to generate binaural sound, a “virtual acoustic environ- ment”, as it contains all the information about the sound source’s location (its direction and distance from the lis- tener). To synthesize the auditory scene, a straightforward way is to filter the original monaural sound through a proper set of measured HRTFs for each individual. However, to represent the entire auditory scene, one common ap- proach is to have functional representations of the HRTF, such as filter bank models [1, 2] and transfer decompo- sitions (principal component analysis [3, 4] and spheri- cal harmonics [5]). As the weights of these models are available only for either the measured directions or sam- pled frequency points, the HRTF must be interpolated between two discrete measurement positions or discrete frequencies. Many techniques have been proposed now to perform the interpolation of the HRTF, such as the bi- linear method [6], pole-zero approximation models [7] and spherical spline-based methods [8]. Nevertheless, the most appropriate interpolation is still considered as an open question. A more efficient approach is to have continuous functional representation of HRTFs. This paper proposes a method to reproduce HRTFs at all possible positions in the horizontal plane based on a novel HRTF Fourier Bessel functional representation. The Fourier Bessel functional model uses the Fourier series to separate the spatial and spectral dependence of the HRTF. According to the strong correlation be- tween the measured spectral structure of the HRTF and the family of Bessel functions of the first kind, we use the Fourier-Bessel series to represent the Fourier series weights (spectral components of HRTFs). Both Fourier series and Fourier-Bessel series are well-studied com- plete orthogonal functions and can be readily applied to accurately model the HRTF. Further, by applying these two sets of continuous functions, each individualized HRTF is transformed to a coefficient matrix, which could be very easily saved and processed to predict HRTFs at any arbitrary position in two dimensions. Given the AES 31 ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27 1

Horizontal plane HRTF reproduction using continuous ...users.cecs.anu.edu.au/~rod/papers/2007/AES-31.pdf · The Fourier Bessel series make use of the orthogonality between Bessel

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Horizontal plane HRTF reproduction usingcontinuous Fourier-Bessel functionsWen Zhang1,2, Thushara D. Abhayapala1,2, Rodney A. Kennedy1

    1Department of Information Engineering, Research School of Information Sciences and Engineering, The Australian NationalUniversity, Canberra ACT 0200, Australia2Wireless Signal Processing Program, Canberra Research Laboratory, NICTA, Canberra ACT 2612, Australia

    Correspondence should be addressed to Wen Zhang ([email protected])

    ABSTRACTThis paper proposes a method to reproduce the Head Related Transfer Function (HRTF) in the horizontal

    auditory scene. The method is based on a separable representation which consists of a Fourier Bessel

    series expansion for the spectral components and a conventional Fourier series expansion for the spatial

    dependence. The proposed representation can be used to predict HRTFs at any azimuth position and at any

    frequency sampling point from a finite number of measurements. Implementation details are demonstrated

    in the paper. Measured HRTFs from a KEMAR manikin and analytically simulated HRTFs were used to

    validate the fidelity and predictive capabilities of the method. The average mean square error for model

    reconstruction is less than two percent.

    1. INTRODUCTIONThe ability of human beings to localize the sound inthree dimensions depends on the way the sound wavesfrom the same source differ from each other as theyreach the left and right ears. The head, torso, shoul-ders and the outer ears modify the sound arriving at aperson’s ears. This modification can be described by acomplex response function, the Head Related TransferFunction (HRTF), which depicts how a given sound waveinput (parameterized as frequency and source location)is filtered by the diffraction and reflection properties ofthe individual body shape before the sound reaches thelistener’s eardrum. Theoretically, HRTFs can be usedto generate binaural sound, a “virtual acoustic environ-ment”, as it contains all the information about the soundsource’s location (its direction and distance from the lis-tener).

    To synthesize the auditory scene, a straightforward wayis to filter the original monaural sound through a properset of measured HRTFs for each individual. However,to represent the entire auditory scene, one common ap-proach is to have functional representations of the HRTF,such as filter bank models [1, 2] and transfer decompo-sitions (principal component analysis [3, 4] and spheri-cal harmonics [5]). As the weights of these models areavailable only for either the measured directions or sam-

    pled frequency points, the HRTF must be interpolatedbetween two discrete measurement positions or discretefrequencies. Many techniques have been proposed nowto perform the interpolation of the HRTF, such as the bi-linear method [6], pole-zero approximation models [7]and spherical spline-based methods [8]. Nevertheless,the most appropriate interpolation is still considered asan open question. A more efficient approach is to havecontinuous functional representation of HRTFs.

    This paper proposes a method to reproduce HRTFs atall possible positions in the horizontal plane based ona novel HRTF Fourier Bessel functional representation.The Fourier Bessel functional model uses the Fourierseries to separate the spatial and spectral dependenceof the HRTF. According to the strong correlation be-tween the measured spectral structure of the HRTF andthe family of Bessel functions of the first kind, we usethe Fourier-Bessel series to represent the Fourier seriesweights (spectral components of HRTFs). Both Fourierseries and Fourier-Bessel series are well-studied com-plete orthogonal functions and can be readily applied toaccurately model the HRTF. Further, by applying thesetwo sets of continuous functions, each individualizedHRTF is transformed to a coefficient matrix, which couldbe very easily saved and processed to predict HRTFsat any arbitrary position in two dimensions. Given the

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–271

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    characteristics of the Fourier Bessel functional model,the reproduction implementation details are elaboratedin the paper. Finally, the reproduction results of the pro-posed method are validated by comparing a KEMARmanikin measured [9] and analytically simulated HRTFs[10] with the corresponding reproduced HRTFs.

    2. HRTF FOURIER BESSEL FUNCTIONALMODELA continuous functional representation is a mathematicalmodel or equation that represents the HRTF as a func-tion of continuous variables (source positions and fre-quency points). The horizontal plane HRTF is denotedby H( f ,φ) as a function of frequency f and azimuthalangle φ . In this section we develop spatial and spectralcontinuous representations of the HRTF.

    2.1. HRTF spatial components modellingOne of the noticeable characteristics of the HRTF is thatthe function is periodic with period 2π in the azimuthalvariable. A periodic function is most naturally expandedusing a Fourier series, which makes use of the orthogo-nality between the complex exponentials to break up anarbitrary periodic function into a series of simple termswhich converge in the mean. When truncated this infiniteseries can provide very good approximation to the origi-nal function given the HRTF has a low pass character inthe azimuthal variable. Thus, the continuous Fourier se-ries can be used as basis to extract the spatial dependenceof HRTFs written as

    H( f ,φ) =∞

    ∑m=−∞

    Am( f )eimφ , (1)

    where i=√−1 and the mth order Fourier series weights

    are given by

    Am( f ) =12π

    � 2π

    0H( f ,φ)e−imφdφ . (2)

    Equations (1) and (2) allow the calculation of the Fourierseries weights Am( f ) given a continuous HRTF function.In a practical context we need to reconsider equations(1) and (2). Two important modifications are required.Firstly (1) should be modified by truncating to a certainorder M depending on what accuracy is desired. Sec-ondly (2) should be modified by replacing the integralwith a finite summation to calculate coefficients Am( f )at discrete frequencies. As the solved Fourier series

    00.5

    11.5

    22.5

    x 104

    −40−20

    020

    400

    0.2

    0.4

    0.6

    0.8

    1

    Frequency (Hz)

    The magnitude spectrum of Am(f)

    Fourier series order: m

    Fig. 1: The magnitude spectrum of Am( f ) calculatedfrom the MIT data [9].

    weights Am( f ) are only available at some discrete fre-quencies, the problem of modelling Am( f ) as a functionof frequency is addressed below to achieve the goal of acontinuous representation.

    2.2. HRTF spectral components modellingBecause the physical understanding of the HRTF vari-ance with frequency remains an open issue, there is noobvious choice for the representation of the spectral com-ponents. A practically effective strategy is to use a repre-sentation consisting of a complete set of orthogonal func-tions which can be served as basis functions to any orderof Am( f ). The concept can be expressed as

    Am( f ) =∞

    ∑k=1

    Cmkϕk( f ), (3)

    where, ϕk( f ) is a suitable orthogonal set of functions de-fined on the interval f ∈ (0, fmax) ( fmax are the maximummeasurement frequency).

    Theoretically any general function can be perfectly ap-proximated by Eq. (3) without error; while for practi-cal implementation, truncation of the equation to a fi-nite number of terms, K, will bring in inaccuracies in therepresentation. In principle any orthogonal set of func-tions on a finite interval can be adapted for the purposeand provide an exact representation and in that sense areequivalent. However, under truncation to a certain num-ber of terms some orthogonal sets will perform better

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 2 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    00.5

    11.5

    2

    x 104

    −40−20

    020

    400

    0.5

    1

    1.5

    Frequency (Hz)

    The magnitude spectrum of Am(f)

    Fourier series order: m

    Fig. 2: The magnitude spectrum of Am( f ) calculatedfrom an analytical model [10].

    than others. That is, under truncation, different orthog-onal sets of functions are not equivalent and there willexist preferred choices.

    The aim of this work is to find the most efficient orthog-onal sets under truncation based on the pattern of Am( f )(the most resemble one should give the lowest truncationnumber). Fig. 1 and 2 depict a mesh plot of the mag-nitude spectrum of Am( f ) calculated from the MIT data[9] and a simple analytical model [10]. The figures revealthat the energy of the HRTF frequency components doesnot uniformly spread over the Fourier series expansionorder m. Instead, the pattern of Am( f ) from both datasets exhibits similarities to the family of Bessel functionsof the first kind (Fig. 3), such as the finite value of A0( f )at the origin f = 0, the prominence of other Am( f )s af-ter the origin and proportionally decaying oscillation likesine or cosine functions. This resemblance demonstratesthat there is a strong correlation between the spectralcomponents of HRTFs and the family of Bessel func-tions of the first kind. Thus, in this paper, a completeorthogonal set, the Fourier Bessel series, is proposed tomodel the HRTFs frequency components Am( f ) whosedependence on m is explained next.

    The Fourier Bessel series make use of the orthogonalitybetween Bessel functions of the first kind for a specificorder � on the interval (0,1) to expand a general func-tion [11]. The Fourier Bessel series expansion of the fre-

    0 2 4 6 8 10−0.5

    0

    0.5

    1

    Argument (n)

    l=0l=1l=2l=3l=4l=5

    Fig. 3: Plot of the Bessel functions of the first kind J�(n)for various orders �.

    quency component of HRTFs is given as

    Am( f ) =∞

    ∑k=1

    CmkJ�(β(�)k

    ffmax

    ), (4)

    where β (�)1 ,β(�)2 , ...,β

    (�)k are the positive roots of J�(x) =

    0 and � is the specific order of the Bessel function of thefirst kind. Cmk are complex coefficients whose depen-dence on the choice of � is suppressed.

    The coefficients of the Fourier Bessel series expansioncan thereby be resolved from [11] as

    Cmk =2

    [J�+1(β(�)k )]

    2

    � fmax

    0f Am( f )J�

    �β (�)k

    ffmax

    �d f .

    (5)

    As the Fourier Bessel series has infinite number of sets,we need to decide which specific order � to use. Firstly,the zero-th order J0(·) should be included due to the dom-inance of A0( f ); among all orders only J0(0) = 1 couldgive very efficient approximation to A0( f ) in the limit asf → 0. However, using only one term J0(·) is not theoptimum representation for the high order Am( f ) whenm �= 0. Noticing that high order Fourier Bessel series canmatch high order frequency components, we believe therepresentation with � depending on m is more effective.In this paper, we have not mathematically derived a spe-cific formula and a simple linear relationship � = |m| isapplied. The real measurements reconstruction perfor-mance in section 4 proves that this choice works well,

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 3 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    such as high approximation accuracy and significantlyreduced number of parameters (the verification of thissimple formula will be left as on open problem).

    Thus, the HRTF functional model in the frequency do-main is obtained as

    H( f ,φ) =∞

    ∑m=−∞

    ∑k=1

    CmkJ|m|�

    β (|m|)kffmax

    �eimφ , (6)

    where

    Cmk =1

    π[J|m+1|(β(|m|)k )]

    2

    � fmax

    0

    � π

    −πfH( f ,φ)

    J|m|�

    β (|m|)kffmax

    �e−imφdφd f . (7)

    Eq. (7) illustrates how to calculate the model parametersfrom a continuous HRTF function. While for experimen-tally measured HRTFs, the model coefficients Cmk arecalculated using the left Riemann sum to approximatethe integral. The implementation about appropriate trun-cation of (6) for efficient representation is illustrated insection 3.

    Now, the HRTF representation (6) are functions of con-tinuous variables representing the spectral and spatialcues. On the one hand, the proposed model can achieveHRTF reconstruction at any frequency point for an ar-bitrary azimuth. On the other hand, given the basisfunctions remain same for all listeners, for each indi-vidualized HRTF measurement, only a coefficient ma-trix Cmk needs to be saved. Note this coefficient matrixis much smaller in size compared to original measure-ments; hence the goal of data compression is achieved.

    3. HRTF REPRODUCTION IMPLEMENTATION

    In this section, issues about how to implement the pro-posed HRTF model (6) are elaborated including the lim-itation on experimental measurement.

    3.1. Angular Sampling

    The all-important implementation issue is to record theplane sound field with enough samples at the desired fi-delity. Ajdler in [12] proves that the appropriate way tosample HRTFs is through

    wt = ±c�θ0.09

    , (8)

    where c is the speed of sound propagation, �θ and wt arethe angular pulsation and temporal sampling frequency.�θ as defined above depends on the angular sampling in-terval. For example, under uniformly spaced sampling(the most widely used sampling method for HRTF mea-surement), �θ = 2π/�θ and �θ is the angular spacingbetween two consecutive HRTF measurement positions.

    Further, according to (8), in order to avoid spatial alias-ing in HRTF reproduction, there is a maximum azimuthalangular spacing�θmax beyond which fidelity is lost. The�θmax is related to the temporal sampling frequency wt .When the sampling frequency is 44.1 kHz (as used inMIT data set), the maximum angular sample interval isapproximate 5 degrees.

    In summary, the proposed HRTF reproduction methodrequires the sound field (HRTF) to be sampled uniformlyon the horizontal plane and the angular sample intervalno more than �θmax for a specific temporal samplingfrequency following (8).

    3.2. Choice of Truncation Number

    The primary parameters that influence the model fittingare the truncation numbers M and K in (6). The gen-eral rule is to monitor the distribution of model parame-ters Cmk over the Fourier series order m and the FourierBessel series order k for each specific data set, like Fig. 4from analytical model and Fig. 5 from MIT data. Thestructure of the coefficients has a clear downward trendin the average contribution made by Fourier series har-monics and Fourier Bessel series of increasing degree.Hence, a higher truncation number results in better ap-proximation; but too large value may lead to over fit-ting. In addition, make sure the value of Cmk is solvedover large orders. For example, the value of Cmk calcu-lated from MIT data is nearly zero when M > 20 andK > 100, which proves that our calculation order is ade-quate. Then, the criteria is that at least 90% of the total“energy” of Cmk is contained in the approximation. Inthis paper, for both analytical solutions and MIT mea-surements, the lowest truncation numbers are chosen asML = 16 and KL = 87 accordingly.

    Once the angular sampling interval and truncation num-bers are determined, we could estimate the model param-eters Cmk using (7) from the experimental measurementsand further reproduce HRTFs in the whole horizontal au-ditory scene.

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 4 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    020

    4060

    80100

    120

    −40−20

    020

    400

    0.5

    1

    1.5

    Fourier Bessel series: k

    Harmonic Amplitude

    Fourier series: m

    Fig. 4: The amplitude of model parameters Cmk (solvedfrom analytical solutions) over Fourier series m andFourier Bessel series k.

    020

    4060

    80100

    120

    −40−20

    020

    400

    0.5

    1

    1.5

    Fourier Bessel series: k

    Harmonic Amplitude

    Fourier series: m

    Fig. 5: The amplitude of model parameters Cmk (solvedfrom MIT data) over Fourier series m and Fourier Besselseries k.

    4. PRACTICAL RESULTS AND PERFOR-MANCE ASSESSMENTThe fidelity and predictive capabilities of our proposedhorizontal planar HRTF reproduction method is vali-dated by comparing the measured (or analytically sim-ulated) and model reproduced data. Two sets of data areemployed in evaluation: 1) the MIT data acquired usinga KEMAR manikin [9], and 2) some analytical solutions[10].

    The error metric is defined as the percent mean squareerror in the magnitude and phase spectrum at each az-imuthal location

    ei =∑Nn=1 �H( fn,φi)− Ĥ( fn,φi)�2

    ∑Nn=1 �H( fn,φi)�2×100%, (9)

    where for each azimuth, HRTFs are measured (or sim-ulated) at N frequency points. H( fn,φi) is the originalHRTFs at the nth frequency point and ith azimuth; andĤ( fn,φi) is the reproduced HRTF.

    The error performance is investigated for two differentclasses of directions; those for which measurements (an-alytical simulations) were conducted and contributed to-ward the derivation of the model parameters (Recon-struction) and directions for which had no experimentalreference (Interpolation).

    4.1. Real DataThe proposed method is tested on theMITmeasurementsfirst. The MIT measurements were made in an anechoicchamber; and both the “small” and “large” pinna mod-els were tested on the KEMAR DB 4004. The measure-ments are the head related impulse response in the timedomain at the 44.1 kHz sampling rate and each responseis 512 samples long. In the horizontal plane, a full 360degree of azimuth was sampled in equal sized increments(5 degrees approximately).

    In order to use one set of measurement to examine twokinds of error performance, 72 MIT measured HRTFsin the horizontal plane are divided into two groups witheach having 36 data sets spaced by 10◦. The first group isused to determine the model parameters Cmk; and the re-construction performance is evaluated by comparing themeasured and model reconstructed HRTFs at these lo-cations. The second group data is used to demonstratethe predictive power of the model (interpolation perfor-mance) as these locations are not used to determine Cmk.Note that from (8), we can only interpolate the HRTFs

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 5 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    0 0.5 1 1.5 2 2.5x 104

    −60

    −50

    −40

    −30

    −20

    −10

    0

    10

    20

    frequency (Hz)

    dB

    0 0.5 1 1.5 2 2.5x 104

    −70

    −60

    −50

    −40

    −30

    −20

    −10

    0

    10

    frequency (Hz)

    dB0 0.5 1 1.5 2 2.5

    x 104

    −70

    −60

    −50

    −40

    −30

    −20

    −10

    0

    10

    20

    frequency (Hz)

    dB

    0 0.5 1 1.5 2 2.5x 104

    −50

    −40

    −30

    −20

    −10

    0

    10

    20

    frequency (Hz)

    dB

    MeasuredReconstructed

    MeasuredReconstructed

    MeasuredReconstructed

    MeasuredReconstructed

    Az=0, err=0.78% Az=90, err=2.04%

    Az=180, err=0.70% Az=270, err=0.23%

    MIT LEFT−EAR HRTF RECONSTRUCTION

    Fig. 6: Example MIT (left ear) measured and model reconstructed HRTFs using frequency domain model at φ =0◦,90◦,180◦, and 270◦.

    0 2000 4000 6000 8000 10000 12000−40

    −30

    −20

    −10

    0

    10

    20

    frequency (Hz)

    dB

    0 2000 4000 6000 8000 10000 12000−35

    −30

    −25

    −20

    −15

    −10

    −5

    0

    5

    10

    frequency (Hz)

    dB

    0 2000 4000 6000 8000 10000 12000−35

    −30

    −25

    −20

    −15

    −10

    −5

    0

    5

    10

    15

    frequency (Hz)

    dB

    0 2000 4000 6000 8000 10000 12000−40

    −30

    −20

    −10

    0

    10

    20

    frequency (Hz)

    dB

    MeasuredInterpolated

    MeasuredInterpolated

    MeasuredInterpolated

    MeasuredInterpolated

    Az=5, err=1.49% Az=95, err=2.34%

    Az=185, err=0.94% Az=275, err=0.23%

    MIT LEFT−EAR HRTF INTERPOLATION

    Fig. 7: Example of model interpolated MIT left ear HRTFs at φ = 5◦,95◦,185◦,275◦ with original measurementsoverlaid.

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 6 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    0 50 100 150 200 250 300 3500

    0.01

    0.02

    0.03

    0.04

    Azimuth Angle

    MSE

    MIT LEFT−EAR HRTF RECONSTRUCTION ERROR

    0 50 100 150 200 250 300 3500

    0.01

    0.02

    0.03

    Azimuth Angle

    MSE

    MIT RIGHT−EAR HRTF RECONSTRUCTION ERROR

    Fig. 8: Reconstruction error distributions as a functionof the source position azimuthal angle φ for the MITHRTFs. The error reaches its maximum for source lo-cations where the ear is maximally shadowed.

    up to 10.8 kHz using the angular spacing of 10◦. In ad-dition, the second group locations are at the midpoints ofthose used to determine the modal parameters and thusrepresent the locations of maximum interpolation error.

    Fig. 6 compares measured and model reconstructedHRTF magnitude spectrum at four typical directions inthe horizontal plane. The responses qualitatively showthat the model reproduces the experimental measured re-sponses with very high accuracy. Fig. 7 illustrates thepredictive power of the model by comparing measuredand model interpolated HRTF magnitude spectrum. Theprecise interpolation results prove that the proposed con-tinuous functional model can achieve HRTF estimationat any point in the horizontal plane.

    The distribution of errors across all positions is also pre-sented in Fig. 8 and 9 for the MIT measurements. Theworst reconstruction error is around 3% at an azimuth of100◦ −130◦ for the left ear and 280◦ −320◦ for the rightear. In MIT measurements, a source located at 90◦ az-imuth is directly across from the right ear (the left ear isthe shadowed ear) and a source located at 270◦ is directlyacross from the left ear (the right ear is the shadowedear). Figures reveal that the reproduction of HRTFs isusually better at the source-facing side of the head than atthe head’s shadowed side. Observe that the interpolationerrors are slightly larger than previous reconstructional

    0 50 100 150 200 250 300 3500

    0.02

    0.04

    0.06

    0.08

    Azimuth Angle

    MSE

    MIT LEFT−EAR HRTF INTERPOLATION ERROR

    0 50 100 150 200 250 300 3500

    0.02

    0.04

    0.06

    0.08

    Azimuth Angle

    MSE

    MIT RIGHT−EAR HRTF INTERPOLATION ERROR

    Fig. 9: Interpolation error distributions as a function ofthe source position azimuthal angle φ for the MIT data.

    errors because these directions are actually not used todetermine the model parameters.

    4.2. Analytical Solutions

    The proposed method is also tested on synthetic data.We used the well known analytical sphere HRTF modelpresented in [10] and a circle of 72 points on the hori-zontal plane (note these 72 points are uniformly sampledaccording to section 3.1). The HRTFs are computed ateach point at a distance of 1.2m from the sphere (head).Then the same validation procedure is employed. Theresults are very similar to those from the MIT databasedescribed above. As an example, the error distributionfrom fitting the model to the analytic solution is shownin Fig. 10 and 11 (the left and right ears have symmetri-cal error performance as the head is modelled as a spherein the analytical model). Note that, as with the MIT data,the reconstruction error reaches maximum (nearly 1.4%)at the head’s shadowed area.

    4.3. Discussion of the Model

    The reproduction of HRTFs is usually better at thesource-facing side of the head than at the head’s shad-owed side. Two factors contribute to the relative largeerrors at the head shadowed side. The most significantis the relatively lower signal-to-noise ratio (SNR) at thehead shadowed side. The energy in the contralateralHRTFs is less than that in the ipsilateral HRTFs. Hence,the head-shadowed HRTFs contribute relatively little to

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 7 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    0 50 100 150 200 250 300 3500.002

    0.004

    0.006

    0.008

    0.01

    0.012

    0.014

    0.016

    0.018

    0.02

    Azimuth Angle

    MSE

    Analytical computed HRTF reconstruction error

    Fig. 10: Reconstruction error distributions as a functionof the source position azimuthal angle φ for the analyti-cal model.

    0 50 100 150 200 250 300 3500

    0.005

    0.01

    0.015

    0.02

    0.025

    0.03

    0.035

    0.04

    0.045

    Azimuth Angle

    MSE

    Analytically computed HRTF interpolation error

    Fig. 11: Interpolation error distributions as a function ofthe source position azimuthal angle φ for the analyticalmodel.

    estimate the model parameters Cmk. A second factor isthat the contralateral sounds may produce more varia-tions because of the diffraction around the head. Thisresults in the spectral shapes that are more complicatedand more difficult to model. Therefore, it is obviouslynoted from the error distributions that the error reachesits maximum at the head shadowed side.

    In view of the above two factors, we believe the syn-thesis error at the head shadowed side can be reducedby increasing the truncation numberM and K (includingmore basis functions) in the proposed model. As stated insection 2, truncation of the infinite series model (6) maycause the inaccuracy of the representation. While the co-efficients Cmk show a clear downward trend in the aver-age contribution to approximate HRTFs; the high orderparameters correspond to the fine detail in the response.Hence, a higher truncation number results in more accu-rate representation.

    5. CONCLUSIONIn this paper, a method was developed for HRTFs repro-duction in the horizontal plane. The method is basedon a continuous HRTF functional model which usesFourier series and Fourier Bessel series to separate thespatial and spectral dependence of the HRTF. MeasuredHRTFs from the KEMAR manikin and analytical simu-lated HRTFs were used to investigate the model perfor-mance. The average MSE between measured and modelreconstructed HRTFs is less than two percent; and theaverage MSE between measured and model interpolatedHRTFs is less than four percent. The implementation re-sults show that the method could give indistinguishablereproduction compared with the original measurements.

    6. REFERENCES[1] F. L. Wightman and D. J. Kistler, “Headphone sim-

    ulation of free-field listening. II: Psychophysicalvalidation,” J. Acoust. Soc. Am., 85, pp. 858-867,1989.

    [2] A. Kulkarnj and H. S. Colburn, “Infinite-impulse-response models of the head-related transfer func-tion,” J. Acoust. Soc. Am., 115, pp. 1714-1728,2003.

    [3] W. L. Martens, “Principal component analysis andresynthesis of spectral cues to perceived direction,”

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 8 of 9

  • Zhang et al. Horizontal plane HRTF reproduction using continuous Fourier-Bessel functions

    in The International Computer Music Conference,edited by J. Beauchamp, San Francisco, CA, pp.274-281, 1987.

    [4] D. J. Kistler and F. L.Wightman, “Amodel of head-related transfer functions based on principal com-ponents analysis and minimum-phase reconstruc-tion,” J. Acoust. Soc. Am., 91, pp. 1637-1647, 1992.

    [5] J. E. Michael, A. S. A. James and I. T. An-thony, “Analyzing head-related transfer functionmeasurements using surface spherical harmonics,”J. Acoust. Soc. Am., 104, pp. 2400-2411, 1998.

    [6] L. Savioja, J. Huopaniemi, T. Lokki and R.Väänänen, “Creating interactive virtual acousticenvironments,” J. Audio Eng. Soc., 47(9), pp. 675-705, 1999.

    [7] M. A. Blommer and G. H. Wakefield, “Pole-zeroapproximations for head-related transfer functionsusing a logarithmic error criterion,” IEEE Trans.Speech Audio Processing, 5, pp. 278-287, 1997.

    [8] S. Carlile, C. Jin and V. V. Raad, “Continuousvirtual auditory space using HRTF interpolation:Acoustic and psychophysical errors,” in 2000 In-ternational Symposium on Multimedia InformationProcessing, Sydney, Australia, pp. 220-223, 2000.

    [9] G. G. William and D. M. Keith, “HRTF measure-ments of a KEMAR,” J. Acoust. Soc. Am., 97, pp.3907-3908, 1995.

    [10] R. O. Duda and W. L. Martens, “Range depen-dence of the response of a spherical head model”,J. Acoust. Soc. Am., 104, pp. 3048-3058, 1998.

    [11] W. Kaplan, “Fourier Bessel Series”, in AdvancedCalculus (4th ed.), MA: Addision-Wesley, pp. 512-518, 1992.

    [12] T. Ajdler, L. Sbaiz and M. Vetterli, “Plenacousticfunction on the circle with application to HRTFinterpolation,” in Proc. 2005 IEEE InternationalConference on Acoustics, Speech, and Signal Pro-cessing, Philadelphia, PA, USA, pp. 273-276, 2005.

    AES 31ST INTERNATIONAL CONFERENCE, London, England, 2007 JUNE 25–27Page 9 of 9