8
Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020 FLEXIBLE REAL-TIME REVERBERATION SYNTHESIS WITH ACCURATE PARAMETER CONTROL Karolina Prawda Acoustics Lab Dept. of Signal Processing and Acoustics Aalto University, Espoo, Finland [email protected] Silvin Willemsen, Stefania Serafin Multisensory Experience Lab Dept. of Architecture, Design & Media Tech. Aalborg University, Copenhagen, Denmark {sil,sts}@create.aau.dk Vesa Välimäki Acoustics Lab Dept. of Signal Processing and Acoustics Aalto University, Espoo, Finland [email protected] ABSTRACT Reverberation is one of the most important effects used in audio production. Although nowadays numerous real-time implementa- tions of artificial reverberation algorithms are available, many of them depend on a database of recorded or pre-synthesized room impulse responses, which are convolved with the input signal. Im- plementations that use an algorithmic approach are more flexible but do not let the users have full control over the produced sound, allowing only a few selected parameters to be altered. The real- time implementation of an artificial reverberation synthesizer pre- sented in this study introduces an audio plugin based on a feed- back delay network (FDN), which lets the user have full and de- tailed insight into the produced reverb. It allows for control of reverberation time in ten octave bands, simultaneously allowing adjusting the feedback matrix type and delay-line lengths. The proposed plugin explores various FDN setups, showing that the lowest useful order for high-quality sound is 16, and that in the case of a Householder matrix the implementation strongly affects the resulting reverberation. Experimenting with delay lengths and distribution demonstrates that choosing too wide or too narrow a length range is disadvantageous to the synthesized sound quality. The study also discusses CPU usage for different FDN orders and plugin states. 1. INTRODUCTION Artificial reverberation is one of the most popular audio effects. It is used in music production, sound design, game audio, and movie production to enhance dry recordings with the impression of space. The development of digital artificial reverberation started nearly 60 years ago [1], and since then various improvements as well as dif- ferent techniques have been developed [2]. The designs available nowadays can be roughly divided into three groups: convolution algorithms, delay networks, and physical room models [2, 3, 4]. The methods involving physical modeling simulate sound prop- agation in a specific geometry. Due to their high computational cost, though, they are used mostly in off-line computer simulations of room acoustics [3]. Recent developments in hardware and soft- ware technologies have also allowed computationally expensive simulations, such as those based on 3-D finite-difference schemes, to run in real time [5]. This work was supported by the “Nordic Sound and Music Computing Network—NordicSMC”, NordForsk project number 86892. Copyright: © 2020 Karolina Prawda et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. The techniques convolving the input signal with a measured room impulse response (RIR) produce rich, high-fidelity reverber- ation. However, since the RIR samples serve as the coefficients of a finite impulse-response (FIR) filter, with which the dry signal is filtered, the computational cost is high, especially for long RIRs. Another group of artificial reverberation algorithms is based on networks of delay lines and digital filters. The first example of such reverberators was introduced by Schroeder and Logan [1], who used feedback-comb-filter structures to create a sequence of decaying echoes. A similar architecture using allpass filters was also proposed to ensure high echo density without spectral col- oration. The development of such structures led to the invention of feedback delay network (FDN) algorithms, which can be regarded as a “vectorized” comb filter [2]. The FDN, as used in its current form, was presented in the work of Jot and Chaigne [6, 7]. Over the years, many real-time implementations of artificial reverberation algorithms have been developed. The designs that use a convolution-based approach, however, depend on measured or pre-synthesized RIRs convolved with the signal, which are col- lected in groups of presets [3, 8, 9, 10]. Such Virtual Studio Tech- nology (VST) plugins allow modifying the reverberation by modu- lating, damping or equalizing the available RIRs. The possibilities are, however, limited by the size of the RIR databases and there- fore prove to be relatively inflexible. Algorithmic reverb plugins that are based on delay network designs are both computationally efficient and easily modulated, thus providing more flexibility and freedom in producing reverber- ated sounds [4, 11]. The available designs vary between simple so- lutions allowing the user to change only a few parameters [12] and complex architectures with an elaborate interface enabling control over a wide range of variables [13]. Many of those plugins, how- ever, still remain ambiguous about the reverberation they synthe- size, allowing the user to set only the broadband decay parameter, and rely on presets based on the types of rooms they are supposed to imitate (e.g., Bright Room or Dark Chamber [14]). Usually, they also lack the information about the reverberation algorithm they use and its elements. The present work proposes a real-time implementation of an FDN algorithm with accurate control over the reverberation time (RT) in ten octave frequency bands in the form of an audio plu- gin. The graphical user interface (GUI) gives a thorough insight into the attenuation filter’s magnitude response, corresponding RT curve, and resulting impulse response (IR). The plugin also pro- vides several possibilities to control the elements of the FDN struc- ture, such as the feedback matrix and delay lines. It gives the user a full view of the decay characteristics and quality of the synthe- sized reverberation. The study also presents the effect that the type and size of the feedback matrix and the lengths and distribution of DAF 2 x 21 in Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21 16

Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

FLEXIBLE REAL-TIME REVERBERATION SYNTHESIS WITH ACCURATE PARAMETERCONTROL

Karolina Prawda ∗

Acoustics LabDept. of Signal Processing and Acoustics

Aalto University, Espoo, [email protected]

Silvin Willemsen, Stefania Serafin

Multisensory Experience LabDept. of Architecture, Design & Media Tech.Aalborg University, Copenhagen, Denmark

{sil,sts}@create.aau.dk

Vesa Välimäki

Acoustics LabDept. of Signal Processing and Acoustics

Aalto University, Espoo, [email protected]

ABSTRACT

Reverberation is one of the most important effects used in audioproduction. Although nowadays numerous real-time implementa-tions of artificial reverberation algorithms are available, many ofthem depend on a database of recorded or pre-synthesized roomimpulse responses, which are convolved with the input signal. Im-plementations that use an algorithmic approach are more flexiblebut do not let the users have full control over the produced sound,allowing only a few selected parameters to be altered. The real-time implementation of an artificial reverberation synthesizer pre-sented in this study introduces an audio plugin based on a feed-back delay network (FDN), which lets the user have full and de-tailed insight into the produced reverb. It allows for control ofreverberation time in ten octave bands, simultaneously allowingadjusting the feedback matrix type and delay-line lengths. Theproposed plugin explores various FDN setups, showing that thelowest useful order for high-quality sound is 16, and that in thecase of a Householder matrix the implementation strongly affectsthe resulting reverberation. Experimenting with delay lengths anddistribution demonstrates that choosing too wide or too narrow alength range is disadvantageous to the synthesized sound quality.The study also discusses CPU usage for different FDN orders andplugin states.

1. INTRODUCTION

Artificial reverberation is one of the most popular audio effects. Itis used in music production, sound design, game audio, and movieproduction to enhance dry recordings with the impression of space.The development of digital artificial reverberation started nearly 60years ago [1], and since then various improvements as well as dif-ferent techniques have been developed [2]. The designs availablenowadays can be roughly divided into three groups: convolutionalgorithms, delay networks, and physical room models [2, 3, 4].

The methods involving physical modeling simulate sound prop-agation in a specific geometry. Due to their high computationalcost, though, they are used mostly in off-line computer simulationsof room acoustics [3]. Recent developments in hardware and soft-ware technologies have also allowed computationally expensivesimulations, such as those based on 3-D finite-difference schemes,to run in real time [5].

∗ This work was supported by the “Nordic Sound and Music ComputingNetwork—NordicSMC”, NordForsk project number 86892.Copyright: © 2020 Karolina Prawda et al. This is an open-access article distributed

under the terms of the Creative Commons Attribution 3.0 Unported License, which

permits unrestricted use, distribution, and reproduction in any medium, provided the

original author and source are credited.

The techniques convolving the input signal with a measuredroom impulse response (RIR) produce rich, high-fidelity reverber-ation. However, since the RIR samples serve as the coefficients ofa finite impulse-response (FIR) filter, with which the dry signal isfiltered, the computational cost is high, especially for long RIRs.

Another group of artificial reverberation algorithms is basedon networks of delay lines and digital filters. The first exampleof such reverberators was introduced by Schroeder and Logan [1],who used feedback-comb-filter structures to create a sequence ofdecaying echoes. A similar architecture using allpass filters wasalso proposed to ensure high echo density without spectral col-oration. The development of such structures led to the invention offeedback delay network (FDN) algorithms, which can be regardedas a “vectorized” comb filter [2]. The FDN, as used in its currentform, was presented in the work of Jot and Chaigne [6, 7].

Over the years, many real-time implementations of artificialreverberation algorithms have been developed. The designs thatuse a convolution-based approach, however, depend on measuredor pre-synthesized RIRs convolved with the signal, which are col-lected in groups of presets [3, 8, 9, 10]. Such Virtual Studio Tech-nology (VST) plugins allow modifying the reverberation by modu-lating, damping or equalizing the available RIRs. The possibilitiesare, however, limited by the size of the RIR databases and there-fore prove to be relatively inflexible.

Algorithmic reverb plugins that are based on delay networkdesigns are both computationally efficient and easily modulated,thus providing more flexibility and freedom in producing reverber-ated sounds [4, 11]. The available designs vary between simple so-lutions allowing the user to change only a few parameters [12] andcomplex architectures with an elaborate interface enabling controlover a wide range of variables [13]. Many of those plugins, how-ever, still remain ambiguous about the reverberation they synthe-size, allowing the user to set only the broadband decay parameter,and rely on presets based on the types of rooms they are supposedto imitate (e.g., Bright Room or Dark Chamber [14]). Usually,they also lack the information about the reverberation algorithmthey use and its elements.

The present work proposes a real-time implementation of anFDN algorithm with accurate control over the reverberation time(RT) in ten octave frequency bands in the form of an audio plu-gin. The graphical user interface (GUI) gives a thorough insightinto the attenuation filter’s magnitude response, corresponding RTcurve, and resulting impulse response (IR). The plugin also pro-vides several possibilities to control the elements of the FDN struc-ture, such as the feedback matrix and delay lines. It gives the usera full view of the decay characteristics and quality of the synthe-sized reverberation. The study also presents the effect that the typeand size of the feedback matrix and the lengths and distribution of

DAFx.1

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

16

Page 2: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

the delay lines have on the produced sound and the algorithm’sperformance.

This paper is organized as follows: Section 2 presents thetheory behind the FDN, and Section 3 shows the GUI of the im-plemented plugin, describes the functionalities and user-controlledparameters of the reverberator, presents the code structure and dis-cusses the real-time computation issues. Section 4 shows and dis-cusses results regarding the echo density produced by the imple-mentation and the CPU usage of the plugin. Finally, Section 5summarizes and concludes the work.

2. FEEDBACK DELAY NETWORK

Figure 1 presents a flow diagram of a conventional FDN, which isexpressed by the relation:

y(n) =

N∑i

cisi(n) + dx(n), (1a)

si(n+ Li) =

N∑j

Ai,j h̃i(n)sj(n) + bix(n), (1b)

where y(n) and x(n) are the output and input signal, respectively,at time sample n, si(n) is the output of the ith delay line, andAi,j is the element of an N -by-N feedback matrix (or scatteringmatrix) A, through which all the delay lines are interconnected.Parameters bi and ci symbolize input and output coefficients, re-spectively, d is the direct-path gain, and h̃i(n) is the attenuationfilter of the ith delay line.

When designing an FDN, a common practice is to first ensurethat the energy of the system will not decay for any possible typeof delay. Therefore, the matrix A should be unilossless [15]. Toobtain a specific frequency-dependent RT, each of the delay linesmust be cascaded with an attenuation filter, which approximatesthe target gain-per-sample expressed by

γdB(ω) =−60

fsT60(ω), (2)

where T60(ω) is the target RT in seconds, ω = 2πf/fs is thenormalized angular frequency, f is the frequency in Hz, and fs

is the sampling rate in Hz. In order to ensure that all delay linesapproximate the same RT, the gain-per-sample for each of themmust be scaled by a respective delay in samples L. This impliesthat the target magnitude response of the attenuation filter in dB isdefined as follows:

AdB(ω) = LγdB(ω). (3)

In order to provide an accurate approximation of the targetRT, and therefore to closely follow the AdB, the attenuation filterused in the FDN implementation in the present study is a graphicequalizer (GEQ), which controls the energy decay of the system inten octave bands, with center frequencies from 31.25 Hz to 16 kHz.The equalizer is composed of biquad filters [16] and designed withthe method proposed by Välimäki and Liski [17] with later modifi-cations, such as the scaling by a median of gains and the adding ofa first-order high-shelf filter as proposed in [18]. The GEQ mag-nitude response for the ith delay line is expressed in dB as

H̃dB,i(ejω) = g0 +

M∑m=1

(HdB,i,m(ejω)− g0

M

), (4)

ℎ�,1 ℎ�,� ℎ�,�

���

Filter

EQDelayLine

EQDL1

EQDL�

EQDL�

�−1

��,�,0

��,�,1 ��,�,1

�−1

��,�,2 ��,�,2

�1

��

��

�1

��

��

�(�)

�(�)

(�)�1

(�)��

(�)��

Figure 1: Flow diagram of an FDN with N equalized delay linesand their M octave-band biquad filters shown in detail. See Sec-tions 2 and 3.5 for more details.

where g0 is the broadband gain factor, HdB,i,m are the magni-tude responses of the band filters, and m = 1, 2, ...,M is thefrequency-band index with M controlled frequency bands. Thetime-domain representation h̃i(n) of H̃dB,i(e

jω) is used in Eq. (1b).

3. IMPLEMENTATION

This section describes the real-time implementation of late rever-beration synthesis using an FDN and a modified GEQ as the atten-uation filter. The algorithm has been implemented in the form of anaudio plugin in C++ using JUCE, an open-source cross-platformapplication framework [19]. The plugin can be downloaded from[20], and an explanatory demo video can be found in [21].

3.1. Control over RT Values

In the present implementation, the modified GEQ attenuation filterallows controlling the RT values in ten frequency bands. In orderto utilize the whole potential of the filter, the GUI of the plugin isequipped with ten vertical sliders, one for each frequency band, asdepicted in Fig. 2. By changing the value of each of the sliders, theuser is able to change the RT value for the corresponding frequencyband from 0.03 s to 15 s with a 0.01-s step.

Since too large a difference between two consecutive RT val-ues can cause instability [18, 22], two extra modes are imple-mented for better control: the All Sliders and the Smooth modes.

DAFx.2

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

17

Page 3: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

(a) The attenuation filter’s response (red line) and the corresponding RTcurve (black line). No preset is selected, and the Smooth button is pressed.

(b) Reverberator IR. The Fix coeffs button has been pressed, and the preset’sdrop-down menu and sliders have been disabled (see Sec. 3.6).

Figure 2: GUI of the implemented FDN plugin.

The modes are activated by pressing the respective buttons, as in-dicated in Fig. 2a, with the corresponding buttons on the GUI be-ing highlighted in green. If a mode is activated when the other isactive, the latter will deactivate. The All Sliders mode allows theuser to set all the RT values to be the same by changing the sliderposition in one of the frequency bands.

When the Smooth mode is activated, changing the value of oneRT will also adjust the RT in the other frequency bands. RT valuesof bands closer to the band that is changed are more affected thanother RT values via the formula

T60[m] = T60,init[m] +(T60[mc]− T60,init[mc]

)ϵ|m−mc|, (5)

where mc is the index of the currently adjusted slider, m = 1, 2, ...,M is the slider number, T60 and T60,init are the final and initial RTvalues, respectively, and ϵ = 0.6 is a heuristically chosen scalingfactor.

Five typical reverberation presets were created: Small Room,Medium Room, Large Room, Concert Hall, and Church. The firstthree presets are based on the measurement results presented in[23], whereas RT values for the last two are taken from [24]. Allexamples are available in a drop-down list in the top part of theGUI. If one of the sliders is changed, “– no preset –” is displayedin the drop-down list, as shown in Fig. 2a.

The Impulse button at the bottom right of the GUI emptiesthe delay lines and feeds a Dirac delta into the system so that theimpulse response of the reverberator is produced as an output.

3.2. Response Plotting

The window in the upper-half part of the GUI displays plots thatinform the user about the state of the plugin. As seen in Fig. 2a, theGUI can display the RT curve (black) and the corresponding mag-nitude response of the attenuation filter (red), which are plotted inreal time based on the values set by the sliders. This provides the

user with an insight into the actual decay characteristics of the syn-thesized reverberation, which may differ from the user-defined RTvalues. This happens due to the limited ability of the attenuationfilter in following the target RT curve, especially when the differ-ences between values set for the neighboring frequency bands arebig [18, 22]. Very extreme differences may lead to the filter’s mag-nitude response reaching or exceeding 0 dB, which results in thesystem’s instability. This state is signaled by the background colorof the window changing to light red. For the response, only onedelay line is used to retain real-time plotting. Due to the fact thatthe attenuation filter adopts smaller values for shorter delay-linelengths, the shortest delay line is chosen as it exhibits instabilitysooner than the others.

The Show IR button located in the top right of the window al-lows the user to toggle between the RT curve and filter’s responseplots and the reverberator’s IR plot, which is shown in Fig. 2b. Asopposed to the response, the longest delay line is used to calcu-late the IR. Even though the effect of the scattering matrix, andwith that the effect of other delay lines, are not included, using thelongest delay line has been proven empirically to give a good in-dication of the audible IR. The values displayed on the x-axis aredetermined by the average slider value, i.e., a shorter reverb time,results in a more detailed plot of the earlier seconds of the IR. Fur-thermore, not every sample is drawn, but 1,000 data points spreadover the plot-range.

3.3. Choice of Delay Lengths and Distribution

Although FDN-based reverbs are nowadays among the most pop-ular algorithmic reverbs, there is no clear rule on how to choosethe lengths of the delay lines [25, 26]. The common practice isto choose the number of samples that are mutually prime and uni-formly distributed between the maximum and minimum lengths toavoid clustering of echoes [26].

DAFx.3

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

18

Page 4: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

Figure 3: The advanced settings window.

Through the Advanced Settings window shown in Fig. 3, thedistribution of delay-line lengths can be chosen through a drop-down menu from four options: Random, Gaussian, Primes, andUniform. Whenever an option is selected, the delay-line lengthsare randomly generated based on the distribution selected and roundedto the nearest integers. The generation can be repeated by clickingon the Randomize button. Furthermore, the minimum (500 sam-ples) and maximum (10,000 samples) delay-line lengths can becontrolled; the minimum difference between the two has been setto 100 samples. Moreover, there is an option to have the lengthspre-defined for each distribution so that the plugin will have thesame behavior every time it is used. The minimum and maximumdelay-line lengths have been empirically set to 1,500 and 4,500samples, respectively (~30–100 ms at fs = 44.1 kHz).

3.4. Choice of Feedback Matrix

The choice of the feedback matrix is crucial for the FDN algo-rithm to work correctly. The popular matrix types used in FDNimplementations that fulfill the requirement of being unilosslessare Hadamard [27], Householder [27], random orthogonal, andidentity matrices [28]. Where the first three are chosen to en-hance specific properties of the algorithm, e.g., density of the im-pulse response, the identity matrix, however, reduces the FDN toa Schroeder reverberator, or a parallel set of comb filters [6, 28].The plugin presented in this study allows the user to choose be-tween these four matrices through a drop-down menu and to learnabout the differences in the sound obtained by changing this partof the FDN reverberator. Additionally, the order of the FDN, andthus the size of the feedback matrix, can be changed. The avail-able options are 2, 4, 8, 16, 32, and 64, which can be chosen froma menu.

In the case of the Householder matrix type, the implementa-tion of matrices of different sizes vary. For all orders except for16, the matrix is constructed using following the formula:

AN = IN − 2

NuNuT

N , (6)

where uTN = [1, . . . , 1], and IN is the identity matrix [27]. The

matrix of order 16, on the other hand, following [29], is con-structed using the recursive embedding of matrix of order 4:

A16 =1

2

A4 −A4 −A4 −A4

−A4 A4 −A4 −A4

−A4 −A4 A4 −A4

−A4 −A4 −A4 A4

. (7)

As a result, the matrix of order 16 consists of the same values,differing only in their sign.

3.5. Code Structure

The plugin is divided into two main components that run on dif-ferent threads at different rates. Firstly, the DSP component run-ning at 44,100 Hz (audio rate), is structured in the same fashionas shown in Fig. 1. An FDN class contains the scattering matrixA, vectors b and c that scale the inputs and outputs of each delayline (marked as bi in Eq. (1b) and ci in Eq. (1a), respectively, andin the current implementation all set to 1), and N instances of theEQDelayLine class. This class, in turn, contains a delay line oflength Li (implemented as a circular buffer) and M instances ofthe Filter class. This class does all the low level computation andcontains the filter states and coefficients bi,m and ai,m of the ithdelay line and the mth octave band.

Secondly, the GUI component running at 5 Hz is responsiblefor the graphics and control of the FDN. Apart from the controls,this component contains the Response class that is used to draw theRT and gain curves and the IR shown in Figs. 2a and 2b. The fil-ter coefficients necessary for drawing the curves are updated at theaforementioned rate. This calculation also provides informationabout the stability of the FDN and is used to trigger the light-redbackground denoting instability. The Response class also containsa single instance of the EQDelayLines class that is used to calcu-late the IR.

Communication from the GUI to the DSP component happensat a 5-Hz control rate, which has been found to be a great trade-offbetween speed and quality of control. When changing any of thenon-RT controls, the GUI triggers flags that are outside of the pro-cess buffer (512 samples) to avoid the manipulation of parameterswhen sample-by-sample calculations are being made.

3.6. Real-time Considerations

The components of the plugin requiring most computations are the(re-)calculation of the filter coefficients and the plotting of the re-sponses. Even though the filter coefficients only need to be recal-culated when the sliders’ values are changed, it is good practice fora plugin to have the same CPU usage when its values are changedas when its values are static to prevent unexpected spikes in theCPU usage. Instead, a Fix coeffs (coefficients) button has beenimplemented that, when clicked, will deactivate the preset’s drop-down menu and the sliders (as shown in Fig. 2b). Furthermore,the plugin will stop recalculating the plots and filter coefficients,greatly decreasing CPU usage (see Sec. 4.2). The CPU usages ofboth threads are shown at the top of the plugin.

When any change is made to the FDN, be it the order, delay-line distribution or length, the delay lines and filter states are set tozero to prevent any unwanted artifacts. Only the RT control worksin real time without emptying the delay lines and filter states.

DAFx.4

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

19

Page 5: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 1Time (s)

0

16

32

DL

num

ber

(a) Distribution of delay-line outputs for the option Primes.

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 1Time (s)

0

16

32

DL

num

ber

(b) Distribution of delay-line outputs for the option Uniform.

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 1Time (s)

0

16

32

DL

num

ber

(c) Distribution of delay-line outputs for the option Random.

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 10

16

32

DL

num

ber

0 0.2 0.4 0.6 0.8 1Time (s)

0

16

32

DL

num

ber

(d) Distribution of delay-line outputs for the option Gaussian.

Figure 4: Distribution of the outputs of 32 delay lines (without scattering) for the options (a) primes, (b) random, (c) uniform, and (d)Gaussian, and the length range (top) pre-defined lengths (1,500–4,500 samples), (middle) lengths randomized over the entire range (500–10,000 samples), and (bottom) lengths randomized over a narrow range (5,000–6,500 samples). Each dot marks the time when the givendelay line outputs a sample.

4. RESULTS AND DISCUSSION

This section presents results regarding the echo density producedby and CPU usage of the plugin.

4.1. Echo Density

To achieve smooth reverberation, a sufficient echo density, i.e., thenumber of echoes per time unit produced by the algorithm andtheir distribution [26], should be obtained. Echo density is affectedby a few factors, such as the lengths and the distribution of thedelay lines, the type of the feedback matrix [30] and its size, all ofwhich are discussed below.

4.1.1. Delay Lengths

The choice of delay-line length-distribution can help avoid morethan one sample appearing at the system’s output at the same timeand a clustering of the echoes, since both of these phenomenalower the echo density. Additionally, the range over which thedelay-line lengths are chosen also affects the quality of the synthe-sized sound. The distribution of delay-line outputs over time, with-out a scattering matrix (i.e., an identity feedback matrix is used),is shown in Fig. 4 for all the options available in the plugin. In thecase of the randomized selection of the delay-line lengths (middleand bottom panes of Figs. 4a–4d), the results show one of the pos-sible configurations. The delay-line lengths used in the examples

DAFx.5

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

20

Page 6: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

were sorted in ascending order.The top panes of Figures 4a–4d show the outputs of the pre-

defined delay lines, which depict the typical behavior of the FDNalgorithm. The outputs become more diffused over time, mak-ing the reverb smoother. It should be noticed, however, that whenusing Uniform distribution, the chosen range is divided into por-tions proportional to the FDN order, and the delay-line lengths arechosen from such “bands”. This makes the consecutive delay linesdiffer by a similar number of samples, and the possibility of outputsamples overlapping or clustering is higher than with other distri-butions. Choosing the Gaussian option, on the other hand, drawsthe delay-line lengths from the normal distribution with the meanbeing the midpoint between the range’s boundaries. This results inchoosing the lengths closer to the mean more often than those fur-ther from it, as depicted in Fig. 4d, potentially causing clusteringof echoes and slowing down the increase of the echo density.

The distribution of outputs presented in the middle panes showthat when the delay-length range is very wide, the output is dif-fused from the beginning. Since such decay is rarely met in reality,it is useful when recreating only specific spaces [31]. Additionally,very short delay lines create clusters of echoes and a huge portionof the output samples overlap. They do not contribute to the in-crease of echo density, but nevertheless add to the computation.Such clusters are well visible in Figs. 4a and 4c. Moreover, theattenuation applied to the short delay lines is usually small, andtherefore closer to 0 dB, which makes them more prone to causingthe system’s instability.

On the other hand, very long delay lines (10,000 samples trans-lates to about 0.23 s for the 44.1-kHz sample rate) may not producea meaningful contribution to the synthesized reverb for low RT val-ues. However, such long delay lines still add to the computation,since the order of the FDN, and at the same time, the size of thefeedback matrix needs to be equal to the number of delay lines.

Using a very narrow range over which the delay-line lengthsare distributed results in clusters of samples arriving at the outputwithin a very short time, as seen in the bottom panes of Figs. 4a–4d. Between the consecutive clusters, however, relatively longsilences occur. The synthesized reverberation tail diffuses veryslowly. Regardless of whether the delay-line lengths are chosen tobe prime, random, distributed normally or uniformly, choosing toonarrow a range results in low sound quality with clearly audiblesegmentation and in the effect’s behavior resembling more that ofa single delay line than a reverb.

4.1.2. Feedback Matrix

The normalized echo densities for all types of matrices availablein the plugin were calculated, following the method presented in[32, 33, 34], for orders 2–64 and the delay lines selected randomlyfrom the range between 1,500 and 4,500 samples (the same set ofdelay lines was used for all calculations). To avoid bias caused bythe smearing of echoes due to the filtering, the attenuation filterswere not used in the calculations. The results are presented inFig. 5 which generally show that the echo density increases fasterwith a higher FDN order than with a lower one.

When matrices of size 2 and 4 are used, the number of echoesin the output of the reverberator increases slowly and may neverreach saturation, i.e., the moment when there is an echo at ev-ery successive time unit [26]. Therefore, these low orders do notproduce smooth sound. In the case of an FDN of order 8, theecho density build-up is slow, which results in audible artifacts in

Table 1: CPU usage for all FDN orders in the cases of unfixed(plotting IR and EQ) and fixed coefficients.

FDN CPU usage (%)Order Unfixed (IR) Unfixed (EQ) Fixed

2 18.4 11.0 3.14 19.8 12.0 5.48 22.7 15.2 7.9

16 28.6 22.2 13.332 46.1 40.2 30.464 110.5 100.1 92.5

synthesized reverbs for as long as one second. Thus, a matrix ofsize 16 is the smallest that increases the number of echoes quicklyenough so that the resulting sound is perceived as smooth for allmatrix types (except for the identity matrix). For the Hadamardand random matrices, a further rise in the size accelerates the echodensity build-up, as evident in Fig. 5c and 5d.

Interestingly, the Householder matrix excels with the order of16 using the recursive embedding of Eq. (7). This can be explainedby the fact that for all other orders, the implementation followsEq. (6), which produces matrices in which the difference betweenthe diagonal and the rest of the elements grows proportionally tothe order. Effectively, this makes the FDN approach a bank ofdecoupled comb filters, which results in high variability of echodensity for orders 32 and 64, as seen in Fig. 5b, leading to audibleartifacts in the reverberation. For the matrix of order 16, however,the echo density increases fast and remains high once saturation isreached.

Because the identity matrices produce a very low echo den-sity that does not increase with time, as seen is Fig. 5a, they arenot well fitted for the FDN. Reverberation synthesized using suchmatrices is always low-quality. Being also an identity matrix, theHouseholder matrix of order 2 should be avoided as well.

4.2. CPU Usage

Table 1 shows the CPU usage for all implemented FDN orders forthree different plugin-states: unfixed coefficients plotting the IR,unfixed coefficients plotting the EQ, and fixed coefficients (plot-ting and recalculation of filter coefficients disabled). The perfor-mance has been measured on a MacBook Pro with a 2.2 GHz Inteli7 processor using Xcode’s time profiler [35].

For all plugin states, the CPU usage increases exponentiallywith the FDN order. Furthermore, fixing the coefficients, and thusdisabling the plotting and filter-coefficient calculation, greatly de-creases the plugin’s CPU usage. Comparing this to the unfixed EQcase, an additional ~8.0% is added to the CPU usage, and whenplotting the IR versus the EQ, an additional ~7.5% is added to theusage. This value, however, depends on the average reverb timeused. For testing, the Concert hall preset was used, which requirescalculating 2.5 s of sound for the IR plot. With a higher averageslider value, and thus a longer IR to be calculated and plotted, theCPU usage also increases.

The smallest useful FDN order is 16, as stated in Sec. 4.1.2.Table 1 shows that this order, or even 32, is unlikely to cause audi-tory drop-outs, especially when the coefficients are fixed.

DAFx.6

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

21

Page 7: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

0 0.2 0.4 0.6 0.8 1 1.2Time (s)

0

0.2

0.4

0.6

0.8

1

Ech

o de

nsity

Order 2Order 4Order 8Order 16Order 32Order 64

(a) Identity matrix.

0 0.2 0.4 0.6 0.8 1 1.2Time (s)

0

0.2

0.4

0.6

0.8

1

Ech

o de

nsity

(b) Householder matrix.

0 0.2 0.4 0.6 0.8 1 1.2Time (s)

0

0.2

0.4

0.6

0.8

1

Ech

o de

nsity

(c) Hadamard matrix.

0 0.2 0.4 0.6 0.8 1 1.2Time (s)

0

0.2

0.4

0.6

0.8

1

Ech

o de

nsity

(d) Random orthogonal matrix.

Figure 5: Normalized echo densities for four types of feedback matrices and different FDN orders.

5. CONCLUSIONS

The present study introduces the FDN-based artificial reverber-ation synthesis plugin. The implementation allows control overthe decay characteristics of the sound in ten octave bands in realtimeand plots the corresponding RT curve, the attenuation filter’smagnitude response, and the IR. Additionally, users can exploredifferent setups of the FDN by changing the type and size of thefeedback matrix, and the lengths and distribution of the delay lines.

Experiments with the delay-line lengths and their distributionssuggest that these parameters should always be used in a balancedmanner, that suit the target reverberation. A wrong choice mayresult in the creation of clusters of output samples and a low echodensity, which is undesirable in a reverberator. Choosing the lengthsover a narrow range results in low-quality, segmented sound, whichdiffuses slowly. Picking the right distribution of delay-line lengthsis also important.

The ability to choose from among different FDN orders showsthat the lowest useful order for high-quality sound processing is16, as it a sufficiently provides fast echo density build-up to ob-

tain smooth reverberation without audible artifacts. Shifting be-tween feedback matrix types proves that the identity matrix, eventhough it is lossless, should not be used in such applications, sincethe produced sound is fluttery. It also shows that, in the case ofthe Householder matrix, implementation affects the reverberation.Results show that using recursive embedding when constructingthe Householder matrix increases the echo density in the producedreverberation.

6. ACKNOWLEDGMENTS

This work was initialized, when Karolina Prawda made a Short-Term Scientific Mission to the Aalborg University Copenhagenfrom October 28 to November 15, 2019.

7. REFERENCES

[1] M. R. Schroeder and B. F. Logan, “Colorless artificial rever-beration,” J. Audio Eng. Soc., vol. 9, no. 3, pp. 192–197, Jul.

DAFx.7

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

22

Page 8: Flexible real-time reverberation synthesis with accurate ......Proceedings of the 23 rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8 12,

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx-20), Vienna, Austria, September 8–12, 2020

1961.

[2] V. Välimäki, J. D. Parker, L. Savioja, J. O. Smith, and J. S.Abel, “Fifty years of artificial reverberation,” IEEE Trans.Audio Speech Lang. Process., vol. 20, no. 5, pp. 1421–1448,Jul. 2012.

[3] N. Peters, J. Choi, and H. Lei, “Matching artificial reverbsettings to unknown room recordings: A recommendationsystem for reverb plugins,” in Proc. Audio Eng. Soc. 133rdConv., San Francisco, CA, USA, Oct. 2012.

[4] C. Kereliuk, W. Herman, R. Wedelich, and D. J. Gillespie,“Modal analysis of room impulse responses using subbandESPRIT,” in Proc. 21st Int. Conf. Digital Audio Effects,Aveiro, Portugal, 4–8 Sept. 2018.

[5] S. Bilbao and B. Hamilton, “Passive volumetric time domainsimulation for room acoustics applications,” J. Acoust. Soc.Am., vol. 145, no. 4, pp. 2613–2624, Apr. 2019.

[6] J. M. Jot and A. Chaigne, “Digital delay networks for de-signing artificial reverberators,” in Proc. 90th Audio Eng.Soc. Conv., Paris, France, 19–22 Febr. 1991.

[7] J. M. Jot and A. Chaigne, “Method and system for artificialspatialisation of digital audio signals,” Feb. 1996, U.S. Patent5,491,754.

[8] S. Heise, M. Hlatky, and J. Loviscach, “Automatic adjust-ment of off-the-shelf reverberation effects,” in Proc. AudioEng. Soc. 126th Conv., Munich, Germany, 7–10 May 2009.

[9] C. Borß, “A VST reverberation effect plugin based on syn-thetic Room Impulse Responses,” in Proc. 12th Int. Conf.on Digital Audio Effects (DAFx-09), Como, Italy, 1–4 Sept.2009.

[10] Ableton, “Convolution reverb,” Available online athttp://www.ableton.com/en/packs/convolution-reverb/, Ac-cessed: 2020-03-16.

[11] S. Philbert, “Developing a reverb plugin; utilizing Faustmeets JUCE framework,” in Proc. Audio Eng. Soc. 143rdConv., New York, NY, USA, 18–21 Oct. 2017.

[12] D. Moffat and M. B. Sandler, “An automated approach to theapplication of reverberation,” in Proc. Audio Eng. Soc. 147thConv., New York, NY, USA, 16–19 May 2019.

[13] T. Erbe, “Building the Erbe-Verb: Extending the feedbackdelay network reverb for modular synthesizer use,” in Proc.Int. Computer Music Conf., Denton, TX, USA, Sept. 2015.

[14] Valhalla DSP, “Valhalla Room,” Available onlineat http://valhalladsp.com/shop/reverb/valhalla-room/, Ac-cessed: 2020-03-31.

[15] S. J. Schlecht and A. P. Habets, “On lossless feedback delaynetworks,” IEEE Trans. Signal Process., vol. 65, no. 6, pp.1554–1564, Mar. 2017.

[16] S. J. Orfanidis, Introduction to Signal Processing, RutgersUniv., Piscataway, NJ, USA, 2010.

[17] V. Välimäki and J. Liski, “Accurate cascade graphic equal-izer,” IEEE Signal Process. Lett., vol. 24, no. 2, pp. 176–180,Feb. 2017.

[18] K. Prawda, S. J. Schlecht, and V. Välimäki, “Improved re-verberation time control for feedback delay networks,” inProc. 22nd Int. Conf. Digital Audio Effects, Birmingham,UK, Sept. 2019.

[19] ROLI, “JUCE,” Available at http://juce.com/, Accessed:2020-04-03.

[20] S. Willemsen, “FDN plugin github release v1.0,” Avail-able at https://github.com/SilvinWillemsen/FDN_/releases/tag/v1.0, Accessed: 2020-03-19.

[21] S. Willemsen, “Real-time FDN,” Available online athttps://youtu.be/ddgKMtW1Obc, Accessed: 2020-03-19.

[22] S. J. Schlecht and A. P. Habets, “Accurate reverberation timecontrol in Feedback Delay Networks,” in Proc. Digital AudioEffects (DAFx-17), Edinburgh, UK, 5–9 Sept. 2017, pp. 337–344.

[23] M. Jeub, M. Schäfer, and P. Vary, “A binaural room impulseresponse database for the evaluation of dereverberation algo-rithms,” in Proc. Int. Conf. Digital Signal Process. (DSP),Santorini, Greece, Jul. 2009, pp. 1–4.

[24] Audiolab University of York, “Open AIR library,” Availableat http://openairlib.net/, Accessed: 2020-04-07.

[25] D. Rocchesso and J. O. Smith, “Circulant and ellipticfeedback delay networks for artificial reverberation,” IEEETrans. Speech and Audio Process., vol. 5, no. 1, pp. 51–63,Jan. 1997.

[26] S. J. Schlecht and E. A. P. Habets, “Feedback delay net-works: Echo density and mixing time,” IEEE/ACM Trans.Audio, Speech Lang. Process., vol. 25, no. 2, pp. 374–383,Feb. 2017.

[27] J. M. Jot, “Efficient models for reverberation and dis-tance rendering in computer music and virtual audio reality,”in Proc. Int. Computer Music Conf., Thessaloniki, Greece,Sept. 1997.

[28] F. Menzer and C. Faller, “Unitary matrix design for diffuseJot reverberators,” in Proc. Audio Eng. Soc. 128th Conv.,London, UK, May 22–25 2010.

[29] J. O. Smith, Physical Audio Signal Processing, http://-ccrma.stanford.edu/˜jos/pasp/, Accessed 2020-04-17, online book, 2010 edition.

[30] O. Das, E. K. Canfield-Dafilou, and J. S. Abel, “On the be-havior of delay network reverberator modes,” in Proc. IEEEWorkshop Appl. Signal Process. Audio Acoustics (WASPAA),New Paltz, NY, USA, Oct. 2019, pp. 50–54.

[31] S. Oksanen, J. Parker, A. Politis, and V. Välimäki, “A di-rectional diffuse reverberation model for excavated tunnelsin rock,” in Proc. IEEE Int. Conf. Acoust. Speech SignalProcess. (ICASSP), Vancouver, Canada, May 2013, pp. 644–648.

[32] J. S. Abel and P. Huang, “A simple, robust measure of rever-beration echo density,” in Proc. Audio Eng. Soc. 121st Conv.,San Francisco, CA, USA, Oct. 2006.

[33] P. Huang and J. S. Abel, “Aspects of reverberation echo den-sity,” in Proc. Audio Eng. Soc. 123rd Conv., New York, NY,USA, Oct. 2007.

[34] P. Huang, J. S. Abel, H. Terasawa, and J. Berger, “Rever-beration echo density psychoacoustics,” in Proc. Audio Eng.Soc. 125th Conv., San Francisco, CA, USA, Oct. 2009.

[35] Apple Inc., “Xcode – Apple Developer,” Available athttps://developer.apple.com/xcode/, Accessed: 2020-03-18.

DAFx.8

DAF2

x21in

Proceedings of the 23rd International Conference on Digital Audio Effects (DAFx2020), Vienna, Austria, September 2020-21

23