13
General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from orbit.dtu.dk on: Jun 13, 2020 Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing- Impaired Listeners Wiinberg, Alan; Zaar, Johannes; Dau, Torsten Published in: Trends in Hearing Link to article, DOI: 10.1177/2331216518775293 Publication date: 2018 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Wiinberg, A., Zaar, J., & Dau, T. (2018). Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners. Trends in Hearing, 22. https://doi.org/10.1177/2331216518775293

Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

Users may download and print one copy of any publication from the public portal for the purpose of private study or research.

You may not further distribute the material or use it for any profit-making activity or commercial gain

You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from orbit.dtu.dk on: Jun 13, 2020

Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

Wiinberg, Alan; Zaar, Johannes; Dau, Torsten

Published in:Trends in Hearing

Link to article, DOI:10.1177/2331216518775293

Publication date:2018

Document VersionPublisher's PDF, also known as Version of record

Link back to DTU Orbit

Citation (APA):Wiinberg, A., Zaar, J., & Dau, T. (2018). Effects of Expanding Envelope Fluctuations on Consonant Perceptionin Hearing-Impaired Listeners. Trends in Hearing, 22. https://doi.org/10.1177/2331216518775293

Page 2: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

Original Article

Effects of Expanding Envelope Fluctuationson Consonant Perception inHearing-Impaired Listeners

Alan Wiinberg1 , Johannes Zaar1, and Torsten Dau1

Abstract

This study examined the perceptual consequences of three speech enhancement schemes based on multiband nonlinear

expansion of temporal envelope fluctuations between 10 and 20 Hz: (a) ‘‘idealized’’ envelope expansion of the speech before

the addition of stationary background noise, (b) envelope expansion of the noisy speech, and (c) envelope expansion of only

those time-frequency segments of the noisy speech that exhibited signal-to-noise ratios (SNRs) above �10 dB. Linear

processing was considered as a reference condition. The performance was evaluated by measuring consonant recognition

and consonant confusions in normal-hearing and hearing-impaired listeners using consonant-vowel nonsense syllables pre-

sented in background noise. Envelope expansion of the noisy speech showed no significant effect on the overall consonant

recognition performance relative to linear processing. In contrast, SNR-based envelope expansion of the noisy speech

improved the overall consonant recognition performance equivalent to a 1- to 2-dB improvement in SNR, mainly by

improving the recognition of some of the stop consonants. The effect of the SNR-based envelope expansion was similar

to the effect of envelope-expanding the clean speech before the addition of noise.

Keywords

consonant recognition, hearing impairment, hearing instruments, speech enhancement, temporal envelope

Date received: 10 August 2017; revised: 13 February 2018; accepted: 16 February 2018

Introduction

People with a sensorineural hearing impairment oftencomplain about difficulties understanding speech in situ-ations with several interfering talkers or backgroundnoise, particularly in reverberant environments. Someof these difficulties are considered to be caused by loud-ness recruitment, reflecting a reduced sensitivity to softsounds and a steeper loudness growth function thanobserved in normal-hearing (NH) people (e.g., Fowler,1936; Steinberg & Gardner, 1937). Modern hearing aidsattempt to compensate for loudness recruitment byapplying multiband dynamic-range compression (DRC)that provides level-dependent amplification in variousfrequency bands, such that soft sounds are amplifiedmore than higher level portions of the sound. Apartfrom reduced audibility, cochlear hearing loss is oftenassociated with a ‘‘distortion loss’’ that is considered toreflect suprathreshold processing deficits and assumed tobe caused by inner hair-cell damage or loss of auditory-nerve fibers and synapses (e.g., Festen & Plomp, 1990;

Plomp, 1978). One of the perceptual consequences of adistortion loss could be a reduced ability to capture anddiscriminate envelope fluctuations in a sound (e.g.,Schlittenlacher & Moore, 2016; Wiinberg, Jepsen, Epp,& Dau, 2018). The course of the envelope of speech indifferent frequency bands has been shown to be crucialfor speech intelligibility (e.g., Shannon, Zeng, &Kamath, 1995; Stone, Anton, & Moore, 2012; Stone,Fullgrabe, & Moore, 2008) and contains informationrelated to voicing, manner, and place of articulation(Xu, Thompson, & Pfingst, 2005). In the case of a back-ground noise, the modulation depth of the speech enve-lope becomes reduced because of the less varyingnoise envelope. This commonly deteriorates speech

1Hearing Systems Group, Department of Electrical Engineering, Technical

University of Denmark, Lyngby, Denmark

Corresponding author:

Alan Wiinberg, Hearing Systems Group, Department of Electrical

Engineering, Technical University of Denmark, DK-2800 Lyngby, Denmark.

Email: [email protected]

Trends in Hearing

Volume 22: 1–12

! The Author(s) 2018

Reprints and permissions:

sagepub.co.uk/journalsPermissions.nav

DOI: 10.1177/2331216518775293

journals.sagepub.com/home/tia

Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License

(http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further

permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

Page 3: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

intelligibility, particularly in listeners with a hearingimpairment (e.g., Stone et al., 2008, 2012).

It has been proposed that artificially increasing themodulation depth of the speech envelope may facilitatethe extraction of speech cues and thereby improve speechintelligibility in noise (e.g., Plomp, 1988). Increasing themodulation depth of the speech envelope, without affect-ing the noise, would increase the signal-to-noise ratio(SNR) in the modulation domain, which has beenshown to be related to speech intelligibility (Jørgensen,Decorsiere, & Dau, 2015). Consistent with this idea,recent speech intelligibility models based on the SNRin the modulation domain have been able to accountfor the effects of a large range of interferers and distor-tion types on speech intelligibility in NH listeners(Chabot-Leclerc, Jørgensen, & Dau, 2014; Chabot-Leclerc, MacDonald, & Dau, 2016; Jørgensen & Dau,2011; Jørgensen, Ewert, & Dau, 2013).

Different implementations of temporal envelopeenhancement schemes have been investigated, with vary-ing degree of success. Several studies found significantbenefits from envelope expansion of the speech beforethe addition of noise both in NH and hearing-impaired(HI) listeners (e.g., Apoux, Tribut, Debruille, & Lorenzi,2004; Langhans & Strube, 1982). However, the idealizedprocessing of the speech before the addition of noiserequires a priori knowledge of the clean speech signal,which cannot be assumed in practice (e.g., in hearing-aidsignal processing schemes). If envelope expansion isinstead applied to the noisy speech mixture, both thespeech fluctuations and the intrinsic noise fluctuationsare enhanced, such that no benefit in terms of the SNRin the modulation domain can be expected. In fact, con-sistent with this reasoning, several studies that appliedenvelope expansion to the noisy speech showed no bene-fit or even a decreased performance relative to linearprocessing (Freyman & Nerbonne, 1996; Van Buuren,Festen, & Houtgast, 1999) while others showed smallbenefits (e.g., Apoux et al., 2004; Clarkson & Bahgat,1991). These results were typically consistent acrossNH and HI listeners when reduced audibility was com-pensated for by amplification. Part of the large variabil-ity regarding the benefit of envelope expansion across thedifferent studies may have been caused by (a) differencesin the details of the expansion schemes employed (e.g.,the number of frequency bands, the range of modulationfrequencies in which an expansion was applied, envelopethresholding, the amount of expansion, etc.), (b) differ-ences in the modulation spectra of the (stationary vs.fluctuating noise) interferers and the speech material(e.g., sentences vs. consonant-vowel nonsense syllables[CVs]) as well as (c) differences in the tested stimulusSNRs.

In most studies, the envelope expansion was appliedto the ‘‘entire’’ modulation-frequency range (e.g.,

between 0 and 500Hz). The modulation power of long-term speech typically has a maximum around the syllabicrate, which is about 4Hz for English, and decays there-after with increasing modulation frequency (e.g., Plomp,1988). Boosting modulation frequencies in this low-fre-quency range around the syllabic rate therefore enhancesthe overall dynamic range of the speech signal.Consequently, low-level speech segments are suppressedsuch that they may fall below the detection thresholdwhile high-level speech segments may become uncom-fortably loud, particularly for HI listeners with loudnessrecruitment. Therefore, audibility effects might contrib-ute to the detrimental effects observed with some of theproposed expansion schemes. Using an alternativeapproach, Langhans and Strube (1982) applied expan-sion only at modulation frequencies above a lower cutoffmodulation frequency of 2 Hz and provided DRC forslow envelope fluctuations (below 2Hz). The idea behindthis approach was that the DRC could compensate forloudness recruitment while the amplitude expansioncould enhance speech envelope cues above 2Hz.Langhans and Strube reported substantial benefits inNH listeners in terms of speech intelligibility when theprocessing was applied before the addition of noise. Eventhough this expansion scheme was successful in suchidealized conditions, it might not be advantageouswhen applied at modulation frequencies as low as 2 Hzin the case of HI listeners with loudness recruitment asstimulus audibility might be affected. Alternatively, anenhancement of higher frequency modulations(e.g., above 10Hz) may increase the robustness of stopconsonants and vowel onsets without compromisingaudibility. For example, the intelligibility of /t/ utter-ances has been shown to be highly correlated with thedetectability of the transient in the release burst whenpresented in noise (Li, Menon, & Allen, 2010; Regnier& Allen, 2008).

This study investigated the effects of expanding modu-lation frequencies (in the range from 10 to 20Hz) onconsonant recognition and consonant confusions inNH and HI listeners using consonant-vowel nonsensesyllables (CVs) mixed with stationary Gaussian noise.It was hypothesized that the considered envelope expan-sion will improve the recognition of stop consonants andthat detrimental effects caused by the enhancement ofnoise fluctuations will be minimized if the envelopeexpansion processing is only applied to those time-fre-quency segments that are dominated by speech. Threedifferent envelope expansion methods were tested: (a)‘‘Idealized’’ envelope expansion of the speech beforethe addition of noise, (b) envelope expansion of thenoisy speech, and (c) envelope expansion of only thosetime-frequency segments of the noisy speech that exhib-ited SNRs above a certain limit. Linear processing wasconsidered as the reference condition. Loss of audibility

2 Trends in Hearing

Page 4: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

was compensated for by providing individual linear fre-quency-dependent amplification for the HI listeners. Theexperimental data were analyzed with respect to overalland consonant group–specific consonant recognitionscores, as well as in terms of listener-specific consonantrecognition scores.

Methods

Listeners

Two groups of listeners participated in the experiments,an NH group and a HI group. The NH group consistedof eight adults with a median age of 26 years and agesranging from 21 to 61 years. All had absolute thresholdsbelow 20 dB HL for the octave frequencies between 0.125and 8 kHz. The HI group consisted of 12 adults withsymmetrical mild- to moderately-severe sensorineuralhearing losses. The median age was 72 years and therange was 50 to 80 years. The absolute thresholds forthe test ear, measured using conventional audiometry,are shown in Figure 1. All listeners reported Danishas their first language, signed an informed consentdocument, and were reimbursed for their efforts.Approval for the study was granted by theScientific Ethical Committee of the Capital Region inDenmark (De Videnskabsetiske Komiteer for RegionHovedstaden).

Stimuli

The CVs consisted of 15 consonants (/p, t, k, b, d, g, f, s,R, v, j, l, h, m, n/) followed by the vowel /i/. Two tokens

(one recording of a female talker and one of a maletalker) were selected per CV from the Pitu Danish non-sense syllable speech material (Christiansen, 2011),amounting to 30 tokens overall (15 CVs � two talkers).The tokens represent a subset of the speech tokens usedin a recent study on consonant perception in white noise(Zaar & Dau, 2015) which considered three recordings ofeach CV per talker. For each CV, the most intelligiblerecording of each talker was selected in this study. Thelevels of the tokens were equalized using VUSOFT, asoftware implementation of an analog VU-meter devel-oped by Lobdell and Allen (2007), such that all CVsshowed the same VUSOFT peak value. This equalizationstrategy is mainly based on the vowel levels, thus ensur-ing realistic relations between the levels of the individualconsonants. After equalization, the reference speech levelfor the SNR calculation was defined as the overall root-mean-square (RMS) level averaged across all speechtokens.

SNR conditions of 12, 6, and 0 dB were generated byfixing the noise level and adjusting the level of the speechtokens based on the reference speech level according tothe desired SNR. The speech tokens were mixed withstationary Gaussian noise such that the speech tokenonset was temporally positioned 400ms after the noiseonset. The stimulus duration was 1 s, including 50-msraised-cosine onset and offset ramps for the noise. Thesound pressure level (SPL) of the noise was set to 65 dB,while the overall stimulus level differed depending on thelevel of the speech, that is, on the SNR. Envelopeexpanded signals (clean speech or noisy speech) wereequalized in RMS level with the corresponding signalsobtained without expansion processing. For the HI lis-teners, the stimuli were linearly amplified according tothe NAL-R(P) frequency-dependent prescription rulebased on their individual audiometric thresholds(Byrne, Parkinson, & Newall, 1990). The frequency-dependent amplification was provided using a bank ofseven octave–wide bandpass linear-phase, finite-impulse-response (FIR) filters with center frequencies between0.125 and 8 kHz.

Envelope Expansion Processing

The proposed multiband envelope expansion algorithm,depicted in Figure 2, is similar to the algorithm describedin Langhans and Strube (1982). The input signal wasshort-time Fourier transformed by Hann-windowingthe signal in time frames of 256 samples with 75% over-lap between frames using a sampling rate of 44100Hz.Each of the windowed segments was padded with 128zeros at the beginning and the end and transformed to

Figure 1. Mean absolute thresholds for the tested ear of the HI

listeners, measured using conventional manual audiometry, and

expressed in dB HL. Error bars represent �1 standard deviation.

HI¼ hearing impaired.

Wiinberg et al. 3

Page 5: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

the spectral domain using a 512-point fast Fourier trans-form (FFT). The power spectral density of the resultingfrequency bins was combined into 15 third-octave widefrequency bands with center frequencies between 0.323and 8.192 kHz. The power in each band was converted todB SPL, and the resulting logarithmic representation ofthe temporal envelope was bandpass filtered over time-frames using a zero-phase fourth-order Chebyshev TypeII filter (�24 dB/octave roll-off) with 3-dB cutoff fre-quencies at 10 and 20 Hz. The bandwise expansiongains per timeframe were computed by multiplying the

bandpass-filtered envelopes with a scaling factor of 1.3.Thus, a bandpass-filtered level of 1 dB resulted in anamplification of the output level by 1.3 dB. The valueof the scaling factor was based on data from Wiinberget al. (2018). The factor was chosen such that the expan-sion processing restored the average modulation-depthdiscrimination performance of the HI listener group tothat of the NH listener group at a modulation frequencyof 16 Hz. The bandwise gains were converted to linearunits and smoothed in the frequency domain using apiecewise cubic interpolation to avoid aliasing artifacts.The frequency smoothed gains were applied to the binsof the short-time Fourier transformed input stimulus andan inverse FFT was applied to produce time segments ofthe envelope-expanded stimuli. These time segmentswere subsequently windowed with a Hann-window toavoid aliasing artifacts and combined using an overlap-add method to provide the processed temporalwaveform.

For the SNR-based expansion scheme, a priori infor-mation about the speech and noise components of thenoisy speech mixture was used. The power of both thespeech and noise components was computed for each ofthe 15 frequency bands and the SNR was calculated indB. For time-frequency segments with SNRs below�10 dB, the expansion gain was set to 0 dB. Otherwise,the expansion gain was not changed.

As listed in Table 1, three different envelope expan-sion settings were tested: Envelope expansion of thenoisy speech (Expmix); envelope expansion applied totime-frequency segments with SNRs above �10 dB(ExpSNR), and envelope expansion of the speech beforethe addition of noise (Expspeech).

Figure 3 shows the temporal waveform of the malespeech token \mi\ along with the waveforms obtainedwith the same speech token mixed with noise at 0-dBSNR for linear processing and the three envelope expan-sion conditions. For illustration purposes, only theresults at the output of an auditory-inspired gammatonefilter tuned to 500Hz are shown. From the top, thepanels show the temporal output of the gammatonefilter for the clean speech, linear processing, Expmix,ExpSNR, and Expspeech conditions, respectively. The illus-tration shows that Expspeech enhances only the speech

Figure 2. Block diagram of the proposed envelope expansion

algorithm. First, the signal was windowed in time segments and

transformed into the frequency domain by an STFT. The frequency

bins in each time window were combined into 15 third-octave

spaced frequency bands (Filterbank). The power in each band was

converted to dB SPL (lin2dB) and bandpass filtered (Bandpass

filter). The filtered temporal envelope was then multiplied by an

expansion factor. The bandpass-filtered temporal envelope in each

of the frequency bands was converted to linear units (dB2lin) and

thereafter used as gain values for the input. For the SNR-based

expansion scheme, indicated in gray, the gain was set to 0 dB for

time-frequency bands with SNRs below a certain limit. Finally, an

ISTFT was computed to generate the final expanded signal.

ISTFT¼ inverse short-time Fourier transform; STFT¼ short-time

Fourier transform.

Table 1. Overview of the Three Different Envelope Expansion

Conditions.

Abbreviation Processing Expander mode

Expmix Noisy speech Envelope expansion

ExpSNR Noisy speech SNR-based envelope

expansion

Expspeech Clean speech Envelope expansion

SNR¼ signal-to-noise ratio.

4 Trends in Hearing

Page 6: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

modulations, ExpSNR enhances the modulations of thenoisy-mixture portions with little noise contributions,and Expmix ‘‘blindly’’ enhances the modulations in theentire noisy mixture, irrespective of whether a particularportion of the signal is dominated by noise or speech.

Experimental Design

A control condition with speech presented in quiet wasdefined as ‘‘Q65.’’ This control condition was included toevaluate whether the CV tokens were sufficiently audiblein quiet at the lowest speech level occurring in the SNRconditions. The clean speech (without envelope enhance-ment) was therefore presented at 65 dB SPL for the NHlisteners and 65 dB SPL þ NAL-R(P) amplification forthe HI listeners, corresponding to the speech level in the0-dB SNR condition. The experimental sessions weresplit into four consecutive blocks corresponding to thefour signal-processing conditions (Lin, Expmix, Expspeech,

and ExpSNR). For each of the listener groups, the orderof presentation for the experimental blocks was counter-balanced using a Latin-square design to control for ordereffects. In order to get the listeners accustomed to thetask, the ‘‘easy’’ control listening condition Q65 was pre-sented first. Within each of the succeeding four experi-mental blocks, the three SNR conditions ranked fromeasy to difficult, that is, with SNR tested in the order12, 6, and 0 dB. For each of the SNR conditions, the30 CV tokens were presented in random order withineach of five repetition blocks. This was done to facilitatethe evaluation of potential learning effects.

Procedure and Apparatus

All signals were generated digitally in MATLAB(Version 2015b; The MathWorks, Inc., Natick, MA,United States) on a PC equipped with an RME UCXFireface sound card at a sampling rate of 44.1 kHz and

Figure 3. Waveforms of the male speech token /mi/ along with waveforms obtained with the same speech token mixed with noise at

0-dB SNR for linear processing and the three envelope expansion conditions. For illustration purposes, only the results at the output of an

auditory-inspired gammatone filter tuned to 500 Hz are shown. From the top, the panels show the temporal output of the gammatone

filter for the clean speech, linear processing, Expmix, ExpSNR, and Expspeech conditions, respectively. The ordinate is the signal magnitude,

expressed in arbitrary linear units. The abscissa is time, expressed in milliseconds.

SNR¼ signal-to-noise ratio.

Wiinberg et al. 5

Page 7: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

with a resolution of 16 bits per sample. The stimuli werepresented in a sound-attenuating booth via SennheiserHD 650 headphones to the better ear of the listeners,as derived from the average of the audiometric thresh-olds at 500Hz, 1000Hz, and 2000Hz. The transfer func-tion of each earpiece of the headphones was digitallyequalized (101-point FIR filter) to produce a flat fre-quency response for frequencies between 0.100 and10 kHz, measured with an ear simulator (B&K 4153)and a flat plate adaptor as specified in IEC 60318-1(2009).

Statistical Analysis

An analysis of variance (ANOVA) was conducted on amixed-effect model to evaluate whether hearing impair-ment, SNR, and processing condition had an effect onconsonant recognition performance. In the mixed-effectmodel, listeners were nested within hearing status (NHvs. HI). Listeners and repetitions were treated as randomblock effects, while SNR, processing condition, and hear-ing status were treated as fixed effects. The random-lis-tener effect accommodates the repeated-measures designby assuming that observations from the same listener arecorrelated. The assumptions underlying a parametricanalysis were met without transforming the dependentvariable. Tukey’s Honestly Significant Difference cor-rected post hoc tests were conducted to test for maineffects and interactions. A confidence level of 5% wasconsidered to be statistically significant. The statisticalanalysis was performed using the lme4 and lsmeans

packages in R (Bates, Machler, Bolker, & Walker,2015; Lenth, 2016).

Results

Consonant Recognition Scores of NH and HI Listeners

Figure 4 shows the consonant recognition scoresobtained with the four different signal-processing condi-tions for the NH listeners (left panel) and the HI listeners(right panel) as a function of the SNR. The consonantrecognition scores were calculated as the mean percent-age correct across all consonants, talkers, repetitions,and listeners for both listener groups. For all SNRsand processing conditions, the consonant recognitionscores were poorer for the HI than for the NH listeners.The scores generally increased with SNR and reachedtheir maximum value for the quiet condition. The resultsshowed that, for both listener groups, the two expansionconditions Expspeech (squares) and ExpSNR (triangles)provided a small but consistent improvement relativeto linear processing (asterisks) except for the Expspeechresults for the NH listeners at 12 dB SNR where aslightly detrimental effect was found. In contrast, thecondition Expmix (circles) provided a small improvementfor the NH listeners but not for the HI listeners.

The outcomes of the ANOVA, summarized inTable 2, showed main effects of hearing impairment,SNR, and processing condition as well as an interactionbetween hearing impairment and SNR. The effects of theenvelope expansion schemes were largely consistent

Figure 4. Overall consonant recognition scores for the NH listeners (left) and the HI listeners (right) as a function of the SNR for the

four different signal-processing conditions. (Circles: Expmix, triangles: ExpSNR, asterisks: linear processing, squares: Expspeech). The error

bars represent �1 standard errors of the mean. A slight horizontal jitter was added to the data for better readability.

HI¼ hearing impaired; NH¼ normal hearing.

6 Trends in Hearing

Page 8: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

across NH and HI listeners. Post hoc comparisons con-firmed that the consonant recognition performance wasimproved in the Expspeech condition (by 1.8 percentagepoints, p¼ .008) and the ExpSNR condition (by 2.1percentage points, p¼ .001), relative to the linear pro-cessing condition. The standard error was 0.5 percentagepoints in both cases. In contrast, the consonant recogni-tion scores for the Expmix and linear processing condi-tions were not significantly different (p¼ .99). There wereno significant differences between the consonant recog-nition scores for the Expspeech and ExpSNR conditions(p¼ .95), but the scores in both of these conditionswere significantly higher than in the Expmix condition(p¼ .01).

An alternative, more familiar, performance measure isthe change in SNR corresponding to the improvement inrecognition scores. The statistical analysis of the data(shown in Figure 4) indicates that the improvement interms of percentage correct for the ExpSNR and Expspeechconditions versus linear processing was roughly constantacross the tested SNRs, as there was no interactionbetween processing condition and SNR. Psychometricfunctions fitted to the data points obtained with linearprocessing in Figure 4 revealed that the recognition-scoreimprovement for the ExpSNR and Expspeech conditionsrelative to linear processing was equivalent to a 1-dBchange in SNR for the NH listeners. For the HI listeners,this improvement amounted to a 1.9-dB change in SNR.The difference in SNR improvement between the twolistener groups, despite similar recognition-scoreimprovements, was caused by differences in the slopesof the respective psychometric functions, which wereshallower for the HI listeners than for the NH listeners.

Figure 5 compares the consonant recognition scoresobtained in the linear reference condition to thoseobtained in the three expansion conditions. To evaluatehow the individual expansion schemes affect different

phonetic categories, the recognition scores were averagedwithin the categories /p,k,t/ (blue), /b,g,d/ (green), /f,s,R,v/ (red), /n,m/ (black), and /h,j,l/ (cyan). The aver-

age recognition scores obtained with the three expansionschemes (left: Expmix, middle: ExpSNR, right: Expspeech)are shown as a function of the average recognition scoresobtained with linear processing. The results for the NHlisteners are shown in the top panels and the results forthe HI listeners are shown in the bottom panels. None ofthe expansion schemes had a detrimental effect on therecognition scores in the NH listeners, as no points fallmore than one percentage point below the diagonals inthe top panels of Figure 5. As expected, the recognitionscores were mainly increased for the stop consonants /p,k,t/ (blue). This improvement was largest for ExpSNR

(upper middle panel), slightly smaller for Expmix (upperright panel), and small for the ‘‘ideal’’ Expspeech (topright panel). In contrast to the NH listeners, the expan-sion schemes had a detrimental effect on the HI listenersfor the consonant groups /n,m/ and /b,g,d/ (bottompanels of Figure 5). However, similar to the effectsobserved in the NH listeners, the recognition scores for/p,k,t/ (blue) were increased substantially in all expan-sion conditions. The effects of the expansion schemesvaried strongly across the consonant groups.Interestingly, Expmix did not affect consonant recogni-tion for the fricatives /f,s,

R,v/ (red dot, lower left

panel), whereas ExpSNR (red dot, lower middle panel)and Expspeech (red dot, lower right panel) provided abenefit of 5% and 4%, respectively. Overall, the SNR-based expansion ExpSNR provided the largest benefits for/f,s,

R,v/ and the smallest detrimental effects (�2% for /

n,m/ and �3% for /b,g,d/) in the HI listeners.

Individual Listener Analysis

The abovementioned analysis focused on group aver-ages, showing moderate improvements of consonant rec-ognition scores induced by the envelope expansion on agroup level. However, the individual listeners may haveexperienced largely different benefits from the expansionprocessing. To analyze the individual differences in bene-fit, Figure 6 shows a scatter plot of the across-SNR aver-age consonant recognition performance with linearprocessing on the abscissa and the across-SNR averageperformance with ExpSNR on the ordinate. Each symbolin Figure 6 represents the result for an individual listener(circles: NH; triangles: HI). The scatter plot reveals thatthe improvement in the overall recognition performancefor the ExpSNR conditions was mainly driven by the sixlisteners (4 HI, 2 NH) for whom the expansion process-ing was most beneficial (on average 6.8% and 9.4%,respectively, for the ExpSNR and Expspeech conditions).For the 14 other listeners, the expansion processingaffected the consonant recognition performance by less

Table 2. Summary of the ANOVA Outcomes for a Mixed-Effect

Model Fitted to the Consonant Recognition Data With a Between-

Listener Factor of Hearing Impairment, and Within-Listener

Factors of SNR and Processing Condition.

df F ratio Probability

Processing condition (3, 1133) 7.68 <.001

SNR (2, 36) 292.93 <.001

Hearing impairment (1, 18) 26.04 <.001

Hearing Impairment� SNR (2, 36) 22.38 <.001

Processing Condition� SNR (6, 1127) 1.89 .08

Processing Condition�

Hearing Impairment

(3, 1124) 2.08 .10

Hearing Impairment�

Processing Condition� SNR

(6, 1118) 1.26 .27

ANOVA¼ analysis of variance; SNR¼ signal-to-noise ratio.

Wiinberg et al. 7

Page 9: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

than �5 percentage points (0.1% and �1.0% on averagefor ExpSNR and Expspeech, respectively).

Discussion

Increasing the modulation depth of the speech envelopehas been suggested to facilitate the extraction of speechcues and thereby improve speech intelligibility in noise.In this study, the effects of three different envelopeexpansion schemes that increase the depth of envelopefluctuations between 10 and 20Hz were tested in a con-sonant identification task. Envelope expansion of thenoisy speech showed no significant effect on the overallconsonant recognition performance relative to linearprocessing, neither for the NH nor the HI listeners.This finding is consistent with results from earlier studiesthat investigated the effect of expanding the envelopeof noisy speech (Apoux, Crouzet, & Lorenzi, 2001;Clarkson & Bahgat, 1991; Van Buuren et al., 1999).While the processing improved the intelligibility ofsome of the plosives most likely because of an enhance-ment of the detectability of the transient release bursts,this was accompanied by an increased proportion of

consonant confusions for the other consonant categories.In contrast, SNR-based envelope expansion of the noisyspeech, which confined the enhancement to the time-fre-quency segments in which the speech power was present,improved the overall consonant recognition performanceboth for the NH and the HI listeners. Interestingly, theeffect of the SNR-based envelope expansion was foundto be similar to the effect of envelope-expanding theclean speech before the addition of noise.

While the expansion benefit in terms of recognition-score improvement was substantial for some listeners(about 10 percentage points), the average effect for theentire population was relatively small (about two per-centage points improvement of consonant recognition).Nevertheless, the observation of similar results obtainedwith the SNR-based processing and the clean-speechenvelope expansion is promising, given that previous stu-dies reported substantial improvements in speech percep-tion with envelope expansion of clean speech (e.g.,Apoux et al., 2004; Langhans & Strube, 1982). This sug-gests that the SNR-based envelope expansion schemeproposed in this study could provide larger improve-ments in speech perception when combined with

Figure 5. Scatter plot of consonant recognition in percentage measured with the linear condition versus the three expansion conditions

(from left to right: Expmix, ExpSNR, and Expspeech) in NH (top) and HI (bottom) listeners. The consonant recognition scores were averaged

within the phonetic categories shown in the legend. Within each panel, the solid gray line represents equal performance while the dashed

lines represent �10% differences induced by the respective expansion scheme.

HI¼ hearing impaired; NH¼ normal hearing.

8 Trends in Hearing

Page 10: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

alternative parameter settings, such as those consideredin the earlier studies. In contrast to the expansion ofclean speech before the addition of noise, SNR-basedenvelope expansion is feasible in hearing-aid algorithmsusing blind SNR-estimation methods (e.g., Gerkmann &Hendriks, 2012; Martin, 2001) such that this approachmight help improve speech perception in hearing-aidusers.

When expressing the changes in consonant recogni-tion performance as equivalent change in SNR, the neteffect was an improvement (relative to the linear process-ing condition) that was about 1 dB larger for the HI lis-teners than for the NH listeners in the ExpSNR andExpspeech conditions. Thus, a larger increase in SNR isrequired for the HI listeners than for the NH listeners toobtain the same increase of the recognition score. Thebenefit achieved with the expansion processing may thusbe larger for HI listeners than for NH listeners.

The relatively small (yet statistically significant)improvements in consonant recognition performanceinduced by the proposed envelope expansion processingindicate that the chosen parameter settings were subopti-mal and should therefore be optimized to achieve bene-fits that justify a potential hearing-aid application.Consistent with the results from this study where

modulation frequencies between 10 and 20Hz wereenhanced, envelope filtering studies have demonstratedthat the contribution of envelope fluctuations above12Hz to phoneme intelligibility is small in quiet and instationary background noise (Drullman, Festen, &Plomp, 1994; Xu et al., 2005; Xu & Zheng, 2007).However, ceiling effects were observed in those studiesand it thus remained unclear whether this finding couldbe reproduced if the experiment was not confounded bysuch effects. In contrast, in terms of sentence intelligibil-ity, the envelope expansion of (clean) speech has beenshown to provide a greater benefit when applied to awider range of modulation frequencies. For example,Apoux et al. (2004) showed that their expansion process-ing of modulation frequencies in the range 0 to 256Hzwas more effective than in the range 0 to 16Hz.Therefore, it is possible that an additional enhancementof a wider range of modulation frequencies (below 10 Hzand above 20Hz) would increase the benefit provided bythe expansion processing. However, it should be takeninto account that while expanding clean speech up tomodulation frequencies in the range of the fundamentalfrequency (about 100–200 Hz) may yield more robustperiodicity information, this approach might be detri-mental when applied to a noisy signal where the SNRin the modulation domain typically decreases monoton-ically with increasing modulation frequency. In any case,this type of expansion would still require that modula-tion frequencies below the syllabic rate are not enhancedto avoid compromising audibility for the HI listeners.Furthermore, expansion of slow envelope fluctuationstends to decrease the consonant-vowel intensity ratio(CVR) as the processing enhances high-intensity vowelsmore than low-intensity consonants (Apoux et al., 2004)which, in turn, may affect consonant recognitionperformance (Freyman & Nerbonne, 1989). A possiblesolution may be to apply expansion processing at modu-lation frequencies between 4 and 256 Hz in combinationwith amplitude compression of the slow envelope fluctu-ations below 4Hz, such that the CVR is increased ascompared with the case where only expansion is applied.

The rationale for using stationary background noiserather than fluctuating background noise in this studywas to maximize the benefit provided by the expansionprocessing in terms of consonant recognition. Thisexpectation was based on the results from Apoux et al.(2004) who found larger benefits provided by theirexpansion processing in terms of word recognitionscores in stationary noise than in fluctuating noise.Furthermore, supraprocessing deficits have been shownto provide stronger links to speech intelligibility in sta-tionary noise than in fluctuating noise (e.g., Van Esch &Dreschler, 2015). Hence, the effect of the proposedexpansion schemes may be smaller for fluctuating back-ground noise maskers with a more similar modulation

Figure 6. Scatter plot of consonant recognition performance

with ExpSNR as a function of recognition performance with linear

processing. Circles and triangles show results for NH and HI

listeners, respectively. For visual clarity, the data were averaged

across SNR conditions. The solid, dotted, and dashed lines

represent equal performance, �5 percentage-point improvements,

and �10 percentage-point improvements, respectively, obtained

with the expansion processing relative to linear processing.

HI ¼ hearing impaired; NH¼ normal hearing; SNR¼ signal-

to-noise ratio.

Wiinberg et al. 9

Page 11: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

spectrum to the target speech. However, it should benoted that the bandpass filtering applied in the expan-sion algorithm corresponds to low-pass filtering in themodulation domain, such that the individual frequencybands of the noise considered for envelope expansionwere in fact highly modulated. This make the distinctionbetween stationary and fluctuating noise less prominentthan in the case of a wideband envelope expansionscheme as used in the Apoux et al. (2004) study.

It has been demonstrated that listeners can learn toadapt to artificially produced, nonlinear changes of thenatural auditory cues that are used for auditory percep-tion. For example, frequency-lowering signal processingstrategies have been implemented in hearing aids.Frequency lowering shifts acoustic cues from high-fre-quency regions to lower frequencies where audibility istypically better, thereby potentially improving the lis-tener’s access to the speech cues (for a review, seeSimpson, 2009). Several studies have indicated that aperiod of acclimatization was necessary before frequencylowering provided benefits in speech recognition (Ellis &Munro, 2015; Glista, Scollie, & Sulkers, 2012; Wolfeet al., 2011). Thus, the benefit of nonlinear signal pro-cessing schemes, such as envelope expansion, may not beimmediately apparent when assessed without a period ofacclimatization.

The observed improvements in consonant recognitionperformance induced by the proposed envelope expan-sion schemes were found in a subgroup of the listeners,that is, only selected listeners benefited from this type ofprocessing. Further research is needed to clarify whythese differences in benefit occur and to establish towhat extent they are related to intersubject variabilitycaused by the experimental design and to what extentthese differences can be accounted for by individualdifferences in psychoacoustic measures, such as tem-poral envelope detection and discrimination (e.g.,Schlittenlacher & Moore, 2016; Wiinberg et al., 2018)or in terms of acclimatization to the processing.

Conclusion

This study investigated the effect of expanding envelopefluctuations between 10 and 20Hz on consonant recog-nition performance in NH and HI listeners. Envelopeexpansion of noisy speech showed no significant effecton the overall consonant recognition performance rela-tive to linear processing. In contrast, SNR-based enve-lope expansion of the noisy speech improved the overallconsonant recognition performance by about two per-centage points, mainly resulting from an improved rec-ognition of some of the stop consonants. If the change inperformance was expressed in terms of equivalent changein SNR, the net effect was an improvement (relative tothe linear condition) of 1 dB and 1.9 dB for the NH and

HI listeners, respectively. The effect of the SNR-basedenvelope expansion was comparable with the effect of‘‘idealized’’ envelope expansion of the clean speechbefore the addition of noise. The size of the measuredeffects was relatively small compared with other relatedstudies, indicating that extending the enhanced modula-tion-frequency range from 10–20 Hz to, for example,4–20Hz might yield larger benefits. Overall, the resultssupport the hypothesis that the detrimental effectof enhancing the noise fluctuations in the differentfrequency bands on speech perception is effectivelyreduced by SNR-based envelope expansion.Furthermore, the results suggest that, because of its prac-tical feasibility, the proposed SNR-based envelopeexpansion scheme may be interesting for speech-enhancement applications in hearing aids.

Acknowledgments

We would like to thank Morten Løve Jepsen and Christoph

Scheidiger for helpful discussions. This project was carried atthe Centre for Applied Hearing Research which is supported byWidex, Oticon, GN Hearing and the Technical University of

Denmark.

Declaration of Conflicting Interests

The authors declared no potential conflict of interest with

respect to the research, authorship, and/or publication of thisarticle.

Funding

The authors received no financial support for the research,authorship, and/or publication of this article.

ORCID iD

Alan Wiinberg http://orcid.org/0000-0001-5239-1486

References

Apoux, F., Crouzet, O., & Lorenzi, C. (2001). Temporal enve-

lope expansion of speech in noise for normal-hearing andhearing-impaired listeners: Effects on identification per-formance and response times. Hearing Research, 153(1–2),123–131. doi: 10.1016/S0378-5955(00)00265-3

Apoux, F., Tribut, N., Debruille, X., & Lorenzi, C. (2004).Identification of envelope-expanded sentences in normal-hearing and hearing-impaired listeners. Hearing Research,

189(1–2), 13–24. doi: 10.1016/S0378-5955(03)00397-6Bates, D., Machler, M., Bolker, B., &Walker, S. (2015). Fitting

linear mixed-effects models using lme4. Journal of Statistical

Software, 67(1). Retrieved from https://cran.r-project.org/package¼lmerTest. doi: 10.18637/jss.v067.i01

Byrne, D., Parkinson, A., & Newall, P. (1990). Hearing aid

gain and frequency response requirements for the severely/profoundly hearing impaired. Ear and Hearing, 11(1),40–49. doi: 10.1097/00003446-199002000-00009

Chabot-Leclerc, A., Jørgensen, S., & Dau, T. (2014). The role

of auditory spectro-temporal modulation filtering and the

10 Trends in Hearing

Page 12: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

decision metric for speech intelligibility prediction. TheJournal of the Acoustical Society of America, 135(6),3502–3512. doi: 10.1121/1.4873517

Chabot-Leclerc, A., MacDonald, E. N., & Dau, T. (2016).Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain.

The Journal of the Acoustical Society of America, 140(1),192–205. doi: 10.1121/1.4954254

Christiansen, T. U. (2011, June 26). Objective evaluation of con-

sonant–vowel pairs produced by native speakers of Danish.Paper presented at the Proceedings of Forum Acusticum,Aalborg, Denmark.

Clarkson, P. M., & Bahgat, S. F. (1991). Envelope expansionmethods for speech enhancement. The Journal of theAcoustical Society of America, 89(3), 1378–1382. doi:10.1121/1.400538

Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect oftemporal envelope smearing on speech reception. TheJournal of the Acoustical Society of America, 95(2),

1053–1064. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/8132899

Ellis, R. J., & Munro, K. J. (2015). Benefit from, and

acclimatization to, frequency compression hearing aids inexperienced adult hearing-aid users. International Journalof Audiology, 54(1), 37–47. doi: 10.3109/14992027.2014.948217

Festen, J. M., & Plomp, R. (1990). Effects of fluctuating noiseand interfering speech on the speech-reception threshold forimpaired and normal hearing. The Journal of the Acoustical

Society of America, 88(4), 1725–1736. doi: 10.1121/1.400247Fowler, E. P. (1936). A method for the early detection of oto-

sclerosis: A study of sounds well above threshold. Archives

of Otolaryngology, 24(6), 731–741.Freyman, R. L., & Nerbonne, G. P. (1989). The importance of

consonant–vowel intensity ratio in the intelligibility of voice-

less consonants. Journal of Speech Language and HearingResearch, 32(3), 524–535. doi: 10.1044/jshr.3203.524

Freyman, R. L., & Nerbonne, G. P. (1996). Consonant confu-sions in amplitude-expanded speech. Journal of Speech

Language and Hearing Research, 39(6), 1124–1137. doi:10.1044/jshr.3906.1124

Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-

based noise power estimation with low complexity and lowtracking delay. IEEE Transactions on Audio, Speech andLanguage Processing, 20(4), 1383–1393. doi: 10.1109/

TASL.2011.2180896Glista, D., Scollie, S., & Sulkers, J. (2012). Perceptual acclima-

tization post nonlinear frequency compression hearingaid fitting in older children. Journal of Speech, Language,

and Hearing Research, 55(6), 1765–1787. doi: 10.1044/1092-4388

International Electrotechnical Commission. (2009).

Electroacoustics - Simulators of human head and ear -Part 1: Ear simulator for the measurement of supra-auraland circumaural earphones. IEC 60318-1-2009, Geneva,

Switzerland: IEC.Jørgensen, S., & Dau, T. (2011). Predicting speech intelligibility

based on the signal-to-noise envelope power ratio after

modulation-frequency selective processing. The Journal of

the Acoustical Society of America, 130(3), 1475–1487. doi:10.1121/1.3621502

Jørgensen, S., Decorsiere, R., & Dau, T. (2015). Effects of

manipulating the signal-to-noise envelope power ratio onspeech intelligibility. The Journal of the Acoustical Societyof America, 137(3), 1401–1410. doi: 10.1121/1.4908240

Jørgensen, S., Ewert, S. D., & Dau, T. (2013). A multi-resolu-tion envelope-power based model for speech intelligibility.The Journal of the Acoustical Society of America, 134(1),

436–446. doi: 10.1121/1.4807563Langhans, T., & Strube, H. (1982). Speech enhancement by

nonlinear multiband envelope filtering. In C. Gueguen

(Ed.), ICASSP ’82. IEEE international conference on acous-tics, speech, and signal processing (vol 7, pp. 156–159).New York, NY: Institute of Electrical and ElectronicsEngineers. doi: 10.1109/ICASSP.1982.1171715

Lenth, R. V. (2016). Least-squares means: The R packagelsmeans. Journal of Statistical Software, 69(1), 1–43. doi:10.18637/jss.v069.i01

Li, F., Menon, A., & Allen, J. B. (2010). A psychoacousticmethod to find the perceptual cues of stop consonants innatural speech. The Journal of the Acoustical Society of

America, 127(4), 2599–2610. doi: 10.1121/1.3295689Lobdell, B. E., & Allen, J. B. (2007). A model of the VU

(volume-unit) meter, with speech applications. The Journalof the Acoustical Society of America, 121(1), 279–85.

Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/17297783

Martin, R. (2001). Noise power spectral density estimation

based on optimal smoothing and minimum statistics.IEEE Signal Processing Letters, 9(5), 504–512.

Plomp, R. (1978). Auditory handicap of hearing impairment

and the limited benefit of hearing aids. The Journal of theAcoustical Society of America, 63(2), 533–549. doi: 10.1121/1.381753

Plomp, R. (1988). The negative effect of amplitude compres-sion in multichannel hearing aids in the light of the modula-tion-transfer function. The Journal of the Acoustical Societyof America, 83(6), 2322–2327.

Regnier, M. S., & Allen, J. B. (2008). A method to identifynoise-robust perceptual features: application for consonant/t/. Journal of the Acoustical Society of America, 123(5),

2801–2814. doi: 10.1121/1.2897915Schlittenlacher, J., & Moore, B. C. J. (2016). Discrimination of

amplitude-modulation depth by subjects with normal and

impaired hearing. The Journal of the Acoustical Society ofAmerica, 140(5), 3487–3495. doi: 10.1121/1.4966117

Shannon, R., Zeng, F., & Kamath, V. (1995). Speech rec-ognition with primarily temporal cues. Science, 270,

303–304.Simpson, A. (2009). Frequency-lowering devices for managing

high-frequency hearing loss: A review. Trends in

Amplification, 13(2), 87–106. doi: 10.1177/1084713809336421Steinberg, J., & Gardner, M. (1937). The dependence of hear-

ing impairment on sound intensity. The Journal of the

Acoustical Society of America, 9, 11–23. doi: doi: /10.1121/1.1915905

Stone, M. A., Anton, K., & Moore, B. C. J. (2012). Use of

high-rate envelope speech cues and their perceptually

Wiinberg et al. 11

Page 13: Effects of Expanding Envelope Fluctuations on Consonant Perception … · Original Article Effects of Expanding Envelope Fluctuations on Consonant Perception in Hearing-Impaired Listeners

relevant dynamic range for the hearing impaired. TheJournal of the Acoustical Society of America, 132(2),1141–1151.

Stone, M. A., Fullgrabe, C., & Moore, B. C. J. (2008). Benefitof high-rate envelope cues in vocoder processing: Effect ofnumber of channels and spectral region. The Journal of the

Acoustical Society of America, 124(4), 2272–2282. doi:10.1121/1.2968678

van Buuren, R. A., Festen, J. M., & Houtgast, T. (1999).

Compression and expansion of the temporal envelope:Evaluation of speech intelligibility and sound quality.The Journal of the Acoustical Society of America, 105(5),

2903–2913. doi: 10.1121/1.426943Van Esch, T. E. M., & Dreschler, W. A. (2015). Relations

between the intelligibility of speech in noise and psycho-physical measures of hearing measured in four languages

using the auditory profile test battery. Trends in Hearing,19(0), 1–12. doi: 10.1177/2331216515618902

Wiinberg, A., Jepsen, M. L., Epp, B., & Dau, T. (2018). Effects

of Hearing Loss and Fast-Acting Compression on

Amplitude Modulation Perception and SpeechIntelligibility. Ear and hearing. Advance online publication.doi: 10.1097/AUD.0000000000000589

Wolfe, J., John, A., Schafer, E., Nyffeler, M., Boretzki, M.,Caraway, T., & Hudson, M. (2011). Long-term effects ofnon-linear frequency compression for children with moder-

ate hearing loss. International Journal of Audiology, 50(6),396–404. doi: 10.3109/14992027.2010.551788

Xu, L., Thompson, C. S., & Pfingst, B. E. (2005). Relative

contributions of spectral and temporal cues for phonemerecognition. The Journal of the Acoustical Society ofAmerica, 117(5), 3255–3267. doi: 10.1121/1.1886405

Xu, L., & Zheng, Y. (2007). Spectral and temporal cues forphoneme recognition in noise. The Journal of theAcoustical Society of America, 122(3), 1758–1764. doi:10.1121/1.2767000

Zaar, J., & Dau, T. (2015). Sources of variability in consonantperception of normal-hearing listeners. The Journal of theAcoustical Society of America, 138(3), 1253–1267. doi:

10.1121/1.4928142

12 Trends in Hearing