18
700 J Am Acad Audiol 18:700–717 (2007) *University of Wisconsin–Madison; Kent State University; Washington University School of Medicine in St. Louis Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive, Madison, WI 53706; Phone: 608-262-7491; Fax: 608-262-6466; E-mail: [email protected] This research was supported by Grant R01 DC000581 from the National Institute on Deafness and Other Communication Disorders and by the Commonwealth of Australia through the Cooperative Research Centre for Cochlear Implant and Hearing Aid Innovation (CRC HEAR) that provided the SPEAR3 processors. Effect of Frequency Boundary Assignment on Speech Recognition with the Nucleus 24 ACE Speech Coding Strategy Marios S. Fourakis* John W. Hawks† Laura K. Holden‡ Margaret W. Skinner‡ Timothy A. Holden‡ Abstract The choice of frequency boundaries for the analysis channels of cochlear implants has been shown to impact the speech perception performance of adult recipients (Skinner et al, 1995; Fourakis et al, 2004). While technological limitations heretofore have limited the clinical feasibility of investigating novel frequency assignments, the SPEAR3 research processor affords the opportunity to investigate an unlimited number of possibilities. Here, four different assignments are evaluated using a variety of speech stimuli. All participants accommodated to assignment changes, and no one assignment was significantly preferred. The results suggest that better performance can be achieved using a strategy whereby (1) there are at least 7-8 electrodes allocated below 1000 Hz, (2) the majority of remaining electrodes are allocated between 1100 - 3000 Hz, and (3) the region above 3 kHz is represented by relatively few electrodes (i.e., 1-3). The results suggest that such frequency assignment flexibility should be made clinically available. Key Words: Cochlear implant, frequency boundaries, maps, speech perception tests, questionnaires Abbreviations: ACE=advanced combination encoder speech coding strategy; AI=articulation index; CIS=continuous interleaved sampling speech coding strategy; CNC=consonant-vowel nucleus-consonant; F0=fundamental frequency; F1=first formant; F2=second formant; FIR=finite impulse response; FFT=fast Fourier transform; IIR=infinite impulse response; P=participant; pps/ch=pulses per second per channel; rms=root mean square; SII=speech intelligibility index; SPEAK=spectral peak speech coding strategy; SPS=SPEAR3 programming system Sumario La escogencia de límites de frecuencia para los canales de análisis de los implantes cocleares se ha visto que impacta el desempeño en la percepción del lenguaje de adultos implantados (Skinner y col, 1995; Fourakis y col, 2004). Mientras que las limitaciones tecnológicas hasta este momento han restringido la factibilidad clínica de investigar nuevas asignaciones de frecuencia,

Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

700

J Am Acad Audiol 18:700–717 (2007)

*University of Wisconsin–Madison; †Kent State University; ‡Washington University School of Medicine in St. Louis

Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 WillowDrive, Madison, WI 53706; Phone: 608-262-7491; Fax: 608-262-6466; E-mail: [email protected]

This research was supported by Grant R01 DC000581 from the National Institute on Deafness and Other CommunicationDisorders and by the Commonwealth of Australia through the Cooperative Research Centre for Cochlear Implant and Hearing AidInnovation (CRC HEAR) that provided the SPEAR3 processors.

Effect of Frequency Boundary Assignment onSpeech Recognition with the Nucleus 24 ACESpeech Coding Strategy

Marios S. Fourakis*John W. Hawks†Laura K. Holden‡Margaret W. Skinner‡Timothy A. Holden‡

Abstract

The choice of frequency boundaries for the analysis channels of cochlearimplants has been shown to impact the speech perception performance of adultrecipients (Skinner et al, 1995; Fourakis et al, 2004). While technologicallimitations heretofore have limited the clinical feasibility of investigating novelfrequency assignments, the SPEAR3 research processor affords the opportunityto investigate an unlimited number of possibilities. Here, four differentassignments are evaluated using a variety of speech stimuli. All participantsaccommodated to assignment changes, and no one assignment was significantlypreferred. The results suggest that better performance can be achieved usinga strategy whereby (1) there are at least 7-8 electrodes allocated below 1000Hz, (2) the majority of remaining electrodes are allocated between 1100 - 3000Hz, and (3) the region above 3 kHz is represented by relatively few electrodes(i.e., 1-3). The results suggest that such frequency assignment flexibilityshould be made clinically available.

Key Words: Cochlear implant, frequency boundaries, maps, speech perceptiontests, questionnaires

Abbreviations: ACE=advanced combination encoder speech coding strategy;AI=articulation index; CIS=continuous interleaved sampling speech codingstrategy; CNC=consonant-vowel nucleus-consonant; F0=fundamental frequency;F1=first formant; F2=second formant; FIR=finite impulse response; FFT=fastFourier transform; IIR=infinite impulse response; P=participant; pps/ch=pulsesper second per channel; rms=root mean square; SII=speech intelligibilityindex; SPEAK=spectral peak speech coding strategy; SPS=SPEAR3programming system

Sumario

La escogencia de límites de frecuencia para los canales de análisis de losimplantes cocleares se ha visto que impacta el desempeño en la percepcióndel lenguaje de adultos implantados (Skinner y col, 1995; Fourakis y col,2004). Mientras que las limitaciones tecnológicas hasta este momento hanrestringido la factibilidad clínica de investigar nuevas asignaciones de frecuencia,

Page 2: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

There are three cochlear implant sys-tems in clinical use at the presenttime: the Advanced Bionics devices

(Clarion and HiResolution Bionic Ear), theMed-El devices (Combi 40+ and Pulsar), andthe Nucleus devices (22, 24, and Freedom).In a recent study of 78 adults (26 implantedwith each device; Firszt et al, 2004), scores onrecorded speech tests covered essentially thesame large range for each of the three devices(i.e., monosyllabic words: 2 to 87%; sentencesin quiet: 10 to 100%). This close similarity inrange of performance occurred despite devicedifferences in number of implanted elec-trodes (Advanced Bionics: 16; Med-El: 12;Nucleus: 22), speech coding strategies, andadjustable parameters that are available toclinicians for custom fitting of individualpatients. A pressing clinical question is “Onwhat bases can device parameters be chosento give each patient the best opportunity tomaximize their understanding of speech?”

All three devices initially filter the incom-ing sound into a number of overlapping fre-quency bands and then provide information

about the energy in those bands to the avail-able stimulating electrodes. With theAdvanced Bionics and Med-El clinical speechprocessors, there is little control over the fre-quency range and no control over the assign-ment of frequency bands to electrodes withinthe clinical fitting parameters. With theNucleus 22 and 24 devices, analysis of incom-ing sound is accomplished via a fast Fouriertransform (FFT) that produces outputs in lin-early spaced bins. The filter bank was creat-ed through a mathematical procedure(Seligman, pers. comm.) that provides a finitenumber of frequency assignment tables (seeSkinner et al, 2002, for a description), a sub-set of which is available for recipients withdifferent numbers of active electrodes.Clinical fitting includes choosing one of thesetables for each speech processor program cre-ated. Although the initial design of thesetables incorporated consideration of criticalband data, or filtering likened to that per-formed in the cochlea (Zwicker 1972), no fur-ther consideration was given to potentialoptimization for speech processing.

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

701

el procesador experimental SPEAR3 ofrece la oportunidad de investigar unnúmero ilimitado de posibilidades. Aquí, se evalúan cuatro asignacionesdiferentes utilizando una variedad de estímulos de lenguaje. Todos losparticipantes se acomodaron a los cambios de asignación y ninguna asignacióntuvo una preferencia significativa. Los resultados sugieren que puede obtenerseun desempeño mejor utilizando una estrategia donde (1) existan al menos 7-8 electrodos colocados por debajo de 1000 Hz, (2) la mayoría de los electrodosremanentes sean colocados entre 1100 – 3000 Hz, y (3) la región por encimade 3 kHz esté representada por relativamente pocos electrodos (p.e., 1 -3).Los resultados sugieren que tal flexibilidad en la asignación de frecuenciasdebería estar clínicamente disponible.

Palabras Clave: implante coclear, límites de frecuencia, mapas, pruebas depercepción de lenguaje, cuestionario

Abreviaturas: ACE = estrategia de codificación de lenguaje por el codificadoravanzado de combinación; I = Índica de articulación; CIS = estrategia decodificación de lenguaje por muestreo intercalado continuo; CNC = consonante-núcleo vocal-consonante; F0 = frecuencia fundamental; F1 = primer formante;FIR = respuesta de impulso finito; FFT = transformación rápida de Fourier; IIR= respuesta de impulso infinito; P = participante; pps/ch = pulsos por segundopor canal; rms = raíz media cuadrada; SII = índice de inteligibilidad dellenguaje; SPEAK = estrategia de codificación del lenguaje por picos espectrales;SPS = sistema de programación SPEAR3

Page 3: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

These tables provide some flexibility butnot enough to address pertinent researchquestions. Of particular interest is whetherincreasing the number of electrodes dedicat-ed to lower frequency regions (< ~2600 Hz) inthe area of the first two vowel formants (F1and F2) will provide more speech informationto implant recipients. The work of Skinnerand Hawks (Skinner et al, 1995; Hawks et al,1997) suggested that it may be important toconsider assigning frequency bands to elec-trodes that would provide optimal recogni-tion of vowels. In one of these studies, speechperception performance of Nucleus 22cochlear implant recipients using the SPEAKstrategy (Spectral Peak; Seligman andMcDermott, 1995) and Spectra 22 processorwas compared for two frequency boundaryassignments differing mainly in the F1region (Skinner et al, 1995). In one assign-ment, four filters were allocated to the fre-quency region between 150 and 950 Hz,while in the other assignment, six filterswere allocated between 120 and 1080 Hz.Significant performance improvements werenoted with the assignment of more electrodesto the F1 region. A similar approach was fol-lowed by Fourakis et al, 2004, with Nucleus24 recipients using the ACE strategy(Advanced Combination Encoder; Skinner etal, 2002) with the SPrint™ processor. Speechrecognition performance was compared usingthe manufacturer’s default frequency assign-ment and an experimental assignment dedi-cating 1 or 2 more electrodes in the F1 andF2 speech regions by means of limiting thehigh frequency range from 7937 Hz to ~6000Hz. In this study, vowel identification wassignificantly improved with the experimentalfrequency assignment, and seven of the eightparticipants preferred its sound quality. Thechoice of ~6000 Hz as the highest frequencyin the experimental assignment was sup-ported by Henshall and McKay’s (2001) find-ing that extending the high frequency rangebeyond 6000 Hz does not improve consonantperception.

Results from Henry et al (2000) obtainedfrom Nucleus 22 recipients using the SPEAKstrategy with the Spectra 22 processor indi-cated that speech information below 2680 Hzwas more difficult to perceive and that finerspectral discrimination in this frequencyrange may be more important than at higherfrequencies. These findings were further

investigated by McKay and Henshall (2002)by comparing Nucleus 22 recipients’ perform-ance on words in quiet and sentences in noisefor two assignments using only ten electrodesthat were either (1) evenly spaced over abroad frequency band (five assigned to lowfrequencies; five assigned to high frequen-cies) or (2) nine of the ten electrodes dedicat-ed to the frequency range below 2600 Hz.These experimental assignments were fur-ther compared to participants’ performancewith their usual full (15-18) electrode config-urations. Their results indicated that whileperformance with the nine-out-of-ten, low-frequency assignment was equivalent to thatof their everyday full-electrode assignmentfor vowel perception and sentences in noise,consonant perception was poorer. The equal-ly spaced frequency-to-electrode assignmentyielded the opposite result, with consonantperception equivalent to the everyday full-electrode assignment, but poorer for vowelsand sentences in noise. Taken together, theresults suggest that cochlear implant userscan utilize additional low frequency informa-tion provided by more electrodes resulting inimproved vowel perception and listening innoise. Additionally, more than one higherfrequency electrode is needed for adequaterepresentation of consonants.

The Sound Processor for Electric andAcoustic Research (SPEAR3), developed bythe Cooperative Research Centre forCochlear Implant and Hearing AidInnovation, is a research-oriented implantprocessor that has made the investigation offrequency to electrode assignment consider-ably more flexible than is possible with thefilter bank implementations in current usewith Nucleus 22 and 24 clinical processors.The SPEAR3 processor can emulate any ofthe current speech coding strategies (i.e.,SPEAK, ACE, and CIS [ContinuousInterleaved Sampling; Wilson et al, 1991]),but also has the computational power to usemore complex filtering approaches (i.e., com-plex finite impulse response [FIR] and infi-nite impulse response [IIR]) that allow forconsiderable flexibility in frequency bound-ary assignment and filter bandwidth manip-ulation. Use of this processor and theSPEAK strategy with Nucleus 22 recipientsallowed comparison of speech perception per-formance with a map similar to the recipi-ent’s own clinic map and an experimental

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

702

Page 4: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

map (Leigh et al, 2004). Individual filterbandwidths in the experimental map werechosen to provide an additional 2 to 3 elec-trodes below 2600 Hz and correspondinglyfewer electrodes for higher frequencies (3 to4 electrodes as opposed to the typical 5 to 7).The results appeared conflicting in that, asanticipated, the experimental assignmentprovided significantly better transmission offirst formant information in a vowel identifi-cation task (/h/-vowel-/d/; heed, had, head…)and no significant reduction in performanceon a consonant identification task (/a/-con-sonant-/a/; aBa, aSa, aKa...). However,results from a Consonant-Vowel Nucleus-Consonant (CNC) monosyllabic word testindicated poorer vowel and consonant per-ception with the experimental assignment.Given these mixed results, the authorscalled for further investigations to addresswhether additional electrodes dedicated tolower frequency information can improvespeech perception.

Beyond the issue of simply dedicatingadditional electrodes to lower frequencyregions is the more specific selection of thebandpass filter frequencies themselves.While unlike the filtering processes of a nor-mal cochlea, cochlear implant filtering mustwork with far fewer channels. Further, it isknown from Speech Intelligibility Index (SII;ANSI3.5-1997) and Articulation Index (AI)research (ANSI S3.5-1969; French andSteinberg, 1947; Pavlovic et al, 1985) thatthe relative importance of different frequen-cy regions to speech intelligibility varies sub-stantially (Studebaker et al, 1987;Studebaker and Sherbecoe, 1991;Studebaker et al, 1993). Thus, every attemptshould be made to optimize the informationprovided to each of the available electrodes.

The objective of the present study was todetermine whether speech recognition ofNucleus 24 implant recipients can be

improved by reassigning frequency bands toelectrodes within the ACE strategy imple-mented on the SPEAR3 processor. Thisobjective was investigated in two experi-ments. In Experiment 1, three different fre-quency-to-electrode assignments were evalu-ated: one found best in Fourakis et al (2004)using one of the frequency assignment tablesavailable for the SPrint™ processor; oneusing Bark based spacing (Traunmüller,1990), and one using filter parameters select-ed with consideration to their relative impor-tance to speech perception. In Experiment 2,modifications to the third frequency assign-ment were implemented and tested.Performance was evaluated with recordedspeech tests. Judgments of performance ineveryday life were assessed in Experiment 1by means of a questionnaire administeredafter a number of weeks of at-home use witheach assignment.

EEXXPPEERRIIMMEENNTT 11

MMeetthhoodd

PPaarrttiicciippaannttss

Eight adult recipients of the Nucleus 24Contour Cochlear Implant system partici-pated. Their demographic information isgiven in Table 1. All participants were flu-ent auditory/oral communicators and hadconsiderable open-set speech recognition(>20% on recorded CNC monosyllabic words;Luxford and Ad Hoc Subcommittee, 2001).Average age of the participants was 59 years(range = 40-74), average duration of deaf-ness was 12 years (range = 1-25), and theaverage length of implant use was 2.5 years(range = 1-4). Three-D reconstructions ofspiral CT scans showed that all participantshad complete insertions of the Contour elec-trode array into the cochleae.

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

703

Table 1. Subject Demographic InformationDuration of

Age during Deafness in Length ofParticipant Sex Etiology Ear Implanted Study Implanted Ear (yrs) Implant Use (yrs)

1 F Unknown R 44 15 1

2 F Meningitis R 59 25 4

3 F Genetic R 55 6 4

4 M Otosclerosis L 71 20 3

5 F Unknown R 74 1 3

6 F Otosclerosis L 56 7 2

7 M Unknown L 40 5 1

8 F Unknown R 76 2 2

Page 5: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

SSppeeeecchh PPrroocceessssiinngg SSyysstteemmss

Prior to the study, all participants used theACE speech coding strategy (25 µs/phase,monopolar stimulation) with either theSPrint™ body worn or ESPrit 3G ear-levelspeech processor. The Nucleus R126 Version2.1 clinical software was used for program-ming the processors. Four participants usedthe ESPrit 3G exclusively, 1 used theSPrint™ exclusively and 3 used both the 3Gand SPrint™ processors on a regular basis.Table 2 shows parameters used with eachparticipant’s everyday speech processor pro-gram or map. Half the participants used astimulation rate of 900 pulses per second perchannel (pps/ch) and half used 1800 pps/ch.The number of maxima (that is, channelsselected from the filter bank bands havingthe highest signal levels within each analy-sis/stimulation cycle) varied among partici-pants. The number of maxima multiplied bythe stimulation rate per channel is the totalstimulation rate. For the Nucleus 24 device,the maximum total stimulation rate is14,400 pps. Consequently, participants whoused a rate of 1800 pps/ch were limited to 8maxima, while those who used 900 pps/chcould utilize more than 8 maxima. The totalstimulation rate used by each participantranged from 7,200 to 14,400 pps. The num-ber of channels (i.e., number of frequencybands) into which incoming sound was fil-tered ranged from 18 to 22 across partici-pants. Participant (P) 4 and P8 used mapswith 22 channels with the SPrint™ proces-sor and 20-channel maps with the 3G proces-sor. The frequency table used by each partic-ipant with either their SPrint™ or 3G proces-sor was similar to that which provided the

best vowel recognition in the study byFourakis et al (2004); it is listed in Table 2.

The SPEAR3 body-worn speech processor,designed for research purposes, was used forExperiments I and II. This processor wasused with the standard Nucleus 24 HS8headset that has a directional microphone(Knowles Model EL7189) and CI24 transmit-ter coil. The SPEAR3 processors were pro-grammed using the SPEAR3 ProgrammingSystem (SPS). The programming systemconsists of (1) the SPS hardware interfacethat connects the SPEAR3 processor to astandard serial port on a desktop computer(running MS-Windows) and (2) the Seed-Speak SPEAR3 programming software thatwas used to create and download participant-specific coding parameters for the ACEspeech coding strategy. Unlike the clinicalprocessors, only one map was stored on theSPEAR3 processor at a time. All maps werecreated using a series of Hanning windowed,256 point complex FIR filters with a mini-mum bandwidth of 90 Hz.

For each participant, three maps usingthree different frequency-to-electrode assign-ments were created for use with the ACEspeech coding strategy on the SPEAR3processors. The frequency assignments weretermed old experimental (OLD EXP), newexperimental (NEW EXP), and bark-based(BARK). The OLD EXP assignment wasbased on the earlier study’s (Fourakis et al,2004) experimental assignment except thelowest frequency was now 200 Hz (instead of187 Hz) and the highest frequency wasextended from ~6000 to 7000 Hz. The NEWEXP assignment also had a low frequencycut-off of 200 Hz and provided finer resolu-tion in the F1 and F2 frequency ranges than

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

704

Table 2. Map Parameters Used by Each Participant for the ACE Strategy on Their Everyday SpeechProcessor Prior to the Study

Processor Frequency Used Prior Rate/channel Total Table Used

Participant to Study (pps) Maxima Stimulation Rate Channels Prior to Study

1 3G 1800 8 14,400 18 7

2 3G 900 8 7,200 20 5

3 3G 1800 8 14,400 19 7

4 3G/SPrint™ 900 8 7,200 20/22* 54

5 3G/SPrint™ 1800 8 14,400 19/19* 77

6 3G 1800 8 14,400 20 5

7 SPrint™ 900 12 10,800 19 7

8 3G/SPrint™ 900 12 10,800 20/22* 76

*Number of channels used with 3G and SPrint™ processors.

Page 6: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

OLD EXP while sacrificing some resolutionin the 1100 to 1900 Hz and 3000 to 7000 Hzranges. With the BARK assignment, theentire 200-7000 Hz range was divided intoequal bark intervals. For all three assign-ments, the highest frequency of 7000 insteadof 6000 Hz was chosen so that more of thespectral energy of fricatives (e.g., /s/) spokenby women could be analyzed by the processor.This choice was based on studies that showedthe peak spectral energy for /s/ spoken bymen was between 4000 and 6000 Hz, and forwomen, above 6000 Hz (Boothroyd et al,1994; Nittrouer et al, 1989; Stelmachowicz etal, 2002). For participants 2, 3, 5, 6, and 7the three new maps were identical to theireveryday processor maps with the exceptionof the frequency assignments to electrodes.For P1, electrode four was activated suchthat 19 channels were used for each of thenew maps rather than the 18 channels typi-cally used by this participant. For P4 and P8,the three new maps used 20 channels andwere thus identical to the maps used withtheir ESPrit 3G processors with the excep-tion of the frequency assignments to elec-trodes. Figure 1 shows the frequency assign-ments for Experiment 1 as well as that usedin Experiment 2 (NEWEST) when 20 or 19active electrodes were used.

EEqquuiippmmeenntt//TTeesstt EEnnvviirroonnmmeenntt

All test materials were presented in a dou-ble-walled sound-attenuating booth (IAC;model 1204-A; 254 cm x 264 cm x 198 cm)

through a loudspeaker placed at ear-levelheight, 0o azimuth, and 1.5 m from the centerof the participants’ heads in their absence.Test materials were presented via an IBMcompatible, Pentium II computer. The com-puter controlled a mixing and attenuationnetwork (Tucker-Davis Technologies) to pres-ent sound through a power amplifier (Crown,model D-150) and loudspeaker (JBL, modelLSR32). The speech perception tests (i.e.,CNC words, vowels, and TIMIT sentences)were stored as wave files on the computerhard disk. The level (SPL) of the sound stim-uli was measured with the microphone(Brüel & Kjaer, model 4155) of the soundlevel meter (Brüel & Kjaer, model 2230) atthe center of the participants’ heads. Theoverall SPL of the words, consonants, andvowels was measured as the average of peakson the slow, rms, linear scale. Presentationof the vowel test and consonant (Experiment2) test and collection of participants’ respons-es were accomplished with the Condor soft-ware (Shannon et al, 1999).

TTeesstt MMaatteerriiaallss

Speech perception tests. A closed-set voweltest, an open-set word test, and an open-setsentence test were chosen to evaluate differ-ences in speech perception between the threefrequency boundary assignments. The voweltest consisted of a 132 tokens of 11 AmericanEnglish vowels in an /h/-vowel-/d/ context(heed, hid, head, had, hod, hood, who’d, hud,heard, hayed, hoed) from a recording made by

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

705

FFiigguurree 11.. Frequency assignments for the BARK, OLD EXP, NEW EXP and NEWEST assign-ments for both 20 and 19 channels. The width of the colored area represents the frequencyband for that channel. The numbers in brackets represent the number of channels in each ofthe three frequency regions, i.e., F1, F2, and high frequency regions.

Page 7: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

Hillenbrand et al (1995). For this study, threeexamples of each vowel were selected from avariety of speakers representing each of fourvoice groups: men (20 talkers), women (24talkers), boys (20 talkers), and girls (14 talk-ers). Only tokens that had been 100% cor-rectly identified in Hillenbrand et al (1995)were considered, and token selection for eachgroup represented a variety of fundamentalfrequency and talker characteristics.

The open-set word test included selectedlists of CNC monosyllabic words recentlydeveloped at the University of Melbourneand evaluated by Skinner et al (2006). Theresults of this study found that the averagescore for the new set of CNC words (30 lists)was 22% less than the original ten lists ofCNC words (Peterson and Lehiste, 1962) fora group of 22 cochlear implant recipients.The same male talker recorded the new lists10 years after the original lists. Eight wordlists (2, 4, 9, 10, 11, 17, 23, and 30) that werevery similar in intelligibility (word scorerange = 35.5 to 37.5%; Skinner et al, 2006)were used for speech perception testing.

Open-set sentence perception in quiet wasevaluated with the TIMIT sentences(Garofolo et al, 1993). The sentences, spokenby male and female talkers, are on topics notusually encountered in everyday conversa-tions. Consequently, listeners often cannotuse sentence context but must recognize indi-vidual words to answer correctly. Loizou andhis colleagues (personal communication)have equalized intelligibility for the lists byrearranging the sentences included in eachlist based on recognition by normally hearingyoung adult listeners using simulatedcochlear implant processing. Twenty-four of34 lists of TIMIT sentences, 20 sentences perlist, were used for this study.

Questionnaire. The questionnaire includ-ed 19 listening situations that were rated ona five-point scale (1=poor, 5=excellent) basedon how well speech was perceived. Threeadditional questions were rated on the samescale based on (1) the ability to detect envi-ronmental sound, (2) enjoyment of music,and (3) overall understanding of speech ineveryday life. The 22 questions are listed inTable 3.

TTeesstt PPrroocceedduurreess

All testing sessions included two, one-hourtest periods occurring on the same day for atotal of 12 test sessions on six different daysas shown in Table 4. During each test day,test-retest data were obtained for each of thespeech perception measures. Study partici-pants were initially programmed and accli-mated to the OLD EXP assignment on theSPEAR3 processor for two weeks, with prac-tice on all tests given at the end of the first

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

706

Table 3. Listening Situations Listed on theQuestionnaire

# Listening situation

1 Conversation on the telephone or cell phone

2 Message on an answering machine

3 News on TV

4 Movies/dramas/sitcoms on TV

5 Radio in the car

6 Radio at home in a quiet room

7 Lyrics to music

8 Conversation with several friends around dinner table

9 Conversation in a quiet room with one person

10 Conversation in a quiet room with several people

11 Conversation in a car

12 Conversation with friends at a social gathering

13 Conversation at a restaurant

14 Conversation with a cashier at the grocery store

15 Conversation with a child

16 Conversation outside

17 Someone speaking from a distance at home

18 Church service

19 Meeting in large room

20 Environmental sounds (especially soft sounds)

21 Music

22 Overall speech understanding

Table 4. Frequency Boundaries Used byParticipants and Timing of Test Sessions for EachWeek of the Study

Week Frequency boundaries Weeks testing of study used by participants occurred

1-2 OLD EXP 1 (practice testing)

2

3-5 Group A: NEW EXP

Group B: BARK 5

6-8 Group A: BARK

Group B: NEW EXP 8

9-10 Group A: NEW EXP

Group B: BARK 10

11-12 Group A: BARK

Group B: NEW EXP 12

13-14 OLD EXP 14

Page 8: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

week and a testing session the next week.This frequency assignment was similar towhat each of the participants were usingprior to the study. After these initial test ses-sions, participants were divided randomlyinto two groups (Group A: P1, P4, P6, P7;Group B: P2, P3, P5, P8) and provided witheither the NEW EXP (group A) or the BARK(group B) frequency assignment. After oneweek, participants returned, if necessary, forminor adjustments to their maps with con-tinued use for an additional two weeks beforethe next test session. At the end of this test-ing, Group A was provided the BARK assign-ment and group B the NEW EXP assign-ment, and the same protocol was followed.On weeks nine and 11 of the study, frequen-cy assignments were again switched betweengroups and worn for two weeks as shown inTable 4. At the end of testing on the 12th

week, all participants returned to the mapusing the OLD EXP assignment. The OLDEXP assignment was used for two weeks byboth groups followed by testing on week 14.

During the first hour of each test session,vowels for each voice group (i.e., men,women, boys, and girls) were presented onceas well as two lists of CNC words and twolists of TIMIT sentences. The same order oftesting was repeated during the second hour.All tests were presented at 60 dB SPL. Theorder in which word lists were presented waspseudo-randomized across participants. Therepeated use of each list for each of the threeassignments occurred at least three weeksapart. This design was supported by the lackof learning effect for CNC words when listswere repeated three weeks apart in a previ-ous study (Skinner et al, 1997). In addition, itallowed analysis of percent information

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

707

FFiigguurree 33.. Percent correct identification of words in TIMIT sentences, CNC words, and eachsegment of the CNC words. Error bars represent one standard deviation.

FFiigguurree 22.. Average rating (across the 19 listening situations on the questionnaire) given byeach participant for each of the three frequency assignments.

Page 9: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

transmitted for the same tokens for each fre-quency assignment. The sentences lists werepseudo-randomized across participants; noneof the lists were repeated. Participants weregiven the questionnaire to take home twicewith each frequency assignment (i.e., firstand second half of experiment). Participantswere asked to respond to the questionnairejust before they returned for testing with eachassignment. Their responses were based ontheir experience with that assignment ineveryday life. At the end of the study, the par-ticipants returned to using either theirSPrint™ or 3G processors with their every-day map and frequency assignment.

RReessuullttss

Figure 2 shows group mean ratings acrossthe 19 listening situations on the question-naire for each the three assignments as wellas group mean ratings for environmentalsounds, music, and overall speech for each ofthe assignments. Across all three assign-ments, average ratings were within therange of one-half rating point. These smallaverage rating differences probably do notreflect a true difference among assignments.All participants reported that they becameaccustomed to each of the frequency assign-ments within one to two weeks. Only P3reported a strong, consistent preference forone of the assignments (NEW EXP). Theother participants reported that once theybecame accustomed to the new assignment,they believed they understood speech equal-ly well with each. In addition, because theSPEAR3 processor could not be programmedwith all three assignments on the processor

at the same time, it was difficult for partici-pants to compare differences between thethree assignments.

Figure 3 shows the group mean percentcorrect identification for words in the TIMITsentences, CNC words, and for each phonemesegment of the CNC words for each of the threefrequency assignments. No significant differ-ences were found for any of these test meas-ures. In addition, in order to investigate thepossibility of learning effects across sessions,the scores for each session were compared forboth the words in TIMIT sentences and theCNC words. There was no significant effect ofsession for the words in the TIMIT sentences,but there was a significant effect for CNCwords (F(1,7)=16.754, p<.01). Scores for CNCwords were higher by about 3.3% in the secondsession compared to the first session. Therewas no interaction with frequency assignment,thus the performance increase was relativelyuniform across assignments.

Figure 4 shows the group mean percent cor-rect identification of vowels in /h/-vowel-/d/ con-text as spoken by the four voice groups (men,women, boys, and girls) for each assignment.Participants’ percent correct scores were arc-sine transformed and submitted to repeated-measures analysis of variance, with Session (2levels), Voice (4 levels), and Assignment (3 lev-els) as factors. There was a marginally signif-icant main effect of session (F(1,7)=7.281,p<.05). Vowel identification improved by 2%across voices and assignments in the secondsession relative to the first session. There wasa statistically significant effect of voice(F(3,21)=11.285, p<.01). Vowels produced bywomen were identified with the highest accu-racy (74%), followed by vowels produced by

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

708

FFiigguurree 44.. Group mean performance (percent correct) for the /h/-vowel-/d/ vowel test for themen’s, women’s, boys’ and girls’ voices for the three frequency assignments. Error bars rep-resent one standard deviation. Asterisks represent significantly different scores. Level of sig-nificance: ** = p<.01; * = p < .05; NS = p>.05.

Page 10: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

men (72.7%) and boys (71.6%), while vowelsproduced by girls were identified with the leastaccuracy (69.1%). There was no significantmain effect of Assignment, but there was a sig-nificant Voice x Assignment interaction(F(6,42)=3.083, p=.014). Therefore separateanalyses of variance were run for each voicecondition. There was a small, but significanteffect of assignment on the correct identifica-tion scores for vowels produced by men(F(2,14)=3.73, p=.05). The BARK assignmentresulted in the best performance. There wasalso a significant effect for vowels produced bywomen (F(2,14)=7.36, p<.05) with the NEWEXP assignment providing the highest per-formance. Finally, there was a highly signifi-cant effect (F(2,14)=7.83, p<.01) for vowels pro-duced by girls where the NEW EXP assign-ment produced the highest performance.

Bonferroni adjusted pairwise t-tests werealso run comparing pairs of assignments foreach voice condition. The results showed thatthe BARK assignment produced significantlyhigher scores than either the OLD EXP or theNEW EXP assignment for vowels produced bymen. It was hypothesized that this was due tothe fact that the BARK assignment allocatedmore channels to the F1 region which encom-passes a large proportion of the energy inmen’s voices (see Figure 1 above). In contrast,for girls’ voices, performance with the NEWEXP assignment was significantly higher thanwith the OLD EXP and BARK assignments. Itwas hypothesized that this may be due to theassignment of more channels in the F2 region,thus affording greater resolution of formantsin voices with higher fundamental frequencieswhich typically yield less clearly defined for-mants. Performance with the NEW EXPassignment was significantly higher than withthe BARK assignment for vowels produced bywomen.

To confirm the above hypotheses, group con-fusion matrices were submitted to informationtransmission analysis (Miller and Nicely, 1955;Wang and Bilger, 1973) using the featurematrix for vowels initially used in Skinner et al

(1996) and shown in Table 5. The results ofthis analysis are shown in Figure 5 for men’svoices, Figure 6 for women’s voices, and Figure7 for girls’ voices. Results for the boys’ voicesare not shown due to the lack of any significantperformance differences between assignments.Each figure shows the percent informationtransmitted for each feature. For men’s voic-es, Figure 5 shows that transmission levelswere higher with the BARK assignment thanwith the OLD EXP and NEW EXP assign-ments for the features F1, F2, complex (a fea-ture distinguishing monophthongs fromeither r-colored or simple diphthongs), andF2 movement (a feature distinguishingbetween vowels with little or no F2 change,F2 change to lower frequencies as in thevowel in the word “hoed,” or F2 movement tohigher frequencies as in the word “hayed”).The transmission level was higher with theNEW EXP assignment than with the othertwo assignments only on the feature ‘r-color’that is coded based on the lowering of thethird formant to frequencies below 2000 Hzas in r-colored vowels. This result wasexpected because with the NEW EXP assign-ment, more channels were allocated to thefrequency region where this movement takesplace.

Figure 6 shows the results for vowels pro-duced by women’s voices. For all features,there are small, but consistently highertransmission levels with the NEW EXPassignment than for the other two assign-ments. Finally, Figure 7 shows the resultsfor vowels produced by girls. Again, theNEW EXP assignment provided highertransmission levels for all features and espe-cially the F2, complex, F2-movement, and r-color features. All of these formant-definedfeatures were most likely transmitted withhigher levels of accuracy because NEW EXPallocates considerably more channels to therelevant frequency region, 1000-3000 Hz.

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

709

Table 5. Feature Matrix Used in Information Transmission Analysis for the Vowel Test

Feature Heed Hid Head Had Hod Hood Who’d Hud Heard Hayed Hoed

Duration 2 1 1 2 2 1 2 1 1 2 2

F1 1 1 2 3 3 1 1 3 2 2 2

F2 2 2 2 2 1 1 1 1 1 2 1

Complex 1 1 1 1 1 1 1 1 1 2 2

R_color 1 1 1 1 1 1 1 1 2 1 1

F2_move 1 1 1 1 1 1 1 1 1 2 3

Page 11: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

DDiissccuussssiioonn

Although the use of the three assignmentsproduced similar performance on the TIMITsentences and CNC word tests, there weresignificant differences in performance on theclosed-set vowel test. The vowel test was theonly one for which scores were calculated foreach of the different types of voices. For thistest, the BARK assignment produced higherperformance for vowels spoken by men whilethe NEW EXP assignment produced signifi-

cantly higher performance for vowels spokenby girls and somewhat higher performancefor vowels spoken by women. These findingswere supported by the information transmis-sion analysis, showing higher transmissionscores for most features for the BARK assign-ment with men’s voices and higher transmis-sion scores for most features for the NEWEXP assignment with girls’ voices. Theseresults can be explained on the basis of thenumber of electrodes allocated by the threeassignments to different frequency regions

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

710

FFiigguurree 55.. Percent transmitted information for features in vowels produced by men.

FFiigguurree 66.. Percent transmitted information for features in vowels produced by women.

FFiigguurree 77.. Percent transmitted information for features in vowels produced by girls.

Page 12: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

which afforded different resolution patternsfor each type of voice. The use of more elec-trodes in the F1 region by the BARK assign-ment provided better resolution for men’svoices that typically contain denser harmon-ic spacing and, therefore, better defined firstformants. This was confirmed by theincreased transmission of the F1 feature,50% for the BARK assignment versus 40%for the NEW EXP assignment. Likewise, theuse of more electrodes in the F2 region by theNEW EXP assignment provided better resolu-tion for the girls’ voices that typically containfewer harmonics and less defined second for-mants. This was again confirmed by the trans-mission of the F2 feature, 70% for the NEWEXP assignment versus 52% for the BARKassignment.

EEXXPPEERRIIMMEENNTT 22

The aim of Experiment 2 was to determineif modifications to NEW EXP could be

made such that performance with this newassignment could match or exceed thatobtained with the BARK assignment on themedial vowel test produced by men while, atthe same time, maintain or improve perform-ance on the TIMIT sentences, CNC words,and vowels produced by the other threespeaker types. Modification considerationswere driven by the hypothesis that the supe-rior performance with the BARK assignmentfor men’s voices on the vowel test was due tothe fact that the BARK assignment allocatedmore channels to the F1 region.

MMeetthhoodd

SSuubbjjeeccttss

Seven of the eight participants fromExperiment 1 agreed to participate in this sec-ond experiment. Participant 7 from the firstexperiment was unable to participate in thesecond experiment due to other commitments.

The time between the two studies was 9months for P8, 10 months for P4 and P2, 11months for P1, P5, and P6, and 12 months forP3. During the time between studies, partici-pants used their own speech processors pro-grammed with the frequency tables listed inTable 2.

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt

For Experiment 2, the NEW EXP assign-ment was modified to assign one extra chan-nel in the F1 region and one less channel inthe frequency region above 3000 Hz, yieldinga new assignment, NEWEST. Figure 1shows the frequency boundaries for frequencybands allocated to sequential electrodes for 20and 19 electrode maps for all four assign-ments. This figure facilitates comparison ofdifferences in frequency boundaries for theNEW EXP and the NEWEST assignments.

SSppeeeecchh MMaatteerriiaallss

Speech tests included the same three testsused in Experiment 1 (CNC words, TIMITsentences, and /h/-vowel-/d/ vowels [spokenby men, women, boys, and girls]), with theaddition of a closed-set consonant identifica-tion test. The consonant test was developedand recorded by Shannon et al (1999). Thetest comprised three randomized repetitionsof a set of 20 consonants [b, ch, d, f, g, j, k, l,m, n, p, r, s, sh, t, th (voiced), v, w, y, z] record-ed in an /a/-consonant-/a/ context. Wave filesof these stimuli were used for testing. Oneexample of each consonant from three maleand three female speakers of AmericanEnglish were selected to represent a range oftalker characteristics.

PPrroocceedduurree

On the first test day of Experiment 2, par-ticipants were initially given practice with theconsonant test and then given one list each ofthe consonants spoken by the men and womenusing their everyday processor and map. Asnoted previously, the frequency assignmentwith this map was similar to the OLD EXPassignment used in Experiment 1. A mapwith the new frequency assignment(NEWEST) was then placed on the SPEAR3processor, and one list of consonants spoken bythe two voice groups was repeated that sameday. This test was included to determinewhether consonant identification was affectedby the NEWEST frequency assignment.

The parameters for each participant’snew map were the same as those used inExperiment 1 except the NEWEST assign-ment was used. Practice testing for thevowels, CNC words, and TIMIT sentencetests was given after one week’s use of this

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

711

Page 13: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

assignment. In addition to practice testing,each participant was asked about the soundquality of their new map during the one-week visit. Only one participant, P4, need-ed minor adjustments to the map toimprove the sound quality. After threeweeks’ use, participants participated in twoone-hour test sessions all on one day, andtwo more test sessions one week later. Thistesting replicated the design used witheach of the other assignments inExperiment 1. Each test session consistedof two 50-word CNC lists, 132 vowel tokensof the four voice types, and two lists ofTIMIT sentences. The word lists used inExperiment 1 were used again inExperiment 2. The order of word and sen-tence list presentation was pseudo-random-ized across subjects. A questionnaire was notcompleted by the participants forExperiment 2; however, participants werequeried by the investigators regarding soundquality and their speech understanding ineveryday life with the NEWEST assignment.

RReessuullttss

Four of the seven participants (P1, P3, P6,and P8) reported that the NEWEST assign-ment had a high pitched sound quality.Participants P2, P4, and P5 reported little dif-ference in the sound quality between the

NEWEST assignment and their everydayassignment. All participants were able to adjustto the sound quality with the NEWEST assign-ment within one to two weeks of use.

The group mean score on the closed-set con-sonant test was 76.2% with subjects using theireveryday map on their own speech processor.The group mean score for consonants with sub-jects using the map with the NEWEST assign-ment on the SPEAR3 processor was 76.7%.There was no significant difference in thesescores according to a paired samples t-testanalysis (t(6) = -.242, p = .817). This finding wasreassuring in that the assignment of fewer elec-trodes to the high frequency region did notresult in a decrement in consonant perceptionperformance.

Figure 8 shows mean correct identificationscores for the TIMIT sentences and CNC words.For these tests, the mean scores for the BARK,OLD EXP, and NEW EXP assignments wererecalculated from Experiment 1 with resultsbased on seven instead of eight participants.Repeated measures analysis of variance showeda significant effect of Assignment for the TIMITsentences (F(3,18)=13.073, p<.01). Bonferroniadjusted, two-tailed, paired sample t-tests withsix degrees of freedom produced significant dif-ferences that are indicated at the bottom ofFigure 8 by the use of uppercase text for theassignment with which participants performedhigher. That is, participants performed signifi-cantly higher with the NEWEST assignmentthan with the BARK (t=-5.042, p<.01), the OLDEXP (t=-7.085, p<.01) or the NEW EXP assign-ments (t=-5.526, p<.01).

Repeated measures analysis of varianceshowed a significant effect of Assignment forthe CNC words (F(3,18)=9.375, p<.01).Bonferroni adjusted, two-tailed, paired samplet-tests with six degrees of freedom producedsignificant differences that, again, are indicat-ed in Figure 8 by uppercase text for the assign-ment with which participants performed high-er. That is, participants performed significant-ly higher with the NEWEST assignment thanwith the BARK (t=-4.394, p<.01) or NEW EXP(t=-4.924, p<.01) assignments. The compari-son with the BARK assignment here is espe-cially noteworthy because the CNC wordswere all spoken by a male speaker.

Figure 9 shows the results of the vowel testfor the four voice types and the four assign-ments for the seven participants who partici-pated in both experiments. Repeated meas-ures analysis of variance showed a significant

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

712

FFiigguurree 88.. Group mean percent correct identification of wordsin TIMIT sentences and CNC words across the seven par-ticipants who participated in both experiments. Error barsindicate one standard deviation. Asterisks represent signif-icantly different scores. Level of significance: ** = p <.01. Thebetter assignment is shown in bold letters.

Page 14: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

main effect of Voice type (F(3,18)=5.361, p<.01)and a significant Voice by Assignment interac-tion (F(9,54)=5.108, p<.01). Bonferroni adjust-ed, two-tailed, paired sample t-tests with sixdegrees of freedom produced significant differ-ences that are indicated in Figure 9 by upper-case text for the assignment with which par-ticipants performed highest. For vowels pro-duced by men, participants performed betterwith the BARK assignment than the OLDEXP assignment (t=2.737, p<.05), and theyperformed better with the NEWEST assign-ment than OLD EXP assignment (t=-3.313,p<.05). The important result here is that per-formance with the NEWEST assignment wasnot significantly different from that with theBARK assignment (unlike Experiment 1,where there was a significant advantage forthe BARK assignment). Thus, the addition ofthe extra electrode in the F1 region in theNEWEST assignment provided increased res-olution of the first formant with men’s voicesnow equal to that provided by the BARKassignment. Specifically, in Experiment 1 thetransmission of the F1 feature with the BARKassignment had shown a ten percentage pointadvantage (50% vs. 40%) over the NEW EXPassignment. The NEWEST assignment yield-ed transmission of the F1 feature of 47% with-out losing information transmission in any ofthe other features.

For vowels produced by women, the onlysignificant difference in performance was

between the NEW EXP and the OLD EXPassignments, with participants performingbetter with NEW EXP than with OLD EXP(t=-3.355, p<.05). Thus, looking at theresults of the seven participants common toboth experiments, there was no significantreduction in performance on women’s vowelsusing the NEWEST assignment. As withExperiment 1, there were no significant dif-ferences for vowels produced by boys betweenany of the assignments.

For vowels produced by girls, the partici-pants performed significantly better with theNEW EXP than the BARK assignment (t=-3.296, p<.05); they also performed signifi-cantly better with the NEWEST than theBARK assignment (t=-4.025, p<.01).

Figures 10 and 11 show the percentages ofinformation transmitted for vowel featuresfor men’s and girl’s voices, respectively, forthe four assignments. Figure 10 shows thatinformation transmission for men’s vowelswas highest with the BARK assignment forall features except the R-color and durationfeatures. As pointed out above in the discus-sion of Figure 9, the difference between theBARK and NEWEST assignments for the F1feature is much smaller than between BARKand NEW EXP. Figure 11 shows that infor-mation transmission was higher with theNEW EXP and the NEWEST assignmentsthan with the BARK and OLD EXP assign-ments for all features for vowels produced by

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

713

FFiigguurree 99.. Group mean performance (percent correct) for the /h/-vowel-/d/ vowel test for the men’s, women’s,boys’ and girls’ voices for the four frequency assignments across the seven participants who participated inboth experiments. Error bars represent one standard deviation. Asterisks represent significantly differ-ent scores. Level of significance: ** =p <.01; * = p < .05. The better assignment is shown in bold letters.

Page 15: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

girls. In contrast, the BARK assignment wasassociated with the lowest transmission per-centage for all features.

DDiissccuussssiioonn

To accomplish the aim of the NEWESTassignment, increasing resolution in the F1region, required the reassignment of an elec-trode to this region. This necessarily createdthe possibility of a loss of information else-where, in this case, above 3 kHz. However,from the results of the closed-set consonanttest, there was no decrement in consonantidentification relative to the subjects’ every-day assignment (similar to the OLD EXPassignment) compared to the NEWESTassignment. Thus, high frequency informa-tion was adequately represented. Moreimportantly, the NEWEST assignment wasassociated with significant improvements inword identification for both TIMIT sentencesand CNC words compared to the otherassignments. The TIMIT sentences, which

include both men’s and women’s voices, yield-ed performance with the NEWEST table of67% correct word identification, significantlybetter than with any of the other threeassignments. For the CNC words recordedby a single male speaker, performance alsoimproved significantly. For the vowel test,the NEWEST assignment yielded improvedperformance for men’s voices, such that therewere no statistical differences between it andthe BARK assignment for men’s voices. Thisimprovement was associated with an assign-ment (NEWEST) having one less electrodeallocated to the F1 region, three additionalelectrodes to the F2 region, and two less elec-trodes allocated between 3062 and 7000 Hzcompared to the BARK assignment. Withthe NEWEST assignment, performance forvowel identification across the four voiceswas approximately equal: 76% for men’svoices, 78% for women’s voices, 75.5% forboys’ voices, and 73.4% for girls’ voices. Theordering of these results is in rough agree-ment with the trend reported by Loizou et al

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

714

FFiigguurree 1100.. Group mean percent transmitted information for features in vowels produced by men.Data are for the seven participants who participated in both experiments.

FFiigguurree 1111.. Group mean percent transmitted information for features in vowels produced by girls.Data are for the seven participants who participated in both experiments.

Page 16: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

(1998). In their study, they found that vow-els produced by men were perceived with thehighest accuracy, followed by vowels pro-duced by women, and then by vowels pro-duced by boys and girls. The maximum dif-ference in performance was between men’sand girl’s voices – 17%. In the current study,performance on vowels produced by womenwas slightly higher than for those produced bymen, but performance on boys’ and girls’ vow-els followed the same pattern. The maximumdifference here was between women’s andgirls’ voices and, in this case, was only 4.6%.

Overall, the results suggest that theNEWEST assignment provided equivalent orsuperior performance across tests and voic-es compared to the other assignments evalu-ated. More exhaustive testing of this assign-ment should include CNC words spoken bywomen and children as well as sentences spo-ken by children. However, such recordingsare not available at this time.

OOVVEERRAALLLL DDIISSCCUUSSSSIIOONN

Given the results presented here, the deter-mination of filter frequency boundaries for

speech coding strategies such as SPEAK andACE should be guided by three principles.

First, an abundance of electrodes (in thecase of the NEWEST assignment, 35%) shouldbe placed below 1100 Hz to capture the rela-tively frequency specific variations that occurin the first formant region with male voices,but with some caution paid to maintainingchannel bandwidths that are not too narrow.As Leigh et al (2004) point out, the use of nar-row bandwidths should improve the percep-tion of vowels produced by men, but potential-ly can cause degradation in performance withhigh fundamental frequency (F0) voices. Leighet al (2004) further argue that narrower chan-nel bandwidths in an energy-picking strategysuch as SPEAK (in their experiment) or ACE(in the present experiment) could result in theselection of channels for stimulation that havedetected individual harmonic energy ratherthan formants in high F0 voices. While thisissue has not been explored in depth, theresults presented here suggest that a mini-mum bandwidth of 90 Hz (as used here in theBARK assignment) is not so narrow as toseemingly cause degradation in the perceptionof higher pitched voices, and the NEWESTassignment’s bandwidths of 150 Hz in this fre-quency region can yield adequate performance

for men’s voices.Second, the majority of remaining elec-

trodes (in the case of the NEWEST assignment,50%) should be allocated to the second formantregion (1100-2900 Hz) to preserve formant tran-sition information and capture the higher for-mants of women’s and children’s voices. Theimportance of adequately representing spectralcues in this frequency region was recentlydemonstrated in a study examining the effectsof noise on identification performance of syn-thetic speech continua by both implanted andnormal hearing listeners (Munson and Nelson,2005). Their findings suggested that the speechcues most susceptible to degradation by back-ground noise were those exhibiting rapidlychanging spectral patterns like the formanttransitions found in their /wa/-/ja/ and /ra/-/la/continua.

Third, the high frequency region above 2900Hz can be left relatively sparse of electrodeswith apparently very little, if any, detriment toconsonant perception. This last factor has beenpreviously proposed and supported in earlierstudies (Henry et al, 2000; McKay andHenshall, 2002; Leigh et al, 2004) where the fre-quency region above about 2600 Hz has provenless critical with respect to spectral resolutionand can be adequately represented with fewerelectrodes than can vowel formant frequencyregions.

This allocation strategy generally follows thepattern of frequency importance and transferfunctions found for various speech materials(Studebaker et al, 1987; Studebaker andSherbecoe, 1991; Studebaker et al, 1993). Thesefunctions have been derived from speech recog-nition performance of normal hearing listenerson materials presented in a large number of fil-tered and background noise conditions and areoften used for predicting and evaluating per-formance of hearing impaired individuals. Thefrequency importance functions assign aweighting to frequency bands (typically 1/3octave) that relate each band’s relative impor-tance to speech perception. In general, thesefunctions reveal a bimodal pattern with peaksin the weighting functions seemingly related tothe first and second formant frequency ranges.Typically, there is somewhat more weightinggiven to the second formant range with audio-metric word list materials, and to the first for-mant range with continuous discourse. Onecommon measure across these studies is thedetermination of the midpoint of the weight-ings, or crossover frequency, where importance

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

715

Page 17: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

is relatively balanced above and below thispoint. For more modern studies, this crossoverfrequency varies from about 1300 to 1600 Hzdepending on the speech materials under evalu-ation. This is in good agreement with the fre-quency between the tenth and eleventh elec-trode (midway in the array) in the NEWESTassignment of 1550 Hz.

Subjects’ questionnaire responses and per-sonal comments indicated that none of the vari-ous assignments yielded any consistent qualita-tive trends in terms of a preference for oneassignment over another. This finding is quitedifferent from similar judgments garnered inFourakis et al, 2004. In that study, participantsshowed strong preferences with regard toassignment. However, the assignments evalu-ated in that study varied considerably, particu-larly in the high frequency range made avail-able to the processor (~6000 Hz from ~7900 Hz).In the current study, all participants did noticean immediate difference between the assign-ment they had been using and a new assign-ment placed on their processor. The biggest dif-ference was during Experiment 2 when theychanged from their everyday assignment to theNEWEST assignment, but they were able toadjust to the difference in sound quality withina week or so and then thought the NEWESTassignment sounded fine. These findingsregarding the qualitative similarity of assign-ments and ease of adaptation are important tonote. It appears that, as long as assignmentsencompass approximately the same frequencyrange and individual electrode assignments arenot radically different, participants are able toaccommodate to a new assignment. This rela-tive ease of accommodation runs counter to thefindings of Fu et al (2002), where initial per-formance levels were never reached with newassignments. However, the new assignments inthat study were substantially different from theassignments participants used with their every-day maps and were shifted to include a numberof electrodes dedicated to a very low frequencyrange (25% of electrodes between 75–575 Hz).Unfortunately, their findings cannot easily dis-ambiguate performance deficits due to a lack ofspectral information with their experimentalmaps and the inability of their participants toaccommodate to the new assignments. Svirskyet al (2004), on the other hand, found that newlyimplanted cochlear implant users could ade-quately accommodate to typical clinical assign-ments, at least with respect to synthetic vowelcategorization, within one to 24 months. Giventhat the initial accommodation to speech per-

ception by means of a cochlear implant must beconsidered an extreme case, the ability of theparticipants in the current study to easily adaptto relatively minor changes in assignments doesnot appear remarkable.

With the Nucleus Freedom™ cochlearimplant, a total of 22 active electrodes are avail-able for mapping. Future research mustaddress the best frequency allocation for theseadditional electrodes, as well as how to optimizea smaller number of electrodes for those unableto make use of a full electrode array. However,additional electrodes will not compensate for thecurrent and continuing need for more flexibilityin adjusting frequency assignments in the clini-cal software.

CCOONNCCLLUUSSIIOONN

It can be argued, on the basis of the results ofthese two experiments, that there should be

much more flexibility available to the pro-gramming clinician, when it comes to assign-ing frequency bands to electrodes. Instead ofthe fixed number of frequency assignmenttables provided by the manufacturer, the clini-cian should have the options of assigning fre-quency bands in the manner most appropriatefor the specific patient, given the patient’snumber of active electrodes. This can beaccomplished by increasing the available set oftables through the inclusion of tables such asthe ones presented here as well as tablesdesigned for 22 or 21 active electrodes. Thedesign of these tables must be guided by thesame principle used to design the NEW EXPand NEWEST tables, that is, the maximiza-tion of resolution in frequency ranges that areimportant for speech perception. In addition,guidelines must be developed to assist the cli-nician in choosing and implementing theappropriate table.

AAcckknnoowwlleeddggmmeennttss.. Appreciation is expressed to theeight subjects who graciously gave their time and effortto participate in this research study. We are grateful toJames Hillenbrand for the vowel recordings, to RobertShannon for the consonant recordings and Condor soft-ware, and to Philip Loizou for his wave file recordings aswell as the rearranged ordering of the TIMIT sentencesinto new lists. We are also grateful to Andrew Vandali(from CRC HEAR), who configured the filters and fre-quency boundary assignments using 19 as well as 20electrodes with the SPEAR3 processor and provided valu-able assistance. This research was approved by the HumanStudies Committee at Washington University School ofMedicine (Experiment 1: 04-0123; Experiment 2: 05-0894).

JJoouurrnnaall ooff tthhee AAmmeerriiccaann AAccaaddeemmyy ooff AAuuddiioollooggyy/Volume 18, Number 8, 2007

716

Page 18: Effect of Frequency Boundary Assignment on Speech ...€¦ · Marios S. Fourakis, Ph.D., Department of Communicative Disorders, University of Wisconsin–Madison, 1975 Willow Drive,

RREEFFEERREENNCCEESS

American National Standards Institute. Methods for theCalculation of the Articulation Index. ANSI S3.5-1969.New York: ANSI.

American National Standards Institute. AmericanNational Standard for the Speech Intelligibility Index.ANSI S3.5-1997. New York: ANSI.

Boothroyd A, Erickson FN, Medwetsky L. (1994) The hear-ing aid input: a phonemic approach to assessing the spec-tral distribution of speech. Ear Hear 15:435–442.

Firszt JB, Holden LK, Skinner MW, Tobey EA, Peterson A,Gaggle W, Runge-Samuelson CL, Wackym PA. (2004)Recognition of speech presented at soft to loud levels byadult cochlear implant recipients of three cochlear implantsystems. Ear Hear 24:375–387.

Fourakis MS, Hawks JW, Holden LK, Skinner MW,Holden TA. (2004) Effect of frequency boundary assign-ment on vowel recognition with the Nucleus 24 ACESpeech Coding Strategy. J Am Acad Audiol 15(4):281–299.

French NR, Steinberg GC. (1947) Factors governing theintelligibility of speech sounds. J Acoust Soc Am19:90–119.

Fu Q-J, Shannon RV, Galvin JJ. (2002) Perceptual learn-ing following changes in the frequency-to-electrode assign-ment with the Nucleus 22-cochlear implant. J Acoust SocAm 112:1664–1674.

Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS,Dahlgren NL. (1993) The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus. CDROM: NTIS ordernumber PB91-100354.

Hawks JW, Fourakis MS, Skinner MW, Holden TA,Holden LK. (1997) Effects of formant bandwidth on theidentification of synthetic vowels by cochlear implantrecipients. Ear Hear 18:479–487.

Henry BA, McKay CM, McDermott HJ, Clark GM. (2000)The relationship between speech perception and electrodediscrimination in cochlear implantees. J Acoust Soc Am108:1269–1280.

Henshall KR, McKay CM. (2001) Optimizing electrode andfilter selection in cochlear implant speech processor maps.J Am Acad Audiol 12:478–489.

Hillenbrand J, Getty LA, Clark MJ, Wheeler K. (1995)Acoustic characteristics of American English vowels. JAcoust Soc Am 97:3099–3111.

Leigh JR, Henshall KR, McKay CM. (2004) Optimizingfrequency-to-electrode allocation in cochlear implants. JAm Acad Audiol 15:574–584.

Loizou PC, Dorman MF, Powell V. (1998) The recognitionof vowels produced by men, women, boys, and girls bycochlear implant patients using a six-channel CIS proces-sor. J Acoust Soc Am 103:1141–1149.

Luxford W, Ad Hoc Subcommittee. (2001) Minimumspeech test battery for postlinguistically deafened adultcochlear implant patients. Otolaryngol Head Neck Surg124:125–126.

McKay CM, Henshall KR. (2002) Frequency-to-electrodeallocation and speech perception with cochlear implants. JAcoust Soc Am 111:1036–1044.

Miller GA, Nicely PE. (1955) Analysis of perceptual confu-sions among some English consonants. J Acoust Soc Am27:338–352.

Munson B, Nelson PB. (2005) Phonetic identification inquiet and in noise by listeners with cochlear implants. JAcoust Soc Am 118:2607–2617.

Nittrouer S, Studdert-Kennedy M, McGowan RS. (1989)The emergence of phonetic segments: evidence from thespectral structure of fricative-vowel syllables spoken bychildren and adults. J Speech Hear Res 32:120–132.

Pavlovic CV, Studebaker GA, Sherbecoe RL. (1985) Anarticulation index based procedure for predicting speechrecognition performance of hearing-impaired individuals.J Acoust Soc Am 80:50–57.

Peterson GE, Lehiste I. (1962) Revised CNC lists for audi-tory tests. J Speech Hear Disord 27:62–70.

Seligman PM, McDermott HJ. (1995) Architecture of theSpectra-22 speech processor. Ann Otol Rhinol Laryngol104(Suppl. 166):139–141.

Shannon R, Jensvold A, Padilla M, Robert M, Wang X.(1999) Consonant recordings for speech testing. J AcoustSoc Am 106:L71–L74.

Skinner MW, Arndt PL, Staller SJ. (2002) Nucleus 24advanced encoder conversion study: performance versuspreference. Ear Hear 23:2S–16S.

Skinner MW, Fourakis MS, Holden TA, Holden LK,Demorest ME. (1996) Identification of speech by cochlearimplant recipients with the Multipeak (MPEAK) andSpectral Peak (SPEAK) speech coding strategies: I.Vowels. Ear Hear 17:182–197.

Skinner MW, Holden LK, Fourakis MS, Hawks JW,Holden T, Arcaroli J, Hyde M. (2006) Evaluation of equiv-alency in two recordings of monosyllabic words. J Am AcadAudiol 17:350–366.

Skinner MW, Holden LK, Holden TA. (1995) Effect of fre-quency boundary assignment on speech recognition withthe SPEAK speech coding strategy. Ann Otol RhinolLaryngol 104(Suppl. 166):307–311.

Skinner MW, Holden LK, Holden TA, Demorest ME,Fourakis MS. (1997) Speech recognition at simulated soft,conversational, and raised-to-loud vocal efforts by adultswith cochlear implants. J Acoust Soc Am 101:3766–3782.

Stelmachowicz PG, Pittman AL, Hoover BM, Lewis DE.(2002) Aided perception of /s/ and /z/ by hearing impairedchildren. Ear Hear 23:316–324.

Studebaker GA, Pavlovic CV, Sherbecoe RL. (1987) A fre-quency importance function for continuous discourse. JAcoust Soc Am 81:1130–1138.

Studebaker GA, Sherbecoe RL. (1991) Frequency-impor-tance and transfer functions for recorded CID W-22 wordlists. J Speech Hear Res 34:427–438.

Studebaker GA, Sherbecoe RL, Gilmore C. (1993)Frequency-importance and transfer functions for theAuditec of St. Louis recordings of the NU-6 word test. JSpeech Hear Res 36:799–807.

Svirsky MA, Silveira A, Neuberger H, Teoh S-W, Suarez H.(2004) Long-term auditory adaptation to a modifiedperipheral frequency map. Acta Otolaryngol 124:381–386.

Traunmüller H. (1990) Analytical expressions for the tono-topic sensory scale J Acoust Soc Am 88:97–100.Wang MD, Bilger RC. (1973) Consonant confusions innoise: a study of perceptual features. J Acoust Soc Am54:1248–1256.

Wilson BS, Finley CC, Lawson DT, Wolford RC, EddingtonDK, Rabinowitz WM. (1991) Better speech recognitionwith cochlear implants. Nature 352:236–242.

Zwicker E. (1972) On the development of the critical band.J Acoust Soc Am 52:699–702.

FFrreeqquueennccyy BBoouunnddaarryy AAssssiiggnnmmeenntt/Fourakis et al

717