18
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Numerical aspects of the Speech Tracking procedure Spens, K-E. and Gnosspelius, J. and ¨ Ohngren, G. and Plant, G. and Risberg, A. journal: STL-QPSR volume: 33 number: 1 year: 1992 pages: 115-130 http://www.speech.kth.se/qpsr

Numerical aspects of the Speech Tracking procedure · Numerical aspects of the Speech Tracking procedure ... NUMERICAL ASPECTS OF THE SPEECH TRACKING PROCEDURE ... stone, & Saunders,

Embed Size (px)

Citation preview

Dept. for Speech, Music and Hearing

Quarterly Progress andStatus Report

Numerical aspects of theSpeech Tracking procedure

Spens, K-E. and Gnosspelius, J. andOhngren, G. and Plant, G. and Risberg, A.

journal: STL-QPSRvolume: 33number: 1year: 1992pages: 115-130

http://www.speech.kth.se/qpsr

STL-QPSR 111992

NUMERICAL ASPECTS OF THE SPEECH TRACKING PROCEDURE

Karl-Erik Spens, Johan Gnosspelius, Gunilla Ohngren, Geof Plant*, & Arne Risberg

Abstract:

The Speech Tracking procedure developed by De Filippo 13 Scott (J. Acoust. Soc.Am. 63, 2978) has been used extensively to evaluate different technical aids for deaf people. A n important question which should be considered is to what ex- tent results from these evaluations can be compared, as modifications of the test design can have a significant influence on the result. This paper describes a nu- merical model of the tracking procedure. The model specifies some of the unknown factors which make a comparison of speech tracking scores difficult. It also indi- cates that very different performances found when using the same aid can be ex- plained by differences in test design. This must also be considered when compar- ing the tracking results obtained with different aids.

INTRODUCTION

De Filippo & Scott's (1978) Speech Tracking procedure or Connected Discourse Tracking (CDT) has been used to train and evaluate the effectiveness of a number of different technical aids developed to improve the lip-reading ability of profoundly hearing impaired people (Brooks, Frost, Mason, & Gibson, 1986; Cowan, Alcantara, Blamey, & Clark, 1988; Grant, Ardell, Kuhl, & Sparks, 1986; Weisenberger, Broad- stone, & Saunders, 1989 and many other authors).

In Speech Tracking, the sender reads from a book phrase by phrase. The receiver is required to repeat back what is said without any errors. If errors are made, the sender repeats the phrase or uses other strategies to enable correct identification. At the completion of a specified time period (usually 5 or 10 minutes), the number of words correctly identified is calculated and divided by the time elapsed to give a word-per-minute (wpm) rate. For example, if a subject is able to correctly repeat back 356 words in a 10-minute Speech Tracking session s/he has a tracking rate of 35.6 wpm.

ADVANTAGES AND DISADVANTAGES OF SPEECH TRACKING The method has a number of important advantages.

(i) The procedure is a straightforward one and requires no special training for either the sender or receiver and no special equipment.

(ii) The method of scoring is easy to understand and gives a measure of the flu- ency with which the receiver can "track" speech.

(iii) Speech Tracking has high face validity as it, in part, replicates everyday com- munication using connected discourse. As Tye-Murray & Tyler (1988) point out "specialists desire a test with a high face validity, one that indexes how well a subject recognises speech encountered in normal everyday life" (p. 230).

(iv) Material can be drawn from a virtually unlimited number of sources and can be selected to meet the language skills and lip-reading ability of individual subjects.

* Guest researcher from National Acoustic Laboratories, Sydney, Australia.

STL-QPSR 111992

There are also a number of potential problems with the procedure. Tye-Murray & Tyler (1988) in a critique of the method list such factors as text selection and uncon- trolled sender and receiver characteristics. Hochberg, Rosen, & Ball (1989) have in- vestigated the effect of the sender-receiver pair and text difficulty on the result and conclude that it is not appropriate to compare tracking results across different sender-receiver pairs.

Another great problem with the speech tracking procedure today is the lack of a standardised protocol for dealing with break-downs or "blockages" (Owens & Te- leens, 1981) in understanding, when the receiver is unable to lip-read a certain word or a phrase. Schoepflin & Levitt (1991), for example, found large differences in the correction or "repair" strategies used by experimenters in various studies using CDT as an evaluation method. Consequently, they argued, that the procedure is unstan- dardised and subject to a number of methodological variables. There have been at- tempts to specify the procedures to be used when blockages occur but none of these appear to have won wide acceptance.

De Filippo & Scott (1978) in their initial description of the method outlined the protocol they used to resolve blockages. "If the repetition does not match the text ex- actly, the talker (a) chooses to present the segment again, making no change, modify- ing the style of presentation (especially timing and exaggeration of speech move- ments), shortening the segment to focus on a phrase, word, syllable or sound, or lengthening the segment to review or preview phonetic or linguistic context; (b) chooses to instruct the receiver with context comments by labelling the error, label- ling the topic, or paraphrasing the text; or (c) chooses to combine or sequence several strategies. The basis for the talker's decision necessarily depends on the receiver's errors and changes as receiver skill changes" (De Filippo & Scott 1978, p. 1187).

Owens & Telleen (1981) and Owens & Raggio, (1987) attempted to shift the re- sponsibility for overcoming blockages to the receiver. In their adaptation of the method, the receiver is trained to use a series of questions or requests when break- downs occur.

This approach has many benefits in training but in practice the receiver's ability to use these strategies varies widely. As a result, the amount of benefit derived will vary widely from receiver to receiver and will greatly influence the tracking rate obtained in experimental studies.

There are other strategies which can be adopted when breakdowns occur. These include writing down, finger spelling and signing the word or words which the re- ceiver cannot lip-read. When artificially deafened hearing subjects are used in ex- perimental studies, the sender may choose to present the blocked words to the re- ceiver auditorily. The time taken to each of these approaches varies considerably. Presenting the words auditorily or via sign is the quickest alternative while writing will probably occupy the most time. Schoepflin & Levitt (1991) found that: "fewer than 7% of the talker-listener sequences extended beyond five trials" (p. 248) before the word or words were recognised correctly. When these many repeats are used, a single blockage can lead to a loss of more than 40 conveyed words in the sessions.

The time taken to resolve a blockage will obviously greatly influence the wpm score obtained by the subject. Fenn & Smith (1987) attempted to control this variable by limiting the time taken to resolve blockages. If the receiver was unable to recog- nise a phrase or a word after two repetitions, the timing was stopped and the block-

STL-QPSR 111992

age was then resolved. The timing was then recommenced and the next phrase pre- sented for identification.

This introduces the problem of calculating the score. Should a word or words conveyed either outside the timed period or outside the communication system be- ing evaluated (for example, lip-reading alone or lip-reading plus an experimental aid) be included in the total number of words identified in a tracking session? If they are included, the result will differ from the result obtained with those words ex- cluded. The difference will increase with the number of blocked words conveyed outside the system. De Filippo (1988) recommends that blocked words resolved via writing or signing should not be included in the final score. She also suggests a penalty reduction of the final score.

It was mentioned above that sender-receiver characteristics play an important role. An uncontrolled sender-receiver characteristic, which we will show later, that numerically affects the final score is the rate of presentation. The faster the sender presents the material, the more words are possible to convey per time unit. How- ever, at a certain rate of presentation, the score will not become improved because the proportion of blocked words will also increase and cause a reduction of the tracking score.

An important consequence is that there is an optimum presentation rate for each sender-receiver pair. This optimal presentation rate will depend not only on indi- vidual sender-receiver characteristics, but also on the particular text difficulty used, the repair strategy chosen, and the quality of the technical aid used. If the sender for some reason not is able to find this optimum rate of presentation, the final tracking rate is not a valid measure of the communication system under evaluation.

Where studies are conducted in different languages it is most likely that the final result will be influenced by differences in average word length. For example, the average word length in English is 6.09 letters/word (ls/w) while in Swedish the av- erage is 6.43 ls/w, and German has as much as 7.69 ls/w. If phonetic data are more closely considered, these correspond to the time it takes to pronounce a word; the average values are 4.96, 5.94, and 6.78 phonemes/word, respectively (Carlson, Ele- nius, Granstrom, & Hunnicutt, 1985). In Swedish nouns, for example, all definite ar- ticles are indicated by a suffix, such as en hund (a dog) and hunden (the dog), i.e., it is just one word, while the English language uses two words to convey the same in- formation. If this relation is linear, the use of Swedish in a tracking test would yield a result about 20% lower score than when English is used. A consequence of such language differences will be a need for some scaling factors to be used for across- language comparisons of tracking results. However, we will leave this parameter for analyses in later publications.

In spite of these factors, the Speech Tracking technique is widely used for both evaluation and training purposes. It is, of course, tempting to make comparisons and forget about all those parameters which must be kept under control before it is pos- sible to make a meaningful comparison.

If we look at across-study Speech Tracking investigations, they show a very large range of results. This is exemplified in Fig. 1, which shows tracking rates obtained via lip-reading alone and lip-reading supplemented by a number of different tactile aids reported by various researchers. In some cases, different researchers evaluated the same aid and yet obtained very different results. Therefore, it can be anticipated that the large range of result is not only an effect of the effectiveness of different aids.

There must also be some dependency on other factors, such as those described above.

TilrlTalk, nh, 70h, n=7 (Cowan 88)

Tacticon.DL.pre1i.d eaf (Kozma-Spytek

TakaidV LK, nh (M'eisenb. 87)

Queens LK, nh (Weisenb. 87)

Queens CD, nh (Weisenb. 87)

Queens LD, nh (Weisenb. 87)

Queens KM, nh (Weisenb. 87)

Queens, (Brooks 85)

Queens KS,nh, 4h (Spens 9 1)

Hand GS, hi, (Plant

85)

MiniVib3, n=6,hi (Axels. et al,)

wordslmin. % benefit of the aid.

Fig. I . This figure shows aided and unaided tracking rates (left) and relative benefit from different tactile aids (right) obtained by different investigators. When known it is indicated if the subject were nomal hearing (nh) or hearing impaired (hi), hours of training time and number of subjects (n ) or initials of single subjects.

In this paper we will describe Speech Tracking using a numerical model in an at- tempt to estimate the effects of some of the factors mentioned above. We will define quantified measures on the experienced text difficulty, the maximum rate of convey- ing words, i.e., a sort of ceiling rate and the repair strategy chosen for a particular session and then try to express their internal relations.

NUMERICAL MODEL Tracking as described by De Filippo (1988) is a composite measure which can be represented by the formula:

STL-QPSR 111992

where L = the average wpm score, W, = the total number of words conveyed during one session, and Ts = the session time in minutes.

In order to shed light on the influence of different components in CDT it is neces- sary to introduce some new parameters.

In CDT, the passage being used is presented in logical linguistic units of appro- priate lengths. The receiver has to repeat back exactly what was presented with no deviations from the printed text. Sometimes the receiver is able to repeat the entire phrase or sentence after only one presentation. On other occasions, the receiver may "block" on a particular word or series of words and require one or more repeats of the word or words that are creating difficulties. Other strategies such as those sug- gested by De Filippo & Scott (1978) may also be used to overcome blockages. Often the receiver at last is able to correctly identify the full phrase without resorting to the use of any communication other than that being investigated. Sometimes, however, the receiver is unable to correctly identify the word or words even after repeats, and the sender is forced to resolve the problem by using a presentation method other than that being evaluated. For example, the sender may be forced to write down or sign the word($ creating difficulties to the receiver.

We can now divide up the words in a particular tracking session into three sepa- rate categories determined by the receiver's response. The categories used are:

(1) W, - words in units or phrases correctly identified after only one presentation. Words in such phrases are on average conveyed at the highest wpm rate or ceiling rate, used in the session under consideration and using the communication system under evaluation. We will use the index c to indicate this definition of the ceiling rate for the communication system under evaluation.

(2) Wbi - words in phrases which are not correctly identified after one presentation but which are ultimately identified using repeats or modifications within the com- munication system being evaluated. These then are words in blocked units which are eventually conveyed inside the system. (3) Who - those words in units which cannot be correctly conveyed within the com- munication system being evaluated. The sender is forced to use another communica- tion mode such as signing or writing to convey the word(s). These are words in blocked units eventually conveyed outside the system.

For simplicity, we will in this paper use the definition that all words in a unit pre- sented for identification which contain at least one blocked word are considered to be blocked. This is done so even if the resolving of a blockage in most cases means the successful conveying of just one word inside or outside the communication sys- tem. For example, all words in a unit which contain one or more words which have to be conveyed by writing or signing will be categorised as Who.

The above definitions could be made in other ways. These are chosen because they can be quantified by using the computer-assisted tracking procedure described by Gnosspelius & Spens (1992).

Two other factors need to be considered at this point. They are:

(4) Ws - the total number of words conveyed from the text in a tracking session.

STL-QPSR 111992

(5) Wb- the total number of words in blocked units in a tracking session.

The relationship between these factors is represented graphically in Fig. 2.

Number of words in blocked units, resolved inside resp. outside the system

Number of words in blocked units Number of words in non-blocked units Total number of conveyed words

Fig 2. A tracking session in which the total number of words CW,) from the text are conveyed, can be divided into, words in units correctly repeated after only one presentation (W,), words in units which require more than one repetition (Wb), and words in blocked units resolved inside (Wbi) or outside (Who) the communication system being evaluated.

We now need to consider the required time in any tracking session, to deal with the various categories of words described above. This requires definitions of a num- ber parameters related to the temporal course of the tracking session.

(6) Ts - the total time of the tracking session.

(7) Tc - the time taken to identify directly conveyed units, that is, the words in units correctly identified after only one presentation. That is also the time used to convey words at the ceiling rate defined earlier.

(8) Tbi - the time taken to convey units containing blockages only resolved inside the communication system being evaluated.

(9) Tho - the time taken to convey blocked units resolved outside the communication system under consideration. For example, units containing those words which are eventually conveyed via finger spelling or writing.

(10) Tb - the time taken to convey all blocked units. This is Tbi+Tbo.

Using these parameters it is possible to specify the average time per word taken to convey words in each of the three word categories W, Wbi, and Wb0 These can be represented as:

(11) Tc - the average time taken to convey a directly conveyed (W,) word or the time taken to convey a word with the earlier defined ceiling rate for the communication system under evaluation.

1, Thatis, t c = - wc

(12) tbi - the average time spent on a Wbi word, i.e., a blocked word resolved inside the communication system.

I bi Thatis, t b i = - Wbi

(13) bo, the average time spent on a wb0 word, i.e., a blocked word resolved outside the communication system.

That is, t, =A w,

Total session time

I;b~8f&n8u"nits

- Time spent on blocked units.

N ber of wo ds 'n bloc ed, u#, reso~vedonl~ Inside resp. outside the system

Number of words in blocked units

Number of words in non-blocked units Total number of conveyed words

Fig. 3. This is an expanded version of Fig. 2 and it shows the relationship between the words in a pas- sage and the time taken for them to be conveyed.

(14) tt, - the average time spent on a word (Wb) in a blocked phrase.

Tb That is, t - - or Q = Tbi + "w,, wbi +

We now have the opportunity to directly compare the time taken to convey non blocked words (W,), blocked words (Wb), blocked words resolved inside the system (Wbi), and blocked words resolved outside the system (Who).

These new parameters are:

(15) kb - how much longer it takes to convey a word in a blocked unit than a directly conveyed word.

t b That is, kbi = - c

(16) kbi - how much longer it takes to convey a word in a blocked unit resolved in- side the system than a word in a directly conveyed unit.

t bi Thatis, k b i = - t c

(17) kbo - how much longer it takes to convey a word in a blocked unit resolved outside the system than a directly conveyed word.

STL-QPSR 111992

and we will arrive at

If we also substitute W, according to Fig. 2, we will get

Changing the absolute word number values to proportions by dividing with W,, we will get a very general expression of the relation between the conventional tracing score and some important underlying time related parameters.

This general expression (19) shows that the tracking score (L) is proportional to the maximum average speed (LC) (or the ceiling rate for that particular session) which is the transmission speed for directly conveyed words. However, it should pointed out again that there is, of course, a hidden relation between LC and Wb, which means that the more LC exceeds the optimal LC, the lower the final score (L) be- cause of an increased proportion of blocked words. Expression (19) also shows that L in a non-linear fashion will depend on the proportion of words in blocked units re- solved inside and outside the system (Wbi/Ws and Wbo/W,) with weights (kb). The weights (kbi and kbo) are related to the repair strategy chosen and are derived from the average time spent on words in blocked phrases compared to words in phrases that are conveyed on the first trial. For example, if the k-value is on average 7, each word in blocked phrases will cause a loss of 6 words of the final score.

The relation also shows that the conventional tracking score would be as high as the ceiling rate if the two terms (Wbi/Ws) and (Wbo/Ws) were zero, i.e., if there are no blockages. As soon as there are blockages resolved either inside or outside the sys- tem, (Wc/W,) in the numerator will get reduced and one or both of the two terms (kbi*Wc/Ws) and (kbo*Wc/Ws), related to the time spent on blockages, will increase the denominator and hereby reduce the conventional tracking score (L). The higher the k-factors, the faster the score (L) will get reduced.

Some authors like De Filippo (1988) and Fenn & Smith (1987), specify scoring schemes that will have a modifying effect on the relation (19). In the present relation (19) the sum of the three terms in the numerator equals one. That makes the relation simpler but this is only applicable if all the words presented by the sender in one session are included in the relation. The two terms (wbi/WS) and (Wbo/Ws) are re-

STL-QPSR 111992

lated to the proportion of words in blocked phrases resolved either inside or outside the communication system. Some authors, like De Filippo (1988) and others using her procedure, specify a scoring design which does include the time spent on resolv- ing blocked words outside the system i.e., the term (kbo*Wbo/Ws), but it does not in- clude the number of words resolved outside the system, i.e., the term (wbows) in the numerator is set to zero. If (Wbo/Ws) in the numerator is omitted, that will of course have a negative effect on the resulting score L. However, the relative benefit from an aid, i.e., the relation aided/unaided will be positive if the number of blocked phrases is lower in the aided situation.

For example, if the result is 250 wpm in the aided situation and 200 in the un- aided, the relative improvement caused by the aid would be 25% if all words are in- cluded. If there are 20 blocked words resolved outside the system in the unaided situation and only 5 in the aided, the final tracking score would be 245 and 180 re- spectively, i.e., somewhat lower. However, the relative improvement would be 245/180, which equals 40% improvement in the aided condition. Penalty schemes can increase relative improvements even further. In most cases, such schemes can be numerically included in rela tion (1 9).

Fenn & Smith (1987) suggest a procedure in which the timing is stopped while re- solving blockages outside the system. In this case, the formula (19) will change so that the term (kbo*Wbo/Ws) in the denominator is set to zero because the timing is stopped and the term (Wbo/Ws) in the numerator is also set to zero because blocked words are excluded. However, the time taken for the two repetitions done before realising that the word has to be resolved outside the system must somehow be taken care of in the denominator.

By using p as an index for proportions and the fact that Wpc+Wpbi+Wpbo=l, we can simplify relation (19) to;

If all blocked words are considered to be one category, we will get an even sim- pler version (20). This does not show the influence of different strategies to repair words inside or outside the communication system, nor does it show what happens if words resolved outside the system are omitted. However, it clearly illustrates the non linear properties.

In Fig. 4, the conventional tracking score L is exemplified for two ceiling rates (LC = 80 and 50 wpm) as a function of the proportion of words in blocked phrases, (Wpb) and the k-values 2,3,5,8, and 12.

L=Lcl(l +Wpb(k-1 )) wordslmin.

from above, k=2,3,5,8 and 12

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 Proportion of words in blocked units, Wpb.

Fig 4. This is an example of relation (20) with two LC-values (80 and 50 wordslmin) and k = 2,3,5, 8, or 12. It shows how L depends on k and the value of Wpb.

It is obvious that the higher the proportion of blocked words the lower the L- value. If the proportion of blocked words increases it also likely that the sender lowers the presentation rate down to around 50 wpm to ensure fluency. That will lower the tracking rate (L) even further. It is also clear that a higher k-value gives a lower L. This means that if a high final score is desired, simple text material resulting in a low proportion of blocked words should be used. That was also shown by Hochberg & al. (1989). A repair strategy which is fast, i.e., a low k-value, will also contribute to a high tracking rate.

However, if results from an aided (La) and an unaided (Lua) condition are com- pared, the relative difference (La-Lua)/Lua will improve for higher k-values.

Both a high absolute tracking score in the aided condition and a high relative score aided/unaided is obtained if the text material results in a low proportion of blocked words and if the chosen repair strategy results in a high k-value. The high k- value will certainly reduce the absolute results but the aided condition will not be influenced less than the unaided.

Figure 5 shows the results obtained by a normally-hearing female subject when tracking materials were presented via lip-reading alone and lip-reading supple- mented by auditorily-presented speech which was low-pass filtered (LPF) at 250, 500, and 1,000 Hz. There were three 10-minute tracking sessions in each condition. The repair strategy chosen was just repetitions of the blocked word in a little more well articulated manner. After two unsuccessful repetitions, the blocked word was given in written form on a LED-screen hanging just above the sender's head. The phrases were read from a screen by the sender and the last resort intervention was somewhat faster than a repetition. The sender just clicked at the blocked word with the mouse (Gnosspelius & Spens 1992).

The subject's wpm rate increases rapidly with the addition of the LPF speech sig- nal. It can be seen that the subject's lip-reading alone tracking rate is approximately 20 wpm. When lip-reading was supplemented by the speech LPF at 250 Hz, her tracking rate rose to around 50 wpm. The tracking rate for lip-reading plus the

STL-QPSR 111992

ing was supplemented with the LPF speech, usually only one word was blocked in a phrase. As the repair strategy was restricted to maximum three repetitions, the last one visual, the average k-value was only around 3. The average blocked phrase with about six words then contained one blockage. The average price for each blocked word in these sessions then equalled about twelve lost words.

Finally, the proportion of words in blocked phrases resolved outside the commu- nication system is shown. This is around 15% in the lip-reading alone condition but close to zero in the remaining three presentation conditions. Allowing any number of interventions to repair these words inside the system would have reduced the final score from 20 to about 16 wpm.

As an example when comparing the aided condition (low-pass filter 250 Hz) with the unaided condition, the relative benefit of the filtered sound would be 48/20-1 or an improvement with 140%. However, if a repair strategy was chosen that did not allow blockages to be resolved outside the system, the reduction of the unaided score from 20 to 16 wpm would equal a relative improvement of 48/16-1 or 200%.

DISCUSSION The relation (20) is illustrated in Fig. 4 with two LC-values. It is indicated that the conventional tracking score will depend on LC and will have a non-linear depend- ence on W b and k. It seems likely that knowledge of the parameters LC, w b and k

I: would ma e the across-subject comparisons of tracking results more valiz $or ex- ample, when comparing across-subject results with two tactile aids, it could prob- ably be done with a much higher degree of accuracy if the unaided base line (for ex- ample 25 wpm) was obtained with about the same proportion of words in blocked phrases and with the same k- and LC-values. At least these values should be known. The k-value would then tell about the average time consumption to convey words in a blocked phrase when using the repair strategy chosen. The LC-value expresses the ceiling rate for the sender receiver and their particular communication system, and finally, the W b-value gives information about the experienced text difficulty. The P experienced difficulty to lip-read the text will, of course, also depend to an extent on the presentation rate, i.e., the LC-value, and it is anticipated that in most cases an op- timal ceiling rate is reached after a rather short period of training.

For a perfect system (for example, normal hearing subjects in quiet), tracking speeds varying from 114-129 wpm depending on text complexity are reported by Hochberg & al. (1989). With a perfect system, blockages are very rare and the score will be very high. It seems that very few blockages will cause the sender to adapt to a system which has some imperfections by lowering her or his presentation rate. For example, as mentioned before, the subject in the study conducted by Plant & Spens (1986) performed 20-30 wpm lower than expected, based on his face-to-face conver- sational performance. This ceiling (LC) for high performers seems to be in the range of 80-100 wpm. The adaptations made for the limitation imposed by the receiver's skills and the communication system can lower LC for lip-reading alone to about 50 wpm (Gnosspelius & Spens, 1992). However, it is worth emphasising that LC is an uncontrolled variable which can vary with the sender's will, habits, or experienced communication convenience. If LC not is under control, there will be a big problem of validity if the aim is to assess the receiver's communication ability with or without a technical aid.

Fenn & Smith (1987) have described a method to reduce the variability introduced by the resolving of blockages by excluding time spent on blockages resolved outside the system and allowing only two repetitions. In their approach, they simply stop timing after two repetitions and, hence, the time to convey blocked words by means outside the system is not included in the session time (Tb, and Who are set to zero). It can, therefore, be assumed that kbi is stable and not very high. Their score contains both L and the percentage of blocked words conveyed outside the system. Analysing their results according to relation (201, we found an estimate of LC reached an asymp- totic level of around 65 wpm after about ten sessions, and that improved perform- ance thereafter was virtually due only to a reduced proportion of blocked words (Wpb). It is difficult to make any definite conclusions based on such limited data but it appears that the sender and the receiver adapt their LC rather quickly to the com- munication difficulty, and that training will mainly reduce wpb. Some preliminary results arevshown in Fig. 5. (Gnosspelius & Spens 1992) also support this assump- tion. Here it is indicated that LC very quick reaches an asymptotic level, especially when lip-reading is supplemented by low-pass filtered speech.

The adaptation of their own speech made by people talking to hearing impaired persons, as reported by Picheny & al. (1986) is probably reflected in LC's mentioned dependence of Wpb They found "clear" speech presented with about half the rate (100 wpm) compared to "conversational" speech (200 wpm). It is obvious that the experienced communication difficulty in a lip-reading situation will make most talkers use the clear speech mode, which has a significantly higher intelligibility (Picheny, Durlach, & Braida, 1985). The LC value of 50 wpm obtained in a lip-reading situation would correspond to the "clear" speech talking rate of 100 wpm. When the sender detects a lowered Wpb he will eventually try to increase his speech rate up to the conversation level of about 200 wpm. The receiver may not increase his speed that much, but the corresponding LC will be in the area of 80 to 100 wpm for a very good communication system like the exemplified LP 1000 Hz condition in Fig. 5.

There are some very low (Rihkanen, 1988) and some very high (Weisenberger 1989) Speech Tracking scores in Fig.. 1. The very low scores obtained by Rihkanen, using the MiniVib3 compared to other MiniVib3 studies can be explained by the numerical model. The text used was difficult (Rihkanen, 1988), which meant that a very large proportion of the conveyed words were blocked, i.e., Wpb was high. Rih- kanen does not describe how the blockages were resolved but the subjects were deaf, which indicates that a portion of the blockages must have been conveyed by either sign language or writing. Therefore the k-value is high and we are working on the right lower part of the graph in Fig.. 4. where L is very low. LC is probably also very low to ensure some fluency in both conditions. If this was the case, there will be no improvement due to an increased LC in the aided condition. Other users of MiniVib3 may have worked with lower Wpb-values by either using simpler texts (Axelsson, Berenstaf, & Spens, 1986) or a more skilled lip-reader (Plant & Spens, 1986), and they may have been able to benefit a little from an increased LC. This means they have most probably worked more to the left in Fig. 4.

In the tracking results analysed by Schoepflin & Levitt (1991), the proportion of utterances correctly repeated on the first trial is 180/412 = 0,44 or 44%, i.e., the pro- portion of words in blocked phrases (wpb) is 0.56. There were 232 blocked utter- ances or phrases, which needed 845 repetitions to become resolved. An estimation of the average k-value is 3.6, assuming all phrases took about the same time. The aver-

STL-QPSR 111992

tional tracking score will hopefully make comparisons of tracking results more in- teresting and informative than today.

REFERENCES: Axelsson, A., Berenstaf, E., & Spens, K-E. (1986): "Erfarenheter av traning med ett vibrotak- tilt hjiilpmedel, " Kursbok for Nordiska Audiologiska Siillskapets mote i Abo, pp. 167-1 69.

Brooks, P.L., Frost, B.J., Mason, J.L., & Gibson, D.M. (1986): "Continuing evaluation of the Queen's University tactile vocoder. 11: Identification of open set sentences and tracking nar- rative," J.Rehab.Res. 23:1, pp. 129-138.

Carlson, R., Elenius, K., Granstrom, B., & Hunnicut, S. (1985): "Phonetic and orthographic properties of the basic vocabulary of five European languages," STL-QPSR No. I, pp. 63-94.

Cowan, R.S.C, Alcantara, J.I., Blamey, P.J., & Clark, G.M.(1988): "Preliminary evaluation of a multichannel electrotactile speech processor, " J.Acoust.Soc.Am. 83:6, pp. 2328-2338.

De Filippo, C.L. (1988): "Tracking for speech reading training," Volta Rev. 90:5, pp. 215-237.

Fenn, G. & Smith, B.Z.D. (1987): "The assessment of lip-reading ability: some practical con- siderations in the use of the tracking procedure," Brit.J.Audiol.21, pp. 253-258.

Gnosspelius, J. & Spens, K-E. (1992): "A computer based Speech Tracking procedure," STL- QPSR No. 1, pp. 131-137.

Grant, K.W., Ardell, L.A.H., Kuhl, P.K., & Sparks, D.W. (1986): "The transmission prosodic information via an electrotactile speech reading aid," Ear & Hear. 7, pp. 328-335.

Hochberg, I., Rosen, & Ball, V. (1989): "Effect of text complexity on Connected Discourse Tracking Rate," Ear & Hear. 10:3, pp. 192-199.

Kozma-Spytek, L. & Weisenberger, J.M. (1987): Evaluation of a multichannel electrotacfile device for the hearing impaired, Central Institute for the Deaf, St. Louis, MS, USA.

Matthies, M.L. & Carney, A.E. (1988): "A modified speech tracking procedure as a commu- nication performance measure, " J.Speech & Hear.Res. 31, pp. 394-404.

Owens, E. & Teleen, C.C. (1981): "Tracking as an aural rehabilitative process," J.Acad.Rehab.Audiol.14, pp. 259-273.

Picheny, M.A., Durlach, N.I., & Braida, L.D. (1985): "Speaking clearly for the hard of Hear- ing I: Intelligibility differences between clear and conversational speech," J.Speech b Hear.Res. 28, pp. 96-103.

Picheny, M.A., Durlach, N.I., & Braida, L.D. (1986): "Speaking clearly for the hard of Hear- ing 11: Acoustic Characteristics of clear and conversational speech," J.Speech & Hear.Res. 29, pp. 434-446.

Plant, G. & Spens, K-E. (1986): "An experienced user of tactile information as a supplement to lip-reading. An evaluation study, " STL-QPSR No. 1, pp. 87-110.

Rihkanen, H. (1988): Rehabilitation Assessment of Postlingually Deaf Adults Using Single Channel Intrawchlear Implants or Vibro-tactile Aids: A Prospecfive Clinical Study, diss.

Weisenberger, J.M. (1989): "Tactile aids for speech perception and production by the hearing impaired," Volta Rev. 91, pp. 79-100.

Weisenberger, J.M., Broadstone, S.M., & Saunders, F.A. (1989): "Evaluation of two mul- tichannel tactile aids for the hearing-impaired," J.Acoust.Soc.Am. 86, pp. 1764-1775.

Weisenberger, J.M. (1991): Personal communication.