10
400 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008 Computational Modeling of User Errors for the Design of Virtual Scanning Keyboards Samit Bhattacharya, Anupam Basu, and Debasis Samanta, Member, IEEE Abstract—Virtual scanning keyboards are used by persons with severe speech and motion impairments as communication aids. Each of these systems consists of a virtual keyboard and a “scan- ning and access switch” based alternate input method. Designers of such keyboards face problems due to the difficulties in testing prototypes with disabled users. Model-based design approaches were proposed in order to alleviate the problems. In model-based design, systems are evaluated with user models reducing the need for extensive user testing. The existing model-based approaches, however, do not consider the effect of user errors in evaluating systems. The lack of consideration of errors limits the practical usefulness of the resulting designs. To overcome this limitation, we have performed empirical studies of errors on virtual scanning keyboards. From our study results, we have derived predictive models of user’s error behavior. We have used the models to develop “ErrorProneness,” a numerical error measure for vir- tual scanning keyboards. We have proposed a method using the “ErrorProneness” measure for taking the effects of errors into account in model-based design. Methods employed in our study, results obtained, the predictive user models, the error measure, and the proposed design method are presented in this paper. Index Terms—ErrorProneness, focus distance, scanning and ac- cess switches, selection error, timing error, virtual keyboards. I. INTRODUCTION P ERSONS having severe speech and motion impairments such as cerebral palsy, muscular dystrophy, quadriplegia, and the like, face difficulties in expressing themselves in an easy and intelligible way. Their difficulties result from nonfunc- tioning or partial functioning of body parts responsible for pro- ducing speech and motor actions. Consequently, they often rely on external aids to perform their day-to-day communication [1]. “Virtual” (also called “soft”) keyboards are commonly used as the input interface to computer based communication aids, de- veloped for the purpose [2]–[5]. Virtual keyboards are on-screen representation of physical keyboards [6]. The keys of a virtual keyboard are laid out spatially on the computer screen. Users make single letter selections from the interface to compose text. Virtual keyboards used as communication aids usually come equipped with text-to-speech systems to enable the composed texts to be “spoken out.” Selection of keys from virtual keyboard interfaces poses problem to severely motion impaired users since they can not Manuscript received October 14, 2007; revised March 17, 2008; accepted March 26, 2008. First published May 16, 2008; last published August 13, 2008 (projected). The work was supported by AICTE NDF under Grant 1-10/FD/ NDF-PG/(IIT-KH(17))/2005-06. The authors are with the Department of Computer Science, Indian Institute of Technology Kharagpur, Kharagpur 721302 India. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNSRE.2008.925073 use standard input devices like mouse or keyboard. “Scan- ning and access switches” are commonly used alternate input methods developed for such users. Access switches are spe- cially designed input devices that require lesser motor control to operate than a mouse or a keyboard. 1 These switches can be operated with any active body part of a user including hand, foot, mouth, or eye. To operate computers with access switches, scanning is used. Scanning is the periodic and successive highlighting of on-screen elements [7]. During scanning, the highlighter pauses at each element for a predefined time delay known as the “scan period.” When the highlighter pauses on the desired element, users activate an access switch to select that element. In the rest of the paper, we use the term VSK to refer to virtual keyboards operated with scanning and access switches. The design space (i.e., the set of possible designs) of a VSK is usually large, which results from the large number of possible ways to organize keys on the interface as well as the wide varia- tion in the scanning input methods. To determine an appropriate design from the design space, designers often rely on their expe- rience and intuition. Even for an experienced designer, however, experience and intuition rarely leads to a single design. Conse- quently, designers are required to implement prototypes of the alternate designs and evaluate them with users. However, eval- uating prototypes with disabled users is not an easy task due to the following problems. It is very difficult to get sufficient number of users for eval- uating alternate designs. Collecting sufficiently large usage data for analysis is also problematic. Physical disabilities prevent the users from working continuously for a long stretch of time. Some- times, it takes several months to collect data for evaluation. As a result, most commercial systems are designed in a way such that the interface elements, layouts, scan period, and other timing parameters and details of switch operations can be con- figured according to the conveniences of the users. 2 The config- uration is usually done by the clinical professionals or relatives of the users or the users themselves based on their experience or by trial and error. In order to reduce dependence on the user testing, some researchers reported more systematic model-based design approaches. In model-based design, performance of VSK users (usually in terms of text entry rate) is computed with user 1 See, for example, http://www.abilityhub.com/switch/switch.htm for a list of commercially available access switches. 2 An example of a commercially available virtual scanning keyboard is the WiViK system. The keyboard has several features including config- urable switch-based scanning. Details about the system can be found at http://www.wivik.com/ 1534-4320/$25.00 © 2008 IEEE

Computational Modeling of User Errors for the Design of Virtual Scanning Keyboards

  • Upload
    d

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

400 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

Computational Modeling of User Errors for theDesign of Virtual Scanning Keyboards

Samit Bhattacharya, Anupam Basu, and Debasis Samanta, Member, IEEE

Abstract—Virtual scanning keyboards are used by persons withsevere speech and motion impairments as communication aids.Each of these systems consists of a virtual keyboard and a “scan-ning and access switch” based alternate input method. Designersof such keyboards face problems due to the difficulties in testingprototypes with disabled users. Model-based design approacheswere proposed in order to alleviate the problems. In model-baseddesign, systems are evaluated with user models reducing the needfor extensive user testing. The existing model-based approaches,however, do not consider the effect of user errors in evaluatingsystems. The lack of consideration of errors limits the practicalusefulness of the resulting designs. To overcome this limitation,we have performed empirical studies of errors on virtual scanningkeyboards. From our study results, we have derived predictivemodels of user’s error behavior. We have used the models todevelop “ErrorProneness,” a numerical error measure for vir-tual scanning keyboards. We have proposed a method using the“ErrorProneness” measure for taking the effects of errors intoaccount in model-based design. Methods employed in our study,results obtained, the predictive user models, the error measure,and the proposed design method are presented in this paper.

Index Terms—ErrorProneness, focus distance, scanning and ac-cess switches, selection error, timing error, virtual keyboards.

I. INTRODUCTION

P ERSONS having severe speech and motion impairmentssuch as cerebral palsy, muscular dystrophy, quadriplegia,

and the like, face difficulties in expressing themselves in aneasy and intelligible way. Their difficulties result from nonfunc-tioning or partial functioning of body parts responsible for pro-ducing speech and motor actions. Consequently, they often relyon external aids to perform their day-to-day communication [1].“Virtual” (also called “soft”) keyboards are commonly used asthe input interface to computer based communication aids, de-veloped for the purpose [2]–[5]. Virtual keyboards are on-screenrepresentation of physical keyboards [6]. The keys of a virtualkeyboard are laid out spatially on the computer screen. Usersmake single letter selections from the interface to compose text.Virtual keyboards used as communication aids usually comeequipped with text-to-speech systems to enable the composedtexts to be “spoken out.”

Selection of keys from virtual keyboard interfaces posesproblem to severely motion impaired users since they can not

Manuscript received October 14, 2007; revised March 17, 2008; acceptedMarch 26, 2008. First published May 16, 2008; last published August 13, 2008(projected). The work was supported by AICTE NDF under Grant 1-10/FD/NDF-PG/(IIT-KH(17))/2005-06.

The authors are with the Department of Computer Science, Indian Instituteof Technology Kharagpur, Kharagpur 721302 India.

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNSRE.2008.925073

use standard input devices like mouse or keyboard. “Scan-ning and access switches” are commonly used alternate inputmethods developed for such users. Access switches are spe-cially designed input devices that require lesser motor controlto operate than a mouse or a keyboard.1 These switches can beoperated with any active body part of a user including hand,foot, mouth, or eye. To operate computers with access switches,scanning is used. Scanning is the periodic and successivehighlighting of on-screen elements [7]. During scanning, thehighlighter pauses at each element for a predefined time delayknown as the “scan period.” When the highlighter pauses onthe desired element, users activate an access switch to selectthat element. In the rest of the paper, we use the term VSK torefer to virtual keyboards operated with scanning and accessswitches.

The design space (i.e., the set of possible designs) of a VSKis usually large, which results from the large number of possibleways to organize keys on the interface as well as the wide varia-tion in the scanning input methods. To determine an appropriatedesign from the design space, designers often rely on their expe-rience and intuition. Even for an experienced designer, however,experience and intuition rarely leads to a single design. Conse-quently, designers are required to implement prototypes of thealternate designs and evaluate them with users. However, eval-uating prototypes with disabled users is not an easy task due tothe following problems.

• It is very difficult to get sufficient number of users for eval-uating alternate designs.

• Collecting sufficiently large usage data for analysis is alsoproblematic. Physical disabilities prevent the users fromworking continuously for a long stretch of time. Some-times, it takes several months to collect data for evaluation.

As a result, most commercial systems are designed in a waysuch that the interface elements, layouts, scan period, and othertiming parameters and details of switch operations can be con-figured according to the conveniences of the users.2 The config-uration is usually done by the clinical professionals or relativesof the users or the users themselves based on their experienceor by trial and error.

In order to reduce dependence on the user testing, someresearchers reported more systematic model-based designapproaches. In model-based design, performance of VSK users(usually in terms of text entry rate) is computed with user

1See, for example, http://www.abilityhub.com/switch/switch.htm for a list ofcommercially available access switches.

2An example of a commercially available virtual scanning keyboard isthe WiViK system. The keyboard has several features including config-urable switch-based scanning. Details about the system can be found athttp://www.wivik.com/

1534-4320/$25.00 © 2008 IEEE

BHATTACHARYA et al.: COMPUTATIONAL MODELING OF USER ERRORS FOR THE DESIGN OF VIRTUAL SCANNING KEYBOARDS 401

models. Alternate designs are compared based on the computedperformance. For example, Damper [8] proposed a model,based on the work of Rosen and Goodenough–Trepagnier [9],for the design of VSKs. There are few other work reportedin the literature on modeling performance of virtual keyboardusers having motor disabilities. These include the models pro-posed by Horstmann and Levine [10] and Koester and Levine[11]–[13] that were based on the KLM/GOMS modeling tech-niques [14]. However, only mouse and keyboard based directinput methods were considered in these work. In Soukoreffand MacKenzie [15], a model of virtual keyboard users wasreported (also see [16] and [17]), although it was developed forusers without any disabilities.

Model-based design approaches allow designers or clinicalprofessional to evaluate a set of alternate designs or configu-rations without having to resort to extensive user trials. Thus,the approaches alleviate the problems associated with prototypetesting. However, the existing models that are used to computeperformance of VSK users do not consider errors in input selec-tion. The lack of error consideration, which arises primarily dueto the absence of data on the type, effect, and causes of VSKerrors, limits the practical usefulness of the resulting designsor configurations. Trewin and Pain [18] reported an extensivestudy concerning errors caused by users’ motor disabilities. Thetypes of disabilities considered in the study were the same as inour work. However, the study considered only mouse- and key-board-based input methods. The results of the study, therefore,are not applicable for the design of VSKs. The present work isaimed at augmenting the model-based approach by taking intoaccount the effect of errors in the design process.

This paper is organized as follows. We performed a set of ex-periments to determine VSK error types. The experimental de-tails are presented in Section II. We found that mainly two typesof errors, namely 1) timing errors and 2) selection errors, occurduring interaction between motion impaired users and VSKs.We also estimated the effect of these errors on user performance.Error types and their effects are discussed in Section III. Fromanalysis of the empirical data, we developed predictive modelsof the error behavior of VSK users. To develop the models, wepropose “focus distance,” an entity that embodies both systemand interaction characteristics of VSKs. The focus distance andthe predictive error models are presented in Section IV. We usedthe models to develop “ErrorProneness,” a quantitative measureof VSK errors. Based on the measure, we propose a method fortaking into account errors in the design. The error measure andthe proposed design method are presented in Section V. Sec-tion VI discusses the strengths and limitations of the presentwork.

II. EXPERIMENT DESIGN

We collected data from six speech and motion impaired usersfor two sets of VSKs. The experiments were divided into twogroups. Data collected from the first group of experiments wereused to determine the types and effects of errors and developpredictive error models. Data from the second group of experi-ments were used to demonstrate the validity of the error models.The method and apparatus used for the experiments are detailedbelow.

A. Scanning Methods

We used three methods of scan in our experiments thatare commonly used with VSKs [7], namely, three-level orblock-row-item scanning, two-level or row-item scanning, andone-level or item scanning. In a three-level scan, the on-screenelements are divided into “blocks.” Each block is further dividedinto a set of “rows” and each row is in turn divided into a few“items.” The system initially starts a block level scan. Duringthis process, the block that contains the desired item is selectedby the user through the activation of an access switch. Once ablock is selected, the system begins a row level scan inside theblock starting from the first row. During the row level scanning,the row in which the desired item lies is selected. Then the itemsof the selected row are scanned starting from the first item. Whenthe scanning reaches the desired item, the item is selected. In atwo-level scanning, on-screen elements are divided into rows anditems while each on-screen element is scanned periodically in aone-level scanning. In our three-level scanning implementation,the highlighter returned to the block containing the last selectedkey after each key selection. In the two-level implementation, therow containing the last selected key was highlighted again afterselection of each key. The key last selected was highlighted againin our implementation of the one-level scanning input method.

B. Interfaces and Access Switches

We developed two virtual keyboard layouts and implementedthe three scanning methods on each of them. One of the layoutswas a 27 keys (26 letters and the “Backspace”) virtual keyboardin English. Implementation of the three scanning methods on thislayout resulted in three VSKs, which we refer to as (the key-boardwith three-level scanning), (thekeyboardwith two-levelscanning), and (the keyboard with one-level scanning) insubsequent discussions. For , three blocks were defined eachconsisting of three rows. Each row in turn contained four itemsas shown in Fig. 1(a). Fig. 1(b) shows with nine rows andfour items in each row. is shown in Fig. 1(c) with 27 items.

The other keyboard layout was in Bengali, a language spokenprimarily in Eastern India and Bangladesh. The layout was amodification of the virtual keyboard reported in Mukherjee etal. [19]. We refer to the three VSK interfaces that were devel-oped by implementing the three scanning methods on the Ben-gali layout as (the keyboard with three-level scanning),

(the keyboard with two-level scanning) and (the key-board with one-level scanning) in subsequent discussions. Theinterfaces are shown in Fig. 1(d)–(f). Each of these VSKs had16 rows of character keys. For , five blocks were defined.These were (a) block 1 comprising rows 1 and 2, (b) block 2comprising rows 3–10, (c) block 3 comprising rows 11 and 12,(d) block 4 comprising rows 13 and 14, and (e) block 5 com-prising rows 15 and 16.

In addition to the character keys, , , , andcontained some special purpose keys. These keys are shown inFig. 1(g) and (h). If a user wrongly entered a block during inputselections, the user could come out of the block by selectingthe “block cancel” (BC) key [Fig. 1(g)]. In order to select blockcancel, the user needed to select the row containing the key first.Similarly, the “row cancel” key [Fig. 1(h)] was used to comeout from a wrongly selected row. However, for the first rows

402 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

Fig. 1. VSK interfaces and special keys used in the experiments. (a) V . (b) V .(c) V . (d) V B . (e) V B . (f) V B . (g) BC. (h) RC.

of each block in and , the block cancel keys had to beused for row cancel also. The sequence of operations in that casewas undo the block selection with the block cancel key, reselectthe block, and then select the desired row in the block. Notethat both the keys were used in and while only the rowcancel key was used in and .

TheVSKswereimplementedinJava(JDKversion1.4.0).Eachof the VSKs could automatically log any text entry task done withit along with the time of occurrence of the task. System clock wasused to log time. In addition to the VSKs, we also developedtwo types of hand operated access switches for our study.

TABLE IPROFILE OF THE PARTICIPANTS

C. Participants

Six volunteers participated in our experiments. Profiles of theparticipants are shown in Table I. Participants CP1, CP2, and CP4could not produce clear and comprehensible speech while CP3could produce understandable speech with much difficulty. Also,CP1 and CP2 had involuntary muscle movements and CP3 andCP4 had stiff limb muscles. FD and MS were having the symp-toms of slurring of speech and weak muscle movements. ApartfromFD,all theparticipantswereregularcomputerusers.Amongthem, CP1, CP2, and CP3 had prior experience of working withscanning input communication aids. In particular, CP2 was aregular user of the “E Z Keys” system3 while CP1 and CP3 wereusers of the switch operated version of the “Clicker” system.4

D. Method

In the first group of experiments, we collected data for theinterfaces , , and from each of of the participants. Datawere collected from the participants for , andin the second group of experiments. Five desktop PCs with 17displays were used for data collection. Each of these PCs hadWindows 2000 running on Pentium IV processor with 2.80-GHzclock speed.

Each experiment was divided into two phases, namely 1) the“training” phase and 2) the “usage” phase. The training phasefor each experiment was of about 5 h duration consisting of tentraining sessions of half an hour each. In the training phase,participants were familiarized with the interfaces. Also, themost convenient scan period for each participant was deter-mined in the training phase by conducting several trials. In theusage phase, each participant was given variable length texts(the reason for using variable length texts is discussed in Sec-tion IV-B) in printed form for each of the VSKs. The lengths ofthe texts were varied in the following way: 105 characters for

, 165 characters for , 405 characters for , 170 charactersfor , 210 characters for , and 830 characters for .

III. ERROR TYPES AND THEIR EFFECTS ON

USER PERFORMANCE

We observed that the participants were making the followingthree types of errors while entering the texts through theinterfaces.

1) Timing error (TE): Participants failed to activate accessswitch when the element (block, row, or item) was high-lighted. When timing error occurred, participants had towait until the element is highlighted again.

3http://www.words-plus.com/website/products/soft/ezkeys.htm4http://www.cricksoft.com/us/products/clicker/default.aspx

BHATTACHARYA et al.: COMPUTATIONAL MODELING OF USER ERRORS FOR THE DESIGN OF VIRTUAL SCANNING KEYBOARDS 403

2) Selection error (SE): Participants selected a wrong ele-ment (block, row, or item). In case of a selection error,participants had to undo the wrong selection by selectingeither of block cancel, row cancel, and back space keyand reselect the element.

3) Transcription error: Participants entered a character thatclosely resembled the actual character, for example “C”instead of “G” or “V” instead of “Y.”

Very few transcription errors were observed in the experi-ments (only four in total). Moreover, only CP2 (once) and FD(thrice) made such errors. Therefore, we neglect the effect ofthis error on user’s performance. However, SE and TE occurredmore frequently and significantly increased the text entry timeof each participant. We analyzed the usage logs to calculate theeffect of TE and SE on text entry time. The method we adoptedfor log analysis is described in the following.

A. Calculation of Error Free Text Entry Time

First, we calculated the time needed to enter a given text on agiven VSK in the absence of any errors. Let this time be denotedby . To calculate , we assign a unique location to each keyon the VSK. For three-level scanning, each location is of theform where denotes the block, denotes the row inand denotes the item in . The numbering of blocks, rows anditems are done according to the order of the scan. In this scheme,if a block numbered is scanned before another block numbered

, then . The same holds true for row and item numbering.Similarly, each key on two-level and one-level VSKs is assignedlocation of the form and , respectively.

In order to select a key after a keyon a three-level VSK, a user first waits until is highlightedagain, which takes a nonzero time. Let this be called the systemoverhead time denoted by . Next, the user waits till the high-lighter reaches from . This takes if or

if , where is the scan period andis the total number of blocks on the interface. Once is high-lighted, the user activates an access switch to select it. Aftertime, the first row in is highlighted. The user now waits forthe highlighter to reach from the first row, which takes a timeequal to . The user makes another switch activation toselect . After time, the first item in is highlighted and theuser now waits for the highlighter to reach . The waiting timeis . Subsequently, the user activates the access switchto select .

These sequence of events are summarized in (1) as the ex-pression for calculating , the time to select after on athree-level VSK. The first part of (1) represents whenand the second part represents when . is the averagetime required to activate an access switch

(1)

We estimated that in our implementations. Also, weused as the value for since any value of greater thanresults in TE. Incorporating these values in (1) and rearrangingthe terms, we arrive at (2a). The corresponding expressions for

two-level and one-level VSKs, which are obtained using a sim-ilar approach, are shown in (2b) and (2c), respectively. and

in (2b) and (2c) denote the total number of rows and items,respectively

(2a)

(2b)

(2c)

Each of the printed texts given to the participants can bethought of as a sequence of characters. Since each charactermaps to a key on the keyboard, this character sequence can bemapped to a key sequence. Therefore, is the total time re-quired to enter the key sequence and can be represented as in(3), where denotes the time to enter two consecutive keysin the key sequence and is the total number of characters inthe given text. in (3) should be replaced by (2a), (2b), or(2c) for three-level, two-level, and one-level VSKs, respectively

(3)

B. Calculation of Increase in Text Entry Time Due to

The usage logs obtained from the experiments contained orig-inal texts as well as extra selections made by the participantsdue to SE. After calculation of , we recomputed (3) takingthe usage logs as the input texts. Let the time thus obtained bedenoted by . Using and , the percentage increase intext entry time due to SE is computed with

(4)

C. Calculation of Increase in Text Entry Time Due to

Finally, we calculated the total text entry time for eachof the participants from their corresponding logs. Usingand , the percentage increase in text entry time due to TE

is calculated with (5). It may be noted that (5) takesSE into account, i.e., denotes the increase in usagetime due to TE occurred in entering the original texts as well asthe extra selections due to SE

(5)

Table II shows the total number of occurrences of each ofthese errors. The results show that the number of occurrencesof TE was much higher than SE. Consequently, the effect of TEon performance was significantly more. For three-level and two-level scanning input methods, the effect of TE on performancewas almost double to that of SE. For one-level scanning, thecontribution of TE in increasing participants text entry time wasalmost triple to that of SE.

IV. PREDICTIVE ERROR MODELING

Results of our empirical study (Table II) imply that the de-signers need to consider the effect of errors in the design to im-

404 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

TABLE IIEFFECT OF ERRORS ON PERFORMANCE

prove user performance. For extending the existing design ap-proaches with error consideration, we developed mathematicalmodels of user’s error behavior. Development of the models isdescribed next.

A. Focus Distance Between Two Keys

Let be any two keys on a VSK such that is selectedafter . Selection of involves two events in sequence, namely1) “shift” and 2) “activation.” The shift event implies shiftingof the highlighter from one on-screen element (a block, row, oritem) to the next element. The activation event implies activationof the access switch to select an element. We define the focusdistance between in terms of these events as “thetotal number of shift and activation events required to selectafter selection of .”

Let there be two keys and on athree-level VSK. In order to select after , the user waitsuntil the highlighter reaches . This requires number ofshift operations if or number of shift op-erations if where represents total number of blockson the interface. Once is highlighted, the user makes an ac-tivation event to select . Then, shifts occur before thehighlighter reaches . The user makes another activation to se-lect . A further shifts occur after which the highlighterreaches . In order to select the key, the user makes a final acti-vation event. These sequence of events are summarized in (6a)(after some manipulation) that gives a general expression forcalculating between for three-level scanning. Expres-sions for for two-level and one-level scanning were obtainedin a similar way which are shown in (6b) [assuming ,

] and (6c) [assuming , ]. and in (6b)and (6c) represent total number of rows and items, respectively

(6a)

(6b)

(6c)

Equation (6a) shows that when and. This particular value implies the minimum focus distance

(say, ) for three-level VSKs when a key is selected succes-sively without any shift. Similarly, (6b) shows thatfor any two-level VSK and (6c) shows that for anyone-level VSK. The maximum focus distance (say, ), how-ever, depends on the key organization on VSK interfaces.

It may be noted that the number of shift events considered independs directly on the position of keys on the interface, i.e.,

the layout. The activation events, on the other hand, imply thatthe interaction method is also taken care of by . Hence, isrepresentative of VSK design. We analyzed experimental datato observe the variation of error probabilities with , which isdescribed next.

B. Observed Variation of Error Probabilities With

In our experimental setup, the maximum focus distance,, for , , and were 9, 12, and 27, respectively. It

may be recalled that the texts given to the participants wereof variable lengths. The purpose of using variable length textswas to keep the number of occurrences (say, which was15 in our first group of experiments) of each focus distancewithin the range of and fixed. For example, had

focus distances (i.e., ) betweenand . The 105 character text given to the participants

for was designed in a way such that each of these focusdistances occurred in the text exactly times. The sameholds true for the other texts used in our experiments.

From the usage logs, we determined the number of timesTE (say, ) and SE (say, ) occurred for each and everyfocus distance. Next, we calculated the ratios of and

for each , which gave the probabilities of TE [denotedby ] and SE [denoted by ] for the corresponding

. These probabilities were then plotted against to obtain the

BHATTACHARYA et al.: COMPUTATIONAL MODELING OF USER ERRORS FOR THE DESIGN OF VIRTUAL SCANNING KEYBOARDS 405

Fig. 2. Distribution of P (f) and P (f) for the three VSKs in the first group of experiments. (a) P (f) for V . (b) P (f) for V . (c) P (f) for V . (d)P (f) for V . (e) P (f) for V . (f) P (f) for V .

error distributions. Distribution of TE for the three VSKs areshown in Fig. 2(a)–(c). Fig. 2(d)–(f) shows the distribution ofSE for the three VSKs. In Fig. 2, the X-axes denote focus dis-tance and the Y-axes denote corresponding error probabilities.

The TE distributions of Fig. 2 show that for all the threeVSKs, was maximum at . Afterwards, it graduallydecreased with the increase in . The decrease continued untilabout half the range of for . For , decreased untilabout one third the range of . From these points on,increased again for both and until . The regions ofdecrease and increase are demarketed in the corresponding fig-ures by the dotted vertical lines. A slightly different behavior,however, was observed for for which show only adecreasing trend within and .

Unlike , no pattern was observed for . Thedotted horizontal lines parallel to X-axes shown in the SE dis-tributions of Fig. 2 represent sample mean of for all theparticipants. The figures show that was almost uniformlydistributed around the sample mean for all the three VSKs.

C. Proposed Models

Fig. 2 show that the TE behaviors were characterized by (a), the TE probability at (b) , the critical focus distance

that separates the regions of increase and decrease (c) , the TEprobability at and (d) , the TE probability at . In orderto relate these parameters, we propose to use a combination ofexponential growth and decay functions. The proposed model isshown in (7), where and are the decay and growth rates,respectively

whenwhen

(7)

TABLE IIIMEAN AND STANDARD DEVIATIONS OF OBSERVED TE PARAMETER VALUES

The mean values of the TE parameters observed in theexperiments for each type of scanning input method and asso-ciated standard deviations are shown in Table III. Both themean and for were very high, which we approximateby 0.95. Also, Fig. 2(a) shows that for all the partici-pants. We approximate this observation asfor where . Moreover, we approximate

with 0.5 for . Similarly from Table III and Fig. 2(c), we set, , , and for

. The presence of only the decreasing component in the TEdistribution of implies that . Since the increasingcomponent was not present, we set for . In addition,we set and for based on Table III values.Substituting the values for , , , and in (7), we get

and for, for and

and for . Since no patternwas observed for the SE distribution, we propose to use , thesample mean, to model the SE behavior. We estimated from thelogged data that for ,

406 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

TABLE IVPARAMETER VALUES FOR TE AND SE ERROR MODELS

TABLE VMEAN MODEL PARAMETER VALUES IN THE SECOND GROUP OF EXPERIMENTS

for , and for . Theassumed and estimated model parameter values for TE and SEare summarized in Table IV.

The solid lines in the TE distributions of Fig. 2 show themodel predicted TE behavior. We measured the goodness of fitin . For the TE model, we found equals to 0.93 for ,0.91 for , and 0.95 for . The corresponding values for theSE model were 0.91 for , 0.89 for , and 0.92 for . Thehigh values indicate that the models closely fit the experi-mental data.

We used the data collected for , and in thesecond set of experiments to test the hypothesis that the modelsare not specific to the particular VSKs used in the first experi-ment. The maximum focus distance for , , and

were 19, 22, and 83, respectively. The number of occur-rence of each focus distance in the texts given to the participantsfor these interfaces was kept fixed at 10. For , we observedthat the increasing part was missing from the TE distribution.The observation implies that the assumptions and

hold true for the second experiment as well. Thevalues observed for and were and

, respectively. The observed values, therefore, were inaccordance with the results of the first experiment (Table IV).

The mean model parameter values observed in the secondgroup of experiments are shown in Table V. We used a “depen-dent -test” to observe the effect of layout on model parameters.

In the statistical test, the values from the first experiments wereused as pretest data and the values from the second experimentswere used as posttest data, the level of significance was 0.05and degree-of-freedom was 5 (i.e., sample size-1). The resultsshow that the effect of layout was not significant for either of

, ,, and

since in all the cases, two-tail . Thenotation denotes the -statistic for the VSK with -level scan-ning. Moreover, for the second group of experiments for

, , and were found to be 0.89, 0.9, 0.94 for TEand 0.93, 0.9, 0.88 for SE models, respectively.

The closeness of the parameter values in the two groups of ex-periments, the -test results and the high values indicate thatthe proposed error models are not specific to any of the partic-ular experimental conditions described in this paper, supportingthe validity of the proposed models.

V. ACCOUNTING FOR ERRORS IN THE DESIGN

We have developed a numerical measure of user errors forVSKs from the error models. The measure is called “Error-Proneness” or EP and forms the basis of a method we proposeto include the effect of errors in the design. The proposed errormeasure and the design method are described in the following.

A. EP Measure

For each key pair or digraph on a VSK, let de-notes its digraph probability or the probability of occurrence inany given text. can be calculated from a corpus of text using(8a) where is the number of occurrence of the key pair inthe corpus and is the total number of keys on the interface.Using the digraph probability, we calculate the mean focus dis-tance of the VSK with (8b) where denotes the focusdistance between and

(8a)

(8b)

We define EP of the VSK as the probability of error in se-lecting two keys in succession having a focus distance . Sinceboth TE and SE contributes significantly to errors, EP is definedas the joint probability of both TE and SE. From our empiricalstudy results, we assume that TE and SE probabilities are mu-tually independent. Hence, EP can be expressed as in

(9)

Moreover, we modeled in terms of . Equation (9)therefore can be simplified, as shown in

(10)

The individual probabilities on the right-hand side of (10) canbe obtained or calculated using Table IV and (7).

BHATTACHARYA et al.: COMPUTATIONAL MODELING OF USER ERRORS FOR THE DESIGN OF VIRTUAL SCANNING KEYBOARDS 407

TABLE VIPOSSIBLE RELATIONSHIPS BETWEEN v AND v

B. Proposed Design Method

In model-based approaches, a set of design alternatives arecompared based on the performance computed with the models.Existing models can compute performance in terms of text entryrate. The models, however, do not take into account the effect oferrors on the entry rate. The computed performance, therefore,does not reflect true performance of VSK users.

We propose to overcome this limitation with the use of the EPmeasure. In our proposed approach, each VSK is characterizedby a two tuple where is the error free text entry rate andis the EP measure. The error free text entry rate is essentially thereciprocal of the error free text entry time as discussed in Sec-tion III-A. Any of the existing models can be used to compute(e.g., Damper’s model [8]). The EP measure can be computedwith (10).

Let there be a set of alternative designs .We first calculate and for each of these designs. Basedon the computed and values, we compare the designsin the set to determine the best among them. In orderto compare, we adopt the following procedure: let

be any two VSKs such that and . Then,and can be related to each other by either of the nine ways

listed in Table VI.Among the relationships of Table VI, performs better thanfor the condition r4 or r7 or r8. The two interfaces are equal

for r5. For either of r2 or r3 or r6, we say that performs betterthan . To compare and on the basis of r1 or r9, however,it is necessary to determine user preferences between text entryrate and errors.

We conducted informal interviews with the participantsduring our experiments and found that high text entry ratewas preferred by them if errors were not very high. However,when the chances of making errors were high, participantspreferred interfaces that reduce probabilities of errors. Ourstudy results show that errors increase user’s text entry timeby as high as 65% for three-level scanning. For two-level andone-level scanning, the increase was less than 50%. Assuminga 50% effect to be the cutoff between high and low errors,we propose that for one-level and two-level VSKs, is betterthan for r9 and vice-versa for r1. For three-level VSKs,error gets precedence over text entry rate during comparisonsince probabilities of errors are very high. Consequently, forthree-level VSKs, is better than for r1 and vice-versafor r9.

VI. DISCUSSION

The proposed error models and the design method are aimedat reducing the role of the users in the design of VSKs, thusovercoming the difficulties associated with prototype testing.In order to achieve the objective, we propose to use two sepa-rate performance measures: the text entry rate assuming no errorand the EP measure. We further propose to use the existing textentry rate models to compute the first measure. Therefore, ourwork extends the existing model-based approaches by providinga way to account for the effect of errors. However, the empiricalstudy, the error models, and the EP measure are the first suchwork on scanning keyboards.

In the study, we found that the effect of errors depends on thescanning input method as well as the error type. It ranges fromvery high for three-level scanning to moderate for two-level andone-level scanning for TE. For SE, errors’ effect on performanceis low for all the three scanning types. Moreover, it was foundthat moderate to low errors are acceptable to the users from theperspective of communication. These findings imply that de-signers of one-level or two-level VSKs should choose a designthat improve user’s text entry rate. For three-level VSKs, how-ever, the deciding factor should be the design’s ability to reduceerrors. Consequently, the EP becomes important for three-levelVSKs. The more the EP of a system, the less is its ability to re-duce errors.

We used data collected from six participants with disabilitiesfor six scanning keyboards in order to develop the models anddesign method. Considering the wide variation in the types anddegrees of disabilities and the large design space involved, theresults of our study are indicative of the general trend. Our effortin collecting more data proved difficult owing to several factors.One of these factors was the social conditions that preventedmany potential users from revealing their disabilities and partic-ipate in the experiments. Lack of exposure to computers amongthe individuals with disability compounded this problem. Evenwith those who participated, data collection was very slow sincethe participants could not work continuously for a long stretchof time due to their physical disabilities. Moreover, a couple ofparticipants opted out without completing the experiments asthey were not physically able to continue. Due to these difficul-ties, we were able to collect the data after putting in about a yearof effort. The present work, therefore, intends to propose formalmodels and design method which can be further refined throughmore validation experiments.

The proposed design method is a semiautomatic frameworkwhere a set of designs arrived at by the designer’s expertise arecompared on the basis of the two measures of error free textentry rate and EP. However, a single measure of text entry ratetaking errors into account can simplify the comparison processby relying on a single measure rather than the two measures.Moreover, it is necessary to reduce the number of design al-ternatives in order to employ the proposed design method suchthat those can be compared manually. This, in turn, require de-signer’s expertise. With automatic methods that rely on algo-rithms for searching through the entire design space, depen-dence on the designer’s expertise can be reduced. Developmentof a single text entry rate measure taking errors into account and

408 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 4, AUGUST 2008

algorithms for design space search would therefore be useful ex-tensions of the present work.

The interfaces considered in this work had two characteris-tics, namely 1) each key held single character and 2) no pre-dictive elements were present. Presence of ambiguity (i.e., eachkey having multiple characters) or predictive elements may af-fect user’s error behavior. Ambiguous keyboards usually haveless number of keys due to the presence of more than one char-acters per key [3]. Consequently, one- or two-level scanningare more suitable for such keyboards. However, selection ofa character from ambiguous keyboards may require multipleswitch activations. Word or character predictions or both, on theother hand, are common features of most commercially avail-able VSKs. User-computer interaction for predictive VSKs aredifferent since after every key selection, the user searches theprediction list and decides whether to select from the list or toproceed with the next key selection. Such interaction may re-quire more cognitive effort [11]. Hence, further work may becarried out to determine the applicability of the proposed modelsto ambiguous and predictive VSKs.

VII. CONCLUSION

We have presented empirical studies to identify types and ef-fects of user errors on VSKs. In our study, we have found thattwo types of errors affect performance of scanning keyboardusers significantly, namely 1) timing error that occurs when auser fails to select a key at the appropriate time and 2) selec-tion error that occurs when the user selects a wrong key. Theseerrors have been found to increase users’ text entry time by ashigh as 65% and 35%, respectively. From analysis of the em-pirical data, we have derived models of user’s error behavior.The models form the basis of the EP measure, which has beenused to propose a design method for VSKs taking the effects oferrors into account.

It may be noted that the proposed method is based on com-putational models of text entry rate and errors. An advantageof using computational models is that the method can be auto-mated with little extra effort. Such automated method can en-able designers or clinical practitioners to rely less on user trialsto develop or configure better systems, by comparing alternatedesigns or configurations automatically. Our experience showsthat performing user trials with severely disabled users is a dif-ficult task. Hence, the proposed method is expected to benefitdesigners and clinical professionals.

Apart from collecting more data from users to refine and val-idate the models and method, several extensions can be made tothe present work in order to make it more effective and useful.These include 1) combining the two separate performance mea-sure into a single measure to simplify the comparison of alter-nate designs, 2) development of design space search algorithmsto reduce the dependence on the designer’s expertise, and 3) in-vestigating the applicability of the proposed models to predic-tive and ambiguous keyboards.

ACKNOWLEDGMENT

The authors would like to thank the students and teachers atIndian Institute of Cerebral Palsy, Kolkata, India for helping incollecting usage data.

REFERENCES

[1] R. D. Beukelman and P. Mirenda, Augmentative and Alternative Com-munication, 2nd ed. Baltimore, MD: Brookes, 1998.

[2] J. L. Arnott, “Text entry in augmentative and alternative communi-cation,” in Proc. Efficient Text Entry, 2005 [Online]. Available: http://drops.dagstuhl.de/opus/volltexte/2006/519/pdf/05382.ArnottJohn.Paper.519.pdf

[3] G. W. Lesher, B. J. Moulton, and D. J. Higginbotham, “Optimal char-acter arrangements for ambiguous keyboards,” IEEE Trans. Rehabil.Eng., vol. 6, no. 4, pp. 415–423, Dec. 1998.

[4] G. W. Lesher, B. J. Moulton, and D. J. Higginbotham, “Techniquesfor augmenting scanning communication,” Augmentative AlternativeCommun., vol. 14, pp. 81–101, 1998.

[5] H. S. Venkatagiri, “Efficient keyboard layouts for sequential access inaugmentative and alternative communication,” Augmentative Alterna-tive Commun., vol. 15, pp. 126–134, 1999.

[6] I. S. MacKenzie, S. X. Zhang, and R. W. Soukoreff, “Text entry usingsoft keyboards,” Behavior Inf. Technol., vol. 18, pp. 235–244, 1999.

[7] C. E. Steriadis and P. Constantinou, “Designing human-computer in-terfaces for quadriplegic people,” ACM Trans. Computer-Human Inter-action, vol. 10, pp. 87–118, 2003.

[8] R. I. Damper, “Text composition by the physically disabled: A rateprediction model for scanning input,” Appl. Ergnonomics, vol. 15, pp.289–296, 1984.

[9] M. J. Rosen and C. Goodenough-Trepagnier, “Factors affecting com-munication rate in non-vocal communication systems,” in Proc. 4thAnnu. Conf. Rehabil. Eng., 1981, pp. 194–196.

[10] H. M. Horstmann and S. P. Levine, “Modeling of user performancewith computer access and augmentative communication systems forhandicapped people,” Augmentative Alternative Commun., vol. 6, pp.231–241, 1990.

[11] H. H. Koester and S. P. Levine, “Modeling the speed of text entry witha word prediction interface,” IEEE Trans. Rehabil. Eng., vol. 2, no. 3,pp. 177–187, Sep. 1994.

[12] H. H. Koester and S. P. Levine, “Keystroke level models for user per-formance with word prediction,” Augmentative Alternative Commun.,vol. 13, pp. 239–257, 1997.

[13] H. H. Koester and S. P. Levine, “Model simulation of user performancewith word prediction,” Augmentative Alternative Commun., vol. 14, pp.25–36, 1998.

[14] B. E. John and D. E. Kieras, “Using GOMS for user interface designand evaluation: Which technique?,” ACM Trans. Computer-Human In-teraction, vol. 3, no. 4, pp. 287–319, 1996.

[15] W. Soukoreff and I. S. MacKenzie, “Theorectical upper and lowerbounds on typing speed using a stylus and soft keyboard,” BehaviourInf. Technol., vol. 14, pp. 370–379, 1995.

[16] I. S. MacKenzie and R. W. Soukoreff, “Text entry for mobile com-puting: Models and methods, theory and practice,” Human-ComputerInteraction, vol. 17, pp. 147–198, 2002.

[17] S. Zhai, M. Hunter, and B. A. Smith, “Performance optimization ofvirtual keyboards,” Human-Computer Interaction, vol. 17, pp. 89–129,2002.

[18] S. Trewin and H. Pain, “Keyboard and mouse errors due to motor dis-abilities,” Int. J. Hum.-Comput. Studies, vol. 50, no. 2, pp. 109–144,1999.

[19] A. Mukherjee, S. Bhattacharya, P. Halder, and A. Basu, “A virtual pre-dictive keyboard as a learning aid for people with neuro-motor disor-ders,” in Proc. 5th IEEE Int. Conf. Adv. Learn. Technol. (ICALT), 2005,pp. 1032–1036.

Samit Bhattacharya received the B.Tech. degreein computer science and technology from KalyaniGovernment Engineering College, University ofKalyani, West Bengal, India, in 2001 and the M.S.degree in computer science and engineering, in 2005,from the Indian Institute of Technology, Kharagpur,India, where he is currently working toward thePh.D. degree in computer science and engineering.

His research interests include human–computer in-teraction, assistive technology and rehabilitation en-gineering, user and cognitive modeling.

BHATTACHARYA et al.: COMPUTATIONAL MODELING OF USER ERRORS FOR THE DESIGN OF VIRTUAL SCANNING KEYBOARDS 409

Anupam Basu received the B.E. degree in elec-tronics and telecommunication engineering and theM.E. degree in computer engineering from JadavpurUniversity, Calcutta, India, in 1980 and 1982, re-spectively, and the Ph.D. degree in computer scienceand engineering from Indian Institute of TechnologyKharagpur, India, in 1988.

He is currently a Professor at the Department ofComputer Science and Engineering, Indian Instituteof Technology, Kharagpur. His research interests in-clude development of cost effective assistive systems

for the physically challenged.Dr. Basu received several awards including the Da Vinci Award 2004, Na-

tional Award for Technological Innovation for the Physically Challenged 2007,and the Outstanding Young Person Award 1996. He is a Fellow of Indian Na-tional Academy of Engineering and a past Humboldt Fellow.

Debasis Samanta (M’08) received the B.Tech.degree in computer science and engineering fromthe Calcutta University, Calcutta, India, in 1993, theM.Tech. degree in computer science and engineeringfrom the Jadavpur University, Calcutta, India, in1995, and the Ph.D. degree in computer science andengineering from the Indian Institute of Technology,Kharagpur, India, in 2002.

He is currently an Assistant Professor in the Schoolof Information Technology at the Indian Institute ofTechnology, Kharagpur. He has previously worked at

North Eastern Regional Institute of Science and Technology (NERIST), Itanagar(1995–2004), India. His research interests include human–computer interactionand information system design.