Transcript
Page 1: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

510 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 5, OCTOBER 2008

Performance Models for Automatic Evaluationof Virtual Scanning Keyboards

Samit Bhattacharya, Debasis Samanta, and Anupam Basu

Abstract—Virtual scanning keyboards are commonly used aug-mentative communication aids by persons with severe speech andmotion impairments. Designers of virtual scanning keyboards faceproblems in evaluating alternate designs and hence in choosingthe better design among alternatives. Automatic evaluation of de-signs will be helpful to designers in making the appropriate designchoice. In this paper, we present performance models for virtualscanning keyboards that can be used for automatic evaluation. Theproposed models address the limitations present in the reportedwork on similar models. We compared the model predictions withresults from user trials and established the validity of the proposedmodels.

Index Terms—Augmentative communication, Fitts’ digraphmodel, scanning and access switches, virtual keyboards.

I. INTRODUCTION

P ERSONS with speech and motor disorders face problemsin expressing themselves in an easy and intelligible way.

Examples of such disabilities include muscular dystrophy,Parkinson’s disease, cerebral palsy, and spinal cord injuries.Their problems arise due to nonfunctioning or partial func-tioning of their body parts responsible for producing speechand motor actions. Thus, physical disabilities deprive them oftwo common forms of communication: speech and writing, sig-nificantly affecting their ability to interact with outside world.Computer-based augmentative and alternative communication(AAC) systems are developed to assist these people [1].

For a user of computer based AAC systems, traditional inputdevices like mouse or keyboards may not be suitable. Such usersmay lack sufficient mobility to reach for these devices or suffi-cient motor control to switch accurately and efficiently betweenthem. Alternate input devices and methods are used to makecomputers accessible to motion impaired users. Commonly usedalternate input devices are the access switches. Access switches1

are specially designed devices that require lesser motor controlto operate. Any active body part of the user including hand, foot,

Manuscript received nulldate; revised September 30, 2007; accepted May 09,2008. First published August 12, 2008; current version published November 05,2008. This work was supported by the All India Council for Technical Education(AICTE) National Doctoral Fellowship (NDF) under Grant 1-10/FD/NDF-PG/(IIT-KH(17))/2005-06.

The authors are with the Indian Institute of Technology Kharagpur,Kharagpur-721 302, India (e-mail: [email protected]).

Color versions of one or more figures in this paper are available online athttp://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNSRE.2008.2003377

1see for example http://www.abilityhub.com/switch/switch.htm for a list ofcommercially available access switches.

mouth, or head can be used to operate such switches. Accord-ingly, there are different types of switches with wide variation inshapes and sizes. The switch based input method is supported byscanning. Scanning is the successive and periodic highlightingof on-screen elements [2]. When the highlighter (also called“focus”) reaches the desired element, the user activates an ac-cess switch to select that element. Switch activation depends onthe type of access switch. For example, in case of a hand-oper-ated press switch, switch activation implies pressing the switchwhile for eye-operated switches, eye blink(s) are used to acti-vate an access switch.

For a user of an AAC system, performance depends on boththe interface layout and the input method. In other words, eachalternate design in the design space of AAC systems is charac-terized by a layout and an input method. Wide variation is pos-sible in both of these, making the design space of AAC systemsvery large. A common practice employed by system designers isto decide on a small set of potentially good design alternatives.There are many factors that determine the alternate design set in-cluding experience and intuition of the designers and interactionwith the user community. Prototypes of the alternate designs arethen implemented and evaluated with a set of carefully chosenusers representing the target user group. Based on the user eval-uation results, a decision on the final design is made. However,designers of AAC systems face difficulties in applying the afore-mentioned procedure due to the following reasons.

• It is very difficult to get sufficient number of disabled usersfor evaluating alternate designs.

• Collection of sufficiently large usage data for analysis isalso problematic. Physical disabilities prevent the usersfrom working continuously for a long stretch of time.Sometimes, it takes several months to collect usage datafor evaluation.

Automatic evaluation of AAC systems will be very helpfulto the designers in alleviating these problems. To perform auto-matic evaluation, quantitative performance prediction models ofthe systems are required. The purpose of such models is to givenumerical measures of user performance for a given system,which forms the basis of comparison among alternate designs.In this paper, we present the performance prediction models wehave developed to automatically evaluate virtual scanning key-boards—a commonly used AAC system each of which consistsof a virtual keyboard operated with a scanning input method [3],[4]–[6].

This paper is organized as follows. Section II of the paper in-troduces virtual scanning keyboards. Related work on perfor-mance modeling of virtual scanning keyboards are discussedin Section III. We address the limitations in the existing work

1534-4320/$25.00 © 2008 IEEE

Page 2: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

BHATTACHARYA et al.: PERFORMANCE MODELS FOR AUTOMATIC EVALUATION OF VIRTUAL SCANNING KEYBOARDS 511

Fig. 1. Row–item scanning method of input selection from a virtual keyboard. (a) First step: row scan. (b) Second step: item scan.

and propose models suitable to predict performance of virtualscanning keyboard users. The proposed models are describedin Section IV. In Section V, we demonstrate the validity of theproposed models with results from user trials. Section VI con-cludes the paper.

II. VIRTUAL SCANNING KEYBOARDS

A virtual (also called soft or on-screen) keyboard refers to asoftware system having an on-screen representation of a phys-ical keyboard [7]. The keys are laid out spatially on the keyboardinterface. Users make single letter selections from the interfaceto compose text. To allow motion impaired users work with vir-tual keyboards, scanning and access switch based input methodsare used2. Fig. 1 illustrates a virtual scanning keyboard in whicha row–item scanning is used to operate a QWERTY like virtualkeyboard. Selection of the letter “F” from this interface is car-ried out in two steps. In the first step shown in Fig. 1(a), rowsof the keyboard are scanned. When the row containing “F” ishighlighted, the row is selected by activating an access switch.In the second step shown in Fig. 1(b), the items in the row arescanned. When “F” is highlighted, it is selected.

The row–item scanning of Fig. 1 is one of the different scan-ning methods used to operate virtual keyboards [2]. These dif-ferent methods are discussed in the following.

A. Methods of Scanning

The scanning methods that are implemented on virtualkeyboards are often called matrix scanning. In such scanning,the screen is assumed to represent a matrix. The items that arepresent on the screen are individual cells of that matrix. Themost general form of matrix scanning is a three-level scan,often called the block–row–item scan. In a block–row–itemscan, the on-screen items are grouped into blocks. Each blockcomprises of a set of rows of items. The system initially starts ablock level scan. During this process, the block that contains thedesired item is selected by the user. Once a block is selected,the system begins a row-level scan inside the block. During therow level scanning, the row in which the desired item lies isselected. Then the items of the selected row are scanned. Whenthe scanning reaches the desired item, the item is selected.A variation of block–row–item scan is the diagonal selection

2An example of a commercially available virtual scanning keyboard isthe WiViK system. The keyboard has several features including config-urable switch-based scanning. Details about the system can be found athttp://www.wivik.com/.

mode. In this technique, a block is split into two triangularmatrices based on the main diagonal. In the first stage of thescanning, the two parts of the matrix are periodically high-lighted and the user selects the triangle where the target itemis located. Then a row scanning is applied for its rows and soon. Other two variants of the block–row–item scanning are therow–item scanning and the item scanning.

Each matrix scanning method is implemented in either of thetwo ways, namely 1) auto scanning and 2) guided scanning.In auto scanning, a highlighter moves automatically throughon-screen elements, pausing at each element for a predefinedtime delay called the scan period. Hence, a single access switchis needed for auto scanning, to select a highlighted element.In guided scanning, both the tasks of moving the highlighter(called navigation) and selecting a highlighted element are per-formed through activation of access switches, requiring morethan one access switch to operate such interfaces.

B. Problem in Choosing Layout-Scanning Combination

Due to the wide variation in scanning input methods, aproblem faced by designers is to chose a suitable scanningmethod for a virtual keyboard interface. There is a close rela-tionship between scanning methods and interface organizationand an inappropriate choice may result in a drop in user per-formance. Most commercial systems are, in fact, designed in away such that the interface elements, layouts, scan period, andother timing parameters and details of switch operations canbe configured according to the conveniences of the users. Theconfiguration is usually performed by the clinical professionalsor relatives of the users or the users themselves based on theirexperience or by “trial and error.”

The task of designing or setting an appropriate combinationof layout and scanning input method can be performed in a moresystematic way with performance prediction models of virtualscanning keyboards. With a performance model, user perfor-mance for a given keyboard can be computed taking into con-sideration: 1) the system (i.e., the layout and the scanning inputmethod); 2) the text input task; and 3) characteristics of the user.Usually for text entry systems like virtual scanning keyboards,user performance is measured in text entry rate (i.e., words perminute, characters per second, etc.). The computed text entryrate can be used to compare among alternate designs. In the nextsection, we present the related work reported in the literature onmodeling performance of virtual scanning keyboard users andtheir limitations.

Page 3: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

512 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 5, OCTOBER 2008

III. RELATED WORK

We found three models for predicting performance of virtualkeyboard users, namely: 1) the keystroke level model/goals, op-erators, methods, and selection (KLM/GOMS) models; 2) themodel by Rosen and Goodenough-Trepagnier [8], referred to asthe “RG model” in subsequent discussions; and 3) the model bySoukoreff and MacKenzie [9], often called the “FD model” [7],[10], [11]. The models are discussed next.

A. KLM/GOMS Models for Virtual Keyboards

KLM [12] and its successor GOMS [13] are two pre-dictive user modeling techniques widely used in HCI [14].KLM/GOMS models allow designers to perform quantitativeanalysis of system behavior from a description of the system,making it possible to identify design problems or comparealternate designs. Both of these techniques were applied tomodel performance of AAC users. GOMS model for AACusers was reported in [15]. The work described in [16]–[18]used KLM analysis for the same purpose. A KLM/GOMSanalysis consists of the following stages.

• A prototype of the system is conceptualized/designed.• A typical task is described using the corresponding mod-

eling language. The description consists of the steps to betaken to carry out the given task. For KLM analysis, the de-scription is a linear sequence of basic steps known as theprimitives. A hierarchy of goals is used to describe the taskin GOMS. At the lowest level of the goal hierarchy are theprimitives.

• A quantitative analysis of the task description is performedusing values of the primitives.

For a virtual scanning keyboard, a typical task consists of in-putting a string of characters . Performance of a user may varydepending on the nature of (i.e., number of characters presentin , frequency distribution of characters in , etc.). Since itis not possible to evaluate user performance for all possible ,there is a possibility of getting wrong performance prediction bythe KLM- or GOMS-based approaches. Also in KLM/GOMSanalysis technique, it is necessary to construct task descriptionsfor each design to compare among a set of alternate designs.Construction of task description, however, is performed manu-ally which is a tedious and time consuming process. It is in factimpossible for virtual scanning keyboards having large numberof design alternatives to construct task descriptions for each al-ternate design manually. Performance models that are easy toautomate are more useful to evaluate and compare among largenumber of alternate designs.

To address the limitations of KLM/GOMS-based approaches,two qualities are desirable in the performance models. Theseare 1) the models should not take task descriptions as input and2) the models should be easy to automate. The RG model andthe FD model are found to have these characteristics. These twomodels are described next.

B. RG Model

The RG model was developed with three parameters: namely1) linguistic cost measured in average number of language unitsper word denoted by ; 2) average number of motor acts per

language unit denoted by ; and 3) average time per act denotedby . From these three parameters, , the average time requiredto form a word was calculated using

(1)

The work reported in [19] considered three different types ofvirtual keyboards. These were :1) virtual keyboards with singlecharacter per key; 2) virtual keyboards with multiple charactersper key; and 3) virtual keyboards with encoded representation ofalphabetic characters. Performance models were developed foreach of these systems based on (1). However, the input mecha-nism considered for all these systems was direct selection from aphysical keyboard rather than scanning and access switch-basedalternate input.

An extension to the RG model was proposed by Damper [20]to make the model applicable for virtual scanning keyboards. Inthe extended RG model, the average time to compose a word[ of (1)] was calculated with (2), where is the scan rate (i.e.,number of scan steps or highlighter movements per second),is the probability of occurrence of the th character present onthe interface (calculated from a corpus of text), and is thenumber of scan steps required to select the th character startingfrom a home position. is the number of characters present onthe interface. From (2), text entry rate in words per second wascalculated as

(2)

In the extended RG model, task was represented by the uni-gram probabilities (i.e., ) and the number of scan steps (i.e.,

) encoded the layout information. The user was represented bythe scan rate (i.e., ). By embedding the task, the model removesthe necessity of providing task descriptions as input. Also, themodel is simple and easy to automate. However, it was assumedin the model that after every selection, the highlighter wouldreturn to a home position. The unigram probabilities of (2) re-flects this assumption. As a consequence, the model can be usedto compute performance of only those keyboards for which eachon-screen location requires a fixed and predetermined numberof scan steps for its selection. The assumption is in fact a specificcase for more general scanning input methods where number ofscan steps required to select an element is not fixed or prede-termined. Unigram probabilities are not suitable to model suchgeneral cases and -gram statistics with is necessary.

C. FD Model

The FD model was developed primarily to evaluate virtualkeyboards used on small sized mobile devices like PDAs. Inthe model, it was assumed that a user would select keys withfinger (touch) or stylus movement. Following components wereconsidered to develop the model.

• Visual search time: To select a key, a user first needs to“locate” the key on the interface. The time to locate a keyis called the visual search time and is denoted by RT. Inthe FD model, RT was calculated with the Hick–Hyman

Page 4: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

BHATTACHARYA et al.: PERFORMANCE MODELS FOR AUTOMATIC EVALUATION OF VIRTUAL SCANNING KEYBOARDS 513

law [21], [22] shown in (3) where and are constantsand is the total number of keys present on the interface

(3)

• Movement time: After locating a key, the user needs tomove finger or stylus from the current location to select thekey. The time to make a manual movement from one key toanother was calculated with the Fitts’ law [23]–[25] shownin (4) where is the movement time from the sourceto target key, and are constants, is the Cartesiandistance between the two keys and is the width of thetarget key

(4)

• Digraph probability: In the FD model, the probability ofoccurrence of each character pair or digraph present onthe interface was used to represent the task. The digraphprobabilities were calculated from a text corpus by (5),where is the digraph frequency and is the digraphprobability of characters and . is the total numberof individual characters present on the interface

(5)

Using and , mean movement time for a givenlayout was calculated with

(6)

Using (3) and (6), user performance in characters per second(CPS) was calculated, as shown in (7). In calculating perfor-mance, the FD model assumed two categories of users; noviceand expert. Novice users were assumed to take nonzero timeto visually locate keys on an interface. Thus, their performancewas represented by (7a). On the other hand, expert users were as-sumed to have complete familiarity with the interface and hencehave a zero visual search time. Thus, expert user’s performancewas represented by (7b)

(7a)

(7b)

Use of the digraph probabilities in the FD model makes themodel suitable to account for general scanning methods wherenumber of scan steps required to select an element is dependenton the previous selection. In addition to the digraph probabili-ties, the FD model has another advantage over the extended RGmodel. The extended RG model considered auto scanning inputmethods only. Consequently, the scan rate used in the model wassufficient to represent the user. In guided scanning, however, theuser has to manually shift the highlighter among on-screen ele-ments which involves visual search. Since visual search is part

of the FD model, it is better suited to model both auto and guidedscanning input methods. As a result, we used the FD model asthe basis of our work and modified it to account for scanninginput methods.

IV. PROPOSED MODELS

To incorporate scanning in the FD model, we have replacedthe Fitts’ law component with the focus movement and selectiontime (FT), i.e., the time between selections of two on-screenelements and with the condition that is selected after .

incorporates both the time to move the highlighter fromto and the user’s switch activation time. In the following, wedescribe the calculation of FT for auto and guided scanning. Wepresent the calculation for block–row–item scanning, which caneasily be modified for other types of matrix scanning methods.Also, we are making the following assumptions in describingour approach.

• The virtual keyboard layout is fixed with single characterper key and without any language prediction elements.

• Keys are selected without errors.• After selection of a key, the block containing the key is

highlighted again.• When a block or a row is selected, the first row of the block

or the first item of the row is highlighted first to indicatethe start of row level scanning in the block or item levelscanning in the row, respectively.

A. FT for Auto Scanning

To calculate FT for block–row–item auto scanning, we as-sign an integer (starting from 1) to each block on the interfaceaccording to the order of scan. In this assignment, the block as-signed the highest integer gets the focus last before the focusreturns to the first block again. Similarly, each row in a blockand each item in a row are assigned integers according to theorder of scan. Thus, each key can be represented by a tripletof the form where is the block number, is the rownumber, and is the item number. Let there be two differentkeys and on the interface represented by the tripletsand , respectively. Selection of after follows aset of events, which are shown in Table I(a).

FT between and is the sum of the individual times takenby each of the events in Table I(a) to complete. Among theTable I(a) entries, time taken to execute is system de-pendent. , and should be less than orequal to the scan period; otherwise the focus shifts to the nextelement (block, row, or item, respectively). We take the meanvalue for either of these quantities where is the scan pe-riod. The expressions for calculating the time taken by each ofthe remaining events are shown in the last column of Table I(a)where represents total number of blocks present on the key-board interface. Combining and rearranging Table I(a) entries,FT between can be expressed as (8). In (8), repre-sents total time to execute three s and

. when and when

(8)

Page 5: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

514 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 5, OCTOBER 2008

TABLE IEVENTS BETWEEN SELECTION OF A KEY PAIR � � ��� �� �� AND � � �� � � � � � FOR AUTO AND GUIDED SCANNING AND THEIR EXECUTION TIMES:

(A) EVENTS FOR AUTO SCANNING AND (B) EVENTS FOR GUIDED SCANNING

B. FT for Guided Scanning

In guided scanning, the user activates access switch to shiftfocus among on-screen elements. This is in addition to theactivation of access switch for selecting a focused element.Consequently, user-computer interaction for guided scanning isdifferent from auto scanning. Table I(b) shows the events thattake place in selecting after on guidedscanning interfaces.

Each SHIFT() event is composed of a sequence of unit shifts(i.e., shifting from one element to the next) until the desiredelement is reached, as shown in

(9)

Moreover, each unit shift is composed of 1) the time toactivate the access switch and 2) a event. As in autoscanning, time taken to execute each is system depen-dent. Assuming switch activation times for both navigation andselection to be the same, let denotes the average time auser takes to activate an access switch. Then, each of the events

, and takes to execute. Timetaken by the rest of the events are shown in the last column ofTable I(b).

Rearranging Table I(b) entries, between forguided scanning can be represented with (10), where is thesame as in (8). In (10), when andwhen

(10)

1) Estimation of : During the activation of accessswitches by a user, two signals are sent to the system. The firstsignal indicates that the user intends to activate the switch and

the second signal indicates that the user intends to deactivate it.For example, in press type switches, pressing of a switch sendsthe first signal and releasing the switch sends the second signal.Based on this knowledge, the task of switch activation can bemodeled as a sequence of the following five subtasks.

1) Perception: The user perceives focusing (highlighting) ofthe region of interest on screen. Let the perception time be

.2) Cognition: The user decides to activate an appropriate (se-

lection/navigation) access switch. Let this cognitive ac-tivity takes time.

3) Motor action: The user activates the access switch. Let themotor action takes time.

4) Cognition: The user decides to deactivate the switch.5) Motor action: The user deactivates the access switch.Subtask (d) is included based on the hypothesized interaction

proposed by Keates et al. [26]. Assuming that the user performsall these tasks each time s/he activates an access switch,can be expressed as in

(11)

Keates et al. [26] reported that for disabled users wascomparable to able-bodied user. Since 100 ms is commonlyused as the value of for able-bodied users [13], we take thesame value for disabled users. However, and for disabledusers were found to be approximately 20% and 50% higher thanable-bodied users, respectively. Usually, the value 70 ms is usedfor each of and for able-bodied users [13]. Thus, thecorresponding values for disabled users come out to be:

ms and ms.Based on these values, we arrive at= 478 ms.

Page 6: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

BHATTACHARYA et al.: PERFORMANCE MODELS FOR AUTOMATIC EVALUATION OF VIRTUAL SCANNING KEYBOARDS 515

TABLE IISUMMARY OF PROPOSED PERFORMANCE MODELS

C. Performance Calculation

Using [(8) or (10)] and the digraph probability (5),the average time to move focus between two keys can becalculated with (12). In (12), is the digraph probabilitybetween and is the total number of keys present on theinterface

(12)

In auto scanning, visual search time (RT) is taken care of bythe scan period itself. Thus, (13) can be used to measure userperformance in CPS for auto scanning interfaces

(13)

However, we need to consider visual search time for guidedscanning interfaces. Hence, CPS for guided scanning interfacesshould be calculated using either (14a) or (14b) depending onthe user’s familiarity with the layout

(14a)

(14b)

D. Summary of the Proposed Models

The performance models for virtual scanning keyboards aresummarized in Table II. The entries for row–item and item scan-ning in Table II are arrived at using a similar procedure. Foreach of the auto and guided scanning entries, the “Variables”column in Table II contains three rows. These rows correspondto the block–row–item (top row), row–item (middle row), anditem (bottom row) scanning. The constants and used inTable II refer to the total number of rows on a row–item scan-ning interface and the total number of items on an item scanninginterface, respectively.

Since CPS for a scanning keyboard is very low, we used char-acters per minute as the measure of userperformance in our work.

V. MODEL VALIDATION

We have carried out user trials to demonstrate the validity ofour proposed performance models. In the trials, we comparedmodel predictions with observed user performances. The modelvalidation process is described in the following.

Fig. 2. Layouts and access switches developed for model validation. (a) Layout1. (b) Layout 2. (c) Access switches.

A. Interfaces and Access Switches

To perform the trials, we developed two virtual keyboardsshown in Fig. 2(a) and (b). The keyboards were in Bengali, a lan-guage spoken primarily in Eastern India and Bangladesh. Thesewere developed by modifying the virtual keyboard reported in[27]. Each layout had 17 rows of character keys. Among theserows, row 3 to row 7 (from the top of the interfaces) were dif-ferent in the two layouts.

For each of these layouts, block–row–item and row–itemauto and guided scanning were implemented. To implementblock–row–item scanning, six blocks were defined for eachlayout. These were, block 1 comprising row 1 and row 2, block2 comprising row 3 to row 10, block 3 comprising row 11and row 12, block 4 comprising row 13 and row 14, block 5comprising row 15 and row 16, and block 6 comprising row17. We thus used eight interfaces in our experiments. In sub-sequent discussions, we refer to these interfaces as I1 (Layout1 with block–row–item auto scanning), I2 (Layout 2 withblock–row–item auto scanning), I3 (Layout 1 with row–item

Page 7: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

516 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 5, OCTOBER 2008

auto scanning), I4 (Layout 2 with row–item auto scanning), I5(Layout 1 with block–row–item guided scanning), I6 (Layout2 with block–row–item guided scanning), I7 (Layout 1 withrow–item guided scanning), and I8 (Layout 2 with row–itemguided scanning). The interfaces were implemented in Java(JDK version 1.4.0). Each of the interfaces could automaticallylog any text entry task done with it along with the time ofoccurrence of the task. System clock was used to log time.

We also developed the following three types of accessswitches for data collection, which are shown in Fig. 2(c). Allof these switches worked on press and release principle.

• Lip switches: A nipple with a micro switch inside that couldbe operated by lips was the basis of this type of accessswitch.

• Box type switches: These switches could be operated byhand. Each was made of a plastic visiting card box, thecover of which was lifted on one side and kept in place bya steel plate on top of a micro switch. When the lifted coverwas pressed, the switch was put on.

• Ball type switches: These could be worn on palm orforearm with the help of a Velcro band. It was made of onehalf of a plastic ball with a base plate of aluminium. A holewas made on the side of the ball, where a micro switchwas placed, with a small aluminium plate for support. Thisplate had to be pressed against some hard or semi-hardsurface to produce the switch action.

B. Linguistic Resources

We developed the following resources for our experiments.• Digraph probability table for Bengali: In order to work

with the Bengali interfaces, we developed a table listing thedigraph probabilities of Bengali character pairs. The tablewas developed from a text corpus created from the “Anand-abazar Patrika,” a vernacular daily newspaper in Bengali.Total size of the corpus was 96 012 779 characters. The di-graph table had a size of 104 104 characters and con-tained digraph probabilities of nonalphanumeric charactersalso (e.g., digraph probability of “Enter-Space”).

• Text for collecting data: To collect usage data, userswere given a piece of text that they entered through theinterfaces. We developed the text from the “Anandabazarpatrika” corpus. To determine text size, we assumed thatthe user would compose text for 30 min/day for a weekwith a rate of 0.5 words per minute (WPM). Also, wefound separately by processing the same corpus that theaverage word length in Bengali was six characters in-cluding “space.” With these values and the formula thatsize WPM typing duration, the text size came out tobe characters.To determine a 630 characters text from the corpus, weused the cross entropy based similarity measure [28].We first obtained the unigram probability distribution ofthe Bengali alphanumeric and other characters from thecorpus. Using the probability distribution, we calculatedthe entropy of the whole corpus which came out to be5.103118. Next, 630 characters texts were generated ran-domly from the corpus. We found that we could generate32 760 distinct texts in that way. For each of these texts,

TABLE IIIPROFILE OF THE PARTICIPANTS

we calculated the unigram character probability distribu-tion and with that, its cross entropy with the corpus. Thecross entropy was then compared with the entropy of thecorpus. Among all the randomly generated texts, the textwith the minimum absolute difference between its crossentropy and the entropy of the corpus was selected. Thetext we selected had a cross entropy of 5.102970.

C. Participants

We took the help of eight volunteers to collect data, all withinthe age group of 25–35 years. Profiles of the participants aresummarized in Table III. Among them, CP1, CP2, and CP4could not produce clear and comprehensible speech while CP3could produce understandable speech with much difficulty.Also, CP1 and CP2 had involuntary muscle movements andCP3 and CP4 had stiff limb muscles. FD and MS were havingthe symptoms of slurring of speech and weak muscle move-ments. The remaining two participants had no disabilities.Three of the participants (CP1, CP2, and FD) could operate theinterfaces with a single access switch only, supported by autoscanning.

Apart from FD, all the participants were regular computerusers. Among them, CP1, CP2, and CP3 had prior experience ofworking with scanning input AAC systems. In particular, CP2was a regular user of the “E Z Keys” system3 while CP1 andCP3 were users of the switch operated version of the “Clicker”system4.

Data were collected for both auto and guided scanning. CP1,CP2, FD, A1, and A2 generated data for auto scanning inter-faces. Usage data for guided scanning interfaces were generatedby CP3, CP4, MS, A1, and A2. The last column in Table III in-dicates the type of the access switch used by the correspondingparticipant during data collection.

D. Method

The experiment was divided into two phases, namely 1) a“training” phase and 2) a “usage” phase. The training phase wasused to familiarize the participants with the interfaces. Also ap-propriate scan periods for each of CP1, CP2, FD, A1, and A2were determined in the training phase. To determine scan period,participants were given the task of entering randomly generatedcharacter strings with an interface. The scan period for the in-putting task was varied and the scan period that minimized er-rors was selected. To familiarize the participants with the inter-faces, randomly generated character strings were given to each

3http://www.words-plus.com/website/products/soft/ezkeys.htm.4http://www.cricksoft.com/us/products/clicker/default.aspx.

Page 8: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

BHATTACHARYA et al.: PERFORMANCE MODELS FOR AUTOMATIC EVALUATION OF VIRTUAL SCANNING KEYBOARDS 517

TABLE IVMODEL PREDICTIONS AND USER TESTING RESULTS: (A) AUTO SCANNING RESULTS AND (B) GUIDED SCANNING RESULTS

of them and they were asked to enter the texts through the inter-faces. Each participant was given between ten to fifteen stringsto enter for each interface. Duration of the training phase variedamong the participants from about three hours in two days (forA1) to about 13 h spread over a week for FD.

In the usage phase, the text chunk was given to each of theparticipants in printed form. They were asked to enter the textwith the corresponding interfaces. The order in which the inter-faces were presented to the participants were different for eachparticipant. Five desktop PCs with 17-in displays were used fordata collection. Each of these PCs had Windows XP running onPentium IV processor with 2.80-GHz clock speed. To minimizeany learning bias in the data, data for each of the interfaces werecollected with one week time interval in between.

Participants were asked to enter the text as accurately as pos-sible. The following procedure was adopted in case of errorsin selections: 1) exit system, 2) in case of wrong input, removethe erroneous selections from the log files manually, 3) restartsystem, and 4) start text entry with the character after the lastcorrectly entered character.

E. Results and Discussion

The model predicted user performances for the four autoscanning interfaces (I1–I4) and the observed user perfor-mances for these interfaces are shown in Table IV(a). Forperformance calculation, we empirically founds. User testing results along with the model predicted per-formance for the four guided scanning interfaces (I5–I8) areshown in Table IV(b) ( s). We usedfor prediction for guided scanning interfaces, since data werecollected after the participants became familiar with the in-terfaces. In Table IV, represents the model predictedCPM and is the observed CPM. “Error” in Table IVindicates model error in percentage, which is calculated as

.The performance measures in CPM can be used to rank al-

ternate designs of virtual scanning keyboards for comparison.

Fig. 3(a) shows the ranking of the four auto scanning interfacespredicted by our models. The model predicted ranking can berepresented as . The notation im-plies that a user performs better with than . Similarly fromFig. 3(b), the four guided scanning interfaces can be rankedbased on model prediction as: . The corre-sponding rankings by the performances of the participants arealso shown in Fig. 3(a) and (b). Fig. 3 shows that all the par-ticipants performed according to the model predictions for bothauto and guided scanning interfaces, supporting the validity ofthe proposed models in a relative sense.

Table IV shows that for auto scanning, there were between5%–10% model errors for each of the interfaces for the partici-pants with disabilities. One reason for this is our assumption thateach event takes time, which varied among partic-ipants as well as usage sessions. For example, the logged datashowed that for CP2 entering the text through with a 1.5-sscan period , the sum of these three quantities was approx-imately 1.7 s, rather than s. For guidedscanning interfaces, the model error for disabled participantsvaried between 2%–8%. Our estimated time of 478 ms for eachswitch activation is one reason for these differences. The esti-mation was based on the assumption that each switch activationinvolves all the five tasks mentioned in Section IV–B1. How-ever, switch activation time varied among participants and usagesessions. For example, we determined from the logged data thatCP4 had actually taken 0.627 s on average for switch activationwith the interface . Also, in calculating user performance forguided scanning interfaces, we assumed that both the switchestook the same amount of time to activate. This assumption maynot hold for all the participants and usage sessions. The assump-tion about visual search for guided scanning, that the partici-pants were taking no time to locate the keys, also contributedto model errors. Sears et al. [29] demonstrated that the visualsearch time calculation in the FD model do not consider manyfactors that affect a user’s visual search and it is possible that theparticipants were taking some nonzero time for visual search.

Page 9: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

518 IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 16, NO. 5, OCTOBER 2008

Fig. 3. Ranking of the scanning interfaces based on model predictions and user trials. (a) Auto scanning interfaces ranking. (b) Guided scanning interfaces ranking.

Also for prediction, we used the digraph probabilities obtainedfrom the “Anandabazar” corpus. Despite having the highest sim-ilarity, the text chunk used in the experiments was not exactly thesame as the corpus. Consequently, the digraph probability dis-tribution of the text chunk was different than that of the corpus,leading to errors in model prediction.

The reasons for model errors for the disabled participants de-scribed above also hold for the able bodied participants. How-ever, model errors were found to be nearly double for the ablebodied participants than that of the disabled participants. Thisresults from the fact that for the able bodied participants, the dif-ferences between the assumed/estimated value for eachevent and the actual value were much more than for the disabledparticipants.

VI. CONCLUSION

We have presented performance models for automatic eval-uation of virtual scanning keyboards. The models are expectedto help designers in determining the best design from a set ofalternatives. Also, clinical practitioners are expected to be ben-efitted from the work by having a basis for comparing amongavailable options without extensive clinical trials. Results fromuser trials show between 2%–10% errors in model predictionsfor the disabled users. Individual differences that exist amongusers (i.e., switch activation times and visual search times) aswell as the difference in modeling and testing activities (i.e., di-graph probability distribution of the corpus used for model pre-dictions as against the digraph probability distribution of the testtext) are responsible for such errors. Although such errors cannot be completely eliminated, we feel that further work may becarried out in the following areas to improve model accuracy.

• Investigating the cognitive, perceptual and motor activitiesinvolved in switch activation tasks to improve estimationaccuracy of switch activation time.

• Investigation of visual search behavior of virtual scanningkeyboard users.

Both the user testing results and the model predictions sug-gest that block–row–item scanning should be the most preferredamong the three matrix scanning methods. The suggestion, how-ever, raises the following questions.

• Is it possible or beneficial to apply block–row–item scan-ning on any layout?

• If block–row–item scanning is implemented, how to orga-nize interface elements into blocks, rows and items to max-imize user performance?

• For auto scanning interfaces, a related issue is: how to de-termine the best scan period for a given layout organizationto maximize user performance?

We plan to work further in the future to address these issues.

ACKNOWLEDGMENT

The authors would like to thank the students and teachers ofThe Indian Institute of Cerebral Palsy, Kolkata, India, and othervolunteers for their help in collecting usage data.

REFERENCES

[1] R. D. Beukelman and P. Mirenda, Augmentative and Alternative Com-munication, 2nd ed. Baltimore, MD: Brooke, 1998.

[2] C. E. Steriadis and P. Constantinou, “Designing human-computer in-terfaces for quadriplegic people,” ACM Trans. Computer-Human Inter-action, vol. 10, pp. 87–118, 2003.

[3] J. L. Arnott, “Text entry in augmentative and alternative commu-nication,” in Proc. Efficient Text Entry, 2005 [Online]. Available:http://drops.dagstuhl.de/opus/volltexte/2006/519/pdf/05382.Arnot-tJohn.P%aper.519.pdf

Page 10: Performance Models for Automatic Evaluation of Virtual Scanning Keyboards

BHATTACHARYA et al.: PERFORMANCE MODELS FOR AUTOMATIC EVALUATION OF VIRTUAL SCANNING KEYBOARDS 519

[4] G. W. Lesher, B. J. Moulton, and D. J. Higginbotham, “Optimal char-acter arrangements for ambiguous keyboards,” IEEE Trans. Rehabil.Eng., vol. 6, no. 4, pp. 415–423, Dec. 1998.

[5] G. W. Lesher, B. J. Moulton, and D. J. Higginbotham, “Techniquesfor augmenting scanning communication,” Augmentative AlternativeCommun., vol. 14, pp. 81–101, 1998.

[6] H. S. Venkatagiri, “Efficient keyboard layouts for sequential access inaugmentative and alternative communication,” Augmentative Alterna-tive Commun., vol. 15, pp. 126–134, 1999.

[7] I. S. MacKenzie, S. X. Zhang, and R. W. Soukoreff, “Text entry usingsoft keyboards,” Behav. Inf. Technol., vol. 18, pp. 235–244, 1999.

[8] M. J. Rosen and C. Goodenough-Trepagnier, “Factors affecting com-munication rate in non-vocal communication systems,” in Proc. 4thAnnu. Conf. Rehabil. Eng., 1981, pp. 194–196.

[9] W. Soukoreff and I. S. MacKenzie, “Theorectical upper and lowerbounds on typing speed using a stylus and soft keyboard,” Behav. Inf.Technol., vol. 14, pp. 370–379, 1995.

[10] I. S. MacKenzie and R. W. Soukoreff, “Text entry for mobile com-puting: Models and methods, theory and practice,” Human-ComputerInteraction, vol. 17, pp. 147–198, 2002.

[11] S. Zhai, M. Hunter, and B. A. Smith, “Performance optimization ofvirtual keyboards,” Human-Computer Interaction, vol. 17, pp. 89–129,2002.

[12] S. K. Card, T. P. Moran, and A. Newell, “The keystroke-level modelfor user performance time with interactive systems,” Commun. ACM,vol. 23, no. 7, pp. 396–410, 1980.

[13] S. K. Card, T. P. Moran, and A. Newell, The Psychology of Human-Computer Interaction. Hillsdale, NJ: Lawrence Erlbaum, 1983.

[14] B. E. John and D. E. Kieras, “The goms family of user interface analysistechniques: Comparison and contrast,” ACM Trans. Computer-HumanInteraction, vol. 3, no. 4, pp. 320–351, 1996.

[15] H. M. Horstmann and S. P. Levine, “Modeling of user performancewith computer access and augmentative communication systems forhandicapped people,” Augmentative Alternative Commun., vol. 6, pp.231–241, 1990.

[16] H. H. Koester and S. P. Levine, “Modeling the speed of text entry witha word prediction interface,” IEEE Trans. Rehabil. Eng., vol. 2, no. 3,pp. 177–187, Sep. 1994.

[17] H. H. Koester and S. P. Levine, “Keystroke level models for user per-formance with word prediction,” Augmentative Alternative Commun.,vol. 13, pp. 239–257, 1997.

[18] H. H. Koester and S. P. Levine, “Model simulation of user performancewith word prediction,” Augmentative Alternative Commun., vol. 14, pp.25–36, 1998.

[19] S. H. Levine and C. Goodenough-Trepagnier, “Customised text entrydevices for motor-impaired users,” Appl. Ergnonomics., vol. 21, pp.55–62, 1990.

[20] R. I. Damper, “Text composition by the physically disabled: A rateprediction model for scanning input,” Appl. Ergnonomics, vol. 15, pp.289–296, 1984.

[21] W. E. Hick, “On the rate of gain of information,” Q. J. Exp. Psychol.,vol. 4, pp. 11–36, 1952.

[22] R. Hyman, “Stimulus information as a determinant of reaction time,”J. Exp. Psychol., vol. 45, pp. 188–196, 1953.

[23] P. M. Fitts, “The information capacity of the human motor system incontrolling the amplitude of movement,” J. Exp. Psychol., vol. 47, pp.381–391, 1954.

[24] P. M. Fitts and J. R. Peterson, “Information capacity of discret motorresponses,” J. Exp. Psychol., vol. 67, pp. 103–112, 1964.

[25] I. S. MacKenzie, “Fitts law as a research and design tool in human com-puter interaction,” Human-Computer Interaction, vol. 7, pp. 91–139,1992.

[26] S. Keates, P. J. Clarkson, and P. Robinson, “Investigating the appli-cability of user models for motion-impaired users,” in Proc. ACM AS-SETS, Arlington, VA, 2000, pp. 129–136.

[27] A. Mukherjee, S. Bhattacharya, P. Halder, and A. Basu, “A virtual pre-dictive keyboard as a learning aid for people with neuro-motor disor-ders,” in Proc. 5th IEEE Int. Conf. Adv. Learn. Technol. (ICALT), 2005,pp. 1032–1036.

[28] C. Manning and H. Schultze, Foundations of Statistical Natural Lan-guage Processing. Cambridge, MA: MIT Press, 1999.

[29] A. Sears, J. A. Jacko, J. C. Chung, and F. Moro, “The role of visualsearch in the design of effective soft keyboards,” Behav. Inf. Technol.,vol. 20, pp. 159–166, 2001.

Samit Bhattacharya received the B.Tech. incomputer science and technology from KalyaniGovernment Engineering College, University ofKalyani, West Bengal, India, in 2001, and the M.S.degree in computer science and engineering, in 2005,from the Indian Institute of Technology Kharagpur,Kharagpur, India, where he is currently workingtoward the Ph.D. degree in computer science andengineering.

His research interests include natural languageprocessing, its application to assistive technology

and formal design techniques for assistive technology applications.

Debasis Samanta received the B.Tech. degree incomputer science and engineering from the CalcuttaUniversity, Calcutta, India, in 1993, the M.Tech.degree in computer science and engineering from theJadavpur University, Calcutta, India, in 1995, and thePh.D. degree in computer science and engineeringfrom Indian Institute of Technology, Kharagpur,India, in 2002.

He is currently an Assistant Professor in theSchool of Information Technology at the IndianInstitute of Technology, Kharagpur. He was with

the North Eastern Regional Institute of Science and Technology (NERIST),Itanagar, India (1995–2004). His research interests include human–computerinteraction and information system design.

Anupam Basu received B.E. degree in electronicsand telecommunication engineering and the M.E. de-gree in computer engineering from Jadavpur Univer-sity, Calcutta, India, in 1980 and 1982, respectively,and the Ph.D. degree in computer science and en-gineering from Indian Institute of Technology (IIT)Kharagpur, Kharagpur, India, in 1988.

He is currently a Professor at the Department ofComputer Science and Engineering, IIT Kharagpur.His research interests include development ofcost effective assistive systems for the physically

challenged.Dr. Basu received the Da Vinci Award 2004, the National Award for Tech-

nological Innovation for the Physically Challenged 2007, and the OutstandingYoung Person Award 1996. He is a Fellow of Indian National Academy of En-gineering and a past Humboldt Fellow.


Recommended