2
The $3 Recognizer: Simple 3D Gesture Recognition on Mobile Devices Sven Kratz Deutsche Telekom Laboratories, TU Berlin Ernst-Reuter-Platz 7 10587 Berlin, Germany [email protected] Michael Rohs Deutsche Telekom Laboratories, TU Berlin Ernst-Reuter-Platz 7 10587 Berlin, Germany [email protected] ABSTRACT We present the $3 Gesture Recognizer, a simple but robust gesture recognition system for input devices fea- turing 3D acceleration sensors. The algorithm is de- signed to be implemented quickly in prototyping en- vironments, is intended to be device-independent and does not require any special toolkits or frameworks, but relies solely on simple trigonometric and geomet- ric calculations. Our method requires significantly less training data than other gesture recognizers and is thus suited to be deployed and to deliver results rapidly. Author Keywords Gesture recognition, recognition rates, classifier, user interfaces, rapid prototyping, 3D gestures ACM Classification Keywords H5.2 [Information interfaces and presentation]: User in- terfaces – Input devices and strategies. I5.2. [Pattern Recognition]: Design methodology – Classifier design and evaluation. I5.5 [Pattern Recognition]: Implemen- tation – Interactive Systems General Terms Algorithms, Performance INTRODUCTION AND RELATED WORK An increasing number of mobile devices are equipped with 3D accelerometers, which calls for suitable meth- ods for 3D gesture recognition on these platforms. Ges- ture input for mobile devices can be a way to overcome the limitations of miniature input facilities and small displays, since the range of movement is not restricted by the size of the device. Our work is based on previous work by Wobbrock et al. [4], who developed a simple “$1 Recognizer” using basic geometry and trigonome- try. The “$1 Recognizer” is targeted at user interface Copyright is held by the author/owner(s). IUI’10, February 7-10, 2010, Hong Kong, China. ACM 978-1-60558-515-4/10/02. Figure 1. The reference gesture vocabulary containing the gesture classes used for the preliminary evaluation. (b) describes a clockwise circular motion, (c) a wrist rolling motion, (e) a gesture resembling a tennis serve and (j) a rapid forward-backwards motion. prototyping for 2D touch-screen-based gesture recogni- tion and therefore focuses on ease of implementation on novel hardware platforms. Our contribution is in extending and modifying Wobbrock et al.’s algorithm to work with 3D acceleration data. Instead of captur- ing exact pixel positions on a touch screen, acceleration data is of much lower quality because it is prone to noise, and additionally, drift error accumulates as the path of a gesture entry is integrated. We extend Wobbrock’s original algorithm with a scoring heuristic to lower the rate of false positives. The major contribution of this work is the creation a simple gesture recognizer that is designed to recognize “true” 3D Gestures, i.e. gestures which are not limited to shapes that can be drawn in a 2D plane. The advantage of true 3D gesture recognition is that more natural movements, such a tennis serve or boxing punches can be input by the user. Like the “$1 Recognizer,” our approach is quick and cheap to imple- ment, does not require library support, needs only mini- mal parameter adjustment and minimal training, about 5 samples per gesture provide good recognition results. It is therefore very valuable for user interface proto- typing and rapid application development. It can also easily be integrated into mobile interfaces that take ad- vantage of other modalities, like speech, or touch-based interaction with RFID/NFC. THE $3 GESTURE RECOGNIZER Extending Wobbrock’s [4] work, we present a gesture recognizer that can recognize gestures from 3D accel- eration data as input. A detailed description of the 419

The $3 Recognizer: Simple 3D Gesture Recognition on Mobile Devices

Embed Size (px)

Citation preview

Page 1: The $3 Recognizer: Simple 3D Gesture Recognition on Mobile Devices

The $3 Recognizer: Simple 3D Gesture Recognition onMobile Devices

Sven KratzDeutsche Telekom Laboratories, TU Berlin

Ernst-Reuter-Platz 710587 Berlin, [email protected]

Michael RohsDeutsche Telekom Laboratories, TU Berlin

Ernst-Reuter-Platz 710587 Berlin, Germany

[email protected]

ABSTRACTWe present the $3 Gesture Recognizer, a simple butrobust gesture recognition system for input devices fea-turing 3D acceleration sensors. The algorithm is de-signed to be implemented quickly in prototyping en-vironments, is intended to be device-independent anddoes not require any special toolkits or frameworks,but relies solely on simple trigonometric and geomet-ric calculations. Our method requires significantly lesstraining data than other gesture recognizers and is thussuited to be deployed and to deliver results rapidly.

Author KeywordsGesture recognition, recognition rates, classifier, userinterfaces, rapid prototyping, 3D gestures

ACM Classification KeywordsH5.2 [Information interfaces and presentation]: User in-terfaces – Input devices and strategies. I5.2. [PatternRecognition]: Design methodology – Classifier designand evaluation. I5.5 [Pattern Recognition]: Implemen-tation – Interactive Systems

General TermsAlgorithms, Performance

INTRODUCTION AND RELATED WORKAn increasing number of mobile devices are equippedwith 3D accelerometers, which calls for suitable meth-ods for 3D gesture recognition on these platforms. Ges-ture input for mobile devices can be a way to overcomethe limitations of miniature input facilities and smalldisplays, since the range of movement is not restrictedby the size of the device. Our work is based on previouswork by Wobbrock et al. [4], who developed a simple“$1 Recognizer” using basic geometry and trigonome-try. The “$1 Recognizer” is targeted at user interface

Copyright is held by the author/owner(s).IUI’10, February 7-10, 2010, Hong Kong, China.ACM 978-1-60558-515-4/10/02.

Figure 1. The reference gesture vocabulary containingthe gesture classes used for the preliminary evaluation.(b) describes a clockwise circular motion, (c) a wristrolling motion, (e) a gesture resembling a tennis serveand (j) a rapid forward-backwards motion.

prototyping for 2D touch-screen-based gesture recogni-tion and therefore focuses on ease of implementationon novel hardware platforms. Our contribution is inextending and modifying Wobbrock et al.’s algorithmto work with 3D acceleration data. Instead of captur-ing exact pixel positions on a touch screen, accelerationdata is of much lower quality because it is prone to noise,and additionally, drift error accumulates as the path ofa gesture entry is integrated. We extend Wobbrock’soriginal algorithm with a scoring heuristic to lower therate of false positives. The major contribution of thiswork is the creation a simple gesture recognizer that isdesigned to recognize “true” 3D Gestures, i.e. gestureswhich are not limited to shapes that can be drawn in a2D plane. The advantage of true 3D gesture recognitionis that more natural movements, such a tennis serve orboxing punches can be input by the user. Like the “$1Recognizer,” our approach is quick and cheap to imple-ment, does not require library support, needs only mini-mal parameter adjustment and minimal training, about5 samples per gesture provide good recognition results.It is therefore very valuable for user interface proto-typing and rapid application development. It can alsoeasily be integrated into mobile interfaces that take ad-vantage of other modalities, like speech, or touch-basedinteraction with RFID/NFC.

THE $3 GESTURE RECOGNIZERExtending Wobbrock’s [4] work, we present a gesturerecognizer that can recognize gestures from 3D accel-eration data as input. A detailed description of the

419

Page 2: The $3 Recognizer: Simple 3D Gesture Recognition on Mobile Devices

algorithm we developed can be found in [2]. Our demois implemented in Objective-C as an iPhone 3GS appli-cation, and in Java running on the Android platform,but the method is by no means limited to these devices.

Gesture Trace and Gesture Class LibraryIn contrast to [3], modification or pre-processing of theraw acceleration data in any way (filtering, smoothing,etc.) is not required by our method. To determine thecurrent change in acceleration, we subtract the currentacceleration value reported by the device’s accelerationsensor from the previous one. We thus obtain an accel-eration delta. By summation of the acceleration deltas,we obtain a gesture trace T (which can be plotted in a3D space (Figure 1 (e),(j)) or projected into a 2D plane(gestures (a)-(d), (f)-(i)) to obtain a graphical repre-sentation of the gesture [1]. The gesture class library Lcontains a predefined number of training gesture tracesfor each gesture class G.

Gesture Recognition ProblemThe basic task of our algorithm is to find the bestmatching gesture class from the gesture class library. Tofind a matching gesture class, we compare the gesturetrace entered by the user to all known gesture tracesstored the gesture class library and thus generate a scoretable that lists the comparison score of the entered ges-ture with the known gestures. A heuristic is then ap-plied to the score table to determine if a gesture hasbeen correctly recognized.

ResamplingFor optimal classification, the gesture trace needs to beresampled to have a number of points equal to the tem-plate gestures. Furthermore, our resampling methodensures that the points are re-distributed to be at equaldistances from each other.

Rotation to “Indicative Angle” and RescalingTo correct for rotational errors during gesture entry, theresampled gesture trace is rotated once along the ges-ture’s indicative angle. Like Wobbrock, we define theindicative angle as the angle between the gesture’s firstpoint p0 and its centroid c = (x, y, z). After rotation,the gesture trace is scaled to fit into a normalize cube,to compensate for scaling differences between gestures.After these pre-processing steps, we have converted theoriginal user input to a gesture that is ready for match-ing with candidates from the gesture class library.

Search for Minimum Distance at Best AngleWe use the average MSE (Mean Square Error) as dis-tance measure between the gesture entered by the userand candidate gestures in the gesture class library. Tocompensate for rotational differences, we adapted aGolden Section Search (GSS). This type of search usesthe Golden Ratio (ϕ = 0.5(−1 +

√5)) to iteratively es-

timate the optimal rotation around three axis of thecoordinate system. The end result of GSS is a table

sorted by matching scores with the corresponding ges-ture class ID.

Scoring HeuristicWobbrock’s original algorithm did not feature a heuris-tic to reduce the occurance of false positives, which isa common problem for simple gesture recognition algo-rithms operating on large gesture vocabularies [4]. Thematches obtained from gestures entered as 3D accelera-tion data are not as precise as strokes entered on a touchscreen. To compensate for the weaker matches, we havedeveloped our own scoring heuristic, which processesthe score table described in the previous section. Aftersorting the score table by maximum score, our heuris-tic determines the recognized gesture with the followingrules:

• ε is defined as the threshold score.• Iff the highest-scoring candidate in the score table has a score

> 1.1ε, return this candidate’s gesture ID.• Iff, within the top three candidates in the score table, two can-

didates exist of the same gesture class and have a score > 0.95ε,respectively, return the gesture ID of these two candidates.

• Else, return “Gesture not recognized!”.

Using this heuristic, we achieved a considerable reduc-tion of false recognitions compared to Wobbrock’s orig-inal strategy of selecting the gesture candidate with thehighest matching score to determine the recognized ges-ture.

SUMMARYWe present a simple, easy-to-implement gesture recog-nizer for input devices equipped with 3D accelerationsensors. The idea behind our gesture recognition algo-rithm is to provide a quick and cheap way to imple-ment gesture recognition for true 3D gestures (such asthe reference gesture (e)), for devices equipped with 3Dacceleration sensors. Our method does not require anyadvanced software frameworks or toolkits. An exam-ple application area for our gesture recognizer is userinterface prototyping.

REFERENCES1. Sanna Kallio, Juha Kela, Jani Mäntyjärvi, and

Johan Plomp. Visualization of hand gestures forpervasive computing environments. In Proc. AVI’06, pages 480–483. ACM, 2006.

2. Sven Kratz and Michael Rohs. A $3 gesturerecognizer – simple gesture recognition for devicesequipped with 3d acceleration sensors. In Proc.IUI’10, 2010.

3. Thomas Schloemer, Benjamin Poppinga, NielsHenze, and Susanne Boll. Gesture recognition witha wii controller. In Proc. TEI ’08, pages 11–14.ACM, 2008.

4. Jacob O. Wobbrock, Andrew D. Wilson, and YangLi. Gestures without libraries, toolkits or training:a $1 recognizer for user interface prototypes. InProc. UIST ’07, pages 159–168, New York, NY,USA, 2007. ACM.

420