Upload
srlresearch
View
116
Download
4
Embed Size (px)
Citation preview
The $3 Recognizer: Simple 3D Gesture Recognition onMobile Devices
Sven KratzDeutsche Telekom Laboratories, TU Berlin
Ernst-Reuter-Platz 710587 Berlin, [email protected]
Michael RohsDeutsche Telekom Laboratories, TU Berlin
Ernst-Reuter-Platz 710587 Berlin, Germany
ABSTRACTWe present the $3 Gesture Recognizer, a simple butrobust gesture recognition system for input devices fea-turing 3D acceleration sensors. The algorithm is de-signed to be implemented quickly in prototyping en-vironments, is intended to be device-independent anddoes not require any special toolkits or frameworks,but relies solely on simple trigonometric and geomet-ric calculations. Our method requires significantly lesstraining data than other gesture recognizers and is thussuited to be deployed and to deliver results rapidly.
Author KeywordsGesture recognition, recognition rates, classifier, userinterfaces, rapid prototyping, 3D gestures
ACM Classification KeywordsH5.2 [Information interfaces and presentation]: User in-terfaces – Input devices and strategies. I5.2. [PatternRecognition]: Design methodology – Classifier designand evaluation. I5.5 [Pattern Recognition]: Implemen-tation – Interactive Systems
General TermsAlgorithms, Performance
INTRODUCTION AND RELATED WORKAn increasing number of mobile devices are equippedwith 3D accelerometers, which calls for suitable meth-ods for 3D gesture recognition on these platforms. Ges-ture input for mobile devices can be a way to overcomethe limitations of miniature input facilities and smalldisplays, since the range of movement is not restrictedby the size of the device. Our work is based on previouswork by Wobbrock et al. [4], who developed a simple“$1 Recognizer” using basic geometry and trigonome-try. The “$1 Recognizer” is targeted at user interface
Copyright is held by the author/owner(s).IUI’10, February 7-10, 2010, Hong Kong, China.ACM 978-1-60558-515-4/10/02.
Figure 1. The reference gesture vocabulary containingthe gesture classes used for the preliminary evaluation.(b) describes a clockwise circular motion, (c) a wristrolling motion, (e) a gesture resembling a tennis serveand (j) a rapid forward-backwards motion.
prototyping for 2D touch-screen-based gesture recogni-tion and therefore focuses on ease of implementationon novel hardware platforms. Our contribution is inextending and modifying Wobbrock et al.’s algorithmto work with 3D acceleration data. Instead of captur-ing exact pixel positions on a touch screen, accelerationdata is of much lower quality because it is prone to noise,and additionally, drift error accumulates as the path ofa gesture entry is integrated. We extend Wobbrock’soriginal algorithm with a scoring heuristic to lower therate of false positives. The major contribution of thiswork is the creation a simple gesture recognizer that isdesigned to recognize “true” 3D Gestures, i.e. gestureswhich are not limited to shapes that can be drawn in a2D plane. The advantage of true 3D gesture recognitionis that more natural movements, such a tennis serve orboxing punches can be input by the user. Like the “$1Recognizer,” our approach is quick and cheap to imple-ment, does not require library support, needs only mini-mal parameter adjustment and minimal training, about5 samples per gesture provide good recognition results.It is therefore very valuable for user interface proto-typing and rapid application development. It can alsoeasily be integrated into mobile interfaces that take ad-vantage of other modalities, like speech, or touch-basedinteraction with RFID/NFC.
THE $3 GESTURE RECOGNIZERExtending Wobbrock’s [4] work, we present a gesturerecognizer that can recognize gestures from 3D accel-eration data as input. A detailed description of the
419
algorithm we developed can be found in [2]. Our demois implemented in Objective-C as an iPhone 3GS appli-cation, and in Java running on the Android platform,but the method is by no means limited to these devices.
Gesture Trace and Gesture Class LibraryIn contrast to [3], modification or pre-processing of theraw acceleration data in any way (filtering, smoothing,etc.) is not required by our method. To determine thecurrent change in acceleration, we subtract the currentacceleration value reported by the device’s accelerationsensor from the previous one. We thus obtain an accel-eration delta. By summation of the acceleration deltas,we obtain a gesture trace T (which can be plotted in a3D space (Figure 1 (e),(j)) or projected into a 2D plane(gestures (a)-(d), (f)-(i)) to obtain a graphical repre-sentation of the gesture [1]. The gesture class library Lcontains a predefined number of training gesture tracesfor each gesture class G.
Gesture Recognition ProblemThe basic task of our algorithm is to find the bestmatching gesture class from the gesture class library. Tofind a matching gesture class, we compare the gesturetrace entered by the user to all known gesture tracesstored the gesture class library and thus generate a scoretable that lists the comparison score of the entered ges-ture with the known gestures. A heuristic is then ap-plied to the score table to determine if a gesture hasbeen correctly recognized.
ResamplingFor optimal classification, the gesture trace needs to beresampled to have a number of points equal to the tem-plate gestures. Furthermore, our resampling methodensures that the points are re-distributed to be at equaldistances from each other.
Rotation to “Indicative Angle” and RescalingTo correct for rotational errors during gesture entry, theresampled gesture trace is rotated once along the ges-ture’s indicative angle. Like Wobbrock, we define theindicative angle as the angle between the gesture’s firstpoint p0 and its centroid c = (x, y, z). After rotation,the gesture trace is scaled to fit into a normalize cube,to compensate for scaling differences between gestures.After these pre-processing steps, we have converted theoriginal user input to a gesture that is ready for match-ing with candidates from the gesture class library.
Search for Minimum Distance at Best AngleWe use the average MSE (Mean Square Error) as dis-tance measure between the gesture entered by the userand candidate gestures in the gesture class library. Tocompensate for rotational differences, we adapted aGolden Section Search (GSS). This type of search usesthe Golden Ratio (ϕ = 0.5(−1 +
√5)) to iteratively es-
timate the optimal rotation around three axis of thecoordinate system. The end result of GSS is a table
sorted by matching scores with the corresponding ges-ture class ID.
Scoring HeuristicWobbrock’s original algorithm did not feature a heuris-tic to reduce the occurance of false positives, which isa common problem for simple gesture recognition algo-rithms operating on large gesture vocabularies [4]. Thematches obtained from gestures entered as 3D accelera-tion data are not as precise as strokes entered on a touchscreen. To compensate for the weaker matches, we havedeveloped our own scoring heuristic, which processesthe score table described in the previous section. Aftersorting the score table by maximum score, our heuris-tic determines the recognized gesture with the followingrules:
• ε is defined as the threshold score.• Iff the highest-scoring candidate in the score table has a score
> 1.1ε, return this candidate’s gesture ID.• Iff, within the top three candidates in the score table, two can-
didates exist of the same gesture class and have a score > 0.95ε,respectively, return the gesture ID of these two candidates.
• Else, return “Gesture not recognized!”.
Using this heuristic, we achieved a considerable reduc-tion of false recognitions compared to Wobbrock’s orig-inal strategy of selecting the gesture candidate with thehighest matching score to determine the recognized ges-ture.
SUMMARYWe present a simple, easy-to-implement gesture recog-nizer for input devices equipped with 3D accelerationsensors. The idea behind our gesture recognition algo-rithm is to provide a quick and cheap way to imple-ment gesture recognition for true 3D gestures (such asthe reference gesture (e)), for devices equipped with 3Dacceleration sensors. Our method does not require anyadvanced software frameworks or toolkits. An exam-ple application area for our gesture recognizer is userinterface prototyping.
REFERENCES1. Sanna Kallio, Juha Kela, Jani Mäntyjärvi, and
Johan Plomp. Visualization of hand gestures forpervasive computing environments. In Proc. AVI’06, pages 480–483. ACM, 2006.
2. Sven Kratz and Michael Rohs. A $3 gesturerecognizer – simple gesture recognition for devicesequipped with 3d acceleration sensors. In Proc.IUI’10, 2010.
3. Thomas Schloemer, Benjamin Poppinga, NielsHenze, and Susanne Boll. Gesture recognition witha wii controller. In Proc. TEI ’08, pages 11–14.ACM, 2008.
4. Jacob O. Wobbrock, Andrew D. Wilson, and YangLi. Gestures without libraries, toolkits or training:a $1 recognizer for user interface prototypes. InProc. UIST ’07, pages 159–168, New York, NY,USA, 2007. ACM.
420