SOMM: Self Organizing Markov Map for Gesture Recognition
Pattern Recognition 2010 Spring Seung-Hyun Lee G. Caridakis et al.,
Pattern Recognition, Vol. 31, pp. 52-59, 2010.
Slide 2
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Contents Introduction
Related Work Hidden Markov Models Other Method Proposed Method
Experiments Conclusion 1
Slide 3
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Gesture A
motion of the body that conveys information In this paper Focus on
hand gestures 2
Slide 4
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Taxonomy of
gesture(McNeill, 1992) Gesticulation Speech-linked Pantomime
Emblems Sign Languages Other (Kendon,1992) (Quek, 1994) 3
Slide 5
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Taxonomy by
functionality 4 GesturesDefinition Symbolic gestures gestures that,
within each culture, have come to have a single meani ng. Deictic
gestures types of gestures most generally seen in HCI and are the
gestures of pointing to entities or direction. Iconic gestures
gestures used to convey information about the size, spatial
relations, actions, shape or orientation of the object of discourse
display. Pantominic gesturesgestures typically used to mimic an
action, object or concept.
Slide 6
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Related Work Cogan(2006)
Discrete HMM which fuse hand shape and position Hossain(2005)
Implicit/Explicit Temporal Information Encoded HMM Discriminated
attention and non-attention gestures Mantyla(2000) On mobile
devices Utilized SOM and HMM method Starner(1998) HMM based
American Sign Language(ASL) recognition Sentence level recognition
is possible 5 Hidden Markov Model
Slide 7
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Related Work Black and
Jepson(1998) CONDitional dENSity propagATION (CONDENSATION)
algorothm Wong and Ciipolla(2006) Sparse Bayesian classifier Hong
et al.(2000) Finite State Machines(FSM) Su(2000) Fuzzy logic and
rule-based approaches and hyper-rectangular composite Neural
network(HRCNNs) Juang and Ku(2005) Fuzzified
Takagi-Sugeno-Kang(TSK) type recurrent network Yang et al.(2002)
Time Delay Neural network Huang and Huang(1998) 3D Hopfield Neural
Network 6 Other method
Slide 8
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Modules
Image processing : detection an tracking of hands SOM :
quantization of hand location and direction HMM : transition
probability matrix 7 Overview
Slide 9
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Video
based method Creation of moving skin masks (Skin color area)
Tracking the centroid of the skin masks Prior knowledge is required
It should indicate different body parts (Left, right hand, and
head) Environment PC platform OpenCV 8 Feature Extraction
Slide 10
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Dataset
Gesture instances 9
Slide 11
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method cf) SOM
(1) continuous input space (2) discrete output space in the form of
lattice (3) time-varying neighborhood function defined around
winning neuron (4) decreasing learning rate parameter 10 Position
Model
Slide 12
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Some
based representation of hand position 11 Position Model
Slide 13
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method
Additional information: Moving direction 12 Direction Model
Slide 14
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Based on
Levenshtein distance(edit distance) Measuring the amount of
difference between two sequences Generalized median of data set Mj
Mean Levenstein distance between members 13 Generalized Median
Slide 15
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Position
Probability Calculation of S som First state: initial probability
From second state: transition probability Unit u 14 Gesture
Decoding
Slide 16
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method
Direction Probability Calculation of S of Unit u 15 Gesture
Decoding
Slide 17
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method
Similarity measurement Problem Shorter gesture instances tend to
gain an advantage by having less transitions and thus less
probabilities multiplication Measurement 16 Gesture Decoding
Slide 18
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Error
definition for function f SOM based approach If data containing
small error is mapped to the same node of SOM No problem Otherwise
Consequently, because of neighboring relation of u, error is not
propagated to the next steps of the recognition process 17 Error
Propagation
Slide 19
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Experiment 30 gestures
10 repetitions each 18 Data Set
Slide 20
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Experiment SOM
clustering Blue: close to input vector Red: not close Recognition
accuracy Test with training data: 100% 10-fold cross validation:
93% 0.843 ms for decoding a gesture Only HMM-based classifier:
86.36% 19 Result
Slide 21
S FT COMPUTING @ YONSEI UNIV. KOREA 16 Conclusion Key features
SOM and HMM based automatic recognition architecture ROI Relative
hand position Moving direction Similarity of pattern Application
Sign language Gaming environment 20