SOMM: Self Organizing Markov Map for Gesture Recognition Pattern Recognition 2010 Spring Seung-Hyun Lee G. Caridakis et al., Pattern Recognition, Vol

Embed Size (px)

Citation preview

  • Slide 1
  • SOMM: Self Organizing Markov Map for Gesture Recognition Pattern Recognition 2010 Spring Seung-Hyun Lee G. Caridakis et al., Pattern Recognition, Vol. 31, pp. 52-59, 2010.
  • Slide 2
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Contents Introduction Related Work Hidden Markov Models Other Method Proposed Method Experiments Conclusion 1
  • Slide 3
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Gesture A motion of the body that conveys information In this paper Focus on hand gestures 2
  • Slide 4
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Taxonomy of gesture(McNeill, 1992) Gesticulation Speech-linked Pantomime Emblems Sign Languages Other (Kendon,1992) (Quek, 1994) 3
  • Slide 5
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Introduction Taxonomy by functionality 4 GesturesDefinition Symbolic gestures gestures that, within each culture, have come to have a single meani ng. Deictic gestures types of gestures most generally seen in HCI and are the gestures of pointing to entities or direction. Iconic gestures gestures used to convey information about the size, spatial relations, actions, shape or orientation of the object of discourse display. Pantominic gesturesgestures typically used to mimic an action, object or concept.
  • Slide 6
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Related Work Cogan(2006) Discrete HMM which fuse hand shape and position Hossain(2005) Implicit/Explicit Temporal Information Encoded HMM Discriminated attention and non-attention gestures Mantyla(2000) On mobile devices Utilized SOM and HMM method Starner(1998) HMM based American Sign Language(ASL) recognition Sentence level recognition is possible 5 Hidden Markov Model
  • Slide 7
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Related Work Black and Jepson(1998) CONDitional dENSity propagATION (CONDENSATION) algorothm Wong and Ciipolla(2006) Sparse Bayesian classifier Hong et al.(2000) Finite State Machines(FSM) Su(2000) Fuzzy logic and rule-based approaches and hyper-rectangular composite Neural network(HRCNNs) Juang and Ku(2005) Fuzzified Takagi-Sugeno-Kang(TSK) type recurrent network Yang et al.(2002) Time Delay Neural network Huang and Huang(1998) 3D Hopfield Neural Network 6 Other method
  • Slide 8
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Modules Image processing : detection an tracking of hands SOM : quantization of hand location and direction HMM : transition probability matrix 7 Overview
  • Slide 9
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Video based method Creation of moving skin masks (Skin color area) Tracking the centroid of the skin masks Prior knowledge is required It should indicate different body parts (Left, right hand, and head) Environment PC platform OpenCV 8 Feature Extraction
  • Slide 10
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Dataset Gesture instances 9
  • Slide 11
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method cf) SOM (1) continuous input space (2) discrete output space in the form of lattice (3) time-varying neighborhood function defined around winning neuron (4) decreasing learning rate parameter 10 Position Model
  • Slide 12
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Some based representation of hand position 11 Position Model
  • Slide 13
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Additional information: Moving direction 12 Direction Model
  • Slide 14
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Based on Levenshtein distance(edit distance) Measuring the amount of difference between two sequences Generalized median of data set Mj Mean Levenstein distance between members 13 Generalized Median
  • Slide 15
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Position Probability Calculation of S som First state: initial probability From second state: transition probability Unit u 14 Gesture Decoding
  • Slide 16
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Direction Probability Calculation of S of Unit u 15 Gesture Decoding
  • Slide 17
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Similarity measurement Problem Shorter gesture instances tend to gain an advantage by having less transitions and thus less probabilities multiplication Measurement 16 Gesture Decoding
  • Slide 18
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Proposed Method Error definition for function f SOM based approach If data containing small error is mapped to the same node of SOM No problem Otherwise Consequently, because of neighboring relation of u, error is not propagated to the next steps of the recognition process 17 Error Propagation
  • Slide 19
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Experiment 30 gestures 10 repetitions each 18 Data Set
  • Slide 20
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Experiment SOM clustering Blue: close to input vector Red: not close Recognition accuracy Test with training data: 100% 10-fold cross validation: 93% 0.843 ms for decoding a gesture Only HMM-based classifier: 86.36% 19 Result
  • Slide 21
  • S FT COMPUTING @ YONSEI UNIV. KOREA 16 Conclusion Key features SOM and HMM based automatic recognition architecture ROI Relative hand position Moving direction Similarity of pattern Application Sign language Gaming environment 20
  • Slide 22
  • Thank you