39
Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air System Ziyong Feng, Shaojie Xu, Xin Zhang, Lianwen Jin, Zhichao Ye, and Weixin Yang Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Ziyong Feng, Shaojie Xu, Xin Zhang , Lianwen Jin, Zhichao Ye, and Weixin Yang

  • Upload
    avedis

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air System. Ziyong Feng, Shaojie Xu, Xin Zhang , Lianwen Jin, Zhichao Ye, and Weixin Yang. Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012. - PowerPoint PPT Presentation

Citation preview

Page 1: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Real-time Fingertip Tracking and Detection using Kinect Depth Sensor for a New Writing-in-the Air SystemZiyong Feng, Shaojie Xu, Xin Zhang, Lianwen Jin, Zhichao Ye, and Weixin Yang

Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Page 2: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

2

Outline• Introduction • Related Work• Proposed Method• Experimental Results• Conclusion

Page 3: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

3

Introduction

Page 4: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

4

Introduction• Fingertip detection takes a very important role of the natural HCI

• Challenge : • Variety of hand poses• Occlusion

• In this paper:• Propose a real-time finger writing character

recognition system using depth information• Accurate and fast

(Human Computer Interaction)

Page 5: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

5

Related Work

Page 6: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Related work• Template matching[3]:

• Curvature Fitting[6]:

[3] L. Jin, D. Yang, L. Zhen, and J. Huang. A novel vision based finger-writing character recognition system. Journal of Circuits, Systems, and Computers (JCSC), 16(3):421–436, 2007.[6] D. Lee and S. Lee. Vision-based finger action recognition by angle detection and contour analysis.Electronics and Telecommunications Research Institute Journal, 33(3):415–422, 2011.

Page 7: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

7

ProposedMethod

Page 8: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Flow Chart

Hand Segmentation

Data Conversion

Region Clustering

Fingertip Identification

Page 9: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

9

• Extract human body from background:• User ID map ( by Open Natural Interaction (OpenNI ) )• User Generator

Hand Segmentation

Page 10: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

10

• Two kinds hand-torso relationship:• 1) Hand is holding up front. • 2) Hand is close to the body.

Hand Segmentation

Depth Histogram

Page 11: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

11

• Characterize the depth-histogram by two models:• 1) Two component Gaussian mixture model . • 2) Single Gaussian model.

• Hand pixels :• Belong to the Gaussian component with smaller mean

Hand Segmentation

: weight of k-th component : maen of k-th component : variance of k-th componentd : depth value

Expectation-maximization algorithm

Two-Component

Page 12: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

12

• One Gaussian fitting:• When the means of two Gaussian are too near• • Distribution:

• Hand pixels: • Compared with torso, hand takes a few room.• Lower part of p :

Hand Segmentation

One-Component

Page 13: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

13

• Convert to real world coordinate:• The accuracy of world coordinate is about 1mm.• The following discussions are all based on real-world coordinate.

Data Conversion

: projected point coordinated : depth value: camera’s focal length at axis x and yx : real word coordinate

Page 14: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

14

• Clustering algorithm : K-means• Finger part vs. non-finger part (K=2)

• Minimize distortion measure J:

Region Clustering

n-th sample would be assigned to k-th cluster maen of the k-th cluster

Page 15: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

15

• After clustering → hand-related region is separated into two parts.

• The fingertip:• The farthest point from one cluster to the center of the other cluster

Fingertip Identification

O

X

‧Arm point: - the mean of points that have the same maximum depth

‧The fingertip:

Page 16: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

16

ExperimentalResults

Page 17: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

17

Experimental Results• Resolution : 480 640

• 30 ftps using OpenNI (KINECT)

• Dataset:• 2 subjects• 6 categories• Total 8185 frames

Page 18: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

18

Experimental Results

Page 19: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

19

Experimental Results

Near mode (1m)

Far mode (1.5m)

Page 20: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

20

Experimental Results• The distribution of errors from a sequence:

‧Fast movement‧Finger is orthogonal to the camera plane.

Page 21: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

21

Experimental Results• Smoothed trajectory: Mean filter

• 90% recognition rate on English characters• 80% on Chinese characters

Page 22: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

22

Conclusion

Page 23: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

23

Conclusion• Proposes a novel real-time fingertip detection and

tracking.

• Using depth sequences

• Accurate and fast on fingertip detection & character recgonition

Page 24: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

Real-time Hand Tracking on Depth ImagesChia-Ping Chen, Yu-Ting Chen, Ping-Han Lee, Yu-Pao Tsai, and Shawmin Lei

Visual Communications and Image Processing (VCIP), 2011 IEEE

Page 25: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

25

Outline• Introduction• Proposed Method• Experimental Results• Conclusion

Page 26: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

26

Introduction

Page 27: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

27

Introduction• Most previous works tracked the hand position on color images and

relied heavily on skin color information.

• Vulnerable to lighting variations and skin color

• In this paper:

• Propose a hand tracking algorithm that uses depth images only• Real-time and accurate• Hand click detection method

Page 28: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

28

ProposedMethod

Page 29: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

29

• Predict the new hand position based on the hand moving velocity:

• H : hand moving velocity (estimated from hand positions tracked in previous frames)

Hand Position Detection

Page 30: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

30

• Hand region:• Connected component in the 3D point cloud P (from 2D depth image)

• Seed Point:

• d(.,.) : Euclidean distance• The nearest point in the point cloud P from the predicted hand position

Hand Region Segmentation

‧Seed Point‧Predicted hand position

Page 31: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

31

• Connectivity:

• Entire hand region:• Using standard region growing techniques• Hand region grows incrementally and stops when:

• 1) Two neighboring points are no longer connected• 2) The geodesic distance to the seed point <

Hand Region Segmentation

𝜴𝜺

Seed Point250mm

30mm

Page 32: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

32

• A) Rough hand center:

• -- The point with maximum boundary points in its neighborhood• -- There should be more boundary points around the palm.

• B) Refined hand center:

Hand Region Segmentation

𝜴𝜺

(12mm)

Mean-Shift(One iteration)

Page 33: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

33

• C) Hand center after Mean-Shift:

Hand Region Segmentation

𝜴𝜺

Page 34: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

34

ExperimentalResults

Page 35: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

35

Experimental Results• Resolution : 320 240

• 3GHz Intel Core 2 Duo E8400

• Computational complexity:

Page 36: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

36

Experimental Results

Page 37: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

37

Experimental Results• Ground truth vs. tracked position (in millimeters)

Page 38: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

38

Conclusion

Page 39: Ziyong Feng,   Shaojie  Xu,  Xin Zhang , Lianwen  Jin,  Zhichao Ye,  and  Weixin Yang

39

Conclusion• Proposes a real-time hand tracking algorithm on depth images.

• Using:• Region Growing• Geodesic distance• Mean-shift

• Can be further extended to two-hand tracking: