Face and Hand for Robots

Embed Size (px)

Citation preview

  • 8/14/2019 Face and Hand for Robots

    1/6

    EFFICIENT ACE ND GESTUREEXOG NITION TECHNIQUESFOR ROBOT O N T R O L

    Chao H y Xiang Wang, Mrinal K. Mandal, Max Meng, and Don& LiDepartment ofElectrical and ComputerEngineeringUniversity of Alberta, Edmonton,AB, T6G 2V4, CanadaEmail: {hcsfds,mandal,max.meng}@ee.ualbertarta.caAbstract

    This paper presents a visual recognition system fo rinteractively controlling a mobile robot. First, the robotidentiJies the ope rator by hum anfa cial recognition, and thendetermines the actions an alyzing the human hand gestures.For facia l recogn ition, the ada ptive region-growingalgorithm isproposed to estimate the locatim offac e region.A genetic algorithm s then applied to search or the accuratefacial feature positions. For gesture recognition, we useadaptive color segmentation, handfinding and labeling w ithblocking, morphological filtering, and gesture actions a=

    found by template matching, and skeletonizing. The resultsshow 95% correct recognition ratio compared to less than90% mentioned in other papers .Keywords: Face and Gesture Recognition, Vision, RobotControl

    1. INTRODUCTIONMobile robots are becoming popular in automatedapplications, such as the service robots in public places.Efficient techniques are needed to control the robots sothat the desired tasks are performed. Typically, the robotsare controlled by human operator using input devices,such as keyboard, mouse, sensor gloves, and wirelesscontroller, based on the field messages received from thevideo camera and other sensors o f the robot. Thesemethods have the drawbacks of indirect and unnaturalcommunication between the operator and the robot. Henceit is desirable to develop more friendly and effectiveinteractive tools between the human operators and robot.A few face and gesture recognition techniques havebeen proposed in the literature. Alattar et al. developed amodel-based algorithm for locating facial features [ l ] .Th e algorithm estimates th e parameters of the ellipsewhich best fits the head view in the image and uses theseparameters to calculate the locations of the facial features.It then refines locations by exploiting projections of thepixels in windows around the estimated locations oftheCCECE2003- CCGEI2003,Montreal, May/mai20030-7~03-77~1-~/03~$i7.00003 IEEE

    features. W u et al. presented an automatic featureextraction algorithm [2] that uses a second chance regiongrowing method to estimate the face region. A geneticsearch algorithm is then applied to extract the facialfeatures. Fong et al. presented a virtual joystick techniquewith static gestures to drive remote vehicle [7].Here, handmotions are tracked with a color and stereo vision system.M oy proposed a technique for visual interpretation of 2 Ddynamic hand gestures in complex environments. Thetechnique is based on hand segmentation and featureextraction, and used for humans to communicate andinteract with a pet robot [8]. Iba et al. proposed anarchitecture of Hidden Morkov Model (HMMs) forgesture-based control of mobile robots [9]. For robotcontrol, the performances of these existing techniques arenot satisfactory. Some techniques have very highcomplexity and others provide poor recognition.In this paper, we present a novel visual human-machine interface for a mobile robot moving in the publicplaces. A two-step procedure is employed to detect theaction control input from the operator. As shown in Fig.1,the face of the operator is first identified from the capturedimage. The action gesture of the operator is thendetermined using a novel gesture recognition technique.

    Video Camera

    MobileRobot

    Figure 1. Robot control by face and gesture recognition.The organization of this paper is as follows. In section

    2 we introduce the methods and procedures for humanfacial feature recognition. In section 3, we present thegesture recognition technique for robot control. Theresults are also shown in section 2 and section 3,respectively.

    - 1757 -

    Authorized licensed use limited to: Military College of Signals. Downloaded on April 23, 2009 at 14:52 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 Face and Hand for Robots

    2/6

    2. HUMAN FACE RECOGNITIONIn order to identify the robot operator, we must locatethe facial features. The recognition procedure can he

    divided into three steps as shown in Fig. 2. Face region islocated first, and then facial features are estimated byprojection analysis. According to estimation location,genetic algorithm is applied to extract the accurate featurelocations.

    Figure 2. Steps for face recognition.2.1 Preprocessing

    In the face estimation step, the location of the faceregion is estimated by adaptive region-growing algorithm.Because we make the face region locate roughly in thecenter of an image, the algorithm can be performed byselecting the central point of the image as an initial seed.After the region is grown from the seed, its size must hechecked to make sure that it is a reasonable face region.Assume that the initial seed is represented by SO, and theregion grown from SO is R. The size of R, denoted by /RI,should be restricted to a hounded range, that is R I 51 ~ 2 ,where R1 and R2 are predefined constants. As a fact, thesize of the grown region depends on the predeterminedgrowing threshold. When the contrast between the faceregion and the background region is low and the thresholdis set larger than a suitable value, the associated grownregion will cover some background regions, and thenJRI>R2 (see Fig. 3a). In this case, the threshold should hedecreased to reduce IR/. Sometimes it is possible that thecentral point of the facial image is located in a smallbright or dark region, then R will be that region, insteadofthe face region, and IRI

  • 8/14/2019 Face and Hand for Robots

    3/6

    0 2 4 6 8 10 12 14 16 18 20Y-coordinate-Y Proi%ction

    (a) Y-ProjectionXProiection idtheEyeWindowI

    J M-3

  • 8/14/2019 Face and Hand for Robots

    4/6

    3. GESTURE RECOGNITION From x, y, z, we can getI YThe gestures should he clear, natural, and easy to

    Turn-left, Turn-right, Move-forward, Move-hac!?,Increase-speed, Decrease-speed, and Stop (as L

    H = tg - (-)Xdemonstrate. For vehicle robot movement, we define

    seven 2-hand gestures corresponding to robot actions:L =zS = 1- mm(R,G,B)5 .

    shown in Fig. IO . The fist is a sign, used for activating thegesture and for action stop. Other 6 actions aredetermined by the direction formed by index finger andthe middle finger.The segmentation is realized with the global thresholdby controlling the absolute errors between the HLS valuesof the pixels and the preset average HL S (Ha, La, Sa)values. Let G(x, y ) denote the binary value of thesegmented image ( 1: inside the segmented regions, 0:outside). G(x, y) is calculated as follows:

    1 i fabs(H(xy)-Ha)< H , &abs(L(x,y)-La)

  • 8/14/2019 Face and Hand for Robots

    5/6

    (a) Original images

    (b)Segmented imagesFigure 13 . Adaptive segmentation.

    Figure 14. Segmented images after morphological filtering.3.2 Hand Finding and Labeling

    The segmented image may have more then two brightregions. Hence we need to determine which regionsrepresent these two hand regions. Here we design aneffective and fast method, called rectangle blocking. Asshown in Fig.15,the row and column lines with maximumbright pixels are first found for each bright region. Thehand rectangles are then found by the boundary extendedfrom these maximum row and column lines.

    Figure 15 (a) Rectangle block (b) Hand findingFor the 2 hands, one is just a fist, and we denoteanother one as "index hand" which determines the robotaction. The fist is smaller and pixels inside the rectangleblock have more bright to dark pixel ratio than the indexhand. Thus the hands are laheled as shown io Fig.16.

    (a) Hand labeling @)I nde x hand (e) Index HandFigure 16. Hand Labeling

    The real action can be determined by the index handif tw o hands are found in the image. In this case, we can

    simplify the problem to find the gesture action byanalyzing the index hand.3.3 Recognition M ethods

    After the fist and the index hand are found, we analyzethe index hand to recognize the gesture. Two methods areapplied: skeletonizing and template matching, which arepresented in the following.3.3.1 Recognition by skeletonizing

    We use Zhang-suen transform for skeletonizing, andthe performance is shown in Fig. 17. When the skeletonhas no branch, the gesture curve is represented by 2vectors, relative to thumb and fingers, using linear fittingmethod. When there are branches, three longest skeletoncurves are changed to three vectors. Physically, these 3vectors represent fingers, thumb, and part of the palm tothe arm. Because the gesture action is determined by thefingers, the action rules can he determined by analyzingthese vectors.

    (a) W ithout branches

    @)W ith branchesFigure 17.Vector representation of skeletons

    The recognition performance is good. Out of the 18images, only one is not correct, providing an efficiency of94%. The error is owing to a noise hole inside the indexhand. Hence it is necessary to remove any noise inside theindex hand to apply the skeletonizing.3.3.2 Template Matching.

    Eighteen training templates are created with rectanglehand blocks after hand labeling. All training gestureblocks with different width and height are changed tosquare ones with S O X 80 pixels using geological transform(as shown in Fig.l8), where x ' = x * j 2 / j , , andy ' = y * i, /i , . The resulting images are shown in Fig.19 .

    Figure 18. Geological transform

    - 1761 -

    Authorized licensed use limited to: Military College of Signals. Downloaded on April 23, 2009 at 14:52 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 Face and Hand for Robots

    6/6

    In addition, we use 1 3 test images 1j=l,2, ...,13)also changed to same square templates. We match eachtest image with 18 training templates .Sj(i=1,2, ..,18). Thematching error Rjj between the test image and trainingtemplates is calculated as follows:

    (4)

    Figure 19. Square templates.The action of the training template that provides theminimum matching error is used as the identified action.The results are perfect with 100%correct decisions. Thismethod provides excellent recognition ratio if there areenough training templates. In addition, the computationalcomplexity of this method is not high. Using C-program,gesture recognition is done in about 0 . 8 second for thewhole recognition procedure.

    4. CONCLUSIONSIn this paper, we have presented efficient techniques forface and gesture recognition for robot control. The facerecognition algorithm is robust to variations in subjecthead shape, eye shape, age, and motion such as tilting andnodding of the head. Because we use the projectionanalysis to estimate the feature locations before applying

    genetic algorithm, the computational complexity isreduced significantly, while the accuracy is bener thanother algorithms. For gesture recognition, the optimalprocedure is with HLS segmentation, morphologicalfiltering, hand block labeling, geological transform andtemplate matching. It provides good correct recognitionratio, robustness and speed. The simulation resultsdemonstrate that the proposed techniques provide a verygood performance.

    References:[l ]k M. AlattarandS.A.Rajala, Facial eatures localizaton n h n tnew head and shoulders images, Pmc. ofICASSP,Vol. 6 , pp. 3557-35-50, P h e U S A , 1999.[2] C. H. Lin and I. L.Wu, Automatic facial f e xtractionbygenetic algorithms, IEEE Tram Image Processing, vol. 8, :no. 6, pp.834-845,June 1999.[3]K M.Lam nd H. Yan, Locationand exhaction the eye in humankcc images, P d t m Recognition,vol. 29, no .6,pp. 771-779:1996.[4] Olivelti Research Labomtory hcedatabase,hm:llwww.ukresearch.anifacedatabase.hh[5] K S.Tang,K F. M q .Kwong andQ.He, Genetic algorithmand theirapplications, IEEE S i p /Processing Mzgazine, vol. 13,pp.U-37, Nov. 1996.[6] M. Srinivas and L.M. Pam& Geneticalgorithm A~ surfey;~ ~ t e ~ ,ol. 27, No.6, pp. 17-26, June 1994.[A .Fong, F.Con& S . Grange and C. Baur, Novel interfaces formo t e driving:Gesture, haptic andPDA, c . fSPLE,Vol. 4195,[8] M. C. May, Gesture-based Interaction with a Pet Rotat, Proc.of 16th National Conference on Artificial Intelligence, pp. 628-633, Orlando,USA, July 18-22, 1999[9] S. lba, W. J. M. Vandq C. 1. J. Paredis, and P. K Khwla, AnArchitecture for Geshm-hased Control of Mobile Robot$ Proc. ofthe IEEE /W International Conference on Intelligent Robotsandsystems (IROS), Vol. 2, pp. 851-857, Oct 1999.

    pp.300.311, Boston,200l.

    - 1762 -