Skin Color Modeling Using Image Processing Paper

Embed Size (px)

DESCRIPTION

Research Paper, how to distinguish genders based on their skin colors

Citation preview

  • Pattern Recognition 40 (2007) 11061122www.elsevier.com/locate/pr

    A survey of skin-colormodeling and detectionmethodsP. Kakumanu, S. Makrogiannis, N. Bourbakis

    ITRI/Department of Computer Science and Engineering, Wright State University, Dayton OH 45435, USAReceived 2 February 2005; received in revised form 3 February 2006; accepted 5 June 2006

    Abstract

    Skin detection plays an important role in a wide range of image processing applications ranging from face detection, face tracking,gesture analysis, content-based image retrieval systems and to various human computer interaction domains. Recently, skin detectionmethodologies based on skin-color information as a cue has gained much attention as skin-color provides computationally effective yet,robust information against rotations, scaling and partial occlusions. Skin detection using color information can be a challenging task as theskin appearance in images is affected by various factors such as illumination, background, camera characteristics, and ethnicity. Numeroustechniques are presented in literature for skin detection using color. In this paper, we provide a critical up-to-date review of the variousskin modeling and classication strategies based on color information in the visual spectrum. The review is divided into three differentcategories: rst, we present the various color spaces used for skin modeling and detection. Second, we present different skin modelingand classication approaches. However, many of these works are limited in performance due to real-world conditions such as illuminationand viewing conditions. To cope up with the rapidly changing illumination conditions, illumination adaptation techniques are appliedalong with skin-color detection. Third, we present various approaches that use skin-color constancy and dynamic adaptation techniques toimprove the skin detection performance in dynamically changing illumination and environmental conditions. Wherever available, we alsoindicate the various factors under which the skin detection techniques perform well. 2006 Published by Elsevier Ltd on behalf of Pattern Recognition Society.

    Keywords: Skin-color modeling; Skin detection; Color spaces and color constancy

    1. Introduction

    Skin detection plays an important role in a wide range ofimage processing applications ranging from face detection,face tracking, gesture analysis, content-based image retrieval(CBIR) systems and to various human computer interactiondomains. Recently, skin detection methodologies based onskin-color information as a cue has gained much attentionas skin color provides computationally effective yet, robustinformation against rotations, scaling and partial occlusions.Skin color can also be used as complimentary informationto other features such as shape and geometry and can beused to build accurate face detection systems [14]. Skin-color detection is often used as a preliminary step in face

    Corresponding author.E-mail addresses: [email protected] (P. Kakumanu),

    [email protected],[email protected],[email protected] (N. Bourbakis).

    0031-3203/$30.00 2006 Published by Elsevier Ltd on behalf of Pattern Recognition Society.doi:10.1016/j.patcog.2006.06.010

    recognition, face tracking and CBIR systems. Skin-color in-formation can be considered a very effective tool for iden-tifying/classifying facial areas provided that the underlyingskin-color pixels can be represented, modeled and classiedaccurately.

    Most of the research efforts on skin detection have focusedon visible spectrum imaging. Skin-color detection in visiblespectrum can be a very challenging task as the skin color inan image is sensitive to various factors such as:

    Illumination: A change in the light source distribution andin the illumination level (indoor, outdoor, highlights, shad-ows, non-white lights) produces a change in the color ofthe skin in the image (color constancy problem). The illu-mination variation is the most important problem amongcurrent skin detection systems that seriously degrades theperformance.

    Camera characteristics: Even under the same illumina-tion, the skin-color distribution for the same person differs

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1107

    from one camera to another depending on the camera sen-sor characteristics. The color reproduced by a CCD cam-era is dependent on the spectral reectance, the prevailingillumination conditions and the camera sensor sensitivi-ties.

    Ethnicity: Skin color also varies from person to person be-longing to different ethnic groups and from persons acrossdifferent regions. For example, the skin color of people be-longing to Asian, African, Caucasian and Hispanic groupsis different from one another and ranges from white,yellow to dark.

    Individual characteristics: Individual characteristics suchas age, sex and body parts also affects the skin-colorappearance.

    Other factors: Different factors such as subject appear-ances (makeup, hairstyle and glasses), background colors,shadows and motion also inuence skin-color appearance.

    Many of the problems encountered in visual spectrum canbe overcome by using non-visual spectrum such as infrared(IR) [5,6] and spectral imaging [79]. Skin-color in non-visual spectrum methods is invariant to changes in illumina-tion conditions, ethnicity, shadows and makeup. However,the expensive equipment necessary for these methods com-bined with tedious setup procedures have limited their use tospecic application areas such as biomedical applications.In this paper, we concentrate on visual spectrum-based skindetection techniques that are applicable for 2D images orsingle frames of video.

    From a classication point of view, skin detection canbe viewed as a two class problem: skin-pixel vs. non-skin-pixel classication. The primary steps for skin detection inan image using color information are (1) to represent theimage pixels in a suitable color space, (2) to model the skinand non-skin pixels using a suitable distribution and (3)to classify the modeled distributions. Several color spaceshave been proposed and used for skin detection in the lit-erature. The choice of color space also determines how ef-fectively we can model the skin-color distribution. Skin-color distribution is modeled primarily either by histogrammodels or by single/Gaussian mixture models. Several tech-niques on skin-color model classication, ranging from sim-ple look-up table approaches to complex pattern recognitionapproaches have been published. A survey of different colorspaces for skin-color representation and skin-pixel detec-tion methods is given by Vezhnevets et al. [10]. However, acomprehensive survey of the up-to-date techniques on skin-color modeling and classication is still missing. The goalof this paper is to provide a critical review of different skin-color modeling and detection methods. Many of the existingskin detection strategies are not effective when the illumi-nation conditions vary rapidly. To cope with changes in illu-mination conditions and viewing environment only few skindetection strategies use color constancy and dynamic adap-tation techniques. In color constancy approaches, the imagesare rst color corrected based on an estimate the illumi-

    nant color. The skin-color modeling and detection are thenapplied on these preprocessed images. In dynamic adapta-tion techniques, the existing skin-color model is transformedto the changing illumination conditions. An up-to-date re-view of such illumination adaptation approaches for skindetection is also presented. Majority of the reported workson skin detection perform well only for a limited set of il-lumination conditions and skin types. Wherever available,we indicate the different factors under which the reportedstrategies perform well. We also report the skin detectionperformances in terms of true and false positives, if avail-able. True positive represents the number of skin pixels cor-rectly classied as skin pixels while false positive representsthe number of non-skin pixels classied as skin pixels.

    The remainder of the paper is organized as follows:Section 2.1 gives a brief description of the popular colorspaces used in skin-color detection. Skin modeling andclassication techniques are described in Section 2.2. Skindetection strategies that use color constancy and dynamicadaptation techniques are reviewed in Sections 3. Section 4provides summary and conclusions.

    2. Skin-color modeling and classication

    2.1. Color spaces

    The choice of color space can be considered as theprimary step in skin-color classication. The RGB colorspace is the default color space for most available imageformats. Any other color space can be obtained from alinear or non-linear transformation from RGB. The colorspace transformation is assumed to decrease the overlapbetween skin and non-skin pixels thereby aiding skin-pixelclassication and to provide robust parameters againstvarying illumination conditions. It has been observed thatskin colors differ more in intensity than in chrominance[11]. Hence, it has been a common practice to drop theluminance component for skin classication. Several colorspaces have been proposed and used for skin detection. Inthis section, we review the most widely used color spacesfor skin detection and their properties.

    2.1.1. Basic Color Spaces (RGB, normalized RGB,CIEXYZ)

    RGB is the most commonly used color space for storingand representing digital images, since the data captured bya camera is normally provided as RGB. RGB correspondto the three primary colors: red, green and blue, respec-tively. To reduce the dependence on lighting, the RGB colorcomponents are normalized so that sum of the normal-ized components is unity (r + g + b = 1). Since the sumof these components is 1, the third component does nothold any signicant information and is normally droppedso as to obtain a reduction in dimensionality. It has beenobserved that under certain assumptions, the differences

  • 1108 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    in skin-color pixels due to lighting conditions and dueto ethnicity can be greatly reduced in normalized RGB(rgb) space. Also, the skin-color clusters in rgb space haverelatively lower variance than the corresponding clustersin RGB and hence are shown to be good for skin-colormodeling and detection [11,12]. Due to the above advan-tages, rgb has been a popular choice for skin-detectionand has been used by Bergasa et al. [13], Brown et al.[14], Caetano and Barone [15], Oliver et al. [16], Kimet al. [17], Schwerdt and Crowley [18], Sebe et al. [19],Soriano et al. [20], Strring et al. [21], Wang and Sung [22],Yang andAhuja [12],Yang et al. [11]. The CIE (CommissionInternationale de lEclairage) system describes color as aluminance component Y, and two additional componentsX and Z. CIEXYZ values were constructed from psy-chophysical experiments and correspond to the color match-ing characteristics of human visual system [23]. This colorspace has been used by Brown et al. [14], Chen and Chiang[24], Wu et al. [25].

    2.1.2. Perceptual color spaces (HSI, HSV, HSL, TSL)The perceptual features of color such as hue (H), satu-

    ration (S) and intensity (I) cannot be described directly byRGB. Many non-linear transformations are proposed to mapRGB on to perceptual features. The HSV space denes coloras Huethe property of a color that varies in passing fromred to green, Saturationthe property of a color that variesin passing from red to pink, Brightness (also called Intensityor Lightness or Value)the property that varies in passingfrom black to white. The transformation of RGB to HSV isinvariant to high intensity at white lights, ambient light andsurface orientations relative to the light source and hence,can form a very good choice for skin detection methods.The HSV color space has been used by Brown et al. [14],Garcia and Tziritas [26], McKenna et al. [27], Saxe andFoulds, [28], Sobottka and Pitas [29], Thu et al. [30], Wangand Yuan [31], Zhu et al. [32]. A variation of the HS com-ponents using logarithmic transformation, called Fleck HSwas introduced by Fleck and has been used by Zarit et al.[33]. Another similar color space is TSL color space whichdenes color as Tinthue with white added, Saturation andLightness. The TSL color space has been used by Terillonet al. [34], Brown et al. [14].

    2.1.3. Orthogonal color spaces (YCbCr, YIQ, YUV, YES)The orthogonal color spaces reduce the redundancy

    present in RGB color channels and represent the color withstatistically independent components (as independent aspossible). As the luminance and chrominance componentsare explicitly separated, these spaces are a favorable choicefor skin detection. The YCbCr space represents color asluminance (Y) computed as a weighted sum of RGB values,and chrominance (Cb and Cr) computed by subtracting theluminance component from B and R values. The YCbCrspace is one of the most popular choices for skin detection

    and has been used by Hsu et al. [35], Chai and Bouzerdoum[36], Chai and Ngan [37], Wong et al. [38]. A variant ofYCbCr called YCgCr was used by deDios and Garcia [39].This new color space differs from YCbCr in the usage ofCg color component instead of the Cb component and wasreported to be better thanYCbCr. Other similar color spacesin this category include YIQ, YUV and YES, and representcolor as luminance (Y) and chrominance. For skin detection,these color spaces has been used by YIQDai and Nakano[40], YUVMarques and Vilaplana [41], YESSaber andTekalp [42], Gomez et al. [43].

    2.1.4. Perceptually uniform color spaces (CIE-Lab andCIE-Luv)

    Perceptual uniformity represents how two colors differ inappearance to a human observer and hence uniform colorspaces (UCS) were dened such that all the colors arearranged by the perceptual difference of the colors. However,the perceptual uniformity in these color spaces is obtainedat the expense of heavy computational transformations.In these color spaces, the computation of the luminance (L)and the chroma (ab or uv) is obtained through a non-linearmapping of the XYZ coordinates. For skin detection, theCIE-Lab space has been used by Cai and Goshtasby [44],Kawato and Ohya [45]. The CIE-Luv space has been usedby Yang and Ahuja [12]. Farsnworth [46] has proposed amore perceptually uniform system than Lab or Luv, and ithas been used by Wu et al. [25].

    2.1.5. Other color spacesIt has been observed that skin contains a signicant level

    of red. Hence, some researchers [43,47] have used colorratios (e.g., R/G) to detect skin. Gomez et al. [43] have usedan attribute selection approach to select different compli-mentary color components from various color spaces. Theyshowed that the mixture of color components (E ofYES, theratio R/G and H from HSV) performed better than the exist-ing color spaces for indoor and outdoor scene images. Also,the authors argue that this new mixture space is not-sensitiveto noise from a wide range of unconstrained sources andillumination conditions. Brand and Mason [47] have evalu-ated the performance of color ratios with other algorithmson the Compaq data set [48]. They concluded that the com-bination of color ratios (R/G + R/B + G/B) provided abetter response than the individual ratios.

    2.2. Skin-color classication

    From a classication point of view, skin-color detectioncan be viewed as a two class problem: skin-pixel vs. non-skin-pixel classication. Different researchers have used dif-ferent techniques to approach this problem. The followingsection gives a brief description of the most common meth-ods used.

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1109

    2.2.1. Explicit skin-color space thresholdingThe human skin colors of different individuals cluster

    in a small region in color space provided that the imagesare taken under illumination controlled environments [11].Hence, one of the easiest and often used methods is to de-ne skin-color cluster decision boundaries for different colorspace components. Single or multiple ranges of thresholdvalues for each color space component are dened and theimage pixel values that fall within these predened range(s)for all the chosen color components are dened as skinpixels.

    Dai and Nakano [40] used a xed range on I component inYIQ space for detecting skin pixels from images containingmostly people with yellow skin. The I component includescolors from orange to cyan. All the pixel values in the range,RI = [0, 50] are described as skin pixels in this approach.Sobottka and Pitas [29,49] used xed range values on the HScolor space. The pixel values in the range RH = [0, 50] andRS = [0.23, 0.68] are dened as skin pixels. These valueshave been determined to be well suited for discriminatingskin pixels from non-skin pixels on the M2VTS database,containing images of yellow and white skin people. Chaiand Ngan [50] proposed a face segmentation algorithm inwhich they used a xed range skin-color map in the CbCrplane. The pixel values in the range RCb = [77, 127], andRCr =[133, 173] are dened as skin pixels on the ECU faceand skin database. Garcia and Tziritas [26] segmented skinby using eight planes inYCbCr space or by using six planesin HSV space. Wang and Yuan [31] have used thresholdvalues in rg space and HSV space. The threshold valuesin the range Rr = [0.36, 0.465], Rg = [0.28, 0.363], RH =[0, 50], RS = [0.20, 0.68] and RV = [0.35, 1.0] are used fordiscriminating skin and non-skin pixels. Yao and Gao [101]rst transformed YUV space to skin chrominance and lipchrominance spaces and then used xed range values onthese spaces to detect skin and lip pixels, respectively. Wonget al. [38] used thresholding on the luminance level, Y inYCbCr space. Tomaz et al. [51] also used thresholding inTS space. In Gomez and Morales [52], the authors start withrgb and the constant 13 . A constructive induction algorithmis used to construct a number of three component decisionrules from these four components through simple arithmeticoperations. The new color representation spaces thus foundhad better performance in precision and success rate thanskin probability maps (SPM) (Section 2.2.2) though it iscomputationally slower than the SPM method.

    2.2.2. Histogram model with nave bayes classiersIn this method, a 2D or 3D color histogram is used to

    represent the distribution of skin tones in color space. Colorhistograms are stable object representations unaffected byocclusion, changes in the view, and can be used to differ-entiate a large number of objects [11]. The color space isquantized into a number of histogram bins. Each histogrambin (also dened as look-up table cell) stores the count

    associated with the occurrence of the bin color in the train-ing data set. The histogram bin counts are converted intoprobability distribution, P(c) as follows:

    P(c) = count(c)T

    , (1)

    where count(c) gives the count in the histogram binassociated with color c and T is the total count obtained bysumming the counts in all the histogram bins. These valuescorrespond to the likelihood that a given color belongs tothe skin. All the pixel values for which the correspondingcolor likelihood is greater than a predened threshold aredened as skin pixels. Zarit et al. [33], Yoo and Oh [53]used a histogram-based approach to classify skin pixels.

    Jones and Rehg [54], built a 3D RGB histogram modelwith two billion pixels collected from 18,696 web images.They reported that 77% of the possible RGB colors are notencountered and most of the histogram is empty. There isa marked skew in the distribution towards the red cornerof the color cube due to the presence of skin in web im-ages, though only 10% of the total pixels are skin pixels.This suggested that skin colors occur more frequently thanother object colors. Jones and Rehg also computed two dif-ferent histograms, skin and non-skin histograms. Given skinand non-skin histograms, the probability that a given colorbelongs to skin and non-skin class (also called class condi-tional probabilities) is dened as

    P(c/skin) = s(c)Ts

    , P (c/non-skin) = n(c)Tn

    , (2)

    where s(c) is the pixel count in the color c-bin of the skinhistogram, n(c) is the pixel count in the color c-bin of thenon-skin histogram. Ts and Tn represents the total countsin the skin and non-skin histogram bins. From the genericskin and non-skin histograms, Jones and Rehg demonstratedthat there is a reasonable separation between skin and non-skin classes. This fact can be used to build fast and accurateskin classiers even for images collected from unconstrainedimaging environments such as web images, given that thetraining dataset is sufciently huge. (A larger training setcan lead to better probability density function estimations.)Given the class conditional probabilities of skin and non-skin-color models, a skin classier can be built using Bayesmaximum likelihood (ML) approach [55]. Using this, a givenimage pixel can be classied as skin, if

    P(c/skin)P (c/non-skin)

    , (3)

    where 01 is a threshold value which can be adjustedto trade-off between true positives and false positives. Thisthreshold value is normally determined from the ROC(receiver operating characteristics) curve calculated fromthe training data set. The ROC curve shows the relationshipbetween the true positives and false positives as function ofthe detection threshold . The histogram-based Bayes clas-sier (also called as skin probability map, SPM in short)

  • 1110 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    has been widely used for skin segmentation. The method issimple and computationally fast as we need only two tablelook-ups to compute the probability of the skin. This hasbeen used by Brand and Mason [47], Chai and Bouzerdoum[36], Fleck et al. [56], Gomez and Morales [52], Gomezet al. [43], Jones and Rehg [54], Marcel and Benigo [57],Schwerdt and Crowley [18], Sigal et al. [95], Srisuk andKuritach [58], Zarit et al. [33].

    2.2.3. Gaussian classiersMany of the representative works on skin-color distri-

    bution modeling have used Gaussian mixtures. The advan-tage of these parametric models is that they can generalizewell with less training data and also have very less storagerequirements.

    2.2.3.1. Single Gaussian models (SGM) Under controlledilluminating conditions, skin colors of different individualscluster in a small region in the color space. Hence, undercertain lighting conditions, the skin-color distribution of dif-ferent individuals can be modeled by a multivariate normal(Gaussian) distribution in normalized color space [11,59].Skin-color distribution is modeled through elliptical Gaus-sian joint probability distribution function (pdf), dened as

    p(c) = 1(2)1/2| |1/2 exp

    [1

    2(c )T

    1(c )

    ],

    (4)where c is a color vector, and

    are the mean vector and

    the diagonal covariance matrix, respectively.

    = 1n

    nj=1

    cj ,

    = 1n 1

    nj=1

    (cj )(cj )T. (5)

    The parameters, and

    are estimated over all the colorsamples (cj ) from the training data using ML estimationapproach. The probability p(c) can be used directly as ameasure of skin-color likeliness and the classication is nor-mally obtained by comparing it to a certain threshold valueestimated empirically from the training data [17,60,61].Alternatively, we can also compare Mahalanobis distance, from the image pixel color c [35,6264] to a certainthreshold. This threshold value is determined from the ROCcurve (Section 2.2.2) calculated from the training data.

    = (c )T1

    (c ). (6)

    2.2.3.2. Gaussian mixture models (GMM) Though the hu-man skin-color samples for people of different races clusterin a small region in the color space, it has been shown thatdifferent modes co-exist within this cluster and hence it can-not be modeled effectively by a single Gaussian distribution[12]. Also, under varying illuminating conditions, the singlemode assumption does not hold. Many researchers, there-fore, have used Gaussian mixturesa more capable model

    to describe complex shaped distributions. A Gaussian mix-ture density function is the sum of individual Gaussians,expressed as

    p(c) =N

    i=1wi

    1(2)1/2|i |1/2

    exp[1

    2(c i )T

    1i

    (c i )]

    , (7)

    where c is a color vector and i and

    i are the meanand the diagonal covariance matrix. N is the number ofGaussians and the weight factor, wi is the contribution ofthe ith Gaussian. The parameters of a GMM (i ,

    i , and

    wi) are approximated from the training data through the it-erative expectation-maximization (EM) technique. To con-verge well, the EM technique needs a good initial guess ofthe parameters. These initial parameters can be obtained byk-means clustering of the training data. A detailed descrip-tion of the EM technique applied to skin GMM is describedinYang and Ahuja [12]. The skin classication is performedusing the same methodology as described in the SGM case.

    The choice of the number of Gaussian components N isvery critical to the training data and the choice of color space.Different researchers have used different values of N rangingfrom 2 to 16. Yang and Ahuja [12] used two Gaussians inLuv color space on a Michigan face database. They providedstatistical tests to justify their hypothesis that a SGM isnot sufcient to model the skin distribution for the datasetconsidered. Greenspan et al. [65] also provided statisticaltests to show that a GMMs is a robust representation thatcan accommodate large variations in color space, highlightsand shadows. In their trained two component GMM, one ofthe components captured the distribution of the normal lightskin color while the other captured the distribution of themore highlighted regions of the skin. Caetano et al. [66] used2.8 Gaussians in rg color space on images containing bothblack and white people. Lee andYoo [63] used six Gaussiansfor skin classication on a compaq dataset whereas Thuet al. [30] used four Gaussian components. The key ideabehind using multiple components is that different parts ofthe face illuminated in a different manner can be detectedby different components. Jones and Rehg [54] trained twoseparate models for the skin and non-skin classes. They used16 Gaussians in each of these models. Cai and Goshtasby[44], Jebara and Pentland (1998),McKenna et al. [27], Oliveret al. [16] also used GMMs for skin classication.

    2.2.4. Elliptical boundary modelAs an alternative to the computationally intensive GMMs,

    Lee and Yoo [63] proposed an elliptical boundary model,whose performance is comparable to that of GMM andyet the computational complexity is as simple as training aSGM. The elliptical boundary model is dened as

    (c) = [c ]T1[c ], (8)

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1111

    where c is the color vector, and are the model param-eters dened as

    = 1n

    ni=1

    ci, = 1N

    ni=1

    fi(ci )(ci )T, (9)

    where N is the total number of samples in the training dataset, fi is the number of samples with chrominance ci and isthe mean of the chrominance vectors in the training data set.The pixel with chrominance c is classied as a skin pixel, if(c)

  • 1112 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    MaxEnt model are then expressed analytically as functionsof histograms of the features. Finally, Gibbs sampler algo-rithm was used for inferring the probability of skin for eachpixel. The details of the procedure can be obtained from Je-dynak et al. [72]. However, the number of parameters in thismodel is very large. Assuming that a color in RGB spacecan take 256 values, the total number of parameters in thismodel is 2563 2563 22. Hence the training of the modelis very time consuming. To reduce the training time, the au-thors propose the use of belief propagation algorithm [73].On Compaq database, the detection rate of this model is82.9%, at a false positive rate of 10%.

    2.2.8. Bayesian network (BN) classierBayesian networks are directed acyclic graphs that allow

    efcient and effective representation of the joint probabil-ity density functions. Each vertex in the graph represents arandom variable, and edges represent direct correlations be-tween the variables [74]. Two examples of popular BayesianNetwork classiers are the Nave Bayes (NB) classier andthe Tree-Augmented Nave Bayes (TAN) classier. The NBclassier assumes that all the features are conditionally inde-pendent given the class label, though this is not always true.To enhance the performance over the NB classier, Fried-man et al. [74] proposed the use of TAN classier whichalso considers the correlations between the variables. Forlearning the TAN classier, we do not x the structure of theBayesian network, but we try to nd the TAN structure thatmaximizes the likelihood function given the training dataout of all the possible TAN structures. In general, searchingfor the best tree structure has no efcient solution. However,Friedman et al. [74] showed that searching for the best TANstructure can be computed in polynomial time.

    Sebe et al. [19] used a BN for skin modeling andclassication. One of the problems with pattern recognitionapproaches is the availability of labeled training data. Theauthors propose a new method for learning the structureof the BN with labeled and unlabeled data. They used astochastic structure search (SSS) algorithm for learning thestructure of the BN, which improves the performance overNB and TAN classiers. The details of the procedure canbe obtained from Sebe et al. [19]. With a training data ofonly 60,000 samples (600 labeled + 54,000 unlabeled) fromCompaq dataset, the proposed approach has detection ratesof 95.82%, 98.32% with only 5% and 10% false positives,respectively.

    2.3. Comparison of skin-color classiers

    A good skin classier should be able to detect differentskin types (white, pink, yellow, brown and dark) under awide variety of illumination conditions (white, non-white,shadows, indoor and outdoor) and different backgrounds.Many of the skin detection techniques consider only afew skin types and a few possible illumination conditions.Table 1 lists the summary of the skin detection strategies

    discussed in Section 2.2. It also lists the various illuminationconditions and the different skin types reported for evaluat-ing the performances of these classiers. The performancesof the skin classiers in terms of true positive rate (TPR)and false positive rate (FPR) are also reported. Many of thereported performances were on different datasets. Hence, itis not possible to obtain a fair evaluation of all these meth-ods, as they are not evaluated on common representativetrain and test datasets.

    Some of the works have reported the performances onthe Compaq and ECU skin/non-skin datasets. The Compaqdataset [54] consists of 13,640 color images collected fromWorld Wide Web. Out of these, 4675 images contain skinpixels, while the remaining 8965 images do not contain skinpixels. Since the images were collected from web, these im-ages contain skin pixels belonging to persons of different ori-gins and with un-constrained illumination and backgroundconditions. The ECU database [75] consists of 4000 colorimages: about 1% of these images were taken with digitalcamera and the reset were collected manually from the Web.The lighting conditions in these images include indoor andoutdoor lighting with varied backgrounds, and the skin typesinclude whitish, brownish, yellowish and darkish skins.

    The decision boundaries in the explicit thresholdingmethod are xed and are determined empirically from theskin-color images. On the Compaq database, the threshold-ing of I-axis inYIQ has a detection rate of 94.7 (FPR30.2)[14]. On the ECU database, thresholding in YCbCr spacehas a detection rate of 82.0 (FPR18.7) [75]. It should benoted that this method has good skin detection rate at theexpense of high false positives. The explicit thresholdingmethod has the advantage of being simple and fast. How-ever, it has many limitations. The xed threshold valuesdiffer from one color space to another and differ from oneillumination to another. It is very difcult to nd a range ofthreshold values that covers all the subjects of different skincolor. This technique is less accurate in case of shadows andsituations where the skin color is not distinguishable frombackground. In most of the situations, the explicit thresh-olding technique for skin segmentation is less precise andhence normally followed by a dynamic adaptation approach(Section 3.2).

    The performance in the histogram technique is affectedby the degree of overlap between the skin and non-skinclasses in a given color space and the choice of the detec-tion threshold. The histogram technique has detection ratesof 90.0 (FPR14.2) and 88.9 (FPR10) on Compaq andECU databases, respectively. As we can observe from thetable, these performance rates are slightly higher than thatof the GMM or the MLP techniques on the correspondingdatabases. However, due to its inability to interpolate andgeneralize the data, the histogram method needs a very largetraining dataset to get a good classication rate. Also, thismethod has higher storage requirements. For example, a 3DRGB histogram with 256 bins per channel requires 248 bytesof storage, assuming one 4 byte integer per bin. One of

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1113Ta

    ble

    1Pe

    rform

    ance

    ofdi

    ffere

    ntsk

    inde

    tectio

    nstra

    tegi

    es

    Aut

    hors

    Colo

    rIn

    tens

    itySk

    inde

    tectio

    nPr

    e-Te

    stda

    taba

    seDiff

    .ski

    nDiff

    .Sk

    inIll

    um.t

    ypes

    True

    and

    false

    spac

    eco

    mp.

    metho

    dtrain

    type

    sill

    um.

    type

    spo

    sitiv

    es

    Jone

    san

    dReh

    g,02

    RGB

    Yes

    Bay

    esNo

    Com

    paq

    Yes

    Yes

    n/a

    Un-

    cons

    train

    ed90

    14.2

    RGB

    Yes

    GM

    M(16

    )Ye

    s(13

    640

    web

    9015

    .5Bro

    wn

    etal.,

    01TS

    LNo

    SOM

    Yes

    imag

    es,4

    675

    7832

    Jedy

    nak

    etal.,

    03RGB

    Yes

    Max

    Ent.

    mode

    lYe

    ssk

    in+

    8965

    82.9

    10Le

    ean

    dYo

    o,02

    Xyz

    No

    Ellip

    .mode

    lYe

    snon-ski

    n90

    20.9

    YCb

    CrNo

    SGM

    Yes

    imag

    es)

    9033

    .3YIQ

    No

    GM

    M(6)

    Yes

    9030

    Bra

    ndet

    al.,

    00RGB

    Yes

    Bay

    esNo

    93.4

    19.8

    YIQ

    Yes

    I-ax

    isTh

    resh

    .No

    94.7

    30.2

    RGB

    Yes

    Thre

    sh.r

    atio

    sNo

    94.7

    32.3

    Sebe

    and

    Hua

    ng,0

    4Rgb

    No

    BN

    Yes

    99.4

    10Fu

    and

    Yang,

    04HSV

    Yes

    GM

    M(14

    )+Ye

    sn/a

    n/a

    Hist

    .Mer

    ging

    Phun

    get

    al.,

    05YCb

    CrYe

    sTh

    resh

    oldi

    ngNo

    ECU

    (4000

    Yes

    Yes

    Whi

    teIn

    door

    ,82

    .018

    .7RGB

    Yes

    Bay

    esNo

    imag

    es,1

    %pi

    nkoutd

    oor,

    88.9

    10RGB

    Yes

    MLP

    Yes

    from

    digi

    tal

    yello

    wdi

    ff.88

    .510

    YCb

    CrYe

    sSG

    MYe

    sca

    m.+

    rest

    brow

    nba

    ckgr

    ound

    s88

    .010

    YCb

    CrYe

    sGM

    MYe

    sfro

    mweb

    )da

    rk85

    .210

    Ana

    gnos

    topo

    ulos

    etal.,

    03RGB

    Yes

    Fuzz

    yru

    les

    +PN

    NYe

    s31

    7im

    ages

    n/a

    n/a

    n/a

    n/a

    82.4

    n/a

    Yang

    and

    Ahu

    ja,99

    Luv

    No

    GM

    M(2)

    Yes

    Michi

    gan

    Face

    Yes

    n/a

    n/a

    n/a

    n/a

    n/a

    Datab

    ase

    Caetan

    oet

    al.,

    02Rgb

    No

    SGM

    Yes

    800

    imag

    esYe

    sYe

    sW

    hite

    Un-

    cons

    train

    edn/a

    n/a

    Caetan

    oet

    al.,

    02Rgb

    No

    GM

    M(2)

    Yes

    (web

    +ye

    llow

    8730

    Stirl

    ing

    Face

    brow

    n

    Data)

    dark

    Seow

    etal.,

    03RGB

    Yes

    NN

    Yes

    ODU

    Yes

    Yes

    n/a

    n/a

    n/a

    n/a

    Gre

    ensp

    anan

    dGol

    dber

    ger,

    01Rgb

    No

    GM

    M(2)

    Yes

    682

    imag

    esYe

    sYe

    sW

    hite

    Cont

    rolle

    dn/a

    n/a

    (AR

    +AH

    +ye

    llow

    Vide

    o)pi

    nkTh

    uan

    dM

    egur

    o,02

    HSV

    Yes

    GM

    M(4)

    +Ye

    sn/a

    Yes

    Yes

    Whi

    teIn

    door

    ,n/a

    n/a

    Mul

    ti-Th

    resh

    .ye

    llow

    outd

    oor,

    dark

    shad

    ows

    Jaya

    ram

    etal.,

    04SC

    TYe

    sBay

    esNo

    805

    imag

    esYe

    sYe

    sW

    hite

    Indo

    or98

    .2n/a

    SCT

    Yes

    SGM

    Yes

    (AR

    +UOPB

    yello

    wwhi

    te,

    94.4

    n/a

    +UW

    CBIR

    brow

    nnon-

    Datab

    ase)

    dark

    whi

    teM

    arce

    land

    Ben

    igo,

    02RGB

    Yes

    Hist

    ogra

    m+

    MLP

    Yes

    XM

    2VTS

    Yes

    Yes

    n/a

    Cont

    rolle

    dn/a

    n/a

  • 1114 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    the other important factors to consider in this method num-ber of histogram bins. A 3D RGB histogram with 256 binsper channel has 2563 bins. Due to the sparse distribution ofskin points in RGB space, we reduce the color cube size,thereby reducing the number of bins in the histogram. Thishelps in creating a more compact histogram. Jones and Rehg[54] found that a 32-bin histogram performed better than a256-bin RGB histogram on the Compaq dataset. However,the number of bins that gives the best performance varieswith the color space representation and the size of the train-ing dataset. Phung et al. [75] found that on ECU database, a256-bin histogram is more sensitive to the size of the trainingdata than a 32-bin histogram. The 256-bin histogram per-formed better for large datasets. However, they also reportedthat the performance of 64, 128 and 256-bin histograms arealmost similar.

    Caetano et al. [62] compared the performance of SGMand GMMs (28 components) on a dataset of 800 imagescontaining people from a large spectrum of ethnic groups.The skin models were trained using 550 images collectedfrom web and Stirling Face Database. For the dataset con-sidered, the performance of GMMs with different compo-nents was similar. The performance of SGM is similar tothose of the GMMs for low FPR. However, for medium andhigh TPR, the GMMs performed better. From these results,they suggested that the mixture models may be more ap-propriate than single Gaussian model when high correct de-tection rates are required. Similar results were obtained byLee and Yoo [63] on Compaq dataset. However, as we canobserve form the table, on ECU dataset, the performance ofGaussian mixture is lower than that of unimodal Gaussian.In this case, the Gaussian mixture was not trained on ECUdataset, but the parameters were taken from that of Jonesand Rehg [54]. It should be noted that the histogram mod-els were found to be slightly superior to GMMs in terms ofskin-pixel classication performance [54,75,76]. However,GMMs have been a popular choice for skin-color segmenta-tion as they can generalize very well with less training data.The mixture models though approximate the skin-color dis-tribution effectively, the initialization and iterative trainingof the GMM are computationally expensive especially withlarge data sets. Also, the mixture model is slower to use dur-ing classication since all the Gaussians must be evaluatedin computing the probability of a single color value. To re-duce the train and recall costs of the GMMs, Fu et al. [77]proposed the use of multidimensional histograms in the EMframework to group the neighboring data points and reducethe size of the data set. The conventional EM algorithm iscomputationally expensive as it considers individual datapoints one by one exhaustively. However, since overlappingor closely distributed models data points have equal proba-bilities, we can group all these points and treat the neigh-boring points as a single data point with multiple occur-rences. To represent the data set the authors use multidimen-sional histograms and the EM algorithm is modied accord-ingly. This method reduces the training time drastically. For

    example, Jones and Rehg [54] spent about 24 hours to trainthe skin and non-skin GMMs with 16 components on Com-paq database using 10 work stations in parallel. With themultidimensional histogram technique of Fu et al. [77], a15-component skin GMM took only 250 s. on Pentium IV3.2GHz workstation to train on the same Compaq database.

    In Phung et al. [75], the authors compared the perfor-mance of MLP network with Bayesian, Gaussian and Ex-plicit threshold skin classiers on ECU database. As wecan observe from Table 1, the Bayesian and MLP classiershave similar performance and outperform the other classi-ers. However, when compared with Bayesian techniques,the MLP has very low storage requirements and hence isa better candidate when storage is also a consideration. Onthe Compaq dataset, the Bayesian network of Sebe et al.[19] have superior performance. This method also has theadvantage of using very low-labeled data.

    To characterize one of the methods described in Table 1as a winner, we need to consider many other factors suchas the sizes of the train and test sets used. The size of thetraining set and the variety of samples used in the trainingset have a direct impact on the classier performance. Also,some skin classication methods require extensive trainingand ne tuning of the parameters to achieve an optimal per-formance. The training time is often ignored. However, itmay be important for real-time applications that require on-line training on different data sets. It should be noted thatthe evaluation criteria are dependent on the purpose of theskin classier. For example, if the skin classier is used aspre processing step in face detection, then the system mayprefer achieving high true positives at the expense of highfalse positives. In such systems, the false positives can bereduced as described in Section 3 or by using multiple fea-tures such as texture, shape and motion along with colorinformation. However, if the skin classier is alone used toidentify faces, then attaining both high TPR and low FPR isvery important.

    2.4. Comparison of color space representations

    Several color spaces have been proposed and used forskin detection. Table 1 lists the various color spaces usedfor different skin classiers. As indicated in Table 1, somemethods drop the intensity component so as to provide robustparameters against various illumination conditions. Givendifferent color spaces, to choose the most appropriate colorspace for skin classication is a very difcult task.

    To the best of our knowledge, the rst comparison of colorspaces was made by Littman and Ritter [78]. They compareda neural approach based on linear maps for skin color withnormal distributions using three different colors: RGB, YIQand YUV on a small dataset containing hand images col-lected indoor. They reported that the performance is predom-inantly independent of the color space representation and theneural approach gives a better performance. Zarit et al. [33]

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1115

    compared ve different color spaces (CIE Lab, HSV, FleckHS, rgb, YCbCr) and two skin detection methods (look-uptable and Bayes classier) on a very limited data set of 48training and 64 test images. The performance of the colorspaces with Bayes classier is similar. The look-up tablemethod performed best when used with HSV or Fleck-Hs.They concluded that HS-based color spaces provided bet-ter results. Terillion et al. [102] have evaluated nine differ-ent chrominance spaces (normalized rg, CIE-xy, CIE-DSH,TSL, HSV,YES,YIQ, CIE-Lab, CIE-Luv) for skin segmen-tation with SGMs and GMMs on a data set of 100 imageswith 144 faces and 65 subjects (30 Asians, 34 Caucasians,1 African). They reported that normalized color spaces pro-duced better discrimination between skin and non-skin dis-tributions with SGMs. However, for un-normalized colorspaces the comparable performance is achieved only withGMMs. They concluded that the TSL space provided betterresults.

    Albiol et al. [79] provided a theoretical proof that forevery color space there is an optimum skin detector withcomparable performance provided that there is an invertibletransformation between the color spaces. Skin detection us-ing Bayes classier on three color spaces (RGB, YCbCr,HSV) were compared on a database of 200 images and theirperformance was found to be similar. They also showed thatthe performance of a 3D color space (YCbCr) color spaceis better than the corresponding 2D color space (CbCr) asthe transformation form 3D2D space is not invertible.

    Many of the above comparisons were performed on verylimited data sets. However, to rank the color spaces basedon their separability between skin and non-skin pixels, thedatasets to be considered must include very large numberof skin and non-skin samples. The skin pixels must in-clude a large number of subjects from different races withfewer restrictions on subject appearances and must covera wide range of illumination conditions both indoor andoutdoor. Shin et al. [80] evaluated the separability of skinand non-skin clusters in nine different color spaces (CIE-Lab, CIE-XYZ, HIS, normalized RGB, RGB, SCT, YCbCr,YIQ, YUV) using metrics derived from scatter matrices andskin, non-skin histograms. They used a database of 805images with different skin tones and illumination. Out ofthese, 507 skin images were collected fromAR face database[81] and University of Oulu Physics database (UOPB) [82]and 298 non-skin images were collected from University ofWashington CBIR database [83]. They found that the sepa-rability between skin and non-skin clusters is the highest inRGB color space (i.e., without color transformation). Also,dropping the luminance component and transforming 3Dcolor space to 2D signicantly decreases the separability,thereby reducing the skin segmentation results. Phung et al.[75] evaluated the same on ECU face and skin database withhistogram technique and concluded the same results as thatof Shin et al. [80]. In Jayaram et al. [76], the same group[80] compared different color spaces but with two differ-ent skin classication approaches: histogram and Gaussian

    approaches. In this work, they found that the choice of colormodeling technique makes a signicant difference and thechoice of color transformation improves the performance butnot consistently. Fu et al. [77]compared the performance offour color spaces (RGB, HSV, YCbCr and rg) with GMMs.They found that rg performed the worst due to its transfor-mation from 3D to 2D while the HSV color space in whichthe chrominance and luminance information de-correlatedperforms the best. Their results also suggest that the choiceof skin modeling technique plays an important role in theperformance for a particular color space.

    The most important criteria for the performance of anygiven skin classier is the degree of overlap between theskin and non-skin clusters in a color space. From the abovediscussion, we can conclude that non-parametric modelssuch as histogram-based Bayes classier are not affected bythe color transformation as the degree of overlap is unaf-fected by mapping from one color space to another losslesstransformation. However, parametric modeling techniquessuch as Gaussian modeling are affected by choice of colorspace. It should be noted that the parametric models are alsoaffected by the amount and quality of the training data avail-able. The choice of appropriate color space should also de-pend on the available image format and the necessity of aparticular color space in post-processing steps. For example,some non-linear color space transformations are too compu-tational expensive to be used in real-time. Many of the exist-ing techniques drop the luminance components to reduce theillumination effects in skin detection without any evidence.As the results suggest, ignoring the luminance componentdegrades the skin detection performance.

    3. Illumination adaptation approaches

    One of the most important factors for the success of anyskin-color model is its ability to be able to adapt to thechanges in the lighting and the viewing environment. Theskin-color distribution of the same person under differentlighting conditions differs. Even under the same lightingconditions, background colors and shadows may inuenceskin-color appearance. Furthermore, if a person is moving,the apparent skin colors change as the persons position rel-ative to the camera or light changes. Human visual sys-tem can dynamically adapt to the varying lighting condi-tions and can approximately preserve the actual color of theobject. This ability of the humans to reduce the effect oflight on the color of the object and to retain a stable per-ceptual representation of the surface color is referred to ascolor constancy or chromatic adaptation. However, unlikehumans, image capturing devices such as digital cameras arenot capable of adapting to the rapidly varying illuminationsacross scenes. Most of the skin-color detection strategiesdescribed in Section 3 are immune only to slight variationsin lighting and shading conditions. To handle the rapidlychanging illumination conditions for skin detection, there are

  • 1116 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    primarily two different classes of approaches: color con-stancy and dynamic adaptation which are described brieyin the following section.

    3.1. Skin-color constancy approaches

    Color constancy approaches transform the image contentsto a known canonical illuminant that reects precisely thephysical contents of the scene. This consists of estimatingthe illuminant color and then correcting the image pixel-wise based on the estimate of the illuminant. Estimating theilluminant is critical in solving the image chromatic adap-tation problem. A number of strategies have been proposedto estimate the image illuminant. All these algorithms arebased on the assumptions of either the existing camera char-acteristics or the illuminant properties or the distribution ofthe color values. Gray World algorithms [84] assume thatthe average reectance of the image(s) is gray and the illu-minant is estimated as the color shift from the average grayvalue of the image. Retinex algorithms [85] try to modelthe human color perception system and estimate the illu-minant by comparing the average color value at each pixelto the maximum value found by looking at a larger area inthe image. The gamut mapping algorithms [86,99] considerall the possible mappings between the set of color valuesunder the known and unknown illuminants. The set of allpossible color values under any illuminant is referred to asgamut. A unique solution to this mapping is then obtained.In Bayesian color constancy [87], a ML approach is used todetermine the illuminant estimate so to maximize the likeli-hood of the observed data. In neural network-based method[88], a NN is used to estimate the chromaticity of the il-luminant. The input to the network is a binary value indi-cating the presence of sampled chromaticity. The output ofthe network is the expected chromaticity. For a compari-son of different approaches to color constancy, readers arereferred to Barnard et al. [89,90]. In skin-color constancyapproaches, the color constancy algorithm is applied as apreprocessing step. Based on the estimate of the illuminant,we rst color correct the image using a diagonal transfor-mation. Skin-color modeling and detection is then appliedon these color corrected images.

    3.1.1. Gray World and white patch approachesThe Gray World algorithm is one of the simplest and

    widely used algorithms for estimating the illuminant. Itassumes that given an image with sufciently varied col-ors, the average reectance of the surfaces in the imageis gray. Hence, any shift from gray of the measured av-erages on the sensor responses correspond to the colorof the illuminant. Kovac et al. [91] used the Gray Worldmethod to color correct the images before applying skindetection. They concluded on a dataset of 40 images thatapply color correction improves the performance of the skinclassier.

    The white patch algorithm searches for a white patch inthe image under the assumption that the brightest patch inthe image is white. The chromaticity of the illuminant is thechromaticity of the white patch. Hsu et al. [35] used a ver-sion of white patch algorithm for color correction. The pix-els with the top 5% of luma are taken as a reference whitepatch. The R, G and B components of the image are thenadjusted such that the average color value of this referencewhite patch is grey. These color corrected image pixels aretransformed non-linearly intoYCbCr space. The skin-pixelsare detected using an elliptical model with Mahalanobis dis-tance. Good skin detection results (around 96% detectionrate) were obtained on HH1 MPEG7 video and Championdatabases, which contain images with both frontal and non-frontal faces under varying backgrounds and illuminationconditions. However the number of false positives is high.A subsequent facial feature detection procedure is also pro-posed which reduces dramatically the false positives.

    3.1.2. Neural network approachesIn NN based approaches, a NN is trained to learn the

    relationships between the colors in the image and the ex-pected illuminant. The advantage of using NNs is that thereare no explicit assumptions regarding the image content asin Gray World or white patch methods. Nayak and Chau-dari [92] used a NN for color constancy in tracking humanpalm. In their approach, a NN (2 20 3) is trained usinga back propagation to directly learn the illuminant param-eters. The inputs to the NN are RGB components of theskin pixels, while the output of the network is the expectedcanonical RGB components. The NN is trained on a set ofimages containing human palm under varying illuminationconditions. The results reported suggest that NN adapts tothe illuminant parameters and NN adapted palm images canbe tracked precisely in a variety of cluttered backgroundsand varying illuminations.

    Kakumanu et al. [93,103] also used NN for color con-stancy. The proposed three layered NN (1600 48 8 2)directly estimates the illuminants so as to bring the skincolor to gray. The input to the NN is an rg histogram andthe output of the network is the expected illuminant of theskin in rg space. The NN is trained on a dataset of 255images and tested on 71 images, the images representing awide range of illuminations both indoor and outdoor, dif-ferent backgrounds and non-white light sources. A simplethresholding technique is used to detect skin from these NNcolor corrected images.

    3.1.3. Skin locus approachStrring et al. [21] have used a physics-based model to

    solve the problem of color constancy for skin detection. Thephysics-based model describes an expected area for the skinchromaticities under certain illuminations. The knowledgeof skin pixels denes an area in chromaticity space whichis called skin locus. A thresholding can be applied on this

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1117

    skin locus to classify skin from non-skin pixels for a rangeof illumination and white balancing conditions. However,the skin locus region is strictly dependent on the camerasensor responses. Therefore, Soriano et al. [20] made thelocus directly from images taken under four representativelight sources, Horizon (2300K), Incandescent A (2856K),uorescent TL84, and daylight 6500K. The obtained locuswas shown to be useful for selecting pixels in probabilitybased face tracking.

    3.2. Dynamic adaptation approaches

    Dynamic color modeling approaches adapt by transform-ing the previously developed skin-color models to the chang-ing environment. Histograms and GMMs are the popularchoices for modeling skin-color distribution in dynamic ap-proaches. The advantage of the histogram approach is thatthe probability density function can be computed trivially.The advantage of the GMMs is that the parameters can beupdated in real-time, given that the number of Gaussiancomponents is known apriori.

    3.2.1. Explicit threshold adaptationCho et al. [94] proposed an adaptive thresholding tech-

    nique in HSV color space. They dened a thresholding boxin HSV color space to separate skin from non-skin pixels.Initially, the dimensions of this box are obtained by observ-ing the skin-color distributions of several color sample im-ages. When a new image is tested, in order to get robustnessagainst illumination changes, the threshold values of the Sand V components are updated based on a color histogrambuilt in SV space. The thresholding box is updated by con-sidering the center of gravity of the color histogram calcu-lated from the color values over 10% of the maximum colorvalue in the box. However, this box is not enough to sepa-rate skin-color regions from background regions of slightlydifferent colors. Cluster analysis is then performed to de-termine if dominant background color vectors exist in thebox and if present, they are separated from the skin-colorvectors by a linear classier. The results were reported on adata set of 379 web images containing yellow people. How-ever, the method has a disadvantage as the threshold valuesare bound to change if the people from different races andbroader range of varying illuminations are considered.

    3.2.2. GMM adaptationYang and Waibel [11,59] were one of the rst to propose

    an adaptive approach for skin-based face tracking. They useda GMM for skin modeling in rg space. The mean and thecovariance of the model as the illumination changes dynam-ically are approximated using a linear combination of theprevious parameters based on ML approach. This adaptivemodel is applied for a real-time face tracker and works nefor slightly varying indoor illuminations. A similar approachwas used by Zue and Yang [100] for tracking hands. In this

    work, a restricted EM algorithm is used to train the adaptiveGMM, where the background is modeled by four Gaussiancomponents and the hand color is modeled by one Gaus-sian component. Oliver et al. [16] used an adaptive GMMin rgb space to cope up with the changing viewing andillumination conditions.An initial mixture model is obtainedoff-line with EM algorithm. To update the GMM parametersonline, they used an incremental EM technique. McKennaet al. [27] also used an adaptive GMM in HS space. Thenumber of components in this model was xed while theparameters were updated online using stochastic equations.Due to the lack of ground truth, the adaptive model mightadapt to non-skin image regions (for example during occlu-sions). To reduce this problem, such outliers were detectedusing log-likelihood measurements and the correspondingframe color data is not used for subsequent frames adapta-tion. Bergasa et al. [13] used an unsupervised and adaptiveGaussian skin-color model (UAGM) in rg space for skinsegmentation. The GMM is initialized with clusters com-puted by a competitive vector quantization (VQ) learningalgorithm. The model parameters are updated in time usinga linear combination of the previous parameters. Schwerdtand Crowley [18] used the rst-order moments of the skin-color histogram in rg space to represent the position andthe spatial extent of the skin-colored region. To reduce theeffect of non-skin pixels during tracking, the skin histogramis weighted by a Gaussian function. The mean and covari-ance of this Gaussian function for a new image frame areupdated form the values of the previous frame.

    3.2.3. Histogram adaptationSoriano et al. [20,21] developed an adaptive skin-color

    modeling technique using skin locus. Skin locus (Section3.1.3) is the range of skin colors in the chromaticity space.Initially in an image frame, skin pixels are extracted from thetracked bounding box and within the skin locus dened inrg chromaticity space. The ratio histogram is updated usingthe histogram of these skin pixels and the histogram of thewhole image. The updated histogram is back projected todene the search space (the bounding box) for the next frameand the process is repeated. The rationale behind using theskin locus approach is that, the skin chromaticities in animage occupies only a small portion the skin locus and it iseasy to track this small region than the entire skin locus.

    3.2.4. HMM adaptationSigal et al. [95] used a second-order Markov model to pre-

    dict the evolution of skin color (HSV) histogram over time.To get an initial estimate of the skin-color distribution to betracked, a Bayes classier constructed on Compaq Datasetis considered. Once the initialization is done, the Learningstage performs the EM step over the rst few video frames.For each frame, the EM algorithms E step is histogram-based segmentation, and the M step is histogram adaptation.The evolution of the skin-color distribution at each frame is

  • 1118 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    Tabl

    e2

    Perfo

    rman

    ceofdi

    ffere

    ntsk

    inde

    tectio

    nstra

    tegi

    eswith

    illum

    inatio

    nad

    aptatio

    nap

    proa

    ches

    Aut

    hors

    Colo

    rIn

    tens

    itySk

    inde

    tectio

    nIll

    um.

    Pre-

    Test

    Diff

    .ski

    nDiff

    .Sk

    inIll

    um.

    True

    spac

    eco

    mp.

    metho

    dad

    apt.

    train

    databa

    sety

    pes

    Illum

    type

    sty

    pes

    posit

    ives

    Skin

    -col

    orco

    nstan

    cymetho

    dsHsu

    etal.0

    2YCb

    CrNo

    SGM

    WP

    Yes

    HH1

    Yes

    Yes

    n/a

    n/a

    96.6

    n/a

    Cham

    pion

    Yes

    Yes

    n/a

    n/a

    99.1

    n/a

    Kovac

    and

    YUV

    Yes

    Ellip

    ticGW

    Yes

    60im

    ages

    n/a

    Yes

    n/a

    Whi

    te,

    n/a

    n/a

    Peer

    ,03

    mode

    lnon-w

    hite

    Nay

    aket

    al.0

    3RGB

    Yes

    n/a

    NN

    Yes

    n/a

    No

    Yes

    n/a

    Clut

    tere

    d-n/a

    n/a

    back

    grou

    ndlo

    w-li

    ghts

    Kak

    uman

    uRGB

    Yes

    Thre

    sh.

    NN

    Yes

    326

    imag

    esNo

    Yes

    n/a

    Indo

    or,

    n/a

    n/a

    etal.0

    4outd

    oor,

    inca

    ndes

    cent

    uor

    esce

    ntsh

    adow

    s

    Str

    ring

    Rgb

    No

    Thre

    sh.

    Skin

    No

    UOPB

    Yes

    Yes

    Whi

    teHor

    izon

    n/a

    n/a

    etal.0

    3lo

    cus

    yello

    win

    cand

    esce

    ntda

    rku

    ores

    cent

    dayl

    ight

    Dyn

    amic

    ada

    ptat

    ion

    metho

    dsCh

    oan

    dHSV

    No

    Thre

    sh.

    Ada

    pt.

    Yes

    379

    web

    imag

    esYe

    sYe

    sW

    hite

    n/a

    n/a

    n/a

    Jang

    ,01

    tres

    h.re

    dbr

    ight

    dark

    Yang

    and

    Rgb

    No

    SGM

    SGM

    Yes

    n/a

    Yes

    Yes

    Whi

    ten/a

    n/a

    n/a

    Waibe

    l,98

    adap

    t.ye

    llow

    brow

    n

    dark

    Oliv

    erRgb

    No

    GM

    MGM

    MYe

    sn/a

    Yes

    Yes

    n/a

    n/a

    n/a

    n/a

    etal.9

    7ad

    apt.

    Ber

    gasa

    Rgb

    No

    GM

    MGM

    MYe

    sn/a

    Yes

    Yes

    Whi

    ten/a

    n/a

    n/a

    etal.0

    0ad

    apt.

    yello

    w

    dark

    Soria

    noRgb

    No

    Skin

    Hist

    .No

    UOPB

    Yes

    Yes

    Whi

    teHor

    izon

    n/a

    n/a

    etal.0

    3lo

    cus

    adap

    t.ye

    llow

    inca

    ndes

    cent

    thre

    sh.

    dark

    uor

    esce

    ntda

    ylig

    htSi

    gala

    nd

    HSV

    Yes

    Bay

    esHM

    MYe

    sM

    ovies

    Yes

    Yes

    n/a

    Clut

    tere

    d-86

    .8n/a

    Sclaro

    ff,04

    adap

    t.ba

    ckgr

    ound

    indo

    or,

    outd

    oor,

    inca

    ndes

    cent

    shad

    ows

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1119

    parameterized by translation, scaling, and rotation in colorspace. During prediction and tracking stage, histograms aredynamically updated based on feedback from the currentsegmentation and predictions of the Markov model and thisupdated histogram is used for skin segmentation in the nextframe. The method is shown to be robust for a wide rangeof illumination conditions including non-white and multiplelight sources.

    3.3. Comparison of illumination adaptation approaches

    Table 2 lists the summary of the illumination adaptationapproaches to skin-color modeling and detection methodsas discussed in Sections 3.1 and 3.2.2. The explicit skin-color constancy or the dynamic adaptation approaches toskin detection are proved to provide better results than thestatic skin detection approaches discussed in Section 2.2,especially in a wide range of illumination conditions andcluttered backgrounds.

    Kovac et al. [91] reported on a dataset of 40 images thatapplying color correction improves the performance of skinclassier. Kakumanu et al. [93] trained a NN for skin-coloradaptation on a dataset of 255 images and tested on 71images, the images representing a wide range of illumina-tions both indoor and outdoor, different backgrounds andnon-white light sources. They reported that applying colorconstancy technique improved the skin-color stabilization inimages. Also, the proposed NN trained on skin patch per-forms favorably against Gray World, White Patch and NNtrained on white patch methods. The advantage of NN ap-proach to color constancy is that there are no explicit as-sumptions regarding the image content as in Gray World orWhite Patch algorithms. MartinKauppi et al. [96] comparedskin locus approach with three different algorithms: adap-tive skin lter approach of Cho et al. (Section 3.2.1), statis-tical approach of Jones and Rehg (Section 2.2.3) and colorcorrection approach of Hsu et al. (Section 3.1.2) on a videotaken from the Face Video Database [82]. The video con-tains a person walking in a corridor. The illumination overthe face varies from the light of uorescent lamp to day-light from the windows and to mixtures of these both. Forthe video considered, the performance of the skin-locus ap-proach is better. However, the skin locus approach can beapplied only if the camera parameters are known. Sigal etal. [95] tested the HMM Illumination adaptation method on21 video sequences from nine popular movies. These videoscontain illumination conditions ranging from white to non-white light sources, cluttered backgrounds and shadows. Theaverage classication for skin is 86.84% and for the back-ground is 93.55%, a 24% performance improvement whencompared with the static histogram technique of Jones andRehg [54].

    The rapidly varying illumination conditions, shadowsand cluttered background pose a major problem to the per-formance of the existing skin-color classiers. As indicatedby the above reported comparisons, it is clear that using an

    illumination adaptation approach, either skin-color con-stancy or the dynamic adaptation approach, improves theperformance of the skin-color classier in these situations.Hence, it is suggested to use one of these methods toimprove the skin classication performance. An in-depthcomparison of these methods is not possible, as the abovemethods are not evaluated on large and common datasets.The particular method to use is dependent on the applica-tion constraints such as real-time issues and the nature ofthe dataset at hand.

    4. Summary and conclusions

    In this paper, we have presented an extensive survey ofthe up-to-date techniques for skin detection using color in-formation in the visual spectrum for 2D images. A good skinclassier must be able to discriminate between skin and non-skin pixels for a wide range of people with different skintypes such as white, yellow, brown and dark and be able toperform well under different illumination conditions such asindoor, outdoor and with white and non-white illuminationsources. We reviewed the following critical issues regardingskin detection:

    Choice of appropriate color space: The color space rep-resentation is often lead by the skin detection method-ology and the application. Non-parametric methods suchas histogram-based methods are not affected by the colorspace representation. However, parametric modeling tech-niques such as Gaussian modeling are affected by thechoice of color space. It should be noted that the paramet-ric models are also affected by the amount and the qualityof the training data available. The choice of color spaceshould also depend on the available image format and thenecessity of a particular color space in post-processingsteps. For example, some non-linear color space transfor-mations are too computational expensive to be used inreal-time. Dropping the intensity component so as to ob-tain robust parameters against illumination conditions asapproached in many of the works actually reduces the skindetection performance.

    Skin-color modeling and classication techniques: Thehistogram-based Bayes classier is feasible for skin detec-tion only with large datasets. Also, this method has veryhigh storage requirements when compared to mixture orNN models. The performance of mixture and NN modelsis comparable to that of histogram methods even whensmall datasets are available.

    Color constancy and dynamic adaptation approaches:Obtaining robust color representations against variedillumination conditions is still a major problem. How-ever, the application of color constancy techniques as apreprocessing step to skin-color modeling proved to im-prove the performance. The NN-based color constancytechniques are very promising as they do not make anyexplicit assumptions about the scene content. Dynamic

  • 1120 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    adaptation techniques which transform the existing colormodels so as to adapt to the changing viewing condi-tions are also available. However, the problem with theseapproaches is that if we lose track of ground truth, theadaptive model might adapt to non-skin image regions.The problem domain factors such as the characteristicsof the data set, the real-time considerations and the skindetection method should lead the choice of one of theillumination adaptation methods.

    Most of the skin detection techniques discussed in literatureare used a preprocessor for face detection and tracking sys-tems. Though skin color analysis often produces high TPRs,it also produces high FPRs when the image contains clut-tered background and shadows. To improve the accuracy ofthe classier, various other features such as shape, spatial andmotion information can be used along with skin-color infor-mation. In the past decade signicant advancement has beenmade in skin detection using color information in the visualspectrum. However, to build an accurate classier which candetect all the skin types under different illuminations, shad-ows, cluttered backgrounds and makeup is still an unsolvedproblem. For example, it is extremely hard to build a skinclassier which can distinguish human skin color from dyesdesigned to mimic skin color. Many of the problems en-countered in visual spectrum can be overcome by using non-visual spectrum methods such as infrared (IR) and spectralimaging. The non-visual spectrum methods though immuneto illumination conditions and makeup are very expensiveand bulky, and hence are not suitable to many applicationssuch as Tyos [97] and iLearn [98]. Often the applicationsdemand certain constraints on the skin detection method-ology. When these methods are used in real-time, meetingcomputational and storage requirements is extremely impor-tant. Sometimes, accuracy may need to be sacriced whenthe skin detection strategy is used only as a preprocessingstep to face detection.

    Acknowledgments

    This research was partially supported by awards fromNSF-ASU, AIIS Inc. and WSU/DAGSI.

    References

    [1] R. Chellappa, C. Wilson, S. Sirohey, Human and machinerecognition of faces: a survey, Proc. IEEE 83 (5) (1995) 705740.

    [2] E. Hjelmas, B.K. Low, Face detection: a survey, J. Comput. VisionImage Understanding 83 (2001) 236274.

    [3] M.H. Yang, D.J. Kriegman, N. Ahuja, Detecting faces in images:a survey, IEEE Trans. Pattern Anal. Mach. Intell. 24 (1) (2002)3458.

    [4] W. Zhao, R. Chellappa, P.J. Philips, A. Rosenfeld, Face recognition:a literature survey, ACM Comput. Surveys 85 (4) (2003) 299458.

    [5] D.A. Socolinsky, A. Selinger, J.D. Neuheisel, Face recognitionwith visible and thermal infrared imagery, Comput. Vision ImageUnderstanding 91 (12) (2003) 72114.

    [6] S.G. Kong, J. Heo, B.R. Abidi, J. Paik, M.A. Abidi, Recent advancesin visual and infrared face recognitiona review, Comput. VisionImage Understanding 97 (2005).

    [7] C. Balas, An imaging colorimeter for noncontact tissue colormapping, IEEE Trans. Biomed. Eng. 44 (6) (1997) 468474.

    [8] E. Angelopoulou, R. Molana, K. Daniilidis, Multispectral skin colormodeling, CVPR01, 2001.

    [9] Z. Pan, G. Healey, M. Prasad, B. Tromberg, Face recognition inhyperspectral images, IEEE Trans. Pattern Anal. Mach. Intell. 25(12) (2003).

    [10] V. Vezhnevets, V. Sazonov, A. Andreeva, A survey on pixel-basedskin color detection techniques, GRAPHICON03, 2003, pp. 8592.

    [11] J. Yang, W. Lu, A. Waibel, Skin-color modeling and adaptation,ACCV98, 1998.

    [12] M.H. Yang, N. Ahuja, Gaussian Mixture model for human skincolor and its application in image and video databases, Proceedingsof SPIE: Conference on Storage and Retrieval for Image and VideoDatabases, vol. 3656, 1999, pp. 458466.

    [13] L.M. Bergasa, M. Mazo, A. Gardel, M.A. Sotelo, L. Boquete,Unsupervised and adaptive Gaussian skin-color model, Image VisionComput. 18 (12) (2000) 9871003.

    [14] D. Brown, I. Craw, J. Lewthwaite, A SOM based approach to skindetection with application in real time systems, BMVC01, 2001.

    [15] T.S. Caetano, D.A.C. Barone, A probabilistic model for the humanskin-color, ICIAP01, 2001, pp. 279283.

    [16] N. Oliver, A. Pentland, F. Berard, Lafter: lips and face real timetracker, CVPR97, 1997.

    [17] S.H. Kim, N.K. Kim, S.C. Ahn, H.G. Kim, Object oriented facedetection using range and color information, AFGR98, 1998.

    [18] K. Schwerdt, J.L. Crowely, Robust face tracking using color,AFGR00, 2000.

    [19] N. Sebe, T. Cohen, T.S. Huang, T. Gevers, Skin detection, a Bayesiannetwork approach, ICPR04, 2004.

    [20] M. Soriano, J.B. MartinKauppi, S. Huovinen, M. Lksonen,Adaptive skin color modeling using the skin locus for selectingtraining pixels, Pattern Recognition 36 (3) (2003) 681690.

    [21] M. Strring, T. Koka, H.J. Anderson, E. Granum, Tracking regionsof human skin through illumination changes, Pattern RecognitionLett. 24 (11) (2003).

    [22] J.G. Wang, E. Sung, Frontal-view face detection and facialfeature extraction using color and morphological operations, PatternRecognition Lett. 20 (1999) 10531068.

    [23] Poynton FAQ, Charles Poynton, http://www.poynton.com/PDFs/ColorFAQ.pdf.

    [24] C. Chen, S.P. Chiang, Detection of human faces in colour images,IEE Proc. Vision Image Signal Process. 144 (6) (1997) 384388.

    [25] H. Wu, Q. Chen, M. Yachida, Face detection from color imagesusing a fuzzy pattern matching method, IEEE Trans. Pattern Anal.Mach. Intell. 21 (6) (1999) 557563.

    [26] C. Garcia, G. Tziritas, Face detection using quantized skincolor regions merging and wavelet packet analysis, IEEE Trans.Multimedia 1 (3) (1999) 264277.

    [27] S. McKenna, S. Gong, Y. Raja, Modeling facial colour andidentity with Gaussian mixtures, Pattern Recognition 31 (12) (1998)18831892.

    [28] D. Saxe, R. Foulds, Toward robust skin identication in videoimages, AFGR96, 1996.

    [29] K. Sobottka, I. Pitas, A novel method for automatic facesegmentation, facial feature extraction and tracking, Signal Process.Image Commun. 12 (1998) 263281.

    [30] Q.H. Thu, M. Meguro, M. Kaneko, Skin-color extraction in imageswith complex background and varying illumination, Sixth IEEEWorkshop on Applications of Computer Vision, 2002.

    [31] Y. Wang, B. Yuan, A novel approach for human face detection fromcolor images under complex background, Pattern Recognition 34(10) (2001) 19831992.

    [32] Q. Zhu, K.-T. Cheng, C.-T. Wu, Y.-L. Wu, Adaptive learning of anaccurate skin-color model, AFGR04, 2004.

  • P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122 1121

    [33] B.D. Zarit, J.B. Super, F.K.H. Quek, Comparison of ve colormodels in skin pixel classication, ICCV99, 1999.

    [34] J.C. Terillon, M. David, S. Akamatsu, Detection of human faces incomplex scene images by use of a skin color model and of invariantFourierMellin moments, ICPR98, 1998, pp. 3501355.

    [35] R.L. Hsu, M. Abdel-Mottaleb, A.K. Jain, Face detection in colorimages, IEEE Trans. Pattern Anal. Machine Intell. 24 (5) (2002)696706.

    [36] D. Chai, A. Bouzerdoum, A Bayesian approach to skin colorclassication in YCbCr color space, IEEE TENCON00, vol. 2,2000, pp. 421424.

    [37] D. Chai, K.N. Ngan, Locating facial region of a head-and-shoulderscolor image, ICFGR98, 1998.

    [38] K.W. Wong, K.M. Lam, W.C. Siu, A robust scheme for live detectionof human faces in color images, Signal Process. Image Commun.18 (2) (2003) 103114.

    [39] J.J. de Dios, N. Garcia, Face detection based on a new color spaceYCgCr, ICIP03, 2003.

    [40] Y. Dai, Y. Nakano, Face-texture model based on SGLD and itsapplication in face detection in a color scene, Pattern Recognition29 (6) (1996) 10071017.

    [41] F. Marqus, V. Vilaplana, A morphological approach forsegmentation and tracking of human face, ICPR 2000, 2000.

    [42] E. Saber, A.M. Tekalp, Frontal-view face detection and facial featureextraction using color, shape and symmetry based cost functions,Pattern Recognition Lett. 17 (8) (1998).

    [43] G. Gomez, M. Sanchez, L.E. Sucar, On selecting an appropriatecolour space for skin detection, Springer-Verlag: Lecture Notes inArticial Intelligence, vol. 2313, 2002, pp. 7079.

    [44] J. Cai, A. Goshtasby, Detecting human faces in color images, ImageVision Comput. 18 (1999) 6375.

    [45] S. Kawato, J. Ohya, Automatic skin-color distribution extractionfor face detection and tracking, Fifth International Conference onSignal Processing, vol. 2, 2000, pp. 14151418.

    [46] G. Wyszecki, W.S. Stiles, Color Science, Wiley, New York, 1967.[47] J. Brand, J. Mason, A comparative assessment of three approaches

    to pixel level human skin-detection, ICPR01 1 (2000) 10561059.[48] M.J. Jones, J.M. Rehg, Statistical color models with application to

    skin detection, CVPR99, 1999.[49] K. Sobottka, I. Pitas, Extraction of facial regions and features using

    color and shape information, ICPR96, 1996.[50] D. Chai, K.N. Ngan, Face segmentation using skin-color map in

    videophone applications, IEEE Trans. Circuits Syst. Video Technol.9 (4) (1999).

    [51] F. Tomaz, T. Candeias, H. Shahbazkia, Fast and accurate skinsegmentation in color images, CRV04, 2004.

    [52] G. Gomez, E. Morales, Automatic feature construction and a simplerule induction algorithm for skin detection, Proceedings of Work-shop on Machine Learning in Computer Vision, 2002, pp. 3138.

    [53] T.W. Yoo, I.S. Oh, A fast algorithm for tracking human faces basedon chromatic histograms, Pattern Recognition Lett. 20 (10) (1999)967978.

    [54] M.J. Jones, J.M. Rehg, Statistical color models with application toskin detection, J. Comput. Vision 46 (1) (2002) 8196.

    [55] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classication, WileyInterScience, New York, 2002.

    [56] M.M. Fleck, D.A. Forsyth, C. Bregler, Finding naked people,Proceedings of European Conference on Computer Vision, vol. 2,1996, pp. 592602.

    [57] S. Marcel, and S. Bengio, Improving face verication using skincolor information, ICPR02, 2002.

    [58] S. Srisuk and W. Kurutach, New robust face detection in colorimages, AFGR02, 2002, pp. 291296.

    [59] J. Yang, A. Waibel, A real-time face tracker, Proceedings of the thirdIEEE Workshop on Applications of Computer Vision, WACV96,1996.

    [60] M.H. Yang, N. Ahuja, Detecting human faces in color images,ICIP98, 1998.

    [61] P. Kuchi, P. Gabbur, S. Bhat, S. David, Human face detectionand tracking using skin color modeling and connected componentoperators, IETE J. Res., Special Issue on Visual Media Processing,May 2002.

    [62] T.S. Caetano, S.D. Olabarriaga, D.A.C. Barone, Performanceevaluation of single and multiple-Gaussian models for skin-colormodeling, SIBGRAPI02, 2002.

    [63] J.Y. Lee, S.I. Yoo, An elliptical boundary model for skin colordetection, Proceedings of the International Conference on ImagingScience, Systems and Technology, 2002.

    [64] J.C. Terrillon, M.N. Shirazi, H. Fukamachi, S. Akamatsu,Comparative performance of different skin chrominance models andchrominance spaces for the automatic detection of human faces incolor images, CFGR00, 2000, pp. 5461.

    [65] H. Greenspan, J. Goldberger, I. Eshet, Mixture model for face-color modeling and segmentation, Pattern Recognition Lett. 22 (14)(2001) 15251536.

    [66] T.S. Caetano, S.D. Olabarriaga, D.A.C. Barone, Do mixture modelsin chromaticity space improve skin detection?, Pattern Recognition36 (12) (2003) 30193021.

    [67] J. Karlekar, U.B. Desai, Finding faces in color images usingwavelet transform, International Conference on Image Analysis andProcessing, 1999.

    [68] S.L. Phung, D. Chai, A. Bouzerdoum, A universal and robust humanskin color model using neural networks, IJCNN01, 2001.

    [69] H. Sahbi, N. Boujemaa, Coarse to ne face detection based onskin color adaptation, Workshop on Biometric Authentication, 2002,Lecture Notes in Computer Science, vol. 2359, 2002, pp. 112120.

    [70] M.J. Seow, D. Valaparla, V.K. Asari, Neural network-based skincolor model for face detection, Proceedings of the 32nd Workshopon Applied Imagery Pattern Recognition, 2003.

    [71] I. Anagnostopoulos, C. Anagnostopoulos, V. Loumos, E. Kayafas,A probabilistic neural network for human face identication basedon fuzzy logic chromatic rules, IEEE MED03, 2003.

    [72] B. Jedynak, H. Zheng, M. Daoudi, Statistical models for skindetection, IEEE Workshop on Statistical Analysis in ComputerVision, 2003.

    [73] H. Zheng, M. Daoudi, B. Jedynak, From maximum entropy tobelief propagation: an application to skin detection, Proceedings ofBritish Machine Vision Conference, BMVC04, 2004.

    [74] N. Friedman, D. Geiger, M. Goldszmidt, Bayesian networkclassiers, Mach. Learn. 29 (2) (1997) 131163.

    [75] S.L. Phung, A. Bouzerdoum, D. Chai, Skin segmentation usingcolor pixel classication: analysis and comparison, IEEE Trans.Pattern Anal. Mach. Intell. 27 (1) (2005).

    [76] S. Jayaram, S. Schmugge, M.C. Shin, L.V. Tsap, Effect ofcolor space transformation, the illuminance component, and colormodeling on skin detection, CVPR04, 2004, pp. 813818.

    [77] Z. Fu, J. Yang, W. Hu, T. Tan, Mixture clustering usingmultidimensional histograms for skin detection, ICPR04, 2004,pp. 549552.

    [78] E. Littmann, H. Ritter, Adaptive color segmentation: A comparisonof neural and statistical methods, IEEE Trans. Neural Networks 8(1) (1997) 175185.

    [79] A. Albiol, L. Torres, E.J. Delp, Optimum color spaces for skindetection, ICIP01, 2001.

    [80] M.C. Shin, K.I. Chang, L.V. Tsap, Does colorspace transformationmake any difference on skin detection? IEEE Workshop onApplications of Computer Vision, Orlando, FL, December 2002,pp. 275279.

    [81] A.M. Martinez, R. Benavente, TheAR face database, CVC TechnicalReport #24, June 1998.

    [82] E. Marszalec, B. Martinkauppi, M. Soriano, M. Pietikinen, Aphysics-based face database for color research, J. Electron. Imaging9 (1) (2000) 3238.

    [83] A. Berman, L.G. Shapiro, A exible image database system forcontent-based retrieval, Comput. Vision Image Understanding 75(12) (1999) 175195.

  • 1122 P. Kakumanu et al. / Pattern Recognition 40 (2007) 11061122

    [84] G. Buchsbaum, A spatial processor model for object colourperception, J. Franklin Inst. 310 (1980) 126.

    [85] D.H. Brainard, B.A. Wandell, Analysis of the retinex theory ofcolor vision, J. Opt. Soc. America 3 (1986) 16511661.

    [86] G.D. Finlayson, S. Hordley, Improving gamut mapping colourconstancy, IEEE Trans. Image Process. 9 (10) (2000) 17741783.

    [87] D.H. Brainard, W.T. Freeman, Bayesian color constancy, J. Opt.Soc. Am. A 14 (1997) 13931411.

    [88] V. Cardei, A neural network approach to colour constancy, Ph.D.Thesis, Simon Fraser University, January 2000.

    [89] K. Barnard, B. Funt, V. Cardei, A comparison of computationalcolor constancy algorithmsPart I: theory and experiments withsynthetic data, IEEE Trans. Image Process. 11 (9) (2002) 972984.

    [90] K. Barnard, L. Martin, A. Coath, B. Funt, A comparison ofcomputational color constancy algorithmsPart II: Experimentswith image data, IEEE Trans. Image Process. 11 (9) (2002) 985996.

    [91] J. Kovac, P. Peer, F. Solina, Human skin color clustering for facedetection, EUROCON2003, 2003.

    [92] A. Nayak, S. Chaudhuri, Self-induced color correction for skintracking under varying illumination, ICIP03, 2003.

    [93] P. Kakumanu, S. Makrogiannis, R. Bryll, S. Panchanathan, and N.Bourbakis, Image chromatic adaptation using ANNs for skin coloradaptation, Proceedings of the 16th IEEE International Conferenceon Tools with Articial Intelligence, ICTAI04.

    [94] K.M. Cho, J.H. Jang, K.S. Hong, Adaptive skin-color lter, PatternRecognition 34 (5) (2001) 10671073.

    [95] L. Sigal, S. Sclaroff, V. Athitsos, Skin color-based videosegmentation under time-varying illumination, IEEE Trans. PatternAnal. Mach. Intell. 26 (6) (2004).

    [96] B. Martinkauppi, M. Soriano, M. Pietikinen, Detection of skin-color under changing illumination: a comparative study, 12thInterenational Conference on Image Analysis and Processing, 2003.

    [97] N. Bourbakis, D. Kavraki, An intelligent agent for blind peoplenavigation, Proc. IEEE International Symposium on BIBE-01, 2001,pp. 230235.

    [98] S. Panchanathan et al., iCarea user centric approach to thedevelopment of assistive devices for blind and visually impaired,ICTAI03, 2003, pp. 641648.

    [99] D. Forsyth, A novel approach to color constancy, J. Comput. Vision5 (1) (1990) 536.

    [100] X. Zhu, J. Yang, A. Waibel, Segmenting hands of arbitrary color,AFGR00, 2000.

    [101] H. Yao, W. Gao, Face detection and location based on skinchrominance and lip chrominance transformation from color images,Pattern Recognition 34 (8) (2001) 15551564.

    [102] J.-C. Terillon, M.N. Shirazi, H. Fukamachi, S. Akamatsu,Comparative performance of different skin chrominance models andchrominance spaces for the automatic detection of human faces incolor images, AFGR00, 2000, pp. 5461.

    [103] P. Kakumanu, A face detection and facial expression recognitionmethod applicable to assistive technologies and biometrics, PhDDissertaion, CSE Department, Wright State University, 2006.

    About the AuthorKakumanu received his MS in 2003 and his PhD in 2006 from Wright State University. His research interests are in humancomputer interaction, assistive technol