IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10 ...users.eecs.northwestern.edu/~pappas/papers/jqchen_tip05.pdf · IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10, OCTOBER

IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL. 14, NO. 10, OCTOBER 2005 1

Adaptive PerceptualColor-Texture ImageSegmentation

JunqingChen,Member, IEEE, ThrasyvoulosN. Pappas,SeniorMember, IEEE, AleksandraMojsilovic, Member, IEEE, andBerniceE. Rogowitz SeniorMember, IEEE

Abstract— Weproposea newapproachfor imagesegmentationthat is based on low-level features for color and texture. It isaimed at segmentation of natural scenes,in which the colorand texture of each segment does not typically exhibit uni-form statistical characteristics.The proposedapproachcombinesknowledgeof human perception with an understanding of signalcharacteristics in order to segment natural scenesinto per-ceptually/semantically uniform regions.The proposedapproachis based on two types of spatially adaptive low-level features.The first describes the local color composition in terms ofspatially adaptive dominant colors, and the seconddescribesthespatial characteristicsof the grayscalecomponentof the texture.Together, they provide a simple and effective characterizationof texture that the proposed algorithm uses to obtain robustand, at the same time, accurate and precise segmentations.The resulting segmentationsconvey semantic information thatcan be used for content-based retrieval. The performance ofthe proposed algorithms is demonstrated in the domain ofphotographic images, including low resolution, degraded, andcompressedimages.

Index Terms— Content-based image retrieval (CBIR), adap-tive clustering algorithm, optimal color composition distance(OCCD), steerablefilter decomposition,Gabor transform, localmedian energy, human visual system(HVS) models.

EDICS Category: Trans. IP 2-SEGM (Segmentation)

I . INTRODUCTION

T HE RAPID accumulationof large collectionsof digitalimageshascreatedthe needfor efficient and intelligent

schemesfor imageretrieval. Sincemanualannotationof largeimagedatabasesis both expensive and time consuming,it isdesirableto basesuch schemesdirectly on image content.Indeed,the field of Content-BasedImage Retrieval (CBIR)has madesignificant advancesin recentyears[1], [2]. Oneof the most importantand challengingcomponentsof manyCBIR systemsis scenesegmentation.

This paperconsidersthe problemof segmentationof nat-ural imagesbasedon color and texture. Although significant

Manuscriptreceived August 7, 2003; revised August 9, 2004.This workwas supportedby the National Science Foundation (NSF) under GrantNo.CCR-0209006.Any opinions,findings and conclusionsor recommenda-tionsexpressedin this materialarethoseof theauthorsanddo not necessarilyreflect the views of the NSF. The associateeditor coordinatingthe review ofthis manuscriptandapproving it for publicationwasDr. Michael Schmitt.

J. Chen is with Unilever Research,Trumbull, CT 06611 USA (email:[email protected])

T. N. Pappasis with the Departmentof Electrical and ComputerEngi-neering,NorthwesternUniversity, Evanston,IL 60208 USA (e-mail: [email protected]),

A. Mojsilovic andB. E. Rogowitz arewith the IBM T. J. WatsonResearchCenter, Yorktown Heights,NY 10598,USA (e-mail: [email protected],[email protected])

progresshasbeenmadein texture segmentation(e.g., [3–7])and color segmentation(e.g., [8–11]) separately, the areaofcombinedcolor and texture segmentationremainsopen andactive.Someof therecentwork includesJSEG[12], stochasticmodel-basedapproaches[13–15], watershedtechniques[16],edgeflow techniques[17], andnormalizedcuts [18].

Another challengingaspectof image segmentationis theextractionof perceptuallyrelevant information.Sincehumansare the ultimate usersof most CBIR systems,it is impor-tant to obtain segmentationsthat can be used to organizeimagecontentssemantically, accordingto categoriesthat aremeaningful to humans.This requiresthe extraction of low-level image featuresthat can be correlatedwith high-levelimagesemantics.This is averychallengingproblem.However,ratherthantrying to obtaina completeanddetaileddescriptionof every object in the scene,it may be sufficient to isolatecertainregionsof perceptualsignificance(suchas“sky,” “wa-ter,” “mountains,” etc.) that can be usedto correctly classifyan image into a given category, such as “natural,” “man-made,” “outdoor,” etc. [19]. An important first step towardsaccomplishingthis goal,is to developlow-level imagefeaturesand segmentationtechniquesthat are basedon perceptualmodelsandprinciplesabouttheprocessingof colorandtextureinformation.

A significant effort has been devoted recently to under-standingperceptualissuesin image analysis.This includesperceptualgrouping of image contents(e.g., [18], [20], and[21]), perceptualmodeling of objects (e.g., [22–24]), per-ceptual modeling of isolated textures for analysis/synthesis[25], [26], and perceptuallybasedtexture classification[27].However, therehasbeenrelatively little work in applyingper-ceptualprinciplesto complex scenesegmentation(e.g., [28]),which motivatesour work. We focus on a broaddomain ofphotographicimages:outdoorand indoor scenes,landscapes,cityscapes,plants,animals,people,objects,etc.A challengingaspectof our work is that we attempt to accomplishbothfeatureextraction and segmentationwith relatively low res-olution (e.g.,

��or lower) andoccasionallydegraded

or compressedimages,just as humansdo. This is especiallyimportant since low resolution imagesare most frequentlyusedwithin WWW documents.In addition, the advantageoflow resolutionimagesis that accessand processingtime aresignificantlyreduced.

A. Motivation and Justificationfor the ProposedApproach

Therearetwo maingoalsin thiswork. Thefirst is to developsegmentationalgorithmsfor imagesof naturalscenes,in which

2 IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL. 14, NO. 10, OCTOBER 2005

color and texture typically do not exhibit uniform statisticalcharacteristics.Thesecondis to incorporateknowledgeof hu-manperceptionin the designof underlyingfeatureextractionalgorithms.

Segmentationof imagesof natural scenesis particularlydifficult because,unlike artificial imagesthatarecomposedofmoreor lesspuretextures,the texture propertiesarenot welldefined. The texture characteristicsof perceptuallydistinctregionsarenot uniform dueto effectsof lighting, perspective,scale changes,etc. Fig. 1 shows two manually segmentedimages.Eventhoughthewaterandthesky in both imagesarequite distinct segments,the color variessubstantiallywithineachsegment.Similarly, the spatialcharacteristicsof the city,forest, and mountain segmentsare also distinct but do nothave well defineduniform characteristics.The humanvisualsystem (HVS) is very good at accountingfor the variouseffects mentionedabove in order to segment natural scenesinto perceptually/semanticallyuniform regions.However, it isextremelydifficult to automaticallysegmentsuchimages,andexisting algorithmshave beenonly partially successful.Thekey to addressingthis problem is in combining perceptualmodelsand principles of texture and color processingwithan understandingof imagecharacteristics.

Recently, there has been considerableprogressin devel-oping perceptualmodels for texture characterizationin theareasof texture analysis/synthesisand texture classification.Several authorshave presentedmodels for texture analysisandsynthesisusingmultiscalefrequency decompositions[26],[29–34]. Themostrecentandcompleteresultswerepresentedby Portilla and Simoncelli [26], who proposeda statisticalmodel for texture imagesthat is consistentwith humanper-ception.Their model is quite elaborateand capturesa verywide classof textures.Similarly, therehasbeenconsiderableactivity in texture classification[3–5], [27]. The segmentationproblem is quite different, however. Most of the work intexture analysis/synthesisand texture classificationhas beenfocused on isolated samplesof well-defined textures withrelatively uniform characteristics(e.g., wavelet coefficientswithin each subbandfollow a certain distribution [35]). Inaddition, the methodsfor texture analysis,classification,andsynthesisare designedto operatein high-resolutionimages(e.g.,

��or

� ��pixels), which allows for the

precise estimation of a relatively large number of textureparameters(e.g.,severalhundredin [26]). In contrast,we wantto segmenttexturesin thumbnailimages,which may containseveral textures with spatially varying characteristics.Thus,by necessity, our texture modelshave to be far simpler sotheir parameterscanbe robustly estimatedfrom a few samplepoints.Note that, aswe discussedabove, for segmentationitis not necessaryto characterizeevery possibletexture, onlysomekey texture featuresthat canhelp discriminatebetweenperceptuallyimportantregions.

B. Outline of ProposedApproach

We presentan imagesegmentationalgorithmthat is basedon spatiallyadaptive texture features.As illustratedin Fig. 2,we develop two types of features,one describesthe local

Fig. 1. HumanSegmentations(imagesshown in color).

Fig. 2. Schematicof ProposedSegmentationAlgorithm.

color composition,andthe other the spatialcharacteristicsofthe grayscalecomponentof the texture. Thesefeaturesarefirst developedindependently, andthencombinedto obtainanoverall segmentation.

Theinitial motivationfor theproposedapproachcamefromthe adaptive clusteringalgorithm(ACA) proposedby Pappas[8]. ACA has beenquite successfulfor segmenting imageswith regions of slowly varying intensity but oversegmentsimageswith texture. Thus,a new algorithmis necessarythatcan extract color texturesas uniform regions and provide anoverall strategy for segmenting natural imagesthat containboth texturedandsmoothareas.The proposedapproachusesACA asa building block. It separatesthe imageinto smoothand textured areas,and combinesthe color compositionandspatial texture features to consolidate textured areas intoregions.

The color composition featuresconsist of the dominantcolors and associatedpercentagesin the vicinity of eachpixel. They arebasedon the estimationof spatially adaptivedominant colors. This is an important new idea, which ononehand,reflectsthefact that theHVS cannotsimultaneouslyperceive a large numberof colors,and on the other, the factthat region colors are spatially varying. Note that therehavebeenprevious approachesbasedon the conceptof extracting

CHEN ET AL.: ADAPTIVE PERCEPTUAL COLOR-TEXTURESEGMENTATION 3

the dominantcolors in the image [27], [36], [37], however,noneof themaddressestheissueof spatialvariations,which isoneof the mostcommoncharacteristicsfor imagesof naturalscenes.Spatially adaptive dominantcolors can be obtainedusing the ACA [8]. As we will seein SectionII, the localintensityfunctionsof the ACA canbe usedasspatiallyadap-tive dominantcolors.Finally, we proposea modifiedOptimalColor CompositionDistance(OCCD) metric to determinetheperceptualsimilarity of two color compositionfeaturevectors[38].

The spatial texture featuresdescribethe spatial character-istics of the grayscalecomponentof the texture, and arebasedon a multiscale frequency decompositionthat offersefficient andflexible approximationof early processingin theHVS. We use the local energy of the subbandcoefficientsas a simple but effective characterizationof spatial texture.An important novelty of the proposedapproachis that amedianfilter operationis usedto distinguishtheenergy duetoregion boundariesfrom the energy of the texturesthemselves.We also show that, while the proposedapproachdependsonthe structureof the frequency decomposition,it is relativelyindependentof the detailedfilter characteristics.

The proposedsegmentationalgorithm combinesthe colorcompositionand spatial texture featuresto obtain segmentsof uniform texture. This is donein two steps.The first relieson a multigrid region growing algorithm to obtain a crudesegmentation.The segmentationis crudedue to the fact thatthe estimationof the spatial and color compositiontexturefeaturesrequiresa finite window. The secondstep usesanelaborateborderrefinementprocedureto obtainaccurateandpreciseborder localization by appropriatelycombining thetexture featureswith the underlyingACA segmentation.

The novelty of the proposedapproachis twofold. First, byusing featuresthat adapt to the local image characteristics,it can accountfor the nonuniformity of the textures that arefoundin naturalscenes,namelytheintensity, color, andtextureof a perceptuallyuniform region can changegradually (butsignificantly)acrossa region. The proposedalgorithmadaptsto suchvariationsby estimatingthecolor compositiontextureparametersovera hierarchyof window sizesthatprogressivelydecreaseasthealgorithmconvergesto thefinal segmentation.Second,in contrast to texture analysis/synthesistechniquesthat use a large number of parametersto describetexture,it relies on only a small numberof parametersthat can berobustly estimated(and easily adapted)basedon the limitednumberof pixels that areavailable in eachregion.

The paper is organized as follows. Section II presentsthe color compositiontexture features.The extraction of thespatialtexture featuresis presentedin SectionIII. SectionIVdiscussesthe proposedalgorithm for combining the spatialtexture and color compositionfeaturesto obtain an overallsegmentation.Segmentationresultsandcomparisonsto otherapproachesare presentedin SectionIV. The conclusionsaresummarizedin SectionV.

I I . COLOR COMPOSITION TEXTURE FEATURES

Color hasbeenusedextensively as a low-level featureforimageretrieval [1], [39–41]. In this section,we discussnew

color compositiontexture featuresthat take into accountbothimagecharacteristicsandhumancolor perception.

A. Motivation and Prior Work

An important characteristicof humancolor perceptionisthat the human eye cannot simultaneouslyperceive a largenumberof colors [27], even thoughunderappropriateadap-tation, it can distinguishmore than two million colors [42].In addition, the number of colors that can be internallyrepresentedandidentifiedin cognitive spaceis about30 [43].A small set of color categoriesprovidesa very efficient rep-resentation,and more importantly, makes it easierto captureinvariantpropertiesin objectappearance[44].

The ideaof using a compactcolor representationin termsof dominantcolors for imageanalysiswas introducedby Maet al. [36]. The representationthey proposedconsistsof thedominantcolors along with the percentageof occurrenceofeachcolor.�� "!#��$%� � �'&'&(&)�+*,�� %-/. � � �'0�1

(1)

whereeachof the dominantcolors,��

, is a three-dimensional(3-D) vector in RGB space,and

� �are the corresponding

percentages.Mojsilovic et al. [27] adoptedthis representationusing an (approximately)perceptuallyuniform color space(Lab). It has beenshown that the quality of imageretrievalalgorithmscanbe substantiallyimprovedby usingsuchcolorspaces[45].

As implied by (1), the dominantcolors in [27], [36], [37]are fixed over an imageor a collectionof images.Therearea numberof approachesfor extracting the dominantcolors[27], [36], [38], [46]. A relatively simple and quite effectivealgorithmthat can be usedfor obtainingthe dominantcolorsof an imageis the color segmentationalgorithmproposedbyComaniciuandMeer[10], which is basedon the“mean-shift”algorithmfor estimatingdensitygradientsandis, thus,knownasthe mean-shiftalgorithmin the literature.However, it doesnot take into considerationspatialvariationsin the dominantcolors of a (natural) image.Another approachthat assumesconstantdominantcolors, but takes into accountthe spatialdistribution of the original imagecolors, is presentedin [47].It recognizesthe fact that humanvisual perceptionis moresensitive to changesin smoothregionsandquantizesthecolorsmorecoarselyin detailedregions.

Theabove dominantcolor extractiontechniquesrely on theassumptionthat thecharacteristiccolorsof an image(or classof images)arerelatively constant,i.e., they do not changedueto variationsin illumination, perspective, etc. This is true forimagesof fabrics,carpets,interior designpatterns,and otherpure textures.The classof imagesthat we are considering,however, is more generaland includes indoor and outdoorscenes,suchaslandscapes,cityscapes,plants,animals,people,and man-madeobjects.To handlesuch images,one has toaccountfor color and lighting variationsin the scene.Thus,while the above approachescan provide colors that arequiteuseful in characterizingthe image as a whole, the resultingcolor classification(segmentation)could be quite inadequatedueto lack of spatialadaptationandspatialconstraints[8].


In addition to the spatially varying image characteristics,one has

2to take into considerationthe adaptive natureof the

HVS [48]. For example,we perceive regions with spatiallyvarying color asa singlecolor.

B. ProposedColor CompositionFeatures

In order to accountfor the spatially varying image char-acteristicsand the adaptive natureof the HVS, we introducethe ideaof spatially adaptive dominant colors. The proposedcolor compositionfeaturerepresentationconsistsof a limitednumber of locally adapteddominant colors and the corre-spondingpercentageof occurrenceof each color within acertainneighborhood:��435��67�+*98�: ;<!=�>��?�� @!#��$A� � �'&'&(&)�CBD�� A-E. � � �'0�1

(2)

whereeachof the dominantcolors,� �

, is a 3-D vectorin Labspace,and

� �arethecorrespondingpercentages.

* 8�: ;denotes

the neighborhoodaroundthe pixel at location�F3G��6�!

andB

is the total numberof colors in the neighborhood.A typicalvalue is

BH�JI. As we will seebelow, this numbercanvary

in differentpartsof the image.One approachfor obtaining spatially adaptive dominant

colorsis theACA proposedin [8] andextendedto color in [9].The ACA is an iterative algorithm that can be regardedas ageneralizationof the K -meansclusteringalgorithm[46], [49]in two respects:it is adaptive andincludesspatialconstraints.It segmentsthe image into K classes.Eachclassis charac-terizedby a spatiallyvarying characteristicfunction LGM �43G�+6�!that replacesthe spatiallyfixedclustercenterof the K -meansalgorithm.Giventhesecharacteristicfunctions,theACA findsthe segmentationthat maximizesthe a posteriori probabilitydensity function for the distribution of regions given theobserved image.The algorithmalternatesbetweenestimatingthecharacteristicfunctionsandupdatingthesegmentation.Theinitial estimateis obtainedby the K -meansalgorithm (and,in particular, the implementationdescribedin [50]), whichestimatesthe cluster centers(i.e., the dominant colors) byaveragingthecolorsof thepixels in eachclassover thewholeimage.The key to adaptingto the local imagecharacteristicsis that theACA estimatesthecharacteristicfunctions LGM �43G�+6�!by averagingover a sliding window whosesizeprogressivelydecreases.Thus,thealgorithmstartswith globalestimatesandslowly adaptsto the local characteristicsof eachregion. Aswe will seebelow, it is thesecharacteristicfunctions LGM �43G�+6�!that areusedasthe spatiallyadaptive dominantcolors.

Fig. 3 comparesthe adaptive dominant colors obtainedby the ACA [8] to the constantdominant colors obtainedby the mean-shiftalgorithm [10]. The image resolution is��N�,� � I

pixels. The examplesfor the mean-shiftalgorithmweregeneratedusingthe“oversegmentation”setting.Notethefalsecontoursin themean-shiftalgorithmin thewaterandthesky. Also, while therearecolor variationsin the forestregion,thesegmentboundariesdonotappearto correspondto any truecolor boundaries.The ACA on the otherhand,smoothesoverthewater, sky, andforestregions,while capturingthedominantedgesof the scene.Note that the ACA was developedforimagesof objectswith smoothsurfacesandno texture.Thus,

in many textured regions, like the mountainarea,the ACAoversegmentsthe image,but the segmentsdo correspondtoactual texture details. Thus, it preserves the essentialcolorcharacteristicsof the texture. In other textured areas,likethe forest, the ACA consolidateseverything into one region.In such cases,the color variationsin the texture are not assignificantandcanbe representedby their local average.

In contrast to the other approaches,the ACA is quiterobust to the numberof classes.This is becausethe gradualcolor adaptationmakes it possible to use one color classto representa wide range of similar colors, provided thatthey vary graduallyover the image.In addition,as we moveto anotherpart of the image, the samecolor class can beused to representan entirely different color. Thus, one ofthe advantagesof using the ACA to obtainspatiallyadaptivedominantcolorsis thatwe only needto specifytheparameterK , which thendeterminesthe maximumnumberof dominantcolors(

BPO K ) in any given region of the image.We foundthat a small number(e.g., K �QI

) is quite adequate.TheACA segmentstheimageinto colorclasses,asshown in

Fig. 3 (d). At everypixel in theimage,eachclassis representedby the characteristicfunction L M �F3G��6�! , i.e., a color that isequal to the averagecolor of the pixels in its neighborhoodthat belongto that class[8]. In theexampleof Fig. 3(c), eachpixel is paintedwith therepresentativecolor of theclassthat itbelongsto. Sincethecharacteristicfunctions(dominantcolors)areslowly varying,wecanassumethatthey areapproximatelyconstantin the immediatevicinity of a pixel. Thus,the colorcompositionfeature representationof the form (2) at eachpoint in the image consistsof the (up to) K characteristiccolors of eachclassand the associatedpercentageof pixelswithin agivenwindow. Notethat,givenanACA segmentation,the color featurevectorscan be computedusing a differentwindow size, by averaging the colors of each class in thewindow andcomputingthepercentageof pixels in eachclass.

C. Color CompositionSimilarity Metric

We now define a metric that measuresthe perceptualsimilarity between two color composition feature vectors.Basedon human perception,the color compositionof twoimages (or image segments) will be similar if the colorsare similar and the total areasthat eachcolor occupiesaresimilar [27], [38]. The definition of a metric that takes intoaccountboth the color and areadifferences,dependson themappingbetweenthedominantcolorsof thetwo images[38].Varioussuboptimalsolutionshave beenproposed[27], [36].Mojsilovic et al. [38] proposedthe OCCD, which finds theoptimal mappingbetweenthe dominantcolorsof two imagesand, thus, provides a better similarity measure.The OCCD,which is closely relatedto the earth mover’s distance[51],1

overcomesthe (significant)problemsof the othermetrics,butin general,requiresmorecomputation.However, sincewe areprimarily interestedin comparingimagesegmentsthatcontainonly a few colors (at most four), the additional overheadfor the OCCD is reasonable.Moreover, we introduce anefficient implementationof OCCDfor theproblemathandthat

1For a comparisonof the two metrics,see[38].


Fig. 3. Color ImageSegmentation(a,b,cshown in color). (a) Original Color Image.(b) MeanShift Algorithm. (c) ACA. (d) ACA Color Classes.

producesa closeapproximationof the optimal solution.Thestepsof the proposedOCCD implementationareas follows:

1) Given two color composition feature vectors�SR�

and�7T�, createa stackof tokens(colors and corresponding

percentages)for eachfeaturevector, asshown in Fig. 4.Createan emptydestinationstackfor eachvector.

2) Selecta pair of tokens�F�'U��VU<!

and��'W��VWC!

with nonzeropercentages,onefrom eachfeaturevector, whosecolorsareclosest.

3) Move the token with the lowest percentage(e.g.,�F� U �F� U !) to the destinationstack.Split the other token

into�� W �F� U !

and�F� W �F� W5X � U !

, andmove the first to thecorrespondingdestinationstack.

4) Repeatabove stepswith the remainingcolors,until theinitial stacksareempty.

An illustrative example is shown in Fig. 4. Note that eventhough this implementationis not guaranteedto result inthe optimal mapping,in practice,given the small numberofclasses,it producesexcellent results.On the other hand, itavoidsthequantizationerrorintroducedby theoriginalOCCD,and thus,canbe even moreaccuratethan the original imple-mentation.Oncethecolor correspondencesareestablished,theOCCD distanceis calculatedas follows:Y � �� R� �C� T� !=�HZ[ �]\)^A_ �� R� �C� T� !5`5� � (3)

where� R�

,� T�

, and� �

arethematchedcolorsandcorrespondingpercentageafter the color matchingprocessdescribedabove,and _ �?ab! is the distancein some color space.We use theEuclideandistancein Lab space.

I I I . SPATIAL TEXTURE FEATURES

As we discussedin the introduction,the color compositionand spatial texture featuresare developedindependently. Weuse only the grayscalecomponent2 of the image to derivethe spatial texture features,which are then combinedwiththecolor compositionfeaturesto obtainan intermediatecrudesegmentation. This is in contrastto the approachesdescribed

2The grayscalecomponentis obtainedasa standardlinear combinationofgammacorrectedRGB values.

Fig. 4. Exampleof Simplified Versionof OCCD (shown in color).

in [12], [27], where the color quantization/segmentationisusedto obtainan achromaticpatternmapwhich becomesthebasisfor texture featureextraction.

A. Motivation and Prior Work

Likemany of theexistingalgorithmsfor textureanalysisandsynthesis(e.g.,[5], [6], [26], [29–34], [52–57]), our approachis basedon a multiscalefrequency decomposition.Examplesof such decompositionsare the Cortex transform [58], theGabor transform [30], [59], the steerablepyramid decom-position [60–62], and the discretewavelet transform(DWT)[63], [64], which can be regardedas a crudeapproximationof the cortex transform.We baseour spatial texture featureextraction on one of the more accurateapproximationsofthevisual cortex, thesteerablepyramiddecomposition,whichcanbe designedto produceany numberof orientationbands.The proposedmethodology, however, canmake useof any ofthe decompositionsmentionedabove. Fig. 5 shows examplesof frequency decompositionsthat can be obtainedwith thesteerablepyramid.

Oneof themostcommonlyusedfeaturesfor textureanalysisin the context of multiscale frequency decompositionsisthe energy of the subbandcoefficients [3–7], [65]. Variousnonlinearoperationshave beenusedto boost up the sparsesubbandcoefficients [3], [36], [57], [65]. Our approachis


Fig. 5. Steerablefilter decomposition.(a) Ideal two-level decomposition.(b) Ideal one-level decomposition(Horizontalbandsshown in gray). (c) Circularcrosssectionof real steerablefilter frequency response.

basedon the local medianenergy of the subbandcoefficients,wheretheenergy is definedasthesquareof thecoefficients.Aswesaw in theintroduction,theadvantageof themedianfilter isthat it suppressestexturesassociatedwith transitionsbetweenregions,while it respondsto texture within uniform regions.The useof medianlocal energy asa nonlinearoperationalsoagreeswith Graham[66] and Grahamand Sutter [67], [68],who concludethat a nonlinearoperatorin texture segregationmusthave accelerating/expansive nature.

B. ProposedSpatialTexture Features

We usea steerablefilter decompositionwith four orientationsubbands(horizontal,vertical,+45o, -45o) asshown in Fig. 5.Most researchershave usedfour to six orientationbandstoapproximatethe orientationselectivity of the HVS (e.g.,[58],[69]). Sincethe imagesarefairly small,we found that a one-level decomposition(lowpassband,four orientationbands,andhighpassresidue,as shown in Fig. 5(b)) is adequate.Out ofthose we use only the four orientation bands.Our goal isto identify regions with a dominantorientation(horizontal,vertical, +45o, -45o); all other regions will be classifiedassmooth(not enoughenergy in any orientation)or complex (nodominantorientation).

Fig. 5(c) shows a circular cross section of the steerablefilter responses.Note that there is a large overlap betweenneighboring filters. Thus, even when there is a dominantorientation,theresponseof theneighboringfilterswill bequitesignificant, especiallywhen the texture orientation falls be-tweenthemain orientationsof thesteerablefilters. Therefore,it is the maximumof the four coefficientsthat determinestheorientationat a given pixel location.3

The spatialtexture featureextractionconsistsof two steps.First,weclassifypixelsinto smoothandnonsmoothcategories.Then we further classify nonsmoothpixels into the remain-ing categories.Let c ^ �435��6�! , c R �F3G��6�! , c T �43G�+6�! , and c�d �43G�+6�!

3In [70], we usedthe closenessof the 1st and 2nd maxima of the foursubbandcoefficients as an indication of a complex region. However, suchacriterionmisclassifies,ascomplex, textureswith orientationsthat fall betweenthe main orientationsof the steerablefilters, for which the responsesof thetwo filters are close.Using sharperorientationfilters will narrow the rangeof misclassifiedorientations,but will not entirely eliminatethe problem.

representthe steerablesubbandcoefficient at location�435��6�!

that correspondsto the horizontal(0o), diagonalwith positiveslope(+45o), vertical(90o), anddiagonalwith negativeslope(-45o) directions,respectively. We will use c(e%f+g �F3G�+6 ! to denotethe maximum (in absolutevalue) of the four coefficients atlocation

�F3G��6�!, and c � �435��6�! to denotethe subbandindex that

correspondsto that maximum.A pixel will beclassifiedassmoothif thereis no substantial

energy in any of the four orientationbands.As we discussedabove, a median operation is necessaryfor boosting theresponseto texturewithin uniformregionsandsuppressingtheresponsedue to texturesassociatedwith transitionsbetweenregions.A pixel

�F3G��6�!is classifiedassmoothif themedianofc e%f�g �43Vh"��6 h]! overaneighborhoodof

�F3G�+6 !is below a thresholdi ^

. This thresholdis determinedusing a two-level K -meansalgorithmthatsegmentstheimageinto smoothandnonsmoothregions.A clustervalidationstepis necessaryat this point. Ifthe clustersare too close, then the image may containonlysmoothor nonsmoothregions,dependingon the actualvalueof the clustercenter.

The next step is to classify the pixels in the nonsmoothregions.As wementionedabove,it is themaximumof thefoursubbandcoefficients, c � �F3G��6�! , that determinesthe orientationof thetextureat eachimagepoint.Thetextureclassificationisbasedon thelocal histogramof theseindices.Again,a mediantype of operationis necessaryfor boosting the responsetotexture within uniform regions and suppressingthe responsedue to textures associatedwith transitionsbetweenregions.This is doneasfollows. We computethe percentagefor eachvalue(orientation)of the index c � �F3 h"��6 hj! in the neighborhoodof

�435��6�!. Only thenonsmoothpixelswithin theneighborhood

are considered.If the maximumof the percentagesis higherthan a threshold

i R (e.g., 36%) and the differencebetweenthe first and second maxima is greater than a thresholdi T (e.g., 15%), then there is a dominantorientation in thewindow and the pixel is classifiedaccordingly. Otherwise,there is no dominantorientation,and the pixel is classifiedas complex. The first thresholdensuresthe existenceof adominantorientationand the secondensuresits uniqueness.An exampleis presentedin Fig. 6. The grayscalecomponentof theoriginal color imageis shown in Fig. 6(a). In Fig. 6(b),


Fig. 6. Texture Map Extraction.(a) GrayscaleComponentof Original Image.(b) Smooth(black) andNonsmooth(differentshadesof gray) RegionsUsingSteerableFilter Decomposition.(c) Texture ClassesUsing SteerableFilter Decomposition.(d) Texture ClassesUsing GaborDecomposition.Texture windowsize= k#l�mnk#l .the smoothregions are shown in black, and the nonsmoothregions are shown in different shadesof gray representingthe indices c � of the subbandcoefficients with maximumenergy. Fig. 6(c) shows the resulting texture classes,whereblack denotessmooth,white denotescomplex, and light graydenoteshorizontal textures. (There are no diagonal texturesin this example.)The window for the medianoperationwas��oN�p��o

.

C. SpatialTexture Similarity Metric

To measurethe similarity betweentwo spatialtexture fea-tures

�SRq and�7Tq , we definethe following distance:Y q �"� Rq �C� Tq !=�sr �

if� Rq �t� Tqu �F: v

if�SRq �w$yx�z�7Tq �|{ (4)

whereu �4: v

is a thresholdthat will, in general,dependon thecombinationof texture classes(smooth,horizontal, vertical,+45o, -45o, and complex); in the following, we will assumetwo differentvaluesfor

u �F: v, onefor within nonsmoothtexture

classes(e.g.,u �F: v � � & �

) and the other for betweensmoothand nonsmoothclasses(e.g.,

u �F: v � �). This metric will be

usedin combinationwith the color metric to determinetheoverall similarity betweentwo texture (color compositionandspatialtexture)featurevectors.Thevalueof

u �4: vrepresentsthe

penaltyfor inconsistentcolor compositionandspatialtextureclassification.Theideais that,if thespatialtextureclassesarethe same,thenwe allow for morecolor variation.If they arenot the same,the colorshave to be moresimilar in order forpixels to belongto the sameclass.

D. ImplementationDetails and Other Considerations

In the texture classextractionprocedure,we found that thewindow size for medianoperatoris of critical importance.Itmustbe large enoughto capturethe local texture characteris-tics, but not too largeto avoid bordereffects.Our experimentsindicatethat window sizesin the rangeof

�~}n��<}to��n��

pixels are suitable for the steerablefilter decomposition.Amore careful determinationof the window size should bebasedon subjective experiments.Note also, that the windowsizedependson the specificdecomposition.For example,wefound that theDWT requiressmallerwindow sizes[71]. Thatis becausein the DWT the subbandsaredownsampled,whilein the steerabledecompositionthat we usethey arenot. Thewindow sizealsodependson theextentof theanalysisfilters.

We have alsoexperimentedwith alternative waysto obtainthesmoothvs.nonsmoothclassification.For example,we triedan approachsimilar to the onedescribedin [71], wherebythelocal medianenergy of eachsubbandcoefficient is computedfirst, followedby a two-level K -means.A pixel is thenclassi-fied assmoothif all subbandsbelongto the low energy class.This leadsto similar resultsbut involves more computation.Another approachis to apply K -meansto the vector of thelocal medianenergies of the four subbandcoefficients. Wefound that the proposedalgorithm has the bestperformancein termsof accuracy androbustness,aswell ascomputationalefficiency.

We alsoconsidereda numberof alternative decompositions.In [70], [71] we comparedthe performanceof the DWT andthe steerablefilter decompositionusing similar classificationprocedures,and found that the steerablefilter decompositionproducessuperior results. As we discussedabove, this ismainly due to the fact that the DWT doesnot separatethetwo diagonaldirections.A numberof other filter banksthatgeneratecomplete/over-completeorientationaldecompositionscan be used instead of the steerablefilters. For example,we tried a one-level, four-orientationGabor decomposition4

with the rest of the procedureunchanged,and found thatits performanceis comparableto that of the steerablefilters.Fig. 6(d) shows the resulting texture class map. Note thatbecauseof the“max” operator, usingsharperorientationfilterswill not lead to bettertexture classification.

IV. SEGMENTATION ALGORITHM

In this section,we presentan algorithmthat combinesthecolor compositionand spatial texture featuresto obtain theoverall imagesegmentation.

The smoothand nonsmoothregions are consideredsepa-rately. As we discussedin SectionII, theACA wasdevelopedfor imageswith smoothregions.Thus,in thoseregions,wecanrely on the ACA for the final segmentation.However, someregionmergingmaybenecessary. Thus,in thesmoothregions,we considerall pairsof connectedneighboringsegments,andmergethemif theaveragecolor differenceacrossthecommonborderis below a giventhreshold.Thecolor differenceat eachpoint alongtheborderis basedon thespatiallyadaptive dom-inant colorsprovided by ACA, which thusprovidesa naturaland robust region merging criterion. Finally, any remaining

4The Gaborfilters we usedare of size 9x9 pixels and we usedthe samefilter designandparametersasthat in [57].


smallcolor segments5 thatareconnectedto nonsmoothtextureregions are consideredtogetherwith the nonsmoothregions,and are assumedto have the samelabel as any nonsmoothregion they areconnectedto. Fig. 7 shows thedifferentstagesof thealgorithm;(a) shows anoriginal color image,(b) showstheACA segmentation(dominantcolors),(c) showsthetextureclasses,and(d) and(e) show thecolor segmentsin thesmoothregionsbeforeandafterthemergingoperation.Thenonsmoothregions are shown in white, while the smoothregions havebeenpaintedby theaveragecolor of eachconnectedsegment.

We now considerthe nonsmoothregions,which have beenfurther classified into horizontal, vertical, +45o, -45o, andcomplex categories.Thesecategoriesmustbe combinedwiththe color compositionfeaturesto obtainsegmentsof uniformtexture.We obtainthefinal segmentationin two steps.Thefirstcombinesthecolor compositionandspatialtexture featurestoobtain a crude segmentation,and the secondusesan elabo-rate border refinementprocedure,which relies on the colorinformationto obtainaccurateandpreciseborderlocalization.

A. CrudeSegmentation

Thecrudesegmentationis obtainedwith a multigrid regiongrowing algorithm.We start with pixels locatedon a coarsegrid in nonsmoothregions,andcomputethecolor compositionfeaturesusinga window sizeequalto twice the grid spacing,i.e., with 50% overlap with adjacenthorizontal or verticalwindows.Only pixelsin nonsmoothregionsandsmoothpixelsthat are neighborswith nonsmoothpixels are considered.Note that the color composition featuresare computedatthe full resolution;it is the merging only that is carriedouton different grids. The merging criterion, which we discussbelow, combinesthe color compositionand spatial textureinformation.

Ideally, a pair of pixels belongto the sameregion, if theircolor composition featuresare similar and they belong tothe same spatial texture category. Thus, to determineif apair of pixels belong to the sameregion, we computethedistancebetweentheir feature vectors

�SR>��SR� �C�SRq ! and�7T��"�7T� �C�7Tq ! , which includeboth the color compositionandspatialtexture features:Y �� R �� T !=� Y �(�"� R� �� T� !G� Y q �"� Rq �� Tq ! (5)

whereY � ��a !

andY q ��a ! weredefinedin the previous sections.

In addition,we incorporatespatialconstraintsin the formof Markov randomfields (MRFs).UsinganMAP formulationsimilar to that of [8], wherebythe conditionaldensityof theobservation is Gaussianand the a priori densityof the classassignmentsis MRF, a pixel is assignedto the class thatminimizesthe following function:Y �� ^ �C� � !G��B � X * � !��]�S$

(6)

where� ^

is the featurevector of the currentpixel,� �

is thefeature vector of its

$th neighbor,

* �(B �

) is the numberof nonsmoothneighborsthat belong to the same(different)classasthe

$th neighbor, and

�representsthe strengthof the

5For example,we useda thresholdequalto the areaof a one pixel widenarrow strip,whoselengthis equalto ahalf of themaximumimagedimension.

spatialconstraint.Thus,a pixel is more likely to belongto aclasswhen many of its neighborsbelong to the sameclass.In orderto allow new classesto be created,we arbitrarily setthe featuredistancebetweenthe currentpixel and a pixel ina new classequalto a threshold

u ^. Note that becauseof the

MRF constraint,the likelihoodof appearanceof a new classdecreasesas

�increases.

Sincethe MRF constraintis symmetric,it is necessarytoiteratea few timesfor a given grid spacing.The grid spacingand window size are then reducedby a factor of two, andthe procedureis repeateduntil the spacingis equal to onepixel. Fig. 7 (f) shows an example of the resulting crudesegmentation.Fig. 8 shows examplesof crudesegmentationsobtainedwith different valuesof the parameter

�. Note that

in Fig. 7 (d), (e), (f), (g), andin Fig. 8 the differentsegmentshave beenpaintedby theaveragecolor of theregion, while inFig. 7 (d) and(e) white representsnonsmoothregions.

B. Border RefinementUsing AdaptiveClustering

Once the crude segmentationis obtained,we refine it byadaptively adjustingthe bordersusing the color compositiontexturefeatures.Theapproachis similar to thatof theACA [8],andis illustratedin Fig. 9. Thedottedline representstheactualboundaryand the solid line denotesthe boundarylocation inthe current iteration. For eachpixel in the image,we use asmallwindow to estimatethepixel texturecharacteristics,i.e.,a color compositionfeaturevectorof theform (2), anda largerwindow to obtaina localizedestimateof theregioncharacteris-tics.For eachtexturesegmentthatthelargerwindow overlaps,we obtain a separatecolor compositionfeaturevector, thatis, we find the averagecolor and percentagefor eachof thedominantcolors.We thenusetheOCCDcriterionto determinewhichsegmenthasa featurevectorthatis closestto thefeaturevectorof thesmallwindow, andclassifythepixel accordingly.An MRF constraintsimilar to theonein (6) is addedto insureregion smoothness.The above procedurecould be repeatedfor eachpixel in a rasterscan.To save computation,however,we only considerpixels on the border betweennonsmoothsegmentsor betweensmoothand nonsmoothsegments.(Thebordersbetweensmoothsegmentshave alreadybeenfixed.)A few iterationsarenecessaryfor convergence.The iterationsconvergewhenthenumberof pixelsthatchangeclassis belowa given threshold(e.g.,equalto the averageof the widths ofthetwo windows).Wethenreducethewindow sizesandrepeatthe procedure.For example,we usea seriesof window pairsstartingfrom 35/5 andendingwith 11/3. (The window sizeisodd so that they aresymmetric.)

Oneof the importantdetailsin the above procedureis thateachof the candidateregions in the larger window must belarge enough in order to obtain a reliable estimateof itstexture attributes. If the areaof a segment that overlapsthelarger window is not large enough,then the region is not avalid candidate.A reasonablechoicefor the thresholdfor theoverlappingareais to use the product of the window sizesdivided by 2.

As wementionedabove,therefinementprocedureis appliedto the whole imageexcept the smoothregions,whereas we


Fig. 7. Color andTexture ImageSegmentation(a,b,d,e,f,g,hshown in color). (a) Original Color Image.(b) Color Segmentation(ACA). (c) Texture Classes.(d) SmoothRegionsBeforeMerging (e) SmoothRegionsAfter Merging (f) CrudeSegmentation.(g) Final Segmentation.(h) Final Segmentation(on originalimage).Texture window size= k#l�m�k#l and �� . White regions in (c) denotecomplex regions.White regions in (d) and(e) denotenonsmoothregions.

saw, the ACA providesaccuratesegmentationand no refine-ment is necessary. Moreover, it is easyand interestingto ex-plain why theborderrefinementprocedure,which is designedfor nonsmoothtextures,will not work in the smoothregions.Let us assumewe have a borderbetweentwo smoothregionsas shown in Fig. 9. Let the local featurebe

� � �F3G��67�C* 8�: ; !��F� R �F� R !��(�F� T �� T ! 1 and the featuresof the two segmentsbe� U� �43G�+67�+* 8�: ; !��F�'U�� ! 1and

� W� �435��67�+* 8�: ; !/��'W�� ! 1.

Note that thesearesmoothsegments,and thus,eachis char-acterizedby onecolor. Sincethecolorsareslowly varying,wehave

� R�� Uand

� T�� W. Thus,the OCCD featuredistances

betweenthelocal featureandthetwo segmentfeaturesbecomeY ��(�� U� !�� _ �F� R �+� U !S`5� R � _ �� T �C� U !G`5� T�� _ �F� W �+� U !5`5� TY ��(�� W� !=� _ �� R �C� W !G`5� R � _ �F� T �+� W !G`5� T�� _ �F� U �C� W !5`5� Rwhere _ �?ab! representsthe distancebetween the dominantcolorsin agivencolorspaceaswesaw in (3). Thus,theOCCDfeaturedistancesareactuallydeterminedby thepercentagesofthe colors,and,hence,the refinementwill lead to the wrongresults.

The final segmentationresultsare shown in Fig. 7(g) and(h). Additional segmentationresultsareshown in Fig. 10; theresolutionof the imagesvariesfrom

��N�� I��to

� � I �p��pixels.Most of the imagesshown werefound in the Internet;example (e) comesfrom the Berkeley image database[72].Fig. 11 shows the segmentationresults obtainedby JSEG[12], a segmentationalgorithm that is also basedon textureand color. We chosethe “no merge” option for the JSEGexamplesshown. Thus, in comparingwith the resultsof theproposedalgorithm in Fig. 10, one shouldkeepin mind thatthe JSEG images are oversegmented.It is fair to assumethat a reasonableregion merging stepcould be applied,eventhoughthe JSEGmerging criterion doesnot work that well.Thus,for example,therearenosignificantdifferencesbetweenthe two algorithmsin the forest areaof example (b) or theflower area of example (c). On the other hand, there are

Fig. 8. Illustrating theeffectsof spatialconstraints(imagesshown in color).Left row shows crudesegmentationsandright row shows final segmentations.From top to bottom ��p�� S�� %�#� � . Texture window size= k#l�mnk#l .

Fig. 9. Illustration of BorderRefinement.


significant differencesin (g) that cannotbe eliminatedwithregion merging,e.g.,aroundtheboator theboundarybetweenthecity andtheforestat thetop of thepicture.Similarly, thereare significant differencesin example (i), where the towerbehind the train is segmentedwell by our algorithm, but ismergedwith one of the sky segmentsby JSEG.Note that inexample (h), the color of the sky is too close to the colorof the mountains,and thus, both algorithms merge part ofthe mountainswith the sky. Note that the proposedalgorithmoccasionallyalso oversegmentssome textured regions, e.g.,in the lower left cornerof example(a) and the forestareaofexample(b). For suchcases,a regionmergingcriterionsimilarto theonewe describedfor thesmoothregionscanbeappliedto the textured regions.Fig. 11 (a), (b), (i), (h), and (j) alsodemonstratethat theproposedalgorithmcanhandlecolor andtexture gradients.

V. CONCLUSION

We presenteda new approachfor imagesegmentationthatis basedon low-level featuresfor color andtexture.It is aimedat segmentationof natural scenes,in which the color andtexture of eachsegment doesnot typically exhibit uniformstatistical characteristics.The proposedapproachcombinesknowledge of human perceptionwith an understandingofsignal characteristicsin order to segmentnaturalscenesintoperceptually/semanticallyuniform regions.

The proposedapproachis basedon two typesof spatiallyadaptive low-level features.The first describesthe local colorcompositionin terms of spatially adaptive dominantcolors,and the seconddescribesthe spatial characteristicsof thegrayscalecomponentof the texture. Togetherthey provide asimple and effective characterizationof texture that can beused to obtain robust, and at the sametime, accurateandprecisesegmentations.Theperformanceof theproposedalgo-rithms hasbeendemonstratedin the domainof photographicimages,including low resolution,degraded,and compressedimages. As we have shown, one of the strengthsof thealgorithm is that it can handle color and texture gradients,which are commonly found in perceptuallyuniform regionsof naturalscenes.

The image segmentation results can be used to deriveregion-specificcolor andtexture features.Thesecanbe com-binedwith othersegmentinformation,suchaslocation,bound-ary shape,andsize, in order to extract semanticinformation.Such semanticinformation may be adequateto classify animagecorrectly, eventhoughour segmentationresultsmaynotalwaysnecessarilycorrespondto semanticobjectsasperceivedby humanobservers.

ACKNOWLEDGMENT

This work was supportedby the National ScienceFoun-dation (NSF) under Grant No.CCR-0209006.Any opinions,findingsandconclusionsor recommendationsexpressedin thismaterialarethoseof theauthorsanddo not necessarilyreflectthe views of the NSF.

REFERENCES

[1] Y. Rui, T. S. Huang,and S.-F. Chang,“Image retrieval: Current tech-niques,promisingdirectionsandopenissues,” J. Visual Communicationand Image Representation, vol. 10, pp. 39–62,Mar. 1999.

[2] W. M. Smeulders,M. Worring, S. Santini, A. Gupta, and R. Jain,“Content-basedimage retrieval at the end of the early years,” IEEETrans.PatternAnal.MachineIntell., vol. 22,pp.1349–1379,Dec.2000.

[3] A. Kundu and J.-L. Chen, “Texture classificationusing QMF bank-basedsubbanddecomposition,” ComputerVision, Graphics,and ImageProcessing. Graphical Modelsand Image Processing, vol. 54, pp. 369–384,Sept.1992.

[4] T. ChangandC.-C.J.Kuo,“Textureanalysisandclassificationwith tree-structuredwavelet transform,” IEEE Trans. Image Processing, vol. 2,pp. 429–441,Oct. 1993.

[5] M. Unser, “Texture classification and segmentation using waveletframes,” IEEE Trans. Image Processing, vol. 4, pp. 1549–1560,Nov.1995.

[6] T. Randenand J. H. Husoy, “Texture segmentationusing filters withoptimized energy separation,” IEEE Trans. Image Processing, vol. 8,pp. 571–582,Apr. 1999.

[7] G. V. de Wouwer,P. Scheunders,andD. Van Dyck, “Statistical texturecharacterizationfrom discrete wavelet representations,” IEEE Trans.Image Processing, vol. 8, pp. 592–598,Apr. 1999.

[8] T. N. Pappas,“An adaptive clusteringalgorithm for image segmenta-tion,” IEEE Trans. Signal Processing, vol. SP-40,pp. 901–914,Apr.1992.

[9] M. M. Chang,M. I. Sezan,and A. M. Tekalp, “Adaptive Bayesiansegmentationof color images,” Journal of Electronic Imaging, vol. 3,pp. 404–414,Oct. 1994.

[10] D. Comaniciuand P. Meer, “Robust analysisof featurespaces:Colorimage segmentation,” in IEEE Conf. Computer Vision and patternRecognition, (SanJuan,PuertoRico), pp. 750–755,June1997.

[11] J. Luo, R. T. Gray, andH.-C. Lee,“Incorporationof derivative priors inadaptive Bayesiancolor imagesegmentation,” in Proc. Int. Conf. ImageProcessing(ICIP-98), vol. III , (Chicago,IL), pp. 780–784,Oct. 1998.

[12] Y. Deng and B. S. Manjunath,“Unsupervisedsegmentationof color-textureregionsin imagesandvideo,” IEEETrans.PatternAnal.MachineIntell., vol. 23, pp. 800–810,Aug. 2001.

[13] S.Belongie,C. Carson,H. Greenspan,andJ.Malik, “Color- andtexture-basedimagesegmentationusingEM andits applicationto contentbasedimageretrieval,” in ICCV, pp. 675–682,1998.

[14] D. K. Panjwani and G. Healey, “Markov random-field models forunsupervisedsegmentationof textured color images,” PAMI, vol. 17,pp. 939–954,Oct. 1995.

[15] J.Wang,“Stochasticrelaxationonpartitionswith connectedcomponentsand its applicationto imagesegmentation,” IEEE Trans.Pattern Anal.Machine Intell., vol. 20, no. 6, pp. 619–636,1998.

[16] L. Shafarenko, M. Petrou,andJ. Kittler, “Automaticwatershedsegmen-tation of randomly textured color images,” IEEE Trans. Pattern Anal.Machine Intell., vol. 6, no. 11, pp. 1530–1544,1997.

[17] W. Ma and B. S. Manjunath,“Edge flow: a techniquefor boundarydetection and image segmentation,” IEEE Trans. Image Processing,vol. 9, pp. 1375–1388,Aug. 2000.

[18] J. Shi andJ. Malik, “Normalizedcuts and imagesegmentation,” IEEETrans.Pattern Anal. Machine Intell., vol. 22, pp. 888–905,Aug. 2000.

[19] A. Mojsilovic andB. Rogowitz, “Capturingimagesemanticswith low-level descriptors,” in Proc. Int. Conf. Image Processing(ICIP-01),(Thessaloniki,Greece),pp. 18–21,Oct. 2001.

[20] A. Amir and M. Lindenbaum,“A genericgrouping algorithm and itsquantitative analysis,” IEEETrans.PatternAnal.MachineIntell., vol. 20,pp. 168–185,Feb. 1998.

[21] Y. Gdalyahu,D. Weinshall,and M. Werman,“Self-organizationin vi-sion: stochasticclusteringfor imagesegmentation,perceptualgrouping,and imagedatabaseorganization,” IEEE Trans.Pattern Anal. MachineIntell., vol. 23, pp. 1053–1074,Oct. 2001.

[22] T. P. Minka and R. W. Picard,“Interactive learningusing a societyofmodels,” PatternRecognition, vol. 30, pp. 565–581,Apr. 1997. Specialissueon ImageDatabases.

[23] A. Jaimes and S. F. Chang, “Model-based classification of visualinformation for content-basedretrieval,” in Storage and Retrieval forImage and Video DatabasesVII, vol. Proc.SPIE,Vol. 3656,(SanJose,CA), pp. 402–414,1999.

[24] J. Fan, Z. Zhu, and L. Wu, “Automatic model-basedobject extractionalgorithm,” IEEETrans.CircuitsSyst.VideoTechnol., vol. 11,pp.1073–1084,Oct. 2001.


Fig. 10. ImageSegmentationBasedon SteerableFilter Decomposition(imagesshown in color). Texture window size = k#l�m�k#l and � =0.8. Edgesaresuperimposedon original images.

Fig. 11. ImageSegmentationUsing JSEG[12] With LeastMerge Setting(imagesshown in color). Edgesaresuperimposedon original images.


[25] E. P. Simoncelli and J. Portilla, “Texture characterizationvia jointstatistics� of wavelet coefficient magnitudes,” in Proc. Int. Conf. ImageProcessing(ICIP-98), vol. I, (Chicago,IL), pp. 62–66,Oct. 1998.

[26] J. Portilla and E. P. Simoncelli, “A parametrictexture model basedonjoint staticticsof complex waveletcoefficients,” Int. J. ComputerVision,vol. 40, pp. 49–71,Oct. 2000.

[27] A. Mojsilovic, J.Kovacevic, J.Hu, R. J. Safranek,andS.K. Ganapathy,“Matching andretrieval basedon the vocabulary andgrammarof colorpatterns,” IEEE Trans.Image Processing, vol. 1, pp. 38–54,Jan.2000.

[28] M. Mirmehdi and M. Petrou,“Segmentationof color textures,” IEEETrans.Pattern Anal. Machine Intell., vol. 22, pp. 142–159,Feb. 2000.

[29] D. Canoand T. H. Minh, “Texture synthesisusing hierarchicallineartransforms,” SignalProcessing, vol. 15, pp. 131–148,1988.

[30] M. Porat and Y. Y. Zeevi, “Localized texture processingin vision:Analysis and synthesisin gaborianspace,” IEEE Trans.Biomed.Eng.,vol. 36, no. 1, pp. 115–129,1989.

[31] D. J. Heeger and J. R. Bergen, “Pyramid-based texture analy-sis/synthesis,” in Proc. Int. Conf. Image Processing(ICIP-95), vol. III ,(Washington,DC), pp. 648–651,Oct. 1995.

[32] J. Portilla, R. Navarro, O. Nestares,and A. Tabernero, “Texturesynthesis-by-analysisbasedon a multiscaleearly-visionmodel,” OpticalEngineering, vol. 35, no. 8, pp. 2403–2417,1996.

[33] S. Zhu, Y. N. Wu, and D. Mumford, “Filters, random fields andmaximum entropy (FRAME): Towards a unified theory for texturemodeling,” in IEEE Conf. ComputerVision and pattern Recognition,pp. 693–696,1996.

[34] J. S. De BonetandP. A. Viola, “A non-parametricmulti-scalestatisticalmodel for natural images,” Adv. in Neural Info. ProcessingSystems,vol. 9, 1997.

[35] M. N. Do and M. Vetterli, “Wavelet-basedtexture retrieval using gen-eralizedGaussiandensityandKullback-Leiblerdistance,” IEEE Trans.Image Processing, vol. 11, pp. 146–158,Feb. 2002.

[36] W. Y. Ma, Y. Deng,andB. S. Manjunath,“Toolsfor texture/colorbasedsearchof images,” in HumanVision and Electronic Imaging II (B. E.Rogowitz andT. N. Pappas,eds.),vol. Proc.SPIE,Vol. 3016,(SanJose,CA), pp. 496–507,Feb. 1997.

[37] Y. Deng,B. S. Manjunath,C. Kenney, M. S. Moore, andH. Shin, “Anefficient color representationfor image retrieval,” IEEE Trans. ImageProcessing, vol. 10, pp. 140–147,Jan.2001.

[38] A. Mojsilovic, J. Hu, and E. Soljanin, “Extraction of perceptuallyim-portantcolorsandsimilarity measurementfor imagematching,retrieval,and analysis,” IEEE Trans. Image Processing, vol. 11, pp. 1238–1248,Nov. 2002.

[39] M. Swain and D. Ballard, “Color indexing,” Int. J. ComputerVision,vol. 7, no. 1, pp. 11–32,1991.

[40] W. Niblack,R. Berber, W. Equitz,M. Flickner, E. Glaman,D. Petkovic,and P. Yanker, “The QBIC project: Quering imagesby contentusingcolor, texture,andshape,” in Storage andRetrieval for Image andVideoData Bases, vol. Proc.SPIE,Vol. 1908,(SanJose,CA), pp. 173–187,1993.

[41] B. S. Manjunath,J.-R.Ohm, V. V. Vasudevan, andA. Yamada,“Colorand texture descriptors,” IEEE Trans. Circuits Syst. Video Technol.,vol. 11, pp. 703–715,June2001.

[42] M. R. PointerandG. G. Attridge, “The numberof discerniblecolors,”Color Research and Application, vol. 23, no. 1, pp. 52–54,1998.

[43] G. Derefeldt and T. Swartling, “Color conceptretrieval by free colornaming,” Displays, vol. 16, pp. 69–77,1995.

[44] S. N. Yendrikhovskij, “Computing color categories,” in HumanVisionand Electronic Imaging V (B. E. Rogowitz and T. N. Pappas,eds.),vol. Proc.SPIEVol. 3959,(SanJose,CA), Jan.2000.

[45] J. SmithandS. F. Chang,“Single color extractionandimagequery,” inProc. IEEE Int. Conf. Image Proc., 1995.

[46] Y. Linde, A. Buzo,andR. M. Gray, “An algorithmfor vectorquantizerdesign,” IEEE Trans.Commun., vol. COM-28, pp. 84–95,Jan.1980.

[47] Y. Deng, S. Kenney, M. S. Moore, and B. S. Manjunath, “Peergoupfiltering andperceptualcolor imagequantization,” in Proc. IEEEInt. Symposiumon Circuits and SystemsVLSI, vol. 4, (Orlando,FL),pp. 21–24,June1999.

[48] P. K. Kaiser and R. M. Boynton, Human Color Vision. Washington,DC: Optical Societyof America,1996.

[49] J. T. Tou andR. C. Gonzalez,PatternRecognition Principles. Addison,1974.

[50] R. M. Gray andY. Linde, “Vectorquantizersandpredictive quantizersfor Gauss-Markov sources,” IEEE Trans. Commun., vol. COM-30,pp. 381–389,Feb. 1982.

[51] Y. Rubner, C. Tomasi, and L. J. Guibas,“A metric for distributionswith applicationsto image databases,” in Proc. IEEE InternationalConferenceon ComputerVision, (Bombay, India), 1998.

[52] J. R. Bergen and M. S. Landy, “Computationalmodeling of visualtexture segregation,” in ComputationalModels of Visual Processing(M. S. Landy andJ. A. Movshon,eds.),pp. 253–271,Cambridge,MA:MIT Press,1991.

[53] M. R. Turner, “Texture discriminationby Gabor functions,” Biol. Cy-bern., vol. 55, pp. 71–82,1986.

[54] A. C. Bovik, M. Clark,andW. S.Geisler, “Multichanneltextureanalysisusinglocalizedspatialfilters,” IEEETrans.PatternAnal.MachineIntell.,vol. 12, pp. 55–73,Jan.1990.

[55] J. Malik and P. Perona,“Preattentive texture discriminationwith earlyvision mechanisms,” Journal of theOptical Societyof AmericaA, vol. 7,pp. 923–932,1990.

[56] D. Dunn and W. E. Higgins, “Optimal Gabor filters for texture seg-mentation,” IEEE Trans. Image Processing, vol. 4, pp. 947–964,July1995.

[57] B. S. Manjunathand W. Y. Ma, “Texture featuresfor browsing andretrieval of image data,” IEEE Trans. Pattern Anal. Machine Intell.,vol. 18, pp. 837–842,Aug. 1996.

[58] A. B. Watson,“The cortex transform:Rapidcomputationof simulatedneural images,” Computer Vision, Graphics, and Image Processing,vol. 39, pp. 311–327,1987.

[59] J. G. Daugmanand D. M. Kammen, “Pure orientation filtering: Ascaleinvariant image-processingtool for perceptionresearchand datacompression,” BehaviorResearch Methods,Instruments,andComputers,vol. 18, no. 6, pp. 559–564,1986.

[60] W. T. Freemanand E. H. Adelson,“The designand use of steerablefilters,” IEEE Trans. Pattern Anal. Machine Intell., vol. 13, pp. 891–906,Sept.1991.

[61] E. P. Simoncelli, W. T. Freeman,E. H. Adelson, and D. J. Heeger,“Shiftablemulti-scaletransforms,” IEEE Trans.Inform. Theory, vol. 38,pp. 587–607,Mar. 1992.

[62] E. P. SimoncelliandW. T. Freeman,“The steerablepyramid:A flexiblearchitecturefor multi-scalederivative computation,” in Proc. ICIP-95,vol. III , (Washington,DC), pp. 444–447,Oct. 1995.

[63] A. Cohen,I. Daubechies,and J. C. Feauveau,“Biorthogonal basesofcompactlysupportedwavelets,” Commun.Pure Appl. Math., vol. 45,pp. 485–560,1992.

[64] I. Daubechies,TenLectureson Wavelets. Philadelphia,PA: SIAM, 1992.[65] T. ChangandC.-C. J. Kuo, “Texture segmentationwith tree-structured

wavelet transform,” in Proc. IEEE-SPInt. Symp.Time-FrequencyandTime-ScaleAnalysis, pp. 543–546,Oct. 1992.

[66] N. Graham,“Non-linearitiesin texturesegregation,” in CBIAFoundationSymposium(G. R. Bock andJ. A. Goode,eds.),vol. 184,pp. 309–329,New York: Wiley, 1994.

[67] N. Graham and A. Sutter, “Spatial summation in simple(fourier)and complex(non-fourier) texture channels,” Vision Research, vol. 38,pp. 231–257,1998.

[68] N. Graham and A. Sutter, “Normalization: contrast-gaincontrol insimple(fourier) and complex(non-fourier) pathways of patter vision,”Vision Research, vol. 40, pp. 2737–2761,2000.

[69] S. Daly, “The visible differencespredictor:an algorithmfor the assess-ment of image fidelity,” in Digital Images and Human Vision (A. B.Watson,ed.),pp. 179–206,Cambridge,MA: The MIT Press,1993.

[70] J. Chen,T. N. Pappas,A. Mojsilovic, andB. E. Rogowitz, “Perceptualcolor and texture featuresfor segmentation,” in Human Vision andElectronic Imaging VIII (B. E. Rogowitz and T. N. Pappas,eds.),vol. Proc.SPIEVol. 5007,(SantaClara,CA), pp. 340–351,Jan.2003.

[71] J. Chen,T. N. Pappas,A. Mojsilovic, and B. E. Rogowitz, “Adaptiveimagesegmentationbasedon color andtexture,” in Proc. Int. Conf. Im-age Processing(ICIP-02), vol. 2, (Rochester, NY), pp. 789–792,Sept.2002.

[72] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A databaseof humansegmentednaturalimagesandits applicationto evaluatingsegmentationalgorithmsand measuringecologicalstatistics,” in ICCV, (Vancouver,Canada),July 2001.


PLACEPHOTOHERE

Junqing Chen (S’02)wasbornin Hangzhou,China.Shereceived the B.S. andM.S. degreesin electricalengineeringfrom Zhejiang University, Hangzhou,China in 1996 and 1999, respectively, and thePh.D. degree in electricalengineeringfrom North-westernUniversity, Evanston,IL, in 2003.

During the summerof 2001, she was a studentintern at the Visual Analysis Group at the IBMT. J. WatsonResearchCenterin New York. FromJanuaryto July 2004,shewasa PostdoctoralFellowat NorthwesternUniversity. In August, 2004, she

joinedUnilever Researchat Edgewater, NJ asimagingscientist.Her researchinterestsinclude image and signal analysis,perceptualmodels for imageprocessing,imageandvideo quality, andmachinelearning.

PLACEPHOTOHERE

Thrasyvoulos N. Pappas (M’87, SM’95) receivedthe S.B., S.M., and Ph.D. degrees in electricalengineeringand computersciencefrom the Mas-sachusettsInstituteof Technology, Cambridge,MA,in 1979, 1982, and 1987, respectively. From 1987until 1999,he wasa Memberof the TechnicalStaffat Bell Laboratories,Murray Hill, NJ. In September1999, he joined the Departmentof Electrical andComputerEngineeringat NorthwesternUniversityas an associateprofessor. His researchinterestsarein imageandvideocompression,videotransmission

over packet-switched networks, perceptualmodels for image processing,model-basedhalftoning, image and video analysis, video processingforsensornetworks,audiovisual signalprocessing,andDNA-baseddigital signalprocessing.

Dr. Pappas has served as chair of the IEEE Image and Multidimen-sionalSignalProcessingTechnicalCommittee,associateeditorandelectronicabstractseditor of the IEEE Transactionson Image Processing,technicalprogramco-chairof ICIP-01 and the Symposiumon InformationProcessingin SensorNetworks (IPSN-04),and since1997 he hasbeenco-chairof theSPIE/IS&TConferenceon HumanVision andElectronicImaging.He is alsoco-chairfor the2005IS&T/SPIESymposiumon ElectronicImaging:ScienceandTechnology.

PLACEPHOTOHERE

Aleksandra (Saska) Mojsilovic (S’93, M’98) wasborn in Belgrade,Yugoslavia. Shereceived theB.S.,M.S., and Ph.D. degrees in electrical engineeringfrom the University of Belgrade, Yugoslavia, in1992,1994,and 1997, respectively. Sincethen shehas worked at IBM Research(2000-present),BellLaboratories(1998-2000)andasa Faculty Memberat the University of Belgrade(1997-1998).Her re-searchinterestsincludemultidimensionalsignalpro-cessing,pattern recognition,modeling, forecastingandhumanperception.

In 2001 Dr. Mojsilovic hasreceived the Young Author Best PaperAwardfrom the IEEE Signal ProcessingSociety. She is a memberof the IEEEMultidimensionalSignal ProcessingTechnicalCommitteeand an AssociateEditor for the IEEE Transactionson ImageProcessing.

PLACEPHOTOHERE

Bernice E. Rogowitz (SM’04) received herB.S.degreefrom BrandeisUniversity, her Ph.D.de-greefrom ColumbiaUniversity, in experimentalpsy-chology, andwasanNIH PostdoctoralFellow in theLaboratoryof Psychophyicsat Harvard University.

Bernice is currently the Program Director forResearchEffectivenessat the IBM T. J. WatsonResearchCenterin New York. Previously, sheman-agedthe Visual Analysis Group at IBM Research,which conductedexperimental researchin humanperception,built interactive software tools for rep-

resentingand exploring data,and worked with customersto develop thesemethodswithin the context of real-world problems.Her researchincludespublicationsin humanspatialandcolorvision,visualization,andperceptually-basedsemanticapproachesto imageanalysisandretrieval.

In 1988, Dr. Rogowitz founded the IS&T/SPIE Conferenceon HumanVision and Electronic Imaging, which she continuesto co-chair. She hasserved on the board of the IS&T from 1997-2002,was electedan IS&TFellow in 2000anda SeniorMemberof the IEEE in 2004.

Documents

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10 ...users.eecs.northwestern.edu/~pappas/papers/jqchen_tip05.pdf · IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 10, OCTOBER