Image Database Indexing using a Combination of Invariant ...yfwang/papers/fusion99.pdfthe object's contour as well as the color and texture ... information from multiple sources is

ImageDatabaseIndexing usinga Combination of InvariantShapeand Color Descriptions

�

RonaldAlferezandYuan-FangWangComputerScienceDepartment

Universityof CaliforniaSantaBarbara,CA, 93106U.S.A.

Abstract Image and video library applicationsarebecomingincreasinglypopular. The increasingpopu-larity callsfor softwaretoolstohelptheuserqueryandretrieve databaseimagesefficiently and effectively. Inthis paper, we presenta technique which combinesshapeand color descriptorsfor invariant, within-a-class retrieval of imagesfrom digital libraries. Wedemonstratethetechniqueon a real databasecontain-ing airplane imagesof similar shapeand query im-agesthat appeardifferent from thosein the databasebecauseof lighting and perspective. We were able toachievea veryhigh retrieval rate.

Keywords: images,video,libraries,features

1 Intr oductionImageand video library applicationsare becom-ing increasinglypopularaswitnessedby many na-tionalandinternationalresearchinitiativesin theseareas,an explodingnumberof professionalmeet-ings devotedto image/video/multi-media, andtheemergency of commercialcompaniesand prod-ucts.Theadventof high-speednetworksandinex-pensive storagedeviceshasenabledthe construc-tion of large electronicimageandvideo archivesandgreatlyfacilitatedtheir accesson theInternet.In line with this, however, is theneedfor softwaretools to help the userqueryandretrieve databaseimagesefficiently andeffectively.

Queryingan imagelibrary canbe difficult andoneof themaindifficultiesliesin designingpower-ful featuresor descriptorsto representandorganizeimagesin a library. Many existing imagedatabaseindexing andretrieval systemsareonly capableofbetween-classesretrieval (e.g.,distinguishingfishfrom airplanes). However, thesesystemsdo not�Supportedin part by a grant from the NationalScience

Foundation,IRI-94-11330

allow theuserto retrieve imagesthataremorespe-cific. In otherwords, they areunableto performwithin-a-classretrieval (e.g.,distinguishingdiffer-ent typesof airplanesor differentspeciesof fish).This is becausetheaggregatefeaturesadoptedbymany currentsystems(such as color histogramsand low-orderedmoments)captureonly the gen-eralshapeof aclassandarenotdescriptiveenoughto distinguishobjectswithin aparticularclass.

The within-a-classretrieval problemis furthercomplicatedif query images depicting objects,thoughbelongingto theclassof interest,maylookdifferent due to non-essentialor incidentalenvi-ronmentchanges,suchasrigid-bodyor articulatedmotion,shapedeformation,andchangein illumi-nation and viewpoint. In this paper, we addresstheproblemof invariant,within-a-classretrieval ofimagesby usinga combinationof invariantshapeandcolor descriptors.By analyzingthe shapeoftheobject's contouraswell asthecolorandtexturecharacteristicsof the enclosedarea, informationfrom multiple sourcesis fusedfor a more robustdescriptionof an object's appearance.this placesour techniqueat an advantageover most currentapproachesthat exploit eithergeometricinforma-tion or color informationexclusively.

The analysisinvolvesprojectingthe shapeandcolor information onto basis functions of finite,local support (e.g., splinesand wavelets). Theprojectioncoefficients,in general,aresensitive tochangesinducedby rigid motion,shapedeforma-tion, and changein illumination and perspective.We derive expressionsby massagingthesesetsof projectioncoefficients to cancelout the envi-ronmentalfactorsto achieve invarianceof the de-scriptors. Basedon thesefeatures,we have con-ductedpreliminary experimentsto recognizedif-ferent types of airplanes(many of them havingverysimilarshape)undervaryingilluminationand

viewing conditionsandhave achievedgoodrecog-nition rates.We show that informationfusionhashelpedto improve the accuracy in retrieval andshapediscrimination.

2 TechnicalDescriptionIn this section,we presentthe theoreticalfoun-dation of our image-derived, invariant shapeandcolor features.Invariantfeaturesform a compact,intrinsic descriptionof anobject,andcanbeusedto designretrieval andindexing algorithmsthatarepotentially more efficient than, say, aspect-basedapproaches.

Thesearchfor invariantfeatures(e.g.,algebraicandprojective invariants)is a classicalprobleminmathematicsdatingbackto the18thcentury. Theneedfor invariantimagedescriptorshaslong beenrecognizedin computervision. Invariant featurescan be designedbasedon many different meth-odsandmadeinvariantto rigid-bodymotion,affineshapedeformation,sceneillumination, occlusion,andperspective projection.Invariantscanbecom-putedeither globally, such is the casein invari-antsbasedon momentsor Fourier transformco-efficients,or basedon localpropertiessuchascur-vatureandarc length. See[3, 4, 5] for survey anddiscussionon thesubjectof invariants.

As mentionedbefore,our invariantfeaturesarederived from a localized analysisof an object'sshapeandcolor. Thebasicideais to projectanob-ject'sexteriorcontouror interiorregionontolocal-izedbasessuchaswaveletsandsplines.Thecoeffi-cientsarethennormalizedto eliminatechangesin-ducedby non-essentialenvironmentalfactorssuchasviewpoint and illumination. We will illustratethemathematicalframeworksusinga specificsce-nariowhereinvariantsfor curvesaresought.Theparticularbasisfunctionsusedin the illustrationwill bethewaveletbasesandsplinefunctions.In-terestedreadersarereferredto [1] for moredetails.

Several implementationissuesarisein this in-variant framework which we will briefly discussbeforedescribingthe invariant expressionsthem-selves.1

1. How are contoursextracted?

1A word on the notationalconvention: matricesandvec-tors will be representedby bold-facecharacterswhile scalarquantitiesby plain-face characters. 2D quantitieswill bein small letterswhile 3D and higher-dimensionalquantitiesin capital letters. For example,coordinates(bold for vectorquantities)of a 2D curve (small letter for 2D quantities)willbedenotedby � .

Or statedslightly differently, how is the problemof segmentation(separatingobjects from back-ground)addressed?Segmentationturnsout to bean extremelydifficult problemand,asfundamen-tal a problemassegmentationis, thereis no fail-proof solution. A “perfect” segmentationschemeis like theholy grail of low-level computervisionanda panaceato many high level vision problemsaswell.

We arenot in searchof this holy-grail, which,we believe, is untenablein the foreseeablefuture.In an imagedatabaseapplication,the problemofobjectsegmentationis simplifiedbecause� Databaseimagescan usually be acquiredun-der standardimagingconditionswhich allow theingest and catalog operationsto be automatedor semi-automated.For example, to constructadatabaseof airplaneimages,many bookson civiland military aircraftsare available with standardfront, side,andtop views taken againsta uniformor unclutteredbackground. (The above is alsotruefor applicationsin botany andmarinebiology.)Thisallowsthecontoursof theobjectsof interesttobeextractedautomaticallyor with theaid of stan-dardtoolssuchasthefloodfill maskin Photoshop.Furthermore,thecatalogingoperationsareusuallydoneoff-line anddoneonly once.Hence,a semi-automatedschemewill suffice.� On the other hand, query imagesare usuallytaken underdifferent lighting andviewing condi-tions. Objectsof interestcanbeembeddeddeeplyin clutteredbackgroundwhich makestheir extrac-tion difficult. However, we can enlist the helpof the user to specify the object of interest in-steadof askingthe systemto attemptthe impos-sibletaskof automatedsegmentation.A query-by-sketchor a“human-in-the-loop”typesolutionwithaneasy-to-usegraphicsinterfaceandsegmentationaids suchas the flood fill maskis perfectly ade-quatehereanddoesnot imposeundueburdenontheuser. This proved to be feasiblein our experi-ments.2. How are contoursparameterized?For a contourbaseddescription,a commonframeof referenceisusuallyneededthatallowspointcor-respondencesto be establishedbetweentwo con-toursfor comparison.Thecommonframeof refer-encecomprisesa commonstartingpoint of traver-sal,acommondirectionof traversal,andaparame-terizationschemethattraversesto thecorrespond-ing pointsin the two contoursat thesameparam-eter setting. We will first discussthe parameter-

ization�

issueand thenaddressthe issuesof pointcorrespondenceandtraversaldirection.

When defining a parameterizedcurve ��

�� , mostprefertheuseof theintrinsicarclengthparameterbecauseof its simplicity andthefactthatit is eitherinvariantor transformslinearlyin rigid-body motion anduniform scaling. How-ever, undermore generalscenarioswhereshapedeformationis allowed(e.g.,deformationinducedin an oblique view), intrinsic arc length parame-ter is no longer invariant. Suchdeformationcanstretchandcompressdifferentportionsof of anob-ject's shape,and a parameterizationbasedon in-trinsic arc lengthwill resultin wrongpoint corre-spondence.

It is well known that many shapedeformationanddistortionresultingfrom imagingcanbemod-eledasanaffine transform,throughwhich the in-trinsic arc length is nonlinearly transformed[2].An alternative parameterizationis thus required.There are at least two possibilities. The first,called affine arc length, is defined[2] as: �� ! "�$#�&% #� "�(')� where "� � "� are the first and #� � #�arethe secondderivativeswith respectto any pa-rameter � (possibly the intrinsic arc length), and�+*,�.-/ is thepathalongasegmentof thecurve.

Another possibility [2] is to use the enclosedareaparameter: 01�324 ��65 � "�7%� "� 5 ')�98 Onecaninterprettheenclosedareaparameterastheareaofthetriangularregion enclosedby thetwo line seg-mentsfrom thecentroidof anobjectto two points* and - on thecontour. It canbeshown thatboththeseparameterstransformlinearly undera gen-eral affine transform[2]. Hence,they caneasilybemadeabsolutelyinvariantby normalizingthemwith respectto thetotalaffinearclengthor thetotalenclosedareaof the whole contour, respectively.Weusetheseparameterizationsin ourexperiments.3. How are identical traversal dir ection andstarting point guaranteed?It will be shown that the invariant signatures (tobedefinedlater)of two contoursarephase-shiftedversionsof eachotherwhenonly thestartingpointof traversaldiffers.Furthermore,thesamecontourparameterizedin oppositedirectionsproducesin-variantsignaturesthatareflippedandinvertedim-agesof eachother. Hence,a matchcanbechosenthat maximizescertaincross-correlationrelationsbetweenthetwo signatures.

Allowing an arbitrary changeof origin andtraversal direction, togetherwith the use of anaffine invariant parameterization,imply that no

point correspondenceis required in computingour invariants.

Now wearereadyto introducetheinvariantex-pressionsthemselves.Our invariantsframework isvery generalandconsidersvariationin anobject'simageinducedby rigid-bodymotion,affine defor-mation,andchangesin parameterization,sceneil-lumination,andviewpoint. Eachformulationcanbeusedalone,or in conjunctionwith others.Dueto the pagelimitation, we can only give a briefdiscussionof the invariantsunderrigid-body andaffine transformand summarizethe invariant ex-pressionsunderchangeof illumination andview-point. Interestedreadersare referredto [1] formoredetails.

Invariants under Rigid-Body Motion and AffineTransform Consider a 2D curve, �� where � denotesa parameterizationwhich is invariant underaffine transform(as de-scribedabove), andits expansionontothewaveletbasis: � �9@+A �� (where > �� is themotherwavelet)as B �C; � � � ��D: �

Scenarios Invariantexpressions

Rigid-bodymotion(usingsplinebasis)5 ucv9w x A ucy�w zc55 u|{Dw } A uc~.w 5

Affine transform(usingwaveletbasis)

ppG�V ,. pppp ,9 ,c ppAffine transform(usingsplinebasis)

pppp V . 9 pppppppp �c 3�| pppp

Perspective transform�@ �+��%aN| + ; ��

4 ']�(usingrationalsplinebasis ) where�� is theobservedimagecurve,and ; �� is thedatabasecurve in rationalsplineform.Changeof illumination

ppp � ucv9Uw x^.ucv�.w xVqq�ucv�

(1) (2) (3) (4)

(5) (6) (7) (8)

(9) (10) (11) (12)

(13) (14) (15) (16)Figure1: A databaseof airplanemodels.

We useda two-stageapproachin informationfusion. Featuresinvariant to affine deformationandperspective projectionwerefirst usedto matchthesilhouetteof thequeryairplanewith thesilhou-ettesof thosein the database.We thenemployedthe illumination invariantscomputedon objects'interior to disambiguateamongmodelswith sim-ilar shapebut differentcolors. The resultsshowthat wewereable to achieve 100% accuracyus-ing our invariants formulation for a databasecomprising very similar models,presentedwithquery imagesof largeperspective shapedistor-tion and changein illumination.

Table2 shows the performanceof usingaffineandperspective invariantsfor shapematchingun-

der a large changeof viewpoint. For eachqueryimage(A throughK), theaffineandperspective in-variantsignatureswerecomputed,and comparedwith the signaturesof all modelsin the database.Correlationcoefficientsasdescribedin Sec.2 wereusedto determinethesimilarity betweeneachpairof signatures.Eachrow in Table2 refersto aqueryimage.Eachof thetencolumnsrepresentstherankgiven to eachairplanemodel from the database(shown in parentheses).The columnsareorderedfrom left to right, with the leftmostcolumnbeingthe bestmatchfound. Only the top ten matchesareshown. Thevalues(not in parentheses)arethecorrelationcoefficients.Entriesprintedin boldfacearetheexpected(correct)matches.

(A) (B) (C) (D)

(E) (F) (G) (H)

(I) (J) (K)Figure2: Thesameairplanesin varyingposesandillumination.

As can be seen,all query imageswere identi-fied correctly. Fig. 3 shows a sampleresult. Theleftmostimageis the queryimage. The top threematchesarein thenext threecolumns—thequeryimage (solid) and estimatedimage (dashed,us-ing perspective invariants)with thecorrespondingdatabasemodelareshown.

For this experiment,all queryimageswerecor-rectlymatchedwith themodelsfrom thedatabase,usingaffine andperspective invariants. However,the error valuesof the top two matchesfor, say,airplaneK werevery closeto eachother. This isbecausethe top two matcheshave similar shapesandbotharesimilar to thequeryimage.Theconfi-dencein theselectedmatchescanbestrengthenedby testingwhetherthe interior regionsof the ob-jects are also consistent. Illumination invariantsreadilyapplieshere.

For illumination invariants, a characteristic

curve wasuniquelydefinedon thesurfaceof eachairplane model in the database(performedoff-line), so that its superimpositionover the modelemphasizesimportant (or interesting)color pat-terns in the image. Our perspective invariantsschemecomputedthe transformationparametersthatbestmatchthetwo givencontours.Thesameparameterswereusedto transformthe character-istic curve definedfor eachmodel to its assumedposein thequeryimage.Hence,thecolorsdefinedby the characteristiccurve in the model shouldmatchthecolorsdefinedby thetransformedcurvein the query image(except for changesdueto il-lumination). Illumination invariantsignaturesforthe query imageswere thencomputed,andcom-paredwith thesignaturesstoredin thedatabaseus-ing Eq.4.

We show one result of illumination invariantswherethe (perspective invariant)errorsof the ¿

Rank(usingaffineandperspectiveinvariants)Image

9Á+Â ÃVÄVÅ ÆVÇUÅ ÈVÂdÉ Ê

100 150 200 250 300 350 400 450 500 550 60050

100

150

200

250

300

350

400

450

100 200 300 400 500 600 70050

100

150

200

250

300

350

400

450

100 150 200 250 300 350 400 450 500 550 6000

50

100

150

200

250

300

350

400

450

( ¿ À @ ) ( X Ð t ) ( ÑcÒ t )Figure3: QueryimageA, with thetop threematchesfrom thedatabase,usingperspective invariants.(Solidfor thequeryimage,dashedfor theestimatedimageusingperspective invariants.)

0 10 20 30 40 50 600

1

2

3

4

5

6

7

8

9

10

(a) (b) (c)

0 10 20 30 40 50 600

1

2

3

4

5

6

7

8

9

10

(d) (e) (f)Figure4: (a),(d) Airplane modelswith the characteristiccurvessuperimposed,(b),(e) query imagewiththe transformedcharacteristiccurvessuperimposed,and(c),(f) illumination invariantsignaturesfor queryimageK (solid)andfor models14and3 (dashed).

Documents

Image Database Indexing using a Combination of Invariant ...yfwang/papers/fusion99.pdfthe object's contour as well as the color and texture ... information from multiple sources is