Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Imageregistration usingalpha-entropy measures andentropicgraphs
Huzefa Neemuchwala���
, Alfred Hero�����
, andPaulCarson���
Dept.of BiomedicalEngineering�, Dept.of EECS
�, Dept.of Statistics
�, andDept.of Radiology
�TheUniversityof MichiganAnn Arbor, MI 48109,USA
June2002
Abstract
Registrationof a referenceimageto a secondaryimageextractedfrom a databaseof transformedexemplarsconstitutesan important imageretrieval andindexing application. Two importantproblemsare: specificationof a general classof discriminatory imagefeatures andan appropriate similaritymeasureto rank theclosenessof thequery to thedatabase.In this paper we proposea solutionusinghighdimensional imagefeatures, whichcanbeeithercontinuousor discretevalued, anda general classof featuresimilarity measuresbasedonRenyi’s � -entropy function. Thisclassof measurescontainsthewell known mutual informationmeasure, its mutual � -information( � -MI) variants, andthe � -Jensendifference. Whenthe featuresarediscretevalued, the � -MI canbe estimatedfrom the joint featurehistogramconstructed from the referenceand the secondary imagesusing a datastructurecalled afeaturecoincidencetree.However, histogramestimationtechniquesbecome impractical for continuousvaluedfeaturesin high dimension. For such featureswe proposean alternative similarity measure:the � -Jensendifferencewhich canbeaccuratelyestimatedusinganentropic-graph estimatorsuchastheminimal spanning tree(MST). A low time-memory complexity MST is usedto compare a varietyof continuousanddiscrete featuresincludingsinglepixel gray levels,tagsubimages,andindependentcomponent analysis (ICA) coefficient vectors. The methodology is illustratedfor ultrasound breastimageregistrationfor which we find that the bestsmall angleregistrationperformance is attainedbyimplementing minimalgraph entropy estimatorsonasetof ICA featurevectors.
Keywords: minimal spanning treeentropy estimators, feature coincidence trees,ultrasoundbreast image
registration
1Corresponding author:Prof. Alfred Hero,4229EECS,1301BealSt.,Ann Arbor, MI 48109-2122USA. Tel: 734-763-0564.FAX: 734-763-8041.email: [email protected] work was supportedin partby PHSgrantP01CA85878andin partby aBiomedical EngineeringFellowshipto thefirst author.
1
1 Intr oduction
Theimageregistrationproblemfalls in thegeneralareaof indexing andretrieval over databasesof images��� ������������for thepurposesof finding thebestmatchto a referenceimage
���. A comprehensivereview
of many techniques in imageretrieval andindexing is presentedin [50]. Thebestmatchis expressedasa
partial re-indexing of the databasein decreasingorderof similarity to the referenceimageusing an index
function. In the context of imageregistration the databasecorrespondsto a setof transformedversionsof
the secondary image,e.g. rotation andtranslation, which arecompared to the referenceimage. Thereare
threekey ingredients to imageretrieval andindexing which impact accuracy andcomputational efficiency
[12]:
1. Selection of imagefeaturesthat discriminatebetween different imageclasses yet possessinvarianceto
unimportantattributesof theimagese.g.rigid translation, rotation andscale;
2. Applicationof an index function thatquantifies feature similarity, is capable of resolving important dif-
ferencesbetweenimages, yet is robustto imageperturbations;
3. Queryprocessingandoptimizationtechniqueswhich allow fastimplementation of search for bestmatch-
ing.
This paper is concernedwith the appropriatechoice of featuresandthe selection of the feature similarity
measure.
The focus application of this paperis the co-registration of a pair of ultrasound (US) imagesof the
breast. Successin this application should leadto full 3D and4D registration: the fourth dimension being
time. Accurateregistration of 3D breastUS imagevolumescould be thebreakthrough to addressefficient
useof whole breast imagingto detect andquantify changesthat: areindicative of a breast lesion; canaid
discrimination of malignantfrom benign lesions [4, 53]; canbeusedto detectmultifocal secondarymasses
[9] and can quantify responseto chemotherapy or radiation therapy [21]. Breastlesions are missedby
community practitionersin upto 45%of womenwith densebreasts [23]. Improvedimageregistration could
alsoimprove spatial compoundingof US images. In compounding, partly correlatedviews of theregion of
interest (ROI) are generatedby scanning the ROI at different transducer tilt angles and then registering
pairsof separateviews. Compounding canresult in an improvement in the signal-to-speckle ratio of US
2
imagesandleadto better delineation of specular reflectors [24]. The field of view of high resolution US
imagesis insufficient for full useof 3D USin detecting asymptomatic lesionsin thebreast andfor tracking
changesin responseto treatment. To create an imageof theentire breast or a large fraction of it, thesmall
volume covered by a single scancanbe extendedby repeatedlyscanning the breast in parallel, partially
overlapping sweepsthat canthenbe combined using registration techniques [25]. Finally, registration of
imagescollectedfrom different isonation anglescanalsobeusedin Dopplerimaging wherethecolor flow
acquisitions do not detect blood flow well when its direction of motion is normal to the direction of the
ultrasoundbeamandwhereaccuratetriangulation canmeasureflow velocity.
To datethemosteffective methodsfor imageretrieval andimageregistration have beenpixel andvoxel
based and include: color histogrammatching [19], texture matching [2], intensity cross correlation [35],
optical flow matching [28] and, mutual information(MI) registration [57]. MI registration maximizes the
MI maximization on pixel coincidence histogramsandis usedin theMIAMI-Fusec�
registrationalgorithm
at theUniversity of MichiganMedical Center[38]. A general survey of medical imageregistrationmethods
is presentedin [35] andvoxel specific methods arethefocus of [18]. Color indexing wasfirst proposedby
Swain andBallard [54] andvariations of this techniquehave appearedin [10, 19, 51]. A histograminter-
section formulawaspresentedwhich,under mostconditions, reducestheinfluenceof background pixelson
the sharpnessof the registration peak. This approachallowsthecomparison of knownobjects basedupon
thecolorsof interest.Thepattern matching techniquesdescribedin ourpaper areapplicableto multichannel
(color) imagery. Texture-basedimageretrieval using statistical methods[2, 45] work well for imageswith
uniform scale andorientation, while spectral imagefeatures like Gaborfilters asusedin [29, 33, 7] and
DCT [55, 56] work well for natural imagery. However, US images arenoisy andoften showshadow and
enhancementanddegradationeffects. Independent componentanalysiscoefficients, usedin this study, are
tailoredto theUS imagedomainandarethus morerobust to noise andother artifacts.Various otherimage
registration measures, such assumof squared differencesandcorrelation coefficients (see[35, 18]), have
beenutil ized for registration using single pixel features. However, pixel-specific methods and measures
oftensuccumb to various artifacts like spuriouslarge intensity differences, non-uniform illumination,satu-
ration, occlusion andrequire linear relationshipsbetweentheintensitiesof referenceandsecondaryimages.
Themethodof mutual information (MI) registration haslargely overcomemany of thesedifficulties.Mutual
3
informationprovidesan informationtheoretic approachto imageregistrationandhasbeen found to bethe
mostrobust andaccurateimageregistrationalgorithm for a wide variety of modalities[18].
Ultrasonic imageregistration presents special challengesdue to the fact that rigid body registration
techniques, developedinitially for magnetic resonance(MR) andcomputedtomography(CT)images(e.g.of
thehumanbrain) cannotbeapplieddirectly to thehigh noiseimagesfrom non-linearly compressibleorgans,
suchas the bladder, breast, heart, and in imagesthat contain speckle, nonlinear (shear) distortions and
refractionartifacts.A manualmethod whichutil izedcorrespondingpoints,linesandplaneswasinvestigated
by Moskalik etal [40] in thefirst effort for 3D USimageregistration.Rohling [47], first usedvoxel intensity
based techniquesfor ultrasoundimageregistrationin 3D compounding. In histechniquethecorrelationratio
wasusedin conjunction with a 3D gradient measure[36]. A similar bivariateapproachwasimplemented
for crossmodality US-MR registration by Rocheet al [46]. Both casesworked reasonably well for clean
US images. Meyer et al [37] demonstratedthat their mutual informationbasedsoftwarecalled “MIAMI
Fuse”developedfor multimodality imagefusion [22, 38] is capableof registering reasonably difficult 3D
US images. MIAMI Fusewasalsousedin 3D US imagecompounding[24], for registrationof multimode
andextended-volumeUS imagesandto trackchangesof tissuestructureandvascularity of serial US scans
obtainedmonthsapart [25].
Despitetheinitial successreportedonUSimages,pixel- or voxel- based registrationmethodshavebeen
disappointing in application to certain clinical cases. Ultrasonic imagesarehighly sensitive to transducer
orientationduringscanning, astheattenuationartifactsoften present themselvesdifferently in thesecondary
andreferenceimages [37]. Further inconsistency in tissue structureandgeometry might arise from coher-
ent echoes andshadowing effects. Partial or complete shadows, caused by objects nearthe probe, area
hindranceto registrationduring compounding. Densetissue suchasribs (in breast imaging) andmalignant
calcificationcauseshadows,whichpresentthemselvesin differentdirectionsin thereferenceandsecondary
images. The problem is further complicatedby intensity similarity of shadows,cysts and tumorswhich
makesintensity-baseddiscrimination difficult. Ultrasonic images typically showblurredboundariesaround
anatomical features.Theseboundarieshave varying intensity dueto changesin surfacecurvatureandtrans-
ducer orientation. ThiseffectoccursbecauseUSimagesexhibit astrongangular dependency of theapparent
brightnessof specularreflectors. They alsohaveasmaller signal-to-noisedynamic rangethanmagnetic res-
4
onance(MR) andcomputed tomography (CT) images.Furthermore,a smallfield of view (FOV) andlarge
motion anddeformationof the compliant breast tissue makes it difficult to obtain consistent imagepairs
for registration undermanual scanning. Due to thesefactors developing robust registration algorithms is
extremely challenging [25]. Without modification,single-pixel based MI registrationmethodsbased areun-
likely to succeed in USregistration. Ourexperiencehasshown thatMI suffersseveremisalignmenterrors in
those imagevolumeswith majorshadowsandobtainedat differentbeamorientations. This paper describes
amethodto improveupon theaccuracy of single-pixel-basedMI registration usinghigher orderfeaturesand
better feature matching criteria.
A recent study by ShekharandZagrodsky [49] delves into 3D US imageregistration for the human
heartusing MI-basedregistration. Thestudy is interestingbecauseUS imagesof anon-linearly deformable
organareregistered.They show that theMI criterion, on raw USimages, producesa rippledsurfacefor 2D
translations of theimageandemphasize that theripplesconfoundthesearch for themaximumcorrespond-
ing to the solution, and that the removal of undesired local maximain the MI function is key to making
optimization robust andreliable. Medianfiltering for speckle removal leadsto marginal improvementin
registrationaccuracy, thoughsmoothing of themutualinformationfunction is observed. They alsosuggest
Partial Volumeinterpolation [34] andimagerequantization for smoothing. Themethodproposedhereis an
alternative way to improve thebehavior of theobjective function.
In this paperwe extend andimprove upon MI registration technology. The key to our approachis the
inclusionof highly specific imagefeaturesanduseof a generalizedinformationdivergencematching crite-
rion relatedto theChernoff bound of detection theory. An innovationis theintroduction of ageneral pattern
matching criterion which canbe usedto register high dimensional imagefeaturessuchascurves, edges,
texturesandspatial relations. The criterion canbe adapted to both discretevalued andcontinuousvalued
feature vectors selectedusinga representative training setof images. In thecaseof discretefeatureswe can
compute thejoint coincidencehistogramof featuresof agiventypeoccurring at thesamespatial location in
boththereferenceandthetransformedexemplar (secondary) images.This histogramis constructedusing a
efficient hierarchical database,which we call a feature coincidence tree, andits spread,asmeasured by the
joint entropy, is anindicatorof thedegreeof misregistration.
As in standardMI registration, to register thesecondary to thereferencea sequenceof transformations
5
is applied to the secondary until the spreadof the coincidencehistogramis minimized. However, un-
like standard MI registration methods, we replace the MI similarity function with the moregeneral class
of � -entropy functions, where � is a parameter in the interval � ��� "! . This classincludesvarious informa-
tion divergence measuressuch as: the mutual � -information ( � -MI), the � divergence, and the � -Jensen
difference. The combination of higher order features and the � -entropy similarity functions specifies a
large andflexible setof registration algorithms. As the purposeof this paper is to introducethis powerful
feature-similarity combination wedo notconsidertheimportantproblemof practical implementation of the
sequenceof registration transformationswhich will bethesubject of a future paper.
Our approachspecializesto thestandard mutualinformationalgorithm (e.g. MIAMI-Fusec�
) whenthe
featuresarethesetof single-pixel graylevelsand �$#% . Theadvantagesof our approachare:1) useof the
generalized mutual � -informationcanleadto a morestable objective function; and2) useof higher order
featurescancapture non-local spatial information which is ignored in the standardsingle pixel MIAMI-
Fusec�
algorithm andcanleadto moreaccurateandrobustimageregistration.
We investigatetwo typesof discretefeatures,calledtags, to populatethefeature coincidencetree.One
of these is based on gray-scale adaptive thresholding and the other on independent component analysis
(ICA). Grayscalethresholding is a fastandsimpleadaptive quantization schemeproposedby Gemanand
Koloydenko [8]. ICA is an iterative method which is closely related to theprojectionpursuit techniqueof
non-linear regression andhasbeenapplied to imageanalysisby Olshausen,Hyvarinenandothers [30, 20].
ThecontinuousvaluedICA coefficients,obtainedby projecting theimageontoits ICA bases,arediscretized
by vector quantization, implemented by partitioning thecoefficient vector into a finite numberof (Voronoi)
cells.
When the number of dimensions of the feature spaceincreasesbeyond 10 or so histogram methods
become impractical dueto the curseof dimensionality: for fixed resolution per dimension the number of
histogram bins increasesgeometrically in dimension. As high dimensional feature spaces can be more
discriminatory this createsa barrier to performing histogram-basedregistration. We circumventthis barrier
by applying a novel technique for estimating the � -entropy using entropic graphs whosevertices are the
locations of the feature vectors in feature space. As introduced in [16] and studied in [12, 13, 15, 17]
an entropic graph is any graph whosenormalized total weight (sum of the edgelengths) is a consistent
6
estimator of � -entropy. An exampleof anentropic graphis theminimalspanning tree(MST) anddueto its
low computational complexity it is thefocusof this imageregistrationapplication. TheMST is constructed
over thesetof joint feature vectors from thesecondaryandreferenceimageto yield anestimateof Jensen’s
entropy difference.We compare andcontrastthis approachwith histogrammethods for theUS registration
application.
All of thecomparisonsareevaluatedthrough a combination of simulationsandrealdataacquired from
clinicalstudies.Thedatabaseusedin thispaperwasasetof 3D ultrasound scansof theleft or right breast of
21 femalesubjects,aged21-49 years, going to biopsy for possible breast cancer. The lower agerangewas
chosen to provide a sampleof morecomplex breasts, which arealsosomewhat moredifficult to diagnose
thantypical breastsof olderwomen.Fig 1 showsslicesof breast ultrasoundimagevolumesrepresentativeof
thosefound in clinicalpractice.Thewomenwereimagedontheirbackswith thetransducer imaging through
thebreast towardthechest wall. Threetestcaseschosen from thebreast databaseandreferredto asCase151,
Case142andCase162arepresented. Theimageslicechosenfrom Case151exhibitssignificant connective
tissuestructureasthebright thin linesor edges.Case142wasdiagnosedasamalignanttumorin echogenic
fibroglandular tissues. The tumor characteristically shows discontinuous edges with a darker center and
shadows below theborders.Theareaof enhancementbelow thetumor is not common.Case162shows an
uncommondegree of degradationdueto shadowing. Thebottomtwo-thirds of the imageinclude thechest
wall andthe dark shadow andreverberationsbehind the acoustically impenetrable boundarybetweenthe
lung andchestwall. Someedgeinformation is evident, however shadowy streaksareobserveddueto dense
tissue absorbing the soundbeam,refraction and phase correlation at oblique boundary or poor acoustic
impedancematch(air bubbles) betweenthe transducerandthe skin. For clarity of presentation we focus
on registrationof 2D slices. Theextension to 3D voxel registration is straightforwardbut will bepresented
elsewhere.
Theoutline of this paperis asfoll ows. Section2 introduces � -mutual informationasa similarity mea-
sure. Section 3 presents feature coincidence treesfor discrete features. Methods of tag selection using
adaptive thresholding arepresentedin Section 4, while discreteandcontinuous ICA featuresarediscussed
in Section5. In Section 6 the � -Jensendifferenceis discussedandin Section7 theMST entropy estimator
is presentedalongwith methods of acceleration. A simplenumerical exampleis presented in Section8
7
andSection 9 presents registration results for a ultrasonicbreast imagedatabase.Conclusionsandfuture
directionsaredescribedin Section10.
2 Indexing with Mutual & -Inf ormation asa Similarity Measure
Let���
beareferenceimageandconsideradatabase�'�
, ( � )�+*+*+*,�.- of imagesgeneratedfrom adatabase
of secondaryimagesto beindexedrelative to thereference.Let / � , /10 befeaturevectorsextractedfrom� �
,� 0 anddefinethe joint density 2435/ � �6/7098 andthemarginal densities 2435/ � 8 , 2435/10,8 . Thesimilarity between
features / � and / � canbegaugedby thedifferencebetween2435/ � �6/ � 8 andtheproduct 2435/ � 8:2435/ � 8 which
canbemeasuredby themutualinformation(MI).
Whenthe featuresarediscretethe joint probability distributions 2435/ � �6/ � 8 of referenceandsecondary
imagescanbeplottedon a two-dimensional histogramof intensity values(Fig 2). Fig. 3 showsfour joint
histogramscatter plots of a pair of reference(�'�
) andsecondary (� 0 ) ultrasoundimagesextracted from a
breast US volume. It canbe seenthat whenthe referenceandsecondary images arethe same(� 0 �;���
)
except for rotation, thejoint histogramshows perfect correlation of intensity values in pixel coordinates[a].
Whenthereferenceandsecondaryimagesaremis-alignedby 8 degrees,thejoint histogramis dispersed[b].
However, consider thecasewherethe referenceandsecondaryimagesareextracted from two slices in the
samebreastUS volume wherethe slicesareseparatedby 2mm away from eachother. At this separation
distance along the depthof the scan, the speckle in the imagesis decorrelated, but the anatomy in the
imagesremains largely unchanged. Now, thehistogramscatterplot [c] showsadispersion similar to theone
seenin [b], in spite of perfectalignment. In caseof perfectly correlatedimageslicesa simpleregistration
objective function which might be applied is the spread of the two dimensional histogram in (b) about a
fixed reference,suchasthe diagonal shown in [c]. However, ultrasoundimagesobtained at different scan
anglesshow different speckle patterns.Henceplots [c] and[d] areobservedmoreoften in practiceandsuch
a simplereference-basedobjective function is not effective. A moreeffective objective function might be
the entropy of the joint histogramswhich measures the spread of the joint distribution independent of any
fixed reference. The mutual information is a generalization of this entropy measurewhich measures the
8
spread relative to themaximallyspreadproduct of themarginal histograms:<=?>A@ =�B 243AC � �.C � 8EDGFIHJ3K243AC � �.C � 8�L��M243AC � 8:243AC � 8�!N8 (1)
which is usedby Viola andWells [57] andothers for imageregistration. Note that this is a different usage
from the KL divergencebetween2435/ � 8 and 2435/ � 8 which hasbeenproposedasan indexing measureby
several authors [52, 6, 55].
The mutual � -information, a generalization of (1), is definedas the � -divergence of fractional order�POQ� ��� "! between 2435/ � �6/ � 8 and 2435/ � 8:2435/ � 8 [3]. For discrete valued featuresthe � -MI is:R�S 3K2435/ � �6/ � 8UTV2435/ � 8:2435/ � 8�8 � �XWY DGFIH <=?>K@ = B 2 S 3AC � �.C � 8:2 ��Z[S 3AC � 8:2 ��Z[S 3AC � 8 (2)
wherethesummationis overall values that / � and / � cantake on. For continuousvaluedfeaturesthe � -MI
is: R�S 3K2435/ � �6/ � 8UTV2435/ � 8:2435/ � 8�8 � �XWY D�FIH]\ =�>A@ = B 2 S 3AC � �.C � 8:2 ��Z[S 3AC � 8:2 ��Z[S 3AC � 8 (3)
The � -divergence is equal to the Hellinger distancesquaredwhen � � ,L)^ , and to the Kullback-
Liebler (KL) divergence [27] when �Y# . Thecase�Y#_ corresponds to thestandardShannon mutual
information(1).
The mutual � -informationcanbe justified as an appropriate registration function by large deviations
theory throughtheChernoff bound. Definetheaverageprobability of error `ba93Nc�8 associatedwith deciding
whether / � and / � aredependent or independent random variables. i.e. deciding betweendependentfea-
tures, d �fe / � �6/ �hg 243AC � �.C � 8 vs. dependentfeatures d �ie / � �6/ �jg 243AC � 8:243AC � 8 , basedon a setof i.i.d.
samples /lk �?m0 �+*+*+* �6/nkpo m0 , q � ����( . This errorprobability hastherepresentation:
` a 3Nc�8 �sr 3Nc�8�`t3Ad � 87uv�w3Nc�8�`t3Ad � 86�where
r k c�8 and �x3Nc�8 areType II (miss)andType I (false alarm) errors, respectively, of the testof d �vs. d � . ThentheChernoff boundimpliesthat[5]:
9
DGyGzYy|{~}o)�n� c D�FIHw` a 3Nc�8 � W��.�~�SE�E� � @ �K� 3: �WQ��8 R�S 3K2435/ � �6/ � 8�Tx2435/ � 8:2435/ � 8�8 � (4)
Thusthemutual � -information givesthe asymptotically optimal rateof exponential decayof theerror
probability for testing d � vs d � asa function of c . While we do not investigate this issuein this paper,
theappearanceof themaximumover � suggeststheexistenceof anoptimalparameter� ensuring low error
registration.
3 FeatureCoincidenceTreesfor DiscreteFeatures
ToperformMI registrationwith discretefeatureswemustpopulatethebinsof thejoint histogram 243AC � �.C � 8 �
for eachimagein thedatabase.First, a universalsetof featuresis selectedaccording to certain criteria dis-
cussedbelow. Thesefeaturesareorganizedinto binson a tree-structureddatabasefor which thecomplexity
of the feature indexedat eachnodeincreasesastreedepth increases.Figures 4 and5 illustratethe feature
treedatastructurewhenthe featuresaredefinesas ����� sub-images. The two imagesareeachdropped
down a feature treeand incidences andcoincidencesof featuresover all pixel locations andat all of the
nodes of thetwo treesarecounted.Thecounter is incrementedfor every coincidence of a particular feature
pair occurring at a commonposition within eachof the two images. SeeFig 6 for illustration. This results
in ahistogramcalledthefeaturecoincidencehistogramdenoted �2�35/ � �6/��,8 . Thehistogrammarginals �2�35/ � 8and �2�35/ � 8 of thecoincidence histogramareextractedby summingover oneof theargumentsof 2435/ � �6/ � 8 .Thesearethenused in the mutual � -information formula (2) to comeup with a registration score for the
images.
Our general feature selection andorganization schemeis similar to therandomizedtreeclassifier struc-
turesintroduced by Amit andGeman[1] andusedfor shaperecognition from binary transcriptionsof hand-
writing. A setof primitive local features,called tags,areselected which provide a coarse description of
the topography of the intensity surface in the vicinity of a pixel. Local imageconfigurations, e.g. ���P�pixel neighborhoods, arecapturedby coding eachpixel with labels derivedfrom thetags.Non-local spatial
featuresarethencapturedby cataloging pairsof tagswhich arein particular relative spatial configurations.
10
This allows usto sidesteptheissue of rigidly definingnon-local imageconfigurationsandclusterssuchas,
imagegradients,boundarydetectors,distinctpoints,discriminating structuresor otherstatisticalparameters.
For grayscaleimages, thenumber of different tag typescanbeextremelylarge. For example,if the image
contains256grayvalues thentherewould exist 3K^I�I��8 �:� different �f��� tagtypes. Thesewould beuselessas
feature discriminants sincetheir individual occurrenceswould be extremely low. Methods for pruning the
tagtypesaredescribedbelow.
4 TagSelectionvia Adaptive Thr esholding
Adaptive thresholding is a quantization schemedescribedby GemanandKoloydenko [8] which wasintro-
duced to study theinvariantcharacteristicsof natural images.Let � beapositivegranularity parameter. The
quantized valueassignedto a pixel within a �t�$� sub-imagedependson the gray values of its neighbors.
Thedarkestpixel(s) areassigned0, thenext brightestpixel(s) areassigned0 if thedifferenceis lessthan �andlabel1 otherwise,thenext brightestpixel(s)areassignedlabel2 if thedifferenceis lessthan � , andso
on. Usingthis schemeon our ultrasoundbreast imagedatabasetagsassociated with therelatively uniform
background areas (darkor bright) with smallspatial variancesarecorrectly classified asspeckle andcanbe
easily eliminated. Figure6 shows tag coincidencesin the reference andsecondary images.Coincidences
of tag typesarecalculatedusing the feature tree. Theexaggeratedtagpattern is meantto capture theedge
of the tumor. A similar tag type will be observed in the secondaryimagealso. The tagscapture the local
intensity pattern in theneighborhoodof thepixel.
As mentionedearlier, the tagfeature spaceis potentially large andrequirespruning. We select relevant
tagtypesby aprocessof randomizedtraining. Fromwithin theUSbreastimagedatabasea largenumberof
pixel neighborhoodsareselected andquantized. After discarding repetitions, tag types that correspondto
spuriousspeckle pattern areidentified andeliminated. Speckle,afterquantization using theabove scheme,
can easily be identified. since it exhibits only one or two non-zero pixels amongst the �X�Y� pixels in
the tag. 300 different tag typesare identified amongthe samples.The number of tag typesis controlled
by selectively traversingthe decision tree so that it is balanced, and by imposing constraints on the tag
types. Also, tagsarerequired to have at least two different intensity typeswithin the centerpixels so that
11
processingis centeredneartheedgesin the image.Eachimageis thenblock-quantizedanddroppeddown
thepartition tree.Thepairwisecoincidencesof the tagtypesat the leavesarerecordedin a histogramover
’tagspace’. If an imageblock doesnot correspond to any of the tagtypes, thenthepixel is not used in the
imageregistrationalgorithm. While wedonot explore it here, thesetagfeaturescanbeextendedto account
for spatial dependenciesbetween pairsof tagtypes[1].
5 Continuousand DiscreteFeatureSelectionvia ICA
5.1 Continuous ICA Features
Imagescanbe described in termsof their projection onto a basis, representedasa linear superposition of
bases functions. Sucha projection feature setwasinvestigatedby Vasconcelos[56, 55] for general image
indexing problemsusingaGaborwavelet basis.Herewetakeadifferentapproachadopting abasisextracted
from an independent components analysis (ICA) of the imagedatabase.The ICA basis canbe adapted so
asto best account for the imagestructure in termsof a collectionof statistically independent components.
This is an alternative to adaptive thresholding andis calledthe ICA approachin [30]. In ICA, an optimal
basisis foundwhichdecomposestheultrasoundimage�4�
into asmallnumberof approximatelystatistically
independent components9� 0 � : ���7� �<0 ����� � 0 � 0 (5)
The basiselements9� 0 � are selected from an over-completelinearly dependent basisusing randomized
selection over the training set of ultrasound imagesin a representative database. The number of basis
elements are selected according to the minimum description length (MDL) criterion. For image ( the
feature vectors / � are definedas the coefficients � � 0 � in 5 obtained by projecting the imageonto the
basis. Here ICA was implemented using Hyvarinen and Oja’s [20] FastICA code (available from
http://www.cis.hut.fi/projects/ica/fastica/) whichusesafixed-point algorithmto per-
form maximumlikelihoodestimationof thebasis elements in theICA datamodel(5). Figure7 shows a set
of 40 �t��� basis vectors which werelearnedfrom over 5000 ���X� training subimagesrandomly selected
from 10 consecutive imageslices of a single ultrasoundvolume scanof the breast (Case151 in Fig. 1).
12
Giventhis ICA basisandapair of to-be-registered � �¢¡ images,coefficient vectors areextractedby pro-
jecting each���X� neighborhood in theimagesontothebasis set.For the40 dimensionalICA basis shown
in Fig. 7 this yields a setof ��¡ vectors in a 40 dimensionalvectorspace which will be usedto define
features.
5.2 DiscreteICA Features
Recallthat in thefeature treemethodwe generatehistogramin higher dimensionalspaceby labeling pixels
with tagtypes. To extend this labelingtechniqueto ICA featurevectors,discretization of theICA coefficient
vectors is essential. We do this by vector quantization: we partition the spaceof the multi-dimensional
ICA coefficients into a finite numberof Voronoi cells. The Voronoi partitions are created using the K-
Meansclustering algorithm [31] on featuresextractedfrom the imagetraining set. This algorithmscreates
a partition which minimizes the meansquarederror of the Euclidian norm betweenthe centroid of the
Voronoi cell andthe feature vectors in thecell. If c Voronoi cellsarecreatedthen,eachpixel £ � is givena
label ��� )�+*+*+*9��c basedupon theICA coefficient at thatpixel neighborhood. 512cells werecreatedusing8
and16 dimensional ICA bases. Thesamepartitioning is used for thereferenceandthesecondaryimagesto
maintain consistency in pixel labels.
6 & -JensenDiffer enceFunction for ContinuousFeatures
With continuousfeaturesthe � -MI canbeestimatedfrom continuous densities,e.g.,throughhistogramor
kernel density estimators. However theperformanceof such ”plug-in” estimates becomesvery poor asthe
featuredimension increases[12]. An alternative,discussedhereandin thenext section, is adirect estimator
of another entropy-basedindex function: the � -Jensendifference. This function hasbeen independently
proposedby Ma [14] andHe et al [11] for imageregistrationproblems.Let 2 � and 2 � betwo densitiesand
letr O [0, 1] bea mixture parameter. It wasalsousedby Michel et al in [39] for time-frequency images.
The � -Jensendifferencewasdefinedin [3] is thedifferencebetween� -entropy of themixture:
2 �sr 2 � u¤3: UW r 8:2 � (6)
13
andthemixtureof the � -entropiesof 2 � and 2 � :��d S 3 r �b2 � �b2 � 8 � d S 3 r 2 � u¤3: �W r 8:2 � 8¥W¦� r d S 3K2 � 87u§3: UW r 8�d S 3K2 � 8�! (7)
where�POQ� ��� "!5* The � -Jensendifferenceis ameasureof dissimilarity between 2 � and 2 � : asthe � -entropyd S 3K278 is concave in 2 it is clearfrom Jensen’s inequality that ��d S 3 r �b2 � �b2 � 8 � � if f 2 �]� 2 � a.e.
The � -Jensendifferencecanbemotivated asan index function asfoll ows. Assumetwo setsof feature
vectors / �¨�j /¨k �Gm� � �G��� @M©M©M© @ o B and / �U�ª /nk ��m� � �G��� @M©M©M© @ o¬« areextractedfrom images�l�
and���
, respectively.
Assumethateachof thesesetsconsist of independent realizations from densities 2 � and2 � respectively. De-
finetheunion / � / �® / � containing c � c � u¯c � unorderedfeaturevectors. Any consistententropy esti-
matorconstructedonthe / k ��m ’swill convergeto d S 3 r 2 � u�3: °W r 8:2 � 8 as c¯#²± wherer$� DGyGz oI�n� c � L�c .
This motivatesthefollowing consistentminimal-graphestimatorof Jensen differenceforrX� c � L�c :
� �d S 3 r �b2 � �b2 � 8 � �d S 35/ � / � 8¥W¦� r �d S 35/ � 8®u³3: �W r 8 �d S 35/ � 8�! (8)
where ��O�3A��� ,86� �d S 35/ �� / � 8 is the minimal graphentropy estimatorconstructed on the c point union
of both setsof feature vectors and �d S 35/ � 8 , �d S 35/ � 8 areconstructedon the individual setsof c � and c �featurevectors,respectively. Wecansimilarly definethedensity-basedestimatorof Jensendifferencebased
on the entropy estimates of the form constructedon / � / � , / � and / � . For rigid registration prob-
lemsthe marginal entropies d S 3K2 � 8 � ��G��� over the databaseareall identical so that the indexing function d S 3 r 2 � u¤3: UW r 8:2 � 8 � ��G��� is equivalentto
��d S 3 r �b2 � �b2 � 8 � ��G���7 Minimum Spanning Treeand Renyi Entr opy
Implementationof the � -Jensenregistrationcriterioncanbeaccomplished by plugging in thedensity esti-
matesto (7) but it is betterto usedirect methodsbasedonentropicgraphssuchastheMinimal SpanningTree
(MST). TheMST methodis agraph-theoretictechnique, which determinesthedominant skeletal patternof
a point setby mapping theshortestpathof nearest neighbor connections.Givenaset / o �� C � �.C��)�+*�*�*�*��.C o �of n, i.i.d vectors / � in ´tµ eachwith density 2 , a spanning treeis a connectedacyclic graph which passes
throughall coordinatesassociatedwith / o . In thisgraph all c pointsareconnectedby c¶W� edges�· � �
. For
agivenrealweightexponent ¸XO (0,1) theminimumspanning treeis thespanning treewhichminimizes the
14
total edge weight: ¹35/ o 8 � z�yG{a ��º < a T · T�» (9)
where T · T denotesEuclidean (L2) norm of the edge. SeeFig 8 for illustration. The overall length of the
MST canbeusedto construct a strongly consistentestimator of entropy.
TheMST length
¹o �
¹35/ o 8 is plottedasa function of c in Fig. 9 for thecaseof i.i.d uniformly and
non-uniformly distributed points in the planefor ¸ =1. It is intuitive that the lengthof the MST spanning
the moreconcentrated non-uniform setof points increasesat a slowerratethandoes the MST’s spanning
theuniformly distributedpoints. This factmotivates theapplication of MST to testrandomnessof a setof
points. It canbeshown [12] thelengthfunction whennormalizedby ¼ c producessequencesthatconverge
within a constantfactor to thealphaentropieswith � =1/2, asillustratedin Fig. 9 . More generally, for i.i.d.
points in ½x¾ , by changing thevalueof ¸ in (9), onecanchangetheconvergentlimit to the � -entropy where� � 3A¿¨W¯¸®L9¿�8 . TheMST is called anentropic spanning graph asits normalizedlog-length converges (a.s.)
within a constantto analpha-entropy. Specifically, theRenyi entropy estimator
�d S 35/ o 8 � ,L�3: UWÀ�¥8Á� ÂNc¹35/ o 8�L�c S WQÂÃc rÅÄ @ » ! (10)
is anasymptotically unbiasedandalmostsurely consistentestimatorof the � -entropy of 2 wherer Ä @ » is
a constant biascriterion independent of 2 [12, 17]. In imageregistration, whentwo imagesareproperly
matched under a sequenceof transformations, corresponding regions of interest should overlap and the
resulting joint probability distribution will behighly concentrated.ThustheRenyi entropy of theoverlapped
imagesshould achieve the minimum value over all of the transformations. This would be reflected asthe
smallest length of the MST [32]. A transformation that minimizesRenyi entropy canbe calculated, since
misregistration would increasethedispersion of thejoint probability distribution.
As contrastedwith density based estimates of entropy, theMST estimator enjoys thefoll owing proper-
ties: it hasafaster asymptoticconvergencerate,especially for nonsmoothdensitiesandfor low dimensional
feature spaces[13, 15]; it completely by-passesthe complication of choosing andfine-tuning parameters
suchashistogrambin size,density kernel width, complexity andadaptation speed; the � parameterin the� -entropy function is varied by varying theinter-point distancemeasureusedto compute theweightof the
15
MST. Furthermore,for large dimensionstheMST canbeimplementedwhenhistogramscannot,dueto the
“curseof dimensionality”. For example,for a ÆI^ dimensional space,only +� cellsperdimension gives +�MÇ �bins in thehistogram,anunworkableandimpractically large burdenfor any computer. On the otherhand
the needfor combinatorial optimization may be a bottleneck for a large numberof feature samplesand
acceleratedMST algorithmsarenecessary.
7.1 Computational Acceleration of the Kruskal MST Algorithm
The Kruskal Algorithm [26] is widely believed to be the fastest general purposealgorithm to solve the
MST problem for sparsegraphs. A setof given edges,sortedby their weights,is maintainedin a list and
Kruskal’s algorithm grows the tree an edge at a time. Cyclesare avoided within the tree by discarding
edgesthat connecttwo sub-treesalready joinedthroughaprior established path.Thetimecomplexity of the
algorithm is of È�3AÉiÂAÊ,ËEÉÌ8 where É is theiniti al numberof edgesin thegraph. ThememoryrequirementisÈ�3AÉÍ8 .In the present application, the mostsimple-mindediniti al estimateof the MST includesall the possi-
ble edgeswithin thepoint set. This results in � � edges for � points; a time requirementof �3A� � 8 anda
memoryrequirementof È�3A� � ÂAÊ�ˬ�$8 . Thenumberof points in thegraphis thetotal numberof pixelspartic-
ipating in theregistrationfrom thetwo images.If eachimagehas ¡ ��` pixels, thetotal number of points
in thegraphis ^Ì�Ρ ��`�Ï 150,000 for ultrasoundimagesof size256 � 256.Desktopprocessors cannot
fulfill memoryrequirementsof thestandard Kruskal algorithm in this case.Evenwith larger machines, the
algorithm hasa forbidding time requirementfor treeconstruction.
A significant acceleration canbe obtained however by a processof sparsification of the initial graph
before treeconstruction. A selection criterion is imposedon theedges, which ensuresthatonly those edges
likely to occur in thefinal MST areincluded in theoriginal graph. While constructing theedge list, a disc
is placedon each point under consideration. As seenin Fig. 11,only thoseedgeswith lengthssmaller than
discradiusareacceptedinto thelist. Theedge-length sortalgorithm, within Kruskal’salgorithm, now hasto
sort �3A�$8 number of edges. For approximately uniform distributions,a constant discradius is optimal for
all areas within the distribution. Moreover, for non uniform distributions, the disc radius may be changed
16
to adaptaccording to the underlying distribution. This can be achieved by selecting the distanceof theÐEÑGÒ-nearestneighbor (kNN) as the disc radiusfor a given point. Further reduction in time and memory
requirementscanbe obtained by first rank-ordering vertex coordinatesalong an arbitrary dimension. It is
not necessaryto computeall � � edge-lengths. Only thoseedges thathave lengths lessthanthediscradius
in thedimensionof ordering needto beconsidered (Fig. 11). Thus � � edgelengthcomputations canalso
be avoided. If during treeconstruction the algorithm runsout of edges, expanding the radius of the disc
reaps-in additional edges. Fig. 10 shows biasof the modifiedMST algorithm asa function of the radius
parameter.
It is straightforward to prove that, if the radius is suitably specified, the disc based tree construction
describedabove is a minimumspanning tree.Recallthat theKruskalalgorithm ensures construction of the
exactMST [26].3: ,8 If point £ � is includedin thetree, thenthepathof its connectionto thetreehasthelowestweightamongst
all possible non-cyclic connections. To prove this is trivial. Thedisc criterion includeslower weightedge
before considering anedgewith a higherweight. Hence,if a pathis found by imposing thedisc, that path
is the smallest possible non-cyclic path. Thenon-cyclicity of the pathis ensured in the Kruskal algorithm
through a standardUnion-Find dataset.3K^�8 If a point £ � is not in the tree,it is becauseall theedgesbetween£ � andits neighbors consideredusing
the disc criterion of edgeinclusionhave total edgeweight greater thandisc radiusor have led to a cyclic
path. Expanding thediscradius would thenprovide thepathwhich is lowestin weightandnon-cyclic.
If the disc radius is underestimated the treecannot be completed without first adding more edges to
the list. If it is overestimateda surplus of edges will result in the edgelist, however the final tree will
have therequired �ÓWÔ edgesonly. It hasbeen observedempirically that theoptimaldiscradius includes
roughly between10-20 edges from neighboring points. This numbervarieswith the dimensionality and
the underlying density of the data. The number of edges É thus reducesfrom � � to roughly �Õ�¦ +� .The memoryrequirementof the modifiedalgorithm is of Èt3AÉÌ8 . The time requirementnow optimizestoÈ�3AÉiÂAÊ,ËEÉÌ8 , where É is a fraction of � � for large � . Figure11 comparestheperformanceof thestandard
Kruskal algorithm with our modified algorithm. The time and memoryrequirements are tremendously
reduced. To obtain the optimal trade-off betweenbias of the MST length and time-memory complexity,
17
the disc radiusshould be selected at the kneeof the curve seenin Fig 10. For uniform distributions, the
distanceof theÐ9ÑGÒ
-NN alongthedimensionof ordering, remains roughly thesame.Therealadvantageof
the kNN techniquelies in adapting the disc radius on a point-by point basisfor non-uniform distributions
(Fig 12). However, oneof the problemsunderlying non-uniform distribution canbeseenthere. TheMST
length estimate from thedisc basedalgorithm convergesslowly to thetruelength, requiring examinationof
a large numberof neighbors in theprocess. Thenumberof nearestneighbors requiring examinationgrows
in higher dimensionsandthe searchnow approachesa full search, thuslosing someof its scalability (Fig
12 b). To addressthis issue, we utilize a kNN approach to determine the radii of the disks, following a
list intersection approachsimilar to [44]. Variants of the techniquedescribed above have beenexamined
andaredescribedin [42]; wherewe alsopresentresults andmethods to dealwith dimensionality issuesfor
non-uniform distributions.
8 Numerical Illustration
In this simple example we compare the MST-basedestimate of � -Jensendifferencewith ICA features
against the standardhistogram-basedestimate of MI using single pixel features. The backgrounds in the
imagesin Fig 13 have beenconstructedusinga random mixture of 15 ICA bases elements. Theaim is to
register thedominantpatternseenin theforegroundasit translatesacrosstheimageon a fixedbackground.
Theproblemis similar to trackingprominent featuresasthey move in theimage. Themutual � -information
based on singlepixels,for � = 0.5andthe � -Jensendivergencebased on the15 ICA featuresarecalculated
for each integertranslation for thepair of images. UsingtheICA techniquewecansystematically eliminate
featuresarising from thebackgroundandcomputetheMST-basedentropy estimatesolelyonthedistribution
of the prominentfeature of interest. This is not possible with the single pixel method.Profilesin Fig. 13
show thatascontrastedto thefeature based� Jensendivergence,the � MI hasa lower discrimination ability
evident from thecurvatureof thepeak around perfect alignment (0 pixel translation). Theexamplesuggests
that significant improvement in resolution canbe achieved through the useof imagespecific featuresand
entropic graphestimators.
18
9 Ultrasonic BreastImageRegistration
Figure14 show therepresentative profilesof theregistration objective function for registering a sliceof US
breast imagevolume,labeledCase142in Fig. (1), to aslice2mmdeeper in thesameimagevolume.At this
separationdistance,thespeckle noisedecorrelates. Howevertheunderlying anatomyremainsapproximately
unchanged. As theaim of this study is to quantitatively compare different feature selection andregistration
similarity indiceswe restricted our investigation to rotation transformations over a small range ( WU� Ö tou¨�)Ö ) of angles. The panelon far left of Fig. 14 indicatesthat, for single pixel features,entropic-graph
(MST) estimates of � -Jensendifferenceandhistogramplug-in estimatesof � -MI give similarity functions
with virtually identical profiles with a uniqueglobal minimumat � Ö rotation. Theprofile of thehistogram
plug-in estimateof the � -Jensendifferencefor single pixels(not shown)is very similar to the � -MI profile.
For 8 dimensional ICA feature vectors, chosenby training on Case151, we observe that the profile of
thehistogramplug-in estimateof � -Jensendegrades,with theappearanceof several local minima(middle
panel). This is expectedsincehistogramestimation becomesunstable in high dimensionalfeature space.In
a 64 dimensional ICA feature space,againchosen by training on Case151,we observe that the entropic-
graph maintains a smoothprofile with a singleglobal minimum(right panel) while thehistogramestimate
of � -MI is not implementable.
We next investigatedthe effect of additive noiseon small-angle registration performance. We tested
registration accuracy for single-pixels, tags, and discrete and continuous ICA featuresusing the � -MI,
entropic-graph, and � -Jensendiscriminating criteria under increasing noiseconditions. Figures(15 and
16) showplots of registration root meansquare (rms)error versus increasinglevels of additive (truncated)
Gaussian noise in the images. Shown on the plots arestandarderror bars. The resultant registration peak
shifts from the perfect alignmentposition ( ��Ö relative rotation), to somearbitrary valuedepending on the
SNR,theregistration features,andentropy/MI estimation techniques implemented.Thelack of smoothness
in theplots is likely dueto the image-specific nature of the simulation - averagingresults from registering
differenttypesof breastimages would undoubtedly producesmoother graphs.
Figure15 shows a comparisonof the histogramandfeature coincidence treemethods for tag features.
Weseethattagfeatureshave lower MSEthanthestandardsinglepixel features.Also higher order ICA fea-
19
turesperform betterthansingle pixels. Noticealsothatincreasingthedimensionality of theICA coefficient
makestheregistration morerobust,by further lowering misregistrationerror. Figure16showsacomparison
of the � -MI and � -Jensendiscrimination criteria. The � -Jensendifferencefunction is calculatedfor single
pixelsand8D ICA coefficientsusing higherorderhistogramsandtheMST entropic-graph estimate. It also
shows the single pixel MI criterion under a range of SNRconditions similar to the onesexplainedabove.
Theperformanceof higherordercontinuousICA featuresestimated from theMST is seen to bebetter than
thoseof single pixel featuresestimated from histograms.Moreextensiveexperimentsarenecessarybut thus
far our results indicate that the MST methodof entropy estimation hassignificantly greater robustness to
additive noise thanhistogrambased methods.
10 Conclusions
In thispaperwedemonstratedthathigher order featuresoffer distinct advantagesoverstandard singlepixels
for entropy-basedUS imageregistration. They canaccount for local spatial information between image
featureswhich is ignored by single-pixel features. We presentedtwo techniquesfor generating imagede-
pendenthigher orderfeature vectors. DiscretefeaturessuchastagsanddiscreteICA coefficientswereused
for imageregistration using ageneralized � -MI matching criteria. Featurecoincidencetreeswerepresented
to generatehistogramsin higher dimensional feature space. We concludedthat tagsandICA featuresare
morerobust to noise artifactsandgive lower misregistrationerrors thansingle-pixel based MI methods.
The � -Jensendifferencefunction canbe used to obtain a consistentestimateof entropy of continuous
feature vectorsusing minimal graphssuchastheMST. Computing thehistogramestimateof � -MI is expo-
nentially complex in higherdimensions,sincethebiasof theestimate increasesin thenumber of dimensions
of the feature space. The direct graphbasedMST estimateof � -Jensendifferenceavoids otherproblems
thatplagueplug-in histogramestimates,which include ideal bin sizeanddensity kernel width selection and
it alsohasa faster asymptotic convergenceratefor non-smoothdensities. Combinatorial optimizationis a
bottleneckin MST computation in large featuresetsin high dimensions.We have presenteda techniqueto
accelerateMST computationandreduceits memorycomplexity soasto beableto usedesktop processing
for large datasets.Using the disc-basedalgorithm it waspossible to rapidly compute the MST length for
20
datasetsup to 1 million points in 8 dimensional space. The MST-basedestimateof � -Jensengave lower
misregistration errorsthan discretefeaturevectorscomputedusing � -MI, andboth techniquesoutperformed
single-pixel MI methods.
Applying these retrieval andregistrationtechniques to color images,suchasregistrationof USDoppler
blood flow color imageswith othercolor flow or gray scale imagesis a direction of future study. Another
natural extension of this work is to incorporate spatial relations amongstpairsof spatially separatedtags
to eliminate the effect of shadowsand other nonlinear artifactswhich poseproblems during compound-
ing registration of ultrasonic images. We areactively pursuing other techniquesto further accelerate the
MST algorithm using parallel machinesandlist intersection. Optimaldisc-radiusselection hasbeen found
critical to complexity reduction in the disc-basedMST construction algorithm. We areinvestigating other
techniquesto judge theoptimaldiscradiusfor MST construction. To increasetheclassification capabilities
of thefeature trees,differentmethodsof randomizedtreesarebeing studied. Finally, thebestvalue of � in
the � -entropy similarity criteriais anopen issuethat needsfurtherinvestigation.
11 Acknowledgments
To SunChungPhD,ComputerScience Department, St. ThomasUniversity for thevaluable suggestions on
acceleration of theMST algorithm.
References
[1] Amit Y andGemanD,“Shapequantization andrecognition with randomizedtrees”, Neural Computa-
tion, Vol. 9, pp.1545-1588, 1997.
[2] Ashley J,Barber R, Flickner M, HafnerJL, LeeD, Niblack W andPetkovic D, “Automaticandsemi-
automatic methodsfor imageannotationandretrieval in QBIC”, Proc.SPIEStorage andRetrieval for
Image andVideoDatabasesIII , pp.24-35,1995.
[3] Bassevil le M,“Distancemeasuresfor signalprocessing andpattern recognition”, Signal Processing ,
Vol. 18,pp.349-369,1989.
21
[4] Bhatti PT, LeCarpentier GL, Roubidoux MA, FowlkesJB, Helvie MA, CarsonP.L., “Discrimination
of SonographicBreastLesionsUsing Frequency Shift Color Doppler Imaging, Age andGray Scale
Criteria”, J. Ultrasound Med., vol. 20,no.4, pp.343-350,2001.
[5] DemboA andZeitouniO,“Largedeviationstechniquesandapplications”, Springer-Verlag NY, 1998.
[6] Do MN andVetterli M,“Texture similarity measurementusingKullback-Liebler distanceon wavelet
subbands”, Proc. IEEE International Conference on Image Processing, Vancouver, BC, pp. 367-370,
2000.
[7] DunnD, Higgins WE andWakeley J, ”Texture segmentation using 2D Gaborelementary functions”,
IEEETrans.PatternAnal.Mach. Intelligence, vol 16,no.2, pp.130-149,1994.
[8] GemanD andKoloydenko A,“Invariantstatisticsandcoding of natural microimages”, IEEEWorkshop
on Statis. Computat. Theoriesof Vision,Fort Collins,CO, June 1999.
[9] GordonPB andGoldenberg SL, “Mali gnantbreast massesdetectedonly by ultrasound”, Cancer,vol.
76,pp 626-630,1995
[10] HafnerJ,Sawhney HS,EquitzW, FlicknerM andNiblack W, “Efficient color histogramindexing for
quadratic form distancefunction”, IEEE TransPattern Anal. Mach. Intelligence, vol. 17, no.7,July
1995.
[11] He Y, Ben-HamzaA andKrim H, “An information divergencemeasure for ISAR imageregistration”,
Signal Processing, submitted,2001.
[12] HeroAO, Ma B, Michel O andGormanJD, ”Appl icationsof entropic spanninggraphs,” to appear in
IEEESignal Proc.Magazine(Special Issueon Mathematical Imaging) Oct.2002.
[13] Hero AO, Ma B, Michel O andGormanJD, “Alpha-Divergence for Classification, Indexing andRe-
trieval”, Technical ReportCSPL-328CommunicationsandSignal ProcessingLaboratory, TheUniver-
sity of Michigan,48109-2122, May 2001
[14] Hero AO, Ma B andMichel O, “Imaging applications of stochastic minimal graphs”, Proc. of IEEE
Int. Conf. on Image Proc.,Thesaloniki, Greece, Oct 2001.
22
[15] HeroAO,CostaJandMa B, “Convergenceratesof minimalgraphswith random vertices”,(submitted
to) IEEETrans.Information Theory, 2001.
[16] Hero AO andMichel O, “Asymptotic theory of greedy approximations to minimal K-point random
graphs”, IEEETrans.InformationTheory, Vol. IT-45,No. 6, pp.1921-1939, Sept.1999.
[17] HeroAO andMichel O, “Estimation of Renyi InformationDivergencevia PrunedMinimal Spanning
Trees”,1999 IEEEWorkshop on Higher Order Statistics, Caesaria Israel, 1999.
[18] Hill DLG, Batchelor PG,HoldenM andHawkesDJ, “Medical imageregistration”, Phys.Med.Biol.,
vol 26,pp R1-R45,2001.
[19] HuangJ, KumarSR,Mitra M, Zhu W, “Spatial Color Indexing andApplications”, IEEE Int’l Conf.
ComputerVision ICCV‘98, Bombay, India, pp 602-608,Jan.1998.
[20] HyvarinenA andOja E, “Independentcomponentanalysis: algorithmsandapplications”, Neural Net-
works, vol. 13,no.4-5,pp.411-430,1999.
[21] KedarRP, Cosgrove DO, Smith IE, Mansi JL andBamberJC,“BreastCarcinoma: Measurementof
tumor responseto primary medical therapy with color Doppler flow imaging”, Radiology, vol. 190,
pp.825-830,1994.
[22] Kim B, BoesJL, Frey KA, Meyer CR, “Mutual information for automated unwrapping of rat brain
autoradiographs”, Neuroimage, vol. 5, pp.31-40, 1997.
[23] Kolb TM, Lichy J andNewhouseJH, “Occult cancer in womenwith densebreasts: Detection with
screening ultrasound: Diagnostic yield andtumor characteristics”, Radiology, vol. 207,pp. 191-198,
1998
[24] Krucker JF, Meyer CR,LeCarpentier GL, FowlkesJB, CarsonPL, “3D SpatialCompounding of Ul-
trasoundImagesUsingImage-Based,Nonrigid Registration”, Ultrasound MedBiol, vol. 26,no.9,pp.
1475-1488, 2000.
[25] Krucker JF, LeCarpentier GL, Meyer CR,FowlkesJB,RoubidouxMA, CarsonPL, “3D imageregis-
tration for multimode, extendedfield of view, andsequential ultrasoundimaging”, RSNA ej, 1999.
23
[26] KruskalJB, “On theshortest subtreeof a graphandthetraveling salesmanproblem”, Proc.American
Math.Societ, vol. 7, 48-50, 1956.
[27] Kullback S andLiebler R,“On information andsufficiency”, Ann.Math. Statist., Vol. 22, pp. 79-86,
1951.
[28] Lefebure M andCohenLD, “ ImageRegistration, optical flow and local rigidity”, J. Mathematical
Imaging andVision, vol. 14,no.2, pp.131-147,March2001.
[29] Leow W, Lai S, “Scaleandorientation-invariant texturematching for imageretrieval”, in Pietikainen
MK ed.,Texture Analysis in Machine Vision, World Scientific, 2000.
[30] Lewicki M andOlshausenB, “Probabilistic framework for the adaptation andcomparison of image
codes”, J. Opt.Soc.Am., vol. 16,no.7, pp..1587-1601, 1999.
[31] Linde Y, BuzoA andGrayRM, “An algorithm for vector quantization design”, IEEETransactionson
Communications, vol. 28,pp.84-95, January 1980.
[32] Ma B, HeroAO, GormanJ andMichel O, “Imageregistrationwith minimal spanning treealgorithm”,
IEEEInternational Conf. on Image Processing, Vancouver, Oct.2000.
[33] Ma WY andManjunath BS,“NETRA: A toolbox for navigatinglarge imagedatabases”, Proc. IEEE
International Conferenceon Image Processing, Santa Barbara, California, Vol. I, pp. 568-571, Oct
1997.
[34] MaesF, Colignon A, VandermeulenD, Marchal G andSuetensP, “Multimodality imageregistration
by maximization of mutualinformation”, IEEETransMedical Imaging, vol. 16,pp 187-198,1997.
[35] Maintz JBA andViergever MA,“A survey of medicalimageregistration”, Medical Image Analysis,
vol. 2, no.1, pp 1-36,1998.
[36] MaintzJBA, vandenElsenPA, ViergeverMA, “Comparisonof edge-basedandridgebasedregistration
of CT andMR brain images”, Med.Image Analysis, vol. 1, pp 151-161,1996.
24
[37] Meyer CR, BoesJL, Kim B, Bland PH, LeCarpentier GL, FowlkesJB, Roubidoux MA, CarsonPL,
“SemiautomaticRegistration of Volumetric Ultrasound Scans”,Ultrasound Med.Biol., vol. 25, no.3,
pp 339-347,1999.
[38] Meyer CR,BoesJL, Kim B, et al,“Demonstration of accuracy andclinicla versatility of mutualinfor-
mationfor automaticmultimodality imagefusion usingaffine andthin-platespline warped geometric
deformations”, MedImage Analysis, vol. 1, pp.195-206,1996/97.
[39] Michel O, Baraniuk R, andFlandrin P, “Time-frequency based distance anddivergencemeasures”,
IEEEInternational Time-Frequency andTime-Scale Analysis Symposium, pp.64-67, Oct 1994.
[40] Moskalik A, Carson PL, Meyer CR, FowlkesJB, Rubin JB, Roubidoux MA, “Registration of three-
dimensionalCompoundUltrasoundScansof theBreastfor Refraction andMotion Correction”, Ultra-
sound Med.Biol., vol. 21,no.6,pp 769-778,1995.
[41] Neemuchwala HF, HeroAO, andCarsonPL, “Image registration usingentropic spanning graphs”, (to
appear in) Proc.of36thAsilomarConfSignals, SystemsandComputers,PacificGrove, CA, Nov 2002.
[42] Neemuchwala HF, Hero AO, andCarsonPL, “Fastalgorithms for Minimum spanning treeconstruc-
tion”, (to apear in) Technical ReportCSPLCommunicationsand Signal Processing Laboratory, The
University of Michigan,48109-2122, 2002
[43] Neemuchwala HF, HeroAO, andCarsonPL, “Feature coincidence treesfor registration of ultrasound
breast images”, Proc.ofIEEEInt. Conf. on Image Proc.,Thesaloniki, Greece, Oct 2001.
[44] NeneSA andNayarSK, “A SimpleAlgorithm for Nearest Neighbor Searchin High Dimensions”,
IEEETrans.PatternAnalysis andMachineIntelligence, vol.19, 1997
[45] Pentland A, PicardW, Sclaroff S.,“Photobook: Content-BasedManipulation of ImageDatabases”,
Proc SPIEStorage andRetrieval for Image andVideoDatabasesII , no.2185, 1994.
[46] RocheA, Xavier P, Malandain G, “Generalizedcorrelationratio for rigid registration of 3D ultraosund
with MR images”, IEEETran.Medical Imaging, vol 20,no.10,pp.25-31, Oct 2001.
25
[47] Rohling R, GeeA, BermanL, “Automatic registration of 3D ultrasound images”, Ultrasound Med.
Biol., vol. 24,pp 769-778,1995.
[48] RudinLI andYu P, “Improvedforensicphotogrammetricmeasurementswith global geometrical con-
straints”, Proc.SPIE, vol. 4709,2001.
[49] Shekhar R andZagrodsky V, “Mutual information-basedrigid andnonrigid registration of ultrasound
volumes”,IEEETransMedical Imaging, vol. 21.no.1,pp 9-22,2002.
[50] SmeuldersAWM, Worring M, Santini S, GuptaA, JainR, “Content-based imageretrieval at the end
of theearlydays”, IEEETransPatternAnalysisandMachine Intelligence, vol. 22,no.12,2000.
[51] Smith JR andChangSF, “Automated imageretrieval usingcolor andtexture”, Columbia University
Technical Report TR414-95-20, July 1995.
[52] StoicaR, ZerubiaJ andFrancos JM,“Imageretrieval andindexing: A hierarchical approachin com-
puting distancebetween texturedimages”, Proc.IEEEInternational ConferenceonImageProcessing,
Chicago, 1998.
[53] StavrosAT, ThickmanDI, RappCL, DennisMA, ParkerSH,Sisney GA, “Solid breastnodules:useof
sonography to distinguish betweenbenign andmalignant lesions”, Radiology, vol.196, pp. 123-134,
1995.
[54] Swain MJ andBallardDH, ”Color indexing”, Int’l J. ComputerVision, vol.7, no.1,pp 11-32, 1991.
[55] VasconcelosN andLippmanA,“Bayesianrepresentationsandlearning mechanismsfor contentbased
imageretrieval”, SPIEStorage andRetrieval for MediaDatabases,SanJose, CA, 2000.
[56] VasconcelosN andLippmanA,“Embeddedmixturemodeling for efficient probabilistic content-based
indexing andretrieval”, SPIEMultimedia Storage andArchiving Systems,Boston, MA, 1998.
[57] Viola P andWells WM, “Alignmentby Maximizationof Mutual Information”, Fifth Int’ l Conf. Com-
puterVision,Cambridge, MA, pp 16-23, IEEE,1995
26
Figure1: Threeultrasoundbreast scans. Fromleft to right are:Case151, Case142andCase152.
Figure2: Single-pixel gray level coincidencesarerecordedby counting numberof co-occurencesof a pairof graylevel acrossthereference(left) andsecondary(right) imagesatapairof homologouspixel locations3N×1��Ø~8 . Herethesecondaryimage(right) is rotatedby ��Ö relative to thereferenceimage(left).
27
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
Figure3: Jointcoincidencehistogramsfor single-pixel graylevel features.Bothhorizontalandvertical axesof eachpanel areindexed over thegraylevel rangeof 0 to 255. Top left: joint histogramscatter plot for thecasethatreferenceimage(
���) andsecondaryimage(
� 0 ) arethesamesliceof theUSimagevolume(Case142) at perfect � Ö alignment (
� 0 �j� �). Top right: sameastop left exceptthat reference andsecondary
aremisalignedby � Ö relative rotation asin Fig. 2. Bottomleft: sameastop right exceptthat the referenceandsecondaryimagesarefrom adjacent(2mmseparation) slicesof theimagevolume.Bottomright: sameasbottom right except that imagesaremisalignedby � Ö relative rotation.
Root Node
Depth 1
Depth 2
Not examined further
Figure4: Part of feature treedatastructure.
Terminal nodes (Depth 16)
Figure5: Leavesof feature treedatastructure.
28
Figure6: Tags: Eachpixel is labeled by a tag type. Occurrencesandcoincidencesof tag labels arethenplottedon a joint histogramof tagfeatures.
Figure7: ICA Basis:estimated40-dimensional ICA basissetobtainedfromtraining on randomly selected����� blocks in 10 ultrasoundbreast images.
29
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
z0
z1
100 uniformly distributed points
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
z0
z1
MST through 100 uniformly distributed points
Figure8: A setof c points / ��� in the plane (left) andthe corresponding Minimal Spanning Tree(MST)
(right).
0 10000 20000 30000 40000 500000
50
100
150Effect of entropy on MST Length
MS
T L
engt
h
Number of points, N
Uniform Dist.Gaussian Dist.
0 10000 20000 30000 40000 50000 600000.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7Normalized MST length functions, red: Unif. blue: Gauss.
Number of Points
Nor
mal
ized
MS
T le
ngth
Figure 9: Length functions
¹o of MST (left) and
¹o / ¼ c (right) as a function of n for the uniform and
normaldistributedpoints in figure.
30
0 0.02 0.04 0.06 0.08 0.10
10
20
30
40
50
60
70Effect of disc radius on MST Length
MS
T L
engt
h
Radius of disc
Real MST LengthMST Length using Disc
0 50 100 150 200 2500
10
20
30
40
50
60
70
Nearest Neighbors along 1st dimension
MS
T L
engt
h
Automatic disc radius selection using kNN
MST length using kNNTrue MST Length
Figure10: Biasof the c logc MST algorithm asafunction of radiusparameter(left) andasa functionof thenumberof nearest neighbors(right)
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
z0
z1
Selection of nearest neighbors for MST using disc
r=0.15
0 50000 1000000
50
100
150
200
Number of points, N
Exe
cutio
n tim
e in
sec
onds
Linearization of Kruskals MST Algorithm for N 2 edges
Standard Kruskal Algorithm O(N 2)Intermediate: Disc imposed, no rank orderingModified algorithm: Disc imposed, rank ordered
Figure 11: Acceleration of Kruskals MST algorithm from Ù)Ú logÙ to Ù logÙ (left) and Comparison ofKruskal’s MST to our Ù logÙ MST algorithm (right)
31
0 100 200 300 400 500 600 700 8000
50
100
150
200
250
300
350
Nearest Neighbors along 1st dimension
MS
T L
engt
h
Radius selection w/ kNN for Gaussian Dist., 10000 2Dpoints
MST length using kNNTrue MST Length
Figure12: [a] Constructing theMST basedon kNN baseddiscradiusestimate couldbea problem ion non-uniform distributionsdueto theslow convergenceof thelength function (left). [b] As thedimensionality ofthedataset increasesthis problem aggravatesandthepartial search now approachesthefull search (right).
50 100 150 200 250 300 350
50
100
150
200
250
300
350
50 100 150 200 250 300 350
50
100
150
200
250
300
350−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
Relative rotation between images
MS
T L
engt
h &
Sha
nnon
MI
Norm. MST Length & Shannon MI for example image
MST Length (ICA)MI (Single−Pixel)
Figure13: Imagebackgrounds have beensynthesizedusing15 ICA bases elements each. The prominentfeature seenin the images (right) is translated across the image(center) (exaggeratedfor effect). Thebackground remainsunchanged. Curvesfor singlepixel basedÛ MI andICA basedÛ Jensen(right)
32
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
1.2
1.4
Relative rotation between images
MS
T L
engt
h an
d al
phaM
I, al
pha=
0.5
Norm. MST Length & alphaMI, Single Pixel Domain, case142
MST LengthalphaMI, alpha=0.5
−5 0 50
0.2
0.4
0.6
0.8
1
Relative rotation between images
Norm. MST Length & alphaJensen, 8D ICA coeff., case142
MST LengthalphaJensen, alpha=0.5
−10 −5 0 5 100
0.2
0.4
0.6
0.8
1
MS
T L
engt
h
Relative rotation between images (degrees)
MST Length profile for 64D ICA feature vector, vol. 142
Figure14: For registration of two slicestaken from Case142: MST andhistogrambasedentropy estimationfor single pixel features(left). MST andhistogrambasedentropy estimationfor 8D ICA features(center)and64D ICA feature vectors(right)
0 5 10 15 20 250
5
10
15
Standard Deviation of Noise
Pos
ition
of P
eak
(Reg
istr
atio
n E
rror
)
Effect of Additive Noise on peak of objective function
Single Pixel MI, alpha=0.5Feature Tag (8x8) MI, alpha=0.5
0 5 10 15 200
5
10
15
Standard Deviation of Noise
Pos
ition
of P
eak
(Reg
istr
atio
n E
rror
)
Effect of Additive Noise on peak of objective function
Single Pixel MI, alpha=0.5Discrete 8D ICA MI, alpha=0.5Discrete 16D ICA MI, alpha=0.5
Figure15: Effect of additive Gaussian noiseon the rms of the peak position of the ÜEÝßÞ°àáÜ - MI objectivefunction estimatedusing histograms onsingle-pixel andfeaturecoincidencetreesof â'ã¨â tagfeatures (left)andfeaturecoincidencetrees on discreteICA (8D and16D) features (right). Theseplots arebasedon 20repeatedexperimentsfor theCase142imageandits rotatedcousin (maximum rotation anglewas ä å æ ).
33
0 5 10 15 20 250
5
10
15
Standard Deviation of Noise
Pos
ition
of P
eak
(reg
istr
atio
n E
rror
)
Effect of Additive Noise on peak of objective function
alphaJensen Diff MST on 8D−ICAalphaMI on single pixels w/ HitogramsalphaJensen on 8D−ICA using HistogramsalphaJensen on single pixels w/ MST
Figure16: Effectof additivenoiseonthepeakof the Û - MI objectivefunctionestimatedusing histogramsonsingle pixels, Û - Jensenfunction estimatedusinghistogramsonsingle-pixelsand8D discreteandcontinuousICA features.
34