8
Unsupervised Scaling of Multi-Descriptor Similarity Functions for Medical Image Datasets Renato Bueno 1 , Daniel S. Kaster 1,2, * , Adriano A. Paterlini 1 , Agma J. M. Traina 1 and Caetano Traina Jr. 1 1 Department of Computer Science – ICMC, University of S˜ ao Paulo at S˜ ao Carlos, SP, Brazil {rbueno, paterlini, agma, caetano}@icmc.usp.br 2 Department of Computer Science, University of Londrina, Londrina, PR, Brazil [email protected] Abstract Content-based search has proven to be a proper comple- ment to textual queries over medical image databases. In many applications, employing multiple image descriptors and combining the respective distance functions using ade- quate scale factors improves the retrieval accuracy. How- ever, the existing weighting methods are either exhaustive or supervised. In this paper, we present the Fractal-scaled Product Metric, an unsupervised method to determine a scale factor among features in multi-descriptor image sim- ilarity assessment based on the Fractal Theory. The com- posite distance function obtained is not limited to dimen- sional image descriptors and enables using scalable index- ing structures. Experiments have shown that the proposed method determines near-optimal scale factors for the de- scriptors involved, and always improves the precision of the results, outperforming the individual descriptors up to 31% on the average precision. 1. Introduction With the recent explosion in the availability of film- less radiology equipments, the management of digital med- ical images is receiving more and more attention. Picture Archiving and Communication Systems (PACS) have been successfully introduced in many hospitals and specialized clinics, providing quick access to screening exams and in- tegrating the actors involved in the enterprise’s workflow. The radiological databases built storing digital images have evolved from a simple deposit of past exams that are kept for legal reasons to an active repository of professional * On leave at Department of Computer Science – ICMC, University of ao Paulo at S˜ ao Carlos, SP, Brazil. knowledge continuously updated. This knowledge base is useful for many tasks, including computer-aided diagno- sis, research, continuing medical education and radiological training. The Digital Imaging and Communications in Medicine (DICOM) standard for image communication allows stor- ing the study information with the image. Most queries in a PACS are based on these metadata. However, querying im- ages based on their visual characteristics has proven to be a suitable complement to text-based search. For instance, visual features only not allow the retrieval of cases where patients have similar diagnoses, but also the identification of cases with visual similarity but different diagnoses [13]. Content-Based Image Retrieval (CBIR) is a technology that helps to manage digital picture archives organized by their visual content [7]. The original representation of an image as an array of pixels corresponds poorly to the hu- man visual response, thus processing steps are required in order to acquire a semantic interpretation of the content. To do so, the images are submitted to image processing algo- rithms, which generate a mathematical signature describing the properties of the image content. Thereafter, the signa- tures can be compared through a function to measure the similarity between them, yielding performing queries based on similarity criteria. In many real situations, employing various feature ex- tractors to describe the image content improves the accu- racy. However, uncontrolled increase of the features into a composite vector may degrade the performance and the pre- cision of the results as well. Combining multiple descrip- tors requires a strategy that both enables employing the best integration between the binomial feature extractor-distance function for each descriptor, and states a proper balance of the descriptors in the similarity calculations. However, the methods for weighting multiple descriptors encountered in 978-1-4244-4878-4/09/$25.00 ©2009 IEEE

[IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

  • Upload
    caetano

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

Unsupervised Scaling of Multi-Descriptor Similarity Functionsfor Medical Image Datasets

Renato Bueno1, Daniel S. Kaster1,2,∗, Adriano A. Paterlini1, Agma J. M. Traina1

and Caetano Traina Jr.1

1Department of Computer Science – ICMC, University of Sao Paulo at Sao Carlos, SP, Brazil{rbueno, paterlini, agma, caetano}@icmc.usp.br

2Department of Computer Science, University of Londrina, Londrina, PR, [email protected]

Abstract

Content-based search has proven to be a proper comple-ment to textual queries over medical image databases. Inmany applications, employing multiple image descriptorsand combining the respective distance functions using ade-quate scale factors improves the retrieval accuracy. How-ever, the existing weighting methods are either exhaustiveor supervised. In this paper, we present the Fractal-scaledProduct Metric, an unsupervised method to determine ascale factor among features in multi-descriptor image sim-ilarity assessment based on the Fractal Theory. The com-posite distance function obtained is not limited to dimen-sional image descriptors and enables using scalable index-ing structures. Experiments have shown that the proposedmethod determines near-optimal scale factors for the de-scriptors involved, and always improves the precision of theresults, outperforming the individual descriptors up to 31%on the average precision.

1. Introduction

With the recent explosion in the availability of film-less radiology equipments, the management of digital med-ical images is receiving more and more attention. PictureArchiving and Communication Systems (PACS) have beensuccessfully introduced in many hospitals and specializedclinics, providing quick access to screening exams and in-tegrating the actors involved in the enterprise’s workflow.The radiological databases built storing digital images haveevolved from a simple deposit of past exams that are keptfor legal reasons to an active repository of professional

∗On leave at Department of Computer Science – ICMC, University ofSao Paulo at Sao Carlos, SP, Brazil.

knowledge continuously updated. This knowledge base isuseful for many tasks, including computer-aided diagno-sis, research, continuing medical education and radiologicaltraining.

The Digital Imaging and Communications in Medicine(DICOM) standard for image communication allows stor-ing the study information with the image. Most queries in aPACS are based on these metadata. However, querying im-ages based on their visual characteristics has proven to bea suitable complement to text-based search. For instance,visual features only not allow the retrieval of cases wherepatients have similar diagnoses, but also the identificationof cases with visual similarity but different diagnoses [13].

Content-Based Image Retrieval (CBIR) is a technologythat helps to manage digital picture archives organized bytheir visual content [7]. The original representation of animage as an array of pixels corresponds poorly to the hu-man visual response, thus processing steps are required inorder to acquire a semantic interpretation of the content. Todo so, the images are submitted to image processing algo-rithms, which generate a mathematical signature describingthe properties of the image content. Thereafter, the signa-tures can be compared through a function to measure thesimilarity between them, yielding performing queries basedon similarity criteria.

In many real situations, employing various feature ex-tractors to describe the image content improves the accu-racy. However, uncontrolled increase of the features into acomposite vector may degrade the performance and the pre-cision of the results as well. Combining multiple descrip-tors requires a strategy that both enables employing the bestintegration between the binomial feature extractor-distancefunction for each descriptor, and states a proper balance ofthe descriptors in the similarity calculations. However, themethods for weighting multiple descriptors encountered in

978-1-4244-4878-4/09/$25.00 ©2009 IEEE

Page 2: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

the literature are either based on exhaustive experimentationor in supervised techniques.

In this paper we present a new method, called Fractal-scaled Product Metric (FPM), to determine a scale fac-tor among features in multi-descriptor image similarity as-sessment, which does not require any user-input parame-ter. This method is based on the intrinsic dimensions ofthe feature spaces, approximated by their respective fractaldimensions. Moreover, the composite distance function isnot limited to image extractors generating vectors (i.e. datainherently immersed in a multidimensional space) and en-ables using scalable access methods to speed up data search-ing. Experiments have shown that the proposed method de-termines near-optimal scale factors for the descriptors in-volved and always improves the precision of the results.

The remainder of the paper is organized as follows. Sec-tion 2 introduces the basic concepts for the paper and themain related work. The proposed method is described inSection 3. Section 4 shows results of some of the experi-ments performed to evaluate the proposed method. Finally,Section 5 presents the conclusions and future work.

2. Background and Related Work

By the nature of its task, the CBIR technology dependson two intrinsic problems: (i) how to state a signature thatmathematically describe an image, and (ii) how to assessthe similarity between a pair of images based on their signa-tures [7]. The visual properties of an image are commonlydescribed in terms of color, texture and shape. These de-scriptors are extracted by image processing algorithms ei-ther globally for the entire image or locally for regions ofinterest, defined manually or by a segmentation algorithm.These algorithms, called feature extractors, compute a num-ber of values, usually numeric, that compose the signaturesrepresenting the images, also referred to as feature vectors.There are many image feature extractors documented in theliterature, such as the histogram for color, the Haralick de-scriptors for texture and the Zernike moments for shape [1].

Due to the nature of the image data, the similarity amongelements has been come forth as the most important prop-erty for image querying. Similarity relies on a functionδ : S × S → R that compares a pair of elements in thedata domain S and returns a real value quantifying howmuch they differ. Many measures proposed in the litera-ture treat similarity (or, more precisely, the dissimilarity) asa distance, so smaller values denote more similar elements.

The basic image similarity evaluation considers only onedescriptor. However, in most real situations, employing var-ious feature extractors to describe the image content im-prove the accuracy of the results because they provide com-plementary characteristics that contribute to the representa-tion of the image content. The next subsection discusses

the combination of multiple descriptors for similarity as-sessment.

2.1. Multi-Descriptors Distance Functions

The basic approach to combine multiple features is toconcatenate every value in a “super-vector” and comparethem using a Minkowski distance function. This approachfails in three aspects. First, it has shown that there is a closerelationship between the features and the distance functionused to compare the data, thus getting better results requiresto assess the best integration of them [4]. Therefore, em-ploying a single distance function to compare all the de-scriptors results in lower precision. However, most medicalCBIR systems employ haphazardly either the Euclidean dis-tance (L2) or the Manhattan distance (L1) [13], even whencombining multiple feature extractors. Second, only fea-ture extractors that produce dimensional descriptors can becombined. Finally, the resulting vectors have high dimen-sionality, facing the curse of dimensionality problem. Sev-eral research works employ dimensionality reduction tech-niques to soften this problem, such as Principal ComponentAnalysis (PCA) and feature selection algorithms. Nonethe-less, the mismatch of the feature-distance integration stillremains.

A better method to combine multiple descriptors com-putes the similarity for each of them in a base level andthen aggregates the partial similarities using a compositionfunction. Let any image x be represented by a set of n de-scriptors x1, . . . , xn, each generated by a feature extractionalgorithm, and δ1, . . . , δn be distance functions defined overthe domains of the respective descriptors. The compositiondistance function ∆ between two images x, y is a combi-nation of the individual distances δi(xi, yi), usually givenby:

∆(x, y) =n∑

i=1

wi · δi(xi, yi) (1)

where wi is the weight assigned to the respective descrip-tor. This method does not have the problems of the super-vector approach. Moreover, the dimensionality reductiontechniques can also be applied to individual dimensional de-scriptors, improving even more the retrieval precision.

Several systems employed this strategy, such as the workof Heesch et al. for sketch retrieval [11] and the medicalCBIR system of the Telemedicine Center at the NationalUniversity of Colombia [6]. The main drawback of suchsystems is that they determine the weights of each com-ponent by means of exhaustive search. To find adequateweights for each component, some approaches were pro-posed relying either on relevance feedback cycles [10] oron supervised techniques, such as the work presented in [5],which determines the weights using a heuristic that is dy-namically computed for each query element based on a

Page 3: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

set of previously labeled elements. Distinctly from theseworks, we propose in this paper an unsupervised techniqueto reach proper scale factors among the multiple descriptorsemployed.

2.2. The Metric Approach to Similarity

Many image feature extractors generate a constantamount of real values representing the intrinsic character-istics of an image. However, there are exceptions. Forinstance, the Metric Histogram [15] performs a piecewiselinear approximation over a normalized histogram, yieldinga descriptor with a varying number of bins. Most CBIRapproaches based on recognizing multiple regions in eachimage do not generate dimensional signatures as well. Inthose techniques, the similarity between images is com-puted based on the regions identified by a segmentationprocess. Thus, the number of regions of two images maybe different as well as the obtained region descriptors mayhave different sizes [14]. Extractors which generate any sortof nominal values are also not inherently embedded into adimensional space. Therefore, the common geometric in-terpretation of distances in a multidimensional space doesnot take place in such cases.

In this sense, metric spaces are more adequate to rep-resent multimedia data, as they only require the elementsand their pairwise distances. A metric space is formallydefined as a pair 〈S, δ〉, where S is a data domain and δis a metric. A metric is a distance function that satisfiesthe following metric axioms, ∀x, y, z ∈ S: (1) symmetry:δ(x, y) = δ(y, x); (2) non-negativity: 0 < δ(x, y) < ∞if x 6= y and δ(x, x) = 0; and (3) triangular inequality:δ(x, y) ≤ δ(x, z) + δ(z, y). Therefore, both adimen-sional1 and dimensional data can be represented in a metricspace, being sufficient to define proper metrics to comparethe data.

Employing a metric to compute the dissimilarity be-tween images also enables to adopt a Metric Access Method(MAM) to speedup similarity queries, fighting the perfor-mance gap in medical CBIR systems [8]. A MAM em-ploys properties of metric spaces to reduce the search space.There are many MAMs proposed [12], such as the VP-tree,the M-tree and the Slim-tree. Experimental evaluation hasshown that MAMs are very effective for indexing imagedata, achieving superior performance and scalability thanother structures.

The proposed method to scale multiple image descriptorsuse concepts of the Fractal Theory, and it is introduced inthe next section.

1Adimensional data are those for whom no dimensionality can be as-signed.

2.3. Fractal Theory and Databases

A fractal is defined as the property of objects being self-similar, independently of scale or size, i.e. parts of a fractalobject are directly or statistically similar to the whole ob-ject. Some experimental evidence has shown that the distri-bution of distances between pairs of elements, in the major-ity of real datasets, present self-similarity. Thus, they canbe considered as fractal datasets, at least for some distanceranges [9].

An interesting result of the Fractal theory is that any frac-tal has intrinsic dimensions, independent from the spacewhere the object is immersed, and they can be measuredby its so-called fractal dimensions. One of the most usefulfractal dimensions for databases is the correlation fractal di-mension D2. Knowing D2 of a dataset allows predicting itsproperties as being similar to that of a dimensional datasetwith approximately the same embedded dimension.

The correlation fractal dimension can be calculated us-ing the Box Counting method [17] or using the techniqueproposed in [16], which is described next. Given a set ofelements in a dataset endowed with a distance function δ,the average number of neighbors within a given distancer is proportional to r raised to a value D . Therefore, thepair-count PC(r) of elements within distance r follows thepower law:

PC(r) = Kp · rD (2)

where Kp is a proportionality constant. The graph obtainedcalculating Equation 2 plotted in log-log scale is called thedistance plot. The plot of datasets that are perfectly fractalsare a straight line, and its slope is the exponent D in Equa-tion 2, called the distance exponent, that closely approxi-mates D2. Real datasets results into curves, however themajority of them can be fitted by a line whose slope givesD . Figure 1 shows the spatial distribution and the distanceplot of a US Bureau of Census dataset whose elements arethe crossroads’ geographical coordinates of MontgomeryCounty of Maryland, MD, USA, which has D2 = 1.82.

Having introduced the basic concepts, we present ourmulti-descriptor scaling method in the next section.

3. The Unsupervised Scaling Factor for Multi-Descriptors Distance Functions

Feature extractors produce signatures comparablethrough metrics, so each one generates a metric space. Thestandard way of aggregating n metric spaces Mi = 〈Si, δi〉,1 ≤ i ≤ n, is defining a metric over the cartesian productM1 × M2 × ... × Mn, which is called a product metric.In this sense, any product metric can be used to aggregate

Page 4: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

Figure 1. Geographic coordinates of theMontgomery County crossroads. (a) Plot ofthe dataset. (b) Distance plot graph.

multiple image descriptors, such as the linear, maximal andmultiplicative combinations.

The proposed method, called the Fractal-scaled ProductMetric (FPM), integrates multiple image descriptors usinga product metric following Equation 1, satisfying the met-ric axioms. Our approach is to calculate the scale factors(the weights wi) between the composed metrics based onthe correlation fractal dimension D2 of each original metricspace created by the extracted descriptors.

The main idea of the FPM is to identify the contributionfor the overall similarity calculation of each descriptor. Thevalue of the intrinsic dimension reflects the existence of cor-relations among the attributes of a dataset. Thus, the fractaldimension provides an estimate of a lower bound for thenumber of features needed in a similarity search to keep theessential data characteristics. Therefore, scaling the multi-ple descriptors in the product metric by their intrinsic di-mensions approximated by the value of D2, neither overes-timate nor underestimate the contribution of each descriptorin the similarity assessment.

The results of the individual metrics must be normalizedto reduce the effects of the range difference in the similar-ity computation, avoiding that features with higher valuesdominate the final result. We perform such normalizationusing the largest known distance dmaxi for each descrip-tor. Therefore, the FPM is given by:

∆(x, y) =n∑i

D2i ·δi(xi, yi)dmaxi

(3)

The values of dmaxi can be measured by computing thedistances between all pairs of elements considering each de-scriptor, but this operation is quadratic on the number of el-ements. However, cheaper techniques can be employed toestimate this value. For example, if a MAM is indexing thedata referred to the descriptor i, it is feasible to get a closeapproximation stating dmaxi as the diameter of the regioncovered by the MAM’s root node.

We already employed a special case of this technique inanother work to define a metric-temporal space which al-lows comparing elements by similarity taking into accounttime-related evolution, and have also obtained nice results.Refer to paper [3] to a more complete discussion of the no-tion of scaling metric spaces using fractals.

The experimental evaluation of our method, shown in thenext section, confirms that the proposed way to calculatescaling factors gives a suitable representation of the contri-butions of each of the multiple descriptors employed.

4. Experiments

This section presents results of experiments performed toevaluate the proposed method. The software developed wasimplemented in C++ using the Slim-tree MAM, available inthe Arboretum2 library to index data.

The next subsections describe the testing datasets, theretrieval evaluation employed and discuss the experiments.

4.1. Dataset Description

We have run experiments over several image databases.We present herein the results of two representative ones,provided by the Clinical Hospital of the Faculty of Medicineof Ribeirao Preto, University of Sao Paulo, Brazil.

The first image database, called MRI 704, is composedof 704 images of Magnetic Resonance Imaging (MRI) ex-ams, with size 256 x 256 pixels. The color depth ofthe images was reduced to 8 bits, resulting in 256 gray-levels. The database is divided into classes according tothe body part examined, view plane and slice position, re-sulting in a total of 40 classes. Each image of the MRI 704database was processed by three feature extractors, gener-ating three distinct feature datasets for this database. Thefirst dataset, called Metric Histogram, was generated by theadimensional color extractor Metric Histogram. The seconddataset, called Haralick, was built computing the Haralick’sTexture of the images. The last dataset, called Zernike, con-tains the first 256 moments of Zernike describing the imageshapes.

The second image database, called CT Lung ROIs, isa collection of 3257 images, with size 64 x 64 pixels and256 gray-levels, containing Regions of Interest (ROIs) ofimages of Computed Tomography (CT) lung exams. Thisdatabase is organized in 6 classes, being one normal and 5of abnormal patterns. We did not employed a color extrac-tor for CT Lung ROIs, because this feature is not helpfulto represent ROIs of images used in medical exams. Theimages of the CT Lung ROIs database were processed us-ing the Haralick’s Texture and the first 256 moments of

2Available at http://gbdi.icmc.usp.br/arboretum

Page 5: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

Figure 2. Distance plots of the datasets Metric Histogram, Haralick and Zernike of the MRI 704database.

Figure 3. Average precisions for the individual descriptors of the MRI 704 database and for varyingratios between the weights for each pairwise combination.

Zernike, generating respectively the feature datasets Har-alick and Zernike for this database.

4.2. Retrieval Evaluation

To evaluate the quality of the query results we employedPrecision versus Recall (P×R) graphs [2]. The recall ofa query is calculated as: recall = |Ra|

|R| , where |Ra| is thenumber of relevant elements retrieved and |R| is the numberof relevant elements that should be retrieved. The precisionof a query is given by: precision = |Ra|

|A| , where |Ra| is thenumber of relevant elements retrieved and |A| is the size ofthe answer set. A simple way to interpret a P×R curve isthe closer to the top the curve the better the technique.

P×R curves can be summarized by a single numericvalue to allow an overall evaluation of different queries atonce. There are some strategies to perform this summa-rization. In this work we employed the average precision,which is given by the average precision at all recall levels,as defined in [2]. Intuitively, the average precision is thearea below a P×R curve.

To generate the precision versus recall graphs, we em-ployed every element from the dataset as the query centerto execute a k-NN query retrieving the full dataset orderedby the similarity to the query center. Therefore, each P×Rcurve, and consequently each average precision value, wasobtained by the average of 704 and 3257 queries, respec-tively for the datasets of the MRI 704 database and for thedatasets of the CT Lung ROIs database.

4.3. Results and Discussion

To evaluate the proposed method we initially tested vari-ous metrics over each dataset, in order to identify the metricthat achieves the highest individual precision for each fea-ture extractor. Thereafter, we evaluated the retrieval qual-ity achieved by several weight combinations between thefeatures employed and by the scaling provided by the FPMmethod. This section presents the results obtained, first withthe datasets of the MRI 704 database and afterwards withthe datasets of the CT Lung ROIs database.

Concerning the MRI 704 database, to represent the color

Page 6: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

features of the images we first tested the precision of thestandard gray-scale histogram extractor using various met-rics. Then, we generated metric histograms from the gray-scale histograms and compared this descriptor using theMetric Histogram Distance (MHD) [15] to the best met-ric for the gray-scale histogram. The combination met-ric histogram/MHD has achieved the best precision, so itwas selected. For the datasets Haralick and Zernike of theMRI 704 database the chosen metrics were respectively theCanberra and the Euclidean distances.

Thereafter, we calculated the correlation fractal dimen-sions (D2) for the feature datasets. This procedure re-sulted in the D2 values of 2.1, 2.4 and 4.4 respectively forthe datasets Metric Histogram, Haralick and Zernike of thedatabase MRI 704. The distance plots for these datasets areshown in Figure 2.

Next, we calculated the average precisions obtained bycombining pairs of descriptors, varying their weights. Fig-ure 3 shows the average precisions for each descriptor ofthe database MRI 704 individually, denoted by the hori-zontal lines. The average precisions for the Metric His-togram, Haralick and Zernike datasets were respectively69.2%, 68% and 80.7%. This figure also presents the plotof the average precisions obtained varying the ratio betweenthe weights wi for each pairwise combination. Notice thatthe average precisions of the combinations for most weightsachieved better results than using only one descriptor. Thevertical line in Figure 3 represents the proposed scale fac-tor ratio. As it can be seen, for the three combinations, ourtechnique achieved a near-optimal scaling between the de-scriptors.

Finally, we evaluated the effectiveness of the FPMmethod to combine the three descriptors of the MRI 704database. The scale factors were set to the measured D2

values, wMetric Hist. = 2.1, wHaralick = 2.4 andwZernike = 4.4.The FPM outperformed every pairwise combination, havingachieved 89.2% of average precision, improving the averageprecision by 29%, 31% and 11% respectively over the indi-vidual descriptors Metric Histogram, Haralick and Zernike.

The average precision gives a concise view of the effec-tiveness of a retrieval method. Therefore, Figure 4 showsthe complete P×R curve of the FPM against the curves ob-tained using each descriptor individually. It can be seen thatthe FPM achieved better results in all recall levels, in someof them outperforming the precision of the individual de-scriptors in more than 100%.

With respect to the CT Lung ROIs database, we alsoperformed tests to identify the best metric for each descrip-tor. The most accurate results were obtained with the Can-berra distance for both datasets Haralick and Zernike of theCT Lung ROIs database.

We calculated the correlation fractal dimension of thedatasets of the CT Lung ROIs database using the linear Box

Figure 4. Comparing the precision of the sin-gle descriptors of the MRI 704 database andthe combination given by the FPM method.

Counting method. Figure 5a-1 and Figure 5a-2 show theplots generated for the datasets Haralick and Zernike of theCT Lung ROIs database, yielding respectively the values of3.9 and 5.1 for D2 regarding these datasets.

Thereafter, we calculated the average precisions of theindividual descriptors of the CT Lung ROIs database, aswell as several weight combinations between them (Fig-ure 5b). The average precision achieved by the Haralickdataset of this database was 36.7% and the average pre-cision of the Zernike dataset was 29.2%. Concerning thepairwise combination of the descriptors, the experimentsrevealed an interesting behavior in the weights proportion.As it can be seen in Figure 5, the highest precision wasachieved setting a higher weight for the Zernike descriptor,although the Haralick descriptor had achieved a better indi-vidual precision. Nevertheless, FPM defined an adequateweight ratio and achieved again a near-optimal precisionsetting wHaralick = 3.9 and wZernike = 5.1 as given by equa-tion 3, proving that it correctly identifies the contribution ofthe descriptors in the similarity assessment.

Figure 6 shows the P×R graph comparing the retrievalquality of employing each individual descriptor of theCT Lung ROIs database with the results obtained with theproposed method. The FPM method achieved 39.2% of av-erage precision and outperformed the individual descriptorsin almost all recall levels, having improved in some of themthe precision up to 64% over using only the Zenike descrip-tor and up to 13% over using only the Haralick descriptor.

Finally, Figure 7 shows the software developed for theevaluation, illustrating a 5-NN example query over theMRI 704 database, comparing the results obtained using alldescriptors combined and scaled with the FPM to the resultsobtained using each descriptor individually. The imageswith a cross mark are false positives returned by each op-tion. It is worth mentioning that these experiments targeted

Page 7: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

Figure 5. Experiments over the CT Lung ROIs image database. a-1) Plot of the D2 calculation for thefeature dataset Haralick. a-2) Plot of the D2 calculation for the Zernike dataset. b) Comparing theaverage precisions of each single descriptor and of pairwise weight combinations.

Figure 6. P×R graph of each single descrip-tor of the CT Lung ROIs database and of theFPM method combining them.

a general purpose application, aimed at evaluating the scal-ing between multiple image descriptors. However, the FPMmethod can be employed to every CBIR medical applica-tions, using domain-specific feature extractors and metrics.

5. Conclusions and Future Work

In this paper we proposed the Fractal-scaled ProductMetric, an unsupervised method to define the scale factor ofa multi-descriptor product metric for image similarity eval-uation, based on the correlation fractal dimension of the de-scriptors. The composite distance function can both handledimensional and adimensional image descriptors, integrat-ing any number of descriptors, and allows using scalable in-dexing structures. We also showed experiments that confirmthe effectiveness of the approach. The proposed method de-termined near-optimal scale factors for the descriptors in-

volved, always improving the precision of the results. Themethod outperformed the precision over the individual de-scriptors up to 31% of average precision.

Future work includes testing other product metrics andcombining the proposed method over dimensionality reduc-tion algorithms for individual descriptors.

Acknowledgments

This work has been supported by FAPESP (Sao PauloState Research Foundation), CNPq (National Council forScientific and Technological Development) and CAPES(Brazilian Federal Funding Agency for Graduate Educa-tion Improvement). The authors also thank Marcelo Pon-ciano da Silva (ICMC-USP, Brazil) for gently providing theCT Lung ROIs image database.

References

[1] Y. A. Aslandogan and C. T. Yu. Techniques and systems forimage and video retrieval. IEEE Trans. on Knowl. and DataEng., 11(1):56–63, 1999.

[2] R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Infor-mation Retrieval. Addison-Wesley, Wokingham, UK, 1999.

[3] R. Bueno, D. S. Kaster, A. J. M. Traina, and C. Traina Jr.Time-aware similarity search: a metric-temporal represen-tation for complex data. In SSTD, volume 5644 of LNCS,pages 302–319, Aalborg, Denmark, 2009. Springer.

[4] P. H. Bugatti, A. J. M. Traina, and C. Traina Jr. Assessingthe best integration between distance-function and image-feature to answer similarity queries. In SAC, pages 1225–1230, Fortaleza, CE, Brazil, 2008. ACM.

[5] B. Bustos, D. Keim, D. Saupe, T. Schreck, and D. Vranic.Automatic selection and combination of descriptors for ef-fective 3d similarity search. In Multimedia Software Engi-neering, pages 514–521, Miami, FL, USA, 2004. IEEE.

Page 8: [IEEE 2009 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS) - Albuquerque, NM, USA (2009.08.2-2009.08.5)] 2009 22nd IEEE International Symposium on Computer-Based

Figure 7. Employing the FPM method and each single descriptor in an example query over theMRI 704 database.

[6] J. C. Caicedo, F. A. Gonzalez, E. Triana, and E. Romero.Design of a medical image database with content-based re-trieval capabilities. In Advances in Image and Video Tech-nology, pages 919–931, Santiago, Chile, 2007. Springer.

[7] R. Datta, D. Joshi, J. Li, and J. Z. Wang. Image retrieval:Ideas, influences, and trends of the new age. ACM Comput.Surv., 40(2):1–60, 2008.

[8] T. M. Deserno, S. Antani, and R. Long. Ontology of gaps incontent-based image retrieval. Journal of Digital Imaging,2(22):1–14, 2009. Epub 2008 Feb 1.

[9] C. Faloutsos and I. Kamel. Beyond uniformity and inde-pendence: Analysis of R-trees using the concept of fractaldimension. In PODS, pages 4–13. ACM, 1994.

[10] C. D. Ferreira, R. da Silva Torres, M. A. Goncalves, andW. Fan. Image retrieval with relevance feedback based ongenetic programming. In SBBD, pages 120–134, 2008.

[11] D. Heesch and S. Ruger. Combining features for content-based sketch retrieval – a comparative evaluation of retrievalperformance. In In Proc. 24th BCS-IRSG European Collo-quium on IR Research, pages 41–52. Springer, 2002.

[12] G. R. Hjaltason and H. Samet. Index-driven similaritysearch in metric spaces. ACM TODS, 21(4):517–580, 2003.

[13] H. Muller, N. Michoux, D. Bandon, and A. Geissbuhler. Areview of content-based image retrieval systems in medicalapplications-clinical benefits and future directions. Int. Jour-nal of Medical Informatics, 73(1):1–23, 2004.

[14] R. O. Stehling, M. A. Nascimento, and A. X. Falcao. Mi-CRoM: A metric distance to compare segmented images. InVISUAL, pages 12–23, Hsin Chu, Taiwan, 2002. Springer.

[15] A. J. M. Traina, C. Traina Jr., J. M. Bueno, F. J. T. Chino,and P. M. d. A. Marques. Efficient content-based image re-trieval through metric histograms. World Wide Web Journal,6(2):157–185, 2003.

[16] C. Traina Jr., A. J. M. Traina, and C. Faloutsos. Distanceexponent: a new concept for selectivity estimation in metrictrees. In ICDE, page 195, San Diego, CA, USA, 2000. IEEE.

[17] C. Traina Jr., A. J. M. Traina, L. Wu, and C. Faloutsos. Fastfeature selection using fractal dimension. In SBBD, pages158–171, Joao Pessoa, PB, Brazil, 2000.