6
Extension of the Concept Representation in Conceptual Spaces 296 Mircea Ionescu and Anca Ralescu Department of ECECS, University of Cincinnati, Cincinnati, OH 45221-0030, USA ionescmm @ ececs.uc.edu, Anca.Ralescu @ uc.edu keywords: conceptual spaces, co-occurrence matrix, fuzzy belong to one domain, visual concepts to a different domain, Hamming distance etc. Domains are built up from quality dimensions. For example, I. ABSTRACT the spatial domain is built up from spatial dimensions such a A compact matrix representation for concepts and instances, height, width, depth; the visual domain is built up from visual based on co-occurrence of properties is considered. The study dimensions such as color, texture, etc. Quality dimensions builds on ideas from the theory of conceptual spaces and are used to represent concept qualities. They make possible current work of the authors on content based image retrieval similarity judgements between two stimuli (objects), such as systems and image similarity measure. Initial experimental less red, for color, or louder for sound. The dimensions are results are used to support the proposed representation. grouped into domains. Similarity between two concepts, in fact points in the conceptual space, can be assessed only within II. INTRODUCTION the same domain, using an appropriate metric. Dimensions provide the framework to define and assign Introduced in [1] as a geometry of thought, conceptual g spaces provide, on a general level, a 'frameworkfor cognitive properties to an object, as well as to specify relations between representation ", a bridge between symbolic and connectionist such properties. A (natural) property is defined as a convex approaches. On a more specific level, this framework can region of a domain in a conceptual space. be used as an "empirically testable theory" or "constructive To illustrate the above notions consider the example of the models". This aspect of the conceptual spaces framework concept/object apple in the visual domain made up of the is very important to applications of our interest, namely to dimensions color and texture. This domain and these dimen- representing image contents in a way which is both compatible sions are of interest for the image representation problem. In with human understanding and amenable to computation. The another problem, for instance that of cultivation of apples, the initial motivation for adopting this framework is our previous conceptual space may have other additional domains, such as work on content based image retrieval systems [2], [3], [4], taste, spatial (with dimension diameter, volume, etc). [5]. The usual approach in such systems is to model the image A concept is an aggregation of object descriptions in differ- content and then apply image similarity measures in order ent domains. This aggregation is realized through a measure to identify like images. The image content is modeled in of co-occurrence of properties, across multiple domains. This terms of attributes or properties of the image, usually, output co-occurrence makes the representation framework offered by some image processing operators (such as segmentation, by conceptual spaces more powerful than those based on color, texture operators). Occasionally, high level descriptions, assessment of individual properties. extracted by a human user are also used. In general, the similarity between images is is obtained within each attribute and the results are aggregated for an overall similarity score. The aggregation operator depends on the specific framework in which the problem is solved and usually does not make any assumptions on the interaction between attributes. First, assume that for an observation o on a dimension, Conceptual spaces theory makes possible a much more d, its similarity to properties of this dimension is calculated. sophisticated form of capturing and representing knowledge For a concept/object such similarities are computed along all about the image content, in a way that considers attribute co- properties (across domains, for all dimensions). A convenient occurrence in a systematic way. way of representing the result is a connection matrix, C, with The basic notions for conceptual spaces are those of domain, matrix elements conveying co-occurrence of properties P -j dimension and property. Loosely defined, domains correspond and Pk. The convention made in [6] restricts the values that to groups of related concepts. For example, spatial concepts Cik can take as follows. For two properties, Pj and Pk, 1 -4244-a32441E$6fJJ52cIfiI21 Eo6 IEEE

[IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

  • Upload
    anca

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

Page 1: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

Extension of the Concept Representation inConceptual Spaces 296

Mircea Ionescu and Anca RalescuDepartment of ECECS,University of Cincinnati,

Cincinnati, OH 45221-0030, USAionescmm@ ececs.uc.edu, [email protected]

keywords: conceptual spaces, co-occurrence matrix, fuzzy belong to one domain, visual concepts to a different domain,Hamming distance etc.

Domains are built up from quality dimensions. For example,I. ABSTRACT the spatial domain is built up from spatial dimensions such a

A compact matrix representation for concepts and instances, height, width, depth; the visual domain is built up from visualbased on co-occurrence of properties is considered. The study dimensions such as color, texture, etc. Quality dimensionsbuilds on ideas from the theory of conceptual spaces and are used to represent concept qualities. They make possiblecurrent work of the authors on content based image retrieval similarity judgements between two stimuli (objects), such assystems and image similarity measure. Initial experimental less red, for color, or louder for sound. The dimensions areresults are used to support the proposed representation. grouped into domains. Similarity between two concepts, in fact

points in the conceptual space, can be assessed only withinII. INTRODUCTION the same domain, using an appropriate metric.

Dimensions provide the framework to define and assignIntroduced in [1] as a geometry of thought, conceptual gspaces provide, on a general level, a 'frameworkfor cognitive properties to an object, as well as to specify relations between

representation ", a bridge between symbolic and connectionist such properties. A (natural) property is defined as a convex

approaches. On a more specific level, this framework can region of a domain in a conceptual space.be used as an "empirically testable theory" or "constructive To illustrate the above notions consider the example of themodels". This aspect of the conceptual spaces framework concept/object apple in the visual domain made up of theis very important to applications of our interest, namely to dimensions color and texture. This domain and these dimen-representing image contents in a way which is both compatible sions are of interest for the image representation problem. Inwith human understanding and amenable to computation. The another problem, for instance that of cultivation of apples, theinitial motivation for adopting this framework is our previous conceptual space may have other additional domains, such aswork on content based image retrieval systems [2], [3], [4], taste, spatial (with dimension diameter, volume, etc).[5]. The usual approach in such systems is to model the image A concept is an aggregation of object descriptions in differ-content and then apply image similarity measures in order ent domains. This aggregation is realized through a measureto identify like images. The image content is modeled in of co-occurrence ofproperties, across multiple domains. Thisterms of attributes or properties of the image, usually, output co-occurrence makes the representation framework offeredby some image processing operators (such as segmentation, by conceptual spaces more powerful than those based oncolor, texture operators). Occasionally, high level descriptions, assessment of individual properties.extracted by a human user are also used. In general, thesimilarity between images is is obtained within each attributeand the results are aggregated for an overall similarity score.The aggregation operator depends on the specific frameworkin which the problem is solved and usually does not make anyassumptions on the interaction between attributes. First, assume that for an observation o on a dimension,

Conceptual spaces theory makes possible a much more d, its similarity to properties of this dimension is calculated.sophisticated form of capturing and representing knowledge For a concept/object such similarities are computed along allabout the image content, in a way that considers attribute co- properties (across domains, for all dimensions). A convenientoccurrence in a systematic way. way of representing the result is a connection matrix, C, withThe basic notions for conceptual spaces are those of domain, matrix elements conveying co-occurrence of properties P -j

dimension and property. Loosely defined, domains correspond and Pk. The convention made in [6] restricts the values thatto groups of related concepts. For example, spatial concepts Cik can take as follows. For two properties, Pj and Pk,

1-4244-a32441E$6fJJ52cIfiI21 Eo6 IEEE

Page 2: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

as Iljk (i) and ,j (i). The summations in (4) can then bef 0 if Pj and Pk are in the same domain interpreted as the respective cardinalities (E- count) of these

C.k 1 if Pj = Pk fuzzy sets, which leads to the interpretation of (4) C2g7as thee [0,1] if P and Pk are in different domains conditional probability that ofproperty Pk given property Pj

(1) given the training set of concept instances.According to equation (1) the matrix C records the co- In light of this interpretation, the condition that Cjk = 0 if

occurrence of different properties only when they are from j 7 k, Pj, Pk from the same dimension, means that considereddifferent domains. Moreover, C is symmetric. The exact value for each dimension, properties form a partition, and no twoof Cjk, when Pj and Pk are in different domains is computed properties in the same dimension can co-occur. For the applefrom the similarity between the observation instance and the example this means that an apple cannot be at the same timeproperty along a given dimension (e.g. the similarity between red and yellow, even though as it has been seen in the abovethe color of an object and the property red, in the color small example, for a particular instance all the properties fordimension). The nice thing about the representation (1) is the color dimension can be detected with degrees differentthat it can be used to describe an instance and to capture a from zero.collection of instances (the training set for defining a concept). In effect, Cjk conveys the conditional probability, in the

concept, of the property Pk given the property Pj. In[6] the numerator and denominator of 5 are expressed as

To describe an instance, if, sj and Sk define the similarity EiTnmin(qij, qik) and >3i qij respectively, which, is the Z-along Pj, Pk, as [6], count of the fuzzy set with elements min(qij, qik where j, k

are fixed and i=1,.. . n.

f MaxiCDPJ {si} if j~k 1IV. CONCEPT REPRESENTATION EXTENSION

C =iMaxfieDpklD mTn{nsi, si Dp, 7 Dpjjk V PklDp otherwise The approach presented in this paper, drops the a priory

(2) requirment that Cjk = 0 when j 7 k, Pj, Pk properties of theTo illustrate this consider again the apple example, (as in same dimension. Instead it starts by defining the entry Cjk as

[6]), in the visual space, with domains: the conditional probability of Pk given Pj, as it is estimated. color and its properties, red, green, yellow, and from the training sets. That is,* texture and its properties, smooth and wrinkled. C #of instances wtith properties Pj and Pk (5)Consider now an instance of apple for which similarities for C7 total# of instances wtith property Pj

each properties are as follows: each properties are as follows: Any one of the formulae developed for the cardinality of a

property red green yellow smooth wrinkled fuzzy set can be used to calculate the quantities on the rightinstance 0.9 0.2 0.1 0.8 0 hand side of (5). For the experiments carried out in this study

the formula in [7] is used. In the new approach (Cij may stillThen the co-occurrence matrix is given by be 0, not by definition, but because this is how it will result

from the data.|r 9 1 r 0 Related to the new definition of Cjk is the assumptionI 0 0.9 0 0.8 0 on the properties for a given domain: it is assumed thaty 0 0.9 0.8 0 (3) for a given dimension, all objects using that dimension, wills 0 0 0.I 0.8 0 be described in terms of the same set of properties. Tos 0.8 0.8 0.8 0.8 0 understand this assumption, consider the color dimension and

Lw 0 0 0 0 0.8 two concepts, such as apples and planes. Rather than designB. The concept connection matrix color properties specific to these two concept a standard set of

The concept connection matrix is obtained from the training properties (e.g. RGB) will be used for both. Any color of anset of instances, i = 1, ... , n. For each instance i and property object (e.g. brown) is described by a set of values, and, evenPj the similarity zij is computed. For N properties, and for more importantly, by a certain combination of these valueseach i, one obtains, the vector [zil,... , ZiN]. The entry Cjk in the RGB. Such combinations will be captured by Cjk, theis then defined as [6]: conditional probability of Pk given Pj. Cases when Cjk = 0

correspond to data sets for a concept in which Pj and Pk didCjk = min(z. , (4) not (as different from could not) occur together.

zi zjWhat is the meaning of (4)? One possible meaning is A. Computing instance-concept similarity - the fuzzy Ham-

as follows: the quantity min(zij, Zik) evaluates how strong ming distanceproperties Pi and Pk co-occur in i. Similarly, the quantity It is clear at this point that the co-occurrence matrix is a veryzij shows how strong the property Pj holds in instance i. In useful device to capture both an instance and a concept (set ofthe framework of fuzzy sets these can be easily expressed instances). Moreover, the representation of a concept requires

1-4244-0363-4/06/$20.OO ©2006 IEEE

Page 3: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

no additional storage regardless of the number of instances [6] has size 26 x 26 while the extended representation has aconsidered. It follows that to calculate the similarity between size of 65 x 65. The properties for each domains were selectedtwo instances, an instance and a concept, or two concepts, the by clustering all the images down to 16 for colors andtf810 forsame operation - comparison of two co-occurrence matrices texture. In order to asses the performance of the new concept- is required. representation two experiments were performed:By "unwinding" it, each co-occurrence matrix represents . Experiment 1: Clustering using concept matrix on both

a vector in the high dimensional space, with compoments in representations and a classifier that uses the two clusters[0,1] and the similarity problem reduces now to that of the is created;similarity between two such vectors. Based on our previous . Experiment 2: The concept representation is computedwork [8], [4], [3], the fuzzy Hamming distance is used here as for each class and then a minimum distance classifier isto evaluate image similarity. The exact definition of the fuzzy used to classify the entire data set;Hamming distance(FHD), its properties, and use in image Analysis of classifier results is based on the confusionsimilarity, and contents based image retrieval can be found matrix for each classifier shown in table I. In both classifiersin [2], [4], [3]. Here FHD is used as a similarity measure in apples represent the positive class and oranges represent theclustering the concept representation of each concept example, negative class.in our case an image.FHD is a generalization of Hamming distance over the set Predicted

of real-valued vectors. It preserves the original meaning of Actu Negative Negative Positivethe Hamming distance as the number of different components actu Positive c dbetween the input vectors with the added features that it uses TABLE Ireal-valued vectors and it takes in account the amount of CONFUSION MATRIXthe difference between each component. FHD is the (fuzzy)number of different components of the input vectors. It showsthe degree to which the input vectors are different by exactly The following measures are computed from the confusion0, 1,...,n, where n is the size of the vectors. In short, the Fuzzy matrix:Hamming Distance is the fuzzy cardinality of the difference . Accuracy: AC a+dfuzzy set. The following defines the fuzzy Hamming distance . Recall or true positive rate: TP d(which is described in detail in [8] and [4]): * False positive rate: FR b- cd

Definition 4.1: (The Fuzzy Hamming Distance) [8] Given * True negative rate: TN = atwo n dimensional real-valued vectors, x and y, for which * False negative rate: FNthe difference fuzzy set Do, (x, y), with membership function * Positive Precision: qctddHD(y' = 1- , the fuzzy Hamming distance bNegative Precision±d:= abetween x and y, denoted by FHDc (x, y) is the fuzzy N Precis: eg a±cccardinality of the difference fuzzy set, Dc,(x, y): . F measure: F =(p + (1-A)TP) = 2XP+TP

I1FHD(x,y) ( :a) O,. . ., n} - [0, 1] denotes the member- where A 2+1 E [0, 1] and Q e [0, xc) determine theship function for FHD, (x, y) corresponding to the parameter importance of precision versus recall;a. More precisely, Table III, Figure 1 and Figure 2 show the results of the

first experiment. Figures 3 and 4 show Receiver OperatingIFHD(x,y) (k; av) = CardD, (x,y) (k) (6) Characteristic (ROC) points for apples class and oranges class.

forke {0, . . . , nm} where nm= ~SupportDc,(x, y) The perfect classifier would exhibit a point at (0, 1) coordinate.

fhemor ulation constant a can be used to further introduce It can be observed that the extended concept produces in-T muiconstxtdependency,alont eacan bempusedto, introduefina creased values for precision on both classes while the original

cnexteeofdepedency along ehac component, representation determines a higher recall (TP) for apples classbut a lower recall for the oranges class. Considering both

V. EXPERIMENTS AND RESULTS classes, the extended representation provides a better accuracy.For our experiments a set of 45 images in two categories *The Table III, figure 6 and figure 7 show the results of the

are slce.Tefrtctgrcosssfo30iaeminimum distance classifier. It can be observed that the classesare selected. The first category consists from 30 images repre- aesihl ifrn u h ofso arxi h aesetn* plsadtescn oehs1 mgsrpeetn are slightly different but the confusion matrix is the same.senting apples and the second one has 1S images representing evoranges. Two domains were used to represent the images: eva

* Dominant Color in RGB space - 3 dimensions and 16 VI. CONCLUSIONSproperties; The co-occurrence matrix for instance and concept represen-

* Texture: MPEG-7 homogeneous texture descriptor - 62 tation introduced [6] has been extended and formally defineddimensions and 10 properties; as the matrix of conditional probabilities of properties. Initial

In total each image is represented by a 65 dimensions with experimental results of using this approach are promising: forvalues in [0,11. This way the original concept representation the data set used, the approach produced better accuracy and

1-4244-0363-4/06/$20.OO ©2006 IEEE

Page 4: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

TABLE IICONFUSION MATRIX MEASURES

Concept Accuracy Precision Precision Recall (TP) False Recall (TN) False 299Representation Apples Oranges Apples Positive Oranges Negative

Original 0.64 0.7 0.45 0.8 0.66 0.33 0.67

Extended 0.69 0.83 0.52 0.67 0.26 0.73 0.33

TABLE IIICONFUSION MATRIX MEASURES FOR A MINIMUM DISTANCE CLASSIFIER

Concept Accuracy Precision Precision Recall (TP) False Recall (TN) False

Representation Apples Oranges Apples Positive Oranges NegativeOriginal 0.71 0.84 0.55 0.7 0.26 0.73 0.30

Extended 0.71 0.84 0.55 0.7 0.26 0.73 0.30

Fig. 1. Clustering using original concept representation

precision. More importantly, by adopting a uniform repre-sentation in which each dimension has the same propertiesacross domains allows comparison of concepts from differentdomains.

in the experiments presented in this study the number ofFg .Cutrn sn h xeddcnetrpeetto n uz

properties in each domain is small. This small number of Fig.i2. ClustanernusgthexnddccptersnaioadFzyproperties was possible because the data set was small. Fora real world application the number of properties in eachdomain can be expected to be much larger than the number Of University of Cincinnati, and from a grant from Lockheeddimensions. In general the new representation shrinks the size Martin Corporation.of the concept matrix not only by removing the sparseness ofthe matrix but by reducing the matrix size, using the domain's REFERENCES

dimenionsFurher tudis wil incude arge dat set and[1] P. Gardenfrs, Conceptual Spaces : The Geometry of Thought. Cambrige:

more domains. MIT Press, 2000.[2] M. M. lonescu and A. L. Ralescu, "The impact of image partition

ACKNOWLEDGMENTS granularity using fuzzy hamming distance as image similarity measure."in MAICS, April 2004, pp. 111-118.

This work was partially supported by Grant [3] , "Fuzzy hamming distance as image similarity measure." in IPMU,In00014031070experomi the Departmentsof theNavy,the July 2004, pp. 1517-1523.

1 6ph[4] 2 "Fuzzy hamming distance in a content-based image retrieval

RinsbrgaGoraduapiateelowsipothe Collegeofpoengineering sytm.anFUZIEEcuyh04

1oancnbeepcet44-33-6$0oO ©200IuhlreEE hntenme f Uieriyo icnai n fo rn rmLche

Page 5: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

Original (Apples)- - - Extened (Apples)

0.9 Original (Oranges) 300Extended (Oranges)

0.8

0.7

1 ~~~~~~~~~~~~~~~~~~~E 0Original Concept Representation L

0.9 Extended Concept Representation0.5

0.80.7 0.4 ....

g 0.6

o 0.5 0 0.2 0.4 0.6 0.8 1aD Lambda

F 0.4

0.3 Fig. 5. F-Measure results for apples and oranges classes using original andextended concept representations

0.2

0.1

00 0.2 0.4 0.6 0.8 1

False Positive

Fig. 3. True Positives vs False Positives

Original Concept Representatio.n'0.9 --Extended Concept Representation

0.8-

0.7-

*> 0.6-M ~~~~~~~~~~~~~~~~~~Fig.6. The classes obtained by a minimum distance classifier using extended

(D0.5 -concept representation

g0.4-

0.3 -[5] ,"Implementation of concepts space with applications to image0.2 ~~~~~~~~~~~~~~~~~~retrieval,"in accepted to IPMU, July 2006.

[6] J. T. Rickard, "A concept geometry for conceptual spaces with applica-0.1 ~~~~~~~~~~~~~~~~~tionsto levels 2 & 3 fusion." LM Technical Report, 2005.

[7] A. L. Ralescu, "A note on rule representation on expert systems,"0 02 04 0. . Information Science, vol. 38, pp. 193-203, 1986.0 0.2 0.Fals Negative1 [8] ~, "Generalization of the hamming distance using fuzzy sets," Re-False Negative ~~~~~search Report, JSPS Senior Research Fellowship, Laboratory for Mathe-

matca Nersine h*ri cec nttt,RKN aa,My

Fig.4.Tru Negaive iv.nalsCncegativepesetio Jun 2003.

1 44-06-1012.0 ©20 IEE

Page 6: [IEEE 2006 Annual Meeting of the North American Fuzzy Information Processing Society - Montreal, QC, Canada (2006.06.3-2006.06.6)] NAFIPS 2006 - 2006 Annual Meeting of the North American

301

*~~~~~~~~i

Fig. 7. The classes obtained by a minimum distance classifier using theoriginal concept representation

11-42406-416$00 ©2 IEEE