37
Knowledge Management Institute 707.009 Foundations of Knowledge Management „Categorization & Formal Concept Analysis“ Markus Strohmaier Univ. Ass. / Assistant Professor Knowledge Management Institute Graz University of Technology, Austria e-mail: [email protected] web: http://www.kmi.tugraz.at/staff/markus 1 Markus Strohmaier 2009

707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

707.009 Foundations of Knowledge Managementg g

„Categorization & Formal Concept Analysis“

Markus Strohmaier

Univ. Ass. / Assistant ProfessorKnowledge Management Institute

Graz University of Technology, Austria

e-mail: [email protected]: http://www.kmi.tugraz.at/staff/markus

1

Markus Strohmaier 2009

Page 2: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Slides in part based on

• Gerd Stumme– Course at Otto-von-Guericke Universität Magdeburg / Summer Term 2003– ECML PKDD TutorialECML PKDD Tutorial

• Rudolf Wille– „Formal Concept Analysis as Mathematical Theory of Concepts and

C t Hi hi “ I F l C t A l i Ed B G t t lConcept Hierarchies“, In Formal Concept Analysis, Eds B. Ganter et al., LNAI 3626, pp1-33, (2005)

Further Literature:

2

Markus Strohmaier 2009

http://www.aifb.uni-karlsruhe.de/WBS/gst/FBA03/chapter1_2.pdf (Ganter / Stumme)

Page 3: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Overview

T d ‘ A dToday‘s Agenda:

Categorization & Formal Concept Analysis

• Formal Context• Formal Concepts• Formal Concept Lattices• FCA Implications• Constructing Concept Lattices

3

Markus Strohmaier 2009

Page 4: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

CategorizationCategorization[Mervis Rosch 1981]

Intension (Meaning)• The specification of those qualities that a thing must have to be a

member of the classExtension (the objects in the class)Extension (the objects in the class)• Things that have those qualities

4

Markus Strohmaier 2009

Page 5: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

CategorizationCategorization[Mervis Rosch 1981]

Six salient problems:Six salient problems:• Arbitrariness of categories. Are there any a priori reasons for dividing

objects into categories, or is this division initially arbitrary? • Equivalence of category members. Are all category members equally

representative of the category as has often been assumed?• Determinacy of category membership and representation. Are categories

specified by necessary and sufficient conditions for membership? Are boundaries of categories well defined?

• The nature of abstraction. How much abstraction is required--that is, do we need only memory for individual exemplars to account for categorization? Or, at the other extreme, are higher-order abstractions of general knowledge, beyond the individual categories, necessary?

• Decomposability of categories into elements. Does a reasonable explanation of objects consist in their decomposition into elementary qualities?

• The nature of attributes. What are the characteristics of these "attributes“

5

Markus Strohmaier 2009

into which categories are to be decomposed?

Page 6: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

TerminologyISO 704 T i l W k P i i l d th dISO 704: Terminology Work: Principles and methodsDIN 2330: Begriffe und Ihre Benennungen

C t

Name DefinitionRepresentation level

Conceptattribute aattribute battribute c

Concept levelattribute c

Object 1 Object 2 Object 3Object 1property Aproperty Bproperty C

Object 2property Aproperty Bproperty C

Object 3property Aproperty Bproperty C

Object level

6

Markus Strohmaier 2009

property C property C property C

Page 7: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept Analysis[Wille 2005]

M d l t it f th ht A t iModels concepts as units of thought. A concept is constituted by its:Extension: consists of all objects belonging to a•Extension: consists of all objects belonging to a

concept•Intension: consists of all attributes common to all those•Intension: consists of all attributes common to all those objects

Concepts „live“ in relationships with many other concepts where the sub-concept-superconcept-relation concepts where the sub concept superconcept relation plays a prominent role.

7

Markus Strohmaier 2009

Page 8: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept Analysis[Wille 2005]

Formal context:Formal context:

A Formal Context is a tripel (G, M, I) for which G and M are sets while I isa binary relation between G and Ma binary relation between G and M.

Formal Concept:

A formal concept of a formal context K := (G,M, I) is defined as a pair (A,B) with

and A = B´, and B = A´; A and B are called the extent and theintent of the formal concept (A,B), respectively.

8

Markus Strohmaier 2009

Page 9: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

Running Example:

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy,

I h ö ht d f hi i d ß i Z it b i d di Äh li hk it dIch möchte nur darauf hinweisen, daß es eine Zeit gab, in der man die Ähnlichkeit der Empfindungen zur Basis der Kategorisierung von Pflanze und Tier gemacht hat. Man denke

[...] an die frühen Taxonomien des Ulisse Aldrovandi aus dem 16. Jahrhundert, der die scheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (diescheußlichen Tiere (die Spinnen, Molche und Schlagen) und die Schönheiten (die

Leoparden, die Adler usw.) zu eigenen Gruppen [von Lebewesen] zusammenfasste.

Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23

9

Markus Strohmaier 2009

[Mervis Rosch 1981]

Page 10: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

Def.: A formal contextis a tripel (G,M,I), where

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

• G is a set of objects,• M is a set of attributes• and I is a relationbetween G and M.

(g,m) I is read as„object g has attribute m“.

10

Markus Strohmaier 2009

Page 11: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

Derivation Operators

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

• A G, B M (A…Extent, B…Intent)• all attributes shared by all objects of A• all objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

all objects having all attributes of BA1´

A1 X X XA formal concept is

X

X

X

X

X

X

defined as a pair (A,B)

A = B´, and B = A´

The two derivation operators satisfy the follwing 3 conditions:

11

Markus Strohmaier 2009

Page 12: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

Def.: A formal conceptis a pair (A,B), with

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

• A G, B Mall attributes shared by all objects of Aall objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

j g

• A´=B and B´= A

Set A is called the extent

IntenttSet A is called the extent

(a set of objects)Set B is called the intent(a set of attributes)

Ext

ent

Of the formal concept (A,B)

12

Markus Strohmaier 2009

Page 13: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

Sub/Superconcept Relation

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

• A G, B M• all attributes shared by all objects of A• all objects having all attributes of B

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

j gB2 ↔A2´

A2 X X X

B1↔A1´A

X

X

X

X

X

X

A1

The orange concept is a subconcept of the blue concept, since its extent is contained in the blue one (equivalent to the blue intent is contained in the orange one)

13

Markus Strohmaier 2009

blue one. (equivalent to the blue intent is contained in the orange one)

Page 14: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisConcept Lattices

Concept Lattices(cf. Galois Lattices)

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

1 X X X

A1´A2´

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

A1 X

X

X

XX

X

XX

X

A2

C1

X X X

Extent Intent

Formal Concept C1 (A1, A1´)

The set of objects that are „yellow“, „sweet“ and „smooth“

C2

15

Markus Strohmaier 2009

„sweet and „smooth

Page 15: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisConcept Lattices

Concept Lattices(cf. Galois Lattices)

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/Bumpy

1 X X X

A1´

Color: Red/Yellow/.., Texture: Smooth/Bumpy,

)

A1 X

X

X

XX

X

XX

X(Attr

ibut

es)

A1

X X X

Inte

nt

cts)

xten

t (O

bjec

16

Markus Strohmaier 2009

Ex

Page 16: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisConcept Lattices

D f Th t l tti f f l t t (G M I) iDef.: The concept lattice of a formal context (G,M,I) isthe set of all formal concepts of (G,M,I), together with the partial order

The concept lattice is denoted by B(G,M,I) .

Theorem: The concept lattice is a lattice, i.e. for two concepts (A1,B1) and (A B ) there is alwaysand (A2,B2), there is always

• a greatest common subconcept• and a least common superconceptp p

17

Markus Strohmaier 2009

Page 17: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisGreatest Common Subconcept

Which objects share the attributes „smooth“ and „red“ and „sour“?

greatest common subconcept

A: Grapes, Apples

(infimum)

• a greatest common subconcept• and a least common superconcept

18

Markus Strohmaier 2009

p p

Page 18: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisLeast Common Superconcept

Which attributes share the objects „strawberries“

and „lemon“?

least common superconcept

A: Bumpy, round

(supremum)

• a greatest common subconcept• and a least common superconcept

19

Markus Strohmaier 2009

p p

Page 19: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisImplications

Def.: An implication

A > B holds in a context (G M I) if every object intent respects A >B i e

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/.., Texture: Smooth/Bumpy,

A -> B holds in a context (G, M, I) if every object intent respects A ->B, i.e.

if each object that has all the attributes in A also has all the attributes in B.

We also say A->B is an implication of (G,M,I).ABBA

20

Markus Strohmaier 2009

Page 20: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisImplicationsObject count f

Taste: Sweet/Sour, Shape: Round/Long/, Color: Red/Yellow/ Texture: Smooth/BumpyColor: Red/Yellow/.., Texture: Smooth/Bumpy,

21

Markus Strohmaier 2009

Page 21: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisApplications / AOL Search Query Logs

Implications:

22

Markus Strohmaier 2009

Page 22: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisApplications / Goal Tagging

Implications:

23

Markus Strohmaier 2009

Page 23: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisApplications / Bugs - Bugzilla

Implications:

24

Markus Strohmaier 2009

Page 24: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

FCA / S liFCA / ScalingTransforming many-valued into single valued contexts

Many-valued contextsColor Shape Taste Texture

Apple Red/green/yellow

Round Sweet/sour

Smooth

Lemon Yellow Round sour Bumpy

What hasn‘t been mentioned yetLemon Yellow Round sour Bumpy

Banana Yellow Long Sweet Smooth

Strawberries Red Round Sweet Bumpy

Grapes Red/ green Round Sweet/ Smooth

y

Grapes Red/ green Round Sweet/sour

Smooth

Pear Yellow Long Sweet Smooth

Color Shape Taste TextureS1 red green yellow

red X

Via Scales

red X

green X

yellow X

25

Markus Strohmaier 2009

Page 25: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005

26

Markus Strohmaier 2009

Page 26: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005

27

Markus Strohmaier 2009

Page 27: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

K l d A i itiKnowledge AcquisitionCIMIANO, HOTHO, & STAAB 2005

28

Markus Strohmaier 2009

Page 28: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

tion

cqui

sit

STA

AB 2

005

dge

Ac

HO

THO

, & S

owle

dC

IMIA

NO

, K

no

29

Markus Strohmaier 2009

Page 29: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisLattice Construction Algorithms

A N i A hA Naive Approach:

Ganter / Stumme 2003

31

Markus Strohmaier 2009

Page 30: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisLattice Construction Algorithms

E lExample:(Formal Concepts)

Ganter / Stumme 2003

32

Markus Strohmaier 2009

Ganter / Stumme 2003

Page 31: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisFormal Concept AnalysisLattice Construction Algorithms

A N i A hA Naive Approach:

Ganter / Stumme 2003

33

Markus Strohmaier 2009

Page 32: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisyLattice Construction Algorithms

E lExample:

({T2,T7},{e}) <= ({T1,T2,T3,T4,T5,T6,T7},{0})

umm

e 20

03 „5“ is a subconcept of „9“

({0},{a,b,c,d,e}) <= ({T4},{a,b,c})({0},{a,b,c,d,e}) <= ({T4},{a,b,c})

Gan

ter /

Stu

Ganter / Stumme 2003a circle for a concept is always positioned higher

than all circles for its proper subconcepts.

34

Markus Strohmaier 2009

Page 33: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept AnalysisLattice Construction Algorithms

E l td´Example ctd´:

10Reduced Labeling

5

10

2 4 3

Reduced Labeling(Top to bottom)

5 2 4 3

9 8 79 8 7

1

Ganter / Stumme 2003

6

35

Markus Strohmaier 2009

Page 34: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

T lTools:E.g. • ConExp http://conexp.sourceforge.net/index.html• Networks.tb http://networks-tb.sourceforge.net/• JaLaBa http://maarten.janssenweb.net/jalaba/JaLaBA.pl

Further Information• FCA Homepage http://www.upriss.org.uk/fca/fca.html

36

Markus Strohmaier 2009

Page 35: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Formal Concept Analysis

C E htt // f t/i d ht l• ConExp http://conexp.sourceforge.net/index.html

37

Markus Strohmaier 2009

Page 36: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Bonus Task

Sk t h t i ti l• Sketch a categorization example• Define a Formal Context, for which |G| >=10, |M| >=10 and

|I|~|G|+|M|| | | | | |• Use Conexp to

– Name all elements of G and M (choose plausible G and M)Represent your formal context (choose plausible I)– Represent your formal context (choose plausible I)

– Draw the Concept Lattice– Calculate Implications (there should be at least one implication with f>3)

S b it• Submit– A .pdf file that contains your 1) context, 2) the layouted(!) lattice 3) the top10

implications and 4) a brief explanationN th df Fil i th f ll i S t• Name the pdf File using the following Syntax: „GWM09-BT2-YOURMATR-YOURLASTNAME.cex“

– To me via e-mail using subject „[GWM09-BT2-YOURMATR]“– before the beginning of next week‘s class

38

Markus Strohmaier 2009

– before the beginning of next week s class

Page 37: 707.009 Foundations of Knowledggge Management ...markusstrohmaier.info/courses/WS2009-10/707.009...Heinz von Foerster, Wahrheit ist die Erfindung eines Lügners, Page 22/23 9 Markus

Knowledge Management Institute

Any questions?y q

See you next week!y

39

Markus Strohmaier 2009