22
Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000 course Conceptual Clustering 00/ What is conceptual clustering … Why? Conceptual vs. Numerical clustering Definitions & key-points Approaches The AQ/CLUSTER approach Adapting STAR generation for conceptual Clustering The COBWEB conceptual clustering approach

Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course

Conceptual Clustering

00/

What is conceptual clustering … Why?Conceptual vs. Numerical clustering

Definitions & key-points

Approaches

The AQ/CLUSTER approachAdapting STAR generation for conceptual Clustering

The COBWEB conceptual clustering approach

Page 2: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course

Conceptual Clustering: What

01/

� How to group examples/ cases/ observations/ objects …. Based on their descriptions

� Unsupervised Learning Method … no classno class assignment to cases

Example: Taxonomy of speciesBODY_COVER HEART_CHAMPER BODY_TEMP FERTILIZATION

s1 hair four regulated internals2 feathers four regulated internals3 cornified imperf-four unregulated internals4 moist three unregulated externals5 scales two unregulated external

{ s4 }

{s4, s5}{s1, s2} { s3 }

{ s2 }{ s1 }

{s1, …, s5}

{ s5 }

hair, four, … hair, four, …

cornified, impref.-four, …

moist, three, … Scales, two, …

HierarchicalHierarchicalConceptualClustering

Page 3: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course

Conceptual Clustering vsvs Numerical Clustering

02/

Numeric: based on distancesTwo groups � hard to interpret?

Conceptual: based on their descriptionsOne group � DIAMOND concept

Points, facts, observations, instances, examples, cases …are put together if they represent the same concept

Points, facts, observations, instances, examples, cases …are put together if they represent the same conceptconcept

Page 4: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 03/

Conceptual Clustering: Key-points

A conceptual clustering system accepts a set of object descriptions(events, facts, observations, …) and produces a classification scheme over them

� Semantic-Network� class � sub-class � … � instances HIERARCHY

� Do not require a teacher � UnsupervisedUnsupervised learning

� An evaluation functionevaluation function is needed for the goodness of clustering

Contextual factors:� PerformancePerformance … “is the resulting classifications of any good?”

� EnvironmentEnvironment … dynamic changes then, hierarchical clustering

Page 5: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 04/

Conceptual Clustering: Definition

Given:� A set of unclassified instances I� An evaluation functionevaluation function e

Do: Create a set of clusters for I that maximizes e� Clusters need to be disjoint ??� Clusters can be hierarchically related

Evaluation functions (for the quality of clusters)

� Maximize intraintra--clustercluster similarity� Maximize interinter--clustercluster dissimilarity� Prefer simplersimpler clustering (Ocam’s razor)

Page 6: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 05/

Conceptual Clustering: Definitions & key-points -2

Performance measures� Ability to predictpredict all or, importantimportant attributes� ComprehensibilityComprehensibility & utilityutility of induced clusters� Ability to generate hierarchyhierarchy

� Recognition process� Structured descriptions

ML contribution to clustering� Representation: symbolic variables� Automatic characterizationAutomatic characterization of induced clusters

Page 7: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 06/

Conceptual Clustering: Approaches

� CLUSTERCLUSTER [Michalski & Stepp, 1983]

� STAR generation� Hierarchical organization� Branches are distinguishing characterizations� Hill-climbing with backtracking� Pres-specified # of clusters

� AUTOCLASSAUTOCLASS [Cheesman et. All., 1988]� Probability distributions of member’s values � Bayesian� Finds most probable partition of instances, maximizing:

� Not pre-specified # of clusters

� COBWEBCOBWEB [Fisher, 1987]� Statistical measure of category (cluster) utility� # number of clusters NOT pre-specified� Incremental

)N|D(p)N,,|D(p)N|,(p)N,D|,(p πθπθ

πθ ====

Page 8: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 07/

Conceptual Clustering: AQ/ CLUSTER

Adapt Adapt AQAQ for conceptual clusteringfor conceptual clustering

AQ requires the classification into POSitive and NEGative examples

Given:� A collection of events, E� The number of clusters desired, k� The criterion of clustering quality � LEF

Find:A disjoint set of clustering of the collection of events that optimizes the given criterion of clustering quality

Page 9: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 08/

AQ/ CLUSTER: Terminology

Variables� Nominal (categorical): DOMAIN(Xi) = {v1, v2, …, nm}� Linear (quantitative): DOMAIN(Xi) = [vi..vj]

� Structured: shape

ellipse

ovalpolygon

circle4-sides3-sides

trapezoid

triangle

square

rectangle

Syntactic Distance ∑∑∑∑====i

j;2i;1 )x,x(sd)2e,1e(d

Relational Statement[Xi: Ri] … Ri the reference of a variable

e.g., [length > 2], [color = blue OR red], [weight = 2..5], …

Page 10: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 09/

Conceptual Clustering: Adapting STAR generation

1. k events are selected (= the number of clusters wanted)

2. G(ei|E-{ei}) (STAR) is generated for each event against the other events

3. The complexes are modified to construct adisjoint cover that optimizes LEF …

4. Termination (condition) ?

5. Choose new seeds� If cluster quality improves � choose central events� If cluster quality is not improving choose border events

Central events: Those nearest the geometric-mean of the set of events in the cluster

� At the end one has a set of clusters and their descriptions (RULES)� For each cluster perform the same ���� hierarchy

Page 11: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 10/

Adapting STAR generation for conceptual clustering:

Disjoint covers

We have k covers … each covering one event and not k-1 others

� From the rest of the events in the given set determine those coveredby more than one of the k covers (Multiple Covered Event List … m-list)� The size of this list is a measure of cluster quality

� If m-list is empty � termination

RefunionRefunion:: complexes � complexLinear

e1 = (2, 3, 0, 1) … new e selectede2 = (0, 2, 1, 1) … new e selectedc = [X1 = 2..3] [X2=4][X3=0][X4=2]c’ = [X1 = 0..3][X2=2..4][X3=0..1][X4=1..2]

Structured: climb the generalization tree

Quality: Sparseness of a clusterr(c)= 1 – [p(c)/(p(c)+s(c)]

p(c): # covered by c MINIMIZE TOTAL SPARESENESSs(a): # covered by E-c MAXIMIZE SIMPLICITY … as less atts

e

Page 12: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 11/

AQ/CLUSTER: Flow ChartGiven:

E – a set of data eventsK – the number of clusters

LEF – the clustering quality criterion(a)(1)

Choose initial k “seed” events from A

(2)Determine a star for each seedagainst the other seed events

(3)By appropriately modifying and selectingcomplexes from stars, construct a disjoint

cover of E that optimizes the LEF criterion(a)

(4)Is the termination criterion satisfied?

(5)Is the clustering quality improved?

Choose k newcentral events

Choose k newborder events

Page 13: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 12/

COBWEB: The Basics

� Representation: Attribute-Value pairs� Search: Heuristic – statistical evaluation measure� Hierarchical clustering: different state representations� Method: operators to built classification schemes� Control: high level algorithmic process � applying evaluation measure;

� forming states � applying operators

� Ability to identify basic level categoriesBasic level categories (e.g., Bird) are retrieved more quickly thateither more general (e.g., Animal) or, more specific (e.g., robin)

Bird

robin

Animal

… a basic level category

���� Maximize inference related capabilities�Effic

ient Recognition process

�Better classification

Page 14: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 13/

COBWEB: Towards a measure of Category Utility

Trade off between intra-class similarity & inter-class dissimilarity

� An Index for intra-class similarity

)C/VA(p kiji ====maximize… a continuous analogue of logical necessity

� “ … the higher this probability the more necessary is Ai=Vij for predicting Ck”“ … more necessary to have objects sharing this att-value pair in the same category”

� “ … the higher this probability the greater the proportion ofclass members sharing this att-value pair”

Page 15: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 14/

COBWEB: Towards a measure of Category Utility -2

Trade off between intra-class similarity & inter-class dissimilarity

� An Index for inter-class disimilarity

)VA/C(p ijik ====maximize… a continuous analogue of logical sufficiency

� “ … the higher this probability the more sufficient is Ai=Vij for predicting Ck”

� “ … the higher this probability less sufficiently predict other classes sharing this att-value pair

Page 16: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 15/

Category Utility: Final Definition

n

][i j

ijii j

kijik

n21

22 )VA(p)C/)VA(p)C(pn

1k

)( }C,...,C,C{

∑∑∑∑ ∑∑∑∑∑∑∑∑ ∑∑∑∑∑∑∑∑ ======== −−−−

====

====CU

# of att-values correctly guessed given Ck

# of att-values correctly

� The special case of irrelevant attributesAi = Vij independent of class membership⇒⇒⇒⇒ p(Ai=Vij/Ck) = p(Ai=Vij)

If for-all j values ⇒⇒⇒⇒ CU = 0⇒⇒⇒⇒ Ai is irrelevant

Page 17: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 16/

COBWEB: The Operators

Operator-1: Placing an object in an existing nodeexisting node(sub-cluster in the hierarchy)

(a) Place object in each (so-far) sub-cluster �� compute CUi (i = # so-far sub-clusters)

(a) Identify BEST CUi�� object is placed in the corresponding node

……BEST

Operator-2: CreationCreation of a new class (sub-cluster)

(a) Apply Operator-1 … CU[object in BEST host]… take previous results

(a) Compute CU[n-so=far U NewNode] CU[n-so=far + NewNode] > CU[object in BEST host]

�� Create NEW-NODE (new class/ sub-cluster) …

…BEST

NEW-NODE

… not predefined # of clusters

Page 18: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 17/

COBWEB: The Operators -2

Operator-3: MergingMerging two nodes … Up one level(… because operators-1,2 are biased to initial input … as objects are coming!!!)

(a) Do it for all node-pairs�� MERGED-NEW-NODE: SUM probabilities of merged nodes

… … … … MERGED-NEW-NODE:SUM of probabilities frommerged nodes

CU < CU[merged]quality improves

Page 19: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 18/

COBWEB: The Operators -3

Operator-4: SplittingSplitting two nodes … Down one level(… because operators-1,2 are biased to initial input … as objects are coming!!!)

(a) Do it for all nodes�� SPLIT-NEW-NODE

CU < CU[+ split-new-nodes]

= CU[- split-node] quality improves

… …… … SPLIT-NEW-NODEs:

Page 20: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 19/

COBWEB: The algorithmic process – The CONTROL

Search: Hill-climbing with Backtracking

N=Node; I = New instance

Train(N,I) =IF leaf(N)

THEN create_sub-tree(N,I)ELSE

Incorporate(I,N); Update N’s probabilities

Compute score of placing I in each child of NCompute score of placing I in each child of NN1 = child with highest score = HIGHHIGHN2 = child with second highest scoreNEWNEW = score when placing I as a new child of NMERGEMERGE = score of merging N1 and N2 … and putting I in merged nodeSPLITSPLIT = score of splitting N1 into its children

IF highest score is:IF highest score is:HIGHHIGH: � Train(N1,I)NEWNEW: � Add I as a new child of NMERGE: � Train(merge(N1,N2,N),I)SPLIT: � Train(split(N1,N),I)

Until all Instances

are presented

Page 21: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 20/

COBWEB:The different Uses

CLASSIFICATIONCLASSIFICATION

1. Eliminate class from data

2. Form COBWEB classification tree

3. Each unseen (test) example is passed through the tree to reach a leaf

4. The Best-host-node is used to classify the case � take as its class the class with the highest # of objects in the node

INFER INFER ATT-VALUES

5. For an unknown-value in a test example, predict its value from the att-values of the objects in the Best-host-node � highest att-value occurrence

Page 22: Conceptual Clustering - Hellasusers.ics.forth.gr/~potamias/hy577/cc.pdf · Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERING University of Crete Fall 2000

Dept. of Computer Science HY577 – Machine Learning CONCEPTUAL CLUSTERINGUniversity of Crete Fall 2000 course 21/

COBWEB: Incrementality & its evaluation

4 criteria for evaluating an incremental system (… like COBWEB)

� COSTCOST of incorporating a single Instance� QUALITY QUALITY of learned classification tree� # Objects to STABILIZESTABILIZE classification tree

COSTCOST = O(B2 logB n * A * V)B: average branching factorlogB n: maximum depthn: classified objects so-far

A: # of attributes

V: mean # of values for atts