8
Classification by Association Classification by Association Rules: Rules: Use Minimum Set of Rules Use Minimum Set of Rules Jianyu Yang Jianyu Yang December 10, 2003 December 10, 2003

Classification by Association Rules: Use Minimum Set of Rules

  • Upload
    dunn

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

Classification by Association Rules: Use Minimum Set of Rules. Jianyu Yang December 10, 2003. Classification System. Problem: (A, B, C) => y | n ? Decision tree learning, etc. Association rules: X => c X : antecedent, c : consequent Support & Confidence Algorithms: Apriori. - PowerPoint PPT Presentation

Citation preview

Page 1: Classification by Association Rules: Use Minimum Set of Rules

Classification by Association Classification by Association Rules:Rules:

Use Minimum Set of RulesUse Minimum Set of Rules

Jianyu YangJianyu Yang

December 10, 2003December 10, 2003

Page 2: Classification by Association Rules: Use Minimum Set of Rules

Classification SystemClassification System

• Problem: (A, B, C) => y | n ?Problem: (A, B, C) => y | n ?– Decision tree learning, etc.Decision tree learning, etc.

• Association rules: Association rules: XX => => cc– XX: antecedent, : antecedent, c c : consequent: consequent– Support & ConfidenceSupport & Confidence– Algorithms: AprioriAlgorithms: Apriori

Page 3: Classification by Association Rules: Use Minimum Set of Rules

Association Rules: IssuesAssociation Rules: Issues

• Too many rulesToo many rules– InefficientInefficient– OverfittingOverfitting

• Applying order mattersApplying order matters– Example: (A, B) => y, (C) => nExample: (A, B) => y, (C) => n

• Minimum Support (Minimum Support (minsupminsup))

• Minimum Confidence (Minimum Confidence (minconf minconf ))

Page 4: Classification by Association Rules: Use Minimum Set of Rules

MSR AlgorithmMSR Algorithm

Ideas:

• No redundant rules– (A, B) =>y– (A, B, C) =>y

• Total order of rules– “Occum’s razor”:

favor general rules

• Pre-pruning – (A, B) =>y– (A, B, D)=>?

1 L1 = {large 1-ruleitems};2 CAR1 = genRules(L1)3 pruneSet(L1)4 for (k = 2; Lk-1 ≠ ; k++) do begin5 Ck = apriori-gen(Lk-1);6 forall training instances t D do begin7 Ct = subset(Ck, t)8 forall candidates c Ct

9 Ci .count++ for class label i10 end11 Lk = {c Ct | ci .count ≥ minsup for any

class i}12 CARk = genRules(Lk)13 pruneSet(Lk)14 end15 CARs = UNIONk(CARk)

Page 5: Classification by Association Rules: Use Minimum Set of Rules

minsupminsup

0

5

10

15

20

0 5 10 15 20

Support (%)

Err

or R

ate

(%)

crx austra auto

Page 6: Classification by Association Rules: Use Minimum Set of Rules

minconfminconf

0

10

20

30

40 50 60 70 80 90 100Confidence (%)

Err

or R

ate

(%)

crx austra auto

Page 7: Classification by Association Rules: Use Minimum Set of Rules

Results: Error Rate Results: Error Rate ComparisonComparison

0

10

20

30

40

Err

or R

ate(

%)

C4.5(R8) CBA(V2) MSR

Page 8: Classification by Association Rules: Use Minimum Set of Rules

ConclusionsConclusions

• A new algorithm was designed to build a A new algorithm was designed to build a classification system using a minimum set classification system using a minimum set of association rules.of association rules.

• In general, low In general, low minsupminsup and high and high minconfminconf produce low error rates.produce low error rates.

• Experiments on 26 benchmark datasets Experiments on 26 benchmark datasets showed lower error rates in 17 datasets showed lower error rates in 17 datasets thanC4.5 (R8), in 16 than CBA (v2.0). thanC4.5 (R8), in 16 than CBA (v2.0).

• The new algorithm does not always produce The new algorithm does not always produce lower error rates than other algorithms. lower error rates than other algorithms.