Upload
zhao-sam
View
14.894
Download
2
Embed Size (px)
DESCRIPTION
What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
Citation preview
Chapter 8
Covering (Rules-based) Algorithm
Data Mining Technology
Chapter 8
Covering (Rules-based) Algorithm
Written by Shakhina Pulatova
Presented by Zhao Xinyou
Data Mining Technology
Some materials (Examples) are taken from Website.
Contents
What is the Covering (Rule-based) algorithm? Classification Rules- Straightforward 1. If-Then rule 2. Generating rules from Decision Tree Rule-based Algorithm 1. The 1R Algorithm / Learn One Rule 2. The PRISM Algorithm 3. Other Algorithm Application of Covering algorithm Discussion on e/m-learning application
Introduction-App-1PP87-88
Attributes
Record
Rules
1. Rules given by people
2. Rules generated by computer
Setting1.(1.75, 0) short
2. [1.75, 1.95) Medium
3. [1.95, ..) tall
Training Data
Introduction-App-2PP87-88
How to get all tall people from B based on A
A B
+
Training Data
What is Rule-based Algorithm? Definition: Each classification method uses an algorithm
to generate rules from the sample data. These rules are then applied to new data.
Rule-based algorithm provide mechanisms that generate rules by
1. concentrating on a specific class at a time 2. maximizing the probability of the desired
classification.
PP87-88
Should be compact, easy-to-interpret, and accurate.
Classification Rules- Straightforward If-Then rule Generating rules from Decision Tree
PP88-89
formal Specification of Rule-based Algorithm The classification rules, r=<a, c>, consists of : a (antecedent/precondition): a series of tests
that be valuated as true or false; c (consequent/conclusion): the class or
classes that apply to instances covered by rule r.
PP88
a=0
b=0 b=0
yes no
X XY Y
nonoyes yes
a=0,b=0a=0,b=1a=1,b=0a=1,b=1
a=
x
y
c=
Remarks of Straightforward classification The antecedent contains a predicate that can be val
uated as true or false against each tuple in database. These rules relate directly to corresponding decision
tree (DT) that could be created. A DT can always be used to generate rules, but they
are not equivalent. Differences: -the tree has a implied order in which the splitting is
performed; rules have no order. -a tree is created based on looking at all classes; onl
y one class must be examined at a time.
PP88-89
If-Then rule
Straightforward way to perform classification is to generate if-then rules that cover all cases.
1
PP88
Generating rules from Decision Tree -1-Con’
Decision Tree
2
Generating rules from Decision Tree -2-Con’
a
by
c
d
x
y
y
n
Generating rules from Decision Tree -3-Con’
Remarks Rules may be more
complex and incomprehensible from DT.
A new test or rules need reshaping the whole tree
a
b
x
y
yc
d
x
y
y n
n
n
n
c
d
x
y
y n
n
c
d
x
y
y n
n
c
d
x
y
y n
n
duplicate subtrees
Rules obtained without decision trees are more compact and accurate.
So many other covering algorithms have been proposed.
PP89-90
a=0
b=0 b=0
yes no
X XY Y
nonoyes yes
a=1 and c=0 Y
Rule-based Classification
Generate rules The 1R Algorithm / Learn One Rule The PRISM Algorithm Other Algorithm
PP90
Generating rules without Decision Trees-1-con’ Goal: find rules that identify the instances of a
specific class Generate the “best” rule possible by optimizing the
desired classification probability Usually, the “best” attribute-pair is chosen Remark
-these technologies are also called covering algorithms because they attempt to generate rules which exactly cover a specific class.
Generate Rules-Example-2-Con' Example 3 Question: We want to generate a rule to
classify persons as tall. Basic format of the rule:
if ? then class = tall Goal: replace “?” with predicates that can be
used to obtain the “best” probability of being tall
PP90
Generate Rules-Algorithms-3-Con' 1.Generate rule R on training data S; 2.Remove the training data covered by rule
R; 3. Repeat the process.
PP90
Generate Rules-Example-4-Con'Sequential Covering
(I) Original data (ii) Step 1
r = NULL
(iii) Step 2
R1
r = R1
(iii) Step 3
R1
R2
r = R1 U R2
(iii) Step 4
R1
R2R3
r = R1 U R2 U R3
Wrong Class
1R Algorithm/ Learn One Rule-Con’ Simple and cheap method it only generates a one level decision tree. Classify an object on the basis of a single
attribute. Idea:Rules will be constructed to test a single
attribute and branch for every value of that attribute. For each branch, the class with the test classification is the one occurring
PP91
1R Algorithm/ Learn One Rule-Con’ Idea:1. Rules will be constructed to test a single attribute and branch for every value of
that attribute. Step 2. For each branch, the class with the test classification is the one occurring.3. Find one biggest number as rules4. Error rate will be evaluated.5. The minimum error rate will be chosen.
PP91
M->T Error=5F->M Error=3
Total Error=8 Total Error=3
A2 AnGender
F
2 5 1
SM T
M
1 4 10
SM T
Total Error=..
1R AlgorithmInput: D //Training Data T //Attributes to consider for rules C //Classes
Output: R //Rules
ALgorithm: R=Φ; for all A in T do RA=Φ; for all possbile value, v, of A do for all Cj C do∈ find count(Cj) end for let Cm be the class with the largest count; RA=RA((A=v) ->(class=Cm)); end for ERRA=number of tuples incorrectly classified by
RA; end forR=RA where ERRA is minimum
D
T={Gender, Height}
C={{F, M}, {0, ∞}}
C1 C2
Training Data
Gender
F MShort
Medium Tall
3 6
0
ShortMedium Tall
1 2
3
R1=F->medium R2=M->tall
Height
Example 5 – 1R-3-Con’
Option Attribute Rules Error Total Error
1 Gender F->medium
M->tall
3/9
3/6
6/15
2 Height
(Step=0.1)
(0 , 1.6]-> short
(1.6, 1.7]->short
(1.7, 1.8]-> medium
(1.8, 1.9]-> medium
(1.9, 2.0]-> medium
(2.0, ∞]-> tall
0/2
0/2
0/3
0/4
1/2
0/2
1/15
Rules based on height…...…
Example 6 -1RAttribute Rules Error Total Error
1 outlook Sunny->no
Overcast->yes
Rainy->yes
2/5
0/4
2/5
4/14
2 temperature Hot->no
Mild->yes
Cool->yes
2/4
2/6
1/4
5/14
3 humidity High->no
Normal->yes
3/7
1/7
4/14
4 windy False->yes
True->no
2/8
3/6
5/14
Rules based on humidity ORHigh->noNormal->yes
Rules based on outlookSunny->noOvercast->yesRainy->yes
PP92-93
PRISM Algorithm-Con’
PRISM generate rules for each class by looking at the training data and adding rules that completely describe all tuples in that class.
Generates only correct or perfect rules: the accuracy of so-constructed PRISM is 100%.
Measures the success of a rule by a p/t, where -p is number of positive instance, -T is total number of instance covered by the rule.
Gender=Male P=10, T=10Gender=Female P=1 T=8
R=Gender = Male ……
A2 AnGender
F
2 5 1
SM T
M
0 0 10
SM T
PRISM AlgorithmInput: D //Training Data C //ClassesOutput: R //Rules
StepInput D and C (Attribute -> Value)
1.Compute all class P/T (Attribute->Value)
2. Find one or more pair of (Attribute->Value) P/T = 100%
3. Select (Attribute->Value) as Rule
4. Repeat 1-3 until no data in D
Example 8-Con’-which class may be tall?
Num (Attribute, value) p / t
1 Gender = F 0/9
2 Gender = M 3/6
3 Height ≤ 1.6 0/2
4 1.6< Height ≤ 1.7 0/2
5 1.7< Height ≤ 1.8 0/3
6 1.8< Height ≤ 1.9 0/4
7 1.9< Height ≤ 2.0 ½
8 2.0< Height 2/2
R1 = 2.0< HeightCompute the value p / tWhich one is 100%
PP94-95
Num (Attribute, value) p / t
… … …
1.9< Height ≤ 1.95 0/1
1.95< Height ≤ 2.0 1/1
R2 = 1.95< Height ≤ 2.0
R = R1 U R2
PP94-96
Example 9-Con’-which days may play?
The predicate outlook=overcast correctly implies play=yes on all four rows
R1=if outlook=overcast, then play=yes
Compute the value p / t
Example 8-Con’
R2=if humidity=normal and windy=false, then play=yes
Example 8-Con’
R3=….. R = R1 U R2 U R3 U…
Application of Covering Algorithm To derive classification rules applied for
diagnosing illness, business planning, banking, government.
Machine learning Text classification. But to photos, it is difficult… And so on.
Application on E-learning/M-learning
Adaptive and personalized learning materials Virtual Group Classification
Initial Learner’s information
Classification of learning styles or some
Provide adaptive and personalized materials
Collect learning styles
feedback
Chapter 2 or 3Similarity, Bayesian…
Rule-based algorithm
Discussion