Fuzzy association rules pre final

Preview:

DESCRIPTION

Use of fuzzy logics to identify associations

Citation preview

Fuzzy Association Rules

Aswin– 215 111 074Deepak– 215 111 049

Content

• Introduction– Data Mining– Association Rules– Fuzzy logic– Applications

• Procedure– Support and confidence– Steps

• Example:Risk analysis– Funda: Conditional probability– Example analysis

Data Mining

Data

Information

Knowledge

• KDD• Extraction of

Knowledge from Huge amounts of data

• This knowledge is – implicit, – previously unknown – and potentially

useful

Association rules

• Item sets (Z:C)

• Antecedent (X:A)

• Consequent (Y:B)

• Ex: If X is A, then Y is B

Application

• Strategic Decision Making

• Marketing Strategy formulation

• Predictive analytics:– CRM– Machine Maintenance– Employee Relations

• Artificial Intelligence : Video games, Robots

• Machines: Air conditioner, Washing machines, ABS

What is likely to happen, If so…..

IF – THENAntecedent - Consequent

Procedure

• Two factors: Support and Confidence

Trans_ID

Bread Butter Biscuit Milk

1 1 1 0 1

2 0 1 1 0

3 1 0 1 0

4 1 1 0 1

5 1 1 1 0

6 1 1 1 1

Procedure

• Hypothesis : Customers who buy bread and butter, also buy milk.

• Support = Desired Outcome/ Total Opportunities• Support = 3/6 = 0.5

Trans_ID

Bread Butter Biscuit Milk

1 1 1 0 1

2 0 1 1 0

3 1 0 1 0

4 1 1 0 1

5 1 1 1 0

6 1 1 1 1

Procedure

• Customers who buy bread and butter, also buy milk.

• Confident = Desired Outcome/ Desired Opportunities

• Confident = 3/4 = 0.75

Trans_ID

Bread Butter Biscuit Milk

1 1 1 0 1

2 0 1 1 0

3 1 0 1 0

4 1 1 0 1

5 1 1 1 0

6 1 1 1 1

Inference

• Hypothesis becomes Rule : Customers who buy bread and

butter, also buy milk.• With 75% confidence and 50%

support from past transactions records

Procedure

Fuzzification• Continuous to Discrete data

Analysis• Threshold

Defuzzification• State rule with confidence and

support

Risk analysis for issuing loan

Bank customer Data set

RISK = ASSETS – DEBT – WANTS

Case Age Income Risk Credit Result

1 20 52,623 –38,954 red 02 26 23,047 –23,636 green 13 46 56,810 45,669 green 14 31 38,388 –7,968 amber 15 28 80,019 –35,125 green 16 21 74,561 –47,592 green 17 46 65,341 58,119 green 18 25 46,504 –30,022 green 19 38 65,735 30,571 green 110 27 26,047 –6 red 1

Bank’s weight for each attribute and condition for analysis

Attribute Weight

Credit 0.800

Risk 0.700

Income 0.550

Age 0.450

Result 0.691Objective : Provide

Confident/Risk factor for the bank to issue loans for the

customers

FunctionPercenta

ge

Minimum Support

25

Minimum Confidence

90

Membership Function

FuzzificationAttribut

eLevel Representati

onWeigh

tMembershi

p valueSupport

(Rjk)

Age Young R11 0.450 0.580 0.261Age Middle R12 0.450 0.300 0.135Age Old R13 0.450 0.131 0.059

Income High R21 0.550 0.000 0.000Income Middle R22 0.550 0.890 0.490Income Low R23 0.550 0.109 0.060

Risk High R31 0.700 0.457 0.320Risk Middle R32 0.700 0.208 0.146Risk Low R33 0.700 0.332 0.233

Credit Good R41 0.800 0.720 0.576Credit Bad R42 0.800 0.280 0.224Result On

TimeR51 0.691 0.930 0.643

Result Default

R52 0.691 0.069 0.048

Item set

• C = complete sets, individual items• L = Set of items above minimum

support, grouped items• minsupp = 0.25• Conditional probability = support

Apriori Algorithm

Compute conditional

probability of each element in

SET C

Eliminate items < minsupp

to form SET L

Is L = 0

NO : nCr to form new SET

C

YES : STOP

START

C1 -> L1 - >C2C1 Support

R11 0.261R12 0.135R13 0.059R21 0.000R22 0.490R23 0.060R31 0.320R32 0.146R33 0.233R41 0.576R42 0.224R51 0.643R52 0.048

L1

R11

R22

R31

R41

R51

C2

(R11 , R22)(R11 , R31)(R11 , R41)(R11 , R51)(R22 , R31)(R22 , R41)(R22 , R51)(R31 , R41)(R31 , R51)(R41 , R51)

C2 -> L2 -> C3

L2

(R22 , R41)(R22 , R51)(R31 , R41)(R31 , R51)(R41 , R51)

C3

(R22, R41, R51)

(R22, R31, R41)

(R22, R51, R31)

(R31, R41, R51)

C2 Support

(R11 , R22) 0.235(R11 , R31) 0.207(R11 , R41) 0.212(R11 , R51) 0.230(R22 , R31) 0.237(R22 , R41) 0.419(R22 , R51) 0.449(R31 , R41) 0.266(R31 , R51) 0.264(R41 , R51) 0.560

C3 -> L3 -> C4

L3

(R22, R41, R51)

(R31, R41, R51)

C3 Support

(R22, R41, R51)

0.417

(R22, R31, R41)

0.198

(R22, R51, R31)

0.196

(R31, R41, R51)

0.264

C4

(R22, R31, R41, R51)

C4 -> L4

L4STOP

C4 Support

(R22, R31, R41, R51)

0.1957

Possible Associations

Items from L3 Associations

(R22, R41, R51)

R22, R41->R51

R22, R51->R41

R51, R41->R22

(R31, R41, R51)

R31, R41->R51

R31, R51->R41

R51, R41->R33

Items from L2 Associations

(R22, R41)R22->R41

R41->R22

(R22, R51)R22->R51

R51->R22

(R31,R41)R3`->R41

R41->R31

(R31, R51)R31->R51

R31->R51

(R41, R51)R41->R51

R51->R41

Confidence of each association

Items from L3

Associations

Confidence

(R22, R41, R51)

R22, R41->R51

0.995

R22, R51->R41

0.928

R51, R41->R22

0.744

(R31, R41, R51)

R31, R41->R51

0.993

R31, R51->R41

1.000

R51, R41->R31

0.472

Items from L2

Associations

Confidenc

(R22, R41)

R22->R41 0.855

R41->R22 0.727

(R22, R51)

R22->R51 0.916

R51->R22 0.697

(R31,R41

)

R3`->R41 0.831

R41->R31 0.462

(R31, R51)

R31->R51 0.825

R31->R51 0.410

(R41, R51)

R41->R51 0.972

R51->R41 0.870

Associations meeting minconf

Items from L3

Associations

Confidence

(R22, R41, R51)

R22, R41->R51

0.995

R22, R51->R41

0.928

R51, R41->R22

0.744

(R31, R41, R51)

R31, R41->R51

0.993

R31, R51->R41

1.000

R51, R41->R31

0.472

Items from L2

Associations

Confidenc

(R22, R41)

R22->R41 0.855

R41->R22 0.727

(R22, R51)

R22->R51 0.916

R51->R22 0.697

(R31,R41

)

R3`->R41 0.831

R41->R31 0.462

(R31, R51)

R31->R51 0.825

R31->R51 0.410

(R41, R51)

R41->R51 0.972

R51->R41 0.870

Confident Associations that meet the objective of the analysis

Items from L3

Associations

Confidence

(R22, R41, R51)

R22, R41->R51

0.995

R22, R51->R41

0.928

R51, R41->R22

0.744

(R31, R41, R51)

R31, R41->R51

0.993

R31, R51->R41

1.000

R51, R41->R31

0.472

Items from L2

Associations

Confidenc

(R22, R41)

R22->R41 0.855

R41->R22 0.727

(R22, R51)

R22->R51 0.916

R51->R22 0.697

(R31,R41

)

R3`->R41 0.831

R41->R31 0.462

(R31, R51)

R31->R51 0.825

R31->R51 0.410

(R41, R51)

R41->R51 0.972

R51->R41 0.870

Defuzzification

• If Income is middle, then payment will be received on time

R22->R51;(91.6%)• If Credit is good, then payment will be received

on time R41->R51;(97.2%)

• If Income is middle and Credit is good, then payment will be received ontime R41, R22-> R51; (99.5%)

• If Risk is high and Credit is good, then payment will be received on time

R31, R41->R51; (99.25%)

Conclusion

References

Thank you

Recommended