Upload
dillon
View
16
Download
0
Embed Size (px)
DESCRIPTION
Linear Programming. As Used for Discriminant Analysis. Objectives. Maximize minimum distance from critical value Minimize sum of deviations from critical value Simple Direct Free of statistical assumptions Flexible. Requirements to use LP. LP modeling skills Commercial software. - PowerPoint PPT Presentation
Citation preview
Linear Programming
As Used for Discriminant Analysis
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-2
Objectives• Maximize minimum distance from critical
value
• Minimize sum of deviations from critical value– Simple– Direct– Free of statistical assumptions– Flexible
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-3
Requirements to use LP
• LP modeling skills
• Commercial software
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-4
Linear Discriminant Analysis
• Separate data into groups such that– Minimize distance within group– Maximize distance to other groups
• Can have:– Binary (2 groups)– Multiple categories (more than 2 groups)
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-5
Minimize Sum of Deviations (MSD)
Minimize 1 + … + r
Subject to:
A11 x1 + … + A1r xr b + 1 for A1 in B,
…………
An1 x1 + … + Anr xr b - r for An in G,
, … , r 0,
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-6
Maximize Minimum Distance(MMD)
Maximize 1 + … + r
Subject to:
A11 x1 + … + A1r xr b - 1 for A1 in B,
…………
An1 x1 + … + Anr xr b + r for An in G,
1, …, r 0
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-7
ExampleMinimize 1 + 2
Subject to:
6 x1 + 8 x2 b + 1 for A1 in B,
15 x1 + 31 x2 b - 2 for A2 in G,
1 , 2 0
Use b = 9
Optimal solution: x1* = 0, x2* = 0.290323
A1 = 2.35 < 9 so BAD; A2 = 9.00001 > 9 so GOOD
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-8
Perfect SeparationAX* = 9
2.3458409.000013
GoodBad
i
i
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-9
Overlapping Data
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-10
Three-Class Linear Discriminant Analysis
a1
bL1
bU1
bL2
bU2
bL3
bU3
C1
C2
C3X
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-11
MCLP Classification
• Two or more criteria
• Create deviational variables for eachFunctiona + da
- - da+ = Targeta
Objective: Min weighted sum of deviations
IDEAL POINT: all desired deviations = 0
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-12
Fuzzy LP Classification• Not all data precise
• Fuzzy concept:– Membership function 0 ≤ MF ≤ 1– Can have MF for any number of states– 50 degrees
• Cold MF might be 0.7• Warm MF might be 0.4• Hot MF might be 0
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-13
Fuzzy MOLP• Discriminate to various classes available
X-axis is alpha; Y-axis is beta
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-14
Real Application: Credit Card• Outcomes
– Bankruptcy– Good
• Scoring techniques
1. Behavior Score
2. Credit Bureau Scores
3. Proprietary Bankruptcy Score
4. Set Enumeration Decision Tree
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-15
Real Application – Credit Card• LP an alternative to these scoring methods• Classify cardholders in terms of payment• Common variables:
– Balance– Purchase– Payment– Cash advance– State of residence– Job security
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-16
Real Application – Credit Card• FDR model
– 38 original variables over 7 months– 65 derived variables generated
• Separation criteria:– Information value – mean difference/STD– Concordance– Kolmogorov-Smirnov (best)
McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved
9-17
Real Application – Credit Cards• Sampled 6,000 records• 2-class output• 65 attributes• 50 LP solutions computed
– Varied fuzzy parameters, setoff limits– Used 1000, 3000, 6000 records– Compared with decision tree, neural network model– MCLP best at not calling actual bad cases good
• But this was on a small test set
– Fuzzy LP best on large test set