Download ppt - Linear Programming

Transcript
Page 1: Linear Programming

Linear Programming

As Used for Discriminant Analysis

Page 2: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-2

Objectives• Maximize minimum distance from critical

value

• Minimize sum of deviations from critical value– Simple– Direct– Free of statistical assumptions– Flexible

Page 3: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-3

Requirements to use LP

• LP modeling skills

• Commercial software

Page 4: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-4

Linear Discriminant Analysis

• Separate data into groups such that– Minimize distance within group– Maximize distance to other groups

• Can have:– Binary (2 groups)– Multiple categories (more than 2 groups)

Page 5: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-5

Minimize Sum of Deviations (MSD)

Minimize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b + 1 for A1 in B,

…………

An1 x1 + … + Anr xr b - r for An in G,

, … , r 0,

Page 6: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-6

Maximize Minimum Distance(MMD)

Maximize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b - 1 for A1 in B,

…………

An1 x1 + … + Anr xr b + r for An in G,

1, …, r 0

Page 7: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-7

ExampleMinimize 1 + 2

Subject to:

6 x1 + 8 x2 b + 1 for A1 in B,

15 x1 + 31 x2 b - 2 for A2 in G,

1 , 2 0

Use b = 9

Optimal solution: x1* = 0, x2* = 0.290323

A1 = 2.35 < 9 so BAD; A2 = 9.00001 > 9 so GOOD

Page 8: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-8

Perfect SeparationAX* = 9

2.3458409.000013

GoodBad

i

i

Page 9: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-9

Overlapping Data

Page 10: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-10

Three-Class Linear Discriminant Analysis

a1

bL1

bU1

bL2

bU2

bL3

bU3

C1

C2

C3X

Page 11: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-11

MCLP Classification

• Two or more criteria

• Create deviational variables for eachFunctiona + da

- - da+ = Targeta

Objective: Min weighted sum of deviations

IDEAL POINT: all desired deviations = 0

Page 12: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-12

Fuzzy LP Classification• Not all data precise

• Fuzzy concept:– Membership function 0 ≤ MF ≤ 1– Can have MF for any number of states– 50 degrees

• Cold MF might be 0.7• Warm MF might be 0.4• Hot MF might be 0

Page 13: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-13

Fuzzy MOLP• Discriminate to various classes available

X-axis is alpha; Y-axis is beta

Page 14: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-14

Real Application: Credit Card• Outcomes

– Bankruptcy– Good

• Scoring techniques

1. Behavior Score

2. Credit Bureau Scores

3. Proprietary Bankruptcy Score

4. Set Enumeration Decision Tree

Page 15: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-15

Real Application – Credit Card• LP an alternative to these scoring methods• Classify cardholders in terms of payment• Common variables:

– Balance– Purchase– Payment– Cash advance– State of residence– Job security

Page 16: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-16

Real Application – Credit Card• FDR model

– 38 original variables over 7 months– 65 derived variables generated

• Separation criteria:– Information value – mean difference/STD– Concordance– Kolmogorov-Smirnov (best)

Page 17: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-17

Real Application – Credit Cards• Sampled 6,000 records• 2-class output• 65 attributes• 50 LP solutions computed

– Varied fuzzy parameters, setoff limits– Used 1000, 3000, 6000 records– Compared with decision tree, neural network model– MCLP best at not calling actual bad cases good

• But this was on a small test set

– Fuzzy LP best on large test set


Recommended