Linear Programming

Linear Programming

As Used for Discriminant Analysis

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-2

Objectives• Maximize minimum distance from critical

value

• Minimize sum of deviations from critical value– Simple– Direct– Free of statistical assumptions– Flexible


9-3

Requirements to use LP

• LP modeling skills

• Commercial software


9-4

Linear Discriminant Analysis

• Separate data into groups such that– Minimize distance within group– Maximize distance to other groups

• Can have:– Binary (2 groups)– Multiple categories (more than 2 groups)


9-5

Minimize Sum of Deviations (MSD)

Minimize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b + 1 for A1 in B,

…………

An1 x1 + … + Anr xr b - r for An in G,

, … , r 0,


9-6

Maximize Minimum Distance(MMD)

Maximize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b - 1 for A1 in B,

…………

An1 x1 + … + Anr xr b + r for An in G,

1, …, r 0


9-7

ExampleMinimize 1 + 2

Subject to:

6 x1 + 8 x2 b + 1 for A1 in B,

15 x1 + 31 x2 b - 2 for A2 in G,

1 , 2 0

Use b = 9

Optimal solution: x1* = 0, x2* = 0.290323

A1 = 2.35 < 9 so BAD; A2 = 9.00001 > 9 so GOOD


9-8

Perfect SeparationAX* = 9

2.3458409.000013

GoodBad

i

i


9-9

Overlapping Data


9-10

Three-Class Linear Discriminant Analysis

a1

bL1

bU1

bL2

bU2

bL3

bU3

C1

C2

C3X


9-11

MCLP Classification

• Two or more criteria

• Create deviational variables for eachFunctiona + da

- - da+ = Targeta

Objective: Min weighted sum of deviations

IDEAL POINT: all desired deviations = 0


9-12

Fuzzy LP Classification• Not all data precise

• Fuzzy concept:– Membership function 0 ≤ MF ≤ 1– Can have MF for any number of states– 50 degrees

• Cold MF might be 0.7• Warm MF might be 0.4• Hot MF might be 0


9-13

Fuzzy MOLP• Discriminate to various classes available

X-axis is alpha; Y-axis is beta


9-14

Real Application: Credit Card• Outcomes

– Bankruptcy– Good

• Scoring techniques

1. Behavior Score

2. Credit Bureau Scores

3. Proprietary Bankruptcy Score

4. Set Enumeration Decision Tree


9-15

Real Application – Credit Card• LP an alternative to these scoring methods• Classify cardholders in terms of payment• Common variables:

– Balance– Purchase– Payment– Cash advance– State of residence– Job security


9-16

Real Application – Credit Card• FDR model

– 38 original variables over 7 months– 65 derived variables generated

• Separation criteria:– Information value – mean difference/STD– Concordance– Kolmogorov-Smirnov (best)


9-17

Real Application – Credit Cards• Sampled 6,000 records• 2-class output• 65 attributes• 50 LP solutions computed

– Varied fuzzy parameters, setoff limits– Used 1000, 3000, 6000 records– Compared with decision tree, neural network model– MCLP best at not calling actual bad cases good

• But this was on a small test set

– Fuzzy LP best on large test set

Documents

Linear Programming