17
Linear Programmin g As Used for Discriminant Analysis

Linear Programming

  • Upload
    dillon

  • View
    16

  • Download
    0

Embed Size (px)

DESCRIPTION

Linear Programming. As Used for Discriminant Analysis. Objectives. Maximize minimum distance from critical value Minimize sum of deviations from critical value Simple Direct Free of statistical assumptions Flexible. Requirements to use LP. LP modeling skills Commercial software. - PowerPoint PPT Presentation

Citation preview

Page 1: Linear Programming

Linear Programming

As Used for Discriminant Analysis

Page 2: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-2

Objectives• Maximize minimum distance from critical

value

• Minimize sum of deviations from critical value– Simple– Direct– Free of statistical assumptions– Flexible

Page 3: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-3

Requirements to use LP

• LP modeling skills

• Commercial software

Page 4: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-4

Linear Discriminant Analysis

• Separate data into groups such that– Minimize distance within group– Maximize distance to other groups

• Can have:– Binary (2 groups)– Multiple categories (more than 2 groups)

Page 5: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-5

Minimize Sum of Deviations (MSD)

Minimize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b + 1 for A1 in B,

…………

An1 x1 + … + Anr xr b - r for An in G,

, … , r 0,

Page 6: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-6

Maximize Minimum Distance(MMD)

Maximize 1 + … + r

Subject to:

A11 x1 + … + A1r xr b - 1 for A1 in B,

…………

An1 x1 + … + Anr xr b + r for An in G,

1, …, r 0

Page 7: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-7

ExampleMinimize 1 + 2

Subject to:

6 x1 + 8 x2 b + 1 for A1 in B,

15 x1 + 31 x2 b - 2 for A2 in G,

1 , 2 0

Use b = 9

Optimal solution: x1* = 0, x2* = 0.290323

A1 = 2.35 < 9 so BAD; A2 = 9.00001 > 9 so GOOD

Page 8: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-8

Perfect SeparationAX* = 9

2.3458409.000013

GoodBad

i

i

Page 9: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-9

Overlapping Data

Page 10: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-10

Three-Class Linear Discriminant Analysis

a1

bL1

bU1

bL2

bU2

bL3

bU3

C1

C2

C3X

Page 11: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-11

MCLP Classification

• Two or more criteria

• Create deviational variables for eachFunctiona + da

- - da+ = Targeta

Objective: Min weighted sum of deviations

IDEAL POINT: all desired deviations = 0

Page 12: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-12

Fuzzy LP Classification• Not all data precise

• Fuzzy concept:– Membership function 0 ≤ MF ≤ 1– Can have MF for any number of states– 50 degrees

• Cold MF might be 0.7• Warm MF might be 0.4• Hot MF might be 0

Page 13: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-13

Fuzzy MOLP• Discriminate to various classes available

X-axis is alpha; Y-axis is beta

Page 14: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-14

Real Application: Credit Card• Outcomes

– Bankruptcy– Good

• Scoring techniques

1. Behavior Score

2. Credit Bureau Scores

3. Proprietary Bankruptcy Score

4. Set Enumeration Decision Tree

Page 15: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-15

Real Application – Credit Card• LP an alternative to these scoring methods• Classify cardholders in terms of payment• Common variables:

– Balance– Purchase– Payment– Cash advance– State of residence– Job security

Page 16: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-16

Real Application – Credit Card• FDR model

– 38 original variables over 7 months– 65 derived variables generated

• Separation criteria:– Information value – mean difference/STD– Concordance– Kolmogorov-Smirnov (best)

Page 17: Linear Programming

McGraw-Hill/Irwin ©2007 The McGraw-Hill Companies, Inc. All rights reserved

9-17

Real Application – Credit Cards• Sampled 6,000 records• 2-class output• 65 attributes• 50 LP solutions computed

– Varied fuzzy parameters, setoff limits– Used 1000, 3000, 6000 records– Compared with decision tree, neural network model– MCLP best at not calling actual bad cases good

• But this was on a small test set

– Fuzzy LP best on large test set