36
Copyright © 2011, SAS Institute Inc. All rights reserved. Logistic Regression: What is it and What can I learn from it? Melodie Rush Senior Systems Engineer

Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

Copyright © 2011, SAS Institute Inc. All rights reserved.

Logistic Regression: What is it and What can I learn from it? Melodie Rush Senior Systems Engineer

Page 2: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

2

Copyright © 2011, SAS Institute Inc. All rights reserved.

Agenda

Why would you use it?

Goal

Application

What is Logistic Regression?

Examples

Data layout

Simple

Multiple

Page 3: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

3

Copyright © 2011, SAS Institute Inc. All rights reserved.

What is our goal?

Page 4: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

4

Copyright © 2011, SAS Institute Inc. All rights reserved.

Common Applications

Target Marketing

Attrition Prediction

Credit Scoring

Fraud Detection

Customer Satisfaction

Page 5: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

5

Copyright © 2011, SAS Institute Inc. All rights reserved.

Good or No Good?

Page 6: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

6

Copyright © 2011, SAS Institute Inc. All rights reserved.

What is Logistic Regression?

Logistic Regression is essentially a regression model tailored to fit a categorical dependent variable.

Page 7: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

8

Copyright © 2011, SAS Institute Inc. All rights reserved.

Response Analysis

Continuous

Categorical

Linear

Regression

Analysis

Logistic

Regression

Analysis

Page 8: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

10

Copyright © 2011, SAS Institute Inc. All rights reserved.

Types of Logistic Regression

Response Variable

• Binary • Yes, No

• 0, 1

• Good, Bad

Two Categories

• Nominal • Region

• Ordinal • Age Group

Three or more

Categories

Type of Logistic Regression

Binary

Nominal

Ordinal

Page 9: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

11

Copyright © 2011, SAS Institute Inc. All rights reserved.

Why not use Regression (OLS)?

Biggest issue is that the predicted values will take on values that have no meaning to your response

Added mathematical inconvenience of not being able to assume normality and constant variance with the response variable that has only 2 values

Page 10: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

12

Copyright © 2011, SAS Institute Inc. All rights reserved.

Logistic Regression Model

logit(pi) = β0 + β 1X1i+…+ β kxki

Where

logit (pi)=logit of the probability of the event

β 0= intercept of the regression equation

β k = parameter estimate of the kth predictor variable

logit(pi) =log(pi / (1-pi))

Page 11: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

14

Copyright © 2011, SAS Institute Inc. All rights reserved.

Mason Crosby’s Career Field Goal Statistics

Page 12: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

15

Copyright © 2011, SAS Institute Inc. All rights reserved.

Mason Crosby’s Career Field Goal Statistics

Page 13: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

16

Copyright © 2011, SAS Institute Inc. All rights reserved.

What might determine a successful field goal?

Page 14: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

18

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Data for Simple Model Continuous Predictor

Y = FGM (Field Goals Made)

X = Dist (Distance)

Page 15: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

19

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC syntax

Page 16: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

20

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Simple Model Continuous Predictor

PROC LOGISTIC DATA=WORK.Crosby_FG;

MODEL FGM (Event = '1')=Dist/

RUN;

Page 17: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

21

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Simple Model Continuous Predictor

Page 18: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

22

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Simple Model Continuous Predictor

Page 19: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

23

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Simple Model Continuous Predictor

Is the model any good?

• Counting concordant, discordant,

and tied pairs is a way to assess

how well the model predicts its

own data and therefore how well

the model fits

• In general, you want a high

percentage of concordant pairs

and low percentages of

discordant and tied pairs

Closer the area under the curve is to

1 the better the model, the closer to

0.5 the worse the model.

Page 20: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

25

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Data for Simple Model Categorical Predictor

Y = FGM (Field Goals Made)

X = Dist_grp (Distance Grouped)

Page 21: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

26

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Simple Model – Categorical Predictor

Create Categorical Variable (CASE

WHEN t1.Dist <= 29 THEN ‘1. < 29 yards’

WHEN t1.Dist >= 30 AND t1.Dist <= 39 THEN ‘2. 30-39 yards’

WHEN t1.Dist >= 40 AND t1.Dist <= 49 THEN ‘3. 40-49 yards’

WHEN t1.Dist >= 50 THEN ‘4. >= 50 yards’

ELSE t1.Dist

END)

LABEL="Distance Grouped" AS Dist_Grp

Page 22: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

27

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Simple Model Categorical Predictor

PROC LOGISTIC DATA=WORK.Crosby_FG;

CLASS Dist_Grp(PARAM=EFFECT);

MODEL FGM (Event = '1')=Dist_Grp;

RUN;

Page 23: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

28

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Simple Model Categorical Predictor

Page 24: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

29

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Simple Model Categorical Predictor

Page 25: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

30

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Simple Model Categorical Predictor

Is the model any good?

Better or worse than the

Continuous Model?

Page 26: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

31

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Data for Multiple Model

Y = FGM (Field Goals Made)

X = Dist (Distance)

Year, Quarter, Win or Loss, Home or Away

Page 27: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

32

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Multiple Model

PROC LOGISTIC DATA=WORK.Crosby_FG;

CLASS Year Away_Game Quarter Win_or_Loss;

MODEL FGM (Event = '1')=Dist Year Away_Game Quarter Win_or_Loss ;

RUN;

Page 28: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

33

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Multiple Model

Page 29: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

34

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Code for Multiple Model

Page 30: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

35

Copyright © 2011, SAS Institute Inc. All rights reserved.

PROC LOGISTIC Output for Multiple Model

Is the model any good?

Better or worse than the

Simple Models?

Page 31: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

36

Copyright © 2011, SAS Institute Inc. All rights reserved.

Stepwise Options

Forward

Backward

Stepwise

Page 32: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

38

Copyright © 2011, SAS Institute Inc. All rights reserved.

Challenges

Missing Value

Errors and Outliers

Massive Data size

Operational vs. observational

Page 33: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

39

Copyright © 2011, SAS Institute Inc. All rights reserved.

Page 34: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

40

Copyright © 2011, SAS Institute Inc. All rights reserved.

Resources Public SAS Courses

Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression

Predictive Modeling Using Logistic Regression

Categorical Data Analysis Using Logistic Regression

Books

Logistic Regression Using SAS Theory and Application, Second Edition by Paul D Allison

Online Tutorials

Logistic Regression in SAS Enterprise Guide Example 1

Logistic Regression in SAS Enterprise Guide Example 2

Page 35: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

41

Copyright © 2011, SAS Institute Inc. All rights reserved.

Page 36: Logistic Regression: What is it and What can I learn from it? · 23 Copyright © 2011, SAS Institute Inc. All rights reserved. PROC LOGISTIC Output for Simple Model Continuous Predictor

Copyright © 2011, SAS Institute Inc. All rights reserved.

Thank you for using SAS!