Upload
duane
View
83
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Logistic Regression. Saed Sayad. Definition. Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as: 0, 1 Y, N F, T. Sample Dataset. Linear Regression ( Continuous Dependent Variable ). Balance. Months in Business. - PowerPoint PPT Presentation
Citation preview
Logistic Regression
Saed Sayad
1www.ismartsoft.com
Definition
Logistic Regression is a type of regression model where the dependent variable (target) has just two values, such as:
0, 1Y, NF, T
2www.ismartsoft.com
Sample Dataset
www.ismartsoft.com 3
Months n Business Balance Default189 $429,916 0170 $240,319 1166 $231,327 0423 $196,105 0145 $193,907 160 $190,944 097 $184,333 0
354 $152,126 099 $151,061 180 $135,885 025 $119,751 1
118 $116,578 174 $123,864 0... ... ...
Linear Regression (Continuous Dependent Variable)
www.ismartsoft.com 4
Months in Business
Balance
Linear Regression (Binary Dependent Variable)
www.ismartsoft.com 5
Default
Months in Business
Linear Regression Model – Binary Target
• If the actual Y is a binary variable then the predicted Y can be less than zero or greater than 1
• If the actual Y is a binary variable then error is not normally distributed.
1i o i iY X
6www.ismartsoft.com
Linear Regression Model
0
1Y
X
7www.ismartsoft.com
Frequency Table
www.ismartsoft.com 8
Months in Business Count Default Count
Default Frequency
<50 4 0 050-100 12 1 0.083
100-150 4 1 0.25150-200 4 2 0.5200-250 4 3 0.75250-300 1 1 1
>300 4 4 1
Frequency Plot
www.ismartsoft.com9
Months in Business - Bins
Default Probability
Logistic Function
www.ismartsoft.com 10
zezf
11)(
Logistic Regression
The logistic distribution constrains the estimated probabilities to lie between 0 and 1.
Maximum Likelihood Estimation is a statistical method for estimating the coefficients of a model.
11www.ismartsoft.com
)( 1011
Xep
Logistic Regression Model
0
1
Linear Model
Logistic Model
Y
X
12www.ismartsoft.com
Maximum Likelihood Estimation (MLE)• MLE maximizes the log likelihood (LL) which reflects
how likely it is that the dependent variable will be predicted from the independent variables.
• MLE is an iterative algorithm which starts with initial arbitrary numbers of what the coefficients should be.
• After this initial function is estimated, the process is repeated until LL does not change significantly.
13www.ismartsoft.comCopyright iSmartsoft Inc. 2008
Log Likelihood (LL)
www.ismartsoft.com 14
• Likelihood is the probability that the dependent variable may be predicted from the independent variables.
• LL is calculated through iteration, using maximum likelihood estimation (MLE).
• Log likelihood is the basis for tests of a logistic model.
Log Likelihood Test (-2LL)
• The log likelihood test is a test of the significance of the difference between the likelihood ratio for the baseline model minus the likelihood ratio for a reduced model.
• This difference is called "model chi-square“.• Also called Likelihood Ratio test.
www.ismartsoft.com 15
Wald Test• A Wald test is used to test the statistical significance
of each coefficient () in the model. • A Wald test calculates a Z statistic, which is:
• This Z value is then squared, yielding a Wald statistic with a chi-square distribution.
www.ismartsoft.com 16
SEZ ̂
Summary• Logistic Regression is a classification method.• It returns the probability that the binary dependent variable
may be predicted from the independent variables.• Maximum Likelihood Estimation is a statistical method for
estimating the coefficients of the model.• The Likelihood Ratio test is used to test the statistical
significance between the full model and the simpler model.• The Wald test is used to test the statistical significance of
each coefficient in the model.
www.ismartsoft.com 17
18www.ismartsoft.com
Questions?