View
0
Download
0
Category
Preview:
Citation preview
PUBL0055: Introduction to Quantitative MethodsLecture 10: Binary Dependent Variable Models
Jack Blumenau and Benjamin Lauderdale
1 / 58
Example: Representation in Parliament
E-petitions and MP supportA crucial question in political science is whether representatives areresponsive to their constituents. We will examine this question by looking atsignatures to an e-petition, and seeing whether MPs who received lots ofsignatures were more likely to support the petition in parliamentary debate.
β’ π : MP supported (1) or opposed (0) the petition in debateβ’ π1: Number of petition signatures from the MPβs constituencyβ’ π2: The party of the MP: Conservative (1) or Labour (0)
(To read the excellent paper on which this example is based, click here)
2 / 58
http://www.jackblumenau.com/papers/petitions.pdf
Example: Representation in Parliament
3 / 58
Example: Representation in Parliament
β’ The dependent variable has only 2 values, coded as 1 and 0
β’ π = 1 if the MP supported the petition in a speechβ’ π = 0 if the MP did not support the petition in a speech
Supported Petition Frequency Percent
Supported 26 51Opposed 25 49Total 51 100
4 / 58
Binary dependent variables
Binary variables are those with two categories
β’ π = 1 if something is βtrueβ, or occurredβ’ π = 0 if something is βnot trueβ, or did not occur
Examples of binary response variables
β’ Survey questions: yes/no; agree/disagreeβ’ In politics: vote/do not voteβ’ In medicine: have/do not have a certain conditionβ’ In education: correct/incorrect; graduate/do not graduate; pass/fail
We have used binary variables as explanatory variables (π) in previous weeks.This week we focus on models for binary dependent variables (π ).
5 / 58
Why do we need a new regression model?
β’ A regression for this dependent variable would be useful to describe howexplanatory variables predict MP support for the petition
β’ Why canβt we just run a linear regression?
6 / 58
Outline
The Linear Probability Model
The Binary Logistic Regression Model
Interpretation
Predicted Probabilities
Inference
7 / 58
The Linear Probability Model
Linear Probability Model
The linear regression for binary outcome variables is known as the linearprobability model:
Linear Probability Model
πΈ[π |π1, π2, ..., ππ] = πΌ + π½1π1 + π½2π2 + ... + π½πππππ(π = 1|π1, π2, ...) = πΌ + π½1π1 + π½2π2 + ... + π½πππ
Advantages:
β’ We can use a well-known model for a new class of phenomenaβ’ Easy to interpret the marginal effects of π variables
Disadvantages:
β’ The linear model assumes a continuous dependent variable, if thedependent variable is binary we run into problems.
8 / 58
Linear Probability Model β Advantages
lpm
Linear Probability Model β Disadvantages
Problems with Linear Probability Model (I)
Predictions, Μπ , are interpreted as probability for π = 1
β’ π(π = 1) = Μπ = πΌ+π½1πβ’ Can be above 1 if π is large enoughβ’ Can be below 1 if π is small enough
0 500 1000 1500 2000 2500
Number of signatures
Sup
port
ed p
etiti
on in
par
liam
ent
01
Problem: the linear regression can predictprobabilities > than 1 and < than 0.
10 / 58
Linear Probability Model β Disadvantages
Problems with Linear Probability Model (II)The linear function may not be appropriate
β’ e.g. Does an additional 1000 signatures have the same effect going from1000 to 2000 as from 3000 to 4000?
0 500 1000 1500 2000 2500
0.0
0.5
1.0
Number of signatures
u i Implication: We need a model that willaccount for these types of non-constanteffects.
11 / 58
12 / 58
The Binary Logistic Regression Model
Proportions and probabilities
β’ When we have a binary Y, we are interested in the proportion of thesubjects in the population for whom π = 1
β’ We can also think of this as the probability π that a randomly selectedmember of the population will have the value π = 1 rather than π = 0
π = π(π = 1)1 β π = π(π = 0)
β’ If π = 0, no unit in the population has π = 1; if π = 1 every unit in thepopulation has π = 1.
β’ We want to model π, given one or more explanatory variables π.
13 / 58
Continuous predictor of petition support
Does petition support depend on the number of signatures an MP receives?
0 500 1000 1500 2000 2500
Number of signatures
Sup
port
ed p
etiti
on in
par
liam
ent
01
14 / 58
Binary predictor of petition support
Does petition support depend on the party of the MP?
Opposed SupportedLabour 8 20
Conservative 17 6
Table 2: Sample counts
Opposed SupportedLabour 0.29 0.71
Conservative 0.74 0.26
Table 3: Sample proportions
15 / 58
Conditional probabilities
β’ Consider the dummy variable π = 1 if an MP is a member of theConservative Party and π = 0 if they are a member of the Labour Party
β’ We would like to estimate conditional probabilities of supporting thepetition separately for these two groups:
Μπ (π = 1|π = 0) = ππ=0 = 0.71Μπ (π = 1|π = 1) = ππ=1 = 0.26
β’ The estimated conditional probability ( Μπ ) of supporting the petition ishigher for Labour MPs than Conservative MPs
β’ More generally, we would like to model how the probabilityπ = π(π = 1) depends on one or more explanatory variables, whichmight be continuous.
16 / 58
How to model π?
β’ Linear regression model: conditional mean is equal to a linearcombination of explanatory variables:
πΈ(ππ) = ππ = πΌ + π½1π1π + β―
β’ Linear probability model: conditional probability is equal to a linearcombination of X:
πΈ(ππ|ππ) = π(ππ = 1|ππ) = ππ = πΌ + π½1π1π + β―
β’ However, we would like a way to make sure 0 β€ ππ β€ 1β’ We cannot model a linear model for π directlyβ’ Instead, we build a linear model for a transformation of π
17 / 58
From probabilities to odds
Odds: the ratio of the probabilities of the event and the non-event:
Odds = π(π = 1)1 β π(π = 1) =π
1 β π
β’ If the probability of supporting the petition is π = 0.25β¦β’ the odds of supporting the petition are = 0.25/0.75 = 0.33β’ the odds of not supporting the petition are = 0.75/0.25 = 3
β’ Odds vs. probabilities π:β’ If odds > 1 β π(π = 1) > π(π = 0) β π > 0.5β’ If odds < 1 β π(π = 1) < π(π = 0) β π < 0.5β’ If odds = 1 β π(π = 1) = π(π = 0) β π = 0.5
18 / 58
From probabilities to odds
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Odds
Pro
babi
lity
Range of π is (0, 1) β Range of odds is (0, β)
19 / 58
Conditional odds
Conditional odds: the odds of an event, conditional on another event:
Opposed SupportedLabour 0.29 0.71
Conservative 0.74 0.26
Table 4: Sample proportions
β’ Odds of supporting the petition if you are a Labour MP:
ΜOddsπΏ =ΜππΏ
(1 β ΜππΏ)= 0.71(1 β 0.71) = 2.5
β’ Odds of supporting the petition if you are a Conservative MP:
ΜOddsπΆ =ΜππΆ
(1 β ΜππΆ)= 0.26(1 β 0.26) = 0.35
20 / 58
Odds ratios
Odds ratio: the ratio of two conditional odds
β’ Describes the association between two variables
ΜORπΏπΆ =ΜOddsπΏΜOddsπΆ
= 2.50.35 = 7.08
β’ ΜOddsπΏ is the odds that π = 1 for Labour MPsβ’ ΜOddsπΆ is the odds that π = 1 for Conservative MPs
Interpretation:
β’ The odds of a Labour MP supporting the petition are 7.08 times the odds ofa Conservative MP supporting the petition
β’ This also means that the probability of supporting the petition is higher forLabour MPs than Conservative MPs
β being a Labour MP is associated with higher odds (and probability) ofsupporting the petition
21 / 58
Odds ratios
β’ In our example,
β’ π = supported the petition (1=supported, 0=opposed)β’ π = party (1=Labour, 0=Conservative)
β’ The association is described by comparing odds of π = 1 for levels ofvariable π
β’ If odds ratio = 1 β no association between π and πβ’ If odds ratio > 1 β positive association between π and πβ’ If odds ratio < 1 β negative association between π and π
22 / 58
From odds to log-odds
β’ Recall that we need to solve the problem that
β’ The linear predictor πΌ + π½1π1π + β― can take values from ββ to β.β’ The probability ππ must be between 0 and 1.
β’ We now have the necessary pieces to solve the problem.
β’ Turning ππ into the odds expanded the range to:
0 < π(1 β π) < +β
β’ By taking the logarithm of the odds:
ββ < log ( π1 β π ) < +β
β’ This transformation is known as the logit transformation
23 / 58
From probabilities to log-odds
β6 β4 β2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
Logβodds
Pro
babi
lity
Range of π is (0, 1) β Range of log-odds is (ββ, β)
24 / 58
The Binary Logistic Regression Model
The logistic regression model, also known as the logit model, is a model for thelog-odds of an outcome:
β’ π is a binary response variable, with values 0 and 1β’ π1, β¦ , ππ are π explanatory variables of any typeβ’ Observations ππ are statistically independent of each otherβ’ For each observation π, the following equation holds for
log(Oddsπ) = log (ππ
1 β ππ) = πΌ + π½1π1π + β― + π½ππππ
where πΌ and π½1, β¦ , π½π are the unknown parameters of the model, to beestimated from data
25 / 58
Model for the probabilities
β’ Although the model is written first for the log-odds, it also implies a modelfor the probabilities, ππ:
ππ =exp(πΌ + π½1π1π + β― + π½ππππ)
1 + exp(πΌ + π½1π1π + β― + π½ππππ)
β’ This is always between 0 and 1
β’ The plots on the next slide give examples of
π = exp(πΌ + π½π)1 + exp(πΌ + π½π)for a simple logistic model with one continuous π
26 / 58
Probabilities from a logistic model
β4 β2 0 2 4
0.0
0.2
0.4
0.6
0.8
1.0
Changing AlphaBeta = 1
X
Fitt
ed V
alue
Alpha = 0Alpha = 1Alpha = β2
β4 β2 0 2 40.
00.
20.
40.
60.
81.
0
Changing BetaAlpha = 0
X
Fitt
ed V
alue
Beta = 1Beta = 2Beta = 0Beta = β1
27 / 58
28 / 58
Interpretation
Petition signatures and MP support
Consider again the following variables:
β’ π : MP support for petition (1 = Supported, 0 = Opposed)β’ π1: Number of petition signaturesβ’ π2: Party of the MP (1 = Conservative, 0 = Labour)
We can estimate the logistic model using the glm function in R:
logit_model
Interpretation of the coefficients
Μlog( ππ1βππ ) = β9.36 + 0.01 β signaturesπ β 3.09 β Conservativeπ#### =============================## Model 1## -----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## -----------------------------## AIC 29.23## BIC 35.02## Log Likelihood -11.61## Deviance 23.23## Num. obs. 51## =============================## *** p < 0.001; ** p < 0.01; * p < 0.05
Some aspects of interpretation arestraightforward:
β’ The sign of the coefficientsindicate the direction of theassociations
β’ π½signatures > 0 β more signaturesincrease the probability of MPsupport
β’ π½party < 0 β being a ConservativeMP decreases the probability ofMP support
30 / 58
Interpretation of the coefficients
Μlog( ππ1βππ ) = β9.36 + 0.01 β signaturesπ β 3.09 β Conservativeπ#### =============================## Model 1## -----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## -----------------------------## AIC 29.23## BIC 35.02## Log Likelihood -11.61## Deviance 23.23## Num. obs. 51## =============================## *** p < 0.001; ** p < 0.01; * p < 0.05
Some aspects of interpretation arestraightforward:
β’ The significance of the coefficientsare still determined by
Μπ½ππΈ( Μπ½)
β’ More on this later
30 / 58
Interpretation of the coefficients
Μlog( ππ1βππ ) = β9.36 + 0.01 β signaturesπ β 3.09 β Conservativeπ#### =============================## Model 1## -----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## -----------------------------## AIC 29.23## BIC 35.02## Log Likelihood -11.61## Deviance 23.23## Num. obs. 51## =============================## *** p < 0.001; ** p < 0.01; * p < 0.05
It is possible to interpret thecoefficients directlyβ¦
β’ β a one signature increase isassociated with an increase ofπ½signatures = 0.01 in the log-oddsof MP support, holding constantparty
β’ β Conservative MPs areassociated π½party = β3.09 lowerlog-odds of petition support,holding constant signatures
β’ β¦but no-one thinks in terms oflog-odds!
31 / 58
Interpretation of the coefficients
Instead of interpreting the log-odd ratios, we can convert the Μπ½ coefficients into(slightly) more intuitive odds ratios:
β’ exp( Μπ½signatures) = exp(0.01) = 1.01β’ exp( Μπ½party) = exp(β3.09) = 0.05
In R:
round(exp(coef(logit_model)),2)
## (Intercept) signatures partyConservative## 0.00 1.01 0.05
Where
β’ coef returns the estimated coefficientsβ’ exp exponentiates the coefficientsβ’ round rounds the results to 2 digits
32 / 58
Interpretation of the coefficients
β’ exp( Μπ½signatures) = 1.01: Controlling for party, an increase of 1 signaturemultiplies the odds of petition support by 1.01, controlling for party (i.e. itincreases the odds by 1%)
β’ exp( Μπ½party) = 0.05: Controlling for signatures, being a Conservative MPmultiplies the odds of petition support by 0.05 (i.e. it decreases the odds by95%)
33 / 58
Interpretation of the coefficients
β’ We can directly interpret the coefficients of the binary logistic regressionas partial log-odds ratios
β’ We can exponentiate the coefficients, and then interpret them as partialodds ratios
β’ But this still requires having to think in terms of oddsβ¦β’ Instead, we can directly calculate predicted probabilities from the modeland communicate these instead
34 / 58
35 / 58
Predicted Probabilities
Calculating predicted probabilities
β’ The logistic regression gives us an equation for calculating the fittedlog-odds that π = 1 for a given set of X values:
Μlog( ππ1 β ππ) = πΌ + Μπ½1 β π1 β Μπ½2 β π2
β’ To recover the probability that π = 1, we use
Μππ =exp( ΜπΌ + Μπ½1π1π + Μπ½2π2π)
1 + exp( ΜπΌ + Μπ½1π1π + Μπ½2π2π)
for selected values of the explanatory variables π1, β¦ , ππ.1
β’ Typically, we will calculate Μπ for different βprofilesβ of our X variables,where we only change the values of one variable
1ππ₯π() is the exponential function, the inverse of the πππ() function36 / 58
First differences
First differencesA simple way to communicate the effects of our X variables on Y is to reportfirst differences in the predicted probabilities. For example:
Ξπ = π2 β π1π1 =
exp(πΌ + π½1π1 + π½2π2)1 + exp(πΌ + π½1π1 + π½2π2)
π2 =exp(πΌ + π½1π1 + π½2(π2 + 1))
1 + exp(πΌ + π½1π1 + π½2(π2 + 1))
The allows us to describe how π changes when π2 changes while holding π1constant.
37 / 58
First differences
What is the probability of supporting a petition for a Labour MP who receives1200 signatures?
#### ============================## Model 1## ----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## ----------------------------## Num. obs. 51## ============================
π1 =exp(πΌ + π½1π1π + π½2π2π)
1 + exp(πΌ + π½1π1π + π½2π2π)
β’ Substitute πΌ, π½1, π½2 with estimatedvalues
β’ Set π1 = 1200 and π2 = 0β’ π1 = exp(β9.36+0.01β1200β3.09β0)1+exp(β9.36+0.01β1200β3.09β0)β’ π1 = exp(0.14)1+exp(0.14) = 1.142.14β’ π1 = 0.53
Result: the probability for an MP with these X values would be 0.53
38 / 58
First differences
What is the probability of supporting a petition for a Conservative MP whoreceives 1200 signatures?
#### ============================## Model 1## ----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## ----------------------------## Num. obs. 51## ============================
π1 =exp(πΌ + π½1π1π + π½2π2π)
1 + exp(πΌ + π½1π1π + π½2π2π)
β’ Substitute πΌ, π½1, π½2 with estimatedvalues
β’ Set π1 = 1200 and π2 = 1β’ π2 = exp(β9.36+0.01β1200β3.09β1)1+exp(β9.36+0.01β1200β3.09β1)β’ π2 = exp(β2.96)1+exp(β2.96) = 0.051.05β’ π2 = 0.05
Result: the probability for an MP with these X values would be 0.05
39 / 58
First differences
β’ In R, we can calculate the predicted probabilities using predict()β’ To do so, we need to specify values for all the explanatory variables
predict(logit_model, newdata = data.frame(signatures = 1200,party = "Labour"),
type = "response")
## 1## 0.5337154
predict(logit_model, newdata = data.frame(signatures = 1200,party = "Conservative"),
type = "response")
## 1## 0.0493317
β’ where type = "response" tells R to calculate predicted probabilitiesβ’ If π1 = .53 and π2 = .049, then the first difference π2 β π1 = β.48β’ β the probability of a Conservative MP supporting the petition is .48 lowerthan the probability of a Labour MP supporting the petition
40 / 58
Non-linear relationship between X and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
β’ The plot shows the predictedprobability of MP support over therange of the signature variable forLabour party MPs
β’ Notice that the predictions are nolonger linear: the effect of π on πis not constant
41 / 58
Non-linear relationship between X and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
β’ Consider the change in π thatresults from an increase insignatures from 500 to 1000
β’ β π increases from 0 to about .18
42 / 58
Non-linear relationship between X and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
β’ Consider the change in π thatresults from an increase insignatures from 1000 to 1500
β’ β π increases from .18 to about.98
β’ Implication: the same change in Xresults in different changes in πdepending on which values of X weconsider
43 / 58
π1,π2 and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
LabourConservative
β’ The plot shows the predictedprobability of MP support over therange of signatures for Labour andConservative MPs
β’ Question: Is the effect of partyconstant across the range of thesignature variable?
β’ Answer: No!
44 / 58
π1,π2 and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
LabourConservative
β’ Set the signature variable equal to1000
β’ Calculate the difference inprobability for Labour andConservative MPs
β’ ππΏ β ππΆ β 0.18
45 / 58
π1,π2 and π
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Number of signatures
Ο i
LabourConservative β’ Set the signature variable equal to
1500β’ Calculate the difference inprobability for Labour andConservative MPs
β’ ππΏ β ππΆ β 0.57β’ Implication: Even exactly the samechange in π1 will result indifferent changes in π dependingon which values of π2 we consider
46 / 58
Summary
β’ No single number can describe the effect of π on π everywhereβ’ β The effect of a one-unit change in π1 will be different for differentstarting values of π1
β’ β The effect of a one-unit change in π1 will be different for different valuesof π2
β’ Because of this, best practice is to provide predicted probabilities for somekey comparisons in order to describe the effects of your explanatoryvariables
47 / 58
48 / 58
Inference
Statistical inference for logistic regression
β’ Logistic regression is not estimated by ordinary least squares (OLS), butrather by maximum likelihood estimation (MLE).
β’ However, the estimates from this method still have normally distributedsampling distributions.
β’ This feature means we can use familiar statistical tests:
β’ Tests ask where a statistic (e.g., Μπ½) falls in the sampling distribution thatwould result under a null hypothesis (e.g., π½ = 0).
β’ We already know how to do hypothesis tests with normal samplingdistributions.
49 / 58
Hypothesis Tests
Hypothesis tests for coefficients take a familiar form:
β’ We test against a null hypothesis of no effect β π»0 βΆ π½π = 0β’ We compute the a test-statistic, the π§ value:
π§ =Μπ½π
ππΈ( Μπ½π)
β’ It is a π§ value because, we compute p values from the standard normalinstead of the student t
β’ We reject the null at the 95% level when |π§| β₯ 1.96β’ These estimates become unstable in small samples (< 100)
50 / 58
Hypothesis test example
screenreg(logit_model)
#### =============================## Model 1## -----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## -----------------------------## AIC 29.23## BIC 35.02## Log Likelihood -11.61## Deviance 23.23## Num. obs. 51## =============================## *** p < 0.001; ** p < 0.01; * p < 0.05
β’ We reject π»0 if π is small,e.g. < 0.05
β’ That is, if π§ > 1.96 or π§ < β1.96β’ z = 0.010.001 = 10β’ π = 0.0000001β’ Do we reject π»0?β’ Yes. We can reject the null that therelationship between signaturesand petition support is zero
51 / 58
Hypothesis test example
screenreg(logit_model)
#### =============================## Model 1## -----------------------------## (Intercept) -9.36 **## (3.09)## signatures 0.01 ***## (0.00)## partyConservative -3.09 *## (1.23)## -----------------------------## AIC 29.23## BIC 35.02## Log Likelihood -11.61## Deviance 23.23## Num. obs. 51## =============================## *** p < 0.001; ** p < 0.01; * p < 0.05
β’ We reject π»0 if π is small,e.g. < 0.05
β’ That is, if π§ > 1.96 or π§ < β1.96β’ z = β3.091.23 = β2.51β’ π = 0.012β’ Do we reject π»0?β’ Yes. We can reject the null that therelationship between party andpetition support is zero
52 / 58
Conclusion
What have we learned? (I)
β’ Many research questions in the social sciences require analysing binaryoutcomes
β’ While we can use linear regression to analyse these outcomes, OLS hassome unattractive properties
β’ Logistic regression is a helpful alternative to OLS, which avoids the mainproblem: that probabilities need to be constrained to be between 0 and 1
β’ We need to be careful when interpreting the output of the model
53 / 58
Seminar
In seminars this week, you will learn to β¦
1. Implement some binary logistic regression models
2. Interpret the resulting coefficients
3. Calculate some fitted probabilities
54 / 58
What have we learned? (II)
Substantive finding:
β’ Politicians are more likely to speak on issues where local support for theissue is strong!
Question: Does this mean that higher numbers of signatures cause betterparliamentary representation?
55 / 58
Logit and causality
β’ Logistic regression is a method for describing variation in observed data
β’ As with linear regression, we cannot claim to be describing a causalrelationship unless we are confident that we have controlled for allpossible confounding variables
β’ No new method will guarantee us a way of making causal statements!
56 / 58
From PUBL0055 to PUBL0050
In the βIntroductionβ module, we have covered:
1. Several commonly applied statistical methods for quantitative analysis2. How to use R3. An introduction to quantitative causal analysis
In the βAdvancedβ module, we will cover:
1. More βcutting-edgeβ statistical methods for quantitative analysis2. More R!3. In-depth exploration of the different approaches to causal analysis4. More focus on developing research questions/designs in your own work
57 / 58
Thanks for watching, have a good break, and hopefully see many of you nextterm!
58 / 58
The Linear Probability ModelThe Binary Logistic Regression ModelInterpretationPredicted ProbabilitiesInferenceConclusion
Recommended