View
230
Download
0
Category
Preview:
Citation preview
IntroductionMethodology
Data IllustrationsDiscussion
A Fully Nonparametric Modeling Approach toBinary Regression
Maria De Yoreo
Department of Applied Mathematics and StatisticsUniversity of California, Santa Cruz
SBIES, April 27-28, 2012
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Motivation
I binary responses along with covariates are present inmany settings, including biometrics, econometrics, andsocial sciences
I Goal: determine the relationship between response andcovariates
I examples: credit scoring, medicine, population dynamics,environmental sciences
I the response-covariate relationship is described by theregression function
I standard approaches involve linearity and distributionalassumptions, e.g., GLMs
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Motivation
I binary responses along with covariates are present inmany settings, including biometrics, econometrics, andsocial sciences
I Goal: determine the relationship between response andcovariates
I examples: credit scoring, medicine, population dynamics,environmental sciences
I the response-covariate relationship is described by theregression function
I standard approaches involve linearity and distributionalassumptions, e.g., GLMs
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Bayesian Nonparametrics
I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification
I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)
I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically
I involves random covariates
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Bayesian Nonparametrics
I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification
I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)
I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically
I involves random covariates
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Bayesian Nonparametrics
I Bayesian nonparametrics can be used to relax commondistributional assumptions, resulting in flexible regressionmodels with proper uncertainty quantification
I rather than modeling directly the regression function,model the joint distribution of response and covariatesusing a nonparametric mixture model (West et al., 1994,Müller et al., 1996)
I this implies a form for the conditional response distribution,which is implicitly modeled nonparametrically
I involves random covariates
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Latent Variable Formulation
I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)
I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)
I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation
I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Latent Variable Formulation
I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)
I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)
I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation
I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Latent Variable Formulation
I introduce latent continuous random variables z thatdetermine the binary responses y , so that y = 1 if-f z > 0(e.g., Albert and Chib, 1993)
I estimate the joint distribution of latent responses andcovariates f (z, x) using a nonparametric mixture model, toobtain flexible inference for the regression functionpr(y = 1|x)
I the latent variables may be of interest in some applications,containing more information than just a 0/1 observation
I in biology applications, these may be thought of asmaturity, latent survivorship, or measure of health
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
DP Mixture Model
The Dirichlet Process (DP) (Ferguson, 1973) generatesrandom distributions, and can be used as a prior for spaces ofdistribution functions.
I DP constructive definition (Sethuraman, 1994): ifG ∼ DP(α,G0), then it is almost surely of the form∑∞
l=1 plδνl
→ νliid∼ G0, l = 1,2, ...
→ zriid∼ Beta(1, α), r = 1,2, ...
→ define p1 = z1, and pl = zl∏l−1
r=1(1− zr ), for l = 2,3, ...I DP mixture model for the latent responses and covariates
f (z, x ; G) =
∫Np+1(z, x ;µ,Σ)dG(µ,Σ)
G|α,ψ ∼ DP(α,G0(µ,Σ;ψ))
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
DP Mixture Model
The Dirichlet Process (DP) (Ferguson, 1973) generatesrandom distributions, and can be used as a prior for spaces ofdistribution functions.
I DP constructive definition (Sethuraman, 1994): ifG ∼ DP(α,G0), then it is almost surely of the form∑∞
l=1 plδνl
→ νliid∼ G0, l = 1,2, ...
→ zriid∼ Beta(1, α), r = 1,2, ...
→ define p1 = z1, and pl = zl∏l−1
r=1(1− zr ), for l = 2,3, ...I DP mixture model for the latent responses and covariates
f (z, x ; G) =
∫Np+1(z, x ;µ,Σ)dG(µ,Σ)
G|α,ψ ∼ DP(α,G0(µ,Σ;ψ))
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Implied Conditional Regression
I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs
f (z, x ; G) =∞∑
l=1
plNp+1(z, x ;µl ,Σl)
I Binary regression functional: pr(y = 1|x ; G)
→ marginalize over z to obtain f (x ; G) and f (y , x ; G)
f (x ; G) =∞∑
l=1
plNp(x ;µxl ,Σ
xxl )
And the joint distribution f (y , x ; G) =
∞∑l=1
plNp(x ;µxl ,Σ
xxl )Bern
(y ; Φ
(µz
l + Σzxl (Σxx
l )−1(x − µxl )
(Σzzl − Σzx
l (Σxxl )−1Σxz
l )1/2
))De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Implied Conditional Regression
I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs
f (z, x ; G) =∞∑
l=1
plNp+1(z, x ;µl ,Σl)
I Binary regression functional: pr(y = 1|x ; G)
→ marginalize over z to obtain f (x ; G) and f (y , x ; G)
f (x ; G) =∞∑
l=1
plNp(x ;µxl ,Σ
xxl )
And the joint distribution f (y , x ; G) =
∞∑l=1
plNp(x ;µxl ,Σ
xxl )Bern
(y ; Φ
(µz
l + Σzxl (Σxx
l )−1(x − µxl )
(Σzzl − Σzx
l (Σxxl )−1Σxz
l )1/2
))De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Implied Conditional Regression
I From the constructive definition, the model has an a.s.representation as a countable mixture of MVNs
f (z, x ; G) =∞∑
l=1
plNp+1(z, x ;µl ,Σl)
I Binary regression functional: pr(y = 1|x ; G)
→ marginalize over z to obtain f (x ; G) and f (y , x ; G)
f (x ; G) =∞∑
l=1
plNp(x ;µxl ,Σ
xxl )
And the joint distribution f (y , x ; G) =
∞∑l=1
plNp(x ;µxl ,Σ
xxl )Bern
(y ; Φ
(µz
l + Σzxl (Σxx
l )−1(x − µxl )
(Σzzl − Σzx
l (Σxxl )−1Σxz
l )1/2
))De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
The Regression Function
I implied regression function:pr(y = 1|x ; G) =
∑∞l=1 wl(x)πl(x), with covariate
dependent weights
wl(x) ∝ plN(x ;µxl ,Σ
xxl )
and probabilities
πl(x) = Φ
(µz
l + Σzxl (Σxx
l )−1(x − µxl )
(Σzzl − Σzx
l (Σxxl )−1Σxz
l )1/2
)
I Notice that the probabilities have the probit form withcomponent-specific intercept and slope parameters
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
The Regression Function
I implied regression function:pr(y = 1|x ; G) =
∑∞l=1 wl(x)πl(x), with covariate
dependent weights
wl(x) ∝ plN(x ;µxl ,Σ
xxl )
and probabilities
πl(x) = Φ
(µz
l + Σzxl (Σxx
l )−1(x − µxl )
(Σzzl − Σzx
l (Σxxl )−1Σxz
l )1/2
)
I Notice that the probabilities have the probit form withcomponent-specific intercept and slope parameters
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Identifiability
Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)
I the binary responses are not able to inform about the scaleof the latent responses
I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl
I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Identifiability
Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)
I the binary responses are not able to inform about the scaleof the latent responses
I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl
I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Identifiability
Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)
I the binary responses are not able to inform about the scaleof the latent responses
I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl
I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Identifiability
Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)
I the binary responses are not able to inform about the scaleof the latent responses
I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl
I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Identifiability
Can the entire covariance matrix Σ be estimated?I Probit Regression: z ∼ N(xTβ,1)
I the binary responses are not able to inform about the scaleof the latent responses
I retaining Σzx is important, if we set it to 0, then πl(x)becomes just πl
I We have shown that if Σzz is fixed, the remainingparameters are identifiable in the kernel of the mixturemodel for y and x
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the
relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)
I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −
∑k−1j=1 βk ,j(yj − µj), δk ),
k = 2, ...,m→ useful for modeling longitudinal data and specifying
conditional independence assumptions
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the
relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)
I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −
∑k−1j=1 βk ,j(yj − µj), δk ),
k = 2, ...,m→ useful for modeling longitudinal data and specifying
conditional independence assumptions
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the
relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)
I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −
∑k−1j=1 βk ,j(yj − µj), δk ),
k = 2, ...,m→ useful for modeling longitudinal data and specifying
conditional independence assumptions
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
How to fix only one element of the covariance matrix?I the usual inverse-Wishart distribution will not workI square-root-free Cholesky decomposition of Σ uses the
relationship ∆ = βΣβT , with ∆ diagonal with all elementsδi > 0, and β lower triangular with 1 on its diagonal(Daniels and Pourahmadi, 2002; Webb and Forster, 2007)
I For y = (y1, ..., ym) ∼ N(µ,Σ), with ∆ = βΣβT , the jointdistribution for y can be expressed in a recursive form:y1 ∼ N(µ1, δ1),(yk |y1, . . . , yk−1) ∼ N(µk −
∑k−1j=1 βk ,j(yj − µj), δk ),
k = 2, ...,m→ useful for modeling longitudinal data and specifying
conditional independence assumptions
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
I here, no natural ordering is present, but theparamaterization has other useful properties which weexploit
I δ1 = Σzz
→ fix δ1, and mix on δ2, . . . , δp+1 and p(p + 1)/2 free elementsof β, denoted by vector β̃
Then the DP mixture model becomes
f (z, x ; G) =
∫Np+1(z, x ;µ, β−1∆β−T )dG(µ, β,∆)
I computationally convenient: there exist conjugate priordistributions for β̃ and δ2, ..., δp+1, which are MVN and(independent) inverse-gamma
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Facilitating Identifiability
I here, no natural ordering is present, but theparamaterization has other useful properties which weexploit
I δ1 = Σzz
→ fix δ1, and mix on δ2, . . . , δp+1 and p(p + 1)/2 free elementsof β, denoted by vector β̃
Then the DP mixture model becomes
f (z, x ; G) =
∫Np+1(z, x ;µ, β−1∆β−T )dG(µ, β,∆)
I computationally convenient: there exist conjugate priordistributions for β̃ and δ2, ..., δp+1, which are MVN and(independent) inverse-gamma
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Hierarchical Model
Blocked Gibbs sampler: truncate G to GN(·) =∑N
l=1 plδWl (·),with Wl = (µl , β̃l ,∆l), and introduce configuration variables(L1, ...,Ln) taking values in 1, ...,N.
yi |ziind∼ 1(yi=1)1(zi>0) + 1(yi=0)1(zi≤0), i = 1, . . . ,n
(zi , xi)|W ,Liind∼ Np+1((zi , xi);µLi , β
−1Li
∆Liβ−TLi
), i = 1, ...,n
Li |p ∼N∑
l=1
plδl(Li), i = 1, . . . ,n
Wl |ψind∼ Np+1(µl ; m,V )Nq(β̃l ; θ, cI)
p+1∏i=2
IG(δi,l ; νi , si), l = 1, . . . ,N
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Posterior Inference
I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.
I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum
I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)
N )
→ provides point estimate and uncertainty quantification forregression function
I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Posterior Inference
I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.
I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum
I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)
N )
→ provides point estimate and uncertainty quantification forregression function
I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Model FormulationPosterior Inference
Posterior Inference
I Gibbs sampling may be used to simulate from full posteriorp(W ,L,p, ψ, α, z|data), with the conditionally conjugatebase distribution, and conjugate priors on ψ and α.
I The posterior for GN = (p,W ) is imputed in the MCMC,enabling full inference for any functional of f (z, x ; GN), nowa finite sum
I Binary regression functional: for any covariate value x0, atiteration r of the MCMC, calculate pr(y = 1|x0; G(r)
N )
→ provides point estimate and uncertainty quantification forregression function
I Same can be done for other functionals, such as latentresponse distribution f (z|x0; GN) at any covariate value x0
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Simulated Data
I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.
I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data
I a practical prior specification approach which isappropriate when little is known about the problem isapplied here
I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Simulated Data
I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.
I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data
I a practical prior specification approach which isappropriate when little is known about the problem isapplied here
I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Simulated Data
I Data {(zi , xi) : i = 1, . . . ,n} was simulated from a mixtureof 3 bivariate normals, and y determined from z.
I compare inference from the binary regression model withdata (y , x) to that from model which views (z, x) as data
I a practical prior specification approach which isappropriate when little is known about the problem isapplied here
I to specify priors on ψ, consider only one mixturecomponent and use an approximate center and range ofthe data, as well as prior simulation to induce anapproximate unif(−1,1) prior on corr(z, x)
De Yoreo BNP Binary Regression
−2 0 2 4
0.0
0.2
0.4
0.6
0.8
1.0
x
Pr(z>0|x;G)
−2 0 2 4
0.0
0.2
0.4
0.6
0.8
1.0
xPr(y=1|x;G)
The inference for pr(z > 0|x ; G) (left) is compared to that forpr(y = 1|x ; G) (right) and the truth (solid line).
−4 −3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
z
f(z|x=x1)
−4 −3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
z
f(z|x=x2)
−4 −3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
1.2
z
f(z|x=x3)
z
f(z|x=x1)
−3.9 0.0 2.9
0.0
1.2
z
f(z|x=x2)
−3.9 0.0 2.9
0.0
1.2
z
f(z|x=x3)
−3.9 0.0 2.9
0.0
1.2
Top row: Inference for f (z|x0; G) under the model which views zas observed, with true densities as dashed lines, at 3 values ofx0. Bottom: Inference from the binary regression model.
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Ozone and Wind Speed
I 111 daily measurements of wind speed (mph) and ozoneconcentration (parts per billion) in NYC over 4 monthperiod
I objective: model the probability of exceeding a certainozone concentration as a function of wind speed
I the model only sees whether or not there was anexceedance, but there is an actual ozone concentrationunderlying this 0/1 value
De Yoreo BNP Binary Regression
5 10 15 20
0.0
0.2
0.4
0.6
0.8
1.0
wind speed
prob
abilit
y of
ozo
ne e
xcee
denc
e
5 10 15 20
050
100
150
wind speed
ozon
e co
ncen
tratio
n
Left: The probability that ozone concentration (parts per billion)exceeds a threshold of 70 decreases with wind speed (mph).Right: For comparison, here are the actual non-discretizedozone measurements as a function of wind speed.
−3 −1 0 1 2 30.0
0.2
0.4
0.6
z
f(z|x0)
−3 −1 0 1 2 3
0.0
0.2
0.4
0.6
z
f(z|x0)
−3 −1 0 1 2 3
0.0
0.2
0.4
0.6
z
f(z|x0)
−3 −1 0 1 2 30.0
0.2
0.4
0.6
z
f(z|x0)
Estimates for f (z|x0; G) at wind speed values of 5, 8, 10, and15 mph.
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Outline
1 Introduction
2 MethodologyModel FormulationPosterior Inference
3 Data IllustrationsSimulation ExampleAtmospheric MeasurementsCredit Card Data
4 Discussion
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Simulation ExampleAtmospheric MeasurementsCredit Card Data
Credit Cards and Income
I n = 100 subjects in a study were asked whether or notthey owned a travel credit card, and their income wasrecorded (Agresti, 1996)
I In this situation, it is not clear that there is somemeaningful interpretation of the latent continuous randomvariables, but we can still use the method for regression
I Does probability of owning a credit card change withincome?
De Yoreo BNP Binary Regression
10 20 30 40 50 60 70
0.0
0.2
0.4
0.6
0.8
1.0
income in thousands
Pr(
y=1|
x;G
)
●●●●●●●●
●●
●●●●●●●●●●●●
●●
●●●●●●●●●●●●●●●
●●
●●●●
●
●●●●●●●●●
●
●●●●●●●●●●
●●
● ●●●●
●● ●●●●●● ●●●●●●
●● ●● ● ●
●●●●●● ●
Probability of owning a credit card appears to increase withincome, with a slight dip or leveling off around income of 40-50,since all subjects in that region did not own a credit card.
IntroductionMethodology
Data IllustrationsDiscussion
Extensions to Ordinal Reponses
I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j
if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)
I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables
I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Extensions to Ordinal Reponses
I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j
if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)
I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables
I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Extensions to Ordinal Reponses
I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j
if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)
I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables
I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Extensions to Ordinal Reponses
I similar methodology, wider range of applicationsI for an ordinal response with C categories, assume y = j
if-f γj−1 < z ≤ γj , for j = 1, ...C, and apply the same DPmixture of MVNs for (z, x)
I for fixed cut-off points γ, it can be shown that all of µ and Σare identifiable in the induced kernel for the observables
I the C − 1 free cut-off points can be fixed to arbitraryincreasing values (Kottas et al., 2005), which is an attributein a computational sense
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Other Extensions
I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response
I several applications, but limited existing methods forflexible inference
I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj
I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be
used, and fixing certain elements of δ provides thenecessary restrictions
I mixed ordinal-continuous responses
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Other Extensions
I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response
I several applications, but limited existing methods forflexible inference
I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj
I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be
used, and fixing certain elements of δ provides thenecessary restrictions
I mixed ordinal-continuous responses
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Other Extensions
I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response
I several applications, but limited existing methods forflexible inference
I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj
I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be
used, and fixing certain elements of δ provides thenecessary restrictions
I mixed ordinal-continuous responses
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Other Extensions
I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response
I several applications, but limited existing methods forflexible inference
I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj
I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be
used, and fixing certain elements of δ provides thenecessary restrictions
I mixed ordinal-continuous responses
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Other Extensions
I multivariate ordinal responses: J ordinal responsesassociated with a vector of covariates for each subject;with Cj categories associated with the j th response
I several applications, but limited existing methods forflexible inference
I y and z are vectors, and yj = l if-f γj,l−1 < zj ≤ γj,l , forj = 1, ..., J, and l = 1, ...,Cj
I Cj > 2 for all j , then no identifiability restrictions neededI Cj = 2 for some j , then (β,∆) paramaterization can be
used, and fixing certain elements of δ provides thenecessary restrictions
I mixed ordinal-continuous responses
De Yoreo BNP Binary Regression
IntroductionMethodology
Data IllustrationsDiscussion
Conclusions
? Binary responses measured along with covariatesrepresents a simple setting, but the scope of problemswhich lie in this category is large.
? This framework allows flexible, nonparametric inference tobe obtained for the regression relationship in a generalbinary regression problem.
? The methodology extends easily to larger classes ofproblems in ordinal regression, including multivariateresponses and mixed responses, making the frameworkmuch more powerful, with utility in a wide variety ofapplications.
De Yoreo BNP Binary Regression
Recommended