2
Aust. N. Z. J. Stat. 00(0), 2011, 1–2 doi: 10.1111/j.1467-842X.2011.00640.x BOOK REVIEW Logistic Regression Models. By J.M. Hilbe. Boca Raton, Florida: CRC Press. 2009. xviii+637 pages. £49.99 (hardback). ISBN 978-10-4200-7575-5. This book covers what would these days be regarded as a very narrow topic, namely logistic regression models. So I began reading with some misgivings. Why not at least consider the more general topic of binomial regression models? Or even add in other count data models such as Poisson and negative binomial and extend to a book on discrete data? So from the beginning I was skeptical that there would be 621 pages of good material to write on the subject. Chapter 1 is an introduction that begins by contrasting normal models with binomial models. However, there is not a single equation or example given, which makes this of questionable value to the student. I was a little worried about the statement that ‘IRLS estimation is a type of ML estimation and shares many of its features’. As we all know, IRLS is an algorithm for obtain the ML estimates for GLMs. In the next sentence, we learn that ‘GLM’s typically suffer few convergence difficulties’. Now, I always thought that it was algorithms that failed to converge, not models. Luckily, there is a historical overview that follows, which I found quite interesting. For instance, I did not know that George Barnard coined the term log-odds. Chapter 2 is entitled ‘Concepts related to the logistic model’ and is supposed to give some basic concepts to the student to carry forward through the book. In principle, this might be a good idea, but the first section of the chapter is on data recoding. There then follows a Stata analysis of a simple 2×2 table. The calculation of the odds ratio in the output is illustrated using Stata as a simple calculator. Then the Stata commands ‘logistic’ and ‘glm’ are used to generate the same odds ratio. At this stage, the odds ratio has not been explained or interpreted. Rather, the reader is asked to ‘note that the odds ratio is the same in all three derivations [sic]’. On page 22, we are shown how to convert a linear predictor (which is denoted as ‘xb’, even though the linear model has not yet been introduced) to a probability. It is pointed out that you get the same answer either by using p = 1/(1 + exp(xb)) or by using an ‘alternative transform’ p = exp(xb) /(1 + exp(xb)). That is because this is the same transform. Then, on page 23, the odds is finally defined. However, it is defined as the ‘relationship of p to 1 p’, rather than as the ratio. There is no real framework given in this chapter for students to grasp any concepts, which was the intention implied in the title. Instead, there are various items of Stata output that are calculated directly, without further explanation. In Section 2.3 there is an intriguing discussion of modelling a quantitative predictor (since the earlier examples involved group labels as predictors). It is claimed that ‘considerable discussion has been given in the recent literature to the question of categorizing continuous predictors ... . Others argue that continuous predictors are fine as long as they are monotonically and smoothly increasing or decreasing.’ I wish I knew what point the author was trying to make. What I can say for sure is that this will not help any student who is reading an introductory chapter called ‘Concepts related to the logistic model’. Chapter 3 consists of 16 pages devoted to writing down the IRLS estimating equations for the GLM. Exponential families are never defined until equation (3.38). There are two conflicting notations for the gradient and Hessian operators in (3.7) and (3.9), and a missing summation in the final derived estimating equation in (3.18). In Section 3.3 it is claimed that ‘full maximum likelihood methodology uses the observed information matrix’, and the alternative Hessian matrix is derived. Now while I am not against basing standard errors on the observed information, the fitting algorithm will often work with expected or observed information. More to the point, for the logistic model the observed and the expected information are the same. So why introduce the observed information concept? In Chapter 4, the estimating equations for the GLM are specialized to the logistic binomial model. Section 4.2 actually contrasts the GLM and ML algorithms. The basic problem here is that the IRLS C 2011 Australian Statistical Publishing Association Inc. Published by Blackwell Publishing Asia Pty Ltd. Australian & New Zealand Journal of Statistics

Logistic Regression Models by J.M. Hilbe

Embed Size (px)

Citation preview

Aust. N. Z. J. Stat. 00(0), 2011, 1–2 doi: 10.1111/j.1467-842X.2011.00640.x

BOOK REVIEW

Logistic Regression Models. By J.M. Hilbe. Boca Raton, Florida: CRC Press. 2009. xviii+637 pages.£49.99 (hardback). ISBN 978-10-4200-7575-5.

This book covers what would these days be regarded as a very narrow topic, namely logistic regressionmodels. So I began reading with some misgivings. Why not at least consider the more general topicof binomial regression models? Or even add in other count data models such as Poisson and negativebinomial and extend to a book on discrete data? So from the beginning I was skeptical that there wouldbe 621 pages of good material to write on the subject.

Chapter 1 is an introduction that begins by contrasting normal models with binomial models.However, there is not a single equation or example given, which makes this of questionable value tothe student. I was a little worried about the statement that ‘IRLS estimation is a type of ML estimationand shares many of its features’. As we all know, IRLS is an algorithm for obtain the ML estimates forGLMs. In the next sentence, we learn that ‘GLM’s typically suffer few convergence difficulties’. Now,I always thought that it was algorithms that failed to converge, not models. Luckily, there is a historicaloverview that follows, which I found quite interesting. For instance, I did not know that George Barnardcoined the term log-odds.

Chapter 2 is entitled ‘Concepts related to the logistic model’ and is supposed to give some basicconcepts to the student to carry forward through the book. In principle, this might be a good idea,but the first section of the chapter is on data recoding. There then follows a Stata analysis of a simple2×2 table. The calculation of the odds ratio in the output is illustrated using Stata as a simple calculator.Then the Stata commands ‘logistic’ and ‘glm’ are used to generate the same odds ratio. At this stage,the odds ratio has not been explained or interpreted. Rather, the reader is asked to ‘note that the oddsratio is the same in all three derivations [sic]’.

On page 22, we are shown how to convert a linear predictor (which is denoted as ‘xb’, eventhough the linear model has not yet been introduced) to a probability. It is pointed out that youget the same answer either by using p = 1/(1 + exp(−xb)) or by using an ‘alternative transform’p = exp(xb) /(1 + exp(xb)). That is because this is the same transform. Then, on page 23, the odds isfinally defined. However, it is defined as the ‘relationship of p to 1 − p’, rather than as the ratio. Thereis no real framework given in this chapter for students to grasp any concepts, which was the intentionimplied in the title. Instead, there are various items of Stata output that are calculated directly, withoutfurther explanation.

In Section 2.3 there is an intriguing discussion of modelling a quantitative predictor (since theearlier examples involved group labels as predictors). It is claimed that ‘considerable discussion hasbeen given in the recent literature to the question of categorizing continuous predictors . . . . Othersargue that continuous predictors are fine as long as they are monotonically and smoothly increasing ordecreasing.’ I wish I knew what point the author was trying to make. What I can say for sure is that thiswill not help any student who is reading an introductory chapter called ‘Concepts related to the logisticmodel’.

Chapter 3 consists of 16 pages devoted to writing down the IRLS estimating equations for theGLM. Exponential families are never defined until equation (3.38). There are two conflicting notationsfor the gradient and Hessian operators in (3.7) and (3.9), and a missing summation in the final derivedestimating equation in (3.18). In Section 3.3 it is claimed that ‘full maximum likelihood methodologyuses the observed information matrix’, and the alternative Hessian matrix is derived. Now while I amnot against basing standard errors on the observed information, the fitting algorithm will often workwith expected or observed information. More to the point, for the logistic model the observed and theexpected information are the same. So why introduce the observed information concept?

In Chapter 4, the estimating equations for the GLM are specialized to the logistic binomial model.Section 4.2 actually contrasts the GLM and ML algorithms. The basic problem here is that the IRLS

C© 2011 Australian Statistical Publishing Association Inc. Published by Blackwell Publishing Asia Pty Ltd.

Australian & New Zealand Journal of Statistics

2 BOOK REVIEW

algorithm returns the ML estimates of the GLM model. There does not seem to be any distinctionbetween algorithm, estimates and model. Section 4.3 mentions the three other standard link functionsfor binomial data. Unfortunately though, we will never learn more about these in a book on logisticregression.

At this point, I decided not to review the book any further, although I continued reading to theend of Chapter 5. If you want a book on logistic regression I suggest you use Cox & Snell (1989) orCollett (1991). However, I prefer books on the more general topic categorical data (Agresti and I bothhave honours-level textbooks on the subject).

CHRIS LLOYD

Melbourne Business School,University of Melbournee-mail: [email protected]

References

AGRESTI, A. (2002). Categorical Data Analysis, 2nd edn. New York: Wiley.COLLETT, D. (1991). Modelling Binary Data. London: Chapman and Hall.COX, D.R. & SNELL, E.J. (1989). Analysis of Binary Data, 2nd edn. London: Chapman and Hall.LLOYD, C.J. (1999). Statistical Analysis of Categorical Data. New York: Wiley.

C© 2011 Australian Statistical Publishing Association Inc.