6
An Appraisal of Multivariable Logistic Models in the Pulmonary and Critical Care Literature* Marc Moss, MD; D. Andrew Wellman, MD; and George A. Cotsonis, MA Objective: Multivariable modeling techniques are appearing in today’s medical literature with increasing frequency. Improper reporting of these statistical models can potentially make the results of a study inaccurate, misleading, or difficult to interpret. We performed a manual literature search of five international pulmonary and critical care journals to determine the accuracy in the reporting of logistic regression modeling strategies. Design: We examined all of the published manuscripts for 12 potential limitations in the reporting of important statistical methodologies over a 6-month period from July 1, 2000, until December 31, 2000. Results: Of the 81 articles that included multivariable logistic regression analyses, only 65% (53 analyses) properly reported the coding classification of pertinent independent variables that were included in the final model. An odds ratio and confidence interval were reported for the independent variables included in the final model for 79% (64 analyses) and 74% (60 analyses), respectively. Only 12% (10 articles) referenced whether interaction terms or effect modifications were examined, 1% (1 article) reported testing for collinearity, and only 16% (13 articles) included a goodness-of-fit analysis of the logistic model. The type of statistical package was reported in 69% (56 articles). Finally, approximately 39% of the articles (22 of 57) may have overfit the logistic regression model, leading to potentially unreliable regression coefficients and odds ratios. Conclusions: Our results indicate that the reporting of multivariable logistic regression analyses in the pulmonary and critical care literature is often incomplete, therefore making it difficult for the reader to accurately interpret the manuscript. We recommend the implementation of adequate guidelines that will lead to overall improvements in the reporting and possibly to the conducting of multivariable analyses in the pulmonary medicine and critical care medicine literature. (CHEST 2003; 123:923–928) Key words: critical care medicine; logistic models; pulmonary medicine; study design Abbreviation: ROC receiver operating characteristic A lthough most investigators of basic science and clinical medicine receive little or no training in statistics and epidemiology, multivariable regression- modeling techniques are now frequently used in medical research. 1 These statistical models are used primarily to identify either the effect of an individual variable on a specific outcome while adjusting for For editorial comment see page 677 differences in other factors (descriptive modeling) or to estimate the likelihood of an outcome for a given patient from a set of observations that are particular to that patient (predictive modeling). 2 All multivari- able modeling strategies have specific assumptions and several limitations that, when violated, may impact on the accuracy of the results and the ability of the reader to properly interpret the study. Logistic *From the Department of Medicine (Drs. Moss and Wellman), Division of Pulmonary and Critical Care Medicine, and the Rollins School of Public Health (Mr. Cotsonis), Emory Unversity, Atlanta, GA. Manuscript received May 4, 2001; revision accepted June 18, 2002. Reproduction of this article is prohibited without written permis- sion from the American College of Chest Physicians (e-mail: [email protected]). Correspondence to: Marc Moss, MD, Division of Pulmonary and Critical Care, Grady Memorial Hospital, 69 Jesse Hill Junior Dr, Suite 2C007, Atlanta, GA 30335; e-mail: [email protected] special report www.chestjournal.org CHEST / 123 / 3 / MARCH, 2003 923

special report

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: special report

An Appraisal of Multivariable LogisticModels in the Pulmonary and CriticalCare Literature*

Marc Moss, MD; D. Andrew Wellman, MD; and George A. Cotsonis, MA

Objective: Multivariable modeling techniques are appearing in today’s medical literature withincreasing frequency. Improper reporting of these statistical models can potentially make the resultsof a study inaccurate, misleading, or difficult to interpret. We performed a manual literature search offive international pulmonary and critical care journals to determine the accuracy in the reporting oflogistic regression modeling strategies.Design: We examined all of the published manuscripts for 12 potential limitations in the reporting ofimportant statistical methodologies over a 6-month period from July 1, 2000, until December 31, 2000.Results: Of the 81 articles that included multivariable logistic regression analyses, only 65% (53analyses) properly reported the coding classification of pertinent independent variables that wereincluded in the final model. An odds ratio and confidence interval were reported for the independentvariables included in the final model for 79% (64 analyses) and 74% (60 analyses), respectively. Only12% (10 articles) referenced whether interaction terms or effect modifications were examined, 1% (1article) reported testing for collinearity, and only 16% (13 articles) included a goodness-of-fit analysisof the logistic model. The type of statistical package was reported in 69% (56 articles). Finally,approximately 39% of the articles (22 of 57) may have overfit the logistic regression model, leading topotentially unreliable regression coefficients and odds ratios.Conclusions: Our results indicate that the reporting of multivariable logistic regression analyses in thepulmonary and critical care literature is often incomplete, therefore making it difficult for the readerto accurately interpret the manuscript. We recommend the implementation of adequate guidelinesthat will lead to overall improvements in the reporting and possibly to the conducting of multivariableanalyses in the pulmonary medicine and critical care medicine literature.

(CHEST 2003; 123:923–928)

Key words: critical care medicine; logistic models; pulmonary medicine; study design

Abbreviation: ROC � receiver operating characteristic

A lthough most investigators of basic science andclinical medicine receive little or no training in

statistics and epidemiology, multivariable regression-modeling techniques are now frequently used in

medical research.1 These statistical models are usedprimarily to identify either the effect of an individualvariable on a specific outcome while adjusting for

For editorial comment see page 677

differences in other factors (descriptive modeling) orto estimate the likelihood of an outcome for a givenpatient from a set of observations that are particularto that patient (predictive modeling).2 All multivari-able modeling strategies have specific assumptionsand several limitations that, when violated, mayimpact on the accuracy of the results and the abilityof the reader to properly interpret the study. Logistic

*From the Department of Medicine (Drs. Moss and Wellman),Division of Pulmonary and Critical Care Medicine, and theRollins School of Public Health (Mr. Cotsonis), Emory Unversity,Atlanta, GA.Manuscript received May 4, 2001; revision accepted June 18,2002.Reproduction of this article is prohibited without written permis-sion from the American College of Chest Physicians (e-mail:[email protected]).Correspondence to: Marc Moss, MD, Division of Pulmonary andCritical Care, Grady Memorial Hospital, 69 Jesse Hill Junior Dr,Suite 2C007, Atlanta, GA 30335; e-mail: [email protected]

special report

www.chestjournal.org CHEST / 123 / 3 / MARCH, 2003 923

Page 2: special report

regression analysis is one type of multivariable mod-eling procedure that is used to assess the relationshipbetween two or more continuous or categoric, ex-planatory (independent) variables and a single di-chotomous response or outcome variable, such ashospital mortality (alive or deceased).3 Logistic re-gression is commonly used in the medical literaturefor several reasons. There is a high degree of flexi-bility associated with logistic regression models, in-cluding that the continuous explanatory variables donot need to be normally distributed and that otherexplanatory variables can be dichotomous variablessuch as gender (male or female). In addition, theresults of logistic regression models are easy tointerpret because the coefficients can be easily trans-formed into odds ratios, which are an acceptedmeasure of association in medical research.4,5

In 1993, Concato and colleagues1 reported a re-view of 60 randomly selected articles from theLancet and the New England Journal of Medicinethat utilized multivariable statistical methods. In73% of the articles, they identified violations andomissions of methodological guidelines that wouldmake the reported results potentially inaccurate,misleading, or difficult to interpret. Based on theirobservations, the authors concluded that there was aneed for improvement in the reporting and perhapsin the conducting of multivariable analysis in thegeneral medical literature.

It has been almost a decade since this importantobservational study was published. The methodolog-ical and statistical review policies of many majormedical journals have improved over the last de-cade.6 However, deficiencies in the reporting ofmultivariable regression models still exist in somesubspecialty journals, including the obstetrics andgynecology literature.4 It is presently unknownwhether similar deficits remain in the pulmonarymedicine and critical care medicine literature. There-fore, we used preexisting guidelines for assessingmultivariable models and investigated the quality ofreporting multivariable logistic regression analyses infive major pulmonary and critical care journals.

Materials and Methods

We manually searched all of the original research articlespublished in five prominent pulmonary and critical care journalsover a 6-month period (July 1 through December 31, 2000). Thefive journals (American Journal of Respiratory and Critical CareMedicine, CHEST, Critical Care Medicine, The European Respi-ratory Journal, and Thorax) were chosen for the following threereasons: (1) their large circulation size; (2) their internationaldistribution; and (3) the inclusion of original research articles.Any article stating that either a multivariable logistic regressionanalysis was performed in the “Methods” section or reporting the

results of a multivariable logistic regression analysis in the“Results” section was selected for further examination andreview.

Various guidelines have been recommended that can improvethe appropriate execution, interpretation, and reporting of mul-tivariable logistic regression methods.1,5,7 Using some of thecriteria established by Concato et al,1 we examined publishedmanuscripts for 12 potential limitations in their reporting ofmultivariable logistic regression analysis, including the following:(1) inclusion of odds ratio and confidence intervals for variablesin the final model; (2) listing the type of statistical softwarepackage that was used to perform the analysis; (3) classification orcoding of the independent variables included in the final model;(4) testing for interaction terms; (5) collinearity of independentvariables; (6) goodness of fit; (7) potential overfitting of themodel; (8) conformity of linear gradients; and (9) explanation ofhow the explanatory variables that were used in the model wereinitially selected. The 10th potential limitation was reserved onlyfor those articles that used pair-matched case-control data. Wedetermined whether the authors reported that a conditionallogistic regression model was used. Finally, if a logistic regressionanalysis was used to create a predictive model, we determined(11) whether a receiver operating characteristic (ROC) curve wasreported for the final model and (12) whether a separatevalidation set was used.

Study Criteria

The following criteria were used to categorize each article in asystematized manner for each of the potential limitations inreporting style:

1. Inclusion of Odds Ratio and Confidence Intervals for theFinal Model. This criterion was fulfilled if the odds ratioand 95% confidence interval for at least the primaryexplanatory (independent) variable of interest were in-cluded anywhere in the text of the manuscript or in a table.If the article included more than one multivariable logisticregression model, only one of the models had to include anodds ratio and confidence interval to be classified as beingproperly recorded. In addition, we recorded whether thearticle reported that the stepwise regression was performedin a forward or backward manner.

2. List Statistical Package. Articles also were examined todetermine whether the authors included the type of statis-tical program that was used in the analyses. Citing thestatistical program used to perform the multivariable anal-ysis should be analogous to a basic science researcherindicating the particular experimental protocol used forphysiologic measurements.1 The type of program could bementioned anywhere in the text of the manuscript orincluded as a reference in order to fulfill this criterion.

3. Coding of Variables. The proper reporting of the coding ofvariables is important because the effect of an independentvariable on the outcome variable depends on the corre-sponding units of measurement and the manner in whichthe variable was coded. For example, the regression coef-ficient for the impact of age on mortality will be different ifage is coded in 1-year increments, in 10-year increments, ordichotomously as � 65 years vs � 65 years.1 Articles wereconsidered to properly report the coding of variables if themethod of coding for all of the variables that remained inthe final statistical model could easily be determined (suchas gender being a dichotomous variable) or were refer-enced anywhere in the article. If the odds ratio could not becalculated accurately because the coding of all of thevariables in the final model was not designated or able to be

924 Special Report

Page 3: special report

determined, the article was considered to have improperlyreported the coding of variables.

4. Interaction or Effect Modification. An interaction occursbetween independent variables if the impact of one vari-able on the outcome event depends on the level of anothervariable.3 Articles were considered to properly report theirtesting for interaction terms if the concepts of interactionor effect modification were mentioned anywhere in themanuscript or if a single independent variable in anymultivariable logistic regression model was reported as across between two single variables.

5. Collinearity. If two of the independent variables are highlycorrelated with one another, then collinearity occurs andmay create highly unstable estimated regression coeffi-cients. Articles were considered to properly report testingfor collinearity if a test for collinearity or the concept ofcollinearity was discussed anywhere in the article.

6. Goodness of Fit. Indexes of goodness-of-fit evaluate howeffectively the calculated model fits the actual data forestimating the outcome variable.1,5 Articles were consid-ered to properly report goodness-of-fit testing if a test forgoodness of fit, such as the Hosmer-Lemeshow statistic,5was used in the article. If the article did not mention anygoodness-of-fit techniques (eg, jackknife, bootstrap, Hos-mer-Lemeshow, deviance, or reporting of a percentage forthe dependent variable that was correctly identified by themodel), the article was classified as having improperlyreported a goodness-of-fit test.

7. Overfitting of the Model. Risk estimates may be unreliableif the model includes too many independent variables andtoo few outcome events. Although controversial, resultshaving � 10 outcome events per explanatory (independent)variable may result in inaccurate risk estimates.1 We at-tempted to determine the number of outcome events andthe number of explanatory variables included in the finalregression model. In those articles that reported thisinformation, we categorized the articles with a ratio of� 10:1 (10 outcome events for each single explanatoryvariable in the final model) as possibly having an overfitmodel.

8. Nonconformity to a Linear Gradient. When a linear regres-sion coefficient is estimated for a variable, the implicationis that, regardless of the value of the variable, each specificunit of change of this variable should have the same effecton the outcome. For explanatory variables that are ordinalor continuous, several gradients of effect on the outcomemay occur for different ranges of the explanatory variable.1If the value of the regression coefficient is reported as anaverage effect over the range of the explanatory variable,this may be inaccurate as the variable may have differenteffects at different ranges. Therefore, in the multivariablelogistic models that include any continuous or ordinalexplanatory variables, we examined the article to assesswhether the authors had assessed for conformity to a lineargradient for these variables.5 Unless the manuscript re-ported determining the impact of each explanatory variableseparately in zones of ranked data or mentioned thatconformity to a linear gradient was addressed, the manu-script was coded as being improperly reported.

9. Selection Process of Explanatory Variable Initially Includedin the Multivariable Model. The explanatory variables thatare included in the initial multivariable logistic regressionanalysis should have a specific reason for being selected.We classified an article as properly addressing this concernif the article included any explanation as to why theexplanatory variables included in the final analysis wereselected, such as significance in a prior univariate analysis,

a previously reported association in the medical literature,or some degree of evidence of the biological plausibility ofan association.

10. Criteria for Conditional Regression Analyses. For thosearticles that used pair-matched, case-control data, werequired the authors to report in any part of the manu-script that a conditional (as opposed to unconditional)logistic regression analysis was used.

11. and 12. Criteria for Prediction Models. In some manu-scripts, multivariable logistic regression analyses are usedto develop predictive models. We developed additionalcriteria for these specific articles. In regard to goodness offit, the authors were required to report some measure ofmodel fit such as an ROC curve to receive credit forproperly reporting their results. In addition, we classifiedarticles as to whether a specific validation patient cohortalso was used to test the validity of the initial model (eg,performing a test analysis on a subsample of patientsfollowed by a subsequent validation analysis on the re-maining patients or repeating the analysis on an indepen-dent sample of patients).

All of the articles were distributed and examined by two of thethree authors to determine the accuracy of logistic regressionanalysis reporting. If there was disagreement between theinitial reviews, then the third person, who had not previouslyexamined the article, rendered the final decision.

Results

Over the 6-month period, a total of 964 originalarticles in five pulmonary-critical care journals weremanually searched to determine whether each indi-vidual manuscript included a multivariable logisticregression analysis. A total of 81 articles was identi-fied that used multivariable logistic regression mod-els in the analysis of the data, representing 8.4% ofall published original research articles (81 articles).These articles were distributed among the followingfive journals: American Journal of Respiratory andCritical Care Medicine, 25 articles; CHEST, 19 articles;Critical Care Medicine, 17 articles; The EuropeanRespiratory Journal, 12 articles; and Thorax, 8 articles.

The overall results are summarized in Table 1.Odds ratios for the independent variables includedin the final multivariable logistic model were re-ported in 79% (64 articles). Confidence intervals forthese odds ratios were included in 74% of the articles(60). Only 22 of the articles (27%) specified whetherthe multivariable regression analysis was performedin a forward or backward manner. Of these 22 arti-cles, 12 used a backward regression analysis and10 used a forward process. The type of statisticalpackage was reported in 69% (56 articles). The mostcommonly used programs were SPSS (SPSS; Chi-cago, IL) [39 articles; 39%], SAS (SAS Institute;Cary, NC) [18 articles; 32%], and STATA (STATA;College Station, TX) [9 articles; 16%].

Only 65% of the 81 articles (53 articles) properlyincluded a coding classification for those indepen-dent variables that had been included in the final

www.chestjournal.org CHEST / 123 / 3 / MARCH, 2003 925

Page 4: special report

model. Only 12% (10 articles) mentioned whetherinteraction terms were examined in their statisticalmodels. Testing for collinearity was identified in 1%(1 article). Goodness-of-fit analyses were included in16% (13 articles). In addition, only three articles(4%) specifically mentioned in the manuscript howmissing data points were handled.

Overfitting of the regression model could be as-sessed in 57 of the manuscripts. Using the criteria ofmaintaining � 10 outcome events per each explana-tory (independent) variable, 39% of the articles (22articles) included final logistic regression models thatwere suspicious for overfitting.1 The criterion fornonconformity to a linear gradient did not apply to42 of the articles in which the analyses used onlybinary independent variables. For the remaining 39articles, only 6 (15%) included any indication ofassessing for nonconformity to a linear gradient inthe text of the manuscript. Finally, the reason thatthe specific explanatory variables were chosen forthe multivariable logistic regression analysis wasreported in only 23% of the 81 articles (19 articles).The most common reason for the inclusion of aspecific variable was a significant association in aunivariate analysis (95%; 18 of 19 articles). Onemanuscript (5%) reported that the selection of ex-planatory variables was due to reasons of biologicalplausibility.

The majority of articles (94%; 76 of 81 articles)used multivariable logistic regression modeling in a

descriptive manner in order to determine the effectof an individual variable on a specific outcome whileadjusting for differences in other factors. The re-maining articles (6%; 5 articles) were primarily con-cerned with creating a predictive modeling strategyto estimate the likelihood of an outcome for a givenpatient from a set of observations particular to thatpatient. Goodness-of-fit tests and ROC curves werereported in 80% (four of five articles). However, avalidation set was included in only three of thearticles (60%). Five other articles were pair-matchedcase-control studies. Four of the five articles prop-erly indicated that a conditional multivariable logisticregression analysis had been performed.

Discussion

The results of this study raise concerns about theproper reporting of multivariable logistic regressionanalyses in today’s pulmonary medicine and criticalcare medicine literature. In our review of the pul-monary and critical care medicine literature, weidentified several important deficiencies in the man-ner in which multivariable logistic regression modelswere expressed. Previous investigations4,8 have re-ported the use of inefficient statistical analyses inunivariate models in the general medical literatureand multivariable logistic regression analyses in theobstetrics and gynecology literature. The inaccuratereporting of multivariable models makes a properand comprehensive interpretation of a study’s resultsdifficult, if not impossible.

For example, the apparent effect of an indepen-dent variable will depend on the corresponding unitsof measure and the coding of that variable.1 If thevalues of the regression coefficients are reportedwithout identifying the units of coding for the inde-pendent variable, then readers will be unable tointerpret the actual magnitude of the risk estimates.The importance of testing for interaction terms isevident in a 1999 trial9 examining the effect of bodyposition on the incidence of nosocomial pneumoniain mechanically ventilated patients. If interactionwas not considered in the logistic regression model,then the regression coefficient for the independentvariable (eg, the body position of an intubatedpatient that had been coded as semi-recumbent orhorizontal) represents the impact of this variable onthe outcome event (such as the development ofventilator-associated pneumonia) for all levels ofanother variable (eg, the use of enteral nutritioncoded as present or absent). If body position andenteral nutrition have a significant interaction, theimpact of body position depends on the presence orabsence of enteral feedings.9 Without attention to

Table 1—Summary of Results

Potential Limitations

Articles

No. %

Odds ratios for independent variables included inthe final model

64/81 79

Confidence intervals for the odds ratios reported 60/81 74Specified type of multivariable regression

modeling (forward or backward)22/81 27

Type of statistical package reported or referencedin the manuscript

56/81 69

Proper coding classification for independentvariables in the final model reported

53/81 65

Reported testing for interaction terms or effectmodification

10/81 12

Reported testing for collinearity betweenindependent variables

1/81 1

Possibility of including an overfit regression model 22/57 39Assessment for nonconformity to a linear gradient 6/39 15Specified the reason for choosing the variables

included in the multivariable model19/81 23

Goodness-of-fit tests and ROC curves included forpredictive models

4/5 80

Inclusion of a validation set for predictive models 3/5 60Use of conditional multivariable analysis for pair-

matched case-control studies4/5 80

926 Special Report

Page 5: special report

interaction, the coefficient for body position willreport a misleading quantitative estimate of theimpact of the body position of an intubated patienton the development of ventilator-associated pneu-monia.1 Finally, goodness-of-fit indexes evaluate howeffectively the calculated model fits the actual datafor estimating the outcome variable,1 and, therefore,the validity of all of the results and conclusions stronglydepends on assessment of the adequacy of the regres-sion model.10

Most clinicians and researchers depend on jour-nals, via the editorial and peer review processes,to ensure that statistical methods in published arti-cles are being used and interpreted appropriately.6Goodman and colleagues6 conducted a cross-sectional survey of medical journals to determine thegeneral policies of the statistical review process.Approximately one third of the 114 journals thatresponded to the survey require statistical review forall accepted manuscripts. The statistical review pol-icies differed between journals according to the sizeof their circulation. Eighty-two percent of journalswith a circulation of � 25,000 maintained a statisticalconsultant on staff compared to only 31% for thosejournals with a circulation of � 4,100. In addition,the managing editors of these sampled journalsestimated that a formal statistical review resulted inan important change in approximately 50% of themanuscripts. Goodman and colleagues6 further eval-uated the peer review process by examining thequality of manuscripts before and after editing at theAnnals of Internal Medicine.11 The quality of multi-variable reporting was rated as one of the mostdeficient factors at the time of submission. However,after the manuscript was edited and revised, theclarity in the reporting of multivariable analyses wasmarkedly improved.

Given the large number of papers submitted forpublication and the limitations on journal space,editors must make important decisions concerningwhat type of information about regression modelsshould be included, excluded, or made available onrequest. Campillo12 suggested that clear-cut publica-tion criteria for generalized regression modelsshould be made available for contributors to medicaljournals. Lang and Secic13 have recommended thatan article with a multivariable logistic regressionanalysis should include a table with the coefficient(�), SE, Wald �2 test value, p value, odds ratio, and95% confidence interval for each independent vari-able. In addition, the coding of each independentvariable and a statement concerning whether themodel was validated should be included. Finally,they recommended that the important issues ofinteraction and collinearity should be addressed inthese manuscripts (Table 2). One of the major

pulmonary and critical care journals has changed its“Instructions for Contributors” in order to limit thelength of each article and to utilize the onlinereporting of the medical literature.14 Authors willnow be required to limit the “Methods” section to500 words; however, an extended account of meth-ods, when appropriate, will be included online in thejournal’s Web repository. Therefore, an additionalvenue for reporting multivariable statistical analysesin the future may be online, thereby optimallyutilizing a Web repository site.

There are some potential modifiers that mayinfluence the interpretation of our results. It was notpossible to determine whether the authors improp-erly performed the statistical analyses or simply didnot accurately report their methods in their articles.However, an improvement in the required reportingof multivariable logistic regression results wouldclearly facilitate the reader’s ability to interpret thedata appropriately. In addition, our review examinedarticles from 2000, and it is possible that the stan-dards for reporting statistical procedures have im-proved over the last few years. Finally, manuscriptswith other forms of multivariable regression analy-ses, such as linear or proportional hazards regression,were not studied due to there being relatively fewarticles that used these techniques.

Accurate and understandable results are requiredto properly communicate medical research.1 Severaldeficiencies in the reporting of multivariable logisticregression analyses in the pulmonary and criticalcare medicine literature were identified. We recom-mend that the editorial boards of these pulmonaryand critical care journals implement sufficient guide-lines for the proper statistical reporting of multivar-iate analyses that will enable the reader to accuratelyinterpret an article’s results (Table 2). In addition, itmay be reasonable for the editorial boards to con-

Table 2—Potential Requirements for the Reporting ofMultivariable Logistic Regression Analyses in thePulmonary and Critical Care Medicine Literature

Summarize the logistic regression equation in a table including• No. of observations• Coefficient of the explanatory variable and the associated SE• Odds ratio• 95% confidence interval of the odds ratio• p value

Name the statistical package used in the analysisIdentify the variable used in the analysis and include descriptive

statisticsSpecify whether the explanatory variables were assessed for

collinearitySpecify whether the explanatory variables were tested for

interaction or effect modificationWhen possible, state whether the model was validated

www.chestjournal.org CHEST / 123 / 3 / MARCH, 2003 927

Page 6: special report

sider increasing the frequency of involving statisticalconsultants to review manuscripts to a similar degreeas journals use copy editors to correct the grammarand format.

It has been nearly a decade since Cancato et al1first identified problems in the reporting of multiva-riable statistical analyses in the medical literature.Presently, the pulmonary and critical care medicineliterature has not fully addressed these issues orimplemented acceptable guidelines or criteria forthe reporting of multivariable analyses. Hopefully,this article will promote improvements in the report-ing of multivariable analyses and will allow readers tomore accurately interpret study results in futurearticles.

References1 Concato J, Feinstein AR, Holford TR. The risk of determin-

ing risk with multivariable models. Ann Intern Med 1993;118:201–210

2 Palmas W, Denton TA, Diamond GA. Publication criteria forstatistical prediction models. Ann Intern Med 1993, 118:231–232

3 Kleinbaum DG. Logistic regression: a self-learning test. NewYork, NY: Springer-Verlag, 1994

4 Khan KS, Chien PFW, Dwarakanath LS. Logistic regression

models in obstetrics and gynecology literature. Obstet Gy-necol 1999; 93:1014–1020

5 Hosmer DW, Lemeshow S. Applied logistic regression. NewYork, NY: Wiley, 1989

6 Goodman SN, Altman DG, George SL. Statistical reviewingpolicies of medical journals: caveat lector? J Gen Intern Med1998; 13:753–756

7 Harrell FE Jr, Lee KL, Matchar DB, et al. Regression modelsfor prognostic prediction: advantages, problems, and sug-gested solutions. Cancer Treat Rep 1985; 69:1071–1077

8 Bridge PD, Sawilowsky SS. Increasing physicians’ awarenessof the impact of statistics on research outcomes: comparativepower of the t-test and Wilcoxon rank-sum test in smallsamples applied research. J Clin Epidemiol 1999; 52:229–235

9 Drakulovic MB, Torres A, Bauer TT, et al. Supine bodyposition as a risk factor for nosocomial pneumonia in mechan-ically ventilated patients: a randomised trial. Lancet 1999;354:1851–1858

10 Bender R, Grouven U. Logistic regression models used inmedical research are poorly presented [letter]. BMJ 1996;313:628

11 Goodman SN, Berlin J, Fletcher SW, et al. Manuscriptquality before and after peer review and editing at the Annalsof Internal Medicine. Ann Intern Med 1994; 121:11–21

12 Campillo C. Standardizing criteria for logistic regressionmodels. Ann Intern Med 1993; 119:540–541

13 Lang TA, Secic M. How to report statistics in medicine.Philadelphia, PA: American College of Physicians, 1997

14 Tobin MJ. Authors, authors, authors: follow instructions orexpect delay. Am J Respir Crit Care Med 2000; 162:1193–1194

928 Special Report