Statistics for clinicians

Preview:

DESCRIPTION

Statistics for clinicians. - PowerPoint PPT Presentation

Citation preview

Statistics for cliniciansStatistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA

Professor and Executive Director, Research CenterUniversity of South Florida, College of NursingProfessor, College of Public HealthDepartment of Epidemiology and BiostatisticsAssociate Member, Byrd Alzheimer’s InstituteMorsani College of MedicineTampa, FL, USA

1

22222

SECTION 6.6SECTION 6.6

Introduction to Introduction to survival analysissurvival analysis

Learning Outcome:

Recognize concepts and methods used in survival analysis

Survival AnalysisSurvival Analysis

• A technique to estimate the probability of “survival” (and also risk of disease) that takes into account incomplete subject follow-up.

• Calculates risks over a time period with changing incidence rates.

• Wide application in a variety of disciplines, such as engineering.

Survival AnalysisSurvival Analysis

• With the Kaplan-Meier method (“product-limit method”), survival probabilities are calculated at each time interval in which an event occurs.

• The cumulative survival over the entire follow-up period is derived from the product of all interval survival probabilities.

• Cumulative incidence (risk) is the complement of cumulative survival.

K-M formula:K-M formula:

# of time

intervals (Nk – Ak)

S = -------------

k = 1 Nk

Where: k = sequence of time intervalNk = number of subjects at risk

Ak = number of outcome events

Survival AnalysisSurvival Analysis

• With the Kaplan-Meier method, subjects with incomplete follow-up (FU) are “censored” at their last known time of (FU).

• An important assumption (often not upheld) is that censoring is “non-informative” (survival experience of subjects censored is the same as those with complete FU).

• Non-fatal outcomes can also be studied.

Survival AnalysisSurvival Analysis

• The Life-Table method is conceptually similar to the Kaplan-Meier method.

• The primary difference is that survival probabilities are determined at pre-determined intervals (i.e. years), rather than when events occur.

999999

SECTION 6.7SECTION 6.7

Calculation and Calculation and Interpretation of Interpretation of

Survival Analysis Survival Analysis EstimatesEstimates

Learning Outcome:

Calculate and interpret survival analysis estimates of incidence

Survival AnalysisSurvival Analysis

Example:

• Assume a study of 10 subjects conducted over a 2-year period.

• A total of 4 subjects die.

• Another 2 subjects have incomplete follow-up (study withdrawal or late study entry).

What is the probability of 2-year survival, and the corresponding risk of 2-year death?

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 ? 1 1 ? ? ? ?

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 ? ? ? ?

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 ? ? ? ?

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

Interpretation: What is the 2-year risk of death?

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

5 10 1 1 0.10 0.90 0.90 0.10

7 8 1 0 0.125 0.875 0.788 0.212

20 7 1 1 0.143 0.857 0.675 0.325

22 5 1 0 0.20 0.80 0.54 0.46

24 4 0 0 0.0 1.0 0.54 0.46

Interpretation: What is the 1-year risk of death?

19

Survival Analysis (Practice)Survival Analysis (Practice)

Example:

• Assume a study of 12 subjects conducted over a 3-year period.

• A total of 5 subjects die.

• Another 2 subjects have incomplete follow-up (study withdrawal or late study entry).

What is the probability of 3-year survival, and the corresponding risk of 3-year death?

20

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

7 12 1 1 0.0833 0.9167 0.9167 0.0833

11 10 1 0 0.10 0.90 0.8250 0.1750

16 1 0

24 1 1

30 1 0

36 0 0

Complete the worksheet below

What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _______ Death _________

21

(1)

Time to Death from Entry

(Mo)

(2)

No. Alive at

Each Time

(3)

No. Who

Died at Each Time

(4)

No. Lost to FU

Prior to Next Time

(5)

Prop. Died at

That Time

(3) / (2)

(6)

Prop. Survive

At That Time

1 – (5)

(7)

Cumul. Survival

To that Time

(8)

Cumul.

Risk to That Time

1 – (7)

7 12 1 1 0.0833 0.9167 0.9167 0.0833

11 10 1 0 0.10 0.90 0.8250 0.1750

16 9 1 0 0.1111 0.8889 0.7333 0.2667

24 8 1 1 0.125 0.875 0.6416 0.3584

30 6 1 0 0.1667 0.8333 0.5346 0.4654

36 5 0 0 0.0 1.0 0.5346 0.4654

Complete the worksheet below

What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _0.5346_ Death _0.4654_

22222222222222

SECTION 6.8SECTION 6.8

Logistic Regression Logistic Regression ModelModel

23

Learning Outcome:

Recognize components and interpret parameters from the logistic regression model

Logistic Regression AnalysisLogistic Regression Analysis

Conceptually similar to linear regression with dichotomous outcome.

Outcome is usually coded as “0” or “1”, with “1” referring to presence of the outcome in interest (although SAS assumes 0).

p represents the probability that the outcome is present (e.g. value of 1), given particular covariate values of an individual

Logistic Regression AnalysisLogistic Regression Analysis Multiple logistic regression model can be

written in different ways:

where:p = expected probability that outcome is presentx1 through xp = independent variablesb0 through bp = regression coefficients

Logistic Regression AnalysisLogistic Regression Analysis

b1 = change in the expected log odds in the outcome relative to a 1-unit change in xi holding other predictors constant

Anti-log of regression coefficient, exp(bi), produces odds ratio

Logistic Regression AnalysisLogistic Regression AnalysisExample: Estimate the risk of incident CVD among persons defined as obese.

Variable b χ2 p-value

Intercept -2.367 307.38 0.0001

Obesity (yes vs. no) 0.658 9.87 0.0017

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}= -2.367 + 0.658(Obesity) = log odds

exp(0.658) = 1.93 (odds ratio)

Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}=

Write out the logistic regression equation below. (Practice)

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

ln{ p

1 – p}= -3.065 + 0.036(age) – 0.53(female) + 0.029(BMI)

– 0.001 (physical activity) + 1.067(diabetes)

Write out the logistic regression equation below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

ln{ p

1 – p} = b0 + b1x1 + b2x2 + … bpxp

= EXP[(-3.065 + 0.036(age) – 0.53(female) + 0.029(BMI) – 0.001 (physical activity) + 1.067(diabetes)]

So, the predicted odds of an individual being on a statin drug =

Predicted Probability = Predicted odds / (1 + predicted odds).

AND

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

= EXP[(-3.065 + 0.036(55) – 0.53(0) + 0.029(31.4) – 0.001 (2) + 1.067(1)]

= exp(0.896) = 2.446

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics:

Age=55; male; BMI=31.4; physical activity level=2; diabetic

Predicted Probability = odds / (1 + predicted odds)= 2.446 / (3.446) = 0.71

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

=

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics: PRACTICE

Age=52; female; BMI=29.5; physical activity level=3; non-diabetic

Predicted Probability = odds / (1 + predicted odds)=

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

= EXP[(-3.065 + 0.036(52) – 0.53(1) + 0.029(29.5) – 0.001 (3) + 1.067(0)]

= exp(-0.8645) = 0.42

Estimate the predicted odds and probability of an individual being ona statin drug with the following characteristics:

Age=52; female; BMI=29.5; physical activity level=3; non-diabetic

Predicted Probability = odds / (1 + predicted odds)= 0.42 / (1.42) = 0.296

Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Produce odds ratio estimates of statin use for the following (Practice):

Age (per year) =Age per 5 years) =Female gender =History of diabetes =

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Produce odds ratio estimates of statin use for the following:

Age (per year) = exp(0.036) = 1.04Age per 10 years) = exp(10 x 0.036) = 1.43Female gender = exp(-0.530) = 0.59History of diabetes = exp(1.067) = 2.91

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Interpret odds ratio estimates of statin use for the following:

Age per 10 years) = exp(10 x 0.036) = 1.43

History of diabetes = exp(1.067) = 2.91

Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below.

Variable b Wald χ2 p-value

Intercept -3.065 8.015 0.027

Age (per year) 0.036 5.334 0.021

Gender (female = 1) -0.530 5.082 0.024

Body mass index (per unit) 0.029 2.187 0.139

Physical activity (per unit) -0.001 0.000 0.996

History of diabetes (1 = yes) 1.067 9.250 0.002

Interpret odds ratio estimates of statin use for the following:

Age per 10 years) = exp(10 x 0.036) = 1.43For every 10 year increase in age, the adjusted odds ofbeing on a statin drug increases 1.43-fold

History of diabetes = exp(1.067) = 2.91Persons with diabetes have 2.91 times higher odds of

being on a statin drug compared to persons without diabetes

3838383838383838

SECTION 6.9SECTION 6.9

SPSS for Logistic SPSS for Logistic Regression AnalysisRegression Analysis

39

Learning Outcome:

Use SPSS to fit and interpret a logistic regression model

SPSSAnalyze

RegressionBinary Logistic

Dependent VariableCovariates

SPSSAnalyzeDescriptive StatisticsCrosstabs

Row=Hx diabetesCol = Statin use

Odds Ratio = odds exposure casesodd exposure controls

= (17 / 88) / (24 / 372)= 0.193 / 0.0645 = 2.99

Recommended