12
BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________ Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 1 of 12 BIOSTATS 640 Intermediate Biostatistics Spring 2020 Examination III Unit 6 – Analysis of Variance Unit 7 – Logistic Regression & Unit 8 – Introduction to Survival Analysis Due: Thursday May 7, 2020 Sorry. I cannot accept late submissions as I must submit grades by May 12, 2020 Before you begin: As before, this is a take-home” exam. You are welcome to use any reference materials you wish. You are welcome to use the computer as you wish, too. However, you MUST work this exam by yourself and you may not consult with anyone (except me and that is fine…). Spring 2020 How to submit your exam: Under the circumstances of remote learning during the COVID19 pandemic, all students are kindly requested to please submit your exam in BLACKBOARD LEARN using the ASSIGNMENT tab at left. Thank you – cb.

BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

  • Upload
    others

  • View
    42

  • Download
    3

Embed Size (px)

Citation preview

Page 1: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 1 of 12

BIOSTATS 640 Intermediate Biostatistics Spring 2020

Examination III Unit 6 – Analysis of Variance

Unit 7 – Logistic Regression & Unit 8 – Introduction to Survival Analysis

Due: Thursday May 7, 2020 Sorry. I cannot accept late submissions as I must submit grades by May 12, 2020

Before you begin: As before, this is a take-home” exam. You are welcome to use any reference materials you wish. You are welcome to use the computer as you wish, too. However, you MUST work this exam by yourself and you may not consult with anyone (except me and that is fine…).

Spring 2020 How to submit your exam:

Under the circumstances of remote learning during the COVID19 pandemic, all students are kindly requested to

please submit your exam in BLACKBOARD LEARN using the ASSIGNMENT tab at left. Thank you – cb.

Page 2: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 2 of 12

BIOSTATS 640 Intermediate Biostatistics Spring 2020

Examination III Unit 6 – Analysis of Variance

Unit 7 – Logistic Regression & Unit 8 – Introduction to Survival Analysis

Due: Thursday May 7, 2020 Sorry. I cannot accept late submissions as I must submit grades by May 12, 2020

Signature This is to confirm that in completing this exam, I worked independently and did not consult with anyone. Name: ___________________________________________________________ Date: ___________________________

Thank you!

Page 3: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 3 of 12

1. (30 points total) A logistic regression analysis was used to explore the relationship between the diabetes (presence or absence) and body mass index (BMI). The Y-variable for this analysis was Y=Diabetes and was coded Y=1 for persons with diabetes and Y=0 for persons without diabetes. The X-variable for this analysis was X=BMI where BMI is measured as kg/m2. The following fitted model was obtained:

With the following values of (-2) ln-likelihood: (-2) ln-Likelihood (intercept only) = 233.304 (-2) ln-Likelihood (intercept + BMI) = 227.704

1a. (10 points) Using the information given in the fitted model, together with your understanding of logistic regression, complete the following table by filling in the four blanks in the 2nd row . Coefficient Standard

Error Wald Statistic p-value OR 95% CI for OR

Intercept

Not asked

0.893

Not asked

Not asked

-

-

BMI

_________

0.032

________

_______

_______

Not asked

1b. (10 points) Using the information given in the fitted model, calculate the value of the estimated odds ratio for the outcome of diabetes in relationship to a 5 kg/m2 increase in BMI. 1c. (10 points) Perform a likelihood ratio test of the null hypothesis that the unknown true beta coefficient for BMI is zero.

π̂ln = -3.034 + 0.075Xˆ1 - π

æ öç ÷è ø

Page 4: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 4 of 12

2. (20 points total) Consider again the same logistic regression analysis setting of problem #1. Further analysis of diabetes explored two additional predictors: treatment with digoxin (X2) and non-white race (X3). The following is a full coding manual.

Variable Label Codings Outcome Y Diabetes 1 = yes, 0 = no

Predictors X1 BMI continuous kg/m2

X2 Digoxin 1 = yes, 0 = no X3 Race 1 = non-white, 0=other

The fitted logit model is now the following.

The following (-2) ln-likelihood values are provided for you: (-2) ln-Likelihood (intercept only model) = 233.304 (-2) ln-Likelihood (intercept + X1 model) = 227.704 (-2) ln-Likelihood (intercept + X1 + X2 + X3 model) = 217.94

2a. (10 points) Using the fitted logit model, calculate the estimated probability of diabetes for a person with BMI of 24 kg/m2, on digoxin treatment, and being of non-white race.

1 2 3π̂ln = -2.948 + 0.081X - 0.796X + 0.904X

ˆ1 - πæ öç ÷è ø

Page 5: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 5 of 12

2b. (2 points) Carry out the appropriate likelihood ratio test to compare the reduced model containing X1 = BMI with a full model containing all three predictors X1 = BMI, X2 = Digoxin and X3 = Race 2c. (2 points) Compare the value of the regression coefficient for BMI in the reduced model (shown in question #1) versus in the full model (shown here in question #2). By what percent does it change, relative to its value in the full model? 2d. (6 points) Using your answers to problems #2b and #2c, what is your conclusion regarding the existence of evidence of a confounding of a BMI-diabetes relationship by the other two variables digoxin and race? Give your answer in terms that a layperson can understand.

Page 6: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 6 of 12

3. (20 points total) A balanced factorial two-way analysis of variance study was conducted to investigate variations in concentration of the amino acid alanine (mg/100 ml) in the hemolymph of millipedes by levels of two factors: gender at two levels (male and female) and species at three levels (species 1, species 2, and species 3). For each combination of gender and species, the sample size obtained was 4.

3a. (5 points) State the analysis of variance model using the notation that is appropriate for a deviation from means parameterization. Be sure to define all your terms, subscripts and constraints on the parameters.

3b. (5 points) Complete the following analysis of variance table by replacing all instances of ??.

Source df Sum of Squares Mean Square F p-value due Model ?? ?? ?? ?? ??

Gender ?? 138.72 ?? ?? ?? Species ?? 55.26 ?? ?? ??

Gender x Species ?? 6.89 ?? ?? ?? due Error

“within groups” ?? 38.02 ?? Total, corrected ?? ?? ??

3c. (10 points) Perform the appropriate hypothesis tests to assess variations in alanine concentration by gender and species and by gender and species in combination. Tip!! Take care to think about the order in which you perform hypothesis tests and the implication of the findings you obtain from one hypothesis test on subsequent hypothesis tests (if any). Report p-values. Write a one paragraph report of your findings and your conclusions.

Page 7: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 7 of 12

4. (20 points, total)

Questions #4a – #4c all pertain to the following setting: The World Almanac and Book of Facts lists notable people of the past in various occupational categories; it also reports how many years each person lived. Do these sample data provide evidence that notable people in different occupations have different average lifetimes? To investigate this question, the lifetimes for 973 people in various occupation categories were recorded. Consider the following ANOVA output.

Source DF SS MS F P-value Occupation ?_____ ?____ 2749 ?____ < .0001

Error 968 195149 202 Total 972 206147

4a. (2 points) Fill in the three missing values in this ANOVA table. Show your work.

4b. (1 point) How many different occupations were considered in this analysis? Explain how you know. 4c. (2 points) In 1-2 sentences, what is your conclusion from this ANOVA?

Page 8: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 8 of 12

Questions #4d – #4f pertain to the following setting: A food company was interested in how texture might affect the palatability of a particular food. They set up an experiment in which they looked at two different coarseness levels of the final product (coarse or fine). The experimenters then randomly assigned 400 people to each of the two treatments. The data collected resulted in the following ANOVA table.

Source DF SS MS F Coarseness ?_____ ?_____ ?_____ ?_____

Error ?_____ 6113 ?_____ Total ?_____ 16722

4d. (2 points) Fill in the seven missing values in this ANOVA table. Show your work.

4e. (1 point) Find the p-value for the F-test in the table. 4f. (2 points) In words, describe the hypotheses being evaluated by the F-test in the table. Using the p-value you obtained in #4e, in 1-2 sentences, what do you conclude?

Page 9: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 9 of 12

Questions #4g – #4j pertain to the following setting: If you carry out a two-factor ANOVA (with interaction) on a data set with Factor A at four levels and Factor B at five levels with three observations per cell … 4g. (1 point)

How many degrees of freedom will there be for FACTOR A?

4h. (1 point) How many degrees of freedom will there be for FACTOR B?

4i. (1 point) How many degrees of freedom will there be for INTERACTION? 4j. (1 point) How many degrees of freedom will there be for ERROR?

4k. (6 points) Is interaction present in the following data? How can you tell? Show your work.

Heart Soul Democrats 2,3 10,12

Republicans 8,4 11,8

Page 10: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 10 of 12

5. (10 points total)

5a. (5 points)

The following table shows events of mortality and censoring for n=975 children who underwent kidney transplantation. Table - corrected

Days Since Transplant, t

# deaths on day = t

# censored on day=t

0 = Day of surgery 0 0 1 7 0

1.5 0 14 2 5 0

2.5 0 8 3 5 0

3.5 0 12 4 7 0

4.5 0 41 5 3 0

5.5 0 54 6 2 0

6.5 0 57 7 0 0

7.5 0 50 8 4 0

8.5 0 49 9 0 0

9.5 0 corrected 28

10 3 0 Complete the Following Worksheet Chronology of Table

Occasion of Mortality, t

# At risk at

instant before time t

# Surviving Beyond day = t

Conditional % Surviving beyond t

Pr [ S > t given S > (t-1) ]

1 975 975 -7 = 968 968/975 = .9928 2 3 4 5 6 7 8 9 10

Page 11: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 11 of 12

Complete the Following Kaplan-Meier Curve Estimation Table (the first few entries have been done for you)

Days Since Transplantation

Formula for % Alive =

Pr [ S > t ]=

Solution

0 Pr [ S > 0 ]

1

1 Pr [ S > 1 ] = Pr[S > 0] * Pr[S >1 | S > 0 ]

(1)(.9928)=.9928

2 Pr [ S > 2 ] = Pr[S > 1] * Pr[S >2 | S > 1 ]

3 4 5 6 7 8 9 10

Questions #5b and #5c pertain to the following setting: A cohort of n=312 participated in a randomized controlled trial of D-penicillamine (DPCA) for primary biliary cirrhosis (PBC). PBC destroys bile ducts in the liver causing bile to accumulate. Tissue damage is progressive and ultimately leads to liver failure. Time from diagnosis to end-stage liver disease ranges from a few months to 20 years. During the approximate 10-year follow-up period, 125 study participants died. The following output is the output from a fit of a Cox PH model with two predictors: 1.rx (this is a 0/1 indicator of receipt of DPCA) and bilirubin (mg/dL). No. of subjects = 312 Number of obs = 312 No. of failures = 125 Time at risk = 1713.853528 LR chi2(2) = 85.79 Prob > chi2 = 0.0000

(-2) Log likelihood (model) = 1194.1682 corrected (-2) Log likelihood (intercept only model) = 1279.9582 ------------------------------------------------------------------------------ _t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.rx | .8181612 .1500579 -1.09 0.274 .5711117 1.172078 bilirubin | 1.163459 .0154566 11.40 0.000 1.133556 1.194151 ------------------------------------------------------------------------------

Page 12: BIOSTATS 640 Exam III 2020 - UMass 640 Exam III 2020… · Using your answers to problems #2b and #2c, ... kidney transplantation. Table - corrected Days Since Transplant, t # deaths

BIOSTATS 640 Exam 3 – Spring 2020 *** CORRECTED 4/29/2020 *** Name ________________

Z:\bigelow\...\2020\...\BIOSTATS 640 Exam III 2020.docx Page 12 of 12

5b. (2 points)

In 2-3 sentences at most, interpret the estimated hazard ratios shown: 0.8181 for the predictor 1.rx and 1.16 for the predictor bilirubin.

5c. (3 points)

By hand, perform a likelihood ratio test comparison that compares the 2 predictor model to the intercept only model, thus verifying what the output shows, namely: LR chi2(2) = 85.79. In 2-3 sentences, interpret.