Upload
phebe-james
View
214
Download
0
Embed Size (px)
Citation preview
Term 4, 2006 BIO656--Multilevel Models 1
140.656 140.656 Multi-Level Statistical ModelsMulti-Level Statistical Models
If you did not receive the welcome email from me, email me at: ([email protected])
Term 4, 2006 BIO656--Multilevel Models 2
ROOM CHANGE, AGAIN!ROOM CHANGE, AGAIN!
• Starting Thursday, March 30th and henceforth, lectures will be in W2030
• Labs will still be in W2009
Term 4, 2006 BIO656--Multilevel Models 3
Term 4, 2006 BIO656--Multilevel Models 4
Prerequisites, resources and GradingPrerequisites, resources and Grading
Term 4, 2006 BIO656--Multilevel Models 5
Learning ObjectivesLearning Objectives
Term 4, 2006 BIO656--Multilevel Models 6
Content & ApproachContent & Approach
Term 4, 2006 BIO656--Multilevel Models 7
Approach Approach
• Lectures include basic illustrations and case studies, structuring an approach and interpreting results– Labs address computing and amplify on the
foregoing
• My approach is formal, but not “mathematical”
• To understand MLMs, you need a very good understanding on single-level models– If you understand these, you are ready to
multi-level!
Term 4, 2006 BIO656--Multilevel Models 8
StructureStructure
Term 4, 2006 BIO656--Multilevel Models 9
RULES FOR HOMEWORK,RULES FOR HOMEWORK,MID-TERM AND PROJECTMID-TERM AND PROJECT
Homework • Must be individually prepared, but you can get help• Homework due dates should be honored.• Turn in hard copy for grading
The in-class, midterm• Must be prepared absolutely independently • During the exam, no advice or information can be obtained from others• You can use your notes and reference materials
The term project• Must be individually prepared, but you can get help• Must be electronically submitted
Term 4, 2006 BIO656--Multilevel Models 10
Handouts and the WebHandouts and the Web• Virtually all course materials will be on the web• Check frequently for updates
• I’ve provided hard copy of the general information sheet
• However, other lectures will be on the web in powerpoint format and won’t be handed out
• Download to your computer so you have an electronic version each part
• Print if you need hard copy, but do it 4 or 6 to a page to save paper• More generally, try to “go electronic” printing sparingly
Term 4, 2006 BIO656--Multilevel Models 11
COMPUTING & DATACOMPUTING & DATA
• We will support WinBUGS, Stata
• We provide partial support for SAS, which should be used only by current SAS users; we aren’t teaching it from scratch
• Some homeworks require use of WinBUGS and another “traditional” program (STATA, SAS, R,...)
• We provide datasets, including some in the WinBUGS examples
Term 4, 2006 BIO656--Multilevel Models 12
WHY BUGS?WHY BUGS?
• Freeware!
• In MLMs, it’s important to see distributions– e.g., Skewness of sampling distribution of variance component estimates
• It’s important to incorporate all uncertainties in estimating random effects
• Note that WinBugs isn’t very data input friendly
• And, it’s difficult to produce P-values
Term 4, 2006 BIO656--Multilevel Models 13
STATISTICAL MODELSSTATISTICAL MODELS
• A statistical model is an approximation • Almost never is there a “correct” or “best” model, no holy grail
• A model is a tool for structuring a statistical approach and addressing a scientific question
• An effective model combines the data with prior information to address a question
Term 4, 2006 BIO656--Multilevel Models 14
MULTI-LEVEL MODELSMULTI-LEVEL MODELS
• Biological, physical, psycho/social processes that influence health occur at many levels:– Cell Organ Person Family Nhbd
City Society ... Solar system– Crew VesselFleet ...
– Block Block Group Tract ...
– Visit Patient Phy Clinic HMO ...
• Covariates can be at each level• Many “units of analysis”
• More modern and flexible parlance and approach: “many variance components”
Term 4, 2006 BIO656--Multilevel Models 15
Example: Alcohol AbuseExample: Alcohol Abuse
• Cell: neurochemistry
• Organ: ability to metabolize ethanol
• Person: genetic susceptibility to addiction
• Family: alcohol abuse in the home
• Neighborhood: availability of bars
• Society: regulations; organizations; social norms
Term 4, 2006 BIO656--Multilevel Models 16
ALCOHOL ABUSE:ALCOHOL ABUSE:A multi-level, interaction model
• Interaction between existence of bars & state, drunk driving laws
• Alcohol abuse in a family & ability to metabolize ethanol
• Genetic predisposition to addiction & household environment
• State regulations about intoxication & job requirements
Term 4, 2006 BIO656--Multilevel Models 17
Many names for similar, Many names for similar, but not identical but not identical
models, analyses and goalsmodels, analyses and goals
• Multi-Level Models• Random effects models
• Mixed models
• Random coefficient models
• Hierarchical models• Bayesian Models
Term 4, 2006 BIO656--Multilevel Models 18
We don’t need MLMsWe don’t need MLMs
• If your question is about slopes on regressors, you can run a standard regression and (usually) get valid slope estimates
Y = 0 + 1(areal monitor) + 2(home monitor) + ...
Y = 0 + 1(zipcode income) + 2(personal income) + ...
logit(P) = ......
• Analysis can be followed by computing a “robust” SE to get valid inferences
Term 4, 2006 BIO656--Multilevel Models 19
We do need MLMsWe do need MLMs
• If your question is about variance components, you need to build the multi-level model
Yijkl = 0 + 1X1 + 2 X2 + ... + ijkl
Var(Yijkl) = Var(ijkl) =
= VHospital + VClinic + VPhysician + VPatient + Vunexplained
• These variances depend on what Xs are in the model
Term 4, 2006 BIO656--Multilevel Models 20
We do need MLMsWe do need MLMs
• To create a broad class of correlation structures– Longitudinal correlations– Nested correlations
• To structure improving unit-level estimates (latent effects) and to make unit-level predictions
Term 4, 2006 BIO656--Multilevel Models 21
MLMs are effective in producing MLMs are effective in producing “working models” that “working models” that
incorporate stochastic realitiesincorporate stochastic realities
• Producing efficient population estimates• Broadening the inference beyond “these units”• Protecting against some types of informative missing data processes• Producing correlation structures• Generating “overdispersed” versions of standard models• Structuring estimation of latent effects
But, MLMs can be fragile and care is neededBut, MLMs can be fragile and care is needed
Term 4, 2006 BIO656--Multilevel Models 22
MLMs are not and should not beMLMs are not and should not be
• A religion
• A truth
• The only way to model multi-level data!
Term 4, 2006 BIO656--Multilevel Models 23
Improving individual-level estimatesImproving individual-level estimatesSimilar to the BUGS rat data
• Dependent variable (Yij) is weight for rat “i” at age Xij
i = 1, ..., I (=10); j = 1, ..., J (=5)
Xij = Xj = (-14, -7, 0, 7, 14) = (8-22, 15-22, 22-22, 29-22 36-22)
Yij = bi0 + bi1 Xj + ij
– As usual, the intercept depends on the centering
• Analyses– Each rat has its own line – All rats follow the same line: bi0 = 0 , bi1 = 1 – A compromise between these two
Term 4, 2006 BIO656--Multilevel Models 24
Each rat has its own (LSE, MLE) lineEach rat has its own (LSE, MLE) line(with the population line)
Pop line
Term 4, 2006 BIO656--Multilevel Models 25
A multi-level model:A multi-level model: Each rat has its own line,
but the lines come from the same distribution
• The bi0 are independent Normal(0, 02)
• The bi1 are independent N(1, 12)
Overdispersion• Sample variance of the OLS estimated intercepts: 345 = SEint
2 + 02 = 320 + 0
2 02 = 25, 0 = 5
• Sample variance of the OLS estimated slopes 4.25 = SEslope
2 + 12 = 3.25 + 1
2 12 = 1.00, 1 = 1.00
Term 4, 2006 BIO656--Multilevel Models 26
A compromise: each rat has its own line,A compromise: each rat has its own line, butbut the lines come from the same distribution the lines come from the same distribution
Pop line
Term 4, 2006 BIO656--Multilevel Models 27
ONE-WAY RANDOM EFFECTS ANOVAONE-WAY RANDOM EFFECTS ANOVA
Term 4, 2006 BIO656--Multilevel Models 28
Simulated “Neighborhood Clustering”Simulated “Neighborhood Clustering”• Random mean for each of 10 neighborhoods (J=10) b1, b2, ..., b10 (iid) N(10, 9)
• Random deviation from neighborhood mean for each of 10 persons in each neighborhood (n=10) Yij = bj + eij, eij (iid) N(0, 4)
Conditional Independence Over-dispersion: Variance of each point is 13 (= 4 + 9)Correlation: Measurements within each cluster are correlated
Term 4, 2006 BIO656--Multilevel Models 29
Term 4, 2006 BIO656--Multilevel Models 30
Intra-class Correlation (ICC)Intra-class Correlation (ICC)
• Correlation of two observations in the same cluster:
ICC = Var(Between)/ Var(Total)
= 1 – Var(Within)/Var(Total)
Estimated ICC: 0.67 = (9.8-3.2)/9.8
True ICC: 0.69 = 9/(9 + 4) = 9/13
Term 4, 2006 BIO656--Multilevel Models 31
V(b)
Term 4, 2006 BIO656--Multilevel Models 32
Term 4, 2006 BIO656--Multilevel Models 33
Term 4, 2006 BIO656--Multilevel Models 34
Term 4, 2006 BIO656--Multilevel Models 35
Term 4, 2006 BIO656--Multilevel Models 36
Term 4, 2006 BIO656--Multilevel Models 37
regressionline
Pop line
45o line
Term 4, 2006 BIO656--Multilevel Models 38
Term 4, 2006 BIO656--Multilevel Models 39
Term 4, 2006 BIO656--Multilevel Models 40
Term 4, 2006 BIO656--Multilevel Models 41
Term 4, 2006 BIO656--Multilevel Models 42
Term 4, 2006 BIO656--Multilevel Models 43
Term 4, 2006 BIO656--Multilevel Models 44
WEIGHTED MEANSWEIGHTED MEANS
Term 4, 2006 BIO656--Multilevel Models 45
Term 4, 2006 BIO656--Multilevel Models 46
Term 4, 2006 BIO656--Multilevel Models 47
Term 4, 2006 BIO656--Multilevel Models 48
Term 4, 2006 BIO656--Multilevel Models 49
Term 4, 2006 BIO656--Multilevel Models 50
Term 4, 2006 BIO656--Multilevel Models 51
Term 4, 2006 BIO656--Multilevel Models 52
INFERENCE SPACEINFERENCE SPACE(Sanders)(Sanders)
• The choice between fixed and random effects depends in part on the reference population (the inference space)
–These studies or people
– Studies or people like these– .........
Term 4, 2006 BIO656--Multilevel Models 53
Random Effects Random Effects should replace “unit of analysis”should replace “unit of analysis”
• Models contain Fixed-effects, Random effects (via Variance Components) and other correlation-inducers
• There are many “units” and so in effect no single set of units
• Random Effects induce unexplained (co)variance• Some of the unexplained may be explicable by
including additional covariates• MLMs are one way to induce a structure and
estimate the REs
Term 4, 2006 BIO656--Multilevel Models 54
PLEASE DO THISPLEASE DO THIS
If you did not receive the welcome email from me, email me at: ([email protected])
Term 4, 2006 BIO656--Multilevel Models 55
ROOM CHANGE, AGAIN!ROOM CHANGE, AGAIN!
• Starting Thursday, March 30th and henceforth, lectures will be in W2030
• Labs will still be in W2009
Term 4, 2006 BIO656--Multilevel Models 56
END OF PART IEND OF PART I