Biostatistics Case Studies 2014

  • View

  • Download

Embed Size (px)


Biostatistics Case Studies 2014. Session 3: Research Study Designs I. Youngju Pak, PhD. Biostatistician Type of Research Study Designs. Observational Study - PowerPoint PPT Presentation


REI Summer Fellowship Biostatistics

Biostatistics Case Studies 2014Youngju Pak, PhD.Biostatisticianypak@labiomed.orgSession 3: Research Study Designs I1Type of Research Study DesignsObservational Study: Researchers do not attempt to influence subjects or surroundings. The goal is to OBSERVE/COLLECT data on characteristic of interests w/o influencing subjectsExperimental Study : Researchers deliberately influence the course of events & investigate the effect of the treatment on selected population of subjectsMore specific types of observational studiesObservation Studies Ecological Studies : Utilize population level data. e.g. Total cigarette consumptions and lung cancer prevalence by different countries Case Reports / Case Series Single subject or case Simple description of series of individual case e.g., CDC and prevention Morbidity and Mortality Week Reports(MMWR) of Pneumocystis pneumonia in previously healthy, homosexual men (LA,1981) ( More specific types of observational studies cont. Cross Sectional: Single time point studies that define a population at a specific time point, may unsuitable for rare diseasePrevalence or Incidence of disease or other characteristicsNational Health and Nutrition Examination Survey on overweight and obesity in US. Case-Control Typically retrospective studies Good for rare disease Case/Control are collected by PI and retrospectively looking for risk factors/exposure Prospective Longitudinal Cohort Study Suitable for rare exposure Large sample size are needed for rare disease Risk factors/exposure are collected by PI and follow up study participants over time How to make a better cross Sectional StudySometimes it is hard to define denominator if it is an incidence study Determining what to be studied is the most important things. A disease or a disease condition or characteristics may be very difficult to define at a certain time point. eg, atherosclerosis is so common and its manifestations at time can be very subtle.The definition of the condition and health characteristics under study SHOULD be standardized, reproducible, and feasible to apply for a larger scale study.

Advantage and Disadvantages of Cross-Sectional studiesCan avoid potential biases if it is truly population based sampleShort duration, less expensive for common diseases for a particular target population (e.g., workers in a given industry)More expensive and time consuming compared with case-control studies particularly for rare diseasesUnsuitable for rare disease or for diseases of short duration (eg., influenza)Potential bias due to non-responses ( usually zero difference assumed but it is assumed 0.5 for the IOP studySD of changes of IOP = 3.5 (usually set to 2.5%) since the confidence level of the confidence interval is (100-2 x ) %Cross Sectional ExamplesJonas JB, et al. Diabetes mellitus in rural India.Epidemiology. 2010;21:754755.Hedley AA, Ogden CL, Johnson CL, Carroll MD, Curtin LR, Flegal KM. Prevalence of overweight and obesity among US children, adolescents, and adults, 1999-2002. JAMA 2004;291:2847-50. Measure height and weight in National Health and Nutrition Examination Survey (NHANES)Flegal KM, Graubard BI, Williamson DF, Gail MH. Cause-specific excess deaths associated with underweight, overweight, and obesity. JAMA 2007;298:2028-37.Case-Control StudiesObservations regarding possible associationsbetween a single outcome (usually a disease)and one or more hypothesized risk factors orExposures

Well suited for studying Rare diseases Diseases with long latency periodsGenerally quicker and less expensive than cohort studiesNon-exposedNo DiseaseExposedNon-exposedExposedDiseaseAdvantage and Disadvantages of Case-Control studiesSuitable for rare disease & Unsuitable for rare exposureMultiple etiological factors can be studied simultaneouslyLess expensive and time consumingAssociations with risk factors are consistent with other types of study if assumptions are met.Do not estimate prevalence nor incidenceRelative risk can be indirectly measured by the odds ratio if the disease is rare

How to make a better Case-Control study?CasesRepresents all patients who developed diseaseStandardized selection criteria from well defined populationCan be NESTED in a larger cohortWhere?Case registriesAdmission recordsPathology logsHigh participation rate

ControlsRepresent healthy population without diseaseNo perfect control group existsStandardized selection criteria from well defined populationWhere?General populationNeighborhoodFamiliesHospitals

How to make a better Case-Control study?All observation made using the same methods for cases & controls (consistency)To avoid selection bias the same hospital or family controlAvoid interviewer or recall bias standardize data collection methods, train the interviewersConsider cost & accessibilityTo minimize confounding Matched controls for age, sex, or other risk factors that are not interests of the study Analyses for Case Control StudiesExposurePresence of DiseaseTotalDisease No DiseasePresentaba+bAbsentcdc+dTotal a+cb+da+b+c+dSummarizing frequencies with a 2x2 Contingency TableOdd Ratio ( [a/b]/[c/d]) is usually used to test the association. When a & c are very small(rare disease), then OR RRChi-square or Fishers exact tests If the risk factor (X) is continuous measure such as BMI, the a logistic regression model will be used to estimate OR as one unit change in X. Prospective or Longitudinal Cohort Studies

Observations concerning associations between a given exposure and subsequent development of disease Examine multiple outcomes for a single exposureDirectly calculate incidence of disease for each exposure group.Concurrent vs. Non-concurrent Prospective CohortConcurrentDefined population is surveyed.Identify group with supposed risk factorIdentify similar group without risk factorFollow them forward in timeCompare incidence rates between groups

Non-ConcurrentDefine population with presence/absence of exposure ascertained in accurate, objective fashion in the pastRetrospective study since it is based on historical dataSurveyed in present: disease occurrenceDefine incidence rates and compare between the two groups

Advantage and Disadvantages of Prospective or Longitudinal Cohort studies

More representative of cases than case-control (incidence)Natural history of diseaseDirectly measure Relative Risk (RR)Less bias than case-controlFirmly establish temporal relationship b/w exposure and disease but exposure must be IDENTIFIED and MEASURED at the initiation and should be followed during the study period.Suitable for Rare exposure

Advantage and Disadvantages of Prospective or Longitudinal Cohort studies

Long follow-up and free-living population follow up is both difficult and expensiveUsually large scaled studyExtensive baseline data may needUnsuitable for rare disease ( can have zero frequency in a 2x2 table if the sample size is not enough)Still bias exists (eg., participant selection, exposure assessment, or loss to follow up)

How to make a better Prospective Cohort studyExposed and non-exposed should be representative and well defined.Non-exposed status should be maintained during the study periodDisease outcomes should be well defined prior to study and no changes during the study periodStandard criteria applied to both exposed and non-exposed.Minimize loss to follow-up (>80%)

Analyses for Longitudinal Cohort StudiesCalculate incidence for the study period in exposed, unexposed, and test using Chi square or Fishers exact test.Measure association with relative risk (or odds ratio) & 95% confidence limits Life-tables (another way to say survival analysis) for Time to Event data Regression modelsNested Case-Control studiesSelect from prospective cohort study eg., Stored samplesUse baseline and follow up samples and data from newly occurring casesCompare to matched or unmatched controlsEfficient for expensive/difficult to measureHelps avoid selection and data collection biasesNeed to have enough cases in the cohortNeed to store all the samples and dataNested Case-Cohort studiesSimilar to Nested Case-ControlControls come from a subcohort sampled from the entire cohorts at baseline(t0), while controls for nested case-control are sampled from individuals at risk at the times(t1) when cases are identified.Typically done whenFailure or event of interest is rareEnormous resources to ascertain covariates valuesVery difficult to analyze Nested Case-Control vs Nested Case Cohort Example :

Prospective Cohort : Example

Cancer incidence for 10% of US population in1973Methods

SEER Register cancer incidence for 10% of the US population in 1973Current incidence about 26% of the US population as of 2005Analyze registered breast cancer patients at age of 20-79 w/o previous cancer registered until Jan 1, 2002 from SEER.Exclude: women with bilateral breast cancer & found at autopsy or the death certificateExposure: Irradiation from radiotherapy Disease outcomes: Cause specific mortality Primary : Death from Heart Disease: acute myocardial infraction, other ischaemic heart disease or other heart disease ( using ICD 9 code)Secondary: Death from Lung Cancer Results

Why they didnt compare radiotherapy group with no radiotherapy group?Results

Nested Case-Control: ExampleRisk Factors for Deep Vein Thrombosis and Pulmonary EmbolismA population-Based Case-Control StudyJohn A,Heit, MD; Marc D, Sliverstein, MD; etc, JAMA Internal Medicine 2000;160:809-815

Deep Vein Thrombosis(DVT) occurs when a blood clot (thrombus) forms in one or more of the deep veins in your body. Deep