Measures of disease occurrence October 5 2004 Epidemiology 511 W. A. Kukull

Measures of disease occurrence

October 5 2004Epidemiology 511

W. A. Kukull

Defining disease (health events)

• What disease features do cases have in common?

• What disease features make cases different from non-cases?

• How can we observe disease features – Interview– Exam– Lab test or autopsy

Observing onset

• Clinical diagnosis: hx, signs, symptoms

• Pathological diagnosis: examination of biological specimens, e.g., biopsy, labs

• Insidious onset

• Abrupt onset

• Recurrent: many “onsets” possible

• Persistent/Chronic

Defining a Population

• What characteristics do members of the chosen population have?

• How are member characteristics different from non-members?– Geography: residents of County– Individual features: 75 – 79 y.o. men– Time period: 1929 - 1938

Population and time

• Closed population: once defined, no new persons may enter.– Disease occurrence and death reduce pool– Airline passengers on a non-stop

• Open population: new members may be added, loss may occur– Non-diseased persons may be lost– Boeing machinists employed 2000 - 2003

Who is “at-risk” ?

• Susceptible: the probability you could get the disease is NOT zero.– Does not mean you are especially likely to get

the disease, or suffer the health event.

• Non-Susceptible: the probability you could get the disease IS Zero.– Persons who have had their appendix removed

are non-susceptible to future appendicitis

Goals

• Define disease (or health event)

• Define population

• Find all cases in the population– Existing cases– New cases

• Create measures of case frequency per population

Counts: “Numerator data”

• Number of people with the disease

• “ We report 5 cases of Parkinson’s disease in 20-30 year olds”

• Numerator data: often hard to interpret without knowing the size of the population giving rise to the cases– Very rare or unusual occurrences

Cases per year

0

50

100

150

200

250

300

1930 1940 1950 1960 1970 1980 1990 2000

Year

Cas

es dementiawartsmycoses

Problems determining disease

• Diagnostic criteria

• Poor recognition

• Survey errors– respondents– interviewers

• Hospital data not meant for research

Creating a frequency measure:Critical questions

• Count cases in relation to the population at-risk (per time)

• If each of the cases had not developed disease, would they have been in the population (denominator)?

• If each of the non-cases in the population had developed disease would they have been included as a case?

• The answers should be “yes”

Mortality for selected causesper 100,000 population (hypothetical data)

020406080

100120140160180

1930 1940 1950 1960 1970 1980 1990 2000

Year

Dea

ths

per

100,

000

anxietyragejoy

PrevalenceHow common is the disease today?

• EXISTING CASES at a specified time / persons in defined population at that time

• “47% of persons over 85 years old, in East Boston were demented, in 1990.”

• A “snapshot” view of the disease at a single point in time (a.k.a. point prevalence)

• NOT a measure of risk and NOT a Rate

Incidence: counting the new cases that occur with time

• Cumulative Incidence (a “risk”)– NEW CASES / initial pop-at-risk– The incidence of nasal papilloma in Seattle was

6 per million population in 1984”

• Incidence rate (a “rate”) – NEW CASES / at-risk time– Stroke incidence is 5 per 100,000 person-years

Prevalent case biasLonger disease duration increases chance of selection

Time

Cross-sectional Sample

Mortality: an incidence-like measure

• [Deaths from disease X in 19xx] divided by [midyear population]

• “the annual CHD mortality rate dropped from 370 per 100,000 in 1968 to 270 per 100,000 in 1975

• Risk of dying from disease X, during the time interval, for someone in the population

Disease Frequency Relationships

• P = I * D– prevalence = incidence times average duration

of the diseased state– Robust when I and D are stable and P is <10%

• M = I * C– Mortality = incidence times Case Fatality Rate– this holds when I and C are approximately

stable over time

Example: Prevalence, incidence and duration

Population Prevalenceof positiveX-ray

New casesper pop peryear

Duration(yrs)

RanchoBernardo(N=1000)

100 4 25

MeanStreets(N=1000)

60 20 3

Where is disease risk highest?

Comparing measures(“Rate” used in a broad sense)

• Crude Rates– overall, summary rate for a population of

comparison group– may differ between populations due to other

factors e.g., age distribution– usually not used for inter-population

comparisons

• Specific Rates: can “always” be compared

Standardized Rates

• Alternative to Crude rate when a single summary rate is needed for comparison– example: when age distributions are different

and disease is age related– “ficticious” summary rates are computed

reflecting state “if the populations had the same age distributions”

Example (Direct)(After Jekel, Katz & Elmore, 2001)

Age Population

Size: “A”

Age-Specific rate

Expected number

Population size “B”

Age Specific rate

Expected number

Young 1,000 0.001 1 4,000 0.002 8Middle aged

5,000 0.010 50 5,000 0.020 100

Older 4,000 0.100 400 1,000 0.200 200

Total 10,000 451 10,000 308

CR= 451

10,000=4.51% CR= 308

10,000=3.08%

Example (Direct)Standard Population

Age Population

Size: A+B

Age-Specific rate: “A”

Expected number

Population size A+B

Age Specific rate”B”

Expected number

Young 5,000 0.001 5 5,000 0.002 10Middle aged

10,000 0.010 100 10,000 0.020 200

Older 5,000 0.100 500 5,000 0.200 1000

Total 20,000 605 20,000 1210Standardizedrate =

605

20,000=3.03% Standardized

rate= 1210

20,000=6.05%

Direct Standardization(there will be an exercise in homework)

• Choose a “standard population”

• Multiply (age)-specific rates from pop#1 by standard pop age groups; repeat for pop#2

• Sum the pop#1 numbers and divide by total standard population; repeat for pop #2

• Compare!

• This adjusts for the confounding effect of age

Indirect Standardization

• An alternative method of standardization– when you know the total deaths and you know

your age distribution but you don’t know age-specific rates

• Apply (age)-specific rates from a standard population to compute “expected” deaths

• [Observed deaths] / [expected deaths] *100 = SMR (standardized mortality ratio)

Direct and Indirect Standardization

Direct Indirect

Data fromstandard pop

Age-distribution

Age-specificrates

Data fromstudy pop

Age-specificrates

Age-distribution

End Product Age-adjustedrate

SMR

Summary rates

• Magnitude depends on choice of standard population

• Give “what if” comparison between groups

• Specific rates are usually preferable (and are compare-able)

Proportional Mortality

• [# Deaths from a specific cause] divided by [all cause deaths] for a given time period

• Example: The proportion of all deaths (in NYC males 15-25) that were due to homicide in 1998

• This is not a risk nor rate; the denominator is all deaths.

Proportions of all death due to specific causes (hypothetical data)

Stroke Heart disease Cancer Infections Other

Proportionate Mortality Ratio

• PMR= [observed deaths in population A] /

[expected deaths based on the proportion

in the population B]

• Sometimes seen in occupational studies

Proportional mortality and PMR

• Often used when you don’t know the number of persons in the population

• Frequently used in Occupational Studies

• Can be Misleading– if all cause death rate differs; cause specific

rates can differ greatly but proportionate mortality may stay the same

PMR

• In Bantu laborers in South Africa, 91% of cancer deaths were due to liver cancer

• Usually liver cancer accounts for about 1% of cancer deaths

• Therefore Bantus have an unusually high liver cancer death rate

Example: Mortality per 100,000 in 19xx (After MacMahon&Trichopoulos)

Bantu African-American

Liver Ca 12.7 3.0

Other Ca 1.3 61.5

All Cancer 14.0 64.5

PMR overstated excess of liver Ca in Bantuand did not reveal great difference at other sites

Sources of Morbidity Data

• Disease registries• Insurance Plans• State L&I• Medicare/ HCFA; VA, armed forces

– CDC web sites, MMWR

• Hospitals• Industries, Schools• Surveys and specific studies

Sources of Mortality Data

• US Vital Statistics

• State Vital Statistics

• Individual death certificates

• Disease registries

• Health maintenance organizations

• cdc.wonder.gov

Causes of death seen on death certificates (after Gordis)

• A mother died in infancy

• Deceased had never been fatally sick

• Died suddenly, nothing serious

• Went to bed feeling well, but woke up dead

• Died suddenly without the aid of a physician

• Cardio-Respiratory arrest

Rate confusion• “Rates” loosely used includes: proportions,

ratios, risk and instantaneous rate (Dt)• Proportions include the numerator in the

denominator (e.g., prevalence is a proportion but not a risk nor a rate)

• Ratios: numerator and denominator may be different groups e.g, male/female ratio

Rates and Risks

• Rate: – denominator in person-time; time must be part

of the measure– average population during the observation time

• Risk:– result of rates that prevailed over a period– denominator: persons at-risk at beginning; a

closed population followed over time – time is not a dimension but used descriptively

to specify period of observation

Incidence Density and Cumulative Incidence

• ID = [new cases] / [person-years]– technically the rate

• CI = [new cases] / [initial pop-at-risk] – the cumulative effect of the ID on pop-at-risk

over a specified time period– technically a risk

• CIt = 1 - e -ID(t)

– to estimate the cumulative effect of a rate [ID] on a population after “t” years ( units of time)

Example Calculation

CIt = 1 - e -ID(t)

Where: e =2.71828… base of natural logs(or just push the ‘e’ button on your calculator)

ID = incidence density rate (=124.7 per 1000)

t = years of observation (2, 5, 10 or 20)So, e is raised to the “power” [ -(.1247)(2)]

Then subtracted from 1 to yield CI

Example: Constant mortality rate of 124.7 per 1000 person-years (ID). What is cumulative risk (CI) at 2, 5, 10 and 20 years [CIt = 1 - e -ID(t) ]

Number of Years Cumulative Risk of Death

2 0.2207 (22%)

5 0.4639 (46%)

10 0.7126 (71%)

20 0.9174 (92%)

Documents

Measures of disease occurrence October 5 2004 Epidemiology 511 W. A. Kukull