Biostatistics and Research Design, #6350 Sampling 10-11

Biostatistics and Research Design, #6350

Sampling

10-11

Thought for the Day:

“Education is what remains after one has forgotten what one has

learned in school”

Albert Einstein

10-11

10-11

Sampling Overview• The method of sampling is vital to the

outcome of a study: – Do we believe the study result based on the

subjects and manner of selection? (Noecker study; journal review by

Paugh)• E.g., if a dry eye study,

• Were the subjects really dry? • How was this known? (what testing and by whom)• Did they represent a range of severity?• Were they sourced in an appropriate manner?

(random, clinic based? etc.)

10-11

Total

Population

sample

Sampling method

Eligible subjects

Inclusion / exclusion criteria

Subjects asked to participate

Informed consent

Subjects enrolled in study

Consent obtained

Randomization:

(2) Forms:

Random assignment

Random selection

10-11

A Note Concerning Randomization:

• Two Major Issues:– Random selection (underpinning for statistical

inference)• Important for observational studies (cohort, case-

control, cross-sectional)

– Random assignment to treatment or control:• Important for experimental studies such as RCT;

want groups in test and control as similar as possible so that any differences are due to the treatments

10-11

Selection Bias

• Volunteers may differ by age, ethnicity, economic status, education level, gender

• Volunteers may have a different health status– “Healthy Worker Effect”

• Clinic Based vs. Population Based

10-11

Sample: Definition

• A sample is a subset of the population of interest, selected in such a way that it is HOPEFULLY representative of the whole population

• The population may be people, animals, organisms, inanimate objects, e.g. lenses, solutions

10-11

Reasons to Study Samples vs. Entire Population

1. Samples can be studied more rapidly

2. Samples can be studied less expensively

3. Can’t study the entire population in most situations

1. Sample results more accurate: can use more expensive methods, that improve accuracy

2. Proper sampling allows inference

3. Samples reduce heterogeneity: e.g.,

sub-types of lupus, dry eye, etc.

10-11

Why a “Random” Sample?• Statistical inference is underpinned by the

collection/obtainment of a truly random sample

• Must know how the sample was collected so that we can determine the probabilities associated w/various outcomes– Random samples = known probabilities = can

make inferences about population – Non-Random samples = unknown probabilities;

cannot make inferences

10-11

A Further Note About Random Samples*

• We state that the probabilities of random samples can be determined, and we can use these probabilities to make inferences about the population from which the sample was drawn

• However1 “…It is unlikely that we would ever achieve a truly random sample, because the probabilities of selection will not always be exactly equal. But we do the best we can.”

1Lyman Ott, An Introduction to Statistical Methods and Data Analysis. 1977, Wadsworth Publishing, Belmont, Ca. Page 50.

* This slide is not in the body of the handout, but a separate page

10-11

Population• Describe your population in as much detail

as possible• If you are using human subjects consider

factors such as:age, gender, ethnicity, geographic area

• Describe in detail the characteristics of the population such as:refractive status, ocular health status, accommodative status, binocularity

10-11

Inclusion / Exclusion Criteria

Are used to determine who can be enrolled in the study

• Examples: – Simple: no ocular pathology, visual acuity

better than 20/30, no previous strabismus surgery

– Complex: FDA drug trial: androgen study at SCCO:• 13 Inclusion and 27 Exclusion Criteria!!!

10-11

Why Be So Specific?• To prevent confounding:

– remember the myopia-night light example– could have included equal #’s of hyperopes

• To prevent selection bias:– generally must balance factors that influence an

outcome (e.g., age, gender, ethnicity)• To more clearly interpret the outcome:

– e.g., “in this group of 20-40 yr old dry eye subtypes of various ethnicities and balanced for gender, we found residence time of XYZ”

10-11

Issues with Being Too Specific

May make it extremely difficult to enroll!

10-11

Simple Random Sample

• Random does NOT mean arbitrary or without thought!

• Characteristics of a random sample:–Every “unit” in the population has an

identifier–Each “unit” has an equal chance of being

selected–The selection of one “unit” does not

influence the selection of another

10-11

Simple Random Sample

• Enumerate or identify every “unit”• Use a random number table or random

number generator to randomly select your subjects

• Example: chart review for diabetic retinopathy:– 250 charts for 06-07; need 50 according to your

sample size estimate– Random numbers up to 250; make your 50

picks from the list

10-11

Sequential or Systematic Random Sample

• Method: Select the first unit randomly, then select every xth unit thereafter– E.g., x-rays: need 200, have 3400, 3400 by

200 = 17; select every 17th x-ray

• Not a “true” random sample, because the selection of one unit influences the selection of another

10-11

Randomized Block Sample

• Not a “true” random sample• Units are organized into sections or groups

called blocks, e.g. zip codes, lens batches, day of the week for appointment

• The block is randomly selected, then every unit in the block is included in the sample

10-11

Convenience Sample

• No element of random selection• Subjects are selected “at your convenience”

rather than systematically• Examples: by location (SCCO), friends

– Most samples for student research at SCCO!

10-11

Sampling and Randomization:putting it all together

• Dr. Paugh’s SCL Lubricant Study: n = 30– Sample: convenience: asked who was interested

among the students and staff at SCCO• Issues?

– Age: young, healthy– Refractive error: v. high number of high myopes – Non-random selection (issue of external validity)

– Randomization: lubricant randomly assigned• Minimizes temporal effects, “learning”

– Masking: third party assigns lubricant

10-11

Documents

Biostatistics and Research Design, #6350 Sampling 10-11