Upload
tarek-tawfik-amin
View
2.300
Download
2
Embed Size (px)
DESCRIPTION
Basic structure of the frequently used methods in medical researches
Citation preview
04/11/2023 Dr. Tarek Tawfik 1
Research methods “an overview”
Dr. Tarek TawfikProfessor of Public Health
Cairo University
04/11/2023 Dr. Tarek Tawfik 2
Research? More than a set of skills, it is away of thinking:
examining critically the various aspects of day to day professional work;
Understanding and formulating guiding principles that govern a particular procedures;
Developing and testing new theories for the enhancement of your practice.
It is the habit of questioning with systematic examination of the observed information to find answers which may results in more effective professional services. Kumar R 2005.
04/11/2023 Dr. Tarek Tawfik 3
Definition: Research is a structured inquiry that
utilizes acceptable scientific methodology to solve problems and creates new knowledge that is generally applicable. Grinnell 1993
Types of research
Application Objectives Inquiry mode
Pure research
Applied research
Exploratory research
Descriptive research
Explanatoryresearch
Correlationalresearch
Quantitativeresearch
Qualitative research
Research process “the 8 steps model”
Formulating a research question
Research design
Instruments for data collection
Selecting a sample
Research protocol writing
Data collection
Data processing
Researchreport
FINER
Variables and hypotheses: definition
and typology
Literaturereview
Research design:functions
Study designs
Methods and tools of data collection
Validity and reliability of the
research tool
Field test of the tools
Sampling theory and designs
Contents of research proposal
Editing Code book
Coding
Methods of dataProcessing: computing
and statistics
Principles of Scientific writing
What How Conducting of the study
04/11/2023 Dr. Tarek Tawfik 6
The structure of a research project is set out in its protocol, the written plan of the study.
The functions of the protocol are: Seeking grant funds. Helping the investigator to organize his
research in a logical, focused, and efficient way.
What questions will the study address?Why are these questions important?How is the study structured?
Who are the subjects and how will they be selected?
What measurements will be made?
How large is the study and how will it be analyzed?
Research questionsSignificance (background)Design time frame epidemiologic approachSubjects selection criteria sampling designVariables predictor variables confounding outcome variablesStatistical issues hypotheses sample size analytic approach
Purpose Elements of protocol
04/11/2023 Dr. Tarek Tawfik 8
I- Conceiving the Research Question.
The research question is the uncertainty about something in the population that the investigator wants to resolve by making measurements on his study subjects.
No shortage of questions as one leads to another.
04/11/2023 Dr. Tarek Tawfik 9
Tamoxifen and Cancer Breast.
Tamoxifen reduces the risk of cancer breast during 4 years of use by women at high risk of breast cancer.
Many other questions evolved:oDoes tamoxifen reduce the risk of death due
to breast cancer?oHow long should treatment be continued?oMight other drugs with the same action are
beneficial without the risk of tamoxifen-induced thromboembolism?
oDoes the use of such drug increases the risk for other cancer (ovarian)?
The difficulty in question lies in finding one that can be transformed into a feasible and valid study plan.
04/11/2023 Dr. Tarek Tawfik 10
Origins of a research question. For established investigator: The best research questions usually
emerge from findings and problems faced and observed in prior studies, and in those of other workers in the field “Major Players”.
For new and other investigators:☼ Mastering of the literature.☼ Being alert to new ideas and
techniques.☼ Keeping the imagination roaming.☼ Attending seminar, workshops and
conferences.
Characteristics of a good research question “FINER Criteria”.
Adequate number of subjects.Adequate technical expertiseAffordable in time and money
Manageable in scopeTo the investigator
Confirms or refuses previous findingsExtends previous findings
Provides new findings
To scientific knowledgeTo clinical and health policy
To future research directions
Feasible
Interesting
Novel
EthicalRelevant
04/11/2023 Dr. Tarek Tawfik 12
Developing the research question and study plan.
☼A one or two page outlining the study question and the study plan at an early stage is very helpful.
☼This will focus the attention to clarify the ideas about the plan and to discover potential specific problems that need correction.
The research question should specifies!
Smoking
Cancer lung
Exposure
Disease Outcome
Predictor
Confounders Confounders Occupational hazards
The research question and study plan: problems and solutions
Solutions Potential problem
Specify a smaller set of variablesNarrow the question.Expand the inclusion criteriaEliminate or modify exclusion criteriaAdd other sources of subjectsLengthen the time frame for entry into studyUse strategies to decrease sample sizeCollaborate with those who have skillsConsult and review the literature for alternative methodsConsult and modify the research question
The research question is not FINER1- Not feasible too broad
not enough subjects available
methods beyond the skills of the investigator
too expensive
2- Not interesting, novel, or relevant
3- Uncertain ethical suitability
The study plan is vague
04/11/2023 Dr. Tarek Tawfik 15
Exercise:
Consider the following research questions.
First, write each question in a single sentence that specifies a predictor, outcome, and population.
Then discuss whether it meets the FINER criteria.
Rewrite the question in a form that overcomes any problems in meeting their criteria.
04/11/2023 Dr. Tarek Tawfik 16
Exercise: A. What is the relationship between
depression and health?B. Does eating red meat cause cancer?C. Does lowering serum cholesterol
prevent heart disease?D. Can a relaxation exercise decrease the
anxiety associated with mammography?
E. Do contraceptive vaginal sponges prevent HIV infection?
F. Does dietary pattern among school children affect their health?
04/11/2023 Dr. Tarek Tawfik 17
Assignment: Formulate a research questions
regarding health and health-related problems that may be encountered in:
A. Rural community and the available health facilities.
B. Urban primary health care facility.C. Primary schools.
04/11/2023 Dr. Tarek Tawfik 18
II- Rationale (Significance). This section sets the proposed study in
context and gives its rationale: What is known about the topic at
hand? Why is the research question
important? What kind of answers will the study
provide?
04/11/2023 Dr. Tarek Tawfik 19
Rationale “Background” ۞ This section cites previous research that
is relevant (including the investigator’s own work) and indicates the problem with that research and what question remain.
۞ It makes clear how the findings of the proposed study will help
o In resolving uncertainties, o Leading to new scientific understanding
ando Influencing clinical and public health
policy.
04/11/2023 Dr. Tarek Tawfik 20
Sequence of the rationaleIn a concise logical sequence:
Discuss the importance of the topic “significance”
Review the relevant literature and current knowledge (including deficiencies in knowledge that make the study worth doing).
Describe any results you have already obtained in the area of the proposed study.
04/11/2023 Dr. Tarek Tawfik 21
Sequence of the rationale
Indicate how research question has emerged and fits logically with the above.
Outline in broad terms how you intend to address the research question.
Explain how your study will add to knowledge and help to improve health and/or save money.
04/11/2023 Dr. Tarek Tawfik 22
How to determine research priorities?
(Importance/Significance)
I- How frequent is the condition relative to other conditions?
PrevalenceAs a cause of death
II- What is the degree of disability or dysfunction due to the condition?
III- Are there cost-effective means to cure, control, or prevent such condition?
04/11/2023 Dr. Tarek Tawfik 23
Assignment:
State the rationale (significance) for the proposed study question?
04/11/2023 Dr. Tarek Tawfik 24
III-Setting up research objectives.
Purpose broad objectives (aims)☼ The statement of a research project should
describe the main questions to be addressed by the research without going into details.
☼ It should give a reader a clear idea of the nature of the research that will be undertaken.
‘ The purpose is to measure the effect of a plasmodium falciparum asexual blood-stage vaccine in reducing morbidity and mortality due to malaria’
‘ This study is conducted to assess the nutritional problems among primary school children’
04/11/2023 Dr. Tarek Tawfik 25
Specific objectives
The specific objective should be SMART:
S SpecificM Measurable (effect size)A Applicable, achievable R Relevant T Timely (a time frame and end
point).
Objectives “characteristics”
Clear
Complete
Specific
Identify theMain variables to be correlated
Identify the direction of the
relationship+ + + +
Descriptive studies
Correlation studies (experimental and non experimental)
Hypothesis-testing studies
04/11/2023 Dr. Tarek Tawfik 27
Specific objectives in research
They should include a concise but detailed description of:
oThe intervention (study) to be evaluated,
oThe outcome (s) of interest, oAnd the population in which the study
will be conducted.
04/11/2023 Dr. Tarek Tawfik 28
Why is asthma among children in Istanbul exceptionally frequent?
The purpose of the study are to determine if the excess asthma in Istanbul is related to a combination of genetic predisposition (estimated by atopy) and socio-economic and/or indoor air pollution.
What are the specific objectives to achieve such type of study?
I. Identify a suitable source of childhood asthma cases and select 200 cases, following a specific case definition.
II. Identify and select suitable control subjects (individuals without asthma).
III. Measure indoor particulate exposure on each of 3 randomly selected days for each participant.
IV. Perform allergy skin test on cases and controls (atopy).
V. Record personal, demographic, and socio-economic information about cases and control.
VI. Compare risk ratio for atopy, low socio-economic status, and increased indoor air pollution between cases and controls.
04/11/2023 Dr. Tarek Tawfik 30
Hypotheses and Underlying Principles
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 31
Hypothesis definition
A hypothesis is written in such a way that it can be proven or disproved by valid and reliable data-it is in order to obtain these data that we perform our study. Grinnel 1988:200.
Hypothesis has certain characteristics:1. It is a tentative proposition “hunch”2. Its validity is unknown.3. In most cases, it specifies a relationship between
two or more variables.
04/11/2023 Dr. Tarek Tawfik 32
Functions of hypothesis Formulation of a hypothesis provides a
study with focus “specific aspects of a research problem to investigate”
What data are necessary to collect to test the hypothesis.
Enables you to specifically conclude what is true or what is false.
Phase I
Phase II Phase III
Formulate your Hunch or assumption
Collect the required data
Analyze data To draw conclusionsAbout the hunch-true/false
Process of testing a hypothesis
04/11/2023 Dr. Tarek Tawfik 33
Hypotheses
It is the further formulation of the study question into a final and more specific version, that summarizes
the elements of the study; the sample, the design, and the predictor and outcome
variables.
The primary purpose is to establish the basis for tests of statistical
significance.
04/11/2023 Dr. Tarek Tawfik 34
HypothesesI- Hypotheses are not needed in
descriptive studies which describe how characteristics are distributed in a population.
The prevalence of particular genotype among patients with hip fracture.
II- Hypotheses are needed in most of the observational and experimental studies that address statistical comparison.
The study of weather a particular genotype is more common in patients with hip fracture compared to control.
04/11/2023 Dr. Tarek Tawfik 35
Hypotheses
If any of the following terms appear in the research question, then the study is not descriptive and a hypothesis should be formulated:
Greater than, less than, causes lead to, compared with, more likely than, associated with, related to, similar to, or correlated with.
04/11/2023 Dr. Tarek Tawfik 36
Characteristics of a good hypothesis
Simple, Specific, Stated in advance (3Ss)
A-Simple versus complexContains one predictor and one outcome
variable; (a sedentary lifestyle is associated with an increased risk of
proteinuria in patients with diabetes).
A complex hypotheses contains more than one predictor;
(a sedentary lifestyle and alcohol consumption are associated with increased risk of proteinuria in patients with diabetes).
04/11/2023 Dr. Tarek Tawfik 37
Simple hypotheses
Or more than one outcome variable; (alcohol consumption is associated with an
increased risk of proteinuria and neuropathy in patients with diabetes).
Complex hypotheses can be readily tested with a single statistical tests and can be easily approached by breaking them into two or more simple hypotheses.
04/11/2023 Dr. Tarek Tawfik 38
Simple hypotheses
(smoking cigarettes, cigars, or a pipe is associated with an increased risk of proteinuria in patients with diabetes).
What type of hypotheses is this?
04/11/2023 Dr. Tarek Tawfik 39
B-Specific versus Vague
A specific hypothesis leaves no ambiguity about the subjects, the variables, or about how the test of statistical significance will be applied.
it uses concise operational definitions that summarize the nature and source of the subjects and how variables will be measured;
(a history of using tricyclic antidepressant medications, as measured by review of pharmacy records, is more common in patients hospitalized with an admission diagnosis of myocardial infarction at Longview Hospital in the past year than in control hospitalized for pneumonia).
04/11/2023 Dr. Tarek Tawfik 40
Specific versus Vague
It is often obvious from the research hypothesis whether the predictor variable and the outcome variable are dichotomous, continuous, or categorical.
(alcohol consumption (in mg/day) is associated with an increased risk of proteinuria (> 30 mg/dL) in patients with diabetes).
04/11/2023 Dr. Tarek Tawfik 41
C-In Advance versus After-the-Fact
The hypothesis should be stated in writing at the outset of the study.
A single pre-tested hypothesis creates a stronger basis for interpreting the study results than several hypotheses that emerge as a result of data inspection.
Hypotheses that are formulated after data examination are a form of multiple hypothesis testing that often leads to over-interpreting the importance of the findings.
Types of hypothesis
Alternate hypothesis
Null hypothesis
Research hypothesis
Hypothesis of no difference
“null hypothesis”
Hypothesisof difference
Hypothesisof point-
prevalence
Hypothesisof
association
04/11/2023 Dr. Tarek Tawfik 43
Types of hypothesis “examples”
There is no significant difference in the proportion of male and female smokers in the study population. Hypothesis is ?A greater proportion of females than males are smokers in the study population. Hypothesis is ?A total of 60% of females and 30% of males in the study population are smokers. Hypothesis is ?There are twice as many female smokers as male smokers in the study population. Hypothesis is ?
04/11/2023 Dr. Tarek Tawfik 44
Types of Hypotheses1- Null and Alternative
I- The null hypothesis states that there is no association
between the predictor and outcome variables in the
population. (there is no difference in the frequency of
drinking well water between subjects who develop peptic ulcer disease and those who do not).
II- It is the formal basis for testing statistical significance.
Statistical tests help to estimate the probability that an association observed in a study is not due to chance.
04/11/2023 Dr. Tarek Tawfik 45
Null and Alternative
o The proposition that there is an association is called the alternate hypothesis.
o The alternative hypothesis cannot be tested directly; it is accepted by default if the test of statistical significance rejects the null hypothesis. “accepted when null is rejected”
04/11/2023 Dr. Tarek Tawfik 46
2- One and Two-sided alternative Hypothesis
I- A one-sided hypothesis specifies the direction of
the association between the predictor and the
outcome variables. Drinking well water is more common
among subjects who develop peptic ulcer (one-sided).
II- A two-sided hypothesis states only that an association exists; does not specify the direction.
The prediction that subjects who develop peptic ulcer disease have a different frequency of drinking well water than those who do not (two-sided).
04/11/2023 Dr. Tarek Tawfik 47
Indications
For one-sided: When only one direction for an
association is important or biologically meaningful (a new drug for hypertension is more likely to cause rashes than a placebo).
When there is good evidence from prior studies that an association is unlikely to occur in one of the two directions (smoking affects the risk of cancer brain).
Underlying Statistical Principles
Target PopulationPhenomenaOf interest
Actual Subjects
Actual Measure.
Intended SampleIntended variables
RandomSystematic
error
RandomSystematic
error
Research Q
Truth in Universe
Study plan
Truth inthe study
Actual study
Findingsin the study
design
infer
implement
infer
Underlying Statistical PrinciplesStatistical tests Jury decision
Null hypothesis: there is no association between dietary carotene and incidence of colon cancer.Alternative hypothesis: there is an association between dietary carotene and colon cancer incidence.Standard for rejection null hypothesis:Level of statistical significance ( ≤ 0.05)
Correct inference: conclude an association when one does not exist in the population.Correct inference: no association between carotene and colon cancer when one does not exist
Incorrect inference (Type I error): association in the study when actually is noneIncorrect inference (Type II error): there no association when actually there is one.
Innocence: the defendant did not counterfeit moneyGuilt: the defendant counterfeit money
Standard for rejecting innocence: beyond a reasonable doubt.Correct judgment: convict a counterfeiterCorrect judgment: acquit an innocent person
Incorrect judgment: convict an innocent person Incorrect judgment: Acquit a counterfeiter
04/11/2023 Dr. Tarek Tawfik 50
Type I and type II error
A type I error (false-positive) occurs if the investigator rejects a null hypothesis that is actually true in the population.
A type II error (false-negative) occurs if the investigator fails to reject a null hypothesis that is actually not true in the population.
04/11/2023 Dr. Tarek Tawfik 51
Truth in the population Vs. the results in the study sample (the four possibilities).
Truth in the population
No association between predictor
and outcome
Association between predictor
and outcome
Results in the study sample
Type I error
Correct
Correct
Type II error
Reject null hypothesis
Fail to reject null
04/11/2023 Dr. Tarek Tawfik 52
, and Power
The probability of committing a type I error (rejecting the null when it is actually true) is called (alpha), another name is the level of statistical significance.
An level of 0.05, setting 5 % as the maximum chance of incorrectly rejecting the null hypothesis.
04/11/2023 Dr. Tarek Tawfik 53
The probability of making a type II error (failing to reject the null hypothesis when it is actually false) is called (beta).
The quantity (1- ) is called power, the
probability of rejecting the null hypothesis in the sample if the actual effect in the population equals effect size.
If is set at 0.10, we are willing to accept a 10 % chance of missing an association of a given effect size. This represents a power of 90 % (there is 90 % chance of finding an association of that size).
04/11/2023 Dr. Tarek Tawfik 54
P Value A ‘non significant’ result (i.e., one with
a P value greater than ) does not mean that there is no association in the population, it only means that the result observed in the sample is small compared with that occurred by chance alone.
Those with hypertension were twice as likely to develop cancer prostate compared to normotensive subjects (P of 0.08)
04/11/2023 Dr. Tarek Tawfik 55
Sampling
Dr. Tarek Tawfik
In research what we are looking for?
The variable: is a condition, quality or trait that varies from one case to another
In the target population (population of interest)
Either include the whole
Population
ASample
OR
Provokes research
To study these variables.
Basic Terms and ConceptsTarget Population and Sample
o A population is a complete set units with a specified set of characteristics while a sample is a subset of that population.
o The defining characteristics of population include geographic, clinical, demographic and temporal.
o Clinical and demographic characteristics define the target population, the large set of people throughout the world to which the results will be generalized.
(all teenagers with asthma).
o The study sample is the subset of the target population available for study.
(teenagers with asthma in the investigator’s town in 2005).
Steps in designing the protocol for choosing the study subjects
Target populationSpecify clinical ,
Demographic and thenGeographic and temporal
characteristics
Intended sampleSpecify accessible
population and approach to selecting
the sample
Research question
Truth in the Universe
Study plan
Findings in the study
Design
04/11/2023 Dr. Tarek Tawfik 59
Selection Criteria
How would you define the population to be studied?
Through establishing selection criteria that include inclusion and exclusion criteria.
Example: Demonstrate the selection criteria for
subjects to evaluate the efficacy of calcium supplements for preventing osteoporosis?
Designing selection criteria for a clinical trial of calcium supplements to prevent osteoporosis
A 5-year trial of calcium supplementation for preventing osteoporosis might specify the subject be:
White females 50 to 60 years old
In good general health**Patients attending clinic at X HospitalBetween Jan. 1st and December 31st of next year.
Specifying the characteristics that define population that are relevant to the research question and efficient for study:®Demographic: age, sex, and race.®Clinical characteristics.®Geographic (administrative).
®Temporal characteristics
Inclusion criteria(be specific)
Example Considerations
Designing selection criteria for a clinical trial of calcium supplements to prevent osteoporosis
The calcium supplementation trial might exclude subjects who are:
oAlcoholic or plan to move of the country or region.oDisoriented or have a language barrier.oSarcoidosis /hypercalcemia.
oTaking steroids.
Specifying the subsets of the population that will not be studied because of:
A high likelihood of being lost to follow-up.An inability to provide good data.Being at high risk of side effects.Characteristics that make it unethical to withhold the study treatment
Exclusion Criteria(be parsimonious)
Example Considerations
04/11/2023 Dr. Tarek Tawfik 62
Clinical versus Community populations
If the research question involves patients with a disease; hospitalized or clinic-based patients are inexpensive and easy to recruit, but selection factors that determine who comes to the hospital or clinic may have an important effect.
Tertiary clinics tend to accumulate patients with serious forms of disease.
In choosing the sample in the community who will represent a non clinical population (population-based)
Samples are difficult and expensive to recruit, but they are particularly useful for guiding public health and clinical practice in the community.
Studying The whole population
Resorted to if we are interested in the characteristics of each individual, particularly with descriptive research questions, and there is a need for generalizing the findings.
Probability sampling is the gold standard. It provides a rigorous basis for estimating the
fidelity with which phenomena observed in the sample represent those in the population, and for computing statistical significance and confidence intervals.
A. It is expensive. B. It is time consuming.C. It has higher error chances because of the
many persons, equipments and wide geographic area covered.
D. Carried out in censuses.
04/11/2023 Dr. Tarek Tawfik 64
Sampling
Resorted to if we are interested in studying the prevalence of a problem, associations or intervention effect,…..etc
A. It is less expensive. B. It is less time consuming.C. It has lower error chances because of less
persons, equipments and geographic area covered.
D. Only estimates are concluded, the reality is unknown.
E. It allows for continuous study of the population “longitudinal studies”.
Study of a sample is carried out in the majority of biomedical researches.
04/11/2023 Dr. Tarek Tawfik 65
The concept of sampling
Study population:Sampling units
You select a few sampling unitsfrom the study population
Sample
You collect informationfrom these people to find answers to your research questions.
You make an estimate “prediction” extrapolated to the study population
(prevalence, outcomes etc.)
Principles of sampling I. In a majority of cases of sampling there
will be a difference between the sample statistics and the true population mean, which attributable to the selection of the units in the sample “sampling error”.
II. The greater the sample size, the more accurate will be the estimate of the true population mean “reduction in sampling error”
III. The greater the difference in the variable “heterogeneous variable” under study in a population for a given sample size, the greater will be the difference between the sample statistics and the true population mean “the larger the sampling error”.
Types of sampling
Random/probability Non-random/probability Mixed sampling
Simple Stratified
Proportionate
Disproportionate
Cluster
Single
Double stage
Multi-stage
Quota
Accidental
Judgmental
Snowball
Systematicsampling
04/11/2023 Dr. Tarek Tawfik 68
Types of Samples
Probability samples: Units are selected according to
probability laws i.e. everyone in the underlying population has an equal (a specified) and independent chance of appearing in that sample.
Non-probability (convenience) samples: Units are selected based on known
factors. In clinical research the study sample is
usually made up of people who meet the inclusion criteria and are easily accessible to the investigator.
04/11/2023 Dr. Tarek Tawfik 69
Probability Samples
In order to be able to infer from sample results to the underlying population, that sample should be a representative sample.
i.e. it should represent the population from which it is drawn in every respect.
Because we can not anticipate all characteristics of the population that the
sample should represent, we chose a probability (random) sample.
04/11/2023 Dr. Tarek Tawfik 70
How to draw a probability Sample?
I. Identify the study units (individuals, villages, houses, …etc).
II. Make a complete list of the study units in the underlying population. That complete list is known as the sampling frame.
III. Each of these units is given a number.
IV. Then select the required number of units (sample size) at random from that frame.
The selection of units can be made either by:
1. The lottery method “fishbowl draw” (the
numbers of frame units are written on identical pieces of papers, mixed thoroughly in a bowl and the required number is blindly picked up).
2. Through the use of random numbers tables.
3. Computer generated random numbers.
Two systems of drawing a random sample: Sampling without replacement. Sampling with replacement.
Random number table
04/11/2023 Dr. Tarek Tawfik 73
Random Sampling Techniques
1-Simple random sample2-Stratified random sample3-Systematic random sample4-Cluster random sample5-Multistage random sample
04/11/2023 Dr. Tarek Tawfik 74
1-Simple random sample
We prepare a complete and up-to-date list of the underlying population (sample frame). The specified sample size is drawn from that frame at random.
Disadvantages: Suitable for homogenous population
(single sex). Larger sample size is required. More expensive as we have to get the
cases from widely scattered areas. Time consuming and more laborious. Some groups might not be represented
in the sample. Extreme values can occur by chance.
Example of Simple random sample using random digit table.
Draw at random a sample size of 50 from a population of 10,000.
A. The size of the population is 10,000 i.e. it is formed of 5 digits.
B. Select at random a page from the random numbers table
C. Select 5 adjacent columns D. Proceed from up down, any value falling
between 00001 and 10,000 is chosen and so on until you completed your 50 cases.
E. Duplicate numbers are left aside F. Individuals with those 50 numbers compose
our sample.
The First 15 columns of the first page of a Random numbers table
26804 00010 93445
90720 12805 58563
85027 32242 86468
09362 16212 00128
64590 75362 32348
29273 34703 23763
96215 01556 63708
59207 22211 48522
49674 01534 98685
04104 00047 14986
04/11/2023 Dr. Tarek Tawfik 77
2-Stratified random sampling
o Based upon the logic of heterogeneity of the included variables.
o Ensure homogeneity of sub-population though ranking them into strata.
2-Stratified random sample Ensures representativeness with regard
to important characteristics as age, sex, educational or socio-economic levels.
The population is divided into strata (subgroups) according to the different levels of the important variable. The population in each stratum is homogenous so sampling accuracy is increased.
We choose a simple random sample from each stratum, the size of which is proportionate to the size of that stratum.
In other words the sampling fraction is the same for each stratum and the total sample.
3
3
2
2
1
1
N
n
N
n
N
n
N
n
Example of Stratified random sample A town with a total population of 12,000 was classified into
4 homogenous socioeconomic strata. The population in each stratum was 2,000 (class I), 4,000 (class II), 5,000 (class III) and 1,000 (class IV) respectively. A sample size of 600 is to be drawn from the town. Calculate the number of individuals to be drawn at random from each of the 4 strata?
501000
2505000
2004000
1002000
201
201
201
201
201
000,12600
x sampleStratum4
x sampleStatum3
x sampleStratum2
x sampleStratum1
fraction Sampling
04/11/2023 Dr. Tarek Tawfik 80
3-Systematic random sample
1. The underlying population is classified into intervals:
The size of intervals = the size of the population ÷ the required sample size.
2. The first case is selected at random from the first stratum (interval) and the others are selected by adding systematically the size of each interval.
3. Accordingly we are taking each (nth) individual. n is the size of the interval. If the latter is 10 we take every tenth observation
04/11/2023 Dr. Tarek Tawfik 81
Example of systematic random sample
1000 patients visit King Faisal University outpatient clinics every day. We need a systematic random sample of 100 patients. Explain how should we proceed in selecting those 100 patients composing our sample?
We classify the patients into 100 intervals and select a patient from each.
Size of each interval =1000/100 = 10Choose at random a number that lies between 1
and 10 say 9. Choose from the second interval patient number
19th. Choose from the third interval observation
number 29th. th 29 10 19 OR th 29 10 x 2 9
th 19 10 9 OR th 19 1x10 9
04/11/2023 Dr. Tarek Tawfik 82
4-Cluster random sample
۞ In this method, the sampling units are clusters (groups) of individuals – (incomplete sampling frame and/or the total sampling population is large) rather than individuals.
۞ The clusters (schools, houses, villages, …etc.)
form the sampling frame, from which the required number of clusters is selected at random.
۞ All individuals in a cluster, a specific group, or a random sample of them are included.
۞ Very useful when the population is widely dispersed, and it is impractical to list and sample from all its elements.
04/11/2023 Dr. Tarek Tawfik 83
Example of random cluster sample
In some research, the objective was to study the prevalence of malnutrition among primary school children in Hofuof. There are 200 primary schools in Hofouf. The estimated sample size is 20 clusters.
Describe how would you proceed in drawing such sample?
A. List all 200 schoolsB. Give each a numberC. Use the random numbers tables in selecting
the 20 schools whose numbers will fall between 001 and 200.
04/11/2023 Dr. Tarek Tawfik 84
5-Multistagerandom sample
We use this method if the target population is spread over wide geographic area and there is limited budget or resources (in community-based surveys).
In this method, the sample is drawn in many stages.
The area is divided into smaller clusters, the clusters are divided into smaller clusters and so on. Random selection is carried out at each level successively.
04/11/2023 Dr. Tarek Tawfik 85
You were asked to head a research team to investigate the problem of handicapping in K.S.A. How would you proceed in drawing your sample?
List all governorates Select 4 governorates at random List the districts in each of the 4 governorates Select a district from each governorate at
random List all village and urban areas in each districts Select a village and an urban centre from each
district randomly Study all or sub-sample of individuals in the
selected villages and urban centres
04/11/2023 Dr. Tarek Tawfik 86
II-Non-probability (convenience) samples
A convenience sample can minimize volunteerism and other selection biases by consecutively selecting every accessible person who meets the inclusion criteria.
A consecutive sample is specially desirable when it mounts to taking the entire accessible population over a long enough period to include seasonal variation or other changes over time that considered important to research question.
Representativness is a matter of judgment.
04/11/2023 Dr. Tarek Tawfik 87
Non-probability samples
These designs are used when the number of elements in a population is either unknown or can not be individually identified.
Quota sampling. Accidental sampling. Judgmental or purposive sampling. Snowball sampling.
04/11/2023 Dr. Tarek Tawfik 88
Non-probability (convenience) samples
1-Purposive sample: Chosen according to the investigator’s
judgement in such a way that maximizes the chances of proving the study hypothesis. “selecting patients with ESRD”
2-Quota sample: Involves only few strata e.g. men and
women >20 years. The enumerators select any individual belonging to those strata from whom they get the required information in an easy, quick and accessible way.
04/11/2023 Dr. Tarek Tawfik 89
Sample size
How many observations should we include?The greater the sample size:
I. The more precise are the estimates derived.II. The more powerful are the tests (probability
of rejecting a false null). Larger degrees of freedom and smaller
test statistic required. Smaller standard error.
III. Higher costs, more time and efforts needed.
04/11/2023 Dr. Tarek Tawfik 90
Sample size
The size of the sample depends on:
1. Study design,2. Maximum tolerable sampling error,3. Homogeneity of the population,4. Number of variables studied,5. The extent of breaking down the data in analysis,6. Cost,7. Available staff, equipments, time and tools,8. Statistical tests used.
04/11/2023 Dr. Tarek Tawfik 91
Data Collection Techniques and Tools
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 92
Objective of data collection techniques:
Allow the investigator to systematically collect data about the subjects under the study including the setting in which they were occur.
Methods of data collection
Secondary Sources
Primary Sources
Documents
o Govt publications
o Earlier research
o Censuso Personal
recordso Client
historieso Service
records
Observation
Participant
Non-participant
Interviewing
Structured
Unstructured
Questionnaire
Mailed
Collective
04/11/2023 Dr. Tarek Tawfik 94
Observation
Participant: the researcher participates in the activities of the group being observed “submitted to clinical examination to observe practice of physicians”
Non-participant: involved in the activities and remains a passive observer “functions carried out by nurses in a hospital”
04/11/2023 Dr. Tarek Tawfik 95
Problems with observation:
Hawthorne effect: change in behavior as a result of the observation process.Observer bias.Inter-observer variation in interpretation.Incomplete observation and /or recording “keen observation with missing recording or vice versa”.
04/11/2023 Dr. Tarek Tawfik 96
Recording of observation
Narrative: description of the process in the researcher’s own words “deeper insight in interpretation and conclusions”.
Scales: interpreting in a form of rates using scales for measurements. No in-depth interpretation, error of central tendency and Halo effect.
Categorical recording: yes/no, always/sometimes/never.
Using mechanical devices: videotape “uncomfortable or behave differently before a camera or cassette recorder.
04/11/2023 Dr. Tarek Tawfik 97
Scale “example”
045 3 2 1 1 2 3 4 5
Positive Negative Neutral
Aggressive behavior of nurses in hospital Z
Interviewing
UnstructuredInterviews
Structured Interviews
-Flexible interview structure.- Flexible contents- Flexibility in questions
-Rigid interview structure.- Rigid contents- Rigidity in questions
and their wording.
Different levels of flexibility and specificity.
In-depth interviews Focus group discussion Narratives Oral histories
Interview schedule Questionnaire
04/11/2023 Dr. Tarek Tawfik 99
Techniques of data collection
Using the available information (records and registries).
Observing and recording using an observation check list.
Interviewing (face to face) Self-administered questionnaire Telephone and net surveys. Focus group discussion. Measuring scales. Others (life histories, essay, case studies, and
mapping).
Techniques of data collection
(advantages and disadvantages)
Disadvantages Advantages Technique
1. Accessible.2. Non-ethical3. Incomplete and
imprecise.A. Ethical issuesB. Observer biasC. Data collector
may influence results.
D. Need training.
1. Inexpensive2. Permit
examination of past trends.
A. More detailed information.
B. Facts not mentioned by questioning
C. Test reliability
Records and registries
Observation
Techniques of data collection(advantages and disadvantages)
Disadvantages Advantages Technique
I. Interviewer may influence results
II. Less accurate recording than observation
III. Needs trained personnel
1. Not suitable for illiterate
2. Low response rate3. Problem of
misunderstanding
I. Suitable for illiterates
II. Permits clarification
III. High response rate
1. Less expensive2. Permit
anonymity3. Less personnel4. Eliminate bias
Personal interviewing
Self administeredquestionnaire
Techniques of data collection(advantages and disadvantages)
Disadvantages Advantages Technique
1. Interviewer may influence results
2. Open-ended questions
3. Domination4. Non response
o Trainingo Validity and
accuracy
Collection of in-depth information and exploration
oPrecision oEliminate bias
Focus group discussion
Measuring scale
Differentiation between data collection techniques and
tools.Tools Techniques
Data compilation sheetCheck list, eye, watch, scales,Microscope, pen and paper.Schedule, agenda, questionnaire, recorder.
Questionnaire.
Using available dataObservation
Interviewing
Self-administered questionnaire
04/11/2023 Dr. Tarek Tawfik 104
Designing Questionnaire and Data Collection
Instruments.
In many instances the validity of the results depends on the quality
of the data collection instruments.
04/11/2023 Dr. Tarek Tawfik 105
Choosing between an interview
schedule and a questionnaire.
Nature of the investigation: reluctant to discuss “sexuality, drug use”.
Geographical distribution of the study population.
The type of study population. “illiterate, young, handicapped, very old”.
04/11/2023 Dr. Tarek Tawfik 106
Administration of questionnaire.
Mailed or via other electronic media.Collective administration “people attending some function (schooling)”.Administration in a public place “hospital, medical center”.
04/11/2023 Dr. Tarek Tawfik 107
Advantages and disadvantages of questionnaire.
Advantages Disadvantages
1. Less expensive 2. Offers greater
anonymity
1. Application is limited
2. Response rate is low
3. Self-selecting bias4. Opportunity to
clarify is lacking5. Spontaneous
responses are not allowed.
6. Possible to consult others.
Advantages and disadvantages of interview.
Advantages Disadvantages
1. More appropriate for complex situations.
2. Collecting in-depth information.
3. Information can be supplemented.
4. Questions can be explained.
5. Has a wider application “any type of population”
1. Time consuming and expensive.
2. Quality of data depends on the quality of interaction.
3. Quality of data depends on the quality of interviewer.
4. Many interviewers5. Interviewer bias.
04/11/2023 Dr. Tarek Tawfik 109
Designing Good Questions and Instruments
Open-ended and Closed-ended Questions
Open-ended question: Useful when it is important to hear what
respondents have to say in their own words;What habits do you believe increase a person’s
chance of having a heart attack?
------------------------------------------------------------------------------------------------------------------------------------------------------
It leave the respondent to answer freely without limits that may imposed by the interviewer.
04/11/2023 Dr. Tarek Tawfik 110
Designing Questionnaire and Data Collection Instruments.
Open-ended questions:A. Often used in exploratory phases of
question design because they facilitate understanding a concept as respondent express it.
B. Phrases and words used by respondent can form the basis for more structured items in a later phase.
Disadvantage: Usually require qualitative methods
of coding and analyze the responses, which take more time and subjective judgment than coding closed-ended questions.
04/11/2023 Dr. Tarek Tawfik 111
Designing Questionnaire and Data Collection Instruments.
Closed-ended questions:More commonly used, and form the basis for most
standardized measures.Ask the respondent to choose from one or more pre-
selected answers;
Which one of the following do you think increases a person’s chance of having a heart attack the most ? (Check one)Smoking
Being overweightStres
s
04/11/2023 Dr. Tarek Tawfik 112
Closed-ended questions:
They quicker and easier to answer.The answers are easier to tabulate and analyze.The list of possible answers often help to clarify the meaning of the question.
Disadvantages:
i. It may lead the respondent, and do not allow them to express their own, potentially unique answers.
ii. The potential responses listed may not include an answer most appropriate for a particular respondent.
04/11/2023 Dr. Tarek Tawfik 113
Designing Questionnaire and Data Collection Instruments.
Whenever there is a chance that the set of answers is not exhaustive (does not include all the possible options), include the option ‘Other (please specify)’ or ‘None of the above”When a single response is desired, the set of possible responses should be mutually exclusive ‘ the categories should not overlap’ to ensure clarity.All that apply is used for multiple answer.
The Visual Analog Scale
Used for recording the answers to closed-ended questions using lines or other drawings.
The participant is asked to mark a line at a spot, along the continuum from one extreme to another, that best represents his characteristics.
It is important that the words that anchor each end describe the most extreme value for the item of interest.
The line is 10 cm long and score is the distance, in cm from the lowest extreme.
04/11/2023 Dr. Tarek Tawfik 115
Visual Analog Scale for Rating the Severity of Pain
4 -please use an X to mark the place on this line that best describe the severity of your pain in general over the past week
None
UnbearableNone
Unbearable
A participant might answer as follow
There is a 10 cm line, and the mark is 3 cm from the end (30 % of the distance from none to unbearable) so the respondent’s
pain would be recorded as having a severity of 30.%
Formatting of questionnaire
It is customary to describe the purpose of the study and how the data will be used in a brief statement on the cover together with name of the institution, assure anonymity, contact number for any questions, return address, deadline date and thank them for participation. “the covering letter”
To ensure accurate and standardized responses, all instruments must have instructions specifying how they should be filled out.
Some time it is helpful to provide an example of how to complete question, using a simple question that is easily answered.
Formatting
To improve the flow of the instrument, questions concerning major subject areas be grouped together an introduced by headings or short descriptive statements. “personal data include: age, sex, educational status, marital status”
To warm up the respondent to the process of answering questions, it is helpful to begin with emotionally neutral questions such as self-rated health of functioning.
More sensitive questions can be placed in the middle.
Questions about personal characteristics such as income or sexual function are often placed at the end of the instrument.
04/11/2023 Dr. Tarek Tawfik 118
Formatting
The visual design should be as easy as possible for the respondent to complete all questions in the correct sequence.With too complex format, the respondent or interviewer may skip questions, provide wrong answers, and even refuse to complete the instruments.A plenty of space is more attractive and easier to use than one that is crowded.When open-ended questions are used, the space of responding should be big enough to allow respondent with large handwriting to answer comfortably.
04/11/2023 Dr. Tarek Tawfik 119
Formatting
People with visual problems, including elderly will appreciate large type (font size 14), and high contrast (black on white).Possible answers to closed-ended questions should be lined up vertically and preceded by boxes or brackets to check, or by number to circle, rather than open blanks:
How many different medicines do you take every day? (Check one)
None
1-2
3-4
5-67 or more
04/11/2023 Dr. Tarek Tawfik 120
Formatting
The Branched Question:
Sometimes the investigator may wish to follow up certain answers with more detailed questions:
Respondent’s answer to initial question (screener) determine whether they directed to answer additional question or skip ahead to later questions;
10- Have you ever been told that you have high blood pressure? Yes NoIf yes, how old were you when you were first told that you had
high blood pressure?-------------- years old.If no, go to question 11.
04/11/2023 Dr. Tarek Tawfik 121
WordingClarity, Simplicity,
Neutrality
Every word in a question can influence the validity and reproducibility of the responses.
• Constructed question should be simple and free of ambiguity.
• Encourage accurate and honest responses without embarrassing or offending of the respondent.
04/11/2023 Dr. Tarek Tawfik 122
Clarity
o Question must be as clear as specific as possible.
o Concrete words are preferred over abstract words:
How much exercise do you usually get?
Is less clear than
“ during a typical week, how many hours do you spend exercising (e. g., vigorous walking or sports)?”
04/11/2023 Dr. Tarek Tawfik 123
Simplicity
Simple and common wording should be used to convey the idea, avoid technical terms and jargon.
“ drugs you can buy without a doctor’s prescription”.Clearer than “over-the-counter medications”.
The sentences should also be simple, using the fewest words and simplest grammatical structure.
04/11/2023 Dr. Tarek Tawfik 124
Neutrality
Avoid Loaded words and stereotypes that suggest that there is a most desirable answer.
“During the last month, how often did you drink too much alcohol”
“During the last month, how often did you drink more than five drinks in one day”
Less Judgmental question.
04/11/2023 Dr. Tarek Tawfik 125
Neutrality
It is useful to set a tone that permits the respondent to express behaviors and attitudes that may be considered undesirable.
“ People sometimes forget to take medications their doctor prescribed. Do you ever forget to take your medications?”
04/11/2023 Dr. Tarek Tawfik 126
Avoid Pitfalls
I. Double-Barreled Questions.II. Hidden assumptions.III. The question and answer
options do not match.IV. Leading questions.
04/11/2023 Dr. Tarek Tawfik 127
I- Double-Barreled Questions.
Each question should contain only one concept :Or or And will lead to unsatisfactory responses.
“How many cups of coffee or tea do you drink during a day?”.
In this case you should ask two questions to assess two things.
04/11/2023 Dr. Tarek Tawfik 128
II- Hidden Assumptions.
“How many cigarettes do you smoke in a day?”
“What contraceptives do you use?”
04/11/2023 Dr. Tarek Tawfik 129
III-The question and answer options do not match.
“ Have you had pain in the last week” The options are : (never, seldom, often,
very often), grammatically incorrect:“ How often have you had pain in the last
week?” or the answer should change to (yes, no).
04/11/2023 Dr. Tarek Tawfik 130
The question and answer options do not match.
Question about intensity:
“ I am sometimes depressed” (agree) (disagree).
For those who are often depressed, it is unclear to respond, disagreeing with this statement could mean that the person is often depressed or never depressed.
(never, sometimes, and often) should be the options.
04/11/2023 Dr. Tarek Tawfik 131
IV-Leading questions
It is the one in which, contents, wording or structure leads a respondent to answer in a certain direction “judgmental questions”.
“Unemployment is increasing, isn’t it?”
“Smoking is bad, isn’t it?”
04/11/2023 Dr. Tarek Tawfik 132
Collecting data using
attitudinal scales
Dr Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 133
Function of attitudinal scales
Attitudinal scales measure the intensity of respondent’s attitudes towards the various aspects of a given situation or issue and provide a techniques which combine the attitudes towards different aspects into one overall indicator.
To develop an overall picture out of various opinions and perspectives.
04/11/2023 Dr. Tarek Tawfik 134
Developing a scale
1. Which aspects is going to be measured?
2. Procedures adopted to combine these aspects to give an indicator for measurement?
3. The validity of such scale?
Types of attitudinal scales
Summated rating Scale
“Likert scale”
The cumulativeScale
“Guttman scale”
Differential scale“Thurstone sclae”
04/11/2023 Dr. Tarek Tawfik 136
I-Likert Scale
04/11/2023 Dr. Tarek Tawfik 137
Basic Research Designs
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 138
Definition of a research design
A traditional research design is a blueprint or detailed plan for how a research study is to be completed-
o Operationalizing variables so they can be measured,
o Selecting a sample of interest to study,o Collecting data to be used as a basis for
testing hypotheses and o Analyzing the results. ‘Thyer
1993’
Types of study design (I)
Prospective
Cla
ssifica
tion
base
Stu
dy
desig
ns
Number of contacts Reference periodNature of
investigation
One Two Three or more
Cross-sectionalStudies
Before and after studies
LongitudinalStudies
Retrospective
RetrospectiveProspective
Experimental
Non-experimental
Semi-experimental
Did the investigator assign exposure “intervention?”
Experimental study Observational study
Random allocation? Comparison group?
Yes No
Randomized Controlled Trial RCT
Non-RandomizedControlled
trial
NoYes
Analytical study
Direction?
Descriptive study
Yes No
Cohort study
Case-controlstudy Cross-sectional
study
Exposure and outcome at the same time
Exposure ←outcomeExposure →outcome
Research designs (II)
Typical uses Action in future time
Action in present
time
Action in past time
Form Timing Type of study
Prevalence estimatesReference rangeCurrent health status
Changes over time
Prognosis and natural history Etiology
Etiology particularly for rare diseases
Clinical trials to assess therapyTrials to assess preventive measuresLab. experiments
Observational
Observational
Observational
Observational
Experimental
Cross-sectional
Cross-sectional
Longitudinal(prospective)
Longitudinal(retrospectiv
e)
Longitudinal(prospective)
Cross-sectional
Repeated cross-
sectional
Cohort
Case-control
C.T
Collect All
information
Define cohort and assess
risk factors
Observe outcome
Collect All
information
Define casesand controls
(outcome)
Collect All
information
Collect All
information
Assess Risk
factors
Observeoutcome
Apply intervention
follow
trace
Phases and indications of basic study designs
follow
04/11/2023 Dr. Tarek Tawfik 142
Descriptive StudiesThe Descriptive Pentad
Descriptive studies are ‘the first toe in the water’
They concerned with and designed only to describe the existing distribution of variables without regard to causal or other hypotheses.Good descriptive study should answer five basic ‘Ws”.
The Five Ws
Components Ws
Age, sex, and other characteristics.A clear, specific, and measurable case definition is essential. Descriptive studies often provide clues about cause that can be pursued with more sophisticated research designs.
Time provides important clues about health events.
Geography has a huge effect on health.
Who has the disease?What is the condition or disease being studied?Why did the condition or disease arise?
When is the condition common or rare?Where does or does not the disease or condition arise?
So what? The implicit W relates to the public health effect.
Descriptive Studies
Deal with individual Relate to the population
Ecological cor-relational studies Case report Cross-sectional
prevalence
Case-series report
Surveillance
04/11/2023 Dr. Tarek Tawfik 145
I- Case Reporto The least publishable units in the
medical literature.o An observant clinician reports an
unusual disease or association which prompts further investigations with more rigorous study design.
Example: benign hepatocellular adenoma and high-dose contraceptive pills.
o Not all case reports deal with serious health threats, however, some simply enliven the generally drab medical literature.
What is the most probable diagnosis?
04/11/2023 Dr. Tarek Tawfik 148
II-Case-series reportA case series report aggregates
individual cases in one report.Sometimes, the appearance of several
similar cases heralds an epidemic. Example: a cluster of homosexual men
in Los Angeles with a similar syndrome alerted the medical community of HIV/AIDS epidemic in North America.
Case-series report is a major trigger for further investigations compared to case report.
Can constitute the case group for a case-control study.
III- Cross-sectional (prevalence) Studies.
Prevalence studies describe the health of populations.
Examples: Health and Nutrition Examination Survey (HNES), and Censuses.These studies provide a snapshot of the population at a particular time.Both exposure and outcome are identified at at one point in time.Particularly useful for estimating the point prevalence of a condition in the population:
Point prevalence =
Number with the disease at a single time point
Total number studied at the same time point
Design of a Cross-Sectional Study
Defined population
Gather data on exposure and disease
Exposed:Have disease
Exposed:Do not have
Disease
Not exposed:Have disease
Not exposed:Do not have
disease
Begin with
End with four possible groups
04/11/2023 Dr. Tarek Tawfik 151
Cross-sectional (prevalence) Studies.
Disadvantages Advantages
Only association can be inferred “not causation”.Temporal sequence is difficult to ascertain “exposure-outcome sequence”.Incidence can not be estimated “occurrence of new cases over time”.Trend over time can not be identified “change of magnitude/pattern over time”.
Low costs.No follow up is required.Quick.
Assignment:o The New Valley Governorate is located in the
Western desert of Egypt; several reports had described a grade II goiter among primary school children, little is known about the prevalence, socio-demographic characteristics of the condition.
o Some clinicians have proposed observing a large number of cases of renal failure in the Manzala region at the Northern cost of Nile delta, the prevalence and distribution of which are lacking.
o Little is known about the magnitude of extra pulmonary tuberculosis in Egypt.
According to the previous given data give the most appropriate study design?
04/11/2023 Dr. Tarek Tawfik 153
IV-Repeated cross-sectional studies “Longitudinal study”
Studies that may be carried out at different time points to assess trends over time.
These studies involve different groups of individuals at each time point.
It can be difficult to assess whether apparent changes over time simply reflect differences in the group included in the study rather in the condition itself.
04/11/2023 Dr. Tarek Tawfik 154
Longitudinal study design.
Study population Study populationStudy populationStudy population
Data collectionData collectionData collectionData collection
Interval Interval Interval
Disadvantages:1. Maturation effect ‘maturation of responses in young
subjects.2. Reactive effect ‘instrument educates the respondents’3. Regression towards the mean ‘shift of extreme attitudes
and behavior towards the average’.4. Conditioning effect ‘repeated contacting with same persons’
V- Surveillance
The ongoing systematic collection, analysis, and interpretation of health data essential to the
planning, implementation, and evaluation of public health practices, closely integrated with
timely dissemination of these data to those who need to know.
Passive Data gathered through the
traditional channels e.g ,.death certificates
ActiveSearching and reporting cases.
VI-Ecological Correlational Studies
Look for associations between exposures and outcomes in the population rather than in individuals.
Can be a convenient initial search for hypotheses as the data are already collected.
Correlation coefficient r, which indicates how linear is the relation between exposure and outcome.
The mortality of coronary heart disease correlates with per capita sales of cigarettes.
Inverse correlation between access to safe abortion and maternal mortality rate.
Consumption of dietary fat and fast foodin certain community.
High mortality from coronary heart disease (high incidence of MI)
Ecological study
04/11/2023 Dr. Tarek Tawfik 158
Ecological Correlational Studies
The inability to link exposure to outcome in individuals.
Controlling of confounders. are the two major limitations of this type of study.
Death rates from coronary heart disease is positively correlated with number of color television sets per capita????
04/11/2023 Dr. Tarek Tawfik 159
VII- Before-and After study design.
“pre-test/post-test design”
The most appropriate design for measuring the impact of effectiveness of a program.
Described as a two sets of cross-sectional data collection on the same population to find out the change in the phenomenon or variables between two points in time.
The change is measured by the difference change before and after the intervention.
It could be experimental or non-experimental.
Commonly used in evaluation studies.
Study population Study population
Before/pre observationData collectionActual or recall
After/post Data collection
Program/intervention
Time
04/11/2023 Dr. Tarek Tawfik 161
Disadvantages
®Two sets of data collection, more expensive and more difficult to implement.
®Time lapse may cause attrition of participants.
®It only measures total change without ruling out the role of other variables “confounders”
®Maturation of the response of young participants “maturation effect”
®Reactive effect®Regression effect.
Uses of Descriptive Studies
Monitor health of the population, provided by ongoing surveillance: epidemic syphilis in USSR, international epidemic of multiple births, prematurity, caused by assisted reproductive technologies.
Health services: Laparoscopy, introduction of Anti HIV/AIDS therapy.
Development of hypotheses: retrolental hyperplasia, and painted radium dial watches.
Trend analysis.
Planning
Clues about cause
Descriptive Studies.
Overstepping of the data: Post hoc inference, a temporal
association is incorrectly inferred to be a causal one.
Intake of 6 cups of coffee /day is associated with lower risk of colonic cancer!!!!
The role of the media, The damage in the control efforts, Damage to the public health.
Now
Concurrent Exposure Outcome
Outcome Exposure Retrospective
Exposure
Outcome Prospective
Time
Research design in relation to time
04/11/2023 Dr. Tarek Tawfik 165
Finding Your Way in the Terminology Jungle
Retrospective studyProspective studyConcurrent prospectiveNon-concurrent prospectiveExperimental studyPrevalence study
=Longitudinal studyProspective cohort Historical cohort
==
Case-control studyCohort studyConcurrent cohort studyRetrospective cohort studyRandomized trialCross-sectional study
Experimental or Observational Study
Experimental studies involve the investigator intervening in someway to affect the outcome.
Clinical trial is an example of an experimental study in which the investigator introduces some form of ‘treatment, vaccine, new surgical procedure, change in the health policy or introduction of behavioral interventions’.
Other examples include animal studies or laboratory studies that are carried out under experimental conditions.
These studies provide the most convincing evidence for any hypothesis as it can possibly control confounders.
Experimental or Observational Study
Observational studies ‘cohort or case-control’ studies are those in which the investigator does nothing to affect the outcome, but simply observes what happens.
These studies provide poorer information than the experimental studies because it is often impossible to control for all factors that may affect the outcome ‘confounders’.
Epidemiological studies which assess the relationship between factors of interest and disease in the population, are observational.
04/11/2023 Dr. Tarek Tawfik 168
Observational (Analytical) Studies.
04/11/2023 Dr. Tarek Tawfik 169
Bias and Casual Associations in Observational Research.I-Validity and Reliability
Definitions : Validity
*Internal validity: the ability of the tool/test to measure what it sets out to measure.
The inference from participants in a study should be accurate, avoiding systematic errors and bias. Wrong extrapolation to the general population is potentially dangerous.
** External validity: can results from study participants be extrapolated to the reader’s patients?
Including the results into the clinical practice.
04/11/2023 Dr. Tarek Tawfik 173
II-Bias
Bias in research denotes deviation from the truth.
(when there is systematic difference between the results from study and the truth).
All observational studies and badly done randomized controlled trials have built-in bias.
The most often used classification of bias includes:
I. Selection bias,II. Information bias,III. Confounding.
04/11/2023 Dr. Tarek Tawfik 174
I- Selection Bias Are the groups similar in all important
respects?
Selection bias stems from absence of comparability between groups being studied.
In a cohort study, are participants in the exposed and unexposed groups similar in all important respects except for exposure?
In case-control study, are cases and controls, similar in all respects except for the disease in questions?
04/11/2023 Dr. Tarek Tawfik 175
Selection Bias
Bias accompanying case-control study:
Berkson bias (admission-rate bias): knowledge of the exposure of interest might lead to an increased rate of admission to hospital. Admission preference of disease of interest.
Neyman bias (an incidence-prevalence bias): arises when a gap in time occurs between exposure and selection of study subjects. This bias crops up in studies of diseases that are quickly fatal, transient, or sub-clinical.
Myocardial infarction and its relation to snow shoveling.
04/11/2023 Dr. Tarek Tawfik 176
Selection Bias
Unmasking bias: An exposure might lead to provoking of an
outcome. Estrogen replacement therapy and
symptomless endometrial cancer.
Non-respondent bias: In observational studies, non-respondents are
different from respondents. Smokers are less likely to return
questionnaires than are non-smokers or pipe and cigar smokers.
II- Information BiasHas the information been gathered in the same
way?
Also known as observation, classification or measurement bias, results from incorrect determination of exposure or outcome or both.
Information should be gathered in the same way in any comparative study.
04/11/2023 Dr. Tarek Tawfik 178
II- Information BiasHas the information been gathered in the same
way?
Sources: Differentials in information gathering: (bedside for cases while using telephone for
control).
Diagnostic suspicion bias: (intensive search for HIV in drug addicts).
Family history bias: Medical information flows differently to
affected and non-affected family members (rheumatoid arthritis).
04/11/2023 Dr. Tarek Tawfik 179
Information Bias
Recall bias: cases are motivated to search their memories in order to identify the cause of their illness than the healthy people.
Observer bias: one observer consistently under or over reports a particular variable. Meticulous observation of those who are exposed than the non-exposed.
04/11/2023 Dr. Tarek Tawfik 180
Information Bias control
Observer and data gatherer should be blinded.
Using a standardized instruments for data collection,
Proper selection of the subjects are the possible maneuvers to lower the information bias.
III- Confounding.Is an extraneous factor blurring the
effect?
A confounding variable is associated with the exposure and it affects the outcome, but it is not an intermediate link in the chain of causation between exposure and outcome.
Myocardial infarctionOral contraceptive
Smoking
IUD insertion
STDs
Salpingitis
04/11/2023 Dr. Tarek Tawfik 182
Confounding ‘Control’
Restriction (exclusion or specification): Enrollment with restricted selection criteria,
including non-smokers.Matching: A pair wise matching (for every case who
smokes, a control who smokes is found).Stratification: Used after completion of the study. Results
can be stratified by the levels of the confounding factor.
Multivariate analysis techniques: logistic regression, proportional hazard
regression, and others.
04/11/2023 Dr. Tarek Tawfik 183
Judgment of AssociationsBogus, indirect, or real?
Statistical associations do not imply causal associations.
Types of associations: Bogus or spurious associations: Results of selection, information bias and
chance. Indirect association: Stems from confounding. Real associations.
04/11/2023 Dr. Tarek Tawfik 184
Hill’s Criteria for Real Associations
Temporal sequence: Did exposure precede outcome? the cause
must antedate the outcome.Strength of association: How strong is the effect, measured as relative
risk (>3 ) or odds ratio (> 1)? Consistency of association: Has effect been seen by others? In different
populations with different study designs.
04/11/2023 Dr. Tarek Tawfik 185
Hill’s Criteria for Real Associations
Biological gradient (dose-response relationship):
Does increased exposure result in more of the outcome?
Lung cancer and years of cigarette smoking.Specificity of association: Does exposure lead only to outcome? “weak criterion, few exposure will only lead
to the outcome”.Biological plausibility: Does the association make sense? “weak criterion, limited by our lack of
knowledge”.
04/11/2023 Dr. Tarek Tawfik 186
Hill’s Criteria for Real Associations
Coherence with existing knowledge: Is the association consistent with
available evidence? The effect of cigarette smoke on the bronchial
epithelium of animals is coherent with an increased risk of caner in human.
Experimental evidence: Has a randomized controlled study
been done?Analogy: Is the association similar to others?
04/11/2023 Dr. Tarek Tawfik 187
Case-control DesignResearch in Reverse
Dr. Tarek Tawfik
Examples of Topics Investigated with Case-control Studies
Outcome Exposure
Schizophrenia, schizoaffective disorder, or bipolar disorderPancreatic cancerEarthquake mortalityReflux oesophagitisConnective tissue disordersSystemic lupus eryhtematosus Nipah virus infectionNeonatal tetanusEsophageal cancerMetastatic prostate cancerDementia Ovarian cancerBreast cancer prevention Genital wartsOvarian cancerColon cancerRecurrent myocardial infarction prevention
Cat ownership in childhoodBody mass indexPhysical disability Hiatus hernia Hair dyesHistory of shinglesPig farmingGhee applied to umbilical cordPickled vegetablesDigital rectal examinationStatins for lipid loweringParacetamol usePhyto-estrogensMale condom usePhysical activitySigmoidoscopy screening Influenza vaccination
04/11/2023 Dr. Tarek Tawfik 189
Case-Control StudiesStructure
A case-control study compares the characteristics of a group of patients with a particular disease outcome (the cases) to a group of individuals without a disease outcome (the control), to see whether any factors occurred more or less frequently in cases than the controls.
Such retrospective studies do not provide information on the prevalence or incidence of disease but may give clues as to which factors elevate or reduce the risk of disease.
Disease-free
Dis
eased
Population
Diseased (cases)
Disease-free(controls)
Exposed to factor(a)
Unexposed to factor(b)
Unexposed to factor(d)
Exposed to factor(c)
Sam
ple
Trace Present time
Starting pointPast time
Basic structure of case-control design
The O
dds “ch
ance
of e
xposu
re
Is calcu
late
d b
etw
een b
oth
gro
ups
Calculate the difference in Odds for the included exposures for comparison.
Calculate the difference in Odds for the included exposures for comparison.
04/11/2023 Dr. Tarek Tawfik 192
Selection of Cases
Cases
Incident cases Patients who are recruited at the time of diagnosis
Prevalent casesPatients who were already
diagnosed before entering the study
1. Recall bias2. Altered behavior
3. Risk factors may be related more to survival
1. Less recall bias2. Less altered behavior
3. But, we have to wait to be diagnosed
Selection of Cases
Hospital patientsPatients in Physician’s
practicesClinic patients
Problems: *Single or multiple hospitals;
Some hospitals have an aggregationof certain risk factors than others.
*Tertiary Health Care Facility;A tendency to select severely ill cases, any risk factors identified may be only found in these severe forms of the disease.
Selection of Controls
Hospitalized persons
Non-hospitalized persons
Community-basedProbability sample
School rostersSelective service list
Insurance company list
Neighborhood controls:Door-to-door approachOr random digit dialing
)Socio-economic, cultural(
Best-friend control:Similarity in demographic
Characteristics)lifestyle pattern(
Spouse or sibling controls:Sibling control may provideSome control over genetic
Difference between Cases and controls
Captive population:They represent a sample of ill population.Hospital patients are differ from people inthe community.A sample of all other patients, admitted or to select a specificother diagnoses?
04/11/2023 Dr. Tarek Tawfik 196
Problems in Controls Selection
When a difference in exposure is observed between cases and controls,
We must ask whether the level of exposure observed in the controls is really the level expected in the population in which the study was carried out or whether-perhaps (due to the manner of selection)-
The controls may have a particularly high or low level of exposure that might not be representative of the level in the population in which the study was carried out.
Distribution of Cases (cancer pancreas) and Controls by Coffee-drinking Habits and Estimates of Risk Ratios
Coffee consumption (cups/day)Sex Category Total >5 3-4 1-2 0
2163072.6
1.2-5.4
1513362.3
1.2-4.6
60822.6
1.2-5.8
28483.1
1.4-7.0
53742.3
1.0-5.3
53803.3
1.6-7.0
941192.6
1.2-5.5
591521.6
0.8-3.4
9321.0-
11561.0-
No. of casesNo. of controlsAdjusted RR95 % CI
No. of casesNo. of controlsAdjusted RR95 % CI
Male
Females
Estimates of Relative Risk of Cancer of the Pancreas Associated with use of Coffee and Cigarettes
Coffee drinking (cups/day)
Cigarette smoking status
Total >5 1-2 0
1.01.31.2
(0.9-1.8)
3.13.04.6
2.7(1.6-4.7)
2.14.02.2
1.8(1.0-3.0)
1.01.31.2
1.0
Never smokedEx-smokers
Current smokers
Total “RR/95% CI”
04/11/2023 Dr. Tarek Tawfik 199
Matching
The process of selecting controls so that they are similar to the cases in certain characteristics, such as age, race, sex, socioeconomic status, and occupation.
To nullify the difference in characteristics or exposures other than that has been targeted for study.
Types of Matching
Individual Matching(matched pairs)
Group Matching(frequency)
Selection of controls: Proportion of controls with certain characteristicsidentical to proportion of cases; 25% of cases aremarried, then 25 % of controls are married.All cases should be selected first, and calculation of proportions are made.
For every case included an identical matched
control should be selected;45 year old white female
case, we seek for 45 year white female control.
used in hospital-basedcase-control studies
Problems with Matching
Practical problems Matching of too many characteristics is
very difficult or impossible to identify an appropriate control.
A 48-years old black female, married, has 4 children, lives in zip code 21209, and work in photo-processing plant
Find her control?
Conceptual problemsOnce we have matched controls to cases to a given
characteristics, we can not study that characteristics.
Marital status and cancer breast, if matching occur as regard marriage, we can not be able to study of that factor ‘marital status’. Why?
Matching ensures the same prevalence of that characteristic in both cases and controls.
04/11/2023 Dr. Tarek Tawfik 202
Uses of Multiple Controls
In case-control studies we usually use more than one control per case to increase the power of the study.
04/11/2023 Dr. Tarek Tawfik 203
1-Multiple controls of the same type.
The power of the study is increasing by including more controls for each case up to 4 controls per case.
Why not keep the ratio of controls to cases 1:1 and just increase the number of cases?
1. For many rare disease ‘cancer, connective tissue disorders’ the number of the cases are limited for study.
2. In addition, with the limited time frame of the study that does not allow more inclusion of cases and
3. In the absence of multi-centric collaboration, the option remained is to increase the number of controls.
04/11/2023 Dr. Tarek Tawfik 204
2-Multiple Controls of Different Types
The use of hospital and neighborhood controls:
To assess the level of exposure among the different controls group in relation to the cases.
Comparing cases with hospital controls, then cases to neighborhood controls to assess discrepancy in the level of exposure, and if present, the reason should be thought.
Nested Case-Control Studies
Population(Cohort )
Develop disease Do not develop Disease
SubgroupSelected as
controlsCases
Tim
e
Initial data and/or specimen obtained
Advantages of Nested Case-Control Design
Interviews are performed at the beginning of the study (baseline), the data are obtained before any disease has develop, the problem of possible recall bias is eliminated.
If abnormalities in biologic characteristics are found ‘specimens obtained years before the development of clinical disease’ , it is more likely that these findings represent risk factors or other pre-morbid characteristics than a manifestation of early, sub-clinical disease.
Temporal association can not be concluded from the ordinary case-control design.
More economical to conduct.
04/11/2023 Dr. Tarek Tawfik 207
Assignments:
The risk factors for end-stage renal disease are largely unknown, describe a study to identify such factors?
The prevalence of iodine deficiency disorders showed a geographic discrepancy between Jeddah and Qaseem, mention a design to explore such discrepancy.
Cross-sectional study reported a difference in the dietary fat intake among obese subject, how to confirm such difference?
04/11/2023 Dr. Tarek Tawfik 208
Cohort Study Design
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 209
Cohort study(marching towards outcomes)
The term cohort has military, not medical roots.
A cohort was a 300-600-man unit in the Roman army, ten cohorts formed a legion.
A cohort study consists of bands or groups of persons marching forward in time from an exposure to one or more outcomes.
Basic Structure of cohort study
Disease-free
Dis
eased
Disease-free
Unexposedto factor
Exposed to factor
Population
Develop Disease (a)
Disease-free(b)
Develop Disease (c)
Disease-free(d)
Sam
ple
Starting point
Present time Future timeFollow
Com
parin
g th
e in
cid
en
ce o
f dis
ease in
each
gro
up
The Relative Risk is calculated for exposure
04/11/2023 Dr. Tarek Tawfik 211
Cohort
Incidence of cancer lung
Incidence of cancer lung
Direction
Tim
e
Prospective Exposure Outcome
Outcome
Exposure
Retrospective
Exposure
ExposureOutcome Outcome
Ambi-directional
Time
Short/long term effects
Design of Cohort
Incidence rate of disease Totals
Then follow to see whether
Disease does not develop
Disease develop
s
aa+b
cc+d
a+b
c+d
b
d
a
c
Exposed
First select Not exposed
Data collection in cohort: forwards and backwards
A cohort study follow-up two or more groups from exposure to outcome.In the simplest form, it compares the experience of a group exposed to some factor with another group not exposed to that factor. The frequency of the outcome ‘whether higher or lower’ in relation to the unexposed, will gives the evidence of association between exposure ad outcome.In general, the cohort should always moves in the same direction, although the data gathering might not.
Cohort versus Randomized Trials
Both types compare exposed with non-exposed groups (or a group with a certain exposure to a group with other exposure). Because of ethical and other reasons, we can not randomize people to receive a putatively harmful substance (carcinogens), the exposure in RCTs is often a treatment or preventive measure.In cohort studies investigating etiology “exposure” is often to a toxic or carcinogenic agent.The difference between the two design is the presence or absence of randomization which is critical in interpreting the study findings.
Selection of Study Population
Comparison of outcomes in an exposed group and non-exposed group (or a group with a certain
characteristic and a group without)
Create a study Population by selecting groups for inclusion on the basis of whether or not they were exposed
)occupationally exposed cohorts(
Select a defined population before any of its members become
exposed or before their exposures are identified selection by
factor not related to exposure (residence ,)
took historiesor tests and then
separate into exposed and non-exposed
In both cases wewait for the
outcome
Types of Cohort Studies
(concurrent prospective)
Using a defined population(smoking and lung cancer ,)population of
elementary school children.
Non randomized
Exposed (smoke) Non-exposed (non-smoker)
No disease Disease Disease No disease
Concurrent 2000
2010
2020Time frame for a hypothetical concurrent cohort study
begun in 2000
Types of Cohort StudiesRetrospective Historical
Defined population (old roster of elementary School children found)
Non randomized
Exposed (smoke) Non-exposed (non-smoker)
No disease Disease Disease No disease
Retrospective 1980
1990
2000
Time frame for a hypothetical retrospective cohort study begun in 2000
Surveyed for smoking habit
Advantages of Cohort DesignI. The best way to ascertain both incidence
and natural history of a disease (the temporal sequence between the putative cause and outcome is usually clear).
II. Useful in investigation of multiple outcomes that might arise after a single exposure (sometimes misleading).
III. Useful in the study of rare exposures.IV. Reduce the risk of survival bias (diseases
that are rapidly fatal are difficult to study because of this factors).
V. Allow calculation of incidence rates, relative risks, and confidence intervals.
VI. Other outcome measures include life table rates, survival curves and hazard ratios.
04/11/2023 Dr. Tarek Tawfik 221
Potential Biases in Cohort Studies
1) Bias in assessment of the outcome (blinding or masking is used to avoid).
2) Information bias (particularly in historical or retrospective cohort).
3) Bias from non-response and losses to follow-up (attrition).
4) Analytic bias (blinding is needed).
04/11/2023 Dr. Tarek Tawfik 222
When Is A Cohort study Warranted?
A. When a good evidence suggests an association of a disease with a certain exposure (from clinical observations or case-controls or other types of studies).
B. When are able to minimize attrition of the study population.
C. When the interval between exposure and development of outcome is relatively short.
What To Look For In Cohort Studies
All participants in a cohort study must be at risk of developing the outcome.
Clear, unambiguous definition of exposure at the outset is required (sometimes quantifying the exposure by degrees, rather than yes/no).
Unexposed should be similar to the exposed in all aspects except for the exposure. Either internal or external sources. The healthy worker effect.
Outcomes must be defined in advance; should be clear, measurable and specific.
Who is at risk?
Who is exposed?
Who is an appropriate control?
Have outcomes been assessed equally?
Reporting of Cohort Studies The first table in reports should provides
demographic and other prognostic factors for both groups with hypothesis testing (P value), to show the likelihood that observed differences could be due to chance.
For dichotomous outcome measures (sick/well), provide raw data sufficient for the reader to confirm the results.
For cumulative incidence: calculate the proportion who develop the outcome during the specified study interval.
For incidence rates, the value is expressed per unit of time.
The relative risks, and confidence intervals should be provided.
Use of P values should not replace interval estimation (relative risk with
confidence).
How to Choose the Study Design?
Study Design Selection of subjects by
status
Information collected on Exposure
Information collected on
Disease
Cross-sectional
No Current Current
Case- Control Disease Past Current
Cohort: Prospective
Retrospective
Exposure Current Future
Exposure Past Current
How to Choose the Study Design? (cont.)
Options Case-Control Concurrent Cohort
Retrospective Cohort
Study time Short Long Short
Cost Low High Low
Rare diseases Yes No No
Sample Size Small Large Large
Loss to follow up
No Yes Yes
Incidence No Yes Yes
Relative Risk Approx. Yes Yes
04/11/2023 Dr. Tarek Tawfik 227
Experimental study design
Dr Tarek Tawfik
Experimental study designs
Treatment /Intervention/
Program Study population
Exploration
Causes/associations
Effect
Exploration
Outcome/Impact/Change
Experimental studies
Non-experimental studies
Randomization
Non
Experimental: starts from the cause to effect.Non-experimental: starting from the effects to trace the cause.Semi (Quasi) experimental: a mix of both.
The concept of Randomization
Study population
Group B
Study population
Group A Randomization Randomization
Or Or
Any individual or unit of study population has an equal and independent chance of becoming a part of an experimental or control group, or in the caseof multiple treatment modalities, any treatment has an equal and independent chance of being assigned to any of the population groups.
The control group design“the control experimental design”
Study population
Study population
Study population
Study population
Intervention arm
No intervention
Experimental group
Control group
Independent variable
Baseline Data
Measuring dependent variables
“outcome”
The chief objective of the control group is to quantify the impact of extraneous factors“possible confounders”, which help to ascertain the impact of the intervention only.
The placebo design
A patient’s belief that is receiving treatment can play an important role in recovery from an illness even if treatment is ineffective “psychological effect known as placebo effect”
The placebo design attempts to determine the extent of this effect.
The placebo design
ExperimentalGroup
Placebo Group
Control Group
ExperimentalGroup
Treatment+
Placebo
Control Group
Placebo Group
Treatment Outcome
Treatment+Confounders
Confounders
Placebo
Treatment
Control
Placebo
Treatment/placebo/confounders
Placebo/confounders
Confounders
(-)
(-)
04/11/2023 Dr. Tarek Tawfik 233
Cross-over comparative design
Denial of treatment to the control group is considered unethical.
Denial of treatment may be unacceptable to some individuals in the control group, which could result in drop out of cases.
The cross-over design experimental design makes it possible to measure the impact of a treatment without denying treatment to any group.
Design is based upon the assumption that participants at different stages are similar in terms of their characteristics and the problem for which they are seeking intervention.
Cross-over experimental design
Study population
Placebo
Drug A
Non
Non
Outcome
Outcome
Placebo
Drug A
Outcome
Non
Non
Outcome
WashoutPeriod
Blind
Blind
Blind
Blind
04/11/2023 Dr. Tarek Tawfik 239
Meta-analysis and systematic review.
04/11/2023 Dr. Tarek Tawfik 244
Estimating Risk
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 245
Absolute Risk
The incidence of a disease in a population is termed absolute risk.
* Can indicate the magnitude of the risk in a group of people with a certain exposure, but:
* It does not take into consideration the risk of disease in the non-exposed individuals,
* It does not indicate whether the exposure is associated with an increased risk of disease.
Absolute risk doe not stipulate an explicit comparison.
Rubella in 1st trimester: what is the risk that my child will be malformed? Abortion will be decided on the basis of this information.
04/11/2023 Dr. Tarek Tawfik 246
Determination that a certain disease is associated with a certain exposure.
By using the case-control and cohort studies we can assess whether there is an excess risk of disease in persons who have been exposed.
We have to compare the different risks among different groups to assess the presence of excessive risk (by calculating the incidence rate ‘attack rates’ and the difference in the risks).
So, estimation of relative risks are vital in determining who will be at a higher risk following the exposure.
Relative Risk (concept)o Both case-control and cohort studies are
designed to determine whether there is an association between exposure to a factor and development of a disease.
If an association exists, how strong is it?
o If we carry out a cohort study, we can put the question another way: what is the ratio of the risk of disease in exposed individuals to the risk of disease in non-exposed individuals? This ratio is called the relative risk.
Relative risk =
Risk in exposed
Risk in non-exposed
Interpreting the Relative Risk(measure the strength of the association)
Risk in exposed equal to risk in non-exposed (no association).Risk in exposed greater than risk in non-exposed (positive association; possibly causal).Risk in exposed less than risk in non-exposed (negative association; possibly protective).
If RR = 1
If RR > 1
If RR < 1
Calculating the Relative Risk in Cohort Studies
Then follow to see whether
Incidence rate of disease
Totals Disease does not develop
Disease develops
a a+b
c c+d
a+b
c+d
b
d
a
c
First select
a c
a+b c+d
Exposed
No exposed
=incidence in exposed =incidence in non-exposed
Hypothetical Cohort3,000 smokers and 5,000 non-smokers to investigate the relation of smoking to the development of coronary heart
disease (CHD) over a 1-year period.
Incidence per 1,000/year
Totals Do not develop CHD
Develop CHD
28.0
17.4
3,000
5,000
2,916
4,913
84
87
Smoke cigarettes
Do not smoke
cigarettes
Incidence among the exposed=84/3,000 = 28.0 per 1,000
Incidence among the non-exposed =87/5000= 17.4 per 1,000
Relative risk= Incidence in exposed
Incidence in non-exposed=
28.0/17.4 = 1.61
04/11/2023 Dr. Tarek Tawfik 251
Example: the British Heart Study
A large cohort study of 7735 men aged 40-59 years randomly selected from general practices in 24 British towns, with the aim of identifying risk factors for ischemic heart disease. At recruitment to the study, the men were asked about a number of demographic and lifestyle, including information on cigarette smoking habits.Of the 7718 men who provided information on smoking status, 5899 (76.4 %) had smoked at some stage during their lives (including those who were current smokers and those who were ex-smokers).Over subsequent 10 years, 650 of these 7718 men (8.4 %) had a myocardial infarction (MI).
MI in subsequent 10 years
Smoking status at baseline
Total No Yes
5899
1819
7718
5336 (90.5%)
1732 (95.2%)
7068(71.6%)
563 (9.5%)
87 (4.8%)
650 (8.4%)
Ever smoked
Never smoked
Total
The estimated relative risk=(563/5899)(87/1819)
=2.00CI = 1.60-2.49
(does not include 1)
The middle aged man who has eversmoke is twice as likely to suffer a MI over the next 10 years period as
a man who has never smoked.
The Odds ratio (relative odds)* In order to calculate a relative risk, we must
have values for the incidence in the exposed and non-exposed, as can be obtained in the cohort study.
* In a case-control study, however, we do not know the incidence in the exposed population or the incidence in the non-exposed population because we start with diseased people (cases) and non-diseased people (controls).
* Hence, we can not estimate the RR in case-control study directly and we implement another measure of association called Odds ratio.
04/11/2023 Dr. Tarek Tawfik 254
Defining the Odds ratio in Cohort and in case-control studies.
Suppose we betting on a horse named Little Beauty, which has a 60% probability of wining the race (P). Little Beauty, therefore has a 40 % probability of losing (1-P). What are the odds that the horse will win the race?
The odds is defined as: the ratio of the number of ways the event can occur to the number of ways the event can not occur.
Odds =
Odds = P/(1-P) or 60 %/40 % = 1.5:1 = 1.5Probability of wining is 60 %, while the odds of
wining is 1.5 times.
Probability that Little Beauty will win the race
Probability that Little Beauty will lose the race
Odds Ratios in Case-Control and Cohort Studies
Do not develop disease
Develop
disease
Cohort
b a Exposed
d c Not exposed
Controls Cases Case-control
b a History of exposure
d c No history of exposure
Odds ratio= Odds that an exposed person
Develops diseaseOdds that a non-exposed Person develops disease
=a/b c/d
=ad bc
Odds ratio= Odds that a case was exposed
Odds that a control was exposed =a/c b/d
=ad bc
Example: HRT
* A total of 1327 women aged 50 to 81 years with hip fractures, who lived in a largely urban area in Sweden, were investigated in this un-matched case-controls study. They were compared with 3262 controls within the same age range selected from the National register.
* Interest was centered on determining whether postmenopausal hormone replacement therapy (HRT) substantially reduced the risk of hip fracture.
* The results in the table show the number of women who were current users of HRT and those who had never used or formerly used HRT in cases and controls.
Total
Never used HRT/ former user of
HRT
Current users of HRT
1327
3262
4589
1287 (30%)
3023
4310
40 (14%)
239
279
With hip fracture (cases)
Without hip fracture (controls)
Total
The observed Odds ratio= (40X3023)(239X1287)
=0.39C.I = 0.28 to 0.56
A postmenopausal woman in this age range in Sweden who was a current user of HRT thus had 39 % of the
risk of hip fracture of a woman who had never used
or formerly used HRTBeing current user of HRT
reduced the risk of hip fracture by 61%.
When is the Odds Ratio a Good Estimate of the Relative Risk?
In case-control, only the odds ratio can be calculated as a measure of association, whereas in a cohort, either the relative risk or the odds ratio is a valid measure of association.
Nevertheless, estimate of RR can be used in interpreting case-control study in the following occasions:When the cases are representative, with regard to history of exposure, of all people with disease in the population from which the cases are drawn.When the controls are representative with regard to history of exposure, of all people without the disease in the population from which the cases were drawn.When the disease being studied dose not occur frequently.
Odds Ratios and Relative risk
Total Do not develo
p disease
Disease develop
s
10,000
10,000
9800
9900
200
100
Exposed
Not exposed
Total Do not develop disease
Develop disease
100
100
50
75
50
25
Exposed
Not exposed
Relative risk =200/10,000
100/10,000= 2Odds Ratio=200X9900100X9800=
2.02
Relative Risk= 50/10025/100
=2Odds ratio=
50X7525X50
=3
04/11/2023 Dr. Tarek Tawfik 260
Remember
The relative odds (odds ratio) is a useful measure of association in and of itself, in both case-control and prospective studies “Cohort”.
In a cohort study, the relative risk can be calculated directly.
In a case-control study, the relative risk cannot be calculated directly, so that the relative odds or odds ratio (cross-product ratio) is used as an estimate of the relative risk when the risk of the disease is low.
04/11/2023 Dr. Tarek Tawfik 261
Calculating the Odds ratio in a Matched Pairs Case-Control Study.
According to the type of exposure, case-control study can be classified into four groups:
- pairs in which both cases and controls were exposed.
Concordant pairs - pairs in which neither the cases nor the controls were exposed.
- pairs in which the case was exposed but the control was not.
Discordant pairs - pairs in which the control was exposed and the case was not.
2X2 tableControl
Cases Not exposed Exposed
b The case was exposed and the
control was not
a Both the case and control
were exposed
Exposed
dNeither the case nor the control
was exposed
cThe case was not exposed
and the control was exposed
Not exposed
Calculation entail the discordant pairs only (b and c), we ignore the concordant pairs, because they do not contribute to our knowledge of how cases and controls differ in regard to past history of exposure.
The odds ratio will then equals = b /c
04/11/2023 Dr. Tarek Tawfik 263
Case-control study of brain tumors in children.
o A number of studies have suggested that children with higher birth weights are at increased risk for childhood cancer.
o In the next analysis, exposure is defined as birth weight greater than 8 lbs.
Total Normal control
Cases < 8lbs 8+ lbs
26 18 8 8+ lbs
45 38 7 < 8 lbs
71 56 15 Total
Odds ratio= 18/7 = 2.57
2= 4.00P = 0.046
04/11/2023 Dr. Tarek Tawfik 264
Attributable Risk
How much of the disease that occurs can be attributed to a certain exposure?Attributable risk is defined as the amount or proportion of disease incidence (or disease risk) that can be attributed to a specific exposure.How much of lung cancer risk experienced by smokers can be attributed to smoking?More important than RR as it addresses important clinical practice and public health. How much of the risk (incidence) of disease can we hope to prevent if we are able to eliminate exposure to the agent in question?
04/11/2023 Dr. Tarek Tawfik 265
Attributable Risk for the Exposed Group
Exposed Group
In nonExposed
group
Leve
l of risk
Background risk
In the non-exposed group
In exposed group
Incidence due to exposure
Incidence not dueto exposure
04/11/2023 Dr. Tarek Tawfik 267
Calculations
The incidence of a disease that is attributable to the exposure in the exposed group can be calculated as follow:
(incidence in the exposed group) - (incidence in the non-exposed group)
Then, what proportion of the risk in exposed persons is due to the exposure?
(incidence in the exposed group) - (incidence in the non-exposed group)
incidence in the exposed group
04/11/2023 Dr. Tarek Tawfik 268
Attributable Risk for the Total Population
What proportion of the disease incidence in a total population (both exposed and non-exposed) can be attributable to a specific exposure?
What would be the total impact of a prevention program on the community?
Calculations entail:(Incidence in the total population) – (incidence in non-exposed
group ‘background risk’).
In proportion:(Incidence in the total population) – (incidence in non-exposed group
‘background risk’). Incidence in total population
Example for calculating the attributable risk in the
exposed group
Incidence per 1,000 per year
Total Do not develop CHD
Develop CHD
Smoking status
28.0
17.4
3,000
5,000
2,916
4,913
84
87
Smoke cigarettes
Do not smoke cigarettes
Incidence among smokers = 84/3,000 = 28.0 per 1,000Incidence among non smokers = 87/5,000 = 17.4 per 1,000
The AR = (incidence in exposed group) – (incidence in the non exposed group) = 28.0 – 17.4 /1,000 = 10.6 /1,000????
In proportion = The AR = (incidence in exposed group) – (incidence in the non exposed group) /( incidence in exposed
group)= 28.0 – 17.6/ 28.0 = 10.6/28.0 = 0.379 = 37.9 %?????
04/11/2023 Dr. Tarek Tawfik 270
What does this mean? The attributable risk = 10.6 /1,000, it means
that 10.6 of the 28.0/1,000 incident cases in smokers are attributable to the fact that these people smoke.
Thus if we had an effective smoking cessation campaign, we could prevent 10.6 of the 28/1,000 incident cases of CHD that smokers experience.
In proportion, 37.9 % of the morbidity from CHD among smokers may be attributable to smoking and could presumably be prevented by eliminating smoking.
04/11/2023 Dr. Tarek Tawfik 271
Attributable risk in total population
The incidence in the total population can be calculated by subtracting the background risk.
(incidence in the total population) – (incidence in the non-
exposed group), for calculation we must know the incidence of the disease in the total population (which we often do not know), or all of the following three values, from which we can then calculate the incidence in the total population:
The incidence among exposed. The incidence among the non-exposed. The proportion of the total population that
exposed (frequently assumed or judged).
04/11/2023 Dr. Tarek Tawfik 272
AR in total population.
Assuming that the incidence in the total population of smoking is 44% (and therefore the proportion of non-smokers is 56%).
The incidence in the total population can then be calculated as follows:
(incidence in smokers)(% of smokers in the population) + (incidence in non-smokers)(% of non-smokers in population).
= (28.0/1,000)(0.44)+(17.4/1,000)(0.56)= 22.1/1,000 Then the AR= 22.1/1,000 – 17.4/1,000 = 4.7/1,000. It means that, if we an effective prevention
program, how much reduction in the incidence of the CHD could be anticipated.
04/11/2023 Dr. Tarek Tawfik 273
AR in total population* Proportion of incidence in the total population
= (incidence in the total population) –
(incidence in the non-exposed group)/ incidence in the total population = 22.1-17.4/22.1= 21.3%.
* Thus, 21.3 % of the incidence of CHD in this total population can be attributed to smoking, and if an effective prevention program eliminated smoking, the best we could hope to achieve would be a reduction of 21.3 % in the incidence of CHD in the total population which consisting of both smoking and non-smoking.
04/11/2023 Dr. Tarek Tawfik 274
Clinical TrialsAn introduction
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 275
What is a Clinical Trial?
A prospective study comparing the
effect and value of intervention (s)
against a control in human being.
Friedman, 1998
04/11/2023 Dr. Tarek Tawfik 276
Hierarchy of Medical Evidence
From weakest to strongest evidence:Case report.Case series.Database studies.Observational studies.Controlled clinical trials.Randomized controlled trial (RCTs).
Leon Gordis 2001.
R.C.Ts
Controlled Clinical
trails
Observational studies
Case reports, case series, and database studies
Level o
f eviden
ce
Randomized Controlled Clinical Trail The gold standard of research
04/11/2023 Dr. Tarek Tawfik 278
Intervention studies (Clinical Trials)
In an intervention study, the investigator determines which individuals are exposed to the factor of interest (intervention arm) and which are unexposed (control arm).
04/11/2023 Dr. Tarek Tawfik 279
Essential Elements of RCTSProperly designed. Unbiased treatment assignment (randomization).Comparable test groups “similar baseline data”.Intervention and control arms.Follow-up for a specific outcome.
04/11/2023 Dr. Tarek Tawfik 280
Types of RCTS
Treatment trials.
Preventive trials (vaccine).
Diagnostic or screening
tests.
Trials of health care delivery.
Trials of health care policy.
04/11/2023 Dr. Tarek Tawfik 281
Types of Treatment Trials
Pharmaceutical (treatment, prevention, biological, synthetic).Device (prosthesis, sensory aids).Procedure (surgery, laser, radiological).Behavior change (smoking cessation, dietary modification, exercise).Other (counseling, information provision).
04/11/2023 Dr. Tarek Tawfik 282
Pharmaceutical Development
04/11/2023 Dr. Tarek Tawfik 283
Clinical Trials Phases – Phase I
Purpose: determine basic safety and pharmacological information:
I. Route of administration.II. Safe dosage range.III. Toxicity.IV. Pharmacokinetics.
oTreat: small numbers of patients over short period of time.
oUsually no control group
oHealthy adult volunteers or patients who have exhausted all other options )terminal cancer patients(
04/11/2023 Dr. Tarek Tawfik 284
Clinical Trials Phases – Phase IIPurpose: evaluate the drug in patients who
suffer from the disease or condition that the drug is proposed to treat:Provide preliminary evaluation of efficacy.Identify group of patients most likely to benefit.Collect additional dosage and safety data. Usually comparison group but not always randomized.
04/11/2023 Dr. Tarek Tawfik 285
Clinical Trials Phases – Phase III
Purpose: further evaluate the efficacy and safety:
Randomized.⌂ New agent compared to placebo or
current therapy.⌂ Usually multi-centeric.⌂ Serve as basis for NDA i.e. new drug
application for marketing approval.
04/11/2023 Dr. Tarek Tawfik 286
Clinical Trials Phases – Phase IV
Drug is on the market – post surveillance study.
Purpose: collect longer term data on safety and efficacy and identify an advantage over other therapies.
Conducted for the approved indication, but may evaluate different doses or effects of extended therapy.
04/11/2023 Dr. Tarek Tawfik 287
Outcomes of Trial Phases
o Phase I : maximum tolerated dose.o Phase II : biological effect,
adverse events.o Phase III : efficacy, adverse
events.o Phase IV : long term effectiveness
and safety.
04/11/2023 Dr. Tarek Tawfik 288
Measures For Bias Control
☻ Written protocol.☻ Tested data collection forms,
handbooks, manuals of procedures.☻ Written definitions.☻ Standard equipment.☻ Training and certification of
personnel.☻ Independent data entry.
Reference population
Sampling (Random)
Experimental population Unwilling
Willing
Screening
selection criteria
Ineligible
Eligible
Study population
Sampling procedures
RCTs “Basic Structure”Reference population
Random sampling
Sample population
Randomization
Control Intervention
Outcome No outcome Outcome No outcome
04/11/2023 Dr. Tarek Tawfik 291
Control arm
WHY? ☻Spontaneous cure ☼ Side effects
HOW? ⌂ Criteria ► Historical
▼Ethical
04/11/2023 Dr. Tarek Tawfik 292
Examples of control arm
Standard care. Placebo. Careful follow-up. Early or late application of same
intervention. Higher or lower dose level.
04/11/2023 Dr. Tarek Tawfik 293
Randomization
oMeans that subjects recruited from the study population are allocated to either intervention or control arm by chance.
oRandom procedure ≠ haphazard procedure
04/11/2023 Dr. Tarek Tawfik 294
Why Randomization
o Ensures comparability of the two arms regarding known and unknown factors.
o Avoid selection bias.o Provides basis for standard statistical analysis.
oDifferences in baseline characteristics of the study arms indicate break in randomization.
04/11/2023 Dr. Tarek Tawfik 295
Why Randomization is difficult?
Any randomization technique must insure:
Every new subject has an equal chance to be allocated to either arms (alternation?!)Nearly equal number of subjects in each arm (coin toss?!).
04/11/2023 Dr. Tarek Tawfik 296
Randomization Techniques
I- Fixed allocation randomization:☺ Simple randomization.☺ Blocked randomization.☺ Stratified randomization.II- Outcome adaptive designs: ☻ Play the winner.
III- Others.
04/11/2023 Dr. Tarek Tawfik 297
Simple randomization
Sealed envelopes.
Random number tables.
Computer generated.
04/11/2023 Dr. Tarek Tawfik 298
Blocked Randomization
I. Blocks containing specific number of participants are generated )5 blocks, each containing 4 participants for a study with total of 20 participants(.
II. Within each block, participants are randomly allocated to either arms.
T C T C C C T T T T C C
T C C T C T C T
04/11/2023 Dr. Tarek Tawfik 299
Stratified randomization
Control
Age<40Test
EnrolledControl
Age>40Test
04/11/2023 Dr. Tarek Tawfik 300
Baseline measurements
Useful to check that comparability has been successfully achieved.
04/11/2023 Dr. Tarek Tawfik 301
Design of the trial
Methodology section should include the following:
I. Patient inclusion criteria.II. Time of patients inclusion in the
study.
III. Presence of a comparison group.IV. Matching criteria of the two
groups.V. Method used for randomization.
04/11/2023 Dr. Tarek Tawfik 302
BlindnessMeans ensuring that a person “investigator, data collector, or analyst” remains unaware of which arm a subject has been allocated to.
04/11/2023 Dr. Tarek Tawfik 303
Why Blindness?I. To reduce selection bias.II. To avoid bias in outcome
measures.
Blinding is not possible in all studies so, one needs to consider how important it is, and to what extent it can be achieved.
04/11/2023 Dr. Tarek Tawfik 304
Blindness(continued)
Trials are often described as:o Single-blind: the subject participating in the
trial. oDouble-blind: the subject & investigators
(clinician, interviewers, laboratory personnel). oTriple blind: the subject, investigators & the
committee (including data entry and analysis) responsible for monitoring outcome.
04/11/2023 Dr. Tarek Tawfik 305
Compliance of the subjects can be assessed by:
I. Questioning
II. Observing
III. Check drug
Complications.
Completeness of follow-up.
Procedures to be considered
04/11/2023 Dr. Tarek Tawfik 306
Complex designs1- Multiple treatment groups
More than 2 different treatments )or doses( may be compared with a control group.
Sample population
Control Drug A Drug B Drug C
04/11/2023 Dr. Tarek Tawfik 307
Complex designs
2- Cross-over trialo Each subject receives both the active o and control treatments during two
periods separated by a wash-out period.
Outcome No outcome Outcome No outcome
Enrolled population
Wash-out period
Drug APlacebo
Outcome No outcome Outcome No outcome
Drug APlacebo
Complex designs
3- Factorial design Used to evaluate the separate and combined
effects of two different factors: Group 1: Placebo. Group 2:. Iron Group 3: Folate. Group 4: Iron + Folate
Sample population
Control Surgery Radiotherapy Surgery+Radio
04/11/2023 Dr. Tarek Tawfik 310
Losses to follow-up (Attrition)
1( One of the most important sources of bias, since those lost may be different from those seen.
2( Compare drop-outs to non-drop-outs.
3( Perform sensitivity analysis.
04/11/2023 Dr. Tarek Tawfik 311
Interpretation of trial
1- Reporting the data.
2- Statistical methods.
3- Statistical analysis.
4- Power.
P < 0.05 ??
04/11/2023 Dr. Tarek Tawfik 312
Good RCT should reportClear definition of patients.
Comparison group.
Randomization and blindness.
Outcome criteria and variables.
Compliance and completeness.
Complications of treatment.
Statistical manipulation.
04/11/2023 Dr. Tarek Tawfik 313
Sample Size Calculation
Standard formulae and look-up tables are available to calculate the minimum sample size and ratio of controls to cases.
Some computer packages )Epi-Info, MedCalc( are also available for free )internet(.
04/11/2023 Dr. Tarek Tawfik 314
Ethical Issues In RCT
Ethical Principles and Guidelines for the Protection of Human Subjects of Research.
It sets the fundamental ethical principles underlying acceptable conduct of research involving human participants:
• Respect for persons
• Beneficence
• Justice
The Belmont Report
04/11/2023 Dr. Tarek Tawfik 315
Critical Appraisal of Published Medical
Research
Dr. Tarek Tawfik
04/11/2023 Dr. Tarek Tawfik 316
Consider the research hypothesis
Consider the study design
Consider the outcome variable
Consider the predictor variables
Consider the methods of analysis
Consider the possible source of bias
Consider the interpretation of results
Consider the utility of the results
Ste
ps in
eva
luatio
n o
f a p
ub
lished
pap
er
04/11/2023 Dr. Tarek Tawfik 317
Stepwise Approach for Appraisal.Step 1. Consider the research hypothesis
Is there a clear statement of the research hypothesis?
Does the study address a question that has clinical relevance?
04/11/2023 Dr. Tarek Tawfik 318
Step 2. Consider the Study Design
Is the study design appropriate for the hypothesis?
Does the design represent an advance over prior approaches?
Does the study use an experimental or an observational design?
04/11/2023 Dr. Tarek Tawfik 319
Step 3. Consider the Outcome Variable
Is the outcome being studied relevant to clinical practice?
What criteria are used to define the presence of disease?
Is the determination of the absence or presence of disease accurate?
04/11/2023 Dr. Tarek Tawfik 320
Step 4. Consider the Predictor Variable
How many exposures or risk factors are being studied?
How is the presence or absence of exposure determined?
Is the assessment of exposure likely t be precise and accurate?
Is there an attempt to quantify the amount or duration of exposure?
Are biological markers of exposure used in the study?
04/11/2023 Dr. Tarek Tawfik 321
Step 5. Consider the Methods of AnalysisAre the statistical methods employed suitable for the types of the variables )nominal versus, ordinal versus continuous( in the study?
Have the levels of type I and type II errors has been discussed appropriately?
Is the sample size adequate to answer the research question?
Have the assumptions underlying the statistical tests been met?
Has chance been evaluated as a potential explanation of the results?
04/11/2023 Dr. Tarek Tawfik 322
Step 6. Consider Possible source of Bias (Systematic Error)
Is the method of selection of subjects likely to have biased results?
Is the measurement of either the exposure or the disease likely to be biased?
Have the investigators considered whether confounders could account for the observed results?
In what direction would each potential bias influence the results?
04/11/2023 Dr. Tarek Tawfik 323
Step 7. Consider the interpretation of the results.
How large is the observed effect?
Is there evidence of a dose-response relationship?
Are the findings consistent with laboratory models?
Are the effect are biologically plausible?
If the findings are negative, was there sufficient statistical power to detect an effect?
04/11/2023 Dr. Tarek Tawfik 324
Step 8. Consider how the results of the study can be used in practice.
Are the findings consistent with other studies of the same questions?
Can the findings be generalized to other human populations?
Do the findings warrant a change in current clinical practice?
04/11/2023 Dr. Tarek Tawfik 325
Thank you