View
84
Download
29
Category
Tags:
Preview:
DESCRIPTION
introduction to Biostatistics-145 Lectures4
Citation preview
Lectures of Stat -145(Biostatistics)
Text bookBiostatistics
Basic Concepts and Methodology for the Health Sciences
ByWayne W. Daniel
Prepared By:Sana A. Abunasrah
Text Book : Basic Concepts and Methodology for the Health Sciences
2
Chapter 1
Introduction To
Biostatistics
Text Book : Basic Concepts and Methodology for the Health Sciences
3
Key words :
Statistics , data , Biostatistics, Variable ,Population ,Sample
Text Book : Basic Concepts and Methodology for the Health Sciences
4
IntroductionSome Basic concepts
Statistics is a field of study concerned with
1- collection, organization, summarization and analysis of data.
2- drawing of inferences about a body of data when only a part of the data is observed.
Statisticians try to interpret and communicate the results to
others.
Text Book : Basic Concepts and Methodology for the Health Sciences
5
* Biostatistics:The tools of statistics are employed in
many fields:business, education, psychology,
agriculture, economics, … etc.When the data analyzed are derived
from the biological science and medicine,
we use the term biostatistics to distinguish this particular application of statistical tools and concepts.
Text Book : Basic Concepts and Methodology for the Health Sciences
6
Data:• The raw material of Statistics is data. • We may define data as figures. Figures
result from the process of counting or from taking a measurement.
•For example: • - When a hospital administrator counts
the number of patients (counting).• - When a nurse weighs a patient
(measurement)
Text Book : Basic Concepts and Methodology for the Health Sciences
7
We search for suitable data to serve as the raw material for our investigation.
Such data are available from one or more of the following sources:
1- Routinely kept records. For example:- Hospital medical records contain
immense amounts of information on patients.
- Hospital accounting records contain a wealth of data on the facility’s business
- activities.
*Sources of Data:
Text Book : Basic Concepts and Methodology for the Health Sciences
8
2- External sources.The data needed to answer a question may already exist in the form ofpublished reports, commercially available data banks, or the research literature, i.e. someone else has already asked the same question.
Text Book : Basic Concepts and Methodology for the Health Sciences
9
3- Surveys:The source may be a survey, if the data
needed is about answering certain questions.
For example: If the administrator of a clinic wishes to
obtain information regarding the mode of transportation used by patients to visit the clinic, then a survey may be conducted among
patients to obtain this information.
Text Book : Basic Concepts and Methodology for the Health Sciences
10
4- Experiments.Frequently the data needed to answer
a question are available only as the result of an experiment.For example:If a nurse wishes to know which of several
strategies is best for maximizing patient compliance, she might conduct an experiment in which the different strategies of motivating compliance
are tried with different patients.
Text Book : Basic Concepts and Methodology for the Health Sciences
11
*A variable:It is a characteristic that takes on
different values in different persons, places, or things.
For example:- heart rate, - the heights of adult males, - the weights of preschool children,- the ages of patients seen in a dental
clinic.
Text Book : Basic Concepts and Methodology for the Health Sciences
12
Quantitative Variables
It can be measured in the usual sense.
For example: - the heights of
adult males, - the weights of
preschool children,
- the ages of patients seen in a
- dental clinic.
Qualitative VariablesMany characteristics
are not capable of being measured. Some of them can be ordered or ranked.
For example:- classification of people
into socio-economic groups,
- social classes based on income, education, etc.
Types of variables
Quantitative Qualitative
Text Book : Basic Concepts and Methodology for the Health Sciences
13
A discrete variableis characterized by
gaps or interruptions in the values that it can assume.
For example:- The number of daily
admissions to a general hospital,
- The number of decayed, missing or filled teeth per child
- in an - elementary - school.
A continuous variablecan assume any value within
a specified relevant interval of values assumed by the variable.
For example:- Height, - weight, - skull circumference.No matter how close together
the observed heights of two people, we can find another person whose height falls somewhere in between.
Types of quantitative variables
Discrete Continuous
Text Book : Basic Concepts and Methodology for the Health Sciences
14
* A population:It is the largest collection of It is the largest collection of valuesvalues
of a of a ranrandom variabledom variable for which we for which we have an interest at a particular have an interest at a particular time. time.
For example: The weights of all the children
enrolled in a certain elementary school.
Populations may be finite or infinite.
Text Book : Basic Concepts and Methodology for the Health Sciences
15
** A sample: A sample:It is a part of a population. It is a part of a population. For example:The weights of only a fraction
of these children.
Text Book : Basic Concepts and Methodology for the Health Sciences
16
Excercises• Question (6) – Page 17• Question (7) – Page 17 “ Situation A , Situation B “
Chapter ( 2 )Chapter ( 2 )Strategies for Strategies for
understanding the understanding the meanings of Datameanings of Data
Pages( 19 – 27)Pages( 19 – 27)
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
1818
Key wordsKey words
frequency table, bar chart ,rangefrequency table, bar chart ,range width of interval ,width of interval , mid-intervalmid-interval Histogram , PolygonHistogram , Polygon
Descriptive StatisticsDescriptive StatisticsFrequency Distribution Frequency Distribution
for Discrete Random Variablesfor Discrete Random VariablesExample:Example:Suppose that we take a Suppose that we take a samplesample of size 16 from of size 16 from children in a primary school children in a primary school and get the following data and get the following data about the number of their about the number of their decayed teeth,decayed teeth,3,5,2,4,0,1,3,5,2,3,2,3,3,2,4,13,5,2,4,0,1,3,5,2,3,2,3,3,2,4,1To construct a To construct a frequencyfrequency table:table:1- 1- OrderOrder the values from the the values from the smallest to the largest.smallest to the largest.0,1,1,2,2,2,2,3,3,3,3,3,4,4,5,50,1,1,2,2,2,2,3,3,3,3,3,4,4,5,52- 2- CountCount how many how many numbers are the same.numbers are the same.
No. of decayed
teeth
FrequencyRelativeFrequency
012345
124522
0.06250.1250.25
0.31250.1250.125
Total161
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2020
Representing the Representing the simple frequency table simple frequency table
using the bar chartusing the bar chart
Number of decayed teeth
5.004.003.002.001.00.00
Freq
uenc
y
6
5
4
3
2
1
0
22
5
4
2
1
We can represent the above simple frequency table using the bar chart.
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2121
2.3 Frequency Distribution 2.3 Frequency Distribution for Continuous Random Variablesfor Continuous Random Variables
For For large sampleslarge samples, we can’t use the simple frequency table to , we can’t use the simple frequency table to represent the data.represent the data.
We need to We need to dividedivide the data into the data into groupsgroups or or intervals intervals oror classes.classes.
So, we need to determine:So, we need to determine:
1- The number of intervals (k).1- The number of intervals (k).Too fewToo few intervals are not good because information will be intervals are not good because information will be
lost.lost.Too manyToo many intervals are not helpful to summarize the data. intervals are not helpful to summarize the data.A commonly followed rule is that A commonly followed rule is that 6 ≤ k ≤ 15,6 ≤ k ≤ 15,or the following formula may be used,or the following formula may be used,k = 1 + 3.322 (log n)k = 1 + 3.322 (log n)
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2222
2- The range (R).2- The range (R).It is the difference between the It is the difference between the largest and the smallest observation largest and the smallest observation in the data set.in the data set.
3- The Width of the interval (w).3- The Width of the interval (w).ClassClass intervals generally should be of intervals generally should be of
the the same widthsame width. Thus, if we want k . Thus, if we want k intervals, then w is chosen such that intervals, then w is chosen such that
w ≥ R / k.w ≥ R / k.
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2323
Example:Example:Assume that the number of observations Assume that the number of observations equal 100, then equal 100, then k = 1+3.322(log 100) k = 1+3.322(log 100) = 1 + 3.3222 (2) = 7.6 = 1 + 3.3222 (2) = 7.6 8. 8.Assume that the smallest value = 5 and the Assume that the smallest value = 5 and the
largest one of the data = 61, then largest one of the data = 61, then R = 61 – 5 = 56 andR = 61 – 5 = 56 andw = 56 / 8 = 7.w = 56 / 8 = 7.To make the summarization more To make the summarization more
comprehensible, the class width may be 5 comprehensible, the class width may be 5 or 10 or the multiples of 10.or 10 or the multiples of 10.
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2424
Example 2.3.1Example 2.3.1 We wish to know how many class interval to have We wish to know how many class interval to have
in the frequency distribution of the data in Table in the frequency distribution of the data in Table 1.4.1 Page 9-10 of ages of 189 subjects who 1.4.1 Page 9-10 of ages of 189 subjects who Participated in a study on smoking cessationParticipated in a study on smoking cessation
SolutionSolution : : Since the number of observations Since the number of observations equal 189, then equal 189, then k = 1+3.322(log 169) k = 1+3.322(log 169) = 1 + 3.3222 (2.276) = 1 + 3.3222 (2.276) 9, 9, R = 82 – 30 = 52 andR = 82 – 30 = 52 and w = 52 / 9 = 5.778w = 52 / 9 = 5.778
It is better to let w = 10, then the intervals It is better to let w = 10, then the intervals will be in the form:will be in the form:
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2525
Class intervalFrequency
30 – 3911
40 – 4946
50 – 597060 – 694570 – 7916
80 – 891Total189
Sum of frequency=sample size=n
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2626
The Cumulative FrequencyThe Cumulative Frequency::It can be computed by adding successive It can be computed by adding successive frequenciesfrequencies..
The Cumulative Relative FrequencyThe Cumulative Relative Frequency::It can be computed by adding successive relative It can be computed by adding successive relative frequenciesfrequencies..
TheThe Mid-intervalMid-interval::It can be computed by adding the lower bound of It can be computed by adding the lower bound of the interval plus the upper bound of it and then the interval plus the upper bound of it and then divide over 2divide over 2 . .
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2727
For the above example, the following table represents the For the above example, the following table represents the cumulative frequency, the relative frequency, the cumulative cumulative frequency, the relative frequency, the cumulative
relative frequency and the mid-intervalrelative frequency and the mid-interval.. Class
intervalMid –
intervalFrequency
Freq (f)Cumulative Frequency
RelativeFrequency
R.f
Cumulative Relative
Frequency
30 – 3934.511110.05820.058240 – 4944.546570.2434-50 – 5954.5-127-0.672060 – 69-45-0.23810.910170 – 7974.5161880.08470.9948
80 – 8984.511890.00531
Total1891
R.f= freq/n
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2828
ExampleExample : : From the above frequency table, complete the From the above frequency table, complete the
table then answer the following questions:table then answer the following questions: 1-The number of objects with age less than 50 1-The number of objects with age less than 50
years ?years ? 2-The number of objects with age between 40-69 2-The number of objects with age between 40-69
years ?years ? 3-Relative frequency of objects with age between 3-Relative frequency of objects with age between
70-79 years ?70-79 years ? 4-Relative frequency of objects with age more 4-Relative frequency of objects with age more
than 69 years ?than 69 years ? 5-The percentage of objects with age between 40-5-The percentage of objects with age between 40-
49 years ?49 years ?
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
2929
6-6- The percentage of objects with age less than The percentage of objects with age less than 60 years ?60 years ?
7-The Range (R) ?7-The Range (R) ? 8- Number of intervals (K)?8- Number of intervals (K)? 9- The width of the interval ( W) ?9- The width of the interval ( W) ?
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
3030
Representing the grouped Representing the grouped frequency table using the frequency table using the
histogramhistogramTo draw the histogram, the To draw the histogram, the true classes limitstrue classes limits should be used. should be used. They can be computed by They can be computed by subtracting subtracting 0.5 from the0.5 from the lower lower limit and limit and adding adding 0.5 to the0.5 to the upper upper limit for each interval.limit for each interval.
True class limitsFrequency
29.5 – <39.511
39.5 – < 49.546
49.5 – < 59.570
59.5 – < 69.545
69.5 – < 79.516
79.5 – < 89.51
Total189
0
10
20
30
40
50
60
70
80
34.5 44.5 54.5 64.5 74.5 84.5
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
3131
Representing the grouped Representing the grouped frequency table using the frequency table using the
PolygonPolygon
0
10
20
30
40
50
60
70
80
34.5 44.5 54.5 64.5 74.5 84.5
Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
3232
ExercisesExercises PagesPages : 31 – 34 : 31 – 34QuestionsQuestions: 2.3.2(a) , 2.3.5 (a): 2.3.2(a) , 2.3.5 (a)H.W.H.W. : : 2.3.6 , 2.3.7(a) 2.3.6 , 2.3.7(a)
Section (2.4) :Section (2.4) : Descriptive Statistics Descriptive Statistics
Measures of Central Measures of Central Tendency Tendency
Page 38 - 41Page 38 - 41
3434Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
key words: Descriptive Statistic, measure of
central tendency ,statistic, parameter, mean (μ) ,median, mode.
3535Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
The Statistic and The The Statistic and The ParameterParameter • A Statistic:
It is a descriptive measure computed from the data of a sample.
• A Parameter:It is a a descriptive measure computed from the
data of a population.Since it is difficult to measure a parameter from the
population, a sample is drawn of size n, whose values are 1 , 2 , …, n. From this data, we measure the statistic.
3636Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
Measures of Central Measures of Central TendencyTendency
A measure of central tendency is a measure which indicates where the middle of the data is.
The three most commonly used measures of central tendency are:
The Mean, the Median, and the Mode.
The Mean:It is the average of the data.
3737Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
The Population Mean:
= which is usually unknown, then we use the
sample mean to estimate or approximate it.The Sample Mean: =
Example:Here is a random sample of size 10 of ages, where 1 = 42, 2 = 28, 3 = 28, 4 = 61, 5 = 31, 6 = 23, 7 = 50, 8 = 34, 9 = 32, 10 = 37.
= (42 + 28 + … + 37) / 10 = 36.6
x
1
N
ii
N
X
x
1
n
ii
n
x
3838Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
Properties of the Mean:• Uniqueness. For a given set of data there is
one and only one mean.• Simplicity. It is easy to understand and to
compute.• Affected by extreme values. Since all
values enter into the computation.Example: Assume the values are 115, 110, 119, 117, 121
and 126. The mean = 118.But assume that the values are 75, 75, 80, 80 and 280. The
mean = 118, a value that is not representative of the set of data as a whole.
3939Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
The Median:When ordering the data, it is the observation that divide the
set of observations into two equal parts such that half of the data are before it and the other are after it.
* If n is odd, the median will be the middle of observations. It will be the (n+1)/2 th ordered observation.
When n = 11, then the median is the 6th observation.* If n is even, there are two middle observations. The median
will be the mean of these two middle observations. It will be the (n+1)/2 th ordered observation.
When n = 12, then the median is the 6.5th observation, which is an observation halfway between the 6th and 7th ordered observation.
4040Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
Example:For the same random sample, the ordered
observations will be as:23, 28, 28, 31, 32, 34, 37, 42, 50, 61.Since n = 10, then the median is the 5.5th
observation, i.e. = (32+34)/2 = 33.
Properties of the Median:• Uniqueness. For a given set of data there is
one and only one median.• Simplicity. It is easy to calculate.• It is not affected by extreme values as
is the mean.
4141Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
The Mode:It is the value which occurs most frequently.If all values are different there is no mode.Sometimes, there are more than one mode.Example:For the same random sample, the value 28 is
repeated two times, so it is the mode.Properties of the Mode:• Sometimes, it is not unique.• It may be used for describing qualitative
data.
Section (2.5) :Section (2.5) : Descriptive Statistics Descriptive Statistics
Measures of Dispersion Measures of Dispersion Page 43 - 46Page 43 - 46
4343Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
key words: Descriptive Statistic, measure of
dispersion , range ,variance, coefficient of variation.
4444Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
2.5. Descriptive Statistics – 2.5. Descriptive Statistics – Measures of Dispersion:Measures of Dispersion:
• A measure of dispersion conveys information regarding the amount of variability present in a set of data.
• Note:1. If all the values are the same → There is no dispersion .2. If all the values are different → There is a dispersion: 3.If the values close to each other →The amount of Dispersion small.b) If the values are widely scattered → The Dispersion is greater.
4545Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
Ex. Figure 2.5.1 –Page 43Ex. Figure 2.5.1 –Page 43
• ** Measures of Dispersion are : 1.Range (R). 2. Variance.3. Standard deviation.4.Coefficient of variation (C.V).
4646Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
1.The Range (R):1.The Range (R): • Range =Largest value- Smallest value =
• Note: • Range concern only onto two values • Example 2.5.1 Page 40: • Refer to Ex 2.4.2.Page 37 • Data:• 43,66,61,64,65,38,59,57,57,50. • Find Range?• Range=66-38=28
SL xx
4747Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
2.The Variance:2.The Variance: • It measure dispersion relative to the scatter of the values
a bout there mean. a) Sample Variance ( ) :• ,where is sample mean
• Example 2.5.2 Page 40: • Refer to Ex 2.4.2.Page 37• Find Sample Variance of ages , = 56 • Solution: • S2= [(43-56) 2 +(66-43) 2+…..+(50-56) 2 ]/ 10• = 900/10 = 90
x
2S
1
)(1
2
2
n
xxS
n
ii
x
4848Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
• b)Population Variance ( ) :• where , is Population mean3.The Standard Deviation: • is the square root of variance=a) Sample Standard Deviation = S =b) Population Standard Deviation = σ =
2
N
xN
ii
1
2
2)(
Varince2S
2
4949Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
4.The Coefficient of Variation 4.The Coefficient of Variation (C.V):(C.V):
• Is a measure use to compare the dispersion in two sets of data which is independent of the unit of the measurement .
• where S: Sample standard deviation.
• : Sample mean.
)100(.XSVC
X
5050Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
Example 2.5.3 Page 46Example 2.5.3 Page 46::
• Suppose two samples of human males yield the following data:
Sampe1 Sample2 Age 25-year-olds 11year-olds Mean weight 145 pound 80 poundStandard deviation 10 pound 10 pound
5151Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
• We wish to know which is more variable.• Solution:• c.v (Sample1)= (10/145)*100= 6.9
• c.v (Sample2)= (10/80)*100= 12.5
• Then age of 11-years old(sample2) is more variation
5252Text Book : Basic Concepts and MethText Book : Basic Concepts and Methodology for the Health Sciences odology for the Health Sciences
ExercisesExercises
• Pages : 52 – 53• Questions: 2.5.1 , 2.5.2 ,2.5.3• H.W. :2.5.4 , 2.5.5, 2.5.6, 2.5.14• * Also you can solve in the review
questions page 57:• Q: 12,13,14,15,16, 19
Chapter 3Chapter 3ProbabilityProbability
The Basis of the The Basis of the Statistical inferenceStatistical inference
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5454
Key words:Key words:
Probability, objective Probability,Probability, objective Probability,subjective Probability, equally likelysubjective Probability, equally likelyMutually exclusive, multiplicative ruleMutually exclusive, multiplicative ruleConditional Probability, independent events, Conditional Probability, independent events,
Bayes theoremBayes theorem
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5555
3.13.1 IntroductionIntroduction The concept of probability is frequently encountered in everyday The concept of probability is frequently encountered in everyday
communication. communication. For exampleFor example, a physician may say that a , a physician may say that a patient has a 50-50 chance of surviving a certain operation. patient has a 50-50 chance of surviving a certain operation. Another physician may say that she is 95 percent certain that a Another physician may say that she is 95 percent certain that a patient has a particular disease. patient has a particular disease.
Most people express probabilities in terms of percentages. Most people express probabilities in terms of percentages.
But, it is more convenient to express probabilities as fractions. But, it is more convenient to express probabilities as fractions. Thus, we may measure the probability of the occurrence of Thus, we may measure the probability of the occurrence of some event by a number between 0 and 1.some event by a number between 0 and 1.
The more likely the event, the closer the number is to one. An The more likely the event, the closer the number is to one. An event that can't occur has a probability of zero, and an event event that can't occur has a probability of zero, and an event that is certain to occur has a probability of one.that is certain to occur has a probability of one.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5656
3.23.2 Two views of Probability Two views of Probability objective and subjectiveobjective and subjective::
*** *** Objective ProbabilityObjective Probability ** ** Classical and RelativeClassical and Relative Some definitionsSome definitions::1.Equally likely outcomes: 1.Equally likely outcomes: Are the outcomes that have the same Are the outcomes that have the same
chance of occurring.chance of occurring.2.Mutually exclusive:2.Mutually exclusive:Two events are said to be mutually exclusive Two events are said to be mutually exclusive
if they cannot occur simultaneously such if they cannot occur simultaneously such that A B =Φ .that A B =Φ .
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5757
The universal SetThe universal Set (S): The set all (S): The set all possible outcomes.possible outcomes.
The empty setThe empty set Φ Φ : Contain no elements. : Contain no elements. The event ,The event ,EE : is a set of outcomes in S : is a set of outcomes in S
which has a certain characteristic.which has a certain characteristic. Classical ProbabilityClassical Probability : If an event can : If an event can
occur in N mutually exclusive and equally occur in N mutually exclusive and equally likely ways, and if m of these possess a likely ways, and if m of these possess a triat, E, the probability of the occurrence of triat, E, the probability of the occurrence of event E is equal to m/ N .event E is equal to m/ N .
For ExampleFor Example: : in the rolling of the die , in the rolling of the die , each of the six sides is equally likely to be each of the six sides is equally likely to be observed . So, the probability that a 4 will observed . So, the probability that a 4 will be observed is equal to 1/6.be observed is equal to 1/6.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5858
Relative Frequency Probability:Relative Frequency Probability: Def:Def: If some posses is repeated a large If some posses is repeated a large
number of times, n, and if some resulting number of times, n, and if some resulting event E occurs m times , the relative event E occurs m times , the relative frequency of occurrence of E , m/n will be frequency of occurrence of E , m/n will be approximately equal to probability of E . approximately equal to probability of E . P(E) = m/n .P(E) = m/n .
*** *** Subjective ProbabilitySubjective Probability : : Probability measures the confidence that a Probability measures the confidence that a
particular individual has in the truth of a particular individual has in the truth of a particular proposition.particular proposition.
For ExampleFor Example : the probability that a cure : the probability that a cure for cancer will be discovered within the for cancer will be discovered within the next 10 years. next 10 years.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
5959
3.33.3 Elementary Properties of Elementary Properties of ProbabilityProbability::
Given some process (or experiment ) Given some process (or experiment ) with n mutually exclusive events Ewith n mutually exclusive events E11, , EE22, E, E33,…………, E,…………, Enn, then, then
1-P(E1-P(Eii ) 0, i= 1,2,3,……n ) 0, i= 1,2,3,……n 2- P(E2- P(E1 1 )+ P(E)+ P(E22) +……+P(E) +……+P(Enn )=1 )=1 3- P(E3- P(Eii +E +EJJ )= P(E )= P(Ei i )+ P(E)+ P(EJJ ), ),
EEii ,E ,EJJ are mutually exclusive are mutually exclusive
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6060
Rules of ProbabilityRules of Probability 1-Addition Rule1-Addition Rule P(A U B)= P(A) + P(B) – P (A∩B )P(A U B)= P(A) + P(B) – P (A∩B ) 2- If A and B are mutually exclusive 2- If A and B are mutually exclusive
(disjoint) ,then(disjoint) ,then P (A∩B ) = 0P (A∩B ) = 0 Then , addition rule isThen , addition rule is P(A B)= P(A) + P(B) .P(A B)= P(A) + P(B) . 3- Complementary Rule3- Complementary Rule P(A' )= 1 – P(A)P(A' )= 1 – P(A) where, A' = = complement eventwhere, A' = = complement event Consider example Consider example 3.4.1 Page 633.4.1 Page 63
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6161
Table 3.4.1 in Example 3.4.1Table 3.4.1 in Example 3.4.1Family history of Mood Disorders
Early = 18) E(
Later >18)L (
Total
Negative(A)283563
Bipolar Disorder(B)
193857
Unipolar (C) 414485
Unipolar and Bipolar(D)
5360113
Total141177318
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6262
****Answer the following questionsAnswer the following questions::Suppose we pick a person at random from this sample.Suppose we pick a person at random from this sample.1-The probability that this person will be 18-years old 1-The probability that this person will be 18-years old
or younger?or younger?2-The probability that this person has family history of 2-The probability that this person has family history of
mood orders Unipolar(C)?mood orders Unipolar(C)?3-The probability that this person has no family history 3-The probability that this person has no family history
of mood orders Unipolar( )?of mood orders Unipolar( )?4-The probability that this person is 18-years old or 4-The probability that this person is 18-years old or
younger younger oror has no family history of mood orders has no family history of mood orders Negative (A)?Negative (A)?
5-The probability that this person is more than18-5-The probability that this person is more than18-years old years old andand has family history of mood orders has family history of mood orders Unipolar and Bipolar(D)?Unipolar and Bipolar(D)?
C
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6363
Conditional ProbabilityConditional Probability::
P(A\B) is the probability of A assuming P(A\B) is the probability of A assuming that B has happened.that B has happened.
P(A\B)= , P(B)≠ 0P(A\B)= , P(B)≠ 0
P(B\A)= , P(A)≠ 0P(B\A)= , P(A)≠ 0
)()(
BPBAP
)()(
APBAP
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6464
Example 3.4.2 Page 64Example 3.4.2 Page 64From previous example From previous example 3.4.1 Page 633.4.1 Page 63 , ,
answeranswer suppose we pick a person at random and suppose we pick a person at random and
find he is 18 years or younger (E),what is find he is 18 years or younger (E),what is the probability that this person will be one the probability that this person will be one who has no family history of mood disorders who has no family history of mood disorders (A)?(A)?
suppose we pick a person at random and suppose we pick a person at random and find he has family history of mood (D) what find he has family history of mood (D) what is the probability that this person will be 18 is the probability that this person will be 18 years or younger (E)? years or younger (E)?
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6565
Calculating a joint ProbabilityCalculating a joint Probability: : Example 3.4.3.Page 64Example 3.4.3.Page 64 Suppose we pick a person at random Suppose we pick a person at random
from the 318 subjects. Find the from the 318 subjects. Find the probability that he will early (E) and probability that he will early (E) and has no family history of mood has no family history of mood disorders (A).disorders (A).
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6666
Multiplicative RuleMultiplicative Rule:: P(A∩B)= P(A\B)P(B)P(A∩B)= P(A\B)P(B) P(A∩B)= P(B\A)P(A)P(A∩B)= P(B\A)P(A) Where,Where, P(A): marginal probability of A.P(A): marginal probability of A. P(B): marginal probability of B.P(B): marginal probability of B. P(B\A):The conditional probability.P(B\A):The conditional probability.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6767
Example 3.4.4 Page 65Example 3.4.4 Page 65 From previous example From previous example 3.4.1 Page 633.4.1 Page 63
, we wish to compute the joint , we wish to compute the joint probability of Early age at onset(E) probability of Early age at onset(E) and a negative family history of and a negative family history of mood disorders(A) from a knowledge mood disorders(A) from a knowledge of an appropriate marginal of an appropriate marginal probability and an appropriate probability and an appropriate conditional probability.conditional probability.
Exercise: Example 3.4.5.Page 66Exercise: Example 3.4.5.Page 66 Exercise: Example 3.4.6.Page 67Exercise: Example 3.4.6.Page 67
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6868
Independent EventsIndependent Events:: If A has no effect on B, we said that If A has no effect on B, we said that
A,B are independent events.A,B are independent events. Then,Then, 1- P(A∩B)= P(B)P(A)1- P(A∩B)= P(B)P(A) 2- P(A\B)=P(A)2- P(A\B)=P(A) 3- P(B\A)=P(B)3- P(B\A)=P(B)
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
6969
Example 3.4.7 Page 68Example 3.4.7 Page 68 In a certain high school class consisting of In a certain high school class consisting of
60 girls and 40 boys, it is observed that 24 60 girls and 40 boys, it is observed that 24 girls and 16 boys wear eyeglasses . If a girls and 16 boys wear eyeglasses . If a student is picked at random from this class student is picked at random from this class ,the probability that the student wears ,the probability that the student wears eyeglasses , P(E), is 40/100 or 0.4 .eyeglasses , P(E), is 40/100 or 0.4 .
What is the probability that a student What is the probability that a student picked at random wears eyeglasses given picked at random wears eyeglasses given that the student is a boy?that the student is a boy?
What is the probability of the joint What is the probability of the joint occurrence of the events of wearing eye occurrence of the events of wearing eye glasses and being a boy?glasses and being a boy?
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7070
Example 3.4.8 Page 69Example 3.4.8 Page 69 Suppose that of 1200 admission to a Suppose that of 1200 admission to a
general hospital during a certain period of general hospital during a certain period of time,750 are private admissions. If we time,750 are private admissions. If we designate these as a set A, then compute designate these as a set A, then compute P(A) , P( ).P(A) , P( ).
Exercise: Example 3.4.9.Page 76Exercise: Example 3.4.9.Page 76
A
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7171
Marginal ProbabilityMarginal Probability:: Definition:Definition: Given some variable that can be broken Given some variable that can be broken
down into m categories designated down into m categories designated by and another jointly occurring by and another jointly occurring
variable that is broken down into n variable that is broken down into n categories designated by categories designated by
, the marginal probability of with all the , the marginal probability of with all the categories of B . That is,categories of B . That is,
for all value of jfor all value of j Example 3.4.9.Page 76Example 3.4.9.Page 76 Use data of Table 3.4.1, and rule of Use data of Table 3.4.1, and rule of
marginal Probabilities to calculate P(E). marginal Probabilities to calculate P(E).
),()( jii BAPAP
mi AAAA ,.......,,.......,, 21
nj BBBB ,.......,,.......,, 21
iA
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7272
ExerciseExercise:: Page 76-77Page 76-77 Questions :Questions : 3.4.1, 3.4.3,3.4.43.4.1, 3.4.3,3.4.4 H.W.H.W. 3.4.5 , 3.4.73.4.5 , 3.4.7
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7373
Baye's Theorem Baye's Theorem Pages 79-83Pages 79-83
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7474
Definition.1
The sensitivity of the symptom
This is the probability of a positive result given that the subject has the disease. It is denoted by P(T|D)
Definition.2
The specificity of the symptomThis is the probability of negative result given that the subject does not have the disease. It is denoted by
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7575
)()|()()|()()|()|(
DPDTPDPDTPDPDTPTDP
)|(1)|(
)(1)(
DTPDTp
DPDP
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7676
Definition.4The predictive value negative of the symptomThis is the probability that a subject does not have the disease given that the subject has a negative screening test resultIt is calculated using Bayes Theorem through the following formula
where,)()|()()|(
)()|()|(DPDTPDPDTP
DPDTPTDP
)|(1)|( DTPDTp
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7777
Example 3.5.1 page 82
A medical research team wished to evaluate a proposed screening test for Alzheimer’s disease. The test was given to a random sample of 450 patients with Alzheimer’s disease and an independent random sample of 500 patients without symptoms of the disease. The two samples were drawn from populations of subjects who were 65 years or older. The results are as follows.
Test ResultYes (D)No( ) TotalPositive(T)4365441
Negativ( )14495509
Total450500950T
D
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7878
In the context of this examplea)What is a false positive?A false positive is when the test indicates a positive result (T) when the person does not have the disease
b) What is the false negative?A false negative is when a test indicates a negative result ( ) when the person has the disease (D).
c) Compute the sensitivity of the symptom.
d) Compute the specificity of the symptom.
D
T
9689.0450436)|( DTP
99.0500495)|( DTP
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
7979
e) Suppose it is known that the rate of the disease in the general population is 11.3%. What is the predictive value positive of the symptom and the predictive value negative of the symptom The predictive value positive of the symptom is calculated as
The predictive value negative of the symptom is calculated as
996.0.113)(0.0311)(087)(0.99)(0.8
87)(0.99)(0.8
)()|()()|()()|()|(
DPDTPDPDTPDPDTPTDP
925.00.113)-(.01)(1.113)(0.9689)(0
.113)(0.9689)(0
)()|()()|()()|()|(
DPDTPDPDTPDPDTPTDP
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
8080
ExerciseExercise:: Page 83Page 83 Questions :Questions : 3.5.1, 3.5.23.5.1, 3.5.2 H.W.:H.W.: Page 87 : Q4,Q5,Q7,Q9,Q21Page 87 : Q4,Q5,Q7,Q9,Q21
Chapter 4:Probabilistic features of
certain data DistributionsPages 93- 111
Text Book : Basic Concepts and Methodology for the Health Sciences
82
Key words
Probability distribution , random variable , Bernolli distribution, Binomail distribution, Poisson distribution
Text Book : Basic Concepts and Methodology for the Health Sciences
83
The Random Variable (X):
When the values of a variable (height, weight, or age) can’t be predicted in advance, the variable is called a random variable.
An example is the adult height.
When a child is born, we can’t predict exactly his or her height at maturity.
Text Book : Basic Concepts and Methodology for the Health Sciences
84
4.2 Probability Distributions for Discrete Random Variables
Definition:The probability distribution of a discrete random variable is a table, graph, formula, or other device used to specify all possible values of a discrete random variable along with their respective probabilities.
Text Book : Basic Concepts and Methodology for the Health Sciences
85
The Cumulative Probability Distribution of X, F(x):
It shows the probability that the variable X is less than or equal to a certain value, P(X x).
8686Text Book : Basic Concepts and MText Book : Basic Concepts and Methodology for the Health Sciencesethodology for the Health Sciences
Example 4.2.1 page 94Example 4.2.1 page 94::Number of Number of ProgramsPrograms
frequencfrequencyy
P(X=x)P(X=x)F(x)F(x)==P(X≤ x)P(X≤ x)
1162620.20880.20880.20880.20882247470.15820.15820.36700.36703339390.13130.13130.49830.49834439390.13130.13130.62960.62965558580.19530.19530.82490.82496637370.12460.12460.94950.949577440.01350.01350.96300.96308811110.03700.03701.00001.0000
TotalTotal2972971.00001.0000
Text Book : Basic Concepts and Methodology for the Health Sciences
87
See figure 4.2.1 page 96See figure 4.2.2 page 97
Properties of probability distribution of discrete random variable.
1. 2. 3. P(a X b) = P(X b) – P(X a-1) 4. P(X < b) = P(X b-1)
0 ( ) 1P X x ( ) 1P X x
Text Book : Basic Concepts and Methodology for the Health Sciences
88
Example 4.2.2 page 96: (use table in example 4.2.1)
What is the probability that a randomly selected family will be one who used three assistance programs?Example 4.2.3 page 96: (use table in example 4.2.1)
What is the probability that a randomly selected family used either one or two programs?
Text Book : Basic Concepts and Methodology for the Health Sciences
89
Example 4.2.4 page 98: (use table in example 4.2.1)
What is the probability that a family picked at random will be one who used two or fewer assistance programs?Example 4.2.5 page 98: (use table in example 4.2.1)
What is the probability that a randomly selected family will be one who used fewer than four programs?Example 4.2.6 page 98: (use table in example 4.2.1)
What is the probability that a randomly selected family used five or more programs?
Text Book : Basic Concepts and Methodology for the Health Sciences
90
Example 4.2.7 page 98: (use table in example 4.2.1)
What is the probability that a randomly selected family is one who used between three and five programs, inclusive?
Text Book : Basic Concepts and Methodology for the Health Sciences
91
4.3 The Binomial Distribution:The binomial distribution is one of the most widely encountered probability distributions in applied statistics. It is derived from a process known as a Bernoulli trial.Bernoulli trial is :
When a random process or experiment called a trial can result in only one of two mutually exclusive outcomes, such as dead or alive, sick or well, the trial is called a Bernoulli trial.
Text Book : Basic Concepts and Methodology for the Health Sciences
92
The Bernoulli ProcessA sequence of Bernoulli trials forms a Bernoulli process under the following conditions
1- Each trial results in one of two possible, mutually exclusive, outcomes. One of the possible outcomes is denoted (arbitrarily) as a success, and the other is denoted a failure.
2- The probability of a success, denoted by p, remains constant from trial to trial. The probability of a failure, 1-p, is denoted by q.
3- The trials are independent, that is the outcome of any particular trial is not affected by the outcome of any other trial
Text Book : Basic Concepts and Methodology for the Health Sciences
93
The probability distribution of the binomial random variable X, the number of successes in n independent trials is:
Where is the number of combinations of n distinct objects taken x of them at a time.
* Note: 0! =1
( ) ( ) , 0,1,2,....,X n Xn
f x P X x p q x nx
n
x
!!( )!
n nx n xx
! ( 1)( 2)....(1)x x x x
Text Book : Basic Concepts and Methodology for the Health Sciences
94
Properties of the binomial distribution
1.2.3.The parameters of the binomial distribution are n and p4.5.
( ) 0f x ( ) 1f x
( )E X np 2 var( ) (1 )X np p
Text Book : Basic Concepts and Methodology for the Health Sciences
95
Example 4.3.1 page 100 If we examine all birth records from the North Carolina State Center for Health statistics for year 2001, we find that 85.8 percent of the pregnancies had delivery in week 37 or later (full- term birth).
If we randomly selected five birth records from this population what is the probability that exactly three of the records will be for full-term births?
Exercise: example 4.3.2 page 104
Text Book : Basic Concepts and Methodology for the Health Sciences
96
Example 4.3.3 page 104Suppose it is known that in a certain population 10 percent of the population is color blind. If a random sample of 25 people is drawn from this population, find the probability that
a) Five or fewer will be color blind.b) Six or more will be color blindc) Between six and nine inclusive will be color
blind.d) Two, three, or four will be color blind.
Exercise: example 4.3.4 page 106
Text Book : Basic Concepts and Methodology for the Health Sciences
97
4.4 The Poisson DistributionIf the random variable X is the number of occurrences of some random event in a certain period of time or space (or some volume of matter).The probability distribution of X is given by:
f (x) =P(X=x) = ,x = 0,1,…..
The symbol e is the constant equal to 2.7183. (Lambda) is called the parameter of the distribution and is the average number of occurrences of the random event in the interval (or volume)
!
x
xe
Text Book : Basic Concepts and Methodology for the Health Sciences
98
Properties of the Poisson distribution
1.2.3.4.
( ) 0f x
( ) 1f x ( )E X
2 var( )X
Text Book : Basic Concepts and Methodology for the Health Sciences
99
Example 4.4.1 page 111In a study of a drug -induced anaphylaxis among patients taking rocuronium bromide as part of their anesthesia, Laake and Rottingen found that the occurrence of anaphylaxis followed a Poisson model with =12 incidents per year in Norway .Find
1- The probability that in the next year, among patients receiving rocuronium, exactly three will experience anaphylaxis?
Text Book : Basic Concepts and Methodology for the Health Sciences
100
2- The probability that less than two patients receiving rocuronium, in the next year will experience anaphylaxis?3- The probability that more than two patients receiving rocuronium, in the next year will experience anaphylaxis?4- The expected value of patients receiving rocuronium, in the next year who will experience anaphylaxis.5- The variance of patients receiving rocuronium, in the next year who will experience anaphylaxis6- The standard deviation of patients receiving rocuronium, in the next year who will experience anaphylaxis
Text Book : Basic Concepts and Methodology for the Health Sciences
101
Example 4.4.2 page 111: Refer to example 4.4.1
1-What is the probability that at least three patients in the next year will experience anaphylaxis if rocuronium is administered with anesthesia?2-What is the probability that exactly one patient in the next year will experience anaphylaxis if rocuronium is administered with anesthesia?3-What is the probability that none of the patients in the next year will experience anaphylaxis if rocuronium is administered with anesthesia?
Text Book : Basic Concepts and Methodology for the Health Sciences
102
4-What is the probability that at most two patients in the next year will experience anaphylaxis if rocuronium is administered with anesthesia?
Exercises: examples 4.4.3, 4.4.4 and 4.4.5 pages111-113Exercises: Questions 4.3.4 ,4.3.5, 4.3.7 ,4.4.1,4.4.5
4.5 Continuous 4.5 Continuous Probability Probability DistributionDistribution
Pages 114 – 127Pages 114 – 127
Text Book : Basic Concepts and Methodology for the Health Sciences
104
• Key words: Continuous random variable,
normal distribution , standard normal distribution , T-distribution
Text Book : Basic Concepts and Methodology for the Health Sciences
105
• Now consider distributions of continuous random variables.
Text Book : Basic Concepts and Methodology for the Health Sciences
106
1- Area under the curve = 1.2- P(X = a) = 0 , where a is a
constant.3- Area between two points a , b =
P(a<x<b) .
Properties of continuous probability Distributions:
Text Book : Basic Concepts and Methodology for the Health Sciences
107
4.6 The normal distribution:
• It is one of the most important probability distributions in statistics.
• The normal density is given by• , - ∞ < x < ∞, - ∞ < µ < ∞, σ >
0
• π, e : constants• µ: population mean.• σ : Population standard deviation.
2
2
2)(
21)(
x
exf
Text Book : Basic Concepts and Methodology for the Health Sciences
108
Characteristics of the normal distribution: Page 111
• The following are some important characteristics of the normal distribution:
1- It is symmetrical about its mean, µ.2- The mean, the median, and the mode
are all equal. 3- The total area under the curve above
the x-axis is one. 4-The normal distribution is completely
determined by the parameters µ and σ.
Text Book : Basic Concepts and Methodology for the Health Sciences
109
5- The normal distributiondepends on the twoparameters and . determines the location of the curve.(As seen in figure 4.6.3) ,
But, determines the scale of the curve, i.e. the degree of flatness or peaked ness of the curve.(as seen in figure 4.6.4)
11 22 33
11 < < 22 < < 33
11
22
33
11 < < 22 < < 33
Text Book : Basic Concepts and Methodology for the Health Sciences
110
Note that : (As seen in Figure 4.6.2)
1. P( µ- σ < x < µ+ σ) = 0.68 2. P( µ- 2σ< x < µ+ 2σ)= 0.953. P( µ-3σ < x < µ+ 3σ) = 0.997
Text Book : Basic Concepts and Methodology for the Health Sciences
111
The Standard normal distribution:
• Is a special case of normal distribution with mean equal 0 and a standard deviation of 1.
• The equation for the standard normal distribution is written as
• , - ∞ < z < ∞2
2
21)(
z
ezf
Text Book : Basic Concepts and Methodology for the Health Sciences
112
Characteristics of the standard normal
distribution
1 -It is symmetrical about 0.2 -The total area under the curve
above the x-axis is one.3 -We can use table (D) to find the
probabilities and areas.
Text Book : Basic Concepts and Methodology for the Health Sciences
113
“How to use tables of Z”Note that The cumulative probabilities P(Z z) are given intables for -3.49 < z < 3.49. Thus, P (-3.49 < Z < 3.49) 1.For standard normal distribution, P (Z > 0) = P (Z < 0) = 0.5Example 4.6.1:If Z is a standard normal distribution, then1) P( Z < 2) = 0.9772is the area to the left to 2 and it equals 0.9772.
2
Text Book : Basic Concepts and Methodology for the Health Sciences
114
Example 4.6.2:P(-2.55 < Z < 2.55) is the area between -2.55 and 2.55, Then it equals P(-2.55 < Z < 2.55) =0.9946 – 0.0054 = 0.9892.
Example 4.6.2: P(-2.74 < Z < 1.53) is the area between -2.74 and 1.53. P(-2.74 < Z < 1.53) =0.9370 – 0.0031 = 0.9339.
-2.74 1.53
-2.55 2.550
Text Book : Basic Concepts and Methodology for the Health Sciences
115
Example 4.6.3:P(Z > 2.71) is the area to the right to 2.71. So, P(Z > 2.71) =1 – 0.9966 = 0.0034.
Example : P(Z = 0.84) is the area at z = 2.71. So, P(Z = 0.84) =1 – 0.9966 = 0.0034
0.84
2.71
Text Book : Basic Concepts and Methodology for the Health Sciences
116
How to transform normal distribution (X) to standard normal distribution (Z)?
• This is done by the following formula:
• Example:• If X is normal with µ = 3, σ = 2. Find
the value of standard normal Z, If X= 6?
• Answer:
xz
5.12
36
xz
Text Book : Basic Concepts and Methodology for the Health Sciences
117
4.7 Normal Distribution Applications
The normal distribution can be used to model the distribution of many variables that are of interest. This allow us to answer probability questions about these random variables.
Example 4.7.1:The ‘Uptime ’is a custom-made light weight battery-operatedactivity monitor that records the amount of time an individualspend the upright position. In a study of children ages 8 to 15years. The researchers found that the amount of time childrenspend in the upright position followed a normal distribution withMean of 5.4 hours and standard deviation of 1.3.Find
Text Book : Basic Concepts and Methodology for the Health Sciences
118
If a child selected at random ,then1-The probability that the child spend less than 3 hours in the upright position 24-hour period
P( X < 3) = P( < ) = P(Z < -1.85) = 0.0322
-------------------------------------------------------------------------2-The probability that the child spend more than 5 hours in the upright position 24-hour period
P( X > 5) = P( > ) = P(Z > -0.31)
= 1- P(Z < - 0.31) = 1- 0.3520= 0.648-----------------------------------------------------------------------3-The probability that the child spend exactly 6.2 hours in the upright position 24-hour period
P( X = 6.2) = 0
X
3.14.53
X
3.14.55
Text Book : Basic Concepts and Methodology for the Health Sciences
119
4-The probability that the child spend from 4.5 to 7.3 hours in the upright position 24-hour period
P( 4.5 < X < 7.3) = P( < < ) = P( -0.69 < Z < 1.46 ) = P(Z<1.46) – P(Z< -0.69) = 0.9279 – 0.2451 = 0.6828
• Hw…EX. 4.7.2 – 4.7.3
X
3.14.55.4
3.14.53.7
Text Book : Basic Concepts and Methodology for the Health Sciences
120
6.3 The T Distribution:)167-173(
1- It has mean of zero.2- It is symmetric about the mean.3- It ranges from - to .
0
Text Book : Basic Concepts and Methodology for the Health Sciences
121
4- compared to the normal distribution, the t distribution is less peaked in the center and has higher tails.
5- It depends on the degrees of freedom (n-1).
6- The t distribution approaches the standard normal distribution as (n-1) approaches .
Text Book : Basic Concepts and Methodology for the Health Sciences
122
Examplest (7, 0.975) = 2.3646
------------------------------t (24, 0.995) = 2.7696
--------------------------If P (T(18) > t) = 0.975,
then t = -2.1009-------------------------If P (T(22) < t) = 0.99,
then t = 2.508
0.005
t (24, 0.995)
0.995
t (7, 0.975)
0.0250.975
t
0.9750.025
0.990.01
t
Text Book : Basic Concepts and Methodology for the Health Sciences
123
• Exercise:
• Questions : 4.7.1, 4.7.2• H.W : 4.7.3, 4.7.4, 4.7.6
Chapter 6Using sample data to make estimates about population parameters (P162-172)
Text Book : Basic Concepts and Methodology for the Health Sciences
125
Key words:
Point estimate, interval estimate, estimator,
Confident level ,α , Confident interval for mean μ, Confident interval for two means,
Confident interval for population proportion P,
Confident interval for two proportions
Text Book : Basic Concepts and Methodology for the Health Sciences
126
6.1 Introduction: Statistical inference is the procedure by which we
reach to a conclusion about a population on the basis of the information contained in a sample drawn from that population.
Suppose that: an administrator of a large hospital is interested
in the mean age of patients admitted to his hospital during a given year.
1. It will be too expensive to go through the records of all patients admitted during that particular year.
2. He consequently elects to examine a sample of the records from which he can compute an estimate of the mean age of patients admitted to his that year.
Text Book : Basic Concepts and Methodology for the Health Sciences
127
• To any parameter, we can compute two types of estimate: a point estimate and an interval estimate.
A point estimate is a single numerical value used to estimate the corresponding population parameter.
An interval estimate consists of two numerical values defining a range of values that, with a specified degree of confidence, we feel includes the parameter being estimated.
The Estimate and The Estimator: The estimate is a single computed value, but the
estimator is the rule that tell us how to compute this value, or estimate.
For example, is an estimator of the population mean,. The
single numerical value that results from evaluating this formula is called an estimate of the parameter .
i
ixx
Text Book : Basic Concepts and Methodology for the Health Sciences
128
6.2 Confidence Interval for a Population Mean: (C.I) Suppose researchers wish to estimate the
mean of some normally distributed population. They draw a random sample of size n from the
population and compute , which they use as a point estimate of .
Because random sampling involves chance, then can’t be expected to be equal to .
The value of may be greater than or less than .
It would be much more meaningful to estimate by an interval.
x
x
Text Book : Basic Concepts and Methodology for the Health Sciences
129
The 1- percent confidence interval (C.I.) for :
We want to find two values L and U between which lies with high probability, i.e.
P( L ≤ ≤ U ) = 1-
Text Book : Basic Concepts and Methodology for the Health Sciences
130
For example: When, = 0.01, then 1- = = 0.05, then 1- = = 0.05, then 1- =
Text Book : Basic Concepts and Methodology for the Health Sciences
131
We have the following casesa) When the population is
normal1) When the variance is known and the sample size is
large or small, the C.I. has the form: P( - Z (1- /2) /n < < + Z (1- /2) /n) = 1-
2) When variance is unknown, and the sample size is small, the C.I. has the form:
P( - t (1- /2),n-1 s/n < < + t (1- /2),n-1 s/n) = 1-
x x
xx
Text Book : Basic Concepts and Methodology for the Health Sciences
132
b) When the population is not normal and n large (n>30)1) When the variance is known the C.I.
has the form:P( - Z (1- /2) /n < < + Z (1- /2) /n) = 1-
2) When variance is unknown, the C.I. has the form:
P( - Z (1- /2) s/n < < + Z (1- /2) s/n) = 1-
x x
x x
Text Book : Basic Concepts and Methodology for the Health Sciences
133
Example 6.2.1 Page 167: Suppose a researcher , interested in obtaining
an estimate of the average level of some enzyme in a certain human population, takes a sample of 10 individuals, determines the level of the enzyme in each, and computes a sample mean of approximately
Suppose further it is known that the variable of interest is approximately normally distributed with a variance of 45. We wish to estimate . (=0.05)
22x
Text Book : Basic Concepts and Methodology for the Health Sciences
134
Solution: 1- =0.95→ =0.05→ /2=0.025, variance = σ2 = 45 → σ= 45,n=10 95%confidence interval for is given by: P( - Z (1- /2) /n < < + Z (1- /2) /n) = 1- Z (1- /2) = Z 0.975 = 1.96 (refer to table D) Z 0.975(/n) =1.96 ( 45 / 10)=4.1578 22 ± 1.96 ( 45 / 10) → (22-4.1578, 22+4.1578) → (17.84, 26.16) Exercise example 6.2.2 page 169
22x
x x
Text Book : Basic Concepts and Methodology for the Health Sciences
135
ExampleThe activity values of a certain enzyme measured in
normal gastric tissue of 35 patients with gastric carcinoma has a mean of 0.718 and a standard deviation of 0.511.We want to construct a 90 % confidence interval for the population mean.
Solution: Note that the population is not normal, n=35 (n>30) n is large and is
unknown ,s=0.511 1- =0.90→ =0.1 → /2=0.05→ 1-/2=0.95,
Text Book : Basic Concepts and Methodology for the Health Sciences
136
Then 90% confident interval for is given by:
P( - Z (1- /2) s/n < < + Z (1- /2) s/n) = 1- Z (1- /2) = Z0.95 = 1.645 (refer to table D) Z 0.95(s/n) =1.645 (0.511/ 35)=0.1421 0.718 ± 1.645 (0.511) / 35→ (0.718-0.1421, 0.718+0.1421) → (0.576,0.860). Exercise example 6.2.3 page 164:
xx
Text Book : Basic Concepts and Methodology for the Health Sciences
137
Example6.3.1 Page 174: Suppose a researcher , studied the effectiveness of
early weight bearing and ankle therapies following acute repair of a ruptured Achilles tendon. One of the variables they measured following treatment the muscle strength. In 19 subjects, the mean of the strength was 250.8 with standard deviation of 130.9
we assume that the sample was taken from is approximately normally distributed population. Calculate 95% confident interval for the mean of the strength ?
Text Book : Basic Concepts and Methodology for the Health Sciences
138
Solution: 1- =0.95→ =0.05→ /2=0.025, Standard deviation= S = 130.9 ,n=19 95%confidence interval for is given by: P( - t (1- /2),n-1 s/n < < + t (1- /2),n-1 s/n) = 1- t (1- /2),n-1 = t 0.975,18 = 2.1009 (refer to table E) t 0.975,18(s/n) =2.1009 (130.9 / 19)=63.1 250.8 ± 2.1009 (130.9 / 19) → (250.8- 63.1 , 22+63.1) → (187.7, 313.9) Exercise 6.2.1 ,6.2.2 6.3.2 page 171
8.250x
x x
Text Book : Basic Concepts and Methodology for the Health Sciences
139
6.3 Confidence Interval for the difference between two Population Means: (C.I)
If we draw two samples from two independent population
and we want to get the confident interval for thedifference between two population means , then
we havethe following cases :a) When the population is normal1) When the variance is known and the sample
sizes is large or small, the C.I. has the form: 2
22
1
21
212121
2
22
1
21
2121 )()(
nnZxx
nnZxx
Text Book : Basic Concepts and Methodology for the Health Sciences
140
2) When variances are unknown but equal, and the sample size is small, the C.I. has the form:
2)1()1(
11)(11)(
21
222
2112
21)2(,
212121
21)2(,
2121
2121
nnSnSnS
wherenn
Stxxnn
Stxx
p
pnnpnn
Text Book : Basic Concepts and Methodology for the Health Sciences
141
a) When the population is normal1) When the variance is known and the sample
sizes is large or small, the C.I. has the form:
2
22
1
21
212121
2
22
1
21
2121 )()(
nS
nSZxx
nS
nSZxx
Text Book : Basic Concepts and Methodology for the Health Sciences
142
Example 6.4.1 P174:The researcher team interested in the difference between serum
uricand acid level in a patient with and without Down’s syndrome .In alarge hospital for the treatment of the mentally retarded, a sample
of 12 individual with Down’s Syndrome yielded a mean of mg/100 ml. In a general hospital a sample of 15 normal individual
ofthe same age and sex were found to have a mean value of If it is reasonable to assume that the two population of values arenormally distributed with variances equal to 1 and 1.5,find the
95%C.I for μ1 - μ2
Solution:1- =0.95→ =0.05→ /2=0.025 → Z (1- /2) = Z0.975 = 1.96
1.1±1.96)0.4282 = (1.1± 0.84 ) = 0.26 , 1.94(
5.41 x
4.32 x
2
22
1
21
2121 )(
nnZxx
15
5.112196.1)4.35.4(
Text Book : Basic Concepts and Methodology for the Health Sciences
143
Example 6.4.1 P178:The purpose of the study was to determine the effectiveness of anintegrated outpatient dual-diagnosis treatment program formentally ill subject. The authors were addressing the problem of substance
abuseissues among people with sever mental disorder. A retrospective chart
review wascarried out on 50 patient ,the recherché was interested in the number of
inpatienttreatment days for physics disorder during a year following the end of the
program.Among 18 patient with schizophrenia, The mean number of treatment days
was 4.7with standard deviation of 9.3. For 10 subject with bipolar disorder, the
meannumber of treatment days was 8.8 with standard deviation of 11.5. We
wish toconstruct 99% C.I for the difference between the means of the populationsRepresented by the two samples
Text Book : Basic Concepts and Methodology for the Health Sciences
144
Solution: 1-α =0.99 → α = 0.01 → α/2 =0.005 → 1- α/2 = 0.995
n2 – 2 = 18 + 10 -2 = 26+ n1t (1- /2),(n1+n2-2) = t0.995,26 = 2.7787, then 99% C.I for μ1 – μ2
where
then(4.7-8.8)± 2.7787 √102.33 √(1/18)+(1/10)- 4.1 ± 11.086 =( - 15.186 , 6.986)Exercises: 6.4.2 , 6.4.6, 6.4.7, 6.4.8 Page
180
21)2(,
2121
11)(21 nn
Stxx pnn
33.10221018
)5.119()3.917(2
)1()1( 22
21
222
2112
xx
nnSnSnS p
Text Book : Basic Concepts and Methodology for the Health Sciences
145
6.5 Confidence Interval for a Population proportion (P):
A sample is drawn from the population of interest ,then compute the sample proportion such as
This sample proportion is used as the point estimator of the population proportion . A confident interval is obtained by the following formula
P̂
na
p sample in theelement of no. Totalisticcharachtar some with sample in theelement of no.
ˆ
nPPZP )ˆ1(ˆˆ
21
Text Book : Basic Concepts and Methodology for the Health Sciences
146
Example 6.5.1The Pew internet life project reported in 2003 that
18%of internet users have used the internet to search forinformation regarding experimental treatments ormedicine . The sample consist of 1220 adult internetusers, and information was collected from telephoneinterview. We wish to construct 98% C.I for theproportion of internet users who have search forinformation about experimental treatments or
medicine
Text Book : Basic Concepts and Methodology for the Health Sciences
147
Solution: 1-α =0.98 → α = 0.02 → α/2 =0.01 → 1- α/2 = 0.99Z 1- α/2 = Z 0.99 =2.33 , n=1220,The 98% C. I is
0.18 ± 0.0256 = ( 0.1544 , 0.2056 )
Exercises: 6.5.1 , 6.5.3 Page 187
18.010018
ˆ p
1220)18.01(18.033.218.0)ˆ1(ˆˆ
21
nPPZP
Text Book : Basic Concepts and Methodology for the Health Sciences
148
6.6 Confidence Interval for the difference between two Population proportions:
Two samples is drawn from two independent population
of interest ,then compute the sample proportion for each
sample for the characteristic of interest. An unbiased
point estimator for the difference between two population
proportionsA 100(1-α)% confident interval for P1 - P2 is given by
21ˆˆ PP
2
22
1
11
2121
)ˆ1(ˆ)ˆ1(ˆ)ˆˆ(
nPP
nPPZPP
Text Book : Basic Concepts and Methodology for the Health Sciences
149
Example 6.6.1Connor investigated gender differences in
proactive andreactive aggression in a sample of 323 adults (68
femaleand 255 males ). In the sample ,31 of the female
and 53of the males were using internet in the internet
café. Wewish to construct 99 % confident interval for thedifference between the proportions of adults go tointernet café in the two sampled population .
Text Book : Basic Concepts and Methodology for the Health Sciences
150
Solution: 1-α =0.99 → α = 0.01 → α/2 =0.005 → 1- α/2 = 0.995Z 1- α/2 = Z 0.995 =2.58 , nF=68, nM=255,
The 99% C. I is
0.2481 ± 2.58(0.0655) = ( 0.07914 , 0.4171 )
2078.025553
ˆ,4559.06831
ˆ M
MMF
FF n
apn
ap
255)2078.01(2078.0
68)4559.01(4559.058.2)2078.04559.0(
M
MM
F
FFMF n
PPn
PPZPP )ˆ1(ˆ)ˆ1(ˆ)ˆˆ(
21
Text Book : Basic Concepts and Methodology for the Health Sciences
151
Exercises: Questions : 6.2.1, 6.2.2,6.2.5 ,6.3.2,6.3.5, 6.4.2 6.5.3 ,6.5.4,6.6.1
Chapter 7Chapter 7Using sample statistics to Using sample statistics to
Test Hypotheses Test Hypotheses about population parametersabout population parameters
PagesPages 215-233 215-233
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
153153
Key words :Key words :
Null hypothesis HNull hypothesis H0, 0, Alternative hypothesis HAlternative hypothesis HAA , testing , testing hypothesis , test statistic , P-valuehypothesis , test statistic , P-value
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
154154
Hypothesis TestingHypothesis Testing
One type of statistical inference, estimation, One type of statistical inference, estimation, was discussed in Chapter 6 . was discussed in Chapter 6 .
The other type ,hypothesis testing ,is discussed The other type ,hypothesis testing ,is discussed in this chapter.in this chapter.
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
155155
Definition of a hypothesisDefinition of a hypothesis
It is a statement about one or more populations . It is a statement about one or more populations . It is usually concerned with the parameters of It is usually concerned with the parameters of
the population. e.g. the hospital administrator the population. e.g. the hospital administrator may want to test the hypothesis that the average may want to test the hypothesis that the average length of stay of patients admitted to the length of stay of patients admitted to the hospital is 5 days hospital is 5 days
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
156156
Definition of Statistical hypothesesDefinition of Statistical hypotheses
They are hypotheses that are stated in such a way that They are hypotheses that are stated in such a way that they may be evaluated by appropriate statistical they may be evaluated by appropriate statistical techniques. techniques.
There are two hypotheses involved in hypothesis There are two hypotheses involved in hypothesis testing testing
Null hypothesisNull hypothesis H H00: It is the hypothesis to be tested .: It is the hypothesis to be tested . Alternative hypothesisAlternative hypothesis H HAA : It is a statement of what : It is a statement of what
we believe is true if our sample data cause us to reject we believe is true if our sample data cause us to reject the null hypothesisthe null hypothesis
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
157157
7.27.2 Testing a hypothesis about the Testing a hypothesis about the mean of a populationmean of a population::
We have the following steps:We have the following steps:1.1.DataData:: determine variable, sample size (n), sample determine variable, sample size (n), sample
mean( ) , population standard deviation or sample mean( ) , population standard deviation or sample standard deviation (s) if is unknown standard deviation (s) if is unknown
2. 2. Assumptions :Assumptions : We have two cases: We have two cases: Case1:Case1: Population is normally or approximately Population is normally or approximately
normally distributed with known or unknown normally distributed with known or unknown variance (sample size n may be small or large), variance (sample size n may be small or large),
Case 2:Case 2: Population is not normal with known or Population is not normal with known or unknown variance (n is large i.e. n≥30).unknown variance (n is large i.e. n≥30).
x
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
158158
3.Hypotheses:3.Hypotheses: we have three caseswe have three cases Case ICase I : : H H00: : μμ==μμ00
HHAA: : μ μμ μ00
e.g. we want to test that the population mean is e.g. we want to test that the population mean is different than 50different than 50
Case IICase II : : H H00: : μ μ = = μμ00 HHAA: : μμ > > μμ00
e.g. we want to test that the population mean is e.g. we want to test that the population mean is greater than 50greater than 50
Case IIICase III : : H H0:0: μ = μ μ = μ00
HHAA: : μμ< < μμ00
e.g. we want to test that the population mean is lesse.g. we want to test that the population mean is less than 50than 50
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
159159
4.Test Statistic4.Test Statistic:: Case 1:Case 1: population is normalpopulation is normal or or approximately approximately
normalnormal σσ22 is known σ is known σ22 is unknown is unknown( n large or small)( n large or small) n large n smalln large n small
Case2:Case2: If population is If population is not normallynot normally distributed and distributed and n is n is largelarge
i)If σi)If σ22 is known ii) If σ is known ii) If σ22 is unknown is unknown
n
XZ o-
ns
XZ o-
ns
XT o-
ns
XZ o-
n
XZ o-
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
160160
5.Decision Rule:5.Decision Rule:i) i) If HIf HAA: μ μ: μ μ00 Reject H Reject H 00 if Z >Z if Z >Z1-α/2 1-α/2 or Z< - Zor Z< - Z1-α/21-α/2
(when use Z - test) (when use Z - test) OrOr Reject H Reject H 00 if T >t if T >t1-α/2,n-1 1-α/2,n-1 or T< - tor T< - t1-α/2,n-11-α/2,n-1
))when use T- testwhen use T- test ( ( ____________________________________________________ ii) If Hii) If HAA: μ> μ: μ> μ00 Reject HReject H00 if Z>Z if Z>Z1-α1-α (when use Z - test) (when use Z - test) OrOr Reject H Reject H00 if T>t if T>t1-α,n-11-α,n-1 (when use T - test)(when use T - test)
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
161161
iii) If Hiii) If HAA: μ< μ: μ< μ00 Reject HReject H00 if Z< - Z if Z< - Z1-1-α α (when use Z - test) (when use Z - test) OrOrReject HReject H00 if T<- t if T<- t1-1-α,n-1 α,n-1 (when use T - test)(when use T - test)
NoteNote:: ZZ1-α/21-α/2 , Z , Z1-α1-α , Z , Zαα are tabulated values obtained are tabulated values obtained
from table Dfrom table Dtt1-α/21-α/2 , t , t1-α1-α , t , tαα are tabulated values obtained from are tabulated values obtained from
table E with (n-1) degree of freedom (df)table E with (n-1) degree of freedom (df)
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
162162
6.Decision :6.Decision : If we reject HIf we reject H00, we can conclude that H, we can conclude that HAA is is
true.true. If ,however ,we do not reject HIf ,however ,we do not reject H00, we may , we may
conclude that Hconclude that H00 is true. is true.
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
163163
An Alternative Decision Rule using theAn Alternative Decision Rule using the p - value Definition p - value Definition The The p-valuep-value is defined as the smallest value of is defined as the smallest value of
α for which the null hypothesis can be α for which the null hypothesis can be rejected.rejected.
If the p-value is less than or equal to α ,we If the p-value is less than or equal to α ,we reject the null hypothesisreject the null hypothesis (p ≤ (p ≤ αα))
If the p-value is greater than α ,we If the p-value is greater than α ,we do not do not reject the null hypothesis reject the null hypothesis (p > (p > αα))
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
164164
Example 7.2.1 Page 223Example 7.2.1 Page 223 Researchers are interested in the mean age of a Researchers are interested in the mean age of a
certaincertain populationpopulation.. A random sample of 10 individuals drawn from the A random sample of 10 individuals drawn from the
population of interest has a mean of 27. population of interest has a mean of 27. Assuming that the population is approximately Assuming that the population is approximately
normally distributed with variance 20,can we normally distributed with variance 20,can we conclude that the mean is different from 30 years ? conclude that the mean is different from 30 years ? (α=0.05) .(α=0.05) .
If the p - value is 0.0340 how can we use it in making If the p - value is 0.0340 how can we use it in making a decision? a decision?
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
165165
SolutionSolution
1-1-Data:Data: variable is age, n=10, =27 ,σ variable is age, n=10, =27 ,σ22=20,α=0.05=20,α=0.052-2-Assumptions:Assumptions: the population is approximately the population is approximately
normally distributed with variance 20 normally distributed with variance 20 3-Hypotheses:3-Hypotheses: HH00 : μ=30 : μ=30 HHAA: μ 30: μ 30
x
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
166166
4-Test Statistic:4-Test Statistic: Z Z = -2.12 = -2.125.Decision Rule5.Decision Rule The alternative hypothesis isThe alternative hypothesis is HHAA: μ > 30: μ > 30 Hence we reject H0 if Z >ZHence we reject H0 if Z >Z1-0.025/21-0.025/2= Z= Z0.9750.975 or Z< - Zor Z< - Z1-0.025/21-0.025/2= - Z= - Z0.9750.975
ZZ0.9750.975=1.96(from table D)=1.96(from table D)
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
167167
6.Decision:6.Decision:
We reject HWe reject H00 ,since -2.12 is in the rejection ,since -2.12 is in the rejection region .region .
We can conclude that μ is not equal to 30We can conclude that μ is not equal to 30
Using the p value ,we note that p-value Using the p value ,we note that p-value =0.0340< 0.05,therefore we reject H0 =0.0340< 0.05,therefore we reject H0
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
168168
Example7.2.2 page227Example7.2.2 page227 Referring to example 7.2.1.Suppose that the Referring to example 7.2.1.Suppose that the
researchers have asked: Can we conclude that researchers have asked: Can we conclude that μ<30.μ<30.
1.Data.1.Data.see previous examplesee previous example2. Assumptions .2. Assumptions .see previous examplesee previous example3.Hypotheses:3.Hypotheses: HH00 μ =30 μ =30 HH ِِAA: μ < 30: μ < 30
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
169169
4.Test Statistic4.Test Statistic : :
= = = -2.12 = -2.12
5. 5. DecisionDecision RuleRule: : Reject HReject H00 if Z< Z if Z< Z αα, where , where
Z Z αα= -1.645. (from table D) = -1.645. (from table D)
6. 6. DecisionDecision: : Reject HReject H00 ,thus we can conclude that the ,thus we can conclude that the population mean is smaller than 30. population mean is smaller than 30.
n
XZ
o-
1020
3027
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
170170
Example7.2.4 page232Example7.2.4 page232 Among 157 African-American men ,the mean Among 157 African-American men ,the mean
systolic blood pressure was 146 mm Hg with a systolic blood pressure was 146 mm Hg with a standard deviation of 27. We wish to know if standard deviation of 27. We wish to know if on the basis of these data, we may conclude on the basis of these data, we may conclude that the mean systolic blood pressure for a that the mean systolic blood pressure for a population of African-American is greater than population of African-American is greater than 140. Use α=0.01.140. Use α=0.01.
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
171171
SolutionSolution1. 1. Data:Data: Variable is systolic blood pressure, Variable is systolic blood pressure,
n=157 , =146, s=27, α=0.01.n=157 , =146, s=27, α=0.01.2. 2. Assumption:Assumption: population is not normal, σ population is not normal, σ22 is is
unknownunknown3. 3. Hypotheses:Hypotheses: HH00 :μ=140 :μ=140
HHAA: μ>140 : μ>140
4.Test Statistic:4.Test Statistic: = = = 2.78= = = 2.78
ns
XZ o-
15727
140146 1548.26
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
172172
5. Desicion Rule:5. Desicion Rule: we reject Hwe reject H00 if Z>Z if Z>Z1-α1-α
= Z= Z0.990.99= 2.33 = 2.33 (from table D)(from table D)
6. 6. Desicion:Desicion: We reject H We reject H00. . Hence we may conclude that the mean systolic Hence we may conclude that the mean systolic
blood pressure for a population of African-blood pressure for a population of African-American is greater than 140.American is greater than 140.
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
173173
7.37.3 Hypothesis Testing :The Difference Hypothesis Testing :The Difference between two population meanbetween two population mean ::
We have the following steps:We have the following steps:1.1.DataData:: determine variable, sample size (n), sample means, determine variable, sample size (n), sample means,
population standard deviation or samples standard population standard deviation or samples standard deviation (s) if is unknown for two population.deviation (s) if is unknown for two population.
2. 2. Assumptions :Assumptions : We have two cases: We have two cases: Case1:Case1: Population is normally or approximately normally Population is normally or approximately normally
distributed with known or unknown variance (sample size distributed with known or unknown variance (sample size n may be small or large), n may be small or large),
Case 2:Case 2: Population is not normal with known variances (n Population is not normal with known variances (n is large i.e. n≥30).is large i.e. n≥30).
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
174174
3.Hypotheses:3.Hypotheses: we have three caseswe have three cases Case ICase I : : H H00: : μ μ 11 == μ μ2 → 2 → μ μ 11 - - μμ22 = 0= 0
HHAA: : μ μ 1 1 ≠ ≠ μ μ 2 2 → → μ μ 1 1 -- μ μ 2 2 ≠ 0≠ 0 e.g. we want to test that the mean for first e.g. we want to test that the mean for first
population is different from second population population is different from second population mean.mean.
Case IICase II : : H H00: : μ μ 11 == μ μ2 → 2 → μ μ 11 - - μμ22 = 0= 0
HHAA: : μ μ 1 1 >> μ μ 2 2 →→ μ μ 1 1 -- μ μ 2 2 >> 0 0 e.g. we want to test that the mean for first e.g. we want to test that the mean for first
population is greater than second population mean.population is greater than second population mean. Case IIICase III : : HH00: : μ μ 11 == μ μ2 → 2 → μ μ 11 - - μμ22 = 0= 0
HHAA: : μ μ 1 1 << μ μ 2 2 →→ μ μ 1 1 -- μ μ 2 2 < 0< 0 e.g. we want to test that the mean for first e.g. we want to test that the mean for first
population is greater than second population mean.population is greater than second population mean.
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
175175
4.Test Statistic4.Test Statistic:: Case 1:Case 1: Two population is normalTwo population is normal or or approximately approximately
normalnormal σσ22 is known σ is known σ22 is unknown if is unknown if
( n ( n11 ,n ,n22 large or small) large or small) ( n ( n11 ,n ,n22 small) small)
populationpopulation populationpopulation VariancesVariances Variances equal not equalVariances equal not equal
wherewhere
2
22
1
21
2121 )(- )X-X(
nn
Z
21
2121
11)(- )X-X(
nnS
T
p
2
22
1
21
2121 )(- )X-X(
nS
nS
T
2)1(n)1(n
21
222
2112
nn
SSS p
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
176176
Case2:Case2: If population is If population is not normallynot normally distributed distributed and nand n1, 1, nn2 2 is large(is large(nn1 1 ≥ 0 ,n≥ 0 ,n22≥ 0) ≥ 0) and population variances is known, and population variances is known,
2
22
1
21
2121 )(- )X-X(
nn
Z
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
177177
5.Decision Rule:5.Decision Rule:i) i) If If HHAA: : μ μ 1 1 ≠ ≠ μ μ 2 2 → → μ μ 1 1 -- μ μ 2 2 ≠ 0≠ 0
Reject H Reject H 00 if Z >Z if Z >Z1-α/2 1-α/2 or Z< - Zor Z< - Z1-α/21-α/2
(when use Z - test) (when use Z - test) OrOr Reject H Reject H 00 if T >t if T >t1-α/2 ,(n1-α/2 ,(n11+n+n22 -2) -2) or T< - tor T< - t1-α/2,,(n1-α/2,,(n11+n+n22 -2) -2)
))when use T- testwhen use T- test ( ( ____________________________________________________ ii) ii) HHAA: : μ μ 1 1 >> μ μ 2 2 →→ μ μ 1 1 -- μ μ 2 2 >> 0 0
Reject HReject H00 if Z>Z if Z>Z1-α1-α (when use Z - test) (when use Z - test) OrOr Reject H Reject H00 if T>t if T>t1-α,(n1-α,(n11+n+n22 -2) -2) (when use T - test)(when use T - test)
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
178178
iii) If iii) If HHAA: : μ μ 1 1 << μ μ 2 2 →→ μ μ 1 1 -- μ μ 2 2 < 0< 0 Reject H Reject H00 if Z< - Zif Z< - Z1-1-α α (when use Z - test) (when use Z - test)
OrOrReject HReject H00 if T<- t if T<- t1-1-α, ,(nα, ,(n11+n+n22 -2) -2) (when use T - test)(when use T - test)
NoteNote:: ZZ1-α/21-α/2 , Z , Z1-α1-α , Z , Zαα are tabulated values obtained from are tabulated values obtained from
table Dtable Dtt1-α/21-α/2 , t , t1-α1-α , t , tαα are tabulated values obtained from are tabulated values obtained from
table E with (ntable E with (n11+n+n22 -2) -2) degree of freedom (df)degree of freedom (df)
6.6. Conclusion: Conclusion: reject or fail to reject Hreject or fail to reject H00
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
179179
Example7.3.1 page238Example7.3.1 page238 Researchers wish to know if the data have collected provide Researchers wish to know if the data have collected provide
sufficient evidence to indicate a difference in mean serum sufficient evidence to indicate a difference in mean serum uric acid levels between normal individuals and individual uric acid levels between normal individuals and individual with Down’s syndrome. The data consist of serum uric with Down’s syndrome. The data consist of serum uric reading on 12 individuals with Down’s syndrome from reading on 12 individuals with Down’s syndrome from normal distribution with variance 1 and 15 normal individuals normal distribution with variance 1 and 15 normal individuals from normal distribution with variance 1.5 . The mean arefrom normal distribution with variance 1.5 . The mean are
andand α=0.05.α=0.05. Solution:Solution:1. 1. Data:Data: Variable is Variable is serum uric acid levelsserum uric acid levels, n, n11=12 , n=12 , n22=15, =15,
σσ2211=1, σ=1, σ22
22=1.5 ,α=0.05.=1.5 ,α=0.05.
100/5.41 mgX 100/4.32 mgX
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
180180
2. 2. Assumption:Assumption: Two population are normal, σ Two population are normal, σ221 1 , σ, σ22
22 are knownare known
3. 3. Hypotheses:Hypotheses: HH00: : μ μ 11 == μ μ2 → 2 → μ μ 11 - - μμ22 = 0= 0
HHAA: : μ μ 1 1 ≠ ≠ μ μ 2 2 → → μ μ 1 1 -- μ μ 2 2 ≠ 0≠ 0
4.Test Statistic:4.Test Statistic: = = 2.57= = 2.57
5. Desicion Rule:5. Desicion Rule: Reject H Reject H 00 if Z >Z if Z >Z1-α/2 1-α/2 or Z< - Zor Z< - Z1-α/21-α/2
ZZ1-α/2= 1-α/2= ZZ1-0.05/2= 1-0.05/2= ZZ0.975=0.975=1.96 (from table D)1.96 (from table D)6-6-Conclusion: Conclusion: Reject Reject HH0 0 sincesince 2.57 > 1.962.57 > 1.96Or if p-value =0.102→ reject Or if p-value =0.102→ reject HH0 0 if pif p << αα → then reject → then reject HH0 0
2
22
1
21
2121 )(- )X-X(
nn
Z
155.1
121
)0(- 3.4)-(4.5
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
181181
Example7.3.2 page 240Example7.3.2 page 240The purpose of a study by Tam, was to investigate wheelchairThe purpose of a study by Tam, was to investigate wheelchairManeuvering in individuals with over-level spinal cord injury (SCI)Maneuvering in individuals with over-level spinal cord injury (SCI)And healthy control (C). Subjects used a modified a wheelchair toAnd healthy control (C). Subjects used a modified a wheelchair toincorporate a rigid seat surface to facilitate the specifiedincorporate a rigid seat surface to facilitate the specifiedexperimental measurements. The data for measurements of theexperimental measurements. The data for measurements of theleft ischial tuerosity left ischial tuerosity ( ( المتحرك الكرسي من وتأثيرها الفخذ المتحرك عظام الكرسي من وتأثيرها الفخذ for for ( (عظام
SCI and control C are shown belowSCI and control C are shown below
C13111512413112211788114150169
SCI60150130180163130121119130143
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
182182
We wish to know if we can conclude, on the We wish to know if we can conclude, on the basis of the above data that the mean of basis of the above data that the mean of left ischial tuberosity for control C lower left ischial tuberosity for control C lower than mean of left ischial tuerosity for SCI, than mean of left ischial tuerosity for SCI, Assume normal populations Assume normal populations equalequal variancesvariances. . αα=0.05, p-value = -1.33=0.05, p-value = -1.33
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
183183
Solution:Solution:1. 1. Data:Data:, n, nCC=10 , n=10 , nSCISCI=10, S=10, SCC=21.8, S=21.8, SSCISCI=133.1 ,α=0.05.=133.1 ,α=0.05. ,, (calculated from data)(calculated from data)2.2.Assumption:Assumption: Two population are normal, σ Two population are normal, σ22
1 1 , σ, σ2222 are are
unknown but unknown but equalequal3. 3. Hypotheses:Hypotheses: HH00: : μ μ CC == μ μ SCISCI → → μ μ CC - - μ μ SCISCI = 0= 0
HHAA: : μ μ C C < < μ μ SCI SCI → → μ μ C C -- μ μ SCI SCI < 0< 0
4.Test Statistic:4.Test Statistic:
Where,Where,
1.126CX 1.133SCIX
569.0
101
10104.756
0)1.1331.126(11
)(- )X-X(
21
2121
nnS
T
p
04.75621010
)3.32(9)8.21(92
)1(n)1(n 22
21
222
2112
nn
SSS p
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
184184
5. Decision Rule:5. Decision Rule: Reject H Reject H 00 if T< - T if T< - T1-α,(n1-α,(n11+n+n22 -2) -2)
TT1-α,(n1-α,(n11+n+n22 -2) = -2) = TT0.95,18 =0.95,18 = 1.7341 (from table E) 1.7341 (from table E)
6-6-Conclusion: Conclusion: Fail toFail to reject reject HH0 0 sincesince -0.569 < - -0.569 < - 1.73411.7341OrOrFail to reject Fail to reject HH0 0 since p = -1.33 since p = -1.33 >> αα =0.05 =0.05
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
185185
Example7.3.3 page 241Example7.3.3 page 241Dernellis and Panaretou examined subjects with hypertension Dernellis and Panaretou examined subjects with hypertension and healthy control subjects .One of the variables of interest wasand healthy control subjects .One of the variables of interest wasthe aortic stiffness index. Measures of this variable werethe aortic stiffness index. Measures of this variable werecalculated From the aortic diameter evaluated by M-mode andcalculated From the aortic diameter evaluated by M-mode andblood pressure measured by a sphygmomanometer. Physics wishblood pressure measured by a sphygmomanometer. Physics wishto reduce aortic stiffness. In the 15 patients with hypertensionto reduce aortic stiffness. In the 15 patients with hypertension(Group 1),the mean aortic stiffness index was 19.16 with a(Group 1),the mean aortic stiffness index was 19.16 with astandard deviation of 5.29. In the30 control subjects (Group 2),thestandard deviation of 5.29. In the30 control subjects (Group 2),themean aortic stiffness index was 9.53 with a standard deviation ofmean aortic stiffness index was 9.53 with a standard deviation of2.69. We wish to determine if the two populations represented by2.69. We wish to determine if the two populations represented bythese samples differ with respect to mean stiffness index .we wishthese samples differ with respect to mean stiffness index .we wishto know if we can conclude that in general a person withto know if we can conclude that in general a person withthrombosis have on the average higher IgG levels than personsthrombosis have on the average higher IgG levels than personswithout thrombosis at without thrombosis at αα=0.01, p-value = 0.0559=0.01, p-value = 0.0559
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
186186
Solution:Solution:1. 1. Data:Data:, n, n11=53 , n=53 , n22=54, S=54, S11= = 44.8944.89, S, S22= = 34.8534.85 α=0.01. α=0.01.
2.2.Assumption:Assumption: Two population are not normal, σ Two population are not normal, σ221 1 , σ, σ22
22 are unknown and sample size largeare unknown and sample size large
3. 3. Hypotheses:Hypotheses: HH00: : μ μ 11 == μ μ 2 2 → → μ μ 11 - - μ μ 22 = 0= 0
HHAA: : μ μ 1 1 > > μ μ 2 2 → → μ μ 1 1 -- μ μ 2 2 > 0> 0
4.Test Statistic:4.Test Statistic:
GroupMean LgG levelSample Size
}ٍstandard deviation
Thrombosis59.015344.89No Thrombosis
46.615434.85
59.1
5485.34
5389.44
0)61.4601.59()(- )X-X(22
2
22
1
21
2121
nS
nS
Z
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
187187
5. Decision Rule:5. Decision Rule: Reject H Reject H 00 if Z > Z if Z > Z1-α1-α
ZZ1-α = 1-α = ZZ0.99 =0.99 = 2.33 (from table D) 2.33 (from table D)
6-6-Conclusion: Conclusion: Fail toFail to reject reject HH0 0 sincesince 1.59 > 2.33 1.59 > 2.33OrOrFail to reject Fail to reject HH0 0 since p = 0.0559 since p = 0.0559 >> αα =0.01 =0.01
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
188188
7.57.5 Hypothesis Testing A single Hypothesis Testing A single population proportionpopulation proportion::
Testing hypothesis about population proportion (P) is carried out Testing hypothesis about population proportion (P) is carried out in much the same way as for mean when condition is necessary forin much the same way as for mean when condition is necessary forusing normal curve are metusing normal curve are met We have the following steps:We have the following steps:1.1.DataData:: sample size (n), sample proportion( ) , P sample size (n), sample proportion( ) , P00
2. 2. Assumptions :Assumptions :normal distributionnormal distribution , ,
p̂
na
p sample in theelement of no. Totalisticcharachtar some with sample in theelement of no.
ˆ
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
189189
3.Hypotheses:3.Hypotheses: we have three caseswe have three cases Case ICase I : : H H00: P = P: P = P00
HHAA: : P ≠ PP ≠ P00
Case IICase II : : H H00: P = P: P = P00
HHAA: : PP > > PP00
Case IIICase III : : HH00: P = P: P = P00
HHAA: : P P < < PP00
4.Test Statistic4.Test Statistic::
Where Where HH00 is true ,is distributed approximately as the is true ,is distributed approximately as the standard normalstandard normal
nqpppZ
00
0ˆ
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
190190
5.Decision Rule:5.Decision Rule:i) i) If HIf HAA: P ≠ P: P ≠ P00 Reject H Reject H 00 if Z >Z if Z >Z1-α/2 1-α/2 or Z< - Zor Z< - Z1-α/21-α/2 ______________________________________________ ii) If Hii) If HAA: P> P: P> P00 Reject HReject H00 if Z>Z if Z>Z1-α1-α __________________________________________________________ iii) If Hiii) If HAA: P< P: P< P00 Reject HReject H00 if Z< - Z if Z< - Z1-1-α α
NoteNote: Z: Z1-α/21-α/2 , Z , Z1-α1-α , Z , Zαα are tabulated values obtained from are tabulated values obtained from table Dtable D
6.6. ConclusionConclusion: : reject or fail to reject Hreject or fail to reject H00
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
191191
2.2. Assumptions : Assumptions : is approximatelyis approximately normaly distributednormaly distributed3.Hypotheses:3.Hypotheses: we have three caseswe have three cases HH00: P = 0.063: P = 0.063 HHAA: : PP > 0.063 > 0.063 4.Test Statistic 4.Test Statistic ::
5.Decision Rule: 5.Decision Rule: Reject HReject H00 if Z>Z if Z>Z1-α1-α
Where Where ZZ1-α 1-α = Z= Z1-0.051-0.05 =Z =Z0.950.95== 1.6451.645
21.1
301)0.937(063.0
063.008.0ˆ
00
0
nqpppZ
p̂
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
192192
6.6. Conclusion: Conclusion: Fail to reject HFail to reject H00
SinceSince Z =1.21 > ZZ =1.21 > Z1-α=1-α=1.6451.645Or , Or , If P-value = 0.1131,If P-value = 0.1131, fail to reject Hfail to reject H0 0 → P > → P > αα
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
193193
Example7.5.1 page 259Example7.5.1 page 259Wagen collected data on a sample of 301 Hispanic womenWagen collected data on a sample of 301 Hispanic womenLiving in Texas .One variable of interest was the percentageLiving in Texas .One variable of interest was the percentageof subjects with impaired fasting glucose (IFG). In theof subjects with impaired fasting glucose (IFG). In thestudy,24 women were classified in the (IFG) stage .The articlestudy,24 women were classified in the (IFG) stage .The articlecites population estimates for (IFG) among Hispanic womencites population estimates for (IFG) among Hispanic womenin Texas as 6.3 percent .Is there sufficient evidence toin Texas as 6.3 percent .Is there sufficient evidence toindicate that the population Hispanic women in Texas has aindicate that the population Hispanic women in Texas has aprevalence of IFG higher than 6.3 percent ,let prevalence of IFG higher than 6.3 percent ,let αα=0.05=0.05Solution:Solution:1.Data:1.Data: n = 301, p n = 301, p00 = 6.3/100=0.063 ,a=24,= 6.3/100=0.063 ,a=24,
qq00 =1- p=1- p00 = 1- 0.063 =0.937, = 1- 0.063 =0.937, αα=0.05=0.05
08.030124ˆ
nap
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
194194
7.67.6 Hypothesis Testing :TheHypothesis Testing :The Difference between two Difference between two
population proportionpopulation proportion:: Testing hypothesis about two population proportion (PTesting hypothesis about two population proportion (P1,, 1,, PP2 2 ) is) iscarried out in much the same way as for difference between twocarried out in much the same way as for difference between twomeans when condition is necessary for using normal curve are metmeans when condition is necessary for using normal curve are met We have the following steps:We have the following steps:1.Data1.Data:: sample size (n sample size (n1 1 ووnn22), sample proportions( ), ), sample proportions( ), Characteristic in two samples (x1 , x2),
2- Assumption : Two populations are independent .
21ˆ,ˆ PP
21
21
nnxxp
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
195195
3.Hypotheses:3.Hypotheses: we have three caseswe have three cases Case ICase I : : H H00: P: P11 = P = P22 → → PP11 - P - P22 = 0 = 0 HHAA: P: P1 1 ≠ ≠ PP2 2 → → PP11 - P - P22 ≠ 0 ≠ 0 Case IICase II : : H H00: P: P1 1 = P = P2 2 → → PP11 - P - P22 = 0 = 0 HHAA: P: P1 1 > P > P2 2 → → PP11 - P - P22 > 0 > 0 Case IIICase III : : HH00: P: P11 = P = P2 2 → → PP11 - P - P22 = 0 = 0 HHAA: P: P11 < P< P2 2 → → PP11 - P - P22 < 0 < 0 4.Test Statistic4.Test Statistic::
Where Where HH00 is true ,is distributed approximately as the is true ,is distributed approximately as the standard normalstandard normal
21
2121
)1()1()()ˆˆ(
npp
npp
ppppZ
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
196196
5.Decision Rule:5.Decision Rule:i) i) If HIf HAA: P: P11 ≠ P ≠ P22 Reject H Reject H 00 if Z >Z if Z >Z1-α/2 1-α/2 or Z< - Zor Z< - Z1-α/21-α/2 ______________________________________________ ii) If Hii) If HAA: P: P11 > P > P22 Reject HReject H00 if Z >Z if Z >Z1-α1-α __________________________________________________________ iii) If Hiii) If HAA: P: P11 < P < P22
Reject HReject H00 if Z< - Z if Z< - Z1-1-α α
NoteNote: Z: Z1-α/21-α/2 , Z , Z1-α1-α , Z , Zαα are tabulated values obtained from are tabulated values obtained from table Dtable D
6.6. ConclusionConclusion: : reject or fail to reject Hreject or fail to reject H00
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
197197
Example7.6.1 page 262Example7.6.1 page 262Noonan is a genetic condition that can affect the heart growth,Noonan is a genetic condition that can affect the heart growth,blood clotting and mental and physical development. Noonan examinedblood clotting and mental and physical development. Noonan examinedthe stature of men and women with Noonan. The study contained 29the stature of men and women with Noonan. The study contained 29Male and 44 female adults. One of the cut-off values used to assessMale and 44 female adults. One of the cut-off values used to assessstature was the third percentile of adult height .Eleven of the males fellstature was the third percentile of adult height .Eleven of the males fellbelow the third percentile of adult male height ,while 24 of the femalebelow the third percentile of adult male height ,while 24 of the femalefell below the third percentile of female adult height .Does this study fell below the third percentile of female adult height .Does this study provide sufficient evidence for us to conclude that among subjects with provide sufficient evidence for us to conclude that among subjects with Noonan ,females are more likely than males to fall below the respectiveNoonan ,females are more likely than males to fall below the respectiveof adult height? Let of adult height? Let αα=0.05=0.05Solution:Solution:1.Data:1.Data: n n MM = 29, n = 29, n FF = 44 , x = 44 , x MM= 11 , x = 11 , x FF= 24, = 24, αα=0.05=0.05
479.044292411
FM
FM
nnxxp 545.0
4424ˆ,379.0
2911ˆ
F
FF
M
mM n
xpnxp
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
198198
2- Assumption : Two populations are independent .3.Hypotheses:3.Hypotheses: Case IICase II : : H H00: P: PF F = P = PM M → → PPFF - P - PMM = 0 = 0 HHAA: P: PF F > P > PM M → → PPFF - P - PMM > 0 > 0 4.Test Statistic4.Test Statistic::
5.Decision Rule:5.Decision Rule:Reject HReject H00 if Z >Z if Z >Z1-α1-α , Where Z , Where Z1-α 1-α = Z= Z1-0.051-0.05 =Z =Z0.950.95== 1.6451.645 6.6. Conclusion: Conclusion: Fail to reject HFail to reject H00
Since Z =1.39 > ZSince Z =1.39 > Z1-α=1-α=1.6451.645Or , If P-value = 0.0823 → fail to reject HOr , If P-value = 0.0823 → fail to reject H0 0 → P > → P > αα
39.1
29)521.0)(479.0(
44)521.0)(479.0(
0)379.0545.0()1()1()()ˆˆ(
21
2121
npp
npp
ppppZ
Text Book : Basic Concepts and MeText Book : Basic Concepts and Methodology for the Health Sciences thodology for the Health Sciences
199199
Exercises:Exercises: Questions Questions : Page 234 -237: Page 234 -237 7.2.1,7.8.2 ,7.3.1,7.3.6 ,7.5.2 ,,7.6.17.2.1,7.8.2 ,7.3.1,7.3.6 ,7.5.2 ,,7.6.1
H.WH.W: : 7.2.8,7.2.9, 7.2.11, 7.2.15,7.3.7,7.3.8,7.3.107.2.8,7.2.9, 7.2.11, 7.2.15,7.3.7,7.3.8,7.3.10 7.5.3,7.6.47.5.3,7.6.4
Text Book : Basic Concepts and Methodology for the Health Sciences
200
Chapter 9Chapter 9 Statistical Inference and TheStatistical Inference and The
Relationship between two Relationship between two variablesvariables
Prepared By : Dr. Shuhrat KhanPrepared By : Dr. Shuhrat Khan
Text Book : Basic Concepts and Methodology for the Health Sciences
201
REGRESSION REGRESSION CORRELATIONCORRELATIONANALYSIS OF ANALYSIS OF VARIANCEVARIANCE
•Regression, Correlation and Analysis Regression, Correlation and Analysis of Covariance are all statistical of Covariance are all statistical
techniques that use the idea that one techniques that use the idea that one variable say, may be related to one or variable say, may be related to one or more variables through an equation. more variables through an equation. Here we consider the relationship of Here we consider the relationship of
two variables only in a linear form, two variables only in a linear form, which is called linear regression and which is called linear regression and
linear correlation; or simple linear correlation; or simple regression and correlation. The regression and correlation. The
relationships between more than two relationships between more than two variables, called multiple regression variables, called multiple regression
and correlation will be considered and correlation will be considered laterlater..
•Simple regression uses the Simple regression uses the relationship between the two variables relationship between the two variables
to obtain information about one to obtain information about one variable by knowing the values of the variable by knowing the values of the other. The equation showing this type other. The equation showing this type of relationship is called simple linear of relationship is called simple linear
regression equation. The related regression equation. The related method of correlation is used to method of correlation is used to
measure how strong the relationship is measure how strong the relationship is between the two variables isbetween the two variables is..
201201
EQUATION OF REGRESSIONEQUATION OF REGRESSION
Text Book : Basic Concepts and Methodology for the Health Sciences
202
Line of RegressionLine of Regression
•Simple Linear RegressionSimple Linear Regression::•Suppose that we are interested in a variable Y, Suppose that we are interested in a variable Y,
but we want to know about its relationship to but we want to know about its relationship to another variable X or we want to use X to another variable X or we want to use X to
predict (or estimate) the value of Y that might predict (or estimate) the value of Y that might be obtained without actually measuring it, be obtained without actually measuring it,
provided the relationship between the two can provided the relationship between the two can be expressed by a line.’ X’ is usually called thebe expressed by a line.’ X’ is usually called the
independent variableindependent variable and ‘Y’ is called the and ‘Y’ is called the dependent variabledependent variable..
• •We assume that the values of variable X are We assume that the values of variable X are
either fixed or random. By fixed, we mean that either fixed or random. By fixed, we mean that the values are chosen by researcher--- either the values are chosen by researcher--- either
an experimental unit (patient) is given this an experimental unit (patient) is given this value of X (such as the dosage of drug or a value of X (such as the dosage of drug or a
unit (patient) is chosen which is known to have unit (patient) is chosen which is known to have this value of Xthis value of X . .
•By random, we mean that units (patients) are By random, we mean that units (patients) are chosen at random from all the possible units,, chosen at random from all the possible units,,
and both variables X and Y are measuredand both variables X and Y are measured..•We also assume that for each value of x of X, We also assume that for each value of x of X,
there is a whole range or population of there is a whole range or population of possible Y values and that the mean of the Y possible Y values and that the mean of the Y
population at X = x, denoted by population at X = x, denoted by µµy/xy/x , is a linear , is a linear function of x. That isfunction of x. That is,,
• •µµy/xy/x = α +βx = α +βx
DEPENDENT VARIABLEDEPENDENT VARIABLEINDEPENDENT VARIABLEINDEPENDENT VARIABLE
TWO RANDOM VARIABLETWO RANDOM VARIABLEOROR
BIVARIATEBIVARIATERANDOMRANDOM
VARIABLEVARIABLE
Text Book : Basic Concepts and Methodology for the Health Sciences
203
ESTIMATIONESTIMATION
•Estimate α and βEstimate α and β..•Predict the value of Y at Predict the value of Y at
a given value x of Xa given value x of X..•Make tests to draw Make tests to draw
conclusions about the conclusions about the model and its usefulnessmodel and its usefulness..
•We estimate the We estimate the
parameters α and β by ‘a’ parameters α and β by ‘a’ and ‘b’ respectively by and ‘b’ respectively by
using sample regression using sample regression lineline::
•Ŷ = a+ bxŶ = a+ bx•Where we calculateWhere we calculate•
We select a sample ofWe select a sample of n observations n observations (x(xii,y,yii))
from the populationfrom the population , ,WITHWITH
the goalsthe goals
Text Book : Basic Concepts and Methodology for the Health Sciences
204
BB= =
ESTIMATION AND CALCULATION OF CONSTANTS , ‘’a’’ AND ‘’b’’
Text Book : Basic Concepts and Methodology for the Health Sciences
205
EXAMPLEEXAMPLE•investigators at a sports health centre investigators at a sports health centre
are interested in the relationship are interested in the relationship between oxygen consumption and between oxygen consumption and
exercise time in athletes recovering exercise time in athletes recovering from injury. Appropriate mechanics from injury. Appropriate mechanics
for exercising and measuring oxygen for exercising and measuring oxygen consumption are set up, and the consumption are set up, and the
results are presented belowresults are presented below : :–x variablex variable
Text Book : Basic Concepts and Methodology for the Health Sciences
206
exercise time
) min(
0.51.01.52.02.53.03.54.04.55.0
y variableoxygen consumption
620630800840840870
1010940950
1130
Text Book : Basic Concepts and Methodology for the Health Sciences
207
calculationscalculations•
or
Text Book : Basic Concepts and Methodology for the Health Sciences
208
Pearson’s Correlation Pearson’s Correlation CoefficientCoefficient • With the aid of Pearson’s correlation With the aid of Pearson’s correlation
coefficient (coefficient (rr), we can determine the ), we can determine the strength and the direction of the strength and the direction of the relationship between relationship between XX and and YY variables, variables,
• both of which have been measured both of which have been measured and they must be quantitative. and they must be quantitative.
• For example, we might be interested For example, we might be interested in examining the association between in examining the association between height and weight for the following height and weight for the following sample of eight children:sample of eight children:
Text Book : Basic Concepts and Methodology for the Health Sciences
209
Height and weights of 8 Height and weights of 8 childrenchildren
ChildHeight(inches)XWeight(pounds)Y
A4981B5088C5387D5599E6091F5589G6095H5090
Average = )54 inches( = )90 pounds(
Text Book : Basic Concepts and Methodology for the Health Sciences
210
Scatter plot for 8 babiesScatter plot for 8 babiesheight weight
49 8150 8853 8355 9960 9155 8960 9550 90
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70
متسلسلة1
Text Book : Basic Concepts and Methodology for the Health Sciences
211
Table : The Strength of a Table : The Strength of a CorrelationCorrelation
• • Value of r (positive or negative) Value of r (positive or negative)
MeaningMeaning• ______________________________________________________________________________________________________________• • 0.00 to 0.190.00 to 0.19 A very weak correlation A very weak correlation• 0.20 to 0.390.20 to 0.39 A weak correlation A weak correlation• 0.40 to 0.690.40 to 0.69 A modest correlation A modest correlation• 0.70 to 0.890.70 to 0.89 A strong correlation A strong correlation• 0.90 to 1.000.90 to 1.00 A very strong correlationA very strong correlation• ________________________________________________________________________________________________________________
Text Book : Basic Concepts and Methodology for the Health Sciences
212
FORMULA FOR FORMULA FOR CORRELATION CORRELATION
COEFFECIENT ( r )COEFFECIENT ( r )
• With Pearson’s With Pearson’s rr, , • means that we add the products of the deviations to see if the means that we add the products of the deviations to see if the
positive products or negative products are more abundant and positive products or negative products are more abundant and sizable. Positive products indicate cases in which the variables sizable. Positive products indicate cases in which the variables go in the same direction (that is, both taller or heavier than go in the same direction (that is, both taller or heavier than average or both shorter and lighter than average); average or both shorter and lighter than average);
• negative products indicate cases in which the variables go in negative products indicate cases in which the variables go in opposite directions (that is, taller but lighter than average or opposite directions (that is, taller but lighter than average or shorter but heavier than average).shorter but heavier than average).
•
Text Book : Basic Concepts and Methodology for the Health Sciences
213
Computational Formula for Pearsons’s Correlation Computational Formula for Pearsons’s Correlation Coefficient rCoefficient r •
Where SP (sum of the product), SSx (Sum of the squares for x) and SSy (sum of the squares for y) can be computed as follows:
Text Book : Basic Concepts and Methodology for the Health Sciences
214
ChildXYX2Y2XY
A 1212 144144144B 10 8100 64 80C 612 3614472D 1611256121176
E 810 64 100 80F 9 8 8164 72G 1216144256192H 1115121225165
∑84 92 946 1118 981
Text Book : Basic Concepts and Methodology for the Health Sciences
215
Table 2 : Chest circumference Table 2 : Chest circumference and Birth Weight of 10 babiesand Birth Weight of 10 babies
• X(cm)X(cm) y(kg)y(kg) xx22 yy22 xy xy• ______________________________________________________________________________________________________• 22.422.4 2.002.00 501.76501.76 4.004.00 44.8 44.8• 27.527.5 2.252.25 756.25756.25 5.065.06 61.88 61.88• 28.528.5 2.102.10 812.25812.25 4.41 59.854.41 59.85• 28.528.5 2.352.35 812.25812.25 5.525.52 66.98 66.98• 29.429.4 2.452.45 864.36864.36 6.006.00 72.03 72.03• 29.429.4 2.502.50 864.36864.36 6.256.25 73.5 73.5• 30.530.5 2.802.80 930.25930.25 7.847.84 85.4 85.4• 32.032.0 2.802.80 1024.01024.0 7.847.84 89.6 89.6• 31.431.4 2.552.55 985.96985.96 6.506.50 80.07 80.07• 32.532.5 3.003.00 1056.25 9.001056.25 9.00 97.5 97.5• TOTALTOTAL• 292.1292.1 24.824.8 8607.69 62.42 8607.69 62.42 731.61 731.61
Text Book : Basic Concepts and Methodology for the Health Sciences
216
Checking for significanceChecking for significance
• There appears to be a strong between chest There appears to be a strong between chest circumference and birth weight in babies.circumference and birth weight in babies.
• We need to check that such a correlation is unlikely to We need to check that such a correlation is unlikely to have arisen by in a sample of ten babies. have arisen by in a sample of ten babies.
• Tables are available that gives the significant values of Tables are available that gives the significant values of this correlation ratio at two probability levels.this correlation ratio at two probability levels.
• First we need to work out degrees of freedom. They First we need to work out degrees of freedom. They are the number of pair of observations less two, that is are the number of pair of observations less two, that is (n – 2)= 8. (n – 2)= 8.
• Looking at the table we find that our calculated value Looking at the table we find that our calculated value of 0.86 exceeds the tabulated value at 8 df of 0.765 at of 0.86 exceeds the tabulated value at 8 df of 0.765 at p= 0.01. Our correlation is therefore statistically highly p= 0.01. Our correlation is therefore statistically highly significant.significant.
Chapter 12Chapter 12Analysis of Frequency DataAnalysis of Frequency DataAn Introduction to the Chi-An Introduction to the Chi-
SquareSquareDistributionDistribution
Prepared By : Dr. Shuhrat KhanPrepared By : Dr. Shuhrat Khan
Text Book : Basic Concepts and Methodology for the Health Sciences
218
TESTS OF INDEPENDENCETESTS OF INDEPENDENCE To test whether two criteria of classification To test whether two criteria of classification
are independent . For example are independent . For example socioeconomic status and area of residence socioeconomic status and area of residence of people in a city are independent.of people in a city are independent.
We divide our sample according to status, We divide our sample according to status, low, medium and high incomes etc. and the low, medium and high incomes etc. and the same samples is categorized according to same samples is categorized according to urban, rural or suburban and slums etc. urban, rural or suburban and slums etc.
Put the first criterion in columns equal in Put the first criterion in columns equal in number to classification of 1number to classification of 1stst criteria criteria ( Socioeconomic status) and the 2( Socioeconomic status) and the 2ndnd in rows, in rows, where the no. of rows equal to the no. of where the no. of rows equal to the no. of categories of 2categories of 2ndnd criteria (areas of cities). criteria (areas of cities).
Text Book : Basic Concepts and Methodology for the Health Sciences
219
The Contingency TableThe Contingency Table Table Two-Way Classification of Table Two-Way Classification of
samplesample First Criterion of Classification →First Criterion of Classification → Second
Criterion↓ 12
3
..…cTotal123..
r
N11
N21
N31
.
.
Nr1
N12
N22
N32
.
.
Nr2
N13
N 23
N33
.
.
Nr3
…………...………
N1c
N2c
N3c
.
.
N rc
N1.
N2.
N3.
.
.
Nr.
TotalN.1N.2N.3……N.cN
Text Book : Basic Concepts and Methodology for the Health Sciences
220
Observed versus Expected Observed versus Expected FrequenciesFrequencies
OOi ji j : The frequencies in ith row and jth column : The frequencies in ith row and jth column given in any contingency table are called given in any contingency table are called observed frequencies that result form the cross observed frequencies that result form the cross classification according to the two classifications.classification according to the two classifications.
eei ji j :Expected frequencies on the assumption of :Expected frequencies on the assumption of independence of two criterion are calculated by independence of two criterion are calculated by multiplying the marginal totals of any cell and multiplying the marginal totals of any cell and then dividing by total frequencythen dividing by total frequency
Formula: Formula:
NNNe ji
ij
)((
Text Book : Basic Concepts and Methodology for the Health Sciences
221
Chi-square TestChi-square Test After the calculations of expected frequency,After the calculations of expected frequency, Prepare a table for expected frequencies and use Prepare a table for expected frequencies and use
Chi-squareChi-square
Where summation is for all values of r xc = k Where summation is for all values of r xc = k cells.cells.
D.F.: the degrees of freedom for using the table are D.F.: the degrees of freedom for using the table are (r-1)(c-1) for (r-1)(c-1) for αα level of significance level of significance
Note that the test is always one-sided.Note that the test is always one-sided.
k
i
eeoi
ii1
2 ])([2
Text Book : Basic Concepts and Methodology for the Health Sciences
222
Example 12.401(page 613)Example 12.401(page 613) The researcher are interested to determine that The researcher are interested to determine that
preconception use of folic acid and race are preconception use of folic acid and race are independent. The data is:independent. The data is:
Observed Frequencies Table Expected Observed Frequencies Table Expected frequencies Tablefrequencies Table
Use of Folic
Acidtotal
Yes
No
WhiteBlackOther
260157
2994114
5595621
Total282354636
YesnoTotalWhite
Black
Others
)282)(559/(636
=247.86
)282)(56/(636
=24.83)282))(21 (
=9.31
)354)(559/(636
=311.14
)354)(559 ( = 31.17
21x354/636= 11.69
559
56
21
total282354636
Text Book : Basic Concepts and Methodology for the Health Sciences
223
Calculations and TestingCalculations and Testing
091.969.11/.....
14.311/86.247/
)69.1114()14.311299()86.247260(
2
222
Data: See the given tableData: See the given tableAssumption: Simple random sampleAssumption: Simple random sampleHypothesis: HHypothesis: H00: race and use of folic acid are independent: race and use of folic acid are independent
HA: the two variables are not independent. HA: the two variables are not independent. Let Let αα = = 0.050.05
The test statistic is Chi Square given earlierThe test statistic is Chi Square given earlierDistribution when HDistribution when H00 is true chi-square is valid with (r-1) is true chi-square is valid with (r-1)
(c-1) = (3-1)(2-1)= 2 d.f(c-1) = (3-1)(2-1)= 2 d.f..Decision Rule: Reject H0 if value of is greater thanDecision Rule: Reject H0 if value of is greater than
= = 5.9915.991
CalculationsCalculations::
2
2
)1)(1(, cr
Text Book : Basic Concepts and Methodology for the Health Sciences
224
ConclusionConclusionStatistical decision. We reject HStatistical decision. We reject H00 since 9.08960> since 9.08960>
5.9915.991
Conclusion: we conclude that HConclusion: we conclude that H00 is false, and that is false, and that there is a relationship between race and there is a relationship between race and
preconception use of folic acidpreconception use of folic acid..P value. Since 7.378< 9.08960< 9.210, P value. Since 7.378< 9.08960< 9.210,
0.01<p <0.0250.01<p <0.025We also reject the hypothesis at 0.025 level of We also reject the hypothesis at 0.025 level of
significance but do not reject it at 0.01 levelsignificance but do not reject it at 0.01 level..Solve Ex12.4.1 and 12.4.5 (p 620 & P 622)Solve Ex12.4.1 and 12.4.5 (p 620 & P 622)
Text Book : Basic Concepts and Methodology for the Health Sciences
225
ODDS RATIOODDS RATIO In a retrospective study, samples are selected from In a retrospective study, samples are selected from
those who have the disease called ‘those who have the disease called ‘cases’ cases’ and those and those who do not have the disease called who do not have the disease called ‘controls’ . ‘controls’ . The The investigator looks back (have a investigator looks back (have a retrospective look)retrospective look) at at the subjects and determines which one have (or had) the subjects and determines which one have (or had) and which one do not have (or did not have ) the risk and which one do not have (or did not have ) the risk factor.factor.
The data is classified into 2x2 table, for comparing The data is classified into 2x2 table, for comparing cases and controls for risk factor cases and controls for risk factor ODDS RATIOODDS RATIO IS IS CALCULATEDCALCULATED
ODDS are defined to be the ratio of probability of ODDS are defined to be the ratio of probability of success to the probability of failure.success to the probability of failure.
The estimate of population odds ratio is The estimate of population odds ratio is bcad
cldbaOR
/
Text Book : Basic Concepts and Methodology for the Health Sciences
226
ODDS RATIOODDS RATIO Where a, b, c and d are the numbers given in the Where a, b, c and d are the numbers given in the
following table:following table:
We may construct 100(1-We may construct 100(1-αα)%CI for OR by )%CI for OR by formula:formula:
Risk Factor
↓
SampleTotalCasesControl
Present
aba + b
Absentcdc + d
Totala + cb + d
R Xz )/(1 22/
Text Book : Basic Concepts and Methodology for the Health Sciences
227
Example 12.7.2 for Odds RatioExample 12.7.2 for Odds Ratio Example 12.5.7.2 page 640: Data Example 12.5.7.2 page 640: Data
relates to the obesity status of children relates to the obesity status of children aged 5-6 and the smoking status of aged 5-6 and the smoking status of their mothers during pregnancytheir mothers during pregnancy
Hence OR for table Hence OR for table is : is :
Obesity statusObesity status
Smoking status(during
Pregnancy)
casesNon-cases
Total
Smoked throughout
64342406
Never smoked6834963564
Total13238383970
62.9)68)(342()3496)(64(OR
Text Book : Basic Concepts and Methodology for the Health Sciences
228
Confidence Interval for Odds Confidence Interval for Odds RatioRatioThe (1-The (1-αα) 100% Confidence Interval for Odds Ratio is:) 100% Confidence Interval for Odds Ratio is:
WhereWhere
For Example 12.5.7.2 we have: a=64, b=342, c=68, For Example 12.5.7.2 we have: a=64, b=342, c=68, d=3496 , therefore:d=3496 , therefore:
Its 95% CI is: Its 95% CI is:
or (7.12, 13.00)or (7.12, 13.00)
))()()(()( 2
2dbcbdaca
bcadnX
RO Xzˆ )2/(1
68.217)3564)(406)(3833)(132()68342349664( 239702 X
62.9 )6831.217/96.1(1
RO Xzˆ )2/(1
Text Book : Basic Concepts and Methodology for the Health Sciences
229
Interpretation of Example 12.7.2 Interpretation of Example 12.7.2 DataData
The 95% confidence interval (7.12, 13.00)The 95% confidence interval (7.12, 13.00) mean that we are 95% confident that the mean that we are 95% confident that the
population odds ratio is somewhere population odds ratio is somewhere between 7.12 and 13.00between 7.12 and 13.00
Since the interval does not contain 1, in Since the interval does not contain 1, in fact contains values larger than one, we fact contains values larger than one, we conclude that, in Pop. Obese children conclude that, in Pop. Obese children (cases) are more likely than non-obese (cases) are more likely than non-obese children ( non-cases) to have had a mother children ( non-cases) to have had a mother who smoked throughout the pregnancy.who smoked throughout the pregnancy.
Solve Ex 12.7.4 (page 646)Solve Ex 12.7.4 (page 646)
Text Book : Basic Concepts and Methodology for the Health Sciences
230
Interpretation of ODDS RATIOInterpretation of ODDS RATIO The sample odds ratio provides an estimate The sample odds ratio provides an estimate
of the relative risk of population in the case of the relative risk of population in the case of a rare disease.of a rare disease.
The odds ratio can assume values between The odds ratio can assume values between 0 to ∞.0 to ∞.
A value of 1 indicate no association A value of 1 indicate no association between risk factor and disease status.between risk factor and disease status.
A value greater than one indicates A value greater than one indicates increased odds of having the disease increased odds of having the disease among subjects in whom the risk factor is among subjects in whom the risk factor is present.present.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
231231
Chapter 13Chapter 13 Special Techniques for use Special Techniques for use
when population parameters when population parameters and/or population distributions and/or population distributions
are unknoenare unknoenpages 683-689pages 683-689
Prepared By : Dr. Shuhrat KhanPrepared By : Dr. Shuhrat Khan
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
232232
NON-PARAMETRIC STATISTICSNON-PARAMETRIC STATISTICS
The t-test, z-test etc. were all parametric The t-test, z-test etc. were all parametric tests as they were based n the tests as they were based n the assumptions of normality or known assumptions of normality or known variances. variances.
When we make no assumptions about the When we make no assumptions about the sample population or about the population sample population or about the population parameters the tests are called non-parameters the tests are called non-parametric and parametric and distribution-freedistribution-free. .
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
233233
ADVANTAGES OF NON-PARAMETRIC ADVANTAGES OF NON-PARAMETRIC STATISTICSSTATISTICS
Testing hypothesis about simple statements (not Testing hypothesis about simple statements (not involving parametric values) e.g. involving parametric values) e.g. The two criteria are independent (test for independence)The two criteria are independent (test for independence)The data fits well to a given distribution (goodness of fit The data fits well to a given distribution (goodness of fit test)test)Distribution Free: Non-parametric tests may be Distribution Free: Non-parametric tests may be used when the form of the sampled population is used when the form of the sampled population is unknown. unknown. Computationally easyComputationally easyAnalysis possible for ranking or categorical data Analysis possible for ranking or categorical data (data which is not based on measurement scale )(data which is not based on measurement scale )
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
234234
The Sign TestThe Sign TestThis test is used as an alternative to t-test, This test is used as an alternative to t-test, when normality assumption is not metwhen normality assumption is not metThe only assumption is that the The only assumption is that the distribution of the underlying variable distribution of the underlying variable (data) is continuous.(data) is continuous.Test focuses on median rather than mean.Test focuses on median rather than mean.The test is based on signs, plus and The test is based on signs, plus and minusesminusesTest is used for one sample as well as for Test is used for one sample as well as for two samplestwo samples
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
235235
ExampleExample(One Sample Sign Test)(One Sample Sign Test)
Score of 10 mentally Score of 10 mentally retarded girls retarded girls
We wish to know We wish to know if Median of population isif Median of population is different from 5.different from 5.Solution:Solution:Data:Data: is about scores of 10 is about scores of 10 mentally retarded girlsmentally retarded girlsAssumptionAssumption: : The measurements are continuous variable.The measurements are continuous variable.
GirlScore
Girl
Score
12345
45889
6789
10
610
766
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
236236
ContinuedContinued.…….……
Hypotheses:Hypotheses: H H00: The population median is 5: The population median is 5 HHAA: The population median is not 5: The population median is not 5Let Let αα = 0.05 = 0.05
Test StatisticTest Statistic: : The test statistic for the sign The test statistic for the sign test is either the observed number of plus signs test is either the observed number of plus signs or the observed number of minus signs. The or the observed number of minus signs. The nature of the alternative hypothesis determines nature of the alternative hypothesis determines which of these test statistics is appropriate. In a which of these test statistics is appropriate. In a given test, any one of the following alternative given test, any one of the following alternative hypotheses is possible: hypotheses is possible:
HHAA: : PP(+) > (+) > PP(-) one-sided alternative(-) one-sided alternative HHAA: : PP(+) < (+) < PP(-) one-sided alternative(-) one-sided alternative HHAA: : PP(+) ≠ (+) ≠ PP(-) two-sided alternative(-) two-sided alternative
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
237237
ContinuedContinued.…….……
If the alternative hypothesis is HIf the alternative hypothesis is HAA: : PP(+) > (+) > PP(-) a (-) a sufficiently small number of minus signs causes sufficiently small number of minus signs causes rejection of Hrejection of H0. 0. The test statistic is the number of The test statistic is the number of minus signs. minus signs. If the alternative hypothesis is HIf the alternative hypothesis is HAA: : PP(+) < (+) < PP(-) a (-) a sufficiently small number of plus signs causes sufficiently small number of plus signs causes rejection of Hrejection of H0. 0. The test statistic is the number of The test statistic is the number of plus signs. plus signs. If the alternative hypothesis is HIf the alternative hypothesis is HAA: : PP(+) ≠ (+) ≠ PP(-) (-) either a sufficiently small number of plus signs or either a sufficiently small number of plus signs or a sufficiently small number of minus signs causes a sufficiently small number of minus signs causes rejection of the null hypothesis. We may take as rejection of the null hypothesis. We may take as the test statistic the less frequently occurring sign. the test statistic the less frequently occurring sign.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
238238
ContinuedContinued.…….……Distribution of test statistic:Distribution of test statistic: If we assign If we assign a plus sign to those scores that lie above the a plus sign to those scores that lie above the hypothesized median and a minus to those hypothesized median and a minus to those that fall below. that fall below.
Decision Rule: Decision Rule: Let k = minimum of pluses Let k = minimum of pluses or minuses. Here k = 1, the minus sign. or minuses. Here k = 1, the minus sign. For HFor HAA: : PP(+) > (+) > PP(-) reject H(-) reject H0 0 if, when Hif, when H0 0 if true, if true, the probability of observing k or fewer minus the probability of observing k or fewer minus signs is less than or equal to signs is less than or equal to αα. .
Girl12345678910
Score relative to median = 5-0++++++++
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
239239
ContinuedContinued.…….……
For HFor HAA: : PP(+) > (+) > PP(-) reject H(-) reject H00 if, when Hif, when H0 0 if true, the if true, the probability of observing k or fewer minus signs is probability of observing k or fewer minus signs is less than or equal to less than or equal to αα..For HFor HAA: : PP(+) < (+) < PP(-), reject H(-), reject H0 0 if the probability of if the probability of observing, when Hobserving, when H0 0 is true, k or fewer plus signs is is true, k or fewer plus signs is equal to or less than equal to or less than αα..For HFor HAA: : PP(+) ≠ (+) ≠ PP(-) , reject H(-) , reject H0 0 if (given that Hif (given that H00 is is true) the probability of obtaining a value of true) the probability of obtaining a value of k k as as extreme as or more extreme than was actually extreme as or more extreme than was actually computed is equal to or less than computed is equal to or less than αα/2. /2. Calculation of test statistic: Calculation of test statistic: The probability of The probability of observing k or fewer minus signs when given a observing k or fewer minus signs when given a sample of size n and parameter sample of size n and parameter p p by evaluating the by evaluating the following expression: following expression: P (X ≤ k | n, p) = P (X ≤ k | n, p) =
qpC
xnxk
x
n
x
0
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
240240
ContinuedContinued.…….……
For our example we would computeFor our example we would compute
Statistical decision: Statistical decision: In Appendix Table B we find In Appendix Table B we find P (k ≤ 1 | 9, 0.5) = 0.0195 P (k ≤ 1 | 9, 0.5) = 0.0195
Conclusion: Conclusion: Since 0.0195 is less than 0.025, we Since 0.0195 is less than 0.025, we reject the null hypothesis and conclude that the reject the null hypothesis and conclude that the median score is not 5.median score is not 5.pp value: value: The The p p value for this test is 2(0.0195) = value for this test is 2(0.0195) = 0.0390, because it is two-sided test.0.0390, because it is two-sided test.
0195.001758.000195.0)5.0()5.0()5.0()5.0( 1919
1
0909
0
CC
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
241241
SIGN TEST----Paired Data SIGN TEST----Paired Data This is used an alternative to t-test for paired observations, This is used an alternative to t-test for paired observations,
when the underlying assumptions of t test are not met.when the underlying assumptions of t test are not met.Null Hypothesis Null Hypothesis to be tested the median difference is zero. to be tested the median difference is zero. OROR P (Xi > Yi ) = P (Yi > Xi ) P (Xi > Yi ) = P (Yi > Xi ) Subtract Yi from Xi , if Yi is less than Xi , the sign of the Subtract Yi from Xi , if Yi is less than Xi , the sign of the
difference is (+), if Yi is greater than Xi , the sign of the difference is (+), if Yi is greater than Xi , the sign of the difference is ( - ), so that difference is ( - ), so that
HH00 : P(+) = P(-) = 0.5 : P(+) = P(-) = 0.5 TEST STATISTIC: As before is k, the no of least occurring of TEST STATISTIC: As before is k, the no of least occurring of
Plus or minus signs. Plus or minus signs.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
242242
SIGN TEST----Example 13.3.2SIGN TEST----Example 13.3.2 A dental research team matched 12 pairs of 24 patients in age, sex, intelligence. Six A dental research team matched 12 pairs of 24 patients in age, sex, intelligence. Six
months later random evaluation showed the following score (low score score is months later random evaluation showed the following score (low score score is higher level of hygiene)higher level of hygiene)
HH0 0 : P(+) = P(-) = 0.5 : P(+) = P(-) = 0.5
1.1.DataData. Scores of dental hygiene, one member instructed how to brush and . Scores of dental hygiene, one member instructed how to brush and other remained uninstructed. other remained uninstructed.
2. 2. AssumptionAssumption: the variable of dist is continues: the variable of dist is continues3. H3. Ho o : The median of the difference is zero: The median of the difference is zero [P(+) =P(-)] [P(+) =P(-)] HHAA : The median of the difference is negative : The median of the difference is negative [P(+) <P(-)][P(+) <P(-)]
pair no.123456789101112
instructed1.52.03.53.03.52.52.01.51.52.03.02.0
Not instructed
2.02.04.02.54.03.03.53.02.52.52.52.5
Difference -0-+------+-
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
243243
Continued…….Continued……. Let Let αα be 0.05 be 0.054. 4. Test StatisticTest Statistic: The test statistic is the number of plus : The test statistic is the number of plus
signs which occurs less frequent. i.e. k = 2signs which occurs less frequent. i.e. k = 2 5. 5. DistributionDistribution of k is binomial with n= 11 (as one of k is binomial with n= 11 (as one
observation is discarded) and p= 0.5observation is discarded) and p= 0.56. 6. Decision RuleDecision Rule: Reject H: Reject H00 if P(k≤2| 11,0.5) ≤ 0.05. if P(k≤2| 11,0.5) ≤ 0.05.7. 7. CalculationsCalculations: : P(k≤2/11,0.5)=P(k≤2/11,0.5)= Table B or calculations show the probability is equal to Table B or calculations show the probability is equal to
0.0327 which is less than 0.05, we 0.0327 which is less than 0.05, we must reject Hmust reject H00 . .8. 8. ConclusionConclusion: median difference is negative and : median difference is negative and
instructions are beneficialinstructions are beneficial 9. 9. p valuep value: Since it is one sided test the p-value is : Since it is one sided test the p-value is
p= .0327p= .0327
)5.0()5.0 112
011 (
kk
k k
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
244244
NON-PARAMETRIC STATISTICSNON-PARAMETRIC STATISTICS
The t-test, z-test etc. were all parametric The t-test, z-test etc. were all parametric tests as they were based n the tests as they were based n the assumptions of normality or known assumptions of normality or known variances. variances.
When we make no assumptions about the When we make no assumptions about the sample population or about the population sample population or about the population parameters the tests are called non-parameters the tests are called non-parametric and parametric and distribution-freedistribution-free. .
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
245245
EXAMPLE 1EXAMPLE 1Cardiac output (liters/minute) was measured by Cardiac output (liters/minute) was measured by thermodilution in a simple random sample of 15 thermodilution in a simple random sample of 15 postcardiac surgical patients in the left lateral position. postcardiac surgical patients in the left lateral position. The results were as follows: The results were as follows:
We wish to know if we can conclude on the basis of these We wish to know if we can conclude on the basis of these data that the population mean is different from 5.05. data that the population mean is different from 5.05. Solution:Solution:1.1. DataData.. As given above As given above2. 2. AssumptionsAssumptions. . We assume that the requirements for We assume that the requirements for the application of the Wilcoxon signed-ranks test are the application of the Wilcoxon signed-ranks test are met. met. 3. 3. Hypothesis.Hypothesis. HH00: µ = 5.05: µ = 5.05 HHAA: µ ≠ 5.05: µ ≠ 5.05Let Let αα = 0.05. = 0.05.
4.914.106.747.277.427.506.564.645.983.143.235.806.175.395.77
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
246246
EXAMPLE 1EXAMPLE 144 . .Test StatisticTest Statistic. . The test statistic will be The test statistic will be T T + or + or TT-, -,
whichever is smaller, called the test statistic whichever is smaller, called the test statistic TT . .5. 5. Distribution of test statisticDistribution of test statistic. . Critical values of Critical values of the test statistic are given in Table K of the the test statistic are given in Table K of the AppendixAppendix. . 6. 6. Decision ruleDecision rule. We will reject . We will reject HH0 0 if the computed if the computed value of value of TT is less than or equal to 25, the critical is less than or equal to 25, the critical value value nn = 15, and = 15, and αα/2 = 0.0240, the closest value to /2 = 0.0240, the closest value to 0.0250 in Table K. 0.0250 in Table K. 7. 7. CalculationCalculation of test statistic. of test statistic. The calculation of The calculation of the test statistic is shown in Table. the test statistic is shown in Table.
8. 8. Statistical decisionStatistical decision.. Since 34 is greater than Since 34 is greater than 25, we are unable to reject 25, we are unable to reject HH0. 0.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
247247
Cardiac output
di = xi – 5.05
Rank of |di| Signed Rank of |di |
4.91-0.141-1
4.10-0.957-7
6.74+1.6910+10
7.27+2.2213+13
7.42+2.3714+14
7.50+2.4515+15
6.56+1.519+9
4.64-0.413-3
5.98+0.936+6
3.14-1.9112-12
3.23-1.8211-11
5.80+0.755+5
6.17+1.128+8
5.39+0.342+2
5.77+0.724+4
T+ = 86, T- = 34, T = 34
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
248248
EXAMPLE 1EXAMPLE 1
8. 8. Statistical decisionStatistical decision.. Since 34 is greater than Since 34 is greater than 25, we are unable to reject 25, we are unable to reject HH0. 0. 9. 9. ConclusionConclusion.. We conclude that the population We conclude that the population mean may be 5.05mean may be 5.0510. 10. p p valuevalue.. From Table K we see that the p value is From Table K we see that the p value is p = 2(0.0757) = 0.1514p = 2(0.0757) = 0.1514
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
249249
EXAMPLE 2EXAMPLE 2
A researcher designed an experiment to assess the effects A researcher designed an experiment to assess the effects of prolonged inhalation of cadmium oxide. Fifteen laboratory of prolonged inhalation of cadmium oxide. Fifteen laboratory animals served as experimental subjects, while 10 similar animals served as experimental subjects, while 10 similar animals served as controls. The variable of interest was animals served as controls. The variable of interest was hemoglobin level following the experiment. The results are hemoglobin level following the experiment. The results are shown in Table 2. shown in Table 2. We wish to know if we can conclude that prolonged We wish to know if we can conclude that prolonged inhalation of cadmium oxide reduces hemoglobin level.inhalation of cadmium oxide reduces hemoglobin level.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
250250
EXAMPLE 2EXAMPLE 2TABLE 2.TABLE 2. HEMOGLOBIN DETERMINATIONS (GRAMS) FOR 25 HEMOGLOBIN DETERMINATIONS (GRAMS) FOR 25 LABORATORY ANIMALSLABORATORY ANIMALS
EXPOSED ANIMALS (X)UNEXPOSED ANIMALS (Y)
14.417.4
14.216.2
13.817.1
16.517.5
14.115.0
16.616.0
15.916.9
15.615.0
14.116.3
15.316.8
15.7
16.7
13.7
15.3
14.0
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
251251
EXAMPLE 2EXAMPLE 2
Solution:Solution:1. 1. Data.Data. See table above See table above2. 2. AssumptionsAssumptions. . We presume that the assumptions We presume that the assumptions of the Mann-Whitney test are met.of the Mann-Whitney test are met.3. 3. Hypothesis.Hypothesis.
HH00: M: Mxx ≥ M ≥ Myy
HHAA: M: Mxx < M < Myy
where Mwhere Mx x is the median of a population of animals is the median of a population of animals exposed to cadmium oxide and Mexposed to cadmium oxide and My y is the median of is the median of a population of animals not exposed to the a population of animals not exposed to the substance. Suppose we let substance. Suppose we let αα = 0.05. = 0.05.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
252252
EXAMPLE 2EXAMPLE 2
4. 4. Test StatisticTest Statistic.. The test statistic is The test statistic is
where where nn is the number of sample is the number of sample XX observations observations and and SS is the sum of the ranks assigned to the is the sum of the ranks assigned to the sample observations from the population of sample observations from the population of XX values. The choice of which sample’s values we values. The choice of which sample’s values we label as label as XX is arbitrary. is arbitrary.
2)1(
nnST
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
253253
Sum of the Sum of the YY ranks = ranks = S S = 145= 145TABLE 2.TABLE 2. ORIGINAL DATA AND RANKS ORIGINAL DATA AND RANKS
X13.713.814.014.114.114.214.415.315.315.6
Rank1234.54.56710.510.512
Y15.015.0
Rank 8.58.5
X15.715.916.5
16.616.7
Rank
131418.1920
Y16.016.2
16.3
16.8
16.9
17.117.4
17.5
Rank
1516172122232425
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
254254
EXAMPLE 2EXAMPLE 2
5. 5. Distribution of test statistic. Distribution of test statistic. The critical values The critical values are given in Table K. are given in Table K. 6. 6. Decision Rule. Decision Rule. Reject HReject H00: M: Mxx ≥ M ≥ Myy, if the computed , if the computed TT is less than w is less than wαα with n, the number of X observations; with n, the number of X observations; m the number of Y observations and m the number of Y observations and αα, the chosen , the chosen level of significance. level of significance. If the null hypothesis were of the types If the null hypothesis were of the types
HH00: M: Mxx ≤ M ≤ Myy HHAA: M: Mxx > M > Myy
Reject HReject H00: M: Mxx ≤ M ≤ Myy if the computed if the computed TT is greater than is greater than ww1-1-αα, where W, where W1-1-αα = = nmnm - W - W α α. .
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
255255
EXAMPLE 2EXAMPLE 2
For the two-sided test situation withFor the two-sided test situation with
HH00: M: Mxx = M = Myy HHAA: M: Mxx ≠ M ≠ Myy
Reject HReject H00: M: Mxx = M = Myy if the computed value of if the computed value of TT is is either less than weither less than wαα/2/2 or greater than w or greater than w1-1-αα/2 /2 , where , where wwαα/2 /2 is the critical value of is the critical value of T T for for n, m n, m andand αα/2 /2 given given in Appendix II Table K and win Appendix II Table K and w1-1-αα/2 = /2 = nm nm - - wwαα/2. /2. For this example the decision rule of For this example the decision rule of TT is smaller is smaller than 45, the critical value of the test statistic for than 45, the critical value of the test statistic for nn = = 15, 15, mm = 10, and = 10, and αα = 0.05 found in Table K. = 0.05 found in Table K.
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
256256
EXAMPLE 2EXAMPLE 2
7. 7. Calculation of test statistic. Calculation of test statistic. We have We have SS = 145, = 145, so thatso that
8. 8. Statistical DecisionStatistical Decision. When we enter Table K . When we enter Table K with with nn = 15, = 15, mm = 10, and = 10, and αα = 0.05, we find the = 0.05, we find the critical value of wcritical value of w1-1-αα to be 45. Since 25 is less than to be 45. Since 25 is less than 45, we reject H45, we reject H00. . 9. 9. ConclusionConclusion. We conclude that M. We conclude that Mxx is smaller than is smaller than MMY. Y. This leads us to the conclusion that prolonged This leads us to the conclusion that prolonged inhalation of cadmium oxide does reduce the inhalation of cadmium oxide does reduce the hemoglobin level. hemoglobin level.
Since 22< 25 < 30, we have for this testSince 22< 25 < 30, we have for this test 0.005 > 0.005 > pp >0.001. >0.001.
252
)115(15145
T
Text Book : Basic Concepts and MethodoText Book : Basic Concepts and Methodology for the Health Sciences logy for the Health Sciences
257257
EXAMPLE 2EXAMPLE 2
When either When either n n or or m m is greater than 20 we cannot is greater than 20 we cannot use Appendix Table K to obtain critical values for the use Appendix Table K to obtain critical values for the Mann-Whitney test. When this is the case we may Mann-Whitney test. When this is the case we may computecompute
And compare the result, for significance, with critical And compare the result, for significance, with critical values of the standard normal distribution. values of the standard normal distribution.
12/)1(2/
mnnmmnTz
Recommended